跳到主要內容區塊
:::
A- A A+

演講公告

:::

Periodic Step-size Adaptation in Second-order Gradient Descent for Single-pass On-line Learning

  • 2010-01-25 (Mon.), 10:30 AM
  • 中研院-蔡元培館 2F 208 演講廳
  • 茶 會:上午10:10統計所蔡元培館二樓
  • 李 育 杰 教授
  • 國立台灣科技大學資訊工程系

Abstract

It has been established that the second-order stochastic gradient descent (2SGD) method can potentially achieve generalization performance as well as optimum in a single pass through the training examples. However, 2SGD requires computing the inverse of the Hessian matrix of the loss function, which is prohibitively expensive, in particular when the learning tasks involve a very high dimensional feature space. In this talk, we present a new second-order SGD method, called Periodic Step-size Adaptation (PSA). PSA approximates the Jacobian matrix of the mapping function and explores a linear relation between the Jacobian and Hessian to approximate the Hessian periodically. We tested PSA on large scale sequence labeling tasks using conditional random fields and large scale classification task using linear support vector machines. Experimental results show that single-pass performance of PSA is always very close to empirical optimum.

最後更新日期:
回頁首