中央研究院統計科學研究所

演講公告

演講公告演講公告

:::

Periodic Step-size Adaptation in Second-order Gradient Descent for Single-pass On-line Learning

2010-01-25 (Mon.), 10:30 AM
中研院-蔡元培館 2F 208 演講廳
茶會：上午10：10統計所蔡元培館二樓
李　育　杰　教授
國立台灣科技大學資訊工程系

Abstract

It has been established that the second-order stochastic gradient descent (2SGD) method can potentially achieve generalization performance as well as optimum in a single pass through the training examples. However, 2SGD requires computing the inverse of the Hessian matrix of the loss function, which is prohibitively expensive, in particular when the learning tasks involve a very high dimensional feature space. In this talk, we present a new second-order SGD method, called Periodic Step-size Adaptation (PSA). PSA approximates the Jacobian matrix of the mapping function and explores a linear relation between the Jacobian and Hessian to approximate the Hessian periodically. We tested PSA on large scale sequence labeling tasks using conditional random fields and large scale classification task using linear support vector machines. Experimental results show that single-pass performance of PSA is always very close to empirical optimum.

最後更新日期：2025-07-01 04:12

回列表頁