Learning Linear Support Vector Machines in Sublinear Time

Tomer Koren, M.Sc. Thesis Seminar
Wednesday, 7.3.2012, 14:30
Taub 701
Dr. E. Hazan

In recent years, stochastic approximation approaches such as Stochastic Gradient Descent (SGD) and Stochastic Dual Averaging have become the optimization method of choice for many learning problems, including linear Support Vector Machines (SVMs). This is not surprising, since such methods yield optimal generalization guarantees with only a single pass over the data. They in a sense have an unbeatable runtime: the runtime to get to a desired generalization goal is the same as the size of the data set required to do so. In this work we show, for the first time, how to beat this unbeatable runtime, and present a method that in a certain relevant regime, learns in runtime less than the size of the minimal training set size required for learning. Our method is based on a stochastic primal-dual approach, where the primal step is akin to an importance-weighted SGD, and the dual step is a stochastic update on the importance weights.

Back to the index of events