First-order methods for stochastic optimization — Optimization, particularly deterministic (convex) optimization, is essential to many learning algorithms (regression, support vector machines, matrix factorization etc). This post discusses first-order/gradient-based optimization approaches that