Sketchy Polytopes

α-strong convexity, β-strong smoothness

Strong convexity often allows for optimization algorithms that converge very quickly to an ϵ-optimum (rf. FISTA and NESTA). This post will cover some fundamentals of strongly convex functions.

Convexity

For a convex function, f:RR and γ[0,1],

f(tx+(1t)y)tf(x)+(1t)f(y)

Strict convexity

For a strictly convex function,

Geometrically, the definition of convexity implies that all points on any straight line connecting any two points in a convex set also lie in the set. Strict convexity excludes linear and affine functions, or functions with linear/affine subsets in their boundaries. (How would this extend to non-Euclidean geometries?)

α-strong convexity

A function is α-strongly convex with respect to a norm |.| if, for α > 0

Alternatively,

(f(x)f(y))(xy)αxy2

or,

f(y)f(x)+f(x)(yx)+α2xy2

Strongly convexity extends strict convexity. For twice-differentiable functions, this implies that 2f(x)α. As Bubeck explains [1], strongly convex functions speed up convergence of first-order methods. Larger values of α imply larger gradient, and hence step size, when further away from the optimum.

Strong smoothness is another property of certain convex functions:

β-smoothness

A function is β-smoothly convex with respect to a norm |.| if, for β > 0, [4]

f(y)f(x)+f(x)(yx)+β2xy2

This definition gives an lower bound on the improvement in one step of (sub)gradient descent [1]:

f(x1βf(x))f(x)12βf(x)2

Alternatively, β-smoothly convex function [lemma 3.3, 1]:

f(x)f(y)f(x)(yx)12βf(x)f(y)2

Strong/smooth duality

Under certain conditions, a-strong convexity and β-smoothness are dual notions. For now, we’ll state the result without discussion.

If f is a closed and convex function, then f is α-strongly convex function with respect to a norm |.| if and only if f is 1α-strongly smooth with respect to the dual norm |.| (corollary 7 in [4]).

References:

  • 1 S. Bubeck, Theory of Convex Optimization for Machine Learning, section 3.4
  • 2 Wikipedia, Convex function
  • 3 S. Boyd and G. Vandenberghe, Convex Optimization, section 9.1.2
  • 4 S. M. Kakade, S. Shalev-Schwartz, A. Tiwari. On the duality of strong convexity and strong smoothness: Learning applications and matrix regularization. Technical Report, 2009

Credits

Image from Sebastian Pokutta’s post, Cheat Sheet: Smooth Convex Optimization.