α-strong convexity, β-strong smoothness
Strong convexity often allows for optimization algorithms that converge very quickly to an ϵ-optimum (rf. FISTA and NESTA). This post will cover some fundamentals of strongly convex functions.
Convexity
For a convex function, f:R→R and ∀γ∈[0,1],
f(tx+(1−t)y)≤tf(x)+(1−t)f(y)Strict convexity
For a strictly convex function,
Geometrically, the definition of convexity implies that all points on any straight line connecting any two points in a convex set also lie in the set. Strict convexity excludes linear and affine functions, or functions with linear/affine subsets in their boundaries. (How would this extend to non-Euclidean geometries?)
α-strong convexity
A function is α-strongly convex with respect to a norm |.| if, for α > 0
Alternatively,
(∇f(x)−∇f(y))(x−y)≥α‖x−y‖2or,
f(y)≥f(x)+∇f(x)(y−x)+α2‖x−y‖2Strongly convexity extends strict convexity. For twice-differentiable functions, this implies that ∇2f(x)≥α. As Bubeck explains [1], strongly convex functions speed up convergence of first-order methods. Larger values of α imply larger gradient, and hence step size, when further away from the optimum.
Strong smoothness is another property of certain convex functions:
β-smoothness
A function is β-smoothly convex with respect to a norm |.| if, for β > 0, [4]
f(y)≤f(x)+∇f(x)(y−x)+β2‖x−y‖2This definition gives an lower bound on the improvement in one step of (sub)gradient descent [1]:
f(x−1β∇f(x))−f(x)≤−12β‖∇f(x)‖2Alternatively, β-smoothly convex function [lemma 3.3, 1]:
f(x)−f(y)≤∇f(x)(y−x)−12β‖∇f(x)−∇f(y)‖2Strong/smooth duality
Under certain conditions, a-strong convexity and β-smoothness are dual notions. For now, we’ll state the result without discussion.
If f is a closed and convex function, then f is α-strongly convex function with respect to a norm |.| if and only if f∗ is 1α-strongly smooth with respect to the dual norm |.|∗ (corollary 7 in [4]).
References:
- 1 S. Bubeck, Theory of Convex Optimization for Machine Learning, section 3.4
- 2 Wikipedia, Convex function
- 3 S. Boyd and G. Vandenberghe, Convex Optimization, section 9.1.2
- 4 S. M. Kakade, S. Shalev-Schwartz, A. Tiwari. On the duality of strong convexity and strong smoothness: Learning applications and matrix regularization. Technical Report, 2009
Credits
Image from Sebastian Pokutta’s post, Cheat Sheet: Smooth Convex Optimization.