Sketchy Polytopes

Subgradient generalizes the notion of gradient/derivative and quantifies the rate of change of non-differentiable/non-smooth function ([1], section 8.1 in [2]). For a real-valued function, the “subgradients” of the function at a point $x_0$ are all values of $c$ such that:

Unlike derivative of smooth function, which is a singleton value, the subgradients form a set (e.g. an interval of points). The set of all subgradients is called the “subdifferential”.

Subdifferential always exists if $f(x)$ is a convex function with a convex domain $X$, and vice versa (proposition 1.1 in [3]):

The generalized subgradient descent scheme is given as:

## References

• 1 Wikipedia, Subgradient
• [2] J. F. Bonnan, J. C. Gilbert, C. Lemaréchal, C. A. Sagastizábal. Numerical Optimization: Theoretical and Practical Aspects. Second Edition, Springer, 2006
• 3 S. Bubeck. Theory of Convex Optimization for Machine Learning. Lecture Notes, 2014

## Credits

Image from the Hal Daumé’s post, Subgradient descent, or something….