Sketchy Polytopes

## Geometric Median & Robust Estimation

Stanislav Minsker addresses the following question:

Problem: Given $X_1, \ldots, X_k$ that are i.i.d. random variables, how can we estimate the mean of the distribution so that the confidence interval still has a coverage probability $\ge 1 - O(e^{-t})$? Note: the smaller the confidence interval, the better the estimator.

1. Assuming the variables from normal distribution $N(\mu, \sigma^2)$. If the estimator is constructed as $\hat{\mu}_n = \Sigma_i X_i / n$, then:

For non-normal distribution, the confidence interval is much different:

Since the length of this interval depends on $e^t$, can we do better?

1. If we split the sample into $k = floor(t) + 1$ groups $G_1, \ldots, G_k$ each of size approximate $frac{n}{t}$, then median-of-means estimator has the following confidence interval (result due to Nemirovskii and Yudin, 1983):

Where $C$ is a constant. The length of confidence interval for median-of-means is much smaller than for the naive mean estimator.

1. Minsker’s work extends this to multi-dimensional distributions in Banach space using geometric median to obtain confidence intervals of sub-exponential length with high coverage.