Abstract:
Given data drawn from a mixture of multivariate Gaussians, a basic problem is to accurately
estimate the mixture parameters. We provide a polynomial-time algorithm for this problem for
any fixed number $k$ of Gaussians in $n$ dimensions (even if they overlap), with provably minimal
assumptions on the Gaussians and polynomial data requirements.
In statistical terms, our estimator converges at an inverse polynomial rate, and no such estimator (even exponential time) was known for this problem (even in one dimension, restricted to two Gaussians). Our algorithm reduces the $n$-dimensional problem to the one dimensional problem, where the method of moments is applied. Additionally, in order to prove correctness for our univariate learning algorithm, we develop a novel explanation for why the method of moments (due to Pearson in 1894) works based on connections to algebraic geometry. [Joint work with Adam Tauman Kalai and Gregory Valiant. See also independent work of Mikhail Belkin and
Kaushik Sinha]