Globally Optimal and Robust k-means Clustering via Convex Optimization

Wednesday, January 27, 2016 - 3:15pm - 4:05pm
Keller 3-180
Rachel Ward (The University of Texas at Austin)
The k-means clustering objective is known to be hard to minimize in general, requiring an exhaustive search over all possible partitions of a given set of data into k clusters. At the same time, fast alternating minimization algorithms act as effective surrogates for the k-means optimization problem in practice, despite only being guaranteed in general to converge to local minimizers. As a compromise between accuracy and speed, we consider a semidefinite programming relaxation of the k-means optimization problem, followed by rounding if necessary. We first consider the case of points drawn randomly from k isotropic separated distributions, and show the SDP recovers the globally optimal k-means clustering at near-minimal ball separation. We then show robustness: we consider the case where points are drawn from a Gaussian mixture model, and show that that the SDP recovers the k planted means up to the error of the globally-optimal k-means centers. We conclude by discussing the role of convex optimization in geometric clustering problems more generally.

This talk is based on joint with with Pranjal Awasthi, Afonso Bandeira, Moses Charikar, Ravi Krishnaswami, Dustin Mixon, and Soledad Villar.