Efficient Dimension Reduction with Guarantees, from Clustering to Hashing

Thursday, September 15, 2016 - 9:00am - 9:50am
Keller 3-180
Rachel Ward (The University of Texas at Austin)
In the first part of the talk, we discuss an semidefinite programming relaxation of the k-means clustering problem. We show that even when the SDP is not tight, its solution produces a denoised copy of the original set of points, and can be combined with a rounding scheme to produce a set of k centers with improved stability guarantees under subgaussian mixture model compared to previous algorithms.

In the second part, we discuss locality sensitive hashing (LSH) for approximate nearest neighbor search. We introduce a fast variant of the cross-polytope LSH scheme for angular distance. To our knowledge, is the first LSH scheme with provably optimal sensitivity which is also fast.