KDD’15, August 10-13, 2015, Sydney, NSW, Australia.
Li Zhou, David G. Andersen, Mu Li, Alexander J. Smola
Carnegie Mellon University
In this paper we present a novel data structure for sparse vectors based on Cuckoo hashing. It is highly memory efficient and allows for random access at near dense vector level rates. This allows us to solve sparse ℓ1 programming problems exactly and without preprocessing at a cost that is identical to dense linear algebra both in terms of memory and speed. Our approach provides a feasible alternative to the hash kernel and it excels whenever exact solutions are required, such as for feature selection.
FULL PAPER: pdf