Asymmetric Hashing

Most binary coding and hashing algorithms were designed to deal with Approximate Nearest Neighbor (ANN) search. Studies have rarely been dedicated to Maximum Inner Product Search (MIPS), which actually plays a critical role in various vision and learning applications.

Inspired by the recent Asymmetric LSH work (Shrivastava and Li, 2014), in this project, we investigate learning binary codes to exclusively handle the MIPS problem. The core problem is, the similaritis between two sets of data should be revealed by the inner products of their corresponding binary codes. We study this problem by proposing the Asymmetric Inner-product Bianry Coding algorithm for large-scale MIPS, and the Discrete Collaborative Filtering algorithm for recommendation systems.

  • Learning Binary Codes for Maximum Inner Product Search, Fumin Shen, Wei Liu, Shaoting Zhang, Yang Yang, and Heng Tao Shen, IEEE International Conference on Computer Vision (ICCV), 2015 [PDF, BibTeX]
    @InProceedings{Shen_2015_ICCV,
    author = {Shen, Fumin and Liu, Wei and Zhang, Shaoting and Yang, Yang and Tao Shen, Heng},
    title = {Learning Binary Codes for Maximum Inner Product Search},
    booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
    month = {December},
    pages = {4148--4156},
    year = {2015}
    }
    
  • Discrete Collaborative Filtering, Hanwang Zhang, Fumin Shen, Wei Liu, Xiangnan He, Huanbo Luan, Tat-Seng Chua, ACM SIGIR conference on Research and Development in Information Retrieval (SIGIR), 2016 [PDF, Codes, BibTeX]
    Best Paper Award Honorable Mention
    @inproceedings{Zhang2016DCF,
     author = {Zhang, Hanwang and Shen, Fumin and Liu, Wei and He, Xiangnan and Luan, Huanbo and Chua, Tat-Seng},
     title = {Discrete Collaborative Filtering},
     booktitle = {Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval},
     series = {SIGIR '16},
     year = {2016},
     location = {Pisa, Italy},
     pages = {325--334},
     numpages = {10},
    } 
    

Classification by Hamming Retrieval

Recently, we study asymmetric hashing on the large-scale linear classification problem: classify examples by a linear transformation matrix from a large number of categories. This is a typical inner-product involving problem. We propose to respresent both the samples and classifiers using binary hash codese, which are simultaneously learned from the training data. Classifying an example thereby reduces to retrieving its nearest class codes in the Hamming spac, i.e., computing the Hamming distance between the binary codes of the example and classifiers and selecting the class with minimal Hamming distance.

Overview of Hashed Linear Classification: binarize both data and classifiers

  • Classification by Retrieval: Binarizing Data and Classifier, Fumin Shen, Yadong Mu, Yang Yang, Wei Liu, Li Liu, Jingkuan Song, Heng Tao Shen, ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2017 [PDF, BibTeX]
    Best Paper Award Honorable Mention
    @inproceedings{'shen2017sigir',
     author = {Fumin Shen and Yadong Mu and Yang Yang and Wei Liu and Li Liu and Jingkuan Song and Heng Tao Shen},
     booktitle={ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR}, 
     title = {Learning Binary Codes and Binary Weights for Efficient Classification},
     year = {2017}
    }