WSDM2021

DECAF: Deep Extreme Classification with Label Features

Anshul Mittal 1 Kunal Dahiya 1 Sheshansh Agrawal 2 Deepak Saini 2 Sumeet Agarwal 1 Manik Varma 2 Purushottam Kar 2
1Indian Institute of Technology Delhi, India
2Microsoft Research, India

This paper develops the DECAF algorithm for extreme multi-label classification, where the objective is to tag a data point with its most relevant subset of labels from an extremely large label set. Leading extreme classifiers ignore the metadata, such as label descriptions. On the other hand, Siamese networks, which can learn from available label metadata, do not accurately scale to extreme settings with millions of labels. DECAF addresses these challenges by learning metadata enriched probabilistic label trees through a novel formulation that ensures metadata is shared among labels, thereby enabling DECAF to scale to millions of labels and getting state-of-the-art accuracy. DECAF also introduces a classifier initialization strategy, which leads to further improvement in accuracy. Experiments on publicly available benchmark datasets, including AmazonTitles-2M, revealed that the proposed DECAF algorithm could be up to 4% more accurate than leading extreme classifiers. At the same time, DECAF was found to be 10-80x faster at inference than leading deep extreme classifiers, which makes it suitable for critical real-world applications requiring real-time predictions in a few milliseconds. The code for DECAF will be made available at a public repository.