Theoretical Understandings of Product Embedding for E-commerce Machine Learning

Product embeddings have been heavily investigated in the past few years, serving as the cornerstone for a broad range of machine learning models in e-commerce. Despite the empirical success of product embeddings, little is known on how and why they work from the theoretical standpoint. Analogous results from the natural language processing (NLP) often rely on domain-specific properties that are not transferable to the e-commerce setting, and the downstream machine learning tasks focus on different aspects of the embeddings. We take an e-commerce-oriented view of the product embedding models and reveal a complete theoretical view from both the representation learning aspect and the learning theory perspective. We prove that product embeddings trained by the widely-adopted skip-gram negative sampling algorithm and its variants are sufficient dimension reduction regarding a critical product relatedness measure, and the generalization performance in the downstream machine learning task is controlled by the alignment between the spectral space of the product relatedness measure. Following the theoretical discoveries, we conduct exploratory experiments that supports the theoretical insights of product embeddings.