Tomas Mikolov [email protected] Google Inc Mountain ViewCA, Kai Chen [email protected] Google Inc Mountain ViewCA, Greg Corrado [email protected] Google Inc Mountain ViewCA, Jeffrey Dean Google Inc Mountain ViewCA (2013)
The paper proposes two novel model architectures for efficiently learning continuous vector representations of words from large datasets, achieving improved performance in a word similarity task compared to previous techniques. The proposed models, Continuous Bag-of-Words (CBOW) and Skip-gram, focus on leveraging large corpora (up to 1.6 billion words) to derive high-quality embeddings and demonstrate state-of-the-art performance on syntactic and semantic similarities in word relationships. The authors employ a comprehensive test set of 8869 semantic and 10675 syntactic questions to evaluate the accuracy and effectiveness of the learned vectors, revealing notable improvements in both accuracy and computational efficiency over prior models. Additionally, the paper discusses how model complexity can be managed while maximizing accuracy, and presents results from testing various model architectures, including Recurrent Neural Networks (RNNs) and Feedforward Neural Networks (NNMs). The architectures aim to balance training complexity and word representation quality, enabling potential applications in various NLP tasks such as machine translation and information retrieval.
This paper employs the following methods:
The following datasets were used in this research:
The authors identified the following limitations: