skip to main content
research-article

Jointly Learning Word Embeddings and Latent Topics

Published: 07 August 2017 Publication History
  • Get Citation Alerts
  • Abstract

    Word embedding models such as Skip-gram learn a vector-space representation for each word, based on the local word collocation patterns that are observed in a text corpus. Latent topic models, on the other hand, take a more global view, looking at the word distributions across the corpus to assign a topic to each word occurrence. These two paradigms are complementary in how they represent the meaning of word occurrences. While some previous works have already looked at using word embeddings for improving the quality of latent topics, and conversely, at using latent topics for improving word embeddings, such "two-step'' methods cannot capture the mutual interaction between the two paradigms. In this paper, we propose STE, a framework which can learn word embeddings and latent topics in a unified manner. STE naturally obtains topic-specific word embeddings, and thus addresses the issue of polysemy. At the same time, it also learns the term distributions of the topics, and the topic distributions of the documents. Our experimental results demonstrate that the STE model can indeed generate useful topic-specific word embeddings and coherent latent topics in an effective and efficient way.

    References

    [1]
    Ricardo Baeza-Yates, Berthier Ribeiro-Neto, and others. 1999. Modern information retrieval. Vol. Vol. 463.
    [2]
    Sergey Bartunov, Dmitry Kondrashkin, Anton Osokin, and Dmitry Vetrov. 2015. Breaking Sticks and Ambiguities with Adaptive Skip-gram. arXiv preprint arXiv:1502.07257 (2015).
    [3]
    Yoshua Bengio. 2009. Learning deep architectures for AI. Foundations and Trends® in Machine Learning, Vol. 2, 1 (2009), 1--127.
    [4]
    Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Jauvin. 2003. A neural probabilistic language model. Journal of Machine Learning Research Vol. 3 (2003), 1137--1155.
    [5]
    David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. Journal of Machine Learning Research Vol. 3 (2003), 993--1022.
    [6]
    Jonathan Chang, Sean Gerrish, Chong Wang, Jordan L. Boyd-Graber, and David M Blei. 2009. Reading tea leaves: How humans interpret topic models Proceedings of NIPS. 288--296.
    [7]
    Ronan Collobert and Jason Weston. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of ICML. 160--167.
    [8]
    Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research Vol. 12 (2011), 2493--2537.
    [9]
    Rajarshi Das, Manzil Zaheer, and Chris Dyer. 2015. Gaussian lda for topic models with word embeddings Proceedings of ACL. 795--804.
    [10]
    Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin 2008. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research Vol. 9 (2008), 1871--1874.
    [11]
    Thomas Hofmann. 1999. Probabilistic latent semantic indexing. In Proceedings of SIGIR. 50--57.
    [12]
    Eric H. Huang, Richard Socher, Christopher D. Manning, and Andrew Y. Ng. 2012. Improving word representations via global context and multiple word prototypes Proceedings of ACL. 873--882.
    [13]
    Ignacio Iacobacci, Mohammad Taher Pilehvar, and Roberto Navigli. 2016. Embeddings for word sense disambiguation: An evaluation study Proceedings of ACL. 897--907.
    [14]
    Jey Han Lau, David Newman, and Timothy Baldwin. 2014. Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality. In Proceedings of EACL. 530--539.
    [15]
    Quoc V Le and Tomas Mikolov 2014. Distributed Representations of Sentences and Documents. Proceedings of ICML. 1188--1196.
    [16]
    Chenliang Li, Haoran Wang, Zhiqian Zhang, Aixin Sun, and Zongyang Ma. 2016. Topic Modeling for Short Texts with Auxiliary Word Embeddings Proceedings of SIGIR. 165--174.
    [17]
    Shaohua Li, Tat-Seng Chua, Jun Zhu, and Chunyan Miao. 2016. Generative topic embedding: a continuous representation of documents Proceedings of ACL. 666--675.
    [18]
    Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2015. Learning context-sensitive word embeddings with neural tensor skip-gram model Proceedings of IJCAI. 1284--1290.
    [19]
    Yang Liu, Zhiyuan Liu, Tat-Seng Chua, and Maosong Sun. 2015. Topical Word Embeddings. In Proceedings of AAAI. 2418--2424.
    [20]
    Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research Vol. 9 (2008), 2579--2605.
    [21]
    Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality Advances in Neural Information Processing Systems. 3111--3119.
    [22]
    Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. 2013. Linguistic Regularities in Continuous Space Word Representations. Proceedings of NAACL. 746--751.
    [23]
    Frederic Morin and Yoshua Bengio. 2005. Hierarchical Probabilistic Neural Network Language Model. Proceedings of AISTATS (2005), 246.
    [24]
    Arvind Neelakantan, Jeevan Shankar, Alexandre Passos, and Andrew McCallum. 2015. Efficient non-parametric estimation of multiple embeddings per word in vector space Proceedings of EMNLP. 1059--1069.
    [25]
    David Newman, Jey Han Lau, Karl Grieser, and Timothy Baldwin. 2010. Automatic evaluation of topic coherence. In Proceedings of NAACL. 100--108.
    [26]
    Dat Quoc Nguyen, Richard Billingsley, Lan Du, and Mark Johnson. 2015. Improving Topic Models with Latent Feature Word Representations. TACL Vol. 3 (2015), 299--313.
    [27]
    Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of EMNLP. 1532--1543.
    [28]
    Joseph Reisinger and Raymond J. Mooney. 2010. Multi-prototype vector-space models of word meaning Proceedings of NAACL. 109--117.
    [29]
    Navid Rekabsaz. 2016. Enhancing Information Retrieval with Adapted Word Embedding Proceedings of SIGIR. 1169--1169.
    [30]
    Yafeng Ren, Yue Zhang, Meishan Zhang, and Donghong Ji. 2016. Improving Twitter Sentiment Classification Using Topic-Enriched Multi-Prototype Word Embeddings. Proceedings of AAAI. 3038--3044.
    [31]
    Michael Roth and Mirella Lapata. 2016. Neural semantic role labeling with dependency path embeddings. (2016), 1192--1202.
    [32]
    David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. 1988. Learning representations by back-propagating errors. Cognitive modeling, Vol. 5, 3 (1988), 1.
    [33]
    Bahar Salehi, Paul Cook, and Timothy Baldwin. 2015. A Word Embedding Approach to Predicting the Compositionality of Multiword Expressions. Proceedings of NAACL. 977--983.
    [34]
    Cyrus Shaoul. 2010. The westbury lab wikipedia corpus. Edmonton, AB: University of Alberta (2010).
    [35]
    Richard Socher, Cliff C. Lin, Chris Manning, and Andrew Y. Ng. 2011. Parsing natural scenes and natural language with recursive neural networks Proceedings of ICML. 129--136.
    [36]
    Keith Stevens, Philip Kegelmeyer, David Andrzejewski, and David Buttler. 2012. Exploring topic coherence over many models and many topics Proceedings of EMNLP. 952--961.
    [37]
    Fei Tian, Hanjun Dai, Jiang Bian, Bin Gao, Rui Zhang, Enhong Chen, and Tie-Yan Liu. 2014. A Probabilistic Model for Learning Multi-Prototype Word Embeddings. Proceedings of COLING. 151--160.
    [38]
    Joseph Turian, Lev Ratinov, and Yoshua Bengio. 2010. Word representations: a simple and general method for semi-supervised learning Proceedings of ACL. 384--394.
    [39]
    Peter D. Turney, Patrick Pantel, and others. 2010. From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, Vol. 37, 1 (2010), 141--188.
    [40]
    Hanna M. Wallach. 2006. Topic modeling: beyond bag-of-words. In Proceedings of ICML. 977--984.
    [41]
    Xuerui Wang, Andrew McCallum, and Xing Wei. 2007. Topical n-grams: Phrase and topic discovery, with an application to information retrieval Proceedings of ICDM. 697--702.

    Cited By

    View all
    • (2024)CoTE: A Flexible Method for Joint Learning of Topic and Embedding ModelsWeb and Big Data10.1007/978-981-97-2421-5_27(406-421)Online publication date: 12-May-2024
    • (2023)Class-Specific Word Sense Aware Topic Modeling via Soft Orthogonalized TopicsProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614809(1218-1227)Online publication date: 21-Oct-2023
    • (2023)Contextualized Word Embeddings via Generative Adversarial Learning of Syntagmatic and Paradigmatic Structure2023 6th International Conference on Software Engineering and Computer Science (CSECS)10.1109/CSECS60003.2023.10428465(1-8)Online publication date: 22-Dec-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
    August 2017
    1476 pages
    ISBN:9781450350228
    DOI:10.1145/3077136
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 August 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. document modeling
    2. topic model
    3. word embedding

    Qualifiers

    • Research-article

    Funding Sources

    • the Microsoft Research Asia Urban Informatics Grant
    • ERC Starting Grant
    • Research Grant Council of the Hong Kong Special Administrative Region China

    Conference

    SIGIR '17
    Sponsor:

    Acceptance Rates

    SIGIR '17 Paper Acceptance Rate 78 of 362 submissions, 22%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)16
    • Downloads (Last 6 weeks)2

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)CoTE: A Flexible Method for Joint Learning of Topic and Embedding ModelsWeb and Big Data10.1007/978-981-97-2421-5_27(406-421)Online publication date: 12-May-2024
    • (2023)Class-Specific Word Sense Aware Topic Modeling via Soft Orthogonalized TopicsProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614809(1218-1227)Online publication date: 21-Oct-2023
    • (2023)Contextualized Word Embeddings via Generative Adversarial Learning of Syntagmatic and Paradigmatic Structure2023 6th International Conference on Software Engineering and Computer Science (CSECS)10.1109/CSECS60003.2023.10428465(1-8)Online publication date: 22-Dec-2023
    • (2023)A decision support system in precision medicine: contrastive multimodal learning for patient stratificationAnnals of Operations Research10.1007/s10479-023-05545-6Online publication date: 29-Aug-2023
    • (2023)Exclusive Topic ModelResearch Papers in Statistical Inference for Time Series and Related Models10.1007/978-981-99-0803-5_3(83-109)Online publication date: 1-Jun-2023
    • (2022)A supervised topic embedding model and its applicationPLOS ONE10.1371/journal.pone.027710417:11(e0277104)Online publication date: 4-Nov-2022
    • (2022)Neural Embedding Allocation: Distributed Representations of Topic ModelsComputational Linguistics10.1162/coli_a_0045748:4(1021-1052)Online publication date: 1-Dec-2022
    • (2022)Collaborative Filtering With Network Representation Learning for Citation RecommendationIEEE Transactions on Big Data10.1109/TBDATA.2020.30349768:5(1233-1246)Online publication date: 1-Oct-2022
    • (2022)Multi-scaled Topic Embedding for Text Classification2022 7th International Conference on Computational Intelligence and Applications (ICCIA)10.1109/ICCIA55271.2022.9828449(52-58)Online publication date: 24-Jun-2022
    • (2022)Topic Modeling for Short Texts Via Dual View Collaborate optimization2022 7th IEEE International Conference on Data Science in Cyberspace (DSC)10.1109/DSC55868.2022.00028(160-166)Online publication date: Jul-2022
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media