Skip to main content

Showing 1–7 of 7 results for author: Krstovski, K

  1. arXiv:2308.12381  [pdf, other

    cs.CL cs.AI cs.LG

    Inferring gender from name: a large scale performance evaluation study

    Authors: Kriste Krstovski, Yao Lu, Ye Xu

    Abstract: A person's gender is a crucial piece of information when performing research across a wide range of scientific disciplines, such as medicine, sociology, political science, and economics, to name a few. However, in increasing instances, especially given the proliferation of big data, gender information is not readily available. In such cases researchers need to infer gender from readily available i… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

  2. arXiv:2209.08129  [pdf, other

    cs.CV cs.CL cs.LG

    Evons: A Dataset for Fake and Real News Virality Analysis and Prediction

    Authors: Kriste Krstovski, Angela Soomin Ryu, Bruce Kogut

    Abstract: We present a novel collection of news articles originating from fake and real news media sources for the analysis and prediction of news virality. Unlike existing fake news datasets which either contain claims or news article headline and body, in this collection each article is supported with a Facebook engagement count which we consider as an indicator of the article virality. In addition we als… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

  3. arXiv:2010.07289  [pdf, other

    q-fin.ST cs.LG stat.ML

    Choosing News Topics to Explain Stock Market Returns

    Authors: Paul Glasserman, Kriste Krstovski, Paul Laliberte, Harry Mamaysky

    Abstract: We analyze methods for selecting topics in news articles to explain stock returns. We find, through empirical and theoretical results, that supervised Latent Dirichlet Allocation (sLDA) implemented through Gibbs sampling in a stochastic EM algorithm will often overfit returns to the detriment of the topic model. We obtain better out-of-sample performance through a random search of plain LDA models… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

  4. arXiv:2004.12864  [pdf, other

    cs.CL

    DeSePtion: Dual Sequence Prediction and Adversarial Examples for Improved Fact-Checking

    Authors: Christopher Hidey, Tuhin Chakrabarty, Tariq Alhindi, Siddharth Varia, Kriste Krstovski, Mona Diab, Smaranda Muresan

    Abstract: The increased focus on misinformation has spurred development of data and systems for detecting the veracity of a claim as well as retrieving authoritative evidence. The Fact Extraction and VERification (FEVER) dataset provides such a resource for evaluating end-to-end fact-checking, requiring retrieval of evidence from Wikipedia to validate a veracity prediction. We show that current systems for… ▽ More

    Submitted 27 April, 2020; originally announced April 2020.

    Comments: ACL 2020

  5. arXiv:1803.09123  [pdf, other

    stat.ML cs.CL cs.LG

    Equation Embeddings

    Authors: Kriste Krstovski, David M. Blei

    Abstract: We present an unsupervised approach for discovering semantic representations of mathematical equations. Equations are challenging to analyze because each is unique, or nearly unique. Our method, which we call equation embeddings, finds good representations of equations by using the representations of their surrounding words. We used equation embeddings to analyze four collections of scientific art… ▽ More

    Submitted 24 March, 2018; originally announced March 2018.

    Comments: 12 pages, 2 figures

  6. arXiv:1712.06704  [pdf, ps, other

    stat.ML cs.CL cs.IR

    Multilingual Topic Models

    Authors: Kriste Krstovski, Michael J. Kurtz, David A. Smith, Alberto Accomazzi

    Abstract: Scientific publications have evolved several features for mitigating vocabulary mismatch when indexing, retrieving, and computing similarity between articles. These mitigation strategies range from simply focusing on high-value article sections, such as titles and abstracts, to assigning keywords, often from controlled vocabularies, either manually or through automatic annotation. Various document… ▽ More

    Submitted 18 December, 2017; originally announced December 2017.

    Comments: 18 pages, 9 figures

  7. arXiv:1601.01611  [pdf, other

    cs.IR

    Automatic Construction of Evaluation Sets and Evaluation of Document Similarity Models in Large Scholarly Retrieval Systems

    Authors: Kriste Krstovski, David A. Smith, Michael J. Kurtz

    Abstract: Retrieval systems for scholarly literature offer the ability for the scientific community to search, explore and download scholarly articles across various scientific disciplines. Mostly used by the experts in the particular field, these systems contain user community logs including information on user specific downloaded articles. In this paper we present a novel approach for automatically evalua… ▽ More

    Submitted 7 January, 2016; originally announced January 2016.