Skip to main content

Showing 1–50 of 70 results for author: Weikum, G

  1. arXiv:2405.02732  [pdf, other

    cs.CL cs.IR

    Recall Them All: Retrieval-Augmented Language Models for Long Object List Extraction from Long Documents

    Authors: Sneha Singhania, Simon Razniewski, Gerhard Weikum

    Abstract: Methods for relation extraction from text mostly focus on high precision, at the cost of limited recall. High recall is crucial, though, to populate long lists of object entities that stand in a specific relation with a given subject. Cues for relevant objects can be spread across many passages in long texts. This poses the challenge of extracting long lists from long texts. We present the L3X met… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  2. arXiv:2402.15400  [pdf, other

    cs.IR cs.CL

    Faithful Temporal Question Answering over Heterogeneous Sources

    Authors: Zhen Jia, Philipp Christmann, Gerhard Weikum

    Abstract: Temporal question answering (QA) involves time constraints, with phrases such as "... in 2019" or "... before COVID". In the former, time is an explicit condition, in the latter it is implicit. State-of-the-art methods have limitations along three dimensions. First, with neural inference, time constraints are merely soft-matched, giving room to invalid or inexplicable answers. Second, questions wi… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: Accepted at WWW 2024

  3. arXiv:2402.10689  [pdf, other

    cs.CL

    Multi-Cultural Commonsense Knowledge Distillation

    Authors: Tuan-Phong Nguyen, Simon Razniewski, Gerhard Weikum

    Abstract: Despite recent progress, large language models (LLMs) still face the challenge of appropriately reacting to the intricacies of social and cultural conventions. This paper presents MANGO, a methodology for distilling high-accuracy, high-recall assertions of cultural knowledge. We judiciously and iteratively prompt LLMs for this purpose from two entry points, concepts and cultures. Outputs are conso… ▽ More

    Submitted 17 April, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

    Comments: 20 pages, 5 figures, 13 tables

  4. arXiv:2311.01314  [pdf, other

    cs.IR

    Recommendations by Concise User Profiles from Review Text

    Authors: Ghazaleh Haratinezhad Torbati, Anna Tigunova, Andrew Yates, Gerhard Weikum

    Abstract: Recommender systems are most successful for popular items and users with ample interactions (likes, ratings etc.). This work addresses the difficult and underexplored case of supporting users who have very sparse interactions but post informative review texts. Our experimental studies address two book communities with these characteristics. We design a framework with Transformer-based representati… ▽ More

    Submitted 13 December, 2023; v1 submitted 2 November, 2023; originally announced November 2023.

  5. arXiv:2310.14771  [pdf, other

    cs.CL cs.AI

    Evaluating the Knowledge Base Completion Potential of GPT

    Authors: Blerta Veseli, Simon Razniewski, Jan-Christoph Kalo, Gerhard Weikum

    Abstract: Structured knowledge bases (KBs) are an asset for search engines and other applications, but are inevitably incomplete. Language models (LMs) have been proposed for unsupervised knowledge base completion (KBC), yet, their ability to do this at scale and with high accuracy remains an open question. Prior experimental studies mostly fall short because they only evaluate on popular subjects, or sampl… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: 12 pages 4 tables

    Journal ref: Findings of EMNLP 2023

  6. arXiv:2310.13505  [pdf, other

    cs.CL cs.AI cs.IR

    Robust Training for Conversational Question Answering Models with Reinforced Reformulation Generation

    Authors: Magdalena Kaiser, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Models for conversational question answering (ConvQA) over knowledge graphs (KGs) are usually trained and tested on benchmarks of gold QA pairs. This implies that training is limited to surface forms seen in the respective datasets, and evaluation is on a small set of held-out questions. Through our proposed framework REIGN, we take several steps to remedy this restricted learning setup. First, we… ▽ More

    Submitted 16 February, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

    Comments: WSDM 2024 Research Paper, 11 pages

  7. arXiv:2307.03122  [pdf, other

    cs.CL

    Extracting Multi-valued Relations from Language Models

    Authors: Sneha Singhania, Simon Razniewski, Gerhard Weikum

    Abstract: The widespread usage of latent language representations via pre-trained language models (LMs) suggests that they are a promising source of structured knowledge. However, existing methods focus only on a single object per subject-relation pair, even though often multiple objects are correct. To overcome this limitation, we analyze these representations for their potential to yield materialized mult… ▽ More

    Submitted 7 July, 2023; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: Accepted to Repl4NLP Workshop at ACL 2023

  8. arXiv:2306.17472  [pdf, other

    cs.CL

    Knowledge Base Completion for Long-Tail Entities

    Authors: Lihu Chen, Simon Razniewski, Gerhard Weikum

    Abstract: Despite their impressive scale, knowledge bases (KBs), such as Wikidata, still contain significant gaps. Language models (LMs) have been proposed as a source for filling these gaps. However, prior works have focused on prominent entities with rich coverage by LMs, neglecting the crucial case of long-tail entities. In this paper, we present a novel method for LM-based-KB completion that is specific… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

    Comments: In ACL23 (MATCHING workshop)

  9. arXiv:2306.12235  [pdf, other

    cs.IR

    CompMix: A Benchmark for Heterogeneous Question Answering

    Authors: Philipp Christmann, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Fact-centric question answering (QA) often requires access to multiple, heterogeneous, information sources. By jointly considering several sources like a knowledge base (KB), a text collection, and tables from the web, QA systems can enhance their answer coverage and confidence. However, existing QA benchmarks are mostly constructed with a single source of knowledge in mind. This limits capabiliti… ▽ More

    Submitted 19 August, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

  10. arXiv:2305.01548  [pdf, other

    cs.IR

    Explainable Conversational Question Answering over Heterogeneous Sources via Iterative Graph Neural Networks

    Authors: Philipp Christmann, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: In conversational question answering, users express their information needs through a series of utterances with incomplete context. Typical ConvQA methods rely on a single source (a knowledge base (KB), or a text corpus, or a set of tables), thus being unable to benefit from increased answer coverage and redundancy of multiple sources. Our method EXPLAIGNN overcomes these limitations by integratin… ▽ More

    Submitted 18 July, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

    Comments: Accepted at SIGIR 2023 (extended version)

  11. arXiv:2303.11082  [pdf, other

    cs.CL cs.AI

    Evaluating Language Models for Knowledge Base Completion

    Authors: Blerta Veseli, Sneha Singhania, Simon Razniewski, Gerhard Weikum

    Abstract: Structured knowledge bases (KBs) are a foundation of many intelligent applications, yet are notoriously incomplete. Language models (LMs) have recently been proposed for unsupervised knowledge base completion (KBC), yet, despite encouraging initial results, questions regarding their suitability remain open. Existing evaluations often fall short because they only evaluate on popular subjects, or sa… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

    Comments: Data and code available at https://github.com/bveseli/LMsForKBC

    Journal ref: ESWC 2023

  12. arXiv:2303.04532  [pdf, ps, other

    cs.IR cs.AI

    Class Cardinality Comparison as a Fermi Problem

    Authors: Shrestha Ghosh, Simon Razniewski, Gerhard Weikum

    Abstract: Questions on class cardinality comparisons are quite tricky to answer and come with its own challenges. They require some kind of reasoning since web documents and knowledge bases, indispensable sources of information, rarely store direct answers to questions, such as, ``Are there more astronauts or Physics Nobel Laureates?'' We tackle questions on class cardinality comparison by tapping into thre… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

    Comments: Accepted to the Web Conference 2023

  13. Extracting Cultural Commonsense Knowledge at Scale

    Authors: Tuan-Phong Nguyen, Simon Razniewski, Aparna Varde, Gerhard Weikum

    Abstract: Structured knowledge is important for many AI applications. Commonsense knowledge, which is crucial for robust human-centric AI, is covered by a small number of structured knowledge projects. However, they lack knowledge about human traits and behaviors conditioned on socio-cultural contexts, which is crucial for situative AI. This paper presents CANDLE, an end-to-end methodology for extracting hi… ▽ More

    Submitted 10 May, 2023; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: 11 pages, 6 figures, 10 tables

    Journal ref: ACM Web Conference 2023

  14. Answering Count Questions with Structured Answers from Text

    Authors: Shrestha Ghosh, Simon Razniewski, Gerhard Weikum

    Abstract: In this work we address the challenging case of answering count queries in web search, such as ``number of songs by John Lennon''. Prior methods merely answer these with a single, and sometimes puzzling number or return a ranked list of text snippets with different numbers. This paper proposes a methodology for answering count queries with inference, contextualization and explanatory evidence. Unl… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

    Comments: arXiv admin note: text overlap with arXiv:2204.05039

  15. UnCommonSense: Informative Negative Knowledge about Everyday Concepts

    Authors: Hiba Arnaout, Simon Razniewski, Gerhard Weikum, Jeff Z. Pan

    Abstract: Commonsense knowledge about everyday concepts is an important asset for AI applications, such as question answering and chatbots. Recently, we have seen an increasing interest in the construction of structured commonsense knowledge bases (CSKBs). An important part of human commonsense is about properties that do not apply to concepts, yet existing CSKBs only store positive statements. Moreover, si… ▽ More

    Submitted 5 September, 2022; v1 submitted 19 August, 2022; originally announced August 2022.

  16. arXiv:2204.11677  [pdf, other

    cs.IR cs.CL

    Conversational Question Answering on Heterogeneous Sources

    Authors: Philipp Christmann, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Conversational question answering (ConvQA) tackles sequential information needs where contexts in follow-up questions are left implicit. Current ConvQA systems operate over homogeneous sources of information: either a knowledge base (KB), or a text corpus, or a collection of tables. This paper addresses the novel issue of jointly tapping into all of these together, this way boosting answer coverag… ▽ More

    Submitted 30 June, 2023; v1 submitted 25 April, 2022; originally announced April 2022.

    Comments: SIGIR 2022 Research Track Long Paper

  17. Answering Count Queries with Explanatory Evidence

    Authors: Shrestha Ghosh, Simon Razniewski, Gerhard Weikum

    Abstract: A challenging case in web search and question answering are count queries, such as \textit{"number of songs by John Lennon"}. Prior methods merely answer these with a single, and sometimes puzzling number or return a ranked list of text snippets with different numbers. This paper proposes a methodology for answering count queries with inference, contextualization and explanatory evidence. Unlike p… ▽ More

    Submitted 30 August, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: Version published at SIGIR 2022

  18. Refined Commonsense Knowledge from Large-Scale Web Contents

    Authors: Tuan-Phong Nguyen, Simon Razniewski, Julien Romero, Gerhard Weikum

    Abstract: Commonsense knowledge (CSK) about concepts and their properties is helpful for AI applications. Prior works, such as ConceptNet, have compiled large CSK collections. However, they are restricted in their expressiveness to subject-predicate-object (SPO) triples with simple concepts for S and strings for P and O. This paper presents a method called ASCENT++ to automatically build a large-scale knowl… ▽ More

    Submitted 23 June, 2022; v1 submitted 30 November, 2021; originally announced December 2021.

    Comments: This is a substantial extension of the previous WWW paper: arXiv:2011.00905

    Journal ref: IEEE Transactions on Knowledge and Data Engineering, 2022

  19. arXiv:2111.13611  [pdf, other

    cs.CL cs.AI

    Predicting Document Coverage for Relation Extraction

    Authors: Sneha Singhania, Simon Razniewski, Gerhard Weikum

    Abstract: This paper presents a new task of predicting the coverage of a text document for relation extraction (RE): does the document contain many relational tuples for a given entity? Coverage predictions are useful in selecting the best documents for knowledge base construction with large input corpora. To study this problem, we present a dataset of 31,366 diverse documents for 520 entities. We analyze t… ▽ More

    Submitted 26 November, 2021; originally announced November 2021.

    Comments: To appear in TACL. The arXiv version is a pre-MIT Press publication version

  20. arXiv:2110.04888  [pdf, other

    cs.CL cs.AI cs.DB

    Language Models As or For Knowledge Bases

    Authors: Simon Razniewski, Andrew Yates, Nora Kassner, Gerhard Weikum

    Abstract: Pre-trained language models (LMs) have recently gained attention for their potential as an alternative to (or proxy for) explicit knowledge bases (KBs). In this position paper, we examine this hypothesis, identify strengths and limitations of both LMs and KBs, and discuss the complementary nature of the two paradigms. In particular, we offer qualitative arguments that latent LMs are not suitable a… ▽ More

    Submitted 10 October, 2021; originally announced October 2021.

    Journal ref: DL4KG 2021

  21. Complex Temporal Question Answering on Knowledge Graphs

    Authors: Zhen Jia, Soumajit Pramanik, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Question answering over knowledge graphs (KG-QA) is a vital topic in IR. Questions with temporal intent are a special class of practical importance, but have not received much attention in research. This work presents EXAQT, the first end-to-end system for answering complex temporal questions that have multiple entities and predicates, and associated temporal conditions. EXAQT answers natural lang… ▽ More

    Submitted 18 September, 2021; originally announced September 2021.

    Comments: CIKM 2021 Long Paper, 11 pages

  22. arXiv:2109.04716  [pdf, other

    cs.IR

    You Get What You Chat: Using Conversations to Personalize Search-based Recommendations

    Authors: Ghazaleh Haratinezhad Torbati, Andrew Yates, Gerhard Weikum

    Abstract: Prior work on personalized recommendations has focused on exploiting explicit signals from user-specific queries, clicks, likes, and ratings. This paper investigates tapping into a different source of implicit signals of interests and tastes: online chats between users. The paper develops an expressive model and effective methods for personalizing search-based entity recommendations. User models d… ▽ More

    Submitted 10 September, 2021; originally announced September 2021.

  23. arXiv:2109.04713  [pdf, other

    cs.IR

    Personalized Entity Search by Sparse and Scrutable User Profiles

    Authors: Ghazaleh Haratinezhad Torbati, Andrew Yates, Gerhard Weikum

    Abstract: Prior work on personalizing web search results has focused on considering query-and-click logs to capture users individual interests. For product search, extensive user histories about purchases and ratings have been exploited. However, for general entity search, such as for books on specific topics or travel destinations with certain features, personalization is largely underexplored. In this pap… ▽ More

    Submitted 10 September, 2021; originally announced September 2021.

  24. arXiv:2109.04432  [pdf, other

    cs.LG cs.IR stat.ML

    Detecting and Mitigating Test-time Failure Risks via Model-agnostic Uncertainty Learning

    Authors: Preethi Lahoti, Krishna P. Gummadi, Gerhard Weikum

    Abstract: Reliably predicting potential failure risks of machine learning (ML) systems when deployed with production data is a crucial aspect of trustworthy AI. This paper introduces Risk Advisor, a novel post-hoc meta-learner for estimating failure risks and predictive uncertainties of any already-trained black-box classification model. In addition to providing a risk score, the Risk Advisor decomposes the… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

    Comments: To appear in the 21st IEEE International Conference on Data Mining (ICDM 2021), Auckland, New Zealand

  25. arXiv:2108.08614  [pdf

    cs.IR cs.CL

    UNIQORN: Unified Question Answering over RDF Knowledge Graphs and Natural Language Text

    Authors: Soumajit Pramanik, Jesujoba Alabi, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Question answering over RDF data like knowledge graphs has been greatly advanced, with a number of good systems providing crisp answers for natural language questions or telegraphic queries. Some of these systems incorporate textual sources as additional evidence for the answering process, but cannot compute answers that are present in text alone. Conversely, the IR and NLP communities have addres… ▽ More

    Submitted 10 October, 2023; v1 submitted 19 August, 2021; originally announced August 2021.

    Comments: 24 pages

    ACM Class: H.3.3

  26. arXiv:2108.08597  [pdf, other

    cs.IR cs.CL

    Beyond NED: Fast and Effective Search Space Reduction for Complex Question Answering over Knowledge Bases

    Authors: Philipp Christmann, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Answering complex questions over knowledge bases (KB-QA) faces huge input data with billions of facts, involving millions of entities and thousands of predicates. For efficiency, QA systems first reduce the answer search space by identifying a set of facts that is likely to contain all answers and relevant cues. The most common technique for doing this is to apply named entity disambiguation (NED)… ▽ More

    Submitted 4 April, 2022; v1 submitted 19 August, 2021; originally announced August 2021.

    Comments: WSDM 2022 Research Track Long Paper (Extended version)

  27. Inside ASCENT: Exploring a Deep Commonsense Knowledge Base and its Usage in Question Answering

    Authors: Tuan-Phong Nguyen, Simon Razniewski, Gerhard Weikum

    Abstract: ASCENT is a fully automated methodology for extracting and consolidating commonsense assertions from web contents (Nguyen et al., WWW 2021). It advances traditional triple-based commonsense knowledge representation by capturing semantic facets like locations and purposes, and composite concepts, i.e., subgroups and related aspects of subjects. In this demo, we present a web portal that allows user… ▽ More

    Submitted 28 May, 2021; originally announced May 2021.

    Comments: Demo website: https://ascent.mpi-inf.mpg.de; introductory video: https://youtu.be/qMkJXqu_Yd4

    Journal ref: ACL 2021 system demonstration

  28. Reinforcement Learning from Reformulations in Conversational Question Answering over Knowledge Graphs

    Authors: Magdalena Kaiser, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: The rise of personal assistants has made conversational question answering (ConvQA) a very popular mechanism for user-system interaction. State-of-the-art methods for ConvQA over knowledge graphs (KGs) can only learn from crisp question-answer pairs found in popular benchmarks. In reality, however, such training data is hard to come by: users would rarely mark answers explicitly as correct or wron… ▽ More

    Submitted 20 August, 2021; v1 submitted 11 May, 2021; originally announced May 2021.

    Comments: SIGIR 2021 Long Paper, 11 pages

  29. arXiv:2102.09388  [pdf, other

    cs.IR cs.AI cs.LG

    ELIXIR: Learning from User Feedback on Explanations to Improve Recommender Models

    Authors: Azin Ghazimatin, Soumajit Pramanik, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: System-provided explanations for recommendations are an important component towards transparent and trustworthy AI. In state-of-the-art research, this is a one-way signal, though, to improve user acceptance. In this paper, we turn the role of explanations around and investigate how they can contribute to enhancing the quality of the generated recommendations themselves. We devise a human-in-the-lo… ▽ More

    Submitted 30 April, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: WWW 2021, 11 pages

  30. arXiv:2011.06844  [pdf, other

    cs.CL

    Cross-Domain Learning for Classifying Propaganda in Online Contents

    Authors: Liqiang Wang, Xiaoyu Shen, Gerard de Melo, Gerhard Weikum

    Abstract: As news and social media exhibit an increasing amount of manipulative polarized content, detecting such propaganda has received attention as a new task for content analysis. Prior work has focused on supervised learning with training data from the same domain. However, as propaganda can be subtle and keeps evolving, manual identification and proper labeling are very demanding. As a consequence, tr… ▽ More

    Submitted 22 November, 2020; v1 submitted 13 November, 2020; originally announced November 2020.

    Comments: TTO 2020

  31. Advanced Semantics for Commonsense Knowledge Extraction

    Authors: Tuan-Phong Nguyen, Simon Razniewski, Gerhard Weikum

    Abstract: Commonsense knowledge (CSK) about concepts and their properties is useful for AI applications such as robust chatbots. Prior works like ConceptNet, TupleKB and others compiled large CSK collections, but are restricted in their expressiveness to subject-predicate-object (SPO) triples with simple concepts for S and monolithic strings for P and O. Also, these projects have either prioritized precisio… ▽ More

    Submitted 25 October, 2022; v1 submitted 2 November, 2020; originally announced November 2020.

    Comments: 12 pages, 3 figures, 11 tables

    Journal ref: Proceedings of the Web Conference 2021 (WWW '21)

  32. arXiv:2009.11564  [pdf, other

    cs.AI cs.DB

    Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases

    Authors: Gerhard Weikum, Luna Dong, Simon Razniewski, Fabian Suchanek

    Abstract: Equipping machines with comprehensive knowledge of the world's entities and their relationships has been a long-standing goal of AI. Over the last decade, large-scale knowledge bases, also known as knowledge graphs, have been automatically constructed from web contents and text sources, and have become a key asset for search engines. This machine knowledge can be harnessed to semantically interpre… ▽ More

    Submitted 22 March, 2021; v1 submitted 24 September, 2020; originally announced September 2020.

    Comments: Submitted to Foundations and Trends in Databases

    Journal ref: Foundations and Trends in Databases, 2021

  33. arXiv:2005.03529  [pdf, other

    cs.IR cs.AI cs.DB

    CounQER: A System for Discovering and Linking Count Information in Knowledge Bases

    Authors: Shrestha Ghosh, Simon Razniewski, Gerhard Weikum

    Abstract: Predicate constraints of general-purpose knowledge bases (KBs) like Wikidata, DBpedia and Freebase are often limited to subproperty, domain and range constraints. In this demo we showcase CounQER, a system that illustrates the alignment of counting predicates, like staffSize, and enumerating predicates, like workInstitution^{-1} . In the demonstration session, attendees can inspect these alignment… ▽ More

    Submitted 7 May, 2020; originally announced May 2020.

    Comments: Accepted at ESWC 2020

  34. arXiv:2004.13117  [pdf, other

    cs.IR cs.CL

    Conversational Question Answering over Passages by Leveraging Word Proximity Networks

    Authors: Magdalena Kaiser, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Question answering (QA) over text passages is a problem of long-standing interest in information retrieval. Recently, the conversational setting has attracted attention, where a user asks a sequence of questions to satisfy her information needs around a topic. While this setup is a natural one and similar to humans conversing with each other, it introduces two key research challenges: understandin… ▽ More

    Submitted 25 May, 2020; v1 submitted 27 April, 2020; originally announced April 2020.

    Comments: SIGIR 2020 Demonstrations

  35. arXiv:2003.03155  [pdf, other

    cs.DB cs.IR

    Uncovering Hidden Semantics of Set Information in Knowledge Bases

    Authors: Shrestha Ghosh, Simon Razniewski, Gerhard Weikum

    Abstract: Knowledge Bases (KBs) contain a wealth of structured information about entities and predicates. This paper focuses on set-valued predicates, i.e., the relationship between an entity and a set of entities. In KBs, this information is often represented in two formats: (i) via counting predicates such as numberOfChildren and staffSize, that store aggregated integers, and (ii) via enumerating predicat… ▽ More

    Submitted 26 March, 2020; v1 submitted 6 March, 2020; originally announced March 2020.

    Comments: This work is under review in the Journal of Web Semantics, Special Issue on Language Technology and Knowledge Graphs. This is a revision draft

  36. arXiv:2001.04425  [pdf, other

    cs.IR cs.AI cs.CL cs.DB

    Negative Statements Considered Useful

    Authors: Hiba Arnaout, Simon Razniewski, Gerhard Weikum, Jeff Z. Pan

    Abstract: Knowledge bases (KBs) about notable entities and their properties are an important asset in applications such as search, question answering and dialogue. All popular KBs capture virtually only positive statements, and abstain from taking any stance on statements not stored in the KB. This paper makes the case for explicitly stating salient statements that do not hold. Negative statements are usefu… ▽ More

    Submitted 25 September, 2021; v1 submitted 13 January, 2020; originally announced January 2020.

    Journal ref: Journal of Web Semantics (JWS), Volume 71, 2021

  37. arXiv:2001.04170  [pdf, other

    cs.CL cs.AI cs.IR

    Joint Reasoning for Multi-Faceted Commonsense Knowledge

    Authors: Yohan Chalier, Simon Razniewski, Gerhard Weikum

    Abstract: Commonsense knowledge (CSK) supports a variety of AI applications, from visual understanding to chatbots. Prior works on acquiring CSK, such as ConceptNet, have compiled statements that associate concepts, like everyday objects or activities, with properties that hold for most or some instances of the concept. Each concept is treated in isolation from other concepts, and the only quantitative meas… ▽ More

    Submitted 4 May, 2020; v1 submitted 13 January, 2020; originally announced January 2020.

    Comments: 11 pages

    Journal ref: AKBC 2020

  38. arXiv:1911.08378  [pdf, other

    cs.LG cs.AI stat.ML

    PRINCE: Provider-side Interpretability with Counterfactual Explanations in Recommender Systems

    Authors: Azin Ghazimatin, Oana Balalau, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Interpretable explanations for recommender systems and other machine learning models are crucial to gain user trust. Prior works that have focused on paths connecting users and items in a heterogeneous network have several limitations, such as discovering relationships rather than true explanations, or disregarding other users' privacy. In this work, we take a fresh perspective, and present PRINCE… ▽ More

    Submitted 24 December, 2019; v1 submitted 19 November, 2019; originally announced November 2019.

    Comments: WSDM 2020, 9 pages

  39. arXiv:1911.02850  [pdf, other

    cs.IR cs.CL

    CROWN: Conversational Passage Ranking by Reasoning over Word Networks

    Authors: Magdalena Kaiser, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Information needs around a topic cannot be satisfied in a single turn; users typically ask follow-up questions referring to the same theme and a system must be capable of understanding the conversational context of a request to retrieve correct answers. In this paper, we present our submission to the TREC Conversational Assistance Track 2019, in which such a conversational setting is explored. We… ▽ More

    Submitted 11 February, 2020; v1 submitted 7 November, 2019; originally announced November 2019.

    Comments: TREC 2019, 14 pages

    Journal ref: TREC 2019

  40. arXiv:1910.06048  [pdf, other

    cs.CL cs.AI cs.LG

    STANCY: Stance Classification Based on Consistency Cues

    Authors: Kashyap Popat, Subhabrata Mukherjee, Andrew Yates, Gerhard Weikum

    Abstract: Controversial claims are abundant in online media and discussion forums. A better understanding of such claims requires analyzing them from different perspectives. Stance classification is a necessary step for inferring these perspectives in terms of supporting or opposing the claim. In this work, we present a neural network model for stance classification leveraging BERT representations and augme… ▽ More

    Submitted 14 October, 2019; originally announced October 2019.

    Comments: Accepted at EMNLP 2019

  41. Look before you Hop: Conversational Question Answering over Knowledge Graphs Using Judicious Context Expansion

    Authors: Philipp Christmann, Rishiraj Saha Roy, Abdalghani Abujabal, Jyotsna Singh, Gerhard Weikum

    Abstract: Fact-centric information needs are rarely one-shot; users typically ask follow-up questions to explore a topic. In such a conversational setting, the user's inputs are often incomplete, with entities or predicates left out, and ungrammatical phrases. This poses a huge challenge to question answering (QA) systems that typically rely on cues in full-fledged interrogative sentences. As a solution, we… ▽ More

    Submitted 5 November, 2019; v1 submitted 8 October, 2019; originally announced October 2019.

    Comments: CIKM 2019 Long Paper, 10 pages

    Journal ref: CIKM 2019

  42. arXiv:1909.00749  [pdf, other

    cs.IR

    Know2Look: Commonsense Knowledge for Visual Search

    Authors: Sreyasi Nag Chowdhury, Niket Tandon, Gerhard Weikum

    Abstract: With the rise in popularity of social media, images accompanied by contextual text form a huge section of the web. However, search and retrieval of documents are still largely dependent on solely textual cues. Although visual cues have started to gain focus, the imperfection in object/scene detection do not lead to significantly improved results. We hypothesize that the use of background commonsen… ▽ More

    Submitted 2 September, 2019; originally announced September 2019.

    Comments: Published in AKBC 2016

    Journal ref: 5th Workshop on Automated Knowledge Base Construction (AKBC) 2016

  43. arXiv:1909.00741  [pdf, other

    cs.MM cs.CV cs.IR

    VISIR: Visual and Semantic Image Label Refinement

    Authors: Sreyasi Nag Chowdhury, Niket Tandon, Hakan Ferhatosmanoglu, Gerhard Weikum

    Abstract: The social media explosion has populated the Internet with a wealth of images. There are two existing paradigms for image retrieval: 1) content-based image retrieval (CBIR), which has traditionally used visual features for similarity search (e.g., SIFT features), and 2) tag-based image retrieval (TBIR), which has relied on user tagging (e.g., Flickr tags). CBIR now gains semantic expressiveness by… ▽ More

    Submitted 2 September, 2019; originally announced September 2019.

    Comments: Published in WSDM 2018

    Journal ref: ACM ISBN 978-1-4503-5581-0/18/02 2018

  44. arXiv:1909.00692  [pdf, other

    cs.CL

    Story-oriented Image Selection and Placement

    Authors: Sreyasi Nag Chowdhury, Simon Razniewski, Gerhard Weikum

    Abstract: Multimodal contents have become commonplace on the Internet today, manifested as news articles, social media posts, and personal or business blog posts. Among the various kinds of media (images, videos, graphics, icons, audio) used in such multimodal stories, images are the most popular. The selection of images from a collection - either author's personal photo album, or web repositories - and the… ▽ More

    Submitted 2 September, 2019; originally announced September 2019.

  45. TEQUILA: Temporal Question Answering over Knowledge Bases

    Authors: Zhen Jia, Abdalghani Abujabal, Rishiraj Saha Roy, Jannik Stroetgen, Gerhard Weikum

    Abstract: Question answering over knowledge bases (KB-QA) poses challenges in handling complex questions that need to be decomposed into sub-questions. An important case, addressed here, is that of temporal questions, where cues for temporal relations need to be discovered and handled. We present TEQUILA, an enabler method for temporal QA that can run on top of any KB-QA engine. TEQUILA has four stages. It… ▽ More

    Submitted 25 January, 2021; v1 submitted 9 August, 2019; originally announced August 2019.

    Comments: CIKM 2018 Short Paper

    Journal ref: CIKM 2018

  46. arXiv:1908.03109  [pdf, other

    cs.SI cs.LG stat.ML

    FAIRY: A Framework for Understanding Relationships between Users' Actions and their Social Feeds

    Authors: Azin Ghazimatin, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Users increasingly rely on social media feeds for consuming daily information. The items in a feed, such as news, questions, songs, etc., usually result from the complex interplay of a user's social contacts, her interests and her actions on the platform. The relationship of the user's own behavior and the received feed is often puzzling, and many users would like to have a clear explanation on wh… ▽ More

    Submitted 5 November, 2019; v1 submitted 8 August, 2019; originally announced August 2019.

    Comments: WSDM 2019

    MSC Class: http://www.acm.org/about/class/1998

    Journal ref: WSDM 2019

  47. Answering Complex Questions by Joining Multi-Document Evidence with Quasi Knowledge Graphs

    Authors: Xiaolu Lu, Soumajit Pramanik, Rishiraj Saha Roy, Abdalghani Abujabal, Yafang Wang, Gerhard Weikum

    Abstract: Direct answering of questions that involve multiple entities and relations is a challenge for text-based QA. This problem is most pronounced when answers can be found only by joining evidence from multiple documents. Curated knowledge graphs (KGs) may yield good answers, but are limited by their inherent incompleteness and potential staleness. This paper presents QUEST, a method that can answer co… ▽ More

    Submitted 28 November, 2020; v1 submitted 1 August, 2019; originally announced August 2019.

    Comments: SIGIR 2019 Long Paper, 10 pages

  48. Operationalizing Individual Fairness with Pairwise Fair Representations

    Authors: Preethi Lahoti, Krishna P. Gummadi, Gerhard Weikum

    Abstract: We revisit the notion of individual fairness proposed by Dwork et al. A central challenge in operationalizing their approach is the difficulty in eliciting a human specification of a similarity metric. In this paper, we propose an operationalization of individual fairness that does not rely on a human specification of a distance metric. Instead, we propose novel approaches to elicit and leverage s… ▽ More

    Submitted 1 December, 2019; v1 submitted 2 July, 2019; originally announced July 2019.

    Comments: To be published in the proceedings of the VLDB Endowment, Vol. 13, Issue. 4

  49. arXiv:1905.10989  [pdf, other

    cs.CL cs.AI cs.DB

    Commonsense Properties from Query Logs and Question Answering Forums

    Authors: Julien Romero, Simon Razniewski, Koninika Pal, Jeff Z. Pan, Archit Sakhadeo, Gerhard Weikum

    Abstract: Commonsense knowledge about object properties, human behavior and general concepts is crucial for robust AI applications. However, automatic acquisition of this knowledge is challenging because of sparseness and bias in online sources. This paper presents Quasimodo, a methodology and tool suite for distilling commonsense properties from non-standard web sources. We devise novel ways of tapping int… ▽ More

    Submitted 10 February, 2021; v1 submitted 27 May, 2019; originally announced May 2019.

    Comments: Updated appendix reporting on Quasimodo v4.3 (2/2021)

    Journal ref: CIKM 2019

  50. arXiv:1904.10887  [pdf, other

    cs.CL

    Listening between the Lines: Learning Personal Attributes from Conversations

    Authors: Anna Tigunova, Andrew Yates, Paramita Mirza, Gerhard Weikum

    Abstract: Open-domain dialogue agents must be able to converse about many topics while incorporating knowledge about the user into the conversation. In this work we address the acquisition of such knowledge, for personalization in downstream Web applications, by extracting personal attributes from conversations. This problem is more challenging than the established task of information extraction from scient… ▽ More

    Submitted 24 April, 2019; originally announced April 2019.

    Comments: published in WWW'19