Skip to main content

Showing 1–21 of 21 results for author: Jacovi, A

  1. arXiv:2407.00402  [pdf, other

    cs.CL cs.AI

    Is It Really Long Context if All You Need Is Retrieval? Towards Genuinely Difficult Long Context NLP

    Authors: Omer Goldman, Alon Jacovi, Aviv Slobodkin, Aviya Maimon, Ido Dagan, Reut Tsarfaty

    Abstract: Improvements in language models' capabilities have pushed their applications towards longer contexts, making long-context evaluation and development an active research area. However, many disparate use-cases are grouped together under the umbrella term of "long-context", defined simply by the total length of the model's input, including - for example - Needle-in-a-Haystack tasks, book summarizatio… ▽ More

    Submitted 11 July, 2024; v1 submitted 29 June, 2024; originally announced July 2024.

  2. arXiv:2406.13632  [pdf, other

    cs.CL

    Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations

    Authors: Arie Cattan, Alon Jacovi, Alex Fabrikant, Jonathan Herzig, Roee Aharoni, Hannah Rashkin, Dror Marcus, Avinatan Hassidim, Yossi Matias, Idan Szpektor, Avi Caciularu

    Abstract: Despite recent advancements in Large Language Models (LLMs), their performance on tasks involving long contexts remains sub-optimal. In-Context Learning (ICL) with few-shot examples may be an appealing solution to enhance LLM performance in this scenario; However, naively adding ICL examples with long context introduces challenges, including substantial token overhead added for each few-shot examp… ▽ More

    Submitted 23 June, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

  3. arXiv:2406.03618  [pdf, other

    cs.CL

    TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools

    Authors: Avi Caciularu, Alon Jacovi, Eyal Ben-David, Sasha Goldshtein, Tal Schuster, Jonathan Herzig, Gal Elidan, Amir Globerson

    Abstract: Large Language Models (LLMs) often do not perform well on queries that require the aggregation of information across texts. To better evaluate this setting and facilitate modeling efforts, we introduce TACT - Text And Calculations through Tables, a dataset crafted to evaluate LLMs' reasoning and computational abilities using complex instructions. TACT contains challenging instructions that demand… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Website (https://tact-benchmark.github.io), Huggingface (https://huggingface.co/datasets/google/TACT)

  4. arXiv:2402.00559  [pdf, other

    cs.CL

    A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains

    Authors: Alon Jacovi, Yonatan Bitton, Bernd Bohnet, Jonathan Herzig, Or Honovich, Michael Tseng, Michael Collins, Roee Aharoni, Mor Geva

    Abstract: Prompting language models to provide step-by-step answers (e.g., "Chain-of-Thought") is the prominent approach for complex reasoning tasks, where more accurate reasoning chains typically improve downstream task performance. Recent literature discusses automatic methods to verify reasoning to evaluate and improve their correctness. However, no fine-grained step-level datasets are available to enabl… ▽ More

    Submitted 21 May, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Accepted to ACL 2024

  5. arXiv:2310.10062  [pdf, other

    cs.CL cs.AI

    A Comprehensive Evaluation of Tool-Assisted Generation Strategies

    Authors: Alon Jacovi, Avi Caciularu, Jonathan Herzig, Roee Aharoni, Bernd Bohnet, Mor Geva

    Abstract: A growing area of research investigates augmenting language models with tools (e.g., search engines, calculators) to overcome their shortcomings (e.g., missing or incorrect knowledge, incorrect logical inferences). Various few-shot tool-usage strategies have been proposed. However, there is no systematic and fair comparison across different strategies, or between these strategies and strong baseli… ▽ More

    Submitted 28 December, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 Findings

  6. arXiv:2310.03392  [pdf

    cs.HC cs.AI

    Unpacking Human-AI Interaction in Safety-Critical Industries: A Systematic Literature Review

    Authors: Tita A. Bach, Jenny K. Kristiansen, Aleksandar Babic, Alon Jacovi

    Abstract: Ensuring quality human-AI interaction (HAII) in safety-critical industries is essential. Failure to do so can lead to catastrophic and deadly consequences. Despite this urgency, what little research there is on HAII is fragmented and inconsistent. We present here a survey of that literature and recommendations for research best practices that will improve the field. We divided our investigation in… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

    Comments: Under review in IEEE Access, September 2023. 23 pages, 2 figures, 9 tables

  7. arXiv:2305.10160  [pdf, other

    cs.CL cs.AI

    Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination by Evaluation Benchmarks

    Authors: Alon Jacovi, Avi Caciularu, Omer Goldman, Yoav Goldberg

    Abstract: Data contamination has become prevalent and challenging with the rise of models pretrained on large automatically-crawled corpora. For closed models, the training data becomes a trade secret, and even for open models, it is not trivial to detect contamination. Strategies such as leaderboards with hidden answers, or using test data which is guaranteed to be unseen, are expensive and become fragile… ▽ More

    Submitted 18 October, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: Accepted to EMNLP 2023

  8. arXiv:2305.02679  [pdf, other

    cs.CL cs.HC

    Neighboring Words Affect Human Interpretation of Saliency Explanations

    Authors: Alon Jacovi, Hendrik Schuff, Heike Adel, Ngoc Thang Vu, Yoav Goldberg

    Abstract: Word-level saliency explanations ("heat maps over words") are often used to communicate feature-attribution in text-based models. Recent studies found that superficial factors such as word length can distort human interpretation of the communicated saliency scores. We conduct a user study to investigate how the marking of a word's neighboring words affect the explainee's perception of the word's i… ▽ More

    Submitted 6 May, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of ACL 2023

  9. arXiv:2301.05433  [pdf, other

    cs.AI

    Trends in Explainable AI (XAI) Literature

    Authors: Alon Jacovi

    Abstract: The XAI literature is decentralized, both in terminology and in publication venues, but recent years saw the community converge around keywords that make it possible to more reliably discover papers automatically. We use keyword search using the SemanticScholar API and manual curation to collect a well-formatted and reasonably comprehensive set of 5199 XAI papers, available at https://github.com/a… ▽ More

    Submitted 13 January, 2023; originally announced January 2023.

  10. arXiv:2201.11569  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Human Interpretation of Saliency-based Explanation Over Text

    Authors: Hendrik Schuff, Alon Jacovi, Heike Adel, Yoav Goldberg, Ngoc Thang Vu

    Abstract: While a lot of research in explainable AI focuses on producing effective explanations, less work is devoted to the question of how people understand and interpret the explanation. In this work, we focus on this question through a study of saliency-based explanations over textual data. Feature-attribution explanations of text models aim to communicate which parts of the input text were more influen… ▽ More

    Submitted 17 June, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

    Comments: FAccT 2022

  11. Diagnosing AI Explanation Methods with Folk Concepts of Behavior

    Authors: Alon Jacovi, Jasmijn Bastings, Sebastian Gehrmann, Yoav Goldberg, Katja Filippova

    Abstract: We investigate a formalism for the conditions of a successful explanation of AI. We consider "success" to depend not only on what information the explanation contains, but also on what information the human explainee understands from it. Theory of mind literature discusses the folk concepts that humans use to understand and generalize behavior. We posit that folk concepts of behavior provide us wi… ▽ More

    Submitted 15 November, 2023; v1 submitted 26 January, 2022; originally announced January 2022.

    Comments: Accepted to JAIR (Vol. 78, 2023)

    Journal ref: Journal of Artificial Intelligence Research 73 (2023) 459-489

  12. arXiv:2103.01378  [pdf, other

    cs.CL cs.AI cs.LG

    Contrastive Explanations for Model Interpretability

    Authors: Alon Jacovi, Swabha Swayamdipta, Shauli Ravfogel, Yanai Elazar, Yejin Choi, Yoav Goldberg

    Abstract: Contrastive explanations clarify why an event occurred in contrast to another. They are more inherently intuitive to humans to both produce and comprehend. We propose a methodology to produce contrastive explanations for classification models by modifying the representation to disregard non-contrastive information, and modifying model behavior to only be based on contrastive reasoning. Our method… ▽ More

    Submitted 14 September, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

    Comments: Accepted to EMNLP 2021 as a long paper

  13. arXiv:2010.07487  [pdf, other

    cs.AI cs.CY

    Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI

    Authors: Alon Jacovi, Ana Marasović, Tim Miller, Yoav Goldberg

    Abstract: Trust is a central component of the interaction between people and AI, in that 'incorrect' levels of trust may cause misuse, abuse or disuse of the technology. But what, precisely, is the nature of trust in AI? What are the prerequisites and goals of the cognitive mechanism of trust, and how can we promote them, or assess whether they are being satisfied in a given interaction? This work aims to a… ▽ More

    Submitted 20 January, 2021; v1 submitted 14 October, 2020; originally announced October 2020.

    Comments: Accepted to ACM FAccT 2021

  14. arXiv:2010.03656  [pdf, other

    cs.CL

    Exposing Shallow Heuristics of Relation Extraction Models with Challenge Data

    Authors: Shachar Rosenman, Alon Jacovi, Yoav Goldberg

    Abstract: The process of collecting and annotating training data may introduce distribution artifacts which may limit the ability of models to learn correct generalization behavior. We identify failure modes of SOTA relation extraction (RE) models trained on TACRED, which we attribute to limitations in the data annotation process. We collect and annotate a challenge-set we call Challenging RE (CRE), based o… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

    Comments: Accepted as a short paper in EMNLP 2020

  15. arXiv:2006.01067  [pdf, other

    cs.CL cs.AI

    Aligning Faithful Interpretations with their Social Attribution

    Authors: Alon Jacovi, Yoav Goldberg

    Abstract: We find that the requirement of model interpretations to be faithful is vague and incomplete. With interpretation by textual highlights as a case-study, we present several failure cases. Borrowing concepts from social science, we identify that the problem is a misalignment between the causal chain of decisions (causal attribution) and the attribution of human behavior to the interpretation (social… ▽ More

    Submitted 14 January, 2021; v1 submitted 1 June, 2020; originally announced June 2020.

    Comments: Accepted as a journal paper to TACL

  16. arXiv:2006.00995  [pdf, other

    cs.CL

    Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals

    Authors: Yanai Elazar, Shauli Ravfogel, Alon Jacovi, Yoav Goldberg

    Abstract: A growing body of work makes use of probing to investigate the working of neural models, often considered black boxes. Recently, an ongoing debate emerged surrounding the limitations of the probing paradigm. In this work, we point out the inability to infer behavioral conclusions from probing results and offer an alternative method that focuses on how the information is being used, rather than on… ▽ More

    Submitted 19 February, 2021; v1 submitted 1 June, 2020; originally announced June 2020.

    Comments: TACL journal. Initial title was: "When Bert Forgets How To POS: Amnesic Probing of Linguistic Properties and MLM Predictions"

  17. arXiv:2004.03685  [pdf, other

    cs.CL cs.LG

    Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?

    Authors: Alon Jacovi, Yoav Goldberg

    Abstract: With the growing popularity of deep-learning based NLP models, comes a need for interpretable systems. But what is interpretability, and what constitutes a high-quality interpretation? In this opinion piece we reflect on the current state of interpretability evaluation research. We call for more clearly differentiating between different desired criteria an interpretation should satisfy, and focus… ▽ More

    Submitted 27 April, 2020; v1 submitted 7 April, 2020; originally announced April 2020.

    Comments: Accepted to ACL 2020

  18. arXiv:1910.13339  [pdf, other

    cs.CL cs.LG

    Scalable Evaluation and Improvement of Document Set Expansion via Neural Positive-Unlabeled Learning

    Authors: Alon Jacovi, Gang Niu, Yoav Goldberg, Masashi Sugiyama

    Abstract: We consider the situation in which a user has collected a small set of documents on a cohesive topic, and they want to retrieve additional documents on this topic from a large collection. Information Retrieval (IR) solutions treat the document set as a query, and look for similar documents in the collection. We propose to extend the IR approach by treating the problem as an instance of positive-un… ▽ More

    Submitted 14 January, 2021; v1 submitted 29 October, 2019; originally announced October 2019.

    Comments: Accepted as a long paper to EACL 2021

  19. arXiv:1901.03995  [pdf, other

    cs.LG stat.ML

    Neural network gradient-based learning of black-box function interfaces

    Authors: Alon Jacovi, Guy Hadash, Einat Kermany, Boaz Carmeli, Ofer Lavi, George Kour, Jonathan Berant

    Abstract: Deep neural networks work well at approximating complicated functions when provided with data and trained by gradient descent methods. At the same time, there is a vast amount of existing functions that programmatically solve different tasks in a precise manner eliminating the need for training. In many cases, it is possible to decompose a task to a series of functions, of which for some we may pr… ▽ More

    Submitted 13 January, 2019; originally announced January 2019.

    Comments: Published as a conference paper at ICLR 2019

  20. arXiv:1809.08037  [pdf, other

    cs.CL

    Understanding Convolutional Neural Networks for Text Classification

    Authors: Alon Jacovi, Oren Sar Shalom, Yoav Goldberg

    Abstract: We present an analysis into the inner workings of Convolutional Neural Networks (CNNs) for processing text. CNNs used for computer vision can be interpreted by projecting filters into image space, but for discrete sequence inputs CNNs remain a mystery. We aim to understand the method by which the networks process and classify text. We examine common hypotheses to this problem: that filters, accomp… ▽ More

    Submitted 27 April, 2020; v1 submitted 21 September, 2018; originally announced September 2018.

    Comments: Accepted to "Analyzing and interpreting neural networks for NLP" workshop in EMNLP 2018. v2: Added link to online github implementation

  21. arXiv:1804.09028  [pdf, other

    cs.LG cs.CL stat.ML

    Estimate and Replace: A Novel Approach to Integrating Deep Neural Networks with Existing Applications

    Authors: Guy Hadash, Einat Kermany, Boaz Carmeli, Ofer Lavi, George Kour, Alon Jacovi

    Abstract: Existing applications include a huge amount of knowledge that is out of reach for deep neural networks. This paper presents a novel approach for integrating calls to existing applications into deep learning architectures. Using this approach, we estimate each application's functionality with an estimator, which is implemented as a deep neural network (DNN). The estimator is then embedded into a ba… ▽ More

    Submitted 24 April, 2018; originally announced April 2018.