Skip to main content

Showing 1–32 of 32 results for author: Alon, U

  1. arXiv:2405.00200  [pdf, other

    cs.CL

    In-Context Learning with Long-Context Models: An In-Depth Exploration

    Authors: Amanda Bertsch, Maor Ivgi, Uri Alon, Jonathan Berant, Matthew R. Gormley, Graham Neubig

    Abstract: As model context lengths continue to increase, the number of demonstrations that can be provided in-context approaches the size of entire training datasets. We study the behavior of in-context learning (ICL) at this extreme scale on multiple datasets and models. We show that, for many datasets with large label spaces, performance continues to increase with hundreds or thousands of demonstrations.… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 27 pages; preprint

  2. arXiv:2402.09371  [pdf, other

    cs.LG cs.AI cs.CL

    Transformers Can Achieve Length Generalization But Not Robustly

    Authors: Yongchao Zhou, Uri Alon, Xinyun Chen, Xuezhi Wang, Rishabh Agarwal, Denny Zhou

    Abstract: Length generalization, defined as the ability to extrapolate from shorter training sequences to longer test ones, is a significant challenge for language models. This issue persists even with large-scale Transformers handling relatively straightforward tasks. In this paper, we test the Transformer's ability of length generalization using the task of addition of two integers. We show that the succe… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  3. arXiv:2402.05403  [pdf, other

    cs.CL cs.AI

    In-Context Principle Learning from Mistakes

    Authors: Tianjun Zhang, Aman Madaan, Luyu Gao, Steven Zheng, Swaroop Mishra, Yiming Yang, Niket Tandon, Uri Alon

    Abstract: In-context learning (ICL, also known as few-shot prompting) has been the standard method of adapting LLMs to downstream tasks, by learning from a few input-output examples. Nonetheless, all ICL-based approaches only learn from correct input-output pairs. In this paper, we revisit this paradigm, by learning more from the few given input-output examples. We introduce Learning Principles (LEAP): Firs… ▽ More

    Submitted 9 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

  4. arXiv:2311.17311  [pdf, other

    cs.CL cs.AI

    Universal Self-Consistency for Large Language Model Generation

    Authors: Xinyun Chen, Renat Aksitov, Uri Alon, Jie Ren, Kefan Xiao, Pengcheng Yin, Sushant Prakash, Charles Sutton, Xuezhi Wang, Denny Zhou

    Abstract: Self-consistency with chain-of-thought prompting (CoT) has demonstrated remarkable performance gains on various challenging tasks, by utilizing multiple reasoning paths sampled from large language models (LLMs). However, self-consistency relies on the answer extraction process to aggregate multiple solutions, which is not applicable to free-form answers. In this work, we propose Universal Self-Con… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  5. arXiv:2310.01602  [pdf, other

    cs.SE cs.AI

    CAT-LM: Training Language Models on Aligned Code And Tests

    Authors: Nikitha Rao, Kush Jain, Uri Alon, Claire Le Goues, Vincent J. Hellendoorn

    Abstract: Testing is an integral part of the software development process. Yet, writing tests is time-consuming and therefore often neglected. Classical test generation tools such as EvoSuite generate behavioral test suites by optimizing for coverage, but tend to produce tests that are hard to understand. Language models trained on code can generate code that is highly similar to that written by humans, but… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  6. arXiv:2309.02389  [pdf, other

    cs.SE

    Contextual Predictive Mutation Testing

    Authors: Kush Jain, Uri Alon, Alex Groce, Claire Le Goues

    Abstract: Mutation testing is a powerful technique for assessing and improving test suite quality that artificially introduces bugs and checks whether the test suites catch them. However, it is also computationally expensive and thus does not scale to large systems and projects. One promising recent approach to tackling this scalability problem uses machine learning to predict whether the tests will detect… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  7. arXiv:2307.13854  [pdf, other

    cs.AI cs.CL cs.LG

    WebArena: A Realistic Web Environment for Building Autonomous Agents

    Authors: Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig

    Abstract: With advances in generative AI, there is now potential for autonomous agents to manage daily tasks via natural language commands. However, current agents are primarily created and tested in simplified synthetic environments, leading to a disconnect with real-world scenarios. In this paper, we build an environment for language-guided agents that is highly realistic and reproducible. Specifically, w… ▽ More

    Submitted 16 April, 2024; v1 submitted 25 July, 2023; originally announced July 2023.

    Comments: Our code, data, environment reproduction resources, and video demonstrations are publicly available at https://webarena.dev/

  8. arXiv:2306.07941  [pdf, other

    cs.CL cs.LG

    GPT-Calls: Enhancing Call Segmentation and Tagging by Generating Synthetic Conversations via Large Language Models

    Authors: Itzik Malkiel, Uri Alon, Yakir Yehuda, Shahar Keren, Oren Barkan, Royi Ronen, Noam Koenigstein

    Abstract: Transcriptions of phone calls are of significant value across diverse fields, such as sales, customer service, healthcare, and law enforcement. Nevertheless, the analysis of these recorded conversations can be an arduous and time-intensive process, especially when dealing with extended or multifaceted dialogues. In this work, we propose a novel method, GPT-distilled Calls Segmentation and Tagging… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

  9. arXiv:2305.02582  [pdf, other

    cs.LG

    On the Expressivity Role of LayerNorm in Transformers' Attention

    Authors: Shaked Brody, Uri Alon, Eran Yahav

    Abstract: Layer Normalization (LayerNorm) is an inherent component in all Transformer-based models. In this paper, we show that LayerNorm is crucial to the expressivity of the multi-head attention layer that follows it. This is in contrast to the common belief that LayerNorm's only role is to normalize the activations during the forward pass, and their gradients during the backward pass. We consider a geome… ▽ More

    Submitted 11 May, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: Accepted as a short paper in Findings of ACL 2023

  10. arXiv:2305.01625  [pdf, other

    cs.CL

    Unlimiformer: Long-Range Transformers with Unlimited Length Input

    Authors: Amanda Bertsch, Uri Alon, Graham Neubig, Matthew R. Gormley

    Abstract: Since the proposal of transformers, these models have been limited to bounded input lengths, because of their need to attend to every token in the input. In this work, we propose Unlimiformer: a general approach that wraps any existing pretrained encoder-decoder transformer, and offloads the cross-attention computation to a single k-nearest-neighbor (kNN) index, while the returned kNN distances ar… ▽ More

    Submitted 30 October, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023

  11. arXiv:2303.17651  [pdf, other

    cs.CL cs.AI cs.LG

    Self-Refine: Iterative Refinement with Self-Feedback

    Authors: Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, Katherine Hermann, Sean Welleck, Amir Yazdanbakhsh, Peter Clark

    Abstract: Like humans, large language models (LLMs) do not always generate the best output on their first try. Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement. The main idea is to generate an initial output using an LLMs; then, the same LLMs provides feedback for its output and uses it… ▽ More

    Submitted 25 May, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: Code, data, and demo at https://selfrefine.info/

  12. arXiv:2302.07867  [pdf, other

    cs.SE cs.AI cs.LG cs.PF

    Learning Performance-Improving Code Edits

    Authors: Alexander Shypula, Aman Madaan, Yimeng Zeng, Uri Alon, Jacob Gardner, Milad Hashemi, Graham Neubig, Parthasarathy Ranganathan, Osbert Bastani, Amir Yazdanbakhsh

    Abstract: With the decline of Moore's law, optimizing program performance has become a major focus of software research. However, high-level optimizations such as API and algorithm changes remain elusive due to the difficulty of understanding the semantics of code. Simultaneously, pretrained large language models (LLMs) have demonstrated strong capabilities at solving a wide range of programming tasks. To t… ▽ More

    Submitted 26 April, 2024; v1 submitted 15 February, 2023; originally announced February 2023.

    Comments: Published as a conference paper at ICLR 2024 (Spotlight). Project website: https://pie4perf.com/

  13. arXiv:2302.05527  [pdf, other

    cs.SE cs.LG cs.PL

    CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code

    Authors: Shuyan Zhou, Uri Alon, Sumit Agarwal, Graham Neubig

    Abstract: Since the rise of neural natural-language-to-code models (NL->Code) that can generate long expressions and statements rather than a single next-token, one of the major problems has been reliably evaluating their generated output. In this paper, we propose CodeBERTScore: an evaluation metric for code generation, which builds on BERTScore (Zhang et al., 2020). Instead of encoding only the generated… ▽ More

    Submitted 31 October, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

  14. arXiv:2301.02828  [pdf, other

    cs.CL cs.LG

    Why do Nearest Neighbor Language Models Work?

    Authors: Frank F. Xu, Uri Alon, Graham Neubig

    Abstract: Language models (LMs) compute the probability of a text by sequentially computing a representation of an already-seen context and using this representation to predict the next word. Currently, most LMs calculate these representations through a neural network consuming the immediate previous context. However recently, retrieval-augmented LMs have shown to improve over standard neural LMs, by access… ▽ More

    Submitted 17 January, 2023; v1 submitted 7 January, 2023; originally announced January 2023.

    Comments: Preprint, 21 pages

  15. arXiv:2211.10435  [pdf, other

    cs.CL cs.AI

    PAL: Program-aided Language Models

    Authors: Luyu Gao, Aman Madaan, Shuyan Zhou, Uri Alon, Pengfei Liu, Yiming Yang, Jamie Callan, Graham Neubig

    Abstract: Large language models (LLMs) have recently demonstrated an impressive ability to perform arithmetic and symbolic reasoning tasks, when provided with a few examples at test time ("few-shot prompting"). Much of this success can be attributed to prompting methods such as "chain-of-thought'', which employ LLMs for both understanding the problem description by decomposing it into steps, as well as solv… ▽ More

    Submitted 27 January, 2023; v1 submitted 18 November, 2022; originally announced November 2022.

    Comments: The first three authors contributed equally. Our code and data are publicly available at http://reasonwithpal.com/

  16. arXiv:2210.07128  [pdf, other

    cs.CL cs.LG

    Language Models of Code are Few-Shot Commonsense Learners

    Authors: Aman Madaan, Shuyan Zhou, Uri Alon, Yiming Yang, Graham Neubig

    Abstract: We address the general task of structured commonsense reasoning: given a natural language input, the goal is to generate a graph such as an event -- or a reasoning-graph. To employ large language models (LMs) for this task, existing approaches ``serialize'' the output graph as a flat list of nodes and edges. Although feasible, these serialized graphs strongly deviate from the natural language corp… ▽ More

    Submitted 6 December, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: EMNLP 2022

  17. arXiv:2208.03471  [pdf, other

    cs.LG cs.IT

    Oversquashing in GNNs through the lens of information contraction and graph expansion

    Authors: Pradeep Kr. Banerjee, Kedar Karhadkar, Yu Guang Wang, Uri Alon, Guido Montúfar

    Abstract: The quality of signal propagation in message-passing graph neural networks (GNNs) strongly influences their expressivity as has been observed in recent works. In particular, for prediction tasks relying on long-range interactions, recursive aggregation of node features can lead to an undesired phenomenon called "oversquashing". We present a framework for analyzing oversquashing based on informatio… ▽ More

    Submitted 6 August, 2022; originally announced August 2022.

    Comments: 8 pages, 5 figures; Accepted at the 58th Annual Allerton Conference on Communication, Control, and Computing

  18. arXiv:2207.05987  [pdf, other

    cs.CL cs.AI cs.SE

    DocPrompting: Generating Code by Retrieving the Docs

    Authors: Shuyan Zhou, Uri Alon, Frank F. Xu, Zhiruo Wang, Zhengbao Jiang, Graham Neubig

    Abstract: Publicly available source-code libraries are continuously growing and changing. This makes it impossible for models of code to keep current with all available APIs by simply training these models on existing code repositories. Thus, existing models inherently cannot generalize to using unseen functions and libraries, because these would never appear in the training data. In contrast, when human pr… ▽ More

    Submitted 18 February, 2023; v1 submitted 13 July, 2022; originally announced July 2022.

    Comments: ICLR 2023 (notable-top-25%); code and data are available at https://github.com/shuyanzhou/docprompting

  19. arXiv:2202.13169  [pdf, other

    cs.PL cs.CL

    A Systematic Evaluation of Large Language Models of Code

    Authors: Frank F. Xu, Uri Alon, Graham Neubig, Vincent J. Hellendoorn

    Abstract: Large language models (LMs) of code have recently shown tremendous promise in completing code and synthesizing code from natural language descriptions. However, the current state-of-the-art code LMs (e.g., Codex (Chen et al., 2021)) are not publicly available, leaving many questions about their model and data design decisions. We aim to fill in some of these blanks through a systematic evaluation… ▽ More

    Submitted 4 May, 2022; v1 submitted 26 February, 2022; originally announced February 2022.

    Comments: DL4C@ICLR 2022, and MAPS@PLDI 2022

  20. arXiv:2201.12431  [pdf, other

    cs.CL cs.LG

    Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval

    Authors: Uri Alon, Frank F. Xu, Junxian He, Sudipta Sengupta, Dan Roth, Graham Neubig

    Abstract: Retrieval-based language models (R-LM) model the probability of natural language text by combining a standard language model (LM) with examples retrieved from an external datastore at test time. While effective, a major bottleneck of using these models in practice is the computationally costly datastore search, which can be performed as frequently as every time step. In this paper, we present Reto… ▽ More

    Submitted 9 June, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

    Comments: Accepted to ICML'2022. Code and models are available at https://github.com/neulab/retomaton

  21. arXiv:2105.14491  [pdf, other

    cs.LG

    How Attentive are Graph Attention Networks?

    Authors: Shaked Brody, Uri Alon, Eran Yahav

    Abstract: Graph Attention Networks (GATs) are one of the most popular GNN architectures and are considered as the state-of-the-art architecture for representation learning with graphs. In GAT, every node attends to its neighbors given its own representation as the query. However, in this paper we show that GAT computes a very limited kind of attention: the ranking of the attention scores is unconditioned on… ▽ More

    Submitted 31 January, 2022; v1 submitted 30 May, 2021; originally announced May 2021.

    Comments: Published in ICLR 2022

  22. Single-Node Attacks for Fooling Graph Neural Networks

    Authors: Ben Finkelshtein, Chaim Baskin, Evgenii Zheltonozhskii, Uri Alon

    Abstract: Graph neural networks (GNNs) have shown broad applicability in a variety of domains. These domains, e.g., social networks and product recommendations, are fertile ground for malicious users and behavior. In this paper, we show that GNNs are vulnerable to the extremely limited (and thus quite realistic) scenarios of a single-node adversarial attack, where the perturbed node cannot be chosen by the… ▽ More

    Submitted 29 September, 2022; v1 submitted 6 November, 2020; originally announced November 2020.

    Comments: Appeared in Neurocomputing

  23. arXiv:2006.05205  [pdf, other

    cs.LG stat.ML

    On the Bottleneck of Graph Neural Networks and its Practical Implications

    Authors: Uri Alon, Eran Yahav

    Abstract: Since the proposal of the graph neural network (GNN) by Gori et al. (2005) and Scarselli et al. (2008), one of the major problems in training GNNs was their struggle to propagate information between distant nodes in the graph. We propose a new explanation for this problem: GNNs are susceptible to a bottleneck when aggregating messages across a long path. This bottleneck causes the over-squashing o… ▽ More

    Submitted 9 March, 2021; v1 submitted 9 June, 2020; originally announced June 2020.

    Comments: Accepted to ICLR'2021

  24. arXiv:2005.13209  [pdf, other

    cs.PL cs.LG

    A Structural Model for Contextual Code Changes

    Authors: Shaked Brody, Uri Alon, Eran Yahav

    Abstract: We address the problem of predicting edit completions based on a learned model that was trained on past edits. Given a code snippet that is partially edited, our goal is to predict a completion of the edit for the rest of the snippet. We refer to this task as the EditCompletion task and present a novel approach for tackling it. The main idea is to directly represent structural edits. This allows u… ▽ More

    Submitted 12 October, 2020; v1 submitted 27 May, 2020; originally announced May 2020.

    Comments: Accepted to OOPSLA 2020

  25. arXiv:1910.07517  [pdf, other

    cs.LG cs.PL

    Adversarial Examples for Models of Code

    Authors: Noam Yefet, Uri Alon, Eran Yahav

    Abstract: Neural models of code have shown impressive results when performing tasks such as predicting method names and identifying certain kinds of bugs. We show that these models are vulnerable to adversarial examples, and introduce a novel approach for attacking trained models of code using adversarial examples. The main idea of our approach is to force a given trained model to make an incorrect predicti… ▽ More

    Submitted 12 October, 2020; v1 submitted 15 October, 2019; originally announced October 2019.

    Comments: Accepted to OOPSLA'2020

  26. arXiv:1910.00577  [pdf, other

    cs.LG cs.PL stat.ML

    Structural Language Models of Code

    Authors: Uri Alon, Roy Sadaka, Omer Levy, Eran Yahav

    Abstract: We address the problem of any-code completion - generating a missing piece of source code in a given program without any restriction on the vocabulary or structure. We introduce a new approach to any-code completion that leverages the strict syntax of programming languages to model a code snippet as a tree - structural language modeling (SLM). SLM estimates the probability of the program's abstrac… ▽ More

    Submitted 29 July, 2020; v1 submitted 30 September, 2019; originally announced October 2019.

    Comments: Appeared in ICML'2020

  27. arXiv:1902.09122  [pdf, other

    cs.LG cs.CR cs.PL stat.ML

    Neural Reverse Engineering of Stripped Binaries using Augmented Control Flow Graphs

    Authors: Yaniv David, Uri Alon, Eran Yahav

    Abstract: We address the problem of reverse engineering of stripped executables, which contain no debug information. This is a challenging problem because of the low amount of syntactic information available in stripped executables, and the diverse assembly code patterns arising from compiler optimizations. We present a novel approach for predicting procedure names in stripped executables. Our approach co… ▽ More

    Submitted 16 October, 2020; v1 submitted 25 February, 2019; originally announced February 2019.

  28. arXiv:1902.08295  [pdf, other

    cs.LG stat.ML

    Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

    Authors: Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob , et al. (66 additional authors not shown)

    Abstract: Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models. Lingvo models are composed of modular building blocks that are flexible and easily extensible, and experiment configurations are centralized and highly customizable. Distributed training and quantized inference are supported directly w… ▽ More

    Submitted 21 February, 2019; originally announced February 2019.

  29. arXiv:1810.12170  [pdf, other

    eess.AS cs.LG

    Contextual Speech Recognition with Difficult Negative Training Examples

    Authors: Uri Alon, Golan Pundak, Tara N. Sainath

    Abstract: Improving the representation of contextual information is key to unlocking the potential of end-to-end (E2E) automatic speech recognition (ASR). In this work, we present a novel and simple approach for training an ASR context mechanism with difficult negative examples. The main idea is to focus on proper nouns (e.g., unique entities such as names of people and places) in the reference transcript,… ▽ More

    Submitted 29 October, 2018; originally announced October 2018.

  30. arXiv:1808.01400  [pdf, other

    cs.LG cs.PL stat.ML

    code2seq: Generating Sequences from Structured Representations of Code

    Authors: Uri Alon, Shaked Brody, Omer Levy, Eran Yahav

    Abstract: The ability to generate natural language sequences from source code snippets has a variety of applications such as code summarization, documentation, and retrieval. Sequence-to-sequence (seq2seq) models, adopted from neural machine translation (NMT), have achieved state-of-the-art performance on these tasks by treating source code as a sequence of tokens. We present ${\rm {\scriptsize CODE2SEQ}}$:… ▽ More

    Submitted 21 February, 2019; v1 submitted 3 August, 2018; originally announced August 2018.

    Comments: Accepted to ICLR'2019

  31. arXiv:1803.09544  [pdf, other

    cs.PL cs.LG

    A General Path-Based Representation for Predicting Program Properties

    Authors: Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav

    Abstract: Predicting program properties such as names or expression types has a wide range of applications. It can ease the task of programming and increase programmer productivity. A major challenge when learning from programs is $\textit{how to represent programs in a way that facilitates effective learning}$. We present a $\textit{general path-based representation}$ for learning from programs. Our repr… ▽ More

    Submitted 22 April, 2018; v1 submitted 26 March, 2018; originally announced March 2018.

    Comments: to appear in PLDI 2018

  32. arXiv:1803.09473  [pdf, other

    cs.LG cs.AI cs.PL stat.ML

    code2vec: Learning Distributed Representations of Code

    Authors: Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav

    Abstract: We present a neural model for representing snippets of code as continuous distributed vectors ("code embeddings"). The main idea is to represent a code snippet as a single fixed-length $\textit{code vector}$, which can be used to predict semantic properties of the snippet. This is performed by decomposing code to a collection of paths in its abstract syntax tree, and learning the atomic representa… ▽ More

    Submitted 30 October, 2018; v1 submitted 26 March, 2018; originally announced March 2018.

    Comments: Accepted in POPL 2019