Skip to main content

Showing 1–48 of 48 results for author: Ballesteros, M

  1. arXiv:2405.00204  [pdf, other

    cs.CL cs.AI

    General Purpose Verification for Chain of Thought Prompting

    Authors: Robert Vacareanu, Anurag Pratik, Evangelia Spiliopoulou, Zheng Qi, Giovanni Paolini, Neha Anna John, Jie Ma, Yassine Benajiba, Miguel Ballesteros

    Abstract: Many of the recent capabilities demonstrated by Large Language Models (LLMs) arise primarily from their ability to exploit contextual information. In this paper, we explore ways to improve reasoning capabilities of LLMs through (1) exploration of different chains of thought and (2) validation of the individual steps of the reasoning process. We propose three general principles that a model should… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 22 pages, preprint

  2. arXiv:2402.18479  [pdf, other

    cs.CL

    NewsQs: Multi-Source Question Generation for the Inquiring Mind

    Authors: Alyssa Hwang, Kalpit Dixit, Miguel Ballesteros, Yassine Benajiba, Vittorio Castelli, Markus Dreyer, Mohit Bansal, Kathleen McKeown

    Abstract: We present NewsQs (news-cues), a dataset that provides question-answer pairs for multiple news documents. To create NewsQs, we augment a traditional multi-document summarization dataset with questions automatically generated by a T5-Large model fine-tuned on FAQ-style news articles from the News On the Web corpus. We show that fine-tuning a model with control codes produces questions that are judg… ▽ More

    Submitted 15 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: minor wording change

  3. arXiv:2305.17127  [pdf, other

    cs.CL

    Characterizing and Measuring Linguistic Dataset Drift

    Authors: Tyler A. Chang, Kishaloy Halder, Neha Anna John, Yogarshi Vyas, Yassine Benajiba, Miguel Ballesteros, Dan Roth

    Abstract: NLP models often degrade in performance when real world data distributions differ markedly from training data. However, existing dataset drift metrics in NLP have generally not considered specific dimensions of linguistic drift that affect model performance, and they have not been validated in their ability to predict model performance at the individual example level, where such metrics are often… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023

  4. arXiv:2305.13191  [pdf, other

    cs.CL cs.AI cs.LG

    Taxonomy Expansion for Named Entity Recognition

    Authors: Karthikeyan K, Yogarshi Vyas, Jie Ma, Giovanni Paolini, Neha Anna John, Shuai Wang, Yassine Benajiba, Vittorio Castelli, Dan Roth, Miguel Ballesteros

    Abstract: Training a Named Entity Recognition (NER) model often involves fixing a taxonomy of entity types. However, requirements evolve and we might need the NER model to recognize additional entity types. A simple approach is to re-annotate entire dataset with both existing and additional entity types and then train the model on the re-annotated dataset. However, this is an extremely laborious task. To re… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  5. arXiv:2305.11979  [pdf, other

    cs.CL

    A Weak Supervision Approach for Few-Shot Aspect Based Sentiment

    Authors: Robert Vacareanu, Siddharth Varia, Kishaloy Halder, Shuai Wang, Giovanni Paolini, Neha Anna John, Miguel Ballesteros, Smaranda Muresan

    Abstract: We explore how weak supervision on abundant unlabeled data can be leveraged to improve few-shot performance in aspect-based sentiment analysis (ABSA) tasks. We propose a pipeline approach to construct a noisy ABSA dataset, and we use it to adapt a pre-trained sequence-to-sequence model to the ABSA tasks. We test the resulting model on three widely used ABSA datasets, before and after fine-tuning.… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

  6. arXiv:2305.11242  [pdf, other

    cs.CL

    Comparing Biases and the Impact of Multilingual Training across Multiple Languages

    Authors: Sharon Levy, Neha Anna John, Ling Liu, Yogarshi Vyas, Jie Ma, Yoshinari Fujinuma, Miguel Ballesteros, Vittorio Castelli, Dan Roth

    Abstract: Studies in bias and fairness in natural language processing have primarily examined social biases within a single language and/or across few attributes (e.g. gender, race). However, biases can manifest differently across various languages for individual attributes. As a result, it is critical to examine biases within each language and attribute. Of equal importance is to study how these biases com… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  7. arXiv:2303.11660  [pdf, other

    cs.CL

    Simple Yet Effective Synthetic Dataset Construction for Unsupervised Opinion Summarization

    Authors: Ming Shen, Jie Ma, Shuai Wang, Yogarshi Vyas, Kalpit Dixit, Miguel Ballesteros, Yassine Benajiba

    Abstract: Opinion summarization provides an important solution for summarizing opinions expressed among a large number of reviews. However, generating aspect-specific and general summaries is challenging due to the lack of annotated data. In this work, we propose two simple yet effective unsupervised approaches to generate both aspect-specific and general opinion summaries by training on synthetic datasets… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: EACL 2023 Findings

  8. arXiv:2302.12297  [pdf, other

    cs.CL

    Dynamic Benchmarking of Masked Language Models on Temporal Concept Drift with Multiple Views

    Authors: Katerina Margatina, Shuai Wang, Yogarshi Vyas, Neha Anna John, Yassine Benajiba, Miguel Ballesteros

    Abstract: Temporal concept drift refers to the problem of data changing over time. In NLP, that would entail that language (e.g. new expressions, meaning shifts) and factual knowledge (e.g. new concepts, updated facts) evolve over time. Focusing on the latter, we benchmark $11$ pretrained masked language models (MLMs) on a series of tests designed to evaluate the effect of temporal concept drift, as it is c… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

    Comments: To appear at EACL 2023. Our code will be available at https://github.com/amazon-science/temporal-robustness

  9. arXiv:2211.04903  [pdf, other

    cs.CL

    Novel Chapter Abstractive Summarization using Spinal Tree Aware Sub-Sentential Content Selection

    Authors: Hardy Hardy, Miguel Ballesteros, Faisal Ladhak, Muhammad Khalifa, Vittorio Castelli, Kathleen McKeown

    Abstract: Summarizing novel chapters is a difficult task due to the input length and the fact that sentences that appear in the desired summaries draw content from multiple places throughout the chapter. We present a pipelined extractive-abstractive approach where the extractive step filters the content that is passed to the abstractive component. Extremely lengthy input also results in a highly skewed data… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

  10. arXiv:2210.06629  [pdf, other

    cs.CL

    Instruction Tuning for Few-Shot Aspect-Based Sentiment Analysis

    Authors: Siddharth Varia, Shuai Wang, Kishaloy Halder, Robert Vacareanu, Miguel Ballesteros, Yassine Benajiba, Neha Anna John, Rishita Anubhai, Smaranda Muresan, Dan Roth

    Abstract: Aspect-based Sentiment Analysis (ABSA) is a fine-grained sentiment analysis task which involves four elements from user-generated texts: aspect term, aspect category, opinion term, and sentiment polarity. Most computational approaches focus on some of the ABSA sub-tasks such as tuple (aspect term, sentiment polarity) or triplet (aspect term, opinion term, sentiment polarity) extraction using eithe… ▽ More

    Submitted 11 June, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: Camera ready copy for WASSA at ACL 2023

  11. arXiv:2210.05613  [pdf, other

    cs.CL cs.AI

    Contrastive Training Improves Zero-Shot Classification of Semi-structured Documents

    Authors: Muhammad Khalifa, Yogarshi Vyas, Shuai Wang, Graham Horwood, Sunil Mallya, Miguel Ballesteros

    Abstract: We investigate semi-structured document classification in a zero-shot setting. Classification of semi-structured documents is more challenging than that of standard unstructured documents, as positional, layout, and style information play a vital role in interpreting such documents. The standard classification setting where categories are fixed during both training and testing falls short in dynam… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

  12. arXiv:2204.11117  [pdf, other

    cs.CL cs.LG

    Exploring the Role of Task Transferability in Large-Scale Multi-Task Learning

    Authors: Vishakh Padmakumar, Leonard Lausen, Miguel Ballesteros, Sheng Zha, He He, George Karypis

    Abstract: Recent work has found that multi-task training with a large number of diverse tasks can uniformly improve downstream performance on unseen target tasks. In contrast, literature on task transferability has established that the choice of intermediate tasks can heavily affect downstream task performance. In this work, we aim to disentangle the effect of scale and relatedness of tasks in multi-task re… ▽ More

    Submitted 12 July, 2022; v1 submitted 23 April, 2022; originally announced April 2022.

    Comments: NAACL 2022 - Camera ready version

  13. arXiv:2203.08985  [pdf, other

    cs.CL

    Label Semantics for Few Shot Named Entity Recognition

    Authors: Jie Ma, Miguel Ballesteros, Srikanth Doss, Rishita Anubhai, Sunil Mallya, Yaser Al-Onaizan, Dan Roth

    Abstract: We study the problem of few shot learning for named entity recognition. Specifically, we leverage the semantic information in the names of the labels as a way of giving the model additional signal and enriched priors. We propose a neural architecture that consists of two BERT encoders, one to encode the document and its tokens and another one to encode each of the labels in natural language format… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

    Comments: Findings of ACL 2022

  14. arXiv:2112.08345  [pdf, other

    cs.CV

    Reliable Multi-Object Tracking in the Presence of Unreliable Detections

    Authors: Travis Mandel, Mark Jimenez, Emily Risley, Taishi Nammoto, Rebekka Williams, Max Panoff, Meynard Ballesteros, Bobbie Suarez

    Abstract: Recent multi-object tracking (MOT) systems have leveraged highly accurate object detectors; however, training such detectors requires large amounts of labeled data. Although such data is widely available for humans and vehicles, it is significantly more scarce for other animal species. We present Robust Confidence Tracking (RCT), an algorithm designed to maintain robust performance even when detec… ▽ More

    Submitted 7 November, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: The full journal version of this article (published in Pattern Recognition, Vol. 135) can be found at https://www.sciencedirect.com/science/article/pii/S0031320322005878. The article is open access. The source code and dataset can be found at https://github.com/tmandel/fish-detrac

  15. arXiv:2109.08232  [pdf, other

    cs.CL

    A Bag of Tricks for Dialogue Summarization

    Authors: Muhammad Khalifa, Miguel Ballesteros, Kathleen McKeown

    Abstract: Dialogue summarization comes with its own peculiar challenges as opposed to news or scientific articles summarization. In this work, we explore four different challenges of the task: handling and differentiating parts of the dialogue belonging to multiple speakers, negation understanding, reasoning about the situation, and informal language understanding. Using a pretrained sequence-to-sequence la… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021 - short paper

  16. arXiv:2109.03160  [pdf, other

    cs.CL

    How much pretraining data do language models need to learn syntax?

    Authors: Laura Pérez-Mayos, Miguel Ballesteros, Leo Wanner

    Abstract: Transformers-based pretrained language models achieve outstanding results in many well-known NLU benchmarks. However, while pretraining methods are very convenient, they are expensive in terms of time and resources. This calls for a study of the impact of pretraining data size on the knowledge of the models. We explore this impact on the syntactic capabilities of RoBERTa, using models trained on i… ▽ More

    Submitted 9 September, 2021; v1 submitted 7 September, 2021; originally announced September 2021.

    Comments: To be published in proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021)

  17. arXiv:2104.08413  [pdf, other

    cs.CL

    Sequential Cross-Document Coreference Resolution

    Authors: Emily Allaway, Shuai Wang, Miguel Ballesteros

    Abstract: Relating entities and events in text is a key component of natural language understanding. Cross-document coreference resolution, in particular, is important for the growing interest in multi-document analysis tasks. In this work we propose a new model that extends the efficient sequential prediction paradigm for coreference resolution to cross-document settings and achieves competitive results fo… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

  18. arXiv:2101.11492  [pdf, other

    cs.CL

    On the Evolution of Syntactic Information Encoded by BERT's Contextualized Representations

    Authors: Laura Pérez-Mayos, Roberto Carlini, Miguel Ballesteros, Leo Wanner

    Abstract: The adaptation of pretrained language models to solve supervised tasks has become a baseline in NLP, and many recent works have focused on studying how linguistic information is encoded in the pretrained sentence representations. Among other information, it has been shown that entire syntax trees are implicitly embedded in the geometry of such models. As these models are often fine-tuned, it becom… ▽ More

    Submitted 10 February, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

  19. arXiv:2101.11059  [pdf, other

    cs.CL cs.AI cs.IR

    Event-Driven News Stream Clustering using Entity-Aware Contextual Embeddings

    Authors: Kailash Karthik Saravanakumar, Miguel Ballesteros, Muthu Kumar Chandrasekaran, Kathleen McKeown

    Abstract: We propose a method for online news stream clustering that is a variant of the non-parametric streaming K-means algorithm. Our model uses a combination of sparse and dense document representations, aggregates document-cluster similarity along these multiple representations and makes the clustering decision using a neural classifier. The weighted document-cluster similarity model is learned using a… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

    Comments: To appear in Proceedings of The 16th Conference of the European Chapter of the Association for Computational Linguistics

    ACM Class: I.2.7

  20. arXiv:2010.14042  [pdf, other

    cs.CL

    To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging

    Authors: Kasturi Bhattacharjee, Miguel Ballesteros, Rishita Anubhai, Smaranda Muresan, Jie Ma, Faisal Ladhak, Yaser Al-Onaizan

    Abstract: Leveraging large amounts of unlabeled data using Transformer-like architectures, like BERT, has gained popularity in recent times owing to their effectiveness in learning general representations that can then be further fine-tuned for downstream tasks to much success. However, training these models can be costly both from an economic and environmental standpoint. In this work, we investigate how t… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: Accepted in the Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)(https://2020.emnlp.org/papers/main)

  21. arXiv:2010.11333  [pdf, other

    cs.CL

    Linking Entities to Unseen Knowledge Bases with Arbitrary Schemas

    Authors: Yogarshi Vyas, Miguel Ballesteros

    Abstract: In entity linking, mentions of named entities in raw text are disambiguated against a knowledge base (KB). This work focuses on linking to unseen KBs that do not have training data and whose schema is unknown during training. Our approach relies on methods to flexibly convert entities from arbitrary KBs with several attribute-value pairs into flat strings, which we use in conjunction with state-of… ▽ More

    Submitted 21 October, 2020; originally announced October 2020.

  22. arXiv:2010.10669  [pdf, other

    cs.CL

    Transition-based Parsing with Stack-Transformers

    Authors: Ramon Fernandez Astudillo, Miguel Ballesteros, Tahira Naseem, Austin Blodgett, Radu Florian

    Abstract: Modeling the parser state is key to good performance in transition-based parsing. Recurrent Neural Networks considerably improved the performance of transition-based systems by modelling the global state, e.g. stack-LSTM parsers, or local state modeling of contextualized features, e.g. Bi-LSTM parsers. Given the success of Transformer architectures in recent parsing systems, this work explores mod… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

    Comments: Accepted to Findings of EMNLP2020, open review https://openreview.net/forum?id=b36spsuUAde, code https://github.com/IBM/transition-amr-parser

  23. arXiv:2010.05725  [pdf, other

    cs.CL

    Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models

    Authors: Ethan Wilcox, Peng Qian, Richard Futrell, Ryosuke Kohita, Roger Levy, Miguel Ballesteros

    Abstract: Humans can learn structural properties about a word from minimal experience, and deploy their learned syntactic representations uniformly in different grammatical contexts. We assess the ability of modern neural language models to reproduce this behavior in English and evaluate the effect of structural supervision on learning outcomes. First, we assess few-shot learning capabilities by developing… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: To appear at EMNLP 2020

  24. arXiv:2010.03022  [pdf, other

    cs.CL

    Resource-Enhanced Neural Model for Event Argument Extraction

    Authors: Jie Ma, Shuai Wang, Rishita Anubhai, Miguel Ballesteros, Yaser Al-Onaizan

    Abstract: Event argument extraction (EAE) aims to identify the arguments of an event and classify the roles that those arguments play. Despite great efforts made in prior work, there remain many challenges: (1) Data scarcity. (2) Capturing the long-range dependency, specifically, the connection between an event trigger and a distant event argument. (3) Integrating event trigger information into candidate ar… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: Findings of EMNLP 2020

  25. arXiv:2004.04295  [pdf, ps, other

    cs.CL

    Severing the Edge Between Before and After: Neural Architectures for Temporal Ordering of Events

    Authors: Miguel Ballesteros, Rishita Anubhai, Shuai Wang, Nima Pourdamghani, Yogarshi Vyas, Jie Ma, Parminder Bhatia, Kathleen McKeown, Yaser Al-Onaizan

    Abstract: In this paper, we propose a neural architecture and a set of training methods for ordering events by predicting temporal relations. Our proposed models receive a pair of events within a span of text as input and they identify temporal relations (Before, After, Equal, Vague) between them. Given that a key challenge with this task is the scarcity of annotated data, our models rely on either pretrain… ▽ More

    Submitted 8 April, 2020; originally announced April 2020.

  26. arXiv:2001.08279   

    cs.CL cs.AI cs.LG

    Transition-Based Dependency Parsing using Perceptron Learner

    Authors: Rahul Radhakrishnan Iyer, Miguel Ballesteros, Chris Dyer, Robert Frederking

    Abstract: Syntactic parsing using dependency structures has become a standard technique in natural language processing with many different parsing models, in particular data-driven models that can be trained on syntactically annotated corpora. In this paper, we tackle transition-based dependency parsing using a Perceptron Learner. Our proposed model, which adds more relevant features to the Perceptron Learn… ▽ More

    Submitted 28 January, 2020; v1 submitted 22 January, 2020; originally announced January 2020.

    Comments: This was part of an assignment at my graduate course at LTI. This does not offer any major novelties

  27. arXiv:1905.13370  [pdf, ps, other

    cs.CL

    Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement Learning

    Authors: Tahira Naseem, Abhishek Shah, Hui Wan, Radu Florian, Salim Roukos, Miguel Ballesteros

    Abstract: Our work involves enriching the Stack-LSTM transition-based AMR parser (Ballesteros and Al-Onaizan, 2017) by augmenting training with Policy Learning and rewarding the Smatch score of sampled graphs. In addition, we also combined several AMR-to-text alignments with an attention mechanism and we supplemented the parser with pre-processed concept identification, named entities and contextualized emb… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

    Comments: Accepted as short paper at ACL 2019

  28. arXiv:1903.03260  [pdf, other

    cs.CL

    Neural Language Models as Psycholinguistic Subjects: Representations of Syntactic State

    Authors: Richard Futrell, Ethan Wilcox, Takashi Morita, Peng Qian, Miguel Ballesteros, Roger Levy

    Abstract: We deploy the methods of controlled psycholinguistic experimentation to shed light on the extent to which the behavior of neural network language models reflects incremental representations of syntactic state. To do so, we examine model behavior on artificial sentences containing a variety of syntactically complex structures. We test four models: two publicly available LSTM sequence models of Engl… ▽ More

    Submitted 7 March, 2019; originally announced March 2019.

    Comments: Accepted to NAACL 2019. Not yet edited into the camera-ready version

  29. arXiv:1903.00943  [pdf, other

    cs.CL

    Structural Supervision Improves Learning of Non-Local Grammatical Dependencies

    Authors: Ethan Wilcox, Peng Qian, Richard Futrell, Miguel Ballesteros, Roger Levy

    Abstract: State-of-the-art LSTM language models trained on large corpora learn sequential contingencies in impressive detail and have been shown to acquire a number of non-local grammatical dependencies with some success. Here we investigate whether supervision with hierarchical structure enhances learning of a range of grammatical dependencies, a question that has previously been addressed only for subject… ▽ More

    Submitted 6 April, 2019; v1 submitted 3 March, 2019; originally announced March 2019.

    Comments: To appear: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

  30. arXiv:1902.09781  [pdf, other

    cs.CL

    Recursive Subtree Composition in LSTM-Based Dependency Parsing

    Authors: Miryam de Lhoneux, Miguel Ballesteros, Joakim Nivre

    Abstract: The need for tree structure modelling on top of sequence modelling is an open issue in neural dependency parsing. We investigate the impact of adding a tree layer on top of a sequential model by recursively composing subtree representations (composition) in a transition-based parser that uses features extracted by a BiLSTM. Composition seems superfluous with such a model, suggesting that BiLSTMs c… ▽ More

    Submitted 26 February, 2019; originally announced February 2019.

    Comments: Accepted at NAACL 2019

  31. arXiv:1806.03280  [pdf, other

    cs.CL

    Multilingual Neural Machine Translation with Task-Specific Attention

    Authors: Graeme Blackwood, Miguel Ballesteros, Todd Ward

    Abstract: Multilingual machine translation addresses the task of translating between multiple source and target languages. We propose task-specific attention models, a simple but effective technique for improving the quality of sequence-to-sequence neural multilingual translation. Our approach seeks to retain as much of the parameter sharing generalization of NMT models as possible, while still allowing for… ▽ More

    Submitted 8 June, 2018; originally announced June 2018.

    Comments: COLING 2018

  32. arXiv:1804.08915  [pdf, other

    cs.CL

    Scheduled Multi-Task Learning: From Syntax to Translation

    Authors: Eliyahu Kiperwasser, Miguel Ballesteros

    Abstract: Neural encoder-decoder models of machine translation have achieved impressive results, while learning linguistic knowledge of both the source and target languages in an implicit end-to-end manner. We propose a framework in which our model begins learning syntax and translation interleaved, gradually putting more focus on translation. Using this approach, we achieve considerable improvements in ter… ▽ More

    Submitted 24 April, 2018; originally announced April 2018.

    Journal ref: Transactions of the Association for Computational Linguistics, 6:225-240 (2018)

  33. arXiv:1804.05038  [pdf, ps, other

    cs.CL

    Pieces of Eight: 8-bit Neural Machine Translation

    Authors: Jerry Quinn, Miguel Ballesteros

    Abstract: Neural machine translation has achieved levels of fluency and adequacy that would have been surprising a short time ago. Output quality is extremely relevant for industry purposes, however it is equally important to produce results in the shortest time possible, mainly for latency-sensitive applications and to control cloud hosting costs. In this paper we show the effectiveness of translating with… ▽ More

    Submitted 13 April, 2018; originally announced April 2018.

    Comments: To appear at NAACL 2018 Industry Track

  34. arXiv:1803.02392  [pdf, other

    cs.CL

    Multimodal Emoji Prediction

    Authors: Francesco Barbieri, Miguel Ballesteros, Francesco Ronzano, Horacio Saggion

    Abstract: Emojis are small images that are commonly included in social media text messages. The combination of visual and textual content in the same message builds up a modern way of communication, that automatic systems are not used to deal with. In this paper we extend recent advances in emoji prediction by putting forward a multimodal approach that is able to predict emojis in Instagram posts. Instagram… ▽ More

    Submitted 17 April, 2018; v1 submitted 6 March, 2018; originally announced March 2018.

    Comments: NAACL 2018 (short)

  35. arXiv:1709.00489  [pdf, ps, other

    cs.CL

    Arc-Standard Spinal Parsing with Stack-LSTMs

    Authors: Miguel Ballesteros, Xavier Carreras

    Abstract: We present a neural transition-based parser for spinal trees, a dependency representation of constituent trees. The parser uses Stack-LSTMs that compose constituent nodes with dependency-based derivations. In experiments, we show that this model adapts to different styles of dependency relations, but this choice has little effect for predicting constituent structure, suggesting that LSTMs induce u… ▽ More

    Submitted 1 September, 2017; originally announced September 2017.

    Comments: IWPT 2017

  36. arXiv:1707.07755  [pdf, ps, other

    cs.CL

    AMR Parsing using Stack-LSTMs

    Authors: Miguel Ballesteros, Yaser Al-Onaizan

    Abstract: We present a transition-based AMR parser that directly generates AMR parses from plain text. We use Stack-LSTMs to represent our parser state and make decisions greedily. In our experiments, we show that our parser achieves very competitive scores on English using only AMR training data. Adding additional information, such as POS tags and dependency trees, improves the results further.

    Submitted 2 August, 2017; v1 submitted 24 July, 2017; originally announced July 2017.

    Comments: EMNLP 2017

  37. arXiv:1702.07285  [pdf, other

    cs.CL

    Are Emojis Predictable?

    Authors: Francesco Barbieri, Miguel Ballesteros, Horacio Saggion

    Abstract: Emojis are ideograms which are naturally combined with plain text to visually complement or condense the meaning of a message. Despite being widely used in social media, their underlying semantics have received little attention from a Natural Language Processing standpoint. In this paper, we investigate the relation between words and emojis, studying the novel task of predicting which emojis are e… ▽ More

    Submitted 24 February, 2017; v1 submitted 23 February, 2017; originally announced February 2017.

    Comments: To appear at EACL 2017

  38. arXiv:1701.03980  [pdf, other

    stat.ML cs.CL cs.MS

    DyNet: The Dynamic Neural Network Toolkit

    Authors: Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, Pengcheng Yin

    Abstract: We describe DyNet, a toolkit for implementing neural network models based on dynamic declaration of network structure. In the static declaration strategy that is used in toolkits like Theano, CNTK, and TensorFlow, the user first defines a computation graph (a symbolic representation of the computation), and then examples are fed into an engine that executes this computation and computes its deriva… ▽ More

    Submitted 14 January, 2017; originally announced January 2017.

    Comments: 33 pages

  39. arXiv:1611.05774  [pdf, other

    cs.CL

    What Do Recurrent Neural Network Grammars Learn About Syntax?

    Authors: Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Graham Neubig, Noah A. Smith

    Abstract: Recurrent neural network grammars (RNNG) are a recently proposed probabilistic generative modeling family for natural language. They show state-of-the-art language modeling and parsing performance. We investigate what information they learn, from a linguistic perspective, through various ablations to the model and the data, and by augmenting the model with an attention mechanism (GA-RNNG) to enabl… ▽ More

    Submitted 10 January, 2017; v1 submitted 17 November, 2016; originally announced November 2016.

    Comments: 10 pages. To appear in EACL 2017, Valencia, Spain

  40. arXiv:1609.07561  [pdf, other

    cs.CL

    Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser

    Authors: Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Noah A. Smith

    Abstract: We introduce two first-order graph-based dependency parsers achieving a new state of the art. The first is a consensus parser built from an ensemble of independently trained greedy LSTM transition-based parsers with different random initializations. We cast this approach as minimum Bayes risk decoding (under the Hamming cost) and argue that weaker consensus within the ensemble is a useful signal o… ▽ More

    Submitted 23 September, 2016; originally announced September 2016.

    Comments: 10 pages. To appear at EMNLP 2016

  41. arXiv:1606.08954  [pdf, other

    cs.CL cs.AI

    Greedy, Joint Syntactic-Semantic Parsing with Stack LSTMs

    Authors: Swabha Swayamdipta, Miguel Ballesteros, Chris Dyer, Noah A. Smith

    Abstract: We present a transition-based parser that jointly produces syntactic and semantic dependencies. It learns a representation of the entire algorithm state, using stack long short-term memories. Our greedy inference algorithm has linear time, including feature extraction. On the CoNLL 2008--9 English shared tasks, we obtain the best published parsing performance among models that jointly learn syntax… ▽ More

    Submitted 4 July, 2018; v1 submitted 29 June, 2016; originally announced June 2016.

    Comments: Proceedings of CoNLL 2016; 13 pages, 5 figures

  42. arXiv:1603.06503  [pdf, ps, other

    cs.CL

    Static and Dynamic Feature Selection in Morphosyntactic Analyzers

    Authors: Bernd Bohnet, Miguel Ballesteros, Ryan McDonald, Joakim Nivre

    Abstract: We study the use of greedy feature selection methods for morphosyntactic tagging under a number of different conditions. We compare a static ordering of features to a dynamic ordering based on mutual information statistics, and we apply the techniques to standalone taggers as well as joint systems for tagging and parsing. Experiments on five languages show that feature selection can result in more… ▽ More

    Submitted 21 March, 2016; originally announced March 2016.

  43. arXiv:1603.03793  [pdf, ps, other

    cs.CL

    Training with Exploration Improves a Greedy Stack-LSTM Parser

    Authors: Miguel Ballesteros, Yoav Goldberg, Chris Dyer, Noah A. Smith

    Abstract: We adapt the greedy Stack-LSTM dependency parser of Dyer et al. (2015) to support a training-with-exploration procedure using dynamic oracles(Goldberg and Nivre, 2013) instead of cross-entropy minimization. This form of training, which accounts for model predictions at training time rather than assuming an error-free action history, improves parsing accuracies for both English and Chinese, obtaini… ▽ More

    Submitted 13 September, 2016; v1 submitted 11 March, 2016; originally announced March 2016.

    Comments: In proceedings of EMNLP 2016

  44. arXiv:1603.01360  [pdf, other

    cs.CL

    Neural Architectures for Named Entity Recognition

    Authors: Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, Chris Dyer

    Abstract: State-of-the-art named entity recognition systems rely heavily on hand-crafted features and domain-specific knowledge in order to learn effectively from the small, supervised training corpora that are available. In this paper, we introduce two new neural architectures---one based on bidirectional LSTMs and conditional random fields, and the other that constructs and labels segments using a transit… ▽ More

    Submitted 7 April, 2016; v1 submitted 4 March, 2016; originally announced March 2016.

    Comments: Proceedings of NAACL 2016

  45. arXiv:1602.07776  [pdf, other

    cs.CL cs.NE

    Recurrent Neural Network Grammars

    Authors: Chris Dyer, Adhiguna Kuncoro, Miguel Ballesteros, Noah A. Smith

    Abstract: We introduce recurrent neural network grammars, probabilistic models of sentences with explicit phrase structure. We explain efficient inference procedures that allow application to both parsing and language modeling. Experiments show that they provide better parsing in English than any single previously published supervised generative model and better language modeling than state-of-the-art seque… ▽ More

    Submitted 12 October, 2016; v1 submitted 24 February, 2016; originally announced February 2016.

    Comments: Proceedings of NAACL 2016 (contains corrigendum)

  46. arXiv:1602.01595  [pdf, other

    cs.CL

    Many Languages, One Parser

    Authors: Waleed Ammar, George Mulcaire, Miguel Ballesteros, Chris Dyer, Noah A. Smith

    Abstract: We train one multilingual model for dependency parsing and use it to parse sentences in several languages. The parsing model uses (i) multilingual word clusters and embeddings; (ii) token-level language information; and (iii) language-specific features (fine-grained POS tags). This input representation enables the parser not only to parse effectively in multiple languages, but also to generalize a… ▽ More

    Submitted 26 July, 2016; v1 submitted 4 February, 2016; originally announced February 2016.

  47. arXiv:1508.00657  [pdf, other

    cs.CL

    Improved Transition-Based Parsing by Modeling Characters instead of Words with LSTMs

    Authors: Miguel Ballesteros, Chris Dyer, Noah A. Smith

    Abstract: We present extensions to a continuous-state dependency parsing method that makes it applicable to morphologically rich languages. Starting with a high-performance transition-based parser that uses long short-term memory (LSTM) recurrent neural networks to learn representations of the parser state, we replace lookup-based word representations with representations constructed from the orthographic r… ▽ More

    Submitted 11 August, 2015; v1 submitted 4 August, 2015; originally announced August 2015.

    Comments: In Proceedings of EMNLP 2015

  48. arXiv:1505.08075  [pdf, other

    cs.CL cs.LG cs.NE

    Transition-Based Dependency Parsing with Stack Long Short-Term Memory

    Authors: Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, Noah A. Smith

    Abstract: We propose a technique for learning representations of parser states in transition-based dependency parsers. Our primary innovation is a new control structure for sequence-to-sequence neural networks---the stack LSTM. Like the conventional stack data structures used in transition-based parsing, elements can be pushed to or popped from the top of the stack in constant time, but, in addition, an LST… ▽ More

    Submitted 29 May, 2015; originally announced May 2015.

    Comments: Proceedings of ACL 2015