Skip to main content

Showing 1–41 of 41 results for author: Gatt, A

  1. arXiv:2407.10488  [pdf, other

    cs.CL cs.AI

    How and where does CLIP process negation?

    Authors: Vincent Quantmeyer, Pablo Mosteiro, Albert Gatt

    Abstract: Various benchmarks have been proposed to test linguistic understanding in pre-trained vision \& language (VL) models. Here we build on the existence task from the VALSE benchmark (Parcalabescu et al, 2022) which we use to test models' understanding of negation, a particularly interesting issue for multimodal models. However, while such VL benchmarks are useful for measuring model performance, they… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted at the 3rd Workshop on Advances in Language and Vision Research (ALVR 2024)

  2. arXiv:2406.18403  [pdf, other

    cs.CL

    LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks

    Authors: Anna Bavaresco, Raffaella Bernardi, Leonardo Bertolazzi, Desmond Elliott, Raquel Fernández, Albert Gatt, Esam Ghaleb, Mario Giulianelli, Michael Hanna, Alexander Koller, André F. T. Martins, Philipp Mondorf, Vera Neplenbroek, Sandro Pezzelle, Barbara Plank, David Schlangen, Alessandro Suglia, Aditya K Surikuchi, Ece Takmaz, Alberto Testoni

    Abstract: There is an increasing trend towards evaluating NLP models with LLM-generated judgments instead of human judgments. In the absence of a comparison against human data, this raises concerns about the validity of these evaluations; in case they are conducted with proprietary models, this also raises concerns over reproducibility. We provide JUDGE-BENCH, a collection of 20 NLP datasets with human anno… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  3. arXiv:2406.11341  [pdf, other

    cs.CL

    A Systematic Analysis of Large Language Models as Soft Reasoners: The Case of Syllogistic Inferences

    Authors: Leonardo Bertolazzi, Albert Gatt, Raffaella Bernardi

    Abstract: The reasoning abilities of Large Language Models (LLMs) are becoming a central focus of study in NLP. In this paper, we consider the case of syllogistic reasoning, an area of deductive reasoning studied extensively in logic and cognitive psychology. Previous research has shown that pre-trained LLMs exhibit reasoning biases, such as $\textit{content effects}$, avoid answering that… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  4. arXiv:2311.07022  [pdf, other

    cs.CL cs.AI cs.CV

    ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models

    Authors: Ilker Kesen, Andrea Pedrotti, Mustafa Dogan, Michele Cafagna, Emre Can Acikgoz, Letitia Parcalabescu, Iacer Calixto, Anette Frank, Albert Gatt, Aykut Erdem, Erkut Erdem

    Abstract: With the ever-increasing popularity of pretrained Video-Language Models (VidLMs), there is a pressing need to develop robust evaluation methodologies that delve deeper into their visio-linguistic capabilities. To address this challenge, we present ViLMA (Video Language Model Assessment), a task-agnostic benchmark that places the assessment of fine-grained capabilities of these models on a firm foo… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

    Comments: Preprint. 48 pages, 22 figures, 10 tables

  5. arXiv:2310.06588  [pdf, other

    cs.CL cs.LG

    FTFT: Efficient and Robust Fine-Tuning by Transferring Training Dynamics

    Authors: Yupei Du, Albert Gatt, Dong Nguyen

    Abstract: Despite the massive success of fine-tuning Pre-trained Language Models (PLMs), they remain susceptible to out-of-distribution input. Dataset cartography is a simple yet effective dual-model approach that improves the robustness of fine-tuned PLMs. It involves fine-tuning a model on the original training set (i.e. reference model), selecting a subset of important training instances based on the tra… ▽ More

    Submitted 29 March, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

  6. arXiv:2309.11252  [pdf, other

    cs.CL cs.CV

    The Scenario Refiner: Grounding subjects in images at the morphological level

    Authors: Claudia Tagliaferri, Sofia Axioti, Albert Gatt, Denis Paperno

    Abstract: Derivationally related words, such as "runner" and "running", exhibit semantic differences which also elicit different visual scenarios. In this paper, we ask whether Vision and Language (V\&L) models capture such distinctions at the morphological level, using a a new methodology and dataset. We compare the results from V\&L models to human judgements and find that models' predictions differ from… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: presented at the LIMO workshop (Linguistic Insights from and for Multimodal Language Processing @KONVENS 2023)

  7. arXiv:2307.02882  [pdf, other

    cs.CL cs.AI

    Contrast Is All You Need

    Authors: Burak Kilic, Florix Bex, Albert Gatt

    Abstract: In this study, we analyze data-scarce classification scenarios, where available labeled legal data is small and imbalanced, potentially hurting the quality of the results. We focused on two finetuning objectives; SetFit (Sentence Transformer Finetuning), a contrastive learning setup, and a vanilla finetuning setup on a legal provision classification task. Additionally, we compare the features that… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Comments: 10 pages + bib, 12 figures, ACAIL2023/ASAIL2023 Workshop

    Journal ref: ASAIL Vol-3441 2023 72-82

  8. arXiv:2305.01633  [pdf, other

    cs.CL

    Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP

    Authors: Anya Belz, Craig Thomson, Ehud Reiter, Gavin Abercrombie, Jose M. Alonso-Moral, Mohammad Arvan, Anouck Braggaar, Mark Cieliebak, Elizabeth Clark, Kees van Deemter, Tanvi Dinkar, Ondřej Dušek, Steffen Eger, Qixiang Fang, Mingqi Gao, Albert Gatt, Dimitra Gkatzia, Javier González-Corbelle, Dirk Hovy, Manuela Hürlimann, Takumi Ito, John D. Kelleher, Filip Klubicka, Emiel Krahmer, Huiyuan Lai , et al. (17 additional authors not shown)

    Abstract: We report our efforts in identifying a set of previous human evaluations in NLP that would be suitable for a coordinated study examining what makes human evaluations in NLP more/less reproducible. We present our results and findings, which include that just 13\% of papers had (i) sufficiently low barriers to reproduction, and (ii) enough obtainable information, to be considered for reproduction, a… ▽ More

    Submitted 7 August, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

    Comments: 5 pages plus appendix, 4 tables, 1 figure. To appear at "Workshop on Insights from Negative Results in NLP" (co-located with EACL2023). Updated author list and acknowledgements

    MSC Class: 68 ACM Class: I.2.7

  9. Interpreting Vision and Language Generative Models with Semantic Visual Priors

    Authors: Michele Cafagna, Lina M. Rojas-Barahona, Kees van Deemter, Albert Gatt

    Abstract: When applied to Image-to-text models, interpretability methods often provide token-by-token explanations namely, they compute a visual explanation for each token of the generated sequence. Those explanations are expensive to compute and unable to comprehensively explain the model's output. Therefore, these models often require some sort of approximation that eventually leads to misleading explanat… ▽ More

    Submitted 4 May, 2023; v1 submitted 28 April, 2023; originally announced April 2023.

  10. arXiv:2302.12189  [pdf, other

    cs.CL cs.CV

    HL Dataset: Visually-grounded Description of Scenes, Actions and Rationales

    Authors: Michele Cafagna, Kees van Deemter, Albert Gatt

    Abstract: Current captioning datasets focus on object-centric captions, describing the visible objects in the image, e.g. "people eating food in a park". Although these datasets are useful to evaluate the ability of Vision & Language models to recognize and describe visual content, they do not support controlled experiments involving model testing or fine-tuning, with more high-level captions, which humans… ▽ More

    Submitted 25 September, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

  11. arXiv:2211.04971  [pdf, other

    cs.CL cs.CV

    Understanding Cross-modal Interactions in V&L Models that Generate Scene Descriptions

    Authors: Michele Cafagna, Kees van Deemter, Albert Gatt

    Abstract: Image captioning models tend to describe images in an object-centric way, emphasising visible objects. But image descriptions can also abstract away from objects and describe the type of scene depicted. In this paper, we explore the potential of a state-of-the-art Vision and Language model, VinVL, to caption images at the scene level using (1) a novel dataset which pairs images with both object-ce… ▽ More

    Submitted 10 November, 2022; v1 submitted 9 November, 2022; originally announced November 2022.

  12. arXiv:2205.12342  [pdf, other

    cs.CV cs.NE

    Face2Text revisited: Improved data set and baseline results

    Authors: Marc Tanti, Shaun Abdilla, Adrian Muscat, Claudia Borg, Reuben A. Farrugia, Albert Gatt

    Abstract: Current image description generation models do not transfer well to the task of describing human faces. To encourage the development of more human-focused descriptions, we developed a new data set of facial descriptions based on the CelebA image data set. We describe the properties of this data set, and present results from a face description generator trained on it, which explores the feasibility… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

    Comments: 7 pages, 5 figures, 4 tables, to appear in LREC 2022 (P-VLAM workshop)

  13. Pre-training Data Quality and Quantity for a Low-Resource Language: New Corpus and BERT Models for Maltese

    Authors: Kurt Micallef, Albert Gatt, Marc Tanti, Lonneke van der Plas, Claudia Borg

    Abstract: Multilingual language models such as mBERT have seen impressive cross-lingual transfer to a variety of languages, but many languages remain excluded from these models. In this paper, we analyse the effect of pre-training with monolingual data for a low-resource language that is not included in mBERT -- Maltese -- with a range of pre-training set ups. We conduct evaluations with the newly pre-train… ▽ More

    Submitted 26 May, 2022; v1 submitted 21 May, 2022; originally announced May 2022.

    Comments: DeepLo 2022 camera-ready version

  14. VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena

    Authors: Letitia Parcalabescu, Michele Cafagna, Lilitta Muradjan, Anette Frank, Iacer Calixto, Albert Gatt

    Abstract: We propose VALSE (Vision And Language Structured Evaluation), a novel benchmark designed for testing general-purpose pretrained vision and language (V&L) models for their visio-linguistic grounding capabilities on specific linguistic phenomena. VALSE offers a suite of six tests covering various linguistic constructs. Solving these requires models to ground linguistic phenomena in the visual modali… ▽ More

    Submitted 14 March, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: Paper accepted for publication at ACL 2022 Main; 28 pages, 4 figures, 11 tables

    MSC Class: 68Txx ACM Class: I.2.7; I.2.10

  15. arXiv:2111.07793  [pdf, ps, other

    cs.CL

    Analysis of Data Augmentation Methods for Low-Resource Maltese ASR

    Authors: Andrea DeMarco, Carlos Mena, Albert Gatt, Claudia Borg, Aiden Williams, Lonneke van der Plas

    Abstract: Recent years have seen an increased interest in the computational speech processing of Maltese, but resources remain sparse. In this paper, we consider data augmentation techniques for improving speech recognition for low-resource languages, focusing on Maltese as a test case. We consider three different types of data augmentation: unsupervised training, multilingual training and the use of synthe… ▽ More

    Submitted 20 January, 2023; v1 submitted 15 November, 2021; originally announced November 2021.

    Comments: 12 pages

  16. arXiv:2109.07301  [pdf, other

    cs.CL

    What Vision-Language Models `See' when they See Scenes

    Authors: Michele Cafagna, Kees van Deemter, Albert Gatt

    Abstract: Images can be described in terms of the objects they contain, or in terms of the types of scene or place that they instantiate. In this paper we address to what extent pretrained Vision and Language models can learn to align descriptions of both types with images. We compare 3 state-of-the-art models, VisualBERT, LXMERT and CLIP. We find that (i) V&L models are susceptible to stylistic biases acqu… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

  17. arXiv:2109.06935  [pdf, other

    cs.CL cs.NE

    On the Language-specificity of Multilingual BERT and the Impact of Fine-tuning

    Authors: Marc Tanti, Lonneke van der Plas, Claudia Borg, Albert Gatt

    Abstract: Recent work has shown evidence that the knowledge acquired by multilingual BERT (mBERT) has two components: a language-specific and a language-neutral one. This paper analyses the relationship between them, in the context of fine-tuning on two tasks -- POS tagging and natural language inference -- which require the model to bring to bear different degrees of language-specific knowledge. Visualisat… ▽ More

    Submitted 26 December, 2021; v1 submitted 14 September, 2021; originally announced September 2021.

    Comments: 14 pages, 6 figures, 5 tables, submitted in BlackBoxNLP 2021 (https://aclanthology.org/2021.blackboxnlp-1.15/)

  18. arXiv:2101.01634  [pdf, other

    cs.CL

    On the interaction of automatic evaluation and task framing in headline style transfer

    Authors: Lorenzo De Mattei, Michele Cafagna, Huiyuan Lai, Felice Dell'Orletta, Malvina Nissim, Albert Gatt

    Abstract: An ongoing debate in the NLG community concerns the best way to evaluate systems, with human evaluation often being considered the most reliable method, compared to corpus-based metrics. However, tasks involving subtle textual differences, such as style transfer, tend to be hard for humans to perform. In this paper, we propose an evaluation method for this task based on purposely-trained classifie… ▽ More

    Submitted 5 January, 2021; originally announced January 2021.

  19. arXiv:2012.12352  [pdf, other

    cs.CV cs.CL

    Seeing past words: Testing the cross-modal capabilities of pretrained V&L models on counting tasks

    Authors: Letitia Parcalabescu, Albert Gatt, Anette Frank, Iacer Calixto

    Abstract: We investigate the reasoning ability of pretrained vision and language (V&L) models in two tasks that require multimodal integration: (1) discriminating a correct image-sentence pair from an incorrect one, and (2) counting entities in an image. We evaluate three pretrained V&L models on these tasks: ViLBERT, ViLBERT 12-in-1 and LXMERT, in zero-shot and finetuned settings. Our results show that mod… ▽ More

    Submitted 17 June, 2021; v1 submitted 22 December, 2020; originally announced December 2020.

    Comments: Paper accepted for publication at MMSR 2021; 13 pages, 3 figures, 7 Tables

    MSC Class: 68Txx ACM Class: I.2.7; I.2.10

    Journal ref: Proceedings of the 1st Workshop on Multimodal Semantic Representations (MMSR), 2021, Groningen, Netherlands (Online), Association for Computational Linguistics, p. 32--44

  20. arXiv:2011.07975  [pdf, other

    cs.CL

    Datasets and Models for Authorship Attribution on Italian Personal Writings

    Authors: Gaetana Ruggiero, Albert Gatt, Malvina Nissim

    Abstract: Existing research on Authorship Attribution (AA) focuses on texts for which a lot of data is available (e.g novels), mainly in English. We approach AA via Authorship Verification on short Italian texts in two novel datasets, and analyze the interaction between genre, topic, gender and length. Results show that AV is feasible even with little data, but more evidence helps. Gender and topic can be i… ▽ More

    Submitted 16 November, 2020; originally announced November 2020.

    Comments: Accepted for publication in: 7th Italian Conference on Computational Linguistics (CLIC-IT 2020)

  21. arXiv:2010.14534  [pdf, other

    cs.CL

    Unmasking Contextual Stereotypes: Measuring and Mitigating BERT's Gender Bias

    Authors: Marion Bartl, Malvina Nissim, Albert Gatt

    Abstract: Contextualized word embeddings have been replacing standard embeddings as the representational knowledge source of choice in NLP systems. Since a variety of biases have previously been found in standard word embeddings, it is crucial to assess biases encoded in their replacements as well. Focusing on BERT (Devlin et al., 2018), we measure gender bias by studying associations between gender-denotin… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: 10 pages, 4 figures, to appear in Proceedings of the 2nd Workshop on Gender Bias in Natural Language Processing at COLING 2020

  22. arXiv:2008.06222  [pdf, other

    cs.CY cs.CL

    Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis

    Authors: Stavros Assimakopoulos, Rebecca Vella Muskat, Lonneke van der Plas, Albert Gatt

    Abstract: This paper presents a novel scheme for the annotation of hate speech in corpora of Web 2.0 commentary. The proposed scheme is motivated by the critical analysis of posts made in reaction to news reports on the Mediterranean migration crisis and LGBTIQ+ matters in Malta, which was conducted under the auspices of the EU-funded C.O.N.T.A.C.T. project. Based on the realization that hate speech is not… ▽ More

    Submitted 14 August, 2020; originally announced August 2020.

    Comments: 10 pages, 1 table. Appears in Proceedings of the 12th edition of the Language Resources and Evaluation Conference (LREC'20)

  23. arXiv:2008.05760  [pdf, other

    cs.CL cs.LG

    MASRI-HEADSET: A Maltese Corpus for Speech Recognition

    Authors: Carlos Mena, Albert Gatt, Andrea DeMarco, Claudia Borg, Lonneke van der Plas, Amanda Muscat, Ian Padovani

    Abstract: Maltese, the national language of Malta, is spoken by approximately 500,000 people. Speech processing for Maltese is still in its early stages of development. In this paper, we present the first spoken Maltese corpus designed purposely for Automatic Speech Recognition (ASR). The MASRI-HEADSET corpus was developed by the MASRI project at the University of Malta. It consists of 8 hours of speech pai… ▽ More

    Submitted 13 August, 2020; originally announced August 2020.

    Comments: 8 pages, 2 figures, 4 tables, 1 appendix. Appears in Proceedings of the 12th edition of the Language Resources and Evaluation Conference (LREC'20)

  24. arXiv:1911.03738  [pdf, other

    cs.NE cs.CL

    On Architectures for Including Visual Information in Neural Language Models for Image Description

    Authors: Marc Tanti, Albert Gatt, Kenneth P. Camilleri

    Abstract: A neural language model can be conditioned into generating descriptions for images by providing visual information apart from the sentence prefix. This visual information can be included into the language model through different points of entry resulting in different neural architectures. We identify four main architectures which we call init-inject, pre-inject, par-inject, and merge. We analyse… ▽ More

    Submitted 9 November, 2019; originally announced November 2019.

    Comments: 145 pages, 41 figures, 15 tables, Doctoral thesis

  25. arXiv:1909.09788  [pdf, other

    cs.CL cs.AI cs.NE

    Visuallly Grounded Generation of Entailments from Premises

    Authors: Somaye Jafaritazehjani, Albert Gatt, Marc Tanti

    Abstract: Natural Language Inference (NLI) is the task of determining the semantic relationship between a premise and a hypothesis. In this paper, we focus on the {\em generation} of hypotheses from premises in a multimodal setting, to generate a sentence (hypothesis) given an image and/or its description (premise) as the input. The main goals of this paper are (a) to investigate whether it is reasonable to… ▽ More

    Submitted 21 September, 2019; originally announced September 2019.

    Comments: Proceedings of the 12th International Conference on Natural Language Generation (INLG 2019), 11 pages, 5 figures

  26. arXiv:1907.07265  [pdf, other

    cs.CL

    You Write Like You Eat: Stylistic variation as a predictor of social stratification

    Authors: Angelo Basile, Albert Gatt, Malvina Nissim

    Abstract: Inspired by Labov's seminal work on stylistic variation as a function of social stratification, we develop and compare neural models that predict a person's presumed socio-economic status, obtained through distant supervision,from their writing style on social media. The focus of our work is on identifying the most important stylistic parameters to predict socio-economic group. In particular, we s… ▽ More

    Submitted 16 July, 2019; originally announced July 2019.

    Comments: 11 pages, 5 figures, ACL Conference 2019

    ACM Class: I.2.7

  27. arXiv:1901.01216  [pdf, other

    cs.CL cs.LG cs.NE

    Transfer learning from language models to image caption generators: Better models may not transfer better

    Authors: Marc Tanti, Albert Gatt, Kenneth P. Camilleri

    Abstract: When designing a neural caption generator, a convolutional neural network can be used to extract image features. Is it possible to also use a neural language model to extract sentence prefix features? We answer this question by trying different ways to transfer the recurrent neural network and embedding layer from a neural language model to an image caption generator. We find that image caption ge… ▽ More

    Submitted 1 January, 2019; originally announced January 2019.

    Comments: 17 pages, 4 figures, 3 tables, unpublished (comments welcome)

  28. Quantifying the amount of visual information used by neural caption generators

    Authors: Marc Tanti, Albert Gatt, Kenneth P. Camilleri

    Abstract: This paper addresses the sensitivity of neural image caption generators to their visual input. A sensitivity analysis and omission analysis based on image foils is reported, showing that the extent to which image captioning architectures retain and are sensitive to visual information varies depending on the type of word being generated and the position in the caption as a whole. We motivate this w… ▽ More

    Submitted 12 October, 2018; originally announced October 2018.

    Comments: 10 pages, 4 figures This publication will appear in the Proceedings of the First Workshop on Shortcomings in Vision and Language (2018). DOI to be inserted later

  29. Pre-gen metrics: Predicting caption quality metrics without generating captions

    Authors: Marc Tanti, Albert Gatt, Adrian Muscat

    Abstract: Image caption generation systems are typically evaluated against reference outputs. We show that it is possible to predict output quality without generating the captions, based on the probability assigned by the neural model to the reference captions. Such pre-gen metrics are strongly correlated to standard evaluation metrics.

    Submitted 12 October, 2018; originally announced October 2018.

    Comments: 13 pages, 6 figures This publication will appear in the Proceedings of the First Workshop on Shortcomings in Vision and Language (2018). DOI to be inserted later

  30. arXiv:1810.00333  [pdf, other

    cs.CL

    Specificity measures and reference

    Authors: Albert Gatt, Nicolás Marín, Gustavo Rivas-Gervilla, Daniel Sánchez

    Abstract: In this paper we study empirically the validity of measures of referential success for referring expressions involving gradual properties. More specifically, we study the ability of several measures of referential success to predict the success of a user in choosing the right object, given a referring expression. Experimental results indicate that certain fuzzy measures of success are able to pred… ▽ More

    Submitted 30 September, 2018; originally announced October 2018.

    Comments: Accepted for publication in: Proceedings of the 11th International Conference on Natural Language Generation. Tilburg, The Netherlands: Association for Computational Linguistics 11 pages, 5 figures, 6 tables

  31. arXiv:1809.02494  [pdf, other

    cs.CL cs.AI

    Meteorologists and Students: A resource for language grounding of geographical descriptors

    Authors: Alejandro Ramos-Soto, Ehud Reiter, Kees van Deemter, Jose M. Alonso, Albert Gatt

    Abstract: We present a data resource which can be useful for research purposes on language grounding tasks in the context of geographical referring expression generation. The resource is composed of two data sets that encompass 25 different geographical descriptors and a set of associated graphical representations, drawn as polygons on a map by two groups of human subjects: teenage students and expert meteo… ▽ More

    Submitted 7 September, 2018; originally announced September 2018.

    Comments: Resource paper, 5 pages, 6 figures, 1 table. Conference: INLG 2018

  32. arXiv:1808.03507  [pdf, other

    cs.CL cs.CY

    Making effective use of healthcare data using data-to-text technology

    Authors: Steffen Pauws, Albert Gatt, Emiel Krahmer, Ehud Reiter

    Abstract: Healthcare organizations are in a continuous effort to improve health outcomes, reduce costs and enhance patient experience of care. Data is essential to measure and help achieving these improvements in healthcare delivery. Consequently, a data influx from various clinical, financial and operational sources is now overtaking healthcare organizations and their patients. The effective use of this da… ▽ More

    Submitted 10 August, 2018; originally announced August 2018.

    Comments: 27 pages, 2 figures, book chapter

  33. arXiv:1806.05645  [pdf, other

    cs.CL cs.CV

    Grounded Textual Entailment

    Authors: Hoa Trong Vu, Claudio Greco, Aliia Erofeeva, Somayeh Jafaritazehjan, Guido Linders, Marc Tanti, Alberto Testoni, Raffaella Bernardi, Albert Gatt

    Abstract: Capturing semantic relations between sentences, such as entailment, is a long-standing challenge for computational semantics. Logic-based models analyse entailment in terms of possible worlds (interpretations, or situations) where a premise P entails a hypothesis H iff in all worlds where P is true, H is also true. Statistical models view this relationship probabilistically, addressing it in terms… ▽ More

    Submitted 14 June, 2018; originally announced June 2018.

    Comments: 15 pages, 2 figures, 14 tables, 2 appendices. Accepted in COLING 2018

  34. arXiv:1803.03827  [pdf, other

    cs.CL cs.AI cs.CV

    Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions

    Authors: Albert Gatt, Marc Tanti, Adrian Muscat, Patrizia Paggio, Reuben A. Farrugia, Claudia Borg, Kenneth P. Camilleri, Mike Rosner, Lonneke van der Plas

    Abstract: The past few years have witnessed renewed interest in NLP tasks at the interface between vision and language. One intensively-studied problem is that of automatically generating text from images. In this paper, we extend this problem to the more specific domain of face description. Unlike scene descriptions, face descriptions are more fine-grained and rely on attributes extracted from the image, r… ▽ More

    Submitted 5 March, 2021; v1 submitted 10 March, 2018; originally announced March 2018.

    Comments: Proceedings of the 11th edition of the Language Resources and Evaluation Conference (LREC'18)

  35. arXiv:1708.02043  [pdf, other

    cs.CL cs.CV cs.NE

    What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption Generator?

    Authors: Marc Tanti, Albert Gatt, Kenneth P. Camilleri

    Abstract: In neural image captioning systems, a recurrent neural network (RNN) is typically viewed as the primary `generation' component. This view suggests that the image features should be `injected' into the RNN. This is in fact the dominant view in the literature. Alternatively, the RNN can instead be viewed as only encoding the previously generated words. This view suggests that the RNN should only be… ▽ More

    Submitted 25 August, 2017; v1 submitted 7 August, 2017; originally announced August 2017.

    Comments: Appears in: Proceedings of the 10th International Conference on Natural Language Generation (INLG'17)

  36. Track Everything: Limiting Prior Knowledge in Online Multi-Object Recognition

    Authors: Sebastien C. Wong, Victor Stamatescu, Adam Gatt, David Kearney, Ivan Lee, Mark D. McDonnell

    Abstract: This paper addresses the problem of online tracking and classification of multiple objects in an image sequence. Our proposed solution is to first track all objects in the scene without relying on object-specific prior knowledge, which in other systems can take the form of hand-crafted features or user-based track initialization. We then classify the tracked objects with a fast-learning image clas… ▽ More

    Submitted 21 April, 2017; originally announced April 2017.

    Comments: 15 pages

    ACM Class: I.4.8

  37. arXiv:1703.10429  [pdf, other

    cs.AI

    An Empirical Approach for Modeling Fuzzy Geographical Descriptors

    Authors: Alejandro Ramos-Soto, Jose M. Alonso, Ehud Reiter, Kees van Deemter, Albert Gatt

    Abstract: We present a novel heuristic approach that defines fuzzy geographical descriptors using data gathered from a survey with human subjects. The participants were asked to provide graphical interpretations of the descriptors `north' and `south' for the Galician region (Spain). Based on these interpretations, our approach builds fuzzy descriptors that are able to compute membership degrees for geograph… ▽ More

    Submitted 30 March, 2017; originally announced March 2017.

    Comments: Conference paper: Accepted for FUZZIEEE-2017. One column version for arXiv (8 pages)

  38. arXiv:1703.09902  [pdf, other

    cs.CL cs.AI cs.NE

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Authors: Albert Gatt, Emiel Krahmer

    Abstract: This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore ai… ▽ More

    Submitted 29 January, 2018; v1 submitted 29 March, 2017; originally announced March 2017.

    Comments: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118 pages, 8 figures, 1 table

    ACM Class: I.2.7; H.5

    Journal ref: Journal of AI Research, volume 60, 2017

  39. Where to put the Image in an Image Caption Generator

    Authors: Marc Tanti, Albert Gatt, Kenneth P. Camilleri

    Abstract: When a recurrent neural network language model is used for caption generation, the image information can be fed to the neural network either by directly incorporating it in the RNN -- conditioning the language model by `injecting' image features -- or in a layer following the RNN -- conditioning the language model by `merging' image features. While both options are attested in the literature, ther… ▽ More

    Submitted 14 March, 2018; v1 submitted 27 March, 2017; originally announced March 2017.

    Comments: Accepted in JNLE Special Issue: Language for Images (24.3) (expanded with content that was removed from journal paper in order to reduce number of pages), 28 pages, 5 figures, 6 tables

  40. arXiv:1703.08701  [pdf, ps, other

    cs.CL

    Morphological Analysis for the Maltese Language: The Challenges of a Hybrid System

    Authors: Claudia Borg, Albert Gatt

    Abstract: Maltese is a morphologically rich language with a hybrid morphological system which features both concatenative and non-concatenative processes. This paper analyses the impact of this hybridity on the performance of machine learning techniques for morphological labelling and clustering. In particular, we analyse a dataset of morphologically related word clusters to evaluate the difference in resul… ▽ More

    Submitted 25 March, 2017; originally announced March 2017.

    Comments: 11pages, Proceedings of the 3rd Arabic Natural Language Processing Workshop (WANLP'17)

    ACM Class: I.2.7

  41. arXiv:1609.08764  [pdf, ps, other

    cs.CV

    Understanding data augmentation for classification: when to warp?

    Authors: Sebastien C. Wong, Adam Gatt, Victor Stamatescu, Mark D. McDonnell

    Abstract: In this paper we investigate the benefit of augmenting data with synthetically created samples when training a machine learning classifier. Two approaches for creating additional training samples are data warping, which generates additional samples through transformations applied in the data-space, and synthetic over-sampling, which creates additional samples in feature-space. We experimentally ev… ▽ More

    Submitted 26 November, 2016; v1 submitted 28 September, 2016; originally announced September 2016.

    Comments: 6 pages, 6 figures, DICTA 2016 conference

    ACM Class: I.5.2; I.4.7