Skip to main content

Showing 1–10 of 10 results for author: Klubicka, F

  1. arXiv:2305.01633  [pdf, other

    cs.CL

    Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP

    Authors: Anya Belz, Craig Thomson, Ehud Reiter, Gavin Abercrombie, Jose M. Alonso-Moral, Mohammad Arvan, Anouck Braggaar, Mark Cieliebak, Elizabeth Clark, Kees van Deemter, Tanvi Dinkar, Ondřej Dušek, Steffen Eger, Qixiang Fang, Mingqi Gao, Albert Gatt, Dimitra Gkatzia, Javier González-Corbelle, Dirk Hovy, Manuela Hürlimann, Takumi Ito, John D. Kelleher, Filip Klubicka, Emiel Krahmer, Huiyuan Lai , et al. (17 additional authors not shown)

    Abstract: We report our efforts in identifying a set of previous human evaluations in NLP that would be suitable for a coordinated study examining what makes human evaluations in NLP more/less reproducible. We present our results and findings, which include that just 13\% of papers had (i) sufficiently low barriers to reproduction, and (ii) enough obtainable information, to be considered for reproduction, a… ▽ More

    Submitted 7 August, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

    Comments: 5 pages plus appendix, 4 tables, 1 figure. To appear at "Workshop on Insights from Negative Results in NLP" (co-located with EACL2023). Updated author list and acknowledgements

    MSC Class: 68 ACM Class: I.2.7

  2. arXiv:2304.14333  [pdf, other

    cs.CL cs.AI cs.LG

    Idioms, Probing and Dangerous Things: Towards Structural Probing for Idiomaticity in Vector Space

    Authors: Filip Klubička, Vasudevan Nedumpozhimana, John D. Kelleher

    Abstract: The goal of this paper is to learn more about how idiomatic information is structurally encoded in embeddings, using a structural probing method. We repurpose an existing English verbal multi-word expression (MWE) dataset to suit the probing framework and perform a comparative probing study of static (GloVe) and contextual (BERT) embeddings. Our experiments indicate that both encode some idiomatic… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

    Comments: 9 pages, 5 tables, In proceedings of the 19th Workshop on Multiword Expressions @ EACL2023

    MSC Class: 68T50

  3. Queer In AI: A Case Study in Community-Led Participatory AI

    Authors: Organizers Of QueerInAI, :, Anaelia Ovalle, Arjun Subramonian, Ashwin Singh, Claas Voelcker, Danica J. Sutherland, Davide Locatelli, Eva Breznik, Filip Klubička, Hang Yuan, Hetvi J, Huan Zhang, Jaidev Shriram, Kruno Lehman, Luca Soldaini, Maarten Sap, Marc Peter Deisenroth, Maria Leonor Pacheco, Maria Ryskina, Martin Mundt, Milind Agarwal, Nyx McLean, Pan Xu, A Pranav , et al. (26 additional authors not shown)

    Abstract: We present Queer in AI as a case study for community-led participatory design in AI. We examine how participatory design and intersectional tenets started and shaped this community's programs over the years. We discuss different challenges that emerged in the process, look at ways this organization has fallen short of operationalizing participatory and intersectional principles, and then assess th… ▽ More

    Submitted 8 June, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

    Comments: To appear at FAccT 2023

    Journal ref: 2023 ACM Conference on Fairness, Accountability, and Transparency

  4. arXiv:2301.10656  [pdf, other

    cs.CL cs.AI cs.LG

    Probing Taxonomic and Thematic Embeddings for Taxonomic Information

    Authors: Filip Klubička, John D. Kelleher

    Abstract: Modelling taxonomic and thematic relatedness is important for building AI with comprehensive natural language understanding. The goal of this paper is to learn more about how taxonomic information is structurally encoded in embeddings. To do this, we design a new hypernym-hyponym probing task and perform a comparative probing study of taxonomic and thematic SGNS and GloVe embeddings. Our experimen… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

    Comments: 9 pages, 1 figure, 4 tables, In proceedings of the 12th International Global Wordnet Conference

    MSC Class: 68T30

  5. arXiv:2210.12206  [pdf, other

    cs.CL cs.AI cs.LG

    Probing with Noise: Unpicking the Warp and Weft of Embeddings

    Authors: Filip Klubička, John D. Kelleher

    Abstract: Improving our understanding of how information is encoded in vector space can yield valuable interpretability insights. Alongside vector dimensions, we argue that it is possible for the vector norm to also carry linguistic information. We develop a method to test this: an extension of the probing framework which allows for relative intrinsic interpretations of probing results. It relies on introdu… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: 10 pages, 3 tables, Workshop on analyzing and interpreting neural networks for NLP

    MSC Class: 68Uxx

  6. arXiv:2002.06235  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Semantic Relatedness and Taxonomic Word Embeddings

    Authors: Magdalena Kacmajor, John D. Kelleher, Filip Klubicka, Alfredo Maldonado

    Abstract: This paper connects a series of papers dealing with taxonomic word embeddings. It begins by noting that there are different types of semantic relatedness and that different lexical representations encode different forms of relatedness. A particularly important distinction within semantic relatedness is that of thematic versus taxonomic relatedness. Next, we present a number of experiments that ana… ▽ More

    Submitted 14 February, 2020; originally announced February 2020.

    Comments: 7 pages 0 figures

  7. arXiv:1807.06998  [pdf, other

    cs.CL

    Is it worth it? Budget-related evaluation metrics for model selection

    Authors: Filip Klubička, Giancarlo D. Salton, John D. Kelleher

    Abstract: Creating a linguistic resource is often done by using a machine learning model that filters the content that goes through to a human annotator, before going into the final resource. However, budgets are often limited, and the amount of available data exceeds the amount of affordable annotation. In order to optimize the benefit from the invested human work, we argue that deciding on which model one… ▽ More

    Submitted 18 July, 2018; originally announced July 2018.

    Comments: 7 pages, 1 figure, 5 tables, In proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

  8. arXiv:1805.04661  [pdf, other

    cs.CL cs.AI cs.CY

    Examining a hate speech corpus for hate speech detection and popularity prediction

    Authors: Filip Klubička, Raquel Fernández

    Abstract: As research on hate speech becomes more and more relevant every day, most of it is still focused on hate speech detection. By attempting to replicate a hate speech detection experiment performed on an existing Twitter corpus annotated for hate speech, we highlight some issues that arise from doing research in the field of hate speech, which is essentially still in its infancy. We take a critical l… ▽ More

    Submitted 12 May, 2018; originally announced May 2018.

    Comments: 8 pages, 1 figure, 10 tables, published in proceedings of 4REAL2018: Workshop on Replicability and Reproducibility of Research Results in Science and Technology of Language

    MSC Class: 68T50

    Journal ref: In Proceedings of 4REAL Workshop 9-16 (2018)

  9. Quantitative Fine-Grained Human Evaluation of Machine Translation Systems: a Case Study on English to Croatian

    Authors: Filip Klubička, Antonio Toral, Víctor M. Sánchez-Cartagena

    Abstract: This paper presents a quantitative fine-grained manual evaluation approach to comparing the performance of different machine translation (MT) systems. We build upon the well-established Multidimensional Quality Metrics (MQM) error taxonomy and implement a novel method that assesses whether the differences in performance for MQM error types between different MT systems are statistically significant… ▽ More

    Submitted 2 February, 2018; originally announced February 2018.

    Comments: 22 pages, 2 figures, 9 tables, 1 equation. This is a post-peer-review, pre-copyedit version of an article published in Machine Translation Journal. The final authenticated version will be available online at the journal page. arXiv admin note: substantial text overlap with arXiv:1706.04389

    MSC Class: 68T50

    Journal ref: Machine Translation, pp 1-21, (2018), http://rdcu.be/GIkb

  10. Fine-grained human evaluation of neural versus phrase-based machine translation

    Authors: Filip Klubička, Antonio Toral, Víctor M. Sánchez-Cartagena

    Abstract: We compare three approaches to statistical machine translation (pure phrase-based, factored phrase-based and neural) by performing a fine-grained manual evaluation via error annotation of the systems' outputs. The error types in our annotation are compliant with the multidimensional quality metrics (MQM), and the annotation is performed by two annotators. Inter-annotator agreement is high for such… ▽ More

    Submitted 14 June, 2017; originally announced June 2017.

    Comments: 12 pages, 2 figures, The Prague Bulletin of Mathematical Linguistics

    ACM Class: I.2.7

    Journal ref: The Prague Bulletin of Mathematical Linguistics No. 108, pp. 121-132 (2017)