Skip to main content

Showing 1–8 of 8 results for author: Serikov, O

  1. arXiv:2309.11197  [pdf, other

    cs.LG cs.CL

    The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute

    Authors: Aleksandar Stanić, Dylan Ashley, Oleg Serikov, Louis Kirsch, Francesco Faccio, Jürgen Schmidhuber, Thomas Hofmann, Imanol Schlag

    Abstract: The Languini Kitchen serves as both a research collective and codebase designed to empower researchers with limited computational resources to contribute meaningfully to the field of language modelling. We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours. The number of tokens on which a model is trained is defined by the m… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

  2. arXiv:2211.05100  [pdf, other

    cs.CL

    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

    Authors: BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major , et al. (369 additional authors not shown)

    Abstract: Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access… ▽ More

    Submitted 27 June, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

  3. arXiv:2210.13236  [pdf, other

    cs.CL cs.AI

    Universal and Independent: Multilingual Probing Framework for Exhaustive Model Interpretation and Evaluation

    Authors: Oleg Serikov, Vitaly Protasov, Ekaterina Voloshina, Viktoria Knyazkova, Tatiana Shavrina

    Abstract: Linguistic analysis of language models is one of the ways to explain and describe their reasoning, weaknesses, and limitations. In the probing part of the model interpretability research, studies concern individual languages as well as individual linguistic structures. The question arises: are the detected regularities linguistically coherent, or on the contrary, do they dissonate at the typologic… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: Accepted to BlackBoxNLP, EMNLP 2022

    MSC Class: 68-04; 68-06; 68T50 ACM Class: G.3; I.2.7

  4. Is neural language acquisition similar to natural? A chronological probing study

    Authors: Ekaterina Voloshina, Oleg Serikov, Tatiana Shavrina

    Abstract: The probing methodology allows one to obtain a partial representation of linguistic phenomena stored in the inner layers of the neural network, using external classifiers and statistical analysis. Pre-trained transformer-based language models are widely used both for natural language understanding (NLU) and natural language generation (NLG) tasks making them most commonly used for downstream appli… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

    Comments: Published in proceedings of Dialogue-2022 "Computational Linguistics and Intellectual Technologies"

  5. arXiv:2201.09997  [pdf, other

    cs.CL

    Razmecheno: Named Entity Recognition from Digital Archive of Diaries "Prozhito"

    Authors: Timofey Atnashev, Veronika Ganeeva, Roman Kazakov, Daria Matyash, Michael Sonkin, Ekaterina Voloshina, Oleg Serikov, Ekaterina Artemova

    Abstract: The vast majority of existing datasets for Named Entity Recognition (NER) are built primarily on news, research papers and Wikipedia with a few exceptions, created from historical and literary texts. What is more, English is the main source for data for further labelling. This paper aims to fill in multiple gaps by creating a novel dataset "Razmecheno", gathered from the diary texts of the project… ▽ More

    Submitted 24 January, 2022; originally announced January 2022.

    Comments: Submitted to LREC 2022

  6. arXiv:2106.03895  [pdf, other

    cs.CL cs.SD eess.AS

    SIGTYP 2021 Shared Task: Robust Spoken Language Identification

    Authors: Elizabeth Salesky, Badr M. Abdullah, Sabrina J. Mielke, Elena Klyachko, Oleg Serikov, Edoardo Ponti, Ritesh Kumar, Ryan Cotterell, Ekaterina Vylomova

    Abstract: While language identification is a fundamental speech and language processing task, for many languages and language families it remains a challenging task. For many low-resource and endangered languages this is in part due to resource availability: where larger datasets exist, they may be single-speaker or have different domains than desired application scenarios, demanding a need for domain and s… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Comments: The first three authors contributed equally

  7. arXiv:2104.12847  [pdf, other

    cs.CL

    Morph Call: Probing Morphosyntactic Content of Multilingual Transformers

    Authors: Vladislav Mikhailov, Oleg Serikov, Ekaterina Artemova

    Abstract: The outstanding performance of transformer-based language models on a great variety of NLP and NLU tasks has stimulated interest in exploring their inner workings. Recent research has focused primarily on higher-level and complex linguistic phenomena such as syntax, semantics, world knowledge, and common sense. The majority of the studies are anglocentric, and little remains known regarding other… ▽ More

    Submitted 4 May, 2021; v1 submitted 26 April, 2021; originally announced April 2021.

    Comments: To appear in the Proceedings of the 3rd Workshop on Research in Computational Typology and Multilingual NLP (SIGTYP, NAACL)

  8. Teaching a Massive Open Online Course on Natural Language Processing

    Authors: Ekaterina Artemova, Murat Apishev, Veronika Sarkisyan, Sergey Aksenov, Denis Kirjanov, Oleg Serikov

    Abstract: This paper presents a new Massive Open Online Course on Natural Language Processing, targeted at non-English speaking students. The course lasts 12 weeks; every week consists of lectures, practical sessions, and quiz assignments. Three weeks out of 12 are followed by Kaggle-style coding assignments. Our course intends to serve multiple purposes: (i) familiarize students with the core concepts an… ▽ More

    Submitted 4 May, 2021; v1 submitted 26 April, 2021; originally announced April 2021.

    Comments: To appear in the Proceedings of the Fifth Workshop on Teaching NLP @ NAACL