Skip to main content

Showing 1–8 of 8 results for author: San, N

  1. arXiv:2406.16746  [pdf, other

    cs.LG cs.AI cs.CL

    The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources

    Authors: Shayne Longpre, Stella Biderman, Alon Albalak, Hailey Schoelkopf, Daniel McDuff, Sayash Kapoor, Kevin Klyman, Kyle Lo, Gabriel Ilharco, Nay San, Maribeth Rauh, Aviya Skowron, Bertie Vidgen, Laura Weidinger, Arvind Narayanan, Victor Sanh, David Adelani, Percy Liang, Rishi Bommasani, Peter Henderson, Sasha Luccioni, Yacine Jernite, Luca Soldaini

    Abstract: Foundation model development attracts a rapidly expanding body of contributors, scientists, and applications. To help shape responsible development practices, we introduce the Foundation Model Development Cheatsheet: a growing collection of 250+ tools and resources spanning text, vision, and speech modalities. We draw on a large body of prior work to survey resources (e.g. software, documentation,… ▽ More

    Submitted 25 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2402.02302  [pdf, other

    eess.AS cs.CL

    Predicting positive transfer for improved low-resource speech recognition using acoustic pseudo-tokens

    Authors: Nay San, Georgios Paraskevopoulos, Aryaman Arora, Xiluo He, Prabhjot Kaur, Oliver Adams, Dan Jurafsky

    Abstract: While massively multilingual speech models like wav2vec 2.0 XLSR-128 can be directly fine-tuned for automatic speech recognition (ASR), downstream performance can still be relatively poor on languages that are under-represented in the pre-training data. Continued pre-training on 70-200 hours of untranscribed speech in these languages can help -- but what about languages without that much recorded… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: Accepted for SIGTYP2024

  3. arXiv:2306.06086  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Developing Speech Processing Pipelines for Police Accountability

    Authors: Anjalie Field, Prateek Verma, Nay San, Jennifer L. Eberhardt, Dan Jurafsky

    Abstract: Police body-worn cameras have the potential to improve accountability and transparency in policing. Yet in practice, they result in millions of hours of footage that is never reviewed. We investigate the potential of large pre-trained speech models for facilitating reviews, focusing on ASR and officer speech detection in footage from traffic stops. Our proposed pipeline includes training data alig… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: Accepted to INTERSPEECH 2023

  4. arXiv:2305.10951  [pdf, other

    cs.CL eess.AS

    Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation

    Authors: Martijn Bartelds, Nay San, Bradley McDonnell, Dan Jurafsky, Martijn Wieling

    Abstract: The performance of automatic speech recognition (ASR) systems has advanced substantially in recent years, particularly for languages for which a large amount of transcribed speech is available. Unfortunately, for low-resource languages, such as minority languages, regional languages or dialects, ASR performance generally remains much lower. In this study, we investigate whether data augmentation t… ▽ More

    Submitted 18 May, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023

  5. arXiv:2302.04975  [pdf, other

    cs.CL

    Leveraging supplementary text data to kick-start automatic speech recognition system development with limited transcriptions

    Authors: Nay San, Martijn Bartelds, Blaine Billings, Ella de Falco, Hendi Feriza, Johan Safri, Wawan Sahrozi, Ben Foley, Bradley McDonnell, Dan Jurafsky

    Abstract: Recent research using pre-trained transformer models suggests that just 10 minutes of transcribed speech may be enough to fine-tune such a model for automatic speech recognition (ASR) -- at least if we can also leverage vast amounts of text data (803 million tokens). But is that much text data necessary? We study the use of different amounts of text data, both for creating a lexicon that constrain… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

    Comments: Accepted for ComputEL-6

  6. arXiv:2204.07272  [pdf, other

    cs.CL cs.SD eess.AS

    Automated speech tools for helping communities process restricted-access corpora for language revival efforts

    Authors: Nay San, Martijn Bartelds, Tolúlopé Ògúnrèmí, Alison Mount, Ruben Thompson, Michael Higgins, Roy Barker, Jane Simpson, Dan Jurafsky

    Abstract: Many archival recordings of speech from endangered languages remain unannotated and inaccessible to community members and language learning programs. One bottleneck is the time-intensive nature of annotation. An even narrower bottleneck occurs for recordings with access constraints, such as language that must be vetted or filtered by authorised community members before annotation can begin. We pro… ▽ More

    Submitted 24 April, 2022; v1 submitted 14 April, 2022; originally announced April 2022.

    Comments: Accepted at ComputEL-5

  7. arXiv:2104.01176  [pdf

    cs.CY q-fin.GN

    Trends in eBusiness and eGovernment

    Authors: Antonio Sánchez-Bayón, Miguel Ángel García-Ramos Lucero, Annie Ng Cheng San, Choy Johnn Yee, Krishna Moorthy, Alex Foo Tun Lee, Angelita Kithatu-Kiwekete, Shikha Vyas-Doorgapersad, Anthony Kiryagana Isabirye, Nobukhosi Dlodlo, Lydia Mbati, Edmore Tarambiwa, Chengedzai Mafini, Anastas Djurovski, Ephrem Habtemichael Redda, Jhalukpreya Surujlal

    Abstract: The first chapter is a critical review and a case study in eBusiness, with special attention to the digital currencies resource and its possibilities. 2. chapter attempts to incorporate the UTAUT model with perceived risk theory to explore its impact on the intention to use m-government services. 3. chapter aims to assess the level of gender inclusivity in the municipal e-procurement processes in… ▽ More

    Submitted 2 April, 2021; originally announced April 2021.

  8. arXiv:2103.14583  [pdf, other

    cs.CL cs.SD eess.AS

    Leveraging pre-trained representations to improve access to untranscribed speech from endangered languages

    Authors: Nay San, Martijn Bartelds, Mitchell Browne, Lily Clifford, Fiona Gibson, John Mansfield, David Nash, Jane Simpson, Myfany Turpin, Maria Vollmer, Sasha Wilmoth, Dan Jurafsky

    Abstract: Pre-trained speech representations like wav2vec 2.0 are a powerful tool for automatic speech recognition (ASR). Yet many endangered languages lack sufficient data for pre-training such models, or are predominantly oral vernaculars without a standardised writing system, precluding fine-tuning. Query-by-example spoken term detection (QbE-STD) offers an alternative for iteratively indexing untranscri… ▽ More

    Submitted 13 September, 2021; v1 submitted 26 March, 2021; originally announced March 2021.

    Comments: Accepted at ASRU 2021