Skip to main content

Showing 1–15 of 15 results for author: Roth, J

  1. arXiv:2405.07965  [pdf, other

    math.OC cs.LG

    Fast Computation of Superquantile-Constrained Optimization Through Implicit Scenario Reduction

    Authors: Jake Roth, Ying Cui

    Abstract: Superquantiles have recently gained significant interest as a risk-aware metric for addressing fairness and distribution shifts in statistical learning and decision making problems. This paper introduces a fast, scalable and robust second-order computational framework to solve large-scale optimization problems with superquantile-based constraints. Unlike empirical risk minimization, superquantile-… ▽ More

    Submitted 20 May, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

    Comments: 34 pages, 2 figures

    MSC Class: 90-04; 90-08; 90C06; 90C25

  2. arXiv:2403.08847  [pdf, ps, other

    astro-ph.IM cs.LG stat.CO

    JAXbind: Bind any function to JAX

    Authors: Jakob Roth, Martin Reinecke, Gordian Edenhofer

    Abstract: JAX is widely used in machine learning and scientific computing, the latter of which often relies on existing high-performance code that we would ideally like to incorporate into JAX. Reimplementing the existing code in JAX is often impractical and the existing interface in JAX for binding custom code either limits the user to a single Jacobian product or requires deep knowledge of JAX and its C++… ▽ More

    Submitted 27 June, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: 4 pages, Github: https://github.com/NIFTy-PPL/JAXbind

    Journal ref: Journal of Open Source Software, 9(98), 6532 (2024)

  3. arXiv:2402.16683  [pdf, other

    astro-ph.IM cs.LG stat.ML

    Re-Envisioning Numerical Information Field Theory (NIFTy.re): A Library for Gaussian Processes and Variational Inference

    Authors: Gordian Edenhofer, Philipp Frank, Jakob Roth, Reimar H. Leike, Massin Guerdi, Lukas I. Scheel-Platz, Matteo Guardiani, Vincent Eberle, Margret Westerkamp, Torsten A. Enßlin

    Abstract: Imaging is the process of transforming noisy, incomplete data into a space that humans can interpret. NIFTy is a Bayesian framework for imaging and has already successfully been applied to many fields in astrophysics. Previous design decisions held the performance and the development of methods in NIFTy back. We present a rewrite of NIFTy, coined NIFTy.re, which reworks the modeling principle, ext… ▽ More

    Submitted 15 June, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: 10 pages, 2 figures, published in Journal of Open Source Software (JOSS)

    Journal ref: Journal of Open Source Software, volume 9(98), year 2024, page 6593

  4. arXiv:2307.03463  [pdf, ps, other

    cs.CE

    Parametrised polyconvex hyperelasticity with physics-augmented neural networks

    Authors: Dominik K. Klein, Fabian J. Roth, Iman Valizadeh, Oliver Weeger

    Abstract: In the present work, neural networks are applied to formulate parametrised hyperelastic constitutive models. The models fulfill all common mechanical conditions of hyperelasticity by construction. In particular, partially input-convex neural network (pICNN) architectures are applied based on feed-forward neural networks. Receiving two different sets of input arguments, pICNNs are convex in one of… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

  5. arXiv:2212.08568  [pdf, other

    cs.CV cs.LG

    Biomedical image analysis competitions: The state of current participation practice

    Authors: Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Patrick Godau, Veronika Cheplygina, Michal Kozubek, Sharib Ali, Anubha Gupta, Jan Kybic, Alison Noble, Carlos Ortiz de Solórzano, Samiksha Pachade, Caroline Petitjean, Daniel Sage, Donglai Wei, Elizabeth Wilden, Deepak Alapatt, Vincent Andrearczyk, Ujjwal Baid, Spyridon Bakas, Niranjan Balu, Sophia Bano , et al. (331 additional authors not shown)

    Abstract: The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis,… ▽ More

    Submitted 12 September, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

  6. arXiv:2202.08164  [pdf, other

    eess.AS cs.CL cs.LG

    Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module

    Authors: Adam Gabryś, Goeric Huybrechts, Manuel Sam Ribeiro, Chung-Ming Chien, Julian Roth, Giulia Comini, Roberto Barra-Chicote, Bartek Perz, Jaime Lorenzo-Trueba

    Abstract: State-of-the-art text-to-speech (TTS) systems require several hours of recorded speech data to generate high-quality synthetic speech. When using reduced amounts of training data, standard TTS models suffer from speech quality and intelligibility degradations, making training low-resource TTS systems problematic. In this paper, we propose a novel extremely low-resource TTS method called Voice Filt… ▽ More

    Submitted 16 February, 2022; originally announced February 2022.

    Comments: Accepted at ICASSP 2022

  7. arXiv:2202.05083  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Cross-speaker style transfer for text-to-speech using data augmentation

    Authors: Manuel Sam Ribeiro, Julian Roth, Giulia Comini, Goeric Huybrechts, Adam Gabrys, Jaime Lorenzo-Trueba

    Abstract: We address the problem of cross-speaker style transfer for text-to-speech (TTS) using data augmentation via voice conversion. We assume to have a corpus of neutral non-expressive data from a target speaker and supporting conversational expressive data from different speakers. Our goal is to build a TTS system that is expressive, while retaining the target speaker's identity. The proposed approach… ▽ More

    Submitted 10 February, 2022; originally announced February 2022.

    Comments: 5 pages, 3 figures, 4 tables. ICASSP 2022

  8. arXiv:2111.07889  [pdf, other

    econ.EM cs.CY

    An Outcome Test of Discrimination for Ranked Lists

    Authors: Jonathan Roth, Guillaume Saint-Jacques, YinYin Yu

    Abstract: This paper extends Becker (1957)'s outcome test of discrimination to settings where a (human or algorithmic) decision-maker produces a ranked list of candidates. Ranked lists are particularly relevant in the context of online platforms that produce search results or feeds, and also arise when human decisionmakers express ordinal preferences over a list of candidates. We show that non-discriminatio… ▽ More

    Submitted 15 November, 2021; originally announced November 2021.

  9. arXiv:2106.05762  [pdf, other

    cs.SD cs.CL eess.AS

    Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows

    Authors: Iván Vallés-Pérez, Julian Roth, Grzegorz Beringer, Roberto Barra-Chicote, Jasha Droppo

    Abstract: Text-to-speech systems recently achieved almost indistinguishable quality from human speech. However, the prosody of those systems is generally flatter than natural speech, producing samples with low expressiveness. Disentanglement of speaker id and prosody is crucial in text-to-speech systems to improve on naturalness and produce more variable syntheses. This paper proposes a new neural text-to-s… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: in Proceedings of Interspeech 2021 conference

  10. The ASHRAE Great Energy Predictor III competition: Overview and results

    Authors: Clayton Miller, Pandarasamy Arjunan, Anjukan Kathirgamanathan, Chun Fu, Jonathan Roth, June Young Park, Chris Balbach, Krishnan Gowri, Zoltan Nagy, Anthony Fontanini, Jeff Haberl

    Abstract: In late 2019, ASHRAE hosted the Great Energy Predictor III (GEPIII) machine learning competition on the Kaggle platform. This launch marked the third energy prediction competition from ASHRAE and the first since the mid-1990s. In this updated version, the competitors were provided with over 20 million points of training data from 2,380 energy meters collected for 1,448 buildings from 16 sources. T… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

    Journal ref: Science and Technology for the Built Environment, 26:10, 1427-1447, (2020)

  11. arXiv:1912.07747  [pdf

    cs.IR cs.CL cs.LG

    Pipelines for Procedural Information Extraction from Scientific Literature: Towards Recipes using Machine Learning and Data Science

    Authors: Huichen Yang, Carlos A. Aguirre, Maria F. De La Torre, Derek Christensen, Luis Bobadilla, Emily Davich, Jordan Roth, Lei Luo, Yihong Theis, Alice Lam, T. Yong-Jin Han, David Buttler, William H. Hsu

    Abstract: This paper describes a machine learning and data science pipeline for structured information extraction from documents, implemented as a suite of open-source tools and extensions to existing tools. It centers around a methodology for extracting procedural information in the form of recipes, stepwise procedures for creating an artifact (in this case synthesizing a nanomaterial), from published scie… ▽ More

    Submitted 16 December, 2019; originally announced December 2019.

    Comments: 15th International Conference on Document Analysis and Recognition Workshops (ICDARW 2019)

    Report number: 2019-1 MSC Class: I.2.7; I.2.6; H.3.3; H.3.4; I.2.10; I.5.4 ACM Class: I.2.7; I.2.6; H.3.3; H.3.4; I.2.10; I.5.4

  12. Bias In, Bias Out? Evaluating the Folk Wisdom

    Authors: Ashesh Rambachan, Jonathan Roth

    Abstract: We evaluate the folk wisdom that algorithmic decision rules trained on data produced by biased human decision-makers necessarily reflect this bias. We consider a setting where training labels are only generated if a biased decision-maker takes a particular action, and so "biased" training data arise due to discriminatory selection into the training data. In our baseline model, the more biased the… ▽ More

    Submitted 19 December, 2020; v1 submitted 18 September, 2019; originally announced September 2019.

    Journal ref: 1st Symposium on Foundations of Responsible Computing (FORC 2020)

  13. arXiv:1901.01342  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    AVA-ActiveSpeaker: An Audio-Visual Dataset for Active Speaker Detection

    Authors: Joseph Roth, Sourish Chaudhuri, Ondrej Klejch, Radhika Marvin, Andrew Gallagher, Liat Kaver, Sharadh Ramaswamy, Arkadiusz Stopczynski, Cordelia Schmid, Zhonghua Xi, Caroline Pantofaru

    Abstract: Active speaker detection is an important component in video analysis algorithms for applications such as speaker diarization, video re-targeting for meetings, speech enhancement, and human-robot interaction. The absence of a large, carefully labeled audio-visual dataset for this task has constrained algorithm evaluations with respect to data diversity, environments, and accuracy. This has made com… ▽ More

    Submitted 24 May, 2019; v1 submitted 4 January, 2019; originally announced January 2019.

  14. arXiv:1810.00319  [pdf, other

    cs.LG cs.CV stat.ML

    Modeling Uncertainty with Hedged Instance Embedding

    Authors: Seong Joon Oh, Kevin Murphy, Jiyan Pan, Joseph Roth, Florian Schroff, Andrew Gallagher

    Abstract: Instance embeddings are an efficient and versatile image representation that facilitates applications like recognition, verification, retrieval, and clustering. Many metric learning methods represent the input as a single point in the embedding space. Often the distance between points is used as a proxy for match confidence. However, this can fail to represent uncertainty arising when the input is… ▽ More

    Submitted 26 August, 2019; v1 submitted 30 September, 2018; originally announced October 2018.

    Comments: 15 pages, 11 figures, updated version of ICLR'19

  15. arXiv:1808.00606  [pdf, other

    cs.SD eess.AS

    AVA-Speech: A Densely Labeled Dataset of Speech Activity in Movies

    Authors: Sourish Chaudhuri, Joseph Roth, Daniel P. W. Ellis, Andrew Gallagher, Liat Kaver, Radhika Marvin, Caroline Pantofaru, Nathan Reale, Loretta Guarino Reid, Kevin Wilson, Zhonghua Xi

    Abstract: Speech activity detection (or endpointing) is an important processing step for applications such as speech recognition, language identification and speaker diarization. Both audio- and vision-based approaches have been used for this task in various settings, often tailored toward end applications. However, much of the prior work reports results in synthetic settings, on task-specific datasets, or… ▽ More

    Submitted 23 August, 2018; v1 submitted 1 August, 2018; originally announced August 2018.

    Comments: Interspeech, 2018