-
On Speaker Attribution with SURT
Authors:
Desh Raj,
Matthew Wiesner,
Matthew Maciejewski,
Leibny Paola Garcia-Perera,
Daniel Povey,
Sanjeev Khudanpur
Abstract:
The Streaming Unmixing and Recognition Transducer (SURT) has recently become a popular framework for continuous, streaming, multi-talker speech recognition (ASR). With advances in architecture, objectives, and mixture simulation methods, it was demonstrated that SURT can be an efficient streaming method for speaker-agnostic transcription of real meetings. In this work, we push this framework furth…
▽ More
The Streaming Unmixing and Recognition Transducer (SURT) has recently become a popular framework for continuous, streaming, multi-talker speech recognition (ASR). With advances in architecture, objectives, and mixture simulation methods, it was demonstrated that SURT can be an efficient streaming method for speaker-agnostic transcription of real meetings. In this work, we push this framework further by proposing methods to perform speaker-attributed transcription with SURT, for both short mixtures and long recordings. We achieve this by adding an auxiliary speaker branch to SURT, and synchronizing its label prediction with ASR token prediction through HAT-style blank factorization. In order to ensure consistency in relative speaker labels across different utterance groups in a recording, we propose "speaker prefixing" -- appending each chunk with high-confidence frames of speakers identified in previous chunks, to establish the relative order. We perform extensive ablation experiments on synthetic LibriSpeech mixtures to validate our design choices, and demonstrate the efficacy of our final model on the AMI corpus.
△ Less
Submitted 28 January, 2024;
originally announced January 2024.
-
The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Authors:
Samuele Cornell,
Matthew Wiesner,
Shinji Watanabe,
Desh Raj,
Xuankai Chang,
Paola Garcia,
Matthew Maciejewski,
Yoshiki Masuyama,
Zhong-Qiu Wang,
Stefano Squartini,
Sanjeev Khudanpur
Abstract:
The CHiME challenges have played a significant role in the development and evaluation of robust automatic speech recognition (ASR) systems. We introduce the CHiME-7 distant ASR (DASR) task, within the 7th CHiME challenge. This task comprises joint ASR and diarization in far-field settings with multiple, and possibly heterogeneous, recording devices. Different from previous challenges, we evaluate…
▽ More
The CHiME challenges have played a significant role in the development and evaluation of robust automatic speech recognition (ASR) systems. We introduce the CHiME-7 distant ASR (DASR) task, within the 7th CHiME challenge. This task comprises joint ASR and diarization in far-field settings with multiple, and possibly heterogeneous, recording devices. Different from previous challenges, we evaluate systems on 3 diverse scenarios: CHiME-6, DiPCo, and Mixer 6. The goal is for participants to devise a single system that can generalize across different array geometries and use cases with no a-priori information. Another departure from earlier CHiME iterations is that participants are allowed to use open-source pre-trained models and datasets. In this paper, we describe the challenge design, motivation, and fundamental research questions in detail. We also present the baseline system, which is fully array-topology agnostic and features multi-channel diarization, channel selection, guided source separation and a robust ASR model that leverages self-supervised speech representations (SSLR).
△ Less
Submitted 14 July, 2023; v1 submitted 23 June, 2023;
originally announced June 2023.
-
Training Noisy Single-Channel Speech Separation With Noisy Oracle Sources: A Large Gap and A Small Step
Authors:
Matthew Maciejewski,
Jing Shi,
Shinji Watanabe,
Sanjeev Khudanpur
Abstract:
As the performance of single-channel speech separation systems has improved, there has been a desire to move to more challenging conditions than the clean, near-field speech that initial systems were developed on. When training deep learning separation models, a need for ground truth leads to training on synthetic mixtures. As such, training in noisy conditions requires either using noise syntheti…
▽ More
As the performance of single-channel speech separation systems has improved, there has been a desire to move to more challenging conditions than the clean, near-field speech that initial systems were developed on. When training deep learning separation models, a need for ground truth leads to training on synthetic mixtures. As such, training in noisy conditions requires either using noise synthetically added to clean speech, preventing the use of in-domain data for a noisy-condition task, or training using mixtures of noisy speech, requiring the network to additionally separate the noise. We demonstrate the relative inseparability of noise and that this noisy speech paradigm leads to significant degradation of system performance. We also propose an SI-SDR-inspired training objective that tries to exploit the inseparability of noise to implicitly partition the signal and discount noise separation errors, enabling the training of better separation systems with noisy oracle sources.
△ Less
Submitted 22 February, 2021; v1 submitted 23 October, 2020;
originally announced October 2020.
-
The JHU Multi-Microphone Multi-Speaker ASR System for the CHiME-6 Challenge
Authors:
Ashish Arora,
Desh Raj,
Aswin Shanmugam Subramanian,
Ke Li,
Bar Ben-Yair,
Matthew Maciejewski,
Piotr Żelasko,
Paola García,
Shinji Watanabe,
Sanjeev Khudanpur
Abstract:
This paper summarizes the JHU team's efforts in tracks 1 and 2 of the CHiME-6 challenge for distant multi-microphone conversational speech diarization and recognition in everyday home environments. We explore multi-array processing techniques at each stage of the pipeline, such as multi-array guided source separation (GSS) for enhancement and acoustic model training data, posterior fusion for spee…
▽ More
This paper summarizes the JHU team's efforts in tracks 1 and 2 of the CHiME-6 challenge for distant multi-microphone conversational speech diarization and recognition in everyday home environments. We explore multi-array processing techniques at each stage of the pipeline, such as multi-array guided source separation (GSS) for enhancement and acoustic model training data, posterior fusion for speech activity detection, PLDA score fusion for diarization, and lattice combination for automatic speech recognition (ASR). We also report results with different acoustic model architectures, and integrate other techniques such as online multi-channel weighted prediction error (WPE) dereverberation and variational Bayes-hidden Markov model (VB-HMM) based overlap assignment to deal with reverberation and overlapping speakers, respectively. As a result of these efforts, our ASR systems achieve a word error rate of 40.5% and 67.5% on tracks 1 and 2, respectively, on the evaluation set. This is an improvement of 10.8% and 10.4% absolute, over the challenge baselines for the respective tracks.
△ Less
Submitted 14 June, 2020;
originally announced June 2020.
-
WHAMR!: Noisy and Reverberant Single-Channel Speech Separation
Authors:
Matthew Maciejewski,
Gordon Wichern,
Emmett McQuinn,
Jonathan Le Roux
Abstract:
While significant advances have been made with respect to the separation of overlapping speech signals, studies have been largely constrained to mixtures of clean, near anechoic speech, not representative of many real-world scenarios. Although the WHAM! dataset introduced noise to the ubiquitous wsj0-2mix dataset, it did not include reverberation, which is generally present in indoor recordings ou…
▽ More
While significant advances have been made with respect to the separation of overlapping speech signals, studies have been largely constrained to mixtures of clean, near anechoic speech, not representative of many real-world scenarios. Although the WHAM! dataset introduced noise to the ubiquitous wsj0-2mix dataset, it did not include reverberation, which is generally present in indoor recordings outside of recording studios. The spectral smearing caused by reverberation can result in significant performance degradation for standard deep learning-based speech separation systems, which rely on spectral structure and the sparsity of speech signals to tease apart sources. To address this, we introduce WHAMR!, an augmented version of WHAM! with synthetic reverberated sources, and provide a thorough baseline analysis of current techniques as well as novel cascaded architectures on the newly introduced conditions.
△ Less
Submitted 14 February, 2020; v1 submitted 22 October, 2019;
originally announced October 2019.
-
Building Corpora for Single-Channel Speech Separation Across Multiple Domains
Authors:
Matthew Maciejewski,
Gregory Sell,
Leibny Paola Garcia-Perera,
Shinji Watanabe,
Sanjeev Khudanpur
Abstract:
To date, the bulk of research on single-channel speech separation has been conducted using clean, near-field, read speech, which is not representative of many modern applications. In this work, we develop a procedure for constructing high-quality synthetic overlap datasets, necessary for most deep learning-based separation frameworks. We produced datasets that are more representative of realistic…
▽ More
To date, the bulk of research on single-channel speech separation has been conducted using clean, near-field, read speech, which is not representative of many modern applications. In this work, we develop a procedure for constructing high-quality synthetic overlap datasets, necessary for most deep learning-based separation frameworks. We produced datasets that are more representative of realistic applications using the CHiME-5 and Mixer 6 corpora and evaluate standard methods on this data to demonstrate the shortcomings of current source-separation performance. We also demonstrate the value of a wide variety of data in training robust models that generalize well to multiple conditions.
△ Less
Submitted 6 November, 2018;
originally announced November 2018.
-
STEAM: A Hierarchical Co-Simulation Framework for Superconducting Accelerator Magnet Circuits
Authors:
Lorenzo Bortot,
Bernhard Auchmann,
Idoia Cortes Garcia,
Alejando M. Fernando Navarro,
Michał Maciejewski,
Matthias Mentink,
Marco Prioli,
Emmanuele Ravaioli,
Sebastian Schöps,
Arjan Verweij
Abstract:
Simulating the transient effects occurring in superconducting accelerator magnet circuits requires including the mutual electro-thermo-dynamic interaction among the circuit elements, such as power converters, magnets, and protection systems. Nevertheless, the numerical analysis is traditionally done separately for each element in the circuit, leading to possible non-consistent results. We present…
▽ More
Simulating the transient effects occurring in superconducting accelerator magnet circuits requires including the mutual electro-thermo-dynamic interaction among the circuit elements, such as power converters, magnets, and protection systems. Nevertheless, the numerical analysis is traditionally done separately for each element in the circuit, leading to possible non-consistent results. We present STEAM, a hierarchical co-simulation framework featuring the waveform relaxation method. The framework simulates a complex system as a composition of simpler, independent models that exchange information. The convergence of the coupling algorithm ensures the consistency of the solution. The modularity of the framework allows integrating models developed with both proprietary and in-house tools. The framework implements a user-customizable hierarchical algorithm to schedule how models participate to the co-simulation, for the purpose of using computational resources efficiently. As a case study, a quench scenario is co-simulated for the inner triplet circuit for the High Luminosity upgrade of the LHC at CERN.
△ Less
Submitted 26 January, 2018;
originally announced January 2018.
-
Coupling of Magneto-Thermal and Mechanical Superconducting Magnet Models by Means of Mesh-Based Interpolation
Authors:
Michał Maciejewski,
Pascal Bayrasy,
Klaus Wolf,
Michał Wilczek,
Bernhard Auchmann,
Tina Griesemer,
Lorenzo Bortot,
Marco Prioli,
Alejandro Manuel Fernandez Navarro,
Sebastian Schöps,
Idoia Cortes Garcia,
Arjan Verweij
Abstract:
In this paper we present an algorithm for the coupling of magneto-thermal and mechanical finite element models representing superconducting accelerator magnets. The mechanical models are used during the design of the mechanical structure as well as the optimization of the magnetic field quality under nominal conditions. The magneto-thermal models allow for the analysis of transient phenomena occur…
▽ More
In this paper we present an algorithm for the coupling of magneto-thermal and mechanical finite element models representing superconducting accelerator magnets. The mechanical models are used during the design of the mechanical structure as well as the optimization of the magnetic field quality under nominal conditions. The magneto-thermal models allow for the analysis of transient phenomena occurring during quench initiation, propagation, and protection. Mechanical analysis of quenching magnets is of high importance considering the design of new protection systems and the study of new superconductor types. We use field/circuit coupling to determine temperature and electromagnetic force evolution during the magnet discharge. These quantities are provided as a load to existing mechanical models. The models are discretized with different meshes and, therefore, we employ a mesh-based interpolation method to exchange coupled quantities. The coupling algorithm is illustrated with a simulation of a mechanical response of a standalone high-field dipole magnet protected with CLIQ (Coupling-Loss Induced Quench) technology.
△ Less
Submitted 29 December, 2017;
originally announced December 2017.
-
PRT (Personal Rapid Transit) network simulation
Authors:
Włodzimierz Choromański,
Wiktor Daszczuk,
Jarosław Dyduch,
Mariusz Maciejewski,
Paweł Brach,
Waldemar Grabski
Abstract:
Transportation problems of large urban conurbations inspire search for new transportation systems, that meet high environmental standards, are relatively cheap and user friendly. The latter element also includes the needs of disabled and elderly people. This article concerns a new transportation system PRT - Personal Rapid Transit. In this article the attention is focused on the analysis of the ef…
▽ More
Transportation problems of large urban conurbations inspire search for new transportation systems, that meet high environmental standards, are relatively cheap and user friendly. The latter element also includes the needs of disabled and elderly people. This article concerns a new transportation system PRT - Personal Rapid Transit. In this article the attention is focused on the analysis of the efficiency of the PRT transport network. The simulator of vehicle movement in PRT network as well as algorithms for traffic management and control will be presented. The proposal of its physical implementation will be also included.
△ Less
Submitted 18 October, 2017;
originally announced November 2017.
-
Reduced Order Modelling for the Simulation of Quenches in Superconducting Magnets
Authors:
Sebastian Schöps,
Idoia Cortes Garcia,
Michał Maciejewski,
Bernhard Auchmann
Abstract:
This contributions discusses the simulation of magnetothermal effects in superconducting magnets as used in particle accelerators. An iterative coupling scheme using reduced order models between a magnetothermal partial differential model and an electrical lumped-element circuit is demonstrated. The multiphysics, multirate and multiscale problem requires a consistent formulation and framework to t…
▽ More
This contributions discusses the simulation of magnetothermal effects in superconducting magnets as used in particle accelerators. An iterative coupling scheme using reduced order models between a magnetothermal partial differential model and an electrical lumped-element circuit is demonstrated. The multiphysics, multirate and multiscale problem requires a consistent formulation and framework to tackle the challenging transient effects occurring at both system and device level.
△ Less
Submitted 13 October, 2017;
originally announced October 2017.
-
Application of the Waveform Relaxation Technique to the Co-Simulation of Power Converter Controller and Electrical Circuit Models
Authors:
Michał Maciejewski,
Idoia Cortes Garcia,
Sebastian Schöps,
Bernhard Auchmann,
Lorenzo Bortot,
Marco Prioli,
Arjan Verweij
Abstract:
In this paper we present the co-simulation of a PID class power converter controller and an electrical circuit by means of the waveform relaxation technique. The simulation of the controller model is characterized by a fixed-time stepping scheme reflecting its digital implementation, whereas a circuit simulation usually employs an adaptive time stepping scheme in order to account for a wide range…
▽ More
In this paper we present the co-simulation of a PID class power converter controller and an electrical circuit by means of the waveform relaxation technique. The simulation of the controller model is characterized by a fixed-time stepping scheme reflecting its digital implementation, whereas a circuit simulation usually employs an adaptive time stepping scheme in order to account for a wide range of time constants within the circuit model. In order to maintain the characteristic of both models as well as to facilitate model replacement, we treat them separately by means of input/output relations and propose an application of a waveform relaxation algorithm. Furthermore, the maximum and minimum number of iterations of the proposed algorithm are mathematically analyzed. The concept of controller/circuit coupling is illustrated by an example of the co-simulation of a PI power converter controller and a model of the main dipole circuit of the Large Hadron Collider.
△ Less
Submitted 10 April, 2017;
originally announced April 2017.
-
Optimized Field/Circuit Coupling for the Simulation of Quenches in Superconducting Magnets
Authors:
Idoia Cortes Garcia,
Sebastian Schöps,
Michał Maciejewski,
Lorenzo Bortot,
Marco Prioli,
Bernhard Auchmann,
Arjan Verweij
Abstract:
In this paper, we propose an optimized field/circuit coupling approach for the simulation of magnetothermal transients in superconducting magnets. The approach improves the convergence of the iterative coupling scheme between a magnetothermal partial differential model and an electrical lumped-element circuit. Such a multi-physics, multi-rate and multi-scale problem requires a consistent formulati…
▽ More
In this paper, we propose an optimized field/circuit coupling approach for the simulation of magnetothermal transients in superconducting magnets. The approach improves the convergence of the iterative coupling scheme between a magnetothermal partial differential model and an electrical lumped-element circuit. Such a multi-physics, multi-rate and multi-scale problem requires a consistent formulation and a dedicated framework to tackle the challenging transient effects occurring at both circuit and magnet level during normal operation and in case of faults. We derive an equivalent magnet model at the circuit side for the linear and the non-linear settings and discuss the convergence of the overall scheme in the framework of optimized Schwarz methods. The efficiency of the developed approach is illustrated by a numerical example of an accelerator dipole magnet with accompanying protection system.
△ Less
Submitted 6 July, 2017; v1 submitted 3 February, 2017;
originally announced February 2017.