Skip to main content

Showing 1–50 of 63 results for author: García, P

  1. arXiv:2407.05817  [pdf, other

    cs.DC cs.AI

    Cyber Physical Games

    Authors: Warisa Sritriratanarak, Paulo Garcia

    Abstract: We describe a formulation of multi-agents operating within a Cyber-Physical System, resulting in collaborative or adversarial games. We show that the non-determinism inherent in the communication medium between agents and the underlying physical environment gives rise to environment evolution that is a probabilistic function of agents' strategies. We name these emergent properties Cyber Physical G… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  2. arXiv:2406.02210  [pdf, other

    cs.RO eess.SY

    An Open and Reconfigurable User Interface to Manage Complex ROS-based Robotic Systems

    Authors: Pablo Malvido Fresnillo, Saigopal Vasudevan, Jose A. Perez Garcia, Jose L. Martinez Lastra

    Abstract: The Robot Operating System (ROS) has significantly gained popularity among robotic engineers and researchers over the past five years, primarily due to its powerful infrastructure for node communication, which enables developers to build modular and large robotic applications. However, ROS presents a steep learning curve and lacks the intuitive usability of vendor-specific robotic Graphical User I… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 14 pages, 12 figures, 3 tables

  3. Perfect codes over non-prime power alphabets: an approach based on Diophantine equations

    Authors: Pedro-José Cazorla García

    Abstract: Perfect error correcting codes allow for an optimal transmission of information while guaranteeing error correction. For this reason, proving their existence has been a classical problem in both pure mathematics and information theory. Indeed, the classification of the parameters of $e-$error correcting perfect codes over $q-$ary alphabets was a very active topic of research in the late 20th centu… ▽ More

    Submitted 24 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

    Comments: 12 pages, 2 tables. The new version includes the comments by the anonymous referees

    MSC Class: 94B65; 11D61 (Primary); 11G05; 11G50; 14G05 (Secondary)

    Journal ref: Mathematics 2024, 12(11), 1642

  4. arXiv:2404.11498  [pdf, other

    cs.SE cs.RO

    Runtime Verification and Field Testing for ROS-Based Robotic Systems

    Authors: Ricardo Caldas, Juan Antonio Piñera García, Matei Schiopu, Patrizio Pelliccione, Genaína Rodrigues, Thorsten Berger

    Abstract: Robotic systems are becoming pervasive and adopted in increasingly many domains, such as manufacturing, healthcare, and space exploration. To this end, engineering software has emerged as a crucial discipline for building maintainable and reusable robotic systems. Robotics software engineering research has received increasing attention, fostering autonomy as a fundamental goal. However, robotics d… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  5. Enhancing Students' Learning Process Through Self-Generated Tests

    Authors: Marcos Sánchez-Élez, Inmaculada Pardines, Pablo García, Guadalupe Miñana, Sara Román, Margarita Sánchez, José L. Risco-Martín

    Abstract: The use of new technologies in higher education has surprisingly emphasized students' tendency to adopt a passive behavior in class. Participation and interaction of students are essential to improve academic results. This paper describes an educational experiment aimed at the promotion of students' autonomous learning by requiring them to generate test type questions related to the contents of th… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Journal ref: Journal of Science Education and Technology, 23(1), pp. 15-25, 2014

  6. arXiv:2402.10642  [pdf, other

    eess.AS cs.AI

    Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model

    Authors: Xiangyu Zhang, Daijiao Liu, Hexin Liu, Qiquan Zhang, Hanyu Meng, Leibny Paola Garcia, Eng Siong Chng, Lina Yao

    Abstract: Recently, Denoising Diffusion Probabilistic Models (DDPMs) have attained leading performances across a diverse range of generative tasks. However, in the field of speech synthesis, although DDPMs exhibit impressive performance, their long training duration and substantial inference costs hinder practical deployment. Existing approaches primarily focus on enhancing inference speed, while approaches… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  7. arXiv:2402.09734  [pdf, other

    cs.AI

    Agents Need Not Know Their Purpose

    Authors: Paulo Garcia

    Abstract: Ensuring artificial intelligence behaves in such a way that is aligned with human values is commonly referred to as the alignment challenge. Prior work has shown that rational agents, behaving in such a way that maximizes a utility function, will inevitably behave in such a way that is not aligned with human values, especially as their level of intelligence goes up. Prior work has also shown that… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  8. arXiv:2402.06201  [pdf, other

    cs.RO eess.SY

    Maximizing Consistent Force Output for Shape Memory Alloy Artificial Muscles in Soft Robots

    Authors: Meredith L. Anderson, Ran Jing, Juan C. Pacheco Garcia, Ilyoung Yang, Sarah Alizadeh-Shabdiz, Charles DeLorey, Andrew P. Sabelhaus

    Abstract: Soft robots have immense potential given their inherent safety and adaptability, but challenges in soft actuator forces and design constraints have limited scaling up soft robots to larger sizes. Electrothermal shape memory alloy (SMA) artificial muscles have the potential to create these large forces and high displacements, but consistently using these muscles under a well-defined model, in-situ… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

    Comments: 8 pages, 8 figures, accepted by 2024 IEEE International Conference on Soft Robotics (RoboSoft)

  9. arXiv:2401.07726  [pdf, other

    cs.PL cs.RO

    Preserving Power Optimizations Across the High Level Synthesis of Distinct Application-Specific Circuits

    Authors: Paulo Garcia

    Abstract: We evaluate the use of software interpretation to push High Level Synthesis of application-specific accelerators toward a higher level of abstraction. Our methodology is supported by a formal power consumption model that computes the power consumption of accelerator components, accurately predicting the power consumption on new designs from prior optimization estimations. We demonstrate how our ap… ▽ More

    Submitted 9 July, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

    Comments: Accepted at IEEE 10th International Conference on Communications and Electronics (ICCE) 2024

  10. arXiv:2401.07429  [pdf, other

    cs.AR

    Accelerating Boolean Constraint Propagation for Efficient SAT-Solving on FPGAs

    Authors: Hariprasadh Govindasamy, Babak Esfandiari, Paulo Garcia

    Abstract: We present a hardware-accelerated SAT solver targeting processor/Field Programmable Gate Arrays (FPGA) SoCs. Our solution accelerates the most expensive subroutine of the Davis-Putnam-Logemann-Loveland (DPLL) algorithm, Boolean Constraint Propagation (BCP) through fine-grained FPGA parallelism. Unlike prior state-of-the-art solutions, our solver eliminates costly clause look-up operations by assig… ▽ More

    Submitted 13 April, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

    Comments: Accepted at ACM GLSVLSI 2024

  11. arXiv:2312.11279  [pdf, other

    cs.AR

    FPGAs (Can Get Some) SATisfaction

    Authors: Hariprasadh Godindasamy, Babak Esfandiari, Paulo Garcia

    Abstract: We present a hardware-accelerated SAT solver suitable for processor/Field Programmable Gate Arrays (FPGA) hybrid platforms, which have become the norm in the embedded domain. Our solution addresses a known bottleneck in SAT solving acceleration: unlike prior state-of-the-art solutions that have addressed the same bottleneck by limiting the amount of exploited parallelism, our solver takes advantag… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  12. arXiv:2312.09546  [pdf, other

    cs.AI

    On a Functional Definition of Intelligence

    Authors: Warisa Sritriratanarak, Paulo Garcia

    Abstract: Without an agreed-upon definition of intelligence, asking "is this system intelligent?"" is an untestable question. This lack of consensus hinders research, and public perception, on Artificial Intelligence (AI), particularly since the rise of generative- and large-language models. Most work on precisely capturing what we mean by "intelligence" has come from the fields of philosophy, psychology, a… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: submitted; under review at "Journal of Intelligent Computing, SPJ"

  13. arXiv:2312.08650  [pdf, other

    cs.CV eess.SP

    PhyOT: Physics-informed object tracking in surveillance cameras

    Authors: Kawisorn Kamtue, Jose M. F. Moura, Orathai Sangpetch, Paulo Garcia

    Abstract: While deep learning has been very successful in computer vision, real world operating conditions such as lighting variation, background clutter, or occlusion hinder its accuracy across several tasks. Prior work has shown that hybrid models -- combining neural networks and heuristics/algorithms -- can outperform vanilla deep learning for several computer vision tasks, such as classification or trac… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

    Comments: Accepted at IEEE ICASSP 2024 on December 13, 2023

  14. arXiv:2311.15954  [pdf, other

    cs.CL eess.AS

    A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors

    Authors: Shuyue Stella Li, Beining Xu, Xiangyu Zhang, Hexin Liu, Wenhan Chao, Leibny Paola Garcia

    Abstract: In this work, we study the features extracted by English self-supervised learning (SSL) models in cross-lingual contexts and propose a new metric to predict the quality of feature representations. Using automatic speech recognition (ASR) as a downstream task, we analyze the effect of model size, training objectives, and model architecture on the models' performance as a feature extractor for a set… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: 12 pages, 5 figures, 4 tables

  15. arXiv:2310.01719  [pdf

    cs.SE

    Software Testing and Code Refactoring: A Survey with Practitioners

    Authors: Danilo Leandro Lima, Ronnie de Souza Santos, Guilherme Pires Garcia, Sildemir S. da Silva, Cesar Franca, Luiz Fernando Capretz

    Abstract: Nowadays, software testing professionals are commonly required to develop coding skills to work on test automation. One essential skill required from those who code is the ability to implement code refactoring, a valued quality aspect of software development; however, software developers usually encounter obstacles in successfully applying this practice. In this scenario, the present study aims to… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  16. arXiv:2309.16953  [pdf, other

    eess.AS cs.SD

    Enhancing Code-switching Speech Recognition with Interactive Language Biases

    Authors: Hexin Liu, Leibny Paola Garcia, Xiangyu Zhang, Andy W. H. Khong, Sanjeev Khudanpur

    Abstract: Languages usually switch within a multilingual speech signal, especially in a bilingual society. This phenomenon is referred to as code-switching (CS), making automatic speech recognition (ASR) challenging under a multilingual scenario. We propose to improve CS-ASR by biasing the hybrid CTC/attention ASR model with multi-level language information comprising frame- and token-level language posteri… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: Submitted to IEEE ICASSP 2024

  17. arXiv:2309.15018  [pdf, other

    cs.CV cs.AI cs.HC q-bio.NC

    Unidirectional brain-computer interface: Artificial neural network encoding natural images to fMRI response in the visual cortex

    Authors: Ruixing Liang, Xiangyu Zhang, Qiong Li, Lai Wei, Hexin Liu, Avisha Kumar, Kelley M. Kempski Leadingham, Joshua Punnoose, Leibny Paola Garcia, Amir Manbachi

    Abstract: While significant advancements in artificial intelligence (AI) have catalyzed progress across various domains, its full potential in understanding visual perception remains underexplored. We propose an artificial neural network dubbed VISION, an acronym for "Visual Interface System for Imaging Output of Neural activity," to mimic the human brain and show how it can foster neuroscientific inquiries… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  18. arXiv:2306.13734  [pdf, other

    eess.AS cs.CL cs.SD

    The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios

    Authors: Samuele Cornell, Matthew Wiesner, Shinji Watanabe, Desh Raj, Xuankai Chang, Paola Garcia, Matthew Maciejewski, Yoshiki Masuyama, Zhong-Qiu Wang, Stefano Squartini, Sanjeev Khudanpur

    Abstract: The CHiME challenges have played a significant role in the development and evaluation of robust automatic speech recognition (ASR) systems. We introduce the CHiME-7 distant ASR (DASR) task, within the 7th CHiME challenge. This task comprises joint ASR and diarization in far-field settings with multiple, and possibly heterogeneous, recording devices. Different from previous challenges, we evaluate… ▽ More

    Submitted 14 July, 2023; v1 submitted 23 June, 2023; originally announced June 2023.

  19. arXiv:2306.01031  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts

    Authors: Dongji Gao, Matthew Wiesner, Hainan Xu, Leibny Paola Garcia, Daniel Povey, Sanjeev Khudanpur

    Abstract: This paper presents a novel algorithm for building an automatic speech recognition (ASR) model with imperfect training data. Imperfectly transcribed speech is a prevalent issue in human-annotated speech corpora, which degrades the performance of ASR models. To address this problem, we propose Bypass Temporal Classification (BTC) as an expansion of the Connectionist Temporal Classification (CTC) cr… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

  20. arXiv:2212.10249  [pdf, other

    q-bio.NC cs.LG cs.NE

    Learning efficient backprojections across cortical hierarchies in real time

    Authors: Kevin Max, Laura Kriener, Garibaldi Pineda García, Thomas Nowotny, Ismael Jaras, Walter Senn, Mihai A. Petrovici

    Abstract: Models of sensory processing and learning in the cortex need to efficiently assign credit to synapses in all areas. In deep learning, a known solution is error backpropagation, which however requires biologically implausible weight transport from feed-forward to feedback paths. We introduce Phaseless Alignment Learning (PAL), a bio-plausible method to learn efficient feedback weights in layered… ▽ More

    Submitted 2 February, 2024; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: Updated with streamlined main part, CIFAR-10 simulations, including DFA and minor fixes

  21. arXiv:2212.06039  [pdf

    cs.CL cs.AI

    Technological taxonomies for hypernym and hyponym retrieval in patent texts

    Authors: You Zuo, Yixuan Li, Alma Parias García, Kim Gerdes

    Abstract: This paper presents an automatic approach to creating taxonomies of technical terms based on the Cooperative Patent Classification (CPC). The resulting taxonomy contains about 170k nodes in 9 separate technological branches and is freely available. We also show that a Text-to-Text Transfer Transformer (T5) model can be fine-tuned to generate hypernyms and hyponyms with relatively high precision, c… ▽ More

    Submitted 13 December, 2022; v1 submitted 14 November, 2022; originally announced December 2022.

    Comments: ToTh 2022 - Terminology & Ontology: Theories and applications, Jun 2022, Chamb{é}ry, France

  22. arXiv:2211.17196  [pdf, other

    cs.CL cs.SD eess.AS

    EURO: ESPnet Unsupervised ASR Open-source Toolkit

    Authors: Dongji Gao, Jiatong Shi, Shun-Po Chuang, Leibny Paola Garcia, Hung-yi Lee, Shinji Watanabe, Sanjeev Khudanpur

    Abstract: This paper describes the ESPnet Unsupervised ASR Open-source Toolkit (EURO), an end-to-end open-source toolkit for unsupervised automatic speech recognition (UASR). EURO adopts the state-of-the-art UASR learning method introduced by the Wav2vec-U, originally implemented at FAIRSEQ, which leverages self-supervised speech representations and adversarial training. In addition to wav2vec2, EURO extend… ▽ More

    Submitted 20 May, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

  23. arXiv:2211.13101  [pdf, other

    cs.DC cs.NI

    High-Quality Fault Resiliency in Fat Trees

    Authors: John Gliksberg, Antoine Capra, Alexandre Louvet, Pedro Javier Garcia, Devan Sohier

    Abstract: Coupling regular topologies with optimised routing algorithms is key in pushing the performance of interconnection networks of supercomputers.In this paper we present Dmodc, a fast deterministic routing algorithm for Parallel Generalised Fat-Trees (PGFTs) which minimises congestion risk even under massive network degradation caused by equipment failure.Dmodc computes forwarding tables with a close… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: text overlap with arXiv:2211.11817

    Journal ref: IEEE Micro, 2020, 40 (1), pp.44-49. \&\#x27E8;10.1109/MM.2019.2949978\&\#x27E9

  24. Node-Type-Based Load-Balancing Routing for Parallel Generalized Fat-Trees

    Authors: John Gliksberg, Jean-Noel Quintin, Pedro Javier Garcia

    Abstract: High-Performance Computing (HPC) clusters are made up of a variety of node types (usually compute, I/O, service, and GPGPU nodes) and applications don't use nodes of a different type the same way. Resulting communication patterns reflect organization of groups of nodes, and current optimal routing algorithms for all-to-all patterns will not always maximize performance for group-specific communicat… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Journal ref: 2018 IEEE 4th International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB), Feb 2018, Vienna, France. pp.9-15

  25. High-Quality Fault-Resiliency in Fat-Tree Networks (Extended Abstract)

    Authors: John Gliksberg, Antoine Capra, Alexandre Louvet, Pedro Javier Garcia, Devan Sohier

    Abstract: Coupling regular topologies with optimized routing algorithms is key in pushing the performance of interconnection networks of HPC systems. In this paper we present Dmodc, a fast deterministic routing algorithm for Parallel Generalized Fat-Trees (PGFTs) which minimizes congestion risk even under massive topology degradation caused by equipment failure. It applies a modulo-based computation of forw… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Journal ref: 2019 IEEE Symposium on High-Performance Interconnects (HOTI), Aug 2019, Santa Clara, United States. pp.9-12

  26. arXiv:2211.03025  [pdf, other

    cs.CL cs.SD eess.AS

    Bridging Speech and Textual Pre-trained Models with Unsupervised ASR

    Authors: Jiatong Shi, Chan-Jan Hsu, Holam Chung, Dongji Gao, Paola Garcia, Shinji Watanabe, Ann Lee, Hung-yi Lee

    Abstract: Spoken language understanding (SLU) is a task aiming to extract high-level semantics from spoken utterances. Previous works have investigated the use of speech self-supervised models and textual pre-trained models, which have shown reasonable improvements to various SLU tasks. However, because of the mismatched modalities between speech signals and text tokens, previous methods usually need comple… ▽ More

    Submitted 6 November, 2022; originally announced November 2022.

    Comments: ICASSP2023 submission

  27. arXiv:2211.00482  [pdf, other

    eess.AS cs.SD

    Adapting self-supervised models to multi-talker speech recognition using speaker embeddings

    Authors: Zili Huang, Desh Raj, Paola García, Sanjeev Khudanpur

    Abstract: Self-supervised learning (SSL) methods which learn representations of data without explicit supervision have gained popularity in speech-processing tasks, particularly for single-talker applications. However, these models often have degraded performance for multi-talker scenarios -- possibly due to the domain mismatch -- which severely limits their use for such applications. In this paper, we inve… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

    Comments: submitted to ICASSP 2023

  28. arXiv:2210.14567  [pdf, other

    eess.AS cs.SD

    Reducing Language confusion for Code-switching Speech Recognition with Token-level Language Diarization

    Authors: Hexin Liu, Haihua Xu, Leibny Paola Garcia, Andy W. H. Khong, Yi He, Sanjeev Khudanpur

    Abstract: Code-switching (CS) refers to the phenomenon that languages switch within a speech signal and leads to language confusion for automatic speech recognition (ASR). This paper aims to address language confusion for improving CS-ASR from two perspectives: incorporating and disentangling language information. We incorporate language information in the CS-ASR model by dynamically biasing the model with… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: Submitted to ICASSP 2023

  29. arXiv:2210.07189  [pdf, other

    cs.CL cs.SD eess.AS

    On Compressing Sequences for Self-Supervised Speech Models

    Authors: Yen Meng, Hsuan-Jui Chen, Jiatong Shi, Shinji Watanabe, Paola Garcia, Hung-yi Lee, Hao Tang

    Abstract: Compressing self-supervised models has become increasingly necessary, as self-supervised models become larger. While previous approaches have primarily focused on compressing the model size, shortening sequences is also effective in reducing the computational cost. In this work, we study fixed-length and variable-length subsampling along the time axis in self-supervised learning. We explore how in… ▽ More

    Submitted 25 October, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: Accepted to IEEE SLT 2022

  30. arXiv:2210.05581  [pdf, other

    cs.CL

    Aggregating Crowdsourced and Automatic Judgments to Scale Up a Corpus of Anaphoric Reference for Fiction and Wikipedia Texts

    Authors: Juntao Yu, Silviu Paun, Maris Camilleri, Paloma Carretero Garcia, Jon Chamberlain, Udo Kruschwitz, Massimo Poesio

    Abstract: Although several datasets annotated for anaphoric reference/coreference exist, even the largest such datasets have limitations in terms of size, range of domains, coverage of anaphoric phenomena, and size of documents included. Yet, the approaches proposed to scale up anaphoric annotation haven't so far resulted in datasets overcoming these limitations. In this paper, we introduce a new release of… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

  31. arXiv:2210.03459  [pdf, other

    eess.AS cs.CL cs.SD

    Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization

    Authors: Shota Horiguchi, Yuki Takashima, Shinji Watanabe, Paola Garcia

    Abstract: Due to the high performance of multi-channel speech processing, we can use the outputs from a multi-channel model as teacher labels when training a single-channel model with knowledge distillation. To the contrary, it is also known that single-channel speech data can benefit multi-channel models by mixing it with multi-channel speech data during training or by using it for model pretraining. This… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Comments: Accepted to IEEE SLT 2022

  32. arXiv:2210.03221  [pdf, other

    cs.LG cs.CL quant-ph

    PQLM -- Multilingual Decentralized Portable Quantum Language Model for Privacy Protection

    Authors: Shuyue Stella Li, Xiangyu Zhang, Shu Zhou, Hongchao Shu, Ruixing Liang, Hexin Liu, Leibny Paola Garcia

    Abstract: With careful manipulation, malicious agents can reverse engineer private information encoded in pre-trained language models. Security concerns motivate the development of quantum pre-training. In this work, we propose a highly Portable Quantum Language Model (PQLM) that can easily transmit information to downstream tasks on classical machines. The framework consists of a cloud PQLM built with rand… ▽ More

    Submitted 26 February, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: 5 pages, 3 figures, 3 tables

  33. arXiv:2209.12702  [pdf, other

    eess.AS cs.SD

    End-to-End Lyrics Recognition with Self-supervised Learning

    Authors: Xiangyu Zhang, Shuyue Stella Li, Zhanhong He, Roberto Togneri, Leibny Paola Garcia

    Abstract: Lyrics recognition is an important task in music processing. Despite traditional algorithms such as the hybrid HMM- TDNN model achieving good performance, studies on applying end-to-end models and self-supervised learning (SSL) are limited. In this paper, we first establish an end-to-end baseline for lyrics recognition and then explore the performance of SSL models on lyrics recognition task. We e… ▽ More

    Submitted 26 October, 2022; v1 submitted 26 September, 2022; originally announced September 2022.

    Comments: 4 pages, 2 figures, 3 tables

  34. arXiv:2208.02693  [pdf, other

    cs.CV eess.IV physics.data-an

    Relict landslide detection using Deep-Learning architectures for image segmentation in rainforest areas: A new framework

    Authors: Guilherme P. B. Garcia, Carlos H. Grohmann, Lucas P. Soares, Mateus Espadoto

    Abstract: Landslides are destructive and recurrent natural disasters on steep slopes and represent a risk to lives and properties. Knowledge of relict landslides location is vital to understand their mechanisms, update inventory maps and improve risk assessment. However, relict landslide mapping is complex in tropical regions covered with rainforest vegetation. A new CNN framework is proposed for semi-autom… ▽ More

    Submitted 29 May, 2023; v1 submitted 4 August, 2022; originally announced August 2022.

  35. arXiv:2206.02432  [pdf, other

    eess.AS cs.CL cs.SD

    Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local Attractors

    Authors: Shota Horiguchi, Shinji Watanabe, Paola Garcia, Yuki Takashima, Yohei Kawaguchi

    Abstract: A method to perform offline and online speaker diarization for an unlimited number of speakers is described in this paper. End-to-end neural diarization (EEND) has achieved overlap-aware speaker diarization by formulating it as a multi-label classification problem. It has also been extended for a flexible number of speakers by introducing speaker-wise attractors. However, the output number of spea… ▽ More

    Submitted 22 December, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: Accepted to IEEE/ACM TASLP

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 706-720, 2023

  36. arXiv:2201.04093  [pdf, other

    quant-ph cs.LG

    Systematic Literature Review: Quantum Machine Learning and its applications

    Authors: David Peral García, Juan Cruz-Benito, Francisco José García-Peñalvo

    Abstract: Quantum computing is the process of performing calculations using quantum mechanics. This field studies the quantum behavior of certain subatomic particles for subsequent use in performing calculations, as well as for large-scale information processing. These capabilities can give quantum computers an advantage in terms of computational time and cost over classical computers. Nowadays, there are s… ▽ More

    Submitted 6 December, 2023; v1 submitted 11 January, 2022; originally announced January 2022.

    Comments: 28 pages, 25 figures

  37. arXiv:2110.04694  [pdf, other

    eess.AS cs.CL cs.SD

    Multi-Channel End-to-End Neural Diarization with Distributed Microphones

    Authors: Shota Horiguchi, Yuki Takashima, Paola Garcia, Shinji Watanabe, Yohei Kawaguchi

    Abstract: Recent progress on end-to-end neural diarization (EEND) has enabled overlap-aware speaker diarization with a single neural network. This paper proposes to enhance EEND by using multi-channel signals from distributed microphones. We replace Transformer encoders in EEND with two types of encoders that process a multi-channel input: spatio-temporal and co-attention encoders. Both are independent of t… ▽ More

    Submitted 28 March, 2022; v1 submitted 9 October, 2021; originally announced October 2021.

    Comments: Accepted to ICASSP 2022

  38. arXiv:2109.11224  [pdf, other

    cs.CR cs.NI

    A Novel Open Set Energy-based Flow Classifier for Network Intrusion Detection

    Authors: Manuela M. C. Souza, Camila Pontes, Joao Gondim, Luis P. F. Garcia, Luiz DaSilva, Marcelo A. Marotta

    Abstract: Network intrusion detection systems (NIDS) are one of many solutions that make up a computer security system. Several machine learning-based NIDS have been proposed in recent years, but most of them were developed and evaluated under the assumption that the training context is similar to the test context. In real networks, this assumption is false, given the emergence of new attacks and variants o… ▽ More

    Submitted 26 April, 2022; v1 submitted 23 September, 2021; originally announced September 2021.

  39. Modeling Systems with Machine Learning based Differential Equations

    Authors: Pedro Garcia

    Abstract: The prediction of behavior in dynamical systems, is frequently subject to the design of models. When a time series obtained from observing the system is available, the task can be performed by designing the model from these observations without additional assumptions or by assuming a preconceived structure in the model, with the help of additional information about the system. In the second case,… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

  40. arXiv:2107.01545  [pdf, other

    eess.AS cs.CL cs.SD

    Towards Neural Diarization for Unlimited Numbers of Speakers Using Global and Local Attractors

    Authors: Shota Horiguchi, Shinji Watanabe, Paola Garcia, Yawen Xue, Yuki Takashima, Yohei Kawaguchi

    Abstract: Attractor-based end-to-end diarization is achieving comparable accuracy to the carefully tuned conventional clustering-based methods on challenging datasets. However, the main drawback is that it cannot deal with the case where the number of speakers is larger than the one observed during training. This is because its speaker counting relies on supervised learning. In this work, we introduce an un… ▽ More

    Submitted 23 September, 2021; v1 submitted 4 July, 2021; originally announced July 2021.

    Comments: Accepted to ASRU 2021

  41. Encoder-Decoder Based Attractors for End-to-End Neural Diarization

    Authors: Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, Paola Garcia

    Abstract: This paper investigates an end-to-end neural diarization (EEND) method for an unknown number of speakers. In contrast to the conventional cascaded approach to speaker diarization, EEND methods are better in terms of speaker overlap handling. However, EEND still has a disadvantage in that it cannot deal with a flexible number of speakers. To remedy this problem, we introduce encoder-decoder-based a… ▽ More

    Submitted 28 March, 2022; v1 submitted 20 June, 2021; originally announced June 2021.

    Comments: Accepted to IEEE/ACM TASLP. This article is based on our previous conference paper arxiv:2005.09921

  42. arXiv:2106.04764  [pdf, other

    eess.AS cs.SD

    Semi-Supervised Training with Pseudo-Labeling for End-to-End Neural Diarization

    Authors: Yuki Takashima, Yusuke Fujita, Shota Horiguchi, Shinji Watanabe, Paola García, Kenji Nagamatsu

    Abstract: In this paper, we present a semi-supervised training technique using pseudo-labeling for end-to-end neural diarization (EEND). The EEND system has shown promising performance compared with traditional clustering-based methods, especially in the case of overlapping speech. However, to get a well-tuned model, EEND requires labeled data for all the joint speech activities of every speaker at each tim… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

    Comments: Accepted for Interspeech 2021

  43. arXiv:2106.04078  [pdf, other

    eess.AS cs.SD

    End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection

    Authors: Yuki Takashima, Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Paola García, Kenji Nagamatsu

    Abstract: In this paper, we present a conditional multitask learning method for end-to-end neural speaker diarization (EEND). The EEND system has shown promising performance compared with traditional clustering-based methods, especially in the case of overlapping speech. In this paper, to further improve the performance of the EEND system, we propose a novel multitask learning framework that solves speaker… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Comments: Accepted for SLT 2021

    Journal ref: IEEE Spoken Language Technology Workshop (SLT), 2021, pp. 849-856

  44. Swarm Robots in Agriculture

    Authors: Daniel Albiero, Angel Pontin Garcia, Claudio Kiyoshi Umezu, Rodrigo Leme de Paulo

    Abstract: Agricultural mechanization is an area of knowledge that has evolved a lot over the past century, its main actors being agricultural tractors that, in 100 years, have increased their powers by 3,300%. This evolution has resulted in an exponential increase in the field capacity of such machines. However, it has also generated negative results such as excessive consumption of fossil fuel, excessive w… ▽ More

    Submitted 11 March, 2021; originally announced March 2021.

    Comments: Paper published in Brazilian Congress of Automatic. Porto Alegre, 2020

  45. arXiv:2102.01363  [pdf, other

    eess.AS cs.CL cs.SD

    The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap

    Authors: Shota Horiguchi, Nelson Yalta, Paola Garcia, Yuki Takashima, Yawen Xue, Desh Raj, Zili Huang, Yusuke Fujita, Shinji Watanabe, Sanjeev Khudanpur

    Abstract: This paper provides a detailed description of the Hitachi-JHU system that was submitted to the Third DIHARD Speech Diarization Challenge. The system outputs the ensemble results of the five subsystems: two x-vector-based subsystems, two end-to-end neural diarization-based subsystems, and one hybrid subsystem. We refine each system and all five subsystems become competitive and complementary. After… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

  46. arXiv:2101.08473  [pdf, other

    cs.SD eess.AS

    Online Streaming End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers

    Authors: Yawen Xue, Shota Horiguchi, Yusuke Fujita, Yuki Takashima, Shinji Watanabe, Paola Garcia, Kenji Nagamatsu

    Abstract: We propose a streaming diarization method based on an end-to-end neural diarization (EEND) model, which handles flexible numbers of speakers and overlapping speech. In our previous study, the speaker-tracing buffer (STB) mechanism was proposed to achieve a chunk-wise streaming diarization using a pre-trained EEND model. STB traces the speaker information in previous chunks to map the speakers in a… ▽ More

    Submitted 6 April, 2021; v1 submitted 21 January, 2021; originally announced January 2021.

  47. arXiv:2012.10055  [pdf, other

    eess.AS cs.CL cs.SD

    End-to-End Speaker Diarization as Post-Processing

    Authors: Shota Horiguchi, Paola Garcia, Yusuke Fujita, Shinji Watanabe, Kenji Nagamatsu

    Abstract: This paper investigates the utilization of an end-to-end diarization model as post-processing of conventional clustering-based diarization. Clustering-based diarization methods partition frames into clusters of the number of speakers; thus, they typically cannot handle overlapping speech because each frame is assigned to one speaker. On the other hand, some end-to-end diarization methods can handl… ▽ More

    Submitted 23 December, 2020; v1 submitted 18 December, 2020; originally announced December 2020.

  48. Déjà Vu: Side-Channel Analysis of Mozilla's NSS

    Authors: Sohaib ul Hassan, Iaroslav Gridin, Ignacio M. Delgado-Lozano, Cesar Pereida García, Jesús-Javier Chi-Domínguez, Alejandro Cabrera Aldaya, Billy Bob Brumley

    Abstract: Recent work on Side Channel Analysis (SCA) targets old, well-known vulnerabilities, even previously exploited, reported, and patched in high-profile cryptography libraries. Nevertheless, researchers continue to find and exploit the same vulnerabilities in old and new products, highlighting a big issue among vendors: effectively tracking and fixing security vulnerabilities when disclosure is not do… ▽ More

    Submitted 13 August, 2020; originally announced August 2020.

    Comments: To appear at ACM CCS 2020

  49. arXiv:2008.02931  [pdf, other

    cs.PL cs.LO cs.SC

    From Big-Step to Small-Step Semantics and Back with Interpreter Specialisation

    Authors: John P. Gallagher, Manuel Hermenegildo, Bishoksan Kafle, Maximiliano Klemen, Pedro López García, José Morales

    Abstract: We investigate representations of imperative programs as constrained Horn clauses. Starting from operational semantics transition rules, we proceed by writing interpreters as constrained Horn clause programs directly encoding the rules. We then specialise an interpreter with respect to a given source program to achieve a compilation of the source language to Horn clauses (an instance of the first… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.

    Comments: In Proceedings VPT/HCVS 2020, arXiv:2008.02483

    Journal ref: EPTCS 320, 2020, pp. 50-64

  50. CellEVAC: An adaptive guidance system for crowd evacuation through behavioral optimization

    Authors: Miguel A. Lopez-Carmona, Alvaro Paricio Garcia

    Abstract: A critical aspect of crowds' evacuation processes is the dynamism of individual decision making. Here, we investigate how to favor a coordinated group dynamic through optimal exit-choice instructions using behavioral strategy optimization. We propose and evaluate an adaptive guidance system (Cell-based Crowd Evacuation, CellEVAC) that dynamically allocates colors to cells in a cell-based pedestria… ▽ More

    Submitted 18 May, 2021; v1 submitted 12 July, 2020; originally announced July 2020.

    Comments: 47 pages, 26 figures

    ACM Class: I.6.4