-
A Large-Scale Evaluation of Speech Foundation Models
Authors:
Shu-wen Yang,
Heng-Jui Chang,
Zili Huang,
Andy T. Liu,
Cheng-I Lai,
Haibin Wu,
Jiatong Shi,
Xuankai Chang,
Hsiang-Sheng Tsai,
Wen-Chin Huang,
Tzu-hsun Feng,
Po-Han Chi,
Yist Y. Lin,
Yung-Sung Chuang,
Tzu-Hsien Huang,
Wei-Cheng Tseng,
Kushal Lakhotia,
Shang-Wen Li,
Abdelrahman Mohamed,
Shinji Watanabe,
Hung-yi Lee
Abstract:
The foundation model paradigm leverages a shared foundation model to achieve state-of-the-art (SOTA) performance for various tasks, requiring minimal downstream-specific modeling and data annotation. This approach has proven crucial in the field of Natural Language Processing (NLP). However, the speech processing community lacks a similar setup to explore the paradigm systematically. In this work,…
▽ More
The foundation model paradigm leverages a shared foundation model to achieve state-of-the-art (SOTA) performance for various tasks, requiring minimal downstream-specific modeling and data annotation. This approach has proven crucial in the field of Natural Language Processing (NLP). However, the speech processing community lacks a similar setup to explore the paradigm systematically. In this work, we establish the Speech processing Universal PERformance Benchmark (SUPERB) to study the effectiveness of the paradigm for speech. We propose a unified multi-tasking framework to address speech processing tasks in SUPERB using a frozen foundation model followed by task-specialized, lightweight prediction heads. Combining our results with community submissions, we verify that the foundation model paradigm is promising for speech, and our multi-tasking framework is simple yet effective, as the best-performing foundation model shows competitive generalizability across most SUPERB tasks. For reproducibility and extensibility, we have developed a long-term maintained platform that enables deterministic benchmarking, allows for result sharing via an online leaderboard, and promotes collaboration through a community-driven benchmark database to support new development cycles. Finally, we conduct a series of analyses to offer an in-depth understanding of SUPERB and speech foundation models, including information flows across tasks inside the models, the correctness of the weighted-sum benchmarking protocol and the statistical significance and robustness of the benchmark.
△ Less
Submitted 29 May, 2024; v1 submitted 14 April, 2024;
originally announced April 2024.
-
ALMA Spectroscopy of Europa: A Search for Active Plumes
Authors:
M. A. Cordiner,
A. E. Thelen,
I. -L. Lai,
W. -L. Tseng,
C. A. Nixon,
Y. -J. Kuan,
G. L. Villanueva,
L. Paganini,
S. B. Charnley,
K. D. Retherford
Abstract:
The subsurface ocean of Europa is a high priority target in the search for extraterrestrial life, but direct investigations are hindered by the presence of a thick, exterior ice shell. Here we present spectral line and continuum maps of Europa obtained over four epochs in May-June 2021 using the Atacama Large Millimeter/submillimeter Array (ALMA), to search for molecular emission from atmospheric…
▽ More
The subsurface ocean of Europa is a high priority target in the search for extraterrestrial life, but direct investigations are hindered by the presence of a thick, exterior ice shell. Here we present spectral line and continuum maps of Europa obtained over four epochs in May-June 2021 using the Atacama Large Millimeter/submillimeter Array (ALMA), to search for molecular emission from atmospheric plumes, with the aim of investigating subsurface processes. Using a 3D physical model, we obtained upper limits for the plume abundances of HCN, H$_2$CO, SO$_2$ and CH$_3$OH. If active plume(s) were present, they contained very low abundances of these molecules. Assuming a total gas production rate of $10^{29}$ s$^{-1}$, our H$_2$CO abundance upper limit of $<0.016$\% is more than an order of magnitude less than measured in the Enceladus plume by the Cassini spacecraft, implying a possible chemical difference between the plume source materials for these two icy moons.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
Understanding Physical Breakdowns in Virtual Reality
Authors:
Wen-Jie Tseng
Abstract:
Virtual Reality (VR) moves away from well-controlled laboratory environments into public and personal spaces. As users are visually disconnected from the physical environment, interacting in an uncontrolled space frequently leads to collisions and raises safety concerns. In my thesis, I investigate this phenomenon which I define as the physical breakdown in VR. The goal is to understand the reason…
▽ More
Virtual Reality (VR) moves away from well-controlled laboratory environments into public and personal spaces. As users are visually disconnected from the physical environment, interacting in an uncontrolled space frequently leads to collisions and raises safety concerns. In my thesis, I investigate this phenomenon which I define as the physical breakdown in VR. The goal is to understand the reasons for physical breakdowns, provide solutions, and explore future mechanisms that could perpetuate safety risks. First, I explored the reasons for physical breakdowns by investigating how people interact with the current VR safety mechanism (e.g., Oculus Guardian). Results show one reason for breaking out of the safety boundary is when interacting with large motions (e.g., swinging arms), the user does not have enough time to react although they see the safety boundary. I proposed a solution, FingerMapper, that maps small-scale finger motions onto virtual arms and hands to enable whole-body virtual arm motions in VR to avoid physical breakdowns. To demonstrate future safety risks, I explored the malicious use of perceptual manipulations (e.g., redirection techniques) in VR, which could deliberately create physical breakdowns without users noticing. Results indicate further open challenges about the cognitive process of how users comprehend their physical environment when they are blindfolded in VR.
△ Less
Submitted 20 March, 2024;
originally announced April 2024.
-
Climate Downscaling: A Deep-Learning Based Super-resolution Model of Precipitation Data with Attention Block and Skip Connections
Authors:
Chia-Hao Chiang,
Zheng-Han Huang,
Liwen Liu,
Hsin-Chien Liang,
Yi-Chi Wang,
Wan-Ling Tseng,
Chao Wang,
Che-Ta Chen,
Ko-Chih Wang
Abstract:
Human activities accelerate consumption of fossil fuels and produce greenhouse gases, resulting in urgent issues today: global warming and the climate change. These indirectly cause severe natural disasters, plenty of lives suffering and huge losses of agricultural properties. To mitigate impacts on our lands, scientists are developing renewable, reusable, and clean energies and climatologists are…
▽ More
Human activities accelerate consumption of fossil fuels and produce greenhouse gases, resulting in urgent issues today: global warming and the climate change. These indirectly cause severe natural disasters, plenty of lives suffering and huge losses of agricultural properties. To mitigate impacts on our lands, scientists are developing renewable, reusable, and clean energies and climatologists are trying to predict the extremes. Meanwhile, governments are publicizing resource-saving policies for a more eco-friendly society and arousing environment awareness. One of the most influencing factors is the precipitation, bringing condensed water vapor onto lands. Water resources are the most significant but basic needs in society, not only supporting our livings, but also economics. In Taiwan, although the average annual precipitation is up to 2,500 millimeter (mm), the water allocation for each person is lower than the global average due to drastically geographical elevation changes and uneven distribution through the year. Thus, it is crucial to track and predict the rainfall to make the most use of it and to prevent the floods. However, climate models have limited resolution and require intensive computational power for local-scale use. Therefore, we proposed a deep convolutional neural network with skip connections, attention blocks, and auxiliary data concatenation, in order to downscale the low-resolution precipitation data into high-resolution one. Eventually, we compare with other climate downscaling methods and show better performance in metrics of Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Pearson Correlation, structural similarity index (SSIM), and forecast indicators.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Mass supply from Io to Jupiter's magnetosphere
Authors:
L. Roth,
A. Blöcker,
K. de Kleer,
D. Goldstein,
E. Lellouch,
J. Saur,
C. Schmidt,
D. F. Strobel,
C. Tao,
F. Tsuchiya,
V. Dols,
H. Huybrighs,
A. Mura,
J. R. Szalay,
S. V. Badman,
I. de Pater,
A. -C. Dott,
M. Kagitani,
L. Klaiber,
R. Koga,
A. McEwen,
Z. Milby,
K. D. Retherford,
S. Schlegel,
N. Thomas
, et al. (2 additional authors not shown)
Abstract:
Since the Voyager mission flybys in 1979, we have known the moon Io to be extremely volcanically active as well as to be the main source of plasma in the vast magnetosphere of Jupiter. Material lost from Io forms neutral clouds, the Io plasma torus and ultimately the extended plasma sheet. This material is supplied from the upper atmosphere and atmospheric loss is likely driven by plasma-interacti…
▽ More
Since the Voyager mission flybys in 1979, we have known the moon Io to be extremely volcanically active as well as to be the main source of plasma in the vast magnetosphere of Jupiter. Material lost from Io forms neutral clouds, the Io plasma torus and ultimately the extended plasma sheet. This material is supplied from the upper atmosphere and atmospheric loss is likely driven by plasma-interaction effects with possible contributions from thermal escape and photochemistry-driven escape. Direct volcanic escape is negligible. The supply of material to maintain the plasma torus was estimated from various methods at roughly one ton per second. Most of the time the magnetospheric plasma environment of Io is stable on timescales from days to months. Similarly, Io's atmosphere was found to have a stable average density on the dayside, although it exhibits lateral, diurnal and seasonal variations. There is a potential positive feedback in the Io torus supply: collisions of torus plasma with atmospheric neutrals likely are a significant loss process, which increases with torus density. The stability of the torus environment might be maintained by limiting mechanisms of either torus supply from Io or the loss from the torus by centrifugal interchange in the middle magnetosphere. Various observations suggest that occasionally the plasma torus undergoes major transient changes over a period of several weeks, apparently overcoming possible stabilizing mechanisms. Such events (and more frequent minor changes) are commonly explained by some kind of change in volcanic activity that triggers a chain of reactions which modify the plasma torus state via a net increase in supply of new mass. However, it remains unknown what kind of volcanic event can trigger torus events, whether Io's atmosphere undergoes a change before or during such magnetospheric events, and what processes could enable such a change.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Electrically tunable flat bands with layer-resolved charge distribution in twisted monolayer-bilayer graphene
Authors:
Wei-En Tseng,
Mei-Yin Chou
Abstract:
At a small twist angle, exotic electronic properties emerge in twisted monolayer-bilayer graphene (aAB), including electrically switchable magnetic order and correlated insulating states. These fascinating many-body phenomena manifest when the low-energy bands feature a narrow band width. In this study, we examine the electronic structure of aAB using first-principles calculations combined with an…
▽ More
At a small twist angle, exotic electronic properties emerge in twisted monolayer-bilayer graphene (aAB), including electrically switchable magnetic order and correlated insulating states. These fascinating many-body phenomena manifest when the low-energy bands feature a narrow band width. In this study, we examine the electronic structure of aAB using first-principles calculations combined with an accurate tight-binding model. We find that the presence of an intrinsic polarization greatly modifies the low-energy bands of aAB. Furthermore, the low-energy bands reach a minimum width at a quasi-magic angle and feature a layer-dependent charge localization and delocalization pattern. In the presence of an electric field, an energy gap opens only if lattice relaxation is taken into account. The particle-hole asymmetry in aAB further leads to flatter conduction bands compared with the valence bands, with an electrically tunable band width and band gap, and a switchable sublattice-dependent charge localization and delocalization pattern.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
BN-embedded monolayer graphene with tunable electronic and topological properties
Authors:
Chih-Piao Chuu,
Wei-En Tseng,
Kuan-Hung Liu,
Ching-Ming Wei,
Mei-Yin Chou
Abstract:
Finding an effective and controllable way to create a sizable energy gap in graphene-based systems has been a challenging topic of intensive research. We propose that the hybrid of boron nitride and graphene (h-BNC) at low BN doping serves as an ideal platform for band-gap engineering and valleytronic applications. We report a systematic first-principles study of the atomic configurations and band…
▽ More
Finding an effective and controllable way to create a sizable energy gap in graphene-based systems has been a challenging topic of intensive research. We propose that the hybrid of boron nitride and graphene (h-BNC) at low BN doping serves as an ideal platform for band-gap engineering and valleytronic applications. We report a systematic first-principles study of the atomic configurations and band gap opening for energetically favorable BN patches embedded in graphene. Based on first-principles calculations, we construct a tight-binding model to simulate general doping configurations in large supercells. Unexpectedly, the calculations find a linear dependence of the band gap on the effective BN concentration at low doping, arising from an induced effective on-site energy difference at the two C sublattices as they are substituted by B and N dopants alternately. The significant and tunable band gap of a few hundred meVs, with preserved topological properties of graphene and feasible sample preparation in the laboratory, presents great opportunities to realize valley physics applications in graphene systems at room temperature.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
The composition of Saturn's rings
Authors:
Kelly E. Miller,
Gianrico Filacchione,
Jeffrey Cuzzi,
Philip D. Nicholson,
Matthew M. Hedman,
Kevin Baillie,
Robert E. Johnson,
Wei-Ling Tseng,
Paul R. Estrada,
J. Hunter Waite,
Mauro Ciarniello,
Cécile Ferrari,
Zhimeng Zhang,
Amanda Hendrix,
Julianne I. Moses
Abstract:
The origin and evolution of Saturn's rings is critical to understanding the Saturnian system as a whole. Here, we discuss the physical and chemical composition of the rings, as a foundation for evolutionary models described in subsequent chapters. We review the physical characteristics of the main rings, and summarize current constraints on their chemical composition. Radial trends are observed in…
▽ More
The origin and evolution of Saturn's rings is critical to understanding the Saturnian system as a whole. Here, we discuss the physical and chemical composition of the rings, as a foundation for evolutionary models described in subsequent chapters. We review the physical characteristics of the main rings, and summarize current constraints on their chemical composition. Radial trends are observed in temperature and to a limited extent in particle size distribution, with the C ring exhibiting higher temperatures and a larger population of small particles. The C ring also shows evidence for the greatest abundance of silicate material, perhaps indicative of formation from a rocky body. The C ring and Cassini Division have lower optical depths than the A and B rings, which contributes to the higher abundance of the exogenous neutral absorber in these regions. Overall, the main ring composition is strongly dominated by water ice, with minor silicate, UV absorber, and neutral absorber components. Sampling of the innermost D ring during Cassini's Grand Finale provides a new set of in situ constraints on the ring composition, and we explore ongoing work to understand the linkages between the main rings and the D ring. The D ring material is organic- and silicate-rich and water-poor relative to the main rings, with a large population of small grains. This composition may be explained in part by volatile losses in the D ring, and current constraints suggest some degree of fractionation rather than sampling of the bulk D ring material.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
Lightly Weighted Automatic Audio Parameter Extraction for the Quality Assessment of Consensus Auditory-Perceptual Evaluation of Voice
Authors:
Yi-Heng Lin,
Wen-Hsuan Tseng,
Li-Chin Chen,
Ching-Ting Tan,
Yu Tsao
Abstract:
The Consensus Auditory-Perceptual Evaluation of Voice is a widely employed tool in clinical voice quality assessment that is significant for streaming communication among clinical professionals and benchmarking for the determination of further treatment. Currently, because the assessment relies on experienced clinicians, it tends to be inconsistent, and thus, difficult to standardize. To address t…
▽ More
The Consensus Auditory-Perceptual Evaluation of Voice is a widely employed tool in clinical voice quality assessment that is significant for streaming communication among clinical professionals and benchmarking for the determination of further treatment. Currently, because the assessment relies on experienced clinicians, it tends to be inconsistent, and thus, difficult to standardize. To address this problem, we propose to leverage lightly weighted automatic audio parameter extraction, to increase the clinical relevance, reduce the complexity, and enhance the interpretability of voice quality assessment. The proposed method utilizes age, sex, and five audio parameters: jitter, absolute jitter, shimmer, harmonic-to-noise ratio (HNR), and zero crossing. A classical machine learning approach is employed. The result reveals that our approach performs similar to state-of-the-art (SOTA) methods, and outperforms the latent representation obtained by using popular audio pre-trained models. This approach provide insights into the feasibility of different feature extraction approaches for voice evaluation. Audio parameters such as jitter and the HNR are proven to be suitable for characterizing voice quality attributes, such as roughness and strain. Conversely, pre-trained models exhibit limitations in effectively addressing noise-related scorings. This study contributes toward more comprehensive and precise voice quality evaluations, achieved by a comprehensively exploring diverse assessment methodologies.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Online Learning Quantum States with the Logarithmic Loss via VB-FTRL
Authors:
Wei-Fu Tseng,
Kai-Chun Chen,
Zi-Hong Xiao,
Yen-Huan Li
Abstract:
Online learning quantum states with the logarithmic loss (LL-OLQS) is a quantum generalization of online portfolio selection, a classic open problem in the field of online learning for over three decades. The problem also emerges in designing randomized optimization algorithms for maximum-likelihood quantum state tomography. Recently, Jezequel et al. (arXiv:2209.13932) proposed the VB-FTRL algorit…
▽ More
Online learning quantum states with the logarithmic loss (LL-OLQS) is a quantum generalization of online portfolio selection, a classic open problem in the field of online learning for over three decades. The problem also emerges in designing randomized optimization algorithms for maximum-likelihood quantum state tomography. Recently, Jezequel et al. (arXiv:2209.13932) proposed the VB-FTRL algorithm, the first nearly regret-optimal algorithm for OPS with moderate computational complexity. In this note, we generalize VB-FTRL for LL-OLQS. Let $d$ denote the dimension and $T$ the number of rounds. The generalized algorithm achieves a regret rate of $O ( d^2 \log ( d + T ) )$ for LL-OLQS. Each iteration of the algorithm consists of solving a semidefinite program that can be implemented in polynomial time by, e.g., cutting-plane methods. For comparison, the best-known regret rate for LL-OLQS is currently $O ( d^2 \log T )$, achieved by the exponential weight method. However, there is no explicit implementation available for the exponential weight method for LL-OLQS. To facilitate the generalization, we introduce the notion of VB-convexity. VB-convexity is a sufficient condition for the logarithmic barrier associated with any function to be convex and is of independent interest.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Monitoring H$α$ Emission from the Wide-orbit Brown-dwarf Companion FU Tau B
Authors:
Ya-Lin Wu,
Yu-Chi Cheng,
Li-Ching Huang,
Brendan Bowler,
Laird Close,
Wei-Ling Tseng,
Ning Chen,
Da-Wei Chen
Abstract:
Monitoring mass accretion onto substellar objects provides insights into the geometry of the accretion flows. We use the Lulin One-meter Telescope to monitor H$α$ emission from FU Tau B, a $\sim$19 $M_{\rm Jup}$ brown-dwarf companion at 5.7" (719 au) from the host star, for six consecutive nights. This is the longest continuous H$α$ monitoring for a substellar companion near the deuterium-burning…
▽ More
Monitoring mass accretion onto substellar objects provides insights into the geometry of the accretion flows. We use the Lulin One-meter Telescope to monitor H$α$ emission from FU Tau B, a $\sim$19 $M_{\rm Jup}$ brown-dwarf companion at 5.7" (719 au) from the host star, for six consecutive nights. This is the longest continuous H$α$ monitoring for a substellar companion near the deuterium-burning limit. We aim to investigate if accretion near the planetary regime could be rotationally modulated as suggested by magnetospheric accretion models. We find tentative evidence that H$α$ mildly varies on hourly and daily timescales, though our sensitivity is not sufficient to definitively establish any rotational modulation. No burst-like events are detected, implying that accretion onto FU Tau B is overall stable during the time baseline and sampling windows over which it was observed. The primary star FU Tau A also exhibits H$α$ variations over timescales from minutes to days. This program highlights the potential of monitoring accretion onto substellar objects with small telescopes.
△ Less
Submitted 13 September, 2023;
originally announced September 2023.
-
Memory Manipulations in Extended Reality
Authors:
Elise Bonnail,
Eric Lecolinet,
Wen-Jie Tseng,
Samuel Huron,
Mark Mcgill,
Jan Gugenheimer
Abstract:
Human memory has notable limitations (e.g., forgetting) which have necessitated a variety of memory aids (e.g., calendars). As we grow closer to mass adoption of everyday Extended Reality (XR), which is frequently leveraging perceptual limitations (e.g., redirected walking), it becomes pertinent to consider how XR could leverage memory limitations (forgetting, distorting, persistence) to induce me…
▽ More
Human memory has notable limitations (e.g., forgetting) which have necessitated a variety of memory aids (e.g., calendars). As we grow closer to mass adoption of everyday Extended Reality (XR), which is frequently leveraging perceptual limitations (e.g., redirected walking), it becomes pertinent to consider how XR could leverage memory limitations (forgetting, distorting, persistence) to induce memory manipulations. As memories highly impact our self-perception, social interactions, and behaviors, there is a pressing need to understand XR Memory Manipulations (XRMMs). We ran three speculative design workshops (n=12), with XR and memory researchers creating 48 XRMM scenarios. Through thematic analysis, we define XRMMs, present a framework of their core components and reveal three classes (at encoding, pre-retrieval, at retrieval). Each class differs in terms of technology (AR, VR) and impact on memory (influencing quality of memories, inducing forgetting, distorting memories). We raise ethical concerns and discuss opportunities of perceptual and memory manipulations in XR.
△ Less
Submitted 5 April, 2023;
originally announced April 2023.
-
Parallel Diffusion Model-based Sparse-view Cone-beam Breast CT
Authors:
Wenjun Xia,
Hsin Wu Tseng,
Chuang Niu,
Wenxiang Cong,
Xiaohua Zhang,
Shaohua Liu,
Ruola Ning,
Srinivasan Vedantham,
Ge Wang
Abstract:
Breast cancer is the most prevalent cancer among women worldwide, and early detection is crucial for reducing its mortality rate and improving quality of life. Dedicated breast computed tomography (CT) scanners offer better image quality than mammography and tomosynthesis in general but at higher radiation dose. To enable breast CT for cancer screening, the challenge is to minimize the radiation d…
▽ More
Breast cancer is the most prevalent cancer among women worldwide, and early detection is crucial for reducing its mortality rate and improving quality of life. Dedicated breast computed tomography (CT) scanners offer better image quality than mammography and tomosynthesis in general but at higher radiation dose. To enable breast CT for cancer screening, the challenge is to minimize the radiation dose without compromising image quality, according to the ALARA principle (as low as reasonably achievable). Over the past years, deep learning has shown remarkable successes in various tasks, including low-dose CT especially few-view CT. Currently, the diffusion model presents the state of the art for CT reconstruction. To develop the first diffusion model-based breast CT reconstruction method, here we report innovations to address the large memory requirement for breast cone-beam CT reconstruction and high computational cost of the diffusion model. Specifically, in this study we transform the cutting-edge Denoising Diffusion Probabilistic Model (DDPM) into a parallel framework for sub-volume-based sparse-view breast CT image reconstruction in projection and image domains. This novel approach involves the concurrent training of two distinct DDPM models dedicated to processing projection and image data synergistically in the dual domains. Our experimental findings reveal that this method delivers competitive reconstruction performance at half to one-third of the standard radiation doses. This advancement demonstrates an exciting potential of diffusion-type models for volumetric breast reconstruction at high-resolution with much-reduced radiation dose and as such hopefully redefines breast cancer screening and diagnosis.
△ Less
Submitted 28 January, 2024; v1 submitted 22 March, 2023;
originally announced March 2023.
-
VMCML: Video and Music Matching via Cross-Modality Lifting
Authors:
Yi-Shan Lee,
Wei-Cheng Tseng,
Fu-En Wang,
Min Sun
Abstract:
We propose a content-based system for matching video and background music. The system aims to address the challenges in music recommendation for new users or new music give short-form videos. To this end, we propose a cross-modal framework VMCML that finds a shared embedding space between video and music representations. To ensure the embedding space can be effectively shared by both representatio…
▽ More
We propose a content-based system for matching video and background music. The system aims to address the challenges in music recommendation for new users or new music give short-form videos. To this end, we propose a cross-modal framework VMCML that finds a shared embedding space between video and music representations. To ensure the embedding space can be effectively shared by both representations, we leverage CosFace loss based on margin-based cosine similarity loss. Furthermore, we establish a large-scale dataset called MSVD, in which we provide 390 individual music and the corresponding matched 150,000 videos. We conduct extensive experiments on Youtube-8M and our MSVD datasets. Our quantitative and qualitative results demonstrate the effectiveness of our proposed framework and achieve state-of-the-art video and music matching performance.
△ Less
Submitted 22 March, 2023;
originally announced March 2023.
-
SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
Authors:
Kai-Wei Chang,
Yu-Kai Wang,
Hua Shen,
Iu-thing Kang,
Wei-Cheng Tseng,
Shang-Wen Li,
Hung-yi Lee
Abstract:
Prompt tuning is a technology that tunes a small set of parameters to steer a pre-trained language model (LM) to directly generate the output for downstream tasks. Recently, prompt tuning has demonstrated its storage and computation efficiency in both natural language processing (NLP) and speech processing fields. These advantages have also revealed prompt tuning as a candidate approach to serving…
▽ More
Prompt tuning is a technology that tunes a small set of parameters to steer a pre-trained language model (LM) to directly generate the output for downstream tasks. Recently, prompt tuning has demonstrated its storage and computation efficiency in both natural language processing (NLP) and speech processing fields. These advantages have also revealed prompt tuning as a candidate approach to serving pre-trained LM for multiple tasks in a unified manner. For speech processing, SpeechPrompt shows its high parameter efficiency and competitive performance on a few speech classification tasks. However, whether SpeechPrompt is capable of serving a large number of tasks is unanswered. In this work, we propose SpeechPrompt v2, a prompt tuning framework capable of performing a wide variety of speech classification tasks, covering multiple languages and prosody-related tasks. The experiment result shows that SpeechPrompt v2 achieves performance on par with prior works with less than 0.15M trainable parameters in a unified framework.
△ Less
Submitted 1 March, 2023;
originally announced March 2023.
-
Ensemble knowledge distillation of self-supervised speech models
Authors:
Kuan-Po Huang,
Tzu-hsun Feng,
Yu-Kuan Fu,
Tsu-Yuan Hsu,
Po-Chieh Yen,
Wei-Cheng Tseng,
Kai-Wei Chang,
Hung-yi Lee
Abstract:
Distilled self-supervised models have shown competitive performance and efficiency in recent years. However, there is a lack of experience in jointly distilling multiple self-supervised speech models. In our work, we performed Ensemble Knowledge Distillation (EKD) on various self-supervised speech models such as HuBERT, RobustHuBERT, and WavLM. We tried two different aggregation techniques, layerw…
▽ More
Distilled self-supervised models have shown competitive performance and efficiency in recent years. However, there is a lack of experience in jointly distilling multiple self-supervised speech models. In our work, we performed Ensemble Knowledge Distillation (EKD) on various self-supervised speech models such as HuBERT, RobustHuBERT, and WavLM. We tried two different aggregation techniques, layerwise-average and layerwise-concatenation, to the representations of different teacher models and found that the former was more effective. On top of that, we proposed a multiple prediction head method for student models to predict different layer outputs of multiple teacher models simultaneously. The experimental results show that our method improves the performance of the distilled models on four downstream speech processing tasks, Phoneme Recognition, Speaker Identification, Emotion Recognition, and Automatic Speech Recognition in the hidden-set track of the SUPERB benchmark.
△ Less
Submitted 24 February, 2023;
originally announced February 2023.
-
FingerMapper: Mapping Finger Motions onto Virtual Arms to Enable Safe Virtual Reality Interaction in Confined Spaces
Authors:
Wen-Jie Tseng,
Samuel Huron,
Eric Lecolinet,
Jan Gugenheimer
Abstract:
Whole-body movements enhance the presence and enjoyment of Virtual Reality (VR) experiences. However, using large gestures is often uncomfortable and impossible in confined spaces (e.g., public transport). We introduce FingerMapper, mapping small-scale finger motions onto virtual arms and hands to enable whole-body virtual movements in VR. In a first target selection study (n=13) comparing FingerM…
▽ More
Whole-body movements enhance the presence and enjoyment of Virtual Reality (VR) experiences. However, using large gestures is often uncomfortable and impossible in confined spaces (e.g., public transport). We introduce FingerMapper, mapping small-scale finger motions onto virtual arms and hands to enable whole-body virtual movements in VR. In a first target selection study (n=13) comparing FingerMapper to hand tracking and ray-casting, we found that FingerMapper can significantly reduce physical motions and fatigue while having a similar degree of precision. In a consecutive study (n=13), we compared FingerMapper to hand tracking inside a confined space (the front passenger seat of a car). The results showed participants had significantly higher perceived safety and fewer collisions with FingerMapper while preserving a similar degree of presence and enjoyment as hand tracking. Finally, we present three example applications demonstrating how FingerMapper could be applied for locomotion and interaction for VR in confined spaces.
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
CroCo: Cross-Modal Contrastive learning for localization of Earth Observation data
Authors:
Wei-Hsin Tseng,
Hoàng-Ân Lê,
Alexandre Boulch,
Sébastien Lefèvre,
Dirk Tiede
Abstract:
It is of interest to localize a ground-based LiDAR point cloud on remote sensing imagery. In this work, we tackle a subtask of this problem, i.e. to map a digital elevation model (DEM) rasterized from aerial LiDAR point cloud on the aerial imagery. We proposed a contrastive learning-based method that trains on DEM and high-resolution optical imagery and experiment the framework on different data s…
▽ More
It is of interest to localize a ground-based LiDAR point cloud on remote sensing imagery. In this work, we tackle a subtask of this problem, i.e. to map a digital elevation model (DEM) rasterized from aerial LiDAR point cloud on the aerial imagery. We proposed a contrastive learning-based method that trains on DEM and high-resolution optical imagery and experiment the framework on different data sampling strategies and hyperparameters. In the best scenario, the Top-1 score of 0.71 and Top-5 score of 0.81 are obtained. The proposed method is promising for feature learning from RGB and DEM for localization and is potentially applicable to other data sources too. Source code will be released at https://github.com/wtseng530/AVLocalization.
△ Less
Submitted 14 April, 2022;
originally announced April 2022.
-
DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training and Distribution of Opinion Scores
Authors:
Wei-Cheng Tseng,
Wei-Tsung Kao,
Hung-yi Lee
Abstract:
Mean opinion score (MOS) is a typical subjective evaluation metric for speech synthesis systems. Since collecting MOS is time-consuming, it would be desirable if there are accurate MOS prediction models for automatic evaluation. In this work, we propose DDOS, a novel MOS prediction model. DDOS utilizes domain adaptive pre-training to further pre-train self-supervised learning models on synthetic s…
▽ More
Mean opinion score (MOS) is a typical subjective evaluation metric for speech synthesis systems. Since collecting MOS is time-consuming, it would be desirable if there are accurate MOS prediction models for automatic evaluation. In this work, we propose DDOS, a novel MOS prediction model. DDOS utilizes domain adaptive pre-training to further pre-train self-supervised learning models on synthetic speech. And a proposed module is added to model the opinion score distribution of each utterance. With the proposed components, DDOS outperforms previous works on BVCC dataset. And the zero shot transfer result on BC2019 dataset is significantly improved. DDOS also wins second place in Interspeech 2022 VoiceMOS challenge in terms of system-level score.
△ Less
Submitted 15 August, 2022; v1 submitted 7 April, 2022;
originally announced April 2022.
-
SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks
Authors:
Kai-Wei Chang,
Wei-Cheng Tseng,
Shang-Wen Li,
Hung-yi Lee
Abstract:
Speech representations learned from Self-supervised learning (SSL) models can benefit various speech processing tasks. However, utilizing SSL representations usually requires fine-tuning the pre-trained models or designing task-specific downstream models and loss functions, causing much memory usage and human labor. Recently, prompting in Natural Language Processing (NLP) has been found to be an e…
▽ More
Speech representations learned from Self-supervised learning (SSL) models can benefit various speech processing tasks. However, utilizing SSL representations usually requires fine-tuning the pre-trained models or designing task-specific downstream models and loss functions, causing much memory usage and human labor. Recently, prompting in Natural Language Processing (NLP) has been found to be an efficient technique to leverage pre-trained language models (LMs). Specifically, prompt tuning optimizes a limited number of task-specific parameters with a fixed pre-trained model; as a result, only a small set of parameters is needed to be stored for each task. Prompt tuning improves computation and memory efficiency by leveraging the pre-trained LM's prediction ability. Nevertheless, such a paradigm is little studied in the speech community. We report in this paper the first exploration of the prompt tuning paradigm for speech processing tasks based on Generative Spoken Language Model (GSLM). Experiment results show that the prompt tuning technique achieves competitive performance in speech classification tasks with fewer trainable parameters than fine-tuning specialized downstream models. We further study the technique in challenging sequence generation tasks. Prompt tuning also demonstrates its potential, while the limitation and possible research directions are discussed in this paper. The source code is available on https://github.com/ga642381/SpeechPrompt.
△ Less
Submitted 10 July, 2022; v1 submitted 30 March, 2022;
originally announced March 2022.
-
Boreas: A Multi-Season Autonomous Driving Dataset
Authors:
Keenan Burnett,
David J. Yoon,
Yuchen Wu,
Andrew Zou Li,
Haowei Zhang,
Shichen Lu,
Jingxing Qian,
Wei-Kang Tseng,
Andrew Lambert,
Keith Y. K. Leung,
Angela P. Schoellig,
Timothy D. Barfoot
Abstract:
The Boreas dataset was collected by driving a repeated route over the course of one year, resulting in stark seasonal variations and adverse weather conditions such as rain and falling snow. In total, the Boreas dataset includes over 350km of driving data featuring a 128-channel Velodyne Alpha Prime lidar, a 360$^\circ$ Navtech CIR304-H scanning radar, a 5MP FLIR Blackfly S camera, and centimetre-…
▽ More
The Boreas dataset was collected by driving a repeated route over the course of one year, resulting in stark seasonal variations and adverse weather conditions such as rain and falling snow. In total, the Boreas dataset includes over 350km of driving data featuring a 128-channel Velodyne Alpha Prime lidar, a 360$^\circ$ Navtech CIR304-H scanning radar, a 5MP FLIR Blackfly S camera, and centimetre-accurate post-processed ground truth poses. Our dataset will support live leaderboards for odometry, metric localization, and 3D object detection. The dataset and development kit are available at https://www.boreas.utias.utoronto.ca
△ Less
Submitted 26 January, 2023; v1 submitted 18 March, 2022;
originally announced March 2022.
-
The Dark Side of Perceptual Manipulations in Virtual Reality
Authors:
Wen-Jie Tseng,
Elise Bonnail,
Mark McGill,
Mohamed Khamis,
Eric Lecolinet,
Samuel Huron,
Jan Gugenheimer
Abstract:
"Virtual-Physical Perceptual Manipulations" (VPPMs) such as redirected walking and haptics expand the user's capacity to interact with Virtual Reality (VR) beyond what would ordinarily physically be possible. VPPMs leverage knowledge of the limits of human perception to effect changes in the user's physical movements, becoming able to (perceptibly and imperceptibly) nudge their physical actions to…
▽ More
"Virtual-Physical Perceptual Manipulations" (VPPMs) such as redirected walking and haptics expand the user's capacity to interact with Virtual Reality (VR) beyond what would ordinarily physically be possible. VPPMs leverage knowledge of the limits of human perception to effect changes in the user's physical movements, becoming able to (perceptibly and imperceptibly) nudge their physical actions to enhance interactivity in VR. We explore the risks posed by the malicious use of VPPMs. First, we define, conceptualize and demonstrate the existence of VPPMs. Next, using speculative design workshops, we explore and characterize the threats/risks posed, proposing mitigations and preventative recommendations against the malicious use of VPPMs. Finally, we implement two sample applications to demonstrate how existing VPPMs could be trivially subverted to create the potential for physical harm. This paper aims to raise awareness that the current way we apply and publish VPPMs can lead to malicious exploits of our perceptual vulnerabilities.
△ Less
Submitted 26 February, 2022;
originally announced February 2022.
-
A SUBLIME 3D Model for Cometary Coma Emission: the Hypervolatile-Rich Comet C/2016 R2 (PanSTARRS)
Authors:
M. A. Cordiner,
I. M. Coulson,
E. Garcia-Berrios,
C. Qi,
F. Lique,
M. Zoltowski,
M. de Val-Borro,
Y. -J. Kuan,
W. -H. Ip,
S. Mairs,
N. X. Roth,
S. B. Charnley,
S. N. Milam,
W. -L Tseng,
Y. -L Chuang
Abstract:
The coma of comet C/2016 R2 (PanSTARRS) is one of the most chemically peculiar ever observed, in particular due to its extremely high CO/H2O and N2+/H2O ratios}, and unusual trace volatile abundances. However, the complex shape of its CO emission lines, as well as uncertainties in the coma structure and excitation, has lead to ambiguities in the total CO production rate. We performed high resoluti…
▽ More
The coma of comet C/2016 R2 (PanSTARRS) is one of the most chemically peculiar ever observed, in particular due to its extremely high CO/H2O and N2+/H2O ratios}, and unusual trace volatile abundances. However, the complex shape of its CO emission lines, as well as uncertainties in the coma structure and excitation, has lead to ambiguities in the total CO production rate. We performed high resolution, spatially, spectrally and temporally resolved CO observations using the James Clerk Maxwell Telescope (JCMT) and Submillimeter Array (SMA) to elucidate the outgassing behaviour of C/2016 R2. Results are analyzed using a new, time-dependent, three dimensional radiative transfer code (SUBLIME), incorporating for the first time, accurate state-to-state collisional rate coefficients for the CO--CO system. The total CO production rate was found to be in the range $(3.8-7.6)\times10^{28}$ s$^{-1}$ between 2018-01-13 and 2018-02-01, with a mean value of $(5.3\pm0.6)\times10^{28}$ s$^{-1}$ at r_H = 2.8-2.9 au. The emission is concentrated in a near-sunward jet, with an outflow velocity $0.51\pm0.01$ km/s, compared to $0.25\pm0.01$ km/s in the ambient (and night-side) coma. Evidence was also found for an extended source of CO emission, possibly due to icy grain sublimation around $1.2\times10^5$ km from the nucleus. Based on the coma molecular abundances, we propose that the nucleus ices of C/2016 R2 can be divided into a rapidly sublimating apolar phase, rich in CO, CO2, N2 and CH3OH, and a predominantly frozen (or less abundant), polar phase containing more H2O, CH4, H2CO and HCN.
△ Less
Submitted 23 February, 2022;
originally announced February 2022.
-
CLA-NeRF: Category-Level Articulated Neural Radiance Field
Authors:
Wei-Cheng Tseng,
Hung-Ju Liao,
Lin Yen-Chen,
Min Sun
Abstract:
We propose CLA-NeRF -- a Category-Level Articulated Neural Radiance Field that can perform view synthesis, part segmentation, and articulated pose estimation. CLA-NeRF is trained at the object category level using no CAD models and no depth, but a set of RGB images with ground truth camera poses and part segments. During inference, it only takes a few RGB views (i.e., few-shot) of an unseen 3D obj…
▽ More
We propose CLA-NeRF -- a Category-Level Articulated Neural Radiance Field that can perform view synthesis, part segmentation, and articulated pose estimation. CLA-NeRF is trained at the object category level using no CAD models and no depth, but a set of RGB images with ground truth camera poses and part segments. During inference, it only takes a few RGB views (i.e., few-shot) of an unseen 3D object instance within the known category to infer the object part segmentation and the neural radiance field. Given an articulated pose as input, CLA-NeRF can perform articulation-aware volume rendering to generate the corresponding RGB image at any camera pose. Moreover, the articulated pose of an object can be estimated via inverse rendering. In our experiments, we evaluate the framework across five categories on both synthetic and real-world data. In all cases, our method shows realistic deformation results and accurate articulated pose estimation. We believe that both few-shot articulated object rendering and articulated pose estimation open doors for robots to perceive and interact with unseen articulated objects.
△ Less
Submitted 3 March, 2022; v1 submitted 31 January, 2022;
originally announced February 2022.
-
Meta-CPR: Generalize to Unseen Large Number of Agents with Communication Pattern Recognition Module
Authors:
Wei-Cheng Tseng,
Wei Wei,
Da-Cheng Juan,
Min Sun
Abstract:
Designing an effective communication mechanism among agents in reinforcement learning has been a challenging task, especially for real-world applications. The number of agents can grow or an environment sometimes needs to interact with a changing number of agents in real-world scenarios. To this end, a multi-agent framework needs to handle various scenarios of agents, in terms of both scales and d…
▽ More
Designing an effective communication mechanism among agents in reinforcement learning has been a challenging task, especially for real-world applications. The number of agents can grow or an environment sometimes needs to interact with a changing number of agents in real-world scenarios. To this end, a multi-agent framework needs to handle various scenarios of agents, in terms of both scales and dynamics, for being practical to real-world applications. We formulate the multi-agent environment with a different number of agents as a multi-tasking problem and propose a meta reinforcement learning (meta-RL) framework to tackle this problem. The proposed framework employs a meta-learned Communication Pattern Recognition (CPR) module to identify communication behavior and extract information that facilitates the training process. Experimental results are poised to demonstrate that the proposed framework (a) generalizes to an unseen larger number of agents and (b) allows the number of agents to change between episodes. The ablation study is also provided to reason the proposed CPR design and show such design is effective.
△ Less
Submitted 31 January, 2022; v1 submitted 14 December, 2021;
originally announced December 2021.
-
Leveraging Sequence Embedding and Convolutional Neural Network for Protein Function Prediction
Authors:
Wei-Cheng Tseng,
Po-Han Chi,
Jia-Hua Wu,
Min Sun
Abstract:
The capability of accurate prediction of protein functions and properties is essential in the biotechnology industry, e.g. drug development and artificial protein synthesis, etc. The main challenges of protein function prediction are the large label space and the lack of labeled training data. Our method leverages unsupervised sequence embedding and the success of deep convolutional neural network…
▽ More
The capability of accurate prediction of protein functions and properties is essential in the biotechnology industry, e.g. drug development and artificial protein synthesis, etc. The main challenges of protein function prediction are the large label space and the lack of labeled training data. Our method leverages unsupervised sequence embedding and the success of deep convolutional neural network to overcome these challenges. In contrast, most of the existing methods delete the rare protein functions to reduce the label space. Furthermore, some existing methods require additional bio-information (e.g., the 3-dimensional structure of the proteins) which is difficult to be determined in biochemical experiments. Our proposed method significantly outperforms the other methods on the publicly available benchmark using only protein sequences as input. This allows the process of identifying protein functions to be sped up.
△ Less
Submitted 1 December, 2021;
originally announced December 2021.
-
The 3D Direct Simulation Monte Carlo Study of Europa Gas Plume
Authors:
Wei-Ling Tseng,
Ian-Lin Lai,
Wing-Huen Ip,
Hsiang-Wen Hsu,
Jong-Shinn Wu
Abstract:
Europa has been spotted to have water outgassing activities by the space and ground-based telescopes as well as reanalysis of the Galileo data (Roth et al. 2014; Sparks et al. 2016, 2017; Paganini et al. 2020; Jia et al. 2018; Arnold et al. 2019). However, these observations only provided limited information about plume dynamics, which is critical in understanding the eruption mechanism and prepar…
▽ More
Europa has been spotted to have water outgassing activities by the space and ground-based telescopes as well as reanalysis of the Galileo data (Roth et al. 2014; Sparks et al. 2016, 2017; Paganini et al. 2020; Jia et al. 2018; Arnold et al. 2019). However, these observations only provided limited information about plume dynamics, which is critical in understanding the eruption mechanism and preparation of future exploration. We adopt a 3D DSMC model to investigate the plume characteristics of Europa assuming supersonic expansion originated from the undersurface vent. The main goal is to understand the physical processes and structures of Europa water vapor plumes, which can play a key role on probing its undersurface vent condition and outgassing mechanism. With a parametric study of the total gas production rate and initial gas bulk velocity, the gas number density, temperature and velocity information of the outgassing plumes from the various case studies are derived. Our results show that the plume gases experience acceleration through mutual collisions and adiabatic cooling when exiting and expanding from the surface. The central part of the plume with the relatively large gas production rates (of 1029 and 1030 H2O s-1) is found to sustain thermal equilibrium and nearly continuum condition. Column density maps integrated along two different viewing angles are presented to demonstrate the importance of the projection effect on remote sensing diagnostics. Finally, the density profiles at different altitudes are provided to prepare for observations of Europa plumes including the upcoming spacecraft missions such as JUICE and Europa Clipper.
△ Less
Submitted 29 March, 2022; v1 submitted 26 November, 2021;
originally announced November 2021.
-
Membership Inference Attacks Against Self-supervised Speech Models
Authors:
Wei-Cheng Tseng,
Wei-Tsung Kao,
Hung-yi Lee
Abstract:
Recently, adapting the idea of self-supervised learning (SSL) on continuous speech has started gaining attention. SSL models pre-trained on a huge amount of unlabeled audio can generate general-purpose representations that benefit a wide variety of speech processing tasks. Despite their ubiquitous deployment, however, the potential privacy risks of these models have not been well investigated. In…
▽ More
Recently, adapting the idea of self-supervised learning (SSL) on continuous speech has started gaining attention. SSL models pre-trained on a huge amount of unlabeled audio can generate general-purpose representations that benefit a wide variety of speech processing tasks. Despite their ubiquitous deployment, however, the potential privacy risks of these models have not been well investigated. In this paper, we present the first privacy analysis on several SSL speech models using Membership Inference Attacks (MIA) under black-box access. The experiment results show that these pre-trained models are vulnerable to MIA and prone to membership information leakage with high Area Under the Curve (AUC) in both utterance-level and speaker-level. Furthermore, we also conduct several ablation studies to understand the factors that contribute to the success of MIA.
△ Less
Submitted 15 August, 2022; v1 submitted 9 November, 2021;
originally announced November 2021.
-
Self-Calibration of the Offset Between GPS and Semantic Map Frames for Robust Localization
Authors:
Wei-Kang Tseng,
Angela P. Schoellig,
Timothy D. Barfoot
Abstract:
In self-driving, standalone GPS is generally considered to have insufficient positioning accuracy to stay in lane. Instead, many turn to LIDAR localization, but this comes at the expense of building LIDAR maps that can be costly to maintain. Another possibility is to use semantic cues such as lane lines and traffic lights to achieve localization, but these are usually not continuously visible. Thi…
▽ More
In self-driving, standalone GPS is generally considered to have insufficient positioning accuracy to stay in lane. Instead, many turn to LIDAR localization, but this comes at the expense of building LIDAR maps that can be costly to maintain. Another possibility is to use semantic cues such as lane lines and traffic lights to achieve localization, but these are usually not continuously visible. This issue can be remedied by combining semantic cues with GPS to fill in the gaps. However, due to elapsed time between mapping and localization, the live GPS frame can be offset from the semantic map frame, requiring calibration. In this paper, we propose a robust semantic localization algorithm that self-calibrates for the offset between the live GPS and semantic map frames by exploiting common semantic cues, including traffic lights and lane markings. We formulate the problem using a modified Iterated Extended Kalman Filter, which incorporates GPS and camera images for semantic cue detection via Convolutional Neural Networks. Experimental results show that our proposed algorithm achieves decimetre-level accuracy comparable to typical LIDAR localization performance and is robust against sparse semantic features and frequent GPS dropouts.
△ Less
Submitted 30 June, 2021; v1 submitted 25 May, 2021;
originally announced May 2021.
-
SUPERB: Speech processing Universal PERformance Benchmark
Authors:
Shu-wen Yang,
Po-Han Chi,
Yung-Sung Chuang,
Cheng-I Jeff Lai,
Kushal Lakhotia,
Yist Y. Lin,
Andy T. Liu,
Jiatong Shi,
Xuankai Chang,
Guan-Ting Lin,
Tzu-Hsien Huang,
Wei-Cheng Tseng,
Ko-tik Lee,
Da-Rong Liu,
Zili Huang,
Shuyan Dong,
Shang-Wen Li,
Shinji Watanabe,
Abdelrahman Mohamed,
Hung-yi Lee
Abstract:
Self-supervised learning (SSL) has proven vital for advancing research in natural language processing (NLP) and computer vision (CV). The paradigm pretrains a shared model on large volumes of unlabeled data and achieves state-of-the-art (SOTA) for various tasks with minimal adaptation. However, the speech processing community lacks a similar setup to systematically explore the paradigm. To bridge…
▽ More
Self-supervised learning (SSL) has proven vital for advancing research in natural language processing (NLP) and computer vision (CV). The paradigm pretrains a shared model on large volumes of unlabeled data and achieves state-of-the-art (SOTA) for various tasks with minimal adaptation. However, the speech processing community lacks a similar setup to systematically explore the paradigm. To bridge this gap, we introduce Speech processing Universal PERformance Benchmark (SUPERB). SUPERB is a leaderboard to benchmark the performance of a shared model across a wide range of speech processing tasks with minimal architecture changes and labeled data. Among multiple usages of the shared model, we especially focus on extracting the representation learned from SSL due to its preferable re-usability. We present a simple framework to solve SUPERB tasks by learning task-specialized lightweight prediction heads on top of the frozen shared model. Our results demonstrate that the framework is promising as SSL representations show competitive generalizability and accessibility across SUPERB tasks. We release SUPERB as a challenge with a leaderboard and a benchmark toolkit to fuel the research in representation learning and general speech processing.
△ Less
Submitted 15 October, 2021; v1 submitted 3 May, 2021;
originally announced May 2021.
-
Utilizing Self-supervised Representations for MOS Prediction
Authors:
Wei-Cheng Tseng,
Chien-yu Huang,
Wei-Tsung Kao,
Yist Y. Lin,
Hung-yi Lee
Abstract:
Speech quality assessment has been a critical issue in speech processing for decades. Existing automatic evaluations usually require clean references or parallel ground truth data, which is infeasible when the amount of data soars. Subjective tests, on the other hand, do not need any additional clean or parallel data and correlates better to human perception. However, such a test is expensive and…
▽ More
Speech quality assessment has been a critical issue in speech processing for decades. Existing automatic evaluations usually require clean references or parallel ground truth data, which is infeasible when the amount of data soars. Subjective tests, on the other hand, do not need any additional clean or parallel data and correlates better to human perception. However, such a test is expensive and time-consuming because crowd work is necessary. It thus becomes highly desired to develop an automatic evaluation approach that correlates well with human perception while not requiring ground truth data. In this paper, we use self-supervised pre-trained models for MOS prediction. We show their representations can distinguish between clean and noisy audios. Then, we fine-tune these pre-trained models followed by simple linear layers in an end-to-end manner. The experiment results showed that our framework outperforms the two previous state-of-the-art models by a significant improvement on Voice Conversion Challenge 2018 and achieves comparable or superior performance on Voice Conversion Challenge 2016. We also conducted an ablation study to further investigate how each module benefits the task. The experiment results are implemented and reproducible with publicly available toolkits.
△ Less
Submitted 20 September, 2021; v1 submitted 7 April, 2021;
originally announced April 2021.
-
Toward Robust Long Range Policy Transfer
Authors:
Wei-Cheng Tseng,
Jin-Siang Lin,
Yao-Min Feng,
Min Sun
Abstract:
Humans can master a new task within a few trials by drawing upon skills acquired through prior experience. To mimic this capability, hierarchical models combining primitive policies learned from prior tasks have been proposed. However, these methods fall short comparing to the human's range of transferability. We propose a method, which leverages the hierarchical structure to train the combination…
▽ More
Humans can master a new task within a few trials by drawing upon skills acquired through prior experience. To mimic this capability, hierarchical models combining primitive policies learned from prior tasks have been proposed. However, these methods fall short comparing to the human's range of transferability. We propose a method, which leverages the hierarchical structure to train the combination function and adapt the set of diverse primitive polices alternatively, to efficiently produce a range of complex behaviors on challenging new tasks. We also design two regularization terms to improve the diversity and utilization rate of the primitives in the pre-training phase. We demonstrate that our method outperforms other recent policy transfer methods by combining and adapting these reusable primitives in tasks with continuous action space. The experiment results further show that our approach provides a broader transferring range. The ablation study also shows the regularization terms are critical for long range policy transfer. Finally, we show that our method consistently outperforms other methods when the quality of the primitives varies.
△ Less
Submitted 4 March, 2021;
originally announced March 2021.
-
Query Expansion System for the VoxCeleb Speaker Recognition Challenge 2020
Authors:
Yu-Sen Cheng,
Chun-Liang Shih,
Tien-Hong Lo,
Wen-Ting Tseng,
Berlin Chen
Abstract:
In this report, we describe our submission to the VoxCeleb Speaker Recognition Challenge (VoxSRC) 2020. Two approaches are adopted. One is to apply query expansion on speaker verification, which shows significant progress compared to baseline in the study. Another is to use Kaldi extract x-vector and to combine its Probabilistic Linear Discriminant Analysis (PLDA) score with ResNet score.
In this report, we describe our submission to the VoxCeleb Speaker Recognition Challenge (VoxSRC) 2020. Two approaches are adopted. One is to apply query expansion on speaker verification, which shows significant progress compared to baseline in the study. Another is to use Kaldi extract x-vector and to combine its Probabilistic Linear Discriminant Analysis (PLDA) score with ResNet score.
△ Less
Submitted 4 November, 2020;
originally announced November 2020.
-
Effective FAQ Retrieval and Question Matching With Unsupervised Knowledge Injection
Authors:
Wen-Ting Tseng,
Tien-Hong Lo,
Yung-Chang Hsu,
Berlin Chen
Abstract:
Frequently asked question (FAQ) retrieval, with the purpose of providing information on frequent questions or concerns, has far-reaching applications in many areas, where a collection of question-answer (Q-A) pairs compiled a priori can be employed to retrieve an appropriate answer in response to a user\u2019s query that is likely to reoccur frequently. To this end, predominant approaches to FAQ r…
▽ More
Frequently asked question (FAQ) retrieval, with the purpose of providing information on frequent questions or concerns, has far-reaching applications in many areas, where a collection of question-answer (Q-A) pairs compiled a priori can be employed to retrieve an appropriate answer in response to a user\u2019s query that is likely to reoccur frequently. To this end, predominant approaches to FAQ retrieval typically rank question-answer pairs by considering either the similarity between the query and a question (q-Q), the relevance between the query and the associated answer of a question (q-A), or combining the clues gathered from the q-Q similarity measure and the q-A relevance measure. In this paper, we extend this line of research by combining the clues gathered from the q-Q similarity measure and the q-A relevance measure and meanwhile injecting extra word interaction information, distilled from a generic (open domain) knowledge base, into a contextual language model for inferring the q-A relevance. Furthermore, we also explore to capitalize on domain-specific topically-relevant relations between words in an unsupervised manner, acting as a surrogate to the supervised domain-specific knowledge base information. As such, it enables the model to equip sentence representations with the knowledge about domain-specific and topically-relevant relations among words, thereby providing a better q-A relevance measure. We evaluate variants of our approach on a publicly-available Chinese FAQ dataset, and further apply and contextualize it to a large-scale question-matching task, which aims to search questions from a QA dataset that have a similar intent as an input query. Extensive experimental results on these two datasets confirm the promising performance of the proposed approach in relation to some state-of-the-art ones.
△ Less
Submitted 27 October, 2020;
originally announced October 2020.
-
The Saturn Ring Skimmer Mission Concept: The next step to explore Saturn's rings, atmosphere, interior, and inner magnetosphere
Authors:
Matthew S. Tiscareno,
Mar Vaquero,
Matthew M. Hedman,
Hao Cao,
Paul R. Estrada,
Andrew P. Ingersoll,
Kelly E. Miller,
Marzia Parisi,
David. H. Atkinson,
Shawn M. Brooks,
Jeffrey N. Cuzzi,
James Fuller,
Amanda R. Hendrix,
Robert E. Johnson,
Tommi Koskinen,
William S. Kurth,
Jonathan I. Lunine,
Philip D. Nicholson,
Carol S. Paty,
Rebecca Schindhelm,
Mark R. Showalter,
Linda J. Spilker,
Nathan J. Strange,
Wendy Tseng
Abstract:
The innovative Saturn Ring Skimmer mission concept enables a wide range of investigations that address fundamental questions about Saturn and its rings, as well as giant planets and astrophysical disk systems in general. This mission would provide new insights into the dynamical processes that operate in astrophysical disk systems by observing individual particles in Saturn's rings for the first t…
▽ More
The innovative Saturn Ring Skimmer mission concept enables a wide range of investigations that address fundamental questions about Saturn and its rings, as well as giant planets and astrophysical disk systems in general. This mission would provide new insights into the dynamical processes that operate in astrophysical disk systems by observing individual particles in Saturn's rings for the first time. The Ring Skimmer would also constrain the origin, history, and fate of Saturn's rings by determining their compositional evolution and material transport rates. In addition, the Ring Skimmer would reveal how the rings, magnetosphere, and planet operate as an inter-connected system by making direct measurements of the ring's atmosphere, Saturn's inner magnetosphere and the material owing from the rings into the planet. At the same time, this mission would clarify the dynamical processes operating in the planet's visible atmosphere and deep interior by making extensive high-resolution observations of cloud features and repeated measurements of the planet's extremely dynamic gravitational field. Given the scientific potential of this basic mission concept, we advocate that it be studied in depth as a potential option for the New Frontiers program.
△ Less
Submitted 16 September, 2020; v1 submitted 30 July, 2020;
originally announced July 2020.
-
Deep Learning Based Segmentation of Various Brain Lesions for Radiosurgery
Authors:
Siang-Ruei Wu,
Hao-Yun Chang,
Florence T Su,
Heng-Chun Liao,
Wanju Tseng,
Chun-Chih Liao,
Feipei Lai,
Feng-Ming Hsu,
Furen Xiao
Abstract:
Semantic segmentation of medical images with deep learning models is rapidly developed. In this study, we benchmarked state-of-the-art deep learning segmentation algorithms on our clinical stereotactic radiosurgery dataset, demonstrating the strengths and weaknesses of these algorithms in a fairly practical scenario. In particular, we compared the model performances with respect to their sampling…
▽ More
Semantic segmentation of medical images with deep learning models is rapidly developed. In this study, we benchmarked state-of-the-art deep learning segmentation algorithms on our clinical stereotactic radiosurgery dataset, demonstrating the strengths and weaknesses of these algorithms in a fairly practical scenario. In particular, we compared the model performances with respect to their sampling method, model architecture, and the choice of loss functions, identifying the suitable settings for their applications and shedding light on the possible improvements.
△ Less
Submitted 22 July, 2020;
originally announced July 2020.
-
Direct growth of mm-size twisted bilayer graphene by plasma-enhanced chemical vapor deposition
Authors:
Yen-Chun Chen,
Wei-Hsiang Lin,
Wei-Shiuan Tseng,
Chien-Chang Chen,
George. R. Rossman,
Chii-Dong Chen,
Yu-Shu Wu,
Nai-Chang Yeh
Abstract:
Plasma enhanced chemical vapor deposition (PECVD) techniques have been shown to be an efficient method to achieve single-step synthesis of high-quality monolayer graphene (MLG) without the need of active heating. Here we report PECVD-growth of single-crystalline hexagonal bilayer graphene (BLG) flakes and mm-size BLG films with the interlayer twist angle controlled by the growth parameters. The tw…
▽ More
Plasma enhanced chemical vapor deposition (PECVD) techniques have been shown to be an efficient method to achieve single-step synthesis of high-quality monolayer graphene (MLG) without the need of active heating. Here we report PECVD-growth of single-crystalline hexagonal bilayer graphene (BLG) flakes and mm-size BLG films with the interlayer twist angle controlled by the growth parameters. The twist angle has been determined by three experimental approaches, including direct measurement of the relative orientation of crystalline edges between two stacked monolayers by scanning electron microscopy, analysis of the twist angle-dependent Raman spectral characteristics, and measurement of the Moiré period with scanning tunneling microscopy. In mm-sized twisted BLG (tBLG) films, the average twist angle can be controlled from 0 to approximately 20 \degree, and the angular spread for a given growth condition can be limited to < 7 \degree. Different work functions between MLG and BLG have been verified by the Kelvin probe force microscopy and ultraviolet photoelectron spectroscopy. Electrical measurements of back-gated field-effect-transistor devices based on small-angle tBLG samples revealed high-quality electric characteristics at 300 K and insulating temperature dependence down to 100 K. This controlled PECVD-growth of tBLG thus provides an efficient approach to investigate the effect of varying Moiré potentials on tBLG.
△ Less
Submitted 11 May, 2020;
originally announced May 2020.
-
Oscillating magnetic field effects in high precision metrology
Authors:
H. C. J. Gan,
G. Maslennikov,
K. W. Tseng,
T. R. Tan,
R. Kaewuam,
K. J. Arnold,
D. Matsukevich,
M. D. Barrett
Abstract:
We examine a range of effects arising from ac magnetic fields in high precision metrology. These results are directly relevant to high precision measurements, and accuracy assessments for state-of-the-art optical clocks. Strategies to characterize these effects are discussed and a simple technique to accurately determine trap-induced ac magnetic fields in a linear Paul trap is demonstrated using…
▽ More
We examine a range of effects arising from ac magnetic fields in high precision metrology. These results are directly relevant to high precision measurements, and accuracy assessments for state-of-the-art optical clocks. Strategies to characterize these effects are discussed and a simple technique to accurately determine trap-induced ac magnetic fields in a linear Paul trap is demonstrated using $^{171}\mathrm{Yb}^+$
△ Less
Submitted 7 July, 2018; v1 submitted 1 July, 2018;
originally announced July 2018.
-
Comparison Training for Computer Chinese Chess
Authors:
Wen-Jie Tseng,
Jr-Chang Chen,
I-Chen Wu,
Tinghan Wei
Abstract:
This paper describes the application of comparison training (CT) for automatic feature weight tuning, with the final objective of improving the evaluation functions used in Chinese chess programs. First, we propose an n-tuple network to extract features, since n-tuple networks require very little expert knowledge through its large numbers of features, while simulta-neously allowing easy access. Se…
▽ More
This paper describes the application of comparison training (CT) for automatic feature weight tuning, with the final objective of improving the evaluation functions used in Chinese chess programs. First, we propose an n-tuple network to extract features, since n-tuple networks require very little expert knowledge through its large numbers of features, while simulta-neously allowing easy access. Second, we propose a novel evalua-tion method that incorporates tapered eval into CT. Experiments show that with the same features and the same Chinese chess program, the automatically tuned comparison training feature weights achieved a win rate of 86.58% against the weights that were hand-tuned. The above trained version was then improved by adding additional features, most importantly n-tuple features. This improved version achieved a win rate of 81.65% against the trained version without additional features.
△ Less
Submitted 23 January, 2018;
originally announced January 2018.
-
Atomic-scale Structural and Chemical Characterization of Hexagonal Boron Nitride Layers Synthesized at the Wafer-Scale with Monolayer Thickness Control
Authors:
Wei-Hsiang Lin,
Victor W. Brar,
Deep Jariwala,
Michelle C. Sherrott,
Wei-Shiuan Tseng,
Chih-I Wu,
Nai-Chang Yeh,
Harry A. Atwater
Abstract:
Hexagonal boron nitride (h-BN) is a promising two-dimensional insulator with a large band gap and low density of charged impurities that is isostructural and isoelectronic with graphene. Here we report the chemical and atomic-scale structure of CVD-grown wafer-scale (~25 cm2) h-BN sheets ranging in thickness from 1-20 monolayers. Atomic-scale images of h-BN on Au and graphene/Au substrates obtaine…
▽ More
Hexagonal boron nitride (h-BN) is a promising two-dimensional insulator with a large band gap and low density of charged impurities that is isostructural and isoelectronic with graphene. Here we report the chemical and atomic-scale structure of CVD-grown wafer-scale (~25 cm2) h-BN sheets ranging in thickness from 1-20 monolayers. Atomic-scale images of h-BN on Au and graphene/Au substrates obtained by scanning tunneling microscopy (STM) reveal high h-BN crystalline quality in monolayer samples. Further characterization of 1-20 monolayer samples indicates uniform thickness for wafer-scale areas; this thickness control is a result of precise control of the precursor flow rate, deposition temperature and pressure. Raman and infrared spectroscopy indicate the presence of B-N bonds and reveal a linear dependence of thickness with growth time. X-ray photoelectron spectroscopy (XPS) shows the film stoichiometry, and the B/N atom ratio in our films is 1 + 0.6% across the range of thicknesses. Electrical current transport in metal/insulator/metal (Au/h-BN/Au) heterostructures indicates that our CVD-grown h-BN films can act as excellent tunnel barriers with a high hard-breakdown field strength. Our results suggest that large-area h-BN films are structurally, chemically and electronically uniform over the wafer scale, opening the door to pervasive application as a dielectric in layered nanoelectronic and nanophotonic heterostructures.
△ Less
Submitted 22 May, 2017;
originally announced May 2017.
-
Nanograin densities outside Saturn's A-ring
Authors:
Robert E Johnson,
Wei-Lin Tseng,
Meredith K Elrod,
Ann M Persoon
Abstract:
The observed disparity between the radial dependence of the ion and electron densities measured by the Cassini plasma and radio science instruments are used to show that the region between the outer edge of Saturn's main rings and its tenuous G-ring is permeated with small charged grains (nanograins). These grains emanate from the edge of the A-ring and from the tenuous F-ring and G-ring. This is…
▽ More
The observed disparity between the radial dependence of the ion and electron densities measured by the Cassini plasma and radio science instruments are used to show that the region between the outer edge of Saturn's main rings and its tenuous G-ring is permeated with small charged grains (nanograins). These grains emanate from the edge of the A-ring and from the tenuous F-ring and G-ring. This is a region of Saturn's magnetosphere that is relatively unexplored, but will be a focus of Cassini's F-ring orbits prior to the end of mission in September 2017. Confirmation of the grain densities predicted here will enhance our ability to describe the formation and destruction of material in this important region of Saturn's magnetosphere.
△ Less
Submitted 8 November, 2016;
originally announced November 2016.
-
Optical spectroscopy study of charge density wave order in Sr$_{3}$Rh$_{4}$Sn$_{13}$ and (Sr$_{0.5}$Ca$_{0.5}$)$_{3}$Rh$_{4}$Sn$_{13}$
Authors:
W. J. Ban,
H. P. Wang,
C. W. Tseng,
C. N. Kuo,
C. S. Lue,
N. L. Wang
Abstract:
We perform optical spectroscopy measurement across the charge density wave (CDW) phase transitions on single-crystal samples of Sr$_{3}$Rh$_{4}$Sn$_{13}$ and (Sr$_{0.5}$Ca$_{0.5}$)$_{3}$Rh$_{4}$Sn$_{13}$. Formation of CDW energy gap was clearly observed for both single-crystal samples when they undergo the phase transitions. The existence of a Drude component in $σ_1(ω)$ below \TCDW indicates that…
▽ More
We perform optical spectroscopy measurement across the charge density wave (CDW) phase transitions on single-crystal samples of Sr$_{3}$Rh$_{4}$Sn$_{13}$ and (Sr$_{0.5}$Ca$_{0.5}$)$_{3}$Rh$_{4}$Sn$_{13}$. Formation of CDW energy gap was clearly observed for both single-crystal samples when they undergo the phase transitions. The existence of a Drude component in $σ_1(ω)$ below \TCDW indicates that the Fermi surface is only partially gapped in the CDW state. The obtained value of 2$Δ$/K$_{B}$T$_{CDW}$ is roughly 13 for both Sr$_{3}$Rh$_{4}$Sn$_{13}$ and (Sr$_{0.5}$Ca$_{0.5}$)$_{3}$Rh$_{4}$Sn$_{13}$ compounds. The value is considerably larger than the mean-field value based on the weak-coupling BCS theory. The observed spectral feature in (Sr$_{x}$Ca$_{1-x}$)$_{3}$Rh$_{4}$Sn$_{13}$ resembles those seen in many other CDW systems.
△ Less
Submitted 9 January, 2017; v1 submitted 14 September, 2016;
originally announced September 2016.
-
Mn-doping induced ferromagnetism and enhanced superconductivity in Bi_4-x Mn_x O_4 S_3 (0.075 < = x < = 0.15)
Authors:
Zhenjie Feng,
Xunqing Yin,
Yiming Cao,
Xianglian Peng,
Tian Gao,
Chuan Yu,
Jingzhe Chen,
Baojuan Kang,
Bo Lu,
Juan Guo,
Qing Li,
Wei-Shiuan Tseng,
Zhongquan Ma,
Chao Jing,
Shixun Cao,
Jincang Zhang,
N. -C. Yeh
Abstract:
We demonstrate that Mn-doping in the layered sulfides Bi_4O_4S_3 leads to stable Bi_4-x Mn_x O_4 S_3 compounds that exhibit both long-range ferromagnetism and enhanced superconductivity for 0.075 < = x < = 0.15, with a possible record superconducting transition temperature (T_c) = 15 K among all BiS_2-based superconductors. We conjecture that the coexistence of superconductivity and ferromagnetism…
▽ More
We demonstrate that Mn-doping in the layered sulfides Bi_4O_4S_3 leads to stable Bi_4-x Mn_x O_4 S_3 compounds that exhibit both long-range ferromagnetism and enhanced superconductivity for 0.075 < = x < = 0.15, with a possible record superconducting transition temperature (T_c) = 15 K among all BiS_2-based superconductors. We conjecture that the coexistence of superconductivity and ferromagnetism may be attributed to Mn-doping in the spacer Bi2O2 layers away from the superconducting BiS_2 layers, whereas the enhancement of T_c may be due to excess electron transfer to BiS_2 from the Mn4+/Mn3+-substitutions in Bi_2O_2. This notion is empirically corroborated by the increased electron-carrier densities upon Mn doping, and by further studies of the Bi_4-x A_x O_4 S_3 compounds (A = Co, Ni; x = 0.1, 0.125), where the T_c values remain comparable to that of the undoped Bi_4O_4S_3 system (= 4.5 K) due to lack of 4+ valences in either Co or Ni ions for excess electron transfer to the BiS_2 layers. These findings therefore shed new light on feasible pathways to enhance the T_c values of BiS_2-based superconductors.
△ Less
Submitted 15 August, 2016;
originally announced August 2016.
-
Central role of domain wall depinning for perpendicular magnetization switching driven by spin torque from the spin Hall effect
Authors:
O. J. Lee,
L. Q. Liu,
C. F. Pai,
H. W. Tseng,
Y. Li,
D. C. Ralph,
R. A. Buhrman
Abstract:
We study deterministic magnetic reversal of a perpendicularly magnetized Co layer in a Co/MgO/Ta nano-square driven by spin Hall torque from an in-plane current flowing in an underlying Pt layer. The rate-limiting step of the switching process is domain-wall (DW) depinning by spin Hall torque via a thermally-assisted mechanism that eventually produces full reversal by domain expansion. An in-plane…
▽ More
We study deterministic magnetic reversal of a perpendicularly magnetized Co layer in a Co/MgO/Ta nano-square driven by spin Hall torque from an in-plane current flowing in an underlying Pt layer. The rate-limiting step of the switching process is domain-wall (DW) depinning by spin Hall torque via a thermally-assisted mechanism that eventually produces full reversal by domain expansion. An in-plane applied magnetic field collinear with the current is required, with the necessary field scale set by the need to overcome DW chirality imposed by the Dzyaloshinskii-Moriya interaction. Once Joule heating is taken into account the switching current density is quantitatively consistent with a spin Hall angle θ$_{SH}$ ${\approx}$ 0.07 for 4 nm of Pt.
△ Less
Submitted 27 December, 2013;
originally announced December 2013.
-
Seasonal and radial trends in Saturn's thermal plasma between the main rings and enceladus
Authors:
Meredith K. Elrod,
Wei-Ling Tseng,
Adam K. Woodson,
Robert E. Johnson
Abstract:
A goal of Cassini's extended mission has been to examine the seasonal variations of Saturn's magnetosphere, moons, and rings. Recently we showed that the magnetospheric plasma between the main rings and Enceladus exhibited a time dependence that we attributed to a seasonally variable source of oxygen from the main rings (Elrod et al., 2012). Such a temporal variation was subsequently seen in the e…
▽ More
A goal of Cassini's extended mission has been to examine the seasonal variations of Saturn's magnetosphere, moons, and rings. Recently we showed that the magnetospheric plasma between the main rings and Enceladus exhibited a time dependence that we attributed to a seasonally variable source of oxygen from the main rings (Elrod et al., 2012). Such a temporal variation was subsequently seen in the energetic ion composition (Christon et al., 2013). Here we include the most recent measurements by the Cassini Plasma Spectrometer (CAPS) in our analysis (Elrod et al., 2012) and modeling (Tseng et al., 2013a) of the temporal and radial dependence of the thermal plasma in the region between the main rings and the orbit of Enceladus. Data taken in 2012, well past equinox for which the northern side of the main rings were illuminated, appear consistent with a seasonal variation. Although the thermal plasma in this region comes from two sources, the extended ring atmosphere and the Enceladus torus that have very different radial and temporal trends, the heavy ion density is found to exhibit a steep radial dependence that is similar for all years examined. Using our chemical model, we show that this dependence requires a radial dependence for Enceladus torus than differs from recent models or, more likely, enhanced heavy ion quenching with decreasing distance from the edge of the main rings. We examine the possible physical processes and suggest that the precipitation of the inward diffusing high energy background radiation onto the edge of the main rings could play an important role.
△ Less
Submitted 14 December, 2013;
originally announced December 2013.
-
FuSSO: Functional Shrinkage and Selection Operator
Authors:
Junier B. Oliva,
Barnabas Poczos,
Timothy Verstynen,
Aarti Singh,
Jeff Schneider,
Fang-Cheng Yeh,
Wen-Yih Tseng
Abstract:
We present the FuSSO, a functional analogue to the LASSO, that efficiently finds a sparse set of functional input covariates to regress a real-valued response against. The FuSSO does so in a semi-parametric fashion, making no parametric assumptions about the nature of input functional covariates and assuming a linear form to the mapping of functional covariates to the response. We provide a statis…
▽ More
We present the FuSSO, a functional analogue to the LASSO, that efficiently finds a sparse set of functional input covariates to regress a real-valued response against. The FuSSO does so in a semi-parametric fashion, making no parametric assumptions about the nature of input functional covariates and assuming a linear form to the mapping of functional covariates to the response. We provide a statistical backing for use of the FuSSO via proof of asymptotic sparsistency under various conditions. Furthermore, we observe good results on both synthetic and real-world data.
△ Less
Submitted 8 March, 2014; v1 submitted 9 November, 2013;
originally announced November 2013.
-
The Atomic Hydrogen Cloud in the Saturnian System
Authors:
W. -L. Tseng,
R. E. Johnson,
W. -H. Ip
Abstract:
The Voyager flyby observations revealed that a very broad doughnut shaped distribution of the hydrogen atoms existed in the Saturnian magnetosphere. Recent Cassini observations confirmed the local-time asymmetry but also showed the hydrogen cloud density increases with decreasing distance to Saturn. The origin of the atomic hydrogen cloud has been debated ever since. Therefore, we have carried out…
▽ More
The Voyager flyby observations revealed that a very broad doughnut shaped distribution of the hydrogen atoms existed in the Saturnian magnetosphere. Recent Cassini observations confirmed the local-time asymmetry but also showed the hydrogen cloud density increases with decreasing distance to Saturn. The origin of the atomic hydrogen cloud has been debated ever since. Therefore, we have carried out a global investigation of the atomic hydrogen cloud taking into account all possible sources: 1) the Saturnian atmosphere, 2) the H2 atmosphere of main rings, 3) Enceladus H2O and OH torus, 4) Titan H2 torus and 5) the atomic hydrogen directly escaping from Titan. We show that the H ejection velocity and angle distribution are modified by collisions of the hot H, produced by electron-impact dissociation of H2, with the ambient atmospheric H2 and H. This in turn affects the morphology of the escaping hydrogen as does the morphology of the ionospheric electron distribution. That Saturn atmosphere is an important source is suggested by the fact that the H cloud peaks well below the ring plane, a feature that, so far, we can not reproduce by the dissociation of the ring H2 atmosphere or other proposed sources. Our simulations show that H directly escaping from Titan is a major contribution in the outer magnetosphere. The morphology of Titan H torus, shaped by the solar radiation pressure and the Saturnian oblateness, can account for the local time asymmetry near Titan orbit. Dissociation of H2O and OH in the Enceladus torus contributes inside ~5 RS, but dissociation of Titan H2 torus does not due to the significant energy released. The total number of H observed by Cassini inside 5 RS: our modeling results suggest ~20% from dissociation in the Enceladus torus, ~10% from dissociation of ring H2 atmosphere, and ~50% from Titan H torus implying that ~20% comes from the Saturnian atmosphere.
△ Less
Submitted 13 February, 2013;
originally announced February 2013.
-
Spin transfer torque devices utilizing the giant spin Hall effect of tungsten
Authors:
Chi-Feng Pai,
Luqiao Liu,
Y. Li,
H. W. Tseng,
D. C. Ralph,
R. A. Buhrman
Abstract:
We report a giant spin Hall effect (SHE) in β-W thin films. Using spin torque induced ferromagnetic resonance with a β-W/CoFeB bilayer microstrip we determine the spin Hall angle to be |θ|=0.30\pm0.02, large enough for an in-plane current to efficiently reverse the orientation of an in-plane magnetized CoFeB free layer of a nanoscale magnetic tunnel junction adjacent to a thin β-W layer. From swit…
▽ More
We report a giant spin Hall effect (SHE) in β-W thin films. Using spin torque induced ferromagnetic resonance with a β-W/CoFeB bilayer microstrip we determine the spin Hall angle to be |θ|=0.30\pm0.02, large enough for an in-plane current to efficiently reverse the orientation of an in-plane magnetized CoFeB free layer of a nanoscale magnetic tunnel junction adjacent to a thin β-W layer. From switching data obtained with such 3-terminal devices we independently determine |θ|=0.33\pm0.06. We also report variation of the spin Hall switching efficiency with W layers of different resistivities and hence of variable (α and β) phase composition.
△ Less
Submitted 8 August, 2012;
originally announced August 2012.
-
Spin torque switching with the giant spin Hall effect of tantalum
Authors:
Luqiao Liu,
Chi-Feng Pai,
Y. Li,
H. W. Tseng,
D. C. Ralph,
R. A. Buhrman
Abstract:
We report a giant spin Hall effect (SHE) in β-Ta that generates spin currents intense enough to induce efficient spin-transfer-torque switching of ferromagnets, thereby providing a new approach for controlling magnetic devices that can be superior to existing technologies. We quantify this SHE by three independent methods and demonstrate spin-torque (ST) switching of both out-of-plane and in-plane…
▽ More
We report a giant spin Hall effect (SHE) in β-Ta that generates spin currents intense enough to induce efficient spin-transfer-torque switching of ferromagnets, thereby providing a new approach for controlling magnetic devices that can be superior to existing technologies. We quantify this SHE by three independent methods and demonstrate spin-torque (ST) switching of both out-of-plane and in-plane magnetized layers. We implement a three-terminal device that utilizes current passing through a low impedance Ta-ferromagnet bilayer to effect switching of a nanomagnet, with a higher-impedance magnetic tunnel junction for read-out. The efficiency and reliability of this device, together with its simplicity of fabrication, suggest that this three-terminal SHE-ST design can eliminate the main obstacles currently impeding the development of magnetic memory and non-volatile spin logic technologies.
△ Less
Submitted 13 March, 2012;
originally announced March 2012.
-
Modeling the Seasonal Variability of the Plasma Environment in Saturn's Magnetosphere between Main Rings and Mimas
Authors:
W. -L. Tseng,
R. E. Johnson,
M. K. Elrod
Abstract:
The detection of O2+ and O+ ions over Saturn's main rings by the Cassini INMS and CAPS instruments at Saturn orbit insertion (SOI) in 2004 confirmed the existence of the ring atmosphere and ionosphere. The source mechanism was suggested to be primarily photolytic decomposition of water ice producing neutral O2 and H2 (Johnson et al., 2006). Therefore, we predicted that there would be seasonal vari…
▽ More
The detection of O2+ and O+ ions over Saturn's main rings by the Cassini INMS and CAPS instruments at Saturn orbit insertion (SOI) in 2004 confirmed the existence of the ring atmosphere and ionosphere. The source mechanism was suggested to be primarily photolytic decomposition of water ice producing neutral O2 and H2 (Johnson et al., 2006). Therefore, we predicted that there would be seasonal variations in the ring atmosphere and ionosphere due to the orientation of the ring plane to the sun (Tseng et al., 2010). The atoms and molecules scattered out of the ring atmosphere by ion-molecule collisions are an important source for the inner magnetosphere (Johnson et al., 2006; Martens et al. 2008; Tseng et al., 2010 and 2011). This source competes with water products from the Enceladus' plumes, which, although possibly variable, do not appear to have a seasonal variability (Smith et al., 2010). Recently, we found that the plasma density, composition and temperature in the region from 2.5 to 3.5 RS exhibited significant seasonal variation between 2004 and 2010 (Elrod et al., 2011). Here we present a one-box ion chemistry model to explain the complex and highly variable plasma environment observed by the CAPS instrument on Cassini. We combine the water products from Enceladus with the molecules scattered from a corrected ring atmosphere, in order to describe the temporal changes in ion densities, composition and temperature detected by CAPS. We found that the observed temporal variations are primarily seasonal, due to the predicted seasonal variation in the ring atmosphere, and are consistent with a compressed magnetosphere at SOI.
△ Less
Submitted 22 December, 2011;
originally announced December 2011.