subscribe to arXiv mailings

arXiv:2406.13448 [pdf, other]

Demonstration of High-Efficiency Microwave Heating Producing Record Highly Charged Xenon Ion Beams with Superconducting ECR Ion Sources

Authors: X. Wang, J. B. Li, V. Mironov, J. W. Guo, X. Z. Zhang, O. Tarvainen, Y. C. Feng, L. X. Li, J. D. Ma, Z. H. Zhang, W. Lu, S. Bogomolov, L. Sun, H. W. Zhao

Abstract: Intense highly charged ion beam production is essential for high-power heavy ion accelerators. A novel movable Vlasov launcher for superconducting high charge state Electron Cyclotron Resonance (ECR) ion source has been devised that can affect the microwave power effectiveness by a factor of about 4 in terms of highly charged ion beam production. This approach based on a dedicated microwave launch… ▽ More Intense highly charged ion beam production is essential for high-power heavy ion accelerators. A novel movable Vlasov launcher for superconducting high charge state Electron Cyclotron Resonance (ECR) ion source has been devised that can affect the microwave power effectiveness by a factor of about 4 in terms of highly charged ion beam production. This approach based on a dedicated microwave launching system instead of the traditional coupling scheme has led to new insight on microwave-plasma interaction. With this new understanding, the world record highly charged xenon ion beam currents have been enhanced by up to a factor of 2, which could directly and significantly enhance the performance of heavy ion accelerators and provide many new research opportunities in nuclear physics, atomic physics and other disciplines. △ Less

Submitted 14 July, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

arXiv:2405.15525 [pdf, other]

Sparse Matrix in Large Language Model Fine-tuning

Authors: Haoze He, Juncheng Billy Li, Xuan Jiang, Heather Miller

Abstract: LoRA and its variants have become popular parameter-efficient fine-tuning (PEFT) methods due to their ability to avoid excessive computational costs. However, an accuracy gap often exists between PEFT methods and full fine-tuning (FT), and this gap has yet to be systematically studied. In this work, we introduce a method for selecting sparse sub-matrices that aim to minimize the performance gap be… ▽ More LoRA and its variants have become popular parameter-efficient fine-tuning (PEFT) methods due to their ability to avoid excessive computational costs. However, an accuracy gap often exists between PEFT methods and full fine-tuning (FT), and this gap has yet to be systematically studied. In this work, we introduce a method for selecting sparse sub-matrices that aim to minimize the performance gap between PEFT vs. full fine-tuning (FT) while also reducing both fine-tuning computational cost and memory cost. Our Sparse Matrix Tuning (SMT) method begins by identifying the most significant sub-matrices in the gradient update, updating only these blocks during the fine-tuning process. In our experiments, we demonstrate that SMT consistently surpasses other PEFT baseline (e.g. LoRA and DoRA) in fine-tuning popular large language models such as LLaMA across a broad spectrum of tasks, while reducing the GPU memory footprint by 67% compared to FT. We also examine how the performance of LoRA and DoRA tends to plateau and decline as the number of trainable parameters increases, in contrast, our SMT method does not suffer from such issue. △ Less

Submitted 29 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

Comments: 14 pages

arXiv:2404.04543 [pdf]

Early Adoption of Generative AI by Global Business Leaders: Insights from an INSEAD Alumni Survey

Authors: Jason P Davis, Jian Bai Li

Abstract: How are new technologies like generative AI quickly adopted and used by executive and managerial leaders to create value in organizations? A survey of INSEAD's global alumni base revealed several intriguing insights into perceptions and engagements with generative AI across a broad spectrum of demographics, industries, and geographies. Notably, there's a prevailing optimism about the role of gener… ▽ More How are new technologies like generative AI quickly adopted and used by executive and managerial leaders to create value in organizations? A survey of INSEAD's global alumni base revealed several intriguing insights into perceptions and engagements with generative AI across a broad spectrum of demographics, industries, and geographies. Notably, there's a prevailing optimism about the role of generative AI in enhancing productivity and innovation, as evidenced by the 90% of respondents being excited about its time-saving and efficiency benefits. Analysis revealed different attitudes about adoption and use across demographic variables. Younger respondents are significantly more excited about generative AI and more likely to be using it at work and in personal life than older participants. Those in Europe have a somewhat more distant view of generative AI than those in North America in Asia, in that they see the gains more likely to be captured by organizations than individuals, and are less likely to be using it in professional and personal contexts than those in North America and Asia. This may also be related to the fact that those in Europe are more likely to be working in Financial Services and less likely to be working in Information Technology industries than those in North America and Asia. Despite this, those in Europe are more likely to see AGI happening faster than those in North America, although this may reflect less interaction with generative AI in personal and professional contexts. These findings collectively underscore the complex and multifaceted perceptions of generative AI's role in society, pointing to both its promising potential and the challenges it presents. △ Less

Submitted 6 April, 2024; originally announced April 2024.

arXiv:2212.05603 [pdf, other]

Error-aware Quantization through Noise Tempering

Authors: Zheng Wang, Juncheng B Li, Shuhui Qu, Florian Metze, Emma Strubell

Abstract: Quantization has become a predominant approach for model compression, enabling deployment of large models trained on GPUs onto smaller form-factor devices for inference. Quantization-aware training (QAT) optimizes model parameters with respect to the end task while simulating quantization error, leading to better performance than post-training quantization. Approximation of gradients through the n… ▽ More Quantization has become a predominant approach for model compression, enabling deployment of large models trained on GPUs onto smaller form-factor devices for inference. Quantization-aware training (QAT) optimizes model parameters with respect to the end task while simulating quantization error, leading to better performance than post-training quantization. Approximation of gradients through the non-differentiable quantization operator is typically achieved using the straight-through estimator (STE) or additive noise. However, STE-based methods suffer from instability due to biased gradients, whereas existing noise-based methods cannot reduce the resulting variance. In this work, we incorporate exponentially decaying quantization-error-aware noise together with a learnable scale of task loss gradient to approximate the effect of a quantization operator. We show this method combines gradient scale and quantization noise in a better optimized way, providing finer-grained estimation of gradients at each weight and activation layer's quantizer bin size. Our controlled noise also contains an implicit curvature term that could encourage flatter minima, which we show is indeed the case in our experiments. Experiments training ResNet architectures on the CIFAR-10, CIFAR-100 and ImageNet benchmarks show that our method obtains state-of-the-art top-1 classification accuracy for uniform (non mixed-precision) quantization, out-performing previous methods by 0.5-1.2% absolute. △ Less

Submitted 11 December, 2022; originally announced December 2022.

arXiv:2210.07171 [pdf, other]

SQuAT: Sharpness- and Quantization-Aware Training for BERT

Authors: Zheng Wang, Juncheng B Li, Shuhui Qu, Florian Metze, Emma Strubell

Abstract: Quantization is an effective technique to reduce memory footprint, inference latency, and power consumption of deep learning models. However, existing quantization methods suffer from accuracy degradation compared to full-precision (FP) models due to the errors introduced by coarse gradient estimation through non-differentiable quantization layers. The existence of sharp local minima in the loss l… ▽ More Quantization is an effective technique to reduce memory footprint, inference latency, and power consumption of deep learning models. However, existing quantization methods suffer from accuracy degradation compared to full-precision (FP) models due to the errors introduced by coarse gradient estimation through non-differentiable quantization layers. The existence of sharp local minima in the loss landscapes of overparameterized models (e.g., Transformers) tends to aggravate such performance penalty in low-bit (2, 4 bits) settings. In this work, we propose sharpness- and quantization-aware training (SQuAT), which would encourage the model to converge to flatter minima while performing quantization-aware training. Our proposed method alternates training between sharpness objective and step-size objective, which could potentially let the model learn the most suitable parameter update magnitude to reach convergence near-flat minima. Extensive experiments show that our method can consistently outperform state-of-the-art quantized BERT models under 2, 3, and 4-bit settings on GLUE benchmarks by 1%, and can sometimes even outperform full precision (32-bit) models. Our experiments on empirical measurement of sharpness also suggest that our method would lead to flatter minima compared to other quantization methods. △ Less

Submitted 13 October, 2022; originally announced October 2022.

arXiv:2205.03268 [pdf, other]

Robustness of Neural Architectures for Audio Event Detection

Authors: Juncheng B Li, Zheng Wang, Shuhui Qu, Florian Metze

Abstract: Traditionally, in Audio Recognition pipeline, noise is suppressed by the "frontend", relying on preprocessing techniques such as speech enhancement. However, it is not guaranteed that noise will not cascade into downstream pipelines. To understand the actual influence of noise on the entire audio pipeline, in this paper, we directly investigate the impact of noise on a different types of neural mo… ▽ More Traditionally, in Audio Recognition pipeline, noise is suppressed by the "frontend", relying on preprocessing techniques such as speech enhancement. However, it is not guaranteed that noise will not cascade into downstream pipelines. To understand the actual influence of noise on the entire audio pipeline, in this paper, we directly investigate the impact of noise on a different types of neural models without the preprocessing step. We measure the recognition performances of 4 different neural network models on the task of environment sound classification under the 3 types of noises: \emph{occlusion} (to emulate intermittent noise), \emph{Gaussian} noise (models continuous noise), and \emph{adversarial perturbations} (worst case scenario). Our intuition is that the different ways in which these models process their input (i.e. CNNs have strong locality inductive biases, which Transformers do not have) should lead to observable differences in performance and/ or robustness, an understanding of which will enable further improvements. We perform extensive experiments on AudioSet which is the largest weakly-labeled sound event dataset available. We also seek to explain the behaviors of different models through output distribution change and weight visualization. △ Less

Submitted 29 July, 2022; v1 submitted 6 May, 2022; originally announced May 2022.

arXiv:2203.13448 [pdf, other]

AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification

Authors: Juncheng B Li, Shuhui Qu, Po-Yao Huang, Florian Metze

Abstract: After its sweeping success in vision and language tasks, pure attention-based neural architectures (e.g. DeiT) are emerging to the top of audio tagging (AT) leaderboards, which seemingly obsoletes traditional convolutional neural networks (CNNs), feed-forward networks or recurrent networks. However, taking a closer look, there is great variability in published research, for instance, performances… ▽ More After its sweeping success in vision and language tasks, pure attention-based neural architectures (e.g. DeiT) are emerging to the top of audio tagging (AT) leaderboards, which seemingly obsoletes traditional convolutional neural networks (CNNs), feed-forward networks or recurrent networks. However, taking a closer look, there is great variability in published research, for instance, performances of models initialized with pretrained weights differ drastically from without pretraining, training time for a model varies from hours to weeks, and often, essences are hidden in seemingly trivial details. This urgently calls for a comprehensive study since our 1st comparison is half-decade old. In this work, we perform extensive experiments on AudioSet which is the largest weakly-labeled sound event dataset available, we also did an analysis based on the data quality and efficiency. We compare a few state-of-the-art baselines on the AT task, and study the performance and efficiency of 2 major categories of neural architectures: CNN variants and attention-based variants. We also closely examine their optimization procedures. Our opensourced experimental results provide insights to trade-off between performance, efficiency, optimization process, for both practitioners and researchers. Implementation: https://github.com/lijuncheng16/AudioTaggingDoneRight △ Less

Submitted 2 April, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

Journal ref: InterSpeech 2022

arXiv:2203.12122 [pdf, other]

On Adversarial Robustness of Large-scale Audio Visual Learning

Authors: Juncheng B Li, Shuhui Qu, Xinjian Li, Po-Yao Huang, Florian Metze

Abstract: As audio-visual systems are being deployed for safety-critical tasks such as surveillance and malicious content filtering, their robustness remains an under-studied area. Existing published work on robustness either does not scale to large-scale dataset, or does not deal with multiple modalities. This work aims to study several key questions related to multi-modal learning through the lens of robu… ▽ More As audio-visual systems are being deployed for safety-critical tasks such as surveillance and malicious content filtering, their robustness remains an under-studied area. Existing published work on robustness either does not scale to large-scale dataset, or does not deal with multiple modalities. This work aims to study several key questions related to multi-modal learning through the lens of robustness: 1) Are multi-modal models necessarily more robust than uni-modal models? 2) How to efficiently measure the robustness of multi-modal learning? 3) How to fuse different modalities to achieve a more robust multi-modal model? To understand the robustness of the multi-modal model in a large-scale setting, we propose a density-based metric, and a convexity metric to efficiently measure the distribution of each modality in high-dimensional latent space. Our work provides a theoretical intuition together with empirical evidence showing how multi-modal fusion affects adversarial robustness through these metrics. We further devise a mix-up strategy based on our metrics to improve the robustness of the trained model. Our experiments on AudioSet and Kinetics-Sounds verify our hypothesis that multi-modal models are not necessarily more robust than their uni-modal counterparts in the face of adversarial examples. We also observe our mix-up trained method could achieve as much protection as traditional adversarial training, offering a computationally cheap alternative. Implementation: https://github.com/lijuncheng16/AudioSetDoneRight △ Less

Submitted 21 April, 2022; v1 submitted 22 March, 2022; originally announced March 2022.

Journal ref: 2022 International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2022)

arXiv:2011.07430 [pdf, other]

Audio-Visual Event Recognition through the lens of Adversary

Authors: Juncheng B Li, Kaixin Ma, Shuhui Qu, Po-Yao Huang, Florian Metze

Abstract: As audio/visual classification models are widely deployed for sensitive tasks like content filtering at scale, it is critical to understand their robustness along with improving the accuracy. This work aims to study several key questions related to multimodal learning through the lens of adversarial noises: 1) The trade-off between early/middle/late fusion affecting its robustness and accuracy 2)… ▽ More As audio/visual classification models are widely deployed for sensitive tasks like content filtering at scale, it is critical to understand their robustness along with improving the accuracy. This work aims to study several key questions related to multimodal learning through the lens of adversarial noises: 1) The trade-off between early/middle/late fusion affecting its robustness and accuracy 2) How do different frequency/time domain features contribute to the robustness? 3) How do different neural modules contribute to the adversarial noise? In our experiment, we construct adversarial examples to attack state-of-the-art neural models trained on Google AudioSet. We compare how much attack potency in terms of adversarial perturbation of size $ε$ using different $L_p$ norms we would need to "deactivate" the victim model. Using adversarial noise to ablate multimodal models, we are able to provide insights into what is the best potential fusion strategy to balance the model parameters/accuracy and robustness trade-off and distinguish the robust features versus the non-robust features that various neural networks model tend to learn. △ Less

Submitted 14 November, 2020; originally announced November 2020.

Comments: 4 pages

arXiv:1911.00126 [pdf, other]

Adversarial Music: Real World Audio Adversary Against Wake-word Detection System

Authors: Juncheng B. Li, Shuhui Qu, Xinjian Li, Joseph Szurley, J. Zico Kolter, Florian Metze

Abstract: Voice Assistants (VAs) such as Amazon Alexa or Google Assistant rely on wake-word detection to respond to people's commands, which could potentially be vulnerable to audio adversarial examples. In this work, we target our attack on the wake-word detection system, jamming the model with some inconspicuous background music to deactivate the VAs while our audio adversary is present. We implemented an… ▽ More Voice Assistants (VAs) such as Amazon Alexa or Google Assistant rely on wake-word detection to respond to people's commands, which could potentially be vulnerable to audio adversarial examples. In this work, we target our attack on the wake-word detection system, jamming the model with some inconspicuous background music to deactivate the VAs while our audio adversary is present. We implemented an emulated wake-word detection system of Amazon Alexa based on recent publications. We validated our models against the real Alexa in terms of wake-word detection accuracy. Then we computed our audio adversaries with consideration of expectation over transform and we implemented our audio adversary with a differentiable synthesizer. Next, we verified our audio adversaries digitally on hundreds of samples of utterances collected from the real world. Our experiments show that we can effectively reduce the recognition F1 score of our emulated model from 93.4% to 11.0%. Finally, we tested our audio adversary over the air, and verified it works effectively against Alexa, reducing its F1 score from 92.5% to 11.0%.; We also verified that non-adversarial music does not disable Alexa as effectively as our music at the same sound level. To the best of our knowledge, this is the first real-world adversarial attack against a commercial-grade VA wake-word detection system. Our code and demo videos can be accessed at \url{https://www.junchengbillyli.com/AdversarialMusic} △ Less

Submitted 5 December, 2019; v1 submitted 31 October, 2019; originally announced November 2019.

Comments: 9 pages, In Proceedings of NeurIPS 2019 Conference

Journal ref: NIPS2019_9362, pages = {11908--11918}, year = {2019}, publisher = {Curran Associates, Inc.}, url = {http://papers.nips.cc/paper/9362-adversarial-music-real-world-audio-adversary-against-wake-word-detection-system.pdf} }

arXiv:1711.07102 [pdf]

Elastomeric focusing enables application of hydraulic principles to solid materials in order to create micromechanical actuators with giant displacements

Authors: Nate J Cira, Jason W Khoo, Mika Jain, Jack T Andraka, Morgan L Paull, Amber L Thomas, Kevin Aliado, Chad Viergever, Feiqiao Yu, Jonathan B Li, Canh T Nguyen, Michael Robles, Ismail E Araci, Stephen R Quake

Abstract: A continuing challenge in material science is how to create active materials in which shape changes or displacements can be generated electrically or thermally. Here we borrow principles from hydraulics, in particular that confined geometries can be used to focus expansion into large displacements, to create solid materials with amplified shape changes. Specifically, we confined an elastomeric pol… ▽ More A continuing challenge in material science is how to create active materials in which shape changes or displacements can be generated electrically or thermally. Here we borrow principles from hydraulics, in particular that confined geometries can be used to focus expansion into large displacements, to create solid materials with amplified shape changes. Specifically, we confined an elastomeric poly(dimethylsiloxane) sheet between two more rigid layers and caused focused expansion into embossed channels by local resistive heating, resulting in a 10x greater relative displacement than the unconfined geometry. We used this effect to create electrically controlled microfluidic valves that open and close in less than 100 ms, can cycle >10,000 times, and operate with as little as 20 mW of power. We investigate this mechanism and establish design rules by varying dimensions, configurations, and materials. We show the generality of elastomeric focusing by creating additional devices where local heating and expansion are generated either wirelessly through inductive coupling or optically with a laser, allowing arbitrary and dynamic positioning of a microfluidic valve along the channels. △ Less

Submitted 19 November, 2017; originally announced November 2017.

Comments: 9 pages, 4 figures, and supplemental material

arXiv:cond-mat/0312319 [pdf, ps, other]

doi 10.1103/PhysRevE.72.021806

Elasticity of polymer vesicles by osmotic pressure: an intermediate theory between fluid membranes and solid shells

Authors: Z. C. Tu, L. Q. Ge, J. B. Li, Z. C. Ou-Yang

Abstract: The entropy of a polymer confined in a curved surface and the elastic free energy of a membrane consisting of polymers are obtained by scaling analysis. It is found that the elastic free energy of the membrane has the form of the in-plane strain energy plus Helfrich's curvature energy [Z. Naturforsch. C \textbf{28}, 693 (1973)]. The elastic constants in the free energy are obtained by discussing… ▽ More The entropy of a polymer confined in a curved surface and the elastic free energy of a membrane consisting of polymers are obtained by scaling analysis. It is found that the elastic free energy of the membrane has the form of the in-plane strain energy plus Helfrich's curvature energy [Z. Naturforsch. C \textbf{28}, 693 (1973)]. The elastic constants in the free energy are obtained by discussing two simplified models: one is the polymer membrane without in-plane strains and asymmetry between its two sides, which is the counterpart of quantum mechanics in curved surface [Jensen and Koppe, Ann. Phys. \textbf{63}, 586 (1971)]; another is the planar rubber membrane with homogeneous in-plane strains. The equations to describe equilibrium shape and in-plane strains of the polymer vesicles by osmotic pressure are derived by taking the first order variation of the total free energy containing the elastic free energy, the surface tension energy and the term induced by osmotic pressure. The critical pressure, above which spherical polymer vesicle will lose its stability, is obtained by taking the second order variation of the total free energy. It is found that the in-plane mode also plays important role in the critical pressure because it couples with the out-of-plane mode. Theoretical results reveal that polymer vesicles possess the mechanical properties intermediate between fluid membranes and solid shells. △ Less

Submitted 18 August, 2005; v1 submitted 12 December, 2003; originally announced December 2003.

Comments: 17 pages, 1 figure, 1 table

Journal ref: Phys. Rev. E 72, 021806 (2005)

Showing 1–12 of 12 results for author: Li, J B