subscribe to arXiv mailings

arXiv:2407.11925 [pdf, other]

Calibration and simulation of ionization signal and electronics noise in the ICARUS liquid argon time projection chamber

Authors: ICARUS collaboration, P. Abratenko, N. Abrego-Martinez, A. Aduszkiewicz, F. Akbar, L. Aliaga Soplin, M. Artero Pons, J. Asaadi, W. F. Badgett, B. Baibussinov, B. Behera, V. Bellini, R. Benocci, S. Berkman, S. Bertolucci, M. Betancourt, M. Bonesini, T. Boone, B. Bottino, A. Braggiotti, D. Brailsford, S. J. Brice, V. Brio, C. Brizzolari, H. S. Budd A. Campani , et al. (153 additional authors not shown)

Abstract: The ICARUS liquid argon time projection chamber (LArTPC) neutrino detector has been taking physics data since 2022 as part of the Short-Baseline Neutrino (SBN) Program. This paper details the equalization of the response to charge in the ICARUS time projection chamber (TPC), as well as data-driven tuning of the simulation of ionization charge signals and electronics noise. The equalization procedu… ▽ More The ICARUS liquid argon time projection chamber (LArTPC) neutrino detector has been taking physics data since 2022 as part of the Short-Baseline Neutrino (SBN) Program. This paper details the equalization of the response to charge in the ICARUS time projection chamber (TPC), as well as data-driven tuning of the simulation of ionization charge signals and electronics noise. The equalization procedure removes non-uniformities in the ICARUS TPC response to charge in space and time. This work leverages the copious number of cosmic ray muons available to ICARUS at the surface. The ionization signal shape simulation applies a novel procedure that tunes the simulation to match what is measured in data. The end result of the equalization procedure and simulation tuning allows for a comparison of charge measurements in ICARUS between Monte Carlo simulation and data, showing good performance with minimal residual bias between the two. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Report number: FERMILAB-PUB-24-0330-PPD

arXiv:2407.10339 [pdf, other]

Supernova Pointing Capabilities of DUNE

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, D. A. Andrade , et al. (1340 additional authors not shown)

Abstract: The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electr… ▽ More The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electron-neutrino charged-current absorption on $^{40}$Ar and elastic scattering of neutrinos on electrons. Procedures to reconstruct individual interactions, including a newly developed technique called ``brems flipping'', as well as the burst direction from an ensemble of interactions are described. Performance of the burst direction reconstruction is evaluated for supernovae happening at a distance of 10 kpc for a specific supernova burst flux model. The pointing resolution is found to be 3.4 degrees at 68% coverage for a perfect interaction-channel classification and a fiducial mass of 40 kton, and 6.6 degrees for a 10 kton fiducial mass respectively. Assuming a 4% rate of charged-current interactions being misidentified as elastic scattering, DUNE's burst pointing resolution is found to be 4.3 degrees (8.7 degrees) at 68% coverage. △ Less

Submitted 14 July, 2024; originally announced July 2024.

Comments: 25 pages, 16 figures

Report number: FERMILAB-PUB-24-0319-LBNF

arXiv:2407.08156 [pdf, other]

AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization

Authors: Shixiong Xu, Chenghao Zhang, Lubin Fan, Gaofeng Meng, Shiming Xiang, Jieping Ye

Abstract: In this study, we introduce a new problem raised by social media and photojournalism, named Image Address Localization (IAL), which aims to predict the readable textual address where an image was taken. Existing two-stage approaches involve predicting geographical coordinates and converting them into human-readable addresses, which can lead to ambiguity and be resource-intensive. In contrast, we p… ▽ More In this study, we introduce a new problem raised by social media and photojournalism, named Image Address Localization (IAL), which aims to predict the readable textual address where an image was taken. Existing two-stage approaches involve predicting geographical coordinates and converting them into human-readable addresses, which can lead to ambiguity and be resource-intensive. In contrast, we propose an end-to-end framework named AddressCLIP to solve the problem with more semantics, consisting of two key ingredients: i) image-text alignment to align images with addresses and scene captions by contrastive learning, and ii) image-geography matching to constrain image features with the spatial distance in terms of manifold learning. Additionally, we have built three datasets from Pittsburgh and San Francisco on different scales specifically for the IAL problem. Experiments demonstrate that our approach achieves compelling performance on the proposed datasets and outperforms representative transfer learning methods for vision-language models. Furthermore, extensive ablations and visualizations exhibit the effectiveness of the proposed method. The datasets and source code are available at https://github.com/xsx1001/AddressCLIP. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: Accepted at ECCV 2024

arXiv:2407.00056 [pdf, other]

MMBee: Live Streaming Gift-Sending Recommendations via Multi-Modal Fusion and Behaviour Expansion

Authors: Jiaxin Deng, Shiyao Wang, Yuchen Wang, Jiansong Qi, Liqin Zhao, Guorui Zhou, Gaofeng Meng

Abstract: Live streaming services are becoming increasingly popular due to real-time interactions and entertainment. Viewers can chat and send comments or virtual gifts to express their preferences for the streamers. Accurately modeling the gifting interaction not only enhances users' experience but also increases streamers' revenue. Previous studies on live streaming gifting prediction treat this task as a… ▽ More Live streaming services are becoming increasingly popular due to real-time interactions and entertainment. Viewers can chat and send comments or virtual gifts to express their preferences for the streamers. Accurately modeling the gifting interaction not only enhances users' experience but also increases streamers' revenue. Previous studies on live streaming gifting prediction treat this task as a conventional recommendation problem, and model users' preferences using categorical data and observed historical behaviors. However, it is challenging to precisely describe the real-time content changes in live streaming using limited categorical information. Moreover, due to the sparsity of gifting behaviors, capturing the preferences and intentions of users is quite difficult. In this work, we propose MMBee based on real-time Multi-Modal Fusion and Behaviour Expansion to address these issues. Specifically, we first present a Multi-modal Fusion Module with Learnable Query (MFQ) to perceive the dynamic content of streaming segments and process complex multi-modal interactions, including images, text comments and speech. To alleviate the sparsity issue of gifting behaviors, we present a novel Graph-guided Interest Expansion (GIE) approach that learns both user and streamer representations on large-scale gifting graphs with multi-modal attributes. Comprehensive experiment results show that MMBee achieves significant performance improvements on both public datasets and Kuaishou real-world streaming datasets and the effectiveness has been further validated through online A/B experiments. MMBee has been deployed and is serving hundreds of millions of users at Kuaishou. △ Less

Submitted 15 June, 2024; originally announced July 2024.

Comments: Accepted at KDD 2024

arXiv:2406.18817 [pdf, other]

Correspondence-Free Non-Rigid Point Set Registration Using Unsupervised Clustering Analysis

Authors: Mingyang Zhao, Jingen Jiang, Lei Ma, Shiqing Xin, Gaofeng Meng, Dong-Ming Yan

Abstract: This paper presents a novel non-rigid point set registration method that is inspired by unsupervised clustering analysis. Unlike previous approaches that treat the source and target point sets as separate entities, we develop a holistic framework where they are formulated as clustering centroids and clustering members, separately. We then adopt Tikhonov regularization with an $\ell_1$-induced Lapl… ▽ More This paper presents a novel non-rigid point set registration method that is inspired by unsupervised clustering analysis. Unlike previous approaches that treat the source and target point sets as separate entities, we develop a holistic framework where they are formulated as clustering centroids and clustering members, separately. We then adopt Tikhonov regularization with an $\ell_1$-induced Laplacian kernel instead of the commonly used Gaussian kernel to ensure smooth and more robust displacement fields. Our formulation delivers closed-form solutions, theoretical guarantees, independence from dimensions, and the ability to handle large deformations. Subsequently, we introduce a clustering-improved Nyström method to effectively reduce the computational complexity and storage of the Gram matrix to linear, while providing a rigorous bound for the low-rank approximation. Our method achieves high accuracy results across various scenarios and surpasses competitors by a significant margin, particularly on shapes with substantial deformations. Additionally, we demonstrate the versatility of our method in challenging tasks such as shape transfer and medical registration. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: [CVPR 2024 Highlight] Project and code at: https://github.com/zikai1/CVPR24_PointSetReg

arXiv:2406.04749 [pdf, other]

Enhanced preprocessed multi-step splitting iterations for computing PageRank

Authors: Guangcong Meng, Yuehua Feng, Yongxin Dong

Abstract: In recent years, the PageRank algorithm has garnered significant attention due to its crucial role in search engine technologies and its applications across various scientific fields. It is well-known that the power method is a classical method for computing PageRank. However, there is a pressing demand for alternative approaches that can address its limitations and enhance its efficiency. Specifi… ▽ More In recent years, the PageRank algorithm has garnered significant attention due to its crucial role in search engine technologies and its applications across various scientific fields. It is well-known that the power method is a classical method for computing PageRank. However, there is a pressing demand for alternative approaches that can address its limitations and enhance its efficiency. Specifically, the power method converges very slowly when the damping factor is close to 1. To address this challenge, this paper introduces a new multi-step splitting iteration approach for accelerating PageRank computations. Furthermore, we present two new approaches for computating PageRank, which are modifications of the new multi-step splitting iteration approach, specifically utilizing the thick restarted Arnoldi and generalized Arnoldi methods. We provide detailed discussions on the construction and theoretical convergence results of these two approaches. Extensive experiments using large test matrices demonstrate the significant performance improvements achieved by our proposed algorithms. △ Less

Submitted 7 June, 2024; originally announced June 2024.

arXiv:2405.20335 [pdf, other]

Xwin-LM: Strong and Scalable Alignment Practice for LLMs

Authors: Bolin Ni, JingCheng Hu, Yixuan Wei, Houwen Peng, Zheng Zhang, Gaofeng Meng, Han Hu

Abstract: In this work, we present Xwin-LM, a comprehensive suite of alignment methodologies for large language models (LLMs). This suite encompasses several key techniques, including supervised finetuning (SFT), reward modeling (RM), rejection sampling finetuning (RS), and direct preference optimization (DPO). The key components are as follows: (1) Xwin-LM-SFT, models initially finetuned with high-quality… ▽ More In this work, we present Xwin-LM, a comprehensive suite of alignment methodologies for large language models (LLMs). This suite encompasses several key techniques, including supervised finetuning (SFT), reward modeling (RM), rejection sampling finetuning (RS), and direct preference optimization (DPO). The key components are as follows: (1) Xwin-LM-SFT, models initially finetuned with high-quality instruction data; (2) Xwin-Pair, a large-scale, multi-turn preference dataset meticulously annotated using GPT-4; (3) Xwin-RM, reward models trained on Xwin-Pair, developed at scales of 7B, 13B, and 70B parameters; (4) Xwin-Set, a multiwise preference dataset in which each prompt is linked to 64 unique responses generated by Xwin-LM-SFT and scored by Xwin-RM; (5) Xwin-LM-RS, models finetuned with the highest-scoring responses from Xwin-Set; (6) Xwin-LM-DPO, models further optimized on Xwin-Set using the DPO algorithm. Our evaluations on AlpacaEval and MT-bench demonstrate consistent and significant improvements across the pipeline, demonstrating the strength and scalability of Xwin-LM. The repository https://github.com/Xwin-LM/Xwin-LM will be continually updated to foster community research. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.15426 [pdf, other]

AuthNet: Neural Network with Integrated Authentication Logic

Authors: Yuling Cai, Fan Xiang, Guozhu Meng, Yinzhi Cao, Kai Chen

Abstract: Model stealing, i.e., unauthorized access and exfiltration of deep learning models, has become one of the major threats. Proprietary models may be protected by access controls and encryption. However, in reality, these measures can be compromised due to system breaches, query-based model extraction or a disgruntled insider. Security hardening of neural networks is also suffering from limits, for e… ▽ More Model stealing, i.e., unauthorized access and exfiltration of deep learning models, has become one of the major threats. Proprietary models may be protected by access controls and encryption. However, in reality, these measures can be compromised due to system breaches, query-based model extraction or a disgruntled insider. Security hardening of neural networks is also suffering from limits, for example, model watermarking is passive, cannot prevent the occurrence of piracy and not robust against transformations. To this end, we propose a native authentication mechanism, called AuthNet, which integrates authentication logic as part of the model without any additional structures. Our key insight is to reuse redundant neurons with low activation and embed authentication bits in an intermediate layer, called a gate layer. Then, AuthNet fine-tunes the layers after the gate layer to embed authentication logic so that only inputs with special secret key can trigger the correct logic of AuthNet. It exhibits two intuitive advantages. It provides the last line of defense, i.e., even being exfiltrated, the model is not usable as the adversary cannot generate valid inputs without the key. Moreover, the authentication logic is difficult to inspect and identify given millions or billions of neurons in the model. We theoretically demonstrate the high sensitivity of AuthNet to the secret key and its high confusion for unauthorized samples. AuthNet is compatible with any convolutional neural network, where our extensive evaluations show that AuthNet successfully achieves the goal in rejecting unauthenticated users (whose average accuracy drops to 22.03%) with a trivial accuracy decrease (1.18% on average) for legitimate users, and is robust against model transformation and adaptive attacks. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.10428 [pdf, other]

Energetic particles transport in constants of motion space due to collisions in tokamak plasmas

Authors: Guo Meng, Philipp Lauber, Zhixin Lu, Andreas Bergmann, Mirelle Schneider

Abstract: The spatio-temporal evolution of the energetic particles in the transport time scale in tokamak plasmas is a key issue of the plasmas confinement, especially in burning plasmas. In order to include sources and sinks and collisional slowing down processes, a new solver, ATEP-3D was implemented to simulate the evolution of the EP distribution in the three-dimensional constants of motion (CoM) space.… ▽ More The spatio-temporal evolution of the energetic particles in the transport time scale in tokamak plasmas is a key issue of the plasmas confinement, especially in burning plasmas. In order to include sources and sinks and collisional slowing down processes, a new solver, ATEP-3D was implemented to simulate the evolution of the EP distribution in the three-dimensional constants of motion (CoM) space. The Fokker-Planck collision operator represented in the CoM space is derived and numerically calculated. The collision coefficients are averaged over the unperturbed orbits to capture the fundamental properties of EPs. ATEP-3D is fully embedded in ITER IMAS framework and combined with the LIGKA/HAGIS codes. The finite volume method and the implicit Crank-Nicholson scheme are adopted due to their optimal numerical properties for transport time scale studies. ATEP-3D allows the analysis of the particle and power balance with the source and sink during the transport process to evaluate the EP confinement properties. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: 14 pages, 5 figures, 29th IAEA Fusion Energy Conf. 2023

arXiv:2404.00360 [pdf, other]

Reusable Architecture Growth for Continual Stereo Matching

Authors: Chenghao Zhang, Gaofeng Meng, Bin Fan, Kun Tian, Zhaoxiang Zhang, Shiming Xiang, Chunhong Pan

Abstract: The remarkable performance of recent stereo depth estimation models benefits from the successful use of convolutional neural networks to regress dense disparity. Akin to most tasks, this needs gathering training data that covers a number of heterogeneous scenes at deployment time. However, training samples are typically acquired continuously in practical applications, making the capability to lear… ▽ More The remarkable performance of recent stereo depth estimation models benefits from the successful use of convolutional neural networks to regress dense disparity. Akin to most tasks, this needs gathering training data that covers a number of heterogeneous scenes at deployment time. However, training samples are typically acquired continuously in practical applications, making the capability to learn new scenes continually even more crucial. For this purpose, we propose to perform continual stereo matching where a model is tasked to 1) continually learn new scenes, 2) overcome forgetting previously learned scenes, and 3) continuously predict disparities at inference. We achieve this goal by introducing a Reusable Architecture Growth (RAG) framework. RAG leverages task-specific neural unit search and architecture growth to learn new scenes continually in both supervised and self-supervised manners. It can maintain high reusability during growth by reusing previous units while obtaining good performance. Additionally, we present a Scene Router module to adaptively select the scene-specific architecture path at inference. Comprehensive experiments on numerous datasets show that our framework performs impressively in various weather, road, and city circumstances and surpasses the state-of-the-art methods in more challenging cross-dataset settings. Further experiments also demonstrate the adaptability of our method to unseen scenes, which can facilitate end-to-end stereo architecture learning and practical deployment. △ Less

Submitted 30 March, 2024; originally announced April 2024.

Comments: Extended version of CVPR 2022 paper "Continual Stereo Matching of Continuous Driving Scenes with Growing Architecture" - Accepted to TPAMI in 2024

arXiv:2403.16124 [pdf, other]

Enhancing Visual Continual Learning with Language-Guided Supervision

Authors: Bolin Ni, Hongbo Zhao, Chenghao Zhang, Ke Hu, Gaofeng Meng, Zhaoxiang Zhang, Shiming Xiang

Abstract: Continual learning (CL) aims to empower models to learn new tasks without forgetting previously acquired knowledge. Most prior works concentrate on the techniques of architectures, replay data, regularization, \etc. However, the category name of each class is largely neglected. Existing methods commonly utilize the one-hot labels and randomly initialize the classifier head. We argue that the scarc… ▽ More Continual learning (CL) aims to empower models to learn new tasks without forgetting previously acquired knowledge. Most prior works concentrate on the techniques of architectures, replay data, regularization, \etc. However, the category name of each class is largely neglected. Existing methods commonly utilize the one-hot labels and randomly initialize the classifier head. We argue that the scarce semantic information conveyed by the one-hot labels hampers the effective knowledge transfer across tasks. In this paper, we revisit the role of the classifier head within the CL paradigm and replace the classifier with semantic knowledge from pretrained language models (PLMs). Specifically, we use PLMs to generate semantic targets for each class, which are frozen and serve as supervision signals during training. Such targets fully consider the semantic correlation between all classes across tasks. Empirical studies show that our approach mitigates forgetting by alleviating representation drifting and facilitating knowledge transfer across tasks. The proposed method is simple to implement and can seamlessly be plugged into existing methods with negligible adjustments. Extensive experiments based on eleven mainstream baselines demonstrate the effectiveness and generalizability of our approach to various protocols. For example, under the class-incremental learning setting on ImageNet-100, our method significantly improves the Top-1 accuracy by 3.2\% to 6.1\% while reducing the forgetting rate by 2.6\% to 13.1\%. △ Less

Submitted 24 March, 2024; originally announced March 2024.

Comments: Accepted by CVPR 2024

arXiv:2403.14910 [pdf, other]

Defying Imbalanced Forgetting in Class Incremental Learning

Authors: Shixiong Xu, Gaofeng Meng, Xing Nie, Bolin Ni, Bin Fan, Shiming Xiang

Abstract: We observe a high level of imbalance in the accuracy of different classes in the same old task for the first time. This intriguing phenomenon, discovered in replay-based Class Incremental Learning (CIL), highlights the imbalanced forgetting of learned classes, as their accuracy is similar before the occurrence of catastrophic forgetting. This discovery remains previously unidentified due to the re… ▽ More We observe a high level of imbalance in the accuracy of different classes in the same old task for the first time. This intriguing phenomenon, discovered in replay-based Class Incremental Learning (CIL), highlights the imbalanced forgetting of learned classes, as their accuracy is similar before the occurrence of catastrophic forgetting. This discovery remains previously unidentified due to the reliance on average incremental accuracy as the measurement for CIL, which assumes that the accuracy of classes within the same task is similar. However, this assumption is invalid in the face of catastrophic forgetting. Further empirical studies indicate that this imbalanced forgetting is caused by conflicts in representation between semantically similar old and new classes. These conflicts are rooted in the data imbalance present in replay-based CIL methods. Building on these insights, we propose CLass-Aware Disentanglement (CLAD) to predict the old classes that are more likely to be forgotten and enhance their accuracy. Importantly, CLAD can be seamlessly integrated into existing CIL methods. Extensive experiments demonstrate that CLAD consistently improves current replay-based methods, resulting in performance gains of up to 2.56%. △ Less

Submitted 21 March, 2024; originally announced March 2024.

Comments: AAAI2024

arXiv:2403.14214 [pdf, other]

Asteroseismological analysis of the non-Blazhko RRab star EPIC~248846335 in LAMOST -- Kepler$/$ K2 project

Authors: Peng Zong, Jian-Ning Fu, Jie Su, Xueying Hu, Bo Zhang, Jiaxin Wang, Gao-Chao Liu, Gang Meng, Gianni Catanzaro, Antonio Frasca, Haotian Wang, Weikai Zong

Abstract: We conduct an asteroseismological analysis on the non-Blazhko ab-type RR Lyrae star EPIC 248846335 employing the Radial Stellar Pulsations (RSP) module of the Modules for Experiments in Stellar Astrophysics (MESA) based on the set of stellar parameters. The atmospheric parameters as $T_\mathrm{eff}$ = 6933$\pm$70 $K$, log $g$ = 3.35$\pm$ 0.50 and [Fe/H] = -1.18 $\pm$ 0.14 are estimated from the Lo… ▽ More We conduct an asteroseismological analysis on the non-Blazhko ab-type RR Lyrae star EPIC 248846335 employing the Radial Stellar Pulsations (RSP) module of the Modules for Experiments in Stellar Astrophysics (MESA) based on the set of stellar parameters. The atmospheric parameters as $T_\mathrm{eff}$ = 6933$\pm$70 $K$, log $g$ = 3.35$\pm$ 0.50 and [Fe/H] = -1.18 $\pm$ 0.14 are estimated from the Low-Resolution Spectra of LAMOST DR9. The luminosity $L$ = 49.70$_{-1.80}^{+2.99}$ $L_\odot$ and mass M = 0.56 $\pm$ 0.07 $M_\odot$ are calculated, respectively, using the distance provided by Gaia and the metallicity estimated from the Low-Resolution Spectra. The Fourier parameters of the light curves observed by $K2$ and RV curves determined from the Medium-Resolution Spectra of LAMOST DR10 are also calculated in this work. The period of the fundamental mode of the star and the residuals $r$ of the Fourier parameters between the models and observations serve to select optimal model, whose stellar parameters are $T_\mathrm{eff}$ = 6700 $\pm$ 220 K, log $g$ = 2.70, [Fe/H] = -1.20 $\pm$ 0.2, M = 0.59 $\pm$ 0.05 $M_\odot$, and $L$ = 56.0 $\pm$ 4.2 $L_\odot$. The projection factors are constrained as 1.20 $\pm$ 0.02 and 1.59 $\pm$ 0.13 by the blue- and red-arm observed velocities with their corresponding RV curves derived from the best-fit model, respectively. The precise determination of stellar parameters in ab-type RR Lyrae stars is crucial for understanding the physical processes that occur during pulsation and for providing a deeper understanding of its Period-Luminosity relationship. △ Less

Submitted 23 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

arXiv:2403.11794 [pdf]

In-situ observation of field-induced nano-protrusion growth on a carbon-coated tungsten nanotip

Authors: Guodong Meng, Yimeng Li, Roni Aleksi Koitermaa, Veronika Zadin, Yonghong Cheng, Andreas Kyritsakis

Abstract: Nano-protrusion (NP) on metal surface and its inevitable contamination layer under high electric field is often considered as the primary precursor that leads to vacuum breakdown, which plays an extremely detrimental effect for high energy physics equipment and many other devices. Yet, the NP growth has never been experimentally observed. Here, we conduct field emission (FE) measurements along wit… ▽ More Nano-protrusion (NP) on metal surface and its inevitable contamination layer under high electric field is often considered as the primary precursor that leads to vacuum breakdown, which plays an extremely detrimental effect for high energy physics equipment and many other devices. Yet, the NP growth has never been experimentally observed. Here, we conduct field emission (FE) measurements along with in-situ Transmission Electron Microscopy (TEM) imaging of an amorphous-carbon (a-C) coated tungsten nanotip at various nanoscale vacuum gap distances. We find that under certain conditions, the FE current-voltage (I-V) curves switch abruptly into an enhanced-current state, implying the growth of an NP. We then run field emission simulations, demonstrating that the temporary enhanced-current I-V is perfectly consistent with the hypothesis that a NP has grown at the apex of the tip. This hypothesis is also confirmed by the repeatable in-situ observation of such a nano-protrusion and its continued growth during successive FE measurements in TEM. We tentatively attribute this phenomenon to field-induced biased diffusion of surface a-C atoms, after performing a finite element analysis that excludes the alternative possibility of field-induced plastic deformation. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.11530 [pdf, other]

Continual Forgetting for Pre-trained Vision Models

Authors: Hongbo Zhao, Bolin Ni, Haochen Wang, Junsong Fan, Fei Zhu, Yuxi Wang, Yuntao Chen, Gaofeng Meng, Zhaoxiang Zhang

Abstract: For privacy and security concerns, the need to erase unwanted information from pre-trained vision models is becoming evident nowadays. In real-world scenarios, erasure requests originate at any time from both users and model owners. These requests usually form a sequence. Therefore, under such a setting, selective information is expected to be continuously removed from a pre-trained model while ma… ▽ More For privacy and security concerns, the need to erase unwanted information from pre-trained vision models is becoming evident nowadays. In real-world scenarios, erasure requests originate at any time from both users and model owners. These requests usually form a sequence. Therefore, under such a setting, selective information is expected to be continuously removed from a pre-trained model while maintaining the rest. We define this problem as continual forgetting and identify two key challenges. (i) For unwanted knowledge, efficient and effective deleting is crucial. (ii) For remaining knowledge, the impact brought by the forgetting procedure should be minimal. To address them, we propose Group Sparse LoRA (GS-LoRA). Specifically, towards (i), we use LoRA modules to fine-tune the FFN layers in Transformer blocks for each forgetting task independently, and towards (ii), a simple group sparse regularization is adopted, enabling automatic selection of specific LoRA groups and zeroing out the others. GS-LoRA is effective, parameter-efficient, data-efficient, and easy to implement. We conduct extensive experiments on face recognition, object detection and image classification and demonstrate that GS-LoRA manages to forget specific classes with minimal impact on other classes. Codes will be released on \url{https://github.com/bjzhb666/GS-LoRA}. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: Accepted by CVPR 2024

arXiv:2403.03212 [pdf, other]

Performance of a modular ton-scale pixel-readout liquid argon time projection chamber

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, D. A. Andrade , et al. (1340 additional authors not shown)

Abstract: The Module-0 Demonstrator is a single-phase 600 kg liquid argon time projection chamber operated as a prototype for the DUNE liquid argon near detector. Based on the ArgonCube design concept, Module-0 features a novel 80k-channel pixelated charge readout and advanced high-coverage photon detection system. In this paper, we present an analysis of an eight-day data set consisting of 25 million cosmi… ▽ More The Module-0 Demonstrator is a single-phase 600 kg liquid argon time projection chamber operated as a prototype for the DUNE liquid argon near detector. Based on the ArgonCube design concept, Module-0 features a novel 80k-channel pixelated charge readout and advanced high-coverage photon detection system. In this paper, we present an analysis of an eight-day data set consisting of 25 million cosmic ray events collected in the spring of 2021. We use this sample to demonstrate the imaging performance of the charge and light readout systems as well as the signal correlations between the two. We also report argon purity and detector uniformity measurements, and provide comparisons to detector simulations. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 47 pages, 41 figures

Report number: FERMILAB-PUB-24-0073-LBNF

arXiv:2402.18393 [pdf, other]

Evaluating Decision Optimality of Autonomous Driving via Metamorphic Testing

Authors: Mingfei Cheng, Yuan Zhou, Xiaofei Xie, Junjie Wang, Guozhu Meng, Kairui Yang

Abstract: Autonomous Driving System (ADS) testing is crucial in ADS development, with the current primary focus being on safety. However, the evaluation of non-safety-critical performance, particularly the ADS's ability to make optimal decisions and produce optimal paths for autonomous vehicles (AVs), is equally vital to ensure the intelligence and reduce risks of AVs. Currently, there is little work dedica… ▽ More Autonomous Driving System (ADS) testing is crucial in ADS development, with the current primary focus being on safety. However, the evaluation of non-safety-critical performance, particularly the ADS's ability to make optimal decisions and produce optimal paths for autonomous vehicles (AVs), is equally vital to ensure the intelligence and reduce risks of AVs. Currently, there is little work dedicated to assessing ADSs' optimal decision-making performance due to the lack of corresponding oracles and the difficulty in generating scenarios with non-optimal decisions. In this paper, we focus on evaluating the decision-making quality of an ADS and propose the first method for detecting non-optimal decision scenarios (NoDSs), where the ADS does not compute optimal paths for AVs. Firstly, to deal with the oracle problem, we propose a novel metamorphic relation (MR) aimed at exposing violations of optimal decisions. The MR identifies the property that the ADS should retain optimal decisions when the optimal path remains unaffected by non-invasive changes. Subsequently, we develop a new framework, Decictor, designed to generate NoDSs efficiently. Decictor comprises three main components: Non-invasive Mutation, MR Check, and Feedback. The Non-invasive Mutation ensures that the original optimal path in the mutated scenarios is not affected, while the MR Check is responsible for determining whether non-optimal decisions are made. To enhance the effectiveness of identifying NoDSs, we design a feedback metric that combines both spatial and temporal aspects of the AV's movement. We evaluate Decictor on Baidu Apollo, an open-source and production-grade ADS. The experimental results validate the effectiveness of Decictor in detecting non-optimal decisions of ADSs. Our work provides valuable and original insights into evaluating the non-safety-critical performance of ADSs. △ Less

Submitted 28 February, 2024; originally announced February 2024.

arXiv:2402.18104 [pdf, other]

Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction

Authors: Tong Liu, Yingjie Zhang, Zhe Zhao, Yinpeng Dong, Guozhu Meng, Kai Chen

Abstract: In recent years, large language models (LLMs) have demonstrated notable success across various tasks, but the trustworthiness of LLMs is still an open problem. One specific threat is the potential to generate toxic or harmful responses. Attackers can craft adversarial prompts that induce harmful responses from LLMs. In this work, we pioneer a theoretical foundation in LLMs security by identifying… ▽ More In recent years, large language models (LLMs) have demonstrated notable success across various tasks, but the trustworthiness of LLMs is still an open problem. One specific threat is the potential to generate toxic or harmful responses. Attackers can craft adversarial prompts that induce harmful responses from LLMs. In this work, we pioneer a theoretical foundation in LLMs security by identifying bias vulnerabilities within the safety fine-tuning and design a black-box jailbreak method named DRA (Disguise and Reconstruction Attack), which conceals harmful instructions through disguise and prompts the model to reconstruct the original harmful instruction within its completion. We evaluate DRA across various open-source and closed-source models, showcasing state-of-the-art jailbreak success rates and attack efficiency. Notably, DRA boasts a 91.1% attack success rate on OpenAI GPT-4 chatbot. △ Less

Submitted 10 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

arXiv:2402.03856 [pdf, ps, other]

High-order stochastic integration schemes for the Rosenbluth-Trubnikov collision operator in particle simulations

Authors: Zhixin Lu, Guo Meng, Tomasz Tyranowski, Alex Chankin

Abstract: In this study, we consider a numerical implementation of the nonlinear Rosenbluth-Trubnikov collision operator for particle simulations in plasma physics in the framework of the finite element method (FEM). The relevant particle evolution equations are formulated as stochastic differential equations, both in the Stratonovich and Itô forms, and are then solved with advanced high-order stochastic nu… ▽ More In this study, we consider a numerical implementation of the nonlinear Rosenbluth-Trubnikov collision operator for particle simulations in plasma physics in the framework of the finite element method (FEM). The relevant particle evolution equations are formulated as stochastic differential equations, both in the Stratonovich and Itô forms, and are then solved with advanced high-order stochastic numerical schemes. Due to its formulation as a stochastic differential equation, both the drift and diffusion components of the collision operator are treated on an equal footing. Our investigation focuses on assessing the accuracy of these schemes. Previous studies on this subject have used the Euler-Maruyama scheme, which, although popular, is of low order, and requires small time steps to achieve satisfactory accuracy. In this work, we compare the performance of the Euler-Maruyama method to other high-order stochastic methods known in the stochastic differential equations literature. Our study reveals advantageous features of these high-order schemes, such as better accuracy and improved conservation properties of the numerical solution. The main test case used in the numerical experiments is the thermalization of isotropic and anisotropic particle distributions. △ Less

Submitted 6 February, 2024; originally announced February 2024.

Comments: 27 pages, 8 figures

arXiv:2402.01568 [pdf, other]

Doping Liquid Argon with Xenon in ProtoDUNE Single-Phase: Effects on Scintillation Light

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, H. Amar Es-sghir, P. Amedo, J. Anderson, D. A. Andrade, C. Andreopoulos , et al. (1300 additional authors not shown)

Abstract: Doping of liquid argon TPCs (LArTPCs) with a small concentration of xenon is a technique for light-shifting and facilitates the detection of the liquid argon scintillation light. In this paper, we present the results of the first doping test ever performed in a kiloton-scale LArTPC. From February to May 2020, we carried out this special run in the single-phase DUNE Far Detector prototype (ProtoDUN… ▽ More Doping of liquid argon TPCs (LArTPCs) with a small concentration of xenon is a technique for light-shifting and facilitates the detection of the liquid argon scintillation light. In this paper, we present the results of the first doping test ever performed in a kiloton-scale LArTPC. From February to May 2020, we carried out this special run in the single-phase DUNE Far Detector prototype (ProtoDUNE-SP) at CERN, featuring 770 t of total liquid argon mass with 410 t of fiducial mass. The goal of the run was to measure the light and charge response of the detector to the addition of xenon, up to a concentration of 18.8 ppm. The main purpose was to test the possibility for reduction of non-uniformities in light collection, caused by deployment of photon detectors only within the anode planes. Light collection was analysed as a function of the xenon concentration, by using the pre-existing photon detection system (PDS) of ProtoDUNE-SP and an additional smaller set-up installed specifically for this run. In this paper we first summarize our current understanding of the argon-xenon energy transfer process and the impact of the presence of nitrogen in argon with and without xenon dopant. We then describe the key elements of ProtoDUNE-SP and the injection method deployed. Two dedicated photon detectors were able to collect the light produced by xenon and the total light. The ratio of these components was measured to be about 0.65 as 18.8 ppm of xenon were injected. We performed studies of the collection efficiency as a function of the distance between tracks and light detectors, demonstrating enhanced uniformity of response for the anode-mounted PDS. We also show that xenon doping can substantially recover light losses due to contamination of the liquid argon by nitrogen. △ Less

Submitted 9 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: 35 pages, 20 figures

Report number: CERN-EP-2024-024; FERMILAB-PUB-23-0819-LBNF

arXiv:2401.07378 [pdf, other]

Efficient approximation of Earth Mover's Distance Based on Nearest Neighbor Search

Authors: Guangyu Meng, Ruyu Zhou, Liu Liu, Peixian Liang, Fang Liu, Danny Chen, Michael Niemier, X. Sharon Hu

Abstract: Earth Mover's Distance (EMD) is an important similarity measure between two distributions, used in computer vision and many other application domains. However, its exact calculation is computationally and memory intensive, which hinders its scalability and applicability for large-scale problems. Various approximate EMD algorithms have been proposed to reduce computational costs, but they suffer lo… ▽ More Earth Mover's Distance (EMD) is an important similarity measure between two distributions, used in computer vision and many other application domains. However, its exact calculation is computationally and memory intensive, which hinders its scalability and applicability for large-scale problems. Various approximate EMD algorithms have been proposed to reduce computational costs, but they suffer lower accuracy and may require additional memory usage or manual parameter tuning. In this paper, we present a novel approach, NNS-EMD, to approximate EMD using Nearest Neighbor Search (NNS), in order to achieve high accuracy, low time complexity, and high memory efficiency. The NNS operation reduces the number of data points compared in each NNS iteration and offers opportunities for parallel processing. We further accelerate NNS-EMD via vectorization on GPU, which is especially beneficial for large datasets. We compare NNS-EMD with both the exact EMD and state-of-the-art approximate EMD algorithms on image classification and retrieval tasks. We also apply NNS-EMD to calculate transport mapping and realize color transfer between images. NNS-EMD can be 44x to 135x faster than the exact EMD implementation, and achieves superior accuracy, speedup, and memory efficiency over existing approximate EMD methods. △ Less

Submitted 19 January, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

arXiv:2401.02316 [pdf]

First-principles Nonadiabatic Dynamics of Molecules at Metal Surfaces with Vibrationally Coupled Electron Transfer

Authors: Gang Meng, James Gardner, Wenjie Dou, Reinhard J. Maurer, Bin Jiang

Abstract: Accurate description of nonadiabatic dynamics of molecules at metal surfaces involving electron transfer has been a longstanding challenge for theory. Here, we tackle this problem by first constructing high-dimensional neural network diabatic potentials including state crossings determined by constrained density functional theory, then applying mixed quantum-classical surface hopping simulations t… ▽ More Accurate description of nonadiabatic dynamics of molecules at metal surfaces involving electron transfer has been a longstanding challenge for theory. Here, we tackle this problem by first constructing high-dimensional neural network diabatic potentials including state crossings determined by constrained density functional theory, then applying mixed quantum-classical surface hopping simulations to evolve coupled electron-nuclear motion. Our approach accurately describes the nonadiabatic effects in CO scattering from Au(111) without empirical parameters and yields results agreeing well with experiments under various conditions for this benchmark system. We find that both adiabatic and nonadiabatic energy loss channels have important contributions to the vibrational relaxation of highly vibrationally excited CO(vi = 17), whereas relaxation of low vibrationally excited states of CO(vi = 2) is weak and dominated by nonadiabatic energy loss. The presented approach paves the way for accurate first-principles simulations of electron transfer mediated nonadiabatic dynamics at metal surfaces. △ Less

Submitted 4 January, 2024; originally announced January 2024.

arXiv:2312.11057 [pdf, other]

DataElixir: Purifying Poisoned Dataset to Mitigate Backdoor Attacks via Diffusion Models

Authors: Jiachen Zhou, Peizhuo Lv, Yibing Lan, Guozhu Meng, Kai Chen, Hualong Ma

Abstract: Dataset sanitization is a widely adopted proactive defense against poisoning-based backdoor attacks, aimed at filtering out and removing poisoned samples from training datasets. However, existing methods have shown limited efficacy in countering the ever-evolving trigger functions, and often leading to considerable degradation of benign accuracy. In this paper, we propose DataElixir, a novel sanit… ▽ More Dataset sanitization is a widely adopted proactive defense against poisoning-based backdoor attacks, aimed at filtering out and removing poisoned samples from training datasets. However, existing methods have shown limited efficacy in countering the ever-evolving trigger functions, and often leading to considerable degradation of benign accuracy. In this paper, we propose DataElixir, a novel sanitization approach tailored to purify poisoned datasets. We leverage diffusion models to eliminate trigger features and restore benign features, thereby turning the poisoned samples into benign ones. Specifically, with multiple iterations of the forward and reverse process, we extract intermediary images and their predicted labels for each sample in the original dataset. Then, we identify anomalous samples in terms of the presence of label transition of the intermediary images, detect the target label by quantifying distribution discrepancy, select their purified images considering pixel and feature distance, and determine their ground-truth labels by training a benign model. Experiments conducted on 9 popular attacks demonstrates that DataElixir effectively mitigates various complex attacks while exerting minimal impact on benign accuracy, surpassing the performance of baseline defense methods. △ Less

Submitted 19 December, 2023; v1 submitted 18 December, 2023; originally announced December 2023.

Comments: Accepted by AAAI2024

arXiv:2312.03130 [pdf, other]

The DUNE Far Detector Vertical Drift Technology, Technical Design Report

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, H. Amar, P. Amedo, J. Anderson, D. A. Andrade, C. Andreopoulos , et al. (1304 additional authors not shown)

Abstract: DUNE is an international experiment dedicated to addressing some of the questions at the forefront of particle physics and astrophysics, including the mystifying preponderance of matter over antimatter in the early universe. The dual-site experiment will employ an intense neutrino beam focused on a near and a far detector as it aims to determine the neutrino mass hierarchy and to make high-precisi… ▽ More DUNE is an international experiment dedicated to addressing some of the questions at the forefront of particle physics and astrophysics, including the mystifying preponderance of matter over antimatter in the early universe. The dual-site experiment will employ an intense neutrino beam focused on a near and a far detector as it aims to determine the neutrino mass hierarchy and to make high-precision measurements of the PMNS matrix parameters, including the CP-violating phase. It will also stand ready to observe supernova neutrino bursts, and seeks to observe nucleon decay as a signature of a grand unified theory underlying the standard model. The DUNE far detector implements liquid argon time-projection chamber (LArTPC) technology, and combines the many tens-of-kiloton fiducial mass necessary for rare event searches with the sub-centimeter spatial resolution required to image those events with high precision. The addition of a photon detection system enhances physics capabilities for all DUNE physics drivers and opens prospects for further physics explorations. Given its size, the far detector will be implemented as a set of modules, with LArTPC designs that differ from one another as newer technologies arise. In the vertical drift LArTPC design, a horizontal cathode bisects the detector, creating two stacked drift volumes in which ionization charges drift towards anodes at either the top or bottom. The anodes are composed of perforated PCB layers with conductive strips, enabling reconstruction in 3D. Light-trap-style photon detection modules are placed both on the cryostat's side walls and on the central cathode where they are optically powered. This Technical Design Report describes in detail the technical implementations of each subsystem of this LArTPC that, together with the other far detector modules and the near detector, will enable DUNE to achieve its physics goals. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Comments: 425 pages; 281 figures Central editing team: A. Heavey, S. Kettell, A. Marchionni, S. Palestini, S. Rajogopalan, R. J. Wilson

Report number: Fermilab Report no: TM-2813-LBNF

arXiv:2311.17824 [pdf, other]

Depth-multiplexing spectral domain OCT for full eye length imaging with a single modulation unit

Authors: Guanghan Meng, Andrew Zhang, Fabio Feroldi, Austin Roorda, Laura Waller

Abstract: Measuring the axial length of the eye is emerging as a crucial approach to measure progression and monitor management of myopia. The high cost of current swept-source OCT devices, the preferred method for such measurements, limits their broad use, especially in lower-income communities.While spectral domain (SD) OCT is a more affordable option, its limited imaging range falls short for full eye le… ▽ More Measuring the axial length of the eye is emerging as a crucial approach to measure progression and monitor management of myopia. The high cost of current swept-source OCT devices, the preferred method for such measurements, limits their broad use, especially in lower-income communities.While spectral domain (SD) OCT is a more affordable option, its limited imaging range falls short for full eye length measurement. Existing depth-multiplexing (DM) techniques for SD-OCT provide a workaround by capturing images at multiple depths within the eye. However, these methods typically require multiple light modulation units or detectors for simultaneous imaging across depths, adding complexity and cost. In response, we propose a novel DM-SD-OCT approach that utilizes a single light modulation unit for depth encoding. This innovative method facilitates the capture of images at multiple depths within the eye using a single line scan camera, with subsequent computational demixing. Our implementation of this system successfully enabled simultaneous acquisition and demixing of signals from three distinct depths within the eye. The system's effectiveness was demonstrated using a model eye, confirming its potential as a cost-effective solution for comprehensive eye length measurement in clinical myopia research. △ Less

Submitted 29 November, 2023; originally announced November 2023.

arXiv:2310.10170 [pdf]

doi 10.1109/ICICML60161.2023.10424815

Leveraging Knowledge Distillation for Efficient Deep Reinforcement Learning in Resource-Constrained Environments

Authors: Guanlin Meng

Abstract: This paper aims to explore the potential of combining Deep Reinforcement Learning (DRL) with Knowledge Distillation (KD) by distilling various DRL algorithms and studying their distillation effects. By doing so, the computational burden of deep models could be reduced while maintaining the performance. The primary objective is to provide a benchmark for evaluating the performance of different DRL… ▽ More This paper aims to explore the potential of combining Deep Reinforcement Learning (DRL) with Knowledge Distillation (KD) by distilling various DRL algorithms and studying their distillation effects. By doing so, the computational burden of deep models could be reduced while maintaining the performance. The primary objective is to provide a benchmark for evaluating the performance of different DRL algorithms that have been refined using KD techniques. By distilling these algorithms, the goal is to develop efficient and fast DRL models. This research is expected to provide valuable insights that can facilitate further advancements in this promising direction. By exploring the combination of DRL and KD, this work aims to promote the development of models that require fewer GPU resources, learn more quickly, and make faster decisions in complex environments. The results of this research have the capacity to significantly advance the field of DRL and pave the way for the future deployment of resource-efficient, decision-making intelligent systems. △ Less

Submitted 16 October, 2023; originally announced October 2023.

arXiv:2309.05679 [pdf, other]

doi 10.1145/3576915.3616605

Good-looking but Lacking Faithfulness: Understanding Local Explanation Methods through Trend-based Testing

Authors: Jinwen He, Kai Chen, Guozhu Meng, Jiangshan Zhang, Congyi Li

Abstract: While enjoying the great achievements brought by deep learning (DL), people are also worried about the decision made by DL models, since the high degree of non-linearity of DL models makes the decision extremely difficult to understand. Consequently, attacks such as adversarial attacks are easy to carry out, but difficult to detect and explain, which has led to a boom in the research on local expl… ▽ More While enjoying the great achievements brought by deep learning (DL), people are also worried about the decision made by DL models, since the high degree of non-linearity of DL models makes the decision extremely difficult to understand. Consequently, attacks such as adversarial attacks are easy to carry out, but difficult to detect and explain, which has led to a boom in the research on local explanation methods for explaining model decisions. In this paper, we evaluate the faithfulness of explanation methods and find that traditional tests on faithfulness encounter the random dominance problem, \ie, the random selection performs the best, especially for complex data. To further solve this problem, we propose three trend-based faithfulness tests and empirically demonstrate that the new trend tests can better assess faithfulness than traditional tests on image, natural language and security tasks. We implement the assessment system and evaluate ten popular explanation methods. Benefiting from the trend tests, we successfully assess the explanation methods on complex data for the first time, bringing unprecedented discoveries and inspiring future research. Downstream tasks also greatly benefit from the tests. For example, model debugging equipped with faithful explanation methods performs much better for detecting and correcting accuracy and security problems. △ Less

Submitted 9 September, 2023; originally announced September 2023.

arXiv:2309.02926 [pdf, other]

Demystifying RCE Vulnerabilities in LLM-Integrated Apps

Authors: Tong Liu, Zizhuang Deng, Guozhu Meng, Yuekang Li, Kai Chen

Abstract: In recent years, Large Language Models (LLMs) have demonstrated remarkable potential across various downstream tasks. LLM-integrated frameworks, which serve as the essential infrastructure, have given rise to many LLM-integrated web apps. However, some of these frameworks suffer from Remote Code Execution (RCE) vulnerabilities, allowing attackers to execute arbitrary code on apps' servers remotely… ▽ More In recent years, Large Language Models (LLMs) have demonstrated remarkable potential across various downstream tasks. LLM-integrated frameworks, which serve as the essential infrastructure, have given rise to many LLM-integrated web apps. However, some of these frameworks suffer from Remote Code Execution (RCE) vulnerabilities, allowing attackers to execute arbitrary code on apps' servers remotely via prompt injections. Despite the severity of these vulnerabilities, no existing work has been conducted for a systematic investigation of them. This leaves a great challenge on how to detect vulnerabilities in frameworks as well as LLM-integrated apps in real-world scenarios. To fill this gap, we present two novel strategies, including 1) a static analysis-based tool called LLMSmith to scan the source code of the framework to detect potential RCE vulnerabilities and 2) a prompt-based automated testing approach to verify the vulnerability in LLM-integrated web apps. We discovered 13 vulnerabilities in 6 frameworks, including 12 RCE vulnerabilities and 1 arbitrary file read/write vulnerability. 11 of them are confirmed by the framework developers, resulting in the assignment of 7 CVE IDs. After testing 51 apps, we found vulnerabilities in 17 apps, 16 of which are vulnerable to RCE and 1 to SQL injection. We responsibly reported all 17 issues to the corresponding developers and received acknowledgments. Furthermore, we amplify the attack impact beyond achieving RCE by allowing attackers to exploit other app users (e.g. app responses hijacking, user API key leakage) without direct interaction between the attacker and the victim. Lastly, we propose some mitigating strategies for improving the security awareness of both framework and app developers, helping them to mitigate these risks effectively. △ Less

Submitted 8 October, 2023; v1 submitted 6 September, 2023; originally announced September 2023.

arXiv:2308.02770 [pdf, other]

One-stage Low-resolution Text Recognition with High-resolution Knowledge Transfer

Authors: Hang Guo, Tao Dai, Mingyan Zhu, Guanghao Meng, Bin Chen, Zhi Wang, Shu-Tao Xia

Abstract: Recognizing characters from low-resolution (LR) text images poses a significant challenge due to the information deficiency as well as the noise and blur in low-quality images. Current solutions for low-resolution text recognition (LTR) typically rely on a two-stage pipeline that involves super-resolution as the first stage followed by the second-stage recognition. Although this pipeline is straig… ▽ More Recognizing characters from low-resolution (LR) text images poses a significant challenge due to the information deficiency as well as the noise and blur in low-quality images. Current solutions for low-resolution text recognition (LTR) typically rely on a two-stage pipeline that involves super-resolution as the first stage followed by the second-stage recognition. Although this pipeline is straightforward and intuitive, it has to use an additional super-resolution network, which causes inefficiencies during training and testing. Moreover, the recognition accuracy of the second stage heavily depends on the reconstruction quality of the first stage, causing ineffectiveness. In this work, we attempt to address these challenges from a novel perspective: adapting the recognizer to low-resolution inputs by transferring the knowledge from the high-resolution. Guided by this idea, we propose an efficient and effective knowledge distillation framework to achieve multi-level knowledge transfer. Specifically, the visual focus loss is proposed to extract the character position knowledge with resolution gap reduction and character region focus, the semantic contrastive loss is employed to exploit the contextual semantic knowledge with contrastive learning, and the soft logits loss facilitates both local word-level and global sequence-level learning from the soft teacher label. Extensive experiments show that the proposed one-stage pipeline significantly outperforms super-resolution based two-stage frameworks in terms of effectiveness and efficiency, accompanied by favorable robustness. Code is available at https://github.com/csguoh/KD-LTR. △ Less

Submitted 4 August, 2023; originally announced August 2023.

Comments: Accepted as the ACM MM 2023 paper

arXiv:2307.09749 [pdf, other]

Towards Robust Scene Text Image Super-resolution via Explicit Location Enhancement

Authors: Hang Guo, Tao Dai, Guanghao Meng, Shu-Tao Xia

Abstract: Scene text image super-resolution (STISR), aiming to improve image quality while boosting downstream scene text recognition accuracy, has recently achieved great success. However, most existing methods treat the foreground (character regions) and background (non-character regions) equally in the forward process, and neglect the disturbance from the complex background, thus limiting the performance… ▽ More Scene text image super-resolution (STISR), aiming to improve image quality while boosting downstream scene text recognition accuracy, has recently achieved great success. However, most existing methods treat the foreground (character regions) and background (non-character regions) equally in the forward process, and neglect the disturbance from the complex background, thus limiting the performance. To address these issues, in this paper, we propose a novel method LEMMA that explicitly models character regions to produce high-level text-specific guidance for super-resolution. To model the location of characters effectively, we propose the location enhancement module to extract character region features based on the attention map sequence. Besides, we propose the multi-modal alignment module to perform bidirectional visual-semantic alignment to generate high-quality prior guidance, which is then incorporated into the super-resolution branch in an adaptive manner using the proposed adaptive fusion module. Experiments on TextZoom and four scene text recognition benchmarks demonstrate the superiority of our method over other state-of-the-art methods. Code is available at https://github.com/csguoh/LEMMA. △ Less

Submitted 29 July, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

Comments: Accepted as IJCAI2023 paper

arXiv:2307.05642 [pdf, other]

ConFL: Constraint-guided Fuzzing for Machine Learning Framework

Authors: Zhao Liu, Quanchen Zou, Tian Yu, Xuan Wang, Guozhu Meng, Kai Chen, Deyue Zhang

Abstract: As machine learning gains prominence in various sectors of society for automated decision-making, concerns have risen regarding potential vulnerabilities in machine learning (ML) frameworks. Nevertheless, testing these frameworks is a daunting task due to their intricate implementation. Previous research on fuzzing ML frameworks has struggled to effectively extract input constraints and generate v… ▽ More As machine learning gains prominence in various sectors of society for automated decision-making, concerns have risen regarding potential vulnerabilities in machine learning (ML) frameworks. Nevertheless, testing these frameworks is a daunting task due to their intricate implementation. Previous research on fuzzing ML frameworks has struggled to effectively extract input constraints and generate valid inputs, leading to extended fuzzing durations for deep execution or revealing the target crash. In this paper, we propose ConFL, a constraint-guided fuzzer for ML frameworks. ConFL automatically extracting constraints from kernel codes without the need for any prior knowledge. Guided by the constraints, ConFL is able to generate valid inputs that can pass the verification and explore deeper paths of kernel codes. In addition, we design a grouping technique to boost the fuzzing efficiency. To demonstrate the effectiveness of ConFL, we evaluated its performance mainly on Tensorflow. We find that ConFL is able to cover more code lines, and generate more valid inputs than state-of-the-art (SOTA) fuzzers. More importantly, ConFL found 84 previously unknown vulnerabilities in different versions of Tensorflow, all of which were assigned with new CVE ids, of which 3 were critical-severity and 13 were high-severity. We also extended ConFL to test PyTorch and Paddle, 7 vulnerabilities are found to date. △ Less

Submitted 11 July, 2023; originally announced July 2023.

Comments: 13 pages, 15 figures

arXiv:2306.14392 [pdf, other]

ContentCTR: Frame-level Live Streaming Click-Through Rate Prediction with Multimodal Transformer

Authors: Jiaxin Deng, Dong Shen, Shiyao Wang, Xiangyu Wu, Fan Yang, Guorui Zhou, Gaofeng Meng

Abstract: In recent years, live streaming platforms have gained immense popularity as they allow users to broadcast their videos and interact in real-time with hosts and peers. Due to the dynamic changes of live content, accurate recommendation models are crucial for enhancing user experience. However, most previous works treat the live as a whole item and explore the Click-through-Rate (CTR) prediction fra… ▽ More In recent years, live streaming platforms have gained immense popularity as they allow users to broadcast their videos and interact in real-time with hosts and peers. Due to the dynamic changes of live content, accurate recommendation models are crucial for enhancing user experience. However, most previous works treat the live as a whole item and explore the Click-through-Rate (CTR) prediction framework on item-level, neglecting that the dynamic changes that occur even within the same live room. In this paper, we proposed a ContentCTR model that leverages multimodal transformer for frame-level CTR prediction. First, we present an end-to-end framework that can make full use of multimodal information, including visual frames, audio, and comments, to identify the most attractive live frames. Second, to prevent the model from collapsing into a mediocre solution, a novel pairwise loss function with first-order difference constraints is proposed to utilize the contrastive information existing in the highlight and non-highlight frames. Additionally, we design a temporal text-video alignment module based on Dynamic Time Warping to eliminate noise caused by the ambiguity and non-sequential alignment of visual and textual information. We conduct extensive experiments on both real-world scenarios and public datasets, and our ContentCTR model outperforms traditional recommendation models in capturing real-time content changes. Moreover, we deploy the proposed method on our company platform, and the results of online A/B testing further validate its practical significance. △ Less

Submitted 25 June, 2023; originally announced June 2023.

arXiv:2304.09476 [pdf, other]

doi 10.1088/1741-4326/acd1a0

MAS: A versatile Landau-fluid eigenvalue code for plasma stability analysis in general geometry

Authors: Jian Bao, Wenlu Zhang, Ding Li, Zhihong Lin, Ge Dong, Chang Liu, Huasheng Xie, Guo Meng, Junyi Cheng, Chao Dong, Jintao Cao

Abstract: We have developed a new global eigenvalue code, Multiscale Analysis for plasma Stabilities (MAS), for studying plasma problems with wave toroidal mode number n and frequency omega in a broad range of interest in general tokamak geometry, based on a five-field Landau-fluid description of thermal plasmas. Beyond keeping the necessary plasma fluid response, we further retain the important kinetic eff… ▽ More We have developed a new global eigenvalue code, Multiscale Analysis for plasma Stabilities (MAS), for studying plasma problems with wave toroidal mode number n and frequency omega in a broad range of interest in general tokamak geometry, based on a five-field Landau-fluid description of thermal plasmas. Beyond keeping the necessary plasma fluid response, we further retain the important kinetic effects including diamagnetic drift, ion finite Larmor radius, finite parallel electric field, ion and electron Landau resonances in a self-consistent and non-perturbative manner without sacrificing the attractive efficiency in computation. The physical capabilities of the code are evaluated and examined in the aspects of both theory and simulation. In theory, the comprehensive Landau-fluid model implemented in MAS can be reduced to the well-known ideal MHD model, electrostatic ion-fluid model, and drift-kinetic model in various limits, which clearly delineates the physics validity regime. In simulation, MAS has been well benchmarked with theory and other gyrokinetic and kinetic-MHD hybrid codes in a manner of adopting the unified physical and numerical framework, which covers the kinetic Alfven wave, ion sound wave, low-n kink, high-n ion temperature gradient mode and kinetic ballooning mode. Moreover, MAS is successfully applied to model the Alfven eigenmode (AE) activities in DIII-D discharge #159243, which faithfully captures the frequency sweeping of RSAE, the tunneling damping of TAE, as well as the polarization characteristics of KBAE and BAAE being consistent with former gyrokinetic theory and simulation. With respect to the key progress contributed to the community, MAS has the advantage of combining rich physics ingredients, realistic global geometry and high computation efficiency together for plasma stability analysis in linear regime. △ Less

Submitted 19 April, 2023; originally announced April 2023.

Comments: 40 pages, 21 figures

arXiv:2303.17007 [pdf]

doi 10.1103/PhysRevD.107.112012

Impact of cross-section uncertainties on supernova neutrino spectral parameter fitting in the Deep Underground Neutrino Experiment

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, Z. Ahmad, J. Ahmed, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, P. Amedo, J. Anderson, D. A. Andrade , et al. (1294 additional authors not shown)

Abstract: A primary goal of the upcoming Deep Underground Neutrino Experiment (DUNE) is to measure the $\mathcal{O}(10)$ MeV neutrinos produced by a Galactic core-collapse supernova if one should occur during the lifetime of the experiment. The liquid-argon-based detectors planned for DUNE are expected to be uniquely sensitive to the $ν_e$ component of the supernova flux, enabling a wide variety of physics… ▽ More A primary goal of the upcoming Deep Underground Neutrino Experiment (DUNE) is to measure the $\mathcal{O}(10)$ MeV neutrinos produced by a Galactic core-collapse supernova if one should occur during the lifetime of the experiment. The liquid-argon-based detectors planned for DUNE are expected to be uniquely sensitive to the $ν_e$ component of the supernova flux, enabling a wide variety of physics and astrophysics measurements. A key requirement for a correct interpretation of these measurements is a good understanding of the energy-dependent total cross section $σ(E_ν)$ for charged-current $ν_e$ absorption on argon. In the context of a simulated extraction of supernova $ν_e$ spectral parameters from a toy analysis, we investigate the impact of $σ(E_ν)$ modeling uncertainties on DUNE's supernova neutrino physics sensitivity for the first time. We find that the currently large theoretical uncertainties on $σ(E_ν)$ must be substantially reduced before the $ν_e$ flux parameters can be extracted reliably: in the absence of external constraints, a measurement of the integrated neutrino luminosity with less than 10\% bias with DUNE requires $σ(E_ν)$ to be known to about 5%. The neutrino spectral shape parameters can be known to better than 10% for a 20% uncertainty on the cross-section scale, although they will be sensitive to uncertainties on the shape of $σ(E_ν)$. A direct measurement of low-energy $ν_e$-argon scattering would be invaluable for improving the theoretical precision to the needed level. △ Less

Submitted 7 July, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

Comments: 25 pages, 21 figures

Report number: FERMILAB-PUB-23-132-CSAID-LBNF-ND-T

Journal ref: Phys. Rev. D 107, 112012 (2023)

arXiv:2301.09072 [pdf, other]

ContraBERT: Enhancing Code Pre-trained Models via Contrastive Learning

Authors: Shangqing Liu, Bozhi Wu, Xiaofei Xie, Guozhu Meng, Yang Liu

Abstract: Large-scale pre-trained models such as CodeBERT, GraphCodeBERT have earned widespread attention from both academia and industry. Attributed to the superior ability in code representation, they have been further applied in multiple downstream tasks such as clone detection, code search and code translation. However, it is also observed that these state-of-the-art pre-trained models are susceptible t… ▽ More Large-scale pre-trained models such as CodeBERT, GraphCodeBERT have earned widespread attention from both academia and industry. Attributed to the superior ability in code representation, they have been further applied in multiple downstream tasks such as clone detection, code search and code translation. However, it is also observed that these state-of-the-art pre-trained models are susceptible to adversarial attacks. The performance of these pre-trained models drops significantly with simple perturbations such as renaming variable names. This weakness may be inherited by their downstream models and thereby amplified at an unprecedented scale. To this end, we propose an approach namely ContraBERT that aims to improve the robustness of pre-trained models via contrastive learning. Specifically, we design nine kinds of simple and complex data augmentation operators on the programming language (PL) and natural language (NL) data to construct different variants. Furthermore, we continue to train the existing pre-trained models by masked language modeling (MLM) and contrastive pre-training task on the original samples with their augmented variants to enhance the robustness of the model. The extensive experiments demonstrate that ContraBERT can effectively improve the robustness of the existing pre-trained models. Further study also confirms that these robustness-enhanced models provide improvements as compared to original models over four popular downstream tasks. △ Less

Submitted 22 January, 2023; originally announced January 2023.

arXiv:2301.06657

Free Lunch for Generating Effective Outlier Supervision

Authors: Sen Pei, Jiaxi Sun, Richard Yi Da Xu, Bin Fan, Shiming Xiang, Gaofeng Meng

Abstract: When deployed in practical applications, computer vision systems will encounter numerous unexpected images (\emph{i.e.}, out-of-distribution data). Due to the potentially raised safety risks, these aforementioned unseen data should be carefully identified and handled. Generally, existing approaches in dealing with out-of-distribution (OOD) detection mainly focus on the statistical difference betwe… ▽ More When deployed in practical applications, computer vision systems will encounter numerous unexpected images (\emph{i.e.}, out-of-distribution data). Due to the potentially raised safety risks, these aforementioned unseen data should be carefully identified and handled. Generally, existing approaches in dealing with out-of-distribution (OOD) detection mainly focus on the statistical difference between the features of OOD and in-distribution (ID) data extracted by the classifiers. Although many of these schemes have brought considerable performance improvements, reducing the false positive rate (FPR) when processing open-set images, they necessarily lack reliable theoretical analysis and generalization guarantees. Unlike the observed ways, in this paper, we investigate the OOD detection problem based on the Bayes rule and present a convincing description of the reason for failures encountered by conventional classifiers. Concretely, our analysis reveals that refining the probability distribution yielded by the vanilla neural networks is necessary for OOD detection, alleviating the issues of assigning high confidence to OOD data. To achieve this effortlessly, we propose an ultra-effective method to generate near-realistic outlier supervision. Extensive experiments on large-scale benchmarks reveal that our proposed \texttt{BayesAug} significantly reduces the FPR95 over 12.50\% compared with the previous schemes, boosting the reliability of machine learning systems. The code will be made publicly available. △ Less

Submitted 17 January, 2024; v1 submitted 16 January, 2023; originally announced January 2023.

Comments: We have rewritten this paper, and published as "Image Background Serves as Good Proxy for Out-of-distribution Data" arXiv:2307.00519

arXiv:2212.09807 [pdf, other]

Highly-parallelized simulation of a pixelated LArTPC on a GPU

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, Z. Ahmad, J. Ahmed, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, C. Alt, A. Alton, R. Alvarez, P. Amedo, J. Anderson , et al. (1282 additional authors not shown)

Abstract: The rapid development of general-purpose computing on graphics processing units (GPGPU) is allowing the implementation of highly-parallelized Monte Carlo simulation chains for particle physics experiments. This technique is particularly suitable for the simulation of a pixelated charge readout for time projection chambers, given the large number of channels that this technology employs. Here we pr… ▽ More The rapid development of general-purpose computing on graphics processing units (GPGPU) is allowing the implementation of highly-parallelized Monte Carlo simulation chains for particle physics experiments. This technique is particularly suitable for the simulation of a pixelated charge readout for time projection chambers, given the large number of channels that this technology employs. Here we present the first implementation of a full microphysical simulator of a liquid argon time projection chamber (LArTPC) equipped with light readout and pixelated charge readout, developed for the DUNE Near Detector. The software is implemented with an end-to-end set of GPU-optimized algorithms. The algorithms have been written in Python and translated into CUDA kernels using Numba, a just-in-time compiler for a subset of Python and NumPy instructions. The GPU implementation achieves a speed up of four orders of magnitude compared with the equivalent CPU version. The simulation of the current induced on $10^3$ pixels takes around 1 ms on the GPU, compared with approximately 10 s on the CPU. The results of the simulation are compared against data from a pixel-readout LArTPC prototype. △ Less

Submitted 28 February, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

Comments: 26 pages, 15 figures

Report number: FERMILAB-PUB-22-926-LBNF

arXiv:2212.03854 [pdf, other]

BiPMAP: A Toolbox for Predictions of Perceived Motion Artifacts on Modern Displays

Authors: Guanghan Meng, Dekel Galor, Laura Waller, Martin S. Banks

Abstract: Presenting dynamic scenes without incurring motion artifacts visible to observers requires sustained effort from the display industry. A tool that predicts motion artifacts and simulates artifact elimination through optimizing the display configuration is highly desired to guide the design and manufacture of modern displays. Despite the popular demands, there is no such tool available in the marke… ▽ More Presenting dynamic scenes without incurring motion artifacts visible to observers requires sustained effort from the display industry. A tool that predicts motion artifacts and simulates artifact elimination through optimizing the display configuration is highly desired to guide the design and manufacture of modern displays. Despite the popular demands, there is no such tool available in the market. In this study, we deliver an interactive toolkit, Binocular Perceived Motion Artifact Predictor (BiPMAP), as an executable file with GPU acceleration. BiPMAP accounts for an extensive collection of user-defined parameters and directly visualizes a variety of motion artifacts by presenting the perceived continuous and sampled moving stimuli side-by-side. For accurate artifact predictions, BiPMAP utilizes a novel model of the human contrast sensitivity function to effectively imitate the frequency modulation of the human visual system. In addition, BiPMAP is capable of deriving various in-plane motion artifacts for 2D displays and depth distortion in 3D stereoscopic displays. △ Less

Submitted 3 January, 2024; v1 submitted 7 December, 2022; originally announced December 2022.

Comments: 11 pages, 9 figures

arXiv:2211.10624 [pdf, other]

A Unified Model for Video Understanding and Knowledge Embedding with Heterogeneous Knowledge Graph Dataset

Authors: Jiaxin Deng, Dong Shen, Haojie Pan, Xiangyu Wu, Ximan Liu, Gaofeng Meng, Fan Yang, Size Li, Ruiji Fu, Zhongyuan Wang

Abstract: Video understanding is an important task in short video business platforms and it has a wide application in video recommendation and classification. Most of the existing video understanding works only focus on the information that appeared within the video content, including the video frames, audio and text. However, introducing common sense knowledge from the external Knowledge Graph (KG) dataset… ▽ More Video understanding is an important task in short video business platforms and it has a wide application in video recommendation and classification. Most of the existing video understanding works only focus on the information that appeared within the video content, including the video frames, audio and text. However, introducing common sense knowledge from the external Knowledge Graph (KG) dataset is essential for video understanding when referring to the content which is less relevant to the video. Owing to the lack of video knowledge graph dataset, the work which integrates video understanding and KG is rare. In this paper, we propose a heterogeneous dataset that contains the multi-modal video entity and fruitful common sense relations. This dataset also provides multiple novel video inference tasks like the Video-Relation-Tag (VRT) and Video-Relation-Video (VRV) tasks. Furthermore, based on this dataset, we propose an end-to-end model that jointly optimizes the video understanding objective with knowledge graph embedding, which can not only better inject factual knowledge into video understanding but also generate effective multi-modal entity embedding for KG. Comprehensive experiments indicate that combining video understanding embedding with factual knowledge benefits the content-based video retrieval performance. Moreover, it also helps the model generate better knowledge graph embedding which outperforms traditional KGE-based methods on VRT and VRV tasks with at least 42.36% and 17.73% improvement in HITS@10. △ Less

Submitted 1 April, 2023; v1 submitted 19 November, 2022; originally announced November 2022.

Comments: Accepted by ICMR 2023

arXiv:2211.01166 [pdf, other]

Identification and reconstruction of low-energy electrons in the ProtoDUNE-SP detector

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, Z. Ahmad, J. Ahmed, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, C. Alt, A. Alton, R. Alvarez, P. Amedo, J. Anderson , et al. (1235 additional authors not shown)

Abstract: Measurements of electrons from $ν_e$ interactions are crucial for the Deep Underground Neutrino Experiment (DUNE) neutrino oscillation program, as well as searches for physics beyond the standard model, supernova neutrino detection, and solar neutrino measurements. This article describes the selection and reconstruction of low-energy (Michel) electrons in the ProtoDUNE-SP detector. ProtoDUNE-SP is… ▽ More Measurements of electrons from $ν_e$ interactions are crucial for the Deep Underground Neutrino Experiment (DUNE) neutrino oscillation program, as well as searches for physics beyond the standard model, supernova neutrino detection, and solar neutrino measurements. This article describes the selection and reconstruction of low-energy (Michel) electrons in the ProtoDUNE-SP detector. ProtoDUNE-SP is one of the prototypes for the DUNE far detector, built and operated at CERN as a charged particle test beam experiment. A sample of low-energy electrons produced by the decay of cosmic muons is selected with a purity of 95%. This sample is used to calibrate the low-energy electron energy scale with two techniques. An electron energy calibration based on a cosmic ray muon sample uses calibration constants derived from measured and simulated cosmic ray muon events. Another calibration technique makes use of the theoretically well-understood Michel electron energy spectrum to convert reconstructed charge to electron energy. In addition, the effects of detector response to low-energy electron energy scale and its resolution including readout electronics threshold effects are quantified. Finally, the relation between the theoretical and reconstructed low-energy electron energy spectrum is derived and the energy resolution is characterized. The low-energy electron selection presented here accounts for about 75% of the total electron deposited energy. After the addition of lost energy using a Monte Carlo simulation, the energy resolution improves from about 40% to 25% at 50~MeV. These results are used to validate the expected capabilities of the DUNE far detector to reconstruct low-energy electrons. △ Less

Submitted 31 May, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

Comments: 19 pages, 10 figures

Report number: FERMILAB-PUB-22-784, CERN-EP-DRAFT-MISC-2022-008

Journal ref: Phys. Rev. D 107, 092012 (2023)

arXiv:2210.04354 [pdf, other]

Full $f$ and $δf$ gyrokinetic particle simulations of Alfvén waves and energetic particle physics

Authors: Zhixin Lu, Guo Meng, Roman Hatzky, Matthias Hoelzl, Philipp Lauber

Abstract: In this work, we focus on the development of the particle-in-cell scheme and the application to the studies of Alfvén waves and energetic particle physics in tokamak plasmas. The $δf$ and full $f$ schemes are formulated on the same footing adopting mixed variables and the pullback scheme for electromagnetic problems. The TRIMEG-GKX code [Lu et al. J. Comput. Phys. 440 (2021) 110384] has been upgra… ▽ More In this work, we focus on the development of the particle-in-cell scheme and the application to the studies of Alfvén waves and energetic particle physics in tokamak plasmas. The $δf$ and full $f$ schemes are formulated on the same footing adopting mixed variables and the pullback scheme for electromagnetic problems. The TRIMEG-GKX code [Lu et al. J. Comput. Phys. 440 (2021) 110384] has been upgraded using cubic spline finite elements and full $f$ and $δf$ schemes. The EP-driven TAE has been simulated for the ITPA-TAE case featured by a small electron skin depth $\sim 1.18\times10^{-3}\;{\rm m}$, which is a challenging parameter regime of electromagnetic simulations, especially for the full $f$ model. The simulation results using the $δf$ scheme are in good agreement with previous work. Excellent performance of the mixed variable/pullback scheme has been observed for both full $f$ and $δf$ schemes. Simulations with mixed full $f$ EPs and $δf$ electrons and thermal ions demonstrate the good features of this novel scheme in mitigating the noise level. The full $f$ scheme is a natural choice for EP physics studies which allows a large variation of EP profiles and distributions in velocity space, providing a powerful tool for kinetic studies using realistic experimental distributions related to intermittent and transient plasma activities. △ Less

Submitted 9 October, 2022; originally announced October 2022.

Comments: 27 pages, 8 figures

arXiv:2210.04170 [pdf]

Multi-Objective Personalized Product Retrieval in Taobao Search

Authors: Yukun Zheng, Jiang Bian, Guanghao Meng, Chao Zhang, Honggang Wang, Zhixuan Zhang, Sen Li, Tao Zhuang, Qingwen Liu, Xiaoyi Zeng

Abstract: In large-scale e-commerce platforms like Taobao, it is a big challenge to retrieve products that satisfy users from billions of candidates. This has been a common concern of academia and industry. Recently, plenty of works in this domain have achieved significant improvements by enhancing embedding-based retrieval (EBR) methods, including the Multi-Grained Deep Semantic Product Retrieval (MGDSPR)… ▽ More In large-scale e-commerce platforms like Taobao, it is a big challenge to retrieve products that satisfy users from billions of candidates. This has been a common concern of academia and industry. Recently, plenty of works in this domain have achieved significant improvements by enhancing embedding-based retrieval (EBR) methods, including the Multi-Grained Deep Semantic Product Retrieval (MGDSPR) model [16] in Taobao search engine. However, we find that MGDSPR still has problems of poor relevance and weak personalization compared to other retrieval methods in our online system, such as lexical matching and collaborative filtering. These problems promote us to further strengthen the capabilities of our EBR model in both relevance estimation and personalized retrieval. In this paper, we propose a novel Multi-Objective Personalized Product Retrieval (MOPPR) model with four hierarchical optimization objectives: relevance, exposure, click and purchase. We construct entire-space multi-positive samples to train MOPPR, rather than the single-positive samples for existing EBR models.We adopt a modified softmax loss for optimizing multiple objectives. Results of extensive offline and online experiments show that MOPPR outperforms the baseline MGDSPR on evaluation metrics of relevance estimation and personalized retrieval. MOPPR achieves 0.96% transaction and 1.29% GMV improvements in a 28-day online A/B test. Since the Double-11 shopping festival of 2021, MOPPR has been fully deployed in mobile Taobao search, replacing the previous MGDSPR. Finally, we discuss several advanced topics of our deeper explorations on multi-objective retrieval and ranking to contribute to the community. △ Less

Submitted 9 October, 2022; originally announced October 2022.

Comments: 9 pages, 4 figures, submitted to the 28th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

arXiv:2210.03986 [pdf, other]

TransRepair: Context-aware Program Repair for Compilation Errors

Authors: Xueyang Li, Shangqing Liu, Ruitao Feng, Guozhu Meng, Xiaofei Xie, Kai Chen, Yang Liu

Abstract: Automatically fixing compilation errors can greatly raise the productivity of software development, by guiding the novice or AI programmers to write and debug code. Recently, learning-based program repair has gained extensive attention and became the state-of-the-art in practice. But it still leaves plenty of space for improvement. In this paper, we propose an end-to-end solution TransRepair to lo… ▽ More Automatically fixing compilation errors can greatly raise the productivity of software development, by guiding the novice or AI programmers to write and debug code. Recently, learning-based program repair has gained extensive attention and became the state-of-the-art in practice. But it still leaves plenty of space for improvement. In this paper, we propose an end-to-end solution TransRepair to locate the error lines and create the correct substitute for a C program simultaneously. Superior to the counterpart, our approach takes into account the context of erroneous code and diagnostic compilation feedback. Then we devise a Transformer-based neural network to learn the ways of repair from the erroneous code as well as its context and the diagnostic feedback. To increase the effectiveness of TransRepair, we summarize 5 types and 74 fine-grained sub-types of compilations errors from two real-world program datasets and the Internet. Then a program corruption technique is developed to synthesize a large dataset with 1,821,275 erroneous C programs. Through the extensive experiments, we demonstrate that TransRepair outperforms the state-of-the-art in both single repair accuracy and full repair accuracy. Further analysis sheds light on the strengths and weaknesses in the contemporary solutions for future improvement. △ Less

Submitted 8 October, 2022; originally announced October 2022.

Comments: 11 pages, accepted to ASE '22

arXiv:2209.09577 [pdf, other]

doi 10.1145/3548606.3559388

Understanding Real-world Threats to Deep Learning Models in Android Apps

Authors: Zizhuang Deng, Kai Chen, Guozhu Meng, Xiaodong Zhang, Ke Xu, Yao Cheng

Abstract: Famous for its superior performance, deep learning (DL) has been popularly used within many applications, which also at the same time attracts various threats to the models. One primary threat is from adversarial attacks. Researchers have intensively studied this threat for several years and proposed dozens of approaches to create adversarial examples (AEs). But most of the approaches are only eva… ▽ More Famous for its superior performance, deep learning (DL) has been popularly used within many applications, which also at the same time attracts various threats to the models. One primary threat is from adversarial attacks. Researchers have intensively studied this threat for several years and proposed dozens of approaches to create adversarial examples (AEs). But most of the approaches are only evaluated on limited models and datasets (e.g., MNIST, CIFAR-10). Thus, the effectiveness of attacking real-world DL models is not quite clear. In this paper, we perform the first systematic study of adversarial attacks on real-world DNN models and provide a real-world model dataset named RWM. Particularly, we design a suite of approaches to adapt current AE generation algorithms to the diverse real-world DL models, including automatically extracting DL models from Android apps, capturing the inputs and outputs of the DL models in apps, generating AEs and validating them by observing the apps' execution. For black-box DL models, we design a semantic-based approach to build suitable datasets and use them for training substitute models when performing transfer-based attacks. After analyzing 245 DL models collected from 62,583 real-world apps, we have a unique opportunity to understand the gap between real-world DL models and contemporary AE generation algorithms. To our surprise, the current AE generation algorithms can only directly attack 6.53% of the models. Benefiting from our approach, the success rate upgrades to 47.35%. △ Less

Submitted 28 September, 2022; v1 submitted 20 September, 2022; originally announced September 2022.

Comments: 18 pages, 9 figures, accepted by CCS'22

arXiv:2209.03563 [pdf, other]

SSL-WM: A Black-Box Watermarking Approach for Encoders Pre-trained by Self-supervised Learning

Authors: Peizhuo Lv, Pan Li, Shenchen Zhu, Shengzhi Zhang, Kai Chen, Ruigang Liang, Chang Yue, Fan Xiang, Yuling Cai, Hualong Ma, Yingjun Zhang, Guozhu Meng

Abstract: Recent years have witnessed tremendous success in Self-Supervised Learning (SSL), which has been widely utilized to facilitate various downstream tasks in Computer Vision (CV) and Natural Language Processing (NLP) domains. However, attackers may steal such SSL models and commercialize them for profit, making it crucial to verify the ownership of the SSL models. Most existing ownership protection s… ▽ More Recent years have witnessed tremendous success in Self-Supervised Learning (SSL), which has been widely utilized to facilitate various downstream tasks in Computer Vision (CV) and Natural Language Processing (NLP) domains. However, attackers may steal such SSL models and commercialize them for profit, making it crucial to verify the ownership of the SSL models. Most existing ownership protection solutions (e.g., backdoor-based watermarks) are designed for supervised learning models and cannot be used directly since they require that the models' downstream tasks and target labels be known and available during watermark embedding, which is not always possible in the domain of SSL. To address such a problem, especially when downstream tasks are diverse and unknown during watermark embedding, we propose a novel black-box watermarking solution, named SSL-WM, for verifying the ownership of SSL models. SSL-WM maps watermarked inputs of the protected encoders into an invariant representation space, which causes any downstream classifier to produce expected behavior, thus allowing the detection of embedded watermarks. We evaluate SSL-WM on numerous tasks, such as CV and NLP, using different SSL models both contrastive-based and generative-based. Experimental results demonstrate that SSL-WM can effectively verify the ownership of stolen SSL models in various downstream tasks. Furthermore, SSL-WM is robust against model fine-tuning, pruning, and input preprocessing attacks. Lastly, SSL-WM can also evade detection from evaluated watermark detection approaches, demonstrating its promising application in protecting the ownership of SSL models. △ Less

Submitted 29 January, 2024; v1 submitted 8 September, 2022; originally announced September 2022.

Comments: To Appear in the Network and Distributed System Security (NDSS) Symposium 2024, 26 February - 1 March 2024, San Diego, CA, USA

arXiv:2208.02816 [pdf, other]

Expanding Language-Image Pretrained Models for General Video Recognition

Authors: Bolin Ni, Houwen Peng, Minghao Chen, Songyang Zhang, Gaofeng Meng, Jianlong Fu, Shiming Xiang, Haibin Ling

Abstract: Contrastive language-image pretraining has shown great success in learning visual-textual joint representation from web-scale data, demonstrating remarkable "zero-shot" generalization ability for various image tasks. However, how to effectively expand such new language-image pretraining methods to video domains is still an open problem. In this work, we present a simple yet effective approach that… ▽ More Contrastive language-image pretraining has shown great success in learning visual-textual joint representation from web-scale data, demonstrating remarkable "zero-shot" generalization ability for various image tasks. However, how to effectively expand such new language-image pretraining methods to video domains is still an open problem. In this work, we present a simple yet effective approach that adapts the pretrained language-image models to video recognition directly, instead of pretraining a new model from scratch. More concretely, to capture the long-range dependencies of frames along the temporal dimension, we propose a cross-frame attention mechanism that explicitly exchanges information across frames. Such module is lightweight and can be plugged into pretrained language-image models seamlessly. Moreover, we propose a video-specific prompting scheme, which leverages video content information for generating discriminative textual prompts. Extensive experiments demonstrate that our approach is effective and can be generalized to different video recognition scenarios. In particular, under fully-supervised settings, our approach achieves a top-1 accuracy of 87.1% on Kinectics-400, while using 12 times fewer FLOPs compared with Swin-L and ViViT-H. In zero-shot experiments, our approach surpasses the current state-of-the-art methods by +7.6% and +14.9% in terms of top-1 accuracy under two popular protocols. In few-shot scenarios, our approach outperforms previous best methods by +32.1% and +23.1% when the labeled data is extremely limited. Code and models are available at https://aka.ms/X-CLIP △ Less

Submitted 4 August, 2022; originally announced August 2022.

Comments: Accepted by ECCV2022, Oral

arXiv:2207.14381 [pdf, other]

Pro-tuning: Unified Prompt Tuning for Vision Tasks

Authors: Xing Nie, Bolin Ni, Jianlong Chang, Gaomeng Meng, Chunlei Huo, Zhaoxiang Zhang, Shiming Xiang, Qi Tian, Chunhong Pan

Abstract: In computer vision, fine-tuning is the de-facto approach to leverage pre-trained vision models to perform downstream tasks. However, deploying it in practice is quite challenging, due to adopting parameter inefficient global update and heavily relying on high-quality downstream data. Recently, prompt-based learning, which adds a task-relevant prompt to adapt the downstream tasks to pre-trained mod… ▽ More In computer vision, fine-tuning is the de-facto approach to leverage pre-trained vision models to perform downstream tasks. However, deploying it in practice is quite challenging, due to adopting parameter inefficient global update and heavily relying on high-quality downstream data. Recently, prompt-based learning, which adds a task-relevant prompt to adapt the downstream tasks to pre-trained models, has drastically boosted the performance of many natural language downstream tasks. In this work, we extend this notable transfer ability benefited from prompt into vision models as an alternative to fine-tuning. To this end, we propose parameter-efficient Prompt tuning (Pro-tuning) to adapt frozen vision models to various downstream vision tasks. The key to Pro-tuning is prompt-based tuning, i.e., learning task-specific vision prompts for downstream input images with the pre-trained model frozen. By only training a few additional parameters, it can work on diverse CNN-based and Transformer-based architectures. Extensive experiments evidence that Pro-tuning outperforms fine-tuning in a broad range of vision tasks and scenarios, including image classification (generic objects, class imbalance, image corruption, adversarial robustness, and out-of-distribution generalization), and dense prediction tasks such as object detection and semantic segmentation. △ Less

Submitted 22 August, 2022; v1 submitted 28 July, 2022; originally announced July 2022.

arXiv:2207.12194 [pdf, other]

Domain Decorrelation with Potential Energy Ranking

Authors: Sen Pei, Jiaxi Sun, Richard Yi Da Xu, Shiming Xiang, Gaofeng Meng

Abstract: Machine learning systems, especially the methods based on deep learning, enjoy great success in modern computer vision tasks under experimental settings. Generally, these classic deep learning methods are built on the \emph{i.i.d.} assumption, supposing the training and test data are drawn from a similar distribution independently and identically. However, the aforementioned \emph{i.i.d.} assumpti… ▽ More Machine learning systems, especially the methods based on deep learning, enjoy great success in modern computer vision tasks under experimental settings. Generally, these classic deep learning methods are built on the \emph{i.i.d.} assumption, supposing the training and test data are drawn from a similar distribution independently and identically. However, the aforementioned \emph{i.i.d.} assumption is in general unavailable in the real-world scenario, and as a result, leads to sharp performance decay of deep learning algorithms. Behind this, domain shift is one of the primary factors to be blamed. In order to tackle this problem, we propose using \textbf{Po}tential \textbf{E}nergy \textbf{R}anking (PoER) to decouple the object feature and the domain feature (\emph{i.e.,} appearance feature) in given images, promoting the learning of label-discriminative features while filtering out the irrelevant correlations between the objects and the background. PoER helps the neural networks to capture label-related features which contain the domain information first in shallow layers and then distills the label-discriminative representations out progressively, enforcing the neural networks to be aware of the characteristic of objects and background which is vital to the generation of domain-invariant features. PoER reports superior performance on domain generalization benchmarks, improving the average top-1 accuracy by at least 1.20\% compared to the existing methods. Moreover, we use PoER in the ECCV 2022 NICO Challenge\footnote{https://nicochallenge.com}, achieving top place with only a vanilla ResNet-18. The code has been made available at https://github.com/ForeverPs/PoER. △ Less

Submitted 16 December, 2022; v1 submitted 25 July, 2022; originally announced July 2022.

Comments: 2022 ECCV jury award, accepted by AAAI 2023

Journal ref: AAAI 2023 Oral

arXiv:2206.14521 [pdf, other]

doi 10.1140/epjc/s10052-023-11733-2

Reconstruction of interactions in the ProtoDUNE-SP detector with Pandora

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, Z. Ahmad, J. Ahmed, B. Aimard, F. Akbar, B. Ali-Mohammadzadeh, K. Allison, S. Alonso Monsalve, M. AlRashed, C. Alt, A. Alton, R. Alvarez, P. Amedo , et al. (1203 additional authors not shown)

Abstract: The Pandora Software Development Kit and algorithm libraries provide pattern-recognition logic essential to the reconstruction of particle interactions in liquid argon time projection chamber detectors. Pandora is the primary event reconstruction software used at ProtoDUNE-SP, a prototype for the Deep Underground Neutrino Experiment far detector. ProtoDUNE-SP, located at CERN, is exposed to a char… ▽ More The Pandora Software Development Kit and algorithm libraries provide pattern-recognition logic essential to the reconstruction of particle interactions in liquid argon time projection chamber detectors. Pandora is the primary event reconstruction software used at ProtoDUNE-SP, a prototype for the Deep Underground Neutrino Experiment far detector. ProtoDUNE-SP, located at CERN, is exposed to a charged-particle test beam. This paper gives an overview of the Pandora reconstruction algorithms and how they have been tailored for use at ProtoDUNE-SP. In complex events with numerous cosmic-ray and beam background particles, the simulated reconstruction and identification efficiency for triggered test-beam particles is above 80% for the majority of particle type and beam momentum combinations. Specifically, simulated 1 GeV/$c$ charged pions and protons are correctly reconstructed and identified with efficiencies of 86.1$\pm0.6$% and 84.1$\pm0.6$%, respectively. The efficiencies measured for test-beam data are shown to be within 5% of those predicted by the simulation. △ Less

Submitted 17 July, 2023; v1 submitted 29 June, 2022; originally announced June 2022.

Comments: 39 pages, 20 figures. Accepted version. Published version available in Eur. Phys. J. C 83, 618 (2023) https://doi.org/10.1140/epjc/s10052-023-11733-2

Report number: FERMILAB-PUB-22-488-AD-ESH-LBNF-ND-SCD, CERN-EP-DRAFT-MISC-2022-007

Journal ref: Eur. Phys. J. C 83, 618 (2023)

arXiv:2205.10617 [pdf, other]

Gradient Concealment: Free Lunch for Defending Adversarial Attacks

Authors: Sen Pei, Jiaxi Sun, Xiaopeng Zhang, Gaofeng Meng

Abstract: Recent studies show that the deep neural networks (DNNs) have achieved great success in various tasks. However, even the \emph{state-of-the-art} deep learning based classifiers are extremely vulnerable to adversarial examples, resulting in sharp decay of discrimination accuracy in the presence of enormous unknown attacks. Given the fact that neural networks are widely used in the open world scenar… ▽ More Recent studies show that the deep neural networks (DNNs) have achieved great success in various tasks. However, even the \emph{state-of-the-art} deep learning based classifiers are extremely vulnerable to adversarial examples, resulting in sharp decay of discrimination accuracy in the presence of enormous unknown attacks. Given the fact that neural networks are widely used in the open world scenario which can be safety-critical situations, mitigating the adversarial effects of deep learning methods has become an urgent need. Generally, conventional DNNs can be attacked with a dramatically high success rate since their gradient is exposed thoroughly in the white-box scenario, making it effortless to ruin a well trained classifier with only imperceptible perturbations in the raw data space. For tackling this problem, we propose a plug-and-play layer that is training-free, termed as \textbf{G}radient \textbf{C}oncealment \textbf{M}odule (GCM), concealing the vulnerable direction of gradient while guaranteeing the classification accuracy during the inference time. GCM reports superior defense results on the ImageNet classification benchmark, improving up to 63.41\% top-1 attack robustness (AR) when faced with adversarial inputs compared to the vanilla DNNs. Moreover, we use GCM in the CVPR 2022 Robust Classification Challenge, currently achieving \textbf{2nd} place in Phase II with only a tiny version of ConvNext. The code will be made available. △ Less

Submitted 21 May, 2022; originally announced May 2022.

Showing 1–50 of 190 results for author: Meng, G