subscribe to arXiv mailings

arXiv:2407.09315 [pdf, other]

RBMD: A molecular dynamics package enabling to simulate 10 million all-atom particles in a single graphics processing unit

Authors: Weihang Gao, Teng Zhao, Yongfa Guo, Jiuyang Liang, Huan Liu, Maoying Luo, Zedong Luo, Wei Qin, Yichao Wang, Qi Zhou, Shi Jin, Zhenli Xu

Abstract: This paper introduces a random-batch molecular dynamics (RBMD) package for fast simulations of particle systems at the nano/micro scale. Different from existing packages, the RBMD uses random batch methods for nonbonded interactions of particle systems. The long-range part of Coulomb interactions is calculated in Fourier space by the random batch Ewald algorithm, which achieves linear complexity a… ▽ More This paper introduces a random-batch molecular dynamics (RBMD) package for fast simulations of particle systems at the nano/micro scale. Different from existing packages, the RBMD uses random batch methods for nonbonded interactions of particle systems. The long-range part of Coulomb interactions is calculated in Fourier space by the random batch Ewald algorithm, which achieves linear complexity and superscalability, surpassing classical lattice-based Ewald methods. For the short-range part, the random batch list algorithm is used to construct neighbor lists, significantly reducing both computational and memory costs. The RBMD is implemented on GPU-CPU heterogeneous architectures, with classical force fields for all-atom systems. Benchmark systems are used to validate accuracy and performance of the package. Comparison with the particle-particle particle-mesh method and the Verlet list method in the LAMMPS package is performed on three different NVIDIA GPUs, demonstrating high efficiency of the RBMD on heterogeneous architectures. Our results also show that the RBMD enables simulations on a single GPU with a CPU core up to 10 million particles. Typically, for systems of one million particles, the RBMD allows simulating all-atom systems with a high efficiency of 8.20 ms per step, demonstrating the attractive feature for running large-scale simulations of practical applications on a desktop machine. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 26 pages, 8 figures

arXiv:2407.09278 [pdf, ps, other]

Exact local distribution of the absolutely continuous spectral measure

Authors: Xianzhe Li, Jiangong You, Qi Zhou

Abstract: It is well-established that the spectral measure for one-frequency Schrödinger operators with Diophantine frequencies exhibits optimal $1/2$-Hölder continuity within the absolutely continuous spectrum. This study extends these findings by precisely characterizing the local distribution of the spectral measure for dense small potentials, including a notable result for any subcritical almost Mathieu… ▽ More It is well-established that the spectral measure for one-frequency Schrödinger operators with Diophantine frequencies exhibits optimal $1/2$-Hölder continuity within the absolutely continuous spectrum. This study extends these findings by precisely characterizing the local distribution of the spectral measure for dense small potentials, including a notable result for any subcritical almost Mathieu operators. Additionally, we investigate the stratified Hölder continuity of the spectral measure at subcritical energies. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 49 pages

arXiv:2407.09139 [pdf, other]

Measurement of $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays at Belle II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, A. Aloisio, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer , et al. (414 additional authors not shown)

Abstract: We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We det… ▽ More We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We determine these parameters for two ranges of $K^0_S π^0$ invariant mass: $m(K^0_S π^0)\in (0.8, 1.0)$ $GeV/c^2$, which is dominated by $B^0 \to K^{*0} (\to K^0_S π^0) γ$ decays, and a complementary region $m(K^0_S π^0)\in (0.6, 0.8)\cup(1.0, 1.8)$ $GeV/c^2$. Our results have improved precision as compared to previous measurements and are consistent with theory predictions. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 10 pages, 4 figures

Report number: Belle II Preprint 2024-009, KEK Preprint 2024-1

arXiv:2407.08984 [pdf, ps, other]

Measurement of branching fractions, CP asymmetry, and isospin asymmetry for $\boldsymbol{B\rightarrowργ}$ decays using Belle and Belle II data

Authors: Belle II Collaboration, I. Adachi, K. Adamczyk, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer , et al. (385 additional authors not shown)

Abstract: We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle I… ▽ More We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle II data sets yields $114\pm 12$ $B^{+}\rightarrowρ^{+}γ$ and $99\pm 12$ $B^{0}\rightarrowρ^{0}γ$ decays. The measured branching fractions are $(13.1^{+2.0 +1.3}_{-1.9 -1.2})\times 10^{-7}$ and $(7.5\pm 1.3^{+1.0}_{-0.8})\times 10^{-7}$ for $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays, respectively, where the first uncertainty is statistical and the second is systematic. We also measure the isospin asymmetry $A_{\rm I}(B\rightarrowργ)=(10.9^{+11.2 +7.8}_{-11.7 -7.3})\%$ and the direct CP asymmetry $A_{CP}(B^{+}\rightarrowρ^{+}γ)=(-8.2\pm 15.2^{+1.6}_{-1.2})\%$. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 12 pages, 4 figures

Report number: Belle II Preprint 2023-019; KEK Preprint 2023-37

arXiv:2407.08801 [pdf, other]

DG-PIC: Domain Generalized Point-In-Context Learning for Point Cloud Understanding

Authors: Jincen Jiang, Qianyu Zhou, Yuhang Li, Xuequan Lu, Meili Wang, Lizhuang Ma, Jian Chang, Jian Jun Zhang

Abstract: Recent point cloud understanding research suffers from performance drops on unseen data, due to the distribution shifts across different domains. While recent studies use Domain Generalization (DG) techniques to mitigate this by learning domain-invariant features, most are designed for a single task and neglect the potential of testing data. Despite In-Context Learning (ICL) showcasing multi-task… ▽ More Recent point cloud understanding research suffers from performance drops on unseen data, due to the distribution shifts across different domains. While recent studies use Domain Generalization (DG) techniques to mitigate this by learning domain-invariant features, most are designed for a single task and neglect the potential of testing data. Despite In-Context Learning (ICL) showcasing multi-task learning capability, it usually relies on high-quality context-rich data and considers a single dataset, and has rarely been studied in point cloud understanding. In this paper, we introduce a novel, practical, multi-domain multi-task setting, handling multiple domains and multiple tasks within one unified model for domain generalized point cloud understanding. To this end, we propose Domain Generalized Point-In-Context Learning (DG-PIC) that boosts the generalizability across various tasks and domains at testing time. In particular, we develop dual-level source prototype estimation that considers both global-level shape contextual and local-level geometrical structures for representing source domains and a dual-level test-time feature shifting mechanism that leverages both macro-level domain semantic information and micro-level patch positional relationships to pull the target data closer to the source ones during the testing. Our DG-PIC does not require any model updates during the testing and can handle unseen domains and multiple tasks, \textit{i.e.,} point cloud reconstruction, denoising, and registration, within one unified model. We also introduce a benchmark for this new setting. Comprehensive experiments demonstrate that DG-PIC outperforms state-of-the-art techniques significantly. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: Accepted to ECCV 2024

arXiv:2407.08225 [pdf, other]

Cosmic dawn constraints on freeze-in dark matter from Lyman-alpha forest and 21-cm signal : single-field models

Authors: Zixuan Xu, Quan Zhou, Sibo Zheng

Abstract: We propose cosmological observations of Lyman-alpha and 21-cm signal to set stringent constraints on freeze-in dark matter (FIDM). Explicitly we consider Higgs (neutrino)-portal FIDM in the single-field context, which injects energy into the intergalactic medium via its annihilation (decay). With respect to Lyman-alpha the baseline ionization history is inferred from low redshift data about astrop… ▽ More We propose cosmological observations of Lyman-alpha and 21-cm signal to set stringent constraints on freeze-in dark matter (FIDM). Explicitly we consider Higgs (neutrino)-portal FIDM in the single-field context, which injects energy into the intergalactic medium via its annihilation (decay). With respect to Lyman-alpha the baseline ionization history is inferred from low redshift data about astrophysical reionization, whereas for 21-cm signal the baseline values of 21-cm power spectrum are obtained through a standard modeling of star formation developed. We use numerical tools to derive the FIDM induced deviations from these baseline values in high redshift region. Our results show that (i) current Lyman-alpha data has already constrained the neutrino portal FIDM mass to be less than 1.06 MeV, (ii) future Lyman-alpha data about the intergalactic medium temperature with a 10(100)% precision at z~9-15 is sufficient to exclude the Higgs (neutrino)-portal FIDM, and (iii) future SKA sensitivity (1000 hrs) on the 21-cm power spectrum for reference wavenumber k*=0.2h /Mpc at z~15-16 is also able to exclude the surviving neutrino-portal FIDM mass window. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 17 pages, 5 figures

arXiv:2407.08019 [pdf, other]

Coherent and Multi-modality Image Inpainting via Latent Space Optimization

Authors: Lingzhi Pan, Tong Zhang, Bingyuan Chen, Qi Zhou, Wei Ke, Sabine Süsstrunk, Mathieu Salzmann

Abstract: With the advancements in denoising diffusion probabilistic models (DDPMs), image inpainting has significantly evolved from merely filling information based on nearby regions to generating content conditioned on various prompts such as text, exemplar images, and sketches. However, existing methods, such as model fine-tuning and simple concatenation of latent vectors, often result in generation fail… ▽ More With the advancements in denoising diffusion probabilistic models (DDPMs), image inpainting has significantly evolved from merely filling information based on nearby regions to generating content conditioned on various prompts such as text, exemplar images, and sketches. However, existing methods, such as model fine-tuning and simple concatenation of latent vectors, often result in generation failures due to overfitting and inconsistency between the inpainted region and the background. In this paper, we argue that the current large diffusion models are sufficiently powerful to generate realistic images without further tuning. Hence, we introduce PILOT (in\textbf{P}ainting v\textbf{I}a \textbf{L}atent \textbf{O}p\textbf{T}imization), an optimization approach grounded on a novel \textit{semantic centralization} and \textit{background preservation loss}. Our method searches latent spaces capable of generating inpainted regions that exhibit high fidelity to user-provided prompts while maintaining coherence with the background. Furthermore, we propose a strategy to balance optimization expense and image quality, significantly enhancing generation efficiency. Our method seamlessly integrates with any pre-trained model, including ControlNet and DreamBooth, making it suitable for deployment in multi-modal editing tools. Our qualitative and quantitative evaluations demonstrate that PILOT outperforms existing approaches by generating more coherent, diverse, and faithful inpainted regions in response to provided prompts. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07325 [pdf, other]

HiLight: Technical Report on the Motern AI Video Language Model

Authors: Zhiting Wang, Qiangong Zhou, Kangjie Yang, Zongyang Liu, Xin Mao

Abstract: This technical report presents the implementation of a state-of-the-art video encoder for video-text modal alignment and a video conversation framework called HiLight, which features dual visual towers. The work is divided into two main parts: 1.alignment of video and text modalities; 2.convenient and efficient way to interact with users. Our goal is to address the task of video comprehension in t… ▽ More This technical report presents the implementation of a state-of-the-art video encoder for video-text modal alignment and a video conversation framework called HiLight, which features dual visual towers. The work is divided into two main parts: 1.alignment of video and text modalities; 2.convenient and efficient way to interact with users. Our goal is to address the task of video comprehension in the context of billiards. The report includes a discussion of the concepts and the final solution developed during the task's implementation. △ Less

Submitted 11 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.07016 [pdf]

Is Large Language Model All You Need to Predict the Synthesizability and Precursors of Crystal Structures?

Authors: Zhilong Song, Shuaihua Lu, Minggang Ju, Qionghua Zhou, Jinlan Wang

Abstract: Accessing the synthesizability of crystal structures is pivotal for advancing the practical application of theoretical material structures designed by machine learning or high-throughput screening. However, a significant gap exists between the actual synthesizability and thermodynamic or kinetic stability, which is commonly used for screening theoretical structures for experiments. To address this… ▽ More Accessing the synthesizability of crystal structures is pivotal for advancing the practical application of theoretical material structures designed by machine learning or high-throughput screening. However, a significant gap exists between the actual synthesizability and thermodynamic or kinetic stability, which is commonly used for screening theoretical structures for experiments. To address this, we develop the Crystal Synthesis Large Language Models (CSLLM) framework, which includes three LLMs for predicting the synthesizability, synthesis methods, and precursors. We create a comprehensive synthesizability dataset including 140,120 crystal structures and develop an efficient text representation method for crystal structures to fine-tune the LLMs. The Synthesizability LLM achieves a remarkable 98.6% accuracy, significantly outperforming traditional synthesizability screening based on thermodynamic and kinetic stability by 106.1% and 44.5%, respectively. The Methods LLM achieves a classification accuracy of 91.02%, and the Precursors LLM has an 80.2% success rate in predicting synthesis precursors. Furthermore, we develop a user-friendly graphical interface that enables automatic predictions of synthesizability and precursors from uploaded crystal structure files. Through these contributions, CSLLM bridges the gap between theoretical material design and experimental synthesis, paving the way for the rapid discovery of novel and synthesizable functional materials. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.06489 [pdf]

T2MAT (text-to-materials): A universal framework for generating material structures with goal properties from a single sentence

Authors: Zhilong Song, Shuaihua Lu, Qionghua Zhou, Jinlan Wang

Abstract: Artificial Intelligence-Generated Content (AIGC)-content autonomously produced by AI systems without human intervention-has significantly boosted efficiency across various fields. However, the AIGC in material science faces challenges in the ability to efficiently discover innovative materials that surpass existing databases, alongside the invariances and stability considerations of crystal struct… ▽ More Artificial Intelligence-Generated Content (AIGC)-content autonomously produced by AI systems without human intervention-has significantly boosted efficiency across various fields. However, the AIGC in material science faces challenges in the ability to efficiently discover innovative materials that surpass existing databases, alongside the invariances and stability considerations of crystal structures. To address these challenges, we develop T2MAT (Text-to-Material), a comprehensive framework processing from a user-input sentence to inverse design material structures with goal properties beyond the existing database via globally exploring chemical space, followed by an entirely automated workflow of first principal validation. Furthermore, we propose CGTNet (Crystal Graph Transformer NETwork), a novel graph neural network model that captures long-term interactions, to enhance the accuracy and data efficiency of property prediction and thereby improve the reliability of inverse design. Through these contributions, T2MAT minimizes the dependency on human expertise and significantly enhances the efficiency of designing novel, high-performance functional materials, thereby actualizing AIGC in the materials design domain. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.05490 [pdf, ps, other]

Structured quantitative almost reducibility and its applications

Authors: Lingrui Ge, Jiangong You, Qi Zhou

Abstract: We establish \textit{structured quantitative almost reducibility}, tailored for analytic quasiperiodic $SL(2,\mathbb{R})$-cocycles, which effectively addresses the challenge of infinitely many \textit{normal frequency} resonances. This method paves the way for optimal arithmetic reducibility results for such cocycles, thereby resolving Jitomirskaya's conjecture. From a spectral perspective, it lea… ▽ More We establish \textit{structured quantitative almost reducibility}, tailored for analytic quasiperiodic $SL(2,\mathbb{R})$-cocycles, which effectively addresses the challenge of infinitely many \textit{normal frequency} resonances. This method paves the way for optimal arithmetic reducibility results for such cocycles, thereby resolving Jitomirskaya's conjecture. From a spectral perspective, it leads to optimal arithmetic Anderson localization for a class of quasiperiodic long-range operators on higher-dimensional lattices. In particular, using structured quantitative almost reducibility, we establish a sharp quantitative version of Aubry duality, enabling us to uncover new spectral insights for almost Mathieu operators with Diophantine frequencies. For example, we precisely determine the exponential decay rate of spectral gaps in non-critical cases, thus addressing a question raised by Goldstein. Additionally, we reveal the optimal asymptotic growth of extended eigenfunctions for subcritical almost Mathieu operators. △ Less

Submitted 7 July, 2024; originally announced July 2024.

Comments: 55 pages

arXiv:2407.05396 [pdf, other]

Evolutionary Trigger Detection and Lightweight Model Repair Based Backdoor Defense

Authors: Qi Zhou, Zipeng Ye, Yubo Tang, Wenjian Luo, Yuhui Shi, Yan Jia

Abstract: Deep Neural Networks (DNNs) have been widely used in many areas such as autonomous driving and face recognition. However, DNN model is fragile to backdoor attack. A backdoor in the DNN model can be activated by a poisoned input with trigger and leads to wrong prediction, which causes serious security issues in applications. It is challenging for current defenses to eliminate the backdoor effective… ▽ More Deep Neural Networks (DNNs) have been widely used in many areas such as autonomous driving and face recognition. However, DNN model is fragile to backdoor attack. A backdoor in the DNN model can be activated by a poisoned input with trigger and leads to wrong prediction, which causes serious security issues in applications. It is challenging for current defenses to eliminate the backdoor effectively with limited computing resources, especially when the sizes and numbers of the triggers are variable as in the physical world. We propose an efficient backdoor defense based on evolutionary trigger detection and lightweight model repair. In the first phase of our method, CAM-focus Evolutionary Trigger Filter (CETF) is proposed for trigger detection. CETF is an effective sample-preprocessing based method with the evolutionary algorithm, and our experimental results show that CETF not only distinguishes the images with triggers accurately from the clean images, but also can be widely used in practice for its simplicity and stability in different backdoor attack situations. In the second phase of our method, we leverage several lightweight unlearning methods with the trigger detected by CETF for model repair, which also constructively demonstrate the underlying correlation of the backdoor with Batch Normalization layers. Source code will be published after accepted. △ Less

Submitted 14 July, 2024; v1 submitted 7 July, 2024; originally announced July 2024.

Comments: 13 pages, 9 figures

arXiv:2407.05285 [pdf, other]

Gradient Diffusion: A Perturbation-Resilient Gradient Leakage Attack

Authors: Xuan Liu, Siqi Cai, Qihua Zhou, Song Guo, Ruibin Li, Kaiwei Lin

Abstract: Recent years have witnessed the vulnerability of Federated Learning (FL) against gradient leakage attacks, where the private training data can be recovered from the exchanged gradients, making gradient protection a critical issue for the FL training process. Existing solutions often resort to perturbation-based mechanisms, such as differential privacy, where each participating client injects a spe… ▽ More Recent years have witnessed the vulnerability of Federated Learning (FL) against gradient leakage attacks, where the private training data can be recovered from the exchanged gradients, making gradient protection a critical issue for the FL training process. Existing solutions often resort to perturbation-based mechanisms, such as differential privacy, where each participating client injects a specific amount of noise into local gradients before aggregating to the server, and the global distribution variation finally conceals the gradient privacy. However, perturbation is not always the panacea for gradient protection since the robustness heavily relies on the injected noise. This intuition raises an interesting question: \textit{is it possible to deactivate existing protection mechanisms by removing the perturbation inside the gradients?} In this paper, we present the answer: \textit{yes} and propose the Perturbation-resilient Gradient Leakage Attack (PGLA), the first attempt to recover the perturbed gradients, without additional access to the original model structure or third-party data. Specifically, we leverage the inherent diffusion property of gradient perturbation protection and construct a novel diffusion-based denoising model to implement PGLA. Our insight is that capturing the disturbance level of perturbation during the diffusion reverse process can release the gradient denoising capability, which promotes the diffusion model to generate approximate gradients as the original clean version through adaptive sampling steps. Extensive experiments demonstrate that PGLA effectively recovers the protected gradients and exposes the FL training process to the threat of gradient leakage, achieving the best quality in gradient denoising and data recovery compared to existing models. We hope to arouse public attention on PGLA and its defense. △ Less

Submitted 7 July, 2024; originally announced July 2024.

arXiv:2407.05117 [pdf, ps, other]

Search for the baryon number and lepton number violating decays $τ^-\to Λπ^-$ and $τ^-\to \barΛπ^-$ at Belle II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (349 additional authors not shown)

Abstract: We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper… ▽ More We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper limits at 90\% credibility level on the branching fractions of $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛπ^-$ are determined to be $4.7 \times 10^{-8}$ and $4.3 \times 10^{-8}$, respectively. △ Less

Submitted 6 July, 2024; originally announced July 2024.

Comments: 8 pages, 4 figures

Report number: Belle II Preprint 2024-020; KEK Preprint 2024-17

arXiv:2407.04057 [pdf, other]

TALENT: A Tabular Analytics and Learning Toolbox

Authors: Si-Yang Liu, Hao-Run Cai, Qi-Le Zhou, Han-Jia Ye

Abstract: Tabular data is one of the most common data sources in machine learning. Although a wide range of classical methods demonstrate practical utilities in this field, deep learning methods on tabular data are becoming promising alternatives due to their flexibility and ability to capture complex interactions within the data. Considering that deep tabular methods have diverse design philosophies, inclu… ▽ More Tabular data is one of the most common data sources in machine learning. Although a wide range of classical methods demonstrate practical utilities in this field, deep learning methods on tabular data are becoming promising alternatives due to their flexibility and ability to capture complex interactions within the data. Considering that deep tabular methods have diverse design philosophies, including the ways they handle features, design learning objectives, and construct model architectures, we introduce a versatile deep-learning toolbox called TALENT (Tabular Analytics and LEarNing Toolbox) to utilize, analyze, and compare tabular methods. TALENT encompasses an extensive collection of more than 20 deep tabular prediction methods, associated with various encoding and normalization modules, and provides a unified interface that is easily integrable with new methods as they emerge. In this paper, we present the design and functionality of the toolbox, illustrate its practical application through several case studies, and investigate the performance of various methods fairly based on our toolbox. Code is available at https://github.com/qile2000/LAMDA-TALENT. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2407.00965 [pdf, other]

Measurement of the integrated luminosity of data samples collected during 2019-2022 by the Belle II experiment

Authors: The Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, J. K. Ahn, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (382 additional authors not shown)

Abstract: A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, diga… ▽ More A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, digamma, and dimuon events is (426.52 $\pm$ 0.03 $\pm$ 2.48)~fb$^{-1}$, (427.32 $\pm$ 0.03 $\pm$ 2.56)~fb$^{-1}$, and (424.84 $\pm$ 0.04 $\pm$ 3.88)~fb$^{-1}$, where the first uncertainties are statistical and the second are systematic. The resulting total integrated luminosity obtained from the combination of the three methods is (426.88 $\pm$ 1.93)~fb$^{-1}$. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 12 pages, 3 figures

Report number: Belle II Preprint 2024-019; KEK Preprint 2024-16

arXiv:2407.00956 [pdf, other]

A Closer Look at Deep Learning on Tabular Data

Authors: Han-Jia Ye, Si-Yang Liu, Hao-Run Cai, Qi-Le Zhou, De-Chuan Zhan

Abstract: Tabular data is prevalent across various domains in machine learning. Although Deep Neural Network (DNN)-based methods have shown promising performance comparable to tree-based ones, in-depth evaluation of these methods is challenging due to varying performance ranks across diverse datasets. In this paper, we propose a comprehensive benchmark comprising 300 tabular datasets, covering a wide range… ▽ More Tabular data is prevalent across various domains in machine learning. Although Deep Neural Network (DNN)-based methods have shown promising performance comparable to tree-based ones, in-depth evaluation of these methods is challenging due to varying performance ranks across diverse datasets. In this paper, we propose a comprehensive benchmark comprising 300 tabular datasets, covering a wide range of task types, size distributions, and domains. We perform an extensive comparison between state-of-the-art deep tabular methods and tree-based methods, revealing the average rank of all methods and highlighting the key factors that influence the success of deep tabular methods. Next, we analyze deep tabular methods based on their training dynamics, including changes in validation metrics and other statistics. For each dataset-method pair, we learn a mapping from both the meta-features of datasets and the first part of the validation curve to the final validation set performance and even the evolution of validation curves. This mapping extracts essential meta-features that influence prediction accuracy, helping the analysis of tabular methods from novel aspects. Based on the performance of all methods on this large benchmark, we identify two subsets of 45 datasets each. The first subset contains datasets that favor either tree-based methods or DNN-based methods, serving as effective analysis tools to evaluate strategies (e.g., attribute encoding strategies) for improving deep tabular models. The second subset contains datasets where the ranks of methods are consistent with the overall benchmark, acting as a probe for tabular analysis. These ``tiny tabular benchmarks'' will facilitate further studies on tabular data. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2407.00934 [pdf, other]

CLEME2.0: Towards More Interpretable Evaluation by Disentangling Edits for Grammatical Error Correction

Authors: Jingheng Ye, Zishan Xu, Yinghui Li, Xuxin Cheng, Linlin Song, Qingyu Zhou, Hai-Tao Zheng, Ying Shen, Xin Su

Abstract: The paper focuses on improving the interpretability of Grammatical Error Correction (GEC) metrics, which receives little attention in previous studies. To bridge the gap, we propose CLEME2.0, a reference-based evaluation strategy that can describe four elementary dimensions of GEC systems, namely hit-correction, error-correction, under-correction, and over-correction. They collectively contribute… ▽ More The paper focuses on improving the interpretability of Grammatical Error Correction (GEC) metrics, which receives little attention in previous studies. To bridge the gap, we propose CLEME2.0, a reference-based evaluation strategy that can describe four elementary dimensions of GEC systems, namely hit-correction, error-correction, under-correction, and over-correction. They collectively contribute to revealing the critical characteristics and locating drawbacks of GEC systems. Evaluating systems by Combining these dimensions leads to high human consistency over other reference-based and reference-less metrics. Extensive experiments on 2 human judgement datasets and 6 reference datasets demonstrate the effectiveness and robustness of our method. All the codes will be released after the peer review. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 16 pages, 8 tables, 2 figures. Under review

arXiv:2407.00136 [pdf, other]

Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, S. Ahmed, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, X. H. Bai, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (495 additional authors not shown)

Abstract: Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions… ▽ More Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components. △ Less

Submitted 2 July, 2024; v1 submitted 28 June, 2024; originally announced July 2024.

arXiv:2406.17758 [pdf, other]

MotionBooth: Motion-Aware Customized Text-to-Video Generation

Authors: Jianzong Wu, Xiangtai Li, Yanhong Zeng, Jiangning Zhang, Qianyu Zhou, Yining Li, Yunhai Tong, Kai Chen

Abstract: In this work, we present MotionBooth, an innovative framework designed for animating customized subjects with precise control over both object and camera movements. By leveraging a few images of a specific object, we efficiently fine-tune a text-to-video model to capture the object's shape and attributes accurately. Our approach presents subject region loss and video preservation loss to enhance t… ▽ More In this work, we present MotionBooth, an innovative framework designed for animating customized subjects with precise control over both object and camera movements. By leveraging a few images of a specific object, we efficiently fine-tune a text-to-video model to capture the object's shape and attributes accurately. Our approach presents subject region loss and video preservation loss to enhance the subject's learning performance, along with a subject token cross-attention loss to integrate the customized subject with motion control signals. Additionally, we propose training-free techniques for managing subject and camera motions during inference. In particular, we utilize cross-attention map manipulation to govern subject motion and introduce a novel latent shift module for camera movement control as well. MotionBooth excels in preserving the appearance of subjects while simultaneously controlling the motions in generated videos. Extensive quantitative and qualitative evaluations demonstrate the superiority and effectiveness of our method. Our project page is at https://jianzongwu.github.io/projects/motionbooth △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: Project page at https://jianzongwu.github.io/projects/motionbooth

arXiv:2406.17019 [pdf, ps, other]

Constraining galaxy properties with complete samples of lenses

Authors: Qing Zhou, Alessandro Sonnenfeld, Henk Hoekstra

Abstract: The statistics of Einstein radii for a sample of strong lenses can provide valuable constraints on the underlying mass distribution. The correct interpretation, however, relies critically on the modelling of the selection of the sample, which has proven to be a limiting factor. This may change thanks to upcoming uniform high-resolution imaging surveys that cover a large fraction of the sky, becaus… ▽ More The statistics of Einstein radii for a sample of strong lenses can provide valuable constraints on the underlying mass distribution. The correct interpretation, however, relies critically on the modelling of the selection of the sample, which has proven to be a limiting factor. This may change thanks to upcoming uniform high-resolution imaging surveys that cover a large fraction of the sky, because they can provide complete lens samples, with well understood selection criteria. To explore how the observed distribution of Einstein radii depends on the galaxy properties, we simulated a realistic complete sample of strong lenses, predicting a number density of lenses of about 2.5 deg$^{-2}$ for a \Euclid-like setup. Such data can break the degeneracy between the stellar initial mass function (IMF) and the inner slope of the density profile of dark matter, without having to rely on additional information from stellar dynamics. We found that a survey covering only 50 deg$^2$ can already provide tight constraints: assuming that the cosmology is known, the dark matter slope is recovered with an uncertainty of $3.5\%$, while the uncertainty in the ratio between the true stellar mass and that inferred from stellar population modelling was found to be $10\%$. These findings highlight the potential of this method when applied to samples of lenses with well-understood selection functions. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: Submitted to Astronomy & Astrophysics

arXiv:2406.13154 [pdf, other]

Conditional score-based diffusion models for solving inverse problems in mechanics

Authors: Agnimitra Dasgupta, Harisankar Ramaswamy, Javier Murgoitio Esandi, Ken Foo, Runze Li, Qifa Zhou, Brendan Kennedy, Assad Oberai

Abstract: We propose a framework to perform Bayesian inference using conditional score-based diffusion models to solve a class of inverse problems in mechanics involving the inference of a specimen's spatially varying material properties from noisy measurements of its mechanical response to loading. Conditional score-based diffusion models are generative models that learn to approximate the score function o… ▽ More We propose a framework to perform Bayesian inference using conditional score-based diffusion models to solve a class of inverse problems in mechanics involving the inference of a specimen's spatially varying material properties from noisy measurements of its mechanical response to loading. Conditional score-based diffusion models are generative models that learn to approximate the score function of a conditional distribution using samples from the joint distribution. More specifically, the score functions corresponding to multiple realizations of the measurement are approximated using a single neural network, the so-called score network, which is subsequently used to sample the posterior distribution using an appropriate Markov chain Monte Carlo scheme based on Langevin dynamics. Training the score network only requires simulating the forward model. Hence, the proposed approach can accommodate black-box forward models and complex measurement noise. Moreover, once the score network has been trained, it can be re-used to solve the inverse problem for different realizations of the measurements. We demonstrate the efficacy of the proposed approach on a suite of high-dimensional inverse problems in mechanics that involve inferring heterogeneous material properties from noisy measurements. Some examples we consider involve synthetic data, while others include data collected from actual elastography experiments. Further, our applications demonstrate that the proposed approach can handle different measurement modalities, complex patterns in the inferred quantities, non-Gaussian and non-additive noise models, and nonlinear black-box forward models. The results show that the proposed framework can solve large-scale physics-based inverse problems efficiently. △ Less

Submitted 21 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.12670 [pdf, other]

Stealth edits for provably fixing or attacking large language models

Authors: Oliver J. Sutton, Qinghua Zhou, Wei Wang, Desmond J. Higham, Alexander N. Gorban, Alexander Bastounis, Ivan Y. Tyukin

Abstract: We reveal new methods and the theoretical foundations of techniques for editing large language models. We also show how the new theory can be used to assess the editability of models and to expose their susceptibility to previously unknown malicious attacks. Our theoretical approach shows that a single metric (a specific measure of the intrinsic dimensionality of the model's features) is fundament… ▽ More We reveal new methods and the theoretical foundations of techniques for editing large language models. We also show how the new theory can be used to assess the editability of models and to expose their susceptibility to previously unknown malicious attacks. Our theoretical approach shows that a single metric (a specific measure of the intrinsic dimensionality of the model's features) is fundamental to predicting the success of popular editing approaches, and reveals new bridges between disparate families of editing methods. We collectively refer to these approaches as stealth editing methods, because they aim to directly and inexpensively update a model's weights to correct the model's responses to known hallucinating prompts without otherwise affecting the model's behaviour, without requiring retraining. By carefully applying the insight gleaned from our theoretical investigation, we are able to introduce a new network block -- named a jet-pack block -- which is optimised for highly selective model editing, uses only standard network operations, and can be inserted into existing networks. The intrinsic dimensionality metric also determines the vulnerability of a language model to a stealth attack: a small change to a model's weights which changes its response to a single attacker-chosen prompt. Stealth attacks do not require access to or knowledge of the model's training data, therefore representing a potent yet previously unrecognised threat to redistributed foundation models. They are computationally simple enough to be implemented in malware in many cases. Extensive experimental results illustrate and support the method and its theoretical underpinnings. Demos and source code for editing language models are available at https://github.com/qinghua-zhou/stealth-edits. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 24 pages, 9 figures. Open source implementation: https://github.com/qinghua-zhou/stealth-edits

MSC Class: 68T07; 68T50; 68W40 ACM Class: I.2.7; F.2.0

arXiv:2406.11739 [pdf, other]

V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results

Authors: Jiaqi Wang, Yuhang Zang, Pan Zhang, Tao Chu, Yuhang Cao, Zeyi Sun, Ziyu Liu, Xiaoyi Dong, Tong Wu, Dahua Lin, Zeming Chen, Zhi Wang, Lingchen Meng, Wenhao Yao, Jianwei Yang, Sihong Wu, Zhineng Chen, Zuxuan Wu, Yu-Gang Jiang, Peixi Wu, Bosong Chai, Xuan Nie, Longquan Yan, Zeyu Wang, Qifan Zhou , et al. (9 additional authors not shown)

Abstract: Detecting objects in real-world scenes is a complex task due to various challenges, including the vast range of object categories, and potential encounters with previously unknown or unseen objects. The challenges necessitate the development of public benchmarks and challenges to advance the field of object detection. Inspired by the success of previous COCO and LVIS Challenges, we organize the V3… ▽ More Detecting objects in real-world scenes is a complex task due to various challenges, including the vast range of object categories, and potential encounters with previously unknown or unseen objects. The challenges necessitate the development of public benchmarks and challenges to advance the field of object detection. Inspired by the success of previous COCO and LVIS Challenges, we organize the V3Det Challenge 2024 in conjunction with the 4th Open World Vision Workshop: Visual Perception via Learning in an Open World (VPLOW) at CVPR 2024, Seattle, US. This challenge aims to push the boundaries of object detection research and encourage innovation in this field. The V3Det Challenge 2024 consists of two tracks: 1) Vast Vocabulary Object Detection: This track focuses on detecting objects from a large set of 13204 categories, testing the detection algorithm's ability to recognize and locate diverse objects. 2) Open Vocabulary Object Detection: This track goes a step further, requiring algorithms to detect objects from an open set of categories, including unknown objects. In the following sections, we will provide a comprehensive summary and analysis of the solutions submitted by participants. By analyzing the methods and solutions presented, we aim to inspire future research directions in vast vocabulary and open-vocabulary object detection, driving progress in this field. Challenge homepage: https://v3det.openxlab.org.cn/challenge △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.11653 [pdf, other]

Communication-Efficient MARL for Platoon Stability and Energy-efficiency Co-optimization in Cooperative Adaptive Cruise Control of CAVs

Authors: Min Hua, Dong Chen, Kun Jiang, Fanggang Zhang, Jinhai Wang, Bo Wang, Quan Zhou, Hongming Xu

Abstract: Cooperative adaptive cruise control (CACC) has been recognized as a fundamental function of autonomous driving, in which platoon stability and energy efficiency are outstanding challenges that are difficult to accommodate in real-world operations. This paper studied the CACC of connected and autonomous vehicles (CAVs) based on the multi-agent reinforcement learning algorithm (MARL) to optimize pla… ▽ More Cooperative adaptive cruise control (CACC) has been recognized as a fundamental function of autonomous driving, in which platoon stability and energy efficiency are outstanding challenges that are difficult to accommodate in real-world operations. This paper studied the CACC of connected and autonomous vehicles (CAVs) based on the multi-agent reinforcement learning algorithm (MARL) to optimize platoon stability and energy efficiency simultaneously. The optimal use of communication bandwidth is the key to guaranteeing learning performance in real-world driving, and thus this paper proposes a communication-efficient MARL by incorporating the quantified stochastic gradient descent (QSGD) and a binary differential consensus (BDC) method into a fully-decentralized MARL framework. We benchmarked the performance of our proposed BDC-MARL algorithm against several several non-communicative andcommunicative MARL algorithms, e.g., IA2C, FPrint, and DIAL, through the evaluation of platoon stability, fuel economy, and driving comfort. Our results show that BDC-MARL achieved the highest energy savings, improving by up to 5.8%, with an average velocity of 15.26 m/s and an inter-vehicle spacing of 20.76 m. In addition, we conducted different information-sharing analyses to assess communication efficacy, along with sensitivity analyses and scalability tests with varying platoon sizes. The practical effectiveness of our approach is further demonstrated using real-world scenarios sourced from open-sourced OpenACC. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.09201 [pdf, other]

Enhanced Object Detection: A Study on Vast Vocabulary Object Detection Track for V3Det Challenge 2024

Authors: Peixi Wu, Bosong Chai, Xuan Nie, Longquan Yan, Zeyu Wang, Qifan Zhou, Boning Wang, Yansong Peng, Hebei Li

Abstract: In this technical report, we present our findings from the research conducted on the Vast Vocabulary Visual Detection (V3Det) dataset for Supervised Vast Vocabulary Visual Detection task. How to deal with complex categories and detection boxes has become a difficulty in this track. The original supervised detector is not suitable for this task. We have designed a series of improvements, including… ▽ More In this technical report, we present our findings from the research conducted on the Vast Vocabulary Visual Detection (V3Det) dataset for Supervised Vast Vocabulary Visual Detection task. How to deal with complex categories and detection boxes has become a difficulty in this track. The original supervised detector is not suitable for this task. We have designed a series of improvements, including adjustments to the network structure, changes to the loss function, and design of training strategies. Our model has shown improvement over the baseline and achieved excellent rankings on the Leaderboard for both the Vast Vocabulary Object Detection (Supervised) track and the Open Vocabulary Object Detection (OVD) track of the V3Det Challenge 2024. △ Less

Submitted 21 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

Journal ref: Second Place in CVPR 2024 Vast Vocabulary Visual Detection Challenge

arXiv:2406.08634 [pdf, other]

Unveiling Incomplete Modality Brain Tumor Segmentation: Leveraging Masked Predicted Auto-Encoder and Divergence Learning

Authors: Zhongao Sun, Jiameng Li, Yuhan Wang, Jiarong Cheng, Qing Zhou, Chun Li

Abstract: Brain tumor segmentation remains a significant challenge, particularly in the context of multi-modal magnetic resonance imaging (MRI) where missing modality images are common in clinical settings, leading to reduced segmentation accuracy. To address this issue, we propose a novel strategy, which is called masked predicted pre-training, enabling robust feature learning from incomplete modality data… ▽ More Brain tumor segmentation remains a significant challenge, particularly in the context of multi-modal magnetic resonance imaging (MRI) where missing modality images are common in clinical settings, leading to reduced segmentation accuracy. To address this issue, we propose a novel strategy, which is called masked predicted pre-training, enabling robust feature learning from incomplete modality data. Additionally, in the fine-tuning phase, we utilize a knowledge distillation technique to align features between complete and missing modality data, simultaneously enhancing model robustness. Notably, we leverage the Holder pseudo-divergence instead of the KLD for distillation loss, offering improve mathematical interpretability and properties. Extensive experiments on the BRATS2018 and BRATS2020 datasets demonstrate significant performance enhancements compared to existing state-of-the-art methods. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.06706 [pdf, other]

Application of Black-Litterman Bayesian in Statistical Arbitrage

Authors: Qiqin Zhou

Abstract: \begin{abstract} In this paper, we integrated the statistical arbitrage strategy, pairs trading, into the Black-Litterman model and constructed efficient mean-variance portfolios. Typically, pairs trading underperforms under volatile or distressed market condition because the selected asset pairs fail to revert to equilibrium within the investment horizon. By enhancing this strategy with the Black… ▽ More \begin{abstract} In this paper, we integrated the statistical arbitrage strategy, pairs trading, into the Black-Litterman model and constructed efficient mean-variance portfolios. Typically, pairs trading underperforms under volatile or distressed market condition because the selected asset pairs fail to revert to equilibrium within the investment horizon. By enhancing this strategy with the Black-Litterman portfolio optimization, we achieved superior performance compared to the S\&P 500 market index under both normal and extreme market conditions. Furthermore, this research presents an innovative idea of incorporating traditional pairs trading strategies into the portfolio optimization framework in a scalable and systematic manner. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2406.06277 [pdf, other]

Measurement of the branching fractions of $\bar{B}\to D^{(*)} K^- K^{(*)0}_{(S)}$ and $\bar{B}\to D^{(*)}D_s^{-}$ decays at Belle II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer , et al. (382 additional authors not shown)

Abstract: We present measurements of the branching fractions of eight $\overline B{}^0\to D^{(*)+} K^- K^{(*)0}_{(S)}$, $B^{-}\to D^{(*)0} K^- K^{(*)0}_{(S)}$ decay channels. The results are based on data from SuperKEKB electron-positron collisions at the $Υ(4S)$ resonance collected with the Belle II detector, corresponding to an integrated luminosity of $362~\text{fb}^{-1}$. The event yields are extracted… ▽ More We present measurements of the branching fractions of eight $\overline B{}^0\to D^{(*)+} K^- K^{(*)0}_{(S)}$, $B^{-}\to D^{(*)0} K^- K^{(*)0}_{(S)}$ decay channels. The results are based on data from SuperKEKB electron-positron collisions at the $Υ(4S)$ resonance collected with the Belle II detector, corresponding to an integrated luminosity of $362~\text{fb}^{-1}$. The event yields are extracted from fits to the distributions of the difference between expected and observed $B$ meson energy, and are efficiency-corrected as a function of $m(K^-K^{(*)0}_{(S)})$ and $m(D^{(*)}K^{(*)0}_{(S)})$ in order to avoid dependence on the decay model. These results include the first observation of $\overline B{}^0\to D^+K^-K_S^0$, $B^-\to D^{*0}K^-K_S^0$, and $\overline B{}^0\to D^{*+}K^-K_S^0$ decays and a significant improvement in the precision of the other channels compared to previous measurements. The helicity-angle distributions and the invariant mass distributions of the $K^- K^{(*)0}_{(S)}$ systems are compatible with quasi-two-body decays via a resonant transition with spin-parity $J^P=1^-$ for the $K^-K_S^0$ systems and $J^P= 1^+$ for the $K^-K^{*0}$ systems. We also present measurements of the branching fractions of four $\overline B{}^0\to D^{(*)+} D_s^-$, $B^{-}\to D^{(*)0} D_s^- $ decay channels with a precision compatible to the current world averages. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: Prepared for submission to JHEP. 34 pages, 14 figures

Report number: Belle II Preprint: 2024-014, KEK Preprint: 2024-8

arXiv:2406.05852 [pdf, other]

RefGaussian: Disentangling Reflections from 3D Gaussian Splatting for Realistic Rendering

Authors: Rui Zhang, Tianyue Luo, Weidong Yang, Ben Fei, Jingyi Xu, Qingyuan Zhou, Keyi Liu, Ying He

Abstract: 3D Gaussian Splatting (3D-GS) has made a notable advancement in the field of neural rendering, 3D scene reconstruction, and novel view synthesis. Nevertheless, 3D-GS encounters the main challenge when it comes to accurately representing physical reflections, especially in the case of total reflection and semi-reflection that are commonly found in real-world scenes. This limitation causes reflectio… ▽ More 3D Gaussian Splatting (3D-GS) has made a notable advancement in the field of neural rendering, 3D scene reconstruction, and novel view synthesis. Nevertheless, 3D-GS encounters the main challenge when it comes to accurately representing physical reflections, especially in the case of total reflection and semi-reflection that are commonly found in real-world scenes. This limitation causes reflections to be mistakenly treated as independent elements with physical presence, leading to imprecise reconstructions. Herein, to tackle this challenge, we propose RefGaussian to disentangle reflections from 3D-GS for realistically modeling reflections. Specifically, we propose to split a scene into transmitted and reflected components and represent these components using two Spherical Harmonics (SH). Given that this decomposition is not fully determined, we employ local regularization techniques to ensure local smoothness for both the transmitted and reflected components, thereby achieving more plausible decomposition outcomes than 3D-GS. Experimental results demonstrate that our approach achieves superior novel view synthesis and accurate depth estimation outcomes. Furthermore, it enables the utilization of scene editing applications, ensuring both high-quality results and physical coherence. △ Less

Submitted 9 June, 2024; originally announced June 2024.

arXiv:2406.05758 [pdf, other]

Planar Turán number for balanced double stars

Authors: Xin Xu, Qiang Zhou, Tong Li, Guiying Yan

Abstract: Planar Turán number, denoted by $ex_{\mathcal{P}}(n,H)$, is the maximum number of edges in an $n$-vertex planar graph which does not contain $H$ as a subgraph. Ghosh, Győri, Paulos and Xiao initiated the topic of the planar Turán number for double stars. For balanced double star, $S_{3,3}$ is the only remaining graph need to be considered. In this paper, we give the exact value of… ▽ More Planar Turán number, denoted by $ex_{\mathcal{P}}(n,H)$, is the maximum number of edges in an $n$-vertex planar graph which does not contain $H$ as a subgraph. Ghosh, Győri, Paulos and Xiao initiated the topic of the planar Turán number for double stars. For balanced double star, $S_{3,3}$ is the only remaining graph need to be considered. In this paper, we give the exact value of $ex_{\mathcal{P}}(n,S_{3,3})$, forcing the planar Turán number for all balanced double stars completely determined. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Comments: 26 pages, 16 figures

arXiv:2406.05397 [pdf, other]

Metamorphic Relation Generation: State of the Art and Visions for Future Research

Authors: Rui Li, Huai Liu, Pak-Lok Poon, Dave Towey, Chang-Ai Sun, Zheng Zheng, Zhi Quan Zhou, Tsong Yueh Chen

Abstract: Metamorphic testing has become one mainstream technique to address the notorious oracle problem in software testing, thanks to its great successes in revealing real-life bugs in a wide variety of software systems. Metamorphic relations, the core component of metamorphic testing, have continuously attracted research interests from both academia and industry. In the last decade, a rapidly increasing… ▽ More Metamorphic testing has become one mainstream technique to address the notorious oracle problem in software testing, thanks to its great successes in revealing real-life bugs in a wide variety of software systems. Metamorphic relations, the core component of metamorphic testing, have continuously attracted research interests from both academia and industry. In the last decade, a rapidly increasing number of studies have been conducted to systematically generate metamorphic relations from various sources and for different application domains. In this article, based on the systematic review on the state of the art for metamorphic relations' generation, we summarize and highlight visions for further advancing the theory and techniques for identifying and constructing metamorphic relations, and discuss potential research trends in related areas. △ Less

Submitted 10 June, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

Comments: Accepted by International Workshop on Software Engineering in 2030

arXiv:2406.04642 [pdf, ps, other]

Measurements of the branching fractions of $Ξ_{c}^{0}\toΞ^{0}π^{0}$, $Ξ_{c}^{0}\toΞ^{0}η$, and $Ξ_{c}^{0}\toΞ^{0}η^{\prime}$ and asymmetry parameter of $Ξ_{c}^{0}\toΞ^{0}π^{0}$

Authors: Belle, Belle II Collaborations, :, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (360 additional authors not shown)

Abstract: We present a study of $Ξ_{c}^{0}\toΞ^{0}π^{0}$, $Ξ_{c}^{0}\toΞ^{0}η$, and $Ξ_{c}^{0}\toΞ^{0}η^{\prime}$ decays using the Belle and Belle~II data samples, which have integrated luminosities of 980~$\mathrm{fb}^{-1}$ and 426~$\mathrm{fb}^{-1}$, respectively. We measure the following relative branching fractions… ▽ More We present a study of $Ξ_{c}^{0}\toΞ^{0}π^{0}$, $Ξ_{c}^{0}\toΞ^{0}η$, and $Ξ_{c}^{0}\toΞ^{0}η^{\prime}$ decays using the Belle and Belle~II data samples, which have integrated luminosities of 980~$\mathrm{fb}^{-1}$ and 426~$\mathrm{fb}^{-1}$, respectively. We measure the following relative branching fractions $${\cal B}(Ξ_{c}^{0}\toΞ^{0}π^{0})/{\cal B}(Ξ_{c}^{0}\toΞ^{-}π^{+}) = 0.48 \pm 0.02 ({\rm stat}) \pm 0.03 ({\rm syst}) ,$$ $${\cal B}(Ξ_{c}^{0}\toΞ^{0}η)/{\cal B}(Ξ_{c}^{0}\toΞ^{-}π^{+}) = 0.11 \pm 0.01 ({\rm stat}) \pm 0.01 ({\rm syst}) ,$$ $${\cal B}(Ξ_{c}^{0}\toΞ^{0}η^{\prime})/{\cal B}(Ξ_{c}^{0}\toΞ^{-}π^{+}) = 0.08 \pm 0.02 ({\rm stat}) \pm 0.01 ({\rm syst}) $$ for the first time, where the uncertainties are statistical ($\rm stat$) and systematic ($\rm syst$). By multiplying by the branching fraction of the normalization mode, ${\mathcal B}(Ξ_{c}^{0}\toΞ^{-}π^{+})$, we obtain the following absolute branching fraction results $(6.9 \pm 0.3 ({\rm stat}) \pm 0.5 ({\rm syst}) \pm 1.3 ({\rm norm})) \times 10^{-3}$, $(1.6 \pm 0.2 ({\rm stat}) \pm 0.2 ({\rm syst}) \pm 0.3 ({\rm norm})) \times 10^{-3}$, and $(1.2 \pm 0.3 ({\rm stat}) \pm 0.1 ({\rm syst}) \pm 0.2 ({\rm norm})) \times 10^{-3}$, for $Ξ_{c}^{0}$ decays to $Ξ^{0}π^{0}$, $Ξ^{0}η$, and $Ξ^{0}η^{\prime}$ final states, respectively. The third errors are from the uncertainty on ${\mathcal B}(Ξ_{c}^{0}\toΞ^{-}π^{+})$. The asymmetry parameter for $Ξ_{c}^{0}\toΞ^{0}π^{0}$ is measured to be $α(Ξ_{c}^{0}\toΞ^{0}π^{0}) = -0.90\pm0.15({\rm stat})\pm0.23({\rm syst})$. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 23 pages, 5 figures

Report number: Belle II Preprint 2024-015; KEK Preprint 2024-9

arXiv:2406.00610 [pdf]

Portfolio Optimization with Robust Covariance and Conditional Value-at-Risk Constraints

Authors: Qiqin Zhou

Abstract: The measure of portfolio risk is an important input of the Markowitz framework. In this study, we explored various methods to obtain a robust covariance estimators that are less susceptible to financial data noise. We evaluated the performance of large-cap portfolio using various forms of Ledoit Shrinkage Covariance and Robust Gerber Covariance matrix during the period of 2012 to 2022. Out-of-samp… ▽ More The measure of portfolio risk is an important input of the Markowitz framework. In this study, we explored various methods to obtain a robust covariance estimators that are less susceptible to financial data noise. We evaluated the performance of large-cap portfolio using various forms of Ledoit Shrinkage Covariance and Robust Gerber Covariance matrix during the period of 2012 to 2022. Out-of-sample performance indicates that robust covariance estimators can outperform the market capitalization-weighted benchmark portfolio, particularly during bull markets. The Gerber covariance with Mean-Absolute-Deviation (MAD) emerged as the top performer. However, robust estimators do not manage tail risk well under extreme market conditions, for example, Covid-19 period. When we aim to control for tail risk, we should add constraint on Conditional Value-at-Risk (CVaR) to make more conservative decision on risk exposure. Additionally, we incorporated unsupervised clustering algorithm K-means to the optimization algorithm (i.e. Nested Clustering Optimization, NCO). It not only helps mitigate numerical instability of the optimization algorithm, but also contributes to lower drawdown as well. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: 11 pages

arXiv:2405.19734 [pdf, other]

Search for the decay $B^{0}\toγγ$ using Belle and Belle II data

Authors: Belle, Belle II Collaborations, :, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, S. Al Said, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot , et al. (385 additional authors not shown)

Abstract: We report the result of a search for the rare decay $B^{0} \to γγ$ using a combined dataset of $753\times10^{6}$ $B\bar{B}$ pairs collected by the Belle experiment and $387\times10^{6}$ $B\bar{B}$ pairs collected by the Belle II experiment from decays of the $\rm Υ(4S)$ resonance produced in $e^{+}e^{-}$ collisions. A simultaneous fit to the Belle and Belle II data sets yields… ▽ More We report the result of a search for the rare decay $B^{0} \to γγ$ using a combined dataset of $753\times10^{6}$ $B\bar{B}$ pairs collected by the Belle experiment and $387\times10^{6}$ $B\bar{B}$ pairs collected by the Belle II experiment from decays of the $\rm Υ(4S)$ resonance produced in $e^{+}e^{-}$ collisions. A simultaneous fit to the Belle and Belle II data sets yields $11.0^{+6.5}_{-5.5}$ signal events, corresponding to a 2.5$σ$ significance. We determine the branching fraction $\mathcal{B}(B^{0} \to γγ) = (3.7^{+2.2}_{-1.8}(\rm stat)\pm0.5(\rm syst))\times10^{-8}$ and set a 90% credibility level upper limit of $\mathcal{B}(B^{0} \to γγ) < 6.4\times10^{-8}$. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Report number: Belle II Preprint: 2024-017, KEK Preprint: 2024-13

arXiv:2405.19624 [pdf, other]

Single spin asymmetry $ A _{ U L } ^ { \sin ( 2 φ_ { h } ) }$ in dihadron production in SIDIS

Authors: Ren Yang, Yangyang Yu, Qihang Zhou, Gang Li, Mao Song, Xuan Luo

Abstract: The paper calculates the helicity-dependent dihadron fragmentation function (DiFF), by extending the dihadron spectator model and examine the single longitudinal spin asymmetry $A^{\sin(2φ_h)}_{UL}$ from dihadron in semi-inclusive inelastic scattering (SIDIS). This function elucidates the relationship between the longitudinal polarization of the fragmented quark and the transverse momentum of the… ▽ More The paper calculates the helicity-dependent dihadron fragmentation function (DiFF), by extending the dihadron spectator model and examine the single longitudinal spin asymmetry $A^{\sin(2φ_h)}_{UL}$ from dihadron in semi-inclusive inelastic scattering (SIDIS). This function elucidates the relationship between the longitudinal polarization of the fragmented quark and the transverse momentum of the resulting hadron pairs. A study by the COMPASS collaboration detected a minimal signal in their experimental search for this azimuthal asymmetry in SIDIS. Here, we use the spectator model to calculate the unknown T-odd dihadron fragmentation function $H_1^\perp$. Adopting collinear factorization to describe the data, avoiding the transverse momentum dependent factorization and the associated resummation effects, helping us understand the asymmetry and explaining why the signal is so weak. We involve the approach of transverse momentum dependence in the model calculations, in order to formulate the differential cross sections and the spin asymmetries in terms of the collinear parton distributions and the collinear DiFFs. A transverse momentum factor analysis method was used, in which the transverse momentum of the final hadron pairs was not integrated. The asymmetry of $sin(2φ_h)$ in COMPASS kinematics was calculated and compared with experimental data. In addition, predictions for the same asymmetry are also presented for HERMES and the Electron Ion Collider. △ Less

Submitted 24 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

Comments: 10 pages,10 figures. Version appearing in PRD

Journal ref: Phys. Rev. D 109, 114038 (2024)

arXiv:2405.18928 [pdf, other]

Measurement of the energy dependence of the $e^+e^- \to B\bar{B}$, $B\bar{B}{}^*$, and $B^*\bar{B}{}^*$ cross sections at Belle~II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, M. Bauer, A. Baur , et al. (444 additional authors not shown)

Abstract: We report measurements of the $e^+e^- \to B\bar{B}$, $B\bar{B}{}^*$, and $B^*\bar{B}{}^*$ cross sections at four energies, 10653, 10701, 10746 and 10805 MeV, using data collected by the Belle~II experiment. We reconstruct one $B$ meson in a large number of hadronic final states and use its momentum to identify the production process. In the first $2-5$ MeV above $B^*\bar{B}{}^*$ threshold, the… ▽ More We report measurements of the $e^+e^- \to B\bar{B}$, $B\bar{B}{}^*$, and $B^*\bar{B}{}^*$ cross sections at four energies, 10653, 10701, 10746 and 10805 MeV, using data collected by the Belle~II experiment. We reconstruct one $B$ meson in a large number of hadronic final states and use its momentum to identify the production process. In the first $2-5$ MeV above $B^*\bar{B}{}^*$ threshold, the $e^+e^- \to B^*\bar{B}{}^*$ cross section increases rapidly. This may indicate the presence of a pole close to the threshold. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 30 pages, 15 figures, submitted to JHEP

Report number: Belle II Preprint 2024-016, KEK Preprint 2024-12

arXiv:2405.18891 [pdf]

Inverse Design of Promising Alloys for Electrocatalytic CO$_2$ Reduction via Generative Graph Neural Networks Combined with Bird Swarm Algorithm

Authors: Zhilong Song, Linfeng Fan, Shuaihua Lu, Qionghua Zhou, Chongyi Ling, Jinlan Wang

Abstract: Directly generating material structures with optimal properties is a long-standing goal in material design. One of the fundamental challenges lies in how to overcome the limitation of traditional generative models to efficiently explore the global chemical space rather than a small localized space. Herein, we develop a framework named MAGECS to address this dilemma, by integrating the bird swarm a… ▽ More Directly generating material structures with optimal properties is a long-standing goal in material design. One of the fundamental challenges lies in how to overcome the limitation of traditional generative models to efficiently explore the global chemical space rather than a small localized space. Herein, we develop a framework named MAGECS to address this dilemma, by integrating the bird swarm algorithm and supervised graph neural network to effectively navigate the generative model in the immense chemical space towards materials with target properties. As a demonstration, MAGECS is applied to design compelling alloy electrocatalysts for CO$_2$ reduction reaction (CO$_2$RR) and works extremely well. Specifically, the chemical space of CO$_2$RR is effectively explored, where over 250,000 promising structures with high activity have been generated and notably, the proportion of desired structures is 2.5-fold increased. Moreover, five predicted alloys, i.e., CuAl, AlPd, Sn$_2$Pd$_5$, Sn$_9$Pd$_7$, and CuAlSe$_2$ are successfully synthesized and characterized experimentally, two of which exhibit about 90% Faraday efficiency of CO$_2$RR, and CuAl achieved 76% efficiency for C$_2$ products. This pioneering application of inverse design in CO$_2$RR catalysis showcases the potential of MAGECS to dramatically accelerate the development of functional materials, paving the way for fully automated, artificial intelligence-driven material design. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.18649 [pdf, other]

Training LLMs to Better Self-Debug and Explain Code

Authors: Nan Jiang, Xiaopeng Li, Shiqi Wang, Qiang Zhou, Soneya Binta Hossain, Baishakhi Ray, Varun Kumar, Xiaofei Ma, Anoop Deoras

Abstract: In the domain of code generation, self-debugging is crucial. It allows LLMs to refine their generated code based on execution feedback. This is particularly important because generating correct solutions in one attempt proves challenging for complex tasks. Prior works on self-debugging mostly focus on prompting methods by providing LLMs with few-shot examples, which work poorly on small open-sourc… ▽ More In the domain of code generation, self-debugging is crucial. It allows LLMs to refine their generated code based on execution feedback. This is particularly important because generating correct solutions in one attempt proves challenging for complex tasks. Prior works on self-debugging mostly focus on prompting methods by providing LLMs with few-shot examples, which work poorly on small open-sourced LLMs. In this work, we propose a training framework that significantly improves self-debugging capability of LLMs. Intuitively, we observe that a chain of explanations on the wrong code followed by code refinement helps LLMs better analyze the wrong code and do refinement. We thus propose an automated pipeline to collect a high-quality dataset for code explanation and refinement by generating a number of explanations and refinement trajectories and filtering via execution verification. We perform supervised fine-tuning (SFT) and further reinforcement learning (RL) on both success and failure trajectories with a novel reward design considering code explanation and refinement quality. SFT improves the pass@1 by up to 15.92% and pass@10 by 9.30% over four benchmarks. RL training brings additional up to 3.54% improvement on pass@1 and 2.55% improvement on pass@10. The trained LLMs show iterative refinement ability, and can keep refining code continuously. Lastly, our human evaluation shows that the LLMs trained with our framework generate more useful code explanations and help developers better understand bugs in source code. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.18258 [pdf, other]

Text-only Synthesis for Image Captioning

Authors: Qing Zhou, Junlin Huang, Qiang Li, Junyu Gao, Qi Wang

Abstract: From paired image-text training to text-only training for image captioning, the pursuit of relaxing the requirements for high-cost and large-scale annotation of good quality data remains consistent. In this paper, we propose Text-only Synthesis for Image Captioning (ToCa), which further advances this relaxation with fewer human labor and less computing time. Specifically, we deconstruct caption te… ▽ More From paired image-text training to text-only training for image captioning, the pursuit of relaxing the requirements for high-cost and large-scale annotation of good quality data remains consistent. In this paper, we propose Text-only Synthesis for Image Captioning (ToCa), which further advances this relaxation with fewer human labor and less computing time. Specifically, we deconstruct caption text into structures and lexical words, which serve as the fundamental components of the caption. By combining different structures and lexical words as inputs to the large language model, massive captions that contain various patterns of lexical words are generated. This method not only approaches the target domain but also surpasses it by generating new captions, thereby enhancing the zero-shot generalization ability of the model. Considering the different levels of data access in the real world, we define three synthesis scenarios: cross-domain synthesis, in-domain synthesis, and data-efficient synthesis. Experiments in these scenarios demonstrate the generalizability, transferability and practicability of ToCa with a nearly 5 CIDEr improvement for zero-shot cross-domain captioning and a maximum increase of over 20 CIDEr for data-efficient captioning. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.16940 [pdf, other]

Adversarial Attacks on Both Face Recognition and Face Anti-spoofing Models

Authors: Fengfan Zhou, Qianyu Zhou, Xiangtai Li, Xuequan Lu, Lizhuang Ma, Hefei Ling

Abstract: Adversarial attacks on Face Recognition (FR) systems have proven highly effective in compromising pure FR models, yet adversarial examples may be ineffective to the complete FR systems as Face Anti-Spoofing (FAS) models are often incorporated and can detect a significant number of them. To address this under-explored and essential problem, we propose a novel setting of adversarially attacking both… ▽ More Adversarial attacks on Face Recognition (FR) systems have proven highly effective in compromising pure FR models, yet adversarial examples may be ineffective to the complete FR systems as Face Anti-Spoofing (FAS) models are often incorporated and can detect a significant number of them. To address this under-explored and essential problem, we propose a novel setting of adversarially attacking both FR and FAS models simultaneously, aiming to enhance the practicability of adversarial attacks on FR systems. In particular, we introduce a new attack method, namely Style-aligned Distribution Biasing (SDB), to improve the capacity of black-box attacks on both FR and FAS models. Specifically, our SDB framework consists of three key components. Firstly, to enhance the transferability of FAS models, we design a Distribution-aware Score Biasing module to optimize adversarial face examples away from the distribution of spoof images utilizing scores. Secondly, to mitigate the substantial style differences between live images and adversarial examples initialized with spoof images, we introduce an Instance Style Alignment module that aligns the style of adversarial examples with live images. In addition, to alleviate the conflicts between the gradients of FR and FAS models, we propose a Gradient Consistency Maintenance module to minimize disparities between the gradients using Hessian approximation. Extensive experiments showcase the superiority of our proposed attack method to state-of-the-art adversarial attacks. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.16126 [pdf, other]

Near-Optimal Distributed Minimax Optimization under the Second-Order Similarity

Authors: Qihao Zhou, Haishan Ye, Luo Luo

Abstract: This paper considers the distributed convex-concave minimax optimization under the second-order similarity. We propose stochastic variance-reduced optimistic gradient sliding (SVOGS) method, which takes the advantage of the finite-sum structure in the objective by involving the mini-batch client sampling and variance reduction. We prove SVOGS can achieve the $\varepsilon$-duality gap within commun… ▽ More This paper considers the distributed convex-concave minimax optimization under the second-order similarity. We propose stochastic variance-reduced optimistic gradient sliding (SVOGS) method, which takes the advantage of the finite-sum structure in the objective by involving the mini-batch client sampling and variance reduction. We prove SVOGS can achieve the $\varepsilon$-duality gap within communication rounds of ${\mathcal O}(δD^2/\varepsilon)$, communication complexity of ${\mathcal O}(n+\sqrt{n}δD^2/\varepsilon)$, and local gradient calls of $\tilde{\mathcal O}(n+(\sqrt{n}δ+L)D^2/\varepsilon\log(1/\varepsilon))$, where $n$ is the number of nodes, $δ$ is the degree of the second-order similarity, $L$ is the smoothness parameter and $D$ is the diameter of the constraint set. We can verify that all of above complexity (nearly) matches the corresponding lower bounds. For the specific $μ$-strongly-convex-$μ$-strongly-convex case, our algorithm has the upper bounds on communication rounds, communication complexity, and local gradient calls of $\mathcal O(δ/μ\log(1/\varepsilon))$, ${\mathcal O}((n+\sqrt{n}δ/μ)\log(1/\varepsilon))$, and $\tilde{\mathcal O}(n+(\sqrt{n}δ+L)/μ)\log(1/\varepsilon))$ respectively, which are also nearly tight. Furthermore, we conduct the numerical experiments to show the empirical advantages of proposed method. △ Less

Submitted 25 May, 2024; originally announced May 2024.

arXiv:2405.15358 [pdf, ps, other]

Coordinated Multi-Neighborhood Learning on a Directed Acyclic Graph

Authors: Stephen Smith, Qing Zhou

Abstract: Learning the structure of causal directed acyclic graphs (DAGs) is useful in many areas of machine learning and artificial intelligence, with wide applications. However, in the high-dimensional setting, it is challenging to obtain good empirical and theoretical results without strong and often restrictive assumptions. Additionally, it is questionable whether all of the variables purported to be in… ▽ More Learning the structure of causal directed acyclic graphs (DAGs) is useful in many areas of machine learning and artificial intelligence, with wide applications. However, in the high-dimensional setting, it is challenging to obtain good empirical and theoretical results without strong and often restrictive assumptions. Additionally, it is questionable whether all of the variables purported to be included in the network are observable. It is of interest then to restrict consideration to a subset of the variables for relevant and reliable inferences. In fact, researchers in various disciplines can usually select a set of target nodes in the network for causal discovery. This paper develops a new constraint-based method for estimating the local structure around multiple user-specified target nodes, enabling coordination in structure learning between neighborhoods. Our method facilitates causal discovery without learning the entire DAG structure. We establish consistency results for our algorithm with respect to the local neighborhood structure of the target nodes in the true graph. Experimental results on synthetic and real-world data show that our algorithm is more accurate in learning the neighborhood structures with much less computational cost than standard methods that estimate the entire DAG. An R package implementing our methods may be accessed at https://github.com/stephenvsmith/CML. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 13 pages, 6 figures

arXiv:2405.14625 [pdf, other]

Test of light-lepton universality in $τ$ decays with the Belle II experiment

Authors: Belle II Collaboration, I. Adachi, K. Adamczyk, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Anh Ky, D. M. Asner, H. Atmacan, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer, J. Becker , et al. (406 additional authors not shown)

Abstract: We present a measurement of the ratio $R_μ= \mathcal{B}(τ^-\to μ^-\barν_μν_τ) / \mathcal{B}(τ^-\to e^-\barν_eν_τ)$ of branching fractions $\mathcal{B}$ of the $τ$ lepton decaying to muons or electrons using data collected with the Belle II detector at the SuperKEKB $e^+e^-$ collider. The sample has an integrated luminosity of 362 fb$^{-1}$ at a centre-of-mass energy of 10.58 GeV. Using an optimise… ▽ More We present a measurement of the ratio $R_μ= \mathcal{B}(τ^-\to μ^-\barν_μν_τ) / \mathcal{B}(τ^-\to e^-\barν_eν_τ)$ of branching fractions $\mathcal{B}$ of the $τ$ lepton decaying to muons or electrons using data collected with the Belle II detector at the SuperKEKB $e^+e^-$ collider. The sample has an integrated luminosity of 362 fb$^{-1}$ at a centre-of-mass energy of 10.58 GeV. Using an optimised event selection, a binned maximum likelihood fit is performed using the momentum spectra of the electron and muon candidates. The result, $R_μ= 0.9675 \pm 0.0007 \pm 0.0036$, where the first uncertainty is statistical and the second is systematic, is the most precise to date. It provides a stringent test of the light-lepton universality, translating to a ratio of the couplings of the muon and electron to the $W$ boson in $τ$ decays of $0.9974 \pm 0.0019$, in agreement with the standard model expectation of unity. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Report number: Belle II Preprint 2024-002, KEK Preprint 2023-49

arXiv:2405.14444 [pdf]

DuEDL: Dual-Branch Evidential Deep Learning for Scribble-Supervised Medical Image Segmentation

Authors: Yitong Yang, Xinli Xu, Haigen Hu, Haixia Long, Qianwei Zhou, Qiu Guan

Abstract: Despite the recent progress in medical image segmentation with scribble-based annotations, the segmentation results of most models are still not ro-bust and generalizable enough in open environments. Evidential deep learn-ing (EDL) has recently been proposed as a promising solution to model predictive uncertainty and improve the reliability of medical image segmen-tation. However directly applying… ▽ More Despite the recent progress in medical image segmentation with scribble-based annotations, the segmentation results of most models are still not ro-bust and generalizable enough in open environments. Evidential deep learn-ing (EDL) has recently been proposed as a promising solution to model predictive uncertainty and improve the reliability of medical image segmen-tation. However directly applying EDL to scribble-supervised medical im-age segmentation faces a tradeoff between accuracy and reliability. To ad-dress the challenge, we propose a novel framework called Dual-Branch Evi-dential Deep Learning (DuEDL). Firstly, the decoder of the segmentation network is changed to two different branches, and the evidence of the two branches is fused to generate high-quality pseudo-labels. Then the frame-work applies partial evidence loss and two-branch consistent loss for joint training of the model to adapt to the scribble supervision learning. The pro-posed method was tested on two cardiac datasets: ACDC and MSCMRseg. The results show that our method significantly enhances the reliability and generalization ability of the model without sacrificing accuracy, outper-forming state-of-the-art baselines. The code is available at https://github.com/Gardnery/DuEDL. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 14 pages, 2 figures

arXiv:2405.13872 [pdf, other]

Image-of-Thought Prompting for Visual Reasoning Refinement in Multimodal Large Language Models

Authors: Qiji Zhou, Ruochen Zhou, Zike Hu, Panzhong Lu, Siyang Gao, Yue Zhang

Abstract: Recent advancements in Chain-of-Thought (CoT) and related rationale-based works have significantly improved the performance of Large Language Models (LLMs) in complex reasoning tasks. With the evolution of Multimodal Large Language Models (MLLMs), enhancing their capability to tackle complex multimodal reasoning problems is a crucial frontier. However, incorporating multimodal rationales in CoT ha… ▽ More Recent advancements in Chain-of-Thought (CoT) and related rationale-based works have significantly improved the performance of Large Language Models (LLMs) in complex reasoning tasks. With the evolution of Multimodal Large Language Models (MLLMs), enhancing their capability to tackle complex multimodal reasoning problems is a crucial frontier. However, incorporating multimodal rationales in CoT has yet to be thoroughly investigated. We propose the Image-of-Thought (IoT) prompting method, which helps MLLMs to extract visual rationales step-by-step. Specifically, IoT prompting can automatically design critical visual information extraction operations based on the input images and questions. Each step of visual information refinement identifies specific visual rationales that support answers to complex visual reasoning questions. Beyond the textual CoT, IoT simultaneously utilizes visual and textual rationales to help MLLMs understand complex multimodal information. IoT prompting has improved zero-shot visual reasoning performance across various visual understanding tasks in different MLLMs. Moreover, the step-by-step visual feature explanations generated by IoT prompting elucidate the visual reasoning process, aiding in analyzing the cognitive processes of large multimodal models △ Less

Submitted 28 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

Comments: Correct the case title

arXiv:2405.13042 [pdf, other]

doi 10.1145/3649921.3656987

StoryVerse: Towards Co-authoring Dynamic Plot with LLM-based Character Simulation via Narrative Planning

Authors: Yi Wang, Qian Zhou, David Ledo

Abstract: Automated plot generation for games enhances the player's experience by providing rich and immersive narrative experience that adapts to the player's actions. Traditional approaches adopt a symbolic narrative planning method which limits the scale and complexity of the generated plot by requiring extensive knowledge engineering work. Recent advancements use Large Language Models (LLMs) to drive th… ▽ More Automated plot generation for games enhances the player's experience by providing rich and immersive narrative experience that adapts to the player's actions. Traditional approaches adopt a symbolic narrative planning method which limits the scale and complexity of the generated plot by requiring extensive knowledge engineering work. Recent advancements use Large Language Models (LLMs) to drive the behavior of virtual characters, allowing plots to emerge from interactions between characters and their environments. However, the emergent nature of such decentralized plot generation makes it difficult for authors to direct plot progression. We propose a novel plot creation workflow that mediates between a writer's authorial intent and the emergent behaviors from LLM-driven character simulation, through a novel authorial structure called "abstract acts". The writers define high-level plot outlines that are later transformed into concrete character action sequences via an LLM-based narrative planning process, based on the game world state. The process creates "living stories" that dynamically adapt to various game world states, resulting in narratives co-created by the author, character simulation, and player. We present StoryVerse as a proof-of-concept system to demonstrate this plot creation workflow. We showcase the versatility of our approach with examples in different stories and game environments. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: Proceedings of the 19th international conference on the foundations of digital games 2024

arXiv:2405.12914 [pdf, other]

An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation

Authors: Zhiyu Tan, Mengping Yang, Luozheng Qin, Hao Yang, Ye Qian, Qiang Zhou, Cheng Zhang, Hao Li

Abstract: One critical prerequisite for faithful text-to-image generation is the accurate understanding of text inputs. Existing methods leverage the text encoder of the CLIP model to represent input prompts. However, the pre-trained CLIP model can merely encode English with a maximum token length of 77. Moreover, the model capacity of the text encoder from CLIP is relatively limited compared to Large Langu… ▽ More One critical prerequisite for faithful text-to-image generation is the accurate understanding of text inputs. Existing methods leverage the text encoder of the CLIP model to represent input prompts. However, the pre-trained CLIP model can merely encode English with a maximum token length of 77. Moreover, the model capacity of the text encoder from CLIP is relatively limited compared to Large Language Models (LLMs), which offer multilingual input, accommodate longer context, and achieve superior text representation. In this paper, we investigate LLMs as the text encoder to improve the language understanding in text-to-image generation. Unfortunately, training text-to-image generative model with LLMs from scratch demands significant computational resources and data. To this end, we introduce a three-stage training pipeline that effectively and efficiently integrates the existing text-to-image model with LLMs. Specifically, we propose a lightweight adapter that enables fast training of the text-to-image model using the textual representations from LLMs. Extensive experiments demonstrate that our model supports not only multilingual but also longer input context with superior image generation quality. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: Technical report. Project page: https://github.com/llm-conditioned-diffusion/llm-conditioned-diffusion

arXiv:2405.12827 [pdf, other]

On an impulsive faecal-oral model in a periodically evolving environment

Authors: Qi Zhou, Zhigui Lin, Carlos Alberto Santos

Abstract: To understand how impulsive intervention and regional evolution jointly influence the spread of faecal-oral diseases, this paper develops an impulsive faecal-oral model in a periodically evolving environment. The well-posedness of the model is first checked. Then, the existence of the principal eigenvalue dependent on impulse intensity and evolving rate is proved based on Krein-Rutman theorem. Wit… ▽ More To understand how impulsive intervention and regional evolution jointly influence the spread of faecal-oral diseases, this paper develops an impulsive faecal-oral model in a periodically evolving environment. The well-posedness of the model is first checked. Then, the existence of the principal eigenvalue dependent on impulse intensity and evolving rate is proved based on Krein-Rutman theorem. With the help of this value, the threshold dynamical behaviours of the model are explored. More importantly, this paper also derives the monotonicity of the principal eigenvalue with respect to initial region and impulse intensity and estimates the principal eigenvalue in some special cases. Finally, numerical simulations are used to verify the correctness of the theoretical results and to explore the impact of regional evolution rate on the spread of the diseases. Our research shows that large impulsive intensity $1-g'(0)$ and small evolving rate $ρ(t)$ play a positive role in the prevention and control of the diseases. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 29 pages, 8 figures

MSC Class: 35K57; 35R12; 92B05

arXiv:2405.12543 [pdf, other]

Like Humans to Few-Shot Learning through Knowledge Permeation of Vision and Text

Authors: Yuyu Jia, Qing Zhou, Wei Huang, Junyu Gao, Qi Wang

Abstract: Few-shot learning aims to generalize the recognizer from seen categories to an entirely novel scenario. With only a few support samples, several advanced methods initially introduce class names as prior knowledge for identifying novel classes. However, obstacles still impede achieving a comprehensive understanding of how to harness the mutual advantages of visual and textual knowledge. In this pap… ▽ More Few-shot learning aims to generalize the recognizer from seen categories to an entirely novel scenario. With only a few support samples, several advanced methods initially introduce class names as prior knowledge for identifying novel classes. However, obstacles still impede achieving a comprehensive understanding of how to harness the mutual advantages of visual and textual knowledge. In this paper, we propose a coherent Bidirectional Knowledge Permeation strategy called BiKop, which is grounded in a human intuition: A class name description offers a general representation, whereas an image captures the specificity of individuals. BiKop primarily establishes a hierarchical joint general-specific representation through bidirectional knowledge permeation. On the other hand, considering the bias of joint representation towards the base set, we disentangle base-class-relevant semantics during training, thereby alleviating the suppression of potential novel-class-relevant information. Experiments on four challenging benchmarks demonstrate the remarkable superiority of BiKop. Our code will be publicly available. △ Less

Submitted 22 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

Showing 1–50 of 1,194 results for author: Zhou, Q