subscribe to arXiv mailings

Measurement of the branching fraction of $D^+_s\to \ell^+ν_\ell$ via $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

Abstract: Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and… ▽ More Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and $\mathcal{B}(D_s^+\toτ^+ν_τ)=(\bftauv)\%$, respectively. The product of the decay constant and Cabibbo-Kobayashi-Maskawa matrix element $|V_{cs}|$ is determined to be $f_{D_s^+}|V_{cs}|=(\mufdsxvcsresult)_{μν}~\mathrm{MeV}$ and $f_{D_s^+}|V_{cs}|=(\taufdsxvcsresult))_{τν}~\mathrm{MeV}$, respectively. Taking the value of $|V_{cs}|$ from a global fit in the Standard Model, we obtain ${f_{D^+_s}}=(\mufdsresult)_{μν}$\,MeV and ${f_{D^+_s}}=(\taufdsresult)_{τν}$\,MeV, respectively. Conversely, taking the value for $f_{D_s^+}$ from the latest lattice quantum chromodynamics calculation, we obtain $|V_{cs}| =(\muvcsresult)_{μν}$ and $|V_{cs}| = (\tauvcsresult)_{τν}$, respectively. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: 27 pages, 13 figures

arXiv:2407.11712 [pdf, other]

Harnessing Large Language Models for Multimodal Product Bundling

Authors: Xiaohao Liu, Jie Wu, Zhulin Tao, Yunshan Ma, Yinwei Wei, Tat-seng Chua

Abstract: Product bundling provides clients with a strategic combination of individual items.And it has gained significant attention in recent years as a fundamental prerequisite for online services. Recent methods utilize multimodal information through sophisticated extractors for bundling, but remain limited by inferior semantic understanding, the restricted scope of knowledge, and an inability to handle… ▽ More Product bundling provides clients with a strategic combination of individual items.And it has gained significant attention in recent years as a fundamental prerequisite for online services. Recent methods utilize multimodal information through sophisticated extractors for bundling, but remain limited by inferior semantic understanding, the restricted scope of knowledge, and an inability to handle cold-start issues.Despite the extensive knowledge and complex reasoning capabilities of large language models (LLMs), their direct utilization fails to process multimodalities and exploit their knowledge for multimodal product bundling. Adapting LLMs for this purpose involves demonstrating the synergies among different modalities and designing an effective optimization strategy for bundling, which remains challenging.To this end, we introduce Bundle-LLM to bridge the gap between LLMs and product bundling tasks. Sepcifically, we utilize a hybrid item tokenization to integrate multimodal information, where a simple yet powerful multimodal fusion module followed by a trainable projector embeds all non-textual features into a single token. This module not only explicitly exhibits the interplays among modalities but also shortens the prompt length, thereby boosting efficiency.By designing a prompt template, we formulate product bundling as a multiple-choice question given candidate items. Furthermore, we adopt progressive optimization strategy to fine-tune the LLMs for disentangled objectives, achieving effective product bundling capability with comprehensive multimodal semantic understanding.Extensive experiments on four datasets from two application domains show that our approach outperforms a range of state-of-the-art (SOTA) methods. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: under review

arXiv:2407.11638 [pdf, other]

A Comprehensive Evaluation of Large Language Models on Temporal Event Forecasting

Authors: He Chang, Chenchen Ye, Zhulin Tao, Jie Wu, Zhengmao Yang, Yunshan Ma, Xianglin Huang, Tat-Seng Chua

Abstract: Recently, Large Language Models (LLMs) have demonstrated great potential in various data mining tasks, such as knowledge question answering, mathematical reasoning, and commonsense reasoning. However, the reasoning capability of LLMs on temporal event forecasting has been under-explored. To systematically investigate their abilities in temporal event forecasting, we conduct a comprehensive evaluat… ▽ More Recently, Large Language Models (LLMs) have demonstrated great potential in various data mining tasks, such as knowledge question answering, mathematical reasoning, and commonsense reasoning. However, the reasoning capability of LLMs on temporal event forecasting has been under-explored. To systematically investigate their abilities in temporal event forecasting, we conduct a comprehensive evaluation of LLM-based methods for temporal event forecasting. Due to the lack of a high-quality dataset that involves both graph and textual data, we first construct a benchmark dataset, named MidEast-TE-mini. Based on this dataset, we design a series of baseline methods, characterized by various input formats and retrieval augmented generation(RAG) modules. From extensive experiments, we find that directly integrating raw texts into the input of LLMs does not enhance zero-shot extrapolation performance. In contrast, incorporating raw texts in specific complex events and fine-tuning LLMs significantly improves performance. Moreover, enhanced with retrieval modules, LLM can effectively capture temporal relational patterns hidden in historical events. Meanwhile, issues such as popularity bias and the long-tail problem still persist in LLMs, particularly in the RAG-based method. These findings not only deepen our understanding of LLM-based event forecasting methods but also highlight several promising research directions.We consider that this comprehensive evaluation, along with the identified research opportunities, will significantly contribute to future research on temporal event forecasting through LLMs. △ Less

Submitted 16 July, 2024; originally announced July 2024.

arXiv:2407.11296 [pdf, other]

Modeling Foreground Spatial Variations in 21 cm Gaussian Process Component Separation

Authors: Kangning Diao, Richard D. P. Grumitt, Yi Mao

Abstract: Gaussian processes (GPs) have been extensively utilized as nonparametric models for component separation in 21 cm data analyses. This exploits the distinct spectral behavior of the cosmological and foreground signals, which are modeled through the GP covariance kernel. Previous approaches have employed a global GP kernel along all lines of sight (LoS). In this work, we study Bayesian approaches th… ▽ More Gaussian processes (GPs) have been extensively utilized as nonparametric models for component separation in 21 cm data analyses. This exploits the distinct spectral behavior of the cosmological and foreground signals, which are modeled through the GP covariance kernel. Previous approaches have employed a global GP kernel along all lines of sight (LoS). In this work, we study Bayesian approaches that allow for spatial variations in foreground kernel parameters, testing them against simulated HI intensity mapping observations. We consider a no-pooling (NP) model, which treats each LoS independently by fitting for separate covariance kernels, and a hierarchical Gaussian Process (HGP) model that allows for variation in kernel parameters between different LoS, regularized through a global hyperprior. We find that accounting for spatial variations in the GP kernel parameters results in a significant improvement in cosmological signal recovery, achieving up to a 30% reduction in the standard deviation of the residual distribution and improved model predictive performance. Allowing for spatial variations in GP kernel parameters also improves the recovery of the HI power spectra and wavelet scattering transform coefficients. Whilst the NP model achieves superior recovery as measured by the residual distribution, it demands extensive computational resources, faces formidable convergence challenges, and is prone to overfitting. Conversely, the HGP model strikes a balance between the accuracy and robustness of the signal recovery. Further improvements to the HGP model will require more physically motivated modeling of foreground spatial variations. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: 20 pages, 11 figures, 2 tables. Submitted to ApJ, comments welcome

arXiv:2407.11046 [pdf, other]

A Survey on LoRA of Large Language Models

Authors: Yuren Mao, Yuhang Ge, Yijiang Fan, Wenyi Xu, Yu Mi, Zhonghao Hu, Yunjun Gao

Abstract: Low-Rank Adaptation~(LoRA), which updates the dense neural network layers with pluggable low-rank matrices, is one of the best performed parameter efficient fine-tuning paradigms. Furthermore, it has significant advantages in cross-task generalization and privacy-preserving. Hence, LoRA has gained much attention recently, and the number of related literature demonstrates exponential growth. It is… ▽ More Low-Rank Adaptation~(LoRA), which updates the dense neural network layers with pluggable low-rank matrices, is one of the best performed parameter efficient fine-tuning paradigms. Furthermore, it has significant advantages in cross-task generalization and privacy-preserving. Hence, LoRA has gained much attention recently, and the number of related literature demonstrates exponential growth. It is necessary to conduct a comprehensive overview of the current progress on LoRA. This survey categorizes and reviews the progress from the perspectives of (1) downstream adaptation improving variants that improve LoRA's performance on downstream tasks; (2) cross-task generalization methods that mix multiple LoRA plugins to achieve cross-task generalization; (3) efficiency-improving methods that boost the computation-efficiency of LoRA; (4) data privacy-preserving methods that use LoRA in federated learning; (5) application. Besides, this survey also discusses the future directions in this field. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.10956 [pdf, other]

Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

Authors: Ruisheng Cao, Fangyu Lei, Haoyuan Wu, Jixuan Chen, Yeqiao Fu, Hongcheng Gao, Xinzhuang Xiong, Hanchong Zhang, Yuchen Mao, Wenjing Hu, Tianbao Xie, Hongshen Xu, Danyang Zhang, Sida Wang, Ruoxi Sun, Pengcheng Yin, Caiming Xiong, Ansong Ni, Qian Liu, Victor Zhong, Lu Chen, Kai Yu, Tao Yu

Abstract: Data science and engineering workflows often span multiple stages, from warehousing to orchestration, using tools like BigQuery, dbt, and Airbyte. As vision language models (VLMs) advance in multimodal understanding and code generation, VLM-based agents could potentially automate these workflows by generating SQL queries, Python code, and GUI operations. This automation can improve the productivit… ▽ More Data science and engineering workflows often span multiple stages, from warehousing to orchestration, using tools like BigQuery, dbt, and Airbyte. As vision language models (VLMs) advance in multimodal understanding and code generation, VLM-based agents could potentially automate these workflows by generating SQL queries, Python code, and GUI operations. This automation can improve the productivity of experts while democratizing access to large-scale data analysis. In this paper, we introduce Spider2-V, the first multimodal agent benchmark focusing on professional data science and engineering workflows, featuring 494 real-world tasks in authentic computer environments and incorporating 20 enterprise-level professional applications. These tasks, derived from real-world use cases, evaluate the ability of a multimodal agent to perform data-related tasks by writing code and managing the GUI in enterprise data software systems. To balance realistic simulation with evaluation simplicity, we devote significant effort to developing automatic configurations for task setup and carefully crafting evaluation metrics for each task. Furthermore, we supplement multimodal agents with comprehensive documents of these enterprise data software systems. Our empirical evaluation reveals that existing state-of-the-art LLM/VLM-based agents do not reliably automate full data workflows (14.0% success). Even with step-by-step guidance, these agents still underperform in tasks that require fine-grained, knowledge-intensive GUI actions (16.2%) and involve remote cloud-hosted workspaces (10.6%). We hope that Spider2-V paves the way for autonomous multimodal agents to transform the automation of data science and engineering workflow. Our code and data are available at https://spider2-v.github.io. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: 34 pages, 14 figures, 10 tables

arXiv:2407.10892 [pdf, other]

First Measurement of Solar $^8$B Neutrino Flux through Coherent Elastic Neutrino-Nucleus Scattering in PandaX-4T

Authors: PandaX Collaboration, Zihao Bo, Wei Chen, Xun Chen, Yunhua Chen, Zhaokan Cheng, Xiangyi Cui, Yingjie Fan, Deqing Fang, Zhixing Gao, Lisheng Geng, Karl Giboni, Xunan Guo, Xuyuan Guo, Zichao Guo, Chencheng Han, Ke Han, Changda He, Jinrong He, Di Huang, Houqi Huang, Junting Huang, Ruquan Hou, Yu Hou, Xiangdong Ji , et al. (77 additional authors not shown)

Abstract: The PandaX-4T liquid xenon detector at the China Jinping Underground Laboratory is used to measure the solar $^8$B neutrino flux by detecting neutrinos through coherent scattering with xenon nuclei. Data samples requiring the coincidence of scintillation and ionization signals (paired), as well as unpaired ionization-only signals (US2), are selected with energy threshold of approximately 1.1 keV (… ▽ More The PandaX-4T liquid xenon detector at the China Jinping Underground Laboratory is used to measure the solar $^8$B neutrino flux by detecting neutrinos through coherent scattering with xenon nuclei. Data samples requiring the coincidence of scintillation and ionization signals (paired), as well as unpaired ionization-only signals (US2), are selected with energy threshold of approximately 1.1 keV (0.33 keV) nuclear recoil energy. Combining the commissioning run and the first science run of PandaX-4T, a total exposure of 1.25 and 1.04 tonne$\cdot$year are collected for the paired and US2, respectively. After unblinding, 3 and 332 events are observed with an expectation of 2.8$\pm$0.5 and 251$\pm$32 background events, for the paired and US2 data, respectively. A combined analysis yields a best-fit $^8$B neutrino signal of 3.5 (75) events from the paired (US2) data sample, with $\sim$37\% uncertainty, and the background-only hypothesis is disfavored at 2.64$σ$ significance. This gives a solar $^8$B neutrino flux of ($8.4\pm3.1$)$\times$10$^6$ cm$^{-2}$s$^{-1}$, consistent with the standard solar model prediction. This is the first indication of solar $^8$B neutrino ``fog'' in a dark matter direct detection experiment. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2407.10168 [pdf, other]

Black Holes and Covariance in Effective Quantum Gravity

Authors: Cong Zhang, Jerzy Lewandowski, Yongge Ma, Jinsong Yang

Abstract: This work addresses the longstanding issue of general covariance in spherically symmetric gravity, arising when canonical quantum gravity leads to a semiclassical model described by an effective Hamiltonian constraint. Applying the minimal requirements proposed to ensure general covariance, we derive the equations for the effective Hamiltonian constraint with the aid of imposing demands on Dirac o… ▽ More This work addresses the longstanding issue of general covariance in spherically symmetric gravity, arising when canonical quantum gravity leads to a semiclassical model described by an effective Hamiltonian constraint. Applying the minimal requirements proposed to ensure general covariance, we derive the equations for the effective Hamiltonian constraint with the aid of imposing demands on Dirac observables. The equations yield two candidates for effective Hamiltonian constraints dependent on a quantum parameter. The resulting quantum modified black hole spacetimes are analyzed. Particularly, our models show improvement by casting off the known limitations of previous works with similar results. △ Less

Submitted 14 July, 2024; originally announced July 2024.

Comments: 7 pages

arXiv:2407.10084 [pdf, other]

Part2Object: Hierarchical Unsupervised 3D Instance Segmentation

Authors: Cheng Shi, Yulin Zhang, Bin Yang, Jiajin Tang, Yuexin Ma, Sibei Yang

Abstract: Unsupervised 3D instance segmentation aims to segment objects from a 3D point cloud without any annotations. Existing methods face the challenge of either too loose or too tight clustering, leading to under-segmentation or over-segmentation. To address this issue, we propose Part2Object, hierarchical clustering with object guidance. Part2Object employs multi-layer clustering from points to object… ▽ More Unsupervised 3D instance segmentation aims to segment objects from a 3D point cloud without any annotations. Existing methods face the challenge of either too loose or too tight clustering, leading to under-segmentation or over-segmentation. To address this issue, we propose Part2Object, hierarchical clustering with object guidance. Part2Object employs multi-layer clustering from points to object parts and objects, allowing objects to manifest at any layer. Additionally, it extracts and utilizes 3D objectness priors from temporally consecutive 2D RGB frames to guide the clustering process. Moreover, we propose Hi-Mask3D to support hierarchical 3D object part and instance segmentation. By training Hi-Mask3D on the objects and object parts extracted from Part2Object, we achieve consistent and superior performance compared to state-of-the-art models in various settings, including unsupervised instance segmentation, data-efficient fine-tuning, and cross-dataset generalization. Code is release at https://github.com/ChengShiest/Part2Object △ Less

Submitted 14 July, 2024; originally announced July 2024.

Comments: Accept to ECCV2024

arXiv:2407.09833 [pdf, other]

LiveHPS++: Robust and Coherent Motion Capture in Dynamic Free Environment

Authors: Yiming Ren, Xiao Han, Yichen Yao, Xiaoxiao Long, Yujing Sun, Yuexin Ma

Abstract: LiDAR-based human motion capture has garnered significant interest in recent years for its practicability in large-scale and unconstrained environments. However, most methods rely on cleanly segmented human point clouds as input, the accuracy and smoothness of their motion results are compromised when faced with noisy data, rendering them unsuitable for practical applications. To address these lim… ▽ More LiDAR-based human motion capture has garnered significant interest in recent years for its practicability in large-scale and unconstrained environments. However, most methods rely on cleanly segmented human point clouds as input, the accuracy and smoothness of their motion results are compromised when faced with noisy data, rendering them unsuitable for practical applications. To address these limitations and enhance the robustness and precision of motion capture with noise interference, we introduce LiveHPS++, an innovative and effective solution based on a single LiDAR system. Benefiting from three meticulously designed modules, our method can learn dynamic and kinematic features from human movements, and further enable the precise capture of coherent human motions in open settings, making it highly applicable to real-world scenarios. Through extensive experiments, LiveHPS++ has proven to significantly surpass existing state-of-the-art methods across various datasets, establishing a new benchmark in the field. △ Less

Submitted 13 July, 2024; originally announced July 2024.

Comments: Accepted by ECCV 2024

arXiv:2407.09751 [pdf, other]

TASeg: Temporal Aggregation Network for LiDAR Semantic Segmentation

Authors: Xiaopei Wu, Yuenan Hou, Xiaoshui Huang, Binbin Lin, Tong He, Xinge Zhu, Yuexin Ma, Boxi Wu, Haifeng Liu, Deng Cai, Wanli Ouyang

Abstract: Training deep models for LiDAR semantic segmentation is challenging due to the inherent sparsity of point clouds. Utilizing temporal data is a natural remedy against the sparsity problem as it makes the input signal denser. However, previous multi-frame fusion algorithms fall short in utilizing sufficient temporal information due to the memory constraint, and they also ignore the informative tempo… ▽ More Training deep models for LiDAR semantic segmentation is challenging due to the inherent sparsity of point clouds. Utilizing temporal data is a natural remedy against the sparsity problem as it makes the input signal denser. However, previous multi-frame fusion algorithms fall short in utilizing sufficient temporal information due to the memory constraint, and they also ignore the informative temporal images. To fully exploit rich information hidden in long-term temporal point clouds and images, we present the Temporal Aggregation Network, termed TASeg. Specifically, we propose a Temporal LiDAR Aggregation and Distillation (TLAD) algorithm, which leverages historical priors to assign different aggregation steps for different classes. It can largely reduce memory and time overhead while achieving higher accuracy. Besides, TLAD trains a teacher injected with gt priors to distill the model, further boosting the performance. To make full use of temporal images, we design a Temporal Image Aggregation and Fusion (TIAF) module, which can greatly expand the camera FOV and enhance the present features. Temporal LiDAR points in the camera FOV are used as mediums to transform temporal image features to the present coordinate for temporal multi-modal fusion. Moreover, we develop a Static-Moving Switch Augmentation (SMSA) algorithm, which utilizes sufficient temporal information to enable objects to switch their motion states freely, thus greatly increasing static and moving training samples. Our TASeg ranks 1st on three challenging tracks, i.e., SemanticKITTI single-scan track, multi-scan track and nuScenes LiDAR segmentation track, strongly demonstrating the superiority of our method. Codes are available at https://github.com/LittlePey/TASeg. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: Accepted by CVPR 2024

arXiv:2407.09367 [pdf, other]

Reshaping the Online Data Buffering and Organizing Mechanism for Continual Test-Time Adaptation

Authors: Zhilin Zhu, Xiaopeng Hong, Zhiheng Ma, Weijun Zhuang, Yaohui Ma, Dai Yong, Yaowei Wang

Abstract: Continual Test-Time Adaptation (CTTA) involves adapting a pre-trained source model to continually changing unsupervised target domains. In this paper, we systematically analyze the challenges of this task: online environment, unsupervised nature, and the risks of error accumulation and catastrophic forgetting under continual domain shifts. To address these challenges, we reshape the online data bu… ▽ More Continual Test-Time Adaptation (CTTA) involves adapting a pre-trained source model to continually changing unsupervised target domains. In this paper, we systematically analyze the challenges of this task: online environment, unsupervised nature, and the risks of error accumulation and catastrophic forgetting under continual domain shifts. To address these challenges, we reshape the online data buffering and organizing mechanism for CTTA. We propose an {uncertainty-aware buffering approach} to identify {and aggregate} significant samples with high certainty from the unsupervised, single-pass data stream. {Based on this}, we propose a graph-based class relation preservation constraint to overcome catastrophic forgetting. Furthermore, a pseudo-target replay objective is used to mitigate error accumulation. Extensive experiments demonstrate the superiority of our method in both segmentation and classification CTTA tasks. Code is available at \href{https://github.com/z1358/OBAO}{this https URL}. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: This is the preprint version of our paper and supplemental material to appear in ECCV 2024

arXiv:2407.09317 [pdf, other]

Bonding states underpinning structural transitions in IrTe$_2$ observed with micro-ARPES

Authors: C. W. Nicholson, M. D. Watson, A. Pulkkinen, M. Rumo, G. Kremer, K. Y. Ma, F. O. von Rohr, C. Cacho, C. Monney

Abstract: Competing interactions in low-dimensional materials can produce nearly degenerate electronic and structural phases. We investigate the staircase of structural phase transitions in layered IrTe$_2$ for which a number of potential transition mechanisms have been postulated. The spatial coexistence of multiple phases on the micron scale has prevented a detailed analysis of the electronic structure. B… ▽ More Competing interactions in low-dimensional materials can produce nearly degenerate electronic and structural phases. We investigate the staircase of structural phase transitions in layered IrTe$_2$ for which a number of potential transition mechanisms have been postulated. The spatial coexistence of multiple phases on the micron scale has prevented a detailed analysis of the electronic structure. By exploiting micro-ARPES obtained with synchrotron radiation we extract the electronic structure of the multiple structural phases in IrTe$_2$ in order to address the mechanism underlying the phase transitions. We find direct evidence of lowered energy states that appear in the low-temperature phases, states previously predicted by \textit{ab initio} calculations and extended here. Our results validate a proposed scenario of bonding and anti-bonding states as the driver of the phase transitions. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: None

arXiv:2407.09139 [pdf, other]

Measurement of $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays at Belle II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, A. Aloisio, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer , et al. (414 additional authors not shown)

Abstract: We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We det… ▽ More We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We determine these parameters for two ranges of $K^0_S π^0$ invariant mass: $m(K^0_S π^0)\in (0.8, 1.0)$ $GeV/c^2$, which is dominated by $B^0 \to K^{*0} (\to K^0_S π^0) γ$ decays, and a complementary region $m(K^0_S π^0)\in (0.6, 0.8)\cup(1.0, 1.8)$ $GeV/c^2$. Our results have improved precision as compared to previous measurements and are consistent with theory predictions. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 10 pages, 4 figures

Report number: Belle II Preprint 2024-009, KEK Preprint 2024-1

arXiv:2407.09056 [pdf, other]

A Novel Quantum Realization of Jet Clustering in High-Energy Physics Experiments

Authors: Yongfeng Zhu, Weifeng Zhuang, Chen Qian, Yunheng Ma, Dong E. Liu, Manqi Ruan, Chen Zhou

Abstract: Exploring the application of quantum technologies to fundamental sciences holds the key to fostering innovation for both sides. In high-energy particle collisions, quarks and gluons are produced and immediately form collimated particle sprays known as jets. Accurate jet clustering is crucial as it retains the information of the originating quark or gluon and forms the basis for studying properties… ▽ More Exploring the application of quantum technologies to fundamental sciences holds the key to fostering innovation for both sides. In high-energy particle collisions, quarks and gluons are produced and immediately form collimated particle sprays known as jets. Accurate jet clustering is crucial as it retains the information of the originating quark or gluon and forms the basis for studying properties of the Higgs boson, which underlies teh mechanism of mass generation for subatomic particles. For the first time, by mapping collision events into graphs--with particles as nodes and their angular separations as edges--we realize jet clustering using the Quantum Approximate Optimization Algorithm (QAOA), a hybrid quantum-classical algorithm for addressing classical combinatorial optimization problems with available quantum resources. Our results, derived from 30 qubits on quantum computer simulator and 6 qubits on quantum computer hardware, demonstrate that jet clustering performance with QAOA is comparable with or even better than classical algorithms for a small-sized problem. This study highlights the feasibility of quantum computing to revolutionize jet clustering, bringing the practical application of quantum computing in high-energy physics experiments one step closer. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.09001 [pdf]

Coupling multi-space topologies in 2D ferromagnetic lattice

Authors: Zhonglin He, Wenhui Du, Kaiying Dou, Ying Dai, Baibiao Huang, Yandong Ma

Abstract: Topology can manifest topological magnetism (e.g., skyrmion and bimeron) in real space and quantum anomalous Hall (QAH) state in momentum space, which have changed the modern conceptions of matter phase. While the topologies in different spaces are widely studied separately, their coexistence and coupling in single phase is seldomly explored. Here, we report a novel phenomenon that arises from the… ▽ More Topology can manifest topological magnetism (e.g., skyrmion and bimeron) in real space and quantum anomalous Hall (QAH) state in momentum space, which have changed the modern conceptions of matter phase. While the topologies in different spaces are widely studied separately, their coexistence and coupling in single phase is seldomly explored. Here, we report a novel phenomenon that arises from the interaction of topological magnetism and band topology, the multi-space topology, in 2D ferromagnetic lattice. Based on continuum theory and tight-binding model, we reveal that the interconnection between skyrmion/bimeron and QAH state generates distinctive localized chiral bound states (CBSs). With moderating topological magnetism through magnetic field, the multi-space topologies accompanied with different CBSs can be reversed, facilitating the coupling of multi-space topologies. By performing firstprinciples and atomic spin model simulations, we further demonstrate such multi-space topologies and their coupling in monolayer Cr2NSb. These results represent an important step towards the development of multispace topological phenomena in 2D lattice. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.08984 [pdf, ps, other]

Measurement of branching fractions, CP asymmetry, and isospin asymmetry for $\boldsymbol{B\rightarrowργ}$ decays using Belle and Belle II data

Authors: Belle II Collaboration, I. Adachi, K. Adamczyk, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer , et al. (385 additional authors not shown)

Abstract: We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle I… ▽ More We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle II data sets yields $114\pm 12$ $B^{+}\rightarrowρ^{+}γ$ and $99\pm 12$ $B^{0}\rightarrowρ^{0}γ$ decays. The measured branching fractions are $(13.1^{+2.0 +1.3}_{-1.9 -1.2})\times 10^{-7}$ and $(7.5\pm 1.3^{+1.0}_{-0.8})\times 10^{-7}$ for $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays, respectively, where the first uncertainty is statistical and the second is systematic. We also measure the isospin asymmetry $A_{\rm I}(B\rightarrowργ)=(10.9^{+11.2 +7.8}_{-11.7 -7.3})\%$ and the direct CP asymmetry $A_{CP}(B^{+}\rightarrowρ^{+}γ)=(-8.2\pm 15.2^{+1.6}_{-1.2})\%$. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 12 pages, 4 figures

Report number: Belle II Preprint 2023-019; KEK Preprint 2023-37

arXiv:2407.08949 [pdf, other]

One-Shot Pose-Driving Face Animation Platform

Authors: He Feng, Donglin Di, Yongjia Ma, Wei Chen, Tonghua Su

Abstract: The objective of face animation is to generate dynamic and expressive talking head videos from a single reference face, utilizing driving conditions derived from either video or audio inputs. Current approaches often require fine-tuning for specific identities and frequently fail to produce expressive videos due to the limited effectiveness of Wav2Pose modules. To facilitate the generation of one-… ▽ More The objective of face animation is to generate dynamic and expressive talking head videos from a single reference face, utilizing driving conditions derived from either video or audio inputs. Current approaches often require fine-tuning for specific identities and frequently fail to produce expressive videos due to the limited effectiveness of Wav2Pose modules. To facilitate the generation of one-shot and more consecutive talking head videos, we refine an existing Image2Video model by integrating a Face Locator and Motion Frame mechanism. We subsequently optimize the model using extensive human face video datasets, significantly enhancing its ability to produce high-quality and expressive talking head videos. Additionally, we develop a demo platform using the Gradio framework, which streamlines the process, enabling users to quickly create customized talking head videos. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08948 [pdf, other]

Symmetry Awareness Encoded Deep Learning Framework for Brain Imaging Analysis

Authors: Yang Ma, Dongang Wang, Peilin Liu, Lynette Masters, Michael Barnett, Weidong Cai, Chenyu Wang

Abstract: The heterogeneity of neurological conditions, ranging from structural anomalies to functional impairments, presents a significant challenge in medical imaging analysis tasks. Moreover, the limited availability of well-annotated datasets constrains the development of robust analysis models. Against this backdrop, this study introduces a novel approach leveraging the inherent anatomical symmetrical… ▽ More The heterogeneity of neurological conditions, ranging from structural anomalies to functional impairments, presents a significant challenge in medical imaging analysis tasks. Moreover, the limited availability of well-annotated datasets constrains the development of robust analysis models. Against this backdrop, this study introduces a novel approach leveraging the inherent anatomical symmetrical features of the human brain to enhance the subsequent detection and segmentation analysis for brain diseases. A novel Symmetry-Aware Cross-Attention (SACA) module is proposed to encode symmetrical features of left and right hemispheres, and a proxy task to detect symmetrical features as the Symmetry-Aware Head (SAH) is proposed, which guides the pretraining of the whole network on a vast 3D brain imaging dataset comprising both healthy and diseased brain images across various MRI and CT. Through meticulous experimentation on downstream tasks, including both classification and segmentation for brain diseases, our model demonstrates superior performance over state-of-the-art methodologies, particularly highlighting the significance of symmetry-aware learning. Our findings advocate for the effectiveness of incorporating symmetry awareness into pretraining and set a new benchmark for medical imaging analysis, promising significant strides toward accurate and efficient diagnostic processes. Code is available at https://github.com/bitMyron/sa-swin. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: MICCAI 2024

ACM Class: I.2.10; I.4.10

arXiv:2407.08923 [pdf, other]

A Bistatic ISAC Framework for LEO Satellite Systems: A Rate-Splitting Approach

Authors: Juha Park, Jaehyup Seong, Jaehak Ryu, Yijie Mao, Wonjae Shin

Abstract: Aiming to achieve ubiquitous global connectivity and target detection on the same platform with improved spectral/energy efficiency and reduced onboard hardware cost, low Earth orbit (LEO) satellite systems capable of simultaneously performing communications and radar have attracted significant attention. Designing such a joint system should address not only the challenges of integrating two funct… ▽ More Aiming to achieve ubiquitous global connectivity and target detection on the same platform with improved spectral/energy efficiency and reduced onboard hardware cost, low Earth orbit (LEO) satellite systems capable of simultaneously performing communications and radar have attracted significant attention. Designing such a joint system should address not only the challenges of integrating two functions but also the unique propagation characteristics of the satellites. To overcome severe echo signal path loss due to the high altitude of the satellite, we put forth a bistatic integrated sensing and communication (ISAC) framework with a radar receiver separated from the satellite. For robust and effective interference management, we employ rate-splitting multiple access (RSMA), which splits and encodes users messages into private and common streams. We optimize the dual-functional precoders to maximize the minimum rate among all users while satisfying the Cramer-Rao bound (CRB) constraints. Given the challenge of acquiring instantaneous channel state information (iCSI) for LEO satellites, we exploit the geometrical and statistical characteristics of the satellite channel. To develop an efficient optimization algorithm, semidefinite relaxation (SDR), sequential rank-1 constraint relaxation (SROCR), and successive convex approximation (SCA) are utilized. Numerical results show that the proposed framework efficiently performs both communication and radar, demonstrating superior interference control capabilities. Furthermore, it is validated that the common stream plays three vital roles: i) beamforming towards the radar target, ii) interference management between communications and radar, and iii) interference management among communication users. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 33 pages, 8 figures, 2 tables

arXiv:2407.08427 [pdf, other]

doi 10.1016/j.dark.2024.101568

Constraining Holographic Dark Energy and Analyzing Cosmological Tensions

Authors: Xin Tang, Yin-Zhe Ma, Wei-Ming Dai, Hong-Jian He

Abstract: We investigate cosmological constraints on the holographic dark energy (HDE) using the state-of-the-art cosmological datasets: Planck CMB angular power spectra and weak lensing power spectra, Atacama Cosmology Telescope (ACT) temperature power spectra, baryon acoustic oscillation (BAO) and redshift-space distortion (RSD) measurements from six-degree-field galaxy survey and Sloan Digital Sky Survey… ▽ More We investigate cosmological constraints on the holographic dark energy (HDE) using the state-of-the-art cosmological datasets: Planck CMB angular power spectra and weak lensing power spectra, Atacama Cosmology Telescope (ACT) temperature power spectra, baryon acoustic oscillation (BAO) and redshift-space distortion (RSD) measurements from six-degree-field galaxy survey and Sloan Digital Sky Survey (DR12 & DR16) and the Cepheids-Supernovae measurement from SH0ES team (R22). We also examine the HDE model and $Λ$CDM with and without $N_{\rm eff}$ (effective number of relativistic species) being treated as a free parameter. We find that the HDE model can relieve the tensions of $H_0$ and $S_8$ to certain degrees. With ``Planck+ACT+BAO+RSD'' datasets, the constraints are $H_0 = 69.70 \pm 1.39\ \mathrm{km\ s^{-1} Mpc^{-1}}$ and $S_8 = 0.823 \pm 0.011$ in HDE model, which brings down the Hubble tension down to $1.92σ$ confidence level (C.L.) and the $S_8$ tension to $1$-$2σ$ C.L. By adding the R22 data, their values are improved as $H_0 = 71.86 \pm 0.93 \,\mathrm{km\ s^{-1} Mpc^{-1}}$ and $S_8 = 0.813 \pm 0.010$, which further brings the Hubble tension down to $0.85σ$ C.L. and relieves the $S_{8}$ tension. We also quantify the goodness-of-fit of different models with Akaike information criterion (AIC) and Bayesian information criterion (BIC), and find that the HDE agrees with the observational data better than the $Λ$CDM and other extended models (treating $N_{\rm eff}$ as free for fitting). △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 11 pages, 7 figures, 5 tables

Journal ref: Physics of the Dark Universe 46 (2024) 101568

arXiv:2407.08126 [pdf, other]

Label-anticipated Event Disentanglement for Audio-Visual Video Parsing

Authors: Jinxing Zhou, Dan Guo, Yuxin Mao, Yiran Zhong, Xiaojun Chang, Meng Wang

Abstract: Audio-Visual Video Parsing (AVVP) task aims to detect and temporally locate events within audio and visual modalities. Multiple events can overlap in the timeline, making identification challenging. While traditional methods usually focus on improving the early audio-visual encoders to embed more effective features, the decoding phase -- crucial for final event classification, often receives less… ▽ More Audio-Visual Video Parsing (AVVP) task aims to detect and temporally locate events within audio and visual modalities. Multiple events can overlap in the timeline, making identification challenging. While traditional methods usually focus on improving the early audio-visual encoders to embed more effective features, the decoding phase -- crucial for final event classification, often receives less attention. We aim to advance the decoding phase and improve its interpretability. Specifically, we introduce a new decoding paradigm, \underline{l}abel s\underline{e}m\underline{a}ntic-based \underline{p}rojection (LEAP), that employs labels texts of event categories, each bearing distinct and explicit semantics, for parsing potentially overlapping events.LEAP works by iteratively projecting encoded latent features of audio/visual segments onto semantically independent label embeddings. This process, enriched by modeling cross-modal (audio/visual-label) interactions, gradually disentangles event semantics within video segments to refine relevant label embeddings, guaranteeing a more discriminative and interpretable decoding process. To facilitate the LEAP paradigm, we propose a semantic-aware optimization strategy, which includes a novel audio-visual semantic similarity loss function. This function leverages the Intersection over Union of audio and visual events (EIoU) as a novel metric to calibrate audio-visual similarities at the feature level, accommodating the varied event densities across modalities. Extensive experiments demonstrate the superiority of our method, achieving new state-of-the-art performance for AVVP and also enhancing the relevant audio-visual event localization task. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: Accepted by ECCV2024

arXiv:2407.07651 [pdf, other]

Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07436 [pdf, other]

Alternating Subspace Approximate Message Passing

Authors: Xu Zhu, Yufei Ma, Xiaoguang Li, Tiejun Li

Abstract: Numerous renowned algorithms for tackling the compressed sensing problem employ an alternating strategy, which typically involves data matching in one module and denoising in another. Based on an in-depth analysis of the connection between the message passing and operator splitting, we present a novel approach, the Alternating Subspace Method (ASM), which intuitively combines the principles of the… ▽ More Numerous renowned algorithms for tackling the compressed sensing problem employ an alternating strategy, which typically involves data matching in one module and denoising in another. Based on an in-depth analysis of the connection between the message passing and operator splitting, we present a novel approach, the Alternating Subspace Method (ASM), which intuitively combines the principles of the greedy methods (e.g., the orthogonal matching pursuit type methods) and the splitting methods (e.g., the approximate message passing type methods). Essentially, ASM modifies the splitting method by achieving fidelity in a subspace-restricted fashion. We reveal that such confining strategy still yields a consistent fixed point iteration and establish its local geometric convergence on the lasso problem. Numerical experiments on both the lasso and channel estimation problems demonstrate its high convergence rate and its capacity to incorporate different prior distributions. Further theoretical analysis also demonstrates the advantage of the motivated message-passing splitting by incorporating quasi-variance degree of freedom even for the classical lasso optimization problem. Overall, the proposed method is promising in efficiency, accuracy and flexibility, which has the potential to be competitive in different sparse recovery applications. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: 19 pages, 6 figures

MSC Class: 94A12; 65F10; 90C06

arXiv:2407.07119 [pdf, other]

Optimal bias of utility function between two-layer network for the evolution of prosocial behavior in two-order game and higher-order game

Authors: Yihe Ma, Hui Zhang

Abstract: Cooperation is an important research object in economics, sociology, and biology, and the evolution of cooperation in structured populations is a interesting research topic. We mainly focus on the evolution of cooperation with two-order and higher-order game in two-layer network. We introduce a bias coefficient of utility function and study the influence of bias coefficient on the evolution of coo… ▽ More Cooperation is an important research object in economics, sociology, and biology, and the evolution of cooperation in structured populations is a interesting research topic. We mainly focus on the evolution of cooperation with two-order and higher-order game in two-layer network. We introduce a bias coefficient of utility function and study the influence of bias coefficient on the evolution of cooperation in two-layer network. We firstly provide theoretical analysis of fixation probabilities of two-order and higher-order game under weak selection in two-layer network.Secondly,based on the expression of fixation probability, we obtain the critical value of the two different games by comparing the size relationship of fixation probability under weak selection condition and neutral selection condition. Finally, by comparing the relationship between the critical value of single-layer and two-layer network in two-order game and higher-order game, when the nonlinear factor satisfies certain conditions, it is concluded that when the optimal bias coefficient tends towards 0 is met, some two-layer networks promote the evolution of cooperative behavior more than some single-layer networks. △ Less

Submitted 12 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

arXiv:2407.07053 [pdf, other]

Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model

Authors: Wenqi Zhang, Zhenglin Cheng, Yuanyu He, Mengna Wang, Yongliang Shen, Zeqi Tan, Guiyang Hou, Mingqian He, Yanna Ma, Weiming Lu, Yueting Zhuang

Abstract: Although most current large multimodal models (LMMs) can already understand photos of natural scenes and portraits, their understanding of abstract images, e.g., charts, maps, or layouts, and visual reasoning capabilities remains quite rudimentary. They often struggle with simple daily tasks, such as reading time from a clock, understanding a flowchart, or planning a route using a road map. In lig… ▽ More Although most current large multimodal models (LMMs) can already understand photos of natural scenes and portraits, their understanding of abstract images, e.g., charts, maps, or layouts, and visual reasoning capabilities remains quite rudimentary. They often struggle with simple daily tasks, such as reading time from a clock, understanding a flowchart, or planning a route using a road map. In light of this, we design a multi-modal self-instruct, utilizing large language models and their code capabilities to synthesize massive abstract images and visual reasoning instructions across daily scenarios. Our strategy effortlessly creates a multimodal benchmark with 11,193 instructions for eight visual scenarios: charts, tables, simulated maps, dashboards, flowcharts, relation graphs, floor plans, and visual puzzles. \textbf{This benchmark, constructed with simple lines and geometric elements, exposes the shortcomings of most advanced LMMs} like Claude-3.5-Sonnet and GPT-4o in abstract image understanding, spatial relations reasoning, and visual element induction. Besides, to verify the quality of our synthetic data, we fine-tune an LMM using 62,476 synthetic chart, table and road map instructions. The results demonstrate improved chart understanding and map navigation performance, and also demonstrate potential benefits for other visual reasoning tasks. Our code is available at: \url{https://github.com/zwq2018/Multi-modal-Self-instruct}. △ Less

Submitted 10 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

Comments: code: https://github.com/zwq2018/Multi-modal-Self-instruct dataset: https://huggingface.co/datasets/zwq2018/Multi-modal-Self-instruct Leaderboard: https://multi-modal-self-instruct.github.io/

arXiv:2407.06530 [pdf, ps, other]

RS-BNN: A Deep Learning Framework for the Optimal Beamforming Design of Rate-Splitting Multiple Access

Authors: Yiwen Wang, Yijie Mao, Sijie Ji

Abstract: Rate splitting multiple access (RSMA) relies on beamforming design for attaining spectral efficiency and energy efficiency gains over traditional multiple access schemes. While conventional optimization approaches such as weighted minimum mean square error (WMMSE) achieve suboptimal solutions for RSMA beamforming optimization, they are computationally demanding. A novel approach based on fractiona… ▽ More Rate splitting multiple access (RSMA) relies on beamforming design for attaining spectral efficiency and energy efficiency gains over traditional multiple access schemes. While conventional optimization approaches such as weighted minimum mean square error (WMMSE) achieve suboptimal solutions for RSMA beamforming optimization, they are computationally demanding. A novel approach based on fractional programming (FP) has unveiled the optimal beamforming structure (OBS) for RSMA. This method, combined with a hyperplane fixed point iteration (HFPI) approach, named FP-HFPI, provides suboptimal beamforming solutions with identical sum rate performance but much lower computational complexity compared to WMMSE. Inspired by such an approach, in this work, a novel deep unfolding framework based on FP-HFPI, named rate-splitting-beamforming neural network (RS-BNN), is proposed to unfold the FP-HFPI algorithm. Numerical results indicate that the proposed RS-BNN attains a level of performance closely matching that of WMMSE and FP-HFPI, while dramatically reducing the computational complexity. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.06480 [pdf, other]

Vector meson's spin alignments in high energy reactions

Authors: Jin-Hui Chen, Zuo-Tang Liang, Yu-Gang Ma, Xin-Li Sheng, Qun Wang

Abstract: The global spin alignment of vector mesons has been observed by the STAR collaboration at the Relativistic Heavy Ion Collider (RHIC) at Brookhaven National Laboratory (BNL). It provides a unique opportunity to probe the correlation between the polarized quark and antiquark in the strongly coupled quark-gluon plasma (sQGP) produced in relativistic heavy ion collisions, opening a new window to explo… ▽ More The global spin alignment of vector mesons has been observed by the STAR collaboration at the Relativistic Heavy Ion Collider (RHIC) at Brookhaven National Laboratory (BNL). It provides a unique opportunity to probe the correlation between the polarized quark and antiquark in the strongly coupled quark-gluon plasma (sQGP) produced in relativistic heavy ion collisions, opening a new window to explore the properties of sQGP. In addition, spin alignments of vector mesons have also been observed in other high-energy particle collisions. The results seem to be strongly dependent on the hadronization mechanism, so comprehensive studies are needed.In this article, we present a brief review of theoretical and experimental advances in the study of vector meson's spin alignments in a variety of high-energy particle collisions, with emphasis on hadronization mechanisms. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: ReVTex 4.1, 15 pages, 7 figures

arXiv:2407.06135 [pdf, other]

ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation

Authors: Ethan Chern, Jiadi Su, Yan Ma, Pengfei Liu

Abstract: Previous open-source large multimodal models (LMMs) have faced several limitations: (1) they often lack native integration, requiring adapters to align visual representations with pre-trained large language models (LLMs); (2) many are restricted to single-modal generation; (3) while some support multimodal generation, they rely on separate diffusion models for visual modeling and generation. To mi… ▽ More Previous open-source large multimodal models (LMMs) have faced several limitations: (1) they often lack native integration, requiring adapters to align visual representations with pre-trained large language models (LLMs); (2) many are restricted to single-modal generation; (3) while some support multimodal generation, they rely on separate diffusion models for visual modeling and generation. To mitigate these limitations, we present Anole, an open, autoregressive, native large multimodal model for interleaved image-text generation. We build Anole from Meta AI's Chameleon, adopting an innovative fine-tuning strategy that is both data-efficient and parameter-efficient. Anole demonstrates high-quality, coherent multimodal generation capabilities. We have open-sourced our model, training framework, and instruction tuning data. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.05553 [pdf, other]

A Color Image Analysis Tool to Help Users Choose a Makeup Foundation Color

Authors: Yafei Mao, Christopher Merkle, Jan P. Allebach

Abstract: This paper presents an approach to predict the color of skin-with-foundation based on a no makeup selfie image and a foundation shade image. Our approach first calibrates the image with the help of the color checker target, and then trains a supervised-learning model to predict the skin color. In the calibration stage, We propose to use three different transformation matrices to map the device dep… ▽ More This paper presents an approach to predict the color of skin-with-foundation based on a no makeup selfie image and a foundation shade image. Our approach first calibrates the image with the help of the color checker target, and then trains a supervised-learning model to predict the skin color. In the calibration stage, We propose to use three different transformation matrices to map the device dependent RGB response to the reference CIE XYZ space. In so doing, color correction error can be minimized. We then compute the average value of the region of interest in the calibrated images, and feed them to the prediction model. We explored both the linear regression and support vector regression models. Cross-validation results show that both models can accurately make the prediction. △ Less

Submitted 7 July, 2024; originally announced July 2024.

Comments: Color Imaging: Displaying, Processing, Hardcopy, and Applications, 373:1-373:6

arXiv:2407.05363 [pdf, other]

Multi-branch Collaborative Learning Network for 3D Visual Grounding

Authors: Zhipeng Qian, Yiwei Ma, Zhekai Lin, Jiayi Ji, Xiawu Zheng, Xiaoshuai Sun, Rongrong Ji

Abstract: 3D referring expression comprehension (3DREC) and segmentation (3DRES) have overlapping objectives, indicating their potential for collaboration. However, existing collaborative approaches predominantly depend on the results of one task to make predictions for the other, limiting effective collaboration. We argue that employing separate branches for 3DREC and 3DRES tasks enhances the model's capac… ▽ More 3D referring expression comprehension (3DREC) and segmentation (3DRES) have overlapping objectives, indicating their potential for collaboration. However, existing collaborative approaches predominantly depend on the results of one task to make predictions for the other, limiting effective collaboration. We argue that employing separate branches for 3DREC and 3DRES tasks enhances the model's capacity to learn specific information for each task, enabling them to acquire complementary knowledge. Thus, we propose the MCLN framework, which includes independent branches for 3DREC and 3DRES tasks. This enables dedicated exploration of each task and effective coordination between the branches. Furthermore, to facilitate mutual reinforcement between these branches, we introduce a Relative Superpoint Aggregation (RSA) module and an Adaptive Soft Alignment (ASA) module. These modules significantly contribute to the precise alignment of prediction results from the two branches, directing the module to allocate increased attention to key positions. Comprehensive experimental evaluation demonstrates that our proposed method achieves state-of-the-art performance on both the 3DREC and 3DRES tasks, with an increase of 2.05% in Acc@0.5 for 3DREC and 3.96% in mIoU for 3DRES. △ Less

Submitted 10 July, 2024; v1 submitted 7 July, 2024; originally announced July 2024.

Comments: ECCV2024

arXiv:2407.05352 [pdf, other]

Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model

Authors: Danni Yang, Ruohan Dong, Jiayi Ji, Yiwei Ma, Haowei Wang, Xiaoshuai Sun, Rongrong Ji

Abstract: Recently, diffusion models have increasingly demonstrated their capabilities in vision understanding. By leveraging prompt-based learning to construct sentences, these models have shown proficiency in classification and visual grounding tasks. However, existing approaches primarily showcase their ability to perform sentence-level localization, leaving the potential for leveraging contextual inform… ▽ More Recently, diffusion models have increasingly demonstrated their capabilities in vision understanding. By leveraging prompt-based learning to construct sentences, these models have shown proficiency in classification and visual grounding tasks. However, existing approaches primarily showcase their ability to perform sentence-level localization, leaving the potential for leveraging contextual information for phrase-level understanding largely unexplored. In this paper, we utilize Panoptic Narrative Grounding (PNG) as a proxy task to investigate this capability further. PNG aims to segment object instances mentioned by multiple noun phrases within a given narrative text. Specifically, we introduce the DiffPNG framework, a straightforward yet effective approach that fully capitalizes on the diffusion's architecture for segmentation by decomposing the process into a sequence of localization, segmentation, and refinement steps. The framework initially identifies anchor points using cross-attention mechanisms and subsequently performs segmentation with self-attention to achieve zero-shot PNG. Moreover, we introduce a refinement module based on SAM to enhance the quality of the segmentation masks. Our extensive experiments on the PNG dataset demonstrate that DiffPNG achieves strong performance in the zero-shot PNG task setting, conclusively proving the diffusion model's capability for context-aware, phrase-level understanding. Source code is available at \url{https://github.com/nini0919/DiffPNG}. △ Less

Submitted 7 July, 2024; originally announced July 2024.

Comments: Accepted by ECCV2024

arXiv:2407.05332 [pdf, other]

Experimental investigation of direct non-Hermitian measurement and uncertainty relation towards high-dimensional quantum domain

Authors: Yi-Tao Wang, Zhao-An Wang, Zhi-Peng Li, Xiao-Dong Zeng, Jia-Ming Ren, Wei Liu, Yuan-Ze Yang, Nai-Jie Guo, Lin-Ke Xie, Jun-You Liu, Yu-Hang Ma, Jian-Shun Tang, Chengjie Zhang, Chuan-Feng Li, Guang-Can Guo

Abstract: Non-Hermitian dynamics in quantum systems have unveiled novel phenomena, yet the implementation of valid non-Hermitian quantum measurement remains a challenge, because a universal quantum projective mechanism on the complete but skewed non-Hermitian eigenstates is not explicit in experiment. This limitation hinders the direct acquisition of non-Hermitian observable statistics (e.g., non-Hermitian… ▽ More Non-Hermitian dynamics in quantum systems have unveiled novel phenomena, yet the implementation of valid non-Hermitian quantum measurement remains a challenge, because a universal quantum projective mechanism on the complete but skewed non-Hermitian eigenstates is not explicit in experiment. This limitation hinders the direct acquisition of non-Hermitian observable statistics (e.g., non-Hermitian population dynamics), also constrains investigations of non-Hermitian quantum measurement properties such as uncertainty relation. Here, we address these challenges by presenting a non-Hermitian projective protocol and investigating the non-Hermitian uncertainty relation. We derive the uncertainty relation for pseudo-Hermitian (PH) observables that is generalized beyond the Hermitian ones. We then investigate the projective properties of general quantum states onto complete non-Hermitian eigenvectors, and present a quantum simulating method to apply the valid non-Hermitian projective measurement on a direct-sum dilated space. Subsequently, we experimentally construct a quantum simulator in the quantum optical circuit and realize the 3-dimensional non-Hermitian quantum measurement on the single-photon qutrit. Employing this platform, we explore the uncertainty relation experimentally with different PH metrics. Our non-Hermitian quantum measurement method is state-independent and outputs directly the non-Hermitian quantum projective statistics, paving the way for studies of extensive non-Hermitian observable in quantum domain. △ Less

Submitted 7 July, 2024; originally announced July 2024.

Comments: 6 pages, 4 figures

arXiv:2407.05155 [pdf, other]

Wi-Fi Beyond Communications: Experimental Evaluation of Respiration Monitoring and Motion Detection Using COTS Devices

Authors: Jiuyu Liu, Yi Ma, Rahim Tafazolli

Abstract: Wi-Fi sensing has become an attractive option for non-invasive monitoring of human activities and vital signs. This paper explores the feasibility of using state-of-the-art commercial off-the-shelf (COTS) devices for Wi-Fi sensing applications, particularly respiration monitoring and motion detection. We utilize the Intel AX210 network interface card (NIC) to transmit Wi-Fi signals in both 2.4 GHz… ▽ More Wi-Fi sensing has become an attractive option for non-invasive monitoring of human activities and vital signs. This paper explores the feasibility of using state-of-the-art commercial off-the-shelf (COTS) devices for Wi-Fi sensing applications, particularly respiration monitoring and motion detection. We utilize the Intel AX210 network interface card (NIC) to transmit Wi-Fi signals in both 2.4 GHz and 6 GHz frequency bands. Our experiments rely on channel frequency response (CFR) and received signal strength indicator (RSSI) data, which are processed using a moving average algorithm to extract human behavior patterns. The experimental results demonstrate the effectiveness of our approach in capturing and representing human respiration and motion patterns. Furthermore, we compare the performance of Wi-Fi sensing across different frequency bands, highlighting the advantages of using higher frequencies for improved sensitivity and clarity. Our findings showcase the practicality of using COTS devices for Wi-Fi sensing and lay the groundwork for the development of non-invasive, contactless sensing systems. These systems have potential applications in various fields, including healthcare, smart homes, and Metaverse. △ Less

Submitted 6 July, 2024; originally announced July 2024.

Comments: This work has been accepted by IEEE ICCC Workshop 2024. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2407.05117 [pdf, ps, other]

Search for the baryon number and lepton number violating decays $τ^-\to Λπ^-$ and $τ^-\to \barΛπ^-$ at Belle II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (349 additional authors not shown)

Abstract: We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper… ▽ More We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper limits at 90\% credibility level on the branching fractions of $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛπ^-$ are determined to be $4.7 \times 10^{-8}$ and $4.3 \times 10^{-8}$, respectively. △ Less

Submitted 6 July, 2024; originally announced July 2024.

Comments: 8 pages, 4 figures

Report number: Belle II Preprint 2024-020; KEK Preprint 2024-17

arXiv:2407.05001 [pdf, other]

Treatment effect estimation under covariate-adaptive randomization with heavy-tailed outcomes

Authors: Hongzi Li, Wei Ma, Yingying Ma, Hanzhong Liu

Abstract: Randomized experiments are the gold standard for investigating causal relationships, with comparisons of potential outcomes under different treatment groups used to estimate treatment effects. However, outcomes with heavy-tailed distributions pose significant challenges to traditional statistical approaches. While recent studies have explored these issues under simple randomization, their applicat… ▽ More Randomized experiments are the gold standard for investigating causal relationships, with comparisons of potential outcomes under different treatment groups used to estimate treatment effects. However, outcomes with heavy-tailed distributions pose significant challenges to traditional statistical approaches. While recent studies have explored these issues under simple randomization, their application in more complex randomization designs, such as stratified randomization or covariate-adaptive randomization, has not been adequately addressed. To fill the gap, this paper examines the properties of the estimated influence function-based M-estimator under covariate-adaptive randomization with heavy-tailed outcomes, demonstrating its consistency and asymptotic normality. Yet, the existing variance estimator tends to overestimate the asymptotic variance, especially under more balanced designs, and lacks universal applicability across randomization methods. To remedy this, we introduce a novel stratified transformed difference-in-means estimator to enhance efficiency and propose a universally applicable variance estimator to facilitate valid inferences. Additionally, we establish the consistency of kernel-based density estimation in the context of covariate-adaptive randomization. Numerical results demonstrate the effectiveness of the proposed methods in finite samples. △ Less

Submitted 6 July, 2024; originally announced July 2024.

arXiv:2407.04923 [pdf, other]

OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding

Authors: Tiancheng Zhao, Qianqian Zhang, Kyusong Lee, Peng Liu, Lu Zhang, Chunxin Fang, Jiajia Liao, Kelei Jiang, Yibo Ma, Ruochen Xu

Abstract: We introduce OmChat, a model designed to excel in handling long contexts and video understanding tasks. OmChat's new architecture standardizes how different visual inputs are processed, making it more efficient and adaptable. It uses a dynamic vision encoding process to effectively handle images of various resolutions, capturing fine details across a range of image qualities. OmChat utilizes an ac… ▽ More We introduce OmChat, a model designed to excel in handling long contexts and video understanding tasks. OmChat's new architecture standardizes how different visual inputs are processed, making it more efficient and adaptable. It uses a dynamic vision encoding process to effectively handle images of various resolutions, capturing fine details across a range of image qualities. OmChat utilizes an active progressive multimodal pretraining strategy, which gradually increases the model's capacity for long contexts and enhances its overall abilities. By selecting high-quality data during training, OmChat learns from the most relevant and informative data points. With support for a context length of up to 512K, OmChat demonstrates promising performance in tasks involving multiple images and videos, outperforming most open-source models in these benchmarks. Additionally, OmChat proposes a prompting strategy for unifying complex multimodal inputs including single image text, multi-image text and videos, and achieving competitive performance on single-image benchmarks. To further evaluate the model's capabilities, we proposed a benchmark dataset named Temporal Visual Needle in a Haystack. This dataset assesses OmChat's ability to comprehend temporal visual details within long videos. Our analysis highlights several key factors contributing to OmChat's success: support for any-aspect high image resolution, the active progressive pretraining strategy, and high-quality supervised fine-tuning datasets. This report provides a detailed overview of OmChat's capabilities and the strategies that enhance its performance in visual understanding. △ Less

Submitted 5 July, 2024; originally announced July 2024.

Comments: 14 pages

arXiv:2407.04418 [pdf, other]

Enabling On-Device LLMs Personalization with Smartphone Sensing

Authors: Shiquan Zhang, Ying Ma, Le Fang, Hong Jia, Simon D'Alfonso, Vassilis Kostakos

Abstract: This demo presents a novel end-to-end framework that combines on-device large language models (LLMs) with smartphone sensing technologies to achieve context-aware and personalized services. The framework addresses critical limitations of current personalization solutions via cloud-based LLMs, such as privacy concerns, latency and cost, and limited personal sensor data. To achieve this, we innovati… ▽ More This demo presents a novel end-to-end framework that combines on-device large language models (LLMs) with smartphone sensing technologies to achieve context-aware and personalized services. The framework addresses critical limitations of current personalization solutions via cloud-based LLMs, such as privacy concerns, latency and cost, and limited personal sensor data. To achieve this, we innovatively proposed deploying LLMs on smartphones with multimodal sensor data and customized prompt engineering, ensuring privacy and enhancing personalization performance through context-aware sensing. A case study involving a university student demonstrated the proposed framework's capability to provide tailored recommendations. In addition, we show that the proposed framework achieves the best trade-off in privacy, performance, latency, cost, battery and energy consumption between on-device and cloud LLMs. Future work aims to integrate more diverse sensor data and conduct large-scale user studies to further refine the personalization. We envision the proposed framework could significantly improve user experiences in various domains such as healthcare, productivity, and entertainment by providing secure, context-aware, and efficient interactions directly on users' devices. △ Less

Submitted 5 July, 2024; originally announced July 2024.

Comments: 5 pages, 3 figures, conference demo paper

arXiv:2407.03783 [pdf, other]

Evidence of $h_{b}(\text{2P}) \to Υ(\text{1S})η$ decay and search for $h_{b}(\text{1P,2P}) \to Υ(\text{1S})π^0$ with the Belle detector

Authors: Belle Collaboration, E. Kovalenko, I. Adachi, H. Aihara, D. M. Asner, T. Aushev, R. Ayad, V. Babu, Sw. Banerjee, K. Belous, J. Bennett, M. Bessner, T. Bilka, D. Biswas, A. Bobrov, D. Bodrov, A. Bondar, A. Bozek, M. Bračko, P. Branchini, T. E. Browder, A. Budano, M. Campajola, M. -C. Chang, B. G. Cheon , et al. (142 additional authors not shown)

Abstract: We report the first evidence for the $h_{b}(\text{2P}) \to Υ(\text{1S})η$ transition with a significance of $3.5$ standard deviations. The decay branching fraction is measured to be $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})η]=(7.1 ~^{+3.7} _{-3.2}\pm 0.8)\times10^{-3}$, which is noticeably smaller than expected. We also set upper limits on $π^0$ transitions of… ▽ More We report the first evidence for the $h_{b}(\text{2P}) \to Υ(\text{1S})η$ transition with a significance of $3.5$ standard deviations. The decay branching fraction is measured to be $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})η]=(7.1 ~^{+3.7} _{-3.2}\pm 0.8)\times10^{-3}$, which is noticeably smaller than expected. We also set upper limits on $π^0$ transitions of $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})π^0] < 1.8\times10^{-3}$, and $\mathcal{B}[h_{b}(\text{1P})\to Υ(\text{1S})π^0] < 1.8\times10^{-3}$, at the $90\%$ confidence level. These results are obtained with a $131.4$~fb$^{-1}$ data sample collected near the $Υ(\text{5S})$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+e^-$ collider. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: to be submitted to PRL

Report number: Belle Preprint 2024-03, KEK Preprint 2024-03

arXiv:2407.03061 [pdf, other]

ALTER: Augmentation for Large-Table-Based Reasoning

Authors: Han Zhang, Yuheng Ma, Hanfang Yang

Abstract: While extensive research has explored the use of large language models (LLMs) for table-based reasoning, most approaches struggle with scalability when applied to large tables. To maintain the superior comprehension abilities of LLMs in these scenarios, we introduce ALTER(Augmentation for Large-Table-Based Reasoning)-a framework designed to harness the latent augmentation potential in both free-fo… ▽ More While extensive research has explored the use of large language models (LLMs) for table-based reasoning, most approaches struggle with scalability when applied to large tables. To maintain the superior comprehension abilities of LLMs in these scenarios, we introduce ALTER(Augmentation for Large-Table-Based Reasoning)-a framework designed to harness the latent augmentation potential in both free-form natural language (NL) questions, via the query augmentor, and semi-structured tabular data, through the table augmentor. By utilizing only a small subset of relevant data from the table and supplementing it with pre-augmented schema, semantic, and literal information, ALTER achieves outstanding performance on table-based reasoning benchmarks. We also provide a detailed analysis of large-table scenarios, comparing different methods and various partitioning principles. In these scenarios, our method outperforms all other approaches and exhibits robustness and efficiency against perturbations. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03050 [pdf, other]

Semantic-Aware Power Allocation for Generative Semantic Communications with Foundation Models

Authors: Chunmei Xu, Mahdi Boloursaz Mashhadi, Yi Ma, Rahim Tafazolli

Abstract: Recent advancements in diffusion models have made a significant breakthrough in generative modeling. The combination of the generative model and semantic communication (SemCom) enables high-fidelity semantic information exchange at ultra-low rates. A novel generative SemCom framework for image tasks is proposed, wherein pre-trained foundation models serve as semantic encoders and decoders for sema… ▽ More Recent advancements in diffusion models have made a significant breakthrough in generative modeling. The combination of the generative model and semantic communication (SemCom) enables high-fidelity semantic information exchange at ultra-low rates. A novel generative SemCom framework for image tasks is proposed, wherein pre-trained foundation models serve as semantic encoders and decoders for semantic feature extractions and image regenerations, respectively. The mathematical relationship between the transmission reliability and the perceptual quality of the regenerated image and the semantic values of semantic features are modeled, which are obtained by conducting numerical simulations on the Kodak dataset. We also investigate the semantic-aware power allocation problem, with the objective of minimizing the total power consumption while guaranteeing semantic performance. To solve this problem, two semanticaware power allocation methods are proposed by constraint decoupling and bisection search, respectively. Numerical results show that the proposed semantic-aware methods demonstrate superior performance compared to the conventional one in terms of total power consumption. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.02935 [pdf, other]

Properties of the QCD Matter -- An Experimental Review of Selected Results from RHIC BES Program

Authors: Jinhui Chen, Xin Dong, Xionghong He, Huanzhong Huang, Feng Liu, Xiaofeng Luo, Yu-Gang Ma, Lijuan Ruan, Ming Shao, Shusu Shi, Xu Sun, Aihong Tang, Zebo Tang, Fuqiang Wang, Hai Wang, Yi Wang, Zhigang Xiao, Guannan Xie, Nu Xu, Qinghua Xu, Zhangbu Xu, Chi Yang, Shuai Yang, Wangmei Zha, Yapeng Zhang , et al. (3 additional authors not shown)

Abstract: In the paper, we discuss the development of the multi-gap resistive plate chamber Time-of-Flight (TOF) technology and the production of the STAR TOF detector in China at the beginning of the 21st century. Then we review recent experimental results from the first beam energy scan program (BES-I) at the Relativistic Heavy Ion Collider (RHIC). Topics cover measurements of collectivity, chirality, cri… ▽ More In the paper, we discuss the development of the multi-gap resistive plate chamber Time-of-Flight (TOF) technology and the production of the STAR TOF detector in China at the beginning of the 21st century. Then we review recent experimental results from the first beam energy scan program (BES-I) at the Relativistic Heavy Ion Collider (RHIC). Topics cover measurements of collectivity, chirality, criticality, global polarization, strangeness, heavy-flavor, di-lepton and light nuclei productions. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 31 pages, 33 figures. This review is dedicated to Professor Wenqing Shen on the occasion to celebrate his leadership of the Chinese STAR Collaboration, the development and production of the STAR MRPC TOF detector in China and many physics analyses

arXiv:2407.02899 [pdf, other]

Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be… ▽ More A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.02131 [pdf, other]

Modeling the Nonlinear Power Spectrum in Low-redshift HI Intensity Mapping

Authors: Zhixing Li, Laura Wolz, Hong Guo, Steven Cunnington, Yi Mao

Abstract: We present a simulation-based framework to forecast the HI power spectrum on non-linear scales ($k\gtrsim 1\ {\rm Mpc^{-1}}$), as measured by interferometer arrays like MeerKAT in the low-redshift ($z\leq 1.0$) universe. Building on a galaxy-based HI mock catalog, we meticulously consider various factors, including the emission line profiles of HI discs and some observational settings, and explore… ▽ More We present a simulation-based framework to forecast the HI power spectrum on non-linear scales ($k\gtrsim 1\ {\rm Mpc^{-1}}$), as measured by interferometer arrays like MeerKAT in the low-redshift ($z\leq 1.0$) universe. Building on a galaxy-based HI mock catalog, we meticulously consider various factors, including the emission line profiles of HI discs and some observational settings, and explore their impacts on the HI power spectrum. While it is relatively insensitive to the profile shape of HI emission line at these scales, we identify a strong correlation with the profile width, that is, the Full Width at Half Maxima (FWHM, also known as $W_{\rm 50}$ in observations) in this work. By modeling the width function of $W_{50}$ as a function of $v_{\rm max}$, we assign each HI source a emission line profile and find that the resulting HI power spectrum is comparatively close to results from particles in the IllustrisTNG hydrodynamical simulation. After implementing $k$-space cuts matching the MeerKAT data, our prediction replicates the trend of the measurements obtained by MeerKAT at $z\approx 0.44$, though with a significantly lower amplitude. Utilizing a Monte Carlo Markov Chain sampling method, we constrain the parameter $A_{W_{\rm 50}}$ in the $W_{\rm 50}$ models and $Ω_{\rm HI}$ with the MeerKAT measurements and find that a strong degeneracy exists between these two parameters. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.02034 [pdf, other]

TrAME: Trajectory-Anchored Multi-View Editing for Text-Guided 3D Gaussian Splatting Manipulation

Authors: Chaofan Luo, Donglin Di, Yongjia Ma, Zhou Xue, Chen Wei, Xun Yang, Yebin Liu

Abstract: Despite significant strides in the field of 3D scene editing, current methods encounter substantial challenge, particularly in preserving 3D consistency in multi-view editing process. To tackle this challenge, we propose a progressive 3D editing strategy that ensures multi-view consistency via a Trajectory-Anchored Scheme (TAS) with a dual-branch editing mechanism. Specifically, TAS facilitates a… ▽ More Despite significant strides in the field of 3D scene editing, current methods encounter substantial challenge, particularly in preserving 3D consistency in multi-view editing process. To tackle this challenge, we propose a progressive 3D editing strategy that ensures multi-view consistency via a Trajectory-Anchored Scheme (TAS) with a dual-branch editing mechanism. Specifically, TAS facilitates a tightly coupled iterative process between 2D view editing and 3D updating, preventing error accumulation yielded from text-to-image process. Additionally, we explore the relationship between optimization-based methods and reconstruction-based methods, offering a unified perspective for selecting superior design choice, supporting the rationale behind the designed TAS. We further present a tuning-free View-Consistent Attention Control (VCAC) module that leverages cross-view semantic and geometric reference from the source branch to yield aligned views from the target branch during the editing of 2D views. To validate the effectiveness of our method, we analyze 2D examples to demonstrate the improved consistency with the VCAC module. Further extensive quantitative and qualitative results in text-guided 3D scene editing indicate that our method achieves superior editing quality compared to state-of-the-art methods. We will make the complete codebase publicly available following the conclusion of the double-blind review process. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.01634 [pdf, other]

Brownian thermal birefringent noise due to non-diagonal anisotropic photoelastic effect in multilayer coated mirrors

Authors: Yu-Pei Zhang, Shi-Xiang Yang, Wen-Hai Tan, Cheng-Gang Shao, Yiqiu Ma, Shan-Qing Yang

Abstract: Thermal noise in the mirror coatings limits the accuracy of today's most optical precision measurement experiments. Unlike the more commonly discussed thermal phase noise, the crystalline coating can generate thermal birefringent noise due to its anisotropic nature. In this study, we propose that the non-diagonal anisotropic photoelastic effect induced by the Brownian motion of mirror coating laye… ▽ More Thermal noise in the mirror coatings limits the accuracy of today's most optical precision measurement experiments. Unlike the more commonly discussed thermal phase noise, the crystalline coating can generate thermal birefringent noise due to its anisotropic nature. In this study, we propose that the non-diagonal anisotropic photoelastic effect induced by the Brownian motion of mirror coating layers may contribute to this noise. Employing a standard model for the coating surface, we calculate the spectrum of the non-diagonal anisotropic Brownian photoelastic(NABP) noise to be $1.2 \times 10^{-11} p_{63} f^{-1/2}/\rm{Hz}^{1/2}$. Further experiments are warranted to validate the influence of this effect and reduce its uncertainty. Our findings highlight that for high-precision experiments involving optical resonant cavities targeting signals imprinted in optical polarizations, this noise could emerge as a limiting factor for experimental sensitivity. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 8 pages, 4 figures, Accepted by Physical Review D

arXiv:2407.01523 [pdf, other]

MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations

Authors: Yubo Ma, Yuhang Zang, Liangyu Chen, Meiqi Chen, Yizhu Jiao, Xinze Li, Xinyuan Lu, Ziyu Liu, Yan Ma, Xiaoyi Dong, Pan Zhang, Liangming Pan, Yu-Gang Jiang, Jiaqi Wang, Yixin Cao, Aixin Sun

Abstract: Understanding documents with rich layouts and multi-modal components is a long-standing and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable strides in various tasks, particularly in single-page document understanding (DU). However, their abilities on long-context DU remain an open problem. This work presents MMLongBench-Doc, a long-context, multi-modal benchmark co… ▽ More Understanding documents with rich layouts and multi-modal components is a long-standing and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable strides in various tasks, particularly in single-page document understanding (DU). However, their abilities on long-context DU remain an open problem. This work presents MMLongBench-Doc, a long-context, multi-modal benchmark comprising 1,062 expert-annotated questions. Distinct from previous datasets, it is constructed upon 130 lengthy PDF-formatted documents with an average of 49.4 pages and 20,971 textual tokens. Towards comprehensive evaluation, answers to these questions rely on pieces of evidence from (1) different sources (text, image, chart, table, and layout structure) and (2) various locations (i.e. page number). Moreover, 33.2% of the questions are cross-page questions requiring evidence across multiple pages. 22.8% of the questions are designed to be unanswerable for detecting potential hallucinations. Experiments on 14 LVLMs demonstrate that long-context DU greatly challenges current models. Notably, the best-performing model, GPT-4o, achieves an F1 score of only 42.7%, while the second-best, GPT-4V, scores 31.4%. Furthermore, 12 LVLMs (all except GPT-4o and GPT-4V) even present worse performance than their LLM counterparts which are fed with lossy-parsed OCR documents. These results validate the necessity of future research toward more capable long-context LVLMs. Project Page: https://mayubo2333.github.io/MMLongBench-Doc △ Less

Submitted 10 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

arXiv:2407.00965 [pdf, other]

Measurement of the integrated luminosity of data samples collected during 2019-2022 by the Belle II experiment

Authors: The Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, J. K. Ahn, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (382 additional authors not shown)

Abstract: A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, diga… ▽ More A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, digamma, and dimuon events is (426.52 $\pm$ 0.03 $\pm$ 2.48)~fb$^{-1}$, (427.32 $\pm$ 0.03 $\pm$ 2.56)~fb$^{-1}$, and (424.84 $\pm$ 0.04 $\pm$ 3.88)~fb$^{-1}$, where the first uncertainties are statistical and the second are systematic. The resulting total integrated luminosity obtained from the combination of the three methods is (426.88 $\pm$ 1.93)~fb$^{-1}$. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 12 pages, 3 figures

Report number: Belle II Preprint 2024-019; KEK Preprint 2024-16

arXiv:2407.00879 [pdf, ps, other]

Study of $χ_{bJ}(2P)\toωΥ(1S)$ at Belle

Authors: Belle Collaboration, Z. S. Stottler, T. K. Pedlar, B. G. Fulsom, I. Adachi, K. Adamczyk, H. Aihara, S. Al Said, D. M. Asner, H. Atmacan, T. Aushev, R. Ayad, V. Babu, Sw. Banerjee, M. Bauer, P. Behera, K. Belous, J. Bennett, F. Bernlochner, M. Bessner, T. Bilka, D. Biswas, A. Bobrov, D. Bodrov, G. Bonvicini , et al. (157 additional authors not shown)

Abstract: We report a study of the hadronic transitions $χ_{bJ}(2P)\toωΥ(1S)$, with $ω\toπ^{+}π^{-}π^{0}$, using $28.2\times10^6~Υ(3S)$ mesons recorded by the Belle detector. We present the first evidence for the near--threshold transition $χ_{b0}(2P)\toωΥ(1S)$, the analog of the charm sector decay $χ_{c1}(3872)\toωJ/ψ$, with a branching fraction of… ▽ More We report a study of the hadronic transitions $χ_{bJ}(2P)\toωΥ(1S)$, with $ω\toπ^{+}π^{-}π^{0}$, using $28.2\times10^6~Υ(3S)$ mesons recorded by the Belle detector. We present the first evidence for the near--threshold transition $χ_{b0}(2P)\toωΥ(1S)$, the analog of the charm sector decay $χ_{c1}(3872)\toωJ/ψ$, with a branching fraction of $B\big(χ_{b0}(2P)\toωΥ(1S)\big) = \big(0.55\pm0.19\pm0.07\big)\%$. We also obtain branching fractions of $B\big(χ_{b1}(2P)\toωΥ(1S)\big) = \big(2.39{}^{+0.20}_{-0.19}\pm0.24\big)\%$ and $B\big(χ_{b2}(2P)\toωΥ(1S)\big) = \big(0.47{}^{+0.13}_{-0.12}\pm0.06\big)\%$, confirming the measurement of the $ω$ transitions of the $J=1,2~P$--wave states. The ratio for the $J=2$ to $J=1$ transitions is also measured and found to differ by 3.3 standard deviations from the expected value in the QCD multipole expansion. △ Less

Submitted 8 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

Comments: 6 pages, 2 figures

Report number: Belle Preprint: 2024-05; KEK Preprint: 2024-10

arXiv:2407.00737 [pdf, other]

LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation

Authors: Mushui Liu, Yuhang Ma, Xinfeng Zhang, Yang Zhen, Zeng Zhao, Zhipeng Hu, Bai Liu, Changjie Fan

Abstract: Diffusion Models have exhibited substantial success in text-to-image generation. However, they often encounter challenges when dealing with complex and dense prompts that involve multiple objects, attribute binding, and long descriptions. This paper proposes a framework called \textbf{LLM4GEN}, which enhances the semantic understanding ability of text-to-image diffusion models by leveraging the se… ▽ More Diffusion Models have exhibited substantial success in text-to-image generation. However, they often encounter challenges when dealing with complex and dense prompts that involve multiple objects, attribute binding, and long descriptions. This paper proposes a framework called \textbf{LLM4GEN}, which enhances the semantic understanding ability of text-to-image diffusion models by leveraging the semantic representation of Large Language Models (LLMs). Through a specially designed Cross-Adapter Module (CAM) that combines the original text features of text-to-image models with LLM features, LLM4GEN can be easily incorporated into various diffusion models as a plug-and-play component and enhances text-to-image generation. Additionally, to facilitate the complex and dense prompts semantic understanding, we develop a LAION-refined dataset, consisting of 1 million (M) text-image pairs with improved image descriptions. We also introduce DensePrompts which contains 7,000 dense prompts to provide a comprehensive evaluation for the text-to-image generation task. With just 10\% of the training data required by recent ELLA, LLM4GEN significantly improves the semantic alignment of SD1.5 and SDXL, demonstrating increases of 7.69\% and 9.60\% in color on T2I-CompBench, respectively. The extensive experiments on DensePrompts also demonstrate that LLM4GEN surpasses existing state-of-the-art models in terms of sample quality, image-text alignment, and human evaluation. The project website is at: \textcolor{magenta}{\url{https://xiaobul.github.io/LLM4GEN/}} △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 11 pages, 13 figures

Showing 1–50 of 5,477 results for author: Mao, Y