subscribe to arXiv mailings

Measurement of the branching fraction of $D^+_s\to \ell^+ν_\ell$ via $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

Abstract: Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and… ▽ More Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and $\mathcal{B}(D_s^+\toτ^+ν_τ)=(\bftauv)\%$, respectively. The product of the decay constant and Cabibbo-Kobayashi-Maskawa matrix element $|V_{cs}|$ is determined to be $f_{D_s^+}|V_{cs}|=(\mufdsxvcsresult)_{μν}~\mathrm{MeV}$ and $f_{D_s^+}|V_{cs}|=(\taufdsxvcsresult))_{τν}~\mathrm{MeV}$, respectively. Taking the value of $|V_{cs}|$ from a global fit in the Standard Model, we obtain ${f_{D^+_s}}=(\mufdsresult)_{μν}$\,MeV and ${f_{D^+_s}}=(\taufdsresult)_{τν}$\,MeV, respectively. Conversely, taking the value for $f_{D_s^+}$ from the latest lattice quantum chromodynamics calculation, we obtain $|V_{cs}| =(\muvcsresult)_{μν}$ and $|V_{cs}| = (\tauvcsresult)_{τν}$, respectively. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: 27 pages, 13 figures

arXiv:2407.11702 [pdf, ps, other]

Dynamics for a diffusive epidemic model with a free boundary: sharp asymptotic profile

Authors: Xueping Li, Lei Li, Mingxin Wang

Abstract: This paper concerns the sharp asymptotic profiles of the solution of a diffusive epidemic model with one free boundary and one fixed boundary which is subject to the homogeneous Dirichlet boundary condition and Neumann boundary condition, respectively. The longtime behaviors has been proved to be governed by a spreading-vanishing dichotomy in \cite{LL}, and when spreading happens, the spreading sp… ▽ More This paper concerns the sharp asymptotic profiles of the solution of a diffusive epidemic model with one free boundary and one fixed boundary which is subject to the homogeneous Dirichlet boundary condition and Neumann boundary condition, respectively. The longtime behaviors has been proved to be governed by a spreading-vanishing dichotomy in \cite{LL}, and when spreading happens, the spreading speed is determined in \cite{LLW}. In this paper, by constructing some subtle upper and lower solutions, as well as employing some detailed analysis, we improve the results in \cite{LLW} and obtain the sharp asymptotic spreading profiles, which show the homogeneous Dirichlet boundary condition and Neumann boundary condition imposed at the fixed boundary $x=0$ lead to the same asymptotic behaviors of $h(t)$ and $(u,v)$ near the spreading front $h(t)$. △ Less

Submitted 16 July, 2024; originally announced July 2024.

arXiv:2407.11683 [pdf, other]

Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning

Authors: Yunbin Tu, Liang Li, Li Su, Chenggang Yan, Qingming Huang

Abstract: Change captioning aims to succinctly describe the semantic change between a pair of similar images, while being immune to distractors (illumination and viewpoint changes). Under these distractors, unchanged objects often appear pseudo changes about location and scale, and certain objects might overlap others, resulting in perturbational and discrimination-degraded features between two images. Howe… ▽ More Change captioning aims to succinctly describe the semantic change between a pair of similar images, while being immune to distractors (illumination and viewpoint changes). Under these distractors, unchanged objects often appear pseudo changes about location and scale, and certain objects might overlap others, resulting in perturbational and discrimination-degraded features between two images. However, most existing methods directly capture the difference between them, which risk obtaining error-prone difference features. In this paper, we propose a distractors-immune representation learning network that correlates the corresponding channels of two image representations and decorrelates different ones in a self-supervised manner, thus attaining a pair of stable image representations under distractors. Then, the model can better interact them to capture the reliable difference features for caption generation. To yield words based on the most related difference features, we further design a cross-modal contrastive regularization, which regularizes the cross-modal alignment by maximizing the contrastive alignment between the attended difference features and generated words. Extensive experiments show that our method outperforms the state-of-the-art methods on four public datasets. The code is available at https://github.com/tuyunbin/DIRL. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: Accepted by ECCV 2024

arXiv:2407.11541 [pdf, other]

Uniformly Accelerated Motion Model for Inter Prediction

Authors: Zhuoyuan Li, Yao Li, Chuanbo Tang, Li Li, Dong Liu, Feng Wu

Abstract: Inter prediction is a key technology to reduce the temporal redundancy in video coding. In natural videos, there are usually multiple moving objects with variable velocity, resulting in complex motion fields that are difficult to represent compactly. In Versatile Video Coding (VVC), existing inter prediction methods usually assume uniform speed motion between consecutive frames and use the linear… ▽ More Inter prediction is a key technology to reduce the temporal redundancy in video coding. In natural videos, there are usually multiple moving objects with variable velocity, resulting in complex motion fields that are difficult to represent compactly. In Versatile Video Coding (VVC), existing inter prediction methods usually assume uniform speed motion between consecutive frames and use the linear models for motion estimation (ME) and motion compensation (MC), which may not well handle the complex motion fields in the real world. To address these issues, we introduce a uniformly accelerated motion model (UAMM) to exploit motion-related elements (velocity, acceleration) of moving objects between the video frames, and further combine them to assist the inter prediction methods to handle the variable motion in the temporal domain. Specifically, first, the theory of UAMM is mentioned. Second, based on that, we propose the UAMM-based parameter derivation and extrapolation schemes in the coding process. Third, we integrate the UAMM into existing inter prediction modes (Merge, MMVD, CIIP) to achieve higher prediction accuracy. The proposed method is implemented into the VVC reference software, VTM version 12.0. Experimental results show that the proposed method achieves up to 0.38% and on average 0.13% BD-rate reduction compared to the VTM anchor, under the Low-delay P configuration, with a slight increase of time complexity on the encoding/decoding side. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: 5 pages, 4 figures

arXiv:2407.11474 [pdf, other]

Search for the rare $Λ_c^+ \to p μ^+ μ^-$ decay

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1062 additional authors not shown)

Abstract: A search for the nonresonant $Λ_c^+ \to p μ^+ μ^-$ decay is performed using proton-proton collision data recorded at a centre-of-mass energy of 13 TeV by the LHCb experiment, corresponding to an integrated luminosity of 5.4 fb$^{-1}$. No evidence for the decay is found in the dimuon invariant-mass regions where the expected contributions of resonances is subdominant. The upper limit on the branchi… ▽ More A search for the nonresonant $Λ_c^+ \to p μ^+ μ^-$ decay is performed using proton-proton collision data recorded at a centre-of-mass energy of 13 TeV by the LHCb experiment, corresponding to an integrated luminosity of 5.4 fb$^{-1}$. No evidence for the decay is found in the dimuon invariant-mass regions where the expected contributions of resonances is subdominant. The upper limit on the branching fraction of the $Λ_c^+ \to p μ^+ μ^-$ decay is determined to be $2.9~(3.2) \times 10^{-8}$ at 90% (95%) confidence level. The branching fractions in the dimuon invariant-mass regions dominated by the $η$, $ρ$ and $ω$ resonances are also determined. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-005.html (LHCb public pages)

Report number: LHCb-PAPER-2024-005, CERN-EP-2024-158

arXiv:2407.11466 [pdf, other]

Navigating the Data Trading Crossroads: An Interdisciplinary Survey

Authors: Yi Yu, Jingru Yu, Xuhong Wang, Juanjuan Li, Yilun Lin, Conghui He, Yanqing Yang, Yu Qiao, Li Li, Fei-Yue Wang

Abstract: Data has been increasingly recognized as a critical factor in the future economy. However, constructing an efficient data trading market faces challenges such as privacy breaches, data monopolies, and misuse. Despite numerous studies proposing algorithms to protect privacy and methods for pricing data, a comprehensive understanding of these issues and systemic solutions remain elusive. This paper… ▽ More Data has been increasingly recognized as a critical factor in the future economy. However, constructing an efficient data trading market faces challenges such as privacy breaches, data monopolies, and misuse. Despite numerous studies proposing algorithms to protect privacy and methods for pricing data, a comprehensive understanding of these issues and systemic solutions remain elusive. This paper provides an extensive review and evaluation of data trading research, aiming to identify existing problems, research gaps, and propose potential solutions. We categorize the challenges into three main areas: Compliance Challenges, Collateral Consequences, and Costly Transactions (the "3C problems"), all stemming from ambiguity in data rights. Through a quantitative analysis of the literature, we observe a paradigm shift from isolated solutions to integrated approaches. Addressing the unresolved issue of right ambiguity, we introduce the novel concept of "data usufruct," which allows individuals to use and benefit from data they do not own. This concept helps reframe data as a more conventional factor of production and aligns it with established economic theories, paving the way for a comprehensive framework of research theories, technical tools, and platforms. We hope this survey provides valuable insights and guidance for researchers, practitioners, and policymakers, thereby contributing to digital economy advancements. △ Less

Submitted 16 July, 2024; originally announced July 2024.

arXiv:2407.11341 [pdf, other]

SN 2021dbg: A Luminous Type IIP-IIL Supernova Exploding from a Massive Star with a Layered Shell

Authors: Zeyi Zhao, Jujia Zhang, Liping Li, Qian Zhai, Yongzhi Cai, Shubham Srivastav, Xiaofeng Wang, Han Lin, Yi Yang, Alexei V. Filippenko, Thomas G. Brink, WeiKang Zheng

Abstract: We present extensive observations and analysis of supernova (SN) 2021dbg, utilizing optical photometry and spectroscopy. For approximately 385 days following the explosion, SN 2021dbg exhibited remarkable luminosity, surpassing most SNe II. This initial high luminosity is potentially attributed to the interaction between the ejected material and the surrounding circumstellar material (CSM), as evi… ▽ More We present extensive observations and analysis of supernova (SN) 2021dbg, utilizing optical photometry and spectroscopy. For approximately 385 days following the explosion, SN 2021dbg exhibited remarkable luminosity, surpassing most SNe II. This initial high luminosity is potentially attributed to the interaction between the ejected material and the surrounding circumstellar material (CSM), as evidenced by the pronounced interaction signatures observed in its spectra. The subsequent high luminosity is primarily due to the significant $^{56}$Ni ($0.17 \pm 0.05$ M$_{\odot}$) produced in the explosion. Based on the flux of flash emission lines detected in the initial spectra, we estimate that the CSM mass near the progenitor amounted to $\sim$(1.0--2.0) $\times 10^{-3}$ M$_{\odot}$, likely resulting from intense stellar wind activity 2--3 yr preceding the explosion. Considering the bolometric light curve, nebular spectrum modeling, and mass-loss rate, we suggest that the progenitor of SN 2021dbg was a red supergiant (RSG) with a mass of $\sim 20$ M$_{\odot}$ and a radius of 1200 R$_{\odot}$. This RSG featured a thick hydrogen shell, which may have contained a region with a sharp decrease in material density, electron density, and temperature, contributing to its layered structure. This object demonstrates mixed features of SNe IIP and SNe IIL, making it as a transitional event linking the above two subclasses of SNe II. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2407.11100 [pdf, other]

Building Intelligence Identification System via Large Language Model Watermarking: A Survey and Beyond

Authors: Xuhong Wang, Haoyu Jiang, Yi Yu, Jingru Yu, Yilun Lin, Ping Yi, Yingchun Wang, Qiao Yu, Li Li, Fei-Yue Wang

Abstract: Large Large Language Models (LLMs) are increasingly integrated into diverse industries, posing substantial security risks due to unauthorized replication and misuse. To mitigate these concerns, robust identification mechanisms are widely acknowledged as an effective strategy. Identification systems for LLMs now rely heavily on watermarking technology to manage and protect intellectual property and… ▽ More Large Large Language Models (LLMs) are increasingly integrated into diverse industries, posing substantial security risks due to unauthorized replication and misuse. To mitigate these concerns, robust identification mechanisms are widely acknowledged as an effective strategy. Identification systems for LLMs now rely heavily on watermarking technology to manage and protect intellectual property and ensure data security. However, previous studies have primarily concentrated on the basic principles of algorithms and lacked a comprehensive analysis of watermarking theory and practice from the perspective of intelligent identification. To bridge this gap, firstly, we explore how a robust identity recognition system can be effectively implemented and managed within LLMs by various participants using watermarking technology. Secondly, we propose a mathematical framework based on mutual information theory, which systematizes the identification process to achieve more precise and customized watermarking. Additionally, we present a comprehensive evaluation of performance metrics for LLM watermarking, reflecting participant preferences and advancing discussions on its identification applications. Lastly, we outline the existing challenges in current watermarking technologies and theoretical frameworks, and provide directional guidance to address these challenges. Our systematic classification and detailed exposition aim to enhance the comparison and evaluation of various methods, fostering further research and development toward a transparent, secure, and equitable LLM ecosystem. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: 59 pages, 7 figures

arXiv:2407.10937 [pdf, other]

IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation

Authors: Yuanhao Zhai, Kevin Lin, Linjie Li, Chung-Ching Lin, Jianfeng Wang, Zhengyuan Yang, David Doermann, Junsong Yuan, Zicheng Liu, Lijuan Wang

Abstract: Significant advances have been made in human-centric video generation, yet the joint video-depth generation problem remains underexplored. Most existing monocular depth estimation methods may not generalize well to synthesized images or videos, and multi-view-based methods have difficulty controlling the human appearance and motion. In this work, we present IDOL (unIfied Dual-mOdal Latent diffusio… ▽ More Significant advances have been made in human-centric video generation, yet the joint video-depth generation problem remains underexplored. Most existing monocular depth estimation methods may not generalize well to synthesized images or videos, and multi-view-based methods have difficulty controlling the human appearance and motion. In this work, we present IDOL (unIfied Dual-mOdal Latent diffusion) for high-quality human-centric joint video-depth generation. Our IDOL consists of two novel designs. First, to enable dual-modal generation and maximize the information exchange between video and depth generation, we propose a unified dual-modal U-Net, a parameter-sharing framework for joint video and depth denoising, wherein a modality label guides the denoising target, and cross-modal attention enables the mutual information flow. Second, to ensure a precise video-depth spatial alignment, we propose a motion consistency loss that enforces consistency between the video and depth feature motion fields, leading to harmonized outputs. Additionally, a cross-attention map consistency loss is applied to align the cross-attention map of the video denoising with that of the depth denoising, further facilitating spatial alignment. Extensive experiments on the TikTok and NTU120 datasets show our superior performance, significantly surpassing existing methods in terms of video FVD and depth accuracy. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: ECCV 2024; project page: https://yhzhai.github.io/idol/

arXiv:2407.10926 [pdf, other]

In-Loop Filtering via Trained Look-Up Tables

Authors: Zhuoyuan Li, Jiacheng Li, Yao Li, Li Li, Dong Liu, Feng Wu

Abstract: In-loop filtering (ILF) is a key technology for removing the artifacts in image/video coding standards. Recently, neural network-based in-loop filtering methods achieve remarkable coding gains beyond the capability of advanced video coding standards, which becomes a powerful coding tool candidate for future video coding standards. However, the utilization of deep neural networks brings heavy time… ▽ More In-loop filtering (ILF) is a key technology for removing the artifacts in image/video coding standards. Recently, neural network-based in-loop filtering methods achieve remarkable coding gains beyond the capability of advanced video coding standards, which becomes a powerful coding tool candidate for future video coding standards. However, the utilization of deep neural networks brings heavy time and computational complexity, and high demands of high-performance hardware, which is challenging to apply to the general uses of coding scene. To address this limitation, inspired by explorations in image restoration, we propose an efficient and practical in-loop filtering scheme by adopting the Look-up Table (LUT). We train the DNN of in-loop filtering within a fixed filtering reference range, and cache the output values of the DNN into a LUT via traversing all possible inputs. At testing time in the coding process, the filtered pixel is generated by locating input pixels (to-be-filtered pixel with reference pixels) and interpolating cached filtered pixel values. To further enable the large filtering reference range with the limited storage cost of LUT, we introduce the enhanced indexing mechanism in the filtering process, and clipping/finetuning mechanism in the training. The proposed method is implemented into the Versatile Video Coding (VVC) reference software, VTM-11.0. Experimental results show that the ultrafast, very fast, and fast mode of the proposed method achieves on average 0.13%/0.34%/0.51%, and 0.10%/0.27%/0.39% BD-rate reduction, under the all intra (AI) and random access (RA) configurations. Especially, our method has friendly time and computational complexity, only 101%/102%-104%/108% time increase with 0.13-0.93 kMACs/pixel, and only 164-1148 KB storage cost for a single model. Our solution may shed light on the journey of practical neural network-based coding tool evolution. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: 11 pages, 6 figures

arXiv:2407.10205 [pdf, other]

Parallel Ising Annealer via Gradient-based Hamiltonian Monte Carlo

Authors: Hao Wang, Zixuan Liu, Zhixin Xie, Langyu Li, Zibo Miao, Wei Cui, Yu Pan

Abstract: Ising annealer is a promising quantum-inspired computing architecture for combinatorial optimization problems. In this paper, we introduce an Ising annealer based on the Hamiltonian Monte Carlo, which updates the variables of all dimensions in parallel. The main innovation is the fusion of an approximate gradient-based approach into the Ising annealer which introduces significant acceleration and… ▽ More Ising annealer is a promising quantum-inspired computing architecture for combinatorial optimization problems. In this paper, we introduce an Ising annealer based on the Hamiltonian Monte Carlo, which updates the variables of all dimensions in parallel. The main innovation is the fusion of an approximate gradient-based approach into the Ising annealer which introduces significant acceleration and allows a portable and scalable implementation on the commercial FPGA. Comprehensive simulation and hardware experiments show that the proposed Ising annealer has promising performance and scalability on all types of benchmark problems when compared to other Ising annealers including the state-of-the-art hardware. In particular, we have built a prototype annealer which solves Ising problems of both integer and fraction coefficients with up to 200 spins on a single low-cost FPGA board, whose performance is demonstrated to be better than the state-of-the-art quantum hardware D-Wave 2000Q and similar to the expensive coherent Ising machine. The sub-linear scalability of the annealer signifies its potential in solving challenging combinatorial optimization problems and evaluating the advantage of quantum hardware. △ Less

Submitted 14 July, 2024; originally announced July 2024.

arXiv:2407.10159 [pdf, other]

RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for 3D LiDAR Segmentation

Authors: Li Li, Hubert P. H. Shum, Toby P. Breckon

Abstract: 3D point clouds play a pivotal role in outdoor scene perception, especially in the context of autonomous driving. Recent advancements in 3D LiDAR segmentation often focus intensely on the spatial positioning and distribution of points for accurate segmentation. However, these methods, while robust in variable conditions, encounter challenges due to sole reliance on coordinates and point intensity,… ▽ More 3D point clouds play a pivotal role in outdoor scene perception, especially in the context of autonomous driving. Recent advancements in 3D LiDAR segmentation often focus intensely on the spatial positioning and distribution of points for accurate segmentation. However, these methods, while robust in variable conditions, encounter challenges due to sole reliance on coordinates and point intensity, leading to poor isometric invariance and suboptimal segmentation. To tackle this challenge, our work introduces Range-Aware Pointwise Distance Distribution (RAPiD) features and the associated RAPiD-Seg architecture. Our RAPiD features exhibit rigid transformation invariance and effectively adapt to variations in point density, with a design focus on capturing the localized geometry of neighboring structures. They utilize inherent LiDAR isotropic radiation and semantic categorization for enhanced local representation and computational efficiency, while incorporating a 4D distance metric that integrates geometric and surface material reflectivity for improved semantic segmentation. To effectively embed high-dimensional RAPiD features, we propose a double-nested autoencoder structure with a novel class-aware embedding objective to encode high-dimensional features into manageable voxel-wise embeddings. Additionally, we propose RAPiD-Seg which incorporates a channel-wise attention fusion and two effective RAPiD-Seg variants, further optimizing the embedding for enhanced performance and generalization. Our method outperforms contemporary LiDAR segmentation work in terms of mIoU on SemanticKITTI (76.1) and nuScenes (83.6) datasets. △ Less

Submitted 14 July, 2024; originally announced July 2024.

Comments: ECCV 2024; 18 pages, 6 figures, 7 tables; Code at https://github.com/l1997i/rapid_seg

Journal ref: Eur. Conf. Comput. Vis. (ECCV 2024)

arXiv:2407.10103 [pdf, other]

From SuperBIT to GigaBIT: Informing next-generation balloon-borne telescope design with Fine Guidance System flight data

Authors: Philippe Voyer, Steven J. Benton, Christopher J. Damaren, Spencer W. Everett, Aurelien A. Fraisse, Ajay S. Gill, John W. Hartley, David Harvey, Michael Henderson, Bradley Holder, Eric M. Huff, Mathilde Jauzac, William C. Jones, David Lagattuta, Jason S. -Y. Leung, Lun Li, Thuy Vy T. Luu, Richard Massey, Jacqueline E. McCleary, Johanna M. Nagy, C. Barth Netterfield, Emaad Paracha, Susan F. Redmond, Jason D. Rhodes, Andrew Robertson , et al. (6 additional authors not shown)

Abstract: The Super-pressure Balloon-borne Imaging Telescope (SuperBIT) is a near-diffraction-limited 0.5m telescope that launched via NASA's super-pressure balloon technology on April 16, 2023. SuperBIT achieved precise pointing control through the use of three nested frames in conjunction with an optical Fine Guidance System (FGS), resulting in an average image stability of 0.055" over 300-second exposure… ▽ More The Super-pressure Balloon-borne Imaging Telescope (SuperBIT) is a near-diffraction-limited 0.5m telescope that launched via NASA's super-pressure balloon technology on April 16, 2023. SuperBIT achieved precise pointing control through the use of three nested frames in conjunction with an optical Fine Guidance System (FGS), resulting in an average image stability of 0.055" over 300-second exposures. The SuperBIT FGS includes a tip-tilt fast-steering mirror that corrects for jitter on a pair of focal plane star cameras. In this paper, we leverage the empirical data from SuperBIT's successful 45-night stratospheric mission to inform the FGS design for the next-generation balloon-borne telescope. The Gigapixel Balloon-borne Imaging Telescope (GigaBIT) is designed to be a 1.35m wide-field, high resolution imaging telescope, with specifications to extend the scale and capabilities beyond those of its predecessor SuperBIT. A description and analysis of the SuperBIT FGS will be presented along with methodologies for extrapolating this data to enhance GigaBIT's FGS design and fine pointing control algorithm. We employ a systems engineering approach to outline and formalize the design constraints and specifications for GigaBIT's FGS. GigaBIT, building on the SuperBIT legacy, is set to enhance high-resolution astronomical imaging, marking a significant advancement in the field of balloon-borne telescopes. △ Less

Submitted 14 July, 2024; originally announced July 2024.

Comments: 13 pages, 7 figures, SPIE Astronomical Telescopes + Instrumentation 2024

arXiv:2407.10069 [pdf, other]

Exact bulk-boundary correspondence of spectral winding topology under mixed boundary conditions

Authors: Gan Liang, Linhu Li

Abstract: Complex spectral winding is a topological feature unique in non-Hermitian systems and is known to be responsible for the non-Hermitian skin effect (NHSE), namely, the sign of the spectral winding number predicts the localization direction of skin modes. In this paper, we analytically establish an exact bulk-boundary correspondence (BBC) for the spectral winding topology by adopting mixed boundary… ▽ More Complex spectral winding is a topological feature unique in non-Hermitian systems and is known to be responsible for the non-Hermitian skin effect (NHSE), namely, the sign of the spectral winding number predicts the localization direction of skin modes. In this paper, we analytically establish an exact bulk-boundary correspondence (BBC) for the spectral winding topology by adopting mixed boundary conditions (MBCs). By presenting analytical solutions of a class of one-dimensional non-Hermitian systems with nearest-neighbor hoppings identical for different components, we demonstrate that the emergence of skin modes further requires the absolute value of spectral winding number exceeding a threshold determined by the MBCs. In contrast to most topological phases, this exact BBC of spectral winding topology is not only robust against disorder, but also favored by it, as disorder can restore the correspondence in certain cases where it is originally violated. Finally, we extend our analysis to a more general model by relaxing the restriction on hopping parameters, verifying the universality of the exact BBC under MBCs. △ Less

Submitted 14 July, 2024; originally announced July 2024.

Comments: 17 pages and 6 figures. Comments are welcome

arXiv:2407.09966 [pdf, other]

Optimizing ROI Benefits Vehicle ReID in ITS

Authors: Mei Qiu, Lauren Ann Christopher, Lingxi Li, Stanley Chien, Yaobin Chen

Abstract: Vehicle re-identification (ReID) is a computer vision task that matches the same vehicle across different cameras or viewpoints in a surveillance system. This is crucial for Intelligent Transportation Systems (ITS), where the effectiveness is influenced by the regions from which vehicle images are cropped. This study explores whether optimal vehicle detection regions, guided by detection confidenc… ▽ More Vehicle re-identification (ReID) is a computer vision task that matches the same vehicle across different cameras or viewpoints in a surveillance system. This is crucial for Intelligent Transportation Systems (ITS), where the effectiveness is influenced by the regions from which vehicle images are cropped. This study explores whether optimal vehicle detection regions, guided by detection confidence scores, can enhance feature matching and ReID tasks. Using our framework with multiple Regions of Interest (ROIs) and lane-wise vehicle counts, we employed YOLOv8 for detection and DeepSORT for tracking across twelve Indiana Highway videos, including two pairs of videos from non-overlapping cameras. Tracked vehicle images were cropped from inside and outside the ROIs at five-frame intervals. Features were extracted using pre-trained models: ResNet50, ResNeXt50, Vision Transformer, and Swin-Transformer. Feature consistency was assessed through cosine similarity, information entropy, and clustering variance. Results showed that features from images cropped inside ROIs had higher mean cosine similarity values compared to those involving one image inside and one outside the ROIs. The most significant difference was observed during night conditions (0.7842 inside vs. 0.5 outside the ROI with Swin-Transformer) and in cross-camera scenarios (0.75 inside-inside vs. 0.52 inside-outside the ROI with Vision Transformer). Information entropy and clustering variance further supported that features in ROIs are more consistent. These findings suggest that strategically selected ROIs can enhance tracking performance and ReID accuracy in ITS. △ Less

Submitted 13 July, 2024; originally announced July 2024.

arXiv:2407.09710 [pdf, other]

DisQ: A Markov Decision Process Based Language for Quantum Distributed Systems

Authors: Le Chang, Saitej Yavvari, Rance Cleaveland, Samik Basu, Liyi Li

Abstract: The development of quantum computers has reached a great milestone, in spite of restrictions on important quantum resources: the number of qubits being entangled at a single-location quantum computer. Recently, there has been some work to combine single-location quantum computing and quantum networking techniques to develop distributed quantum systems such that large entangled qubit groups can be… ▽ More The development of quantum computers has reached a great milestone, in spite of restrictions on important quantum resources: the number of qubits being entangled at a single-location quantum computer. Recently, there has been some work to combine single-location quantum computing and quantum networking techniques to develop distributed quantum systems such that large entangled qubit groups can be established through remote processors, and quantum algorithms can be executed distributively. We present DisQ as a framework to facilitate the rewrites of quantum algorithms to their distributed versions. The core of DisQ is a distributed quantum programming language that combines the concepts of Chemical Abstract Machine (CHAM) and Markov Decision Processes (MDP) with the objective of providing a clearly distinguishing quantum concurrent and distributed behaviors. Based on the DisQ language, we develop a simulation relation for verifying the equivalence of a quantum algorithm and its distributed versions. We present several case studies, such as quantum addition and Shor's algorithm, to demonstrate their equivalent rewrites to distributed versions. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: Version 1

arXiv:2407.09191 [pdf, other]

From Easy to Hard: Learning Curricular Shape-aware Features for Robust Panoptic Scene Graph Generation

Authors: Hanrong Shi, Lin Li, Jun Xiao, Yueting Zhuang, Long Chen

Abstract: Panoptic Scene Graph Generation (PSG) aims to generate a comprehensive graph-structure representation based on panoptic segmentation masks. Despite remarkable progress in PSG, almost all existing methods neglect the importance of shape-aware features, which inherently focus on the contours and boundaries of objects. To bridge this gap, we propose a model-agnostic Curricular shApe-aware FEature (CA… ▽ More Panoptic Scene Graph Generation (PSG) aims to generate a comprehensive graph-structure representation based on panoptic segmentation masks. Despite remarkable progress in PSG, almost all existing methods neglect the importance of shape-aware features, which inherently focus on the contours and boundaries of objects. To bridge this gap, we propose a model-agnostic Curricular shApe-aware FEature (CAFE) learning strategy for PSG. Specifically, we incorporate shape-aware features (i.e., mask features and boundary features) into PSG, moving beyond reliance solely on bbox features. Furthermore, drawing inspiration from human cognition, we propose to integrate shape-aware features in an easy-to-hard manner. To achieve this, we categorize the predicates into three groups based on cognition learning difficulty and correspondingly divide the training process into three stages. Each stage utilizes a specialized relation classifier to distinguish specific groups of predicates. As the learning difficulty of predicates increases, these classifiers are equipped with features of ascending complexity. We also incorporate knowledge distillation to retain knowledge acquired in earlier stages. Due to its model-agnostic nature, CAFE can be seamlessly incorporated into any PSG model. Extensive experiments and ablations on two PSG tasks under both robust and zero-shot PSG have attested to the superiority and robustness of our proposed CAFE, which outperforms existing state-of-the-art methods by a large margin. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: Accepted by IJCV

arXiv:2407.09139 [pdf, other]

Measurement of $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays at Belle II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, A. Aloisio, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer , et al. (414 additional authors not shown)

Abstract: We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We det… ▽ More We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We determine these parameters for two ranges of $K^0_S π^0$ invariant mass: $m(K^0_S π^0)\in (0.8, 1.0)$ $GeV/c^2$, which is dominated by $B^0 \to K^{*0} (\to K^0_S π^0) γ$ decays, and a complementary region $m(K^0_S π^0)\in (0.6, 0.8)\cup(1.0, 1.8)$ $GeV/c^2$. Our results have improved precision as compared to previous measurements and are consistent with theory predictions. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 10 pages, 4 figures

Report number: Belle II Preprint 2024-009, KEK Preprint 2024-1

arXiv:2407.08984 [pdf, ps, other]

Measurement of branching fractions, CP asymmetry, and isospin asymmetry for $\boldsymbol{B\rightarrowργ}$ decays using Belle and Belle II data

Authors: Belle II Collaboration, I. Adachi, K. Adamczyk, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer , et al. (385 additional authors not shown)

Abstract: We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle I… ▽ More We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle II data sets yields $114\pm 12$ $B^{+}\rightarrowρ^{+}γ$ and $99\pm 12$ $B^{0}\rightarrowρ^{0}γ$ decays. The measured branching fractions are $(13.1^{+2.0 +1.3}_{-1.9 -1.2})\times 10^{-7}$ and $(7.5\pm 1.3^{+1.0}_{-0.8})\times 10^{-7}$ for $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays, respectively, where the first uncertainty is statistical and the second is systematic. We also measure the isospin asymmetry $A_{\rm I}(B\rightarrowργ)=(10.9^{+11.2 +7.8}_{-11.7 -7.3})\%$ and the direct CP asymmetry $A_{CP}(B^{+}\rightarrowρ^{+}γ)=(-8.2\pm 15.2^{+1.6}_{-1.2})\%$. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 12 pages, 4 figures

Report number: Belle II Preprint 2023-019; KEK Preprint 2023-37

arXiv:2407.08951 [pdf, other]

Audio Spotforming Using Nonnegative Tensor Factorization with Attractor-Based Regularization

Authors: Shoma Ayano, Li Li, Shogo Seki, Daichi Kitamura

Abstract: Spotforming is a target-speaker extraction technique that uses multiple microphone arrays. This method applies beamforming (BF) to each microphone array, and the common components among the BF outputs are estimated as the target source. This study proposes a new common component extraction method based on nonnegative tensor factorization (NTF) for higher model interpretability and more robust spot… ▽ More Spotforming is a target-speaker extraction technique that uses multiple microphone arrays. This method applies beamforming (BF) to each microphone array, and the common components among the BF outputs are estimated as the target source. This study proposes a new common component extraction method based on nonnegative tensor factorization (NTF) for higher model interpretability and more robust spotforming against hyperparameters. Moreover, attractor-based regularization was introduced to facilitate the automatic selection of optimal target bases in the NTF. Experimental results show that the proposed method performs better than conventional methods in spotforming performance and also shows some characteristics suitable for practical use. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: Accepted at EUSIPCO2024

arXiv:2407.08928 [pdf, ps, other]

Dynamics for a diffusive epidemic model with a free boundary: spreading speed

Authors: Xueping Li, Lei Li, Mingxin Wang

Abstract: We study the spreading speed of a diffusive epidemic model proposed by Li et al. \cite{LL}, where the Stefan boundary condition is imposed at the right boundary, and the left boundary is subject to the homogeneous Dirichlet and Neumann condition, respectively. A spreading-vanishing dichotomy and some sharp criteria were obtained in \cite{LL}. In this paper, when spreading happens, we not only obta… ▽ More We study the spreading speed of a diffusive epidemic model proposed by Li et al. \cite{LL}, where the Stefan boundary condition is imposed at the right boundary, and the left boundary is subject to the homogeneous Dirichlet and Neumann condition, respectively. A spreading-vanishing dichotomy and some sharp criteria were obtained in \cite{LL}. In this paper, when spreading happens, we not only obtain the exact spreading speed of the spreading front described by the right boundary, but derive some sharp estimates on the asymptotical behavior of solution component $(u,v)$. Our arguments depend crucially on some detailed understandings for a corresponding semi-wave problem and a steady state problem. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08722 [pdf, other]

Unifying 3D Representation and Control of Diverse Robots with a Single Camera

Authors: Sizhe Lester Li, Annan Zhang, Boyuan Chen, Hanna Matusik, Chao Liu, Daniela Rus, Vincent Sitzmann

Abstract: Mirroring the complex structures and diverse functions of natural organisms is a long-standing challenge in robotics. Modern fabrication techniques have dramatically expanded feasible hardware, yet deploying these systems requires control software to translate desired motions into actuator commands. While conventional robots can easily be modeled as rigid links connected via joints, it remains an… ▽ More Mirroring the complex structures and diverse functions of natural organisms is a long-standing challenge in robotics. Modern fabrication techniques have dramatically expanded feasible hardware, yet deploying these systems requires control software to translate desired motions into actuator commands. While conventional robots can easily be modeled as rigid links connected via joints, it remains an open challenge to model and control bio-inspired robots that are often multi-material or soft, lack sensing capabilities, and may change their material properties with use. Here, we introduce Neural Jacobian Fields, an architecture that autonomously learns to model and control robots from vision alone. Our approach makes no assumptions about the robot's materials, actuation, or sensing, requires only a single camera for control, and learns to control the robot without expert intervention by observing the execution of random commands. We demonstrate our method on a diverse set of robot manipulators, varying in actuation, materials, fabrication, and cost. Our approach achieves accurate closed-loop control and recovers the causal dynamic structure of each robot. By enabling robot control with a generic camera as the only sensor, we anticipate our work will dramatically broaden the design space of robotic systems and serve as a starting point for lowering the barrier to robotic automation. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: Project Page: https://sizhe-li.github.io/publication/neural_jacobian_field

arXiv:2407.08492 [pdf, ps, other]

Syzygies of projected canonical and paracanonical curves

Authors: Li Li

Abstract: Let $X\subset\mathbb{P}^r$ be an integral linearly normal variety and $R=k[x_0,\cdots,x_r]$ the coordinate ring of $\mathbb{P}^r$. It is known that the syzygies of $X$ contain some geometric information. In recent years the syzygies of non-projectively normal varieties or in other words, the projection $X'$ of $X$ away from a linear subspace $W\subset\mathbb{P}^r$, were taken into considerations.… ▽ More Let $X\subset\mathbb{P}^r$ be an integral linearly normal variety and $R=k[x_0,\cdots,x_r]$ the coordinate ring of $\mathbb{P}^r$. It is known that the syzygies of $X$ contain some geometric information. In recent years the syzygies of non-projectively normal varieties or in other words, the projection $X'$ of $X$ away from a linear subspace $W\subset\mathbb{P}^r$, were taken into considerations. Assuming that the coordinate ring of the ambient space that $X'$ lives in is $S$, there are two types of vanishing properties of the Betti diagrams of the projected varieties, the so-called $N_{d,p}^S$ and $\widetilde{N}_{d,p}$. The former one have been widely discussed for general varieties, for example by S. Kwak, Y. Choi and E. Park, while the latter one was discussed by W. Lee and E. Park for curves of very large degree. In this paper I will discuss about the $\widetilde{N}_{d,p}$ properties of the projection of a generic canonical and paracanonical curve away from a generic point and in particular whether they are cut out by quadrics. Some conjectures will be claimed based on the tests on Macaulay2. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 20 pages

MSC Class: 14H51(Primary); 14Q05(Secondary)

arXiv:2407.08355 [pdf, other]

Dynamically assisted pair production enhancement by combined multiple potentials

Authors: Lie-Juan Li, Li Wang, Melike Mohamedsedik, Li-Na Hu, Bai-Song Xie

Abstract: We propose a new Sauter-like field model with combinatorial multiple potentials consisting of a deep slow-varying and some shallow fast-varying potentials. The dynamically assisted Sauter-Schwinger effect on the pair production is found by using the computational quantum field theory. The enhanced pair production is found to be significant at about one order increasing for multiple potentials rath… ▽ More We propose a new Sauter-like field model with combinatorial multiple potentials consisting of a deep slow-varying and some shallow fast-varying potentials. The dynamically assisted Sauter-Schwinger effect on the pair production is found by using the computational quantum field theory. The enhanced pair production is found to be significant at about one order increasing for multiple potentials rather than single potential. In case of dominated by Schwinger mechanism, the obvious time effect leads to electrons concentrating at the two edges of the potential, meanwhile, the momentum locates at the zero nearby. In contrary, however, for the multiphoton processes, the pair generation makes the electrons distributing outside the potential and the momentum appearing multiple peaks far away from zero and evenly evolving toward a step-like structure. An interesting finding is that the particles of pair produced in the alternating potential has a quasi-monoenergetic structure compared to the oscillating potential well or/and potential barrier, which is helpful to achieve the high quality positron source. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 32 pages, 19 figures

arXiv:2407.08351 [pdf, other]

AutoBencher: Creating Salient, Novel, Difficult Datasets for Language Models

Authors: Xiang Lisa Li, Evan Zheran Liu, Percy Liang, Tatsunori Hashimoto

Abstract: Evaluation is critical for assessing capabilities, tracking scientific progress, and informing model selection. In this paper, we present three desiderata for a good benchmark for language models: (i) salience (e.g., knowledge about World War II is more salient than a random day in history), (ii) novelty (i.e., the benchmark reveals new trends in model rankings not shown by previous benchmarks), a… ▽ More Evaluation is critical for assessing capabilities, tracking scientific progress, and informing model selection. In this paper, we present three desiderata for a good benchmark for language models: (i) salience (e.g., knowledge about World War II is more salient than a random day in history), (ii) novelty (i.e., the benchmark reveals new trends in model rankings not shown by previous benchmarks), and (iii) difficulty (i.e., the benchmark should be difficult for existing models, leaving headroom for future improvement). We operationalize these three desiderata and cast benchmark creation as a search problem, that of finding benchmarks that that satisfy all three desiderata. To tackle this search problem, we present AutoBencher, which uses a language model to automatically search for datasets that meet the three desiderata. AutoBencher uses privileged information (e.g. relevant documents) to construct reliable datasets, and adaptivity with reranking to optimize for the search objective. We use AutoBencher to create datasets for math, multilingual, and knowledge-intensive question answering. The scalability of AutoBencher allows it to test fine-grained categories and tail knowledge, creating datasets that are on average 27% more novel and 22% more difficult than existing benchmarks. A closer investigation of our constructed datasets shows that we can identify specific gaps in LM knowledge in language models that are not captured by existing benchmarks, such as Gemini Pro performing much worse on question answering about the Permian Extinction and Fordism, while OpenAGI-7B performing surprisingly well on QA about COVID-19. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: preprint

arXiv:2407.08168 [pdf, other]

Berry phases in Coulomb drag of double-layer graphene system

Authors: Jianghui Pan, Lijun Zhu, Xiaoqiang Liu, Lin Li, Changgan Zeng, Ji Feng

Abstract: Recent experiments suggest quantum interference effects in the Coulomb drag of double-layer graphene systems. By accounting for correlated interlayer impurity scattering under a weak magnetic field, our theoretical results reveal drag resistivities resembling those in weak (anti-)localization. It is established that the quantum interference effect is most significant when the chemical potentials m… ▽ More Recent experiments suggest quantum interference effects in the Coulomb drag of double-layer graphene systems. By accounting for correlated interlayer impurity scattering under a weak magnetic field, our theoretical results reveal drag resistivities resembling those in weak (anti-)localization. It is established that the quantum interference effect is most significant when the chemical potentials match. The theory clarifies the roles of intra- and interlayer Berry phases in Coulomb drag in double-layer graphene systems and helps delineate the intra- and intervalley contributions. These insights are valuable for designing graphene-based electronic devices exploiting quantum effects. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.07860 [pdf, other]

Controlling Space and Time with Diffusion Models

Authors: Daniel Watson, Saurabh Saxena, Lala Li, Andrea Tagliasacchi, David J. Fleet

Abstract: We present 4DiM, a cascaded diffusion model for 4D novel view synthesis (NVS), conditioned on one or more images of a general scene, and a set of camera poses and timestamps. To overcome challenges due to limited availability of 4D training data, we advocate joint training on 3D (with camera pose), 4D (pose+time) and video (time but no pose) data and propose a new architecture that enables the sam… ▽ More We present 4DiM, a cascaded diffusion model for 4D novel view synthesis (NVS), conditioned on one or more images of a general scene, and a set of camera poses and timestamps. To overcome challenges due to limited availability of 4D training data, we advocate joint training on 3D (with camera pose), 4D (pose+time) and video (time but no pose) data and propose a new architecture that enables the same. We further advocate the calibration of SfM posed data using monocular metric depth estimators for metric scale camera control. For model evaluation, we introduce new metrics to enrich and overcome shortcomings of current evaluation schemes, demonstrating state-of-the-art results in both fidelity and pose control compared to existing diffusion models for 3D NVS, while at the same time adding the ability to handle temporal dynamics. 4DiM is also used for improved panorama stitching, pose-conditioned video to video translation, and several other tasks. For an overview see https://4d-diffusion.github.io △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07842 [pdf, other]

Study on Aspect Ratio Variability toward Robustness of Vision Transformer-based Vehicle Re-identification

Authors: Mei Qiu, Lauren Christopher, Lingxi Li

Abstract: Vision Transformers (ViTs) have excelled in vehicle re-identification (ReID) tasks. However, non-square aspect ratios of image or video input might significantly affect the re-identification performance. To address this issue, we propose a novel ViT-based ReID framework in this paper, which fuses models trained on a variety of aspect ratios. Our main contributions are threefold: (i) We analyze asp… ▽ More Vision Transformers (ViTs) have excelled in vehicle re-identification (ReID) tasks. However, non-square aspect ratios of image or video input might significantly affect the re-identification performance. To address this issue, we propose a novel ViT-based ReID framework in this paper, which fuses models trained on a variety of aspect ratios. Our main contributions are threefold: (i) We analyze aspect ratio performance on VeRi-776 and VehicleID datasets, guiding input settings based on aspect ratios of original images. (ii) We introduce patch-wise mixup intra-image during ViT patchification (guided by spatial attention scores) and implement uneven stride for better object aspect ratio matching. (iii) We propose a dynamic feature fusing ReID network, enhancing model robustness. Our ReID method achieves a significantly improved mean Average Precision (mAP) of 91.0\% compared to the the closest state-of-the-art (CAL) result of 80.9\% on VehicleID dataset. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07699 [pdf, other]

Transmission Design for XL-RIS-Aided Massive MIMO System with Visibility Regions

Authors: Luchu Li, Kangda Zhi, Cunhua Pan

Abstract: This paper proposes a two-timescale transmission scheme for extremely large-scale (XL)-reconfigurable intelligent surfaces (RIS)-aided massive multi-input multi-output (MIMO) systems considering visibility regions (VRs). The beamforming of base stations (BS) is designed based on rapidly changing instantaneous channel state information (CSI), while the phase shifts of RIS are configured based on sl… ▽ More This paper proposes a two-timescale transmission scheme for extremely large-scale (XL)-reconfigurable intelligent surfaces (RIS)-aided massive multi-input multi-output (MIMO) systems considering visibility regions (VRs). The beamforming of base stations (BS) is designed based on rapidly changing instantaneous channel state information (CSI), while the phase shifts of RIS are configured based on slowly changing statistical CSI. Specifically, we first formulate a system model with spatially correlated Rician fading channels and introduce the concept of VRs. Then, we derive a closed-form approximate expression for the achievable rate applicable to any number of BS antennas and RIS elements, and analyze the impact of VRs on system performance and complexity. Next, we solve the problem of maximizing the minimum user rate by optimizing the phase shifts of RIS through an algorithm based on accelerated gradient ascent. Finally, we present numerical results to demonstrate the performance of the gradient algorithm from different aspects and reveal the low system complexity of deploying XL-RIS in massive MIMO systems with the help of VRs. △ Less

Submitted 17 May, 2024; originally announced July 2024.

arXiv:2407.07651 [pdf, other]

Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07503 [pdf, other]

Metasurface-based Snapshot Shortwave-Infrared Hyperspectral Image Reconstruction with Inter and Intra Prior Learning Network

Authors: Linqiang Li, Jinglei Hao, Yongqiang Zhao, Pan Liu, Haofang Yan, Ziqin Zhang, Seong G. Kong

Abstract: Shortwave-infrared(SWIR) spectral information,ranging from 1 μm to 2.5μm, breaks the limitations of traditional color cameras in acquiring scene information and has been used in many fields. However, conventional SWIR hyperspectral imaging systems face challenges due to their bulky setups and low acquisition speed. In this work, we introduce a snapshot SWIR hyperspectral imaging system based on a… ▽ More Shortwave-infrared(SWIR) spectral information,ranging from 1 μm to 2.5μm, breaks the limitations of traditional color cameras in acquiring scene information and has been used in many fields. However, conventional SWIR hyperspectral imaging systems face challenges due to their bulky setups and low acquisition speed. In this work, we introduce a snapshot SWIR hyperspectral imaging system based on a metasurface filter and a corresponding filter selection method to achieve the lowest correlation coefficient among these filters.This systemhas the advantages of small size and snapshot imaging. We propose a novel inter and intra prior learning unfolding framework proposed to achieve high-quality SWIR hyperspectral image reconstruction, which bridges the gap between prior learning and cross-stage information interaction. We also design an adaptive feature transfer mechanism to adaptively the transfer contextual correlation of multi-scale encoder features to prevent detailed information loss in the decoder. Experiment results demonstrate that our method can reconstruct HSI with high speed and superior performance over existing methods. △ Less

Submitted 10 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

Comments: 10 pages,5 figures

arXiv:2407.06981 [pdf, other]

Reconfigurable unitary transformations of optical beam arrays

Authors: Aldo C. Martinez-Becerril, Siwei Luo, Liu Li, Jordan Pagé, Lambert Giner, Raphael A. Abrahao, Jeff S. Lundeen

Abstract: Spatial transformations of light are ubiquitous in optics, with examples ranging from simple imaging with a lens to quantum and classical information processing in waveguide meshes. Multi-plane light converter (MPLC) systems have emerged as a platform that promises completely general spatial transformations, i.e., a universal unitary. However until now, MPLC systems have demonstrated transformatio… ▽ More Spatial transformations of light are ubiquitous in optics, with examples ranging from simple imaging with a lens to quantum and classical information processing in waveguide meshes. Multi-plane light converter (MPLC) systems have emerged as a platform that promises completely general spatial transformations, i.e., a universal unitary. However until now, MPLC systems have demonstrated transformations that are far from general, e.g., converting from a Gaussian to Laguerre-Gauss mode. Here, we demonstrate the promise of an MLPC, the ability to impose an arbitrary unitary transformation that can be reconfigured dynamically. Specifically, we consider transformations on superpositions of parallel free-space beams arranged in an array, which is a common information encoding in photonics. We experimentally test the full gamut of unitary transformations for a system of two parallel beams and make a map of their fidelity. We obtain an average transformation fidelity of $0.85 \pm 0.03$. This high-fidelity suggests MPLCs are a useful tool implementing the unitary transformations that comprise quantum and classical information processing. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.06705 [pdf, other]

Integrated Sensing and Communications for Resource Allocation in Non-Terrestrial Networks

Authors: Israel Leyva-Mayorga, Fabio Saggese, Lintao Li, Petar Popovski

Abstract: The integration of Non-Terrestrial Networks (NTNs) with Low Earth Orbit (LEO) satellite constellations into 5G and Beyond is essential to achieve truly global connectivity. A distinctive characteristic of LEO mega-constellations is that they constitute a global infrastructure with predictable dynamics, which enables the pre-planned allocation of the radio resources. However, the different bands th… ▽ More The integration of Non-Terrestrial Networks (NTNs) with Low Earth Orbit (LEO) satellite constellations into 5G and Beyond is essential to achieve truly global connectivity. A distinctive characteristic of LEO mega-constellations is that they constitute a global infrastructure with predictable dynamics, which enables the pre-planned allocation of the radio resources. However, the different bands that can be used for ground-to-satellite communication are affected differently by atmospheric conditions such as precipitation, which introduces uncertainty on the attenuation of the communication links at high frequencies. Based on this, we present a compelling case for applying integrated sensing and communications (ISAC) in heterogeneous and multi-layer LEO satellite constellations over wide areas. Specifically, we present an ISAC framework and frame structure to accurately estimate the attenuation in the communication links due to precipitation, with the aim of finding the optimal serving satellites and resource allocation for downlink communication with users on ground. The results show that, by dedicating an adequate amount of resources for sensing and solving the association and resource allocation problems jointly, it is feasible to increase the average throughput by 59% and the fairness by 600% when compared to solving these problems separately. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: Submitted for publication to IEEE Transactions on Wireless Communications

arXiv:2407.06597 [pdf, other]

TVR-Ranking: A Dataset for Ranked Video Moment Retrieval with Imprecise Queries

Authors: Renjie Liang, Li Li, Chongzhi Zhang, Jing Wang, Xizhou Zhu, Aixin Sun

Abstract: In this paper, we propose the task of \textit{Ranked Video Moment Retrieval} (RVMR) to locate a ranked list of matching moments from a collection of videos, through queries in natural language. Although a few related tasks have been proposed and studied by CV, NLP, and IR communities, RVMR is the task that best reflects the practical setting of moment search. To facilitate research in RVMR, we dev… ▽ More In this paper, we propose the task of \textit{Ranked Video Moment Retrieval} (RVMR) to locate a ranked list of matching moments from a collection of videos, through queries in natural language. Although a few related tasks have been proposed and studied by CV, NLP, and IR communities, RVMR is the task that best reflects the practical setting of moment search. To facilitate research in RVMR, we develop the TVR-Ranking dataset, based on the raw videos and existing moment annotations provided in the TVR dataset. Our key contribution is the manual annotation of relevance levels for 94,442 query-moment pairs. We then develop the $NDCG@K, IoU\geq μ$ evaluation metric for this new task and conduct experiments to evaluate three baseline models. Our experiments show that the new RVMR task brings new challenges to existing models and we believe this new dataset contributes to the research on multi-modality search. The dataset is available at \url{https://github.com/Ranking-VMR/TVR-Ranking} △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.06573 [pdf, other]

LLM for Mobile: An Initial Roadmap

Authors: Daihang Chen, Yonghui Liu, Mingyi Zhou, Yanjie Zhao, Haoyu Wang, Shuai Wang, Xiao Chen, Tegawendé F. Bissyandé, Jacques Klein, Li Li

Abstract: When mobile meets LLMs, mobile app users deserve to have more intelligent usage experiences. For this to happen, we argue that there is a strong need to appl LLMs for the mobile ecosystem. We therefore provide a research roadmap for guiding our fellow researchers to achieve that as a whole. In this roadmap, we sum up six directions that we believe are urgently required for research to enable nativ… ▽ More When mobile meets LLMs, mobile app users deserve to have more intelligent usage experiences. For this to happen, we argue that there is a strong need to appl LLMs for the mobile ecosystem. We therefore provide a research roadmap for guiding our fellow researchers to achieve that as a whole. In this roadmap, we sum up six directions that we believe are urgently required for research to enable native intelligence in mobile devices. In each direction, we further summarize the current research progress and the gaps that still need to be filled by our fellow researchers. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.06540 [pdf, other]

General and Task-Oriented Video Segmentation

Authors: Mu Chen, Liulei Li, Wenguan Wang, Ruijie Quan, Yi Yang

Abstract: We present GvSeg, a general video segmentation framework for addressing four different video segmentation tasks (i.e., instance, semantic, panoptic, and exemplar-guided) while maintaining an identical architectural design. Currently, there is a trend towards developing general video segmentation solutions that can be applied across multiple tasks. This streamlines research endeavors and simplifies… ▽ More We present GvSeg, a general video segmentation framework for addressing four different video segmentation tasks (i.e., instance, semantic, panoptic, and exemplar-guided) while maintaining an identical architectural design. Currently, there is a trend towards developing general video segmentation solutions that can be applied across multiple tasks. This streamlines research endeavors and simplifies deployment. However, such a highly homogenized framework in current design, where each element maintains uniformity, could overlook the inherent diversity among different tasks and lead to suboptimal performance. To tackle this, GvSeg: i) provides a holistic disentanglement and modeling for segment targets, thoroughly examining them from the perspective of appearance, position, and shape, and on this basis, ii) reformulates the query initialization, matching and sampling strategies in alignment with the task-specific requirement. These architecture-agnostic innovations empower GvSeg to effectively address each unique task by accommodating the specific properties that characterize them. Extensive experiments on seven gold-standard benchmark datasets demonstrate that GvSeg surpasses all existing specialized/general solutions by a significant margin on four different video segmentation tasks. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: ECCV 2024; Project page: https://github.com/kagawa588/GvSeg

arXiv:2407.06113 [pdf, other]

C2C: Component-to-Composition Learning for Zero-Shot Compositional Action Recognition

Authors: Rongchang Li, Zhenhua Feng, Tianyang Xu, Linze Li, Xiao-Jun Wu, Muhammad Awais, Sara Atito, Josef Kittler

Abstract: Compositional actions consist of dynamic (verbs) and static (objects) concepts. Humans can easily recognize unseen compositions using the learned concepts. For machines, solving such a problem requires a model to recognize unseen actions composed of previously observed verbs and objects, thus requiring, so-called, compositional generalization ability. To facilitate this research, we propose a nove… ▽ More Compositional actions consist of dynamic (verbs) and static (objects) concepts. Humans can easily recognize unseen compositions using the learned concepts. For machines, solving such a problem requires a model to recognize unseen actions composed of previously observed verbs and objects, thus requiring, so-called, compositional generalization ability. To facilitate this research, we propose a novel Zero-Shot Compositional Action Recognition (ZS-CAR) task. For evaluating the task, we construct a new benchmark, Something-composition (Sth-com), based on the widely used Something-Something V2 dataset. We also propose a novel Component-to-Composition (C2C) learning method to solve the new ZS-CAR task. C2C includes an independent component learning module and a composition inference module. Last, we devise an enhanced training strategy to address the challenges of component variation between seen and unseen compositions and to handle the subtle balance between learning seen and unseen actions. The experimental results demonstrate that the proposed framework significantly surpasses the existing compositional generalization methods and sets a new state-of-the-art. The new Sth-com benchmark and code are available at https://github.com/RongchangLi/ZSCAR_C2C. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: Accepted by ECCV2024

arXiv:2407.06078 [pdf, ps, other]

Few-Shot Keyword Spotting from Mixed Speech

Authors: Junming Yuan, Ying Shi, LanTian Li, Dong Wang, Askar Hamdulla

Abstract: Few-shot keyword spotting (KWS) aims to detect unknown keywords with limited training samples. A commonly used approach is the pre-training and fine-tuning framework. While effective in clean conditions, this approach struggles with mixed keyword spotting -- simultaneously detecting multiple keywords blended in an utterance, which is crucial in real-world applications. Previous research has propos… ▽ More Few-shot keyword spotting (KWS) aims to detect unknown keywords with limited training samples. A commonly used approach is the pre-training and fine-tuning framework. While effective in clean conditions, this approach struggles with mixed keyword spotting -- simultaneously detecting multiple keywords blended in an utterance, which is crucial in real-world applications. Previous research has proposed a Mix-Training (MT) approach to solve the problem, however, it has never been tested in the few-shot scenario. In this paper, we investigate the possibility of using MT and other relevant methods to solve the two practical challenges together: few-shot and mixed speech. Experiments conducted on the LibriSpeech and Google Speech Command corpora demonstrate that MT is highly effective on this task when employed in either the pre-training phase or the fine-tuning phase. Moreover, combining SSL-based large-scale pre-training (HuBert) and MT fine-tuning yields very strong results in all the test conditions. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: accepted by INTERSPEECH 2024

arXiv:2407.05975 [pdf, other]

LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages

Authors: Yinquan Lu, Wenhao Zhu, Lei Li, Yu Qiao, Fei Yuan

Abstract: Large Language Models~(LLMs) demonstrate remarkable translation capabilities in high-resource language tasks, yet their performance in low-resource languages is hindered by insufficient multilingual data during pre-training. To address this, we dedicate 35,000 A100-SXM4-80GB GPU hours in conducting extensive multilingual continual pre-training on the LLaMA series models, enabling translation suppo… ▽ More Large Language Models~(LLMs) demonstrate remarkable translation capabilities in high-resource language tasks, yet their performance in low-resource languages is hindered by insufficient multilingual data during pre-training. To address this, we dedicate 35,000 A100-SXM4-80GB GPU hours in conducting extensive multilingual continual pre-training on the LLaMA series models, enabling translation support across more than 100 languages. Through a comprehensive analysis of training strategies, such as vocabulary expansion and data augmentation, we develop LLaMAX. Remarkably, without sacrificing its generalization ability, LLaMAX achieves significantly higher translation performance compared to existing open-source LLMs~(by more than 10 spBLEU points) and performs on-par with specialized translation model~(M2M-100-12B) on the Flores-101 benchmark. Extensive experiments indicate that LLaMAX can serve as a robust multilingual foundation model. The code~\footnote{\url{https://github.com/CONE-MT/LLaMAX/.}} and models~\footnote{\url{https://huggingface.co/LLaMAX/.}} are publicly available. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.05616 [pdf, other]

Explainable Image Recognition via Enhanced Slot-attention Based Classifier

Authors: Bowen Wang, Liangzhi Li, Jiahao Zhang, Yuta Nakashima, Hajime Nagahara

Abstract: The imperative to comprehend the behaviors of deep learning models is of utmost importance. In this realm, Explainable Artificial Intelligence (XAI) has emerged as a promising avenue, garnering increasing interest in recent years. Despite this, most existing methods primarily depend on gradients or input perturbation, which often fails to embed explanations directly within the model's decision-mak… ▽ More The imperative to comprehend the behaviors of deep learning models is of utmost importance. In this realm, Explainable Artificial Intelligence (XAI) has emerged as a promising avenue, garnering increasing interest in recent years. Despite this, most existing methods primarily depend on gradients or input perturbation, which often fails to embed explanations directly within the model's decision-making process. Addressing this gap, we introduce ESCOUTER, a visually explainable classifier based on the modified slot attention mechanism. ESCOUTER distinguishes itself by not only delivering high classification accuracy but also offering more transparent insights into the reasoning behind its decisions. It differs from prior approaches in two significant aspects: (a) ESCOUTER incorporates explanations into the final confidence scores for each category, providing a more intuitive interpretation, and (b) it offers positive or negative explanations for all categories, elucidating "why an image belongs to a certain category" or "why it does not." A novel loss function specifically for ESCOUTER is designed to fine-tune the model's behavior, enabling it to toggle between positive and negative explanations. Moreover, an area loss is also designed to adjust the size of the explanatory regions for a more precise explanation. Our method, rigorously tested across various datasets and XAI metrics, outperformed previous state-of-the-art methods, solidifying its effectiveness as an explanatory tool. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 16 pages, 12 figures

arXiv:2407.05554 [pdf, other]

PANS: Probabilistic Airway Navigation System for Real-time Robust Bronchoscope Localization

Authors: Qingyao Tian, Zhen Chen, Huai Liao, Xinyan Huang, Bingyu Yang, Lujie Li, Hongbin Liu

Abstract: Accurate bronchoscope localization is essential for pulmonary interventions, by providing six degrees of freedom (DOF) in airway navigation. However, the robustness of current vision-based methods is often compromised in clinical practice, and they struggle to perform in real-time and to generalize across cases unseen during training. To overcome these challenges, we propose a novel Probabilistic… ▽ More Accurate bronchoscope localization is essential for pulmonary interventions, by providing six degrees of freedom (DOF) in airway navigation. However, the robustness of current vision-based methods is often compromised in clinical practice, and they struggle to perform in real-time and to generalize across cases unseen during training. To overcome these challenges, we propose a novel Probabilistic Airway Navigation System (PANS), leveraging Monte-Carlo method with pose hypotheses and likelihoods to achieve robust and real-time bronchoscope localization. Specifically, our PANS incorporates diverse visual representations (\textit{e.g.}, odometry and landmarks) by leveraging two key modules, including the Depth-based Motion Inference (DMI) and the Bronchial Semantic Analysis (BSA). To generate the pose hypotheses of bronchoscope for PANS, we devise the DMI to accurately propagate the estimation of pose hypotheses over time. Moreover, to estimate the accurate pose likelihood, we devise the BSA module by effectively distinguishing between similar bronchial regions in endoscopic images, along with a novel metric to assess the congruence between estimated depth maps and the segmented airway structure. Under this probabilistic formulation, our PANS is capable of achieving the 6-DOF bronchoscope localization with superior accuracy and robustness. Extensive experiments on the collected pulmonary intervention dataset comprising 10 clinical cases confirm the advantage of our PANS over state-of-the-arts, in terms of both robustness and generalization in localizing deeper airway branches and the efficiency of real-time inference. The proposed PANS reveals its potential to be a reliable tool in the operating room, promising to enhance the quality and safety of pulmonary interventions. △ Less

Submitted 7 July, 2024; originally announced July 2024.

arXiv:2407.05388 [pdf, other]

Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene Synthesis

Authors: Qi Sun, Hang Zhou, Wengang Zhou, Li Li, Houqiang Li

Abstract: Synthesizing realistic 3D indoor scenes is a challenging task that traditionally relies on manual arrangement and annotation by expert designers. Recent advances in autoregressive models have automated this process, but they often lack semantic understanding of the relationships and hierarchies present in real-world scenes, yielding limited performance. In this paper, we propose Forest2Seq, a fram… ▽ More Synthesizing realistic 3D indoor scenes is a challenging task that traditionally relies on manual arrangement and annotation by expert designers. Recent advances in autoregressive models have automated this process, but they often lack semantic understanding of the relationships and hierarchies present in real-world scenes, yielding limited performance. In this paper, we propose Forest2Seq, a framework that formulates indoor scene synthesis as an order-aware sequential learning problem. Forest2Seq organizes the inherently unordered collection of scene objects into structured, ordered hierarchical scene trees and forests. By employing a clustering-based algorithm and a breadth-first traversal, Forest2Seq derives meaningful orderings and utilizes a transformer to generate realistic 3D scenes autoregressively. Experimental results on standard benchmarks demonstrate Forest2Seq's superiority in synthesizing more realistic scenes compared to top-performing baselines, with significant improvements in FID and KL scores. Our additional experiments for downstream tasks and ablation studies also confirm the importance of incorporating order as a prior in 3D scene generation. △ Less

Submitted 7 July, 2024; originally announced July 2024.

Comments: ECCV 2024

arXiv:2407.05117 [pdf, ps, other]

Search for the baryon number and lepton number violating decays $τ^-\to Λπ^-$ and $τ^-\to \barΛπ^-$ at Belle II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (349 additional authors not shown)

Abstract: We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper… ▽ More We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper limits at 90\% credibility level on the branching fractions of $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛπ^-$ are determined to be $4.7 \times 10^{-8}$ and $4.3 \times 10^{-8}$, respectively. △ Less

Submitted 6 July, 2024; originally announced July 2024.

Comments: 8 pages, 4 figures

Report number: Belle II Preprint 2024-020; KEK Preprint 2024-17

arXiv:2407.05104 [pdf, other]

Crowdsourced reviews reveal substantial disparities in public perceptions of parking

Authors: Lingyao Li, Songhua Hu, Ly Dinh, Libby Hemphill

Abstract: Due to increased reliance on private vehicles and growing travel demand, parking remains a longstanding urban challenge globally. Quantifying parking perceptions is paramount as it enables decision-makers to identify problematic areas and make informed decisions on parking management. This study introduces a cost-effective and widely accessible data source, crowdsourced online reviews, to investig… ▽ More Due to increased reliance on private vehicles and growing travel demand, parking remains a longstanding urban challenge globally. Quantifying parking perceptions is paramount as it enables decision-makers to identify problematic areas and make informed decisions on parking management. This study introduces a cost-effective and widely accessible data source, crowdsourced online reviews, to investigate public perceptions of parking across the U.S. Specifically, we examine 4,987,483 parking-related reviews for 1,129,460 points of interest (POIs) across 911 core-based statistical areas (CBSAs) sourced from Google Maps. We employ the Bidirectional Encoder Representations from Transformers (BERT) model to classify the parking sentiment and conduct regression analyses to explore its relationships with socio-spatial factors. Findings reveal significant variations in parking sentiment across POI types and CBSAs, with Restaurant POIs showing the most negative. Regression results further indicate that denser urban areas with higher proportions of African Americans and Hispanics and lower socioeconomic status are more likely to exhibit negative parking sentiment. Interestingly, an opposite relationship between parking supply and sentiment is observed, indicating increasing supply does not necessarily improve parking experiences. Finally, our textual analysis identifies keywords associated with positive or negative sentiments and highlights disparities between urban and rural areas. Overall, this study demonstrates the potential of a novel data source and methodological framework in measuring parking sentiment, offering valuable insights that help identify hyperlocal parking issues and guide targeted parking management strategies. △ Less

Submitted 6 July, 2024; originally announced July 2024.

arXiv:2407.04206 [pdf, other]

Computational Graph Representation of Equations System Constructors in Hierarchical Circuit Simulation

Authors: Zichao Long, Lin Li, Lei Han, Xianglong Meng, Chongjun Ding, Ruiyan Li, Wu Jiang, Fuchen Ding, Jiaqing Yue, Zhichao Li, Yisheng Hu, Ding Li, Heng Liao

Abstract: Equations system constructors of hierarchical circuits play a central role in device modeling, nonlinear equations solving, and circuit design automation. However, existing constructors present limitations in applications to different extents. For example, the costs of developing and reusing device models -- especially coarse-grained equivalent models of circuit modules -- remain high while parame… ▽ More Equations system constructors of hierarchical circuits play a central role in device modeling, nonlinear equations solving, and circuit design automation. However, existing constructors present limitations in applications to different extents. For example, the costs of developing and reusing device models -- especially coarse-grained equivalent models of circuit modules -- remain high while parameter sensitivity analysis is complex and inefficient. Inspired by differentiable programming and leveraging the ecosystem benefits of open-source software, we propose an equations system constructor using the computational graph representation, along with its JSON format netlist, to address these limitations. This representation allows for runtime dependencies between signals and subcircuit/device parameters. The proposed method streamlines the model development process and facilitates end-to-end computation of gradients of equations remainders with respect to parameters. This paper discusses in detail the overarching concept of hierarchical subcircuit/device decomposition and nested invocation by drawing parallels to functions in programming languages, and introduces rules for parameters passing and gradient propagation across hierarchical circuit modules. The presented numerical examples, including (1) an uncoupled CMOS model representation using "equivalent circuit decomposition+dynamic parameters" and (2) operational amplifier (OpAmp) auto device sizing, have demonstrated that the proposed method supports circuit simulation and design and particularly subcircuit modeling with improved efficiency, simplicity, and decoupling compared to existing techniques. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2407.03966 [pdf, other]

Serialized Output Training by Learned Dominance

Authors: Ying Shi, Lantian Li, Shi Yin, Dong Wang, Jiqing Han

Abstract: Serialized Output Training (SOT) has showcased state-of-the-art performance in multi-talker speech recognition by sequentially decoding the speech of individual speakers. To address the challenging label-permutation issue, prior methods have relied on either the Permutation Invariant Training (PIT) or the time-based First-In-First-Out (FIFO) rule. This study presents a model-based serialization st… ▽ More Serialized Output Training (SOT) has showcased state-of-the-art performance in multi-talker speech recognition by sequentially decoding the speech of individual speakers. To address the challenging label-permutation issue, prior methods have relied on either the Permutation Invariant Training (PIT) or the time-based First-In-First-Out (FIFO) rule. This study presents a model-based serialization strategy that incorporates an auxiliary module into the Attention Encoder-Decoder architecture, autonomously identifying the crucial factors to order the output sequence of the speech components in multi-talker speech. Experiments conducted on the LibriSpeech and LibriMix databases reveal that our approach significantly outperforms the PIT and FIFO baselines in both 2-mix and 3-mix scenarios. Further analysis shows that the serialization module identifies dominant speech components in a mixture by factors including loudness and gender, and orders speech components based on the dominance score. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: accepted by INTERSPEECH 2024

arXiv:2407.03783 [pdf, other]

Evidence of $h_{b}(\text{2P}) \to Υ(\text{1S})η$ decay and search for $h_{b}(\text{1P,2P}) \to Υ(\text{1S})π^0$ with the Belle detector

Authors: Belle Collaboration, E. Kovalenko, I. Adachi, H. Aihara, D. M. Asner, T. Aushev, R. Ayad, V. Babu, Sw. Banerjee, K. Belous, J. Bennett, M. Bessner, T. Bilka, D. Biswas, A. Bobrov, D. Bodrov, A. Bondar, A. Bozek, M. Bračko, P. Branchini, T. E. Browder, A. Budano, M. Campajola, M. -C. Chang, B. G. Cheon , et al. (142 additional authors not shown)

Abstract: We report the first evidence for the $h_{b}(\text{2P}) \to Υ(\text{1S})η$ transition with a significance of $3.5$ standard deviations. The decay branching fraction is measured to be $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})η]=(7.1 ~^{+3.7} _{-3.2}\pm 0.8)\times10^{-3}$, which is noticeably smaller than expected. We also set upper limits on $π^0$ transitions of… ▽ More We report the first evidence for the $h_{b}(\text{2P}) \to Υ(\text{1S})η$ transition with a significance of $3.5$ standard deviations. The decay branching fraction is measured to be $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})η]=(7.1 ~^{+3.7} _{-3.2}\pm 0.8)\times10^{-3}$, which is noticeably smaller than expected. We also set upper limits on $π^0$ transitions of $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})π^0] < 1.8\times10^{-3}$, and $\mathcal{B}[h_{b}(\text{1P})\to Υ(\text{1S})π^0] < 1.8\times10^{-3}$, at the $90\%$ confidence level. These results are obtained with a $131.4$~fb$^{-1}$ data sample collected near the $Υ(\text{5S})$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+e^-$ collider. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: to be submitted to PRL

Report number: Belle Preprint 2024-03, KEK Preprint 2024-03

arXiv:2407.02899 [pdf, other]

Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be… ▽ More A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.02806 [pdf, other]

Multiple topological transitions and spectral singularities in non-Hermitian Floquet systems

Authors: Weiwei Zhu, Longwen Zhou, Linhu Li, Jiangbin Gong

Abstract: The interplay between Floquet driving and non-Hermitian gain/loss could give rise to intriguing phenomena including topological funneling of light, edge-state delocalization, anomalous topological transitions and Floquet non-Hermitian skin effects. In this work, we uncover two unique phenomena in Floquet systems caused by gain and loss. First, multiple topological transitions from anomalous Floque… ▽ More The interplay between Floquet driving and non-Hermitian gain/loss could give rise to intriguing phenomena including topological funneling of light, edge-state delocalization, anomalous topological transitions and Floquet non-Hermitian skin effects. In this work, we uncover two unique phenomena in Floquet systems caused by gain and loss. First, multiple topological transitions from anomalous Floquet second-order topological insulators to anomalous Floquet first-order topological insulators and then to normal insulators can be induced by gain and loss. Interestingly, the resulting anomalous Floquet insulators further carry hybrid skin-topological boundary modes, which could either be fully localized or localized to different edges at different time slices and traversing along all edges in a single driving period. The topological phase transitions are also shown to be detectable through studies of transmission properties in the setting of coupled ring resonators. Second, gain and loss are found to induce singularities in the Floquet spectral, around which anomalous transmissions at flat quasienergy bands are predicted. These discoveries not only enhanced our understanding of topological matter and phase transitions in driven non-Hermitian systems, but also promoted their experimental realizations in optical and acoustic settings. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 10 pages, 7 figures

arXiv:2407.02530 [pdf, ps, other]

Unifying quantum spatial search, state transfer and uniform sampling on graphs: simple and exact

Authors: Qingwen Wang, Ying Jiang, Lvzhou Li

Abstract: This article presents a novel and succinct algorithmic framework via alternating quantum walks, unifying quantum spatial search, state transfer and uniform sampling on a large class of graphs. Using the framework, we can achieve exact uniform sampling over all vertices and perfect state transfer between any two vertices, provided that eigenvalues of Laplacian matrix of the graph are all integers.… ▽ More This article presents a novel and succinct algorithmic framework via alternating quantum walks, unifying quantum spatial search, state transfer and uniform sampling on a large class of graphs. Using the framework, we can achieve exact uniform sampling over all vertices and perfect state transfer between any two vertices, provided that eigenvalues of Laplacian matrix of the graph are all integers. Furthermore, if the graph is vertex-transitive as well, then we can achieve deterministic quantum spatial search that finds a marked vertex with certainty. In contrast, existing quantum search algorithms generally has a certain probability of failure. Even if the graph is not vertex-transitive, such as the complete bipartite graph, we can still adjust the algorithmic framework to obtain deterministic spatial search, which thus shows the flexibility of it. Besides unifying and improving plenty of previous results, our work provides new results on more graphs. The approach is easy to use since it has a succinct formalism that depends only on the depth of the Laplacian eigenvalue set of the graph, and may shed light on the solution of more problems related to graphs. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: This manuscript has some overlap with arXiv:2307.16133. More precisely, it is an advanced version of arXiv:2307.16133, which not only modifies the paper structure and some results but also adds several new results

Showing 1–50 of 6,418 results for author: Li, L