-
Measurement of the branching fraction of $D^+_s\to \ell^+ν_\ell$ via $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and…
▽ More
Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and $\mathcal{B}(D_s^+\toτ^+ν_τ)=(\bftauv)\%$, respectively. The product of the decay constant and Cabibbo-Kobayashi-Maskawa matrix element $|V_{cs}|$ is determined to be $f_{D_s^+}|V_{cs}|=(\mufdsxvcsresult)_{μν}~\mathrm{MeV}$ and $f_{D_s^+}|V_{cs}|=(\taufdsxvcsresult))_{τν}~\mathrm{MeV}$, respectively. Taking the value of $|V_{cs}|$ from a global fit in the Standard Model, we obtain ${f_{D^+_s}}=(\mufdsresult)_{μν}$\,MeV and ${f_{D^+_s}}=(\taufdsresult)_{τν}$\,MeV, respectively. Conversely, taking the value for $f_{D_s^+}$ from the latest lattice quantum chromodynamics calculation, we obtain $|V_{cs}| =(\muvcsresult)_{μν}$ and $|V_{cs}| = (\tauvcsresult)_{τν}$, respectively.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Nonreciprocal Single-Photon Band Structure in a Coupled-Spinning-Resonator chain
Authors:
Jing Li,
Ya Yang,
Xun Wei Xu,
Jing Lu,
Hui Jing,
Lan Zhou
Abstract:
We analyze the single-photon band structure and the transport of a single photon in a one-dimensional coupled-spinning-resonator chain. The time-reversal symmetry of the resonators chain is broken by the spinning of the resonators, instead of external or synthetic magnetic field. Two nonreciprocal single-photon band gaps can be obtained in the coupled-spinning-resonator chain, whose width depends…
▽ More
We analyze the single-photon band structure and the transport of a single photon in a one-dimensional coupled-spinning-resonator chain. The time-reversal symmetry of the resonators chain is broken by the spinning of the resonators, instead of external or synthetic magnetic field. Two nonreciprocal single-photon band gaps can be obtained in the coupled-spinning-resonator chain, whose width depends on the angular velocity of the spinning resonator. Based on the nonreciprocal band gaps, we can implement a single photon circulator at multiple frequency windows, and the direction of photon cycling is opposite for different band gaps. In addition, reciprocal single-photon band structures can also be realized in the coupled-spinning-resonator chain when all resonators rotate in the same direction with equal angular velocity. Our work open a new route to achieve, manipulate, and switch nonreciprocal or reciprocal single-photon band structures, and provides new opportunities to realize novel single-photon devices.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Automated Label Unification for Multi-Dataset Semantic Segmentation with GNNs
Authors:
Rong Ma,
Jie Chen,
Xiangyang Xue,
Jian Pu
Abstract:
Deep supervised models possess significant capability to assimilate extensive training data, thereby presenting an opportunity to enhance model performance through training on multiple datasets. However, conflicts arising from different label spaces among datasets may adversely affect model performance. In this paper, we propose a novel approach to automatically construct a unified label space acr…
▽ More
Deep supervised models possess significant capability to assimilate extensive training data, thereby presenting an opportunity to enhance model performance through training on multiple datasets. However, conflicts arising from different label spaces among datasets may adversely affect model performance. In this paper, we propose a novel approach to automatically construct a unified label space across multiple datasets using graph neural networks. This enables semantic segmentation models to be trained simultaneously on multiple datasets, resulting in performance improvements. Unlike existing methods, our approach facilitates seamless training without the need for additional manual reannotation or taxonomy reconciliation. This significantly enhances the efficiency and effectiveness of multi-dataset segmentation model training. The results demonstrate that our method significantly outperforms other multi-dataset training methods when trained on seven datasets simultaneously, and achieves state-of-the-art performance on the WildDash 2 benchmark.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Second-order topological insulator in Bilayer borophene
Authors:
Licheng Wang,
Ali Hamza Qureshi,
Yi Sun,
Xiaokang Xu,
Xiaojing Yao,
Xinli Zhao,
Ai-Lei He,
Yuan Zhou,
Xiuyun Zhang
Abstract:
As the novel topological states, the higher-order topological insulators have attracted great attentions in the past years. However, their realizations in realistic materials, in particular in two dimensional systems, remains the big challenge due to the lack of adequate candidates. Here, based on the first-principle calculation and tight-binding model simulations, we identify the currently \emph{…
▽ More
As the novel topological states, the higher-order topological insulators have attracted great attentions in the past years. However, their realizations in realistic materials, in particular in two dimensional systems, remains the big challenge due to the lack of adequate candidates. Here, based on the first-principle calculation and tight-binding model simulations, we identify the currently \emph{existing} bilayer $α_{5}$-phase borophenes as the two-dimensional second-order topological insulators, protected by the $C_{2}$-rotational symmetry. The formation of interlayer B-B covalent bonds, stabilizing the bilayer borophenes and opening the large direct bulk gaps ($\sim 0.55-0.62$ eV) at Fermi level, plays the key roles. The second-order topology is characterized by the bulk quantized quadrupole momentum. Our results enriches the candidates for the second-order topological insulators, and also provide a way to study topological states in borophenes.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Transformer for Multitemporal Hyperspectral Image Unmixing
Authors:
Hang Li,
Qiankun Dong,
Xueshuo Xie,
Xia Xu,
Tao Li,
Zhenwei Shi
Abstract:
Multitemporal hyperspectral image unmixing (MTHU) holds significant importance in monitoring and analyzing the dynamic changes of surface. However, compared to single-temporal unmixing, the multitemporal approach demands comprehensive consideration of information across different phases, rendering it a greater challenge. To address this challenge, we propose the Multitemporal Hyperspectral Image U…
▽ More
Multitemporal hyperspectral image unmixing (MTHU) holds significant importance in monitoring and analyzing the dynamic changes of surface. However, compared to single-temporal unmixing, the multitemporal approach demands comprehensive consideration of information across different phases, rendering it a greater challenge. To address this challenge, we propose the Multitemporal Hyperspectral Image Unmixing Transformer (MUFormer), an end-to-end unsupervised deep learning model. To effectively perform multitemporal hyperspectral image unmixing, we introduce two key modules: the Global Awareness Module (GAM) and the Change Enhancement Module (CEM). The Global Awareness Module computes self-attention across all phases, facilitating global weight allocation. On the other hand, the Change Enhancement Module dynamically learns local temporal changes by comparing endmember changes between adjacent phases. The synergy between these modules allows for capturing semantic information regarding endmember and abundance changes, thereby enhancing the effectiveness of multitemporal hyperspectral image unmixing. We conducted experiments on one real dataset and two synthetic datasets, demonstrating that our model significantly enhances the effect of multitemporal hyperspectral image unmixing.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Charge radii of $^{11-16}$C, $^{13-17}$N and $^{15-18}$O determined from their charge-changing cross-sections and the mirror-difference charge radii
Authors:
J. W. Zhao,
B. -H. Sun,
I. Tanihata,
J. Y. Xu,
K. Y. Zhang,
A. Prochazka,
L. H. Zhu,
S. Terashima,
J. Meng,
L. C. He,
C. Y. Liu,
G. S. Li,
C. G. Lu,
W. J. Lin,
W. P. Lin,
Z. Liu,
P. P Ren,
Z. Y. Sun,
F. Wang,
J. Wang,
M. Wang,
S. T. Wang,
X. L. Wei,
X. D. Xu,
J. C. Zhang
, et al. (2 additional authors not shown)
Abstract:
Charge-changing cross-sections of $^{11-16}$C, $^{13-17}$N and $^{15-18}$O on a carbon target have been determined at energies around 300 MeV/nucleon. A nucleon separation energy dependent correction factor has been introduced to the Glauber model calculation for extracting the nuclear charge radii from the experimental CCCSs. The charge radii of $^{11}$C, $^{13,16}$N and $^{15}$O thus were determ…
▽ More
Charge-changing cross-sections of $^{11-16}$C, $^{13-17}$N and $^{15-18}$O on a carbon target have been determined at energies around 300 MeV/nucleon. A nucleon separation energy dependent correction factor has been introduced to the Glauber model calculation for extracting the nuclear charge radii from the experimental CCCSs. The charge radii of $^{11}$C, $^{13,16}$N and $^{15}$O thus were determined for the first time. With the new radii, we studied the experimental mirror-difference charge radii ($ΔR_{\text {ch}}^{\text {mirror}}$) of $^{11}$B-$^{11}$C, $^{13}$C-$^{13}$N, $^{15}$N-$^{15}$O, $^{17}$N-$^{17}$Ne pairs for the first time. We find that the $ΔR_{\text {ch}}^{\text {mirror}}$, including both bound and weakly bound proton-rich mirror partners, are reproduced by the empirical relation to the isospin asymmetry predicted by the $ab$ $initio$ calculations.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
Inferior interfacial superconductivity in 1 UC FeSe/SrVO$_3$/SrTiO$_3$ with screened interfacial electron-phonon coupling
Authors:
Nan Guo,
Xiaoyang Chen,
Tianlun Yu,
Yu Fan,
Qinghua Zhang,
Minyinan Lei,
Xiaofeng Xu,
Xuetao Zhu,
Jiandong Guo,
Lin Gu,
Haichao Xu,
Rui Peng,
Donglai Feng
Abstract:
Monolayer FeSe/TiO$_x$ and FeSe/FeO$_x$ interfaces exhibit significant superconductivity enhancement compared to bulk FeSe, with interfacial electron-phonon coupling (EPC) playing a crucial role. However, the reduced dimensionality in monolayer FeSe, which may drive superconducting fluctuations, complicates the understanding of the enhancement mechanisms. Here we construct a new superconducting in…
▽ More
Monolayer FeSe/TiO$_x$ and FeSe/FeO$_x$ interfaces exhibit significant superconductivity enhancement compared to bulk FeSe, with interfacial electron-phonon coupling (EPC) playing a crucial role. However, the reduced dimensionality in monolayer FeSe, which may drive superconducting fluctuations, complicates the understanding of the enhancement mechanisms. Here we construct a new superconducting interface: monolayer FeSe/SrVO$_3$/SrTiO$_3$, in which the itinerant electrons of highly metallic SrVO$_3$ films can screen all the high-energy Fuchs-Kliewer phonons, including those of SrTiO$_3$, making it the first FeSe/oxide system with screened interfacial EPC while maintaining the monolayer FeSe thickness. Despite comparable doping levels, the heavily electron-doped monolayer FeSe/SrVO$_3$ exhibits a lower pairing temperature ($T_\mathrm{g}$ $\sim$ 48 K) than FeSe/SrTiO$_3$ and FeSe/LaFeO$_3$. Our findings disentangle the contributions of interfacial EPC from dimensionality on enhancing $T_\mathrm{g}$ in FeSe/oxide interfaces, underscoring the importance of interfacial EPC in $T_\mathrm{g}$ enhancement. This FeSe/VO$_x$ interface also provides a platform for studying the interfacial superconductivity.
△ Less
Submitted 13 July, 2024;
originally announced July 2024.
-
3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance
Authors:
Xiaoxu Xu,
Yitian Yuan,
Jinlong Li,
Qiudan Zhang,
Zequn Jie,
Lin Ma,
Hao Tang,
Nicu Sebe,
Xu Wang
Abstract:
In this paper, we propose 3DSS-VLG, a weakly supervised approach for 3D Semantic Segmentation with 2D Vision-Language Guidance, an alternative approach that a 3D model predicts dense-embedding for each point which is co-embedded with both the aligned image and text spaces from the 2D vision-language model. Specifically, our method exploits the superior generalization ability of the 2D vision-langu…
▽ More
In this paper, we propose 3DSS-VLG, a weakly supervised approach for 3D Semantic Segmentation with 2D Vision-Language Guidance, an alternative approach that a 3D model predicts dense-embedding for each point which is co-embedded with both the aligned image and text spaces from the 2D vision-language model. Specifically, our method exploits the superior generalization ability of the 2D vision-language models and proposes the Embeddings Soft-Guidance Stage to utilize it to implicitly align 3D embeddings and text embeddings. Moreover, we introduce the Embeddings Specialization Stage to purify the feature representation with the help of a given scene-level label, specifying a better feature supervised by the corresponding text embedding. Thus, the 3D model is able to gain informative supervisions both from the image embedding and text embedding, leading to competitive segmentation performances. To the best of our knowledge, this is the first work to investigate 3D weakly supervised semantic segmentation by using the textual semantic information of text category labels. Moreover, with extensive quantitative and qualitative experiments, we present that our 3DSS-VLG is able not only to achieve the state-of-the-art performance on both S3DIS and ScanNet datasets, but also to maintain strong generalization capability.
△ Less
Submitted 13 July, 2024;
originally announced July 2024.
-
Geometric Understanding of Discriminability and Transferability for Visual Domain Adaptation
Authors:
You-Wei Luo,
Chuan-Xian Ren,
Xiao-Lin Xu,
Qingshan Liu
Abstract:
To overcome the restriction of identical distribution assumption, invariant representation learning for unsupervised domain adaptation (UDA) has made significant advances in computer vision and pattern recognition communities. In UDA scenario, the training and test data belong to different domains while the task model is learned to be invariant. Recently, empirical connections between transferabil…
▽ More
To overcome the restriction of identical distribution assumption, invariant representation learning for unsupervised domain adaptation (UDA) has made significant advances in computer vision and pattern recognition communities. In UDA scenario, the training and test data belong to different domains while the task model is learned to be invariant. Recently, empirical connections between transferability and discriminability have received increasing attention, which is the key to understanding the invariant representations. However, theoretical study of these abilities and in-depth analysis of the learned feature structures are unexplored yet. In this work, we systematically analyze the essentials of transferability and discriminability from the geometric perspective. Our theoretical results provide insights into understanding the co-regularization relation and prove the possibility of learning these abilities. From methodology aspect, the abilities are formulated as geometric properties between domain/cluster subspaces (i.e., orthogonality and equivalence) and characterized as the relation between the norms/ranks of multiple matrices. Two optimization-friendly learning principles are derived, which also ensure some intuitive explanations. Moreover, a feasible range for the co-regularization parameters is deduced to balance the learning of geometric structures. Based on the theoretical results, a geometry-oriented model is proposed for enhancing the transferability and discriminability via nuclear norm optimization. Extensive experiment results validate the effectiveness of the proposed model in empirical applications, and verify that the geometric abilities can be sufficiently learned in the derived feasible range.
△ Less
Submitted 24 June, 2024;
originally announced July 2024.
-
Dynamic-Mode Decomposition of Geostrophically Balanced and Unbalanced Motions from SWOT
Authors:
Takaya Uchida,
Yadidya Badarvada,
Karl E. Lapo,
Xiaobiao Xu,
Brian K. Arbic,
Dimitris Menemenlis,
Luna Hiron,
Eric P. Chassignet,
Jay F. Shriver
Abstract:
The decomposition of oceanic flow into its balanced and unbalanced motions carries theoretical and practical significance for the oceanographic community. These two motions have distinct dynamical characteristics and affect the transport of tracers differently from one another. The launch of Surface Water and Ocean Topography (SWOT) satellite provides a prime opportunity to diagnose the surface ba…
▽ More
The decomposition of oceanic flow into its balanced and unbalanced motions carries theoretical and practical significance for the oceanographic community. These two motions have distinct dynamical characteristics and affect the transport of tracers differently from one another. The launch of Surface Water and Ocean Topography (SWOT) satellite provides a prime opportunity to diagnose the surface balanced and unbalanced motions on a global scale at an unprecedented spatial resolution. Here, we apply dynamic-mode decomposition (DMD), a linear-algebraic data-driven method, to a tidally-forced numerical simulation and one-day-repeat SWOT observations of sea-surface height (SSH) in the Gulf Stream extension. DMD is able to separate out the spatial modes associated with sub-inertial periods from super-inertial periods. The sub-inertial modes of DMD can be used to extract geostrophically balanced motions from SSH fields, which have an imprint of internal tides.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Measurement of $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer
, et al. (414 additional authors not shown)
Abstract:
We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We det…
▽ More
We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We determine these parameters for two ranges of $K^0_S π^0$ invariant mass: $m(K^0_S π^0)\in (0.8, 1.0)$ $GeV/c^2$, which is dominated by $B^0 \to K^{*0} (\to K^0_S π^0) γ$ decays, and a complementary region $m(K^0_S π^0)\in (0.6, 0.8)\cup(1.0, 1.8)$ $GeV/c^2$. Our results have improved precision as compared to previous measurements and are consistent with theory predictions.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Dynamic protected states in the non-Hermitian system
Authors:
Lei Chen,
Zhen-Xia Niu,
Xingran Xu
Abstract:
The non-Hermitian skin effect and nonreciprocal behavior are sensitive to the boundary conditions, which are unique features of non-Hermitian systems. The eigenenergies will become complex and all eigenstates are localized at the boundary, which is distinguished from the Hermitian topologies. In this work, we theoretically study the dynamic behavior of the propagation of Gaussian wavepackets insid…
▽ More
The non-Hermitian skin effect and nonreciprocal behavior are sensitive to the boundary conditions, which are unique features of non-Hermitian systems. The eigenenergies will become complex and all eigenstates are localized at the boundary, which is distinguished from the Hermitian topologies. In this work, we theoretically study the dynamic behavior of the propagation of Gaussian wavepackets inside a non-Hermitian lattice and analyze the self-acceleration process of bulk state or Gaussian wavepackets toward the system's boundary. The initial wavepackets will not only propagate toward the side where the eigenstates are localized, but also their momentum will approach to a specific value where the imaginary parts of energy dispersion are the maximum. In addition, if the wavepackets cover this specific momentum, they will eventually exhibit exponentially increasing amplitudes with time evolution, maintaining the dynamic protected condition for an extended period of time until they approach the boundary. We also take two widely used toy models as examples in one and two dimensions to verify the correspondence of the non-Hermitian skin effect and the dynamic protected state.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Dynamic neural network with memristive CIM and CAM for 2D and 3D vision
Authors:
Yue Zhang,
Woyu Zhang,
Shaocong Wang,
Ning Lin,
Yifei Yu,
Yangu He,
Bo Wang,
Hao Jiang,
Peng Lin,
Xiaoxin Xu,
Xiaojuan Qi,
Zhongrui Wang,
Xumeng Zhang,
Dashan Shang,
Qi Liu,
Kwang-Ting Cheng,
Ming Liu
Abstract:
The brain is dynamic, associative and efficient. It reconfigures by associating the inputs with past experiences, with fused memory and processing. In contrast, AI models are static, unable to associate inputs with past experiences, and run on digital computers with physically separated memory and processing. We propose a hardware-software co-design, a semantic memory-based dynamic neural network…
▽ More
The brain is dynamic, associative and efficient. It reconfigures by associating the inputs with past experiences, with fused memory and processing. In contrast, AI models are static, unable to associate inputs with past experiences, and run on digital computers with physically separated memory and processing. We propose a hardware-software co-design, a semantic memory-based dynamic neural network (DNN) using memristor. The network associates incoming data with the past experience stored as semantic vectors. The network and the semantic memory are physically implemented on noise-robust ternary memristor-based Computing-In-Memory (CIM) and Content-Addressable Memory (CAM) circuits, respectively. We validate our co-designs, using a 40nm memristor macro, on ResNet and PointNet++ for classifying images and 3D points from the MNIST and ModelNet datasets, which not only achieves accuracy on par with software but also a 48.1% and 15.9% reduction in computational budget. Moreover, it delivers a 77.6% and 93.3% reduction in energy consumption.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Measurement of branching fractions, CP asymmetry, and isospin asymmetry for $\boldsymbol{B\rightarrowργ}$ decays using Belle and Belle II data
Authors:
Belle II Collaboration,
I. Adachi,
K. Adamczyk,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer
, et al. (385 additional authors not shown)
Abstract:
We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle I…
▽ More
We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle II data sets yields $114\pm 12$ $B^{+}\rightarrowρ^{+}γ$ and $99\pm 12$ $B^{0}\rightarrowρ^{0}γ$ decays. The measured branching fractions are $(13.1^{+2.0 +1.3}_{-1.9 -1.2})\times 10^{-7}$ and $(7.5\pm 1.3^{+1.0}_{-0.8})\times 10^{-7}$ for $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays, respectively, where the first uncertainty is statistical and the second is systematic. We also measure the isospin asymmetry $A_{\rm I}(B\rightarrowργ)=(10.9^{+11.2 +7.8}_{-11.7 -7.3})\%$ and the direct CP asymmetry $A_{CP}(B^{+}\rightarrowρ^{+}γ)=(-8.2\pm 15.2^{+1.6}_{-1.2})\%$.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Self-dualities and Galois symmetries in Feynman integrals
Authors:
Sebastian Pögel,
Xing Wang,
Stefan Weinzierl,
Konglong Wu,
Xiaofeng Xu
Abstract:
It is well-known that all Feynman integrals within a given family can be expressed as a finite linear combination of master integrals. The master integrals naturally group into sectors. Starting from two loops, there can exist sectors made up of more than one master integral. In this paper we show that such sectors may have additional symmetries. First of all, self-duality, which was first observe…
▽ More
It is well-known that all Feynman integrals within a given family can be expressed as a finite linear combination of master integrals. The master integrals naturally group into sectors. Starting from two loops, there can exist sectors made up of more than one master integral. In this paper we show that such sectors may have additional symmetries. First of all, self-duality, which was first observed in Feynman integrals related to Calabi--Yau geometries, often carries over to non-Calabi--Yau Feynman integrals. Secondly, we show that in addition there can exist Galois symmetries relating integrals. In the simplest case of two master integrals within a sector, whose definition involves a square root $r$, we may choose a basis $(I_1,I_2)$ such that $I_2$ is obtained from $I_1$ by the substitution $r \rightarrow -r$. This pattern also persists in sectors, which a priori are not related to any square root with dependence on the kinematic variables. We show in several examples that in such cases a suitable redefinition of the integrals introduces constant square roots like $\sqrt{3}$. The new master integrals are then again related by a Galois symmetry, for example the substitution $\sqrt{3} \rightarrow -\sqrt{3}$. To handle the case where the argument of a square root would be a perfect square we introduce a limit Galois symmetry. Both self-duality and Galois symmetries constrain the differential equation.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Modeling and Suppressing Unwanted Parasitic Interactions in Superconducting Circuits
Authors:
Xuexin Xu
Abstract:
Superconducting qubits are among the most promising candidates for building quantum computers. Despite significant improvements in qubit coherence, achieving a fault-tolerant quantum computer remains a major challenge, largely due to imperfect gate fidelity. A key source of this infidelity is the parasitic interaction between coupled qubits, which this thesis addresses in two- and three-qubit circ…
▽ More
Superconducting qubits are among the most promising candidates for building quantum computers. Despite significant improvements in qubit coherence, achieving a fault-tolerant quantum computer remains a major challenge, largely due to imperfect gate fidelity. A key source of this infidelity is the parasitic interaction between coupled qubits, which this thesis addresses in two- and three-qubit circuits. This parasitic interaction causes a bending between computational and non-computational levels, leading to a parasitic ZZ interaction. The thesis first investigates the possibility of zeroing the ZZ interaction in two qubit combinations: a pair of interacting transmons, and a hybrid pair of a transmon coupled to a capacitively shunted flux qubit (CSFQ). The theory developed is used to accurately simulate experimental results from our collaborators, who measured a CSFQ-transmon pair with and without a cross-resonance (CR) gate. The strong agreement between theory and experiment motivated further study of a CR gate that achieves 99.9% fidelity in the absence of static ZZ interaction. Since the CR pulse adds an additional ZZ component to the static part, a new strategy called dynamical ZZ freedom is proposed to zero the total ZZ interaction. This strategy can be applied in all-transmon circuits to enable perfect entanglement. Based on these findings, a new two-qubit gate, the parasitic-free (PF) gate, is proposed. Additionally, the thesis explores how to utilize the ZZ interaction to enhance the performance of a controlled-Z gate. Lastly, the impact of a third qubit on two-qubit gate performance is examined, with several examples illustrating the properties of two-body ZZ and three-body ZZZ interactions in circuits with more than two qubits.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Purity benchmarking study of error coherence in a single Xmon qubit
Authors:
Auda Zhu,
Jérémy H. Béjanin,
Xicheng Xu,
Matteo Mariantoni
Abstract:
In this study, we employ purity benchmarking (PB) to explore the dynamics of gate noise in a superconducting qubit system. Over 1110 hours of observations on an Xmon qubit, we simultaneously measure the coherence noise budget across two different operational frequencies. We find that incoherent errors, which predominate in overall error rates, exhibit minimal frequency dependence, suggesting they…
▽ More
In this study, we employ purity benchmarking (PB) to explore the dynamics of gate noise in a superconducting qubit system. Over 1110 hours of observations on an Xmon qubit, we simultaneously measure the coherence noise budget across two different operational frequencies. We find that incoherent errors, which predominate in overall error rates, exhibit minimal frequency dependence, suggesting they are primarily due to wide-band, diffusive incoherent error sources. In contrast, coherent errors, although less prevalent, show significant sensitivity to operational frequency variations and telegraphic noise. We speculate that this sensitivity is due to interactions with a single strongly coupled environmental defect -- modeled as a two-level system -- which influences qubit control parameters and causes coherent calibration errors. Our results also demonstrate that PB offers improved sensitivity, capturing additional dynamics that conventional relaxation time measurements cannot detect, thus presenting a more comprehensive method for capturing dynamic interactions within quantum systems. The intricate nature of these coherence dynamics underscores the need for further research.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (645 additional authors not shown)
Abstract:
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be…
▽ More
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Beat-It: Beat-Synchronized Multi-Condition 3D Dance Generation
Authors:
Zikai Huang,
Xuemiao Xu,
Cheng Xu,
Huaidong Zhang,
Chenxi Zheng,
Jing Qin,
Shengfeng He
Abstract:
Dance, as an art form, fundamentally hinges on the precise synchronization with musical beats. However, achieving aesthetically pleasing dance sequences from music is challenging, with existing methods often falling short in controllability and beat alignment. To address these shortcomings, this paper introduces Beat-It, a novel framework for beat-specific, key pose-guided dance generation. Unlike…
▽ More
Dance, as an art form, fundamentally hinges on the precise synchronization with musical beats. However, achieving aesthetically pleasing dance sequences from music is challenging, with existing methods often falling short in controllability and beat alignment. To address these shortcomings, this paper introduces Beat-It, a novel framework for beat-specific, key pose-guided dance generation. Unlike prior approaches, Beat-It uniquely integrates explicit beat awareness and key pose guidance, effectively resolving two main issues: the misalignment of generated dance motions with musical beats, and the inability to map key poses to specific beats, critical for practical choreography. Our approach disentangles beat conditions from music using a nearest beat distance representation and employs a hierarchical multi-condition fusion mechanism. This mechanism seamlessly integrates key poses, beats, and music features, mitigating condition conflicts and offering rich, multi-conditioned guidance for dance generation. Additionally, a specially designed beat alignment loss ensures the generated dance movements remain in sync with the designated beats. Extensive experiments confirm Beat-It's superiority over existing state-of-the-art methods in terms of beat alignment and motion controllability.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Higher-order Fuzzy Membership in Motif Modularity Optimization
Authors:
Jing Xiao,
Ya-Wei Wei,
Xiao-Ke Xu
Abstract:
Higher-order community detection (HCD) reveals both mesoscale structures and functional characteristics of real-life networks. Although many methods have been developed from diverse perspectives, to our knowledge, none can provide fine-grained higher-order fuzzy community information. This study presents a novel concept of higher-order fuzzy memberships that quantify the membership grades of motif…
▽ More
Higher-order community detection (HCD) reveals both mesoscale structures and functional characteristics of real-life networks. Although many methods have been developed from diverse perspectives, to our knowledge, none can provide fine-grained higher-order fuzzy community information. This study presents a novel concept of higher-order fuzzy memberships that quantify the membership grades of motifs to crisp higher-order communities, thereby revealing the partial community affiliations. Furthermore, we employ higher-order fuzzy memberships to enhance HCD via a general framework called fuzzy memberships assisted motif-based evolutionary modularity (FMMEM). In FFMEM, on the one hand, a fuzzy membership-based neighbor community modification (FM-NCM) strategy is designed to correct misassigned bridge nodes, thereby improving partition quality. On the other hand, a fuzzy membership-based local community merging (FM-LCM) strategy is also proposed to combine excessively fragmented communities for enhancing local search ability. Experimental results indicate that the FMMEM framework outperforms state-of-the-art methods in both synthetic and real-world datasets, particularly in the networks with ambiguous and complex structures.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Accelerating Mobile Edge Generation (MEG) by Constrained Learning
Authors:
Xiaoxia Xu,
Yuanwei Liu,
Xidong Mu,
Hong Xing,
Arumugam Nallanathan
Abstract:
A novel accelerated mobile edge generation (MEG) framework is proposed for generating high-resolution images on mobile devices. Exploiting a large-scale latent diffusion model (LDM) distributed across edge server (ES) and user equipment (UE), cost-efficient artificial intelligence generated content (AIGC) is achieved by transmitting low-dimensional features between ES and UE. To reduce overheads o…
▽ More
A novel accelerated mobile edge generation (MEG) framework is proposed for generating high-resolution images on mobile devices. Exploiting a large-scale latent diffusion model (LDM) distributed across edge server (ES) and user equipment (UE), cost-efficient artificial intelligence generated content (AIGC) is achieved by transmitting low-dimensional features between ES and UE. To reduce overheads of both distributed computations and transmissions, a dynamic diffusion and feature merging scheme is conceived. By jointly optimizing the denoising steps and feature merging ratio, the image generation quality is maximized subject to latency and energy consumption constraints. To address this problem and tailor LDM sub-models, a low-complexity MEG acceleration protocol is developed. Particularly, a backbone meta-architecture is trained via offline distillation. Then, dynamic diffusion and feature merging are determined in online channel environment, which can be viewed as a constrained Markov Decision Process (MDP). A constrained variational policy optimization (CVPO) based MEG algorithm is further proposed for constraint-guaranteed learning, namely MEG-CVPO. Numerical results verify that: 1) The proposed framework can generate 1024$\times$1024 high-quality images over noisy channels while reducing over $40\%$ latency compared to conventional generation schemes. 2) The developed MEG-CVPO effectively mitigates constraint violations, thus flexibly controlling the trade-off between image distortion and generation costs.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Multiple Instance Verification
Authors:
Xin Xu,
Eibe Frank,
Geoffrey Holmes
Abstract:
We explore multiple-instance verification, a problem setting where a query instance is verified against a bag of target instances with heterogeneous, unknown relevancy. We show that naive adaptations of attention-based multiple instance learning (MIL) methods and standard verification methods like Siamese neural networks are unsuitable for this setting: directly combining state-of-the-art (SOTA) M…
▽ More
We explore multiple-instance verification, a problem setting where a query instance is verified against a bag of target instances with heterogeneous, unknown relevancy. We show that naive adaptations of attention-based multiple instance learning (MIL) methods and standard verification methods like Siamese neural networks are unsuitable for this setting: directly combining state-of-the-art (SOTA) MIL methods and Siamese networks is shown to be no better, and sometimes significantly worse, than a simple baseline model. Postulating that this may be caused by the failure of the representation of the target bag to incorporate the query instance, we introduce a new pooling approach named ``cross-attention pooling'' (CAP). Under the CAP framework, we propose two novel attention functions to address the challenge of distinguishing between highly similar instances in a target bag. Through empirical studies on three different verification tasks, we demonstrate that CAP outperforms adaptations of SOTA MIL methods and the baseline by substantial margins, in terms of both classification accuracy and quality of the explanations provided for the classifications. Ablation studies confirm the superior ability of the new attention functions to identify key instances.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Improving Speech Enhancement by Integrating Inter-Channel and Band Features with Dual-branch Conformer
Authors:
Jizhen Li,
Xinmeng Xu,
Weiping Tu,
Yuhong Yang,
Rong Zhu
Abstract:
Recent speech enhancement methods based on convolutional neural networks (CNNs) and transformer have been demonstrated to efficaciously capture time-frequency (T-F) information on spectrogram. However, the correlation of each channels of speech features is failed to explore. Theoretically, each channel map of speech features obtained by different convolution kernels contains information with diffe…
▽ More
Recent speech enhancement methods based on convolutional neural networks (CNNs) and transformer have been demonstrated to efficaciously capture time-frequency (T-F) information on spectrogram. However, the correlation of each channels of speech features is failed to explore. Theoretically, each channel map of speech features obtained by different convolution kernels contains information with different scales demonstrating strong correlations. To fill this gap, we propose a novel dual-branch architecture named channel-aware dual-branch conformer (CADB-Conformer), which effectively explores the long range time and frequency correlations among different channels, respectively, to extract channel relation aware time-frequency information. Ablation studies conducted on DNS-Challenge 2020 dataset demonstrate the importance of channel feature leveraging while showing the significance of channel relation aware T-F information for speech enhancement. Extensive experiments also show that the proposed model achieves superior performance than recent methods with an attractive computational costs.
△ Less
Submitted 13 July, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
4D Contrastive Superflows are Dense 3D Representation Learners
Authors:
Xiang Xu,
Lingdong Kong,
Hui Shuai,
Wenwei Zhang,
Liang Pan,
Kai Chen,
Ziwei Liu,
Qingshan Liu
Abstract:
In the realm of autonomous driving, accurate 3D perception is the foundation. However, developing such models relies on extensive human annotations -- a process that is both costly and labor-intensive. To address this challenge from a data representation learning perspective, we introduce SuperFlow, a novel framework designed to harness consecutive LiDAR-camera pairs for establishing spatiotempora…
▽ More
In the realm of autonomous driving, accurate 3D perception is the foundation. However, developing such models relies on extensive human annotations -- a process that is both costly and labor-intensive. To address this challenge from a data representation learning perspective, we introduce SuperFlow, a novel framework designed to harness consecutive LiDAR-camera pairs for establishing spatiotemporal pretraining objectives. SuperFlow stands out by integrating two key designs: 1) a dense-to-sparse consistency regularization, which promotes insensitivity to point cloud density variations during feature learning, and 2) a flow-based contrastive learning module, carefully crafted to extract meaningful temporal cues from readily available sensor calibrations. To further boost learning efficiency, we incorporate a plug-and-play view consistency module that enhances the alignment of the knowledge distilled from camera views. Extensive comparative and ablation studies across 11 heterogeneous LiDAR datasets validate our effectiveness and superiority. Additionally, we observe several interesting emerging properties by scaling up the 2D and 3D backbones during pretraining, shedding light on the future research of 3D foundation models for LiDAR-based perception.
△ Less
Submitted 9 July, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
Receiver Selection and Transmit Beamforming for Multi-static Integrated Sensing and Communications
Authors:
Dan Wang,
Yuanming Tian,
Chuan Huang,
Hao Chen,
Xiaodong Xu,
Ping Zhang
Abstract:
Next-generation wireless networks are expected to develop a novel paradigm of integrated sensing and communications (ISAC) to enable both the high-accuracy sensing and high-speed communications. However, conventional mono-static ISAC systems, which simultaneously transmit and receive at the same equipment, may suffer from severe self-interference, and thus significantly degrade the system performa…
▽ More
Next-generation wireless networks are expected to develop a novel paradigm of integrated sensing and communications (ISAC) to enable both the high-accuracy sensing and high-speed communications. However, conventional mono-static ISAC systems, which simultaneously transmit and receive at the same equipment, may suffer from severe self-interference, and thus significantly degrade the system performance.To address this issue, this paper studies a multi-static ISAC system for cooperative target localization and communications, where the transmitter transmits ISAC signal to multiple receivers (REs) deployed at different positions. We derive the closed-form Cramér-Rao bound (CRB) on the joint estimations of both the transmission delay and Doppler shift for cooperative target localization, and the CRB minimization problem is formulated by considering the cooperative cost and communication rate requirements for the REs. To solve this problem, we first decouple it into two subproblems for RE selection and transmit beamforming, respectively. Then, a minimax linkage-based method is proposed to solve the RE selection subproblem, and a successive convex approximation algorithm is adopted to deal with the transmit beamforming subproblem with non-convex constraints. Finally, numerical results validate our analysis and reveal that our proposed multi-static ISAC scheme achieves better ISAC performance than the conventional mono-static ones when the number of cooperative REs is large.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Multi-Colouring of Kneser Graphs: Notes on Stahl's Conjecture
Authors:
Jan van den Heuvel,
Xinyi Xu
Abstract:
If a graph is $n$-colourable, then it obviously is $n'$-colourable for any $n'\ge n$. But the situation is not so clear when we consider multi-colourings of graphs. A graph is $(n,k)$-colourable if we can assign each vertex a $k$-subset of $\{1,2,\ldots,n\}$ so that adjacent vertices receive disjoint subsets. In this note we consider the following problem: if a graph is $(n,k)$-colourable, then fo…
▽ More
If a graph is $n$-colourable, then it obviously is $n'$-colourable for any $n'\ge n$. But the situation is not so clear when we consider multi-colourings of graphs. A graph is $(n,k)$-colourable if we can assign each vertex a $k$-subset of $\{1,2,\ldots,n\}$ so that adjacent vertices receive disjoint subsets. In this note we consider the following problem: if a graph is $(n,k)$-colourable, then for what pairs $(n', k')$ is it also $(n',k')$-colourable? This question can be translated into a question regarding multi-colourings of Kneser graphs, for which Stahl formulated a conjecture in 1976. We present new results, strengthen existing results, and in particular present much simpler proofs of several known cases of the conjecture.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Learning to Adapt Category Consistent Meta-Feature of CLIP for Few-Shot Classification
Authors:
Jiaying Shi,
Xuetong Xue,
Shenghui Xu
Abstract:
The recent CLIP-based methods have shown promising zero-shot and few-shot performance on image classification tasks. Existing approaches such as CoOp and Tip-Adapter only focus on high-level visual features that are fully aligned with textual features representing the ``Summary" of the image. However, the goal of few-shot learning is to classify unseen images of the same category with few labeled…
▽ More
The recent CLIP-based methods have shown promising zero-shot and few-shot performance on image classification tasks. Existing approaches such as CoOp and Tip-Adapter only focus on high-level visual features that are fully aligned with textual features representing the ``Summary" of the image. However, the goal of few-shot learning is to classify unseen images of the same category with few labeled samples. Especially, in contrast to high-level representations, local representations (LRs) at low-level are more consistent between seen and unseen samples. Based on this point, we propose the Meta-Feature Adaption method (MF-Adapter) that combines the complementary strengths of both LRs and high-level semantic representations. Specifically, we introduce the Meta-Feature Unit (MF-Unit), which is a simple yet effective local similarity metric to measure category-consistent local context in an inductive manner. Then we train an MF-Adapter to map image features to MF-Unit for adequately generalizing the intra-class knowledge between unseen images and the support set. Extensive experiments show that our proposed method is superior to the state-of-the-art CLIP downstream few-shot classification methods, even showing stronger performance on a set of challenging visual classification tasks.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
HPFF: Hierarchical Locally Supervised Learning with Patch Feature Fusion
Authors:
Junhao Su,
Chenghao He,
Feiyu Zhu,
Xiaojie Xu,
Dongzhi Guan,
Chenyang Si
Abstract:
Traditional deep learning relies on end-to-end backpropagation for training, but it suffers from drawbacks such as high memory consumption and not aligning with biological neural networks. Recent advancements have introduced locally supervised learning, which divides networks into modules with isolated gradients and trains them locally. However, this approach can lead to performance lag due to lim…
▽ More
Traditional deep learning relies on end-to-end backpropagation for training, but it suffers from drawbacks such as high memory consumption and not aligning with biological neural networks. Recent advancements have introduced locally supervised learning, which divides networks into modules with isolated gradients and trains them locally. However, this approach can lead to performance lag due to limited interaction between these modules, and the design of auxiliary networks occupies a certain amount of GPU memory. To overcome these limitations, we propose a novel model called HPFF that performs hierarchical locally supervised learning and patch-level feature computation on the auxiliary networks. Hierarchical Locally Supervised Learning (HiLo) enables the network to learn features at different granularity levels along their respective local paths. Specifically, the network is divided into two-level local modules: independent local modules and cascade local modules. The cascade local modules combine two adjacent independent local modules, incorporating both updates within the modules themselves and information exchange between adjacent modules. Patch Feature Fusion (PFF) reduces GPU memory usage by splitting the input features of the auxiliary networks into patches for computation. By averaging these patch-level features, it enhances the network's ability to focus more on those patterns that are prevalent across multiple patches. Furthermore, our method exhibits strong generalization capabilities and can be seamlessly integrated with existing techniques. We conduct experiments on CIFAR-10, STL-10, SVHN, and ImageNet datasets, and the results demonstrate that our proposed HPFF significantly outperforms previous approaches, consistently achieving state-of-the-art performance across different datasets. Our code is available at: https://github.com/Zeudfish/HPFF.
△ Less
Submitted 8 July, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
AdaPI: Facilitating DNN Model Adaptivity for Efficient Private Inference in Edge Computing
Authors:
Tong Zhou,
Jiahui Zhao,
Yukui Luo,
Xi Xie,
Wujie Wen,
Caiwen Ding,
Xiaolin Xu
Abstract:
Private inference (PI) has emerged as a promising solution to execute computations on encrypted data, safeguarding user privacy and model parameters in edge computing. However, existing PI methods are predominantly developed considering constant resource constraints, overlooking the varied and dynamic resource constraints in diverse edge devices, like energy budgets. Consequently, model providers…
▽ More
Private inference (PI) has emerged as a promising solution to execute computations on encrypted data, safeguarding user privacy and model parameters in edge computing. However, existing PI methods are predominantly developed considering constant resource constraints, overlooking the varied and dynamic resource constraints in diverse edge devices, like energy budgets. Consequently, model providers have to design specialized models for different devices, where all of them have to be stored on the edge server, resulting in inefficient deployment. To fill this gap, this work presents AdaPI, a novel approach that achieves adaptive PI by allowing a model to perform well across edge devices with diverse energy budgets. AdaPI employs a PI-aware training strategy that optimizes the model weights alongside weight-level and feature-level soft masks. These soft masks are subsequently transformed into multiple binary masks to enable adjustments in communication and computation workloads. Through sequentially training the model with increasingly dense binary masks, AdaPI attains optimal accuracy for each energy budget, which outperforms the state-of-the-art PI methods by 7.3\% in terms of test accuracy on CIFAR-100. The code of AdaPI can be accessed via https://github.com/jiahuiiiiii/AdaPI.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Momentum Auxiliary Network for Supervised Local Learning
Authors:
Junhao Su,
Changpeng Cai,
Feiyu Zhu,
Chenghao He,
Xiaojie Xu,
Dongzhi Guan,
Chenyang Si
Abstract:
Deep neural networks conventionally employ end-to-end backpropagation for their training process, which lacks biological credibility and triggers a locking dilemma during network parameter updates, leading to significant GPU memory use. Supervised local learning, which segments the network into multiple local blocks updated by independent auxiliary networks. However, these methods cannot replace e…
▽ More
Deep neural networks conventionally employ end-to-end backpropagation for their training process, which lacks biological credibility and triggers a locking dilemma during network parameter updates, leading to significant GPU memory use. Supervised local learning, which segments the network into multiple local blocks updated by independent auxiliary networks. However, these methods cannot replace end-to-end training due to lower accuracy, as gradients only propagate within their local block, creating a lack of information exchange between blocks. To address this issue and establish information transfer across blocks, we propose a Momentum Auxiliary Network (MAN) that establishes a dynamic interaction mechanism. The MAN leverages an exponential moving average (EMA) of the parameters from adjacent local blocks to enhance information flow. This auxiliary network, updated through EMA, helps bridge the informational gap between blocks. Nevertheless, we observe that directly applying EMA parameters has certain limitations due to feature discrepancies among local blocks. To overcome this, we introduce learnable biases, further boosting performance. We have validated our method on four image classification datasets (CIFAR-10, STL-10, SVHN, ImageNet), attaining superior performance and substantial memory savings. Notably, our method can reduce GPU memory usage by more than 45\% on the ImageNet dataset compared to end-to-end training, while achieving higher performance. The Momentum Auxiliary Network thus offers a new perspective for supervised local learning. Our code is available at: https://github.com/JunhaoSu0/MAN.
△ Less
Submitted 9 July, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
VideoCoT: A Video Chain-of-Thought Dataset with Active Annotation Tool
Authors:
Yan Wang,
Yawen Zeng,
Jingsheng Zheng,
Xiaofen Xing,
Jin Xu,
Xiangmin Xu
Abstract:
Multimodal large language models (MLLMs) are flourishing, but mainly focus on images with less attention than videos, especially in sub-fields such as prompt engineering, video chain-of-thought (CoT), and instruction tuning on videos. Therefore, we try to explore the collection of CoT datasets in videos to lead to video OpenQA and improve the reasoning ability of MLLMs. Unfortunately, making such…
▽ More
Multimodal large language models (MLLMs) are flourishing, but mainly focus on images with less attention than videos, especially in sub-fields such as prompt engineering, video chain-of-thought (CoT), and instruction tuning on videos. Therefore, we try to explore the collection of CoT datasets in videos to lead to video OpenQA and improve the reasoning ability of MLLMs. Unfortunately, making such video CoT datasets is not an easy task. Given that human annotation is too cumbersome and expensive, while machine-generated is not reliable due to the hallucination issue, we develop an automatic annotation tool that combines machine and human experts, under the active learning paradigm. Active learning is an interactive strategy between the model and human experts, in this way, the workload of human labeling can be reduced and the quality of the dataset can be guaranteed. With the help of the automatic annotation tool, we strive to contribute three datasets, namely VideoCoT, TopicQA, TopicCoT. Furthermore, we propose a simple but effective benchmark based on the collected datasets, which exploits CoT to maximize the complex reasoning capabilities of MLLMs. Extensive experiments demonstrate the effectiveness our solution.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Search for the baryon number and lepton number violating decays $τ^-\to Λπ^-$ and $τ^-\to \barΛπ^-$ at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Althubiti,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien
, et al. (349 additional authors not shown)
Abstract:
We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper…
▽ More
We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper limits at 90\% credibility level on the branching fractions of $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛπ^-$ are determined to be $4.7 \times 10^{-8}$ and $4.3 \times 10^{-8}$, respectively.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Spectroscopy of deeply bound orbitals in neutron-rich Ca isotopes
Authors:
P. J. Li,
J. Lee,
P. Doornenbal,
S. Chen,
S. Wang,
A. Obertelli,
Y. Chazono,
J. D. Holt,
B. S. Hu,
K. Ogata,
Y. Utsuno,
K. Yoshida,
N. L. Achouri,
H. Baba,
F. Browne,
D. Calvet,
F. Château,
N. Chiga,
A. Corsi,
M. L. Cortés,
A. Delbart,
J-M. Gheller,
A. Giganon,
A. Gillibert,
C. Hilaire
, et al. (63 additional authors not shown)
Abstract:
The calcium isotopes are an ideal system to investigate the evolution of shell structure and magic numbers. Although the properties of surface nucleons in calcium have been well studied, probing the structure of deeply bound nucleons remains a challenge. Here, we report on the first measurement of unbound states in $^{53}$Ca and $^{55}$Ca, populated from \ts{54,56}Ca($p,pn$) reactions at a beam en…
▽ More
The calcium isotopes are an ideal system to investigate the evolution of shell structure and magic numbers. Although the properties of surface nucleons in calcium have been well studied, probing the structure of deeply bound nucleons remains a challenge. Here, we report on the first measurement of unbound states in $^{53}$Ca and $^{55}$Ca, populated from \ts{54,56}Ca($p,pn$) reactions at a beam energy of around 216 MeV/nucleon at the RIKEN Radioactive Isotopes Beam Factory. The resonance properties, partial cross sections, and momentum distributions of these unbound states were analyzed. Orbital angular momentum $l$ assignments were extracted from momentum distributions based on calculations using the distorted wave impulse approximation (DWIA) reaction model. The resonances at excitation energies of 5516(41)\,keV in $^{53}$Ca and 6000(250)\,keV in $^{55}$Ca indicate a significant $l$\, =\,3 component, providing the first experimental evidence for the $ν0f_{7/2}$ single-particle strength of unbound hole states in the neutron-rich Ca isotopes. The observed excitation energies and cross-sections point towards extremely localized and well separated strength distributions, with some fragmentation for the $ν0f_{7/2}$ orbital in $^{55}$Ca. These results are in good agreement with predictions from shell-model calculations using the effective GXPF1Bs interaction and \textit{ab initio} calculations and diverge markedly from the experimental distributions in the nickel isotones at $Z=28$.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Multi-Branch Auxiliary Fusion YOLO with Re-parameterization Heterogeneous Convolutional for accurate object detection
Authors:
Zhiqiang Yang,
Qiu Guan,
Keer Zhao,
Jianmin Yang,
Xinli Xu,
Haixia Long,
Ying Tang
Abstract:
Due to the effective performance of multi-scale feature fusion, Path Aggregation FPN (PAFPN) is widely employed in YOLO detectors. However, it cannot efficiently and adaptively integrate high-level semantic information with low-level spatial information simultaneously. We propose a new model named MAF-YOLO in this paper, which is a novel object detection framework with a versatile neck named Multi…
▽ More
Due to the effective performance of multi-scale feature fusion, Path Aggregation FPN (PAFPN) is widely employed in YOLO detectors. However, it cannot efficiently and adaptively integrate high-level semantic information with low-level spatial information simultaneously. We propose a new model named MAF-YOLO in this paper, which is a novel object detection framework with a versatile neck named Multi-Branch Auxiliary FPN (MAFPN). Within MAFPN, the Superficial Assisted Fusion (SAF) module is designed to combine the output of the backbone with the neck, preserving an optimal level of shallow information to facilitate subsequent learning. Meanwhile, the Advanced Assisted Fusion (AAF) module deeply embedded within the neck conveys a more diverse range of gradient information to the output layer.
Furthermore, our proposed Re-parameterized Heterogeneous Efficient Layer Aggregation Network (RepHELAN) module ensures that both the overall model architecture and convolutional design embrace the utilization of heterogeneous large convolution kernels. Therefore, this guarantees the preservation of information related to small targets while simultaneously achieving the multi-scale receptive field. Finally, taking the nano version of MAF-YOLO for example, it can achieve 42.4% AP on COCO with only 3.76M learnable parameters and 10.51G FLOPs, and approximately outperforms YOLOv8n by about 5.1%. The source code of this work is available at: https://github.com/yang-0201/MAF-YOLO.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
An Interactive Multi-modal Query Answering System with Retrieval-Augmented Large Language Models
Authors:
Mengzhao Wang,
Haotian Wu,
Xiangyu Ke,
Yunjun Gao,
Xiaoliang Xu,
Lu Chen
Abstract:
Retrieval-augmented Large Language Models (LLMs) have reshaped traditional query-answering systems, offering unparalleled user experiences. However, existing retrieval techniques often struggle to handle multi-modal query contexts. In this paper, we present an interactive Multi-modal Query Answering (MQA) system, empowered by our newly developed multi-modal retrieval framework and navigation graph…
▽ More
Retrieval-augmented Large Language Models (LLMs) have reshaped traditional query-answering systems, offering unparalleled user experiences. However, existing retrieval techniques often struggle to handle multi-modal query contexts. In this paper, we present an interactive Multi-modal Query Answering (MQA) system, empowered by our newly developed multi-modal retrieval framework and navigation graph index, integrated with cutting-edge LLMs. It comprises five core components: Data Preprocessing, Vector Representation, Index Construction, Query Execution, and Answer Generation, all orchestrated by a dedicated coordinator to ensure smooth data flow from input to answer generation. One notable aspect of MQA is its utilization of contrastive learning to assess the significance of different modalities, facilitating precise measurement of multi-modal information similarity. Furthermore, the system achieves efficient retrieval through our advanced navigation graph index, refined using computational pruning techniques. Another highlight of our system is its pluggable processing framework, allowing seamless integration of embedding models, graph indexes, and LLMs. This flexibility provides users diverse options for gaining insights from their multi-modal knowledge base. A preliminary video introduction of MQA is available at https://youtu.be/xvUuo2ZIqWk.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
An Autoencoder Architecture for L-band Passive Microwave Retrieval of Landscape Freeze-Thaw Cycle
Authors:
Divya Kumawat,
Ardeshir Ebtehaj,
Xiaolan Xu,
Andreas Colliander,
Vipin Kumar
Abstract:
Estimating the landscape and soil freeze-thaw (FT) dynamics in the Northern Hemisphere is crucial for understanding permafrost response to global warming and changes in regional and global carbon budgets. A new framework is presented for surface FT-cycle retrievals using L-band microwave radiometry based on a deep convolutional autoencoder neural network. This framework defines the landscape FT-cy…
▽ More
Estimating the landscape and soil freeze-thaw (FT) dynamics in the Northern Hemisphere is crucial for understanding permafrost response to global warming and changes in regional and global carbon budgets. A new framework is presented for surface FT-cycle retrievals using L-band microwave radiometry based on a deep convolutional autoencoder neural network. This framework defines the landscape FT-cycle retrieval as a time series anomaly detection problem considering the frozen states as normal and thawed states as anomalies. The autoencoder retrieves the FT-cycle probabilistically through supervised reconstruction of the brightness temperature (TB) time series using a contrastive loss function that minimizes (maximizes) the reconstruction error for the peak winter (summer). Using the data provided by the Soil Moisture Active Passive (SMAP) satellite, it is demonstrated that the framework learns to isolate the landscape FT states over different land surface types with varying complexities related to the radiometric characteristics of snow cover, lake-ice phenology, and vegetation canopy. The consistency of the retrievals is evaluated over Alaska, against in situ ground-based observations, showing reduced uncertainties compared to the traditional methods that use thresholding of the normalized polarization ratio.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Equilibrium moderate deviations for occupation times of SSEP on regula trees
Authors:
Xiaofeng Xue
Abstract:
In this paper, we are concerned with the symmetric simple exclusion process on the regula tree $\mathbb{T}^d$ for $d\geq 2$. Our main result gives moderate deviation principles of occupation times of the process starting from an invariant product measure. Two replacement lemmas play key roles in the proof of our main result. To obtain these replacement lemmas, we utilize duality relationships betw…
▽ More
In this paper, we are concerned with the symmetric simple exclusion process on the regula tree $\mathbb{T}^d$ for $d\geq 2$. Our main result gives moderate deviation principles of occupation times of the process starting from an invariant product measure. Two replacement lemmas play key roles in the proof of our main result. To obtain these replacement lemmas, we utilize duality relationships between the symmetric exclusion process and two types of random walks on $\mathbb{T}^d$ and $\left(\mathbb{T}^d\right)^2$ respectively.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Spatio-temporal cooperative control Method of Highway Ramp Merge Based on Vehicle-road Coordination
Authors:
Xiaoxue Xu,
Maokai Lai,
Haitao Zhang,
Xiang Dong,
Tao Li,
Jie Wu,
Yuan Li,
Ting Peng
Abstract:
The merging area of highway ramps faces multiple challenges, including traffic congestion, collision risks, speed mismatches, driver behavior uncertainties, limited visibility, and bottleneck effects. However, autonomous vehicles engaging in depth coordination between vehicle and road in merging zones, by pre-planning and uploading travel trajectories, can significantly enhance the safety and effi…
▽ More
The merging area of highway ramps faces multiple challenges, including traffic congestion, collision risks, speed mismatches, driver behavior uncertainties, limited visibility, and bottleneck effects. However, autonomous vehicles engaging in depth coordination between vehicle and road in merging zones, by pre-planning and uploading travel trajectories, can significantly enhance the safety and efficiency of merging zones.In this paper,we mainly introduce mainline priority cooperation method to achieve the time and space cooperative control of highway merge.Vehicle-mounted intelligent units share real-time vehicle status and driving intentions with Road Section Management Units, which pre-plan the spatiotemporal trajectories of vehicle travel. After receiving these trajectories, Vehicle Intelligent Units strictly adhere to them. Through this deep collaboration between vehicles and roads, conflicts in time and space during vehicle travel are eliminated in advance.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Machine Learning for Economic Forecasting: An Application to China's GDP Growth
Authors:
Yanqing Yang,
Xingcheng Xu,
Jinfeng Ge,
Yan Xu
Abstract:
This paper aims to explore the application of machine learning in forecasting Chinese macroeconomic variables. Specifically, it employs various machine learning models to predict the quarterly real GDP growth of China, and analyzes the factors contributing to the performance differences among these models. Our findings indicate that the average forecast errors of machine learning models are genera…
▽ More
This paper aims to explore the application of machine learning in forecasting Chinese macroeconomic variables. Specifically, it employs various machine learning models to predict the quarterly real GDP growth of China, and analyzes the factors contributing to the performance differences among these models. Our findings indicate that the average forecast errors of machine learning models are generally lower than those of traditional econometric models or expert forecasts, particularly in periods of economic stability. However, during certain inflection points, although machine learning models still outperform traditional econometric models, expert forecasts may exhibit greater accuracy in some instances due to experts' more comprehensive understanding of the macroeconomic environment and real-time economic variables. In addition to macroeconomic forecasting, this paper employs interpretable machine learning methods to identify the key attributive variables from different machine learning models, aiming to enhance the understanding and evaluation of their contributions to macroeconomic fluctuations.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Consistent Point Orientation for Manifold Surfaces via Boundary Integration
Authors:
Weizhou Liu,
Xingce Wang,
Haichuan Zhao,
Xingfei Xue,
Zhongke Wu,
Xuequan Lu,
Ying He
Abstract:
This paper introduces a new approach for generating globally consistent normals for point clouds sampled from manifold surfaces. Given that the generalized winding number (GWN) field generated by a point cloud with globally consistent normals is a solution to a PDE with jump boundary conditions and possesses harmonic properties, and the Dirichlet energy of the GWN field can be defined as an integr…
▽ More
This paper introduces a new approach for generating globally consistent normals for point clouds sampled from manifold surfaces. Given that the generalized winding number (GWN) field generated by a point cloud with globally consistent normals is a solution to a PDE with jump boundary conditions and possesses harmonic properties, and the Dirichlet energy of the GWN field can be defined as an integral over the boundary surface, we formulate a boundary energy derived from the Dirichlet energy of the GWN. Taking as input a point cloud with randomly oriented normals, we optimize this energy to restore the global harmonicity of the GWN field, thereby recovering the globally consistent normals. Experiments show that our method outperforms state-of-the-art approaches, exhibiting enhanced robustness to noise, outliers, complex topologies, and thin structures. Our code can be found at \url{https://github.com/liuweizhou319/BIM}.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
"It's like a rubber duck that talks back": Understanding Generative AI-Assisted Data Analysis Workflows through a Participatory Prompting Study
Authors:
Ian Drosos,
Advait Sarkar,
Xiaotong Xu,
Carina Negreanu,
Sean Rintel,
Lev Tankelevitch
Abstract:
Generative AI tools can help users with many tasks. One such task is data analysis, which is notoriously challenging for non-expert end-users due to its expertise requirements, and where AI holds much potential, such as finding relevant data sources, proposing analysis strategies, and writing analysis code. To understand how data analysis workflows can be assisted or impaired by generative AI, we…
▽ More
Generative AI tools can help users with many tasks. One such task is data analysis, which is notoriously challenging for non-expert end-users due to its expertise requirements, and where AI holds much potential, such as finding relevant data sources, proposing analysis strategies, and writing analysis code. To understand how data analysis workflows can be assisted or impaired by generative AI, we conducted a study (n=15) using Bing Chat via participatory prompting. Participatory prompting is a recently developed methodology in which users and researchers reflect together on tasks through co-engagement with generative AI. In this paper we demonstrate the value of the participatory prompting method. We found that generative AI benefits the information foraging and sensemaking loops of data analysis in specific ways, but also introduces its own barriers and challenges, arising from the difficulties of query formulation, specifying context, and verifying results.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be…
▽ More
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation
Authors:
Zeyu Xie,
Xuenan Xu,
Zhizheng Wu,
Mengyue Wu
Abstract:
Recently, audio generation tasks have attracted considerable research interests. Precise temporal controllability is essential to integrate audio generation with real applications. In this work, we propose a temporal controlled audio generation framework, PicoAudio. PicoAudio integrates temporal information to guide audio generation through tailored model design. It leverages data crawling, segmen…
▽ More
Recently, audio generation tasks have attracted considerable research interests. Precise temporal controllability is essential to integrate audio generation with real applications. In this work, we propose a temporal controlled audio generation framework, PicoAudio. PicoAudio integrates temporal information to guide audio generation through tailored model design. It leverages data crawling, segmentation, filtering, and simulation of fine-grained temporally-aligned audio-text data. Both subjective and objective evaluations demonstrate that PicoAudio dramantically surpasses current state-of-the-art generation models in terms of timestamp and occurrence frequency controllability. The generated samples are available on the demo website https://PicoAudio.github.io.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
AudioTime: A Temporally-aligned Audio-text Benchmark Dataset
Authors:
Zeyu Xie,
Xuenan Xu,
Zhizheng Wu,
Mengyue Wu
Abstract:
Recent advancements in audio generation have enabled the creation of high-fidelity audio clips from free-form textual descriptions. However, temporal relationships, a critical feature for audio content, are currently underrepresented in mainstream models, resulting in an imprecise temporal controllability. Specifically, users cannot accurately control the timestamps of sound events using free-form…
▽ More
Recent advancements in audio generation have enabled the creation of high-fidelity audio clips from free-form textual descriptions. However, temporal relationships, a critical feature for audio content, are currently underrepresented in mainstream models, resulting in an imprecise temporal controllability. Specifically, users cannot accurately control the timestamps of sound events using free-form text. We acknowledge that a significant factor is the absence of high-quality, temporally-aligned audio-text datasets, which are essential for training models with temporal control. The more temporally-aligned the annotations, the better the models can understand the precise relationship between audio outputs and temporal textual prompts. Therefore, we present a strongly aligned audio-text dataset, AudioTime. It provides text annotations rich in temporal information such as timestamps, duration, frequency, and ordering, covering almost all aspects of temporal control. Additionally, we offer a comprehensive test set and evaluation metric to assess the temporal control performance of various models. Examples are available on the https://zeyuxie29.github.io/AudioTime/
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Convergence of Implicit Gradient Descent for Training Two-Layer Physics-Informed Neural Networks
Authors:
Xianliang Xu,
Zhongyi Huang,
Ye Li
Abstract:
Optimization algorithms is crucial in training physics-informed neural networks (PINNs), unsuitable methods may lead to poor solutions. Compared to the common gradient descent algorithm, implicit gradient descent (IGD) outperforms it in handling some multi-scale problems. In this paper, we provide convergence analysis for the implicit gradient descent for training over-parametrized two-layer PINNs…
▽ More
Optimization algorithms is crucial in training physics-informed neural networks (PINNs), unsuitable methods may lead to poor solutions. Compared to the common gradient descent algorithm, implicit gradient descent (IGD) outperforms it in handling some multi-scale problems. In this paper, we provide convergence analysis for the implicit gradient descent for training over-parametrized two-layer PINNs. We first demonstrate the positive definiteness of Gram matrices for general smooth activation functions, like sigmoidal function, softplus function, tanh function and so on. Then the over-parameterization allows us to show that the randomly initialized IGD converges a globally optimal solution at a linear convergence rate. Moreover, due to the different training dynamics, the learning rate of IGD can be chosen independent of the sample size and the least eigenvalue of the Gram matrix.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Mobile Edge Generation-Enabled Digital Twin: Architecture Design and Research Opportunities
Authors:
Xiaoxia Xu,
Ruikang Zhong,
Xidong Mu,
Yuanwei Liu,
Kaibin Huang
Abstract:
A novel paradigm of mobile edge generation (MEG)-enabled digital twin (DT) is proposed, which enables distributed on-device generation at mobile edge networks for real-time DT applications. First, an MEG-DT architecture is put forward to decentralize generative artificial intelligence (GAI) models onto edge servers (ESs) and user equipments (UEs), which has the advantages of low latency, privacy p…
▽ More
A novel paradigm of mobile edge generation (MEG)-enabled digital twin (DT) is proposed, which enables distributed on-device generation at mobile edge networks for real-time DT applications. First, an MEG-DT architecture is put forward to decentralize generative artificial intelligence (GAI) models onto edge servers (ESs) and user equipments (UEs), which has the advantages of low latency, privacy preservation, and individual-level customization. Then, various single-user and multi-user generation mechanisms are conceived for MEG-DT, which strike trade-offs between generation latency, hardware costs, and device coordination. Furthermore, to perform efficient distributed generation, two operating protocols are explored for transmitting interpretable and latent features between ESs and UEs, namely sketch-based generation and seed-based generation, respectively. Based on the proposed protocols, the convergence between MEG and DT are highlighted. Considering the seed-based image generation scenario, numerical case studies are provided to reveal the superiority of MEG-DT over centralized generation. Finally, promising applications and research opportunities are identified.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
When Could Abelian Fractional Topological Insulators Exist in Twisted MoTe$_2$ (and Other Systems)
Authors:
Yves H. Kwan,
Glenn Wagner,
Jiabin Yu,
Andrea Kouta Dagnino,
Yi Jiang,
Xiaodong Xu,
B. Andrei Bernevig,
Titus Neupert,
Nicolas Regnault
Abstract:
Using comprehensive exact diagonalization calculations on $θ\approx 3.7 ^{\circ}$ twisted bilayer MoTe$_2$ ($t$MoTe$_2$), as well as idealized Landau level models also relevant for lower $θ$, we extract general principles for engineering fractional topological insulators (FTIs) in realistic situations. First, in a Landau level setup at $ν=1/3+1/3$, we investigate what features of the interaction d…
▽ More
Using comprehensive exact diagonalization calculations on $θ\approx 3.7 ^{\circ}$ twisted bilayer MoTe$_2$ ($t$MoTe$_2$), as well as idealized Landau level models also relevant for lower $θ$, we extract general principles for engineering fractional topological insulators (FTIs) in realistic situations. First, in a Landau level setup at $ν=1/3+1/3$, we investigate what features of the interaction destroy an FTI. For both pseudopotential interactions and realistic screened Coulomb interactions, we find that sufficient suppression of the short-range repulsion is needed for stabilizing an FTI. We then study $θ\approx 3.7 ^{\circ}$ $t$MoTe$_2$ with realistic band-mixing and anisotropic non-local dielectric screening. Our finite-size calculations only find an FTI phase at $ν=-4/3$ in the presence of a significant additional short-range attraction $g$ that acts to counter the Coulomb repulsion at short distances. We discuss how further finite-size drifts, dielectric engineering, Landau level character, and band-mixing effects may reduce the required value of $g$ closer towards the experimentally relevant conditions of $t$MoTe$_2$. Projective calculations into the $n=1$ Landau level, which resembles the second valence band of $θ\simeq 2.1^\circ$ $t$MoTe$_2$, do not yield FTIs for any $g$, suggesting that FTIs at low-angle $t$MoTe$_2$ for $ν=-8/3$ and $-10/3$ may be unlikely. While our study highlights the challenges, at least for the fillings considered, to obtaining an FTI with transport plateaus, even in large-angle $t$MoTe$_2$ where fractional Chern insulators are experimentally established, we also provide potential sample-engineering routes to improve the stability of FTI phases.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Similarity Distance-Based Label Assignment for Tiny Object Detection
Authors:
Shuohao Shi,
Qiang Fang,
Tong Zhao,
Xin Xu
Abstract:
Tiny object detection is becoming one of the most challenging tasks in computer vision because of the limited object size and lack of information. The label assignment strategy is a key factor affecting the accuracy of object detection. Although there are some effective label assignment strategies for tiny objects, most of them focus on reducing the sensitivity to the bounding boxes to increase th…
▽ More
Tiny object detection is becoming one of the most challenging tasks in computer vision because of the limited object size and lack of information. The label assignment strategy is a key factor affecting the accuracy of object detection. Although there are some effective label assignment strategies for tiny objects, most of them focus on reducing the sensitivity to the bounding boxes to increase the number of positive samples and have some fixed hyperparameters need to set. However, more positive samples may not necessarily lead to better detection results, in fact, excessive positive samples may lead to more false positives. In this paper, we introduce a simple but effective strategy named the Similarity Distance (SimD) to evaluate the similarity between bounding boxes. This proposed strategy not only considers both location and shape similarity but also learns hyperparameters adaptively, ensuring that it can adapt to different datasets and various object sizes in a dataset. Our approach can be simply applied in common anchor-based detectors in place of the IoU for label assignment and Non Maximum Suppression (NMS). Extensive experiments on four mainstream tiny object detection datasets demonstrate superior performance of our method, especially, 1.8 AP points and 4.1 AP points of very tiny higher than the state-of-the-art competitors on AI-TOD. Code is available at: \url{https://github.com/cszzshi/SimD}.
△ Less
Submitted 3 July, 2024; v1 submitted 2 July, 2024;
originally announced July 2024.
-
Global calibration of large-scale photonic integrated circuits
Authors:
Jin-Hao Zheng,
Qin-Qin Wang,
Lan-Tian Feng,
Yu-Yang Ding,
Xiao-Ye Xu,
Xi-Feng Ren,
Chuan-Feng Li,
Guang-Can Guo
Abstract:
The advancing maturity of photonic integrated circuit (PIC) fabrication technology enables the high integration of an increasing number of optical components onto a single chip. With the incremental circuit complexity, the calibration of active phase shifters in a large-scale PIC becomes a crucially important issue. The traditional one-by-one calibration techniques encounter significant hurdles wi…
▽ More
The advancing maturity of photonic integrated circuit (PIC) fabrication technology enables the high integration of an increasing number of optical components onto a single chip. With the incremental circuit complexity, the calibration of active phase shifters in a large-scale PIC becomes a crucially important issue. The traditional one-by-one calibration techniques encounter significant hurdles with the propagation of calibration errors, and achieving the decoupling of all phase shifters for independent calibration is not straightforward. To address this issue, we propose a machine-learning approach for globally calibrating the large-scale PIC. Our method utilizes a custom network to simultaneously learn the nonlinear phase-current relations for all thermo-optic phase shifters on the PIC by minimizing the negative likelihood of the measurement datasets. Moreover, the reflectivities of all static beamsplitter components can also be synchronizedly extracted using this calibration method. As an example, a quantum walk PIC with a circuit depth of 12 is calibrated, and a programmable discrete-time quantum walk is experimentally demonstrated. These results will greatly benefit the applications of large-scale PICs in photonic quantum information processing.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Counterfactual Data Augmentation with Denoising Diffusion for Graph Anomaly Detection
Authors:
Chunjing Xiao,
Shikang Pang,
Xovee Xu,
Xuan Li,
Goce Trajcevski,
Fan Zhou
Abstract:
A critical aspect of Graph Neural Networks (GNNs) is to enhance the node representations by aggregating node neighborhood information. However, when detecting anomalies, the representations of abnormal nodes are prone to be averaged by normal neighbors, making the learned anomaly representations less distinguishable. To tackle this issue, we propose CAGAD -- an unsupervised Counterfactual data Aug…
▽ More
A critical aspect of Graph Neural Networks (GNNs) is to enhance the node representations by aggregating node neighborhood information. However, when detecting anomalies, the representations of abnormal nodes are prone to be averaged by normal neighbors, making the learned anomaly representations less distinguishable. To tackle this issue, we propose CAGAD -- an unsupervised Counterfactual data Augmentation method for Graph Anomaly Detection -- which introduces a graph pointer neural network as the heterophilic node detector to identify potential anomalies whose neighborhoods are normal-node-dominant. For each identified potential anomaly, we design a graph-specific diffusion model to translate a part of its neighbors, which are probably normal, into anomalous ones. At last, we involve these translated neighbors in GNN neighborhood aggregation to produce counterfactual representations of anomalies. Through aggregating the translated anomalous neighbors, counterfactual representations become more distinguishable and further advocate detection performance. The experimental results on four datasets demonstrate that CAGAD significantly outperforms strong baselines, with an average improvement of 2.35% on F1, 2.53% on AUC-ROC, and 2.79% on AUC-PR.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.