-
Measurement of the branching fraction of $D^+_s\to \ell^+ν_\ell$ via $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and…
▽ More
Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and $\mathcal{B}(D_s^+\toτ^+ν_τ)=(\bftauv)\%$, respectively. The product of the decay constant and Cabibbo-Kobayashi-Maskawa matrix element $|V_{cs}|$ is determined to be $f_{D_s^+}|V_{cs}|=(\mufdsxvcsresult)_{μν}~\mathrm{MeV}$ and $f_{D_s^+}|V_{cs}|=(\taufdsxvcsresult))_{τν}~\mathrm{MeV}$, respectively. Taking the value of $|V_{cs}|$ from a global fit in the Standard Model, we obtain ${f_{D^+_s}}=(\mufdsresult)_{μν}$\,MeV and ${f_{D^+_s}}=(\taufdsresult)_{τν}$\,MeV, respectively. Conversely, taking the value for $f_{D_s^+}$ from the latest lattice quantum chromodynamics calculation, we obtain $|V_{cs}| =(\muvcsresult)_{μν}$ and $|V_{cs}| = (\tauvcsresult)_{τν}$, respectively.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
No Questions Asked: Effects of Transparency on Stablecoin Liquidity During the Collapse of Silicon Valley Bank
Authors:
Walter Hernandez Cruz,
Jiahua Xu,
Paolo Tasca,
Carlo Campajola
Abstract:
Fiat-pegged stablecoins are by nature exposed to spillover effects during market turmoil in Traditional Finance (TradFi). We observe a difference in TradFi market shocks impact between various stablecoins, in particular, USD Coin (USDC) and Tether USDT (USDT), the former with a higher reporting frequency and transparency than the latter. We investigate this, using top USDC and USDT liquidity pools…
▽ More
Fiat-pegged stablecoins are by nature exposed to spillover effects during market turmoil in Traditional Finance (TradFi). We observe a difference in TradFi market shocks impact between various stablecoins, in particular, USD Coin (USDC) and Tether USDT (USDT), the former with a higher reporting frequency and transparency than the latter. We investigate this, using top USDC and USDT liquidity pools in Uniswap, by adapting the Marginal Cost of Immediacy (MCI) measure to Uniswap's Automated Market Maker, and then conducting Difference-in-Differences analysis on MCI and Total Value Locked (TVL) in USD, as well as measuring liquidity concentration across different providers. Results show that the Silicon Valley Bank (SVB) event reduced USDC's TVL dominance over USDT, increased USDT's liquidity cost relative to USDC, and liquidity provision remained concentrated with pool-specific trends. These findings reveal a flight-to-safety behavior and counterintuitive effects of stablecoin transparency: USDC's frequent and detailed disclosures led to swift market reactions, while USDT's opacity and less frequent reporting provided a safety net against immediate impacts.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Search for the rare $Λ_c^+ \to p μ^+ μ^-$ decay
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1062 additional authors not shown)
Abstract:
A search for the nonresonant $Λ_c^+ \to p μ^+ μ^-$ decay is performed using proton-proton collision data recorded at a centre-of-mass energy of 13 TeV by the LHCb experiment, corresponding to an integrated luminosity of 5.4 fb$^{-1}$. No evidence for the decay is found in the dimuon invariant-mass regions where the expected contributions of resonances is subdominant. The upper limit on the branchi…
▽ More
A search for the nonresonant $Λ_c^+ \to p μ^+ μ^-$ decay is performed using proton-proton collision data recorded at a centre-of-mass energy of 13 TeV by the LHCb experiment, corresponding to an integrated luminosity of 5.4 fb$^{-1}$. No evidence for the decay is found in the dimuon invariant-mass regions where the expected contributions of resonances is subdominant. The upper limit on the branching fraction of the $Λ_c^+ \to p μ^+ μ^-$ decay is determined to be $2.9~(3.2) \times 10^{-8}$ at 90% (95%) confidence level. The branching fractions in the dimuon invariant-mass regions dominated by the $η$, $ρ$ and $ω$ resonances are also determined.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
SPINACH: SPARQL-Based Information Navigation for Challenging Real-World Questions
Authors:
Shicheng Liu,
Sina J. Semnani,
Harold Triedman,
Jialiang Xu,
Isaac Dan Zhao,
Monica S. Lam
Abstract:
Recent work integrating Large Language Models (LLMs) has led to significant improvements in the Knowledge Base Question Answering (KBQA) task. However, we posit that existing KBQA datasets that either have simple questions, use synthetically generated logical forms, or are based on small knowledge base (KB) schemas, do not capture the true complexity of KBQA tasks.
To address this, we introduce…
▽ More
Recent work integrating Large Language Models (LLMs) has led to significant improvements in the Knowledge Base Question Answering (KBQA) task. However, we posit that existing KBQA datasets that either have simple questions, use synthetically generated logical forms, or are based on small knowledge base (KB) schemas, do not capture the true complexity of KBQA tasks.
To address this, we introduce the SPINACH dataset, an expert-annotated KBQA dataset collected from forum discussions on Wikidata's "Request a Query" forum with 320 decontextualized question-SPARQL pairs. Much more complex than existing datasets, SPINACH calls for strong KBQA systems that do not rely on training data to learn the KB schema, but can dynamically explore large and often incomplete schemas and reason about them.
Along with the dataset, we introduce the SPINACH agent, a new KBQA approach that mimics how a human expert would write SPARQLs for such challenging questions. Experiments on existing datasets show SPINACH's capability in KBQA, achieving a new state of the art on the QALD-7, QALD-9 Plus and QALD-10 datasets by 30.1%, 27.0%, and 10.0% in F1, respectively, and coming within 1.6% of the fine-tuned LLaMA SOTA model on WikiWebQuestions. On our new SPINACH dataset, SPINACH agent outperforms all baselines, including the best GPT-4-based KBQA agent, by 38.1% in F1.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
MedBench: A Comprehensive, Standardized, and Reliable Benchmarking System for Evaluating Chinese Medical Large Language Models
Authors:
Mianxin Liu,
Jinru Ding,
Jie Xu,
Weiguo Hu,
Xiaoyang Li,
Lifeng Zhu,
Zhian Bai,
Xiaoming Shi,
Benyou Wang,
Haitao Song,
Pengfei Liu,
Xiaofan Zhang,
Shanshan Wang,
Kang Li,
Haofen Wang,
Tong Ruan,
Xuanjing Huang,
Xin Sun,
Shaoting Zhang
Abstract:
Ensuring the general efficacy and goodness for human beings from medical large language models (LLM) before real-world deployment is crucial. However, a widely accepted and accessible evaluation process for medical LLM, especially in the Chinese context, remains to be established. In this work, we introduce "MedBench", a comprehensive, standardized, and reliable benchmarking system for Chinese med…
▽ More
Ensuring the general efficacy and goodness for human beings from medical large language models (LLM) before real-world deployment is crucial. However, a widely accepted and accessible evaluation process for medical LLM, especially in the Chinese context, remains to be established. In this work, we introduce "MedBench", a comprehensive, standardized, and reliable benchmarking system for Chinese medical LLM. First, MedBench assembles the currently largest evaluation dataset (300,901 questions) to cover 43 clinical specialties and performs multi-facet evaluation on medical LLM. Second, MedBench provides a standardized and fully automatic cloud-based evaluation infrastructure, with physical separations for question and ground truth. Third, MedBench implements dynamic evaluation mechanisms to prevent shortcut learning and answer remembering. Applying MedBench to popular general and medical LLMs, we observe unbiased, reproducible evaluation results largely aligning with medical professionals' perspectives. This study establishes a significant foundation for preparing the practical applications of Chinese medical LLMs. MedBench is publicly accessible at https://medbench.opencompass.org.cn.
△ Less
Submitted 23 June, 2024;
originally announced July 2024.
-
Qwen2-Audio Technical Report
Authors:
Yunfei Chu,
Jin Xu,
Qian Yang,
Haojie Wei,
Xipin Wei,
Zhifang Guo,
Yichong Leng,
Yuanjun Lv,
Jinzheng He,
Junyang Lin,
Chang Zhou,
Jingren Zhou
Abstract:
We introduce the latest progress of Qwen-Audio, a large-scale audio-language model called Qwen2-Audio, which is capable of accepting various audio signal inputs and performing audio analysis or direct textual responses with regard to speech instructions. In contrast to complex hierarchical tags, we have simplified the pre-training process by utilizing natural language prompts for different data an…
▽ More
We introduce the latest progress of Qwen-Audio, a large-scale audio-language model called Qwen2-Audio, which is capable of accepting various audio signal inputs and performing audio analysis or direct textual responses with regard to speech instructions. In contrast to complex hierarchical tags, we have simplified the pre-training process by utilizing natural language prompts for different data and tasks, and have further expanded the data volume. We have boosted the instruction-following capability of Qwen2-Audio and implemented two distinct audio interaction modes for voice chat and audio analysis. In the voice chat mode, users can freely engage in voice interactions with Qwen2-Audio without text input. In the audio analysis mode, users could provide audio and text instructions for analysis during the interaction. Note that we do not use any system prompts to switch between voice chat and audio analysis modes. Qwen2-Audio is capable of intelligently comprehending the content within audio and following voice commands to respond appropriately. For instance, in an audio segment that simultaneously contains sounds, multi-speaker conversations, and a voice command, Qwen2-Audio can directly understand the command and provide an interpretation and response to the audio. Additionally, DPO has optimized the model's performance in terms of factuality and adherence to desired behavior. According to the evaluation results from AIR-Bench, Qwen2-Audio outperformed previous SOTAs, such as Gemini-1.5-pro, in tests focused on audio-centric instruction-following capabilities. Qwen2-Audio is open-sourced with the aim of fostering the advancement of the multi-modal language community.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Edwards thermodynamic framework controls density segregation in cyclically sheared granular materials
Authors:
Haiyang Lu,
Houfei Yuan,
Shuyang Zhang,
Zhikun Zeng,
Yi Xing,
Jiazhao Xu,
Xin Wang,
Yujie Wang
Abstract:
Using X-ray tomography, we experimentally investigate granular segregation phenomena in a mixture of particles with different densities under quasi-static cyclic shear. We quantitatively characterize their height distributions at steady states by minimizing effective free energy based on a segregation temperature that captures the competition between the mixing entropy and gravitational potential…
▽ More
Using X-ray tomography, we experimentally investigate granular segregation phenomena in a mixture of particles with different densities under quasi-static cyclic shear. We quantitatively characterize their height distributions at steady states by minimizing effective free energy based on a segregation temperature that captures the competition between the mixing entropy and gravitational potential energy. We find this temperature coincides with Edwards' compactivity within error under various pressures and cyclic shear amplitudes. Therefore, we find that granular segregation in quasi-static conditions can be fundamentally explained by an effective granular thermodynamic framework including real energy terms based on the Edwards statistical ensemble.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
FRI-Net: Floorplan Reconstruction via Room-wise Implicit Representation
Authors:
Honghao Xu,
Juzhan Xu,
Zeyu Huang,
Pengfei Xu,
Hui Huang,
Ruizhen Hu
Abstract:
In this paper, we introduce a novel method called FRI-Net for 2D floorplan reconstruction from 3D point cloud. Existing methods typically rely on corner regression or box regression, which lack consideration for the global shapes of rooms. To address these issues, we propose a novel approach using a room-wise implicit representation with structural regularization to characterize the shapes of room…
▽ More
In this paper, we introduce a novel method called FRI-Net for 2D floorplan reconstruction from 3D point cloud. Existing methods typically rely on corner regression or box regression, which lack consideration for the global shapes of rooms. To address these issues, we propose a novel approach using a room-wise implicit representation with structural regularization to characterize the shapes of rooms in floorplans. By incorporating geometric priors of room layouts in floorplans into our training strategy, the generated room polygons are more geometrically regular. We have conducted experiments on two challenging datasets, Structured3D and SceneCAD. Our method demonstrates improved performance compared to state-of-the-art methods, validating the effectiveness of our proposed representation for floorplan reconstruction.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Qwen2 Technical Report
Authors:
An Yang,
Baosong Yang,
Binyuan Hui,
Bo Zheng,
Bowen Yu,
Chang Zhou,
Chengpeng Li,
Chengyuan Li,
Dayiheng Liu,
Fei Huang,
Guanting Dong,
Haoran Wei,
Huan Lin,
Jialong Tang,
Jialin Wang,
Jian Yang,
Jianhong Tu,
Jianwei Zhang,
Jianxin Ma,
Jin Xu,
Jingren Zhou,
Jinze Bai,
Jinzheng He,
Junyang Lin,
Kai Dang
, et al. (34 additional authors not shown)
Abstract:
This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, a…
▽ More
This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, and exhibits competitive performance relative to proprietary models across diverse benchmarks on language understanding, generation, multilingual proficiency, coding, mathematics, and reasoning.
The flagship model, Qwen2-72B, showcases remarkable performance: 84.2 on MMLU, 37.9 on GPQA, 64.6 on HumanEval, 89.5 on GSM8K, and 82.4 on BBH as a base language model. The instruction-tuned variant, Qwen2-72B-Instruct, attains 9.1 on MT-Bench, 48.1 on Arena-Hard, and 35.7 on LiveCodeBench. Moreover, Qwen2 demonstrates robust multilingual capabilities, proficient in approximately 30 languages, spanning English, Chinese, Spanish, French, German, Arabic, Russian, Korean, Japanese, Thai, Vietnamese, and more, underscoring its versatility and global reach.
To foster community innovation and accessibility, we have made the Qwen2 model weights openly available on Hugging Face and ModelScope, and the supplementary materials including example code on GitHub. These platforms also include resources for quantization, fine-tuning, and deployment, facilitating a wide range of applications and research endeavors.
△ Less
Submitted 16 July, 2024; v1 submitted 15 July, 2024;
originally announced July 2024.
-
Sudden polarization angle jumps of the repeating fast radio burst FRB 20201124A
Authors:
J. R. Niu,
W. Y. Wang,
J. C. Jiang,
Y. Qu,
D. J. Zhou,
W. W. Zhu,
K. J. Lee,
J. L. Han,
B. Zhang,
D. Li,
S. Cao,
Z. Y. Fang,
Y. Feng,
Q. Y. Fu,
P. Jiang,
W. C. Jing,
J. Li,
Y. Li,
R. Luo,
L. Q. Meng,
C. C. Miao,
X. L. Miao,
C. H. Niu,
Y. C. Pan,
B. J. Wang
, et al. (19 additional authors not shown)
Abstract:
We report the first detection of polarization angle (PA) orthogonal jumps, a phenomenon previously only observed from radio pulsars, from a fast radio burst (FRB) source FRB 20201124A. We find three cases of orthogonal jumps in over two thousand bursts, all resembling those observed in pulsar single pulses. We propose that the jumps are due to the superposition of two orthogonal emission modes tha…
▽ More
We report the first detection of polarization angle (PA) orthogonal jumps, a phenomenon previously only observed from radio pulsars, from a fast radio burst (FRB) source FRB 20201124A. We find three cases of orthogonal jumps in over two thousand bursts, all resembling those observed in pulsar single pulses. We propose that the jumps are due to the superposition of two orthogonal emission modes that could only be produced in a highly magnetized plasma, and they are caused by the line of sight sweeping across a rotating magnetosphere. The shortest jump timescale is of the order of one-millisecond, which hints that the emission modes come from regions smaller than the light cylinder of most pulsars or magnetars. This discovery provides convincing evidence that FRB emission originates from the complex magnetosphere of a magnetar, suggesting an FRB emission mechanism that is analogous to radio pulsars despite a huge luminosity difference between two types of objects.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Charge radii of $^{11-16}$C, $^{13-17}$N and $^{15-18}$O determined from their charge-changing cross-sections and the mirror-difference charge radii
Authors:
J. W. Zhao,
B. -H. Sun,
I. Tanihata,
J. Y. Xu,
K. Y. Zhang,
A. Prochazka,
L. H. Zhu,
S. Terashima,
J. Meng,
L. C. He,
C. Y. Liu,
G. S. Li,
C. G. Lu,
W. J. Lin,
W. P. Lin,
Z. Liu,
P. P Ren,
Z. Y. Sun,
F. Wang,
J. Wang,
M. Wang,
S. T. Wang,
X. L. Wei,
X. D. Xu,
J. C. Zhang
, et al. (2 additional authors not shown)
Abstract:
Charge-changing cross-sections of $^{11-16}$C, $^{13-17}$N and $^{15-18}$O on a carbon target have been determined at energies around 300 MeV/nucleon. A nucleon separation energy dependent correction factor has been introduced to the Glauber model calculation for extracting the nuclear charge radii from the experimental CCCSs. The charge radii of $^{11}$C, $^{13,16}$N and $^{15}$O thus were determ…
▽ More
Charge-changing cross-sections of $^{11-16}$C, $^{13-17}$N and $^{15-18}$O on a carbon target have been determined at energies around 300 MeV/nucleon. A nucleon separation energy dependent correction factor has been introduced to the Glauber model calculation for extracting the nuclear charge radii from the experimental CCCSs. The charge radii of $^{11}$C, $^{13,16}$N and $^{15}$O thus were determined for the first time. With the new radii, we studied the experimental mirror-difference charge radii ($ΔR_{\text {ch}}^{\text {mirror}}$) of $^{11}$B-$^{11}$C, $^{13}$C-$^{13}$N, $^{15}$N-$^{15}$O, $^{17}$N-$^{17}$Ne pairs for the first time. We find that the $ΔR_{\text {ch}}^{\text {mirror}}$, including both bound and weakly bound proton-rich mirror partners, are reproduced by the empirical relation to the isospin asymmetry predicted by the $ab$ $initio$ calculations.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
Towards Robust Recommendation via Decision Boundary-aware Graph Contrastive Learning
Authors:
Jiakai Tang,
Sunhao Dai,
Zexu Sun,
Xu Chen,
Jun Xu,
Wenhui Yu,
Lantao Hu,
Peng Jiang,
Han Li
Abstract:
In recent years, graph contrastive learning (GCL) has received increasing attention in recommender systems due to its effectiveness in reducing bias caused by data sparsity. However, most existing GCL models rely on heuristic approaches and usually assume entity independence when constructing contrastive views. We argue that these methods struggle to strike a balance between semantic invariance an…
▽ More
In recent years, graph contrastive learning (GCL) has received increasing attention in recommender systems due to its effectiveness in reducing bias caused by data sparsity. However, most existing GCL models rely on heuristic approaches and usually assume entity independence when constructing contrastive views. We argue that these methods struggle to strike a balance between semantic invariance and view hardness across the dynamic training process, both of which are critical factors in graph contrastive learning.
To address the above issues, we propose a novel GCL-based recommendation framework RGCL, which effectively maintains the semantic invariance of contrastive pairs and dynamically adapts as the model capability evolves through the training process. Specifically, RGCL first introduces decision boundary-aware adversarial perturbations to constrain the exploration space of contrastive augmented views, avoiding the decrease of task-specific information. Furthermore, to incorporate global user-user and item-item collaboration relationships for guiding on the generation of hard contrastive views, we propose an adversarial-contrastive learning objective to construct a relation-aware view-generator. Besides, considering that unsupervised GCL could potentially narrower margins between data points and the decision boundary, resulting in decreased model robustness, we introduce the adversarial examples based on maximum perturbations to achieve margin maximization. We also provide theoretical analyses on the effectiveness of our designs. Through extensive experiments on five public datasets, we demonstrate the superiority of RGCL compared against twelve baseline models.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
Bipartizing (Pseudo-)Disk Graphs: Approximation with a Ratio Better than 3
Authors:
Daniel Lokshtanov,
Fahad Panolan,
Saket Saurabh,
Jie Xue,
Meirav Zehavi
Abstract:
In a disk graph, every vertex corresponds to a disk in $\mathbb{R}^2$ and two vertices are connected by an edge whenever the two corresponding disks intersect. Disk graphs form an important class of geometric intersection graphs, which generalizes both planar graphs and unit-disk graphs. We study a fundamental optimization problem in algorithmic graph theory, Bipartization (also known as Odd Cycle…
▽ More
In a disk graph, every vertex corresponds to a disk in $\mathbb{R}^2$ and two vertices are connected by an edge whenever the two corresponding disks intersect. Disk graphs form an important class of geometric intersection graphs, which generalizes both planar graphs and unit-disk graphs. We study a fundamental optimization problem in algorithmic graph theory, Bipartization (also known as Odd Cycle Transversal), on the class of disk graphs. The goal of Bipartization is to delete a minimum number of vertices from the input graph such that the resulting graph is bipartite. A folklore (polynomial-time) $3$-approximation algorithm for Bipartization on disk graphs follows from the classical framework of Goemans and Williamson [Combinatorica'98] for cycle-hitting problems. For over two decades, this result has remained the best known approximation for the problem (in fact, even for Bipartization on unit-disk graphs). In this paper, we achieve the first improvement upon this result, by giving a $(3-α)$-approximation algorithm for Bipartization on disk graphs, for some constant $α>0$. Our algorithm directly generalizes to the broader class of pseudo-disk graphs. Furthermore, our algorithm is robust in the sense that it does not require a geometric realization of the input graph to be given.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Authors:
Youliang Yuan,
Wenxiang Jiao,
Wenxuan Wang,
Jen-tse Huang,
Jiahao Xu,
Tian Liang,
Pinjia He,
Zhaopeng Tu
Abstract:
This study addresses a critical gap in safety tuning practices for Large Language Models (LLMs) by identifying and tackling a refusal position bias within safety tuning data, which compromises the models' ability to appropriately refuse generating unsafe content. We introduce a novel approach, Decoupled Refusal Training (DeRTa), designed to empower LLMs to refuse compliance to harmful prompts at a…
▽ More
This study addresses a critical gap in safety tuning practices for Large Language Models (LLMs) by identifying and tackling a refusal position bias within safety tuning data, which compromises the models' ability to appropriately refuse generating unsafe content. We introduce a novel approach, Decoupled Refusal Training (DeRTa), designed to empower LLMs to refuse compliance to harmful prompts at any response position, significantly enhancing their safety capabilities. DeRTa incorporates two novel components: (1) Maximum Likelihood Estimation (MLE) with Harmful Response Prefix, which trains models to recognize and avoid unsafe content by appending a segment of harmful response to the beginning of a safe response, and (2) Reinforced Transition Optimization (RTO), which equips models with the ability to transition from potential harm to safety refusal consistently throughout the harmful response sequence. Our empirical evaluation, conducted using LLaMA3 and Mistral model families across six attack scenarios, demonstrates that our method not only improves model safety without compromising performance but also surpasses well-known models such as GPT-4 in defending against attacks. Importantly, our approach successfully defends recent advanced attack methods (e.g., CodeAttack) that have jailbroken GPT-4 and LLaMA3-70B-Instruct. Our code and data can be found at https://github.com/RobustNLP/DeRTa.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
On the structure of the complement of skeleton
Authors:
Morgan Brown,
Jiachang Xu,
Muyuan Zhang
Abstract:
We study the higher dimensional geometry of Berkovich spaces using open fiber disks, which are given by open disks in a relative dimension $1$ fibration. Inspired by birational geometry, we conjecture that the Berkovich skeleton is the complement of the union of all open fiber disks, and prove this conjecture for $\mathcal{X}$ admitting a strictly semistable model with semiample canonical class.
We study the higher dimensional geometry of Berkovich spaces using open fiber disks, which are given by open disks in a relative dimension $1$ fibration. Inspired by birational geometry, we conjecture that the Berkovich skeleton is the complement of the union of all open fiber disks, and prove this conjecture for $\mathcal{X}$ admitting a strictly semistable model with semiample canonical class.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
ST-Mamba: Spatial-Temporal Mamba for Traffic Flow Estimation Recovery using Limited Data
Authors:
Doncheng Yuan,
Jianzhe Xue,
Jinshan Su,
Wenchao Xu,
Haibo Zhou
Abstract:
Traffic flow estimation (TFE) is crucial for urban intelligent traffic systems. While traditional on-road detectors are hindered by limited coverage and high costs, cloud computing and data mining of vehicular network data, such as driving speeds and GPS coordinates, present a promising and cost-effective alternative. Furthermore, minimizing data collection can significantly reduce overhead. Howev…
▽ More
Traffic flow estimation (TFE) is crucial for urban intelligent traffic systems. While traditional on-road detectors are hindered by limited coverage and high costs, cloud computing and data mining of vehicular network data, such as driving speeds and GPS coordinates, present a promising and cost-effective alternative. Furthermore, minimizing data collection can significantly reduce overhead. However, limited data can lead to inaccuracies and instability in TFE. To address this, we introduce the spatial-temporal Mamba (ST-Mamba), a deep learning model combining a convolutional neural network (CNN) with a Mamba framework. ST-Mamba is designed to enhance TFE accuracy and stability by effectively capturing the spatial-temporal patterns within traffic flow. Our model aims to achieve results comparable to those from extensive data sets while only utilizing minimal data. Simulations using real-world datasets have validated our model's ability to deliver precise and stable TFE across an urban landscape based on limited data, establishing a cost-efficient solution for TFE.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Selective area epitaxy of in-plane HgTe nanostrcutures on CdTe(001) substrate
Authors:
Nicolas Chaize,
Xavier Baudry,
Pierre-Henri Jouneau,
Eric Gautier,
Jean-Luc Rouvière,
Yves Deblock,
Jimmy Xu,
Maxime Berthe,
Clément Barbot,
Bruno Grandidier,
Ludovic Desplanque,
Hermann Sellier,
Philippe Ballet
Abstract:
Semiconductor nanowires are believed to play a crucial role for future applications in electronics, spintronics and quantum technologies. A potential candidate is HgTe but its sensitivity to nanofabrication processes restrain its development. A way to circumvent this obstacle is the selective area growth technique. Here, in-plane HgTe nanostructures are grown thanks to selective area molecular bea…
▽ More
Semiconductor nanowires are believed to play a crucial role for future applications in electronics, spintronics and quantum technologies. A potential candidate is HgTe but its sensitivity to nanofabrication processes restrain its development. A way to circumvent this obstacle is the selective area growth technique. Here, in-plane HgTe nanostructures are grown thanks to selective area molecular beam epitaxy on a semi-insulating CdTe substrate covered with a patterned SiO$_{\mathrm{2}}$ mask. The shape of these nanostructures is defined by the in-plane orientation of the mask aperture along the <$110$>, <$1\bar{\mathrm{1}}0$>, or <$100$> direction, the deposited thickness, and the growth temperature. Several micron long in-plane nanowires can be achieved as well as more complex nanostructures such as networks, diamond structures or rings. A good selectivity is achieved with very little parasitic growth on the mask even for a growth temperature as low as $140$°C and growth rate up to $0.5$ ML/s. For <$110$> oriented nanowires, the center of the nanostructure exhibits a trapezoidal shape with {$111$}B facets and two grains on the sides, while <$1\bar{\mathrm{1}}0$> oriented nanowires show {$111$}A facets with adatoms accumulation on the sides of the top surface. Transmission electron microscopy observations reveal a continuous epitaxial relation between the CdTe substrate and the HgTe nanowire. Measurements of the resistance with fourpoint scanning tunneling microscopy indicates a good electrical homogeneity along the main NW axis and a thermally activated transport. This growth method paves the way toward the fabrication of complex HgTe-based nanostructures for electronic transport measurements.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Improved Model and Analysis for RIS-Assisted Indoor Terahertz Wireless Networks
Authors:
Zhi Chai,
Jiajie Xu,
Mohamed-Slim Alouini,
Justin P. Coon
Abstract:
In this paper, we propose a new model for indoor THz communication assisted by RIS. We conduct a realistic modeling of indoor obstacles and analyze their impact on performance. Order statistics are applied to calculate the cumulative distribution functions (CDFs) of distances from the transmitter to the selected RIS, i.e., the nearest RIS in the bounded indoor environment to the transmitter, and f…
▽ More
In this paper, we propose a new model for indoor THz communication assisted by RIS. We conduct a realistic modeling of indoor obstacles and analyze their impact on performance. Order statistics are applied to calculate the cumulative distribution functions (CDFs) of distances from the transmitter to the selected RIS, i.e., the nearest RIS in the bounded indoor environment to the transmitter, and from the selected RIS to the receiver. We calculate the coverage probability (CP) as a function of RIS number, obstacle density, room size, and the transmitter's location. By comparing the numerical results obtained from the analytical expressions with Monte Carlo simulations, we verify the accuracy of our analysis. Through numerical results, it is observed that room size and obstacle density affect the CP in a significant way. However, by optimizing the transmitter's location and increasing the RIS number deployed in the room, the CP can be significantly improved (e.g., an increase of around 15% by optimizing the transmitter's location, and an increase of around 30% by increasing the RIS number deployed in the room).
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Spatial-Temporal Attention Model for Traffic State Estimation with Sparse Internet of Vehicles
Authors:
Jianzhe Xue,
Dongcheng Yuan,
Yu Sun,
Tianqi Zhang,
Wenchao Xu,
Haibo Zhou,
Xuemin,
Shen
Abstract:
The growing number of connected vehicles offers an opportunity to leverage internet of vehicles (IoV) data for traffic state estimation (TSE) which plays a crucial role in intelligent transportation systems (ITS). By utilizing only a portion of IoV data instead of the entire dataset, the significant overheads associated with collecting and processing large amounts of data can be avoided. In this p…
▽ More
The growing number of connected vehicles offers an opportunity to leverage internet of vehicles (IoV) data for traffic state estimation (TSE) which plays a crucial role in intelligent transportation systems (ITS). By utilizing only a portion of IoV data instead of the entire dataset, the significant overheads associated with collecting and processing large amounts of data can be avoided. In this paper, we introduce a novel framework that utilizes sparse IoV data to achieve cost-effective TSE. Particularly, we propose a novel spatial-temporal attention model called the convolutional retentive network (CRNet) to improve the TSE accuracy by mining spatial-temporal traffic state correlations. The model employs the convolutional neural network (CNN) for spatial correlation aggregation and the retentive network (RetNet) based on the attention mechanism to extract temporal correlations. Extensive simulations on a real-world IoV dataset validate the advantage of the proposed TSE approach in achieving accurate TSE using sparse IoV data, demonstrating its cost effectiveness and practicality for real-world applications.
△ Less
Submitted 14 July, 2024; v1 submitted 10 July, 2024;
originally announced July 2024.
-
Spatial-Temporal Generative AI for Traffic Flow Estimation with Sparse Data of Connected Vehicles
Authors:
Jianzhe Xue,
Yunting Xu,
Dongcheng Yuan,
Caoyi Zha,
Hongyang Du,
Haibo Zhou,
Dusit Niyato
Abstract:
Traffic flow estimation (TFE) is crucial for intelligent transportation systems. Traditional TFE methods rely on extensive road sensor networks and typically incur significant costs. Sparse mobile crowdsensing enables a cost-effective alternative by utilizing sparsely distributed probe vehicle data (PVD) provided by connected vehicles. However, as pointed out by the central limit theorem, the spar…
▽ More
Traffic flow estimation (TFE) is crucial for intelligent transportation systems. Traditional TFE methods rely on extensive road sensor networks and typically incur significant costs. Sparse mobile crowdsensing enables a cost-effective alternative by utilizing sparsely distributed probe vehicle data (PVD) provided by connected vehicles. However, as pointed out by the central limit theorem, the sparsification of PVD leads to the degradation of TFE accuracy. In response, this paper introduces a novel and cost-effective TFE framework that leverages sparse PVD and improves accuracy by applying the spatial-temporal generative artificial intelligence (GAI) framework. Within this framework, the conditional encoder mines spatial-temporal correlations in the initial TFE results derived from averaging vehicle speeds of each region, and the generative decoder generates high-quality and accurate TFE outputs. Additionally, the design of the spatial-temporal neural network is discussed, which is the backbone of the conditional encoder for effectively capturing spatial-temporal correlations. The effectiveness of the proposed TFE approach is demonstrated through evaluations based on real-world connected vehicle data. The experimental results affirm the feasibility of our sparse PVD-based TFE framework and highlight the significant role of the spatial-temporal GAI framework in enhancing the accuracy of TFE.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
AutoMate: Specialist and Generalist Assembly Policies over Diverse Geometries
Authors:
Bingjie Tang,
Iretiayo Akinola,
Jie Xu,
Bowen Wen,
Ankur Handa,
Karl Van Wyk,
Dieter Fox,
Gaurav S. Sukhatme,
Fabio Ramos,
Yashraj Narang
Abstract:
Robotic assembly for high-mixture settings requires adaptivity to diverse parts and poses, which is an open challenge. Meanwhile, in other areas of robotics, large models and sim-to-real have led to tremendous progress. Inspired by such work, we present AutoMate, a learning framework and system that consists of 4 parts: 1) a dataset of 100 assemblies compatible with simulation and the real world,…
▽ More
Robotic assembly for high-mixture settings requires adaptivity to diverse parts and poses, which is an open challenge. Meanwhile, in other areas of robotics, large models and sim-to-real have led to tremendous progress. Inspired by such work, we present AutoMate, a learning framework and system that consists of 4 parts: 1) a dataset of 100 assemblies compatible with simulation and the real world, along with parallelized simulation environments for policy learning, 2) a novel simulation-based approach for learning specialist (i.e., part-specific) policies and generalist (i.e., unified) assembly policies, 3) demonstrations of specialist policies that individually solve 80 assemblies with 80% or higher success rates in simulation, as well as a generalist policy that jointly solves 20 assemblies with an 80%+ success rate, and 4) zero-shot sim-to-real transfer that achieves similar (or better) performance than simulation, including on perception-initialized assembly. The key methodological takeaway is that a union of diverse algorithms from manufacturing engineering, character animation, and time-series analysis provides a generic and robust solution for a diverse range of robotic assembly problems.To our knowledge, AutoMate provides the first simulation-based framework for learning specialist and generalist policies over a wide range of assemblies, as well as the first system demonstrating zero-shot sim-to-real transfer over such a range.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Belief Information based Deep Channel Estimation for Massive MIMO Systems
Authors:
Jialong Xu,
Liu Liu,
Xin Wang,
Lan Chen
Abstract:
In the next generation wireless communication system, transmission rates should continue to rise to support emerging scenarios, e.g., the immersive communications. From the perspective of communication system evolution, multiple-input multiple-output (MIMO) technology remains pivotal for enhancing transmission rates. However, current MIMO systems rely on inserting pilot signals to achieve accurate…
▽ More
In the next generation wireless communication system, transmission rates should continue to rise to support emerging scenarios, e.g., the immersive communications. From the perspective of communication system evolution, multiple-input multiple-output (MIMO) technology remains pivotal for enhancing transmission rates. However, current MIMO systems rely on inserting pilot signals to achieve accurate channel estimation. As the increase of transmit stream, the pilots consume a significant portion of transmission resources, severely reducing the spectral efficiency. In this correspondence, we propose a belief information based mechanism. By introducing a plug-and-play belief information module, existing single-antenna channel estimation networks could be seamlessly adapted to multi-antenna channel estimation and fully exploit the spatial correlation among multiple antennas. Experimental results demonstrate that the proposed method can either improve 1 ~ 2 dB channel estimation performance or reduce 1/3 ~ 1/2 pilot overhead, particularly in bad channel conditions.
△ Less
Submitted 23 June, 2024;
originally announced July 2024.
-
Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (645 additional authors not shown)
Abstract:
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be…
▽ More
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
BoostCom: Towards Efficient Universal Fully Homomorphic Encryption by Boosting the Word-wise Comparisons
Authors:
Ardhi Wiratama Baskara Yudha,
Jiaqi Xue,
Qian Lou,
Huiyang Zhou,
Yan Solihin
Abstract:
Fully Homomorphic Encryption (FHE) allows for the execution of computations on encrypted data without the need to decrypt it first, offering significant potential for privacy-preserving computational operations. Emerging arithmetic-based FHE schemes (ar-FHE), like BGV, demonstrate even better performance in word-wise comparison operations over non-arithmetic FHE (na-FHE) schemes, such as TFHE, esp…
▽ More
Fully Homomorphic Encryption (FHE) allows for the execution of computations on encrypted data without the need to decrypt it first, offering significant potential for privacy-preserving computational operations. Emerging arithmetic-based FHE schemes (ar-FHE), like BGV, demonstrate even better performance in word-wise comparison operations over non-arithmetic FHE (na-FHE) schemes, such as TFHE, especially for basic tasks like comparing values, finding maximums, and minimums. This shows the universality of ar-FHE in effectively handling both arithmetic and non-arithmetic operations without the expensive conversion between arithmetic and non-arithmetic FHEs. We refer to universal arithmetic Fully Homomorphic Encryption as uFHE. The arithmetic operations in uFHE remain consistent with those in the original arithmetic FHE, which have seen significant acceleration. However, its non-arithmetic comparison operations differ, are slow, and have not been as thoroughly studied or accelerated. In this paper, we introduce BoostCom, a scheme designed to speed up word-wise comparison operations, enhancing the efficiency of uFHE systems. BoostCom involves a multi-prong optimizations including infrastructure acceleration (Multi-level heterogeneous parallelization and GPU-related improvements), and algorithm-aware optimizations (slot compaction, non-blocking comparison semantic). Together, BoostCom achieves an end-to-end performance improvement of more than an order of magnitude (11.1x faster) compared to the state-of-the-art CPU-based uFHE systems, across various FHE parameters and tasks.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Reference-based Controllable Scene Stylization with Gaussian Splatting
Authors:
Yiqun Mei,
Jiacong Xu,
Vishal M. Patel
Abstract:
Referenced-based scene stylization that edits the appearance based on a content-aligned reference image is an emerging research area. Starting with a pretrained neural radiance field (NeRF), existing methods typically learn a novel appearance that matches the given style. Despite their effectiveness, they inherently suffer from time-consuming volume rendering, and thus are impractical for many rea…
▽ More
Referenced-based scene stylization that edits the appearance based on a content-aligned reference image is an emerging research area. Starting with a pretrained neural radiance field (NeRF), existing methods typically learn a novel appearance that matches the given style. Despite their effectiveness, they inherently suffer from time-consuming volume rendering, and thus are impractical for many real-time applications. In this work, we propose ReGS, which adapts 3D Gaussian Splatting (3DGS) for reference-based stylization to enable real-time stylized view synthesis. Editing the appearance of a pretrained 3DGS is challenging as it uses discrete Gaussians as 3D representation, which tightly bind appearance with geometry. Simply optimizing the appearance as prior methods do is often insufficient for modeling continuous textures in the given reference image. To address this challenge, we propose a novel texture-guided control mechanism that adaptively adjusts local responsible Gaussians to a new geometric arrangement, serving for desired texture details. The proposed process is guided by texture clues for effective appearance editing, and regularized by scene depth for preserving original geometric structure. With these novel designs, we show ReGs can produce state-of-the-art stylization results that respect the reference texture while embracing real-time rendering speed for free-view navigation.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
A Survey of Controllable Learning: Methods and Applications in Information Retrieval
Authors:
Chenglei Shen,
Xiao Zhang,
Teng Shi,
Changshuo Zhang,
Guofu Xie,
Jun Xu
Abstract:
Controllable learning (CL) emerges as a critical component in trustworthy machine learning, ensuring that learners meet predefined targets and can adaptively adjust without retraining according to the changes in those targets. We provide a formal definition of CL, and discuss its applications in information retrieval (IR) where information needs are often complex and dynamic. The survey categorize…
▽ More
Controllable learning (CL) emerges as a critical component in trustworthy machine learning, ensuring that learners meet predefined targets and can adaptively adjust without retraining according to the changes in those targets. We provide a formal definition of CL, and discuss its applications in information retrieval (IR) where information needs are often complex and dynamic. The survey categorizes CL according to who controls (users or platforms), what is controllable (e.g., retrieval objectives, users' historical behaviors, controllable environmental adaptation), how control is implemented (e.g., rule-based method, Pareto optimization, Hypernetwork), and where to implement control (e.g.,pre-processing, in-processing, post-processing methods). Then, we identify challenges faced by CL across training, evaluation, task setting, and deployment in online environments. Additionally, we outline promising directions for CL in theoretical analysis, efficient computation, empowering large language models, application scenarios and evaluation frameworks in IR.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Distilling System 2 into System 1
Authors:
Ping Yu,
Jing Xu,
Jason Weston,
Ilia Kulikov
Abstract:
Large language models (LLMs) can spend extra compute during inference to generate intermediate thoughts, which helps to produce better final responses. Since Chain-of-Thought (Wei et al., 2022), many such System 2 techniques have been proposed such as Rephrase and Respond (Deng et al., 2023a), System 2 Attention (Weston and Sukhbaatar, 2023) and Branch-Solve-Merge (Saha et al., 2023). In this work…
▽ More
Large language models (LLMs) can spend extra compute during inference to generate intermediate thoughts, which helps to produce better final responses. Since Chain-of-Thought (Wei et al., 2022), many such System 2 techniques have been proposed such as Rephrase and Respond (Deng et al., 2023a), System 2 Attention (Weston and Sukhbaatar, 2023) and Branch-Solve-Merge (Saha et al., 2023). In this work we investigate self-supervised methods to ``compile'' (distill) higher quality outputs from System 2 techniques back into LLM generations without intermediate reasoning token sequences, as this reasoning has been distilled into System 1. We show that several such techniques can be successfully distilled, resulting in improved results compared to the original System 1 performance, and with less inference cost than System 2. We posit that such System 2 distillation will be an important feature of future continually learning AI systems, enabling them to focus System 2 capabilities on the reasoning tasks that they cannot yet do well.
△ Less
Submitted 9 July, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
Collaborative Secret and Covert Communications for Multi-User Multi-Antenna Uplink UAV Systems: Design and Optimization
Authors:
Jinpeng Xu,
Lin Bai,
Xin Xie,
Lin Zhou
Abstract:
Motivated by diverse secure requirements of multi-user in UAV systems, we propose a collaborative secret and covert transmission method for multi-antenna ground users to unmanned aerial vehicle (UAV) communications. Specifically, based on the power domain non-orthogonal multiple access (NOMA), two ground users with distinct security requirements, named Bob and Carlo, superimpose their signals and…
▽ More
Motivated by diverse secure requirements of multi-user in UAV systems, we propose a collaborative secret and covert transmission method for multi-antenna ground users to unmanned aerial vehicle (UAV) communications. Specifically, based on the power domain non-orthogonal multiple access (NOMA), two ground users with distinct security requirements, named Bob and Carlo, superimpose their signals and transmit the combined signal to the UAV named Alice. An adversary Willie attempts to simultaneously eavesdrop Bob's confidential message and detect whether Carlo is transmitting or not. We derive close-form expressions of the secrecy connection probability (SCP) and the covert connection probability (CCP) to evaluate the link reliability for wiretap and covert transmissions, respectively. Furthermore, we bound the secrecy outage probability (SOP) from Bob to Alice and the detection error probability (DEP) of Willie to evaluate the link security for wiretap and covert transmissions, respectively. To characterize the theoretical benchmark of the above model, we formulate a weighted multi-objective optimization problem to maximize the average of secret and covert transmission rates subject to constraints SOP, DEP, the beamformers of Bob and Carlo, and UAV trajectory parameters. To solve the optimization problem, we propose an iterative optimization algorithm using successive convex approximation and block coordinate descent (SCA-BCD) methods. Our results reveal the influence of design parameters of the system on the wiretap and covert rates, analytically and numerically. In summary, our study fills the gaps in joint secret and covert transmission for multi-user multi-antenna uplink UAV communications and provides insights to construct such systems.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Flying Calligrapher: Contact-Aware Motion and Force Planning and Control for Aerial Manipulation
Authors:
Xiaofeng Guo,
Guanqi He,
Jiahe Xu,
Mohammadreza Mousaei,
Junyi Geng,
Sebastian Scherer,
Guanya Shi
Abstract:
Aerial manipulation has gained interest in completing high-altitude tasks that are challenging for human workers, such as contact inspection and defect detection, etc. Previous research has focused on maintaining static contact points or forces. This letter addresses a more general and dynamic task: simultaneously tracking time-varying contact force in the surface normal direction and motion traje…
▽ More
Aerial manipulation has gained interest in completing high-altitude tasks that are challenging for human workers, such as contact inspection and defect detection, etc. Previous research has focused on maintaining static contact points or forces. This letter addresses a more general and dynamic task: simultaneously tracking time-varying contact force in the surface normal direction and motion trajectories on tangential surfaces. We propose a pipeline that includes a contact-aware trajectory planner to generate dynamically feasible trajectories, and a hybrid motion-force controller to track such trajectories. We demonstrate the approach in an aerial calligraphy task using a novel sponge pen design as the end-effector, whose stroke width is proportional to the contact force. Additionally, we develop a touchscreen interface for flexible user input. Experiments show our method can effectively draw diverse letters, achieving an IoU of 0.59 and an end-effector position (force) tracking RMSE of 2.9 cm (0.7 N). Website: https://xiaofeng-guo.github.io/flying-calligrapher/
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
VideoCoT: A Video Chain-of-Thought Dataset with Active Annotation Tool
Authors:
Yan Wang,
Yawen Zeng,
Jingsheng Zheng,
Xiaofen Xing,
Jin Xu,
Xiangmin Xu
Abstract:
Multimodal large language models (MLLMs) are flourishing, but mainly focus on images with less attention than videos, especially in sub-fields such as prompt engineering, video chain-of-thought (CoT), and instruction tuning on videos. Therefore, we try to explore the collection of CoT datasets in videos to lead to video OpenQA and improve the reasoning ability of MLLMs. Unfortunately, making such…
▽ More
Multimodal large language models (MLLMs) are flourishing, but mainly focus on images with less attention than videos, especially in sub-fields such as prompt engineering, video chain-of-thought (CoT), and instruction tuning on videos. Therefore, we try to explore the collection of CoT datasets in videos to lead to video OpenQA and improve the reasoning ability of MLLMs. Unfortunately, making such video CoT datasets is not an easy task. Given that human annotation is too cumbersome and expensive, while machine-generated is not reliable due to the hallucination issue, we develop an automatic annotation tool that combines machine and human experts, under the active learning paradigm. Active learning is an interactive strategy between the model and human experts, in this way, the workload of human labeling can be reduced and the quality of the dataset can be guaranteed. With the help of the automatic annotation tool, we strive to contribute three datasets, namely VideoCoT, TopicQA, TopicCoT. Furthermore, we propose a simple but effective benchmark based on the collected datasets, which exploits CoT to maximize the complex reasoning capabilities of MLLMs. Extensive experiments demonstrate the effectiveness our solution.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Advancing Prompt Recovery in NLP: A Deep Dive into the Integration of Gemma-2b-it and Phi2 Models
Authors:
Jianlong Chen,
Wei Xu,
Zhicheng Ding,
Jinxin Xu,
Hao Yan,
Xinyu Zhang
Abstract:
Prompt recovery, a crucial task in natural language processing, entails the reconstruction of prompts or instructions that language models use to convert input text into a specific output. Although pivotal, the design and effectiveness of prompts represent a challenging and relatively untapped field within NLP research. This paper delves into an exhaustive investigation of prompt recovery methodol…
▽ More
Prompt recovery, a crucial task in natural language processing, entails the reconstruction of prompts or instructions that language models use to convert input text into a specific output. Although pivotal, the design and effectiveness of prompts represent a challenging and relatively untapped field within NLP research. This paper delves into an exhaustive investigation of prompt recovery methodologies, employing a spectrum of pre-trained language models and strategies. Our study is a comparative analysis aimed at gauging the efficacy of various models on a benchmark dataset, with the goal of pinpointing the most proficient approach for prompt recovery. Through meticulous experimentation and detailed analysis, we elucidate the outstanding performance of the Gemma-2b-it + Phi2 model + Pretrain. This model surpasses its counterparts, showcasing its exceptional capability in accurately reconstructing prompts for text transformation tasks. Our findings offer a significant contribution to the existing knowledge on prompt recovery, shedding light on the intricacies of prompt design and offering insightful perspectives for future innovations in text rewriting and the broader field of natural language processing.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Authors:
Yu Sun,
Xinhao Li,
Karan Dalal,
Jiarui Xu,
Arjun Vikram,
Genghan Zhang,
Yann Dubois,
Xinlei Chen,
Xiaolong Wang,
Sanmi Koyejo,
Tatsunori Hashimoto,
Carlos Guestrin
Abstract:
Self-attention performs well in long context but has quadratic complexity. Existing RNN layers have linear complexity, but their performance in long context is limited by the expressive power of their hidden state. We propose a new class of sequence modeling layers with linear complexity and an expressive hidden state. The key idea is to make the hidden state a machine learning model itself, and t…
▽ More
Self-attention performs well in long context but has quadratic complexity. Existing RNN layers have linear complexity, but their performance in long context is limited by the expressive power of their hidden state. We propose a new class of sequence modeling layers with linear complexity and an expressive hidden state. The key idea is to make the hidden state a machine learning model itself, and the update rule a step of self-supervised learning. Since the hidden state is updated by training even on test sequences, our layers are called Test-Time Training (TTT) layers. We consider two instantiations: TTT-Linear and TTT-MLP, whose hidden state is a linear model and a two-layer MLP respectively. We evaluate our instantiations at the scale of 125M to 1.3B parameters, comparing with a strong Transformer and Mamba, a modern RNN. Both TTT-Linear and TTT-MLP match or exceed the baselines. Similar to Transformer, they can keep reducing perplexity by conditioning on more tokens, while Mamba cannot after 16k context. With preliminary systems optimization, TTT-Linear is already faster than Transformer at 8k context and matches Mamba in wall-clock time. TTT-MLP still faces challenges in memory I/O, but shows larger potential in long context, pointing to a promising direction for future research.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Longitudinal optical phonons in photonic time crystals containing a stationary charge
Authors:
Sihao Zhang,
Junhua Dong,
Huanan Li,
Jingjun Xu,
Boris Shapiro
Abstract:
Lorentzian-type media support optical phonons that oscillate with longitudinal polarization parallel to the wave direction, at a wave vector-independent frequency at which the permittivity becomes zero. Here, we study the interactions between the longitudinal optical phonons and Lorentzian medium-based dispersive photonic time crystals (PTCs). We demonstrate that a stationary charge embedded in th…
▽ More
Lorentzian-type media support optical phonons that oscillate with longitudinal polarization parallel to the wave direction, at a wave vector-independent frequency at which the permittivity becomes zero. Here, we study the interactions between the longitudinal optical phonons and Lorentzian medium-based dispersive photonic time crystals (PTCs). We demonstrate that a stationary charge embedded in the PTCs can excite these longitudinal modes through the conversion of the static polarization field induced by the charge. Furthermore, the PTCs can develop a momentum bandgap across the entire wave vector space to amplify the longitudinal modes. Remarkably, this infinite momentum bandgap can be established with minimal temporal modulation of the refractive index when creating the PTCs. Our approach expands the range of waves that can be manipulated in PTCs and shows potential for observing momentum bandgap phenomenon in realistic optical experiments, where the modulation depth of the refractive index is severely constrained.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Robust Decision Transformer: Tackling Data Corruption in Offline RL via Sequence Modeling
Authors:
Jiawei Xu,
Rui Yang,
Feng Luo,
Meng Fang,
Baoxiang Wang,
Lei Han
Abstract:
Learning policies from offline datasets through offline reinforcement learning (RL) holds promise for scaling data-driven decision-making and avoiding unsafe and costly online interactions. However, real-world data collected from sensors or humans often contains noise and errors, posing a significant challenge for existing offline RL methods. Our study indicates that traditional offline RL methods…
▽ More
Learning policies from offline datasets through offline reinforcement learning (RL) holds promise for scaling data-driven decision-making and avoiding unsafe and costly online interactions. However, real-world data collected from sensors or humans often contains noise and errors, posing a significant challenge for existing offline RL methods. Our study indicates that traditional offline RL methods based on temporal difference learning tend to underperform Decision Transformer (DT) under data corruption, especially when the amount of data is limited. This suggests the potential of sequential modeling for tackling data corruption in offline RL. To further unleash the potential of sequence modeling methods, we propose Robust Decision Transformer (RDT) by incorporating several robust techniques. Specifically, we introduce Gaussian weighted learning and iterative data correction to reduce the effect of corrupted data. Additionally, we leverage embedding dropout to enhance the model's resistance to erroneous inputs. Extensive experiments on MoJoCo, KitChen, and Adroit tasks demonstrate RDT's superior performance under diverse data corruption compared to previous methods. Moreover, RDT exhibits remarkable robustness in a challenging setting that combines training-time data corruption with testing-time observation perturbations. These results highlight the potential of robust sequence modeling for learning from noisy or corrupted offline datasets, thereby promoting the reliable application of offline RL in real-world tasks.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
BiosERC: Integrating Biography Speakers Supported by LLMs for ERC Tasks
Authors:
Jieying Xue,
Minh Phuong Nguyen,
Blake Matheny,
Le Minh Nguyen
Abstract:
In the Emotion Recognition in Conversation task, recent investigations have utilized attention mechanisms exploring relationships among utterances from intra- and inter-speakers for modeling emotional interaction between them. However, attributes such as speaker personality traits remain unexplored and present challenges in terms of their applicability to other tasks or compatibility with diverse…
▽ More
In the Emotion Recognition in Conversation task, recent investigations have utilized attention mechanisms exploring relationships among utterances from intra- and inter-speakers for modeling emotional interaction between them. However, attributes such as speaker personality traits remain unexplored and present challenges in terms of their applicability to other tasks or compatibility with diverse model architectures. Therefore, this work introduces a novel framework named BiosERC, which investigates speaker characteristics in a conversation. By employing Large Language Models (LLMs), we extract the "biographical information" of the speaker within a conversation as supplementary knowledge injected into the model to classify emotional labels for each utterance. Our proposed method achieved state-of-the-art (SOTA) results on three famous benchmark datasets: IEMOCAP, MELD, and EmoryNLP, demonstrating the effectiveness and generalization of our model and showcasing its potential for adaptation to various conversation analysis tasks. Our source code is available at https://github.com/yingjie7/BiosERC.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
A Complex-Coefficient Voltage Control for Virtual Synchronous Generators for Dynamic Enhancement and Power-Voltage Decoupling
Authors:
Jingzhe Xu,
Weihua Zhou,
Behrooz Bahrani
Abstract:
As electric power systems evolve towards decarbonization, the transition to inverter-based resources (IBRs) presents challenges to grid stability, necessitating innovative control solutions. Virtual synchronous generator (VSG) emerges as a prominent solution. However, conventional VSGs are prone to instability in strong grids, slow voltage regulation, and coupled power-voltage response. To address…
▽ More
As electric power systems evolve towards decarbonization, the transition to inverter-based resources (IBRs) presents challenges to grid stability, necessitating innovative control solutions. Virtual synchronous generator (VSG) emerges as a prominent solution. However, conventional VSGs are prone to instability in strong grids, slow voltage regulation, and coupled power-voltage response. To address these issues, this paper introduces an advanced VSG control strategy. A novel analysis of the VSG control dynamics is presented through a second-order closed-loop complex single-input single-output system, employing a vectorized geometrical pole analysis technique for enhanced voltage stability and dynamics. The proposed comprehensive controller design mitigates issues related to control interacted subsynchronous resonance and $dq \leftrightarrow 3φ$ transformation-induced voltage-coupled power transients, achieving improved system robustness and simplified control tuning. Key contributions include a two-fold design: optimized voltage transition characteristics through direct pole placement and transient power overshoot correction via a compensator. Validated by simulation and experiments, the findings offer a pragmatic solution for integrating VSG technology into decarbonizing power systems, ensuring reliability and efficiency.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Securing Multi-turn Conversational Language Models Against Distributed Backdoor Triggers
Authors:
Terry Tong,
Jiashu Xu,
Qin Liu,
Muhao Chen
Abstract:
The security of multi-turn conversational large language models (LLMs) is understudied despite it being one of the most popular LLM utilization. Specifically, LLMs are vulnerable to data poisoning backdoor attacks, where an adversary manipulates the training data to cause the model to output malicious responses to predefined triggers. Specific to the multi-turn dialogue setting, LLMs are at the ri…
▽ More
The security of multi-turn conversational large language models (LLMs) is understudied despite it being one of the most popular LLM utilization. Specifically, LLMs are vulnerable to data poisoning backdoor attacks, where an adversary manipulates the training data to cause the model to output malicious responses to predefined triggers. Specific to the multi-turn dialogue setting, LLMs are at the risk of even more harmful and stealthy backdoor attacks where the backdoor triggers may span across multiple utterances, giving lee-way to context-driven attacks. In this paper, we explore a novel distributed backdoor trigger attack that serves to be an extra tool in an adversary's toolbox that can interface with other single-turn attack strategies in a plug and play manner. Results on two representative defense mechanisms indicate that distributed backdoor triggers are robust against existing defense strategies which are designed for single-turn user-model interactions, motivating us to propose a new defense strategy for the multi-turn dialogue setting that is more challenging. To this end, we also explore a novel contrastive decoding based defense that is able to mitigate the backdoor with a low computational tradeoff.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Benchmarking Complex Instruction-Following with Multiple Constraints Composition
Authors:
Bosi Wen,
Pei Ke,
Xiaotao Gu,
Lindong Wu,
Hao Huang,
Jinfeng Zhou,
Wenchuang Li,
Binxin Hu,
Wendy Gao,
Jiaxin Xu,
Yiming Liu,
Jie Tang,
Hongning Wang,
Minlie Huang
Abstract:
Instruction following is one of the fundamental capabilities of large language models (LLMs). As the ability of LLMs is constantly improving, they have been increasingly applied to deal with complex human instructions in real-world scenarios. Therefore, how to evaluate the ability of complex instruction-following of LLMs has become a critical research problem. Existing benchmarks mainly focus on m…
▽ More
Instruction following is one of the fundamental capabilities of large language models (LLMs). As the ability of LLMs is constantly improving, they have been increasingly applied to deal with complex human instructions in real-world scenarios. Therefore, how to evaluate the ability of complex instruction-following of LLMs has become a critical research problem. Existing benchmarks mainly focus on modeling different types of constraints in human instructions while neglecting the composition of different constraints, which is an indispensable constituent in complex instructions. To this end, we propose ComplexBench, a benchmark for comprehensively evaluating the ability of LLMs to follow complex instructions composed of multiple constraints. We propose a hierarchical taxonomy for complex instructions, including 4 constraint types, 19 constraint dimensions, and 4 composition types, and manually collect a high-quality dataset accordingly. To make the evaluation reliable, we augment LLM-based evaluators with rules to effectively verify whether generated texts can satisfy each constraint and composition. Furthermore, we obtain the final evaluation score based on the dependency structure determined by different composition types. ComplexBench identifies significant deficiencies in existing LLMs when dealing with complex instructions with multiple constraints composition.
△ Less
Submitted 11 July, 2024; v1 submitted 4 July, 2024;
originally announced July 2024.
-
The Solution for the GAIIC2024 RGB-TIR object detection Challenge
Authors:
Xiangyu Wu,
Jinling Xu,
Longfei Huang,
Yang Yang
Abstract:
This report introduces a solution to The task of RGB-TIR object detection from the perspective of unmanned aerial vehicles. Unlike traditional object detection methods, RGB-TIR object detection aims to utilize both RGB and TIR images for complementary information during detection. The challenges of RGB-TIR object detection from the perspective of unmanned aerial vehicles include highly complex ima…
▽ More
This report introduces a solution to The task of RGB-TIR object detection from the perspective of unmanned aerial vehicles. Unlike traditional object detection methods, RGB-TIR object detection aims to utilize both RGB and TIR images for complementary information during detection. The challenges of RGB-TIR object detection from the perspective of unmanned aerial vehicles include highly complex image backgrounds, frequent changes in lighting, and uncalibrated RGB-TIR image pairs. To address these challenges at the model level, we utilized a lightweight YOLOv9 model with extended multi-level auxiliary branches that enhance the model's robustness, making it more suitable for practical applications in unmanned aerial vehicle scenarios. For image fusion in RGB-TIR detection, we incorporated a fusion module into the backbone network to fuse images at the feature level, implicitly addressing calibration issues. Our proposed method achieved an mAP score of 0.516 and 0.543 on A and B benchmarks respectively while maintaining the highest inference speed among all models.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Let the Code LLM Edit Itself When You Edit the Code
Authors:
Zhenyu He,
Jun Zhang,
Shengjie Luo,
Jingjing Xu,
Zhi Zhang,
Di He
Abstract:
In this work, we investigate a typical scenario in code generation where a developer edits existing code in real time and requests a code assistant, e.g., a large language model, to re-predict the next token or next line on the fly. Naively, the LLM needs to re-encode the entire KV cache to provide an accurate prediction. However, this process is computationally expensive, especially when the sequ…
▽ More
In this work, we investigate a typical scenario in code generation where a developer edits existing code in real time and requests a code assistant, e.g., a large language model, to re-predict the next token or next line on the fly. Naively, the LLM needs to re-encode the entire KV cache to provide an accurate prediction. However, this process is computationally expensive, especially when the sequence length is long. Simply encoding the edited subsequence and integrating it to the original KV cache meets the temporal confusion problem, leading to significantly worse performance. We address this efficiency and accuracy trade-off by introducing \underline{\textbf{Positional \textbf{I}ntegrity \textbf{E}ncoding} (PIE). Building upon the rotary positional encoding, PIE first removes the rotary matrices in the Key cache that introduce temporal confusion and then reapplies the correct rotary matrices. This process ensures that positional relationships between tokens are correct and requires only a single round of matrix multiplication. We validate the effectiveness of PIE through extensive experiments on the RepoBench-C-8k dataset, utilizing DeepSeek-Coder models with 1.3B, 6.7B, and 33B parameters. Our evaluation includes three real-world coding tasks: code insertion, code deletion, and multi-place code editing. Results demonstrate that PIE reduces computational overhead by over 85% compared to the standard full recomputation approach across all model sizes and tasks while well approximating the model performance.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
SlerpFace: Face Template Protection via Spherical Linear Interpolation
Authors:
Zhizhou Zhong,
Yuxi Mi,
Yuge Huang,
Jianqing Xu,
Guodong Mu,
Shouhong Ding,
Jingyun Zhang,
Rizen Guo,
Yunsheng Wu,
Shuigeng Zhou
Abstract:
Contemporary face recognition systems use feature templates extracted from face images to identify persons. To enhance privacy, face template protection techniques are widely employed to conceal sensitive identity and appearance information stored in the template. This paper identifies an emerging privacy attack form utilizing diffusion models that could nullify prior protection, referred to as in…
▽ More
Contemporary face recognition systems use feature templates extracted from face images to identify persons. To enhance privacy, face template protection techniques are widely employed to conceal sensitive identity and appearance information stored in the template. This paper identifies an emerging privacy attack form utilizing diffusion models that could nullify prior protection, referred to as inversion attacks. The attack can synthesize high-quality, identity-preserving face images from templates, revealing persons' appearance. Based on studies of the diffusion model's generative capability, this paper proposes a defense to deteriorate the attack, by rotating templates to a noise-like distribution. This is achieved efficiently by spherically and linearly interpolating templates, or slerp, on their located hypersphere. This paper further proposes to group-wisely divide and drop out templates' feature dimensions, to enhance the irreversibility of rotated templates. The division of groups and dropouts within each group are learned in a recognition-favored way. The proposed techniques are concretized as a novel face template protection technique, SlerpFace. Extensive experiments show that SlerpFace provides satisfactory recognition accuracy and comprehensive privacy protection against inversion and other attack forms, superior to prior arts.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be…
▽ More
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
A Direct Construction of Solitary Waves for a Fractional Korteweg-de Vries Equation With an Inhomogeneous Symbol
Authors:
Swati Yadav,
Jun Xue
Abstract:
We construct solitary waves for the fractional Korteweg-De Vries type equation $u_t + (Λ^{-s}u + u^2)_x = 0$, where $Λ^{-s}$ denotes the Bessel potential operator $(1 + |D|^2)^{-\frac{s}{2}}$ for $s \in (0,\infty)$. The approach is to parameterise the known periodic solution curves through the relative wave height. Using a priori estimates, we show that the periodic waves locally uniformly converg…
▽ More
We construct solitary waves for the fractional Korteweg-De Vries type equation $u_t + (Λ^{-s}u + u^2)_x = 0$, where $Λ^{-s}$ denotes the Bessel potential operator $(1 + |D|^2)^{-\frac{s}{2}}$ for $s \in (0,\infty)$. The approach is to parameterise the known periodic solution curves through the relative wave height. Using a priori estimates, we show that the periodic waves locally uniformly converge to waves with negative tails, which are transformed to the desired branch of solutions. The obtained branch reaches a highest wave, the behavior of which varies with $s$. The work is a generalisation of recent work by Ehrnström-Nik-Walker, and is as far as we know the first simultaneous construction of small, intermediate and highest solitary waves for the complete family of (inhomogeneous) fractional KdV equations with negative-order dispersive operators. The obtained waves display exponential decay rate as $|x| \to \infty$.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Automatic Adaptation Rule Optimization via Large Language Models
Authors:
Yusei Ishimizu,
Jialong Li,
Jinglue Xu,
Jinyu Cai,
Hitoshi Iba,
Kenji Tei
Abstract:
Rule-based adaptation is a foundational approach to self-adaptation, characterized by its human readability and rapid response. However, building high-performance and robust adaptation rules is often a challenge because it essentially involves searching the optimal design in a complex (variables) space. In response, this paper attempt to employ large language models (LLMs) as a optimizer to constr…
▽ More
Rule-based adaptation is a foundational approach to self-adaptation, characterized by its human readability and rapid response. However, building high-performance and robust adaptation rules is often a challenge because it essentially involves searching the optimal design in a complex (variables) space. In response, this paper attempt to employ large language models (LLMs) as a optimizer to construct and optimize adaptation rules, leveraging the common sense and reasoning capabilities inherent in LLMs. Preliminary experiments conducted in SWIM have validated the effectiveness and limitation of our method.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
I2EKF-LO: A Dual-Iteration Extended Kalman Filter Based LiDAR Odometry
Authors:
Wenlu Yu,
Jie Xu,
Chengwei Zhao,
Lijun Zhao,
Thien-Minh Nguyen,
Shenghai Yuan,
Mingming Bai,
Lihua Xie
Abstract:
LiDAR odometry is a pivotal technology in the fields of autonomous driving and autonomous mobile robotics. However, most of the current works focus on nonlinear optimization methods, and still existing many challenges in using the traditional Iterative Extended Kalman Filter (IEKF) framework to tackle the problem: IEKF only iterates over the observation equation, relying on a rough estimate of the…
▽ More
LiDAR odometry is a pivotal technology in the fields of autonomous driving and autonomous mobile robotics. However, most of the current works focus on nonlinear optimization methods, and still existing many challenges in using the traditional Iterative Extended Kalman Filter (IEKF) framework to tackle the problem: IEKF only iterates over the observation equation, relying on a rough estimate of the initial state, which is insufficient to fully eliminate motion distortion in the input point cloud; the system process noise is difficult to be determined during state estimation of the complex motions; and the varying motion models across different sensor carriers. To address these issues, we propose the Dual-Iteration Extended Kalman Filter (I2EKF) and the LiDAR odometry based on I2EKF (I2EKF-LO). This approach not only iterates over the observation equation but also leverages state updates to iteratively mitigate motion distortion in LiDAR point clouds. Moreover, it dynamically adjusts process noise based on the confidence level of prior predictions during state estimation and establishes motion models for different sensor carriers to achieve accurate and efficient state estimation. Comprehensive experiments demonstrate that I2EKF-LO achieves outstanding levels of accuracy and computational efficiency in the realm of LiDAR odometry. Additionally, to foster community development, our code is open-sourced.https://github.com/YWL0720/I2EKF-LO.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Machine-learning designed smart coating: temperature-dependent self-adaptation between a solar absorber and a radiative cooler
Authors:
Zhaocheng Zhang,
Jiahao Xu,
Pengran Hou,
Yang Deng
Abstract:
We designed a multilayered self-adaptive absorber/emitter metamaterial, which can smartly switch between a solar absorber and a radiative cooler based on temperature change. The switching capability is facilitated by the phase change material and the structure is optimized by machine learning. Our design not only advances the machine-learning-based development of metamaterials but also has the pot…
▽ More
We designed a multilayered self-adaptive absorber/emitter metamaterial, which can smartly switch between a solar absorber and a radiative cooler based on temperature change. The switching capability is facilitated by the phase change material and the structure is optimized by machine learning. Our design not only advances the machine-learning-based development of metamaterials but also has the potential to significantly reduce carbon emissions and contribute to the goal of achieving carbon neutrality.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
MeMo: Meaningful, Modular Controllers via Noise Injection
Authors:
Megan Tjandrasuwita,
Jie Xu,
Armando Solar-Lezama,
Wojciech Matusik
Abstract:
Robots are often built from standardized assemblies, (e.g. arms, legs, or fingers), but each robot must be trained from scratch to control all the actuators of all the parts together. In this paper we demonstrate a new approach that takes a single robot and its controller as input and produces a set of modular controllers for each of these assemblies such that when a new robot is built from the sa…
▽ More
Robots are often built from standardized assemblies, (e.g. arms, legs, or fingers), but each robot must be trained from scratch to control all the actuators of all the parts together. In this paper we demonstrate a new approach that takes a single robot and its controller as input and produces a set of modular controllers for each of these assemblies such that when a new robot is built from the same parts, its control can be quickly learned by reusing the modular controllers. We achieve this with a framework called MeMo which learns (Me)aningful, (Mo)dular controllers. Specifically, we propose a novel modularity objective to learn an appropriate division of labor among the modules. We demonstrate that this objective can be optimized simultaneously with standard behavior cloning loss via noise injection. We benchmark our framework in locomotion and grasping environments on simple to complex robot morphology transfer. We also show that the modules help in task transfer. On both structure and task transfer, MeMo achieves improved training efficiency to graph neural network and Transformer baselines.
△ Less
Submitted 24 May, 2024;
originally announced July 2024.
-
Integration of Computer Networks and Artificial Neural Networks for an AI-based Network Operator
Authors:
Binbin Wu,
Jingyu Xu,
Yifan Zhang,
Bo Liu,
Yulu Gong,
Jiaxin Huang
Abstract:
This paper proposes an integrated approach combining computer networks and artificial neural networks to construct an intelligent network operator, functioning as an AI model. State information from computer networks is transformed into embedded vectors, enabling the operator to efficiently recognize different pieces of information and accurately output appropriate operations for the computer netw…
▽ More
This paper proposes an integrated approach combining computer networks and artificial neural networks to construct an intelligent network operator, functioning as an AI model. State information from computer networks is transformed into embedded vectors, enabling the operator to efficiently recognize different pieces of information and accurately output appropriate operations for the computer network at each step. The operator has undergone comprehensive testing, achieving a 100% accuracy rate, thus eliminating operational risks. Furthermore, a novel algorithm is proposed to emphasize crucial training losses, aiming to enhance the efficiency of operator training. Additionally, a simple computer network simulator is created and encapsulated into training and testing environment components, enabling automation of the data collection, training, and testing processes. This abstract outlines the core contributions of the paper while highlighting the innovative methodology employed in the development and validation of the AI-based network operator.
△ Less
Submitted 9 April, 2024;
originally announced July 2024.
-
VisEval: A Benchmark for Data Visualization in the Era of Large Language Models
Authors:
Nan Chen,
Yuge Zhang,
Jiahang Xu,
Kan Ren,
Yuqing Yang
Abstract:
Translating natural language to visualization (NL2VIS) has shown great promise for visual data analysis, but it remains a challenging task that requires multiple low-level implementations, such as natural language processing and visualization design. Recent advancements in pre-trained large language models (LLMs) are opening new avenues for generating visualizations from natural language. However,…
▽ More
Translating natural language to visualization (NL2VIS) has shown great promise for visual data analysis, but it remains a challenging task that requires multiple low-level implementations, such as natural language processing and visualization design. Recent advancements in pre-trained large language models (LLMs) are opening new avenues for generating visualizations from natural language. However, the lack of a comprehensive and reliable benchmark hinders our understanding of LLMs' capabilities in visualization generation. In this paper, we address this gap by proposing a new NL2VIS benchmark called VisEval. Firstly, we introduce a high-quality and large-scale dataset. This dataset includes 2,524 representative queries covering 146 databases, paired with accurately labeled ground truths. Secondly, we advocate for a comprehensive automated evaluation methodology covering multiple dimensions, including validity, legality, and readability. By systematically scanning for potential issues with a number of heterogeneous checkers, VisEval provides reliable and trustworthy evaluation outcomes. We run VisEval on a series of state-of-the-art LLMs. Our evaluation reveals prevalent challenges and delivers essential insights for future advancements.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
A new characterization of the dissipation structure and the relaxation limit for the compressible Euler-Maxwell system
Authors:
Timothée Crin-Barat,
Yue-Jun Peng,
Ling-Yun Shou,
Jiang Xu
Abstract:
We investigate the three-dimensional compressible Euler-Maxwell system, a model for simulating the transport of electrons interacting with propagating electromagnetic waves in semiconductor devices. First, we show the global well-posedness of classical solutions being a sharp small perturbation of constant equilibrium in a critical regularity setting, uniformly with respect to the relaxation param…
▽ More
We investigate the three-dimensional compressible Euler-Maxwell system, a model for simulating the transport of electrons interacting with propagating electromagnetic waves in semiconductor devices. First, we show the global well-posedness of classical solutions being a sharp small perturbation of constant equilibrium in a critical regularity setting, uniformly with respect to the relaxation parameter $\varepsilon>0$. Then, for all times $t>0$, we derive quantitative error estimates at the rate $O(\varepsilon)$ between the rescaled Euler-Maxwell system and the limit drift-diffusion model. To the best of our knowledge, this work provides the first global-in-time strong convergence for the relaxation procedure in the case of ill-prepared data.
In order to prove our results, we develop a new characterization of the dissipation structure for the linearized Euler-Maxwell system with respect to the relaxation parameter $\varepsilon$. This is done by partitioning the frequency space into three distinct regimes: low, medium and high frequencies, each associated with a different behaviour of the solution. Then, in each regime, the use of efficient unknowns and Lyapunov functionals based on the hypocoercivity theory leads to uniform a priori estimates.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.