-
Qwen2 Technical Report
Authors:
An Yang,
Baosong Yang,
Binyuan Hui,
Bo Zheng,
Bowen Yu,
Chang Zhou,
Chengpeng Li,
Chengyuan Li,
Dayiheng Liu,
Fei Huang,
Guanting Dong,
Haoran Wei,
Huan Lin,
Jialong Tang,
Jialin Wang,
Jian Yang,
Jianhong Tu,
Jianwei Zhang,
Jianxin Ma,
Jin Xu,
Jingren Zhou,
Jinze Bai,
Jinzheng He,
Junyang Lin,
Kai Dang
, et al. (34 additional authors not shown)
Abstract:
This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, a…
▽ More
This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, and exhibits competitive performance relative to proprietary models across diverse benchmarks on language understanding, generation, multilingual proficiency, coding, mathematics, and reasoning.
The flagship model, Qwen2-72B, showcases remarkable performance: 84.2 on MMLU, 37.9 on GPQA, 64.6 on HumanEval, 89.5 on GSM8K, and 82.4 on BBH as a base language model. The instruction-tuned variant, Qwen2-72B-Instruct, attains 9.1 on MT-Bench, 48.1 on Arena-Hard, and 35.7 on LiveCodeBench. Moreover, Qwen2 demonstrates robust multilingual capabilities, proficient in approximately 30 languages, spanning English, Chinese, Spanish, French, German, Arabic, Russian, Korean, Japanese, Thai, Vietnamese, and more, underscoring its versatility and global reach.
To foster community innovation and accessibility, we have made the Qwen2 model weights openly available on Hugging Face and ModelScope, and the supplementary materials including example code on GitHub. These platforms also include resources for quantization, fine-tuning, and deployment, facilitating a wide range of applications and research endeavors.
△ Less
Submitted 16 July, 2024; v1 submitted 15 July, 2024;
originally announced July 2024.
-
Quantum Clock Synchronization Network with Silicon-chip Dual-Pumped Entangled Photon Source
Authors:
J. A. Li,
H. Han,
X. P. Huang,
B. Y. Tang,
K. Guo,
J. Q. Huang,
S. Y. Xiong,
W. R. Yu,
Z. J. Zhang,
J. B. Yang,
B. Liu,
H. Chen,
Z. K. Lu
Abstract:
In this paper, we propose a quantum clock synchronization (QCS) network scheme with silicon-chip dual-pumped entangled photon source. This scheme couples two pump beams into the silicon-based waveguide, where degenerate and non-degenerate spontaneous four-wave mixing (SFWM) occurs, generating entanglement between one signal channel and three idler channels. The entangled photons are distributed to…
▽ More
In this paper, we propose a quantum clock synchronization (QCS) network scheme with silicon-chip dual-pumped entangled photon source. This scheme couples two pump beams into the silicon-based waveguide, where degenerate and non-degenerate spontaneous four-wave mixing (SFWM) occurs, generating entanglement between one signal channel and three idler channels. The entangled photons are distributed to remote users through the wavelength division multiplexing strategy to construct an entanglement distribution network, and the round-trip QCS is adopted to realize a QCS network that can serve multiple users. A proof-of-principle QCS network experiment is implemented among the server and multiple users (Alice, Bob, and Charlie) for 11.1 hours, where Alice and Charlie are 10 km away from the server and Bob is 25 km away from the server. The lowest time deviations (TDEV) between the server and each user (Alice, Bob, and Charlie) are 1.57 ps, 0.82 ps and 2.57 ps at the average time of 8000 s, 8000 s and 800 s respectively. The results show that the QCS network scheme with dual-pumped SFWM photon source proposed by us achieves high accuracy, and the channel resources used by n users are reduced by about 30% compared with other round-trip QCS schemes.
△ Less
Submitted 13 July, 2024;
originally announced July 2024.
-
Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Authors:
Chun-Yi Kuan,
Chih-Kai Yang,
Wei-Ping Huang,
Ke-Han Lu,
Hung-yi Lee
Abstract:
In this work, we introduce Speech-Copilot, a modular framework for instruction-oriented speech-processing tasks that minimizes human effort in toolset construction. Unlike end-to-end methods using large audio-language models, Speech-Copilot builds speech processing-specific toolsets by analyzing pre-collected task instructions and breaking tasks into manageable sub-tasks. It features a flexible ag…
▽ More
In this work, we introduce Speech-Copilot, a modular framework for instruction-oriented speech-processing tasks that minimizes human effort in toolset construction. Unlike end-to-end methods using large audio-language models, Speech-Copilot builds speech processing-specific toolsets by analyzing pre-collected task instructions and breaking tasks into manageable sub-tasks. It features a flexible agent based on large language models that performs tasks through program generation. Our approach achieves state-of-the-art performance on the Dynamic-SUPERB benchmark, demonstrating its effectiveness across diverse speech-processing tasks. Key contributions include: 1) developing an innovative framework for speech processing-specific toolset construction, 2) establishing a high-performing agent based on large language models, and 3) offering a new perspective on addressing challenging instruction-oriented speech-processing tasks. Without additional training processes required by end-to-end approaches, our method provides a flexible and extendable solution for a wide range of speech-processing applications.
△ Less
Submitted 13 July, 2024;
originally announced July 2024.
-
Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models
Authors:
Yi-Cheng Lin,
Tzu-Quan Lin,
Chih-Kai Yang,
Ke-Han Lu,
Wei-Chih Chen,
Chun-Yi Kuan,
Hung-yi Lee
Abstract:
Speech Integrated Large Language Models (SILLMs) combine large language models with speech perception to perform diverse tasks, such as emotion recognition to speaker verification, demonstrating universal audio understanding capability. However, these models may amplify biases present in training data, potentially leading to biased access to information for marginalized groups. This work introduce…
▽ More
Speech Integrated Large Language Models (SILLMs) combine large language models with speech perception to perform diverse tasks, such as emotion recognition to speaker verification, demonstrating universal audio understanding capability. However, these models may amplify biases present in training data, potentially leading to biased access to information for marginalized groups. This work introduces a curated spoken bias evaluation toolkit and corresponding dataset. We evaluate gender bias in SILLMs across four semantic-related tasks: speech-to-text translation (STT), spoken coreference resolution (SCR), spoken sentence continuation (SSC), and spoken question answering (SQA). Our analysis reveals that bias levels are language-dependent and vary with different evaluation methods. Our findings emphasize the necessity of employing multiple approaches to comprehensively assess biases in SILLMs, providing insights for developing fairer SILLM systems.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Velocity-Resolved Ionization Mapping of Broad Line Region. I. Insights into Diverse Geometry and Kinematics
Authors:
Sha-Sha Li,
Hai-Cheng Feng,
H. T. Liu,
J. M. Bai,
Xiang Ji,
Cheng Cheng,
Kai-Xing Lu,
Jian-Guo Wang,
Rui Li
Abstract:
Broad emission lines of active galactic nuclei (AGNs) originate from the broad-line region (BLR), consisting of dense gas clouds in orbit around an accreting supermassive black hole. Understanding the geometry and kinematics of the region is crucial for gaining insights into the physics and evolution of AGNs. Conventional velocity-resolved reverberation mapping may face challenges in disentangling…
▽ More
Broad emission lines of active galactic nuclei (AGNs) originate from the broad-line region (BLR), consisting of dense gas clouds in orbit around an accreting supermassive black hole. Understanding the geometry and kinematics of the region is crucial for gaining insights into the physics and evolution of AGNs. Conventional velocity-resolved reverberation mapping may face challenges in disentangling the degeneracy between intricate motion and geometry of this region. To address this challenge, new key constraints are required. Here, we report the discovery of an asymmetric BLR using a novel technique: velocity-resolved ionization mapping, which can map the distance of emitting gas clouds by measuring Hydrogen line ratios at different velocities. By analyzing spectroscopic monitoring data, we find that the Balmer decrement is anticorrelated with the continuum and correlated with the lags across broad emission line velocities. Some line ratio profiles deviate from the expectations for a symmetrically virialized BLR, suggesting that the red-shifted and blue-shifted gas clouds may not be equidistant from the supermassive black hole (SMBH). This asymmetric geometry might represent a formation imprint, provide new perspectives on the evolution of AGNs, and influence SMBH mass measurements.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
SQLaser: Detecting DBMS Logic Bugs with Clause-Guided Fuzzing
Authors:
Jin Wei,
Ping Chen,
Kangjie Lu,
Jun Dai,
Xiaoyan Sun
Abstract:
Database Management Systems (DBMSs) are vital components in modern data-driven systems. Their complexity often leads to logic bugs, which are implementation errors within the DBMSs that can lead to incorrect query results, data exposure, unauthorized access, etc., without necessarily causing visible system failures. Existing detection employs two strategies: rule-based bug detection and coverage-g…
▽ More
Database Management Systems (DBMSs) are vital components in modern data-driven systems. Their complexity often leads to logic bugs, which are implementation errors within the DBMSs that can lead to incorrect query results, data exposure, unauthorized access, etc., without necessarily causing visible system failures. Existing detection employs two strategies: rule-based bug detection and coverage-guided fuzzing. In general, rule specification itself is challenging; as a result, rule-based detection is limited to specific and simple rules. Coverage-guided fuzzing blindly explores code paths or blocks, many of which are unlikely to contain logic bugs; therefore, this strategy is cost-ineffective. In this paper, we design SQLaser, a SQL-clause-guided fuzzer for detecting logic bugs in DBMSs. Through a comprehensive examination of most existing logic bugs across four distinct DBMSs, excluding those causing system crashes, we have identified 35 logic bug patterns. These patterns manifest as certain SQL clause combinations that commonly result in logic bugs, and behind these clause combinations are a sequence of functions. We therefore model logic bug patterns as error-prone function chains (ie, sequences of functions). We further develop a directed fuzzer with a new path-to-path distance-calculation mechanism for effectively testing these chains and discovering additional logic bugs. This mechanism enables SQLaser to swiftly navigate to target sites and uncover potential bugs emerging from these paths. Our evaluation, conducted on SQLite, MySQL, PostgreSQL, and TiDB, demonstrates that SQLaser significantly accelerates bug discovery compared to other fuzzing approaches, reducing detection time by approximately 60%.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment
Authors:
Ke-Han Lu,
Zhehuai Chen,
Szu-Wei Fu,
He Huang,
Boris Ginsburg,
Yu-Chiang Frank Wang,
Hung-yi Lee
Abstract:
Recent speech language models (SLMs) typically incorporate pre-trained speech models to extend the capabilities from large language models (LLMs). In this paper, we propose a Descriptive Speech-Text Alignment approach that leverages speech captioning to bridge the gap between speech and text modalities, enabling SLMs to interpret and generate comprehensive natural language descriptions, thereby fa…
▽ More
Recent speech language models (SLMs) typically incorporate pre-trained speech models to extend the capabilities from large language models (LLMs). In this paper, we propose a Descriptive Speech-Text Alignment approach that leverages speech captioning to bridge the gap between speech and text modalities, enabling SLMs to interpret and generate comprehensive natural language descriptions, thereby facilitating the capability to understand both linguistic and non-linguistic features in speech. Enhanced with the proposed approach, our model demonstrates superior performance on the Dynamic-SUPERB benchmark, particularly in generalizing to unseen tasks. Moreover, we discover that the aligned model exhibits a zero-shot instruction-following capability without explicit speech instruction tuning. These findings highlight the potential to reshape instruction-following SLMs by incorporating rich, descriptive speech captions.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback
Authors:
Bofei Gao,
Zefan Cai,
Runxin Xu,
Peiyi Wang,
Ce Zheng,
Runji Lin,
Keming Lu,
Dayiheng Liu,
Chang Zhou,
Wen Xiao,
Junjie Hu,
Tianyu Liu,
Baobao Chang
Abstract:
Mathematical verfier achieves success in mathematical reasoning tasks by validating the correctness of solutions. However, existing verifiers are trained with binary classification labels, which are not informative enough for the model to accurately assess the solutions. To mitigate the aforementioned insufficiency of binary labels, we introduce step-wise natural language feedbacks as rationale la…
▽ More
Mathematical verfier achieves success in mathematical reasoning tasks by validating the correctness of solutions. However, existing verifiers are trained with binary classification labels, which are not informative enough for the model to accurately assess the solutions. To mitigate the aforementioned insufficiency of binary labels, we introduce step-wise natural language feedbacks as rationale labels (i.e., the correctness of the current step and the explanations). In this paper, we propose \textbf{Math-Minos}, a natural language feedback enhanced verifier by constructing automatically-generated training data and a two-stage training paradigm for effective training and efficient inference. Our experiments reveal that a small set (30k) of natural language feedbacks can significantly boost the performance of the verifier by the accuracy of 1.6\% (86.6\% $\rightarrow$ 88.2\%) on GSM8K and 0.8\% (37.8\% $\rightarrow$ 38.6\%) on MATH. We have released our code and data for further exploration.
△ Less
Submitted 8 July, 2024; v1 submitted 20 June, 2024;
originally announced June 2024.
-
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models
Authors:
Guanting Dong,
Keming Lu,
Chengpeng Li,
Tingyu Xia,
Bowen Yu,
Chang Zhou,
Jingren Zhou
Abstract:
One core capability of large language models (LLMs) is to follow natural language instructions. However, the issue of automatically constructing high-quality training data to enhance the complex instruction-following abilities of LLMs without manual annotation remains unresolved. In this paper, we introduce AutoIF, the first scalable and reliable method for automatically generating instruction-fol…
▽ More
One core capability of large language models (LLMs) is to follow natural language instructions. However, the issue of automatically constructing high-quality training data to enhance the complex instruction-following abilities of LLMs without manual annotation remains unresolved. In this paper, we introduce AutoIF, the first scalable and reliable method for automatically generating instruction-following training data. AutoIF transforms the validation of instruction-following data quality into code verification, requiring LLMs to generate instructions, the corresponding code to check the correctness of the instruction responses, and unit test samples to verify the code's correctness. Then, execution feedback-based rejection sampling can generate data for Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) training. AutoIF achieves significant improvements across three training algorithms, SFT, Offline DPO, and Online DPO, when applied to the top open-source LLMs, Qwen2 and LLaMA3, in self-alignment and strong-to-weak distillation settings. Our code is publicly available at https://github.com/QwenLM/AutoIF.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Affine $\imath$quantum groups and twisted Yangians in Drinfeld presentations
Authors:
Kang Lu,
Weiqiang Wang,
Weinan Zhang
Abstract:
We formulate a family of algebras, twisted Yangians (of split type) in current generators and relations, via a degeneration of the Drinfeld presentation of affine $\imath$quantum groups (associated with split Satake diagrams). These new algebras admit PBW type bases and are shown to be a deformation of twisted current algebras; presentations for twisted current algebras are also provided. For type…
▽ More
We formulate a family of algebras, twisted Yangians (of split type) in current generators and relations, via a degeneration of the Drinfeld presentation of affine $\imath$quantum groups (associated with split Satake diagrams). These new algebras admit PBW type bases and are shown to be a deformation of twisted current algebras; presentations for twisted current algebras are also provided. For type AI, it matches with the Drinfeld presentation of twisted Yangian obtained via Gauss decomposition. We conjecture that our split twisted Yangians are isomorphic to the corresponding ones in RTT presentation.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Morpho-Photometric Classification of KiDS DR5 Sources Based on Neural Networks: A Comprehensive Star-Quasar-Galaxy Catalog
Authors:
Hai-Cheng Feng,
Rui Li,
Nicola R. Napolitano,
Sha-Sha Li,
J. M. Bai,
Ran Li,
H. T. Liu,
Kai-Xing Lu,
Mario Radovich,
Huan-Yuan Shan,
Jian-Guo Wang,
Wen-Zhe Xi,
Ling-Hua Xie,
Yang-Wei Zhang
Abstract:
We present a novel multimodal neural network for classifying astronomical sources in multiband ground-based observations, from optical to near infrared, to separate sources in stars, galaxies and quasars. Our approach combines a convolutional neural network branch for learning morphological features from $r$-band images with an artificial neural network branch for extracting spectral energy distri…
▽ More
We present a novel multimodal neural network for classifying astronomical sources in multiband ground-based observations, from optical to near infrared, to separate sources in stars, galaxies and quasars. Our approach combines a convolutional neural network branch for learning morphological features from $r$-band images with an artificial neural network branch for extracting spectral energy distribution (SED) information. Specifically, we have used 9-band optical ($ugri$) and NIR ($ZYHJK_s$) data from the Kilo-Degree Survey (KiDS) Data Release 5. The two branches of the network are concatenated and feed into fully-connected layers for final classification. We train the network on a spectroscopically confirmed sample from the Sloan Digital Sky Survey cross-matched with KiDS. The trained model achieves 98.76\% overall accuracy on an independent testing dataset, with F1 scores exceeding 95\% for each class. Raising the output probability threshold, we obtain higher purity at the cost of a lower completeness. We have also validated the network using external catalogs cross-matched with KiDS, correctly classifying 99.74\% of a pure star sample selected from Gaia parallaxes and proper motions, and 99.74\% of an external galaxy sample from the Galaxy and Mass Assembly survey, adjusted for low-redshift contamination. We apply the trained network to 27,334,751 KiDS DR5 sources with $r \leqslant 23$ mag to generate a new classification catalog. This multimodal neural network successfully leverages both morphological and SED information to enable efficient and robust classification of stars, quasars, and galaxies in large photometric surveys.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
Authors:
Zefan Cai.,
Yichi Zhang,
Bofei Gao,
Yuliang Liu,
Tianyu Liu,
Keming Lu,
Wayne Xiong,
Yue Dong,
Baobao Chang,
Junjie Hu,
Wen Xiao
Abstract:
In this study, we investigate whether attention-based information flow inside large language models (LLMs) is aggregated through noticeable patterns for long context processing. Our observations reveal that LLMs aggregate information through Pyramidal Information Funneling where attention is scattering widely in lower layers, progressively consolidating within specific contexts, and ultimately foc…
▽ More
In this study, we investigate whether attention-based information flow inside large language models (LLMs) is aggregated through noticeable patterns for long context processing. Our observations reveal that LLMs aggregate information through Pyramidal Information Funneling where attention is scattering widely in lower layers, progressively consolidating within specific contexts, and ultimately focusin on critical tokens (a.k.a massive activation or attention sink) in higher layers. Motivated by these insights, we developed PyramidKV, a novel and effective KV cache compression method. This approach dynamically adjusts the KV cache size across different layers, allocating more cache in lower layers and less in higher ones, diverging from traditional methods that maintain a uniform KV cache size. Our experimental evaluations, utilizing the LongBench benchmark, show that PyramidKV matches the performance of models with a full KV cache while retaining only 12% of the KV cache, thus significantly reducing memory usage. In scenarios emphasizing memory efficiency, where only 0.7% of the KV cache is maintained, PyramidKV surpasses other KV cache compression techniques achieving up to a 20.5 absolute accuracy improvement on TREC.
△ Less
Submitted 16 June, 2024; v1 submitted 4 June, 2024;
originally announced June 2024.
-
Towards Scalable Automated Alignment of LLMs: A Survey
Authors:
Boxi Cao,
Keming Lu,
Xinyu Lu,
Jiawei Chen,
Mengjie Ren,
Hao Xiang,
Peilin Liu,
Yaojie Lu,
Ben He,
Xianpei Han,
Le Sun,
Hongyu Lin,
Bowen Yu
Abstract:
Alignment is the most critical step in building large language models (LLMs) that meet human needs. With the rapid development of LLMs gradually surpassing human capabilities, traditional alignment methods based on human-annotation are increasingly unable to meet the scalability demands. Therefore, there is an urgent need to explore new sources of automated alignment signals and technical approach…
▽ More
Alignment is the most critical step in building large language models (LLMs) that meet human needs. With the rapid development of LLMs gradually surpassing human capabilities, traditional alignment methods based on human-annotation are increasingly unable to meet the scalability demands. Therefore, there is an urgent need to explore new sources of automated alignment signals and technical approaches. In this paper, we systematically review the recently emerging methods of automated alignment, attempting to explore how to achieve effective, scalable, automated alignment once the capabilities of LLMs exceed those of humans. Specifically, we categorize existing automated alignment methods into 4 major categories based on the sources of alignment signals and discuss the current status and potential development of each category. Additionally, we explore the underlying mechanisms that enable automated alignment and discuss the essential factors that make automated alignment technologies feasible and effective from the fundamental role of alignment.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
Authors:
Keming Lu,
Bowen Yu,
Fei Huang,
Yang Fan,
Runji Lin,
Chang Zhou
Abstract:
Effectively aligning Large Language Models (LLMs) with human-centric values while preventing the degradation of abilities acquired through Pre-training and Supervised Fine-tuning (SFT) poses a central challenge in Reinforcement Learning from Human Feedback (RLHF). In this paper, we first discover that interpolating RLHF and SFT model parameters can adjust the trade-off between human preference and…
▽ More
Effectively aligning Large Language Models (LLMs) with human-centric values while preventing the degradation of abilities acquired through Pre-training and Supervised Fine-tuning (SFT) poses a central challenge in Reinforcement Learning from Human Feedback (RLHF). In this paper, we first discover that interpolating RLHF and SFT model parameters can adjust the trade-off between human preference and basic capabilities, thereby reducing the alignment tax at the cost of alignment reward. Inspired by this, we propose integrating the RL policy and SFT models at each optimization step in RLHF to continuously regulate the training direction, introducing the Online Merging Optimizer. Specifically, we merge gradients with the parameter differences between SFT and pretrained models, effectively steering the gradient towards maximizing rewards in the direction of SFT optimization. We demonstrate that our optimizer works well with different LLM families, such as Qwen and LLaMA, across various model sizes ranging from 1.8B to 8B, various RLHF algorithms like DPO and KTO, and existing model merging methods. It significantly enhances alignment reward while mitigating alignment tax, achieving higher overall performance across 14 benchmarks.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Note on the union-closed sets conjecture and Reimer's average set size theorem
Authors:
Kengbo Lu,
Abigail Raz
Abstract:
The Union-Closed Sets Conjecture, often attributed to Péter Frankl in 1979, remains an open problem in discrete mathematics. It posits that for any finite family of sets $S\neq\{\emptyset\}$, if the union of any two sets in the family is also in the family, then $\underline{\text{there must exist an element that belongs to at least half of the member sets}}$. We will refer to the underlined text a…
▽ More
The Union-Closed Sets Conjecture, often attributed to Péter Frankl in 1979, remains an open problem in discrete mathematics. It posits that for any finite family of sets $S\neq\{\emptyset\}$, if the union of any two sets in the family is also in the family, then $\underline{\text{there must exist an element that belongs to at least half of the member sets}}$. We will refer to the underlined text as the abundance condition. In 2001, David Reimer proved that the average set size of a union-closed family $S$ must be at least $\frac{1}{2}\log_{2}|S|$. When proving this result, he showed that a family being union-closed implies that the family satisfies certain conditions, which we will refer to as the Reimer's conditions. Therefore, as seen in the context of Tim Gowers' polymath project on the Union-Closed Sets Conjecture, it is natural to ask if all families that satisfy Reimer's conditions meet the abundance condition. A minimal counterexample to this question was offered by Raz in 2017. In this paper, we will discuss a general method to construct infinitely many such counterexamples with any fixed lower bound on the size of the member sets. Furthermore, we will discuss some properties related to these counterexamples, especially those focusing on how far these counterexamples are from being union-closed.
△ Less
Submitted 29 May, 2024; v1 submitted 17 May, 2024;
originally announced May 2024.
-
Achieving millisecond coherence fluxonium through overlap Josephson junctions
Authors:
Fei Wang,
Kannan Lu,
Huijuan Zhan,
Lu Ma,
Feng Wu,
Hantao Sun,
Hao Deng,
Yang Bai,
Feng Bao,
Xu Chang,
Ran Gao,
Xun Gao,
Guicheng Gong,
Lijuan Hu,
Ruizi Hu,
Honghong Ji,
Xizheng Ma,
Liyong Mao,
Zhijun Song,
Chengchun Tang,
Hongcheng Wang,
Tenghui Wang,
Ziang Wang,
Tian Xia,
Hongxin Xu
, et al. (10 additional authors not shown)
Abstract:
Fluxonium qubits are recognized for their high coherence times and high operation fidelities, attributed to their unique design incorporating over 100 Josephson junctions per superconducting loop. However, this complexity poses significant fabrication challenges, particularly in achieving high yield and junction uniformity with traditional methods. Here, we introduce an overlap process for Josephs…
▽ More
Fluxonium qubits are recognized for their high coherence times and high operation fidelities, attributed to their unique design incorporating over 100 Josephson junctions per superconducting loop. However, this complexity poses significant fabrication challenges, particularly in achieving high yield and junction uniformity with traditional methods. Here, we introduce an overlap process for Josephson junction fabrication that achieves nearly 100% yield and maintains uniformity across a 2-inch wafer with less than 5% variation for the phase slip junction and less than 2% for the junction array. Our compact junction array design facilitates fluxonium qubits with energy relaxation times exceeding 1 millisecond at the flux frustration point, demonstrating consistency with state-of-the-art dielectric loss tangents and flux noise across multiple devices. This work suggests the scalability of high coherence fluxonium processors using CMOS-compatible processes, marking a significant step towards practical quantum computing.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Rapid Mobile App Development for Generative AI Agents on MIT App Inventor
Authors:
Jaida Gao,
Calab Su,
Etai Miller,
Kevin Lu,
Yu Meng
Abstract:
The evolution of Artificial Intelligence (AI) stands as a pivotal force shaping our society, finding applications across diverse domains such as education, sustainability, and safety. Leveraging AI within mobile applications makes it easily accessible to the public, catalyzing its transformative potential. In this paper, we present a methodology for the rapid development of AI agent applications u…
▽ More
The evolution of Artificial Intelligence (AI) stands as a pivotal force shaping our society, finding applications across diverse domains such as education, sustainability, and safety. Leveraging AI within mobile applications makes it easily accessible to the public, catalyzing its transformative potential. In this paper, we present a methodology for the rapid development of AI agent applications using the development platform provided by MIT App Inventor. To demonstrate its efficacy, we share the development journey of three distinct mobile applications: SynchroNet for fostering sustainable communities; ProductiviTeams for addressing procrastination; and iHELP for enhancing community safety. All three applications seamlessly integrate a spectrum of generative AI features, leveraging OpenAI APIs. Furthermore, we offer insights gleaned from overcoming challenges in integrating diverse tools and AI functionalities, aiming to inspire young developers to join our efforts in building practical AI agent applications.
△ Less
Submitted 31 March, 2024;
originally announced May 2024.
-
SARMA: Scalable Low-Rank High-Dimensional Autoregressive Moving Averages via Tensor Decomposition
Authors:
Feiqing Huang,
Kexin Lu,
Yao Zheng
Abstract:
Existing models for high-dimensional time series are overwhelmingly developed within the finite-order vector autoregressive (VAR) framework, whereas the more flexible vector autoregressive moving averages (VARMA) have been much less considered. This paper introduces a high-dimensional model for capturing VARMA dynamics, namely the Scalable ARMA (SARMA) model, by combining novel reparameterization…
▽ More
Existing models for high-dimensional time series are overwhelmingly developed within the finite-order vector autoregressive (VAR) framework, whereas the more flexible vector autoregressive moving averages (VARMA) have been much less considered. This paper introduces a high-dimensional model for capturing VARMA dynamics, namely the Scalable ARMA (SARMA) model, by combining novel reparameterization and tensor decomposition techniques. To ensure identifiability and computational tractability, we first consider a reparameterization of the VARMA model and discover that this interestingly amounts to a Tucker-low-rank structure for the AR coefficient tensor along the temporal dimension. Motivated by this finding, we further consider Tucker decomposition across the response and predictor dimensions of the AR coefficient tensor, enabling factor extraction across variables and time lags. Additionally, we consider sparsity assumptions on the factor loadings to accomplish automatic variable selection and greater estimation efficiency. For the proposed model, we develop both rank-constrained and sparsity-inducing estimators. Algorithms and model selection methods are also provided. Simulation studies and empirical examples confirm the validity of our theory and advantages of our approaches over existing competitors.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Discrete non-commutative hungry Toda lattice and its application in matrix computation
Authors:
Zheng Wang,
Shi-Hao Li,
Kang-Ya Lu,
Jian-Qing Sun
Abstract:
In this paper, we plan to show an eigenvalue algorithm for block Hessenberg matrices by using the idea of non-commutative integrable systems and matrix-valued orthogonal polynomials. We introduce adjacent families of matrix-valued $θ$-deformed bi-orthogonal polynomials, and derive corresponding discrete non-commutative hungry Toda lattice from discrete spectral transformations for polynomials. It…
▽ More
In this paper, we plan to show an eigenvalue algorithm for block Hessenberg matrices by using the idea of non-commutative integrable systems and matrix-valued orthogonal polynomials. We introduce adjacent families of matrix-valued $θ$-deformed bi-orthogonal polynomials, and derive corresponding discrete non-commutative hungry Toda lattice from discrete spectral transformations for polynomials. It is shown that this discrete system can be used as a pre-precessing algorithm for block Hessenberg matrices. Besides, some convergence analysis and numerical examples of this algorithm are presented.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
SpatialPIN: Enhancing Spatial Reasoning Capabilities of Vision-Language Models through Prompting and Interacting 3D Priors
Authors:
Chenyang Ma,
Kai Lu,
Ta-Ying Cheng,
Niki Trigoni,
Andrew Markham
Abstract:
Current state-of-the-art spatial reasoning-enhanced VLMs are trained to excel at spatial visual question answering (VQA). However, we believe that higher-level 3D-aware tasks, such as articulating dynamic scene changes and motion planning, require a fundamental and explicit 3D understanding beyond current spatial VQA datasets. In this work, we present SpatialPIN, a framework designed to enhance th…
▽ More
Current state-of-the-art spatial reasoning-enhanced VLMs are trained to excel at spatial visual question answering (VQA). However, we believe that higher-level 3D-aware tasks, such as articulating dynamic scene changes and motion planning, require a fundamental and explicit 3D understanding beyond current spatial VQA datasets. In this work, we present SpatialPIN, a framework designed to enhance the spatial reasoning capabilities of VLMs through prompting and interacting with priors from multiple 3D foundation models in a zero-shot, training-free manner. Extensive experiments demonstrate that our spatial reasoning-imbued VLM performs well on various forms of spatial VQA and can extend to help in various downstream robotics tasks such as pick and stack and trajectory planning.
△ Less
Submitted 6 June, 2024; v1 submitted 18 March, 2024;
originally announced March 2024.
-
Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors
Authors:
Guanghua Li,
Wensheng Lu,
Wei Zhang,
Defu Lian,
Kezhong Lu,
Rui Mao,
Kai Shu,
Hao Liao
Abstract:
The proliferation of fake news has had far-reaching implications on politics, the economy, and society at large. While Fake news detection methods have been employed to mitigate this issue, they primarily depend on two essential elements: the quality and relevance of the evidence, and the effectiveness of the verdict prediction mechanism. Traditional methods, which often source information from st…
▽ More
The proliferation of fake news has had far-reaching implications on politics, the economy, and society at large. While Fake news detection methods have been employed to mitigate this issue, they primarily depend on two essential elements: the quality and relevance of the evidence, and the effectiveness of the verdict prediction mechanism. Traditional methods, which often source information from static repositories like Wikipedia, are limited by outdated or incomplete data, particularly for emerging or rare claims. Large Language Models (LLMs), known for their remarkable reasoning and generative capabilities, introduce a new frontier for fake news detection. However, like traditional methods, LLM-based solutions also grapple with the limitations of stale and long-tail knowledge. Additionally, retrieval-enhanced LLMs frequently struggle with issues such as low-quality evidence retrieval and context length constraints. To address these challenges, we introduce a novel, retrieval-augmented LLMs framework--the first of its kind to automatically and strategically extract key evidence from web sources for claim verification. Employing a multi-round retrieval strategy, our framework ensures the acquisition of sufficient, relevant evidence, thereby enhancing performance. Comprehensive experiments across three real-world datasets validate the framework's superiority over existing methods. Importantly, our model not only delivers accurate verdicts but also offers human-readable explanations to improve result interpretability.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech
Authors:
Ziqi Liang,
Haoxiang Shi,
Jiawei Wang,
Keda Lu
Abstract:
Recently, deep learning-based Text-to-Speech (TTS) systems have achieved high-quality speech synthesis results. Recurrent neural networks have become a standard modeling technique for sequential data in TTS systems and are widely used. However, training a TTS model which includes RNN components requires powerful GPU performance and takes a long time. In contrast, CNN-based sequence synthesis techn…
▽ More
Recently, deep learning-based Text-to-Speech (TTS) systems have achieved high-quality speech synthesis results. Recurrent neural networks have become a standard modeling technique for sequential data in TTS systems and are widely used. However, training a TTS model which includes RNN components requires powerful GPU performance and takes a long time. In contrast, CNN-based sequence synthesis techniques can significantly reduce the parameters and training time of a TTS model while guaranteeing a certain performance due to their high parallelism, which alleviate these economic costs of training. In this paper, we propose a lightweight TTS system based on deep convolutional neural networks, which is a two-stage training end-to-end TTS model and does not employ any recurrent units. Our model consists of two stages: Text2Spectrum and SSRN. The former is used to encode phonemes into a coarse mel spectrogram and the latter is used to synthesize the complete spectrum from the coarse mel spectrogram. Meanwhile, we improve the robustness of our model by a series of data augmentations, such as noise suppression, time warping, frequency masking and time masking, for solving the low resource mongolian problem. Experiments show that our model can reduce the training time and parameters while ensuring the quality and naturalness of the synthesized speech compared to using mainstream TTS models. Our method uses NCMMSC2022-MTTSC Challenge dataset for validation, which significantly reduces training time while maintaining a certain accuracy.
△ Less
Submitted 17 March, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation
Authors:
Xinyao Li,
Yuke Li,
Zhekai Du,
Fengling Li,
Ke Lu,
Jingjing Li
Abstract:
Large vision-language models (VLMs) like CLIP have demonstrated good zero-shot learning performance in the unsupervised domain adaptation task. Yet, most transfer approaches for VLMs focus on either the language or visual branches, overlooking the nuanced interplay between both modalities. In this work, we introduce a Unified Modality Separation (UniMoS) framework for unsupervised domain adaptatio…
▽ More
Large vision-language models (VLMs) like CLIP have demonstrated good zero-shot learning performance in the unsupervised domain adaptation task. Yet, most transfer approaches for VLMs focus on either the language or visual branches, overlooking the nuanced interplay between both modalities. In this work, we introduce a Unified Modality Separation (UniMoS) framework for unsupervised domain adaptation. Leveraging insights from modality gap studies, we craft a nimble modality separation network that distinctly disentangles CLIP's features into language-associated and vision-associated components. Our proposed Modality-Ensemble Training (MET) method fosters the exchange of modality-agnostic information while maintaining modality-specific nuances. We align features across domains using a modality discriminator. Comprehensive evaluations on three benchmarks reveal our approach sets a new state-of-the-art with minimal computational costs. Code: https://github.com/TL-UESTC/UniMoS
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
Agile Multi-Source-Free Domain Adaptation
Authors:
Xinyao Li,
Jingjing Li,
Fengling Li,
Lei Zhu,
Ke Lu
Abstract:
Efficiently utilizing rich knowledge in pretrained models has become a critical topic in the era of large models. This work focuses on adaptively utilizing knowledge from multiple source-pretrained models to an unlabeled target domain without accessing the source data. Despite being a practically useful setting, existing methods require extensive parameter tuning over each source model, which is c…
▽ More
Efficiently utilizing rich knowledge in pretrained models has become a critical topic in the era of large models. This work focuses on adaptively utilizing knowledge from multiple source-pretrained models to an unlabeled target domain without accessing the source data. Despite being a practically useful setting, existing methods require extensive parameter tuning over each source model, which is computationally expensive when facing abundant source domains or larger source models. To address this challenge, we propose a novel approach which is free of the parameter tuning over source backbones. Our technical contribution lies in the Bi-level ATtention ENsemble (Bi-ATEN) module, which learns both intra-domain weights and inter-domain ensemble weights to achieve a fine balance between instance specificity and domain consistency. By slightly tuning source bottlenecks, we achieve comparable or even superior performance on a challenging benchmark DomainNet with less than 3% trained parameters and 8 times of throughput compared with SOTA method. Furthermore, with minor modifications, the proposed module can be easily equipped to existing methods and gain more than 4% performance boost. Code is available at https://github.com/TL-UESTC/Bi-ATEN.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
Domain-Agnostic Mutual Prompting for Unsupervised Domain Adaptation
Authors:
Zhekai Du,
Xinyao Li,
Fengling Li,
Ke Lu,
Lei Zhu,
Jingjing Li
Abstract:
Conventional Unsupervised Domain Adaptation (UDA) strives to minimize distribution discrepancy between domains, which neglects to harness rich semantics from data and struggles to handle complex domain shifts. A promising technique is to leverage the knowledge of large-scale pre-trained vision-language models for more guided adaptation. Despite some endeavors, current methods often learn textual p…
▽ More
Conventional Unsupervised Domain Adaptation (UDA) strives to minimize distribution discrepancy between domains, which neglects to harness rich semantics from data and struggles to handle complex domain shifts. A promising technique is to leverage the knowledge of large-scale pre-trained vision-language models for more guided adaptation. Despite some endeavors, current methods often learn textual prompts to embed domain semantics for source and target domains separately and perform classification within each domain, limiting cross-domain knowledge transfer. Moreover, prompting only the language branch lacks flexibility to adapt both modalities dynamically. To bridge this gap, we propose Domain-Agnostic Mutual Prompting (DAMP) to exploit domain-invariant semantics by mutually aligning visual and textual embeddings. Specifically, the image contextual information is utilized to prompt the language branch in a domain-agnostic and instance-conditioned way. Meanwhile, visual prompts are imposed based on the domain-agnostic textual prompt to elicit domain-invariant visual embeddings. These two branches of prompts are learned mutually with a cross-attention module and regularized with a semantic-consistency loss and an instance-discrimination contrastive loss. Experiments on three UDA benchmarks demonstrate the superiority of DAMP over state-of-the-art approaches.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Probabilistic Routing for Graph-Based Approximate Nearest Neighbor Search
Authors:
Kejing Lu,
Chuan Xiao,
Yoshiharu Ishikawa
Abstract:
Approximate nearest neighbor search (ANNS) in high-dimensional spaces is a pivotal challenge in the field of machine learning. In recent years, graph-based methods have emerged as the superior approach to ANNS, establishing a new state of the art. Although various optimizations for graph-based ANNS have been introduced, they predominantly rely on heuristic methods that lack formal theoretical back…
▽ More
Approximate nearest neighbor search (ANNS) in high-dimensional spaces is a pivotal challenge in the field of machine learning. In recent years, graph-based methods have emerged as the superior approach to ANNS, establishing a new state of the art. Although various optimizations for graph-based ANNS have been introduced, they predominantly rely on heuristic methods that lack formal theoretical backing. This paper aims to enhance routing within graph-based ANNS by introducing a method that offers a probabilistic guarantee when exploring a node's neighbors in the graph. We formulate the problem as probabilistic routing and develop two baseline strategies by incorporating locality-sensitive techniques. Subsequently, we introduce PEOs, a novel approach that efficiently identifies which neighbors in the graph should be considered for exact distance calculation, thus significantly improving efficiency in practice. Our experiments demonstrate that equipping PEOs can increase throughput on commonly utilized graph indexes (HNSW and NSSG) by a factor of 1.6 to 2.5, and its efficiency consistently outperforms the leading-edge routing technique by 1.1 to 1.4 times.
△ Less
Submitted 10 July, 2024; v1 submitted 17 February, 2024;
originally announced February 2024.
-
Empowering Federated Learning for Massive Models with NVIDIA FLARE
Authors:
Holger R. Roth,
Ziyue Xu,
Yuan-Ting Hsieh,
Adithya Renduchintala,
Isaac Yang,
Zhihong Zhang,
Yuhong Wen,
Sean Yang,
Kevin Lu,
Kristopher Kersten,
Camir Ricketts,
Daguang Xu,
Chester Chen,
Yan Cheng,
Andrew Feng
Abstract:
In the ever-evolving landscape of artificial intelligence (AI) and large language models (LLMs), handling and leveraging data effectively has become a critical challenge. Most state-of-the-art machine learning algorithms are data-centric. However, as the lifeblood of model performance, necessary data cannot always be centralized due to various factors such as privacy, regulation, geopolitics, copy…
▽ More
In the ever-evolving landscape of artificial intelligence (AI) and large language models (LLMs), handling and leveraging data effectively has become a critical challenge. Most state-of-the-art machine learning algorithms are data-centric. However, as the lifeblood of model performance, necessary data cannot always be centralized due to various factors such as privacy, regulation, geopolitics, copyright issues, and the sheer effort required to move vast datasets. In this paper, we explore how federated learning enabled by NVIDIA FLARE can address these challenges with easy and scalable integration capabilities, enabling parameter-efficient and full supervised fine-tuning of LLMs for natural language processing and biopharmaceutical applications to enhance their accuracy and robustness.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Safe Reinforcement Learning-Based Eco-Driving Control for Mixed Traffic Flows With Disturbances
Authors:
Ke Lu,
Dongjun Li,
Qun Wang,
Kaidi Yang,
Lin Zhao,
Ziyou Song
Abstract:
This paper presents a safe learning-based eco-driving framework tailored for mixed traffic flows, which aims to optimize energy efficiency while guaranteeing safety during real-system operations. Even though reinforcement learning (RL) is capable of optimizing energy efficiency in intricate environments, it is challenged by safety requirements during the training process. The lack of safety guaran…
▽ More
This paper presents a safe learning-based eco-driving framework tailored for mixed traffic flows, which aims to optimize energy efficiency while guaranteeing safety during real-system operations. Even though reinforcement learning (RL) is capable of optimizing energy efficiency in intricate environments, it is challenged by safety requirements during the training process. The lack of safety guarantees is the other concern when deploying a trained policy in real-world application. Compared with RL, model predicted control (MPC) can handle constrained dynamics systems, ensuring safe driving. However, the major challenges lie in complicated eco-driving tasks and the presence of disturbances, which respectively challenge the MPC design and the satisfaction of constraints. To address these limitations, the proposed framework incorporates the tube-based enhanced MPC (RMPC) to ensure the safe execution of the RL policy under disturbances, thereby improving the control robustness. RL not only optimizes the energy efficiency of the connected and automated vehicle in mixed traffic but also handles more uncertain scenarios, in which the energy consumption of the human-driven vehicle and its diverse and stochastic driving behaviors are considered in the optimization framework. Simulation results demonstrate that the proposed algorithm, compared with RMPC technique, shows an average improvement of 10.88% in holistic energy efficiency, while compared with RL algorithm, it effectively prevents inter-vehicle collisions.
△ Less
Submitted 31 January, 2024;
originally announced January 2024.
-
INSTILLER: Towards Efficient and Realistic RTL Fuzzing
Authors:
Gen Zhang,
Pengfei Wang,
Tai Yue,
Danjun Liu,
Yubei Guo,
Kai Lu
Abstract:
Bugs exist in hardware, such as CPU. Unlike software bugs, these hardware bugs need to be detected before deployment. Previous fuzzing work in CPU bug detection has several disadvantages, e.g., the length of RTL input instructions keeps growing, and longer inputs are ineffective for fuzzing. In this paper, we propose INSTILLER (Instruction Distiller), an RTL fuzzer based on ant colony optimization…
▽ More
Bugs exist in hardware, such as CPU. Unlike software bugs, these hardware bugs need to be detected before deployment. Previous fuzzing work in CPU bug detection has several disadvantages, e.g., the length of RTL input instructions keeps growing, and longer inputs are ineffective for fuzzing. In this paper, we propose INSTILLER (Instruction Distiller), an RTL fuzzer based on ant colony optimization (ACO). First, to keep the input instruction length short and efficient in fuzzing, it distills input instructions with a variant of ACO (VACO). Next, related work cannot simulate realistic interruptions well in fuzzing, and INSTILLER solves the problem of inserting interruptions and exceptions in generating the inputs. Third, to further improve the fuzzing performance of INSTILLER, we propose hardware-based seed selection and mutation strategies. We implement a prototype and conduct extensive experiments against state-of-the-art fuzzing work in real-world target CPU cores. In experiments, INSTILLER has 29.4% more coverage than DiFuzzRTL. In addition, 17.0% more mismatches are detected by INSTILLER. With the VACO algorithm, INSTILLER generates 79.3% shorter input instructions than DiFuzzRTL, demonstrating its effectiveness in distilling the input instructions. In addition, the distillation leads to a 6.7% increase in execution speed on average.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
MobFuzz: Adaptive Multi-objective Optimization in Gray-box Fuzzing
Authors:
Gen Zhang,
Pengfei Wang,
Tai Yue,
Xiangdong Kong,
Shan Huang,
Xu Zhou,
Kai Lu
Abstract:
Coverage-guided gray-box fuzzing (CGF) is an efficient software testing technique. There are usually multiple objectives to optimize in CGF. However, existing CGF methods cannot successfully find the optimal values for multiple objectives simultaneously. In this paper, we propose a gray-box fuzzer for multi-objective optimization (MOO) called MobFuzz. We model the multi-objective optimization proc…
▽ More
Coverage-guided gray-box fuzzing (CGF) is an efficient software testing technique. There are usually multiple objectives to optimize in CGF. However, existing CGF methods cannot successfully find the optimal values for multiple objectives simultaneously. In this paper, we propose a gray-box fuzzer for multi-objective optimization (MOO) called MobFuzz. We model the multi-objective optimization process as a multi-player multi-armed bandit (MPMAB). First, it adaptively selects the objective combination that contains the most appropriate objectives for the current situation. Second, our model deals with the power schedule, which adaptively allocates energy to the seeds under the chosen objective combination. In MobFuzz, we propose an evolutionary algorithm called NIC to optimize our chosen objectives simultaneously without incurring additional performance overhead. To prove the effectiveness of MobFuzz, we conduct experiments on 12 real-world programs and the MAGMA data set. Experiment results show that multi-objective optimization in MobFuzz outperforms single-objective fuzzing in the baseline fuzzers. In contrast to them, MobFuzz can select the optimal objective combination and increase the values of multiple objectives up to 107%, with at most a 55% reduction in the energy consumption. Moreover, MobFuzz has up to 6% more program coverage and finds 3x more unique bugs than the baseline fuzzers. The NIC algorithm has at least a 2x improvement with a performance overhead of approximately 3%.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Improving Expressive Power of Spectral Graph Neural Networks with Eigenvalue Correction
Authors:
Kangkang Lu,
Yanhua Yu,
Hao Fei,
Xuan Li,
Zixuan Yang,
Zirui Guo,
Meiyu Liang,
Mengran Yin,
Tat-Seng Chua
Abstract:
In recent years, spectral graph neural networks, characterized by polynomial filters, have garnered increasing attention and have achieved remarkable performance in tasks such as node classification. These models typically assume that eigenvalues for the normalized Laplacian matrix are distinct from each other, thus expecting a polynomial filter to have a high fitting ability. However, this paper…
▽ More
In recent years, spectral graph neural networks, characterized by polynomial filters, have garnered increasing attention and have achieved remarkable performance in tasks such as node classification. These models typically assume that eigenvalues for the normalized Laplacian matrix are distinct from each other, thus expecting a polynomial filter to have a high fitting ability. However, this paper empirically observes that normalized Laplacian matrices frequently possess repeated eigenvalues. Moreover, we theoretically establish that the number of distinguishable eigenvalues plays a pivotal role in determining the expressive power of spectral graph neural networks. In light of this observation, we propose an eigenvalue correction strategy that can free polynomial filters from the constraints of repeated eigenvalue inputs. Concretely, the proposed eigenvalue correction strategy enhances the uniform distribution of eigenvalues, thus mitigating repeated eigenvalues, and improving the fitting capacity and expressive power of polynomial filters. Extensive experimental results on both synthetic and real-world datasets demonstrate the superiority of our method.
△ Less
Submitted 18 March, 2024; v1 submitted 28 January, 2024;
originally announced January 2024.
-
Color Maker: a Mixed-Initiative Approach to Creating Accessible Color Maps
Authors:
Amey Salvi,
Kecheng Lu,
Michael E. Papka,
Yunhai Wang,
Khairi Reda
Abstract:
Quantitative data is frequently represented using color, yet designing effective color mappings is a challenging task, requiring one to balance perceptual standards with personal color preference. Current design tools either overwhelm novices with complexity or offer limited customization options. We present ColorMaker, a mixed-initiative approach for creating colormaps. ColorMaker combines fluid…
▽ More
Quantitative data is frequently represented using color, yet designing effective color mappings is a challenging task, requiring one to balance perceptual standards with personal color preference. Current design tools either overwhelm novices with complexity or offer limited customization options. We present ColorMaker, a mixed-initiative approach for creating colormaps. ColorMaker combines fluid user interaction with real-time optimization to generate smooth, continuous color ramps. Users specify their loose color preferences while leaving the algorithm to generate precise color sequences, meeting both designer needs and established guidelines. ColorMaker can create new colormaps, including designs accessible for people with color-vision deficiencies, starting from scratch or with only partial input, thus supporting ideation and iterative refinement. We show that our approach can generate designs with similar or superior perceptual characteristics to standard colormaps. A user study demonstrates how designers of varying skill levels can use this tool to create custom, high-quality colormaps. ColorMaker is available at https://colormaker.org
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
Online Distributed Optimization with Clipped Stochastic Gradients: High Probability Bound of Regrets
Authors:
Yuchen Yang,
Kaihong Lu,
Long Wang
Abstract:
In this paper, the problem of distributed optimization is studied via a network of agents. Each agent only has access to a stochastic gradient of its own objective function in the previous time, and can communicate with its neighbors via a network. To handle this problem, an online distributed clipped stochastic gradient descent algorithm is proposed. Dynamic regrets are used to capture the perfor…
▽ More
In this paper, the problem of distributed optimization is studied via a network of agents. Each agent only has access to a stochastic gradient of its own objective function in the previous time, and can communicate with its neighbors via a network. To handle this problem, an online distributed clipped stochastic gradient descent algorithm is proposed. Dynamic regrets are used to capture the performance of the algorithm. Particularly, the high probability bounds of regrets are analyzed when the stochastic gradients satisfy the heavy-tailed noise condition. For the convex case, the offline benchmark of the dynamic regret is to seek the minimizer of the objective function each time. Under mild assumptions on the graph connectivity, we prove that the dynamic regret grows sublinearly with high probability under a certain clipping parameter. For the non-convex case, the offline benchmark of the dynamic regret is to find the stationary point of the objective function each time. We show that the dynamic regret grows sublinearly with high probability if the variation of the objective function grows within a certain rate. Finally, numerical simulations are provided to demonstrate the effectiveness of our theoretical results.
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
Value-Driven Mixed-Precision Quantization for Patch-Based Inference on Microcontrollers
Authors:
Wei Tao,
Shenglin He,
Kai Lu,
Xiaoyang Qu,
Guokuan Li,
Jiguang Wan,
Jianzong Wang,
Jing Xiao
Abstract:
Deploying neural networks on microcontroller units (MCUs) presents substantial challenges due to their constrained computation and memory resources. Previous researches have explored patch-based inference as a strategy to conserve memory without sacrificing model accuracy. However, this technique suffers from severe redundant computation overhead, leading to a substantial increase in execution lat…
▽ More
Deploying neural networks on microcontroller units (MCUs) presents substantial challenges due to their constrained computation and memory resources. Previous researches have explored patch-based inference as a strategy to conserve memory without sacrificing model accuracy. However, this technique suffers from severe redundant computation overhead, leading to a substantial increase in execution latency. A feasible solution to address this issue is mixed-precision quantization, but it faces the challenges of accuracy degradation and a time-consuming search time. In this paper, we propose QuantMCU, a novel patch-based inference method that utilizes value-driven mixed-precision quantization to reduce redundant computation. We first utilize value-driven patch classification (VDPC) to maintain the model accuracy. VDPC classifies patches into two classes based on whether they contain outlier values. For patches containing outlier values, we apply 8-bit quantization to the feature maps on the dataflow branches that follow. In addition, for patches without outlier values, we utilize value-driven quantization search (VDQS) on the feature maps of their following dataflow branches to reduce search time. Specifically, VDQS introduces a novel quantization search metric that takes into account both computation and accuracy, and it employs entropy as an accuracy representation to avoid additional training. VDQS also adopts an iterative approach to determine the bitwidth of each feature map to further accelerate the search process. Experimental results on real-world MCU devices show that QuantMCU can reduce computation by 2.2x on average while maintaining comparable model accuracy compared to the state-of-the-art patch-based inference methods.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment
Authors:
Keming Lu,
Bowen Yu,
Chang Zhou,
Jingren Zhou
Abstract:
Considerable efforts have been invested in augmenting the role-playing proficiency of open-source large language models (LLMs) by emulating proprietary counterparts. Nevertheless, we posit that LLMs inherently harbor role-play capabilities, owing to the extensive knowledge of characters and potential dialogues ingrained in their vast training corpora. Thus, in this study, we introduce Ditto, a sel…
▽ More
Considerable efforts have been invested in augmenting the role-playing proficiency of open-source large language models (LLMs) by emulating proprietary counterparts. Nevertheless, we posit that LLMs inherently harbor role-play capabilities, owing to the extensive knowledge of characters and potential dialogues ingrained in their vast training corpora. Thus, in this study, we introduce Ditto, a self-alignment method for role-play. Ditto capitalizes on character knowledge, encouraging an instruction-following LLM to simulate role-play dialogues as a variant of reading comprehension. This method creates a role-play training set comprising 4,000 characters, surpassing the scale of currently available datasets by tenfold regarding the number of roles. Subsequently, we fine-tune the LLM using this self-generated dataset to augment its role-playing capabilities. Upon evaluating our meticulously constructed and reproducible role-play benchmark and the roleplay subset of MT-Bench, Ditto, in various parameter scales, consistently maintains a consistent role identity and provides accurate role-specific knowledge in multi-turn role-play conversations. Notably, it outperforms all open-source role-play baselines, showcasing performance levels comparable to advanced proprietary chatbots. Furthermore, we present the first comprehensive cross-supervision alignment experiment in the role-play domain, revealing that the intrinsic capabilities of LLMs confine the knowledge within role-play. Meanwhile, the role-play styles can be easily acquired with the guidance of smaller models. We open-source related resources at https://github.com/OFA-Sys/Ditto.
△ Less
Submitted 22 January, 2024;
originally announced January 2024.
-
Correcting the Contamination of Second-order Spectra: Improving Hα Measurements in Reverberation Mapping Campaigns
Authors:
Wen-Zhe Xi,
Kai-Xing Lu,
Hai-Cheng Feng,
Sha-Sha Li,
Jin-Ming Bai,
Rui-Lei Zhou,
Hong-Tao Liu,
Jian-Guo Wang
Abstract:
Long-term spectroscopic monitoring campaigns on active galactic nuclei (AGNs) provide a wealth of information about its interior structure and kinematics. However, a number of the observations suffer from the contamination of second-order spectra (SOS) which will introduce some undesirable uncertainties at the red side of the spectra. In this paper, we test the effect of SOS and propose a method t…
▽ More
Long-term spectroscopic monitoring campaigns on active galactic nuclei (AGNs) provide a wealth of information about its interior structure and kinematics. However, a number of the observations suffer from the contamination of second-order spectra (SOS) which will introduce some undesirable uncertainties at the red side of the spectra. In this paper, we test the effect of SOS and propose a method to correct it in the time domain spectroscopic data using the simultaneously observed comparison stars. Based on the reverberation mapping (RM) data of NGC 5548 in 2019, one of the most intensively monitored AGNs by the Lijiang 2.4 m telescope, we find that the scientific object, comparison star, and spectrophotometric standard star can jointly introduce up to similar to 30% SOS for Grism 14. This irregular but smooth SOS significantly affects the flux density and profile of the emission line, while having little effect on the light curve. After applying our method to each spectrum, we find that the SOS can be corrected effectively. The deviation between corrected and intrinsic spectra is similar to 2%, and the impact of SOS on time lag is very minor. This method makes it possible to obtain the H alpha RM measurements from archival data provided that the spectral shape of the AGN under investigation does not have a large change.
△ Less
Submitted 22 January, 2024;
originally announced January 2024.
-
Finite solvable tidy Groups whose orders are divisible by two primes
Authors:
Nicolas F. Beike,
Rachel Carleton,
David G. Costanzo,
Colin Heath,
Mark L. Lewis,
Kaiwen Lu,
Jamie D. Pearce
Abstract:
In this paper, we investigate finite solvable tidy groups. We classify the tidy $\{ p, q \}$-groups. Combining this with a previous result, we are able to characterize the finite tidy solvable groups. Using this characterization, we bound the Fitting height of finite tidy solvable groups and we prove that the quotients of finite tidy solvable groups are tidy.
In this paper, we investigate finite solvable tidy groups. We classify the tidy $\{ p, q \}$-groups. Combining this with a previous result, we are able to characterize the finite tidy solvable groups. Using this characterization, we bound the Fitting height of finite tidy solvable groups and we prove that the quotients of finite tidy solvable groups are tidy.
△ Less
Submitted 21 January, 2024;
originally announced January 2024.
-
Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision
Authors:
Chih-Kai Yang,
Kuan-Po Huang,
Ke-Han Lu,
Chun-Yi Kuan,
Chi-Yuan Hsiao,
Hung-yi Lee
Abstract:
This work evaluated several cutting-edge large-scale foundation models based on self-supervision or weak supervision, including SeamlessM4T, SeamlessM4T v2, and Whisper-large-v3, on three code-switched corpora. We found that self-supervised models can achieve performances close to the supervised model, indicating the effectiveness of multilingual self-supervised pre-training. We also observed that…
▽ More
This work evaluated several cutting-edge large-scale foundation models based on self-supervision or weak supervision, including SeamlessM4T, SeamlessM4T v2, and Whisper-large-v3, on three code-switched corpora. We found that self-supervised models can achieve performances close to the supervised model, indicating the effectiveness of multilingual self-supervised pre-training. We also observed that these models still have room for improvement as they kept making similar mistakes and had unsatisfactory performances on modeling intra-sentential code-switching. In addition, the validity of several variants of Whisper was explored, and we concluded that they remained effective in a code-switching scenario, and similar techniques for self-supervised models are worth studying to boost the performance of code-switched tasks.
△ Less
Submitted 30 December, 2023;
originally announced January 2024.
-
A computationally efficient semi-blind source separation based approach for nonlinear echo cancellation based on an element-wise iterative source steering
Authors:
Kunxing Lu,
Xianrui Wang,
Tetsuya Ueda,
Shoji Makino,
Jingdong Chen
Abstract:
While the semi-blind source separation-based acoustic echo cancellation (SBSS-AEC) has received much research attention due to its promising performance during double-talk compared to the traditional adaptive algorithms, it suffers from system latency and nonlinear distortions. To circumvent these drawbacks, the recently developed ideas on convolutive transfer function (CTF) approximation and nonl…
▽ More
While the semi-blind source separation-based acoustic echo cancellation (SBSS-AEC) has received much research attention due to its promising performance during double-talk compared to the traditional adaptive algorithms, it suffers from system latency and nonlinear distortions. To circumvent these drawbacks, the recently developed ideas on convolutive transfer function (CTF) approximation and nonlinear expansion have been used in the iterative projection (IP)-based semi-blind source separation (SBSS) algorithm. However, because of the introduction of CTF approximation and nonlinear expansion, this algorithm becomes computationally very expensive, which makes it difficult to implement in embedded systems. Thus, we attempt in this paper to improve this IP-based algorithm, thereby developing an element-wise iterative source steering (EISS) algorithm. In comparison with the IP-based SBSS algorithm, the proposed algorithm is computationally much more efficient, especially when the nonlinear expansion order is high and the length of the CTF filter is long. Meanwhile, its AEC performance is as good as that of IP-based SBSS.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Supervised Factor Modeling for High-Dimensional Linear Time Series
Authors:
Feiqing Huang,
Kexin Lu,
Guodong Li
Abstract:
Motivated by Tucker tensor decomposition, this paper imposes low-rank structures to the column and row spaces of coefficient matrices in a multivariate infinite-order vector autoregression (VAR), which leads to a supervised factor model with two factor modelings being conducted to responses and predictors simultaneously. Interestingly, the stationarity condition implies an intrinsic weak group spa…
▽ More
Motivated by Tucker tensor decomposition, this paper imposes low-rank structures to the column and row spaces of coefficient matrices in a multivariate infinite-order vector autoregression (VAR), which leads to a supervised factor model with two factor modelings being conducted to responses and predictors simultaneously. Interestingly, the stationarity condition implies an intrinsic weak group sparsity mechanism of infinite-order VAR, and hence a rank-constrained group Lasso estimation is considered for high-dimensional linear time series. Its non-asymptotic properties are discussed thoughtfully by balancing the estimation, approximation and truncation errors. Moreover, an alternating gradient descent algorithm with thresholding is designed to search for high-dimensional estimates, and its theoretical justifications, including statistical and convergence analysis, are also provided. Theoretical and computational properties of the proposed methodology are verified by simulation experiments, and the advantages over existing methods are demonstrated by two real examples.
△ Less
Submitted 30 November, 2023;
originally announced December 2023.
-
Twisted super Yangians of type AIII and their representations
Authors:
Kang Lu
Abstract:
We study the super analogue of the Molev-Ragoucy reflection algebras, which we call twisted super Yangians of type AIII, and classify their finite-dimensional irreducible representations. These superalgebras are coideal subalgebras of the super Yangian $\mathscr{Y}(\mathfrak{gl}_{m|n})$ and are associated with symmetric pairs of type AIII in Cartan's classification. We establish the Schur-Weyl typ…
▽ More
We study the super analogue of the Molev-Ragoucy reflection algebras, which we call twisted super Yangians of type AIII, and classify their finite-dimensional irreducible representations. These superalgebras are coideal subalgebras of the super Yangian $\mathscr{Y}(\mathfrak{gl}_{m|n})$ and are associated with symmetric pairs of type AIII in Cartan's classification. We establish the Schur-Weyl type duality between degenerate affine Hecke algebras of type BC and twisted super Yangians.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
FlashOcc: Fast and Memory-Efficient Occupancy Prediction via Channel-to-Height Plugin
Authors:
Zichen Yu,
Changyong Shu,
Jiajun Deng,
Kangjie Lu,
Zongdai Liu,
Jiangyong Yu,
Dawei Yang,
Hui Li,
Yan Chen
Abstract:
Given the capability of mitigating the long-tail deficiencies and intricate-shaped absence prevalent in 3D object detection, occupancy prediction has become a pivotal component in autonomous driving systems. However, the procession of three-dimensional voxel-level representations inevitably introduces large overhead in both memory and computation, obstructing the deployment of to-date occupancy pr…
▽ More
Given the capability of mitigating the long-tail deficiencies and intricate-shaped absence prevalent in 3D object detection, occupancy prediction has become a pivotal component in autonomous driving systems. However, the procession of three-dimensional voxel-level representations inevitably introduces large overhead in both memory and computation, obstructing the deployment of to-date occupancy prediction approaches. In contrast to the trend of making the model larger and more complicated, we argue that a desirable framework should be deployment-friendly to diverse chips while maintaining high precision. To this end, we propose a plug-and-play paradigm, namely FlashOCC, to consolidate rapid and memory-efficient occupancy prediction while maintaining high precision. Particularly, our FlashOCC makes two improvements based on the contemporary voxel-level occupancy prediction approaches. Firstly, the features are kept in the BEV, enabling the employment of efficient 2D convolutional layers for feature extraction. Secondly, a channel-to-height transformation is introduced to lift the output logits from the BEV into the 3D space. We apply the FlashOCC to diverse occupancy prediction baselines on the challenging Occ3D-nuScenes benchmarks and conduct extensive experiments to validate the effectiveness. The results substantiate the superiority of our plug-and-play paradigm over previous state-of-the-art methods in terms of precision, runtime efficiency, and memory costs, demonstrating its potential for deployment. The code will be made available.
△ Less
Submitted 18 November, 2023;
originally announced November 2023.
-
Speculative Contrastive Decoding
Authors:
Hongyi Yuan,
Keming Lu,
Fei Huang,
Zheng Yuan,
Chang Zhou
Abstract:
Large language models~(LLMs) exhibit exceptional performance in language tasks, yet their auto-regressive inference is limited due to high computational requirements and is sub-optimal due to the exposure bias. Inspired by speculative decoding and contrastive decoding, we introduce Speculative Contrastive Decoding~(SCD), a straightforward yet powerful decoding approach that leverages predictions f…
▽ More
Large language models~(LLMs) exhibit exceptional performance in language tasks, yet their auto-regressive inference is limited due to high computational requirements and is sub-optimal due to the exposure bias. Inspired by speculative decoding and contrastive decoding, we introduce Speculative Contrastive Decoding~(SCD), a straightforward yet powerful decoding approach that leverages predictions from smaller language models~(LMs) to achieve both decoding acceleration and quality improvement. Extensive evaluations and analyses on four diverse language tasks demonstrate the effectiveness of SCD, showing that decoding efficiency and quality can compatibly benefit from one smaller LM.
△ Less
Submitted 13 March, 2024; v1 submitted 15 November, 2023;
originally announced November 2023.
-
Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models
Authors:
Keming Lu,
Hongyi Yuan,
Runji Lin,
Junyang Lin,
Zheng Yuan,
Chang Zhou,
Jingren Zhou
Abstract:
The complementary potential of Large Language Models (LLM) assumes off-the-shelf LLMs have heterogeneous expertise in a wide range of domains and tasks so that an ensemble of LLMs can achieve consistently better performance. Existing ensemble methods for LLMs mainly focus on reward model ranking of outputs, leading to significant computation overhead. To combat this issue, we revisit the complemen…
▽ More
The complementary potential of Large Language Models (LLM) assumes off-the-shelf LLMs have heterogeneous expertise in a wide range of domains and tasks so that an ensemble of LLMs can achieve consistently better performance. Existing ensemble methods for LLMs mainly focus on reward model ranking of outputs, leading to significant computation overhead. To combat this issue, we revisit the complementary potential of LLMs and further elaborate it by mining latent expertise with off-the-shelf reward models. We propose Zooter, a reward-guided routing method distilling rewards on training queries to train a routing function, which can precisely distribute each query to the LLM with expertise about it. We also integrate a tag-based label enhancement to mitigate noise from uncertainty when using rewards as silver supervision. Zooter shows computation efficiency in inference as it introduces only a minor computation overhead of a routing function compared with reward model ranking methods. We evaluate Zooter on a comprehensive benchmark collection with 26 subsets on different domains and tasks. Zooter outperforms the best single model on average and ranks first on 44% of tasks, even surpassing multiple reward model ranking methods.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning
Authors:
Shengguang Wu,
Keming Lu,
Benfeng Xu,
Junyang Lin,
Qi Su,
Chang Zhou
Abstract:
Enhancing the instruction-following ability of Large Language Models (LLMs) primarily demands substantial instruction-tuning datasets. However, the sheer volume of these imposes a considerable computational burden and annotation cost. To investigate a label-efficient instruction tuning method that allows the model itself to actively sample subsets that are equally or even more effective, we introd…
▽ More
Enhancing the instruction-following ability of Large Language Models (LLMs) primarily demands substantial instruction-tuning datasets. However, the sheer volume of these imposes a considerable computational burden and annotation cost. To investigate a label-efficient instruction tuning method that allows the model itself to actively sample subsets that are equally or even more effective, we introduce a self-evolving mechanism DiverseEvol. In this process, a model iteratively augments its training subset to refine its own performance, without requiring any intervention from humans or more advanced LLMs. The key to our data sampling technique lies in the enhancement of diversity in the chosen subsets, as the model selects new data points most distinct from any existing ones according to its current embedding space. Extensive experiments across three datasets and benchmarks demonstrate the effectiveness of DiverseEvol. Our models, trained on less than 8% of the original dataset, maintain or improve performance compared with finetuning on full data. We also provide empirical evidence to analyze the importance of diversity in instruction data and the iterative scheme as opposed to one-time sampling. Our code is publicly available at https://github.com/OFA-Sys/DiverseEvol.git.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
Exploring ChatGPT's Capabilities on Vulnerability Management
Authors:
Peiyu Liu,
Junming Liu,
Lirong Fu,
Kangjie Lu,
Yifan Xia,
Xuhong Zhang,
Wenzhi Chen,
Haiqin Weng,
Shouling Ji,
Wenhai Wang
Abstract:
Recently, ChatGPT has attracted great attention from the code analysis domain. Prior works show that ChatGPT has the capabilities of processing foundational code analysis tasks, such as abstract syntax tree generation, which indicates the potential of using ChatGPT to comprehend code syntax and static behaviors. However, it is unclear whether ChatGPT can complete more complicated real-world vulner…
▽ More
Recently, ChatGPT has attracted great attention from the code analysis domain. Prior works show that ChatGPT has the capabilities of processing foundational code analysis tasks, such as abstract syntax tree generation, which indicates the potential of using ChatGPT to comprehend code syntax and static behaviors. However, it is unclear whether ChatGPT can complete more complicated real-world vulnerability management tasks, such as the prediction of security relevance and patch correctness, which require an all-encompassing understanding of various aspects, including code syntax, program semantics, and related manual comments.
In this paper, we explore ChatGPT's capabilities on 6 tasks involving the complete vulnerability management process with a large-scale dataset containing 70,346 samples. For each task, we compare ChatGPT against SOTA approaches, investigate the impact of different prompts, and explore the difficulties. The results suggest promising potential in leveraging ChatGPT to assist vulnerability management. One notable example is ChatGPT's proficiency in tasks like generating titles for software bug reports. Furthermore, our findings reveal the difficulties encountered by ChatGPT and shed light on promising future directions. For instance, directly providing random demonstration examples in the prompt cannot consistently guarantee good performance in vulnerability management. By contrast, leveraging ChatGPT in a self-heuristic way -- extracting expertise from demonstration examples itself and integrating the extracted expertise in the prompt is a promising research direction. Besides, ChatGPT may misunderstand and misuse the information in the prompt. Consequently, effectively guiding ChatGPT to focus on helpful information rather than the irrelevant content is still an open problem.
△ Less
Submitted 20 June, 2024; v1 submitted 11 November, 2023;
originally announced November 2023.
-
Empowering high-dimensional optical fiber communications with integrated photonic processors
Authors:
Kaihang Lu,
Zengqi Chen,
Hao Chen,
Wu Zhou,
Zunyue Zhang,
Hon Ki Tsang,
Yeyu Tong
Abstract:
Mode division multiplexing (MDM) in optical fibers enables multichannel capabilities for various applications, including data transmission, quantum networks, imaging, and sensing. However, MDM optical fiber systems, usually necessities bulk-optics approaches for launching different orthogonal fiber modes into the multimode optical fiber, and multiple-input multiple-output digital electronic signal…
▽ More
Mode division multiplexing (MDM) in optical fibers enables multichannel capabilities for various applications, including data transmission, quantum networks, imaging, and sensing. However, MDM optical fiber systems, usually necessities bulk-optics approaches for launching different orthogonal fiber modes into the multimode optical fiber, and multiple-input multiple-output digital electronic signal processing at the receiver side to undo the arbitrary mode scrambling in a circular-core optical fiber. Here we show that a high-dimensional optical fiber communication system can be entirely implemented by a reconfigurable integrated photonic processor, featuring kernels of multichannel mode multiplexing transmitter and all-optical descrambling receiver. High-speed and inter-chip communications involving six spatial- and polarization modes have been experimentally demonstrated with high efficiency and high-quality eye diagrams, despite the presence of random mode scrambling and polarization rotation in a circular-core few-mode fiber. The proposed photonic integration approach holds promising prospects for future space-division multiplexing applications.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
DynPoint: Dynamic Neural Point For View Synthesis
Authors:
Kaichen Zhou,
Jia-Xing Zhong,
Sangyun Shin,
Kai Lu,
Yiyuan Yang,
Andrew Markham,
Niki Trigoni
Abstract:
The introduction of neural radiance fields has greatly improved the effectiveness of view synthesis for monocular videos. However, existing algorithms face difficulties when dealing with uncontrolled or lengthy scenarios, and require extensive training time specific to each new scenario. To tackle these limitations, we propose DynPoint, an algorithm designed to facilitate the rapid synthesis of no…
▽ More
The introduction of neural radiance fields has greatly improved the effectiveness of view synthesis for monocular videos. However, existing algorithms face difficulties when dealing with uncontrolled or lengthy scenarios, and require extensive training time specific to each new scenario. To tackle these limitations, we propose DynPoint, an algorithm designed to facilitate the rapid synthesis of novel views for unconstrained monocular videos. Rather than encoding the entirety of the scenario information into a latent representation, DynPoint concentrates on predicting the explicit 3D correspondence between neighboring frames to realize information aggregation. Specifically, this correspondence prediction is achieved through the estimation of consistent depth and scene flow information across frames. Subsequently, the acquired correspondence is utilized to aggregate information from multiple reference frames to a target frame, by constructing hierarchical neural point clouds. The resulting framework enables swift and accurate view synthesis for desired views of target frames. The experimental results obtained demonstrate the considerable acceleration of training time achieved - typically an order of magnitude - by our proposed method while yielding comparable outcomes compared to prior approaches. Furthermore, our method exhibits strong robustness in handling long-duration videos without learning a canonical representation of video content.
△ Less
Submitted 18 January, 2024; v1 submitted 29 October, 2023;
originally announced October 2023.
-
Multi-band analyses of the bright GRB 230812B and the associated SN2023pel
Authors:
T. Hussenot-Desenonges,
T. Wouters,
N. Guessoum,
I. Abdi,
A. Abulwfa,
C. Adami,
J. F. Agüí Fernández,
T. Ahumada,
V. Aivazyan,
D. Akl,
S. Anand,
C. M. Andrade,
S. Antier,
S. A. Ata,
P. D'Avanzo,
Y. A. Azzam,
A. Baransky,
S. Basa,
M. Blazek,
P. Bendjoya,
S. Beradze,
P. Boumis,
M. Bremer,
R. Brivio,
V. Buat
, et al. (87 additional authors not shown)
Abstract:
GRB~230812B is a bright and relatively nearby ($z =0.36$) long gamma-ray burst (GRB) that has generated significant interest in the community and has thus been observed over the entire electromagnetic spectrum. We report over 80 observations in X-ray, ultraviolet, optical, infrared, and sub-millimeter bands from the GRANDMA (Global Rapid Advanced Network for Multi-messenger Addicts) network of obs…
▽ More
GRB~230812B is a bright and relatively nearby ($z =0.36$) long gamma-ray burst (GRB) that has generated significant interest in the community and has thus been observed over the entire electromagnetic spectrum. We report over 80 observations in X-ray, ultraviolet, optical, infrared, and sub-millimeter bands from the GRANDMA (Global Rapid Advanced Network for Multi-messenger Addicts) network of observatories and from observational partners. Adding complementary data from the literature, we then derive essential physical parameters associated with the ejecta and external properties (i.e. the geometry and environment) of the GRB and compare with other analyses of this event. We spectroscopically confirm the presence of an associated supernova, SN2023pel, and we derive a photospheric expansion velocity of v $\sim$ 17$\times10^3$ km s$^{-1}$. We analyze the photometric data first using empirical fits of the flux and then with full Bayesian Inference. We again strongly establish the presence of a supernova in the data, with a maximum (pseudo-)bolometric luminosity of $5.75 \times 10^{42}$ erg/s, at $15.76^{+0.81}_{-1.21}$ days (in the observer frame) after the trigger, with a half-max time width of 22.0 days. We compare these values with those of SN1998bw, SN2006aj, and SN2013dx. Our best-fit model favours a very low density environment ($\log_{10}({n_{\rm ISM}/{\rm cm}^{-3}}) = -2.38^{+1.45}_{-1.60}$) and small values for the jet's core angle $θ_{\rm core} = 1.54^{+1.02}_{-0.81} \ \rm{deg}$ and viewing angle $θ_{\rm obs} = 0.76^{+1.29}_{-0.76} \ \rm{deg}$. GRB 230812B is thus one of the best observed afterglows with a distinctive supernova bump.
△ Less
Submitted 17 February, 2024; v1 submitted 22 October, 2023;
originally announced October 2023.
-
Inelastic Scattering of Dark Matter with Heavy Cosmic Rays
Authors:
Keyu Lu,
Yue-Lin Sming Tsai,
Qiang Yuan,
Le Zhang
Abstract:
We investigate the impact of inelastic collisions between dark matter (DM) and heavy cosmic ray (CR) nuclei on CR propagation. We approximate the fragmentation cross-sections for DM-CR collisions using collider-measured proton-nuclei scattering cross-sections, allowing us to assess how these collisions affect the spectra of CR Boron and Carbon. We derive new CR spectra from DM-CR collisions by inc…
▽ More
We investigate the impact of inelastic collisions between dark matter (DM) and heavy cosmic ray (CR) nuclei on CR propagation. We approximate the fragmentation cross-sections for DM-CR collisions using collider-measured proton-nuclei scattering cross-sections, allowing us to assess how these collisions affect the spectra of CR Boron and Carbon. We derive new CR spectra from DM-CR collisions by incorporating their cross-sections into the source terms and solving the diffusion equation for the complete network of reactions involved in generating secondary species. In a specific example with a coupling strength of $b_χ=0.1$ and a DM mass of $m_χ=0.1$ GeV, considering a simplified scenario where DM interacts exclusively with Oxygen, a notable modification in the Boron-to-Carbon spectrum due to the DM-CR interaction is observed. Particularly, the peak within the spectrum, spanning from $0.1$ GeV to $10$ GeV, experiences an enhancement of approximately 1.5 times. However, in a more realistic scenario where DM particles interact with all CRs, this peak can be amplified to twice its original value.Utilizing the latest data from AMS-02 and DAMPE on the Boron-to-Carbon ratio, we estimate a 95\% upper limit for the effective inelastic cross-section of DM-proton as a function of DM mass. Our findings reveal that at $m_χ\simeq 2$ MeV, the effective inelastic cross-section between DM and protons must be less than $\mathcal{O}(10^{-32})~{\rm cm}^2$.
△ Less
Submitted 7 June, 2024; v1 submitted 19 October, 2023;
originally announced October 2023.