Skip to main content

Showing 1–50 of 824 results for author: Bai, J

  1. arXiv:2407.12023  [pdf, other

    cs.CL cs.AI

    CMMaTH: A Chinese Multi-modal Math Skill Evaluation Benchmark for Foundation Models

    Authors: Zhong-Zhi Li, Ming-Liang Zhang, Fei Yin, Zhi-Long Ji, Jin-Feng Bai, Zhen-Ru Pan, Fan-Hu Zeng, Jian Xu, Jia-Xin Zhang, Cheng-Lin Liu

    Abstract: Due to the rapid advancements in multimodal large language models, evaluating their multimodal mathematical capabilities continues to receive wide attention. Despite the datasets like MathVista proposed benchmarks for assessing mathematical capabilities in multimodal scenarios, there is still a lack of corresponding evaluation tools and datasets for fine-grained assessment in the context of K12 ed… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

  2. arXiv:2407.10671  [pdf, other

    cs.CL cs.AI

    Qwen2 Technical Report

    Authors: An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jianxin Yang, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin , et al. (37 additional authors not shown)

    Abstract: This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, a… ▽ More

    Submitted 17 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: 25 pages, 1 figure

  3. arXiv:2407.10435  [pdf

    cond-mat.mes-hall

    Nontrivial impact of interlayer coupling on thermal conductivity: opposing trends in in-plane and out-of-plane phonons

    Authors: H. F. Feng, B. Liu, J. L. Bai, X. Zhang, Z. X. Song, Zhi-Xin Guo

    Abstract: The study of heat transport in two-dimensional (2D) materials reveals novel behaviors due to quantum confinement effects, where in-plane and out-of-plane phonons play crucial roles. In 2D materials like graphene, it is widely recognized that the out-of-plane vibrational mode is the primary contributor to thermal conductivity owing to the mirror symmetry. Based on this perspective, the introduction… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 4 figures

  4. arXiv:2407.09550  [pdf

    cs.CV cs.AI cs.LG

    CAPM: Fast and Robust Verification on Maxpool-based CNN via Dual Network

    Authors: Jia-Hau Bai, Chi-Ting Liu, Yu Wang, Fu-Chieh Chang, Pei-Yuan Wu

    Abstract: This study uses CAPM (Convex Adversarial Polytope for Maxpool-based CNN) to improve the verified bound for general purpose maxpool-based convolutional neural networks (CNNs) under bounded norm adversarial perturbations. The maxpool function is decomposed as a series of ReLU functions to extend the convex relaxation technique to maxpool functions, by which the verified bound can be efficiently comp… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

  5. arXiv:2407.09021  [pdf, other

    eess.AS

    Squeeze-and-Excite ResNet-Conformers for Sound Event Localization, Detection, and Distance Estimation for DCASE 2024 Challenge

    Authors: Jun Wei Yeow, Ee-Leng Tan, Jisheng Bai, Santi Peksi, Woon-Seng Gan

    Abstract: This technical report details our systems submitted for Task 3 of the DCASE 2024 Challenge: Audio and Audiovisual Sound Event Localization and Detection (SELD) with Source Distance Estimation (SDE). We address only the audio-only SELD with SDE (SELDDE) task in this report. We propose to improve the existing ResNet-Conformer architectures with Squeeze-and-Excitation blocks in order to introduce add… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Technical report for DCASE 2024 Challenge Task 3

  6. arXiv:2407.08183  [pdf, other

    astro-ph.SR

    The white-light superflares from cool stars in GWAC triggers

    Authors: Guang-Wei Li, Liang Wang, Hai-Long Yuan, Li-Ping Xin, Jing Wang, Chao Wu, Hua-Li Li, Hasitieer Haerken, Wei-Hua Wang, Hong-Bo Cai, Xu-Hui Han, Yang Xu, Lei Huang, Xiao-Meng Lu, Jian-Ying Bai, Xiang-Yu Wang, Zi-Gao Dai, En-Wei Liang, Jian-Yan Wei

    Abstract: M-type stars are the ones that flare most frequently, but how big their maximum flare energy can reach is still unknown. We present 163 flares from 162 individual M2 through L1-type stars that triggered the GWAC, with flare energies ranging from $10^{32.2}$ to $10^{36.4}$ erg . The flare amplitudes range from $\triangle G = 0.84$ to $\sim 10$ mag. Flare energy increases with stellar surface temper… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 18 pages, 11 figures, 4 tables

  7. arXiv:2407.08120  [pdf, other

    astro-ph.GA

    Spectroastrometry and Reverberation Mapping (SARM) of Active Galactic Nuclei. I. The H$β$ Broad-line Region Structure and Black Hole Mass of Five Quasars

    Authors: Yan-Rong Li, Chen Hu, Zhu-Heng Yao, Yong-Jie Chen, Hua-Rui Bai, Sen Yang, Pu Du, Feng-Na Fang, Yi-Xin Fu, Jun-Rong Liu, Yue-Chang Peng, Yu-Yang Songsheng, Yi-Lin Wang, Ming Xiao, Shuo Zhai, Hartmut Winkler, Jin-Ming Bai, Luis C. Ho, Romain G. Petrov, Jesus Aceituno, Jian-Min Wang

    Abstract: We conduct a reverberation mapping (RM) campaign to spectroscopically monitor a sample of selected bright active galactic nuclei with large anticipated broad-line region (BLR) sizes adequate for spectroastrometric observations by the GRAVITY instrument on the Very Large Telescope Interferometer. We report the first results for five objects, IC 4329A, Mrk 335, Mrk 509, Mrk 1239, and PDS 456, among… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 32 pages, 6 tables, 20 figures. To appear in ApJ

  8. arXiv:2407.06964  [pdf, other

    cs.CV

    Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach

    Authors: Taolin Zhang, Jiawang Bai, Zhihe Lu, Dongze Lian, Genping Wang, Xinchao Wang, Shu-Tao Xia

    Abstract: Recent works on parameter-efficient transfer learning (PETL) show the potential to adapt a pre-trained Vision Transformer to downstream recognition tasks with only a few learnable parameters. However, since they usually insert new structures into the pre-trained model, entire intermediate features of that model are changed and thus need to be stored to be involved in back-propagation, resulting in… ▽ More

    Submitted 14 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: ECCV2024

  9. arXiv:2407.05676  [pdf, other

    physics.atom-ph physics.app-ph

    Continuous broadband Rydberg receiver using AC Stark shifts and Floquet States

    Authors: Danni Song, Yuechun Jiao, Jinlian Hu, Yuwen Yin, Zhenhua Li, Yunhui He, Jingxu Bai, Jianming Zhao, Suotang Jia

    Abstract: We demonstrate the continuous broadband microwave receivers based on AC Stark shifts and Floquet States of Rydberg levels in a cesium atomic vapor cell. The resonant transition frequency of two adjacent Rydberg states 78$S_{1/2}$ and 78$P_{1/2}$ is tuned based on AC Stark effect of 70~MHz Radio frequency (RF) field that is applied outside the vapor cell. Meanwhile, the Rydberg states also exhibit… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 5 pages, 4 figures

  10. arXiv:2407.05414  [pdf, other

    astro-ph.GA astro-ph.HE

    Velocity-Resolved Ionization Mapping of Broad Line Region. I. Insights into Diverse Geometry and Kinematics

    Authors: Sha-Sha Li, Hai-Cheng Feng, H. T. Liu, J. M. Bai, Xiang Ji, Cheng Cheng, Kai-Xing Lu, Jian-Guo Wang, Rui Li

    Abstract: Broad emission lines of active galactic nuclei (AGNs) originate from the broad-line region (BLR), consisting of dense gas clouds in orbit around an accreting supermassive black hole. Understanding the geometry and kinematics of the region is crucial for gaining insights into the physics and evolution of AGNs. Conventional velocity-resolved reverberation mapping may face challenges in disentangling… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: 20 pages, 10 figures, Accepted by ApJ

  11. arXiv:2407.05369  [pdf, other

    math.CO

    A Novel Property of Generalized Fibonacci Sequence in Grids

    Authors: Zixian Yang, Jianchao Bai

    Abstract: Fibonacci sequence, generated by summing the preceding two terms, is a classical sequence renowned for its elegant properties. In this paper, leveraging properties of generalized Fibonacci sequences and formulas for consecutive sums of equidistant subsequences, we investigate the ratio of the sum of numbers along main-diagonal and sub-diagonal of odd-order grids containing generalized Fibonacci se… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  12. arXiv:2407.03654  [pdf, other

    eess.AS

    Mixstyle based Domain Generalization for Sound Event Detection with Heterogeneous Training Data

    Authors: Yang Xiao, Han Yin, Jisheng Bai, Rohan Kumar Das

    Abstract: This work explores domain generalization (DG) for sound event detection (SED), advancing adaptability towards real-world scenarios. Our approach employs a mean-teacher framework with domain generalization to integrate heterogeneous training data, while preserving the SED model performance across the datasets. Specifically, we first apply mixstyle to the frequency dimension to adapt the mel-spectro… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Sumbitted to DCASE WS 2024. 5 pages. arXiv admin note: text overlap with arXiv:2407.00291

  13. arXiv:2407.00291  [pdf, other

    eess.AS cs.SD

    FMSG-JLESS Submission for DCASE 2024 Task4 on Sound Event Detection with Heterogeneous Training Dataset and Potentially Missing Labels

    Authors: Yang Xiao, Han Yin, Jisheng Bai, Rohan Kumar Das

    Abstract: This report presents the systems developed and submitted by Fortemedia Singapore (FMSG) and Joint Laboratory of Environmental Sound Sensing (JLESS) for DCASE 2024 Task 4. The task focuses on recognizing event classes and their time boundaries, given that multiple events can be present and may overlap in an audio recording. The novelty this year is a dataset with two sources, making it challenging… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: Technical report for DCASE 2024 Challenge Task 4

  14. arXiv:2406.16855  [pdf, other

    cs.CV

    DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation

    Authors: Yuang Peng, Yuxin Cui, Haomiao Tang, Zekun Qi, Runpei Dong, Jing Bai, Chunrui Han, Zheng Ge, Xiangyu Zhang, Shu-Tao Xia

    Abstract: Personalized image generation holds great promise in assisting humans in everyday work and life due to its impressive function in creatively generating personalized content. However, current evaluations either are automated but misalign with humans or require human evaluations that are time-consuming and expensive. In this work, we present DreamBench++, a human-aligned benchmark automated by advan… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Project page: https://dreambenchplus.github.io/

  15. arXiv:2406.11045  [pdf, other

    cs.LG math.NA

    Kolmogorov Arnold Informed neural network: A physics-informed deep learning framework for solving PDEs based on Kolmogorov Arnold Networks

    Authors: Yizheng Wang, Jia Sun, Jinshuai Bai, Cosmin Anitescu, Mohammad Sadegh Eshaghi, Xiaoying Zhuang, Timon Rabczuk, Yinghua Liu

    Abstract: AI for partial differential equations (PDEs) has garnered significant attention, particularly with the emergence of Physics-informed neural networks (PINNs). The recent advent of Kolmogorov-Arnold Network (KAN) indicates that there is potential to revisit and enhance the previously MLP-based PINNs. Compared to MLPs, KANs offer interpretability and require fewer parameters. PDEs can be described in… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  16. arXiv:2406.10885  [pdf, other

    cs.CL

    On the Role of Entity and Event Level Conceptualization in Generalizable Reasoning: A Survey of Tasks, Methods, Applications, and Future Directions

    Authors: Weiqi Wang, Tianqing Fang, Haochen Shi, Baixuan Xu, Wenxuan Ding, Liyu Zhang, Wei Fan, Jiaxin Bai, Haoran Li, Xin Liu, Yangqiu Song

    Abstract: Entity- and event-level conceptualization, as fundamental elements of human cognition, plays a pivotal role in generalizable reasoning. This process involves abstracting specific instances into higher-level concepts and forming abstract knowledge that can be applied in unfamiliar or novel situations, which can enhance models' inferential capabilities and support the effective transfer of knowledge… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  17. arXiv:2406.10701  [pdf, other

    cs.CL

    MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding

    Authors: Baixuan Xu, Weiqi Wang, Haochen Shi, Wenxuan Ding, Huihao Jing, Tianqing Fang, Jiaxin Bai, Long Chen, Yangqiu Song

    Abstract: Improving user experience and providing personalized search results in E-commerce platforms heavily rely on understanding purchase intention. However, existing methods for acquiring large-scale intentions bank on distilling large language models with human annotation for verification. Such an approach tends to generate product-centric intentions, overlook valuable visual information from product i… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 8 pages, 5 figures

  18. arXiv:2406.10173  [pdf, other

    cs.CL

    IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce

    Authors: Wenxuan Ding, Weiqi Wang, Sze Heng Douglas Kwok, Minghao Liu, Tianqing Fang, Jiaxin Bai, Junxian He, Yangqiu Song

    Abstract: Enhancing Language Models' (LMs) ability to understand purchase intentions in E-commerce scenarios is crucial for their effective assistance in various downstream tasks. However, previous approaches that distill intentions from LMs often fail to generate meaningful and human-centric intentions applicable in real-world E-commerce contexts. This raises concerns about the true comprehension and utili… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  19. arXiv:2406.09695  [pdf, other

    eess.SP

    Machine learning-based Near-field Emitter Localization via Grouped Hybrid Analog and Digital Massive MIMO Receive Array

    Authors: Yifan Li, Feng Shu, Jiatong Bai, Cunhua Pan, Yongpeng Wu, Yaoliang Song, Jiangzhou Wang

    Abstract: A fully-digital massive MIMO receive array is promising to meet the high-resolution requirement of near-field (NF) emitter localization, but it also results in the significantly increasing of hardware costs and algorithm complexity. In order to meet the future demand for green communication while maintaining high performance, the grouped hybrid analog and digital (HAD) structure is proposed for NF… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  20. arXiv:2406.07880  [pdf, other

    cs.CV eess.IV

    A Comprehensive Survey on Machine Learning Driven Material Defect Detection: Challenges, Solutions, and Future Prospects

    Authors: Jun Bai, Di Wu, Tristan Shelley, Peter Schubel, David Twine, John Russell, Xuesen Zeng, Ji Zhang

    Abstract: Material defects (MD) represent a primary challenge affecting product performance and giving rise to safety issues in related products. The rapid and accurate identification and localization of MD constitute crucial research endeavours in addressing contemporary challenges associated with MD. Although conventional non-destructive testing methods such as ultrasonic and X-ray approaches have mitigat… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  21. arXiv:2406.05696  [pdf, other

    eess.SP

    Two Power Allocation and Beamforming Strategies for Active IRS-aided Wireless Network via Machine Learning

    Authors: Qiankun Cheng, Jiatong Bai, Baihua Shi, Wei Gao, Feng Shu

    Abstract: This paper models an active intelligent reflecting surface (IRS) -assisted wireless communication network, which has the ability to adjust power between BS and IRS. We aim to maximize the signal-to-noise ratio of user by jointly designing power allocation (PA) factor, active IRS phase shift matrix, and beamforming vector of BS, subject to a total power constraint. To tackle this non-convex problem… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  22. arXiv:2406.03797  [pdf, other

    astro-ph.GA

    Morpho-Photometric Classification of KiDS DR5 Sources Based on Neural Networks: A Comprehensive Star-Quasar-Galaxy Catalog

    Authors: Hai-Cheng Feng, Rui Li, Nicola R. Napolitano, Sha-Sha Li, J. M. Bai, Ran Li, H. T. Liu, Kai-Xing Lu, Mario Radovich, Huan-Yuan Shan, Jian-Guo Wang, Wen-Zhe Xi, Ling-Hua Xie, Yang-Wei Zhang

    Abstract: We present a novel multimodal neural network for classifying astronomical sources in multiband ground-based observations, from optical to near infrared, to separate sources in stars, galaxies and quasars. Our approach combines a convolutional neural network branch for learning morphological features from $r$-band images with an artificial neural network branch for extracting spectral energy distri… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 18 pages, 12 figures, 2 tables, Submitted to ApJS

  23. arXiv:2406.03127  [pdf, other

    cs.CL

    Towards Real-world Scenario: Imbalanced New Intent Discovery

    Authors: Shun Zhang, Chaoran Yan, Jian Yang, Jiaheng Liu, Ying Mo, Jiaqi Bai, Tongliang Li, Zhoujun Li

    Abstract: New Intent Discovery (NID) aims at detecting known and previously undefined categories of user intent by utilizing limited labeled and massive unlabeled data. Most prior works often operate under the unrealistic assumption that the distribution of both familiar and new intent classes is uniform, overlooking the skewed and long-tailed distributions frequently encountered in real-world scenarios. To… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: ACL 2024

  24. arXiv:2406.02993  [pdf, other

    physics.optics

    Dual-color Q-switched mode-locking in an Erbium-doped fiber laser

    Authors: Chenyue Lv, Baole Lu, Jintao Bai

    Abstract: Q-switched mode-locking (QML) has been widely observed in various lasers, but its generation mechanism in passive mode-locking remains unclear. In this paper, we build up a dual-color QML Erbium-doped fiber laser and find a bound-state-like envelope on the optical spectrum for the first time. Theoretically, the formation mechanism of QML is numerically investigated using the coupled Ginzburg-Landa… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  25. arXiv:2405.19732  [pdf, other

    cs.CV cs.CL cs.LG

    Two Optimizers Are Better Than One: LLM Catalyst Empowers Gradient-Based Optimization for Prompt Tuning

    Authors: Zixian Guo, Ming Liu, Zhilong Ji, Jinfeng Bai, Yiwen Guo, Wangmeng Zuo

    Abstract: Learning a skill generally relies on both practical experience by doer and insightful high-level guidance by instructor. Will this strategy also work well for solving complex non-convex optimization problems? Here, a common gradient-based optimizer acts like a disciplined doer, making locally optimal update at each step. Recent methods utilize large language models (LLMs) to optimize solutions for… ▽ More

    Submitted 6 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

  26. arXiv:2405.15758  [pdf, other

    cs.CV cs.AI

    InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation

    Authors: Yuchi Wang, Junliang Guo, Jianhong Bai, Runyi Yu, Tianyu He, Xu Tan, Xu Sun, Jiang Bian

    Abstract: Recent talking avatar generation models have made strides in achieving realistic and accurate lip synchronization with the audio, but often fall short in controlling and conveying detailed expressions and emotions of the avatar, making the generated video less vivid and controllable. In this paper, we propose a novel text-guided approach for generating emotionally expressive 2D avatars, offering f… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Project page: https://wangyuchi369.github.io/InstructAvatar/

  27. arXiv:2405.12072  [pdf, other

    cond-mat.mtrl-sci

    Real topological phonons in 3D carbon allotropes

    Authors: Xiaotian Wang, Jingbo Bai, Jianhua Wang, Zhenxiang Cheng, Shifeng Qian, Wenhong Wang, Gang Zhang, Zhi-Ming Yu, Yugui Yao

    Abstract: There has been a significant focus on real topological systems that enjoy space-time inversion symmetry (PT ) and lack spin-orbit coupling. While the theoretical classification of the real topology has been established, more progress has yet to be made in the materials realization of such real topological systems in three dimensions (3D). To address this crucial issue, by selecting the carbon-base… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  28. arXiv:2405.10612  [pdf, other

    cs.CV cs.CR cs.LG

    Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transformers

    Authors: Sheng Yang, Jiawang Bai, Kuofeng Gao, Yong Yang, Yiming Li, Shu-tao Xia

    Abstract: Given the power of vision transformers, a new learning paradigm, pre-training and then prompting, makes it more efficient and effective to address downstream visual recognition tasks. In this paper, we identify a novel security threat towards such a paradigm from the perspective of backdoor attacks. Specifically, an extra prompt token, called the switch token in this work, can turn the backdoor mo… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  29. arXiv:2405.09981  [pdf, other

    cs.CV

    Adversarial Robustness for Visual Grounding of Multimodal Large Language Models

    Authors: Kuofeng Gao, Yang Bai, Jiawang Bai, Yong Yang, Shu-Tao Xia

    Abstract: Multi-modal Large Language Models (MLLMs) have recently achieved enhanced performance across various vision-language tasks including visual grounding capabilities. However, the adversarial robustness of visual grounding remains unexplored in MLLMs. To fill this gap, we use referring expression comprehension (REC) as an example task in visual grounding and propose three adversarial attack paradigms… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: ICLR 2024 Workshop on Reliable and Responsible Foundation Models

  30. arXiv:2405.09556  [pdf, other

    eess.SP cs.AI cs.IT

    Co-learning-aided Multi-modal-deep-learning Framework of Passive DOA Estimators for a Heterogeneous Hybrid Massive MIMO Receiver

    Authors: Jiatong Bai, Feng Shu, Qinghe Zheng, Bo Xu, Baihua Shi, Yiwen Chen, Weibin Zhang, Xianpeng Wang

    Abstract: Due to its excellent performance in rate and resolution, fully-digital (FD) massive multiple-input multiple-output (MIMO) antenna arrays has been widely applied in data transmission and direction of arrival (DOA) measurements, etc. But it confronts with two main challenges: high computational complexity and circuit cost. The two problems may be addressed well by hybrid analog-digital (HAD) structu… ▽ More

    Submitted 12 June, 2024; v1 submitted 27 April, 2024; originally announced May 2024.

  31. Robust Covariance-Based Activity Detection for Massive Access

    Authors: Jianan Bai, Erik G. Larsson

    Abstract: The wireless channel is undergoing continuous changes, and the block-fading assumption, despite its popularity in theoretical contexts, never holds true in practical scenarios. This discrepancy is particularly critical for user activity detection in grant-free random access, where joint processing across multiple resource blocks is usually undesirable. In this paper, we propose employing a low-dim… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: 5 pages, 11 figures. Asilomar SSC 2023 Conference

  32. arXiv:2405.07551  [pdf, other

    cs.CL cs.AI

    MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning

    Authors: Shuo Yin, Weihao You, Zhilong Ji, Guoqiang Zhong, Jinfeng Bai

    Abstract: The tool-use Large Language Models (LLMs) that integrate with external Python interpreters have significantly enhanced mathematical reasoning capabilities for open-source LLMs, while tool-free methods chose another track: augmenting math reasoning data. However, a great method to integrate the above two research paths and combine their advantages remains to be explored. In this work, we firstly in… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: The state-of-the-art open-source tool-use LLMs for mathematical reasoning

  33. arXiv:2405.07518  [pdf, other

    cs.AR cs.AI

    SambaNova SN40L: Scaling the AI Memory Wall with Dataflow and Composition of Experts

    Authors: Raghu Prabhakar, Ram Sivaramakrishnan, Darshan Gandhi, Yun Du, Mingran Wang, Xiangyu Song, Kejie Zhang, Tianren Gao, Angela Wang, Karen Li, Yongning Sheng, Joshua Brot, Denis Sokolov, Apurv Vivek, Calvin Leung, Arjun Sabnis, Jiayu Bai, Tuowen Zhao, Mark Gottscho, David Jackson, Mark Luttrell, Manish K. Shah, Edison Chen, Kaizhao Liang, Swayambhoo Jain , et al. (5 additional authors not shown)

    Abstract: Monolithic large language models (LLMs) like GPT-4 have paved the way for modern generative AI applications. Training, serving, and maintaining monolithic LLMs at scale, however, remains prohibitively expensive and challenging. The disproportionate increase in compute-to-memory ratio of modern AI accelerators have created a memory wall, necessitating new methods to deploy AI. Composition of Expert… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  34. arXiv:2405.07497  [pdf, other

    cs.LG

    Towards Subgraph Isomorphism Counting with Graph Kernels

    Authors: Xin Liu, Weiqi Wang, Jiaxin Bai, Yangqiu Song

    Abstract: Subgraph isomorphism counting is known as #P-complete and requires exponential time to find the accurate solution. Utilizing representation learning has been shown as a promising direction to represent substructures and approximate the solution. Graph kernels that implicitly capture the correlations among substructures in diverse graphs have exhibited great discriminative power in graph classifica… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  35. arXiv:2405.05840  [pdf, other

    astro-ph.CO astro-ph.IM

    FREmu: Power Spectrum Emulator for $f(R)$ Gravity

    Authors: Jiachen Bai, Junqing Xia

    Abstract: To investigate gravity in the non-linear regime of cosmic structure using measurements from Stage-IV surveys, it is imperative to accurately compute large-scale structure observables, such as non-linear matter power spectra, for gravity models that extend beyond general relativity. However, the theoretical predictions of non-linear observables are typically derived from N-body simulations, which d… ▽ More

    Submitted 5 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

    Comments: 12 pages, 5 figures, 1 table, accepted by The Astrophysical Journal (ApJ)

  36. arXiv:2405.05806  [pdf, other

    cs.CV

    MasterWeaver: Taming Editability and Identity for Personalized Text-to-Image Generation

    Authors: Yuxiang Wei, Zhilong Ji, Jinfeng Bai, Hongzhi Zhang, Lei Zhang, Wangmeng Zuo

    Abstract: Text-to-image (T2I) diffusion models have shown significant success in personalized text-to-image generation, which aims to generate novel images with human identities indicated by the reference images. Despite promising identity fidelity has been achieved by several tuning-free methods, they usually suffer from overfitting issues. The learned identity tends to entangle with irrelevant information… ▽ More

    Submitted 10 May, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

    Comments: 34 pages

  37. Optimizing E-commerce Search: Toward a Generalizable and Rank-Consistent Pre-Ranking Model

    Authors: Enqiang Xu, Yiming Qiu, Junyang Bai, Ping Zhang, Dadong Miao, Songlin Wang, Guoyu Tang, Lin Liu, Mingming Li

    Abstract: In large e-commerce platforms, search systems are typically composed of a series of modules, including recall, pre-ranking, and ranking phases. The pre-ranking phase, serving as a lightweight module, is crucial for filtering out the bulk of products in advance for the downstream ranking module. Industrial efforts on optimizing the pre-ranking model have predominantly focused on enhancing ranking c… ▽ More

    Submitted 13 May, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

    ACM Class: H.3.3

  38. Array SAR 3D Sparse Imaging Based on Regularization by Denoising Under Few Observed Data

    Authors: Yangyang Wang, Xu Zhan, Jing Gao, Jinjie Yao, Shunjun Wei, JianSheng Bai

    Abstract: Array synthetic aperture radar (SAR) three-dimensional (3D) imaging can obtain 3D information of the target region, which is widely used in environmental monitoring and scattering information measurement. In recent years, with the development of compressed sensing (CS) theory, sparse signal processing is used in array SAR 3D imaging. Compared with matched filter (MF), sparse SAR imaging can effect… ▽ More

    Submitted 26 May, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

  39. arXiv:2405.03349  [pdf, other

    cs.CV

    Retinexmamba: Retinex-based Mamba for Low-light Image Enhancement

    Authors: Jiesong Bai, Yuhao Yin, Qiyuan He, Yuanxian Li, Xiaofeng Zhang

    Abstract: In the field of low-light image enhancement, both traditional Retinex methods and advanced deep learning techniques such as Retinexformer have shown distinct advantages and limitations. Traditional Retinex methods, designed to mimic the human eye's perception of brightness and color, decompose images into illumination and reflection components but struggle with noise management and detail preserva… ▽ More

    Submitted 19 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  40. arXiv:2405.02942  [pdf, other

    physics.optics cs.CV cs.RO eess.IV

    Design, analysis, and manufacturing of a glass-plastic hybrid minimalist aspheric panoramic annular lens

    Authors: Shaohua Gao, Qi Jiang, Yiqi Liao, Yi Qiu, Wanglei Ying, Kailun Yang, Kaiwei Wang, Benhao Zhang, Jian Bai

    Abstract: We propose a high-performance glass-plastic hybrid minimalist aspheric panoramic annular lens (ASPAL) to solve several major limitations of the traditional panoramic annular lens (PAL), such as large size, high weight, and complex system. The field of view (FoV) of the ASPAL is 360°x(35°~110°) and the imaging quality is close to the diffraction limit. This large FoV ASPAL is composed of only 4 len… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: Accepted to Optics & Laser Technology

  41. arXiv:2405.01074  [pdf, other

    cs.IT eess.SY

    Stability Analysis of Interacting Wireless Repeaters

    Authors: Erik G. Larsson, Jianan Bai

    Abstract: We consider a wireless network with multiple single-antenna repeaters that amplify and instantaneously re-transmit the signals they receive to improve the channel rank and system coverage. Due to the positive feedback formed by inter-repeater interference, stability could become a critical issue. We investigate the problem of determining the maximum amplification gain that the repeaters can use wi… ▽ More

    Submitted 7 July, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted to SPAWC 2024. 5 pages, 7 figures

  42. arXiv:2404.18700  [pdf, other

    physics.app-ph

    Real-fluid Transport Property Computations Based on the Boltzmann-weighted Full-dimensional Potential Model

    Authors: Xin Zhang, Junfeng Bai, Bowen Liu, Tong Zhu, Hao Zhao

    Abstract: The intermolecular potential plays crucial roles in real-fluid interactions away from the ideal-gas equilibrium, such as supercritical fluid, high-enthalpy fluid, plasma interactions, etc. We propose a Boltzmann-weighted Full-dimensional (BWF) potential model for real-fluid computations. It includes diverse intermolecular interactions so as to determine the potential well, molecular diameter, dipo… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 18 pages, 10 figures

    MSC Class: 82 (Primary) ACM Class: J.2

  43. arXiv:2404.18356  [pdf, other

    cs.DC

    FEDQ-Trust: Efficient Data-Driven Trust Prediction for Mobile Edge-Based IoT Systems

    Authors: Jiahui Bai, Hai Dong, Athman Bouguettaya

    Abstract: We introduce FEDQ-Trust, an innovative data-driven trust prediction approach designed for mobile edge-based Internet of Things (IoT) environments. The decentralized nature of mobile edge environments introduces challenges due to variations in data distribution, impacting the accuracy and training efficiency of existing distributed data-driven trust prediction models. FEDQ-Trust effectively tackles… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 14 pages, 6 figures, submitted to IEEE Transactions on Services Computing

  44. arXiv:2404.16655  [pdf

    physics.chem-ph physics.optics

    Rational Designing of Anthocyanidins-Directed Near-Infrared Two-Photon Fluorescence Probes

    Authors: Xiu-e Zhang, Xue Wei, Wei-Bo Cui, Jin-Pu Bai, Aynur Matyusup, Jing-Fu Guo, Hui Li, Ai-Min Ren

    Abstract: Recently, two-photon fluorescent probes based on anthocyanidins molecules have attracted extensive attention due to their outstanding photophysical properties. However, there are only a few two-photon excited fluorescent probes that really meet the requirements of relatively long emission wavelengths (>600 nm), large two-photon absorption (TPA) cross sections (300 GM), significant Stokes shift (>8… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  45. arXiv:2404.10763  [pdf, other

    cs.AI cs.CL cs.CV

    LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation?

    Authors: Yuchi Wang, Shuhuai Ren, Rundong Gao, Linli Yao, Qingyan Guo, Kaikai An, Jianhong Bai, Xu Sun

    Abstract: Diffusion models have exhibited remarkable capabilities in text-to-image generation. However, their performance in image-to-text generation, specifically image captioning, has lagged behind Auto-Regressive (AR) models, casting doubt on their applicability for such tasks. In this work, we revisit diffusion models, highlighting their capacity for holistic context modeling and parallel decoding. With… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  46. arXiv:2404.08998  [pdf, other

    physics.optics

    Dual-comb mode-locked Yb:CALGO laser based on cavity-shared configuration with separated end mirrors

    Authors: Ruixin Tang, Ziyu Luo, Pengfei Li, Pengrun Ying, Haiyang Xie, Siyuan Xu, Hui Liu, Jintao Bai

    Abstract: Dual-comb spectroscopy typically requires the utilization of two independent and phase-locked femtosecond lasers, resulting in a complex and expensive system that hinders its industrial applications. Single-cavity dual-comb lasers are considered as one of the primary solution to simplify the system. However, controlling the crucial parameter of difference in repetition rates remains challenging. I… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  47. arXiv:2404.08977  [pdf, other

    cs.CL cs.LG

    RoNID: New Intent Discovery with Generated-Reliable Labels and Cluster-friendly Representations

    Authors: Shun Zhang, Chaoran Yan, Jian Yang, Changyu Ren, Jiaqi Bai, Tongliang Li, Zhoujun Li

    Abstract: New Intent Discovery (NID) strives to identify known and reasonably deduce novel intent groups in the open-world scenario. But current methods face issues with inaccurate pseudo-labels and poor representation learning, creating a negative feedback loop that degrades overall model performance, including accuracy and the adjusted rand index. To address the aforementioned challenges, we propose a Rob… ▽ More

    Submitted 18 April, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

    Comments: DASFAA 2024

  48. arXiv:2404.07943  [pdf, other

    cs.CE cs.LG

    HomoGenius: a Foundation Model of Homogenization for Rapid Prediction of Effective Mechanical Properties using Neural Operators

    Authors: Yizheng Wang, Xiang Li, Ziming Yan, Yuqing Du, Jinshuai Bai, Bokai Liu, Timon Rabczuk, Yinghua Liu

    Abstract: Homogenization is an essential tool for studying multiscale physical phenomena. However, traditional numerical homogenization, heavily reliant on finite element analysis, requires extensive computation costs, particularly in handling complex geometries, materials, and high-resolution problems. To address these limitations, we propose a numerical homogenization model based on operator learning: Hom… ▽ More

    Submitted 18 March, 2024; originally announced April 2024.

  49. arXiv:2404.07343  [pdf, other

    astro-ph.GA

    Monitoring AGNs with H$β$ Asymmetry. IV. First Reverberation Mapping Results of 14 AGNs

    Authors: T. E. Zastrocky, Michael S. Brotherton, Pu Du, Jacob N. McLane, Kianna A. Olson, D. A. Dale, H. A. Kobulnicky, Jaya Maithil, My L. Nguyen, William T. Chick, David H. Kasper, Derek Hand, C. Adelman, Z. Carter, G. Murphree, M. Oeur, T. Roth, S. Schonsberg, M. J. Caradonna, J. Favro, A. J. Ferguson, I. M. Gonzalez, L. M. Hadding, H. D. Hagler, C. J. Rogers , et al. (19 additional authors not shown)

    Abstract: We report first-time reverberation mapping results for 14 AGNs from the ongoing Monitoring AGNs with H$β$ Asymmetry campaign (MAHA). These results utilize optical spectra obtained with the Long Slit Spectrograph on the Wyoming Infrared 2.3m Telescope between 2017 November-2023 May. MAHA combines long-duration monitoring with high cadence. We report results from multiple observing seasons for 9 of… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 35 pages, 19 figures, accepted for publication in ApJ Supplement

  50. arXiv:2404.07246  [pdf, other

    astro-ph.IM astro-ph.CO

    Prospects of the multi-channel photometric survey telescope in the cosmological application of Type Ia supernovae

    Authors: Zhenyu Wang, Ju-Jia Zhang, Xinzhong Er, Jinming Bai

    Abstract: The Multi-channel Photometric Survey Telescope (Mephisto) is a real-time, three-color photometric system designed to capture the color evolution of stars and transients accurately. This telescope system can be crucial in cosmological distance measurements of low-redshift (low-$z$, $z$ $\lesssim 0.1$) Type Ia supernovae (SNe Ia). To optimize the capabilities of this instrument, we perform a compreh… ▽ More

    Submitted 17 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: 15 pages, 7 figures