subscribe to arXiv mailings

doi 10.1021/acs.nanolett.4c01493

Interface suppressed nematicity and enhanced superconductivity of FeSe/NdFeO3 in the low doping regime

Authors: Chihao Li, Yuanhe Song, Xiaoxiao Wang, Minyinan Lei, Xiaoyang Chen, Haichao Xu, Rui Peng, Donglai Feng

Abstract: The discovery of interface-enhanced superconductivity in single-layer FeSe/oxides has generated intensive research interests. Beyond the family of FeSe interfaced with various TiO$_2$ terminated oxides, high pairing temperature up to 80~K has been recently observed in FeSe interfaced with FeO$_x$-terminated LaFeO$_3$. Here we successfully extend the FeSe/FeO$_x$ superconducting interface to FeSe/N… ▽ More The discovery of interface-enhanced superconductivity in single-layer FeSe/oxides has generated intensive research interests. Beyond the family of FeSe interfaced with various TiO$_2$ terminated oxides, high pairing temperature up to 80~K has been recently observed in FeSe interfaced with FeO$_x$-terminated LaFeO$_3$. Here we successfully extend the FeSe/FeO$_x$ superconducting interface to FeSe/NdFeO$_3$, by constructing 1uc-FeSe/6uc-NdFeO$_3$/Nb:SrTiO$_3$ heterostructures. Intriguingly, well-annealed FeSe/NdFeO$_3$ exhibits a low doping level of 0.038$\sim$0.046 ~e$^-/$Fe which deviates universally magic doping level (0.10$\sim$0.12 e$^-/\rm{Fe}$) and provides a new playground for studying the FeSe/oxide interface in the low electron-doped regime. Comparing it with thick FeSe films at the comparable electron doping level induced by surface potassium dosing, FeSe/NdFeO$_3$ shows a larger superconducting gap and the absence of a nematic gap, indicating an enhancement of the superconductivity and suppression of nematicity by the FeSe/FeO$_x$ interface. These results not only expand the FeSe/FeO$_x$ superconducting family but also enrich the current understanding on the roles of the oxide interface. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: 8 pages, 4 figures

Journal ref: Nano Lett. 24, 27, 8303-8310 (2024)

arXiv:2407.11273 [pdf]

Informational Size in School Choice

Authors: Di Feng, Yun Liu

Abstract: This paper introduces a novel measurement of informational size to school choice problems, which inherits its ideas from Mount and Reiter (1974). This concept measures a matching mechanism's information size by counting the maximal relevant preference and priority rankings to secure a certain pairwise assignment of a student to a school across all possible matching problems. Our analysis uncovers… ▽ More This paper introduces a novel measurement of informational size to school choice problems, which inherits its ideas from Mount and Reiter (1974). This concept measures a matching mechanism's information size by counting the maximal relevant preference and priority rankings to secure a certain pairwise assignment of a student to a school across all possible matching problems. Our analysis uncovers two key insights. First, the three prominent strategy-proof matching mechanisms, the deferred acceptance (DA) mechanism, the top trading cycles (TTC) mechanism, and the serial dictatorship (SD) mechanism, is (strictly) less informative than the non-strategy-proof immediate acceptance (IA) mechanism. This result highlights a previously omitted advantage of IA in term of its information demand, which partially explain the its popularity in real-world matching problems especially when acquiring information is both pecuniarily and cognitively costly. Second, when the matching problem contains at least four students, the TTC demands less information compared to the DA to implement a desired allocation. The issue of comparison between TTC and DA has puzzled researchers both in theory (Gonczarowski and Thomas, 2023) and in experiment (Hakimov and Kubler, 2021). Our result responds to this issue from an informational perspective: in experiments with relatively fewer students, agents tend to prefer DA over TTC as DA requires fewer information to secure one's allocation in all problems (Guillen and Veszteg, 2021), while the opposite is true when the market size increases (Pais et al., 2011). Among others, our informational size concept offers a new perspective to understand the differences in auditability (Grigoryan and Moller, 2024), manipulation vulnerability (Pathak and Sonmez, 2013), and privacy protection (Haupt and Hitzig, 2022), among some commonly used matching mechanisms. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2407.11018 [pdf, other]

Online Multi-Task Offloading for Semantic-Aware Edge Computing Systems

Authors: Xuyang Chen, Qu Luo, Gaojie Chen, Daquan Feng, Yao Sun

Abstract: Mobile edge computing (MEC) provides low-latency offloading solutions for computationally intensive tasks, effectively improving the computing efficiency and battery life of mobile devices. However, for data-intensive tasks or scenarios with limited uplink bandwidth, network congestion might occur due to massive simultaneous offloading nodes, increasing transmission latency and affecting task perf… ▽ More Mobile edge computing (MEC) provides low-latency offloading solutions for computationally intensive tasks, effectively improving the computing efficiency and battery life of mobile devices. However, for data-intensive tasks or scenarios with limited uplink bandwidth, network congestion might occur due to massive simultaneous offloading nodes, increasing transmission latency and affecting task performance. In this paper, we propose a semantic-aware multi-modal task offloading framework to address the challenges posed by limited uplink bandwidth. By introducing a semantic extraction factor, we balance the relationship among transmission latency, computation energy consumption, and task performance. To measure the offloading performance of multi-modal tasks, we design a unified and fair quality of experience (QoE) metric that includes execution latency, energy consumption, and task performance. Lastly, we formulate the optimization problem as a Markov decision process (MDP) and exploit the multi-agent proximal policy optimization (MAPPO) reinforcement learning algorithm to jointly optimize the semantic extraction factor, communication resources, and computing resources to maximize overall QoE. Experimental results show that the proposed method achieves a reduction in execution latency and energy consumption of 18.1% and 12.9%, respectively compared with the semantic-unaware approach. Moreover, the proposed approach can be easily extended to models with different user preferences. △ Less

Submitted 28 June, 2024; originally announced July 2024.

arXiv:2407.09880 [pdf, other]

doi 10.1021/acs.nanolett.4c01612

Inferior interfacial superconductivity in 1 UC FeSe/SrVO$_3$/SrTiO$_3$ with screened interfacial electron-phonon coupling

Authors: Nan Guo, Xiaoyang Chen, Tianlun Yu, Yu Fan, Qinghua Zhang, Minyinan Lei, Xiaofeng Xu, Xuetao Zhu, Jiandong Guo, Lin Gu, Haichao Xu, Rui Peng, Donglai Feng

Abstract: Monolayer FeSe/TiO$_x$ and FeSe/FeO$_x$ interfaces exhibit significant superconductivity enhancement compared to bulk FeSe, with interfacial electron-phonon coupling (EPC) playing a crucial role. However, the reduced dimensionality in monolayer FeSe, which may drive superconducting fluctuations, complicates the understanding of the enhancement mechanisms. Here we construct a new superconducting in… ▽ More Monolayer FeSe/TiO$_x$ and FeSe/FeO$_x$ interfaces exhibit significant superconductivity enhancement compared to bulk FeSe, with interfacial electron-phonon coupling (EPC) playing a crucial role. However, the reduced dimensionality in monolayer FeSe, which may drive superconducting fluctuations, complicates the understanding of the enhancement mechanisms. Here we construct a new superconducting interface: monolayer FeSe/SrVO$_3$/SrTiO$_3$, in which the itinerant electrons of highly metallic SrVO$_3$ films can screen all the high-energy Fuchs-Kliewer phonons, including those of SrTiO$_3$, making it the first FeSe/oxide system with screened interfacial EPC while maintaining the monolayer FeSe thickness. Despite comparable doping levels, the heavily electron-doped monolayer FeSe/SrVO$_3$ exhibits a lower pairing temperature ($T_\mathrm{g}$ $\sim$ 48 K) than FeSe/SrTiO$_3$ and FeSe/LaFeO$_3$. Our findings disentangle the contributions of interfacial EPC from dimensionality on enhancing $T_\mathrm{g}$ in FeSe/oxide interfaces, underscoring the importance of interfacial EPC in $T_\mathrm{g}$ enhancement. This FeSe/VO$_x$ interface also provides a platform for studying the interfacial superconductivity. △ Less

Submitted 13 July, 2024; originally announced July 2024.

Comments: Published in Nano Letters, 11 pages, 4 figures, 1 table

arXiv:2407.00588 [pdf, other]

Forward and backward problems for coupled subdiffusion systems

Authors: Dian Feng, Yikan Liu, Shuai Lu

Abstract: In this article, we investigate both forward and backward problems for coupled systems of time-fractional diffusion equations, encompassing scenarios of strong coupling. For the forward problem, we establish the well-posedness of the system, leveraging the eigensystem of the corresponding elliptic system as the foundation. When considering the backward problem, specifically the determination of in… ▽ More In this article, we investigate both forward and backward problems for coupled systems of time-fractional diffusion equations, encompassing scenarios of strong coupling. For the forward problem, we establish the well-posedness of the system, leveraging the eigensystem of the corresponding elliptic system as the foundation. When considering the backward problem, specifically the determination of initial values through final time observations, we demonstrate a Lipschitz stability estimate, which is consist with the stability observed in the case of a single equation. To numerically address this backward problem, we refer to the explicit formulation of Tikhonov regularization to devise a multi-channel neural network architecture. This innovative architecture offers a versatile approach, exhibiting its efficacy in multidimensional settings through numerical examples and its robustness in handling initial values that have not been trained. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 26 pages, 7 figures

MSC Class: 35R11; 35K58; 35B44

arXiv:2406.16713 [pdf, other]

ShanghaiTech Mapping Robot is All You Need: Robot System for Collecting Universal Ground Vehicle Datasets

Authors: Bowen Xu, Xiting Zhao, Delin Feng, Yuanyuan Yang, Sören Schwertfeger

Abstract: This paper presents the ShanghaiTech Mapping Robot, a state-of-the-art unmanned ground vehicle (UGV) designed for collecting comprehensive multi-sensor datasets to support research in robotics, computer vision, and autonomous driving. The robot is equipped with a wide array of sensors including RGB cameras, RGB-D cameras, event-based cameras, IR cameras, LiDARs, mmWave radars, IMUs, ultrasonic ran… ▽ More This paper presents the ShanghaiTech Mapping Robot, a state-of-the-art unmanned ground vehicle (UGV) designed for collecting comprehensive multi-sensor datasets to support research in robotics, computer vision, and autonomous driving. The robot is equipped with a wide array of sensors including RGB cameras, RGB-D cameras, event-based cameras, IR cameras, LiDARs, mmWave radars, IMUs, ultrasonic range finders, and a GNSS RTK receiver. The sensor suite is integrated onto a specially designed mechanical structure with a centralized power system and a synchronization mechanism to ensure spatial and temporal alignment of the sensor data. A 16-node on-board computing cluster handles sensor control, data collection, and storage. We describe the hardware and software architecture of the robot in detail and discuss the calibration procedures for the various sensors. The capabilities of the platform are demonstrated through an extensive dataset collected in diverse real-world environments. To facilitate research, we make the dataset publicly available along with the associated robot sensor calibration data. Performance evaluations on a set of standard perception and localization tasks showcase the potential of the dataset to support developments in Robot Autonomy. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: Incomplete draft

arXiv:2406.09356 [pdf, other]

CMC-Bench: Towards a New Paradigm of Visual Signal Compression

Authors: Chunyi Li, Xiele Wu, Haoning Wu, Donghui Feng, Zicheng Zhang, Guo Lu, Xiongkuo Min, Xiaohong Liu, Guangtao Zhai, Weisi Lin

Abstract: Ultra-low bitrate image compression is a challenging and demanding topic. With the development of Large Multimodal Models (LMMs), a Cross Modality Compression (CMC) paradigm of Image-Text-Image has emerged. Compared with traditional codecs, this semantic-level compression can reduce image data size to 0.1\% or even lower, which has strong potential applications. However, CMC has certain defects in… ▽ More Ultra-low bitrate image compression is a challenging and demanding topic. With the development of Large Multimodal Models (LMMs), a Cross Modality Compression (CMC) paradigm of Image-Text-Image has emerged. Compared with traditional codecs, this semantic-level compression can reduce image data size to 0.1\% or even lower, which has strong potential applications. However, CMC has certain defects in consistency with the original image and perceptual quality. To address this problem, we introduce CMC-Bench, a benchmark of the cooperative performance of Image-to-Text (I2T) and Text-to-Image (T2I) models for image compression. This benchmark covers 18,000 and 40,000 images respectively to verify 6 mainstream I2T and 12 T2I models, including 160,000 subjective preference scores annotated by human experts. At ultra-low bitrates, this paper proves that the combination of some I2T and T2I models has surpassed the most advanced visual signal codecs; meanwhile, it highlights where LMMs can be further optimized toward the compression task. We encourage LMM developers to participate in this test to promote the evolution of visual signal codec protocols. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.08124 [pdf, other]

Legend: Leveraging Representation Engineering to Annotate Safety Margin for Preference Datasets

Authors: Duanyu Feng, Bowen Qin, Chen Huang, Youcheng Huang, Zheng Zhang, Wenqiang Lei

Abstract: The success of the reward model in distinguishing between responses with subtle safety differences depends critically on the high-quality preference dataset, which should capture the fine-grained nuances of harmful and harmless responses. This motivates the need to develop a dataset involving preference margins, which accurately quantify how harmless one response is compared to another. In this pa… ▽ More The success of the reward model in distinguishing between responses with subtle safety differences depends critically on the high-quality preference dataset, which should capture the fine-grained nuances of harmful and harmless responses. This motivates the need to develop a dataset involving preference margins, which accurately quantify how harmless one response is compared to another. In this paper, we take the first step to propose an effective and cost-efficient framework to promote the margin-enhanced preference dataset development. Our framework, Legend, Leverages representation engineering to annotate preference datasets. It constructs the specific direction within the LLM's embedding space that represents safety. By leveraging this safety direction, Legend can then leverage the semantic distances of paired responses along this direction to annotate margins automatically. We experimentally demonstrate our effectiveness in both reward modeling and harmless alignment for LLMs. Legend also stands out for its efficiency, requiring only the inference time rather than additional training. This efficiency allows for easier implementation and scalability, making Legend particularly valuable for practical applications in aligning LLMs with safe conversations. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: Our code is available at https://github.com/colfeng/Legend

arXiv:2406.01931 [pdf, other]

Dishonesty in Helpful and Harmless Alignment

Authors: Youcheng Huang, Jingkun Tang, Duanyu Feng, Zheng Zhang, Wenqiang Lei, Jiancheng Lv, Anthony G. Cohn

Abstract: People tell lies when seeking rewards. Large language models (LLMs) are aligned to human values with reinforcement learning where they get rewards if they satisfy human preference. We find that this also induces dishonesty in helpful and harmless alignment where LLMs tell lies in generating harmless responses. Using the latest interpreting tools, we detect dishonesty, show how LLMs can be harmful… ▽ More People tell lies when seeking rewards. Large language models (LLMs) are aligned to human values with reinforcement learning where they get rewards if they satisfy human preference. We find that this also induces dishonesty in helpful and harmless alignment where LLMs tell lies in generating harmless responses. Using the latest interpreting tools, we detect dishonesty, show how LLMs can be harmful if their honesty is increased, and analyze such conflicts at the parameter-level. Given these preliminaries and the hypothesis that reward-seeking stimulates dishonesty, we theoretically show that the dishonesty can in-turn decrease the alignment performances and augment reward-seeking alignment with representation regularization. Extensive results, including GPT-4 annotated win-rates, perplexities, and cases studies demonstrate that we can train more honest, helpful, and harmless LLMs. We will make all our codes and results be open-sourced upon this paper's acceptance. △ Less

Submitted 5 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.00123 [pdf]

Correlation-aware Coarse-to-fine MLPs for Deformable Medical Image Registration

Authors: Mingyuan Meng, Dagan Feng, Lei Bi, Jinman Kim

Abstract: Deformable image registration is a fundamental step for medical image analysis. Recently, transformers have been used for registration and outperformed Convolutional Neural Networks (CNNs). Transformers can capture long-range dependence among image features, which have been shown beneficial for registration. However, due to the high computation/memory loads of self-attention, transformers are typi… ▽ More Deformable image registration is a fundamental step for medical image analysis. Recently, transformers have been used for registration and outperformed Convolutional Neural Networks (CNNs). Transformers can capture long-range dependence among image features, which have been shown beneficial for registration. However, due to the high computation/memory loads of self-attention, transformers are typically used at downsampled feature resolutions and cannot capture fine-grained long-range dependence at the full image resolution. This limits deformable registration as it necessitates precise dense correspondence between each image pixel. Multi-layer Perceptrons (MLPs) without self-attention are efficient in computation/memory usage, enabling the feasibility of capturing fine-grained long-range dependence at full resolution. Nevertheless, MLPs have not been extensively explored for image registration and are lacking the consideration of inductive bias crucial for medical registration tasks. In this study, we propose the first correlation-aware MLP-based registration network (CorrMLP) for deformable medical image registration. Our CorrMLP introduces a correlation-aware multi-window MLP block in a novel coarse-to-fine registration architecture, which captures fine-grained multi-range dependence to perform correlation-aware coarse-to-fine registration. Extensive experiments with seven public medical datasets show that our CorrMLP outperforms state-of-the-art deformable registration methods. △ Less

Submitted 12 June, 2024; v1 submitted 31 May, 2024; originally announced June 2024.

Comments: Accepted at CVPR2024 as Oral Presentation && Best Paper Candidate

Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 9645-9654

arXiv:2405.16462 [pdf, ps, other]

Global and local existence of solutions for nonlinear systems of time-fractional diffusion equations

Authors: Dian Feng, Masahiro Yamamoto

Abstract: In this paper, we consider initial-boundary value problems for two-component nonlinear systems of time-fractional diffusion equations with the homogeneous Neumann boundary condition and non-negative initial values. The main results are the existence of solutions global in time and the blow-up. Our approach involves the truncation of the nonlinear terms, which enables us to handle all local Lipschi… ▽ More In this paper, we consider initial-boundary value problems for two-component nonlinear systems of time-fractional diffusion equations with the homogeneous Neumann boundary condition and non-negative initial values. The main results are the existence of solutions global in time and the blow-up. Our approach involves the truncation of the nonlinear terms, which enables us to handle all local Lipschitz continuous nonlinear terms, provided their sum is less than or equal to zero. By employing a comparison principle for the corresponding linear system, we establish also the non-negativity of the nonlinear system. △ Less

Submitted 26 May, 2024; originally announced May 2024.

Comments: 27 pages

arXiv:2405.14103 [pdf, other]

Online Self-Preferring Language Models

Authors: Yuanzhao Zhai, Zhuo Zhang, Kele Xu, Hanyang Peng, Yue Yu, Dawei Feng, Cheng Yang, Bo Ding, Huaimin Wang

Abstract: Aligning with human preference datasets has been critical to the success of large language models (LLMs). Reinforcement learning from human feedback (RLHF) employs a costly reward model to provide feedback for on-policy sampling responses. Recently, offline methods that directly fit responses with binary preferences in the dataset have emerged as alternatives. However, existing methods do not expl… ▽ More Aligning with human preference datasets has been critical to the success of large language models (LLMs). Reinforcement learning from human feedback (RLHF) employs a costly reward model to provide feedback for on-policy sampling responses. Recently, offline methods that directly fit responses with binary preferences in the dataset have emerged as alternatives. However, existing methods do not explicitly model preference strength information, which is crucial for distinguishing different response pairs. To overcome this limitation, we propose Online Self-Preferring (OSP) language models to learn from self-generated response pairs and self-judged preference strengths. For each prompt and corresponding self-generated responses, we introduce a ranked pairing method to construct multiple response pairs with preference strength information. We then propose the soft-preference cross-entropy loss to leverage such information. Empirically, we demonstrate that leveraging preference strength is crucial for avoiding overfitting and enhancing alignment performance. OSP achieves state-of-the-art alignment performance across various metrics in two widely used human preference datasets. OSP is parameter-efficient and more robust than the dominant online method, RLHF when limited offline data are available and generalizing to out-of-domain tasks. Moreover, OSP language models established by LLMs with proficiency in self-preferring can efficiently self-improve without external supervision. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 20 pages, 9 figures

arXiv:2405.12687 [pdf, other]

Large band-splitting in $g$-wave type altermagnet CrSb

Authors: Jianyang Ding, Zhicheng Jiang, Xiuhua Chen, Zicheng Tao, Zhengtai Liu, Jishan Liu, Tongrui Li, Jiayu Liu, Yichen Yang, Runfeng Zhang, Liwei Deng, Wenchuan Jing, Yu Huang, Yuming Shi, Shan Qiao, Yilin Wang, Yanfeng Guo, Donglai Feng, Dawei Shen

Abstract: Altermagnetism (AM), a newly discovered magnetic state, ingeniously integrates the properties of ferromagnetism and antiferromagnetism, representing a significant breakthrough in the field of magnetic materials. Despite experimental verification of some typical AM materials, such as MnTe and MnTe$_2$, the pursuit of AM materials that feature larger spin splitting and higher transition temperature… ▽ More Altermagnetism (AM), a newly discovered magnetic state, ingeniously integrates the properties of ferromagnetism and antiferromagnetism, representing a significant breakthrough in the field of magnetic materials. Despite experimental verification of some typical AM materials, such as MnTe and MnTe$_2$, the pursuit of AM materials that feature larger spin splitting and higher transition temperature is still essential. Here, our research focuses on CrSb, which possesses N{é}el temperature of up to 700K and giant spin splitting near the Fermi level ($E_F$). Utilizing high-resolution angle-resolved photoemission spectroscopy and density functional theory calculations, we meticulously map the three-dimensional electronic structure of CrSb. Our photoemission spectroscopic results on both (0001) and (10$\overline{1}$0) cleavages of CrSb collaboratively reveal unprecedented details on AM-induced band splitting, and subsequently pin down its unique bulk $g$-wave symmetry through quantitative analysis of the angular and photon-energy dependence of spin splitting. Moreover, the observed spin splitting reaches the magnitude of 0.93~eV near $E_F$, the most substantial among all confirmed AM materials. This study not only validates the nature of CrSb as a prototype $g$-wave like AM material but also underscores its pivotal role in pioneering applications in spintronics. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.09857 [pdf, other]

IGOT: Information Gain Optimized Tokenizer on Domain Adaptive Pretraining

Authors: Dawei Feng, Yihai Zhang, Zhixuan Xu

Abstract: Pretrained Large Language Models (LLM) such as ChatGPT, Claude, etc. have demonstrated strong capabilities in various fields of natural language generation. However, there are still many problems when using LLM in specialized domain-specific fields. When using generative AI to process downstream tasks, a common approach is to add new knowledge (e.g., private domain knowledge, cutting-edge informat… ▽ More Pretrained Large Language Models (LLM) such as ChatGPT, Claude, etc. have demonstrated strong capabilities in various fields of natural language generation. However, there are still many problems when using LLM in specialized domain-specific fields. When using generative AI to process downstream tasks, a common approach is to add new knowledge (e.g., private domain knowledge, cutting-edge information) to a pretrained model through continued training or fine-tuning. However, whether there is a universal paradigm for domain adaptation training is still an open question. In this article, we proposed Information Gain Optimized Tokenizer (IGOT), which analyzes the special token set of downstream tasks, constructs a new subset using heuristic function $φ$ with the special token and its information gain, to build new domain-specific tokenizer, and continues pretraining on the downstream task data. We explored the many positive effects of this method's customized tokenizer on domain-adaptive pretraining and verified this method can perform better than the ordinary method of just collecting data and fine-tuning. Based on our experiment, the continued pretraining process of IGOT with LLaMA-7B achieved 11.9\% token saving, 12.2\% training time saving, and 5.8\% maximum GPU VRAM usage saving, combined with the T5 model, we can even reach a 31.5\% of training time saving, making porting general generative AI to specific domains more effective than before. In domain-specific tasks, supervised $IGOT_τ$ shows great performance on reducing both the convergence radius and convergence point during keep pretraining. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2405.03124 [pdf, ps, other]

Dimension of homogeneous iterated function systems with algebraic translations

Authors: De-Jun Feng, Zhou Feng

Abstract: Let $ μ$ be the self-similar measure associated with a homogeneous iterated function system $ Φ= \{ λx + t_j \}_{j=1}^m $ on ${\Bbb R}$ and a probability vector $ (p_{j})_{j=1}^m$, where $0\neq λ\in (-1,1)$ and $t_j\in {\Bbb R}$. Recently by modifying the arguments of Varjú (2019), Rapaport and Varjú (2024) showed that if $t_1,\ldots, t_m$ are rational numbers and $0<λ<1$, then… ▽ More Let $ μ$ be the self-similar measure associated with a homogeneous iterated function system $ Φ= \{ λx + t_j \}_{j=1}^m $ on ${\Bbb R}$ and a probability vector $ (p_{j})_{j=1}^m$, where $0\neq λ\in (-1,1)$ and $t_j\in {\Bbb R}$. Recently by modifying the arguments of Varjú (2019), Rapaport and Varjú (2024) showed that if $t_1,\ldots, t_m$ are rational numbers and $0<λ<1$, then $$ \dim μ=\min\Big \{ 1, \; \frac{\sum_{j=1}^m p_{j}\log p_{j}}{ \log |λ| }\Big\}$$ unless $ Φ$ has exact overlaps. In this paper, we further show that the above equality holds in the case when $t_1,\ldots, t_m$ are algebraic numbers and $0<|λ|<1$. This is done by adapting and extending the ideas employed in the recent papers of Breuillard, Rapaport and Varjú. △ Less

Submitted 5 May, 2024; originally announced May 2024.

MSC Class: 28A80; 42A85

arXiv:2404.19417 [pdf, other]

Physical Backdoor: Towards Temperature-based Backdoor Attacks in the Physical World

Authors: Wen Yin, Jian Lou, Pan Zhou, Yulai Xie, Dan Feng, Yuhua Sun, Tailai Zhang, Lichao Sun

Abstract: Backdoor attacks have been well-studied in visible light object detection (VLOD) in recent years. However, VLOD can not effectively work in dark and temperature-sensitive scenarios. Instead, thermal infrared object detection (TIOD) is the most accessible and practical in such environments. In this paper, our team is the first to investigate the security vulnerabilities associated with TIOD in the… ▽ More Backdoor attacks have been well-studied in visible light object detection (VLOD) in recent years. However, VLOD can not effectively work in dark and temperature-sensitive scenarios. Instead, thermal infrared object detection (TIOD) is the most accessible and practical in such environments. In this paper, our team is the first to investigate the security vulnerabilities associated with TIOD in the context of backdoor attacks, spanning both the digital and physical realms. We introduce two novel types of backdoor attacks on TIOD, each offering unique capabilities: Object-affecting Attack and Range-affecting Attack. We conduct a comprehensive analysis of key factors influencing trigger design, which include temperature, size, material, and concealment. These factors, especially temperature, significantly impact the efficacy of backdoor attacks on TIOD. A thorough understanding of these factors will serve as a foundation for designing physical triggers and temperature controlling experiments. Our study includes extensive experiments conducted in both digital and physical environments. In the digital realm, we evaluate our approach using benchmark datasets for TIOD, achieving an Attack Success Rate (ASR) of up to 98.21%. In the physical realm, we test our approach in two real-world settings: a traffic intersection and a parking lot, using a thermal infrared camera. Here, we attain an ASR of up to 98.38%. △ Less

Submitted 30 April, 2024; originally announced April 2024.

Comments: To appear in CVPR 2024.11pages, 8 figures and 4 tables

arXiv:2404.18105 [pdf, other]

Tightly-Coupled VLP/INS Integrated Navigation by Inclination Estimation and Blockage Handling

Authors: Xiao Sun, Yuan Zhuang, Xiansheng Yang, Jianzhu Huai, Tianming Huang, Daquan Feng

Abstract: Visible Light Positioning (VLP) has emerged as a promising technology capable of delivering indoor localization with high accuracy. In VLP systems that use Photodiodes (PDs) as light receivers, the Received Signal Strength (RSS) is affected by the incidence angle of light, making the inclination of PDs a critical parameter in the positioning model. Currently, most studies assume the inclination to… ▽ More Visible Light Positioning (VLP) has emerged as a promising technology capable of delivering indoor localization with high accuracy. In VLP systems that use Photodiodes (PDs) as light receivers, the Received Signal Strength (RSS) is affected by the incidence angle of light, making the inclination of PDs a critical parameter in the positioning model. Currently, most studies assume the inclination to be constant, limiting the applications and positioning accuracy. Additionally, light blockages may severely interfere with the RSS measurements but the literature has not explored blockage detection in real-world experiments. To address these problems, we propose a tightly coupled VLP/INS (Inertial Navigation System) integrated navigation system that uses graph optimization to account for varying PD inclinations and VLP blockages. We also discussed the possibility of simultaneously estimating the robot's pose and the locations of some unknown LEDs. Simulations and two groups of real-world experiments demonstrate the efficiency of our approach, achieving an average positioning accuracy of 10 cm during movement and inclination accuracy within 1 degree despite inclination changes and blockages. △ Less

Submitted 28 April, 2024; originally announced April 2024.

arXiv:2404.04932 [pdf, other]

Towards Understanding the Influence of Reward Margin on Preference Model Performance

Authors: Bowen Qin, Duanyu Feng, Xi Yang

Abstract: Reinforcement Learning from Human Feedback (RLHF) is a widely used framework for the training of language models. However, the process of using RLHF to develop a language model that is well-aligned presents challenges, especially when it comes to optimizing the reward model. Our research has found that existing reward models, when trained using the traditional ranking objective based on human pref… ▽ More Reinforcement Learning from Human Feedback (RLHF) is a widely used framework for the training of language models. However, the process of using RLHF to develop a language model that is well-aligned presents challenges, especially when it comes to optimizing the reward model. Our research has found that existing reward models, when trained using the traditional ranking objective based on human preference data, often struggle to effectively distinguish between responses that are more or less favorable in real-world scenarios. To bridge this gap, our study introduces a novel method to estimate the preference differences without the need for detailed, exhaustive labels from human annotators. Our experimental results provide empirical evidence that incorporating margin values into the training process significantly improves the effectiveness of reward models. This comparative analysis not only demonstrates the superiority of our approach in terms of reward prediction accuracy but also highlights its effectiveness in practical applications. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2404.04822 [pdf, ps, other]

Some Characterizations of TTC in Multiple-Object Reallocation Problems

Authors: Jacob Coreno, Di Feng

Abstract: This paper considers exchange of indivisible objects when agents are endowed with and desire bundles of objects. Agents are assumed to have lexicographic preferences over bundles. We show that the generalized Top Trading Cycles rule (TTC) is characterized by Pareto efficiency, balancedness, the weak endowment lower bound, and truncation-proofness (or drop strategy-proofness). In the classic Shaple… ▽ More This paper considers exchange of indivisible objects when agents are endowed with and desire bundles of objects. Agents are assumed to have lexicographic preferences over bundles. We show that the generalized Top Trading Cycles rule (TTC) is characterized by Pareto efficiency, balancedness, the weak endowment lower bound, and truncation-proofness (or drop strategy-proofness). In the classic Shapley-Scarf model, TTC is characterized by Pareto efficiency, individual rationality, and truncation-proofness. The proof is nonstandard and its novelty has independent methodological interest. △ Less

Submitted 19 May, 2024; v1 submitted 7 April, 2024; originally announced April 2024.

Comments: 22 pages

arXiv:2404.04626 [pdf, ps, other]

Towards Analyzing and Understanding the Limitations of DPO: A Theoretical Perspective

Authors: Duanyu Feng, Bowen Qin, Chen Huang, Zheng Zhang, Wenqiang Lei

Abstract: Direct Preference Optimization (DPO), which derives reward signals directly from pairwise preference data, has shown its effectiveness on aligning Large Language Models (LLMs) with human preferences. Despite its widespread use across various tasks, DPO has been criticized for its sensitivity to the SFT's effectiveness and its hindrance to the learning capacity towards human-preferred responses, le… ▽ More Direct Preference Optimization (DPO), which derives reward signals directly from pairwise preference data, has shown its effectiveness on aligning Large Language Models (LLMs) with human preferences. Despite its widespread use across various tasks, DPO has been criticized for its sensitivity to the SFT's effectiveness and its hindrance to the learning capacity towards human-preferred responses, leading to less satisfactory performance. To overcome those limitations, the theoretical understanding of DPO are indispensable but still lacking. To this end, we take a step towards theoretically analyzing and understanding the limitations of DPO. Specifically, we provide an analytical framework using the field theory to analyze the optimization process of DPO. By analyzing the gradient vector field of the DPO loss function, we find that the DPO loss function decreases the probability of producing human dispreferred data at a faster rate than it increases the probability of producing preferred data. This provides theoretical insights for understanding the limitations of DPO discovered in the related research experiments, thereby setting the foundation for its improvement. △ Less

Submitted 6 April, 2024; originally announced April 2024.

Comments: Draft version

arXiv:2404.04449 [pdf]

Self-referencing photothermal common-path interferometry to measure absorption of Si3N4 membranes for laser-light sails

Authors: Demeng Feng, Tanuj Kumar, Shenwei Yin, Merlin Mah, Phyo Lin, Margaret Fortman, Gabriel R. Jaffe, Chenghao Wan, Hongyan Mei, Yuzhe Xiao, Ron Synowicki, Ronald J. Warzoha, Victor W. Brar, Joseph J. Talghader, Mikhail A. Kats

Abstract: Laser-light sails are a spacecraft concept wherein lightweight "sails" are propelled to high speeds by lasers with high intensities. The sails must comprise materials with low optical loss, to minimize the risk of laser damage. Stoichiometric silicon nitride (Si$_3$N$_4$) is a candidate material with low loss in the near infrared, but the precise absorption coefficient has not been characterized i… ▽ More Laser-light sails are a spacecraft concept wherein lightweight "sails" are propelled to high speeds by lasers with high intensities. The sails must comprise materials with low optical loss, to minimize the risk of laser damage. Stoichiometric silicon nitride (Si$_3$N$_4$) is a candidate material with low loss in the near infrared, but the precise absorption coefficient has not been characterized in the membrane form-factor needed for sails. We use photothermal common-path interferometry (PCI), a sensitive pump-probe technique, to measure the absorption coefficient of stoichiometric and nonstoichiometric silicon nitride. To calibrate PCI measurements of membranes, we developed a self-referencing technique where a measurement is performed twice: once on a bare membrane, and a second time with a monolayer of graphene deposited on the membrane. The absorption of the sample with graphene can be measured by both PCI and more-conventional spectroscopic techniques, enabling the calibration of the PCI measurement. We find that with an absorption coefficient of (2.09 $\pm$ 0.76) $\times$ 10$^{-2}$ cm$^{-1}$ at 1064 nm, Si$_3$N$_4$ is a suitable laser-sail material for laser intensities as high as ~10 GW/m$^{2}$, which have been proposed for some laser-sail missions, while silicon-rich SiN$_x$ (x~1), with an absorption coefficient of 7.94 $\pm$ 0.50 cm$^{-1}$, is unlikely to survive such high laser intensities. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: Main text + supplementary

arXiv:2403.17671 [pdf]

Revealing the Microscopic Mechanism of Elementary Vortex Pinning in Superconductors

Authors: C. Chen, Y. Liu, Y. Chen, Y. N. Hu, T. Z. Zhang, D. Li, X. Wang, C. X. Wang, Z. Y. W. Lu, Y. H. Zhang, Q. L. Zhang, X. L. Dong, R. Wang, D. L. Feng, T. Zhang

Abstract: Vortex pinning is a crucial factor that determines the critical current of practical superconductors. However, the understanding of its underlying mechanism has long been phenomenological without a clear microscopic description. Here using high-resolution scanning tunneling microscopy, we studied single vortex pinning induced by point defect in layered FeSe-based superconductors. We found the defe… ▽ More Vortex pinning is a crucial factor that determines the critical current of practical superconductors. However, the understanding of its underlying mechanism has long been phenomenological without a clear microscopic description. Here using high-resolution scanning tunneling microscopy, we studied single vortex pinning induced by point defect in layered FeSe-based superconductors. We found the defect-vortex interaction drives low-energy vortex bound states away from EF, resulting a mini gap which effectively lowered the energy of vortex and caused the pinning. By measuring the local density-of-states, we directly obtained the elementary pinning energy and estimated the pinning force through the spatial gradient of pinning energy. The results align with the bulk critical current measurement. We further show that a general microscopic quantum model with considering defect-vortex interaction can well capture our observation. It indicates the local pairing near pinned vortex core is actually enhanced, which is beyond the traditional understanding that non-superconducting regions pin vortices. Our study thus revealed a general microscopic mechanism of vortex pinning in superconductors. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: 28 pages, 12 figures, Supplementary Materials included. Comments are welcome

arXiv:2403.07448 [pdf]

doi 10.1093/nsr/nwae194

Cuprate-like Electronic Structures in Infinite-Layer Nickelates with Substantial Hole Dopings

Authors: X. Ding, Y. Fan, X. X. Wang, C. H. Li, Z. T. An, J. H. Ye, S. L. Tang, M. Y. N. Lei, X. T. Sun, N. Guo, Z. H. Chen, S. Sangphet, Y. L. Wang, H. C. Xu, R. Peng, D. L. Feng

Abstract: The superconducting infinite-layer (IL) nickelates offer a new platform for investigating the long-standing problem of high-temperature superconductivity. Many models were proposed to understand its superconducting mechanisms based on the calculated electronic structure, and the multiple Fermi surfaces and multiple orbitals involved create complications and controversial conclusions. Over the past… ▽ More The superconducting infinite-layer (IL) nickelates offer a new platform for investigating the long-standing problem of high-temperature superconductivity. Many models were proposed to understand its superconducting mechanisms based on the calculated electronic structure, and the multiple Fermi surfaces and multiple orbitals involved create complications and controversial conclusions. Over the past 5 years, the lack of direct measurements of the electronic structure has hindered the understanding of nickelate superconductors. Here we fill this gap by directly resolving the electronic structures of the parent compound LaNiO$_2$ and superconducting La$_{0.8}$Ca$_{0.2}$NiO$_2$ using angle-resolved photoemission spectroscopy (ARPES). We find that their Fermi surfaces consist of a quasi-two-dimensional (quasi-2D) hole pocket and a three-dimensional (3D) electron pocket at the Brillouin zone corner, whose volumes change upon Ca doping. The Fermi surface topology and band dispersion of the hole pocket closely resemble those observed in hole-doped cuprates. However, the cuprate-like band exhibits significantly higher hole doping in superconducting La$_{0.8}$Ca$_{0.2}$NiO$_2$ compared to superconducting cuprates, highlighting the disparities in the electronic states of the superconducting phase. Our observations highlight the novel aspects of the IL nickelates, and pave the way toward the microscopic understanding of the IL nickelate family and its superconductivity. △ Less

Submitted 5 June, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

Comments: 12 pages, 4 figures

Journal ref: National Science Review, nwae194 (2024)

arXiv:2403.06794 [pdf]

Closed-loop control of gamma oscillations in the brain connections through the transcranial stimulations

Authors: Xuan Zhang, Duoyu Feng, Djibrina Barry, Jiajia Li

Abstract: The reconstruction of brain neural network connections occurs not only during the infancy and early childhood stages of brain development, but also in patients with cognitive impairment in middle and old age under the therapy with stimulated external interference, such as the non-invasive repetitive transcranial magnetic stimulation (rTMS) and the transcranial direct current stimulation(tDCS). How… ▽ More The reconstruction of brain neural network connections occurs not only during the infancy and early childhood stages of brain development, but also in patients with cognitive impairment in middle and old age under the therapy with stimulated external interference, such as the non-invasive repetitive transcranial magnetic stimulation (rTMS) and the transcranial direct current stimulation(tDCS). However, until now, it is not clear how brain stimulation triggers and controls the reconstruction of neural network connections in the brain. This paper combines the EEG data analysis and the cortical neuronal network modeling methods. On one hand, an E-I balanced cortical neural network model was constructed under a long-lasting external stimulation of sinusoidal-exponential form TMS or square-wave tDCS was introduced into the network model for simulate the treatment process for the brain connections. On the other hand, by combining Butterworth filter and functional connectivity algorithm, the paper analyzes the relations between the attentional gamma oscillation responses and the brain connection based on the publicly available EEGs during the pre-tDCS and post-tDCS treatment phases. Firstly, the simulation results indicate that, during long-lasting external stimulations of tDCS/rTMS, The sustained gamma oscillation was found to trigger more release of BDNF from astrocytes to participate in the positively reshaping the excitatory neuronal network connection. △ Less

Submitted 11 March, 2024; originally announced March 2024.

arXiv:2403.02710 [pdf, other]

FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird's-Eye View and Perspective View

Authors: Jiawei Hou, Xiaoyan Li, Wenhao Guan, Gang Zhang, Di Feng, Yuheng Du, Xiangyang Xue, Jian Pu

Abstract: In autonomous driving, 3D occupancy prediction outputs voxel-wise status and semantic labels for more comprehensive understandings of 3D scenes compared with traditional perception tasks, such as 3D object detection and bird's-eye view (BEV) semantic segmentation. Recent researchers have extensively explored various aspects of this task, including view transformation techniques, ground-truth label… ▽ More In autonomous driving, 3D occupancy prediction outputs voxel-wise status and semantic labels for more comprehensive understandings of 3D scenes compared with traditional perception tasks, such as 3D object detection and bird's-eye view (BEV) semantic segmentation. Recent researchers have extensively explored various aspects of this task, including view transformation techniques, ground-truth label generation, and elaborate network design, aiming to achieve superior performance. However, the inference speed, crucial for running on an autonomous vehicle, is neglected. To this end, a new method, dubbed FastOcc, is proposed. By carefully analyzing the network effect and latency from four parts, including the input image resolution, image backbone, view transformation, and occupancy prediction head, it is found that the occupancy prediction head holds considerable potential for accelerating the model while keeping its accuracy. Targeted at improving this component, the time-consuming 3D convolution network is replaced with a novel residual-like architecture, where features are mainly digested by a lightweight 2D BEV convolution network and compensated by integrating the 3D voxel features interpolated from the original image features. Experiments on the Occ3D-nuScenes benchmark demonstrate that our FastOcc achieves state-of-the-art results with a fast inference speed. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: Accepted by ICRA 2024

arXiv:2403.02472 [pdf, other]

OffensiveLang: A Community Based Implicit Offensive Language Dataset

Authors: Amit Das, Mostafa Rahgouy, Dongji Feng, Zheng Zhang, Tathagata Bhattacharya, Nilanjana Raychawdhary, Fatemeh Jamshidi, Vinija Jain, Aman Chadha, Mary Sandage, Lauramarie Pope, Gerry Dozier, Cheryl Seals

Abstract: The widespread presence of hateful languages on social media has resulted in adverse effects on societal well-being. As a result, addressing this issue with high priority has become very important. Hate speech or offensive languages exist in both explicit and implicit forms, with the latter being more challenging to detect. Current research in this domain encounters several challenges. Firstly, th… ▽ More The widespread presence of hateful languages on social media has resulted in adverse effects on societal well-being. As a result, addressing this issue with high priority has become very important. Hate speech or offensive languages exist in both explicit and implicit forms, with the latter being more challenging to detect. Current research in this domain encounters several challenges. Firstly, the existing datasets primarily rely on the collection of texts containing explicit offensive keywords, making it challenging to capture implicitly offensive contents that are devoid of these keywords. Secondly, common methodologies tend to focus solely on textual analysis, neglecting the valuable insights that community information can provide. In this research paper, we introduce a novel dataset OffensiveLang, a community based implicit offensive language dataset generated by ChatGPT 3.5 containing data for 38 different target groups. Despite limitations in generating offensive texts using ChatGPT due to ethical constraints, we present a prompt-based approach that effectively generates implicit offensive languages. To ensure data quality, we evaluate the dataset with human. Additionally, we employ a prompt-based zero-shot method with ChatGPT and compare the detection results between human annotation and ChatGPT annotation. We utilize existing state-of-the-art models to see how effective they are in detecting such languages. The dataset is available here: https://github.com/AmitDasRup123/OffensiveLang △ Less

Submitted 17 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

arXiv:2402.18934 [pdf, other]

RELEAD: Resilient Localization with Enhanced LiDAR Odometry in Adverse Environments

Authors: Zhiqiang Chen, Hongbo Chen, Yuhua Qi, Shipeng Zhong, Dapeng Feng, Wu Jin, Weisong Wen, Ming Liu

Abstract: LiDAR-based localization is valuable for applications like mining surveys and underground facility maintenance. However, existing methods can struggle when dealing with uninformative geometric structures in challenging scenarios. This paper presents RELEAD, a LiDAR-centric solution designed to address scan-matching degradation. Our method enables degeneracy-free point cloud registration by solving… ▽ More LiDAR-based localization is valuable for applications like mining surveys and underground facility maintenance. However, existing methods can struggle when dealing with uninformative geometric structures in challenging scenarios. This paper presents RELEAD, a LiDAR-centric solution designed to address scan-matching degradation. Our method enables degeneracy-free point cloud registration by solving constrained ESIKF updates in the front end and incorporates multisensor constraints, even when dealing with outlier measurements, through graph optimization based on Graduated Non-Convexity (GNC). Additionally, we propose a robust Incremental Fixed Lag Smoother (rIFL) for efficient GNC-based optimization. RELEAD has undergone extensive evaluation in degenerate scenarios and has outperformed existing state-of-the-art LiDAR-Inertial odometry and LiDAR-Visual-Inertial odometry methods. △ Less

Submitted 15 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

Journal ref: published in ICRA 2024

arXiv:2402.16749 [pdf, other]

MISC: Ultra-low Bitrate Image Semantic Compression Driven by Large Multimodal Model

Authors: Chunyi Li, Guo Lu, Donghui Feng, Haoning Wu, Zicheng Zhang, Xiaohong Liu, Guangtao Zhai, Weisi Lin, Wenjun Zhang

Abstract: With the evolution of storage and communication protocols, ultra-low bitrate image compression has become a highly demanding topic. However, existing compression algorithms must sacrifice either consistency with the ground truth or perceptual quality at ultra-low bitrate. In recent years, the rapid development of the Large Multimodal Model (LMM) has made it possible to balance these two goals. To… ▽ More With the evolution of storage and communication protocols, ultra-low bitrate image compression has become a highly demanding topic. However, existing compression algorithms must sacrifice either consistency with the ground truth or perceptual quality at ultra-low bitrate. In recent years, the rapid development of the Large Multimodal Model (LMM) has made it possible to balance these two goals. To solve this problem, this paper proposes a method called Multimodal Image Semantic Compression (MISC), which consists of an LMM encoder for extracting the semantic information of the image, a map encoder to locate the region corresponding to the semantic, an image encoder generates an extremely compressed bitstream, and a decoder reconstructs the image based on the above information. Experimental results show that our proposed MISC is suitable for compressing both traditional Natural Sense Images (NSIs) and emerging AI-Generated Images (AIGIs) content. It can achieve optimal consistency and perception results while saving 50% bitrate, which has strong potential applications in the next generation of storage and communication. The code will be released on https://github.com/lcysyzxdxc/MISC. △ Less

Submitted 17 April, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

Comments: 13 page, 11 figures, 4 tables

arXiv:2402.16527 [pdf, other]

Defocus-integration Interferometric Scattering Microscopy for Speckle Suppression and Enhancing Nanoparticle Detection on Substrate

Authors: Nanfang Jiao, Shupei Lin, Delong Feng, Yong He, Xue-Wen Chen

Abstract: Direct optical detection and imaging of single nanoparticles on substrate in wide field underpin vast applications across different research fields. However, the speckles originating from the unavoidable random surface undulations of the substrate ultimately limit the size of the decipherable nanoparticles by the current optical techniques, including the ultrasensitive interferometric scattering m… ▽ More Direct optical detection and imaging of single nanoparticles on substrate in wide field underpin vast applications across different research fields. However, the speckles originating from the unavoidable random surface undulations of the substrate ultimately limit the size of the decipherable nanoparticles by the current optical techniques, including the ultrasensitive interferometric scattering microscopy (iSCAT). Here we report a defocus-integration iSCAT to suppress the speckle noise and to enhance the detection and imaging of single nanoparticles on ultra-flat glass substrate and silicon wafer. In particular, we discover distinct symmetry properties of the scattering phase between the nanoparticle and the surface undulations that cause the speckles. Consequently, we develop the defocus-integration technique to suppress the speckles.We experimentally achieve an enhancement of the signal to noise ratio by 6.9 dB for the nanoparticle detection. We demonstrate that the technique is generally applicable for nanoparticles of various materials and for both low and high refractive-index substrates. △ Less

Submitted 26 February, 2024; originally announced February 2024.

Comments: 5 pages, 4 figures

arXiv:2402.15999 [pdf]

Revelation of new magnetic domain wall category in the itinerant antiferromagnet Chromium

Authors: Yining Hu, Xu Wang, Chen Chen, Qingle Zhang, Dongming Zhao, Tianzhen Zhang, Chenxi Wang, Donglai Feng, Tong Zhang

Abstract: Conventional magnetic domain walls are characterized by the reorientation of local moments. However, what occurs at the boundary of itinerant magnets is largely unknown. Here using spin-sensitive scanning tunneling microscopy, we investigated the microscopic boundaries of spin-density-wave (SDW) state in a prototypical itinerant anti-ferromagnet of Cr. We find at the boundary of two incommensurate… ▽ More Conventional magnetic domain walls are characterized by the reorientation of local moments. However, what occurs at the boundary of itinerant magnets is largely unknown. Here using spin-sensitive scanning tunneling microscopy, we investigated the microscopic boundaries of spin-density-wave (SDW) state in a prototypical itinerant anti-ferromagnet of Cr. We find at the boundary of two incommensurate SDW domains, the spins display finite-scale decay rather than reorientation. A novel double-Q SDW is generated with a second-order charge modulation. In commensurate SDW domains, a clear SDW gap is observed. Screw dislocations induced novel "half" vortex and anti-vortex that are connected by antiphase domain wall. This domain wall is characterized by vanishing spin density, where intriguing SDW in-gap states emerge, resembling the Andreev bound states in superconductors. All these unique SDW boundary structures can be viewed as consequences of local interference of two SDW, either with different Q or reversed phases. Therefore, our study revealed a new category of magnetic domain wall, the "interference wall", with a mechanism rooted in itinerant nature. △ Less

Submitted 25 February, 2024; originally announced February 2024.

Comments: 19 pages, 10 figures, supplementary materials included

arXiv:2402.12659 [pdf, other]

FinBen: A Holistic Financial Benchmark for Large Language Models

Authors: Qianqian Xie, Weiguang Han, Zhengyu Chen, Ruoyu Xiang, Xiao Zhang, Yueru He, Mengxi Xiao, Dong Li, Yongfu Dai, Duanyu Feng, Yijing Xu, Haoqiang Kang, Ziyan Kuang, Chenhan Yuan, Kailai Yang, Zheheng Luo, Tianlin Zhang, Zhiwei Liu, Guojun Xiong, Zhiyang Deng, Yuechen Jiang, Zhiyuan Yao, Haohang Li, Yangyang Yu, Gang Hu , et al. (9 additional authors not shown)

Abstract: LLMs have transformed NLP and shown promise in various fields, yet their potential in finance is underexplored due to a lack of comprehensive evaluation benchmarks, the rapid development of LLMs, and the complexity of financial tasks. In this paper, we introduce FinBen, the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks, covering seven critical… ▽ More LLMs have transformed NLP and shown promise in various fields, yet their potential in finance is underexplored due to a lack of comprehensive evaluation benchmarks, the rapid development of LLMs, and the complexity of financial tasks. In this paper, we introduce FinBen, the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks, covering seven critical aspects: information extraction (IE), textual analysis, question answering (QA), text generation, risk management, forecasting, and decision-making. FinBen offers several key innovations: a broader range of tasks and datasets, the first evaluation of stock trading, novel agent and Retrieval-Augmented Generation (RAG) evaluation, and three novel open-source evaluation datasets for text summarization, question answering, and stock trading. Our evaluation of 15 representative LLMs, including GPT-4, ChatGPT, and the latest Gemini, reveals several key findings: While LLMs excel in IE and textual analysis, they struggle with advanced reasoning and complex tasks like text generation and forecasting. GPT-4 excels in IE and stock trading, while Gemini is better at text generation and forecasting. Instruction-tuned LLMs improve textual analysis but offer limited benefits for complex tasks such as QA. FinBen has been used to host the first financial LLMs shared task at the FinNLP-AgentScen workshop during IJCAI-2024, attracting 12 teams. Their novel solutions outperformed GPT-4, showcasing FinBen's potential to drive innovation in financial LLMs. All datasets, results, and codes are released for the research community: https://github.com/The-FinAI/PIXIU. △ Less

Submitted 18 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

Comments: 26 pages, 11 figures

arXiv:2402.11790 [pdf, other]

CoLRIO: LiDAR-Ranging-Inertial Centralized State Estimation for Robotic Swarms

Authors: Shipeng Zhong, Hongbo Chen, Yuhua Qi, Dapeng Feng, Zhiqiang Chen, Jin Wu, Weisong Wen, Ming Liu

Abstract: Collaborative state estimation using different heterogeneous sensors is a fundamental prerequisite for robotic swarms operating in GPS-denied environments, posing a significant research challenge. In this paper, we introduce a centralized system to facilitate collaborative LiDAR-ranging-inertial state estimation, enabling robotic swarms to operate without the need for anchor deployment. The system… ▽ More Collaborative state estimation using different heterogeneous sensors is a fundamental prerequisite for robotic swarms operating in GPS-denied environments, posing a significant research challenge. In this paper, we introduce a centralized system to facilitate collaborative LiDAR-ranging-inertial state estimation, enabling robotic swarms to operate without the need for anchor deployment. The system efficiently distributes computationally intensive tasks to a central server, thereby reducing the computational burden on individual robots for local odometry calculations. The server back-end establishes a global reference by leveraging shared data and refining joint pose graph optimization through place recognition, global optimization techniques, and removal of outlier data to ensure precise and robust collaborative state estimation. Extensive evaluations of our system, utilizing both publicly available datasets and our custom datasets, demonstrate significant enhancements in the accuracy of collaborative SLAM estimates. Moreover, our system exhibits remarkable proficiency in large-scale missions, seamlessly enabling ten robots to collaborate effectively in performing SLAM tasks. In order to contribute to the research community, we will make our code open-source and accessible at \url{https://github.com/PengYu-team/Co-LRIO}. △ Less

Submitted 23 February, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

Journal ref: published in ICRA 2024

arXiv:2402.07405 [pdf, other]

Dólares or Dollars? Unraveling the Bilingual Prowess of Financial LLMs Between Spanish and English

Authors: Xiao Zhang, Ruoyu Xiang, Chenhan Yuan, Duanyu Feng, Weiguang Han, Alejandro Lopez-Lira, Xiao-Yang Liu, Sophia Ananiadou, Min Peng, Jimin Huang, Qianqian Xie

Abstract: Despite Spanish's pivotal role in the global finance industry, a pronounced gap exists in Spanish financial natural language processing (NLP) and application studies compared to English, especially in the era of large language models (LLMs). To bridge this gap, we unveil Toisón de Oro, the first bilingual framework that establishes instruction datasets, finetuned LLMs, and evaluation benchmark for… ▽ More Despite Spanish's pivotal role in the global finance industry, a pronounced gap exists in Spanish financial natural language processing (NLP) and application studies compared to English, especially in the era of large language models (LLMs). To bridge this gap, we unveil Toisón de Oro, the first bilingual framework that establishes instruction datasets, finetuned LLMs, and evaluation benchmark for financial LLMs in Spanish joint with English. We construct a rigorously curated bilingual instruction dataset including over 144K Spanish and English samples from 15 datasets covering 7 tasks. Harnessing this, we introduce FinMA-ES, an LLM designed for bilingual financial applications. We evaluate our model and existing LLMs using FLARE-ES, the first comprehensive bilingual evaluation benchmark with 21 datasets covering 9 tasks. The FLARE-ES benchmark results reveal a significant multilingual performance gap and bias in existing LLMs. FinMA-ES models surpass SOTA LLMs such as GPT-4 in Spanish financial tasks, due to strategic instruction tuning and leveraging data from diverse linguistic resources, highlighting the positive impact of cross-linguistic transfer. All our datasets, models, and benchmarks have been released. △ Less

Submitted 11 February, 2024; originally announced February 2024.

Comments: 10 pages, 2 figures

arXiv:2401.15398 [pdf, ps, other]

Resolvent analysis for predicting energetic structures in the far wake of a wind turbine

Authors: Dachuan Feng, Vikrant Gupta, Larry K. B. Li, Minping Wan

Abstract: A thorough understanding of the energetic flow structures that form in the far wake of a wind turbine is essential for accurate turbine wake modeling and wind farm performance estimation. We use resolvent analysis to explore such flow structures for a turbine operating in a neutral atmospheric boundary layer and validate our results against data-driven modes extracted through spectral proper ortho… ▽ More A thorough understanding of the energetic flow structures that form in the far wake of a wind turbine is essential for accurate turbine wake modeling and wind farm performance estimation. We use resolvent analysis to explore such flow structures for a turbine operating in a neutral atmospheric boundary layer and validate our results against data-driven modes extracted through spectral proper orthogonal decomposition. Our results confirm that convective instabilities play a dominant role in generating turbulent kinetic energy (TKE) in the far wake. Additionally, we find evidence of the non-modal Orr mechanism contributing to TKE generation, particularly at low Strouhal numbers. The resolvent analysis method requires only the mean wake velocity and eddy viscosity profiles as inputs but can capture the energetic modes and TKE spectra in the far wake. In this specific application, the resolvent analysis method approximates the wake to be axisymmetric, which suggests that it can be paired with engineering wake models. Overall this study demonstrates the use of resolvent analysis as a viable tool for estimating TKE and for uncovering the mechanism of TKE generation. △ Less

Submitted 27 January, 2024; originally announced January 2024.

arXiv:2401.13267 [pdf, other]

Dual-modal Dynamic Traceback Learning for Medical Report Generation

Authors: Shuchang Ye, Mingyuan Meng, Mingjian Li, Dagan Feng, Jinman Kim

Abstract: With increasing reliance on medical imaging in clinical practices, automated report generation from medical images is in great demand. Existing report generation methods typically adopt an encoder-decoder deep learning framework to build a uni-directional image-to-report mapping. However, such a framework ignores the bi-directional mutual associations between images and reports, thus incurring dif… ▽ More With increasing reliance on medical imaging in clinical practices, automated report generation from medical images is in great demand. Existing report generation methods typically adopt an encoder-decoder deep learning framework to build a uni-directional image-to-report mapping. However, such a framework ignores the bi-directional mutual associations between images and reports, thus incurring difficulties in associating the intrinsic medical meanings between them. Recent generative representation learning methods have demonstrated the benefits of dual-modal learning from both image and text modalities. However, these methods exhibit two major drawbacks for medical report generation: 1) they tend to capture morphological information and have difficulties in capturing subtle pathological semantic information, and 2) they predict masked text rely on both unmasked images and text, inevitably degrading performance when inference is based solely on images. In this study, we propose a new report generation framework with dual-modal dynamic traceback learning (DTrace) to overcome the two identified drawbacks and enable dual-modal learning for medical report generation. To achieve this, our DTrace introduces a traceback mechanism to control the semantic validity of generated content via self-assessment. Further, our DTrace introduces a dynamic learning strategy to adapt to various proportions of image and text input, enabling report generation without reliance on textual input during inference. Extensive experiments on two well-benchmarked datasets (IU-Xray and MIMIC-CXR) show that our DTrace outperforms state-of-the-art medical report generation methods. △ Less

Submitted 6 March, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

arXiv:2401.12657 [pdf, ps, other]

Electronic and magnetic excitations in La$_3$Ni$_2$O$_7$

Authors: Xiaoyang Chen, Jaewon Choi, Zhicheng Jiang, Jiong Mei, Kun Jiang, Jie Li, Stefano Agrestini, Mirian Garcia-Fernandez, Xing Huang, Hualei Sun, Dawei Shen, Meng Wang, Jiangping Hu, Yi Lu, Ke-Jin Zhou, Donglai Feng

Abstract: The striking discovery of high-temperature superconductivity (HTSC) of 80 K in a bilayer nickelate La$_3$Ni$_2$O$_7$ under a moderately high pressure of about 14 GPa ignited a new wave of studying HTSC in nickelates. The properties of the parental phase at ambient pressure may contain key information on basic interactions therein and bosons that may mediate pairing giving birth to superconductivit… ▽ More The striking discovery of high-temperature superconductivity (HTSC) of 80 K in a bilayer nickelate La$_3$Ni$_2$O$_7$ under a moderately high pressure of about 14 GPa ignited a new wave of studying HTSC in nickelates. The properties of the parental phase at ambient pressure may contain key information on basic interactions therein and bosons that may mediate pairing giving birth to superconductivity. Moreover, the bilayer structure of La$_3$Ni$_2$O$_7$ may suggest a distinct minimal model in comparison to cuprate superconductors. Here using X-ray absorption spectroscopy and resonant inelastic X-ray scattering, we studied La$_3$Ni$_2$O$_7$ at ambient pressure, and found that Ni 3$d_{x^2-y^2}$, Ni 3$d_{z^2}$, and ligand oxygen 2$p$ orbitals dominate the low-energy physics with a small charge-transfer energy. Remarkably, well-defined optical-like magnetic excitations were found to soften into a quasi-static spin-density-wave ordering, evidencing the strong electronic correlations and rich magnetic properties. Based on a Heisenberg spin model, we found that the inter-layer effective magnetic superexchange interaction is much larger than the intra-layer ones, and proposed two viable magnetic structures. Our results set the foundation for further exploration of La$_3$Ni$_2$O$_7$ superconductor. △ Less

Submitted 23 January, 2024; originally announced January 2024.

arXiv:2401.12540 [pdf, other]

DREditor: An Time-efficient Approach for Building a Domain-specific Dense Retrieval Model

Authors: Chen Huang, Duanyu Feng, Wenqiang Lei, Jiancheng Lv

Abstract: Deploying dense retrieval models efficiently is becoming increasingly important across various industries. This is especially true for enterprise search services, where customizing search engines to meet the time demands of different enterprises in different domains is crucial. Motivated by this, we develop a time-efficient approach called DREditor to edit the matching rule of an off-the-shelf den… ▽ More Deploying dense retrieval models efficiently is becoming increasingly important across various industries. This is especially true for enterprise search services, where customizing search engines to meet the time demands of different enterprises in different domains is crucial. Motivated by this, we develop a time-efficient approach called DREditor to edit the matching rule of an off-the-shelf dense retrieval model to suit a specific domain. This is achieved by directly calibrating the output embeddings of the model using an efficient and effective linear mapping. This mapping is powered by an edit operator that is obtained by solving a specially constructed least squares problem. Compared to implicit rule modification via long-time finetuning, our experimental results show that DREditor provides significant advantages on different domain-specific datasets, dataset sources, retrieval models, and computing devices. It consistently enhances time efficiency by 100-300 times while maintaining comparable or even superior retrieval performance. In a broader context, we take the first step to introduce a novel embedding calibration approach for the retrieval task, filling the technical blank in the current field of embedding calibration. This approach also paves the way for building domain-specific dense retrieval models efficiently and inexpensively. △ Less

Submitted 23 January, 2024; originally announced January 2024.

Comments: 15 pages, 6 figures, Codes are available at https://github.com/huangzichun/DREditor

arXiv:2401.10501 [pdf]

Enhancing medical vision-language contrastive learning via inter-matching relation modelling

Authors: Mingjian Li, Mingyuan Meng, Michael Fulham, David Dagan Feng, Lei Bi, Jinman Kim

Abstract: Medical image representations can be learned through medical vision-language contrastive learning (mVLCL) where medical imaging reports are used as weak supervision through image-text alignment. These learned image representations can be transferred to and benefit various downstream medical vision tasks such as disease classification and segmentation. Recent mVLCL methods attempt to align image su… ▽ More Medical image representations can be learned through medical vision-language contrastive learning (mVLCL) where medical imaging reports are used as weak supervision through image-text alignment. These learned image representations can be transferred to and benefit various downstream medical vision tasks such as disease classification and segmentation. Recent mVLCL methods attempt to align image sub-regions and the report keywords as local-matchings. However, these methods aggregate all local-matchings via simple pooling operations while ignoring the inherent relations between them. These methods therefore fail to reason between local-matchings that are semantically related, e.g., local-matchings that correspond to the disease word and the location word (semantic-relations), and also fail to differentiate such clinically important local-matchings from others that correspond to less meaningful words, e.g., conjunction words (importance-relations). Hence, we propose a mVLCL method that models the inter-matching relations between local-matchings via a relation-enhanced contrastive learning framework (RECLF). In RECLF, we introduce a semantic-relation reasoning module (SRM) and an importance-relation reasoning module (IRM) to enable more fine-grained report supervision for image representation learning. We evaluated our method using four public benchmark datasets on four downstream tasks, including segmentation, zero-shot classification, supervised classification, and cross-modal retrieval. Our results demonstrated the superiority of our RECLF over the state-of-the-art mVLCL methods with consistent improvements across single-modal and cross-modal tasks. These results suggest that our RECLF, by modelling the inter-matching relations, can learn improved medical image representations with better generalization capabilities. △ Less

Submitted 19 January, 2024; originally announced January 2024.

Comments: 11 pages, 5 figures. Under review

arXiv:2401.05899 [pdf, other]

Optimistic Model Rollouts for Pessimistic Offline Policy Optimization

Authors: Yuanzhao Zhai, Yiying Li, Zijian Gao, Xudong Gong, Kele Xu, Dawei Feng, Ding Bo, Huaimin Wang

Abstract: Model-based offline reinforcement learning (RL) has made remarkable progress, offering a promising avenue for improving generalization with synthetic model rollouts. Existing works primarily focus on incorporating pessimism for policy optimization, usually via constructing a Pessimistic Markov Decision Process (P-MDP). However, the P-MDP discourages the policies from learning in out-of-distributio… ▽ More Model-based offline reinforcement learning (RL) has made remarkable progress, offering a promising avenue for improving generalization with synthetic model rollouts. Existing works primarily focus on incorporating pessimism for policy optimization, usually via constructing a Pessimistic Markov Decision Process (P-MDP). However, the P-MDP discourages the policies from learning in out-of-distribution (OOD) regions beyond the support of offline datasets, which can under-utilize the generalization ability of dynamics models. In contrast, we propose constructing an Optimistic MDP (O-MDP). We initially observed the potential benefits of optimism brought by encouraging more OOD rollouts. Motivated by this observation, we present ORPO, a simple yet effective model-based offline RL framework. ORPO generates Optimistic model Rollouts for Pessimistic offline policy Optimization. Specifically, we train an optimistic rollout policy in the O-MDP to sample more OOD model rollouts. Then we relabel the sampled state-action pairs with penalized rewards and optimize the output policy in the P-MDP. Theoretically, we demonstrate that the performance of policies trained with ORPO can be lower-bounded in linear MDPs. Experimental results show that our framework significantly outperforms P-MDP baselines by a margin of 30%, achieving state-of-the-art performance on the widely-used benchmark. Moreover, ORPO exhibits notable advantages in problems that require generalization. △ Less

Submitted 11 January, 2024; originally announced January 2024.

arXiv:2401.00243 [pdf, other]

Uncertainty-Penalized Reinforcement Learning from Human Feedback with Diverse Reward LoRA Ensembles

Authors: Yuanzhao Zhai, Han Zhang, Yu Lei, Yue Yu, Kele Xu, Dawei Feng, Bo Ding, Huaimin Wang

Abstract: Reinforcement learning from human feedback (RLHF) emerges as a promising paradigm for aligning large language models (LLMs). However, a notable challenge in RLHF is overoptimization, where beyond a certain threshold, the pursuit of higher rewards leads to a decline in human preferences. In this paper, we observe the weakness of KL regularization which is commonly employed in existing RLHF methods… ▽ More Reinforcement learning from human feedback (RLHF) emerges as a promising paradigm for aligning large language models (LLMs). However, a notable challenge in RLHF is overoptimization, where beyond a certain threshold, the pursuit of higher rewards leads to a decline in human preferences. In this paper, we observe the weakness of KL regularization which is commonly employed in existing RLHF methods to address overoptimization. To mitigate this limitation, we scrutinize the RLHF objective in the offline dataset and propose uncertainty-penalized RLHF (UP-RLHF), which incorporates uncertainty regularization during RL-finetuning. To enhance the uncertainty quantification abilities for reward models, we first propose a diverse low-rank adaptation (LoRA) ensemble by maximizing the nuclear norm of LoRA matrix concatenations. Then we optimize policy models utilizing penalized rewards, determined by both rewards and uncertainties provided by the diverse reward LoRA ensembles. Our experimental results, based on two real human preference datasets, showcase the effectiveness of diverse reward LoRA ensembles in quantifying reward uncertainty. Additionally, uncertainty regularization in UP-RLHF proves to be pivotal in mitigating overoptimization, thereby contributing to the overall performance. △ Less

Submitted 30 December, 2023; originally announced January 2024.

Comments: 10 pages, 5 figures,

arXiv:2312.01319 [pdf, ps, other]

Erdős similarity problem via bi-Lipschitz embedding

Authors: De-jun Feng, Chun-Kit Lai, Ying Xiong

Abstract: The Erdős similarity conjecture asserted that an infinite set of real numbers cannot be affinely embedded into every measurable set of positive Lebesgue measure. The problem is still open, in particular for all fast decaying sequences. In this paper, we relax the problem to the bi-Lipschitz embedding and obtain some sharp criteria about the bi-Lipschitz Erdős similarity problem for strictly decrea… ▽ More The Erdős similarity conjecture asserted that an infinite set of real numbers cannot be affinely embedded into every measurable set of positive Lebesgue measure. The problem is still open, in particular for all fast decaying sequences. In this paper, we relax the problem to the bi-Lipschitz embedding and obtain some sharp criteria about the bi-Lipschitz Erdős similarity problem for strictly decreasing sequences. △ Less

Submitted 3 December, 2023; originally announced December 2023.

MSC Class: 28A78; 28A05; 30L05; 11K55

arXiv:2311.16707 [pdf]

Full-resolution MLPs Empower Medical Dense Prediction

Authors: Mingyuan Meng, Yuxin Xue, Dagan Feng, Lei Bi, Jinman Kim

Abstract: Dense prediction is a fundamental requirement for many medical vision tasks such as medical image restoration, registration, and segmentation. The most popular vision model, Convolutional Neural Networks (CNNs), has reached bottlenecks due to the intrinsic locality of convolution operations. Recently, transformers have been widely adopted for dense prediction for their capability to capture long-r… ▽ More Dense prediction is a fundamental requirement for many medical vision tasks such as medical image restoration, registration, and segmentation. The most popular vision model, Convolutional Neural Networks (CNNs), has reached bottlenecks due to the intrinsic locality of convolution operations. Recently, transformers have been widely adopted for dense prediction for their capability to capture long-range visual dependence. However, due to the high computational complexity and large memory consumption of self-attention operations, transformers are usually used at downsampled feature resolutions. Such usage cannot effectively leverage the tissue-level textural information available only at the full image resolution. This textural information is crucial for medical dense prediction as it can differentiate the subtle human anatomy in medical images. In this study, we hypothesize that Multi-layer Perceptrons (MLPs) are superior alternatives to transformers in medical dense prediction where tissue-level details dominate the performance, as MLPs enable long-range dependence at the full image resolution. To validate our hypothesis, we develop a full-resolution hierarchical MLP framework that uses MLPs beginning from the full image resolution. We evaluate this framework with various MLP blocks on a wide range of medical dense prediction tasks including restoration, registration, and segmentation. Extensive experiments on six public well-benchmarked datasets show that, by simply using MLPs at full resolution, our framework outperforms its CNN and transformer counterparts and achieves state-of-the-art performance on various medical dense prediction tasks. △ Less

Submitted 28 November, 2023; originally announced November 2023.

Comments: Under Review

arXiv:2311.12935 [pdf, other]

Sampling-accelerated First-principles Prediction of Phonon Scattering Rates for Converged Thermal Conductivity and Radiative Properties

Authors: Ziqi Guo, Zherui Han, Dudong Feng, Guang Lin, Xiulin Ruan

Abstract: First-principles prediction of thermal conductivity and radiative properties is crucial. However, computing phonon scattering, especially for four-phonon scattering, could be prohibitively expensive, and the thermal conductivity even for silicon was still under-predicted and not converged in the literature. Here we propose a method to estimate scattering rates from a small sample of scattering pro… ▽ More First-principles prediction of thermal conductivity and radiative properties is crucial. However, computing phonon scattering, especially for four-phonon scattering, could be prohibitively expensive, and the thermal conductivity even for silicon was still under-predicted and not converged in the literature. Here we propose a method to estimate scattering rates from a small sample of scattering processes using maximum likelihood estimation. The computational cost of estimating scattering rates and associated thermal conductivity and radiative properties is dramatically reduced by over 99%. This allows us to use an unprecedented q-mesh of 32*32*32 for silicon and achieve a converged thermal conductivity value that agrees much better with experiments. The accuracy and efficiency of our approach make it ideal for the high-throughput screening of materials for thermal and optical applications. △ Less

Submitted 21 November, 2023; originally announced November 2023.

arXiv:2310.15550 [pdf]

PET Synthesis via Self-supervised Adaptive Residual Estimation Generative Adversarial Network

Authors: Yuxin Xue, Lei Bi, Yige Peng, Michael Fulham, David Dagan Feng, Jinman Kim

Abstract: Positron emission tomography (PET) is a widely used, highly sensitive molecular imaging in clinical diagnosis. There is interest in reducing the radiation exposure from PET but also maintaining adequate image quality. Recent methods using convolutional neural networks (CNNs) to generate synthesized high-quality PET images from low-dose counterparts have been reported to be state-of-the-art for low… ▽ More Positron emission tomography (PET) is a widely used, highly sensitive molecular imaging in clinical diagnosis. There is interest in reducing the radiation exposure from PET but also maintaining adequate image quality. Recent methods using convolutional neural networks (CNNs) to generate synthesized high-quality PET images from low-dose counterparts have been reported to be state-of-the-art for low-to-high image recovery methods. However, these methods are prone to exhibiting discrepancies in texture and structure between synthesized and real images. Furthermore, the distribution shift between low-dose PET and standard PET has not been fully investigated. To address these issues, we developed a self-supervised adaptive residual estimation generative adversarial network (SS-AEGAN). We introduce (1) An adaptive residual estimation mapping mechanism, AE-Net, designed to dynamically rectify the preliminary synthesized PET images by taking the residual map between the low-dose PET and synthesized output as the input, and (2) A self-supervised pre-training strategy to enhance the feature representation of the coarse generator. Our experiments with a public benchmark dataset of total-body PET images show that SS-AEGAN consistently outperformed the state-of-the-art synthesis methods with various dose reduction factors. △ Less

Submitted 24 October, 2023; originally announced October 2023.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2310.08826 [pdf, other]

Revisiting Multi-modal 3D Semantic Segmentation in Real-world Autonomous Driving

Authors: Feng Jiang, Chaoping Tu, Gang Zhang, Jun Li, Hanqing Huang, Junyu Lin, Di Feng, Jian Pu

Abstract: LiDAR and camera are two critical sensors for multi-modal 3D semantic segmentation and are supposed to be fused efficiently and robustly to promise safety in various real-world scenarios. However, existing multi-modal methods face two key challenges: 1) difficulty with efficient deployment and real-time execution; and 2) drastic performance degradation under weak calibration between LiDAR and came… ▽ More LiDAR and camera are two critical sensors for multi-modal 3D semantic segmentation and are supposed to be fused efficiently and robustly to promise safety in various real-world scenarios. However, existing multi-modal methods face two key challenges: 1) difficulty with efficient deployment and real-time execution; and 2) drastic performance degradation under weak calibration between LiDAR and cameras. To address these challenges, we propose CPGNet-LCF, a new multi-modal fusion framework extending the LiDAR-only CPGNet. CPGNet-LCF solves the first challenge by inheriting the easy deployment and real-time capabilities of CPGNet. For the second challenge, we introduce a novel weak calibration knowledge distillation strategy during training to improve the robustness against the weak calibration. CPGNet-LCF achieves state-of-the-art performance on the nuScenes and SemanticKITTI benchmarks. Remarkably, it can be easily deployed to run in 20ms per frame on a single Tesla V100 GPU using TensorRT TF16 mode. Furthermore, we benchmark performance over four weak calibration levels, demonstrating the robustness of our proposed approach. △ Less

Submitted 12 October, 2023; originally announced October 2023.

Comments: 7 pages, 3 figures

arXiv:2310.05620 [pdf, other]

LAiW: A Chinese Legal Large Language Models Benchmark

Authors: Yongfu Dai, Duanyu Feng, Jimin Huang, Haochen Jia, Qianqian Xie, Yifang Zhang, Weiguang Han, Wei Tian, Hao Wang

Abstract: General and legal domain LLMs have demonstrated strong performance in various tasks of LegalAI. However, the current evaluations of these LLMs in LegalAI are defined by the experts of computer science, lacking consistency with the logic of legal practice, making it difficult to judge their practical capabilities. To address this challenge, we are the first to build the Chinese legal LLMs benchmark… ▽ More General and legal domain LLMs have demonstrated strong performance in various tasks of LegalAI. However, the current evaluations of these LLMs in LegalAI are defined by the experts of computer science, lacking consistency with the logic of legal practice, making it difficult to judge their practical capabilities. To address this challenge, we are the first to build the Chinese legal LLMs benchmark LAiW, based on the logic of legal practice. To align with the thinking process of legal experts and legal practice (syllogism), we divide the legal capabilities of LLMs from easy to difficult into three levels: basic information retrieval, legal foundation inference, and complex legal application. Each level contains multiple tasks to ensure a comprehensive evaluation. Through automated evaluation of current general and legal domain LLMs on our benchmark, we indicate that these LLMs may not align with the logic of legal practice. LLMs seem to be able to directly acquire complex legal application capabilities but perform poorly in some basic tasks, which may pose obstacles to their practical application and acceptance by legal experts. To further confirm the complex legal application capabilities of current LLMs in legal application scenarios, we also incorporate human evaluation with legal experts. The results indicate that while LLMs may demonstrate strong performance, they still require reinforcement of legal logic. △ Less

Submitted 18 February, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

arXiv:2310.00566 [pdf, other]

Empowering Many, Biasing a Few: Generalist Credit Scoring through Large Language Models

Authors: Duanyu Feng, Yongfu Dai, Jimin Huang, Yifang Zhang, Qianqian Xie, Weiguang Han, Zhengyu Chen, Alejandro Lopez-Lira, Hao Wang

Abstract: In the financial industry, credit scoring is a fundamental element, shaping access to credit and determining the terms of loans for individuals and businesses alike. Traditional credit scoring methods, however, often grapple with challenges such as narrow knowledge scope and isolated evaluation of credit tasks. Our work posits that Large Language Models (LLMs) have great potential for credit scori… ▽ More In the financial industry, credit scoring is a fundamental element, shaping access to credit and determining the terms of loans for individuals and businesses alike. Traditional credit scoring methods, however, often grapple with challenges such as narrow knowledge scope and isolated evaluation of credit tasks. Our work posits that Large Language Models (LLMs) have great potential for credit scoring tasks, with strong generalization ability across multiple tasks. To systematically explore LLMs for credit scoring, we propose the first open-source comprehensive framework. We curate a novel benchmark covering 9 datasets with 14K samples, tailored for credit assessment and a critical examination of potential biases within LLMs, and the novel instruction tuning data with over 45k samples. We then propose the first Credit and Risk Assessment Large Language Model (CALM) by instruction tuning, tailored to the nuanced demands of various financial risk assessment tasks. We evaluate CALM, existing state-of-art (SOTA) methods, open source and closed source LLMs on the build benchmark. Our empirical results illuminate the capability of LLMs to not only match but surpass conventional models, pointing towards a future where credit scoring can be more inclusive, comprehensive, and unbiased. We contribute to the industry's transformation by sharing our pioneering instruction-tuning datasets, credit and risk assessment LLM, and benchmarks with the research community and the financial industry. △ Less

Submitted 17 February, 2024; v1 submitted 30 September, 2023; originally announced October 2023.

arXiv:2309.17344 [pdf, other]

Four phonon-dominated near-field radiation in weakly anharmonic polar materials

Authors: Dudong Feng, Xiaolong Yang, Zherui Han, Xiulin Ruan

Abstract: Inelastic scattering processes typically introduce friction among carriers and reduce the transport properties of photons, phonons, and electrons. However, we predict that in contrast to the role in reducing thermal conductivity, four-phonon scattering dominates near-field radiative heat transfer (NFRHT) in both boron arsenide~(BAs) and boron antimonide. Including four-phonon scattering results in… ▽ More Inelastic scattering processes typically introduce friction among carriers and reduce the transport properties of photons, phonons, and electrons. However, we predict that in contrast to the role in reducing thermal conductivity, four-phonon scattering dominates near-field radiative heat transfer (NFRHT) in both boron arsenide~(BAs) and boron antimonide. Including four-phonon scattering results in a nearly 400-fold increase in the total heat flux between two BAs thin-films compared to three-phonon scattering alone. This non-intuitive enhancement arises from the large number of NFRHT channels activated by four-phonon scattering outcompete the effect of decreased coupling strength of surface phonon polaritons at the resonance frequency. Additionally, we point out that four-phonon scattering to decrease NFRHT in certain other systems. △ Less

Submitted 29 September, 2023; originally announced September 2023.

arXiv:2309.15679 [pdf, other]

Classification of skyrmionic textures and extraction of Hamiltonian parameters via machine learning

Authors: Dushuo Feng, Zhihao Guan, Xiaoping Wu, Yan Wu, Changsheng Song

Abstract: Classifying skyrmionic textures and extracting magnetic Hamiltonian parameters are fundamental and demanding endeavors within the field of two-dimensional (2D) spintronics. By using micromagnetic simulation and machine learning (ML) methods, we theoretically realize the recognition of nine skyrmionic textures and the mining of magnetic Hamiltonian parameters from massive spin texture images in 2D… ▽ More Classifying skyrmionic textures and extracting magnetic Hamiltonian parameters are fundamental and demanding endeavors within the field of two-dimensional (2D) spintronics. By using micromagnetic simulation and machine learning (ML) methods, we theoretically realize the recognition of nine skyrmionic textures and the mining of magnetic Hamiltonian parameters from massive spin texture images in 2D Heisenberg model. For textures classification, a deep neural network (DNN) trained according to transfer learning is proposed to distinguish nine different skyrmionic textures. For parameters extraction, based on the textures generated by different Heisenberg exchange stiffness (J), Dzyaloshinskii-Moriya strength (D), and anisotropy constant (K), we apply a multi-input single-output (MISO) deep learning model (handling with both images and parameters) and a support vector regression (SVR) model (dealing with Fourier features) to extract the parameters embedded in the spin textures. The models for classification and extraction both achieve great results with the accuracy of 98% (DNN),90% (MISO) and 80% (SVR). Importantly, via our ML methods, the skyrmionic textures with blurred phase boundaries can be effectively distinguished, and the concluded formation conditions of various skyrmionic textures, especially the skyrmion crystal, are consistent with previous reports. Besides, our models demonstrate the mapping relationship between spin texture images and magnetic parameters, which proves the feasibility of extracting microscopic mechanisms from experimental images and has guiding significance for the experiments of spintronics. △ Less

Submitted 27 September, 2023; originally announced September 2023.

arXiv:2309.08980 [pdf, other]

Differential Modulation for Short Packet Transmission in URLLC

Authors: Canjian Zheng, Fu-Chun Zheng, Jingjing Luo, Pengcheng Zhu, Xiaohu You, Daquan Feng

Abstract: One key feature of ultra-reliable low-latency communications (URLLC) in 5G is to support short packet transmission (SPT). However, the pilot overhead in SPT for channel estimation is relatively high, especially in high Doppler environments. In this paper, we advocate the adoption of differential modulation to support ultra-low latency services, which can ease the channel estimation burden and redu… ▽ More One key feature of ultra-reliable low-latency communications (URLLC) in 5G is to support short packet transmission (SPT). However, the pilot overhead in SPT for channel estimation is relatively high, especially in high Doppler environments. In this paper, we advocate the adoption of differential modulation to support ultra-low latency services, which can ease the channel estimation burden and reduce the power and bandwidth overhead incurred in traditional coherent modulation schemes. Specifically, we consider a multi-connectivity (MC) scheme employing differential modulation to enable URLLC services. The popular selection combining and maximal ratio combining schemes are respectively applied to explore the diversity gain in the MC scheme. A first-order autoregressive model is further utilized to characterize the time-varying nature of the channel. Theoretically, the maximum achievable rate and minimum achievable block error rate under ergodic fading channels with PSK inputs and perfect CSI are first derived by using the non-asymptotic information-theoretic bounds. The performance of SPT with differential modulation and MC schemes is then analysed by characterizing the effect of differential modulation and time-varying channels as a reduction in the effective SNR. Simulation results show that differential modulation does offer a significant advantage over the pilot-assisted coherent scheme for SPT, especially in high Doppler environments. △ Less

Submitted 16 September, 2023; originally announced September 2023.

Comments: 15 pages, 9 figures

Showing 1–50 of 439 results for author: Feng, D