subscribe to arXiv mailings

Coupling of hole double quantum dot in planar germanium to a microwave cavity

Authors: Yuan Kang, Zong-Hu Li, Zhen-Zhen Kong, Fang-Ge Li, Tian-Yue Hao, Ze-Cheng Wei, Song-Yan Deng, Bao-Chuan Wang, Hai-Ou Li, Gui-Lei Wang, Guang-Can Guo, Gang Cao, Guo-Ping Guo

Abstract: In recent years, notable progress has been made in the study of hole qubits in planar germanium, and circuit quantum electrodynamics (circuit QED) has emerged as a promising approach for achieving long-range coupling and scaling up of qubits. Here, we demonstrate the coupling between holes in a planar germanium double quantum dot (DQD) and photons in a microwave cavity. Specifically, a real-time c… ▽ More In recent years, notable progress has been made in the study of hole qubits in planar germanium, and circuit quantum electrodynamics (circuit QED) has emerged as a promising approach for achieving long-range coupling and scaling up of qubits. Here, we demonstrate the coupling between holes in a planar germanium double quantum dot (DQD) and photons in a microwave cavity. Specifically, a real-time calibrated virtual gate method is developed to characterize this hybrid system, which in turn allows us to determine the typical parameters sequentially through single-parameter fitting instead of conventional multi-parameter fitting with additional uncertainty, and gives the hole-photon coupling rate of $g_0/2π$ = 21.7 MHz. This work is a step toward further research on hole-photon interactions and long-range qubit coupling in planar germanium. The experimental method developed in this work contributes to the more accurate and efficient characterization of hybrid cavity-QED systems. △ Less

Submitted 12 October, 2023; originally announced October 2023.

Comments: 11 pages, 4 figures

arXiv:2310.08134 [pdf, other]

Sensing-assisted Accurate and Fast Beam Management for Cellular-connected mmWave UAV Network

Authors: Yanpeng Cui, Qixun Zhang, Zhiyong Feng, Qin Wen, Ying Zhou, Zhiqing Wei, Ping Zhang

Abstract: Beam management, including initial access (IA) and beam tracking, is essential to the millimeter-wave Unmanned Aerial Vehicle (UAV) network. However, conventional communication-only and feedback-based schemes suffer a high delay and low accuracy of beam alignment since they only enable the receiver to passively hear the information of the transmitter from the radio domain. This paper presents a no… ▽ More Beam management, including initial access (IA) and beam tracking, is essential to the millimeter-wave Unmanned Aerial Vehicle (UAV) network. However, conventional communication-only and feedback-based schemes suffer a high delay and low accuracy of beam alignment since they only enable the receiver to passively hear the information of the transmitter from the radio domain. This paper presents a novel sensing-assisted beam management approach, the first solution that fully utilizes the information from the visual domain to improve communication performance. We employ both integrated sensing and communication and computer vision techniques and design an extended Kalman filtering method for beam tracking and prediction. Besides, we also propose a novel dual identity association solution to distinguish multiple UAVs in dynamic environments. Real-world experiments and numerical results show that the proposed solution outperforms the conventional methods in IA delay, association accuracy, tracking error, and communication performance. △ Less

Submitted 12 October, 2023; originally announced October 2023.

arXiv:2310.08082 [pdf, ps, other]

Jointly Optimized Global-Local Visual Localization of UAVs

Authors: Haoling Li, Jiuniu Wang, Zhiwei Wei, Wenjia Xu

Abstract: Navigation and localization of UAVs present a challenge when global navigation satellite systems (GNSS) are disrupted and unreliable. Traditional techniques, such as simultaneous localization and mapping (SLAM) and visual odometry (VO), exhibit certain limitations in furnishing absolute coordinates and mitigating error accumulation. Existing visual localization methods achieve autonomous visual lo… ▽ More Navigation and localization of UAVs present a challenge when global navigation satellite systems (GNSS) are disrupted and unreliable. Traditional techniques, such as simultaneous localization and mapping (SLAM) and visual odometry (VO), exhibit certain limitations in furnishing absolute coordinates and mitigating error accumulation. Existing visual localization methods achieve autonomous visual localization without error accumulation by matching with ortho satellite images. However, doing so cannot guarantee real-time performance due to the complex matching process. To address these challenges, we propose a novel Global-Local Visual Localization (GLVL) network. Our GLVL network is a two-stage visual localization approach, combining a large-scale retrieval module that finds similar regions with the UAV flight scene, and a fine-grained matching module that localizes the precise UAV coordinate, enabling real-time and precise localization. The training process is jointly optimized in an end-to-end manner to further enhance the model capability. Experiments on six UAV flight scenes encompassing both texture-rich and texture-sparse regions demonstrate the ability of our model to achieve the real-time precise localization requirements of UAVs. Particularly, our method achieves a localization error of only 2.39 meters in 0.48 seconds in a village scene with sparse texture features. △ Less

Submitted 12 October, 2023; originally announced October 2023.

arXiv:2310.07401 [pdf, ps, other]

doi 10.1109/ICCC57788.2023.10233416

Integrated Sensing and Communication enabled Doppler Frequency Shift Estimation and Compensation

Authors: Jinzhu Jia, Zhiqing Wei, Ruiyun Zhang, Lin Wang

Abstract: Despite the millimeter wave technology fulfills the low-latency and high data transmission, it will cause severe Doppler Frequency Shift (DFS) for high-speed vehicular network, which tremendously damages the communication performance. In this paper, we propose an Integrated Sensing and Communication (ISAC) enabled DFS estimation and compensation algorithm. Firstly, the DFS is coarsely estimated an… ▽ More Despite the millimeter wave technology fulfills the low-latency and high data transmission, it will cause severe Doppler Frequency Shift (DFS) for high-speed vehicular network, which tremendously damages the communication performance. In this paper, we propose an Integrated Sensing and Communication (ISAC) enabled DFS estimation and compensation algorithm. Firstly, the DFS is coarsely estimated and compensated using radar detection. Then, the designed preamble sequence is used to accurately estimate and compensate DFS. In addition, an adaptive DFS estimator is designed to reduce the computational complexity. Compared with the traditional DFS estimation algorithm, the improvement of the proposed algorithm is verified in bit error rate and mean square error performance by simulation results. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: 6 pages,8 figures, IEEE/CIC ICCC conference

Journal ref: 2023 IEEE/CIC International Conference on Communications in China (ICCC). IEEE, 2023: 1-6

arXiv:2310.07292 [pdf, ps, other]

doi 10.1109/JSEN.2023.3313422

Integrated Sensing and Communication Neighbor Discovery for MANET with Gossip Mechanism

Authors: Zhiqing Wei, Chenfei Li, Yanpeng Cui, Xu Chen, Zeyang Meng, Zhiyong Feng

Abstract: Mobile Ad hoc Network (MANET), supporting Machine-Type Communication(MTC), has a strong demand for rapid networking. Neighbor Discovery (ND) is a key initial step in configuring MANETs and faces a serious challenge in decreasing convergence time. Integrated Sensing and Communication (ISAC), as one of the potential key technologies in the 6th Generation (6G) mobile networks, can obtain the sensing… ▽ More Mobile Ad hoc Network (MANET), supporting Machine-Type Communication(MTC), has a strong demand for rapid networking. Neighbor Discovery (ND) is a key initial step in configuring MANETs and faces a serious challenge in decreasing convergence time. Integrated Sensing and Communication (ISAC), as one of the potential key technologies in the 6th Generation (6G) mobile networks, can obtain the sensing data as the priori information to accelerate ND convergence. In order to further reduce the convergence time of ND, this paper introduces the ISAC-enabled gossip mechanism into the ND algorithm. The prior information acquired by ISAC reduces the information redundancy brought by the gossip mechanism and thus decreases the probability of collision, which further improves convergence speed. The average number of discovered nodes within a given period is derived, which is applied as the critical metric to evaluate the performance of ND algorithms. The simulation results confirm the correctness of the theoretical derivation and show that the interplay between the prior mechanisms and the gossip mechanism significantly reduces the convergence time. In addition, to solve the problem of imperfect sensing information, reinforcement learning is applied. Under the constraints of the convergence condition, the non-Reply and non-Stop Algorithm based on Gossip and Q-learning (GQ-nRnS) proposed in this paper not only ensures the completeness of ND, but also maintains a high convergence speed of ND. Compared with the Q-learning-based ND algorithm (Q-ND), the average convergence time of the GQ-nRnS algorithm is reduced by about 66.4%. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: 12 pages, 16 figures, in IEEE Sensors Journal, 2023

arXiv:2310.07180 [pdf, other]

Integrated Sensing and Communication enabled Multiple Base Stations Cooperative Sensing Towards 6G

Authors: Zhiqing Wei, Wangjun Jiang, Zhiyong Feng, Huici Wu, Ning Zhang, Kaifeng Han, Ruizhong Xu, Ping Zhang

Abstract: Driven by the intelligent applications of sixth-generation (6G) mobile communication systems such as smart city and autonomous driving, which connect the physical and cyber space, the integrated sensing and communication (ISAC) brings a revolutionary change to the base stations (BSs) of 6G by integrating radar sensing and communication in the same hardware and wireless resource. However, with the… ▽ More Driven by the intelligent applications of sixth-generation (6G) mobile communication systems such as smart city and autonomous driving, which connect the physical and cyber space, the integrated sensing and communication (ISAC) brings a revolutionary change to the base stations (BSs) of 6G by integrating radar sensing and communication in the same hardware and wireless resource. However, with the requirements of long-range and accurate sensing in the applications of smart city and autonomous driving, the ISAC enabled single BS still has a limitation in the sensing range and accuracy. With the networked infrastructures of mobile communication systems, multi-BS cooperative sensing is a natural choice satisfying the requirement of long-range and accurate sensing. In this article, the framework of multi-BS cooperative sensing is proposed, breaking through the limitation of single-BS sensing. The enabling technologies, including unified ISAC performance metrics, ISAC signal design and optimization, interference management, cooperative sensing algorithms, are introduced in details. The performance evaluation results are provided to verify the effectiveness of multi-BS cooperative sensing schemes. With ISAC enabled multi-BS cooperative sensing (ISAC-MCS), the intelligent infrastructures connecting physical and cyber space can be established, ushering the era of 6G promoting the intelligence of everything. △ Less

Submitted 24 November, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

Comments: 11 pages 6 figures

Journal ref: IEEE NetWork 2023

arXiv:2310.07122 [pdf, ps, other]

Spectrum Sharing Towards Delay Deterministic Wireless Network: Delay Performance Analysis

Authors: Zhiqing Wei, Ling Zhang, Gaofeng Nie, Huici Wu, Ning Zhang, Zeyang Meng, Zhiyong Feng

Abstract: To accommodate Machine-type Communication (MTC) service, the wireless network needs to support low-delay and low-jitter data transmission, realizing delay deterministic wireless network. This paper analyzes the delay and jitter of the wireless network with and without spectrum sharing. When sharing the spectrum of the licensed network, the spectrum band of wireless network can be expanded, such th… ▽ More To accommodate Machine-type Communication (MTC) service, the wireless network needs to support low-delay and low-jitter data transmission, realizing delay deterministic wireless network. This paper analyzes the delay and jitter of the wireless network with and without spectrum sharing. When sharing the spectrum of the licensed network, the spectrum band of wireless network can be expanded, such that the delay and jitter of data transmission are reduced. The challenge of this research is to model the relation between the delay/jitter and the parameters such as node distribution, transmit power, and bandwidth, etc. To this end, this paper applies stochastic geometry and queueing theory to analyze the outage probability of the licensed network and the delay performance of the wireless network with and without spectrum sharing. By establishing the M/G/1 queueing model for the queueing of the Base Station (BS) in the wireless network, the downlink delay and jitter are derived. Monte Carlo simulation results show that the spectrum sharing reduces the delay and jitter without causing serious interference to the licensed network, which can lay a foundation for the application of spectrum sharing in delay deterministic wireless network supporting MTC service. △ Less

Submitted 10 October, 2023; originally announced October 2023.

Comments: 15 pages, 14 figures

MSC Class: 94A99 ACM Class: H.1.1

arXiv:2310.06401 [pdf, other]

ISAC 4D Imaging System Based on 5G Downlink Millimeter Wave Signal

Authors: Bohao Lu, Zhiqing Wei, Lin Wang, Ruiyun Zhang, Zhiyong Feng

Abstract: Integrated Sensing and Communication(ISAC) has become a key technology for the 5th generation (5G) and 6th generation (6G) wireless communications due to its high spectrum utilization efficiency. Utilizing infrastructure such as 5G Base Stations (BS) to realize environmental imaging and reconstruction is important for promoting the construction of smart cities. Current 4D imaging methods utilizing… ▽ More Integrated Sensing and Communication(ISAC) has become a key technology for the 5th generation (5G) and 6th generation (6G) wireless communications due to its high spectrum utilization efficiency. Utilizing infrastructure such as 5G Base Stations (BS) to realize environmental imaging and reconstruction is important for promoting the construction of smart cities. Current 4D imaging methods utilizing Frequency Modulated Continuous Wave (FMCW) based Fast Fourier Transform (FFT) are not suitable for ISAC scenarios due to the higher bandwidth occupation and lower resolution. We propose a 4D (3D-Coordinates, Velocity) imaging method with higher sensing accuracy based on 2D-FFT with 2D-MUSIC utilizing standard 5G Downlink (DL) millimeter wave (mmWave) signals. To improve the sensing precision we also design a transceiver antenna array element arrangement scheme based on MIMO virtual aperture technique. We further propose a target detection algorithm based on multi-dimensional Constant False Alarm (CFAR) detection, which optimizes the ISAC imaging signal processing flow and reduces the computational pressure of signal processing. Simulation results show that our proposed method has better imaging results. The code is publicly available at https://github.com/MrHaobolu/ISAC\_4D\_IMaging.git. △ Less

Submitted 1 December, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

arXiv:2310.06387 [pdf, other]

Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations

Authors: Zeming Wei, Yifei Wang, Ang Li, Yichuan Mo, Yisen Wang

Abstract: Large Language Models (LLMs) have shown remarkable success in various tasks, yet their safety and the risk of generating harmful content remain pressing concerns. In this paper, we delve into the potential of In-Context Learning (ICL) to modulate the alignment of LLMs. Specifically, we propose the In-Context Attack (ICA) which employs harmful demonstrations to subvert LLMs, and the In-Context Defe… ▽ More Large Language Models (LLMs) have shown remarkable success in various tasks, yet their safety and the risk of generating harmful content remain pressing concerns. In this paper, we delve into the potential of In-Context Learning (ICL) to modulate the alignment of LLMs. Specifically, we propose the In-Context Attack (ICA) which employs harmful demonstrations to subvert LLMs, and the In-Context Defense (ICD) which bolsters model resilience through examples that demonstrate refusal to produce harmful responses. We offer theoretical insights to elucidate how a limited set of in-context demonstrations can pivotally influence the safety alignment of LLMs. Through extensive experiments, we demonstrate the efficacy of ICA and ICD in respectively elevating and mitigating the success rates of jailbreaking prompts. Our findings illuminate the profound influence of ICL on LLM behavior, opening new avenues for improving the safety of LLMs. △ Less

Submitted 25 May, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

arXiv:2310.06382 [pdf, other]

Mutual Information Metrics for Uplink MIMO-OFDM Integrated Sensing and Communication System

Authors: Jinghui Piao, Zhiqing Wei, Xin Yuan, Xiaoyu Yang, Huici Wu, Zhiyong Feng

Abstract: As the uplink sensing has the advantage of easy implementation, it attracts great attention in integrated sensing and communication (ISAC) system. This paper presents an uplink ISAC system based on multi-input multi-output orthogonal frequency division multiplexing (MIMO-OFDM) technology. The mutual information (MI) is introduced as a unified metric to evaluate the performance of communication and… ▽ More As the uplink sensing has the advantage of easy implementation, it attracts great attention in integrated sensing and communication (ISAC) system. This paper presents an uplink ISAC system based on multi-input multi-output orthogonal frequency division multiplexing (MIMO-OFDM) technology. The mutual information (MI) is introduced as a unified metric to evaluate the performance of communication and sensing. In this paper, firstly, the upper and lower bounds of communication and sensing MI are derived in details based on the interaction between communication and sensing. And the ISAC waveform is optimized by maximizing the weighted sum of sensing and communication MI. The Monte Carlo simulation results show that, compared with other waveform optimization schemes, the proposed ISAC scheme has the best overall performance. △ Less

Submitted 10 October, 2023; originally announced October 2023.

arXiv:2310.06285 [pdf, ps, other]

Fast Neighbor Discovery for Wireless Ad Hoc Network with Successive Interference Cancellation

Authors: Zhiqing Wei, Yueyue Liang, Zeyang Meng, Zhiyong Feng, Kaifeng Han, Huici Wu

Abstract: Neighbor discovery (ND) is a key step in wireless ad hoc network, which directly affects the efficiency of wireless networking. Improving the speed of ND has always been the goal of ND algorithms. The classical ND algorithms lose packets due to the collision of multiple packets, which greatly affects the speed of the ND algorithms. Traditional methods detect packet collision and implement retransm… ▽ More Neighbor discovery (ND) is a key step in wireless ad hoc network, which directly affects the efficiency of wireless networking. Improving the speed of ND has always been the goal of ND algorithms. The classical ND algorithms lose packets due to the collision of multiple packets, which greatly affects the speed of the ND algorithms. Traditional methods detect packet collision and implement retransmission when encountering packet loss. However, they does not solve the packet collision problem and the performance improvement of ND algorithms is limited. In this paper, the successive interference cancellation (SIC) technology is introduced into the ND algorithms to unpack multiple collision packets by distinguishing multiple packets in the power domain. Besides, the multi-packet reception (MPR) is further applied to reduce the probability of packet collision by distinguishing multiple received packets, thus further improving the speed of ND algorithms. Six ND algorithms, namely completely random algorithm (CRA), CRA based on SIC (CRA-SIC), CRA based on SIC and MPR (CRA-SIC-MPR), scan-based algorithm (SBA), SBA based on SIC (SBA-SIC), and SBA based on SIC and MPR (SBA-SIC-MPR), are theoretically analyzed and verified by simulation. The simulation results show that SIC and MPR reduce the ND time of SBA by 69.02% and CRA by 66.03% averagely. △ Less

Submitted 9 October, 2023; originally announced October 2023.

Comments: 16 pages, 16 figures

MSC Class: 60B99; 94A99 ACM Class: C.2.2

arXiv:2310.05444 [pdf, other]

Waveform Design for MIMO-OFDM Integrated Sensing and Communication System: An Information Theoretical Approach

Authors: Zhiqing Wei, Jinghui Piao, Xin Yuan, Huici Wu, J. Andrew Zhang, Zhiyong Feng, Lin Wang, Ping Zhang

Abstract: Integrated sensing and communication (ISAC) is regarded as the enabling technology in the future 5th-Generation-Advanced (5G-A) and 6th-Generation (6G) mobile communication system. ISAC waveform design is critical in ISAC system. However, the difference of the performance metrics between sensing and communication brings challenges for the ISAC waveform design. This paper applies the unified perfor… ▽ More Integrated sensing and communication (ISAC) is regarded as the enabling technology in the future 5th-Generation-Advanced (5G-A) and 6th-Generation (6G) mobile communication system. ISAC waveform design is critical in ISAC system. However, the difference of the performance metrics between sensing and communication brings challenges for the ISAC waveform design. This paper applies the unified performance metrics in information theory, namely mutual information (MI), to measure the communication and sensing performance in multicarrier ISAC system. In multi-input multi-output orthogonal frequency division multiplexing (MIMO-OFDM) ISAC system, we first derive the sensing and communication MI with subcarrier correlation and spatial correlation. Then, we propose optimal waveform designs for maximizing the sensing MI, communication MI and the weighted sum of sensing and communication MI, respectively. The optimization results are validated by Monte Carlo simulations. Our work provides effective closed-form expressions for waveform design, enabling the realization of MIMO-OFDM ISAC system with balanced performance in communication and sensing. △ Less

Submitted 9 October, 2023; originally announced October 2023.

arXiv:2310.02569 [pdf, other]

ReForm-Eval: Evaluating Large Vision Language Models via Unified Re-Formulation of Task-Oriented Benchmarks

Authors: Zejun Li, Ye Wang, Mengfei Du, Qingwen Liu, Binhao Wu, Jiwen Zhang, Chengxing Zhou, Zhihao Fan, Jie Fu, Jingjing Chen, Xuanjing Huang, Zhongyu Wei

Abstract: Recent years have witnessed remarkable progress in the development of large vision-language models (LVLMs). Benefiting from the strong language backbones and efficient cross-modal alignment strategies, LVLMs exhibit surprising capabilities to perceive visual signals and perform visually grounded reasoning. However, the capabilities of LVLMs have not been comprehensively and quantitatively evaluate… ▽ More Recent years have witnessed remarkable progress in the development of large vision-language models (LVLMs). Benefiting from the strong language backbones and efficient cross-modal alignment strategies, LVLMs exhibit surprising capabilities to perceive visual signals and perform visually grounded reasoning. However, the capabilities of LVLMs have not been comprehensively and quantitatively evaluate. Most existing multi-modal benchmarks require task-oriented input-output formats, posing great challenges to automatically assess the free-form text output of LVLMs. To effectively leverage the annotations available in existing benchmarks and reduce the manual effort required for constructing new benchmarks, we propose to re-formulate existing benchmarks into unified LVLM-compatible formats. Through systematic data collection and reformulation, we present the ReForm-Eval benchmark, offering substantial data for evaluating various capabilities of LVLMs. Based on ReForm-Eval, we conduct extensive experiments, thoroughly analyze the strengths and weaknesses of existing LVLMs, and identify the underlying factors. Our benchmark and evaluation framework will be open-sourced as a cornerstone for advancing the development of LVLMs. △ Less

Submitted 17 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

Comments: 38 pages, 11 figures, 24 tables

arXiv:2310.02555 [pdf, other]

doi 10.1109/TCCN.2024.3391307

Integrated Sensing and Communication Signal Processing Based on Compressed Sensing Over Unlicensed Spectrum Bands

Authors: Haotian Liu, Zhiqing Wei, Fengyun Li, Yuewei Lin, Hanyang Qu, Huici Wu, Zhiyong Feng

Abstract: As a promising key technology of 6th generation (6G) mobile communication system, integrated sensing and communication (ISAC) technology aims to make full use of spectrum resources to enable the functional integration of communication and sensing. The ISAC-enabled mobile communication system regularly operate in non-continuous spectrum bands due to crowded licensed frequency bands. However, the co… ▽ More As a promising key technology of 6th generation (6G) mobile communication system, integrated sensing and communication (ISAC) technology aims to make full use of spectrum resources to enable the functional integration of communication and sensing. The ISAC-enabled mobile communication system regularly operate in non-continuous spectrum bands due to crowded licensed frequency bands. However, the conventional sensing algorithms over non-continuous spectrum bands have disadvantages such as reduced peak-to-side lobe ratio (PSLR) and degraded anti-noise performance. Facing this challenge, we propose a high-precision ISAC signal processing algorithm based on compressed sensing (CS) in this paper. By integrating the resource block group (RBG) configuration information in 5th generation new radio (5G NR) and channel information matrices, we can dynamically and accurately obtain power estimation spectra. Moreover, we employ the fast iterative shrinkage-thresholding algorithm (FISTA) to address the reconstruction problem and utilize K-fold cross validation (KCV) to obtain optimal parameters. Simulation results show that the proposed algorithm has lower sidelobes or even zero sidelobes compared with conventional sensing algorithms. Meanwhile, compared with the improved 2D FFT algorithm and conventional 2D FFT algorithm, the proposed algorithms in this paper have a maximum improvement of 54.66 % and 84.36 % in range estimation accuracy, and 41.54 % and 97.09 % in velocity estimation accuracy, respectively. △ Less

Submitted 19 April, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

Comments: 15 pages 12 figures 7 tables

arXiv:2310.02361 [pdf, other]

doi 10.1145/3581783.3612147

Event-Enhanced Multi-Modal Spiking Neural Network for Dynamic Obstacle Avoidance

Authors: Yang Wang, Bo Dong, Yuji Zhang, Yunduo Zhou, Haiyang Mei, Ziqi Wei, Xin Yang

Abstract: Autonomous obstacle avoidance is of vital importance for an intelligent agent such as a mobile robot to navigate in its environment. Existing state-of-the-art methods train a spiking neural network (SNN) with deep reinforcement learning (DRL) to achieve energy-efficient and fast inference speed in complex/unknown scenes. These methods typically assume that the environment is static while the obsta… ▽ More Autonomous obstacle avoidance is of vital importance for an intelligent agent such as a mobile robot to navigate in its environment. Existing state-of-the-art methods train a spiking neural network (SNN) with deep reinforcement learning (DRL) to achieve energy-efficient and fast inference speed in complex/unknown scenes. These methods typically assume that the environment is static while the obstacles in real-world scenes are often dynamic. The movement of obstacles increases the complexity of the environment and poses a great challenge to the existing methods. In this work, we approach robust dynamic obstacle avoidance twofold. First, we introduce the neuromorphic vision sensor (i.e., event camera) to provide motion cues complementary to the traditional Laser depth data for handling dynamic obstacles. Second, we develop an DRL-based event-enhanced multimodal spiking actor network (EEM-SAN) that extracts information from motion events data via unsupervised representation learning and fuses Laser and event camera data with learnable thresholding. Experiments demonstrate that our EEM-SAN outperforms state-of-the-art obstacle avoidance methods by a significant margin, especially for dynamic obstacle avoidance. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: In Proceedings of the 31st ACM International Conference on Multimedia (ACM MM 2023)

arXiv:2310.01954 [pdf, other]

doi 10.1051/0004-6361/202346927

Performance of the joint LST-1 and MAGIC observations evaluated with Crab Nebula data

Authors: H. Abe, K. Abe, S. Abe, V. A. Acciari, A. Aguasca-Cabot, I. Agudo, N. Alvarez Crespo, T. Aniello, S. Ansoldi, L. A. Antonelli, C. Aramo, A. Arbet-Engels, C. Arcaro, M. Artero, K. Asano, P. Aubert, D. Baack, A. Babić, A. Baktash, A. Bamba, A. Baquero Larriva, L. Baroncelli, U. Barres de Almeida, J. A. Barrio, I. Batković , et al. (344 additional authors not shown)

Abstract: Aims. LST-1, the prototype of the Large-Sized Telescope for the upcoming Cherenkov Telescope Array Observatory, is concluding its commissioning in Observatorio del Roque de los Muchachos on the island of La Palma. The proximity of LST-1 (Large-Sized Telescope 1) to the two MAGIC (Major Atmospheric Gamma Imaging Cherenkov) telescopes permits observations of the same gamma-ray events with both syste… ▽ More Aims. LST-1, the prototype of the Large-Sized Telescope for the upcoming Cherenkov Telescope Array Observatory, is concluding its commissioning in Observatorio del Roque de los Muchachos on the island of La Palma. The proximity of LST-1 (Large-Sized Telescope 1) to the two MAGIC (Major Atmospheric Gamma Imaging Cherenkov) telescopes permits observations of the same gamma-ray events with both systems. Methods. We describe the joint LST-1+MAGIC analysis pipeline and use simultaneous Crab Nebula observations and Monte Carlo simulations to assess the performance of the three-telescope system. The addition of the LST-1 telescope allows the recovery of events in which one of the MAGIC images is too dim to survive analysis quality cuts. Results. Thanks to the resulting increase in the collection area and stronger background rejection, we find a significant improvement in sensitivity, allowing the detection of 30% weaker fluxes in the energy range between 200 GeV and 3 TeV. The spectrum of the Crab Nebula, reconstructed in the energy range ~60 GeV to ~10 TeV, is in agreement with previous measurements. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: Accepted for publication in Astronomy & Astrophysics

Journal ref: A&A 680, A66 (2023)

arXiv:2309.14008 [pdf, other]

doi 10.1109/TVT.2023.3324436

Carrier Aggregation Enabled Integrated Sensing and Communication Signal Design and Processing

Authors: Zhiqing Wei, Haotian Liu, Xinyi Yang, Wangjun Jiang, Huici Wu, Xingwang Li, Zhiyong Feng

Abstract: The future mobile communication systems will support intelligent applications such as Internet of Vehicles (IoV) and Extended Reality (XR). Integrated Sensing and Communication (ISAC) is regarded as one of the key technologies satisfying the high data rate communication and highly accurate sensing for these intelligent applications in future mobile communication systems. With the explosive growth… ▽ More The future mobile communication systems will support intelligent applications such as Internet of Vehicles (IoV) and Extended Reality (XR). Integrated Sensing and Communication (ISAC) is regarded as one of the key technologies satisfying the high data rate communication and highly accurate sensing for these intelligent applications in future mobile communication systems. With the explosive growth of wireless devices and services, the shortage of spectrum resources leads to the fragmentation of available frequency bands for ISAC systems, which degrades sensing performance. Facing the above challenges, this paper proposes a Carrier Aggregation (CA)-based ISAC signal aggregating high and low-frequency bands to improve the sensing performance, where the CA-based ISAC signal can use four different aggregated pilot structures for sensing. Then, an ISAC signal processing algorithm with Compressed Sensing (CS) is proposed and the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA) is used to solve the reconfiguration convex optimization problem. Finally, the Cram'er-Rao Lower Bounds (CRLBs) are derived for the CA-based ISAC signal. Simulation results show that CA efficiently improves the accuracy of range and velocity estimation. △ Less

Submitted 28 November, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

Comments: 17pages, 17 figures, already early access in IEEE Transactions on Vehicular Technology

arXiv:2309.11702 [pdf, other]

Incentivized Communication for Federated Bandits

Authors: Zhepei Wei, Chuanhao Li, Haifeng Xu, Hongning Wang

Abstract: Most existing works on federated bandits take it for granted that all clients are altruistic about sharing their data with the server for the collective good whenever needed. Despite their compelling theoretical guarantee on performance and communication efficiency, this assumption is overly idealistic and oftentimes violated in practice, especially when the algorithm is operated over self-interes… ▽ More Most existing works on federated bandits take it for granted that all clients are altruistic about sharing their data with the server for the collective good whenever needed. Despite their compelling theoretical guarantee on performance and communication efficiency, this assumption is overly idealistic and oftentimes violated in practice, especially when the algorithm is operated over self-interested clients, who are reluctant to share data without explicit benefits. Negligence of such self-interested behaviors can significantly affect the learning efficiency and even the practical operability of federated bandit learning. In light of this, we aim to spark new insights into this under-explored research area by formally introducing an incentivized communication problem for federated bandits, where the server shall motivate clients to share data by providing incentives. Without loss of generality, we instantiate this bandit problem with the contextual linear setting and propose the first incentivized communication protocol, namely, Inc-FedUCB, that achieves near-optimal regret with provable communication and incentive cost guarantees. Extensive empirical experiments on both synthetic and real-world datasets further validate the effectiveness of the proposed method across various environments. △ Less

Submitted 23 October, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

Comments: 25 pages, 4 figures. Accepted at NeurIPS 2023

arXiv:2309.11325 [pdf, other]

DISC-LawLLM: Fine-tuning Large Language Models for Intelligent Legal Services

Authors: Shengbin Yue, Wei Chen, Siyuan Wang, Bingxuan Li, Chenchen Shen, Shujun Liu, Yuxuan Zhou, Yao Xiao, Song Yun, Xuanjing Huang, Zhongyu Wei

Abstract: We propose DISC-LawLLM, an intelligent legal system utilizing large language models (LLMs) to provide a wide range of legal services. We adopt legal syllogism prompting strategies to construct supervised fine-tuning datasets in the Chinese Judicial domain and fine-tune LLMs with legal reasoning capability. We augment LLMs with a retrieval module to enhance models' ability to access and utilize ext… ▽ More We propose DISC-LawLLM, an intelligent legal system utilizing large language models (LLMs) to provide a wide range of legal services. We adopt legal syllogism prompting strategies to construct supervised fine-tuning datasets in the Chinese Judicial domain and fine-tune LLMs with legal reasoning capability. We augment LLMs with a retrieval module to enhance models' ability to access and utilize external legal knowledge. A comprehensive legal benchmark, DISC-Law-Eval, is presented to evaluate intelligent legal systems from both objective and subjective dimensions. Quantitative and qualitative results on DISC-Law-Eval demonstrate the effectiveness of our system in serving various users across diverse legal scenarios. The detailed resources are available at https://github.com/FudanDISC/DISC-LawLLM. △ Less

Submitted 23 September, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

arXiv:2309.06431 [pdf, ps, other]

Limit theorems for critical faces above the vanishing threshold

Authors: Zifu Wei, Takashi Owada, D. Yogeshwaran

Abstract: We investigate convergence of point processes associated with critical faces for a Čech filtration built over a homogeneous Poisson point process in the $d$-dimensional flat torus. The convergence of our point process is established in terms of the $\mathcal M_0$-topology, when the connecting radius of a Čech complex decays to $0$, so slowly that critical faces are even less likely to occur than t… ▽ More We investigate convergence of point processes associated with critical faces for a Čech filtration built over a homogeneous Poisson point process in the $d$-dimensional flat torus. The convergence of our point process is established in terms of the $\mathcal M_0$-topology, when the connecting radius of a Čech complex decays to $0$, so slowly that critical faces are even less likely to occur than those in the regime of threshold for homological connectivity. We also obtain a series of limit theorems for positive and negative critical faces, all of which are considerably analogous to those for critical faces. △ Less

Submitted 21 September, 2023; v1 submitted 12 September, 2023; originally announced September 2023.

Comments: 21 pages; Minor notational and typographical edits made

MSC Class: 60D05. 60G55; 55U10

arXiv:2309.03408 [pdf, ps, other]

Linear Regularity and Strong CHIP of Closed Sets in Asplund Spaces

Authors: Zhou Wei, Michel Théra, Jen-Chih Yao

Abstract: In this paper, we mainly study linear regularity and two types of strong CHIP (given via Fréchet and limiting normal cones) for a collection of finitely many closed sets. We first prove characterizations of Asplund spaces in terms of linear regularity and intersection formulae of Fréchet normal cones. Several necessary conditions for linear regularity of closed sets are obtained via Fréchet/limiti… ▽ More In this paper, we mainly study linear regularity and two types of strong CHIP (given via Fréchet and limiting normal cones) for a collection of finitely many closed sets. We first prove characterizations of Asplund spaces in terms of linear regularity and intersection formulae of Fréchet normal cones. Several necessary conditions for linear regularity of closed sets are obtained via Fréchet/limiting normal cones in Asplund spaces. Then, we consider linear regularity for some special closed sets in convex-composite optimization and prove the equivalence result on linear regularity, strong Fréchet CHIP and property (G) so as to extend duality characterization for linear regularity of finitely many closed convex sets via strong CHIP and property (G) to the possibly non-convex case. As applications, we use these results on linear regularity and strong CHIP to study error bounds of inequality systems and give several dual criteria for error bounds via Fréchet normal cones and subdifferentials. △ Less

Submitted 6 September, 2023; originally announced September 2023.

arXiv:2309.03069 [pdf, other]

A New Smoothing Technique for Bang-Bang Optimal Control Problems

Authors: Kun Wang, Zheng Chen, Zhenyu Wei, Fangmin Lu, Jun Li

Abstract: Bang-bang control is ubiquitous for Optimal Control Problems (OCPs) where the constrained control variable appears linearly in the dynamics and cost function. Based on the Pontryagin's Minimum Principle, the indirect method is widely used to numerically solve OCPs because it enables to derive the theoretical structure of the optimal control. However, discontinuities in the bang-bang control struct… ▽ More Bang-bang control is ubiquitous for Optimal Control Problems (OCPs) where the constrained control variable appears linearly in the dynamics and cost function. Based on the Pontryagin's Minimum Principle, the indirect method is widely used to numerically solve OCPs because it enables to derive the theoretical structure of the optimal control. However, discontinuities in the bang-bang control structure may result in numerical difficulties for gradient-based indirect method. In this case, smoothing or regularization procedures are usually applied to eliminating the discontinuities of bang-bang controls. Traditional smoothing or regularization procedures generally modify the cost function by adding a term depending on a small parameter, or introducing a small error into the state equation. Those procedures may complexify the numerical algorithms or degenerate the convergence performance. To overcome these issues, we propose a bounded smooth function, called normalized L2-norm function, to approximate the sign function in terms of the switching function. The resulting optimal control is smooth and can be readily embedded into the indirect method. Then, the simplicity and improved performance of the proposed method over some existing methods are numerically demonstrated by a minimal-time oscillator problem and a minimal-fuel low-thrust trajectory optimization problem that involves many revolutions. △ Less

Submitted 1 December, 2023; v1 submitted 6 September, 2023; originally announced September 2023.

Comments: This paper has been accpted for presentation at the 2024 AIAA Scitech

arXiv:2309.02519 [pdf, other]

Efficient Simulation of Low Temperature Physics in One-Dimensional Gapless Systems

Authors: Yuya Kusuki, Kotaro Tamaoka, Zixia Wei, Yasushi Yoneta

Abstract: We discuss the computational efficiency of the finite temperature simulation with the minimally entangled typical thermal states (METTS). To argue that METTS can be efficiently represented as matrix product states, we present an analytic upper bound for the average entanglement Renyi entropy of METTS for Renyi index $0<q\leq 1$. In particular, for 1D gapless systems described by CFTs, the upper bo… ▽ More We discuss the computational efficiency of the finite temperature simulation with the minimally entangled typical thermal states (METTS). To argue that METTS can be efficiently represented as matrix product states, we present an analytic upper bound for the average entanglement Renyi entropy of METTS for Renyi index $0<q\leq 1$. In particular, for 1D gapless systems described by CFTs, the upper bound scales as $\mathcal{O}(c N^0 \log β)$ where $c$ is the central charge and $N$ is the system size. Furthermore, we numerically find that the average Renyi entropy exhibits a universal behavior characterized by the central charge and is roughly given by half of the analytic upper bound. Based on these results, we show that METTS provide a significant speedup compared to employing the purification method to analyze thermal equilibrium states at low temperatures in 1D gapless systems. △ Less

Submitted 5 September, 2023; originally announced September 2023.

Comments: 6 pages, revtex

Report number: CALT-TH 2023-018, RIKEN-iTHEMS-Report-23, YITP-23-74

arXiv:2308.14346 [pdf, other]

DISC-MedLLM: Bridging General Large Language Models and Real-World Medical Consultation

Authors: Zhijie Bao, Wei Chen, Shengze Xiao, Kuang Ren, Jiaao Wu, Cheng Zhong, Jiajie Peng, Xuanjing Huang, Zhongyu Wei

Abstract: We propose DISC-MedLLM, a comprehensive solution that leverages Large Language Models (LLMs) to provide accurate and truthful medical response in end-to-end conversational healthcare services. To construct high-quality Supervised Fine-Tuning (SFT) datasets, we employ three strategies: utilizing medical knowledge-graphs, reconstructing real-world dialogues, and incorporating human-guided preference… ▽ More We propose DISC-MedLLM, a comprehensive solution that leverages Large Language Models (LLMs) to provide accurate and truthful medical response in end-to-end conversational healthcare services. To construct high-quality Supervised Fine-Tuning (SFT) datasets, we employ three strategies: utilizing medical knowledge-graphs, reconstructing real-world dialogues, and incorporating human-guided preference rephrasing. These datasets are instrumental in training DISC-MedLLM, surpassing existing medical LLMs in both single-turn and multi-turn consultation scenarios. Extensive experimental results demonstrate the effectiveness of the proposed model in bridging the gap between general language models and real-world medical consultation. Additionally, we release the constructed dataset and model weights to further contribute to research and development. Further details and resources can be found at https://github.com/FudanDISC/DISC-MedLLM △ Less

Submitted 28 August, 2023; originally announced August 2023.

Comments: Work in progress

arXiv:2308.11432 [pdf, other]

doi 10.1007/s11704-024-40231-1

A Survey on Large Language Model based Autonomous Agents

Authors: Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, Wayne Xin Zhao, Zhewei Wei, Ji-Rong Wen

Abstract: Autonomous agents have long been a prominent research focus in both academic and industry communities. Previous research in this field often focuses on training agents with limited knowledge within isolated environments, which diverges significantly from human learning processes, and thus makes the agents hard to achieve human-like decisions. Recently, through the acquisition of vast amounts of we… ▽ More Autonomous agents have long been a prominent research focus in both academic and industry communities. Previous research in this field often focuses on training agents with limited knowledge within isolated environments, which diverges significantly from human learning processes, and thus makes the agents hard to achieve human-like decisions. Recently, through the acquisition of vast amounts of web knowledge, large language models (LLMs) have demonstrated remarkable potential in achieving human-level intelligence. This has sparked an upsurge in studies investigating LLM-based autonomous agents. In this paper, we present a comprehensive survey of these studies, delivering a systematic review of the field of LLM-based autonomous agents from a holistic perspective. More specifically, we first discuss the construction of LLM-based autonomous agents, for which we propose a unified framework that encompasses a majority of the previous work. Then, we present a comprehensive overview of the diverse applications of LLM-based autonomous agents in the fields of social science, natural science, and engineering. Finally, we delve into the evaluation strategies commonly used for LLM-based autonomous agents. Based on the previous studies, we also present several challenges and future directions in this field. To keep track of this field and continuously update our survey, we maintain a repository of relevant references at https://github.com/Paitesanshi/LLM-Agent-Survey. △ Less

Submitted 3 April, 2024; v1 submitted 22 August, 2023; originally announced August 2023.

Comments: 35 pages, 5 figures, 3 tables, has been accepted by frontiers of computer science (FCS), doi={10.1007/s11704-024-40231-1}

arXiv:2308.11159 [pdf, other]

SwinV2DNet: Pyramid and Self-Supervision Compounded Feature Learning for Remote Sensing Images Change Detection

Authors: Dalong Zheng, Zebin Wu, Jia Liu, Zhihui Wei

Abstract: Among the current mainstream change detection networks, transformer is deficient in the ability to capture accurate low-level details, while convolutional neural network (CNN) is wanting in the capacity to understand global information and establish remote spatial relationships. Meanwhile, both of the widely used early fusion and late fusion frameworks are not able to well learn complete change fe… ▽ More Among the current mainstream change detection networks, transformer is deficient in the ability to capture accurate low-level details, while convolutional neural network (CNN) is wanting in the capacity to understand global information and establish remote spatial relationships. Meanwhile, both of the widely used early fusion and late fusion frameworks are not able to well learn complete change features. Therefore, based on swin transformer V2 (Swin V2) and VGG16, we propose an end-to-end compounded dense network SwinV2DNet to inherit the advantages of both transformer and CNN and overcome the shortcomings of existing networks in feature learning. Firstly, it captures the change relationship features through the densely connected Swin V2 backbone, and provides the low-level pre-changed and post-changed features through a CNN branch. Based on these three change features, we accomplish accurate change detection results. Secondly, combined with transformer and CNN, we propose mixed feature pyramid (MFP) which provides inter-layer interaction information and intra-layer multi-scale information for complete feature learning. MFP is a plug and play module which is experimentally proven to be also effective in other change detection networks. Further more, we impose a self-supervision strategy to guide a new CNN branch, which solves the untrainable problem of the CNN branch and provides the semantic change information for the features of encoder. The state-of-the-art (SOTA) change detection scores and fine-grained change maps were obtained compared with other advanced methods on four commonly used public remote sensing datasets. The code is available at https://github.com/DalongZ/SwinV2DNet. △ Less

Submitted 21 August, 2023; originally announced August 2023.

arXiv:2308.11114 [pdf, ps, other]

On Möbius functions from automorphic forms and a generalized Sarnak's conjecture

Authors: Zhining Wei, Shifan Zhao

Abstract: In this paper, we consider Möbius functions associated with two types of $L$-functions: Rankin-Selberg $L$-functions of symmetric powers of distinct holomorphic cusp forms and $L$-functions of Maass cusp forms. We show that these Möbius functions are weakly orthogonal to bounded sequences. As a direct corollary, a generalized Sarnak's conjecture holds for these two types of Möbius functions. In this paper, we consider Möbius functions associated with two types of $L$-functions: Rankin-Selberg $L$-functions of symmetric powers of distinct holomorphic cusp forms and $L$-functions of Maass cusp forms. We show that these Möbius functions are weakly orthogonal to bounded sequences. As a direct corollary, a generalized Sarnak's conjecture holds for these two types of Möbius functions. △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: 13 pages

MSC Class: 11F66; 11F30

arXiv:2308.10759 [pdf, other]

EALink: An Efficient and Accurate Pre-trained Framework for Issue-Commit Link Recovery

Authors: Chenyuan Zhang, Yanlin Wang, Zhao Wei, Yong Xu, Juhong Wang, Hui Li, Rongrong Ji

Abstract: Issue-commit links, as a type of software traceability links, play a vital role in various software development and maintenance tasks. However, they are typically deficient, as developers often forget or fail to create tags when making commits. Existing studies have deployed deep learning techniques, including pretrained models, to improve automatic issue-commit link recovery.Despite their promisi… ▽ More Issue-commit links, as a type of software traceability links, play a vital role in various software development and maintenance tasks. However, they are typically deficient, as developers often forget or fail to create tags when making commits. Existing studies have deployed deep learning techniques, including pretrained models, to improve automatic issue-commit link recovery.Despite their promising performance, we argue that previous approaches have four main problems, hindering them from recovering links in large software projects. To overcome these problems, we propose an efficient and accurate pre-trained framework called EALink for issue-commit link recovery. EALink requires much fewer model parameters than existing pre-trained methods, bringing efficient training and recovery. Moreover, we design various techniques to improve the recovery accuracy of EALink. We construct a large-scale dataset and conduct extensive experiments to demonstrate the power of EALink. Results show that EALink outperforms the state-of-the-art methods by a large margin (15.23%-408.65%) on various evaluation metrics. Meanwhile, its training and inference overhead is orders of magnitude lower than existing methods. △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: 13 pages, 6 figures, published to ASE

Journal ref: IEEE/ACM International Conference on Automated Software Engineering,2023

arXiv:2308.09916 [pdf, other]

VI-Net: Boosting Category-level 6D Object Pose Estimation via Learning Decoupled Rotations on the Spherical Representations

Authors: Jiehong Lin, Zewei Wei, Yabin Zhang, Kui Jia

Abstract: Rotation estimation of high precision from an RGB-D object observation is a huge challenge in 6D object pose estimation, due to the difficulty of learning in the non-linear space of SO(3). In this paper, we propose a novel rotation estimation network, termed as VI-Net, to make the task easier by decoupling the rotation as the combination of a viewpoint rotation and an in-plane rotation. More speci… ▽ More Rotation estimation of high precision from an RGB-D object observation is a huge challenge in 6D object pose estimation, due to the difficulty of learning in the non-linear space of SO(3). In this paper, we propose a novel rotation estimation network, termed as VI-Net, to make the task easier by decoupling the rotation as the combination of a viewpoint rotation and an in-plane rotation. More specifically, VI-Net bases the feature learning on the sphere with two individual branches for the estimates of two factorized rotations, where a V-Branch is employed to learn the viewpoint rotation via binary classification on the spherical signals, while another I-Branch is used to estimate the in-plane rotation by transforming the signals to view from the zenith direction. To process the spherical signals, a Spherical Feature Pyramid Network is constructed based on a novel design of SPAtial Spherical Convolution (SPA-SConv), which settles the boundary problem of spherical signals via feature padding and realizesviewpoint-equivariant feature extraction by symmetric convolutional operations. We apply the proposed VI-Net to the challenging task of category-level 6D object pose estimation for predicting the poses of unknown objects without available CAD models; experiments on the benchmarking datasets confirm the efficacy of our method, which outperforms the existing ones with a large margin in the regime of high precision. △ Less

Submitted 19 August, 2023; originally announced August 2023.

Comments: Accepted by ICCV2023. Project Page: https://github.com/JiehongLin/VI-Net

arXiv:2308.09022 [pdf, other]

ARAI-MVSNet: A multi-view stereo depth estimation network with adaptive depth range and depth interval

Authors: Song Zhang, Wenjia Xu, Zhiwei Wei, Lili Zhang, Yang Wang, Junyi Liu

Abstract: Multi-View Stereo~(MVS) is a fundamental problem in geometric computer vision which aims to reconstruct a scene using multi-view images with known camera parameters. However, the mainstream approaches represent the scene with a fixed all-pixel depth range and equal depth interval partition, which will result in inadequate utilization of depth planes and imprecise depth estimation. In this paper, w… ▽ More Multi-View Stereo~(MVS) is a fundamental problem in geometric computer vision which aims to reconstruct a scene using multi-view images with known camera parameters. However, the mainstream approaches represent the scene with a fixed all-pixel depth range and equal depth interval partition, which will result in inadequate utilization of depth planes and imprecise depth estimation. In this paper, we present a novel multi-stage coarse-to-fine framework to achieve adaptive all-pixel depth range and depth interval. We predict a coarse depth map in the first stage, then an Adaptive Depth Range Prediction module is proposed in the second stage to zoom in the scene by leveraging the reference image and the obtained depth map in the first stage and predict a more accurate all-pixel depth range for the following stages. In the third and fourth stages, we propose an Adaptive Depth Interval Adjustment module to achieve adaptive variable interval partition for pixel-wise depth range. The depth interval distribution in this module is normalized by Z-score, which can allocate dense depth hypothesis planes around the potential ground truth depth value and vice versa to achieve more accurate depth estimation. Extensive experiments on four widely used benchmark datasets~(DTU, TnT, BlendedMVS, ETH 3D) demonstrate that our model achieves state-of-the-art performance and yields competitive generalization ability. Particularly, our method achieves the highest Acc and Overall on the DTU dataset, while attaining the highest Recall and $F_{1}$-score on the Tanks and Temples intermediate and advanced dataset. Moreover, our method also achieves the lowest $e_{1}$ and $e_{3}$ on the BlendedMVS dataset and the highest Acc and $F_{1}$-score on the ETH 3D dataset, surpassing all listed methods.Project website: https://github.com/zs670980918/ARAI-MVSNet △ Less

Submitted 17 August, 2023; originally announced August 2023.

arXiv:2308.08788 [pdf]

Nature-inspired three-dimensional surface serration topologies enable silent flight by suppressing airfoil-turbulence interaction noise

Authors: Zixiao Wei, Stanley Wang, Sean Farris, Naga Chennuri, Ningping Wang, Stara Shinsato, Kahraman Demir, Maya Horii, Grace X. Gu

Abstract: As natural predators, owls fly with astonishing stealth due to the sophisticated serrated surface morphology of their feathers that produces advantageous flow characteristics and favorable boundary layer structures. Traditionally, these serrations are tailored for airfoil edges with simple two-dimensional patterns, limiting their effect on overall noise reduction while negotiating tradeoffs in aer… ▽ More As natural predators, owls fly with astonishing stealth due to the sophisticated serrated surface morphology of their feathers that produces advantageous flow characteristics and favorable boundary layer structures. Traditionally, these serrations are tailored for airfoil edges with simple two-dimensional patterns, limiting their effect on overall noise reduction while negotiating tradeoffs in aerodynamic performance. Here, we formulate new design strategies that can mitigate tradeoffs between noise reduction and aerodynamic performance by merging owl feather and cicada insect wing geometries to create a three-dimensional topology that features silent and efficient flight. Aeroacoustics and aerodynamics experimental results show that the application of our hybrid topology yields a reduction in overall sound pressure levels by up to 9.93% and an increase in propulsive efficiency by over 48.14% compared to benchmark designs. Computational fluid dynamics simulations reveal that the three-dimensional, owl-inspired surface serrations can enhance surface vorticity. The produced coherent vortex structures serve to suppress the source strength of dipole and quadrupole pressure sources at various Reynolds numbers, resulting in a universal noise reduction effect. Our work demonstrates how a bioinspired three-dimensional serration topology refines the turbulence-airfoil interaction mode and improves multiple functionalities of an aerodynamic surface to enable quieter and more fuel-efficient, aerial vehicles. △ Less

Submitted 17 August, 2023; originally announced August 2023.

Comments: 33 pages

arXiv:2308.08101 [pdf]

The interface states in gate-all-around transistors (GAAFETs)

Authors: Yue-Yang Liu, Haoran Lu, Zirui Wang, Hui-Xiong Deng, Lang Zeng, Zhongming Wei, Jun-Wei Luo, Runsheng Wang

Abstract: The atomic-level structural detail and the quantum effects are becoming crucial to device performance as the emerging advanced transistors, representatively GAAFETs, are scaling down towards sub-3nm nodes. However, a multiscale simulation framework based on atomistic models and ab initio quantum simulation is still absent. Here, we propose such a simulation framework by fulfilling three challengin… ▽ More The atomic-level structural detail and the quantum effects are becoming crucial to device performance as the emerging advanced transistors, representatively GAAFETs, are scaling down towards sub-3nm nodes. However, a multiscale simulation framework based on atomistic models and ab initio quantum simulation is still absent. Here, we propose such a simulation framework by fulfilling three challenging tasks, i.e., building atomistic all-around interfaces between semiconductor and amorphous gate-oxide, conducting large-scale first-principles calculations on the interface models containing up to 2796 atoms, and finally bridging the state-of-the-art atomic level calculation to commercial TCAD. With this framework, two unnoticed origins of interface states are demonstrated, and their tunability by changing channel size, orientation and geometry is confirmed. The quantitative study of interface states and their effects on device performance explains why the nanosheet channel is preferred in industry. We believe such a bottom-up framework is necessary and promising for the accurate simulation of emerging advanced transistors. △ Less

Submitted 15 August, 2023; originally announced August 2023.

arXiv:2308.06874 [pdf, ps, other]

Joint Data Collection and Sensor Positioning in Multi-UAV-Assisted Wireless Sensor Network

Authors: Mingyue Zhu, Zhiqing Wei, Chen Qiu, Wangjun Jiang, Huici Wu, Zhiying Feng

Abstract: Due to the high mobility and easy deployment, unmanned aerial vehicles (UAVs) have attracted much attention in the field of wireless communication and positioning. To meet the challenges of lack of infrastructure coverage, uncertain sensor position and large amount of sensing data collection in wireless sensor network (WSN), this paper presents an efficient joint data collection and sensor positio… ▽ More Due to the high mobility and easy deployment, unmanned aerial vehicles (UAVs) have attracted much attention in the field of wireless communication and positioning. To meet the challenges of lack of infrastructure coverage, uncertain sensor position and large amount of sensing data collection in wireless sensor network (WSN), this paper presents an efficient joint data collection and sensor positioning scheme for WSN supported by multiple UAVs. Specifically, a UAV is set as the main UAV to collect data, and other UAVs are used as auxiliary UAVs for sensor positioning using time difference of arrival (TDoA). A mixed-integer non-convex optimization problem with uncertain sensor position is established. The goal is to minimize the average positioning error of all sensors by jointly optimizing the UAV trajectories, sensor transmission schedule and positioning observation points (POPs). To solve this optimization model, the original problem is decomposed into two sub-problems based on the path discrete method. Firstly, the block coordinate descent (BCD) and successive convex approximation (SCA) techniques are applied to iteratively optimize the trajectory of the main UAV and the sensor transmission schedule, so as to maximize the minimum amount of data uploaded by the sensor. Then, based on the trajectory of the main UAV, a particle swarm optimization (PSO)-based algorithm is designed to optimize the POPs of UAVs. Finally, the spline curve is applied to generate the trajectories of auxiliary UAVs. The simulation results show that the proposed scheme can meet the requirements of data collection and has a good positioning performance. △ Less

Submitted 13 August, 2023; originally announced August 2023.

arXiv:2308.06732 [pdf, ps, other]

UD-MAC: Delay Tolerant Multiple Access Control Protocol for Unmanned Aerial Vehicle Networks

Authors: Yingying Zou, Zhiqing Wei, Yanpeng Cui, Xinyi Liu, Zhiyong Feng

Abstract: In unmanned aerial vehicle (UAV) networks, high-capacity data transmission is of utmost importance for applications such as intelligent transportation, smart cities, and forest monitoring, which rely on the mobility of UAVs to collect and transmit large amount of data, including video and image data. Due to the short flight time of UAVs, the network capacity will be reduced when they return to the… ▽ More In unmanned aerial vehicle (UAV) networks, high-capacity data transmission is of utmost importance for applications such as intelligent transportation, smart cities, and forest monitoring, which rely on the mobility of UAVs to collect and transmit large amount of data, including video and image data. Due to the short flight time of UAVs, the network capacity will be reduced when they return to the ground unit for charging. Hence, we suggest that UAVs can apply a store-carry-and-forward (SCF) transmission mode to carry packets on their way back to the ground unit for improving network throughput. In this paper, we propose a novel protocol, named UAV delay-tolerant multiple access control (UD-MAC), which can support different transmission modes in UAV networks. We set a higher priority for SCF transmission and analyze the probability of being in SCF mode to derive network throughput. The simulation results show that the network throughput of UD-MAC is improved by 57% to 83% compared to VeMAC. △ Less

Submitted 13 August, 2023; originally announced August 2023.

arXiv:2308.06702 [pdf, ps, other]

doi 10.1109/TVT.2023.3304856

Symbol-level Integrated Sensing and Communication enabled Multiple Base Stations Cooperative Sensing

Authors: Zhiqing Wei, Ruizhong Xu, Zhiyong Feng, Huici Wu, Ning Zhang, Wangjun Jiang, Xiaoyu Yang

Abstract: With the support of integrated sensing and communication (ISAC) technology, mobile communication system will integrate the function of wireless sensing, thereby facilitating new intelligent applications such as smart city and intelligent transportation. Due to the limited sensing accuracy and sensing range of single base station (BS), multi-BS cooperative sensing can be applied to realize high-acc… ▽ More With the support of integrated sensing and communication (ISAC) technology, mobile communication system will integrate the function of wireless sensing, thereby facilitating new intelligent applications such as smart city and intelligent transportation. Due to the limited sensing accuracy and sensing range of single base station (BS), multi-BS cooperative sensing can be applied to realize high-accurate, long-range and continuous sensing, exploiting the specific advantages of large-scale networked mobile communication system. This paper proposes a cooperative sensing method suitable to mobile communication systems, which applies symbol-level sensing information fusion to estimate the location and velocity of target. With the demodulation symbols obtained from the echo signals of multiple BSs, the phase features contained in the demodulation symbols are used in the fusion procedure, which realizes cooperative sensing with the synchronization level of mobile communication system. Compared with the signal-level fusion in the area of distributed aperture coherence-synthetic radars, the requirement of synchronization is much lower. When signal-to-noise ratio (SNR) is -5 dB, it is evaluated that symbol-level multi-BS cooperative sensing effectively improves the accuracy of distance and velocity estimation of target. Compared with single-BS sensing, the accuracy of distance and velocity estimation is improved by 40% and 72%, respectively. Compared with data-level multi-BS cooperative sensing based on maximum likelihood (ML) estimation, the accuracy of location and velocity estimation is improved by 12% and 63%, respectively. This work may provide a guideline for the design of multi-BS cooperative sensing system to exploit the widely deployed networked mobile communication system. △ Less

Submitted 13 August, 2023; originally announced August 2023.

Comments: 15 pages, 17 figures, 2 tables

arXiv:2308.05754 [pdf, other]

SLAM for Multiple Extended Targets using 5G Signal

Authors: Wangjun Jiang, Zhiqing Wei, Zhiyong Feng

Abstract: 5th Generation (5G) mobile communication systems operating at around 28 GHz have the potential to be applied to simultaneous localization and mapping (SLAM). Most existing 5G SLAM studies estimate environment as many point targets, instead of extended targets. In this paper, we focus on the performance analysis of 5G SLAM for multiple extended targets. To evaluate the mapping performance of multip… ▽ More 5th Generation (5G) mobile communication systems operating at around 28 GHz have the potential to be applied to simultaneous localization and mapping (SLAM). Most existing 5G SLAM studies estimate environment as many point targets, instead of extended targets. In this paper, we focus on the performance analysis of 5G SLAM for multiple extended targets. To evaluate the mapping performance of multiple extended targets, a new mapping error metric, named extended targets generalized optimal sub-pattern assignment (ET-GOPSA), is proposed in this paper. Compared with the existing metrics, ET-GOPSA not only considers the accuracy error of target estimation, the cost of missing detection, the cost of false detection, but also the cost of matching the estimated point with the extended target. To evaluate the performance of 5G signal in SLAM, we analyze and simulate the mapping error of 5G signal sensing by ET-GOPSA. Simulation results show that, under the condition of SNR = 10 dB, 5G signal sensing can barely meet to meet the requirements of SLAM for multiple extended targets with the carrier frequency of 28 GHz, the bandwidth of 1.23 GHz, and the antenna size of 32. △ Less

Submitted 30 November, 2023; v1 submitted 4 August, 2023; originally announced August 2023.

arXiv:2308.05236 [pdf, other]

doi 10.1103/PhysRevB.108.L241302

Anyonic Mach-Zehnder interferometer on a single edge of a 2D electron gas

Authors: Navketan Batra, Zezhu Wei, Smitha Vishveshwara, D. E. Feldman

Abstract: Anyonic Fabry-Pérot and Mach-Zehnder interferometers have been proposed theoretically and implemented experimentally as tools to probe electric charges and statistics of anyons. The experimentally observed visibility of Aharonov-Bohm oscillations is maximal at a high transmission through an interferometer but simple theoretical expressions for the electric currents and noises are only available at… ▽ More Anyonic Fabry-Pérot and Mach-Zehnder interferometers have been proposed theoretically and implemented experimentally as tools to probe electric charges and statistics of anyons. The experimentally observed visibility of Aharonov-Bohm oscillations is maximal at a high transmission through an interferometer but simple theoretical expressions for the electric currents and noises are only available at low visibility. We consider an alternative version of a Mach-Zehnder interferometer, in which anyons tunnel between co-propagating chiral channels on the edges of quantum Hall liquids at the filling factors $n/(2n+1)$. We find simple exact solutions for any transmission. The solutions allow a straight-forward interpretation in terms of fractional charges and statistics. △ Less

Submitted 9 August, 2023; originally announced August 2023.

Comments: 6 pages, 2 figures

Journal ref: Phys. Rev. B 108, L241302 (2023)

arXiv:2308.03646 [pdf, other]

doi 10.1103/PhysRevB.109.L041104

Universal entanglement signatures of interface conformal field theories

Authors: Qicheng Tang, Zixia Wei, Yin Tang, Xueda Wen, W. Zhu

Abstract: An interface connecting two distinct conformal field theories hosts rich critical behaviors. In this work, we investigate the entanglement properties of such critical interface theories for probing the underlying universality. As inspired by holographic perspectives, we demonstrate vital features of various entanglement measures regarding such interfaces based on several paradigmatic lattice model… ▽ More An interface connecting two distinct conformal field theories hosts rich critical behaviors. In this work, we investigate the entanglement properties of such critical interface theories for probing the underlying universality. As inspired by holographic perspectives, we demonstrate vital features of various entanglement measures regarding such interfaces based on several paradigmatic lattice models. Crucially, for two subsystems adjacent at the interface, the mutual information and the reflected entropy exhibit identical leading logarithmic scaling, giving an effective interface central charge that takes the same value as the smaller central charge of the two conformal field theories. Our work demonstrates that the entanglement measure offers a powerful tool to explore the rich physics in critical interface theories. △ Less

Submitted 8 January, 2024; v1 submitted 7 August, 2023; originally announced August 2023.

Comments: 7 pages main text, 31 pages supplementary materials

Report number: YITP-23-100, RIKEN-iTHEMS-Report-23

Journal ref: Phys. Rev. B 109, L041104 (2024)

arXiv:2308.03072 [pdf, other]

Customizing Textile and Tactile Skins for Interactive Industrial Robots

Authors: Bo Ying Su, Zhongqi Wei, James McCann, Wenzhen Yuan, Changliu Liu

Abstract: Tactile skins made from textiles enhance robot-human interaction by localizing contact points and measuring contact forces. This paper presents a solution for rapidly fabricating, calibrating, and deploying these skins on industrial robot arms. The novel automated skin calibration procedure maps skin locations to robot geometry and calibrates contact force. Through experiments on a FANUC LR Mate 2… ▽ More Tactile skins made from textiles enhance robot-human interaction by localizing contact points and measuring contact forces. This paper presents a solution for rapidly fabricating, calibrating, and deploying these skins on industrial robot arms. The novel automated skin calibration procedure maps skin locations to robot geometry and calibrates contact force. Through experiments on a FANUC LR Mate 200id/7L industrial robot, we demonstrate that tactile skins made from textiles can be effectively used for human-robot interaction in industrial environments, and can provide unique opportunities in robot control and learning, making them a promising technology for enhancing robot perception and interaction. △ Less

Submitted 6 August, 2023; originally announced August 2023.

arXiv:2308.02782 [pdf]

doi 10.1364/OL.501622

Non-line-of-sight reconstruction via structure sparsity regularization

Authors: Duolan Huang, Quan Chen, Zhun Wei, Rui Chen

Abstract: Non-line-of-sight (NLOS) imaging allows for the imaging of objects around a corner, which enables potential applications in various fields such as autonomous driving, robotic vision, medical imaging, security monitoring, etc. However, the quality of reconstruction is challenged by low signal-noise-ratio (SNR) measurements. In this study, we present a regularization method, referred to as structure… ▽ More Non-line-of-sight (NLOS) imaging allows for the imaging of objects around a corner, which enables potential applications in various fields such as autonomous driving, robotic vision, medical imaging, security monitoring, etc. However, the quality of reconstruction is challenged by low signal-noise-ratio (SNR) measurements. In this study, we present a regularization method, referred to as structure sparsity (SS) regularization, for denoising in NLOS reconstruction. By exploiting the prior knowledge of structure sparseness, we incorporate nuclear norm penalization into the cost function of directional light-cone transform (DLCT) model for NLOS imaging system. This incorporation effectively integrates the neighborhood information associated with the directional albedo, thereby facilitating the denoising process. Subsequently, the reconstruction is achieved by optimizing a directional albedo model with SS regularization using fast iterative shrinkage-thresholding algorithm. Notably, the robust reconstruction of occluded objects is observed. Through comprehensive evaluations conducted on both synthetic and experimental datasets, we demonstrate that the proposed approach yields high-quality reconstructions, surpassing the state-of-the-art reconstruction algorithms, especially in scenarios involving short exposure and low SNR measurements. △ Less

Submitted 4 August, 2023; originally announced August 2023.

Comments: 8 pages, 5 figures

arXiv:2308.01266 [pdf, ps, other]

Deformations of cohesive modules on compact complex manifolds

Authors: Zhaoting Wei

Abstract: Cohesive modules give a dg-enhancement of the bounded derived category of coherent sheaves on a complex manifold via superconnections. In this paper we discuss the deformation theory of cohesive modules on compact complex manifolds. This generalizes the deformation theory of holomorphic vector bundles and coherent sheaves. We also develop the theory of Kuranishi maps and obstructions of deformatio… ▽ More Cohesive modules give a dg-enhancement of the bounded derived category of coherent sheaves on a complex manifold via superconnections. In this paper we discuss the deformation theory of cohesive modules on compact complex manifolds. This generalizes the deformation theory of holomorphic vector bundles and coherent sheaves. We also develop the theory of Kuranishi maps and obstructions of deformations of cohesive modules and give some examples of unobstructed deformations. △ Less

Submitted 4 September, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

Comments: 34 pages, Proposition 5.2 in Section 5.3 modified

MSC Class: 18D20; 32G08; 32Q99; 53C05

arXiv:2308.00452 [pdf, other]

A Majority Invariant Approach to Patch Robustness Certification for Deep Learning Models

Authors: Qilin Zhou, Zhengyuan Wei, Haipeng Wang, W. K. Chan

Abstract: Patch robustness certification ensures no patch within a given bound on a sample can manipulate a deep learning model to predict a different label. However, existing techniques cannot certify samples that cannot meet their strict bars at the classifier or patch region levels. This paper proposes MajorCert. MajorCert firstly finds all possible label sets manipulatable by the same patch region on th… ▽ More Patch robustness certification ensures no patch within a given bound on a sample can manipulate a deep learning model to predict a different label. However, existing techniques cannot certify samples that cannot meet their strict bars at the classifier or patch region levels. This paper proposes MajorCert. MajorCert firstly finds all possible label sets manipulatable by the same patch region on the same sample across the underlying classifiers, then enumerates their combinations element-wise, and finally checks whether the majority invariant of all these combinations is intact to certify samples. △ Less

Submitted 7 September, 2023; v1 submitted 1 August, 2023; originally announced August 2023.

Comments: 5 pages, 2 figures, accepted for inclusion in the ASE 2023 NIER track

arXiv:2307.15074 [pdf, other]

ISAC-NET: Model-driven Deep Learning for Integrated Passive Sensing and Communication

Authors: Wangjun Jiang, Dingyou Ma, Zhiqing Wei, Zhiyong Feng, Ping Zhang

Abstract: Recent advances in wireless communication with the enormous demands of sensing ability have given rise to the integrated sensing and communication (ISAC) technology, among which passive sensing plays an important role. The main challenge of passive sensing is how to achieve high sensing performance in the condition of communication demodulation errors. In this paper, we propose an ISAC network (IS… ▽ More Recent advances in wireless communication with the enormous demands of sensing ability have given rise to the integrated sensing and communication (ISAC) technology, among which passive sensing plays an important role. The main challenge of passive sensing is how to achieve high sensing performance in the condition of communication demodulation errors. In this paper, we propose an ISAC network (ISAC-NET) that combines passive sensing with communication signal detection by using model-driven deep learning (DL). Dissimilar to existing passive sensing algorithms that first demodulate the transmitted symbols and then obtain passive sensing results from the demodulated symbols, ISAC-NET obtains passive sensing results and communication demodulated symbols simultaneously. Different from the data-driven DL method, we adopt the block-by-block signal processing method that divides the ISAC-NET into the passive sensing module, signal detection module and channel reconstruction module. From the simulation results, ISAC-NET obtains better communication performance than the traditional signal demodulation algorithm, which is close to OAMP-Net2. Compared to the 2D-DFT algorithm, ISAC-NET demonstrates significantly enhanced sensing performance. In summary, ISAC-NET is a promising tool for passive sensing and communication in wireless communications. △ Less

Submitted 14 July, 2023; originally announced July 2023.

Comments: 29 pages, 11 figures

arXiv:2307.13162 [pdf, other]

Estimating Single-Node PageRank in $\tilde{O}\left(\min\{d_t, \sqrt{m}\}\right)$ Time

Authors: Hanzhi Wang, Zhewei Wei

Abstract: PageRank is a famous measure of graph centrality that has numerous applications in practice. The problem of computing a single node's PageRank has been the subject of extensive research over a decade. However, existing methods still incur large time complexities despite years of efforts. Even on undirected graphs where several valuable properties held by PageRank scores, the problem of locally app… ▽ More PageRank is a famous measure of graph centrality that has numerous applications in practice. The problem of computing a single node's PageRank has been the subject of extensive research over a decade. However, existing methods still incur large time complexities despite years of efforts. Even on undirected graphs where several valuable properties held by PageRank scores, the problem of locally approximating the PageRank score of a target node remains a challenging task. Two commonly adopted techniques, Monte-Carlo based random walks and backward push, both cost $O(n)$ time in the worst-case scenario, which hinders existing methods from achieving a sublinear time complexity like $O(\sqrt{m})$ on an undirected graph with $n$ nodes and $m$ edges. In this paper, we focus on the problem of single-node PageRank computation on undirected graphs. We propose a novel algorithm, SetPush, for estimating single-node PageRank specifically on undirected graphs. With non-trival analysis, we prove that our SetPush achieves the $\tilde{O}\left(\min\left\{d_t, \sqrt{m}\right\}\right)$ time complexity for estimating the target node $t$'s PageRank with constant relative error and constant failure probability on undirected graphs. We conduct comprehensive experiments to demonstrate the effectiveness of SetPush. △ Less

Submitted 26 July, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

Comments: Technical Report

Journal ref: PVLDB 2023

arXiv:2307.12591 [pdf, other]

SwinMM: Masked Multi-view with Swin Transformers for 3D Medical Image Segmentation

Authors: Yiqing Wang, Zihan Li, Jieru Mei, Zihao Wei, Li Liu, Chen Wang, Shengtian Sang, Alan Yuille, Cihang Xie, Yuyin Zhou

Abstract: Recent advancements in large-scale Vision Transformers have made significant strides in improving pre-trained models for medical image segmentation. However, these methods face a notable challenge in acquiring a substantial amount of pre-training data, particularly within the medical field. To address this limitation, we present Masked Multi-view with Swin Transformers (SwinMM), a novel multi-view… ▽ More Recent advancements in large-scale Vision Transformers have made significant strides in improving pre-trained models for medical image segmentation. However, these methods face a notable challenge in acquiring a substantial amount of pre-training data, particularly within the medical field. To address this limitation, we present Masked Multi-view with Swin Transformers (SwinMM), a novel multi-view pipeline for enabling accurate and data-efficient self-supervised medical image analysis. Our strategy harnesses the potential of multi-view information by incorporating two principal components. In the pre-training phase, we deploy a masked multi-view encoder devised to concurrently train masked multi-view observations through a range of diverse proxy tasks. These tasks span image reconstruction, rotation, contrastive learning, and a novel task that employs a mutual learning paradigm. This new task capitalizes on the consistency between predictions from various perspectives, enabling the extraction of hidden multi-view information from 3D medical data. In the fine-tuning stage, a cross-view decoder is developed to aggregate the multi-view information through a cross-attention block. Compared with the previous state-of-the-art self-supervised learning method Swin UNETR, SwinMM demonstrates a notable advantage on several medical image segmentation tasks. It allows for a smooth integration of multi-view information, significantly boosting both the accuracy and data-efficiency of the model. Code and models are available at https://github.com/UCSC-VLAA/SwinMM/. △ Less

Submitted 24 July, 2023; originally announced July 2023.

Comments: MICCAI 2023; project page: https://github.com/UCSC-VLAA/SwinMM/

arXiv:2307.11403 [pdf, other]

Channel Estimation for RIS-Aided MIMO Systems: A Partially Decoupled Atomic Norm Minimization Approach

Authors: Yonghui Chu, Zhiqiang Wei, Zai Yang, Derrick Wing Kwan Ng

Abstract: Channel estimation (CE) plays a key role in reconfigurable intelligent surface (RIS)-aided multiple-input multiple-output (MIMO) communication systems, while it poses a challenging task due to the passive nature of RIS and the cascaded channel structures. In this paper, a partially decoupled atomic norm minimization (PDANM) framework is proposed for CE of RIS-aided MIMO systems, which exploits the… ▽ More Channel estimation (CE) plays a key role in reconfigurable intelligent surface (RIS)-aided multiple-input multiple-output (MIMO) communication systems, while it poses a challenging task due to the passive nature of RIS and the cascaded channel structures. In this paper, a partially decoupled atomic norm minimization (PDANM) framework is proposed for CE of RIS-aided MIMO systems, which exploits the three-dimensional angular sparsity of the channel. In particular, PDANM partially decouples the differential angles at the RIS from other angles at the base station and user equipment, reducing the computational complexity compared with existing methods. A reweighted PDANM (RPDANM) algorithm is proposed to further improve CE accuracy, which iteratively refines CE through a specifically designed reweighing strategy. Building upon RPDANM, we propose an iterative approach named RPDANM with adaptive phase control (RPDANM-APC), which adaptively adjusts the RIS phases based on previously estimated channel parameters to facilitate CE, achieving superior CE accuracy while reducing training overhead. Numerical simulations demonstrate the superiority of our proposed approaches in terms of running time, CE accuracy, and training overhead. In particular, the RPDANM-APC approach can achieve higher CE accuracy than existing methods within less than 40 percent training overhead while reducing the running time by tens of times. △ Less

Submitted 25 August, 2023; v1 submitted 21 July, 2023; originally announced July 2023.

Comments: 35 pages, 9 figures. Part of this paper has been accepted by the 2023 IEEE Global Communications Conference (GLOBECOM)

arXiv:2307.10554 [pdf, other]

EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization

Authors: Peijie Dong, Lujun Li, Zimian Wei, Xin Niu, Zhiliang Tian, Hengyue Pan

Abstract: Mixed-Precision Quantization~(MQ) can achieve a competitive accuracy-complexity trade-off for models. Conventional training-based search methods require time-consuming candidate training to search optimized per-layer bit-width configurations in MQ. Recently, some training-free approaches have presented various MQ proxies and significantly improve search efficiency. However, the correlation between… ▽ More Mixed-Precision Quantization~(MQ) can achieve a competitive accuracy-complexity trade-off for models. Conventional training-based search methods require time-consuming candidate training to search optimized per-layer bit-width configurations in MQ. Recently, some training-free approaches have presented various MQ proxies and significantly improve search efficiency. However, the correlation between these proxies and quantization accuracy is poorly understood. To address the gap, we first build the MQ-Bench-101, which involves different bit configurations and quantization results. Then, we observe that the existing training-free proxies perform weak correlations on the MQ-Bench-101. To efficiently seek superior proxies, we develop an automatic search of proxies framework for MQ via evolving algorithms. In particular, we devise an elaborate search space involving the existing proxies and perform an evolution search to discover the best correlated MQ proxy. We proposed a diversity-prompting selection strategy and compatibility screening protocol to avoid premature convergence and improve search efficiency. In this way, our Evolving proxies for Mixed-precision Quantization~(EMQ) framework allows the auto-generation of proxies without heavy tuning and expert knowledge. Extensive experiments on ImageNet with various ResNet and MobileNet families demonstrate that our EMQ obtains superior performance than state-of-the-art mixed-precision methods at a significantly reduced cost. The code will be released. △ Less

Submitted 19 July, 2023; originally announced July 2023.

Comments: Accepted by ICCV2023

arXiv:2307.09362 [pdf, other]

Disentangle then Parse:Night-time Semantic Segmentation with Illumination Disentanglement

Authors: Zhixiang Wei, Lin Chen, Tao Tu, Huaian Chen, Pengyang Ling, Yi Jin

Abstract: Most prior semantic segmentation methods have been developed for day-time scenes, while typically underperforming in night-time scenes due to insufficient and complicated lighting conditions. In this work, we tackle this challenge by proposing a novel night-time semantic segmentation paradigm, i.e., disentangle then parse (DTP). DTP explicitly disentangles night-time images into light-invariant re… ▽ More Most prior semantic segmentation methods have been developed for day-time scenes, while typically underperforming in night-time scenes due to insufficient and complicated lighting conditions. In this work, we tackle this challenge by proposing a novel night-time semantic segmentation paradigm, i.e., disentangle then parse (DTP). DTP explicitly disentangles night-time images into light-invariant reflectance and light-specific illumination components and then recognizes semantics based on their adaptive fusion. Concretely, the proposed DTP comprises two key components: 1) Instead of processing lighting-entangled features as in prior works, our Semantic-Oriented Disentanglement (SOD) framework enables the extraction of reflectance component without being impeded by lighting, allowing the network to consistently recognize the semantics under cover of varying and complicated lighting conditions. 2) Based on the observation that the illumination component can serve as a cue for some semantically confused regions, we further introduce an Illumination-Aware Parser (IAParser) to explicitly learn the correlation between semantics and lighting, and aggregate the illumination features to yield more precise predictions. Extensive experiments on the night-time segmentation task with various settings demonstrate that DTP significantly outperforms state-of-the-art methods. Furthermore, with negligible additional parameters, DTP can be directly used to benefit existing day-time methods for night-time segmentation. △ Less

Submitted 19 July, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

Comments: Accepted by ICCV2023

arXiv:2307.08437 [pdf, other]

Synthesis of single-crystalline LuN films

Authors: Guanhua Su, Shuling Xiang, Jiachang Bi, Fugang Qi, Peiyi Li, Shunda Zhang, Shaozhu Xiao, Ruyi Zhang, Zhiyang Wei, Yanwei Cao

Abstract: In the nitrogen-doped lutetium hydride (Lu-H-N) system, the presence of Lu-N chemical bonds plays a key role in the emergence of possible room-temperature superconductivity at near ambient pressure. However, due to the synthesis of single-crystalline LuN being a big challenge, the understanding of LuN is insufficient thus far. Here, we report on the epitaxial growth of single-crystalline LuN films… ▽ More In the nitrogen-doped lutetium hydride (Lu-H-N) system, the presence of Lu-N chemical bonds plays a key role in the emergence of possible room-temperature superconductivity at near ambient pressure. However, due to the synthesis of single-crystalline LuN being a big challenge, the understanding of LuN is insufficient thus far. Here, we report on the epitaxial growth of single-crystalline LuN films. The crystal structures of LuN films were characterized by high-resolution X-ray diffraction. The measurement of low-temperature electrical transport indicates the LuN film is semiconducting from 300 to 2 K, yielding an activation gap of $\sim$ 0.02 eV. Interestingly, negative magnetoresistances can be observed below 12 K, which can result from the defects and magnetic impurities in LuN films. Our results uncover the electronic and magnetic properties of single-crystalline LuN films. △ Less

Submitted 17 July, 2023; originally announced July 2023.

arXiv:2307.08016 [pdf, other]

Breaking Down the Task: A Unit-Grained Hybrid Training Framework for Vision and Language Decision Making

Authors: Ruipu Luo, Jiwen Zhang, Zhongyu Wei

Abstract: Vision language decision making (VLDM) is a challenging multimodal task. The agent have to understand complex human instructions and complete compositional tasks involving environment navigation and object manipulation. However, the long action sequences involved in VLDM make the task difficult to learn. From an environment perspective, we find that task episodes can be divided into fine-grained \… ▽ More Vision language decision making (VLDM) is a challenging multimodal task. The agent have to understand complex human instructions and complete compositional tasks involving environment navigation and object manipulation. However, the long action sequences involved in VLDM make the task difficult to learn. From an environment perspective, we find that task episodes can be divided into fine-grained \textit{units}, each containing a navigation phase and an interaction phase. Since the environment within a unit stays unchanged, we propose a novel hybrid-training framework that enables active exploration in the environment and reduces the exposure bias. Such framework leverages the unit-grained configurations and is model-agnostic. Specifically, we design a Unit-Transformer (UT) with an intrinsic recurrent state that maintains a unit-scale cross-modal memory. Through extensive experiments on the TEACH benchmark, we demonstrate that our proposed framework outperforms existing state-of-the-art methods in terms of all evaluation metrics. Overall, our work introduces a novel approach to tackling the VLDM task by breaking it down into smaller, manageable units and utilizing a hybrid-training framework. By doing so, we provide a more flexible and effective solution for multimodal decision making. △ Less

Submitted 16 July, 2023; originally announced July 2023.

Showing 201–250 of 976 results for author: Wei, Z