subscribe to arXiv mailings

Machine Learning in Communications: A Road to Intelligent Transmission and Processing

Abstract: Prior to the era of artificial intelligence and big data, wireless communications primarily followed a conventional research route involving problem analysis, model building and calibration, algorithm design and tuning, and holistic and empirical verification. However, this methodology often encountered limitations when dealing with large-scale and complex problems and managing dynamic and massive… ▽ More Prior to the era of artificial intelligence and big data, wireless communications primarily followed a conventional research route involving problem analysis, model building and calibration, algorithm design and tuning, and holistic and empirical verification. However, this methodology often encountered limitations when dealing with large-scale and complex problems and managing dynamic and massive data, resulting in inefficiencies and limited performance of traditional communication systems and methods. As such, wireless communications have embraced the revolutionary impact of artificial intelligence and machine learning, giving birth to more adaptive, efficient, and intelligent systems and algorithms. This technological shift opens a road to intelligent information transmission and processing. This overview article discusses the typical roles of machine learning in intelligent wireless communications, as well as its features, challenges, and practical considerations. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: Invited Article

arXiv:2407.02251 [pdf, other]

White-Box 3D-OMP-Transformer for ISAC

Authors: Bowen Zhang, Geoffrey Ye Li

Abstract: Transformers have found broad applications for their great ability to capture long-range dependency among the inputs using attention mechanisms. The recent success of transformers increases the need for mathematical interpretation of their underlying working mechanisms, leading to the development of a family of white-box transformer-like deep network architectures. However, designing white-box tra… ▽ More Transformers have found broad applications for their great ability to capture long-range dependency among the inputs using attention mechanisms. The recent success of transformers increases the need for mathematical interpretation of their underlying working mechanisms, leading to the development of a family of white-box transformer-like deep network architectures. However, designing white-box transformers with efficient three-dimensional (3D) attention is still an open challenge. In this work, we revisit the 3D-orthogonal matching pursuit (OMP) algorithm and demonstrate that the operation of 3D-OMP is analogous to a specific kind of transformer with 3D attention. Therefore, we build a white-box 3D-OMP-transformer by introducing additional learnable parameters to 3D-OMP. As a transformer, its 3D-attention can be mathematically interpreted from 3D-OMP; while as a variant of OMP, it can learn to improve the matching pursuit process from data. Besides, a transformer's performance can be improved by stacking more transformer blocks. To simulate this process, we design a cascaded 3D-OMP-Transformer with dynamic small-scale dictionaries, which can improve the performance of the 3D-OMP-Transformer with low costs. We evaluate the designed 3D-OMP-transformer in the multi-target detection task of integrated sensing and communications (ISAC). Experimental results show that the designed 3D-OMP-Transformer can outperform current baselines. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.01640 [pdf, other]

BADM: Batch ADMM for Deep Learning

Authors: Ouya Wang, Shenglong Zhou, Geoffrey Ye Li

Abstract: Stochastic gradient descent-based algorithms are widely used for training deep neural networks but often suffer from slow convergence. To address the challenge, we leverage the framework of the alternating direction method of multipliers (ADMM) to develop a novel data-driven algorithm, called batch ADMM (BADM). The fundamental idea of the proposed algorithm is to split the training data into batch… ▽ More Stochastic gradient descent-based algorithms are widely used for training deep neural networks but often suffer from slow convergence. To address the challenge, we leverage the framework of the alternating direction method of multipliers (ADMM) to develop a novel data-driven algorithm, called batch ADMM (BADM). The fundamental idea of the proposed algorithm is to split the training data into batches, which is further divided into sub-batches where primal and dual variables are updated to generate global parameters through aggregation. We evaluate the performance of BADM across various deep learning tasks, including graph modelling, computer vision, image generation, and natural language processing. Extensive numerical experiments demonstrate that BADM achieves faster convergence and superior testing accuracy compared to other state-of-the-art optimizers. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2406.09238 [pdf, other]

Near-Field Multiuser Communications based on Sparse Arrays

Authors: Kangjian Chen, Chenhao Qi, Geoffrey Ye Li, Octavia A. Dobre

Abstract: This paper considers near-field multiuser communications based on sparse arrays (SAs). First, for the uniform SAs (USAs), we analyze the beam gains of channel steering vectors, which shows that increasing the antenna spacings can effectively improve the spatial resolution of the antenna arrays to enhance the sum rate of multiuser communications. Then, we investigate nonuniform SAs (NSAs) to mitiga… ▽ More This paper considers near-field multiuser communications based on sparse arrays (SAs). First, for the uniform SAs (USAs), we analyze the beam gains of channel steering vectors, which shows that increasing the antenna spacings can effectively improve the spatial resolution of the antenna arrays to enhance the sum rate of multiuser communications. Then, we investigate nonuniform SAs (NSAs) to mitigate the high multiuser interference from the grating lobes of the USAs. To maximize the sum rate of near-field multiuser communications, we optimize the antenna positions of the NSAs, where a successive convex approximation-based antenna position optimization algorithm is proposed. Moreover, we find that the channels of both the USAs and the NSAs show uniform sparsity in the defined surrogate distance-angle (SD-A) domain. Based on the channel sparsity, an on-grid SD-A-domain orthogonal matching pursuit (SDA-OMP) algorithm is developed to estimate multiuser channels. To further improve the resolution of the SDA-OMP, we also design an off-grid SD-A-domain iterative super-resolution channel estimation algorithm. Simulation results demonstrate the superior performance of the proposed methods. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.02410 [pdf, ps, other]

Optimization of Rate-Splitting Multiple Access with Integrated Sensing and Backscatter Communication

Authors: Diluka Galappaththige, Shayan Zargari, Chintha Tellambura, Geoffrey Ye Li

Abstract: An integrated sensing and backscatter communication (ISABC) system is introduced herein. This system features a full-duplex (FD) base station (BS) that seamlessly merges sensing with backscatter communication and supports multiple users. Multiple access (MA) for the user is provided by employing rate-splitting multiple access (RSMA). RSMA, unlike other classical orthogonal and non-orthogonal MA sc… ▽ More An integrated sensing and backscatter communication (ISABC) system is introduced herein. This system features a full-duplex (FD) base station (BS) that seamlessly merges sensing with backscatter communication and supports multiple users. Multiple access (MA) for the user is provided by employing rate-splitting multiple access (RSMA). RSMA, unlike other classical orthogonal and non-orthogonal MA schemes, splits messages into common and private streams. With RSMA, the set of common rate forms can be optimized to reduce interference. Optimized formulas are thus derived for communication rates for users, tags, and the BS's sensing rate, with the primary goal of enhancing the transmission efficiency of the BS. The optimization task involves minimizing the BS's overall transmission power by jointly optimizing the BS's beamforming vectors, the tag reflection coefficients, and user common rates. The alternating optimization method is employed to address this challenge. Concrete solutions are provided for the received beamformers, and semi-definite relaxation and slack-optimization techniques are adopted for transmit beamformers and reflection coefficients, respectively. For example, the proposed RSMA-assisted ISABC system achieves a 350% communication rate boost over a nonorthogonal multiple access-assisted ISABC, with only a 24% increase in transmit power, leveraging ten transmit/reception antennas at the BS. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 13 pages, 8 figures, Journal paper

arXiv:2406.00516 [pdf, other]

Deep Learning based Performance Testing for Analog Integrated Circuits

Authors: Jiawei Cao, Chongtao Guo, Hao Li, Zhigang Wang, Houjun Wang, Geoffrey Ye Li

Abstract: In this paper, we propose a deep learning based performance testing framework to minimize the number of required test modules while guaranteeing the accuracy requirement, where a test module corresponds to a combination of one circuit and one stimulus. First, we apply a deep neural network (DNN) to establish the mapping from the response of the circuit under test (CUT) in each module to all specif… ▽ More In this paper, we propose a deep learning based performance testing framework to minimize the number of required test modules while guaranteeing the accuracy requirement, where a test module corresponds to a combination of one circuit and one stimulus. First, we apply a deep neural network (DNN) to establish the mapping from the response of the circuit under test (CUT) in each module to all specifications to be tested. Then, the required test modules are selected by solving a 0-1 integer programming problem. Finally, the predictions from the selected test modules are combined by a DNN to form the specification estimations. The simulation results validate the proposed approach in terms of testing accuracy and cost. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2405.01961 [pdf, other]

Rescale-Invariant Federated Reinforcement Learning for Resource Allocation in V2X Networks

Authors: Kaidi Xu, Shenglong Zhou, Geoffrey Ye Li

Abstract: Federated Reinforcement Learning (FRL) offers a promising solution to various practical challenges in resource allocation for vehicle-to-everything (V2X) networks. However, the data discrepancy among individual agents can significantly degrade the performance of FRL-based algorithms. To address this limitation, we exploit the node-wise invariance property of ReLU-activated neural networks, with th… ▽ More Federated Reinforcement Learning (FRL) offers a promising solution to various practical challenges in resource allocation for vehicle-to-everything (V2X) networks. However, the data discrepancy among individual agents can significantly degrade the performance of FRL-based algorithms. To address this limitation, we exploit the node-wise invariance property of ReLU-activated neural networks, with the aim of reducing data discrepancy to improve learning performance. Based on this property, we introduce a backward rescale-invariant operation to develop a rescale-invariant FRL algorithm. Simulation results demonstrate that the proposed algorithm notably enhances both convergence speed and convergent performance. △ Less

Submitted 3 May, 2024; originally announced May 2024.

arXiv:2404.11941 [pdf, other]

Semantic Satellite Communications Based on Generative Foundation Model

Authors: Peiwen Jiang, Chao-Kai Wen, Xiao Li, Shi Jin, Geoffrey Ye Li

Abstract: Satellite communications can provide massive connections and seamless coverage, but they also face several challenges, such as rain attenuation, long propagation delays, and co-channel interference. To improve transmission efficiency and address severe scenarios, semantic communication has become a popular choice, particularly when equipped with foundation models (FMs). In this study, we introduce… ▽ More Satellite communications can provide massive connections and seamless coverage, but they also face several challenges, such as rain attenuation, long propagation delays, and co-channel interference. To improve transmission efficiency and address severe scenarios, semantic communication has become a popular choice, particularly when equipped with foundation models (FMs). In this study, we introduce an FM-based semantic satellite communication framework, termed FMSAT. This framework leverages FM-based segmentation and reconstruction to significantly reduce bandwidth requirements and accurately recover semantic features under high noise and interference. Considering the high speed of satellites, an adaptive encoder-decoder is proposed to protect important features and avoid frequent retransmissions. Meanwhile, a well-received image can provide a reference for repairing damaged images under sudden attenuation. Since acknowledgment feedback is subject to long propagation delays when retransmission is unavoidable, a novel error detection method is proposed to roughly detect semantic errors at the regenerative satellite. With the proposed detectors at both the satellite and the gateway, the quality of the received images can be ensured. The simulation results demonstrate that the proposed method can significantly reduce bandwidth requirements, adapt to complex satellite scenarios, and protect semantic information with an acceptable transmission delay. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2404.02648 [pdf, other]

A Universal Deep Neural Network for Signal Detection in Wireless Communication Systems

Authors: Khalid Albagami, Nguyen Van Huynh, Geoffrey Ye Li

Abstract: Recently, deep learning (DL) has been emerging as a promising approach for channel estimation and signal detection in wireless communications. The majority of the existing studies investigating the use of DL techniques in this domain focus on analysing channel impulse responses that are generated from only one channel distribution such as additive white Gaussian channel noise and Rayleigh channels… ▽ More Recently, deep learning (DL) has been emerging as a promising approach for channel estimation and signal detection in wireless communications. The majority of the existing studies investigating the use of DL techniques in this domain focus on analysing channel impulse responses that are generated from only one channel distribution such as additive white Gaussian channel noise and Rayleigh channels. In practice, to cope with the dynamic nature of the wireless channel, DL methods must be re-trained on newly non-aged collected data which is costly, inefficient, and impractical. To tackle this challenge, this paper proposes a novel universal deep neural network (Uni-DNN) that can achieve high detection performance in various wireless environments without retraining the model. In particular, our proposed Uni-DNN model consists of a wireless channel classifier and a signal detector which are constructed by using DNNs. The wireless channel classifier enables the signal detector to generalise and perform optimally for multiple wireless channel distributions. In addition, to further improve the signal detection performance of the proposed model, convolutional neural network is employed. Extensive simulations using the orthogonal frequency division multiplexing scheme demonstrate that the bit error rate performance of our proposed solution can outperform conventional DL-based approaches as well as least square and minimum mean square error channel estimators in practical low pilot density scenarios. △ Less

Submitted 3 April, 2024; originally announced April 2024.

arXiv:2403.17810 [pdf, other]

Environment Reconstruction based on Multi-User Selection and Multi-Modal Fusion in ISAC

Authors: Bo Lin, Chuanbin Zhao, Feifei Gao, Geoffrey Ye Li

Abstract: Integrated sensing and communications (ISAC) has been deemed as a key technology for the sixth generation (6G) wireless communications systems. In this paper, we explore the inherent clustered nature of wireless users and design a multi-user based environment reconstruction scheme. Specifically, we first select users based on the estimation precision of channel's multipath, including the line-of-s… ▽ More Integrated sensing and communications (ISAC) has been deemed as a key technology for the sixth generation (6G) wireless communications systems. In this paper, we explore the inherent clustered nature of wireless users and design a multi-user based environment reconstruction scheme. Specifically, we first select users based on the estimation precision of channel's multipath, including the line-of-sight (LOS) and the non-line-of-sight (NLOS) paths, to enhance the accuracy of environment reconstruction. Then, we develop a fusion strategy that merges communications signalling with camera image to increase the accuracy and robustness of environment reconstruction. The simulation results demonstrate that the proposed algorithm can achieve a remarkable sensing accuracy of centimeter level, which is about 17 times better than the scheme without user selection. Meanwhile, the fusion of communications data and vision data leads to a threefold accuracy improvement over the image only method, especially under challenging weather conditions like raining and snowing. △ Less

Submitted 26 March, 2024; originally announced March 2024.

arXiv:2401.12345 [pdf, other]

Distributionally Robust Receive Beamforming

Authors: Shixiong Wang, Wei Dai, Geoffrey Ye Li

Abstract: This article investigates signal estimation in wireless transmission (i.e., receive beamforming) from the perspective of statistical machine learning, where the transmit signals may be from an integrated sensing and communication system; that is, 1) signals may be not only discrete constellation points but also arbitrary complex values; 2) signals may be spatially correlated. Particular attention… ▽ More This article investigates signal estimation in wireless transmission (i.e., receive beamforming) from the perspective of statistical machine learning, where the transmit signals may be from an integrated sensing and communication system; that is, 1) signals may be not only discrete constellation points but also arbitrary complex values; 2) signals may be spatially correlated. Particular attention is paid to handling various uncertainties such as the uncertainty of the transmit signal covariance, the uncertainty of the channel matrix, the uncertainty of the channel noise covariance, the existence of channel impulse noises, and the limited sample size of pilots. To proceed, a distributionally robust machine learning framework that is insensitive to the above uncertainties is proposed, which reveals that channel estimation is not a necessary operation. For optimal linear estimation, the proposed framework includes several existing beamformers as special cases such as diagonal loading and eigenvalue thresholding. For optimal nonlinear estimation, estimators are limited in reproducing kernel Hilbert spaces and neural network function spaces, and corresponding uncertainty-aware solutions (e.g., kernelized diagonal loading) are derived. In addition, we prove that the ridge and kernel ridge regression methods in machine learning are distributionally robust against diagonal perturbation in feature covariance. △ Less

Submitted 10 June, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

arXiv:2401.11195 [pdf, other]

doi 10.1109/TWC.2024.3351712

Triple-Refined Hybrid-Field Beam Training for mmWave Extremely Large-Scale MIMO

Authors: Kangjian Chen, Chenhao Qi, Octavia A. Dobre, Geoffrey Ye Li

Abstract: This paper investigates beam training for extremely large-scale multiple-input multiple-output systems. By considering both the near field and far field, a triple-refined hybrid-field beam training scheme is proposed, where high-accuracy estimates of channel parameters are obtained through three steps of progressive beam refinement. First, the hybrid-field beam gain (HFBG)-based first refinement m… ▽ More This paper investigates beam training for extremely large-scale multiple-input multiple-output systems. By considering both the near field and far field, a triple-refined hybrid-field beam training scheme is proposed, where high-accuracy estimates of channel parameters are obtained through three steps of progressive beam refinement. First, the hybrid-field beam gain (HFBG)-based first refinement method is developed. Based on the analysis of the HFBG, the first-refinement codebook is designed and the beam training is performed accordingly to narrow down the potential region of the channel path. Then, the maximum likelihood (ML)-based and principle of stationary phase (PSP)-based second refinement methods are developed. By exploiting the measurements of the beam training, the ML is used to estimate the channel parameters. To avoid the high computational complexity of ML, closed-form estimates of the channel parameters are derived according to the PSP. Moreover, the Gaussian approximation (GA)-based third refinement method is developed. The hybrid-field neighboring search is first performed to identify the potential region of the main lobe of the channel steering vector. Afterwards, by applying the GA, a least-squares estimator is developed to obtain the high-accuracy channel parameter estimation. Simulation results verify the effectiveness of the proposed scheme. △ Less

Submitted 20 January, 2024; originally announced January 2024.

Journal ref: IEEE Transactions on Wireless Communications, 2024

arXiv:2401.09904 [pdf, ps, other]

Distributed Task-Oriented Communication Networks with Multimodal Semantic Relay and Edge Intelligence

Authors: Jie Guo, Hao Chen, Bin Song, Yuhao Chi, Chau Yuen, Fei Richard Yu, Geoffrey Ye Li, Dusit Niyato

Abstract: In this article, we present a novel framework, named distributed task-oriented communication networks (DTCN), based on recent advances in multimodal semantic transmission and edge intelligence. In DTCN, the multimodal knowledge of semantic relays and the adaptive adjustment capability of edge intelligence can be integrated to improve task performance. Specifically, we propose the key techniques in… ▽ More In this article, we present a novel framework, named distributed task-oriented communication networks (DTCN), based on recent advances in multimodal semantic transmission and edge intelligence. In DTCN, the multimodal knowledge of semantic relays and the adaptive adjustment capability of edge intelligence can be integrated to improve task performance. Specifically, we propose the key techniques in the framework, such as semantic alignment and complement, a semantic relay scheme for deep joint source-channel relay coding, and collaborative device-server optimization and inference. Furthermore, a multimodal classification task is used as an example to demonstrate the benefits of the proposed DTCN over existing methods. Numerical results validate that DTCN can significantly improve the accuracy of classification tasks, even in harsh communication scenarios (e.g., low signal-to-noise regime), thanks to multimodal semantic relay and edge intelligence. △ Less

Submitted 19 January, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

Comments: 7 pages, 5 figures, 1 table, accepted by IEEE Communications Magazine

arXiv:2401.00859 [pdf, ps, other]

Federated Multi-View Synthesizing for Metaverse

Authors: Yiyu Guo, Zhijin Qin, Xiaoming Tao, Geoffrey Ye Li

Abstract: The metaverse is expected to provide immersive entertainment, education, and business applications. However, virtual reality (VR) transmission over wireless networks is data- and computation-intensive, making it critical to introduce novel solutions that meet stringent quality-of-service requirements. With recent advances in edge intelligence and deep learning, we have developed a novel multi-view… ▽ More The metaverse is expected to provide immersive entertainment, education, and business applications. However, virtual reality (VR) transmission over wireless networks is data- and computation-intensive, making it critical to introduce novel solutions that meet stringent quality-of-service requirements. With recent advances in edge intelligence and deep learning, we have developed a novel multi-view synthesizing framework that can efficiently provide computation, storage, and communication resources for wireless content delivery in the metaverse. We propose a three-dimensional (3D)-aware generative model that uses collections of single-view images. These single-view images are transmitted to a group of users with overlapping fields of view, which avoids massive content transmission compared to transmitting tiles or whole 3D models. We then present a federated learning approach to guarantee an efficient learning process. The training performance can be improved by characterizing the vertical and horizontal data samples with a large latent feature space, while low-latency communication can be achieved with a reduced number of transmitted parameters during federated learning. We also propose a federated transfer learning framework to enable fast domain adaptation to different target domains. Simulation results have demonstrated the effectiveness of our proposed federated multi-view synthesizing framework for VR content delivery. △ Less

Submitted 18 December, 2023; originally announced January 2024.

arXiv:2311.15066 [pdf, other]

Beam Training and Tracking for Extremely Large-Scale MIMO Communications

Authors: Kangjian Chen, Chenhao Qi, Cheng-Xiang Wang, Geoffrey Ye Li

Abstract: In this paper, beam training and beam tracking are investigated for extremely large-scale multiple-input-multiple-output communication systems with partially-connected hybrid combining structures. Firstly, we propose a two-stage hybrid-field beam training scheme for both the near field and the far field. In the first stage, each subarray independently uses multiple far-field channel steering vecto… ▽ More In this paper, beam training and beam tracking are investigated for extremely large-scale multiple-input-multiple-output communication systems with partially-connected hybrid combining structures. Firstly, we propose a two-stage hybrid-field beam training scheme for both the near field and the far field. In the first stage, each subarray independently uses multiple far-field channel steering vectors to approximate near-field ones for analog combining. To find the codeword best fitting for the channel, digital combiners in the second stage are designed to combine the outputs of the analog combiners from the first stage. Then, based on the principle of stationary phase and the time-frequency duality, the expressions of subarray signals after analog combining are analytically derived and a beam refinement based on phase shifts of subarrays~(BRPSS) scheme with closed-form solutions is proposed for high-resolution channel parameter estimation. Moreover, a low-complexity near-field beam tracking scheme is developed, where the kinematic model is adopted to characterize the channel variations and the extended Kalman filter is exploited for beam tracking. Simulation results verify the effectiveness of the proposed schemes. △ Less

Submitted 25 November, 2023; originally announced November 2023.

arXiv:2311.15062 [pdf, other]

Simultaneous Beam Training and Target Sensing in ISAC Systems with RIS

Authors: Kangjian Chen, Chenhao Qi, Octavia A. Dobre, Geoffrey Ye Li

Abstract: This paper investigates an integrated sensing and communication (ISAC) system with reconfigurable intelligent surface (RIS). Our simultaneous beam training and target sensing (SBTTS) scheme enables the base station to perform beam training with the user terminals (UTs) and the RIS, and simultaneously to sense the targets. Based on our findings, the energy of the echoes from the RIS is accumulated… ▽ More This paper investigates an integrated sensing and communication (ISAC) system with reconfigurable intelligent surface (RIS). Our simultaneous beam training and target sensing (SBTTS) scheme enables the base station to perform beam training with the user terminals (UTs) and the RIS, and simultaneously to sense the targets. Based on our findings, the energy of the echoes from the RIS is accumulated in the angle-delay domain while that from the targets is accumulated in the Doppler-delay domain. The SBTTS scheme can distinguish the RIS from the targets with the mixed echoes from the RIS and the targets. Then we propose a positioning and array orientation estimation (PAOE) scheme for both the line-of-sight channels and the non-line-of-sight channels based on the beam training results of SBTTS by developing a low-complexity two-dimensional fast search algorithm. Based on the SBTTS and PAOE schemes, we further compute the angle-of-arrival and angle-of-departure for the channels between the RIS and the UTs by exploiting the geometry relationship to accomplish the beam alignment of the ISAC system. Simulation results verify the effectiveness of the proposed schemes. △ Less

Submitted 25 November, 2023; originally announced November 2023.

arXiv:2311.15060 [pdf, ps, other]

Key Issues in Wireless Transmission for NTN-Assisted Internet of Things

Authors: Chenhao Qi, Jing Wang, Leyi Lyu, Lei Tan, Jinming Zhang, Geoffrey Ye Li

Abstract: Non-terrestrial networks (NTNs) have become appealing resolutions for seamless coverage in the next-generation wireless transmission, where a large number of Internet of Things (IoT) devices diversely distributed can be efficiently served. The explosively growing number of IoT devices brings a new challenge for massive connection. The long-distance wireless signal propagation in NTNs leads to seve… ▽ More Non-terrestrial networks (NTNs) have become appealing resolutions for seamless coverage in the next-generation wireless transmission, where a large number of Internet of Things (IoT) devices diversely distributed can be efficiently served. The explosively growing number of IoT devices brings a new challenge for massive connection. The long-distance wireless signal propagation in NTNs leads to severe path loss and large latency, where the accurate acquisition of channel state information (CSI) is another challenge, especially for fast-moving non-terrestrial base stations (NTBSs). Moreover, the scarcity of on-board resources of NTBSs is also a challenge for resource allocation. To this end, we investigate three key issues, where the existing schemes and emerging resolutions for these three key issues have been comprehensively presented. The first issue is to enable the massive connection by designing random access to establish the wireless link and multiple access to transmit data streams. The second issue is to accurately acquire CSI in various channel conditions by channel estimation and beam training, where orthogonal time frequency space modulation and dynamic codebooks are on focus. The third issue is to efficiently allocate the wireless resources, including power allocation, spectrum sharing, beam hopping, and beamforming. At the end of this article, some future research topics are identified. △ Less

Submitted 25 November, 2023; originally announced November 2023.

Comments: 7 pages, 6 figures

arXiv:2311.06498 [pdf, other]

Semantic Communication for Cooperative Perception based on Importance Map

Authors: Yucheng Sheng, Hao Ye, Le Liang, Shi Jin, Geoffrey Ye Li

Abstract: Cooperative perception, which has a broader perception field than single-vehicle perception, has played an increasingly important role in autonomous driving to conduct 3D object detection. Through vehicle-to-vehicle (V2V) communication technology, various connected automated vehicles (CAVs) can share their sensory information (LiDAR point clouds) for cooperative perception. We employ an importance… ▽ More Cooperative perception, which has a broader perception field than single-vehicle perception, has played an increasingly important role in autonomous driving to conduct 3D object detection. Through vehicle-to-vehicle (V2V) communication technology, various connected automated vehicles (CAVs) can share their sensory information (LiDAR point clouds) for cooperative perception. We employ an importance map to extract significant semantic information and propose a novel cooperative perception semantic communication scheme with intermediate fusion. Meanwhile, our proposed architecture can be extended to the challenging time-varying multipath fading channel. To alleviate the distortion caused by the time-varying multipath fading, we adopt explicit orthogonal frequency-division multiplexing (OFDM) blocks combined with channel estimation and channel equalization. Simulation results demonstrate that our proposed model outperforms the traditional separate source-channel coding over various channel models. Moreover, a robustness study indicates that only part of semantic information is key to cooperative perception. Although our proposed model has only been trained over one specific channel, it has the ability to learn robust coded representations of semantic information that remain resilient to various channel models, demonstrating its generality and robustness. △ Less

Submitted 11 November, 2023; originally announced November 2023.

Comments: 13 pages,22 figures;journal;submitted for possible publication

arXiv:2311.00071 [pdf, other]

Robust Waveform Design for Integrated Sensing and Communication

Authors: Shixiong Wang, Wei Dai, Haowei Wang, Geoffrey Ye Li

Abstract: Integrated sensing and communication (ISAC), which enables hardware, resources (e.g., spectra), and waveforms sharing, is becoming a key feature in future-generation communication systems. This paper investigates performance characterization and waveform design for ISAC systems when the underlying true communication channels are not accurately known. With uncertainty in a nominal communication cha… ▽ More Integrated sensing and communication (ISAC), which enables hardware, resources (e.g., spectra), and waveforms sharing, is becoming a key feature in future-generation communication systems. This paper investigates performance characterization and waveform design for ISAC systems when the underlying true communication channels are not accurately known. With uncertainty in a nominal communication channel, the nominal Pareto frontier of the sensing and communication performances cannot represent the true performance trade-off of a real-world operating ISAC system. Therefore, this paper portrays the robust (i.e., conservative) Pareto frontier considering the uncertainty in the communication channel. To be specific, the lower bound of the true (but unknown) Pareto frontier is investigated, technically by studying robust waveform design problems that find the best waveforms under the worst-case channels. The robust waveform design problems examined in this paper are shown to be non-convex and high-dimensional, which cannot be solved using existing optimization techniques. As such, we propose a computationally efficient solution framework to approximately solve them. Simulation results show that by solving the robust waveform design problems, the lower bound of the true but unknown Pareto frontier, which characterizes the sensing-communication performance trade-off under communication channel uncertainty, can be obtained. △ Less

Submitted 3 June, 2024; v1 submitted 31 October, 2023; originally announced November 2023.

Comments: Accepted by IEEE Transactions on Signal Processing; Source Codes: https://github.com/Spratm-Asleaf/Robust-Waveform

arXiv:2310.12343 [pdf, other]

New Environment Adaptation with Few Shots for OFDM Receiver and mmWave Beamforming

Authors: Ouya Wang, Shenglong Zhou, Geoffrey Ye Li

Abstract: Few-shot learning (FSL) enables adaptation to new tasks with only limited training data. In wireless communications, channel environments can vary drastically; therefore, FSL techniques can quickly adjust transceiver accordingly. In this paper, we develop two FSL frameworks that fit in wireless transceiver design. Both frameworks are base on optimization programs that can be solved by well-known a… ▽ More Few-shot learning (FSL) enables adaptation to new tasks with only limited training data. In wireless communications, channel environments can vary drastically; therefore, FSL techniques can quickly adjust transceiver accordingly. In this paper, we develop two FSL frameworks that fit in wireless transceiver design. Both frameworks are base on optimization programs that can be solved by well-known algorithms like the inexact alternating direction method of multipliers (iADMM) and the inexact alternating direction method (iADM). As examples, we demonstrate how the proposed two FSL frameworks are used for the OFDM receiver and beamforming (BF) for the millimeter wave (mmWave) system. The numerical experiments confirm their desirable performance in both applications compared to other popular approaches, such as transfer learning (TL) and model-agnostic meta-learning. △ Less

Submitted 18 October, 2023; originally announced October 2023.

arXiv:2310.09858 [pdf, other]

Federated Reinforcement Learning for Resource Allocation in V2X Networks

Authors: Kaidi Xu, Shenglong Zhou, Geoffrey Ye Li

Abstract: Resource allocation significantly impacts the performance of vehicle-to-everything (V2X) networks. Most existing algorithms for resource allocation are based on optimization or machine learning (e.g., reinforcement learning). In this paper, we explore resource allocation in a V2X network under the framework of federated reinforcement learning (FRL). On one hand, the usage of RL overcomes many chal… ▽ More Resource allocation significantly impacts the performance of vehicle-to-everything (V2X) networks. Most existing algorithms for resource allocation are based on optimization or machine learning (e.g., reinforcement learning). In this paper, we explore resource allocation in a V2X network under the framework of federated reinforcement learning (FRL). On one hand, the usage of RL overcomes many challenges from the model-based optimization schemes. On the other hand, federated learning (FL) enables agents to deal with a number of practical issues, such as privacy, communication overhead, and exploration efficiency. The framework of FRL is then implemented by the inexact alternative direction method of multipliers (ADMM), where subproblems are solved approximately using policy gradients and accelerated by an adaptive step size calculated from their second moments. The developed algorithm, PASM, is proven to be convergent under mild conditions and has a nice numerical performance compared with some baseline methods for solving the resource allocation problem in a V2X network. △ Less

Submitted 15 October, 2023; originally announced October 2023.

Comments: Submitted to TWC

arXiv:2310.06246 [pdf, other]

Compression Ratio Learning and Semantic Communications for Video Imaging

Authors: Bowen Zhang, Zhijin Qin, Geoffrey Ye Li

Abstract: Camera sensors have been widely used in intelligent robotic systems. Developing camera sensors with high sensing efficiency has always been important to reduce the power, memory, and other related resources. Inspired by recent success on programmable sensors and deep optic methods, we design a novel video compressed sensing system with spatially-variant compression ratios, which achieves higher im… ▽ More Camera sensors have been widely used in intelligent robotic systems. Developing camera sensors with high sensing efficiency has always been important to reduce the power, memory, and other related resources. Inspired by recent success on programmable sensors and deep optic methods, we design a novel video compressed sensing system with spatially-variant compression ratios, which achieves higher imaging quality than the existing snapshot compressed imaging methods with the same sensing costs. In this article, we also investigate the data transmission methods for programmable sensors, where the performance of communication systems is evaluated by the reconstructed images or videos rather than the transmission of sensor data itself. Usually, different reconstruction algorithms are designed for applications in high dynamic range imaging, video compressive sensing, or motion debluring. This task-aware property inspires a semantic communication framework for programmable sensors. In this work, a policy-gradient based reinforcement learning method is introduced to achieve the explicit trade-off between the compression (or transmission) rate and the image distortion. Numerical results show the superiority of the proposed methods over existing baselines. △ Less

Submitted 9 October, 2023; originally announced October 2023.

arXiv:2309.17185 [pdf, other]

Meta Reinforcement Learning for Fast Spectrum Sharing in Vehicular Networks

Authors: Kai Huang, Le Liang, Shi Jin, Geoffrey Ye Li

Abstract: In this paper, we investigate the problem of fast spectrum sharing in vehicle-to-everything communication. In order to improve the spectrum efficiency of the whole system, the spectrum of vehicle-to-infrastructure links is reused by vehicle-to-vehicle links. To this end, we model it as a problem of deep reinforcement learning and tackle it with proximal policy optimization. A considerable number o… ▽ More In this paper, we investigate the problem of fast spectrum sharing in vehicle-to-everything communication. In order to improve the spectrum efficiency of the whole system, the spectrum of vehicle-to-infrastructure links is reused by vehicle-to-vehicle links. To this end, we model it as a problem of deep reinforcement learning and tackle it with proximal policy optimization. A considerable number of interactions are often required for training an agent with good performance, so simulation-based training is commonly used in communication networks. Nevertheless, severe performance degradation may occur when the agent is directly deployed in the real world, even though it can perform well on the simulator, due to the reality gap between the simulation and the real environments. To address this issue, we make preliminary efforts by proposing an algorithm based on meta reinforcement learning. This algorithm enables the agent to rapidly adapt to a new task with the knowledge extracted from similar tasks, leading to fewer interactions and less training time. Numerical results show that our method achieves near-optimal performance and exhibits rapid convergence. △ Less

Submitted 29 September, 2023; originally announced September 2023.

Comments: This paper has been accepted by China Communications

arXiv:2308.16671 [pdf, other]

Communication-Efficient Decentralized Federated Learning via One-Bit Compressive Sensing

Authors: Shenglong Zhou, Kaidi Xu, Geoffrey Ye Li

Abstract: Decentralized federated learning (DFL) has gained popularity due to its practicality across various applications. Compared to the centralized version, training a shared model among a large number of nodes in DFL is more challenging, as there is no central server to coordinate the training process. Especially when distributed nodes suffer from limitations in communication or computational resources… ▽ More Decentralized federated learning (DFL) has gained popularity due to its practicality across various applications. Compared to the centralized version, training a shared model among a large number of nodes in DFL is more challenging, as there is no central server to coordinate the training process. Especially when distributed nodes suffer from limitations in communication or computational resources, DFL will experience extremely inefficient and unstable training. Motivated by these challenges, in this paper, we develop a novel algorithm based on the framework of the inexact alternating direction method (iADM). On one hand, our goal is to train a shared model with a sparsity constraint. This constraint enables us to leverage one-bit compressive sensing (1BCS), allowing transmission of one-bit information among neighbour nodes. On the other hand, communication between neighbour nodes occurs only at certain steps, reducing the number of communication rounds. Therefore, the algorithm exhibits notable communication efficiency. Additionally, as each node selects only a subset of neighbours to participate in the training, the algorithm is robust against stragglers. Additionally, complex items are computed only once for several consecutive steps and subproblems are solved inexactly using closed-form solutions, resulting in high computational efficiency. Finally, numerical experiments showcase the algorithm's effectiveness in both communication and computation. △ Less

Submitted 31 August, 2023; originally announced August 2023.

arXiv:2308.13381 [pdf, ps, other]

Deep Unfolding-Based Channel Estimation for Wideband TeraHertz Near-Field Massive MIMO Systems

Authors: Jiabao Gao, Xiaoming Cheng, Geoffrey Ye Li

Abstract: The combination of Terahertz (THz) and massive multiple-input multiple-output (MIMO) is promising to meet the increasing data rate demand of future wireless communication systems thanks to the huge bandwidth and spatial degrees of freedom. However, unique channel features such as the near-field beam split effect make channel estimation particularly challenging in THz massive MIMO systems. On one h… ▽ More The combination of Terahertz (THz) and massive multiple-input multiple-output (MIMO) is promising to meet the increasing data rate demand of future wireless communication systems thanks to the huge bandwidth and spatial degrees of freedom. However, unique channel features such as the near-field beam split effect make channel estimation particularly challenging in THz massive MIMO systems. On one hand, adopting the conventional angular domain transformation dictionary designed for low-frequency far-field channels will result in degraded channel sparsity and destroyed sparsity structure in the transformed domain. On the other hand, most existing compressive sensing-based channel estimation algorithms cannot achieve high performance and low complexity simultaneously. To alleviate these issues, in this paper, we first adopt frequency-dependent near-field dictionaries to maintain good channel sparsity and sparsity structure in the transformed domain under the near-field beam split effect. Then, a deep unfolding-based wideband THz massive MIMO channel estimation algorithm is proposed. In each iteration of the unitary approximate message passing-sparse Bayesian learning algorithm, the optimal update rule is learned by a deep neural network (DNN), whose structure is customized to effectively exploit the inherent channel patterns. Furthermore, a mixed training method based on novel designs of the DNN structure and the loss function is developed to effectively train data from different system configurations. Simulation results validate the superiority of the proposed algorithm in terms of performance, complexity, and robustness. △ Less

Submitted 25 August, 2023; originally announced August 2023.

arXiv:2308.04728 [pdf, ps, other]

Deep Plug-and-Play Prior for Multitask Channel Reconstruction in Massive MIMO Systems

Authors: Weixiao Wan, Wei Chen, Shiyue Wang, Geoffrey Ye Li, Bo Ai

Abstract: Scalability is a major concern in implementing deep learning (DL) based methods in wireless communication systems. Given various channel reconstruction tasks, applying one DL model for one specific task is costly in both model training and model storage. In this paper, we propose a novel unsupervised deep plug-and-play prior method for three channel reconstruction tasks in the downlink of massive… ▽ More Scalability is a major concern in implementing deep learning (DL) based methods in wireless communication systems. Given various channel reconstruction tasks, applying one DL model for one specific task is costly in both model training and model storage. In this paper, we propose a novel unsupervised deep plug-and-play prior method for three channel reconstruction tasks in the downlink of massive multiple-input multiple-output (MIMO) systems, including channel estimation, antenna extrapolation and channel state information (CSI) feedback. The proposed method corresponding to these three channel reconstruction tasks employs a common DL model, which greatly reduces the overhead of model training and storage. Unlike general multi-task learning, the DL model of the proposed method does not require further fine-tuning for specific channel reconstruction tasks. Extensive experiments are conducted on the DeepMIMO dataset to demonstrate the convergence, performance, and storage overhead of the proposed method for the three channel reconstruction tasks. △ Less

Submitted 18 December, 2023; v1 submitted 9 August, 2023; originally announced August 2023.

arXiv:2308.04463 [pdf, other]

Weakly Semi-Supervised Detection in Lung Ultrasound Videos

Authors: Jiahong Ouyang, Li Chen, Gary Y. Li, Naveen Balaraju, Shubham Patil, Courosh Mehanian, Sourabh Kulhare, Rachel Millin, Kenton W. Gregory, Cynthia R. Gregory, Meihua Zhu, David O. Kessler, Laurie Malia, Almaz Dessie, Joni Rabiner, Di Coneybeare, Bo Shopsin, Andrew Hersh, Cristian Madar, Jeffrey Shupp, Laura S. Johnson, Jacob Avila, Kristin Dwyer, Peter Weimersheimer, Balasundar Raju , et al. (2 additional authors not shown)

Abstract: Frame-by-frame annotation of bounding boxes by clinical experts is often required to train fully supervised object detection models on medical video data. We propose a method for improving object detection in medical videos through weak supervision from video-level labels. More concretely, we aggregate individual detection predictions into video-level predictions and extend a teacher-student train… ▽ More Frame-by-frame annotation of bounding boxes by clinical experts is often required to train fully supervised object detection models on medical video data. We propose a method for improving object detection in medical videos through weak supervision from video-level labels. More concretely, we aggregate individual detection predictions into video-level predictions and extend a teacher-student training strategy to provide additional supervision via a video-level loss. We also introduce improvements to the underlying teacher-student framework, including methods to improve the quality of pseudo-labels based on weak supervision and adaptive schemes to optimize knowledge transfer between the student and teacher networks. We apply this approach to the clinically important task of detecting lung consolidations (seen in respiratory infections such as COVID-19 pneumonia) in medical ultrasound videos. Experiments reveal that our framework improves detection accuracy and robustness compared to baseline semi-supervised models, and improves efficiency in data and annotation usage. △ Less

Submitted 7 August, 2023; originally announced August 2023.

Comments: IPMI 2023

arXiv:2307.16100 [pdf, other]

RIS-Enhanced Semantic Communications Adaptive to User Requirements

Authors: Peiwen Jiang, Chao-Kai Wen, Shi Jin, Geoffrey Ye Li

Abstract: Semantic communication significantly reduces required bandwidth by understanding semantic meaning of the transmitted. However, current deep learning-based semantic communication methods rely on joint source-channel coding design and end-to-end training, which limits their adaptability to new physical channels and user requirements. Reconfigurable intelligent surfaces (RIS) offer a solution by cust… ▽ More Semantic communication significantly reduces required bandwidth by understanding semantic meaning of the transmitted. However, current deep learning-based semantic communication methods rely on joint source-channel coding design and end-to-end training, which limits their adaptability to new physical channels and user requirements. Reconfigurable intelligent surfaces (RIS) offer a solution by customizing channels in different environments. In this study, we propose the RIS-SC framework, which allocates semantic contents with varying levels of RIS assistance to satisfy the changing user requirements. It takes into account user movement and line-of-sight obstructions, enabling the RIS resource to protect important semantics in challenging channel conditions. The simulation results indicate reasonable task performance, but some semantic parts that have no effect on task performances are abandoned under severe channel conditions. To address this issue, a reconstruction method is also introduced to improve visual acceptance by inferring those missing semantic parts. Furthermore, the framework can adjust RIS resources in friendly channel conditions to save and allocate them efficiently among multiple users. Simulation results demonstrate the adaptability and efficiency of the RIS-SC framework across diverse channel conditions and user requirements. △ Less

Submitted 5 August, 2023; v1 submitted 29 July, 2023; originally announced July 2023.

Comments: This work has been submitted to the IEEE for possible publication.Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2307.03246 [pdf, other]

Semantic-Aware Image Compressed Sensing

Authors: Bowen Zhang, Zhijin Qin, Geoffrey Ye Li

Abstract: Deep learning based image compressed sensing (CS) has achieved great success. However, existing CS systems mainly adopt a fixed measurement matrix to images, ignoring the fact the optimal measurement numbers and bases are different for different images. To further improve the sensing efficiency, we propose a novel semantic-aware image CS system. In our system, the encoder first uses a fixed number… ▽ More Deep learning based image compressed sensing (CS) has achieved great success. However, existing CS systems mainly adopt a fixed measurement matrix to images, ignoring the fact the optimal measurement numbers and bases are different for different images. To further improve the sensing efficiency, we propose a novel semantic-aware image CS system. In our system, the encoder first uses a fixed number of base CS measurements to sense different images. According to the base CS results, the encoder then employs a policy network to analyze the semantic information in images and determines the measurement matrix for different image areas. At the decoder side, a semantic-aware initial reconstruction network is developed to deal with the changes of measurement matrices used at the encoder. A rate-distortion training loss is further introduced to dynamically adjust the average compression ratio for the semantic-aware CS system and the policy network is trained jointly with the encoder and the decoder in an en-to-end manner by using some proxy functions. Numerical results show that the proposed semantic-aware image CS system is superior to the traditional ones with fixed measurement matrices. △ Less

Submitted 10 July, 2023; v1 submitted 6 July, 2023; originally announced July 2023.

Comments: Modified version

arXiv:2305.08303 [pdf, other]

Deep-Unfolding for Next-Generation Transceivers

Authors: Qiyu Hu, Yunlong Cai, Guangyi Zhang, Guanding Yu, Geoffrey Ye Li

Abstract: The stringent performance requirements of future wireless networks, such as ultra-high data rates, extremely high reliability and low latency, are spurring worldwide studies on defining the next-generation multiple-input multiple-output (MIMO) transceivers. For the design of advanced transceivers in wireless communications, optimization approaches often leading to iterative algorithms have achieve… ▽ More The stringent performance requirements of future wireless networks, such as ultra-high data rates, extremely high reliability and low latency, are spurring worldwide studies on defining the next-generation multiple-input multiple-output (MIMO) transceivers. For the design of advanced transceivers in wireless communications, optimization approaches often leading to iterative algorithms have achieved great success for MIMO transceivers. However, these algorithms generally require a large number of iterations to converge, which entails considerable computational complexity and often requires fine-tuning of various parameters. With the development of deep learning, approximating the iterative algorithms with deep neural networks (DNNs) can significantly reduce the computational time. However, DNNs typically lead to black-box solvers, which requires amounts of data and extensive training time. To further overcome these challenges, deep-unfolding has emerged which incorporates the benefits of both deep learning and iterative algorithms, by unfolding the iterative algorithm into a layer-wise structure analogous to DNNs. In this article, we first go through the framework of deep-unfolding for transceiver design with matrix parameters and its recent advancements. Then, some endeavors in applying deep-unfolding approaches in next-generation advanced transceiver design are presented. Moreover, some open issues for future research are highlighted. △ Less

Submitted 14 May, 2023; originally announced May 2023.

Comments: 16 pages, 6 figures

arXiv:2303.12479

Distributed Two-tier DRL Framework for Cell-Free Network: Association, Beamforming and Power Allocation

Authors: Kaiwen Yu, Chonghao Zhao, Gang Wu, Geoffrey Ye Li

Abstract: Intelligent wireless networks have long been expected to have self-configuration and self-optimization capabilities to adapt to various environments and demands. In this paper, we develop a novel distributed hierarchical deep reinforcement learning (DHDRL) framework with two-tier control networks in different timescales to optimize the long-term spectrum efficiency (SE) of the downlink cell-free m… ▽ More Intelligent wireless networks have long been expected to have self-configuration and self-optimization capabilities to adapt to various environments and demands. In this paper, we develop a novel distributed hierarchical deep reinforcement learning (DHDRL) framework with two-tier control networks in different timescales to optimize the long-term spectrum efficiency (SE) of the downlink cell-free multiple-input single-output (MISO) network, consisting of multiple distributed access points (AP) and user terminals (UT). To realize the proposed two-tier control strategy, we decompose the optimization problem into two sub-problems, AP-UT association (AUA) as well as beamforming and power allocation (BPA), resulting in a Markov decision process (MDP) and Partially Observable MDP (POMDP). The proposed method consists of two neural networks. At the system level, a distributed high-level neural network is introduced to optimize wireless network structure on a large timescale. While at the link level, a distributed low-level neural network is proposed to mitigate inter-AP interference and improve the transmission performance on a small timescale. Numerical results show that our method is effective for high-dimensional problems, in terms of spectrum efficiency, signaling overhead as well as satisfaction probability, and generalize well to diverse multi-object problems. △ Less

Submitted 5 December, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

Comments: The paper has some updated

arXiv:2303.12335 [pdf, other]

Semantic Communication with Memory

Authors: Huiqiang Xie, Zhijin Qin, Geoffrey Ye Li

Abstract: While semantic communication succeeds in efficiently transmitting due to the strong capability to extract the essential semantic information, it is still far from the intelligent or human-like communications. In this paper, we introduce an essential component, memory, into semantic communications to mimic human communications. Particularly, we investigate a deep learning (DL) based semantic commun… ▽ More While semantic communication succeeds in efficiently transmitting due to the strong capability to extract the essential semantic information, it is still far from the intelligent or human-like communications. In this paper, we introduce an essential component, memory, into semantic communications to mimic human communications. Particularly, we investigate a deep learning (DL) based semantic communication system with memory, named Mem-DeepSC, by considering the scenario question answer task. We exploit the universal Transformer based transceiver to extract the semantic information and introduce the memory module to process the context information. Moreover, we derive the relationship between the length of semantic signal and the channel noise to validate the possibility of dynamic transmission. Specially, we propose two dynamic transmission methods to enhance the transmission reliability as well as to reduce the communication overhead by masking some unessential elements, which are recognized through training the model with mutual information. Numerical results show that the proposed Mem-DeepSC is superior to benchmarks in terms of answer accuracy and transmission efficiency, i.e., number of transmitted symbols. △ Less

Submitted 22 March, 2023; originally announced March 2023.

Comments: 12 pages

arXiv:2302.08645 [pdf, ps, other]

Semantic Communications with Variable-Length Coding for Extended Reality

Authors: Bowen Zhang, Zhijin Qin, Geoffrey Ye Li

Abstract: Wireless extended reality (XR) has attracted wide attentions as a promising technology to improve users' mobility and quality of experience. However, the ultra-high data rate requirement of wireless XR has hindered its development for many years. To overcome this challenge, we develop a semantic communication framework, where semantically-unimportant information is highly-compressed or discarded i… ▽ More Wireless extended reality (XR) has attracted wide attentions as a promising technology to improve users' mobility and quality of experience. However, the ultra-high data rate requirement of wireless XR has hindered its development for many years. To overcome this challenge, we develop a semantic communication framework, where semantically-unimportant information is highly-compressed or discarded in semantic coders, significantly improving the transmission efficiency. Besides, considering the fact that some source content may have less amount of semantic information or have higher tolerance to channel noise, we propose a universal variable-length semantic-channel coding method. In particular, we first use a rate allocation network to estimate the best code length for semantic information and then adjust the coding process accordingly. By adopting some proxy functions, the whole framework is trained in an end-to-end manner. Numerical results show that our semantic system significantly outperforms traditional transmission methods and the proposed variable-length coding scheme is superior to the fixed-length coding methods. △ Less

Submitted 11 March, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

Comments: 1. Update the performance of VL-SCC in Fig8. under new rate allocation architecture 2. Give a fair comparison between VL-SCC and SCC in Fig9. 3. fix the typo of LDPC rate (1/3 changed to 2/3) 4. Reduce L=32 to 16, and update the bpp

arXiv:2302.03861 [pdf]

doi 10.1002/mp.16703

SwinCross: Cross-modal Swin Transformer for Head-and-Neck Tumor Segmentation in PET/CT Images

Authors: Gary Y. Li, Junyu Chen, Se-In Jang, Kuang Gong, Quanzheng Li

Abstract: Radiotherapy (RT) combined with cetuximab is the standard treatment for patients with inoperable head and neck cancers. Segmentation of head and neck (H&N) tumors is a prerequisite for radiotherapy planning but a time-consuming process. In recent years, deep convolutional neural networks have become the de facto standard for automated image segmentation. However, due to the expensive computational… ▽ More Radiotherapy (RT) combined with cetuximab is the standard treatment for patients with inoperable head and neck cancers. Segmentation of head and neck (H&N) tumors is a prerequisite for radiotherapy planning but a time-consuming process. In recent years, deep convolutional neural networks have become the de facto standard for automated image segmentation. However, due to the expensive computational cost associated with enlarging the field of view in DCNNs, their ability to model long-range dependency is still limited, and this can result in sub-optimal segmentation performance for objects with background context spanning over long distances. On the other hand, Transformer models have demonstrated excellent capabilities in capturing such long-range information in several semantic segmentation tasks performed on medical images. Inspired by the recent success of Vision Transformers and advances in multi-modal image analysis, we propose a novel segmentation model, debuted, Cross-Modal Swin Transformer (SwinCross), with cross-modal attention (CMA) module to incorporate cross-modal feature extraction at multiple resolutions.To validate the effectiveness of the proposed method, we performed experiments on the HECKTOR 2021 challenge dataset and compared it with the nnU-Net (the backbone of the top-5 methods in HECKTOR 2021) and other state-of-the-art transformer-based methods such as UNETR, and Swin UNETR. The proposed method is experimentally shown to outperform these comparing methods thanks to the ability of the CMA module to capture better inter-modality complimentary feature representations between PET and CT, for the task of head-and-neck tumor segmentation. △ Less

Submitted 7 February, 2023; originally announced February 2023.

Comments: 9 pages, 3 figures. Med Phys. 2023

arXiv:2302.00461 [pdf, ps, other]

AMP-SBL Unfolding for Wideband MmWave Massive MIMO Channel Estimation

Authors: Jiabao Gao, Caijun Zhong, Geoffrey Ye Li

Abstract: In wideband millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) systems, channel estimation is challenging due to the hybrid analog-digital architecture, which compresses the received pilot signal and makes channel estimation a compressive sensing (CS) problem. However, existing high-performance CS algorithms usually suffer from high complexity. On the other hand, the beam squin… ▽ More In wideband millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) systems, channel estimation is challenging due to the hybrid analog-digital architecture, which compresses the received pilot signal and makes channel estimation a compressive sensing (CS) problem. However, existing high-performance CS algorithms usually suffer from high complexity. On the other hand, the beam squint effect caused by huge bandwidth and massive antennas will deteriorate estimation performance. In this paper, frequency-dependent angular dictionaries are first adopted to compensate for beam squint. Then, the expectation-maximization (EM)-based sparse Bayesian learning (SBL) algorithm is enhanced in two aspects, where the E-step in each iteration is implemented by approximate message passing (AMP) to reduce complexity while the M-step is realized by a deep neural network (DNN) to improve performance. In simulation, the proposed AMP-SBL unfolding-based channel estimator achieves satisfactory performance with low complexity. △ Less

Submitted 1 February, 2023; originally announced February 2023.

arXiv:2212.08533 [pdf, other]

Semantic Sensing and Communications for Ultimate Extended Reality

Authors: Bowen Zhang, Zhijin Qin, Yiyu Guo, Geoffrey Ye Li

Abstract: As a key technology in metaversa, wireless ultimate extended reality (XR) has attracted extensive attentions from both industry and academia. However, the stringent latency and ultra-high data rates requirements have hindered the development of wireless ultimate XR. Instead of transmitting the original source data bit-by-bit, semantic communications focus on the successful delivery of semantic inf… ▽ More As a key technology in metaversa, wireless ultimate extended reality (XR) has attracted extensive attentions from both industry and academia. However, the stringent latency and ultra-high data rates requirements have hindered the development of wireless ultimate XR. Instead of transmitting the original source data bit-by-bit, semantic communications focus on the successful delivery of semantic information contained in the source, which have shown great potentials in reducing the data traffic of wireless systems. Inspired by semantic communications, this article develops a joint semantic sensing, rendering, and communication framework for wireless ultimate XR. In particular, semantic sensing is used to improve the sensing efficiency by exploring the spatial-temporal distributions of semantic information. Semantic rendering is designed to reduce the costs on semantically-redundant pixels. Next, semantic communications are adopted for high data transmission efficiency in wireless ultimate XR. Then, two case studies are provided to demonstrate the effectiveness of the proposed framework. Finally, potential research directions are identified to boost the development of semantic-aware wireless ultimate XR. △ Less

Submitted 16 December, 2022; originally announced December 2022.

Comments: 7 pages, 5 figures, submitted for possible publication

arXiv:2212.07967 [pdf, ps, other]

Distributed-Training-and-Execution Multi-Agent Reinforcement Learning for Power Control in HetNet

Authors: Kaidi Xu, Nguyen Van Huynh, Geoffrey Ye Li

Abstract: In heterogeneous networks (HetNets), the overlap of small cells and the macro cell causes severe cross-tier interference. Although there exist some approaches to address this problem, they usually require global channel state information, which is hard to obtain in practice, and get the sub-optimal power allocation policy with high computational complexity. To overcome these limitations, we propos… ▽ More In heterogeneous networks (HetNets), the overlap of small cells and the macro cell causes severe cross-tier interference. Although there exist some approaches to address this problem, they usually require global channel state information, which is hard to obtain in practice, and get the sub-optimal power allocation policy with high computational complexity. To overcome these limitations, we propose a multi-agent deep reinforcement learning (MADRL) based power control scheme for the HetNet, where each access point makes power control decisions independently based on local information. To promote cooperation among agents, we develop a penalty-based Q learning (PQL) algorithm for MADRL systems. By introducing regularization terms in the loss function, each agent tends to choose an experienced action with high reward when revisiting a state, and thus the policy updating speed slows down. In this way, an agent's policy can be learned by other agents more easily, resulting in a more efficient collaboration process. We then implement the proposed PQL in the considered HetNet and compare it with other distributed-training-and-execution (DTE) algorithms. Simulation results show that our proposed PQL can learn the desired power control policy from a dynamic environment where the locations of users change episodically and outperform existing DTE MADRL algorithms. △ Less

Submitted 15 December, 2022; originally announced December 2022.

arXiv:2212.06482 [pdf, other]

Over-The-Air Federated Learning Over Scalable Cell-free Massive MIMO

Authors: Houssem Sifaou, Geoffrey Ye Li

Abstract: Cell-free massive MIMO is emerging as a promising technology for future wireless communication systems, which is expected to offer uniform coverage and high spectral efficiency compared to classical cellular systems. We study in this paper how cell-free massive MIMO can support federated edge learning. Taking advantage of the additive nature of the wireless multiple access channel, over-the-air co… ▽ More Cell-free massive MIMO is emerging as a promising technology for future wireless communication systems, which is expected to offer uniform coverage and high spectral efficiency compared to classical cellular systems. We study in this paper how cell-free massive MIMO can support federated edge learning. Taking advantage of the additive nature of the wireless multiple access channel, over-the-air computation is exploited, where the clients send their local updates simultaneously over the same communication resource. This approach, known as over-the-air federated learning (OTA-FL), is proven to alleviate the communication overhead of federated learning over wireless networks. Considering channel correlation and only imperfect channel state information available at the central server, we propose a practical implementation of OTA-FL over cell-free massive MIMO. The convergence of the proposed implementation is studied analytically and experimentally, confirming the benefits of cell-free massive MIMO for OTA-FL. △ Less

Submitted 18 September, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

Comments: Accepted at IEEE Transactions on Wireless Communications

arXiv:2212.04047 [pdf, ps, other]

Graph Neural Networks Meet Wireless Communications: Motivation, Applications, and Future Directions

Authors: Mengyuan Lee, Guanding Yu, Huaiyu Dai, Geoffrey Ye Li

Abstract: As an efficient graph analytical tool, graph neural networks (GNNs) have special properties that are particularly fit for the characteristics and requirements of wireless communications, exhibiting good potential for the advancement of next-generation wireless communications. This article aims to provide a comprehensive overview of the interplay between GNNs and wireless communications, including… ▽ More As an efficient graph analytical tool, graph neural networks (GNNs) have special properties that are particularly fit for the characteristics and requirements of wireless communications, exhibiting good potential for the advancement of next-generation wireless communications. This article aims to provide a comprehensive overview of the interplay between GNNs and wireless communications, including GNNs for wireless communications (GNN4Com) and wireless communications for GNNs (Com4GNN). In particular, we discuss GNN4Com based on how graphical models are constructed and introduce Com4GNN with corresponding incentives. We also highlight potential research directions to promote future research endeavors for GNNs in wireless communications. △ Less

Submitted 7 December, 2022; originally announced December 2022.

Comments: This paper is accepted by IEEE Wirel. Commun

arXiv:2211.15851 [pdf, ps, other]

CSI-PPPNet: A One-Sided One-for-All Deep Learning Framework for Massive MIMO CSI Feedback

Authors: Wei Chen, Weixiao Wan, Shiyue Wang, Peng Sun, Geoffrey Ye Li, Bo Ai

Abstract: To reduce multiuser interference and maximize the spectrum efficiency in orthogonal frequency division duplexing massive multiple-input multiple-output (MIMO) systems, the downlink channel state information (CSI) estimated at the user equipment (UE) is required at the base station (BS). This paper presents a novel method for massive MIMO CSI feedback via a one-sided one-for-all deep learning frame… ▽ More To reduce multiuser interference and maximize the spectrum efficiency in orthogonal frequency division duplexing massive multiple-input multiple-output (MIMO) systems, the downlink channel state information (CSI) estimated at the user equipment (UE) is required at the base station (BS). This paper presents a novel method for massive MIMO CSI feedback via a one-sided one-for-all deep learning framework. The CSI is compressed via linear projections at the UE, and is recovered at the BS using deep learning (DL) with plug-and-play priors (PPP). Instead of using handcrafted regularizers for the wireless channel responses, the proposed approach, namely CSI-PPPNet, exploits a DL based denoisor in place of the proximal operator of the prior in an alternating optimization scheme. In this way, a DL model trained once for denoising can be repurposed for CSI recovery tasks with arbitrary compression ratio. The one-sided one-for-all framework reduces model storage space, relieves the burden of joint model training and model delivery, and could be applied at UEs with limited device memories and computation power. Extensive experiments over the open indoor and urban macro scenarios show the effectiveness and advantages of the proposed method. △ Less

Submitted 18 July, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

arXiv:2211.14866 [pdf, ps, other]

Spatially Sparse Precoding in Wideband Hybrid Terahertz Massive MIMO Systems

Authors: Jiabao Gao, Caijun Zhong, Geoffrey Ye Li, Joseph B. Soriaga, Arash Behboodi

Abstract: In terahertz (THz) massive multiple-input multiple-output (MIMO) systems, the combination of huge bandwidth and massive antennas results in severe beam split, thus making the conventional phase-shifter based hybrid precoding architecture ineffective. With the incorporation of true-time-delay (TTD) lines in the hardware implementation of the analog precoders, delay-phase precoding (DPP) emerges as… ▽ More In terahertz (THz) massive multiple-input multiple-output (MIMO) systems, the combination of huge bandwidth and massive antennas results in severe beam split, thus making the conventional phase-shifter based hybrid precoding architecture ineffective. With the incorporation of true-time-delay (TTD) lines in the hardware implementation of the analog precoders, delay-phase precoding (DPP) emerges as a promising architecture to effectively overcome beam split. However, existing DPP approaches suffer from poor performance, high complexity, and weak robustness in practical THz channels. In this paper, we propose a novel DPP approach in wideband THz massive MIMO systems. First, the optimization problem is converted into a compressive sensing (CS) form, which can be solved by the extended spatially sparse precoding (SSP) algorithm. To compensate for beam split, frequency-dependent measurement matrices are introduced, which can be approximately realized by feasible phase and delay codebooks. Then, several efficient atom selection techniques are developed to further reduce the complexity of extended SSP. In simulation, the proposed DPP approach achieves superior performance, complexity, and robustness by using it alone or in combination with existing DPP approaches. △ Less

Submitted 27 November, 2022; originally announced November 2022.

arXiv:2210.11889 [pdf, ps, other]

0/1 Constrained Optimization Solving Sample Average Approximation for Chance Constrained Programming

Authors: Shenglong Zhou, Lili Pan, Naihua Xiu, Geoffrey Ye Li

Abstract: Sample average approximation (SAA) is a tractable approach for dealing with chance constrained programming, a challenging stochastic optimization problem. The constraint of SAA is characterized by the $0/1$ loss function which results in considerable complexities in devising numerical algorithms. Most existing methods have been devised based on reformulations of SAA, such as binary integer program… ▽ More Sample average approximation (SAA) is a tractable approach for dealing with chance constrained programming, a challenging stochastic optimization problem. The constraint of SAA is characterized by the $0/1$ loss function which results in considerable complexities in devising numerical algorithms. Most existing methods have been devised based on reformulations of SAA, such as binary integer programming or relaxed problems. However, the development of viable methods to directly tackle SAA remains elusive, let alone providing theoretical guarantees. In this paper, we investigate a general $0/1$ constrained optimization, providing a new way to address SAA rather than its reformulations. Specifically, starting with deriving the Bouligand tangent and Fr$\acute{e}$chet normal cones of the $0/1$ constraint, we establish several optimality conditions. One of them can be equivalently expressed by a system of equations, enabling the development of a semismooth Newton-type algorithm. The algorithm demonstrates a locally superlinear or quadratic convergence rate under standard assumptions, along with nice numerical performance compared to several leading solvers. △ Less

Submitted 30 April, 2024; v1 submitted 21 October, 2022; originally announced October 2022.

arXiv:2210.00473 [pdf, other]

Wireless Semantic Transmission via Revising Modules in Conventional Communications

Authors: Peiwen Jiang, Chao-Kai Wen, Shi Jin, Geoffrey Ye Li

Abstract: Semantic communication has become a popular research area due its high spectrum efficiency and error-correction performance. Some studies use deep learning to extract semantic features, which usually form end-to-end semantic communication systems and are hard to address the varying wireless environments. Therefore, the novel semantic-based coding methods and performance metrics have been investiga… ▽ More Semantic communication has become a popular research area due its high spectrum efficiency and error-correction performance. Some studies use deep learning to extract semantic features, which usually form end-to-end semantic communication systems and are hard to address the varying wireless environments. Therefore, the novel semantic-based coding methods and performance metrics have been investigated and the designed semantic systems consist of various modules as in the conventional communications but with improved functions. This article discusses recent achievements in the state-of-art semantic communications exploiting the conventional modules in wireless systems. We demonstrate through two examples that the traditional hybrid automatic repeat request and modulation methods can be redesigned for novel semantic coding and metrics to further improve the performance of wireless semantic communications. At the end of this article, some open issues are identified. △ Less

Submitted 2 October, 2022; originally announced October 2022.

Comments: This work has been submitted to the IEEE for possible publication.Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2209.02649 [pdf, other]

Learn to Adapt to New Environment from Past Experience and Few Pilot

Authors: Ouya Wang, Jiabao Gao, Geoffrey Ye Li

Abstract: In recent years, deep learning has been widely applied in communications and achieved remarkable performance improvement. Most of the existing works are based on data-driven deep learning, which requires a significant amount of training data for the communication model to adapt to new environments and results in huge computing resources for collecting data and retraining the model. In this paper,… ▽ More In recent years, deep learning has been widely applied in communications and achieved remarkable performance improvement. Most of the existing works are based on data-driven deep learning, which requires a significant amount of training data for the communication model to adapt to new environments and results in huge computing resources for collecting data and retraining the model. In this paper, we will significantly reduce the required amount of training data for new environments by leveraging the learning experience from the known environments. Therefore, we introduce few-shot learning to enable the communication model to generalize to new environments, which is realized by an attention-based method. With the attention network embedded into the deep learning-based communication model, environments with different power delay profiles can be learnt together in the training process, which is called the learning experience. By exploiting the learning experience, the communication model only requires few pilot blocks to perform well in the new environment. Through an example of deep-learning-based channel estimation, we demonstrate that this novel design method achieves better performance than the existing data-driven approach designed for few-shot learning. △ Less

Submitted 2 September, 2022; originally announced September 2022.

Comments: 11 pages, 8 figures

arXiv:2208.11231 [pdf, other]

Exact Penalty Method for Federated Learning

Authors: Shenglong Zhou, and Geoffrey Ye Li

Abstract: Federated learning has burgeoned recently in machine learning, giving rise to a variety of research topics. Popular optimization algorithms are based on the frameworks of the (stochastic) gradient descent methods or the alternating direction method of multipliers. In this paper, we deploy an exact penalty method to deal with federated learning and propose an algorithm, FedEPM, that enables to tack… ▽ More Federated learning has burgeoned recently in machine learning, giving rise to a variety of research topics. Popular optimization algorithms are based on the frameworks of the (stochastic) gradient descent methods or the alternating direction method of multipliers. In this paper, we deploy an exact penalty method to deal with federated learning and propose an algorithm, FedEPM, that enables to tackle four critical issues in federated learning: communication efficiency, computational complexity, stragglers' effect, and data privacy. Moreover, it is proven to be convergent and testified to have high numerical performance. △ Less

Submitted 4 December, 2022; v1 submitted 23 August, 2022; originally announced August 2022.

arXiv:2208.01828 [pdf, other]

LEO Satellite-Enabled Grant-Free Random Access with MIMO-OTFS

Authors: Boxiao Shen, Yongpeng Wu, Wenjun Zhang, Geoffrey Ye Li, Jianping An, Chengwen Xing

Abstract: This paper investigates joint channel estimation and device activity detection in the LEO satellite-enabled grant-free random access systems with large differential delay and Doppler shift. In addition, the multiple-input multiple-output (MIMO) with orthogonal time-frequency space modulation (OTFS) is utilized to combat the dynamics of the terrestrial-satellite link. To simplify the computation pr… ▽ More This paper investigates joint channel estimation and device activity detection in the LEO satellite-enabled grant-free random access systems with large differential delay and Doppler shift. In addition, the multiple-input multiple-output (MIMO) with orthogonal time-frequency space modulation (OTFS) is utilized to combat the dynamics of the terrestrial-satellite link. To simplify the computation process, we estimate the channel tensor in parallel along the delay dimension. Then, the deep learning and expectation-maximization approach are integrated into the generalized approximate message passing with cross-correlation--based Gaussian prior to capture the channel sparsity in the delay-Doppler-angle domain and learn the hyperparameters. Finally, active devices are detected by computing energy of the estimated channel. Simulation results demonstrate that the proposed algorithms outperform conventional methods. △ Less

Submitted 2 August, 2022; originally announced August 2022.

Comments: This paper has been accepted for presentation at the IEEE GLOBECOM 2022. arXiv admin note: text overlap with arXiv:2202.13058

arXiv:2208.00714 [pdf, other]

Hybrid Precoding for Mixture Use of Phase Shifters and Switches in mmWave Massive MIMO

Authors: Chenhao Qi, Qiang Liu, Xianghao Yu, Geoffrey Ye Li

Abstract: A variable-phase-shifter (VPS) architecture with hybrid precoding for mixture use of phase shifters and switches, is proposed for millimeter wave massive multiple-input multiple-output communications. For the VPS architecture, a hybrid precoding design (HPD) scheme, called VPS-HPD, is proposed to optimize the phases according to the channel state information by alternately optimizing the analog pr… ▽ More A variable-phase-shifter (VPS) architecture with hybrid precoding for mixture use of phase shifters and switches, is proposed for millimeter wave massive multiple-input multiple-output communications. For the VPS architecture, a hybrid precoding design (HPD) scheme, called VPS-HPD, is proposed to optimize the phases according to the channel state information by alternately optimizing the analog precoder and digital precoder. To reduce the computational complexity of the VPS-HPD scheme, a low-complexity HPD scheme for the VPS architecture (VPS-LC-HPD) including alternating optimization in three stages is then proposed, where each stage has a closed-form solution and can be efficiently implemented. To reduce the hardware complexity introduced by the large number of switches, we consider a group-connected VPS architecture and propose a HPD scheme, where the HPD problem is divided into multiple independent subproblems with each subproblem flexibly solved by the VPS-HPD or VPS-LC-HPD scheme. Simulation results verify the effectiveness of the propose schemes and show that the proposed schemes can achieve satisfactory spectral efficiency performance with reduced computational complexity or hardware complexity. △ Less

Submitted 1 August, 2022; originally announced August 2022.

arXiv:2206.14383 [pdf, other]

Overview of Deep Learning-based CSI Feedback in Massive MIMO Systems

Authors: Jiajia Guo, Chao-Kai Wen, Shi Jin, Geoffrey Ye Li

Abstract: Many performance gains achieved by massive multiple-input and multiple-output depend on the accuracy of the downlink channel state information (CSI) at the transmitter (base station), which is usually obtained by estimating at the receiver (user terminal) and feeding back to the transmitter. The overhead of CSI feedback occupies substantial uplink bandwidth resources, especially when the number of… ▽ More Many performance gains achieved by massive multiple-input and multiple-output depend on the accuracy of the downlink channel state information (CSI) at the transmitter (base station), which is usually obtained by estimating at the receiver (user terminal) and feeding back to the transmitter. The overhead of CSI feedback occupies substantial uplink bandwidth resources, especially when the number of the transmit antennas is large. Deep learning (DL)-based CSI feedback refers to CSI compression and reconstruction by a DL-based autoencoder and can greatly reduce feedback overhead. In this paper, a comprehensive overview of state-of-the-art research on this topic is provided, beginning with basic DL concepts widely used in CSI feedback and then categorizing and describing some existing DL-based feedback works. The focus is on novel neural network architectures and utilization of communication expert knowledge to improve CSI feedback accuracy. Works on bit-level CSI feedback and joint design of CSI feedback with other communication modules are also introduced, and some practical issues, including training dataset collection, online training, complexity, generalization, and standardization effect, are discussed. At the end of the paper, some challenges and potential research directions associated with DL-based CSI feedback in future wireless communication systems are identified. △ Less

Submitted 28 June, 2022; originally announced June 2022.

Comments: 28 pages, 33 figures, 6 tables. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2206.09379 [pdf, other]

0/1 Deep Neural Networks via Block Coordinate Descent

Authors: Hui Zhang, Shenglong Zhou, Geoffrey Ye Li, Naihua Xiu

Abstract: The step function is one of the simplest and most natural activation functions for deep neural networks (DNNs). As it counts 1 for positive variables and 0 for others, its intrinsic characteristics (e.g., discontinuity and no viable information of subgradients) impede its development for several decades. Even if there is an impressive body of work on designing DNNs with continuous activation funct… ▽ More The step function is one of the simplest and most natural activation functions for deep neural networks (DNNs). As it counts 1 for positive variables and 0 for others, its intrinsic characteristics (e.g., discontinuity and no viable information of subgradients) impede its development for several decades. Even if there is an impressive body of work on designing DNNs with continuous activation functions that can be deemed as surrogates of the step function, it is still in the possession of some advantageous properties, such as complete robustness to outliers and being capable of attaining the best learning-theoretic guarantee of predictive accuracy. Hence, in this paper, we aim to train DNNs with the step function used as an activation function (dubbed as 0/1 DNNs). We first reformulate 0/1 DNNs as an unconstrained optimization problem and then solve it by a block coordinate descend (BCD) method. Moreover, we acquire closed-form solutions for sub-problems of BCD as well as its convergence properties. Furthermore, we also integrate $\ell_{2,0}$-regularization into 0/1 DNN to accelerate the training process and compress the network scale. As a result, the proposed algorithm has a high performance on classifying MNIST and Fashion-MNIST datasets. As a result, the proposed algorithm has a desirable performance on classifying MNIST, FashionMNIST, Cifar10, and Cifar100 datasets. △ Less

Submitted 31 August, 2023; v1 submitted 19 June, 2022; originally announced June 2022.

arXiv:2206.04011 [pdf, ps, other]

Robust Semantic Communications with Masked VQ-VAE Enabled Codebook

Authors: Qiyu Hu, Guangyi Zhang, Zhijin Qin, Yunlong Cai, Guanding Yu, Geoffrey Ye Li

Abstract: Although semantic communications have exhibited satisfactory performance for a large number of tasks, the impact of semantic noise and the robustness of the systems have not been well investigated. Semantic noise refers to the misleading between the intended semantic symbols and received ones, thus cause the failure of tasks. In this paper, we first propose a framework for the robust end-to-end se… ▽ More Although semantic communications have exhibited satisfactory performance for a large number of tasks, the impact of semantic noise and the robustness of the systems have not been well investigated. Semantic noise refers to the misleading between the intended semantic symbols and received ones, thus cause the failure of tasks. In this paper, we first propose a framework for the robust end-to-end semantic communication systems to combat the semantic noise. In particular, we analyze sample-dependent and sample-independent semantic noise. To combat the semantic noise, the adversarial training with weight perturbation is developed to incorporate the samples with semantic noise in the training dataset. Then, we propose to mask a portion of the input, where the semantic noise appears frequently, and design the masked vector quantized-variational autoencoder (VQ-VAE) with the noise-related masking strategy. We use a discrete codebook shared by the transmitter and the receiver for encoded feature representation. To further improve the system robustness, we develop a feature importance module (FIM) to suppress the noise-related and task-unrelated features. Thus, the transmitter simply needs to transmit the indices of these important task-related features in the codebook. Simulation results show that the proposed method can be applied in many downstream tasks and significantly improve the robustness against semantic noise with remarkable reduction on the transmission overhead. △ Less

Submitted 18 April, 2023; v1 submitted 8 June, 2022; originally announced June 2022.

Comments: 16 pages, 11 figures. arXiv admin note: text overlap with arXiv:2202.03338

Showing 1–50 of 184 results for author: Li, G Y