subscribe to arXiv mailings

arXiv:2407.06373 [pdf]

Enhancing super-resolution ultrasound localisation through multi-frame deconvolution exploiting spatiotemporal coherence

Authors: Su Yan, Clotilde Vié, Marcelo Lerendegui, Herman Verinaz-Jadan, Jipeng Yan, Martina Tashkova, James Burn, Bingxue Wang, Gary Frost, Kevin G. Murphy, Meng-Xing Tang

Abstract: Super-resolution ultrasound imaging through microbubble (MB) localisation and tracking, also known as ultrasound localisation microscopy, allows non-invasive sub-diffraction resolution imaging of microvasculature in animals and humans. The number of MBs localised from the acquired contrast-enhanced ultrasound (CEUS) images and the localisation precision directly influence the quality of the result… ▽ More Super-resolution ultrasound imaging through microbubble (MB) localisation and tracking, also known as ultrasound localisation microscopy, allows non-invasive sub-diffraction resolution imaging of microvasculature in animals and humans. The number of MBs localised from the acquired contrast-enhanced ultrasound (CEUS) images and the localisation precision directly influence the quality of the resulting super-resolution microvasculature images. However, non-negligible noise present in the CEUS images can make localising MBs challenging. To enhance the MB localisation performance, we propose a Multi-Frame Deconvolution (MF-Decon) framework that can exploit the spatiotemporal coherence inherent in the CEUS data, with new spatial and temporal regularisers designed based on total variation (TV) and regularisation by denoising (RED). Based on the MF-Decon framework, we introduce two novel methods: MF-Decon with spatial and temporal TVs (MF-Decon+3DTV) and MF-Decon with spatial RED and temporal TV (MF-Decon+RED+TV). Results from in silico simulations indicate that our methods outperform two widely used methods using deconvolution or normalised cross-correlation across all evaluation metrics, including precision, recall, $F_1$ score, mean and standard localisation errors. In particular, our methods improve MB localisation precision by up to 39% and recall by up to 12%. Super-resolution microvasculature maps generated with our methods on a publicly available in vivo rat brain dataset show less noise, better contrast, higher resolution and more vessel structures. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 26 pages, 1 table, 7 figures

arXiv:2406.11163 [pdf, other]

Explainable Bayesian Recurrent Neural Smoother to Capture Global State Evolutionary Correlations

Authors: Shi Yan, Yan Liang, Huayu Zhang, Le Zheng, Difan Zou, Binglu Wang

Abstract: Through integrating the evolutionary correlations across global states in the bidirectional recursion, an explainable Bayesian recurrent neural smoother (EBRNS) is proposed for offline data-assisted fixed-interval state smoothing. At first, the proposed model, containing global states in the evolutionary interval, is transformed into an equivalent model with bidirectional memory. This transformati… ▽ More Through integrating the evolutionary correlations across global states in the bidirectional recursion, an explainable Bayesian recurrent neural smoother (EBRNS) is proposed for offline data-assisted fixed-interval state smoothing. At first, the proposed model, containing global states in the evolutionary interval, is transformed into an equivalent model with bidirectional memory. This transformation incorporates crucial global state information with support for bi-directional recursive computation. For the transformed model, the joint state-memory-trend Bayesian filtering and smoothing frameworks are derived by introducing the bidirectional memory iteration mechanism and offline data into Bayesian estimation theory. The derived frameworks are implemented using the Gaussian approximation to ensure analytical properties and computational efficiency. Finally, the neural network modules within EBRNS and its two-stage training scheme are designed. Unlike most existing approaches that artificially combine deep learning and model-based estimation, the bidirectional recursion and internal gated structures of EBRNS are naturally derived from Bayesian estimation theory, explainably integrating prior model knowledge, online measurement, and offline data. Experiments on representative real-world datasets demonstrate that the high smoothing accuracy of EBRNS is accompanied by data efficiency and a lightweight parameter scale. △ Less

Submitted 16 June, 2024; originally announced June 2024.

arXiv:2405.11289 [pdf, other]

Diffusion Model Driven Test-Time Image Adaptation for Robust Skin Lesion Classification

Authors: Ming Hu, Siyuan Yan, Peng Xia, Feilong Tang, Wenxue Li, Peibo Duan, Lin Zhang, Zongyuan Ge

Abstract: Deep learning-based diagnostic systems have demonstrated potential in skin disease diagnosis. However, their performance can easily degrade on test domains due to distribution shifts caused by input-level corruptions, such as imaging equipment variability, brightness changes, and image blur. This will reduce the reliability of model deployment in real-world scenarios. Most existing solutions focus… ▽ More Deep learning-based diagnostic systems have demonstrated potential in skin disease diagnosis. However, their performance can easily degrade on test domains due to distribution shifts caused by input-level corruptions, such as imaging equipment variability, brightness changes, and image blur. This will reduce the reliability of model deployment in real-world scenarios. Most existing solutions focus on adapting the source model through retraining on different target domains. Although effective, this retraining process is sensitive to the amount of data and the hyperparameter configuration for optimization. In this paper, we propose a test-time image adaptation method to enhance the accuracy of the model on test data by simultaneously updating and predicting test images. We modify the target test images by projecting them back to the source domain using a diffusion model. Specifically, we design a structure guidance module that adds refinement operations through low-pass filtering during reverse sampling, regularizing the diffusion to preserve structural information. Additionally, we introduce a self-ensembling scheme automatically adjusts the reliance on adapted and unadapted inputs, enhancing adaptation robustness by rejecting inappropriate generative modeling results. To facilitate this study, we constructed the ISIC2019-C and Dermnet-C corruption robustness evaluation benchmarks. Extensive experiments on the proposed benchmarks demonstrate that our method makes the classifier more robust across various corruptions, architectures, and data regimes. Our datasets and code will be available at \url{https://github.com/minghu0830/Skin-TTA_Diffusion}. △ Less

Submitted 18 May, 2024; originally announced May 2024.

arXiv:2403.18621 [pdf, other]

doi 10.1109/TVT.2024.3420880

Performance Analysis of Integrated Sensing and Communication Networks with Blockage Effects

Authors: Zezhong Sun, Shi Yan, Ning Jiang, Jiaen Zhou, Mugen Peng

Abstract: Communication-sensing integration represents an up-and-coming area of research, enabling wireless networks to simultaneously perform communication and sensing tasks. However, in urban cellular networks, the blockage of buildings results in a complex signal propagation environment, affecting the performance analysis of integrated sensing and communication (ISAC) networks. To overcome this obstacle,… ▽ More Communication-sensing integration represents an up-and-coming area of research, enabling wireless networks to simultaneously perform communication and sensing tasks. However, in urban cellular networks, the blockage of buildings results in a complex signal propagation environment, affecting the performance analysis of integrated sensing and communication (ISAC) networks. To overcome this obstacle, this paper constructs a comprehensive framework considering building blockage and employs a distance-correlated blockage model to analyze interference from line of sight (LoS), non-line of sight (NLoS), and target reflection cascading (TRC) links. Using stochastic geometric theory, expressions for signal-to-interference-plus-noise ratio (SINR) and coverage probability for communication and sensing in the presence of blockage are derived, allowing for a comprehensive comparison under the same parameters. The research findings indicate that blockage can positively impact coverage, especially in enhancing communication performance. The analysis also suggests that there exists an optimal base station (BS) density when blockage is of the same order of magnitude as the BS density, maximizing communication or sensing coverage probability. △ Less

Submitted 2 July, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: This paper has been accepted by IEEE Transactions on Vehicular Technology

arXiv:2403.05808 [pdf, other]

Adaptive Multi-modal Fusion of Spatially Variant Kernel Refinement with Diffusion Model for Blind Image Super-Resolution

Authors: Junxiong Lin, Yan Wang, Zeng Tao, Boyang Wang, Qing Zhao, Haorang Wang, Xuan Tong, Xinji Mai, Yuxuan Lin, Wei Song, Jiawen Yu, Shaoqi Yan, Wenqiang Zhang

Abstract: Pre-trained diffusion models utilized for image generation encapsulate a substantial reservoir of a priori knowledge pertaining to intricate textures. Harnessing the potential of leveraging this a priori knowledge in the context of image super-resolution presents a compelling avenue. Nonetheless, prevailing diffusion-based methodologies presently overlook the constraints imposed by degradation inf… ▽ More Pre-trained diffusion models utilized for image generation encapsulate a substantial reservoir of a priori knowledge pertaining to intricate textures. Harnessing the potential of leveraging this a priori knowledge in the context of image super-resolution presents a compelling avenue. Nonetheless, prevailing diffusion-based methodologies presently overlook the constraints imposed by degradation information on the diffusion process. Furthermore, these methods fail to consider the spatial variability inherent in the estimated blur kernel, stemming from factors such as motion jitter and out-of-focus elements in open-environment scenarios. This oversight results in a notable deviation of the image super-resolution effect from fundamental realities. To address these concerns, we introduce a framework known as Adaptive Multi-modal Fusion of \textbf{S}patially Variant Kernel Refinement with Diffusion Model for Blind Image \textbf{S}uper-\textbf{R}esolution (SSR). Within the SSR framework, we propose a Spatially Variant Kernel Refinement (SVKR) module. SVKR estimates a Depth-Informed Kernel, which takes the depth information into account and is spatially variant. Additionally, SVKR enhance the accuracy of depth information acquired from LR images, allowing for mutual enhancement between the depth map and blur kernel estimates. Finally, we introduce the Adaptive Multi-Modal Fusion (AMF) module to align the information from three modalities: low-resolution images, depth maps, and blur kernels. This alignment can constrain the diffusion model to generate more authentic SR results. △ Less

Submitted 9 July, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

arXiv:2401.03002 [pdf, other]

Prompt-driven Latent Domain Generalization for Medical Image Classification

Authors: Siyuan Yan, Chi Liu, Zhen Yu, Lie Ju, Dwarikanath Mahapatra, Brigid Betz-Stablein, Victoria Mar, Monika Janda, Peter Soyer, Zongyuan Ge

Abstract: Deep learning models for medical image analysis easily suffer from distribution shifts caused by dataset artifacts bias, camera variations, differences in the imaging station, etc., leading to unreliable diagnoses in real-world clinical settings. Domain generalization (DG) methods, which aim to train models on multiple domains to perform well on unseen domains, offer a promising direction to solve… ▽ More Deep learning models for medical image analysis easily suffer from distribution shifts caused by dataset artifacts bias, camera variations, differences in the imaging station, etc., leading to unreliable diagnoses in real-world clinical settings. Domain generalization (DG) methods, which aim to train models on multiple domains to perform well on unseen domains, offer a promising direction to solve the problem. However, existing DG methods assume domain labels of each image are available and accurate, which is typically feasible for only a limited number of medical datasets. To address these challenges, we propose a novel DG framework for medical image classification without relying on domain labels, called Prompt-driven Latent Domain Generalization (PLDG). PLDG consists of unsupervised domain discovery and prompt learning. This framework first discovers pseudo domain labels by clustering the bias-associated style features, then leverages collaborative domain prompts to guide a Vision Transformer to learn knowledge from discovered diverse domains. To facilitate cross-domain knowledge learning between different prompts, we introduce a domain prompt generator that enables knowledge sharing between domain prompts and a shared prompt. A domain mixup strategy is additionally employed for more flexible decision margins and mitigates the risk of incorrect domain assignments. Extensive experiments on three medical image classification tasks and one debiasing task demonstrate that our method can achieve comparable or even superior performance than conventional DG algorithms without relying on domain labels. Our code will be publicly available upon the paper is accepted. △ Less

Submitted 5 January, 2024; originally announced January 2024.

Comments: 10 pages

arXiv:2311.05477 [pdf, other]

Using ResNet to Utilize 4-class T2-FLAIR Slice Classification Based on the Cholinergic Pathways Hyperintensities Scale for Pathological Aging

Authors: Wei-Chun Kevin Tsai, Yi-Chien Liu, Ming-Chun Yu, Chia-Ju Chou, Sui-Hing Yan, Yang-Teng Fan, Yan-Hsiang Huang, Yen-Ling Chiu, Yi-Fang Chuang, Ran-Zan Wang, Yao-Chia Shih

Abstract: The Cholinergic Pathways Hyperintensities Scale (CHIPS) is a visual rating scale used to assess the extent of cholinergic white matter hyperintensities in T2-FLAIR images, serving as an indicator of dementia severity. However, the manual selection of four specific slices for rating throughout the entire brain is a time-consuming process. Our goal was to develop a deep learning-based model capable… ▽ More The Cholinergic Pathways Hyperintensities Scale (CHIPS) is a visual rating scale used to assess the extent of cholinergic white matter hyperintensities in T2-FLAIR images, serving as an indicator of dementia severity. However, the manual selection of four specific slices for rating throughout the entire brain is a time-consuming process. Our goal was to develop a deep learning-based model capable of automatically identifying the four slices relevant to CHIPS. To achieve this, we trained a 4-class slice classification model (BSCA) using the ADNI T2-FLAIR dataset (N=150) with the assistance of ResNet. Subsequently, we tested the model's performance on a local dataset (N=30). The results demonstrated the efficacy of our model, with an accuracy of 99.82% and an F1-score of 99.83%. This achievement highlights the potential impact of BSCA as an automatic screening tool, streamlining the selection of four specific T2-FLAIR slices that encompass white matter landmarks along the cholinergic pathways. Clinicians can leverage this tool to assess the risk of clinical dementia development efficiently. △ Less

Submitted 9 November, 2023; originally announced November 2023.

Comments: 8 pages, 2 figures, 2 tables

arXiv:2310.17523 [pdf, ps, other]

Adaptive Resource Management for Edge Network Slicing using Incremental Multi-Agent Deep Reinforcement Learning

Authors: Haiyuan Li, Yuelin Liu, Xueqing Zhou, Xenofon Vasilakos, Reza Nejabati, Shuangyi Yan, Dimitra Simeonidou

Abstract: Multi-access edge computing provides local resources in mobile networks as the essential means for meeting the demands of emerging ultra-reliable low-latency communications. At the edge, dynamic computing requests require advanced resource management for adaptive network slicing, including resource allocations, function scaling and load balancing to utilize only the necessary resources in resource… ▽ More Multi-access edge computing provides local resources in mobile networks as the essential means for meeting the demands of emerging ultra-reliable low-latency communications. At the edge, dynamic computing requests require advanced resource management for adaptive network slicing, including resource allocations, function scaling and load balancing to utilize only the necessary resources in resource-constraint networks. Recent solutions are designed for a static number of slices. Therefore, the painful process of optimization is required again with any update on the number of slices. In addition, these solutions intend to maximize instant rewards, neglecting long-term resource scheduling. Unlike these efforts, we propose an algorithmic approach based on multi-agent deep deterministic policy gradient (MADDPG) for optimizing resource management for edge network slicing. Our objective is two-fold: (i) maximizing long-term network slicing benefits in terms of delay and energy consumption, and (ii) adapting to slice number changes. Through simulations, we demonstrate that MADDPG outperforms benchmark solutions including a static slicing-based one from the literature, achieving stable and high long-term performance. Additionally, we leverage incremental learning to facilitate a dynamic number of edge slices, with enhanced performance compared to pre-trained base models. Remarkably, this approach yields superior reward performance while saving approximately 90% of training time costs. △ Less

Submitted 27 October, 2023; v1 submitted 26 October, 2023; originally announced October 2023.

arXiv:2310.17187 [pdf, other]

Explainable Gated Bayesian Recurrent Neural Network for Non-Markov State Estimation

Authors: Shi Yan, Yan Liang, Le Zheng, Mingyang Fan, Xiaoxu Wang, Binglu Wang

Abstract: The optimality of Bayesian filtering relies on the completeness of prior models, while deep learning holds a distinct advantage in learning models from offline data. Nevertheless, the current fusion of these two methodologies remains largely ad hoc, lacking a theoretical foundation. This paper presents a novel solution, namely an explainable gated Bayesian recurrent neural network specifically des… ▽ More The optimality of Bayesian filtering relies on the completeness of prior models, while deep learning holds a distinct advantage in learning models from offline data. Nevertheless, the current fusion of these two methodologies remains largely ad hoc, lacking a theoretical foundation. This paper presents a novel solution, namely an explainable gated Bayesian recurrent neural network specifically designed to state estimation under model mismatches. Firstly, we transform the non-Markov state-space model into an equivalent first-order Markov model with memory. It is a generalized transformation that overcomes the limitations of the first-order Markov property and enables recursive filtering. Secondly, by deriving a data-assisted joint state-memory-mismatch Bayesian filtering, we design a Bayesian gated framework that includes a memory update gate for capturing the temporal regularities in state evolution, a state prediction gate with the evolution mismatch compensation, and a state update gate with the observation mismatch compensation. The Gaussian approximation implementation of the filtering process within the gated framework is derived, taking into account the computational efficiency. Finally, the corresponding internal neural network structures and end-to-end training methods are designed. The Bayesian filtering theory enhances the interpretability of the proposed gated network, enabling the effective integration of offline data and prior models within functionally explicit gated units. In comprehensive experiments, including simulations and real-world datasets, the proposed gated network demonstrates superior estimation performance compared to benchmark filters and state-of-the-art deep learning filtering methods. △ Less

Submitted 7 March, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

arXiv:2309.15140 [pdf, other]

doi 10.1109/CVCI59596.2023.10397371

A Review on AI Algorithms for Energy Management in E-Mobility Services

Authors: Sen Yan, Maqsood Hussain Shah, Ji Li, Noel O'Connor, Mingming Liu

Abstract: E-mobility, or electric mobility, has emerged as a pivotal solution to address pressing environmental and sustainability concerns in the transportation sector. The depletion of fossil fuels, escalating greenhouse gas emissions, and the imperative to combat climate change underscore the significance of transitioning to electric vehicles (EVs). This paper seeks to explore the potential of artificial… ▽ More E-mobility, or electric mobility, has emerged as a pivotal solution to address pressing environmental and sustainability concerns in the transportation sector. The depletion of fossil fuels, escalating greenhouse gas emissions, and the imperative to combat climate change underscore the significance of transitioning to electric vehicles (EVs). This paper seeks to explore the potential of artificial intelligence (AI) in addressing various challenges related to effective energy management in e-mobility systems (EMS). These challenges encompass critical factors such as range anxiety, charge rate optimization, and the longevity of energy storage in EVs. By analyzing existing literature, we delve into the role that AI can play in tackling these challenges and enabling efficient energy management in EMS. Our objectives are twofold: to provide an overview of the current state-of-the-art in this research domain and propose effective avenues for future investigations. Through this analysis, we aim to contribute to the advancement of sustainable and efficient e-mobility solutions, shaping a greener and more sustainable future for transportation. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: 8 pages, 4 tables, 1 figure

arXiv:2308.04278 [pdf, other]

Achieving Covert Communication With A Probabilistic Jamming Strategy

Authors: Xun Chen, Fujun Gao, Min Qiu, Jia Zhang, Feng Shu, Shihao Yan

Abstract: In this work, we consider a covert communication scenario, where a transmitter Alice communicates to a receiver Bob with the aid of a probabilistic and uninformed jammer against an adversary warden's detection. The transmission status and power of the jammer are random and follow some priori probabilities. We first analyze the warden's detection performance as a function of the jammer's transmissi… ▽ More In this work, we consider a covert communication scenario, where a transmitter Alice communicates to a receiver Bob with the aid of a probabilistic and uninformed jammer against an adversary warden's detection. The transmission status and power of the jammer are random and follow some priori probabilities. We first analyze the warden's detection performance as a function of the jammer's transmission probability, transmit power distribution, and Alice's transmit power. We then maximize the covert throughput from Alice to Bob subject to a covertness constraint, by designing the covert communication strategies from three different perspectives: Alice's perspective, the jammer's perspective, and the global perspective. Our analysis reveals that the minimum jamming power should not always be zero in the probabilistic jamming strategy, which is different from that in the continuous jamming strategy presented in the literature. In addition, we prove that the minimum jamming power should be the same as Alice's covert transmit power, depending on the covertness and average jamming power constraints. Furthermore, our results show that the probabilistic jamming can outperform the continuous jamming in terms of achieving a higher covert throughput under the same covertness and average jamming power constraints. △ Less

Submitted 29 August, 2023; v1 submitted 8 August, 2023; originally announced August 2023.

arXiv:2308.02915 [pdf, other]

doi 10.1145/3581783.3612307

DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation

Authors: Qiaosong Qi, Le Zhuo, Aixi Zhang, Yue Liao, Fei Fang, Si Liu, Shuicheng Yan

Abstract: When hearing music, it is natural for people to dance to its rhythm. Automatic dance generation, however, is a challenging task due to the physical constraints of human motion and rhythmic alignment with target music. Conventional autoregressive methods introduce compounding errors during sampling and struggle to capture the long-term structure of dance sequences. To address these limitations, we… ▽ More When hearing music, it is natural for people to dance to its rhythm. Automatic dance generation, however, is a challenging task due to the physical constraints of human motion and rhythmic alignment with target music. Conventional autoregressive methods introduce compounding errors during sampling and struggle to capture the long-term structure of dance sequences. To address these limitations, we present a novel cascaded motion diffusion model, DiffDance, designed for high-resolution, long-form dance generation. This model comprises a music-to-dance diffusion model and a sequence super-resolution diffusion model. To bridge the gap between music and motion for conditional generation, DiffDance employs a pretrained audio representation learning model to extract music embeddings and further align its embedding space to motion via contrastive loss. During training our cascaded diffusion model, we also incorporate multiple geometric losses to constrain the model outputs to be physically plausible and add a dynamic loss weight that adaptively changes over diffusion timesteps to facilitate sample diversity. Through comprehensive experiments performed on the benchmark dataset AIST++, we demonstrate that DiffDance is capable of generating realistic dance sequences that align effectively with the input music. These results are comparable to those achieved by state-of-the-art autoregressive methods. △ Less

Submitted 5 August, 2023; originally announced August 2023.

Comments: Accepted at ACM MM 2023

arXiv:2307.03028 [pdf, other]

doi 10.1109/TWC.2023.3293315

Performance Analysis and Approximate Message Passing Detection of Orthogonal Time Sequency Multiplexing Modulation

Authors: Zeping Sui, Shefeng Yan, Hongming Zhang, Sumei Sun, Yonghong Zeng, Lie-Liang Yang, Lajos Hanzo

Abstract: In orthogonal time sequency multiplexing (OTSM) modulation, the information symbols are conveyed in the delay-sequency domain upon exploiting the inverse Walsh Hadamard transform (IWHT). It has been shown that OTSM is capable of attaining a bit error ratio (BER) similar to that of orthogonal time-frequency space (OTFS) modulation at a lower complexity, since the saving of multiplication operations… ▽ More In orthogonal time sequency multiplexing (OTSM) modulation, the information symbols are conveyed in the delay-sequency domain upon exploiting the inverse Walsh Hadamard transform (IWHT). It has been shown that OTSM is capable of attaining a bit error ratio (BER) similar to that of orthogonal time-frequency space (OTFS) modulation at a lower complexity, since the saving of multiplication operations in the IWHT. Hence we provide its BER performance analysis and characterize its detection complexity. We commence by deriving its generalized input-output relationship and its unconditional pairwise error probability (UPEP). Then, its BER upper bound is derived in closed form under both ideal and imperfect channel estimation conditions, which is shown to be tight at moderate to high signal-to-noise ratios (SNRs). Moreover, a novel approximate message passing (AMP) aided OTSM detection framework is proposed. Specifically, to circumvent the high residual BER of the conventional AMP detector, we proposed a vector AMP-based expectation-maximization (VAMP-EM) detector for performing joint data detection and noise variance estimation. The variance auto-tuning algorithm based on the EM algorithm is designed for the VAMP-EM detector to further improve the convergence performance. The simulation results illustrate that the VAMP-EM detector is capable of striking an attractive BER vs. complexity trade-off than the state-of-the-art schemes as well as providing a better convergence. Finally, we propose AMP and VAMP-EM turbo receivers for low-density parity-check (LDPC)-coded OTSM systems. It is demonstrated that our proposed VAMP-EM turbo receiver is capable of providing both BER and convergence performance improvements over the conventional AMP solution. △ Less

Submitted 6 July, 2023; originally announced July 2023.

Comments: Accepted in IEEE Transactions on Wireless Communications

arXiv:2306.00412 [pdf, ps, other]

Beamforming Design for IRS-and-UAV-Aided Two-Way Amplify-and-Forward Relay Networks in Maritime IoT

Authors: Xuehui Wang, Feng Shu, Yuanyuan Wu, Weiping Shi, Shihao Yan, Yifan Zhao, Qiankun Cheng, Zhongwen Sun, Jiangzhou Wang

Abstract: In this paper, an intelligent reflecting surface (IRS)-and-unmanned aerial vehicle (UAV)-assisted two-way amplify-and-forward (AF) relay network in maritime Internet of Things (IoT) is proposed, where ship1 ($\text{S}_1$) and ship2 ($\text{S}_2$) can be viewed as data collecting centers. To enhance the message exchange rate between $\text{S}_1$ and $\text{S}_2$, a problem of maximizing minimum rat… ▽ More In this paper, an intelligent reflecting surface (IRS)-and-unmanned aerial vehicle (UAV)-assisted two-way amplify-and-forward (AF) relay network in maritime Internet of Things (IoT) is proposed, where ship1 ($\text{S}_1$) and ship2 ($\text{S}_2$) can be viewed as data collecting centers. To enhance the message exchange rate between $\text{S}_1$ and $\text{S}_2$, a problem of maximizing minimum rate is cast, where the variables, namely AF relay beamforming matrix and IRS phase shifts of two time slots, need to be optimized. To achieve a maximum rate, a low-complexity alternately iterative (AI) scheme based on zero forcing and successive convex approximation (LC-ZF-SCA) algorithm is presented. To obtain a significant rate enhancement, a high-performance AI method based on one step, semidefinite programming and penalty SCA (ONS-SDP-PSCA) is proposed. Simulation results show that by the proposed LC-ZF-SCA and ONS-SDP-PSCA methods, the rate of the IRS-and-UAV-assisted AF relay network surpass those of with random phase and only AF relay networks. Moreover, ONS-SDP-PSCA perform better than LC-ZF-SCA in aspect of rate. △ Less

Submitted 24 June, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

arXiv:2305.12778 [pdf, other]

STAR-RIS-UAV Aided Coordinated Multipoint Cellular System for Multi-user Networks

Authors: Baihua Shi, Yang Wang, Danqi Li, Wenlong Cai, Jinyong Lin, Shuo Zhang, Weiping Shi, Shihao Yan, Feng Shu

Abstract: Different with conventional reconfigurable intelligent surface (RIS), simultaneous transmitting and reflecting RIS (STAR-RIS) can reflect and transmit the signals to the receiver. In this paper, to serve more ground users and increase the deployment flexibility, we investigate an unmanned aerial vehicle equipped with a STAR-RIS (STAR-RIS-UAV) aided wireless communications for multi-user networks.… ▽ More Different with conventional reconfigurable intelligent surface (RIS), simultaneous transmitting and reflecting RIS (STAR-RIS) can reflect and transmit the signals to the receiver. In this paper, to serve more ground users and increase the deployment flexibility, we investigate an unmanned aerial vehicle equipped with a STAR-RIS (STAR-RIS-UAV) aided wireless communications for multi-user networks. Energy splitting (ES) and mode switching (MS) protocols are considered to control the reflection and transmission coefficients of STAR-RIS elements. To maximize the sum rate of the STAR-RIS-UAV aided coordinated multipoint cellular system for multi-user networks, the corresponding beamforming vectors as well as transmitted and reflected coefficients matrices are optimized. Specifically, instead of adopting the alternating optimization, we design an iteration method to optimize all variables for both ES and MS protocols at the same time. Simulation results reveal that STAR-RIS-UAV aided wireless communication system has a much higher sum rate than the system with conventional RIS or without RIS. Furthermore, the proposed structure is more flexible than a fixed STAR-RIS and could greatly promote the sum rate. △ Less

Submitted 22 May, 2023; originally announced May 2023.

Comments: 10 pages, 6 figures

arXiv:2304.06783 [pdf, other]

A Distributionally Robust Approach to Regret Optimal Control using the Wasserstein Distance

Authors: Feras Al Taha, Shuhao Yan, Eilyan Bitar

Abstract: This paper proposes a distributionally robust approach to regret optimal control of discrete-time linear dynamical systems with quadratic costs subject to a stochastic additive disturbance on the state process. The underlying probability distribution of the disturbance process is unknown, but assumed to lie in a given ball of distributions defined in terms of the type-2 Wasserstein distance. In th… ▽ More This paper proposes a distributionally robust approach to regret optimal control of discrete-time linear dynamical systems with quadratic costs subject to a stochastic additive disturbance on the state process. The underlying probability distribution of the disturbance process is unknown, but assumed to lie in a given ball of distributions defined in terms of the type-2 Wasserstein distance. In this framework, strictly causal linear disturbance feedback controllers are designed to minimize the worst-case expected regret. The regret incurred by a controller is defined as the difference between the cost it incurs in response to a realization of the disturbance process and the cost incurred by the optimal noncausal controller which has perfect knowledge of the disturbance process realization at the outset. Building on a well-established duality theory for optimal transport problems, we derive a reformulation of the minimax regret optimal control problem as a tractable semidefinite program. Using the equivalent dual reformulation, we characterize a worst-case distribution achieving the worst-case expected regret in relation to the distribution at the center of the Wasserstein ball. We compare the minimax regret optimal control design method with the distributionally robust optimal control approach using an illustrative example and numerical experiments. △ Less

Submitted 16 August, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

Comments: 8 pages, 3 figures, to appear in the proceedings of the 2023 IEEE Conference on Decision and Control (CDC)

arXiv:2304.05105 [pdf, other]

Learning-based Rigid Tube Model Predictive Control

Authors: Yulong Gao, Shuhao Yan, Jian Zhou, Mark Cannon, Alessandro Abate, Karl H. Johansson

Abstract: This paper is concerned with model predictive control (MPC) of discrete-time linear systems subject to bounded additive disturbance and mixed constraints on the state and input, whereas the true disturbance set is unknown. Unlike most existing work on robust MPC, we propose an algorithm incorporating online learning that builds on prior knowledge of the disturbance, i.e., a known but conservative… ▽ More This paper is concerned with model predictive control (MPC) of discrete-time linear systems subject to bounded additive disturbance and mixed constraints on the state and input, whereas the true disturbance set is unknown. Unlike most existing work on robust MPC, we propose an algorithm incorporating online learning that builds on prior knowledge of the disturbance, i.e., a known but conservative disturbance set. We approximate the true disturbance set at each time step with a parameterised set, which is referred to as a quantified disturbance set, using disturbance realisations. A key novelty is that the parameterisation of these quantified disturbance sets enjoys desirable properties such that the quantified disturbance set and its corresponding rigid tube bounding disturbance propagation can be efficiently updated online. We provide statistical gaps between the true and quantified disturbance sets, based on which, probabilistic recursive feasibility of MPC optimisation problems is discussed. Numerical simulations are provided to demonstrate the effectiveness of our proposed algorithm and compare with conventional robust MPC algorithms. △ Less

Submitted 21 May, 2024; v1 submitted 11 April, 2023; originally announced April 2023.

Comments: 8 pages

arXiv:2302.10391 [pdf, other]

doi 10.3390/drones7050318

Low-Complexity Three-Dimensional AOA-Cross Geometric Center Localization Methods via Multi-UAV network

Authors: Baihua Shi, Yifan Li, Guilu Wu, Shihao Yan, Feng Shu

Abstract: Angle of arrival (AOA) is widely used to locate a wireless signal emitter in unmanned aerial vehicle (UAV) localization. Compared with received signal strength (RSS) and time of arrival (TOA), it has higher accuracy and is not sensitive to time synchronization of the distributed sensors. However, there are few works focused on three-dimensional (3-D) scenario. Furthermore, although maximum likelih… ▽ More Angle of arrival (AOA) is widely used to locate a wireless signal emitter in unmanned aerial vehicle (UAV) localization. Compared with received signal strength (RSS) and time of arrival (TOA), it has higher accuracy and is not sensitive to time synchronization of the distributed sensors. However, there are few works focused on three-dimensional (3-D) scenario. Furthermore, although maximum likelihood estimator (MLE) has a relatively high performance, its computational complexity is ultra high. It is hard to employ it in practical applications. This paper proposed two multiplane geometric center based methods for 3-D AOA in UAV positioning. The first method could estimate the source position and angle measurement noise at the same time by seeking a center of the inscribed sphere, called CIS. Firstly, every sensor could measure two angles, azimuth angle and elevation angle. Based on that, two planes are constructed. Then, the estimated values of source position and angle noise are achieved by seeking the center and radius of the corresponding inscribed sphere. Deleting the estimation of the radius, the second algorithm, called MSD-LS, is born. It is not able to estimate angle noise but has lower computational complexity. Theoretical analysis and simulation results show that proposed methods could approach the Cramer-Rao lower bound (CRLB) and have lower complexity than MLE. △ Less

Submitted 16 May, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

Comments: 7 pages, 5 figures

Journal ref: Drones 2023, 7(5), 318

arXiv:2212.12055 [pdf, ps, other]

DRL-based Energy-Efficient Baseband Function Deployments for Service-Oriented Open RAN

Authors: Haiyuan Li, Amin Emami, Karcius Assis, Antonis Vafeas, Ruizhi Yang, Reza Nejabati, Shuangyi Yan, Dimitra Simeonidou

Abstract: Open Radio Access Network (Open RAN) has gained tremendous attention from industry and academia with decentralized baseband functions across multiple processing units located at different places. However, the ever-expanding scope of RANs, along with fluctuations in resource utilization across different locations and timeframes, necessitates the implementation of robust function management policies… ▽ More Open Radio Access Network (Open RAN) has gained tremendous attention from industry and academia with decentralized baseband functions across multiple processing units located at different places. However, the ever-expanding scope of RANs, along with fluctuations in resource utilization across different locations and timeframes, necessitates the implementation of robust function management policies to minimize network energy consumption. Most recently developed strategies neglected the activation time and the required energy for the server activation process, while this process could offset the potential energy savings gained from server hibernation. Furthermore, user plane functions, which can be deployed on edge computing servers to provide low-latency services, have not been sufficiently considered. In this paper, a multi-agent deep reinforcement learning (DRL) based function deployment algorithm, coupled with a heuristic method, has been developed to minimize energy consumption while fulfilling multiple requests and adhering to latency and resource constraints. In an 8-MEC network, the DRL-based solution approaches the performance of the benchmark while offering up to 51% energy savings compared to existing approaches. In a larger network of 14-MEC, it maintains a 38% energy-saving advantage and ensures real-time response capabilities. Furthermore, this paper prototypes an Open RAN testbed to verify the feasibility of the proposed solution. △ Less

Submitted 4 October, 2023; v1 submitted 22 December, 2022; originally announced December 2022.

arXiv:2211.00222 [pdf, other]

SDMuse: Stochastic Differential Music Editing and Generation via Hybrid Representation

Authors: Chen Zhang, Yi Ren, Kejun Zhang, Shuicheng Yan

Abstract: While deep generative models have empowered music generation, it remains a challenging and under-explored problem to edit an existing musical piece at fine granularity. In this paper, we propose SDMuse, a unified Stochastic Differential Music editing and generation framework, which can not only compose a whole musical piece from scratch, but also modify existing musical pieces in many ways, such a… ▽ More While deep generative models have empowered music generation, it remains a challenging and under-explored problem to edit an existing musical piece at fine granularity. In this paper, we propose SDMuse, a unified Stochastic Differential Music editing and generation framework, which can not only compose a whole musical piece from scratch, but also modify existing musical pieces in many ways, such as combination, continuation, inpainting, and style transferring. The proposed SDMuse follows a two-stage pipeline to achieve music generation and editing on top of a hybrid representation including pianoroll and MIDI-event. In particular, SDMuse first generates/edits pianoroll by iteratively denoising through a stochastic differential equation (SDE) based on a diffusion model generative prior, and then refines the generated pianoroll and predicts MIDI-event tokens auto-regressively. We evaluate the generated music of our method on ailabs1k7 pop music dataset in terms of quality and controllability on various music editing and generation tasks. Experimental results demonstrate the effectiveness of our proposed stochastic differential music editing and generation process, as well as the hybrid representations. △ Less

Submitted 2 November, 2022; v1 submitted 31 October, 2022; originally announced November 2022.

arXiv:2210.17496 [pdf, other]

doi 10.1109/TCI.2023.3288343

A Fast Automatic Method for Deconvoluting Macro X-ray Fluorescence Data Collected from Easel Paintings

Authors: Su Yan, Jun-Jie Huang, Herman Verinaz-Jadan, Nathan Daly, Catherine Higgitt, Pier Luigi Dragotti

Abstract: Macro X-ray Fluorescence (MA-XRF) scanning is increasingly widely used by researchers in heritage science to analyse easel paintings as one of a suite of non-invasive imaging techniques. The task of processing the resulting MA-XRF datacube generated in order to produce individual chemical element maps is called MA-XRF deconvolution. While there are several existing methods that have been proposed… ▽ More Macro X-ray Fluorescence (MA-XRF) scanning is increasingly widely used by researchers in heritage science to analyse easel paintings as one of a suite of non-invasive imaging techniques. The task of processing the resulting MA-XRF datacube generated in order to produce individual chemical element maps is called MA-XRF deconvolution. While there are several existing methods that have been proposed for MA-XRF deconvolution, they require a degree of manual intervention from the user that can affect the final results. The state-of-the-art AFRID approach can automatically deconvolute the datacube without user input, but it has a long processing time and does not exploit spatial dependency. In this paper, we propose two versions of a fast automatic deconvolution (FAD) method for MA-XRF datacubes collected from easel paintings with ADMM (alternating direction method of multipliers) and FISTA (fast iterative shrinkage-thresholding algorithm). The proposed FAD method not only automatically analyses the datacube and produces element distribution maps of high-quality with spatial dependency considered, but also significantly reduces the running time. The results generated on the MA-XRF datacubes collected from two easel paintings from the National Gallery, London, verify the performance of the proposed FAD method. △ Less

Submitted 31 October, 2022; originally announced October 2022.

Comments: 13 pages, 13 figures

arXiv:2208.12923 [pdf, other]

Global RTK Positioning in Graphical State Space

Authors: Yihong Ge, Sudan Yan, Shaolin Lü, Cong Li

Abstract: This paper proposes a new method for RTK post-processing. Different from the traditional forward-backward Kalman filter, in our method, the whole system equation is built on a graphical state space model and solved by factor graph optimization. The position solution provided by the forward Kalman filter is used as the linearization points of the graphical state space model. Constant variables, suc… ▽ More This paper proposes a new method for RTK post-processing. Different from the traditional forward-backward Kalman filter, in our method, the whole system equation is built on a graphical state space model and solved by factor graph optimization. The position solution provided by the forward Kalman filter is used as the linearization points of the graphical state space model. Constant variables, such as double-difference ambiguity, will exist as constants in the graphical state space model, not as time-series variables. It is shown by experiment results that factor graph optimization with a graphical state space model is more effective than Kalman filter with a traditional discrete-time state space model for RTK post-processing problem. △ Less

Submitted 8 November, 2022; v1 submitted 26 August, 2022; originally announced August 2022.

arXiv:2205.03269 [pdf, ps, other]

Two Rapid Power Iterative DOA Estimators for UAV Emitter Using Massive/Ultra-massive Receive Array

Authors: Yiwen Chen, Feng Shu, Qijuan Jie, Xichao Zhan, Xuehui Wang, Zhongwen Sun, Shihao Yan, Wenlong Cai, Peng Zhang, Peng Chen

Abstract: To provide rapid direction finding (DF) for unmanned aerial vehicle (UAV) emitter in future wireless networks, a low-complexity direction of arrival (DOA) estimation architecture for massive multiple input multiple output (MIMO) receiver arrays is constructed. In this paper, we propose two strategies to address the extremely high complexity caused by eigenvalue decomposition of the received signal… ▽ More To provide rapid direction finding (DF) for unmanned aerial vehicle (UAV) emitter in future wireless networks, a low-complexity direction of arrival (DOA) estimation architecture for massive multiple input multiple output (MIMO) receiver arrays is constructed. In this paper, we propose two strategies to address the extremely high complexity caused by eigenvalue decomposition of the received signal covariance matrix. Firstly, a rapid power-iterative rotational invariance (RPI-RI) method is proposed, which adopts the signal subspace generated by power iteration to gets the final direction estimation through rotational invariance between subarrays. RPI-RI makes a significant complexity reduction at the cost of a substantial performance loss. In order to further reduce the complexity and provide a good directional measurement result, a rapid power-iterative Polynomial rooting (RPI-PR) method is proposed, which utilizes the noise subspace combined with polynomial solution method to get the optimal direction estimation. In addition, the influence of initial vector selection on convergence in the power iteration is analyzed, especially when the initial vector is orthogonal to the incident wave. Simulation results show that the two proposed methods outperform the conventional DOA estimation methods in terms of computational complexity. In particular, the RPIPR method achieves more than two orders of magnitude lower complexity than conventional methods and achieves performance close to CRLB. Moreover, it is verified that the initial vector and the relative error have a significant impact on the performance of the computational complexity. △ Less

Submitted 23 April, 2023; v1 submitted 6 May, 2022; originally announced May 2022.

arXiv:2203.04568 [pdf, other]

PHTrans: Parallelly Aggregating Global and Local Representations for Medical Image Segmentation

Authors: Wentao Liu, Tong Tian, Weijin Xu, Huihua Yang, Xipeng Pan, Songlin Yan, Lemeng Wang

Abstract: The success of Transformer in computer vision has attracted increasing attention in the medical imaging community. Especially for medical image segmentation, many excellent hybrid architectures based on convolutional neural networks (CNNs) and Transformer have been presented and achieve impressive performance. However, most of these methods, which embed modular Transformer into CNNs, struggle to r… ▽ More The success of Transformer in computer vision has attracted increasing attention in the medical imaging community. Especially for medical image segmentation, many excellent hybrid architectures based on convolutional neural networks (CNNs) and Transformer have been presented and achieve impressive performance. However, most of these methods, which embed modular Transformer into CNNs, struggle to reach their full potential. In this paper, we propose a novel hybrid architecture for medical image segmentation called PHTrans, which parallelly hybridizes Transformer and CNN in main building blocks to produce hierarchical representations from global and local features and adaptively aggregate them, aiming to fully exploit their strengths to obtain better segmentation performance. Specifically, PHTrans follows the U-shaped encoder-decoder design and introduces the parallel hybird module in deep stages, where convolution blocks and the modified 3D Swin Transformer learn local features and global dependencies separately, then a sequence-to-volume operation unifies the dimensions of the outputs to achieve feature aggregation. Extensive experimental results on both Multi-Atlas Labeling Beyond the Cranial Vault and Automated Cardiac Diagnosis Challeng datasets corroborate its effectiveness, consistently outperforming state-of-the-art methods. The code is available at: https://github.com/lseventeen/PHTrans. △ Less

Submitted 23 July, 2022; v1 submitted 9 March, 2022; originally announced March 2022.

Comments: 10 pages, 3 figures

Journal ref: MICCAI2022

arXiv:2203.00917 [pdf, other]

Machine Learning Methods for Inferring the Number of UAV Emitters via Massive MIMO Receive Array

Authors: Yifan Li, Feng Shu, Jinsong Hu, Shihao Yan, Haiwei Song, Weiqiang Zhu, Da Tian, Yaoliang Song, Jiangzhou Wang

Abstract: To provide important prior knowledge for the DOA estimation of UAV emitters in future wireless networks, we present a complete DOA preprocessing system for inferring the number of emitters via massive MIMO receive array. Firstly, in order to eliminate the noise signals, two high-precision signal detectors, square root of maximum eigenvalue times minimum eigenvalue (SR-MME) and geometric mean (GM),… ▽ More To provide important prior knowledge for the DOA estimation of UAV emitters in future wireless networks, we present a complete DOA preprocessing system for inferring the number of emitters via massive MIMO receive array. Firstly, in order to eliminate the noise signals, two high-precision signal detectors, square root of maximum eigenvalue times minimum eigenvalue (SR-MME) and geometric mean (GM), are proposed. Compared to other detectors, SR-MME and GM can achieve a high detection probability while maintaining extremely low false alarm probability. Secondly, if the existence of emitters is determined by detectors, we need to further confirm their number. Therefore, we perform feature extraction on the the eigenvalue sequence of sample covariance matrix to construct feature vector and innovatively propose a multi-layer neural network (ML-NN). Additionally, the support vector machine (SVM), and naive Bayesian classifier (NBC) are also designed. The simulation results show that the machine learning-based methods can achieve good results in signal classification, especially neural networks, which can always maintain the classification accuracy above 70\% with massive MIMO receive array. Finally, we analyze the classical signal classification methods, Akaike (AIC) and Minimum description length (MDL). It is concluded that the two methods are not suitable for scenarios with massive MIMO arrays, and they also have much worse performance than machine learning-based classifiers. △ Less

Submitted 10 March, 2023; v1 submitted 2 March, 2022; originally announced March 2022.

arXiv:2202.05382 [pdf, other]

Give me a knee radiograph, I will tell you where the knee joint area is: a deep convolutional neural network adventure

Authors: Shi Yan, Taghi Ramazanian, Elham Sagheb, Walter K. Kremers, Vipin Chaudhary, Michael Taunton, Hilal Maradit Kremers, Ahmad P. Tafti

Abstract: Knee pain is undoubtedly the most common musculoskeletal symptom that impairs quality of life, confines mobility and functionality across all ages. Knee pain is clinically evaluated by routine radiographs, where the widespread adoption of radiographic images and their availability at low cost, make them the principle component in the assessment of knee pain and knee pathologies, such as arthritis,… ▽ More Knee pain is undoubtedly the most common musculoskeletal symptom that impairs quality of life, confines mobility and functionality across all ages. Knee pain is clinically evaluated by routine radiographs, where the widespread adoption of radiographic images and their availability at low cost, make them the principle component in the assessment of knee pain and knee pathologies, such as arthritis, trauma, and sport injuries. However, interpretation of the knee radiographs is still highly subjective, and overlapping structures within the radiographs and the large volume of images needing to be analyzed on a daily basis, make interpretation challenging for both naive and experienced practitioners. There is thus a need to implement an artificial intelligence strategy to objectively and automatically interpret knee radiographs, facilitating triage of abnormal radiographs in a timely fashion. The current work proposes an accurate and effective pipeline for autonomous detection, localization, and classification of knee joint area in plain radiographs combining the You Only Look Once (YOLO v3) deep convolutional neural network with a large and fully-annotated knee radiographs dataset. The present work is expected to stimulate more interest from the deep learning computer vision community to this pragmatic and clinical application. △ Less

Submitted 10 February, 2022; originally announced February 2022.

Comments: 13 Pages, 4 Figures

arXiv:2111.08380 [pdf, other]

doi 10.1145/3474085.3475195

Video Background Music Generation with Controllable Music Transformer

Authors: Shangzhe Di, Zeren Jiang, Si Liu, Zhaokai Wang, Leyan Zhu, Zexin He, Hongming Liu, Shuicheng Yan

Abstract: In this work, we address the task of video background music generation. Some previous works achieve effective music generation but are unable to generate melodious music tailored to a particular video, and none of them considers the video-music rhythmic consistency. To generate the background music that matches the given video, we first establish the rhythmic relations between video and background… ▽ More In this work, we address the task of video background music generation. Some previous works achieve effective music generation but are unable to generate melodious music tailored to a particular video, and none of them considers the video-music rhythmic consistency. To generate the background music that matches the given video, we first establish the rhythmic relations between video and background music. In particular, we connect timing, motion speed, and motion saliency from video with beat, simu-note density, and simu-note strength from music, respectively. We then propose CMT, a Controllable Music Transformer that enables local control of the aforementioned rhythmic features and global control of the music genre and instruments. Objective and subjective evaluations show that the generated background music has achieved satisfactory compatibility with the input videos, and at the same time, impressive music quality. Code and models are available at https://github.com/wzk1015/video-bgm-generation. △ Less

Submitted 16 November, 2021; originally announced November 2021.

Comments: Accepted to ACM Multimedia 2021. Project website at https://wzk1015.github.io/cmt/

arXiv:2111.01544 [pdf]

Comprehensive and Clinically Accurate Head and Neck Organs at Risk Delineation via Stratified Deep Learning: A Large-scale Multi-Institutional Study

Authors: Dazhou Guo, Jia Ge, Xianghua Ye, Senxiang Yan, Yi Xin, Yuchen Song, Bing-shen Huang, Tsung-Min Hung, Zhuotun Zhu, Ling Peng, Yanping Ren, Rui Liu, Gong Zhang, Mengyuan Mao, Xiaohua Chen, Zhongjie Lu, Wenxiang Li, Yuzhen Chen, Lingyun Huang, Jing Xiao, Adam P. Harrison, Le Lu, Chien-Yu Lin, Dakai Jin, Tsung-Ying Ho

Abstract: Accurate organ at risk (OAR) segmentation is critical to reduce the radiotherapy post-treatment complications. Consensus guidelines recommend a set of more than 40 OARs in the head and neck (H&N) region, however, due to the predictable prohibitive labor-cost of this task, most institutions choose a substantially simplified protocol by delineating a smaller subset of OARs and neglecting the dose di… ▽ More Accurate organ at risk (OAR) segmentation is critical to reduce the radiotherapy post-treatment complications. Consensus guidelines recommend a set of more than 40 OARs in the head and neck (H&N) region, however, due to the predictable prohibitive labor-cost of this task, most institutions choose a substantially simplified protocol by delineating a smaller subset of OARs and neglecting the dose distributions associated with other OARs. In this work we propose a novel, automated and highly effective stratified OAR segmentation (SOARS) system using deep learning to precisely delineate a comprehensive set of 42 H&N OARs. SOARS stratifies 42 OARs into anchor, mid-level, and small & hard subcategories, with specifically derived neural network architectures for each category by neural architecture search (NAS) principles. We built SOARS models using 176 training patients in an internal institution and independently evaluated on 1327 external patients across six different institutions. It consistently outperformed other state-of-the-art methods by at least 3-5% in Dice score for each institutional evaluation (up to 36% relative error reduction in other metrics). More importantly, extensive multi-user studies evidently demonstrated that 98% of the SOARS predictions need only very minor or no revisions for direct clinical acceptance (saving 90% radiation oncologists workload), and their segmentation and dosimetric accuracy are within or smaller than the inter-user variation. These findings confirmed the strong clinical applicability of SOARS for the OAR delineation process in H&N cancer radiotherapy workflows, with improved efficiency, comprehensiveness, and quality. △ Less

Submitted 1 November, 2021; originally announced November 2021.

arXiv:2109.09271 [pdf, ps, other]

DeepStationing: Thoracic Lymph Node Station Parsing in CT Scans using Anatomical Context Encoding and Key Organ Auto-Search

Authors: Dazhou Guo, Xianghua Ye, Jia Ge, Xing Di, Le Lu, Lingyun Huang, Guotong Xie, Jing Xiao, Zhongjie Liu, Ling Peng, Senxiang Yan, Dakai Jin

Abstract: Lymph node station (LNS) delineation from computed tomography (CT) scans is an indispensable step in radiation oncology workflow. High inter-user variabilities across oncologists and prohibitive laboring costs motivated the automated approach. Previous works exploit anatomical priors to infer LNS based on predefined ad-hoc margins. However, without voxel-level supervision, the performance is sever… ▽ More Lymph node station (LNS) delineation from computed tomography (CT) scans is an indispensable step in radiation oncology workflow. High inter-user variabilities across oncologists and prohibitive laboring costs motivated the automated approach. Previous works exploit anatomical priors to infer LNS based on predefined ad-hoc margins. However, without voxel-level supervision, the performance is severely limited. LNS is highly context-dependent - LNS boundaries are constrained by anatomical organs - we formulate it as a deep spatial and contextual parsing problem via encoded anatomical organs. This permits the deep network to better learn from both CT appearance and organ context. We develop a stratified referencing organ segmentation protocol that divides the organs into anchor and non-anchor categories and uses the former's predictions to guide the later segmentation. We further develop an auto-search module to identify the key organs that opt for the optimal LNS parsing performance. Extensive four-fold cross-validation experiments on a dataset of 98 esophageal cancer patients (with the most comprehensive set of 12 LNSs + 22 organs in thoracic region to date) are conducted. Our LNS parsing model produces significant performance improvements, with an average Dice score of 81.1% +/- 6.1%, which is 5.0% and 19.2% higher over the pure CT-based deep model and the previous representative approach, respectively. △ Less

Submitted 19 September, 2021; originally announced September 2021.

arXiv:2107.09934 [pdf, other]

DOA Estimation for Hybrid Massive MIMO Systems using Mixed-ADCs: Performance Loss and Energy Efficiency

Authors: Baihua Shi, Qi Zhang, Rongen Dong, Qijuan Jie, Shihao Yan, Feng Shu, Jiangzhou Wang

Abstract: Due to the power consumption and high circuit cost in antenna arrays, the practical application of massive multiple-input multiple-output (MIMO) in the sixth generation (6G) and future wireless networks is still challenging. Employing low-resolution analog-to-digital converters (ADCs) and hybrid analog and digital (HAD) structure is two low-cost choice with acceptable performance loss.In this pape… ▽ More Due to the power consumption and high circuit cost in antenna arrays, the practical application of massive multiple-input multiple-output (MIMO) in the sixth generation (6G) and future wireless networks is still challenging. Employing low-resolution analog-to-digital converters (ADCs) and hybrid analog and digital (HAD) structure is two low-cost choice with acceptable performance loss.In this paper, the combination of the mixed-ADC architecture and HAD structure employed at receiver is proposed for direction of arrival (DOA) estimation, which will be applied to the beamforming tracking and alignment in 6G. By adopting the additive quantization noise model, the exact closed-form expression of the Cramér-Rao lower bound (CRLB) for the HAD architecture with mixed-ADCs is derived. Moreover, the closed-form expression of the performance loss factor is derived as a benchmark. In addition, to take power consumption into account, energy efficiency is also investigated in our paper. The numerical results reveal that the HAD structure with mixed-ADCs can significantly reduce the power consumption and hardware cost. Furthermore, that architecture is able to achieve a better trade-off between the performance loss and the power consumption. Finally, adopting 2-4 bits of resolution may be a good choice in practical massive MIMO systems. △ Less

Submitted 19 May, 2023; v1 submitted 21 July, 2021; originally announced July 2021.

Comments: 12 pages, 7 figures

arXiv:2107.02660 [pdf, ps, other]

doi 10.1109/TIP.2023.3309408

HybrUR: A Hybrid Physical-Neural Solution for Unsupervised Underwater Image Restoration

Authors: Shuaizheng Yan, Xingyu Chen, Zhengxing Wu, Min Tan, Junzhi Yu

Abstract: Robust vision restoration of underwater images remains a challenge. Owing to the lack of well-matched underwater and in-air images, unsupervised methods based on the cyclic generative adversarial framework have been widely investigated in recent years. However, when using an end-to-end unsupervised approach with only unpaired image data, mode collapse could occur, and the color correction of the r… ▽ More Robust vision restoration of underwater images remains a challenge. Owing to the lack of well-matched underwater and in-air images, unsupervised methods based on the cyclic generative adversarial framework have been widely investigated in recent years. However, when using an end-to-end unsupervised approach with only unpaired image data, mode collapse could occur, and the color correction of the restored images is usually poor. In this paper, we propose a data- and physics-driven unsupervised architecture to perform underwater image restoration from unpaired underwater and in-air images. For effective color correction and quality enhancement, an underwater image degeneration model must be explicitly constructed based on the optically unambiguous physics law. Thus, we employ the Jaffe-McGlamery degeneration theory to design a generator and use neural networks to model the process of underwater visual degeneration. Furthermore, we impose physical constraints on the scene depth and degeneration factors for backscattering estimation to avoid the vanishing gradient problem during the training of the hybrid physical-neural model. Experimental results show that the proposed method can be used to perform high-quality restoration of unconstrained underwater images without supervision. On multiple benchmarks, the proposed method outperforms several state-of-the-art supervised and unsupervised approaches. We demonstrate that our method yields encouraging results in real-world applications. △ Less

Submitted 6 October, 2023; v1 submitted 6 July, 2021; originally announced July 2021.

Comments: 13 pages, 9 figures

Journal ref: IEEE Transactions on Image Processing, vol. 32, pp. 5004-5016, 2023

arXiv:2106.07953 [pdf, other]

Learning to Compensate: A Deep Neural Network Framework for 5G Power Amplifier Compensation

Authors: Po-Yu Chen, Hao Chen, Yi-Min Tsai, Hsien-Kai Kuo, Hantao Huang, Hsin-Hung Chen, Sheng-Hong Yan, Wei-Lun Ou, Chia-Ming Cheng

Abstract: Owing to the complicated characteristics of 5G communication system, designing RF components through mathematical modeling becomes a challenging obstacle. Moreover, such mathematical models need numerous manual adjustments for various specification requirements. In this paper, we present a learning-based framework to model and compensate Power Amplifiers (PAs) in 5G communication. In the proposed… ▽ More Owing to the complicated characteristics of 5G communication system, designing RF components through mathematical modeling becomes a challenging obstacle. Moreover, such mathematical models need numerous manual adjustments for various specification requirements. In this paper, we present a learning-based framework to model and compensate Power Amplifiers (PAs) in 5G communication. In the proposed framework, Deep Neural Networks (DNNs) are used to learn the characteristics of the PAs, while, correspondent Digital Pre-Distortions (DPDs) are also learned to compensate for the nonlinear and memory effects of PAs. On top of the framework, we further propose two frequency domain losses to guide the learning process to better optimize the target, compared to naive time domain Mean Square Error (MSE). The proposed framework serves as a drop-in replacement for the conventional approach. The proposed approach achieves an average of 56.7% reduction of nonlinear and memory effects, which converts to an average of 16.3% improvement over a carefully-designed mathematical model, and even reaches 34% enhancement in severe distortion scenarios. △ Less

Submitted 15 June, 2021; originally announced June 2021.

Comments: IEEE International Conference on Communications (ICC) 2021

arXiv:2103.04367 [pdf, other]

doi 10.1109/LSP.2021.3062777

Adaptive Detection of Dim Maneuvering Targets in Adjacent Range Cells

Authors: Sheng Yan, Pia Addabbo, Chengpeng Hao, Danilo Orlando

Abstract: This letter addresses the detection problem of dim maneuvering targets in the presence of range cell migration. Specifically, it is assumed that the moving target can appear in more than one range cell within the transmitted pulse train. Then, the Bayesian information criterion and the generalized likelihood ratio test design procedure are jointly exploited to come up with six adaptive decision sc… ▽ More This letter addresses the detection problem of dim maneuvering targets in the presence of range cell migration. Specifically, it is assumed that the moving target can appear in more than one range cell within the transmitted pulse train. Then, the Bayesian information criterion and the generalized likelihood ratio test design procedure are jointly exploited to come up with six adaptive decision schemes capable of estimating the range indices related to the target migration. The computational complexity of the proposed detectors is also studied and suitably reduced. Simulation results show the effectiveness of the newly proposed solutions also for a limited set of training data and in comparison with suitable counterparts. △ Less

Submitted 7 March, 2021; originally announced March 2021.

Comments: 5 pages

MSC Class: 62Cxx

arXiv:2101.07502 [pdf, other]

UAV-Enabled Cooperative Jamming for Covert Communications

Authors: Hangmei Rao, Sa Xiao, Shihao Yan, Jianquan Wang, Wanbin Tang

Abstract: This work employs an unmanned aerial vehicle (UAV) as a jammer to aid a covert communication from a transmitter Alice to a receiver Bob, where the UAV transmits artificial noise (AN) with random power to deliberately create interference to a warden Willie. In the considered system, the UAV's trajectory is critical to the covert communication performance, since the AN transmitted by the UAV also ge… ▽ More This work employs an unmanned aerial vehicle (UAV) as a jammer to aid a covert communication from a transmitter Alice to a receiver Bob, where the UAV transmits artificial noise (AN) with random power to deliberately create interference to a warden Willie. In the considered system, the UAV's trajectory is critical to the covert communication performance, since the AN transmitted by the UAV also generates interference to Bob. To maximize the system performance, we formulate an optimization problem to jointly design the UAV's trajectory and Alice's transmit power. The formulated optimization problem is non-convex and is normally solved by a conventional iterative (CI) method, which requires multiple approximations based on Taylor expansions and an initialization on the UAV's trajectory. In order to eliminate these requirements, this work, for the first time, develops a geometric (GM) method to solve the optimization problem. By analyzing the covertness constraint, the GM method decouples the joint optimization into optimizing the UAV's trajectory and Alice's transmit power separately. Our examination shows that the GM method can significantly outperform the CI method in terms of achieving a higher average covert rate and the complexity of the GM method is lower than that of the CI method. △ Less

Submitted 25 February, 2022; v1 submitted 19 January, 2021; originally announced January 2021.

arXiv:2011.03726 [pdf, ps, other]

Intelligent Reflecting Surface (IRS)-Aided Covert Wireless Communications with Delay Constraint

Authors: Xiaobo Zhou, Shihao Yan, Qingqing Wu, Feng Shu, Derrick Wing Kwan Ng

Abstract: This work examines the performance gain achieved by deploying an intelligent reflecting surface (IRS) in covert communications. To this end, we formulate the joint design of the transmit power and the IRS reflection coefficients by taking into account the communication covertness for the cases with global channel state information (CSI) and without a warden's instantaneous CSI. For the case of glo… ▽ More This work examines the performance gain achieved by deploying an intelligent reflecting surface (IRS) in covert communications. To this end, we formulate the joint design of the transmit power and the IRS reflection coefficients by taking into account the communication covertness for the cases with global channel state information (CSI) and without a warden's instantaneous CSI. For the case of global CSI, we first prove that perfect covertness is achievable with the aid of the IRS even for a single-antenna transmitter, which is impossible without an IRS. Then, we develop a penalty successive convex approximation (PSCA) algorithm to tackle the design problem. Considering the high complexity of the PSCA algorithm, we further propose a low-complexity two-stage algorithm, where analytical expressions for the transmit power and the IRS's reflection coefficients are derived. For the case without the warden's instantaneous CSI, we first derive the covertness constraint analytically facilitating the optimal phase shift design. Then, we consider three hardware-related constraints on the IRS's reflection amplitudes and determine their optimal designs together with the optimal transmit power. Our examination shows that significant performance gain can be achieved by deploying an IRS into covert communications. △ Less

Submitted 11 January, 2022; v1 submitted 7 November, 2020; originally announced November 2020.

arXiv:2010.09187 [pdf, other]

Neural Network Architectures for Location Estimation in the Internet of Things

Authors: Ihsan Ullah, Robert Malaney, Shihao Yan

Abstract: Artificial Intelligence (AI) solutions for wireless location estimation are likely to prevail in many real-world scenarios. In this work, we demonstrate for the first time how the Cramer-Rao upper bound on localization accuracy can facilitate efficient neural-network solutions for wireless location estimation. In particular, we demonstrate how the number of neurons for the network can be intellige… ▽ More Artificial Intelligence (AI) solutions for wireless location estimation are likely to prevail in many real-world scenarios. In this work, we demonstrate for the first time how the Cramer-Rao upper bound on localization accuracy can facilitate efficient neural-network solutions for wireless location estimation. In particular, we demonstrate how the number of neurons for the network can be intelligently chosen, leading to AI location solutions that are not time-consuming to run and less likely to be plagued by over-fitting. Experimental verification of our approach is provided. Our new algorithms are directly applicable to location estimates in many scenarios including the Internet of Things, and vehicular networks where vehicular GPS coordinates are unreliable or need verifying. Our work represents the first successful AI solution for a communication problem whose neural-network design is based on fundamental information-theoretic constructs. We anticipate our approach will be useful for a wide range of communication problems beyond location estimation. △ Less

Submitted 18 October, 2020; originally announced October 2020.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2009.11071 [pdf, ps, other]

doi 10.1016/j.automatica.2022.110282

Stochastic output feedback MPC with intermittent observations

Authors: Shuhao Yan, Mark Cannon, Paul J. Goulart

Abstract: This paper designs a model predictive control (MPC) law for constrained linear systems with stochastic additive disturbances and noisy measurements, minimising a discounted cost subject to a discounted expectation constraint. It is assumed that sensor data is lost with a known probability. Taking into account the data losses modelled by a Bernoulli process, we parameterise the predicted control po… ▽ More This paper designs a model predictive control (MPC) law for constrained linear systems with stochastic additive disturbances and noisy measurements, minimising a discounted cost subject to a discounted expectation constraint. It is assumed that sensor data is lost with a known probability. Taking into account the data losses modelled by a Bernoulli process, we parameterise the predicted control policy as an affine function of future observations and obtain a convex linear-quadratic optimal control problem. Constraint satisfaction and a discounted cost bound are ensured without imposing bounds on the distributions of the disturbance and noise inputs. In addition, the average long-run undiscounted closed loop cost is shown to be finite if the discount factor takes appropriate values. We analyse robustness of the proposed control law with respect to possible uncertainties in the arrival probability of sensor data and we bound the impact of these uncertainties on constraint satisfaction and the discounted cost. Numerical simulations are provided to illustrate these results. △ Less

Submitted 1 March, 2022; v1 submitted 21 September, 2020; originally announced September 2020.

Comments: 13 pages. arXiv admin note: text overlap with arXiv:2004.02591

arXiv:2007.07134 [pdf, ps, other]

doi 10.1109/TAC.2021.3128466

Stochastic MPC with Dynamic Feedback Gain Selection and Discounted Probabilistic Constraints

Authors: Shuhao Yan, Paul J. Goulart, Mark Cannon

Abstract: This paper considers linear discrete-time systems with additive disturbances, and designs a Model Predictive Control (MPC) law incorporating a dynamic feedback gain to minimise a quadratic cost function subject to a single chance constraint. The feedback gain is selected online and we provide two selection methods based on minimising upper bounds on predicted costs. The chance constraint is define… ▽ More This paper considers linear discrete-time systems with additive disturbances, and designs a Model Predictive Control (MPC) law incorporating a dynamic feedback gain to minimise a quadratic cost function subject to a single chance constraint. The feedback gain is selected online and we provide two selection methods based on minimising upper bounds on predicted costs. The chance constraint is defined as a discounted sum of violation probabilities on an infinite horizon. By penalising violation probabilities close to the initial time and assigning violation probabilities in the far future with vanishingly small weights, this form of constraints allows for an MPC law with guarantees of recursive feasibility without a boundedness assumption on the disturbance. A computationally convenient MPC optimisation problem is formulated using Chebyshev's inequality and we introduce an online constraint-tightening technique to ensure recursive feasibility. The closed loop system is guaranteed to satisfy the chance constraint and a quadratic stability condition. With dynamic feedback gain selection, the closed loop cost is reduced and conservativeness of Chebyshev's inequality is mitigated. Also, a larger feasible set of initial conditions can be obtained. Numerical simulations are given to show these results. △ Less

Submitted 26 May, 2021; v1 submitted 14 July, 2020; originally announced July 2020.

Comments: 15 pages, 3 figures

arXiv:2007.01513 [pdf, other]

Joint Beam Training and Data Transmission Design for Covert Millimeter-Wave Communication

Authors: Jiayu Zhang, Min Li, Shihao Yan, Chunshan Liu, Xihan Chen, Minjian Zhao, Philip Whiting

Abstract: Covert communication prevents legitimate transmission from being detected by a warden while maintaining certain covert rate at the intended user. Prior works have considered the design of covert communication over conventional low-frequency bands, but few works so far have explored the higher-frequency millimeter-wave (mmWave) spectrum. The directional nature of mmWave communication makes it attra… ▽ More Covert communication prevents legitimate transmission from being detected by a warden while maintaining certain covert rate at the intended user. Prior works have considered the design of covert communication over conventional low-frequency bands, but few works so far have explored the higher-frequency millimeter-wave (mmWave) spectrum. The directional nature of mmWave communication makes it attractive for covert transmission. However, how to establish such directional link in a covert manner in the first place remains as a significant challenge. In this paper, we consider a covert mmWave communication system, where legitimate parties Alice and Bob adopt beam training approach for directional link establishment. Accounting for the training overhead, we develop a new design framework that jointly optimizes beam training duration, training power and data transmission power to maximize the effective throughput of Alice-Bob link while ensuring the covertness constraint at warden Willie is met. We further propose a dual-decomposition successive convex approximation algorithm to solve the problem efficiently. Numerical studies demonstrate interesting tradeoff among the key design parameters considered and also the necessity of joint design of beam training and data transmission for covert mmWave communication. △ Less

Submitted 11 July, 2020; v1 submitted 3 July, 2020; originally announced July 2020.

Comments: Submitted for possible journal publication

arXiv:2006.12159 [pdf, ps, other]

Covert Communications with Constrained Age of Information

Authors: Yida Wang, Shihao Yan, Weiwei Yang, Yueming Cai

Abstract: In this letter, we consider the requirement of information freshness in covert communications for the first time. With artificial noise (AN) generated from a full-duplex (FD) receiver, we formulate a covertness maximization problem under the average age of information (AoI) constraint to optimize the transmit probability of information signal. In particular, the transmit probability not only repre… ▽ More In this letter, we consider the requirement of information freshness in covert communications for the first time. With artificial noise (AN) generated from a full-duplex (FD) receiver, we formulate a covertness maximization problem under the average age of information (AoI) constraint to optimize the transmit probability of information signal. In particular, the transmit probability not only represents the generation rate of information signal but also represents the prior probability of the alternative hypothesis in covert communications, which builds up a bridge between information freshness and communication covertness. Our analysis shows that the best transmit probability is not always 0.5, which differs from the equal prior probabilities assumption in most related works on covert communications. Furthermore, the limitation of average AoI enlarges the transmit probability at the cost of the covertness reduction and leads to a positive lower bound on the information transmit power for non-zero covertness. △ Less

Submitted 29 July, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

Comments: 5 pages, 3 figures

arXiv:2006.01435 [pdf, other]

Recapture as You Want

Authors: Chen Gao, Si Liu, Ran He, Shuicheng Yan, Bo Li

Abstract: With the increasing prevalence and more powerful camera systems of mobile devices, people can conveniently take photos in their daily life, which naturally brings the demand for more intelligent photo post-processing techniques, especially on those portrait photos. In this paper, we present a portrait recapture method enabling users to easily edit their portrait to desired posture/view, body figur… ▽ More With the increasing prevalence and more powerful camera systems of mobile devices, people can conveniently take photos in their daily life, which naturally brings the demand for more intelligent photo post-processing techniques, especially on those portrait photos. In this paper, we present a portrait recapture method enabling users to easily edit their portrait to desired posture/view, body figure and clothing style, which are very challenging to achieve since it requires to simultaneously perform non-rigid deformation of human body, invisible body-parts reasoning and semantic-aware editing. We decompose the editing procedure into semantic-aware geometric and appearance transformation. In geometric transformation, a semantic layout map is generated that meets user demands to represent part-level spatial constraints and further guides the semantic-aware appearance transformation. In appearance transformation, we design two novel modules, Semantic-aware Attentive Transfer (SAT) and Layout Graph Reasoning (LGR), to conduct intra-part transfer and inter-part reasoning, respectively. SAT module produces each human part by paying attention to the semantically consistent regions in the source portrait. It effectively addresses the non-rigid deformation issue and well preserves the intrinsic structure/appearance with rich texture details. LGR module utilizes body skeleton knowledge to construct a layout graph that connects all relevant part features, where graph reasoning mechanism is used to propagate information among part nodes to mine their relations. In this way, LGR module infers invisible body parts and guarantees global coherence among all the parts. Extensive experiments on DeepFashion, Market-1501 and in-the-wild photos demonstrate the effectiveness and superiority of our approach. Video demo is at: \url{https://youtu.be/vTyq9HL6jgw}. △ Less

Submitted 2 June, 2020; originally announced June 2020.

Comments: 14 pages

arXiv:2004.03939 [pdf]

Image super-resolution reconstruction based on attention mechanism and feature fusion

Authors: Jiawen Lyn, Sen Yan

Abstract: Aiming at the problems that the convolutional neural networks neglect to capture the inherent attributes of natural images and extract features only in a single scale in the field of image super-resolution reconstruction, a network structure based on attention mechanism and multi-scale feature fusion is proposed. By using the attention mechanism, the network can effectively integrate the non-local… ▽ More Aiming at the problems that the convolutional neural networks neglect to capture the inherent attributes of natural images and extract features only in a single scale in the field of image super-resolution reconstruction, a network structure based on attention mechanism and multi-scale feature fusion is proposed. By using the attention mechanism, the network can effectively integrate the non-local information and second-order features of the image, so as to improve the feature expression ability of the network. At the same time, the convolution kernel of different scales is used to extract the multi-scale information of the image, so as to preserve the complete information characteristics at different scales. Experimental results show that the proposed method can achieve better performance over other representative super-resolution reconstruction algorithms in objective quantitative metrics and visual quality. △ Less

Submitted 8 April, 2020; originally announced April 2020.

arXiv:2003.04511 [pdf, other]

Mobility and Safety Benefits of Connectivity in CACC Vehicle Strings

Authors: Vamsi Vegamoor, Shaojie Yan, Sivakumar Rathinam, Swaroop Darbha

Abstract: In this paper, we re-examine the notion of string stability as it relates to safety by providing an upper bound on the maximum spacing error of any vehicle in a homogeneous platoon in terms of the input of the leading vehicle. We reinforce our previous work on lossy CACC platoons by accommodating for burst-noise behavior in the V2V link. Further, through Monte Carlo type simulations, we demonstrat… ▽ More In this paper, we re-examine the notion of string stability as it relates to safety by providing an upper bound on the maximum spacing error of any vehicle in a homogeneous platoon in terms of the input of the leading vehicle. We reinforce our previous work on lossy CACC platoons by accommodating for burst-noise behavior in the V2V link. Further, through Monte Carlo type simulations, we demonstrate that connectivity can enhance traffic mobility and safety in a CACC string even when the deceleration capabilities of the vehicles in the platoon are heterogeneous. △ Less

Submitted 19 October, 2020; v1 submitted 9 March, 2020; originally announced March 2020.

Comments: Fixed some typos and references after acceptance to IEEE ITSC 2020

arXiv:2003.04046 [pdf, other]

doi 10.1109/IROS45743.2020.9340784

Efficiency and Equity are Both Essential: A Generalized Traffic Signal Controller with Deep Reinforcement Learning

Authors: Shengchao Yan, Jingwei Zhang, Daniel Büscher, Wolfram Burgard

Abstract: Traffic signal controllers play an essential role in today's traffic system. However, the majority of them currently is not sufficiently flexible or adaptive to generate optimal traffic schedules. In this paper we present an approach to learning policies for signal controllers using deep reinforcement learning aiming for optimized traffic flow. Our method uses a novel formulation of the reward fun… ▽ More Traffic signal controllers play an essential role in today's traffic system. However, the majority of them currently is not sufficiently flexible or adaptive to generate optimal traffic schedules. In this paper we present an approach to learning policies for signal controllers using deep reinforcement learning aiming for optimized traffic flow. Our method uses a novel formulation of the reward function that simultaneously considers efficiency and equity. We furthermore present a general approach to find the bound for the proposed equity factor and we introduce the adaptive discounting approach that greatly stabilizes learning and helps to maintain a high flexibility of green light duration. The experimental evaluations on both simulated and real-world data demonstrate that our proposed algorithm achieves state-of-the-art performance (previously held by traditional non-learning methods) on a wide range of traffic situations. △ Less

Submitted 27 December, 2020; v1 submitted 9 March, 2020; originally announced March 2020.

Comments: Published as a conference paper at IROS 2020

arXiv:2002.08419 [pdf, ps, other]

Mode Selection and Resource Allocation in Sliced Fog Radio Access Networks: A Reinforcement Learning Approach

Authors: Hongyu Xiang, Mugen Peng, Yaohua Sun, Shi Yan

Abstract: The mode selection and resource allocation in fog radio access networks (F-RANs) have been advocated as key techniques to improve spectral and energy efficiency. In this paper, we investigate the joint optimization of mode selection and resource allocation in uplink F-RANs, where both of the traditional user equipments (UEs) and fog UEs are served by constructed network slice instances. The concer… ▽ More The mode selection and resource allocation in fog radio access networks (F-RANs) have been advocated as key techniques to improve spectral and energy efficiency. In this paper, we investigate the joint optimization of mode selection and resource allocation in uplink F-RANs, where both of the traditional user equipments (UEs) and fog UEs are served by constructed network slice instances. The concerned optimization is formulated as a mixed-integer programming problem, and both the orthogonal and multiplexed subchannel allocation strategies are proposed to guarantee the slice isolation. Motivated by the development of machine learning, two reinforcement learning based algorithms are developed to solve the original high complexity problem under traditional and fog UEs' specific performance requirements. The basic idea of the proposals is to generate a good mode selection policy according to the immediate reward fed back by an environment. Simulation results validate the benefits of our proposed algorithms and show that a tradeoff between system power consumption and queue delay can be achieved. △ Less

Submitted 13 February, 2020; originally announced February 2020.

arXiv:2002.05485 [pdf, ps, other]

Deep Reinforcement Learning Based Mode Selection and Resource Allocation for Cellular V2X Communications

Authors: Xinran Zhang, Mugen Peng, Shi Yan, Yaohua Sun

Abstract: Cellular vehicle-to-everything (V2X) communication is crucial to support future diverse vehicular applications. However, for safety-critical applications, unstable vehicle-to-vehicle (V2V) links and high signalling overhead of centralized resource allocation approaches become bottlenecks. In this paper, we investigate a joint optimization problem of transmission mode selection and resource allocat… ▽ More Cellular vehicle-to-everything (V2X) communication is crucial to support future diverse vehicular applications. However, for safety-critical applications, unstable vehicle-to-vehicle (V2V) links and high signalling overhead of centralized resource allocation approaches become bottlenecks. In this paper, we investigate a joint optimization problem of transmission mode selection and resource allocation for cellular V2X communications. In particular, the problem is formulated as a Markov decision process, and a deep reinforcement learning (DRL) based decentralized algorithm is proposed to maximize the sum capacity of vehicle-to-infrastructure users while meeting the latency and reliability requirements of V2V pairs. Moreover, considering training limitation of local DRL models, a two-timescale federated DRL algorithm is developed to help obtain robust model. Wherein, the graph theory based vehicle clustering algorithm is executed on a large timescale and in turn the federated learning algorithm is conducted on a small timescale. Simulation results show that the proposed DRL-based algorithm outperforms other decentralized baselines, and validate the superiority of the two-timescale federated DRL algorithm for newly activated V2V pairs. △ Less

Submitted 13 February, 2020; originally announced February 2020.

Comments: 12 pages, 11 figures, accepted by IEEE IoT Journal

arXiv:2002.05437 [pdf, ps, other]

Tradeoff between Ergodic Rate and Delivery Latency in Fog Radio Access Networks

Authors: Bonan Yin, Mugen Peng, Shi Yan, Chunjing Hu

Abstract: Wireless content caching has recently been considered as an efficient way in fog radio access networks (FRANs) to alleviate the heavy burden on capacity-limited fronthaul links and reduce delivery latency. In this paper, an advanced minimal delay association policy is proposed to minimize latency while guaranteeing spectral efficiency in F-RANs. By utilizing stochastic geometry and queueing theory… ▽ More Wireless content caching has recently been considered as an efficient way in fog radio access networks (FRANs) to alleviate the heavy burden on capacity-limited fronthaul links and reduce delivery latency. In this paper, an advanced minimal delay association policy is proposed to minimize latency while guaranteeing spectral efficiency in F-RANs. By utilizing stochastic geometry and queueing theory, closed-form expressions of successful delivery probability, average ergodic rate, and average delivery latency are derived, where both the traditional association policy based on accessing the base station with maximal received power and the proposed minimal delay association policy are concerned. Impacts of key operating parameters on the aforementioned performance metrics are exploited. It is shown that the proposed association policy has a better delivery latency than the traditional association policy. Increasing the cache size of fog-computing based access points (F-APs) can more significantly reduce average delivery latency, compared with increasing the density of F-APs. Meanwhile, the latter comes at the expense of decreasing average ergodic rate. This implies the deployment of large cache size at F-APs rather than high density of F-APs can promote performance effectively in F-RANs. △ Less

Submitted 13 February, 2020; originally announced February 2020.

arXiv:2001.00759 [pdf, ps, other]

UAV-Enabled Confidential Data Collection in Wireless Networks

Authors: Xiaobo Zhou, Shihao Yan, Min Li, Jun Li, Feng Shu

Abstract: This work, for the first time, considers confidential data collection in the context of unmanned aerial vehicle (UAV) wireless networks, where the scheduled ground sensor node (SN) intends to transmit confidential information to the UAV without being intercepted by other unscheduled ground SNs. Specifically, a full-duplex (FD) UAV collects data from each scheduled SN on the ground and generates ar… ▽ More This work, for the first time, considers confidential data collection in the context of unmanned aerial vehicle (UAV) wireless networks, where the scheduled ground sensor node (SN) intends to transmit confidential information to the UAV without being intercepted by other unscheduled ground SNs. Specifically, a full-duplex (FD) UAV collects data from each scheduled SN on the ground and generates artificial noise (AN) to prevent the scheduled SN's confidential information from being wiretapped by other unscheduled SNs. We first derive the reliability outage probability (ROP) and secrecy outage probability (SOP) of a considered fixed-rate transmission, based on which we formulate an optimization problem that maximizes the minimum average secrecy rate (ASR) subject to some specific constraints. We then transform the formulated optimization problem into a convex problem with the aid of first-order restrictive approximation technique and penalty method. The resultant problem is a generalized nonlinear convex programming (GNCP) and solving it directly still leads to a high complexity, which motivates us to further approximate this problem as a second-order cone program (SOCP) in order to reduce the computational complexity. Finally, we develop an iteration procedure based on penalty successive convex approximation (P-SCA) algorithm to pursue the solution to the formulated optimization problem. Our examination shows that the developed joint design achieves a significant performance gain compared to a benchmark scheme. △ Less

Submitted 3 January, 2020; originally announced January 2020.

Comments: 13 pages, 6 figures

arXiv:1912.12138 [pdf]

Convolutional Dictionary Pair Learning Network for Image Representation Learning

Authors: Zhao Zhang, Yulin Sun, Yang Wang, Zhengjun Zha, Shuicheng Yan, Meng Wang

Abstract: Both the Dictionary Learning (DL) and Convolutional Neural Networks (CNN) are powerful image representation learning systems based on different mechanisms and principles, however whether we can seamlessly integrate them to improve the per-formance is noteworthy exploring. To address this issue, we propose a novel generalized end-to-end representation learning architecture, dubbed Convolutional Dic… ▽ More Both the Dictionary Learning (DL) and Convolutional Neural Networks (CNN) are powerful image representation learning systems based on different mechanisms and principles, however whether we can seamlessly integrate them to improve the per-formance is noteworthy exploring. To address this issue, we propose a novel generalized end-to-end representation learning architecture, dubbed Convolutional Dictionary Pair Learning Network (CDPL-Net) in this paper, which integrates the learning schemes of the CNN and dictionary pair learning into a unified framework. Generally, the architecture of CDPL-Net includes two convolutional/pooling layers and two dictionary pair learn-ing (DPL) layers in the representation learning module. Besides, it uses two fully-connected layers as the multi-layer perception layer in the nonlinear classification module. In particular, the DPL layer can jointly formulate the discriminative synthesis and analysis representations driven by minimizing the batch based reconstruction error over the flatted feature maps from the convolution/pooling layer. Moreover, DPL layer uses l1-norm on the analysis dictionary so that sparse representation can be delivered, and the embedding process will also be robust to noise. To speed up the training process of DPL layer, the efficient stochastic gradient descent is used. Extensive simulations on real databases show that our CDPL-Net can deliver enhanced performance over other state-of-the-art methods. △ Less

Submitted 15 January, 2020; v1 submitted 16 December, 2019; originally announced December 2019.

Comments: Accepted by the 24th European Conference on Artificial Intelligence (ECAI 2020)

arXiv:1912.02037 [pdf, other]

AdversarialNAS: Adversarial Neural Architecture Search for GANs

Authors: Chen Gao, Yunpeng Chen, Si Liu, Zhenxiong Tan, Shuicheng Yan

Abstract: Neural Architecture Search (NAS) that aims to automate the procedure of architecture design has achieved promising results in many computer vision fields. In this paper, we propose an AdversarialNAS method specially tailored for Generative Adversarial Networks (GANs) to search for a superior generative model on the task of unconditional image generation. The AdversarialNAS is the first method that… ▽ More Neural Architecture Search (NAS) that aims to automate the procedure of architecture design has achieved promising results in many computer vision fields. In this paper, we propose an AdversarialNAS method specially tailored for Generative Adversarial Networks (GANs) to search for a superior generative model on the task of unconditional image generation. The AdversarialNAS is the first method that can search the architectures of generator and discriminator simultaneously in a differentiable manner. During searching, the designed adversarial search algorithm does not need to comput any extra metric to evaluate the performance of the searched architecture, and the search paradigm considers the relevance between the two network architectures and improves their mutual balance. Therefore, AdversarialNAS is very efficient and only takes 1 GPU day to search for a superior generative model in the proposed large search space ($10^{38}$). Experiments demonstrate the effectiveness and superiority of our method. The discovered generative model sets a new state-of-the-art FID score of $10.87$ and highly competitive Inception Score of $8.74$ on CIFAR-10. Its transferability is also proven by setting new state-of-the-art FID score of $26.98$ and Inception score of $9.63$ on STL-10. Code is at: \url{https://github.com/chengaopro/AdversarialNAS}. △ Less

Submitted 8 April, 2020; v1 submitted 4 December, 2019; originally announced December 2019.

Comments: Accepted to CVPR 2020

Showing 1–50 of 61 results for author: Yan, S