-
Exact Fisher zeros and thermofield dynamics across a quantum critical point
Authors:
Yang Liu,
Songtai Lv,
Yuchen Meng,
Zefan Tan,
Erhai Zhao,
Haiyuan Zou
Abstract:
By setting the inverse temperature $β$ loose to occupy the complex plane, Michael E. Fisher showed that the zeros of the complex partition function $Z$, if approaching the real $β$ axis, reveal a thermodynamic phase transition. More recently, Fisher zeros have been used to mark the dynamical phase transition in quench dynamics. The success of Fisher zeros however seems limited, and it is unclear h…
▽ More
By setting the inverse temperature $β$ loose to occupy the complex plane, Michael E. Fisher showed that the zeros of the complex partition function $Z$, if approaching the real $β$ axis, reveal a thermodynamic phase transition. More recently, Fisher zeros have been used to mark the dynamical phase transition in quench dynamics. The success of Fisher zeros however seems limited, and it is unclear how they can be employed to shed light on quantum phase transitions or the non-unitary dynamics of open quantum systems. Here we answer this question by a comprehensive analysis of the (analytically continued) one-dimensional transverse field Ising model. We exhaust all the Fisher zeros to show that in the thermodynamic limit they congregate into a remarkably simple pattern in the form of continuous open or closed lines. These Fisher lines evolve smoothly as the coupling constant is tuned, and a qualitative change identifies the quantum critical point. By exploiting the connection between $Z$ and the thermofield double states, we obtain analytical expressions for the short- and long-time dynamics of the survival amplitude and the scaling of recurrence time at the quantum critical point. We further point out $Z$ can be realized and probed in monitored quantum circuits. The analytical results are corroborated by numerical tensor renormalization group which elevates the approach outlined here to a powerful tool for interacting quantum systems.
△ Less
Submitted 8 July, 2024; v1 submitted 27 June, 2024;
originally announced June 2024.
-
HoneyGPT: Breaking the Trilemma in Terminal Honeypots with Large Language Model
Authors:
Ziyang Wang,
Jianzhou You,
Haining Wang,
Tianwei Yuan,
Shichao Lv,
Yang Wang,
Limin Sun
Abstract:
Honeypots, as a strategic cyber-deception mechanism designed to emulate authentic interactions and bait unauthorized entities, continue to struggle with balancing flexibility, interaction depth, and deceptive capability despite their evolution over decades. Often they also lack the capability of proactively adapting to an attacker's evolving tactics, which restricts the depth of engagement and sub…
▽ More
Honeypots, as a strategic cyber-deception mechanism designed to emulate authentic interactions and bait unauthorized entities, continue to struggle with balancing flexibility, interaction depth, and deceptive capability despite their evolution over decades. Often they also lack the capability of proactively adapting to an attacker's evolving tactics, which restricts the depth of engagement and subsequent information gathering. Under this context, the emergent capabilities of large language models, in tandem with pioneering prompt-based engineering techniques, offer a transformative shift in the design and deployment of honeypot technologies. In this paper, we introduce HoneyGPT, a pioneering honeypot architecture based on ChatGPT, heralding a new era of intelligent honeypot solutions characterized by their cost-effectiveness, high adaptability, and enhanced interactivity, coupled with a predisposition for proactive attacker engagement. Furthermore, we present a structured prompt engineering framework that augments long-term interaction memory and robust security analytics. This framework, integrating thought of chain tactics attuned to honeypot contexts, enhances interactivity and deception, deepens security analytics, and ensures sustained engagement.
The evaluation of HoneyGPT includes two parts: a baseline comparison based on a collected dataset and a field evaluation in real scenarios for four weeks. The baseline comparison demonstrates HoneyGPT's remarkable ability to strike a balance among flexibility, interaction depth, and deceptive capability. The field evaluation further validates HoneyGPT's efficacy, showing its marked superiority in enticing attackers into more profound interactive engagements and capturing a wider array of novel attack vectors in comparison to existing honeypot technologies.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
RELICS: a REactor neutrino LIquid xenon Coherent elastic Scattering experiment
Authors:
Chang Cai,
Guocai Chen,
Jiangyu Chen,
Rundong Fang,
Fei Gao,
Xiaoran Guo,
Jiheng Guo,
Tingyi He,
Chengjie Jia,
Gaojun Jin,
Yipin Jing,
Gaojun Ju,
Yang Lei,
Jiayi Li,
Kaihang Li,
Meng Li,
Minhua Li,
Shengchao Li,
Siyin Li,
Tao Li,
Qing Lin,
Jiajun Liu,
Minghao Liu,
Sheng Lv,
Guang Luo
, et al. (24 additional authors not shown)
Abstract:
Coherent elastic neutrino-nucleus scattering (CEvNS) provides a unique probe for neutrino properties Beyond the Standard Model (BSM) physics. REactor neutrino LIquid xenon Coherent Scattering experiment (RELICS), a proposed reactor neutrino program using liquid xenon time projection chamber (LXeTPC) technology, aims to investigate the CEvNS process of antineutrinos off xenon atomic nuclei. In this…
▽ More
Coherent elastic neutrino-nucleus scattering (CEvNS) provides a unique probe for neutrino properties Beyond the Standard Model (BSM) physics. REactor neutrino LIquid xenon Coherent Scattering experiment (RELICS), a proposed reactor neutrino program using liquid xenon time projection chamber (LXeTPC) technology, aims to investigate the CEvNS process of antineutrinos off xenon atomic nuclei. In this work, the design of the experiment is studied and optimized based on Monte Carlo (MC) simulations. To achieve a sufficiently low energy threshold for CEvNS detection, an ionization-only analysis channel is adopted for RELICS. A high emission rate of delayed electrons after a big ionization signal is the major background, leading to an analysis threshold of 120 photo-electrons in the CEvNS search. The second largest background, nuclear recoils induced by cosmic-ray neutrons, is suppressed via a passive water shield. The physics potential of RELICS is explored with a 32 kg-yr exposure at a baseline of 25 m from a reactor core with a 3 GW thermal power. In an energy range of 120 to 240 PE, we expect 4902.4 CEvNS and 1318.4 background events. The sensitivity of RELICS to the weak mixing angle is investigated at a low momentum transfer. Our study shows that RELICS can further improve the constraints on the non-standard neutrino interaction (NSI) compared to the current best results.
△ Less
Submitted 12 June, 2024; v1 submitted 9 May, 2024;
originally announced May 2024.
-
4-connected 1-planar chordal graphs are Hamiltonian-connected
Authors:
Licheng Zhang,
Yuanqiu Huang,
Shengxiang Lv,
Fengming Dong
Abstract:
Tutte proved that 4-connected planar graphs are Hamiltonian. It is unknown if there is an analogous result on 1-planar graphs. In this paper, we characterize 4-connected 1-planar chordal graphs, and show that all such graphs are Hamiltonian-connected. A crucial tool used in our proof is a characteristic of 1-planar 4-trees.
Tutte proved that 4-connected planar graphs are Hamiltonian. It is unknown if there is an analogous result on 1-planar graphs. In this paper, we characterize 4-connected 1-planar chordal graphs, and show that all such graphs are Hamiltonian-connected. A crucial tool used in our proof is a characteristic of 1-planar 4-trees.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Research progress on intelligent optimization techniques for energy-efficient design of ship hull forms
Authors:
Shuwei Zhu,
Siying Lv,
Kaifeng Chen,
Wei Fang,
Leilei Cao
Abstract:
The design optimization of ship hull form based on hydrodynamics theory and simulation-based design (SBD) technologies generally considers ship performance and energy efficiency performance as the design objective, which plays an important role in smart design and manufacturing of green ship. An optimal design of sustainable energy system requires multidisciplinary tools to build ships with the le…
▽ More
The design optimization of ship hull form based on hydrodynamics theory and simulation-based design (SBD) technologies generally considers ship performance and energy efficiency performance as the design objective, which plays an important role in smart design and manufacturing of green ship. An optimal design of sustainable energy system requires multidisciplinary tools to build ships with the least resistance and energy consumption. Through a systematic approach, this paper presents the research progress of energy-efficient design of ship hull forms based on intelligent optimization techniques. We discuss different methods involved in the optimization procedure, especially the latest developments of intelligent optimization algorithms and surrogate models. Moreover, current development trends and technical challenges of multidisciplinary design optimization and surrogate-assisted evolutionary algorithms for ship design are further analyzed. We explore the gaps and potential future directions, so as to paving the way towards the design of the next generation of more energy-efficient ship hull form.
△ Less
Submitted 9 March, 2024;
originally announced March 2024.
-
Many-body computing on Field Programmable Gate Arrays
Authors:
Songtai Lv,
Yang Liang,
Yuchen Meng,
Xiaochen Yao,
Jincheng Xu,
Yang Liu,
Qibin Zheng,
Haiyuan Zou
Abstract:
A new implementation of many-body calculations is of paramount importance in the field of computational physics. In this study, we leverage the capabilities of Field Programmable Gate Arrays (FPGAs) for conducting quantum many-body calculations. Through the design of appropriate schemes for Monte Carlo and tensor network methods, we effectively utilize the parallel processing capabilities provided…
▽ More
A new implementation of many-body calculations is of paramount importance in the field of computational physics. In this study, we leverage the capabilities of Field Programmable Gate Arrays (FPGAs) for conducting quantum many-body calculations. Through the design of appropriate schemes for Monte Carlo and tensor network methods, we effectively utilize the parallel processing capabilities provided by FPGAs. This has resulted in a remarkable tenfold speedup compared to CPU-based computation for a Monte Carlo algorithm. We also demonstrate, for the first time, the utilization of FPGA to accelerate a typical tensor network algorithm. Our findings unambiguously highlight the significant advantages of hardware implementation and pave the way for novel approaches to many-body calculations.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Electric Field Switching of Magnon Spin Current in a Compensated Ferrimagnet
Authors:
Kaili Li,
Lei Wang,
Yu Wang,
Yuanjun Guo,
Shuping Lv,
Yuewei He,
Weiwei Lin,
Tai Min,
Shaojie Hu,
Sen Yang,
Dezhen Xue,
Aqun Zheng,
Shuming Yang,
Xiangdong Ding
Abstract:
Manipulation of directional magnon propagation, known as magnon spin current, is essential for developing magnonic memory and logic devices featuring nonvolatile functionalities and ultralow power consumption. Magnon spin current can usually be modulated by magnetic field or current-induced spin torques. However, these approaches may lead to energy dissipation caused by Joule heating. Electric-fie…
▽ More
Manipulation of directional magnon propagation, known as magnon spin current, is essential for developing magnonic memory and logic devices featuring nonvolatile functionalities and ultralow power consumption. Magnon spin current can usually be modulated by magnetic field or current-induced spin torques. However, these approaches may lead to energy dissipation caused by Joule heating. Electric-field switching of magnon spin current without charge current is highly desired but very challenging to realize. By integrating magnonic and piezoelectric materials, we demonstrate manipulation of the magnon spin current generated by the spin Seebeck effect in the ferrimagnetic insulator Gd3Fe5O12 (GdIG) film on a piezoelectric substrate. We observe reversible electric-field switching of magnon polarization without applied charge current. Through strain-mediated magnetoelectric coupling, the electric field induces the magnetic compensation transition between two magnetic states of the GdIG, resulting in its magnetization reversal and the simultaneous switching of magnon spin current. Our work establishes a prototype material platform that pave the way for developing magnon logic devices characterized by all electric field reading and writing and reveals the underlying physics principles of their functions.
△ Less
Submitted 25 November, 2023;
originally announced November 2023.
-
MBTFNet: Multi-Band Temporal-Frequency Neural Network For Singing Voice Enhancement
Authors:
Weiming Xu,
Zhouxuan Chen,
Zhili Tan,
Shubo Lv,
Runduo Han,
Wenjiang Zhou,
Weifeng Zhao,
Lei Xie
Abstract:
A typical neural speech enhancement (SE) approach mainly handles speech and noise mixtures, which is not optimal for singing voice enhancement scenarios. Music source separation (MSS) models treat vocals and various accompaniment components equally, which may reduce performance compared to the model that only considers vocal enhancement. In this paper, we propose a novel multi-band temporal-freque…
▽ More
A typical neural speech enhancement (SE) approach mainly handles speech and noise mixtures, which is not optimal for singing voice enhancement scenarios. Music source separation (MSS) models treat vocals and various accompaniment components equally, which may reduce performance compared to the model that only considers vocal enhancement. In this paper, we propose a novel multi-band temporal-frequency neural network (MBTFNet) for singing voice enhancement, which particularly removes background music, noise and even backing vocals from singing recordings. MBTFNet combines inter and intra-band modeling for better processing of full-band signals. Dual-path modeling are introduced to expand the receptive field of the model. We propose an implicit personalized enhancement (IPE) stage based on signal-to-noise ratio (SNR) estimation, which further improves the performance of MBTFNet. Experiments show that our proposed model significantly outperforms several state-of-the-art SE and MSS models.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
RIS-aided Near-Field MIMO Communications: Codebook and Beam Training Design
Authors:
Suyu Lv,
Yuanwei Liu,
Xiaodong Xu,
Arumugam Nallanathan,
A. Lee Swindlehurst
Abstract:
Downlink reconfigurable intelligent surface (RIS)-assisted multi-input-multi-output (MIMO) systems are considered with far-field, near-field, and hybrid-far-near-field channels. According to the angular or distance information contained in the received signals, 1) a distance-based codebook is designed for near-field MIMO channels, based on which a hierarchical beam training scheme is proposed to r…
▽ More
Downlink reconfigurable intelligent surface (RIS)-assisted multi-input-multi-output (MIMO) systems are considered with far-field, near-field, and hybrid-far-near-field channels. According to the angular or distance information contained in the received signals, 1) a distance-based codebook is designed for near-field MIMO channels, based on which a hierarchical beam training scheme is proposed to reduce the training overhead; 2) a combined angular-distance codebook is designed for mixed-far-near-field MIMO channels, based on which a two-stage beam training scheme is proposed to achieve alignment in the angular and distance domains separately. For maximizing the achievable rate while reducing the complexity, an alternating optimization algorithm is proposed to carry out the joint optimization iteratively. Specifically, the RIS coefficient matrix is optimized through the beam training process, the optimal combining matrix is obtained from the closed-form solution for the mean square error (MSE) minimization problem, and the active beamforming matrix is optimized by exploiting the relationship between the achievable rate and MSE. Numerical results reveal that: 1) the proposed beam training schemes achieve near-optimal performance with a significantly decreased training overhead; 2) compared to the angular-only far-field channel model, taking the additional distance information into consideration will effectively improve the achievable rate when carrying out beam design for near-field communications.
△ Less
Submitted 30 September, 2023;
originally announced October 2023.
-
Personalized Federated Learning via Amortized Bayesian Meta-Learning
Authors:
Shiyu Liu,
Shaogao Lv,
Dun Zeng,
Zenglin Xu,
Hui Wang,
Yue Yu
Abstract:
Federated learning is a decentralized and privacy-preserving technique that enables multiple clients to collaborate with a server to learn a global model without exposing their private data. However, the presence of statistical heterogeneity among clients poses a challenge, as the global model may struggle to perform well on each client's specific task. To address this issue, we introduce a new pe…
▽ More
Federated learning is a decentralized and privacy-preserving technique that enables multiple clients to collaborate with a server to learn a global model without exposing their private data. However, the presence of statistical heterogeneity among clients poses a challenge, as the global model may struggle to perform well on each client's specific task. To address this issue, we introduce a new perspective on personalized federated learning through Amortized Bayesian Meta-Learning. Specifically, we propose a novel algorithm called \emph{FedABML}, which employs hierarchical variational inference across clients. The global prior aims to capture representations of common intrinsic structures from heterogeneous clients, which can then be transferred to their respective tasks and aid in the generation of accurate client-specific approximate posteriors through a few local updates. Our theoretical analysis provides an upper bound on the average generalization error and guarantees the generalization performance on unseen data. Finally, several empirical results are implemented to demonstrate that \emph{FedABML} outperforms several competitive baselines.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
Robust Graph Structure Learning with the Alignment of Features and Adjacency Matrix
Authors:
Shaogao Lv,
Gang Wen,
Shiyu Liu,
Linsen Wei,
Ming Li
Abstract:
To improve the robustness of graph neural networks (GNN), graph structure learning (GSL) has attracted great interest due to the pervasiveness of noise in graph data. Many approaches have been proposed for GSL to jointly learn a clean graph structure and corresponding representations. To extend the previous work, this paper proposes a novel regularized GSL approach, particularly with an alignment…
▽ More
To improve the robustness of graph neural networks (GNN), graph structure learning (GSL) has attracted great interest due to the pervasiveness of noise in graph data. Many approaches have been proposed for GSL to jointly learn a clean graph structure and corresponding representations. To extend the previous work, this paper proposes a novel regularized GSL approach, particularly with an alignment of feature information and graph information, which is motivated mainly by our derived lower bound of node-level Rademacher complexity for GNNs. Additionally, our proposed approach incorporates sparse dimensional reduction to leverage low-dimensional node features that are relevant to the graph structure. To evaluate the effectiveness of our approach, we conduct experiments on real-world graphs. The results demonstrate that our proposed GSL method outperforms several competitive baselines, especially in scenarios where the graph structures are heavily affected by noise. Overall, our research highlights the importance of integrating feature and graph information alignment in GSL, as inspired by our derived theoretical result, and showcases the superiority of our approach in handling noisy graph structures through comprehensive experiments on real-world datasets.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
Autonomous Drone Racing: Time-Optimal Spatial Iterative Learning Control within a Virtual Tube
Authors:
Shuli Lv,
Yan Gao,
Jiaxing Che,
Quan Quan
Abstract:
It is often necessary for drones to complete delivery, photography, and rescue in the shortest time to increase efficiency. Many autonomous drone races provide platforms to pursue algorithms to finish races as quickly as possible for the above purpose. Unfortunately, existing methods often fail to keep training and racing time short in drone racing competitions. This motivates us to develop a high…
▽ More
It is often necessary for drones to complete delivery, photography, and rescue in the shortest time to increase efficiency. Many autonomous drone races provide platforms to pursue algorithms to finish races as quickly as possible for the above purpose. Unfortunately, existing methods often fail to keep training and racing time short in drone racing competitions. This motivates us to develop a high-efficient learning method by imitating the training experience of top racing drivers. Unlike traditional iterative learning control methods for accurate tracking, the proposed approach iteratively learns a trajectory online to finish the race as quickly as possible. Simulations and experiments using different models show that the proposed approach is model-free and is able to achieve the optimal result with low computation requirements. Furthermore, this approach surpasses some state-of-the-art methods in racing time on a benchmark drone racing platform. An experiment on a real quadcopter is also performed to demonstrate its effectiveness.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
An Approach to Mismatched Disturbance Rejection Control for Continuous-Time Uncontrollable Systems
Authors:
Shichao Lv,
Hongdan Li,
Kai Peng,
Shihua Li,
Huanshui Zhang
Abstract:
This paper focuses on optimal mismatched disturbance rejection control for linear continuoustime uncontrollable systems. Different from previous studies, by introducing a new quadratic performance index to transform the mismatched disturbance rejection control into a linear quadratic tracking problem, the regulated state can track a reference trajectory and minimize the influence of disturbance. T…
▽ More
This paper focuses on optimal mismatched disturbance rejection control for linear continuoustime uncontrollable systems. Different from previous studies, by introducing a new quadratic performance index to transform the mismatched disturbance rejection control into a linear quadratic tracking problem, the regulated state can track a reference trajectory and minimize the influence of disturbance. The necessary and sufficient conditions for the solvability and the disturbance rejection controller are obtained by solving a forward-backward differential equation over a finite horizon. A sufficient condition for system stability is obtained over an infinite horizon under detectable condition. This paper details our novel approach for transforming disturbance rejection into a linear quadratic tracking problem. The effectiveness of the proposed method is provided with two examples to demonstrate.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
DCCRN-KWS: an audio bias based model for noise robust small-footprint keyword spotting
Authors:
Shubo Lv,
Xiong Wang,
Sining Sun,
Long Ma,
Lei Xie
Abstract:
Real-world complex acoustic environments especially the ones with a low signal-to-noise ratio (SNR) will bring tremendous challenges to a keyword spotting (KWS) system. Inspired by the recent advances of neural speech enhancement and context bias in speech recognition, we propose a robust audio context bias based DCCRN-KWS model to address this challenge. We form the whole architecture as a multi-…
▽ More
Real-world complex acoustic environments especially the ones with a low signal-to-noise ratio (SNR) will bring tremendous challenges to a keyword spotting (KWS) system. Inspired by the recent advances of neural speech enhancement and context bias in speech recognition, we propose a robust audio context bias based DCCRN-KWS model to address this challenge. We form the whole architecture as a multi-task learning framework for both denosing and keyword spotting, where the DCCRN encoder is connected with the KWS model. Helped with the denoising task, we further introduce an audio context bias module to leverage the real keyword samples and bias the network to better iscriminate keywords in noisy conditions. Feature merge and complex context linear modules are also introduced to strength such discrimination and to effectively leverage contextual information respectively. Experiments on the internal challenging dataset and the HIMIYA public dataset show that our DCCRN-KWS system is superior in performance, while ablation study demonstrates the good design of the whole model.
△ Less
Submitted 12 June, 2023; v1 submitted 20 May, 2023;
originally announced May 2023.
-
Stability and Generalization of lp-Regularized Stochastic Learning for GCN
Authors:
Shiyu Liu,
Linsen Wei,
Shaogao Lv,
Ming Li
Abstract:
Graph convolutional networks (GCN) are viewed as one of the most popular representations among the variants of graph neural networks over graph data and have shown powerful performance in empirical experiments. That $\ell_2$-based graph smoothing enforces the global smoothness of GCN, while (soft) $\ell_1$-based sparse graph learning tends to promote signal sparsity to trade for discontinuity. Thi…
▽ More
Graph convolutional networks (GCN) are viewed as one of the most popular representations among the variants of graph neural networks over graph data and have shown powerful performance in empirical experiments. That $\ell_2$-based graph smoothing enforces the global smoothness of GCN, while (soft) $\ell_1$-based sparse graph learning tends to promote signal sparsity to trade for discontinuity. This paper aims to quantify the trade-off of GCN between smoothness and sparsity, with the help of a general $\ell_p$-regularized $(1<p\leq 2)$ stochastic learning proposed within. While stability-based generalization analyses have been given in prior work for a second derivative objectiveness function, our $\ell_p$-regularized learning scheme does not satisfy such a smooth condition. To tackle this issue, we propose a novel SGD proximal algorithm for GCNs with an inexact operator. For a single-layer GCN, we establish an explicit theoretical understanding of GCN with the $\ell_p$-regularized stochastic learning by analyzing the stability of our SGD proximal algorithm. We conduct multiple empirical experiments to validate our theoretical findings.
△ Less
Submitted 19 June, 2023; v1 submitted 19 May, 2023;
originally announced May 2023.
-
Constructing nonorientable genus embedding of complete bipartite graph minus a matching
Authors:
Shengxiang Lv
Abstract:
$G_{m,n,k}$ is a subgraph of the complete bipartite graph $K_{m,n}$ with a $k$-matching removed. By a new method based on the embeddings of some $G_{m,n,k}$ with small $m,n,k$ and bipartite joins with small bipartite graphs, we construct the nonorientable genus embedding of $G_{m,n,k}$ for all $m,n\geq 3$ with $(m,n,k)\neq (5,4,4), (4,5,4),(5,5,5)$. Hence, we solve the cases $G_{n+1,n,n}$($n…
▽ More
$G_{m,n,k}$ is a subgraph of the complete bipartite graph $K_{m,n}$ with a $k$-matching removed. By a new method based on the embeddings of some $G_{m,n,k}$ with small $m,n,k$ and bipartite joins with small bipartite graphs, we construct the nonorientable genus embedding of $G_{m,n,k}$ for all $m,n\geq 3$ with $(m,n,k)\neq (5,4,4), (4,5,4),(5,5,5)$. Hence, we solve the cases $G_{n+1,n,n}$($n$ is even) and $G_{n,n,n}$, with the values of $n$ that have not been previously solved, i.e., $n\geq 6$. This completes previous work on the nonorientable genus of $G_{m,n,k}$.
△ Less
Submitted 17 May, 2023;
originally announced May 2023.
-
Solving a class of zero-sum stopping game with regime switching
Authors:
Siyu Lv,
Xiao Yang
Abstract:
This paper studies a class of zero-sum stopping game in a regime switching model. A verification theorem as a sufficient criterion for Nash equilibriums is established based on a set of variational inequalities (VIs). Under an appropriate regularity condition for solutions to the VIs, a suitable system of algebraic equations is derived via the so-called smooth-fit principle. Explicit Nash equilibr…
▽ More
This paper studies a class of zero-sum stopping game in a regime switching model. A verification theorem as a sufficient criterion for Nash equilibriums is established based on a set of variational inequalities (VIs). Under an appropriate regularity condition for solutions to the VIs, a suitable system of algebraic equations is derived via the so-called smooth-fit principle. Explicit Nash equilibrium stopping rules of threshold-type for the two players and the corresponding value function of the game in closed form are obtained. Numerical experiments are reported to demonstrate the dependence of the threshold levels on various model parameters. A reduction to the case with no regime switching is also presented as a comparison.
△ Less
Submitted 28 March, 2023;
originally announced March 2023.
-
Two-stage Neural Network for ICASSP 2023 Speech Signal Improvement Challenge
Authors:
Mingshuai Liu,
Shubo Lv,
Zihan Zhang,
Runduo Han,
Xiang Hao,
Xianjun Xia,
Li Chen,
Yijian Xiao,
Lei Xie
Abstract:
In ICASSP 2023 speech signal improvement challenge, we developed a dual-stage neural model which improves speech signal quality induced by different distortions in a stage-wise divide-and-conquer fashion. Specifically, in the first stage, the speech improvement network focuses on recovering the missing components of the spectrum, while in the second stage, our model aims to further suppress noise,…
▽ More
In ICASSP 2023 speech signal improvement challenge, we developed a dual-stage neural model which improves speech signal quality induced by different distortions in a stage-wise divide-and-conquer fashion. Specifically, in the first stage, the speech improvement network focuses on recovering the missing components of the spectrum, while in the second stage, our model aims to further suppress noise, reverberation, and artifacts introduced by the first-stage model. Achieving 0.446 in the final score and 0.517 in the P.835 score, our system ranks 4th in the non-real-time track.
△ Less
Submitted 14 March, 2023;
originally announced March 2023.
-
Stochastic maximum principle for hybrid optimal control problems under partial observation
Authors:
Siyu Lv,
Jie Xiong,
Wen Xu
Abstract:
This paper is concerned with a partially observed hybrid optimal control problem, where continuous dynamics and discrete events coexist and in particular, the continuous dynamics can be observed while the discrete events, described by a Markov chain, is not directly available. Such kind of problem is first considered in the literature and has wide applications in finance, management, engineering,…
▽ More
This paper is concerned with a partially observed hybrid optimal control problem, where continuous dynamics and discrete events coexist and in particular, the continuous dynamics can be observed while the discrete events, described by a Markov chain, is not directly available. Such kind of problem is first considered in the literature and has wide applications in finance, management, engineering, and so on. There are three major contributions made in this paper: First, we develop a novel non-linear filtering method to convert the partially observed problem into a completely observed one. Our method relies on some delicate stochastic analysis technique related to hybrid diffusions and is essentially different from the traditional filtering approaches. Second, we establish a new maximum principle based on the completely observed problem, whose two-dimensional state process consists of the continuous dynamics and the optimal filter. An important advantage of the maximum principle is that it takes a simple form and is convenient to implement. Finally, in order to illustrate the theoretical results, we solve a linear quadratic (LQ) example using the derived maximum principle to get some observable optimal controls.
△ Less
Submitted 11 March, 2023;
originally announced March 2023.
-
Dynamic surface tension of the pure liquid-vapor interface subjected to the cyclic loads
Authors:
Zhiyong Yu,
Songtai Lv,
Xin Zhang,
Hongtao Liang,
Wei Xie,
Yang Yang
Abstract:
We demonstrate a methodology for computationally investigating the mechanical response of a pure molten lead surface system to the lateral mechanical cyclic loads and try to answer the question: how dose the dynamically driven liquid surface system follow the classical physics of the elastic-driven oscillation? The steady-state oscillation of the dynamic surface tension under cyclic load, includin…
▽ More
We demonstrate a methodology for computationally investigating the mechanical response of a pure molten lead surface system to the lateral mechanical cyclic loads and try to answer the question: how dose the dynamically driven liquid surface system follow the classical physics of the elastic-driven oscillation? The steady-state oscillation of the dynamic surface tension under cyclic load, including the excitation of high frequency vibration mode at different driving frequencies and amplitudes, was compared with the classical theory of single-body driven damped oscillator. Under the highest studied frequency (50 GHz) and amplitude (5%) of the load, the increase of the (mean value) dynamic surface tension could reach ~5%. The peak and trough values of the instantaneous dynamic surface tension could reach (up to) 40% increase and (up to) 20% decrease compared to the equilibrium surface tension, respectively. The extracted generalized natural frequencies and the generalized damping constants seem to be intimately related to the intrinsic timescales of the atomic temporal-spatial correlation functions of the liquids both in the bulk region and in the outermost surface layers. These insights uncovered could be helpful for quantitative manipulation of the liquid surface tension using ultrafast shockwaves or laser pulses.
△ Less
Submitted 27 November, 2022;
originally announced November 2022.
-
Fast Calibration for Computer Models with Massive Physical Observations
Authors:
Shurui Lv,
Yan Wang,
Jun Yu
Abstract:
Computer model calibration is a crucial step in building a reliable computer model. In the face of massive physical observations, a fast estimation for the calibration parameters is urgently needed. To alleviate the computational burden, we design a two-step algorithm to estimate the calibration parameters by employing the subsampling techniques. Compared with the current state-of-the-art calibrat…
▽ More
Computer model calibration is a crucial step in building a reliable computer model. In the face of massive physical observations, a fast estimation for the calibration parameters is urgently needed. To alleviate the computational burden, we design a two-step algorithm to estimate the calibration parameters by employing the subsampling techniques. Compared with the current state-of-the-art calibration methods, the complexity of the proposed algorithm is greatly reduced without sacrificing too much accuracy. We prove the consistency and asymptotic normality of the proposed estimator. The form of the variance of the proposed estimation is also presented, which provides a natural way to quantify the uncertainty of the calibration parameters. The obtained results of two numerical simulations and two real-case studies demonstrate the advantages of the proposed method.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
Signatures of quantum criticality in the complex inverse temperature plane
Authors:
Yang Liu,
Songtai Lv,
Yang Yang,
Haiyuan Zou
Abstract:
Concepts of complex partition functions and the Fisher zeros provide intrinsic statistical mechanisms for finite temperature and real time dynamical phase transitions. We extend the utility of these complexifications to quantum phase transitions. We exactly identify different Fisher zeros on lines or closed curves and elucidate their correspondence with domain-wall excitation or confined meson for…
▽ More
Concepts of complex partition functions and the Fisher zeros provide intrinsic statistical mechanisms for finite temperature and real time dynamical phase transitions. We extend the utility of these complexifications to quantum phase transitions. We exactly identify different Fisher zeros on lines or closed curves and elucidate their correspondence with domain-wall excitation or confined meson for the one-dimensional transverse field Ising model. The crossover behavior of Fisher zeros provides a fascinating picture for criticality near the quantum phase transition, where the excitation energy scales are quantitatively determined. We further confirm our results by tensor network calculation and demonstrate a clear signal of deconfined meson excitation from the breaking of the closed zero curves. Our results unambiguously show significant features of the Fisher zeros for a quantum phase transition and open up a new route to explore quantum criticality.
△ Less
Submitted 1 November, 2022;
originally announced November 2022.
-
Robust Multitask Diffusion Normalized M-estimate Subband Adaptive Filtering Algorithm Over Adaptive Networks
Authors:
Wenjing Xu,
Haiquan Zhao,
Shaohui Lv
Abstract:
In recent years, the multitask diffusion least mean square (MD-LMS) algorithm has been extensively applied in the distributed parameter estimation and target tracking of multitask network. However, its performance is mainly limited by two aspects, i.e, the correlated input signal and impulsive noise interference. To overcome these two limitations simultaneously, this paper firstly introduces the s…
▽ More
In recent years, the multitask diffusion least mean square (MD-LMS) algorithm has been extensively applied in the distributed parameter estimation and target tracking of multitask network. However, its performance is mainly limited by two aspects, i.e, the correlated input signal and impulsive noise interference. To overcome these two limitations simultaneously, this paper firstly introduces the subband adaptive filter (SAF) into the multitask network. Then, a robust multitask diffusion normalized M-estimate subband adaptive filtering (MD-NMSAF) algorithm is proposed by solving the modified Huber function based global network optimization problem in a distributed manner, which endows the multitask network strong decorrelation ability for correlated inputs and robustness to impulsive noise interference, and accelerates the convergence of the algorithm significantly. Compared with the robust multitask diffusion affine projection M-estimate (MD-APM) algorithm, the computational complexity of the proposed MD-NMSAF is greatly reduced. In addition, the stability condition, the analytical expressions of the theoretical transient and steady-state network mean square deviation (MSD) of the MD-NMSAF are also provided and verified through computer simulations. Simulation results under different input signals and impulsive noise environment fully demonstrate the performance advantages of the MD-NMSAF algorithm over some other competitors in terms of steady-state accuracy and tracking speed.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
spatial-dccrn: dccrn equipped with frame-level angle feature and hybrid filtering for multi-channel speech enhancement
Authors:
Shubo Lv,
Yihui Fu,
Yukai Jv,
Lei Xie,
Weixin Zhu,
Wei Rao,
Yannan Wang
Abstract:
Recently, multi-channel speech enhancement has drawn much interest due to the use of spatial information to distinguish target speech from interfering signal. To make full use of spatial information and neural network based masking estimation, we propose a multi-channel denoising neural network -- Spatial DCCRN. Firstly, we extend S-DCCRN to multi-channel scenario, aiming at performing cascaded su…
▽ More
Recently, multi-channel speech enhancement has drawn much interest due to the use of spatial information to distinguish target speech from interfering signal. To make full use of spatial information and neural network based masking estimation, we propose a multi-channel denoising neural network -- Spatial DCCRN. Firstly, we extend S-DCCRN to multi-channel scenario, aiming at performing cascaded sub-channel and full-channel processing strategy, which can model different channels separately. Moreover, instead of only adopting multi-channel spectrum or concatenating first-channel's magnitude and IPD as the model's inputs, we apply an angle feature extraction module (AFE) to extract frame-level angle feature embeddings, which can help the model to apparently perceive spatial information. Finally, since the phenomenon of residual noise will be more serious when the noise and speech exist in the same time frequency (TF) bin, we particularly design a masking and mapping filtering method to substitute the traditional filter-and-sum operation, with the purpose of cascading coarsely denoising, dereverberation and residual noise suppression. The proposed model, Spatial-DCCRN, has surpassed EaBNet, FasNet as well as several competitive models on the L3DAS22 Challenge dataset. Not only the 3D scenario, Spatial-DCCRN outperforms state-of-the-art (SOTA) model MIMO-UNet by a large margin in multiple evaluation metrics on the multi-channel ConferencingSpeech2021 Challenge dataset. Ablation studies also demonstrate the effectiveness of different contributions.
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
Optomechanical Effects in Nanocavity-enhanced Resonant Raman Scattering of a Single Molecule
Authors:
Xuan-Ming Shen,
Yuan Zhang,
Shunping Zhang,
Yao Zhang,
Qiu-Shi Meng,
Guangchao Zheng,
Siyuan Lv,
Luxia Wang,
Roberto A. Boto,
Chongxin Shan,
Javier Aizpurua
Abstract:
In this article, we address the optomechanical effects in surface-enhanced resonant Raman scattering (SERRS) from a single molecule in a nano-particle on mirror (NPoM) nanocavity by developing a quantum master equation theory, which combines macroscopic quantum electrodynamics and electron-vibration interaction within the framework of open quantum system theory. We supplement the theory with elect…
▽ More
In this article, we address the optomechanical effects in surface-enhanced resonant Raman scattering (SERRS) from a single molecule in a nano-particle on mirror (NPoM) nanocavity by developing a quantum master equation theory, which combines macroscopic quantum electrodynamics and electron-vibration interaction within the framework of open quantum system theory. We supplement the theory with electromagnetic simulations and time-dependent density functional theory calculations in order to study the SERRS of a methylene blue molecule in a realistic NPoM nanocavity. The simulations allow us not only to identify the conditions to achieve conventional optomechanical effects, such as vibrational pumping, non-linear scaling of Stokes and anti-Stokes scattering, but also to discovery distinct behaviors, such as the saturation of exciton population, the emergence of Mollow triplet side-bands, and higher-order Raman scattering. All in all, our study might guide further investigations of optomechanical effects in resonant Raman scattering.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
Efficient Long Sequential User Data Modeling for Click-Through Rate Prediction
Authors:
Qiwei Chen,
Yue Xu,
Changhua Pei,
Shanshan Lv,
Tao Zhuang,
Junfeng Ge
Abstract:
Recent studies on Click-Through Rate (CTR) prediction has reached new levels by modeling longer user behavior sequences. Among others, the two-stage methods stand out as the state-of-the-art (SOTA) solution for industrial applications. The two-stage methods first train a retrieval model to truncate the long behavior sequence beforehand and then use the truncated sequences to train a CTR model. How…
▽ More
Recent studies on Click-Through Rate (CTR) prediction has reached new levels by modeling longer user behavior sequences. Among others, the two-stage methods stand out as the state-of-the-art (SOTA) solution for industrial applications. The two-stage methods first train a retrieval model to truncate the long behavior sequence beforehand and then use the truncated sequences to train a CTR model. However, the retrieval model and the CTR model are trained separately. So the retrieved subsequences in the CTR model is inaccurate, which degrades the final performance. In this paper, we propose an end-to-end paradigm to model long behavior sequences, which is able to achieve superior performance along with remarkable cost-efficiency compared to existing models. Our contribution is three-fold: First, we propose a hashing-based efficient target attention (TA) network named ETA-Net to enable end-to-end user behavior retrieval based on low-cost bit-wise operations. The proposed ETA-Net can reduce the complexity of standard TA by orders of magnitude for sequential data modeling. Second, we propose a general system architecture as one viable solution to deploy ETA-Net on industrial systems. Particularly, ETA-Net has been deployed on the recommender system of Taobao, and brought 1.8% lift on CTR and 3.1% lift on Gross Merchandise Value (GMV) compared to the SOTA two-stage methods. Third, we conduct extensive experiments on both offline datasets and online A/B test. The results verify that the proposed model outperforms existing CTR models considerably, in terms of both CTR prediction performance and online cost-efficiency. ETA-Net now serves the main traffic of Taobao, delivering services to hundreds of millions of users towards billions of items every day.
△ Less
Submitted 25 September, 2022;
originally announced September 2022.
-
An Approach to Mismatched Disturbance Rejection Control for Uncontrollable Systems
Authors:
Shichao Lv,
Hongdan Li,
Kai Peng,
Huanshui Zhang
Abstract:
This study focuses on the problem of optimal mismatched disturbance rejection control for uncontrollable linear discrete-time systems. In contrast to previous studies, by introducing a quadratic performance index such that the regulated state can track a reference trajectory and minimize the effects of disturbances, mismatched disturbance rejection control is transformed into a linear quadratic tr…
▽ More
This study focuses on the problem of optimal mismatched disturbance rejection control for uncontrollable linear discrete-time systems. In contrast to previous studies, by introducing a quadratic performance index such that the regulated state can track a reference trajectory and minimize the effects of disturbances, mismatched disturbance rejection control is transformed into a linear quadratic tracking problem. The necessary and sufficient conditions for the solvability of this problem over a finite horizon and a disturbance rejection controller are derived by solving a forward-backward difference equation. In the case of an infinite horizon, a sufficient condition for the stabilization of the system is obtained under the detectable condition. This paper details our novel approach to disturbance rejection. Four examples are provided to demonstrate the effectiveness of the proposed method.
△ Less
Submitted 14 September, 2022;
originally announced September 2022.
-
Linear quadratic leader-follower stochastic differential games for mean-field switching diffusions
Authors:
Siyu Lv,
Jie Xiong,
Xin Zhang
Abstract:
In this paper, we consider a linear quadratic (LQ) leader-follower stochastic differential game for regime switching diffusions with mean-field interactions. One of the salient features of this paper is that conditional mean-field terms are included in the state equation and cost functionals. Based on stochastic maximum principles (SMPs), the follower's problem and the leader's problem are solved…
▽ More
In this paper, we consider a linear quadratic (LQ) leader-follower stochastic differential game for regime switching diffusions with mean-field interactions. One of the salient features of this paper is that conditional mean-field terms are included in the state equation and cost functionals. Based on stochastic maximum principles (SMPs), the follower's problem and the leader's problem are solved sequentially and an open-loop Stackelberg equilibrium is obtained. Further, with the help of the so-called four-step scheme, the corresponding Hamiltonian systems for the two players are decoupled and then the open-loop Stackelberg equilibrium admits a state feedback representation if some new-type Riccati equations are solvable.
△ Less
Submitted 30 July, 2022;
originally announced August 2022.
-
Local collective dynamics at equilibrium BCC crystal-melt interfaces
Authors:
Xin Zhang,
Wenliang Lu,
Zun Liang,
Yashen Wang,
Songtai Lv,
Hongtao Liang,
Brian B. Laird,
Yang Yang
Abstract:
We present a classical molecular-dynamics study of the collective dynamical properties of the coexisting liquid phase at equilibrium body-centered cubic (BCC) Fe crystal-melt interfaces. For the three interfacial orientations (100), (110), and (111), the collective dynamics are characterized through the calculation of the intermediate scattering functions, dynamical structure factors and density r…
▽ More
We present a classical molecular-dynamics study of the collective dynamical properties of the coexisting liquid phase at equilibrium body-centered cubic (BCC) Fe crystal-melt interfaces. For the three interfacial orientations (100), (110), and (111), the collective dynamics are characterized through the calculation of the intermediate scattering functions, dynamical structure factors and density relaxation times in a sequential local region of interest. An anisotropic speed up of the collective dynamics in all three BCC crystal-melt interfacial orientations is observed. This trend differs significantly different from the previously observed slowing down of the local collective dynamics at the liquid-vapor interface [Acta Mater 2020;198:281]. Examining the interfacial density relaxation times, we revisit the validity of the recently developed time-dependent Ginzburg-Landau (TDGL) theory for the solidification crystal-melt interface kinetic coefficients, resulting in excellent agreement with both the magnitude and the kinetic anisotropy of the CMI kinetic coefficients measured from the non-equilibrium MD simulations
△ Less
Submitted 28 July, 2022;
originally announced July 2022.
-
Mismatched Disturbance Rejection Control for Second-Order Discrete-Time Systems
Authors:
Shichao Lv,
Kai Peng,
Hongxia Wang,
Huanshui Zhang
Abstract:
This paper is concerned with mismatched disturbance rejection control for the second-order discrete-time systems.Different from previous work, the controllability of the system is applied to design the disturbance compensation gain, which does not require any coordinate transformations. Via this new idea, it is shown that disturbance in the regulated output is immediately and directly compensated…
▽ More
This paper is concerned with mismatched disturbance rejection control for the second-order discrete-time systems.Different from previous work, the controllability of the system is applied to design the disturbance compensation gain, which does not require any coordinate transformations. Via this new idea, it is shown that disturbance in the regulated output is immediately and directly compensated in the case that the disturbance is known. When the disturbance is unknown, an extra generalized extended state observer is applied to design the controller. Two examples are given to show the effectiveness of the proposed methods. Numerical simulation shows that the designed controller has excellent disturbance rejection effect when the disturbance is known. The example with respect to the permanent-magnet direct current motor illustrates that the proposed control method for unknown disturbance rejection is effective.
△ Less
Submitted 2 May, 2022;
originally announced May 2022.
-
WSSS4LUAD: Grand Challenge on Weakly-supervised Tissue Semantic Segmentation for Lung Adenocarcinoma
Authors:
Chu Han,
Xipeng Pan,
Lixu Yan,
Huan Lin,
Bingbing Li,
Su Yao,
Shanshan Lv,
Zhenwei Shi,
Jinhai Mai,
Jiatai Lin,
Bingchao Zhao,
Zeyan Xu,
Zhizhen Wang,
Yumeng Wang,
Yuan Zhang,
Huihui Wang,
Chao Zhu,
Chunhui Lin,
Lijian Mao,
Min Wu,
Luwen Duan,
Jingsong Zhu,
Dong Hu,
Zijie Fang,
Yang Chen
, et al. (18 additional authors not shown)
Abstract:
Lung cancer is the leading cause of cancer death worldwide, and adenocarcinoma (LUAD) is the most common subtype. Exploiting the potential value of the histopathology images can promote precision medicine in oncology. Tissue segmentation is the basic upstream task of histopathology image analysis. Existing deep learning models have achieved superior segmentation performance but require sufficient…
▽ More
Lung cancer is the leading cause of cancer death worldwide, and adenocarcinoma (LUAD) is the most common subtype. Exploiting the potential value of the histopathology images can promote precision medicine in oncology. Tissue segmentation is the basic upstream task of histopathology image analysis. Existing deep learning models have achieved superior segmentation performance but require sufficient pixel-level annotations, which is time-consuming and expensive. To enrich the label resources of LUAD and to alleviate the annotation efforts, we organize this challenge WSSS4LUAD to call for the outstanding weakly-supervised semantic segmentation (WSSS) techniques for histopathology images of LUAD. Participants have to design the algorithm to segment tumor epithelial, tumor-associated stroma and normal tissue with only patch-level labels. This challenge includes 10,091 patch-level annotations (the training set) and over 130 million labeled pixels (the validation and test sets), from 87 WSIs (67 from GDPH, 20 from TCGA). All the labels were generated by a pathologist-in-the-loop pipeline with the help of AI models and checked by the label review board. Among 532 registrations, 28 teams submitted the results in the test phase with over 1,000 submissions. Finally, the first place team achieved mIoU of 0.8413 (tumor: 0.8389, stroma: 0.7931, normal: 0.8919). According to the technical reports of the top-tier teams, CAM is still the most popular approach in WSSS. Cutmix data augmentation has been widely adopted to generate more reliable samples. With the success of this challenge, we believe that WSSS approaches with patch-level annotations can be a complement to the traditional pixel annotations while reducing the annotation efforts. The entire dataset has been released to encourage more researches on computational pathology in LUAD and more novel WSSS techniques.
△ Less
Submitted 13 April, 2022; v1 submitted 13 April, 2022;
originally announced April 2022.
-
Predict the Rover Mobility over Soft Terrain using Articulated Wheeled Bevameter
Authors:
Wenyao Zhang,
Shipeng Lv,
Feng Xue,
Chen Yao,
Zheng Zhu,
Zhenzhong Jia
Abstract:
Robot mobility is critical for mission success, especially in soft or deformable terrains, where the complex wheel-soil interaction mechanics often leads to excessive wheel slip and sinkage, causing the eventual mission failure. To improve the success rate, online mobility prediction using vision, infrared imaging, or model-based stochastic methods have been used in the literature. This paper prop…
▽ More
Robot mobility is critical for mission success, especially in soft or deformable terrains, where the complex wheel-soil interaction mechanics often leads to excessive wheel slip and sinkage, causing the eventual mission failure. To improve the success rate, online mobility prediction using vision, infrared imaging, or model-based stochastic methods have been used in the literature. This paper proposes an on-board mobility prediction approach using an articulated wheeled bevameter that consists of a force-controlled arm and an instrumented bevameter (with force and vision sensors) as its end-effector. The proposed bevameter, which emulates the traditional terramechanics tests such as pressure-sinkage and shear experiments, can measure contact parameters ahead of the rover's body in real-time, and predict the slip and sinkage of supporting wheels over the probed region. Based on the predicted mobility, the rover can select a safer path in order to avoid dangerous regions such as those covered with quicksand. Compared to the literature, our proposed method can avoid the complicated terramechanics modeling and time-consuming stochastic prediction; it can also mitigate the inaccuracy issues arising in non-contact vision-based methods. We also conduct multiple experiments to validate the proposed approach.
△ Less
Submitted 17 February, 2022;
originally announced February 2022.
-
S-DCCRN: Super Wide Band DCCRN with learnable complex feature for speech enhancement
Authors:
Shubo Lv,
Yihui Fu,
Mengtao Xing,
Jiayao Sun,
Lei Xie,
Jun Huang,
Yannan Wang,
Tao Yu
Abstract:
In speech enhancement, complex neural network has shown promising performance due to their effectiveness in processing complex-valued spectrum. Most of the recent speech enhancement approaches mainly focus on wide-band signal with a sampling rate of 16K Hz. However, research on super wide band (e.g., 32K Hz) or even full-band (48K) denoising is still lacked due to the difficulty of modeling more f…
▽ More
In speech enhancement, complex neural network has shown promising performance due to their effectiveness in processing complex-valued spectrum. Most of the recent speech enhancement approaches mainly focus on wide-band signal with a sampling rate of 16K Hz. However, research on super wide band (e.g., 32K Hz) or even full-band (48K) denoising is still lacked due to the difficulty of modeling more frequency bands and particularly high frequency components. In this paper, we extend our previous deep complex convolution recurrent neural network (DCCRN) substantially to a super wide band version -- S-DCCRN, to perform speech denoising on speech of 32K Hz sampling rate. We first employ a cascaded sub-band and full-band processing module, which consists of two small-footprint DCCRNs -- one operates on sub-band signal and one operates on full-band signal, aiming at benefiting from both local and global frequency information. Moreover, instead of simply adopting the STFT feature as input, we use a complex feature encoder trained in an end-to-end manner to refine the information of different frequency bands. We also use a complex feature decoder to revert the feature to time-frequency domain. Finally, a learnable spectrum compression method is adopted to adjust the energy of different frequency bands, which is beneficial for neural network learning. The proposed model, S-DCCRN, has surpassed PercepNet as well as several competitive models and achieves state-of-the-art performance in terms of speech quality and intelligibility. Ablation studies also demonstrate the effectiveness of different contributions.
△ Less
Submitted 16 November, 2021;
originally announced November 2021.
-
SStaGCN: Simplified stacking based graph convolutional networks
Authors:
Jia Cai,
Zhilong Xiong,
Shaogao Lv
Abstract:
Graph convolutional network (GCN) is a powerful model studied broadly in various graph structural data learning tasks. However, to mitigate the over-smoothing phenomenon, and deal with heterogeneous graph structural data, the design of GCN model remains a crucial issue to be investigated. In this paper, we propose a novel GCN called SStaGCN (Simplified stacking based GCN) by utilizing the ideas of…
▽ More
Graph convolutional network (GCN) is a powerful model studied broadly in various graph structural data learning tasks. However, to mitigate the over-smoothing phenomenon, and deal with heterogeneous graph structural data, the design of GCN model remains a crucial issue to be investigated. In this paper, we propose a novel GCN called SStaGCN (Simplified stacking based GCN) by utilizing the ideas of stacking and aggregation, which is an adaptive general framework for tackling heterogeneous graph data. Specifically, we first use the base models of stacking to extract the node features of a graph. Subsequently, aggregation methods such as mean, attention and voting techniques are employed to further enhance the ability of node features extraction. Thereafter, the node features are considered as inputs and fed into vanilla GCN model. Furthermore, theoretical generalization bound analysis of the proposed model is explicitly given. Extensive experiments on $3$ public citation networks and another $3$ heterogeneous tabular data demonstrate the effectiveness and efficiency of the proposed approach over state-of-the-art GCNs. Notably, the proposed SStaGCN can efficiently mitigate the over-smoothing problem of GCN.
△ Less
Submitted 16 November, 2021;
originally announced November 2021.
-
Uformer: A Unet based dilated complex & real dual-path conformer network for simultaneous speech enhancement and dereverberation
Authors:
Yihui Fu,
Yun Liu,
Jingdong Li,
Dawei Luo,
Shubo Lv,
Yukai Jv,
Lei Xie
Abstract:
Complex spectrum and magnitude are considered as two major features of speech enhancement and dereverberation. Traditional approaches always treat these two features separately, ignoring their underlying relationship. In this paper, we propose Uformer, a Unet based dilated complex & real dual-path conformer network in both complex and magnitude domain for simultaneous speech enhancement and dereve…
▽ More
Complex spectrum and magnitude are considered as two major features of speech enhancement and dereverberation. Traditional approaches always treat these two features separately, ignoring their underlying relationship. In this paper, we propose Uformer, a Unet based dilated complex & real dual-path conformer network in both complex and magnitude domain for simultaneous speech enhancement and dereverberation. We exploit time attention (TA) and dilated convolution (DC) to leverage local and global contextual information and frequency attention (FA) to model dimensional information. These three sub-modules contained in the proposed dilated complex & real dual-path conformer module effectively improve the speech enhancement and dereverberation performance. Furthermore, hybrid encoder and decoder are adopted to simultaneously model the complex spectrum and magnitude and promote the information interaction between two domains. Encoder decoder attention is also applied to enhance the interaction between encoder and decoder. Our experimental results outperform all SOTA time and complex domain models objectively and subjectively. Specifically, Uformer reaches 3.6032 DNSMOS on the blind test set of Interspeech 2021 DNS Challenge, which outperforms all top-performed models. We also carry out ablation experiments to tease apart all proposed sub-modules that are most important.
△ Less
Submitted 4 May, 2022; v1 submitted 10 November, 2021;
originally announced November 2021.
-
Kernel-based estimation for partially functional linear model: Minimax rates and randomized sketches
Authors:
Shaogao Lv,
Xin He,
Junhui Wang
Abstract:
This paper considers the partially functional linear model (PFLM) where all predictive features consist of a functional covariate and a high dimensional scalar vector. Over an infinite dimensional reproducing kernel Hilbert space, the proposed estimation for PFLM is a least square approach with two mixed regularizations of a function-norm and an $\ell_1$-norm. Our main task in this paper is to est…
▽ More
This paper considers the partially functional linear model (PFLM) where all predictive features consist of a functional covariate and a high dimensional scalar vector. Over an infinite dimensional reproducing kernel Hilbert space, the proposed estimation for PFLM is a least square approach with two mixed regularizations of a function-norm and an $\ell_1$-norm. Our main task in this paper is to establish the minimax rates for PFLM under high dimensional setting, and the optimal minimax rates of estimation is established by using various techniques in empirical process theory for analyzing kernel classes. In addition, we propose an efficient numerical algorithm based on randomized sketches of the kernel matrix. Several numerical experiments are implemented to support our method and optimization strategy.
△ Less
Submitted 18 October, 2021;
originally announced October 2021.
-
Controller-and-Stopper Stochastic Differential Games with Regime Switching
Authors:
Siyu Lv
Abstract:
This paper is concerned with the controller-and-stopper stochastic differential game under a regime switching model in an infinite horizon. The state of the system consists of a number of diffusions \emph{coupled} by a continuous-time finite-state Markov chain. There are two players, one called the controller and the other called the stopper, involved in the game. The goal is to find a saddle poin…
▽ More
This paper is concerned with the controller-and-stopper stochastic differential game under a regime switching model in an infinite horizon. The state of the system consists of a number of diffusions \emph{coupled} by a continuous-time finite-state Markov chain. There are two players, one called the controller and the other called the stopper, involved in the game. The goal is to find a saddle point for the two players up to the time that the stopper \emph{terminates} the game. Based on the dynamic programming principle (DPP, for short), the lower and upper value functions are shown to be the viscosity supersolution and viscosity subsolution of the associated Hamilton-Jacobi-Bellman (HJB, for short) equation, respectively. Further, in view of the comparison principle for viscosity solutions, the lower and upper value functions \emph{coincide}, which implies that the game admits a value. All the proofs in this paper are strikingly different from those for the case without regime switching.
△ Less
Submitted 1 November, 2021; v1 submitted 21 September, 2021;
originally announced September 2021.
-
A New Rational Approach to the Square Root of 5
Authors:
Shenghui Su,
Jianhua Zheng,
Shuwang Lv
Abstract:
In this paper, authors construct a new type of sequence which is named an extra-super increasing sequence, and give the definitions of the minimal super increasing sequence {a[0], a[1], ..., a[n]} and minimal extra-super increasing sequence {z[0], z[1], ..., z[n]}. Find that there always exists a fit n which makes (z[n] / z[n-1] - a[n] / a[n-1])= PHI, where PHI is the golden ratio conjugate with a…
▽ More
In this paper, authors construct a new type of sequence which is named an extra-super increasing sequence, and give the definitions of the minimal super increasing sequence {a[0], a[1], ..., a[n]} and minimal extra-super increasing sequence {z[0], z[1], ..., z[n]}. Find that there always exists a fit n which makes (z[n] / z[n-1] - a[n] / a[n-1])= PHI, where PHI is the golden ratio conjugate with a finite precision in the range of computer expression. Further, derive the formula radic(5) = 2(z[n] / z[n-1] - a[n] / a[n-1]) + 1, where n corresponds to the demanded precision. Experiments demonstrate that the approach to radic(5) through a term ratio difference is more smooth and expeditious than through a Taylor power series, and convince the authors that lim(n to infinity) (z[n] / z[n-1] - a[n] / a[n-1]) = PHI holds.
△ Less
Submitted 7 September, 2021; v1 submitted 30 August, 2021;
originally announced August 2021.
-
A New Lever Function with Adequate Indeterminacy
Authors:
Shenghui Su,
Ping Luo,
Shuwang Lv,
Maozhi Xu
Abstract:
The key transform of the REESSE1+ asymmetrical cryptosystem is Ci = (Ai * W ^ l(i)) ^ d (% M) with l(i) in Omega = {5, 7, ..., 2n + 3} for i = 1, ..., n, where l(i) is called a lever function. In this paper, the authors give a simplified key transform Ci = Ai * W ^ l(i) (% M) with a new lever function l(i) from {1, ..., n} to Omega = {+/-5, +/-6, ..., +/-(n + 4)}, where "+/-" means the selection o…
▽ More
The key transform of the REESSE1+ asymmetrical cryptosystem is Ci = (Ai * W ^ l(i)) ^ d (% M) with l(i) in Omega = {5, 7, ..., 2n + 3} for i = 1, ..., n, where l(i) is called a lever function. In this paper, the authors give a simplified key transform Ci = Ai * W ^ l(i) (% M) with a new lever function l(i) from {1, ..., n} to Omega = {+/-5, +/-6, ..., +/-(n + 4)}, where "+/-" means the selection of the "+" or "-" sign. Discuss the necessity of the new l(i), namely that a simplified private key is insecure if the new l(i) is a constant but not one-to-one function. Further, expound the sufficiency of the new l(i) from four aspects: (1) indeterminacy of the new l(i), (2) insufficient conditions for neutralizing the powers of W and W ^-1 even if Omega = {5, 6, ..., n + 4}, (3) verification by examples, and (4) running times of continued fraction attack and W-parameter intersection attack which are the two most efficient algorithms of the probabilistic polytime attacks so far. Last, the authors detail the relation between a lever function and a random oracle.
△ Less
Submitted 25 April, 2023; v1 submitted 30 August, 2021;
originally announced August 2021.
-
PAIR: Leveraging Passage-Centric Similarity Relation for Improving Dense Passage Retrieval
Authors:
Ruiyang Ren,
Shangwen Lv,
Yingqi Qu,
Jing Liu,
Wayne Xin Zhao,
QiaoQiao She,
Hua Wu,
Haifeng Wang,
Ji-Rong Wen
Abstract:
Recently, dense passage retrieval has become a mainstream approach to finding relevant information in various natural language processing tasks. A number of studies have been devoted to improving the widely adopted dual-encoder architecture. However, most of the previous studies only consider query-centric similarity relation when learning the dual-encoder retriever. In order to capture more compr…
▽ More
Recently, dense passage retrieval has become a mainstream approach to finding relevant information in various natural language processing tasks. A number of studies have been devoted to improving the widely adopted dual-encoder architecture. However, most of the previous studies only consider query-centric similarity relation when learning the dual-encoder retriever. In order to capture more comprehensive similarity relations, we propose a novel approach that leverages both query-centric and PAssage-centric sImilarity Relations (called PAIR) for dense passage retrieval. To implement our approach, we make three major technical contributions by introducing formal formulations of the two kinds of similarity relations, generating high-quality pseudo labeled data via knowledge distillation, and designing an effective two-stage training procedure that incorporates passage-centric similarity relation constraint. Extensive experiments show that our approach significantly outperforms previous state-of-the-art models on both MSMARCO and Natural Questions datasets.
△ Less
Submitted 23 April, 2023; v1 submitted 12 August, 2021;
originally announced August 2021.
-
End-to-End User Behavior Retrieval in Click-Through RatePrediction Model
Authors:
Qiwei Chen,
Changhua Pei,
Shanshan Lv,
Chao Li,
Junfeng Ge,
Wenwu Ou
Abstract:
Click-Through Rate (CTR) prediction is one of the core tasks in recommender systems (RS). It predicts a personalized click probability for each user-item pair. Recently, researchers have found that the performance of CTR model can be improved greatly by taking user behavior sequence into consideration, especially long-term user behavior sequence. The report on an e-commerce website shows that 23\%…
▽ More
Click-Through Rate (CTR) prediction is one of the core tasks in recommender systems (RS). It predicts a personalized click probability for each user-item pair. Recently, researchers have found that the performance of CTR model can be improved greatly by taking user behavior sequence into consideration, especially long-term user behavior sequence. The report on an e-commerce website shows that 23\% of users have more than 1000 clicks during the past 5 months. Though there are numerous works focus on modeling sequential user behaviors, few works can handle long-term user behavior sequence due to the strict inference time constraint in real world system. Two-stage methods are proposed to push the limit for better performance. At the first stage, an auxiliary task is designed to retrieve the top-$k$ similar items from long-term user behavior sequence. At the second stage, the classical attention mechanism is conducted between the candidate item and $k$ items selected in the first stage. However, information gap happens between retrieval stage and the main CTR task. This goal divergence can greatly diminishing the performance gain of long-term user sequence. In this paper, inspired by Reformer, we propose a locality-sensitive hashing (LSH) method called ETA (End-to-end Target Attention) which can greatly reduce the training and inference cost and make the end-to-end training with long-term user behavior sequence possible. Both offline and online experiments confirm the effectiveness of our model. We deploy ETA into a large-scale real world E-commerce system and achieve extra 3.1\% improvements on GMV (Gross Merchandise Value) compared to a two-stage long user sequence CTR model.
△ Less
Submitted 10 August, 2021;
originally announced August 2021.
-
S2Looking: A Satellite Side-Looking Dataset for Building Change Detection
Authors:
Li Shen,
Yao Lu,
Hao Chen,
Hao Wei,
Donghai Xie,
Jiabao Yue,
Rui Chen,
Shouye Lv,
Bitao Jiang
Abstract:
Building-change detection underpins many important applications, especially in the military and crisis-management domains. Recent methods used for change detection have shifted towards deep learning, which depends on the quality of its training data. The assembly of large-scale annotated satellite imagery datasets is therefore essential for global building-change surveillance. Existing datasets al…
▽ More
Building-change detection underpins many important applications, especially in the military and crisis-management domains. Recent methods used for change detection have shifted towards deep learning, which depends on the quality of its training data. The assembly of large-scale annotated satellite imagery datasets is therefore essential for global building-change surveillance. Existing datasets almost exclusively offer near-nadir viewing angles. This limits the range of changes that can be detected. By offering larger observation ranges, the scroll imaging mode of optical satellites presents an opportunity to overcome this restriction. This paper therefore introduces S2Looking, a building-change-detection dataset that contains large-scale side-looking satellite images captured at various off-nadir angles. The dataset consists of 5000 bitemporal image pairs of rural areas and more than 65,920 annotated instances of changes throughout the world. The dataset can be used to train deep-learning-based change-detection algorithms. It expands upon existing datasets by providing (1) larger viewing angles; (2) large illumination variances; and (3) the added complexity of rural images. To facilitate {the} use of the dataset, a benchmark task has been established, and preliminary tests suggest that deep-learning algorithms find the dataset significantly more challenging than the closest-competing near-nadir dataset, LEVIR-CD+. S2Looking may therefore promote important advances in existing building-change-detection algorithms. The dataset is available at https://github.com/S2Looking/.
△ Less
Submitted 11 January, 2022; v1 submitted 19 July, 2021;
originally announced July 2021.
-
DCCRN+: Channel-wise Subband DCCRN with SNR Estimation for Speech Enhancement
Authors:
Shubo Lv,
Yanxin Hu,
Shimin Zhang,
Lei Xie
Abstract:
Deep complex convolution recurrent network (DCCRN), which extends CRN with complex structure, has achieved superior performance in MOS evaluation in Interspeech 2020 deep noise suppression challenge (DNS2020). This paper further extends DCCRN with the following significant revisions. We first extend the model to sub-band processing where the bands are split and merged by learnable neural network f…
▽ More
Deep complex convolution recurrent network (DCCRN), which extends CRN with complex structure, has achieved superior performance in MOS evaluation in Interspeech 2020 deep noise suppression challenge (DNS2020). This paper further extends DCCRN with the following significant revisions. We first extend the model to sub-band processing where the bands are split and merged by learnable neural network filters instead of engineered FIR filters, leading to a faster noise suppressor trained in an end-to-end manner. Then the LSTM is further substituted with a complex TF-LSTM to better model temporal dependencies along both time and frequency axes. Moreover, instead of simply concatenating the output of each encoder layer to the input of the corresponding decoder layer, we use convolution blocks to first aggregate essential information from the encoder output before feeding it to the decoder layers. We specifically formulate the decoder with an extra a priori SNR estimation module to maintain good speech quality while removing noise. Finally a post-processing module is adopted to further suppress the unnatural residual noise. The new model, named DCCRN+, has surpassed the original DCCRN as well as several competitive models in terms of PESQ and DNSMOS, and has achieved superior performance in the new Interspeech 2021 DNS challenge
△ Less
Submitted 16 June, 2021;
originally announced June 2021.
-
F-T-LSTM based Complex Network for Joint Acoustic Echo Cancellation and Speech Enhancement
Authors:
Shimin Zhang,
Yuxiang Kong,
Shubo Lv,
Yanxin Hu,
Lei Xie
Abstract:
With the increasing demand for audio communication and online conference, ensuring the robustness of Acoustic Echo Cancellation (AEC) under the complicated acoustic scenario including noise, reverberation and nonlinear distortion has become a top issue. Although there have been some traditional methods that consider nonlinear distortion, they are still inefficient for echo suppression and the perf…
▽ More
With the increasing demand for audio communication and online conference, ensuring the robustness of Acoustic Echo Cancellation (AEC) under the complicated acoustic scenario including noise, reverberation and nonlinear distortion has become a top issue. Although there have been some traditional methods that consider nonlinear distortion, they are still inefficient for echo suppression and the performance will be attenuated when noise is present. In this paper, we present a real-time AEC approach using complex neural network to better modeling the important phase information and frequency-time-LSTMs (F-T-LSTM), which scan both frequency and time axis, for better temporal modeling. Moreover, we utilize modified SI-SNR as cost function to make the model to have better echo cancellation and noise suppression (NS) performance. With only 1.4M parameters, the proposed approach outperforms the AEC-challenge baseline by 0.27 in terms of Mean Opinion Score (MOS).
△ Less
Submitted 16 June, 2021; v1 submitted 14 June, 2021;
originally announced June 2021.
-
Combining Supervised and Un-supervised Learning for Automatic Citrus Segmentation
Authors:
Heqing Huang,
Tongbin Huang,
Zhen Li,
Zhiwei Wei,
Shilei Lv
Abstract:
Citrus segmentation is a key step of automatic citrus picking. While most current image segmentation approaches achieve good segmentation results by pixel-wise segmentation, these supervised learning-based methods require a large amount of annotated data, and do not consider the continuous temporal changes of citrus position in real-world applications. In this paper, we first train a simple CNN wi…
▽ More
Citrus segmentation is a key step of automatic citrus picking. While most current image segmentation approaches achieve good segmentation results by pixel-wise segmentation, these supervised learning-based methods require a large amount of annotated data, and do not consider the continuous temporal changes of citrus position in real-world applications. In this paper, we first train a simple CNN with a small number of labelled citrus images in a supervised manner, which can roughly predict the citrus location from each frame. Then, we extend a state-of-the-art unsupervised learning approach to pre-learn the citrus's potential movements between frames from unlabelled citrus's videos. To take advantages of both networks, we employ the multimodal transformer to combine supervised learned static information and unsupervised learned movement information. The experimental results show that combing both network allows the prediction accuracy reached at 88.3$\%$ IOU and 93.6$\%$ precision, outperforming the original supervised baseline 1.2$\%$ and 2.4$\%$. Compared with most of the existing citrus segmentation methods, our method uses a small amount of supervised data and a large number of unsupervised data, while learning the pixel level location information and the temporal information of citrus changes to enhance the segmentation effect.
△ Less
Submitted 4 May, 2021;
originally announced May 2021.
-
AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario
Authors:
Yihui Fu,
Luyao Cheng,
Shubo Lv,
Yukai Jv,
Yuxiang Kong,
Zhuo Chen,
Yanxin Hu,
Lei Xie,
Jian Wu,
Hui Bu,
Xin Xu,
Jun Du,
Jingdong Chen
Abstract:
In this paper, we present AISHELL-4, a sizable real-recorded Mandarin speech dataset collected by 8-channel circular microphone array for speech processing in conference scenario. The dataset consists of 211 recorded meeting sessions, each containing 4 to 8 speakers, with a total length of 120 hours. This dataset aims to bridge the advanced research on multi-speaker processing and the practical ap…
▽ More
In this paper, we present AISHELL-4, a sizable real-recorded Mandarin speech dataset collected by 8-channel circular microphone array for speech processing in conference scenario. The dataset consists of 211 recorded meeting sessions, each containing 4 to 8 speakers, with a total length of 120 hours. This dataset aims to bridge the advanced research on multi-speaker processing and the practical application scenario in three aspects. With real recorded meetings, AISHELL-4 provides realistic acoustics and rich natural speech characteristics in conversation such as short pause, speech overlap, quick speaker turn, noise, etc. Meanwhile, accurate transcription and speaker voice activity are provided for each meeting in AISHELL-4. This allows the researchers to explore different aspects in meeting processing, ranging from individual tasks such as speech front-end processing, speech recognition and speaker diarization, to multi-modality modeling and joint optimization of relevant tasks. Given most open source dataset for multi-speaker tasks are in English, AISHELL-4 is the only Mandarin dataset for conversation speech, providing additional value for data diversity in speech community. We also release a PyTorch-based training and evaluation framework as baseline system to promote reproducible research in this field.
△ Less
Submitted 10 August, 2021; v1 submitted 8 April, 2021;
originally announced April 2021.
-
Communication-efficient Byzantine-robust distributed learning with statistical guarantee
Authors:
Xingcai Zhou,
Le Chang,
Pengfei Xu,
Shaogao Lv
Abstract:
Communication efficiency and robustness are two major issues in modern distributed learning framework. This is due to the practical situations where some computing nodes may have limited communication power or may behave adversarial behaviors. To address the two issues simultaneously, this paper develops two communication-efficient and robust distributed learning algorithms for convex problems. Ou…
▽ More
Communication efficiency and robustness are two major issues in modern distributed learning framework. This is due to the practical situations where some computing nodes may have limited communication power or may behave adversarial behaviors. To address the two issues simultaneously, this paper develops two communication-efficient and robust distributed learning algorithms for convex problems. Our motivation is based on surrogate likelihood framework and the median and trimmed mean operations. Particularly, the proposed algorithms are provably robust against Byzantine failures, and also achieve optimal statistical rates for strong convex losses and convex (non-smooth) penalties. For typical statistical models such as generalized linear models, our results show that statistical errors dominate optimization errors in finite iterations. Simulated and real data experiments are conducted to demonstrate the numerical performance of our algorithms.
△ Less
Submitted 27 February, 2021;
originally announced March 2021.
-
Generalization bounds for graph convolutional neural networks via Rademacher complexity
Authors:
Shaogao Lv
Abstract:
This paper aims at studying the sample complexity of graph convolutional networks (GCNs), by providing tight upper bounds of Rademacher complexity for GCN models with a single hidden layer. Under regularity conditions, theses derived complexity bounds explicitly depend on the largest eigenvalue of graph convolution filter and the degree distribution of the graph. Again, we provide a lower bound of…
▽ More
This paper aims at studying the sample complexity of graph convolutional networks (GCNs), by providing tight upper bounds of Rademacher complexity for GCN models with a single hidden layer. Under regularity conditions, theses derived complexity bounds explicitly depend on the largest eigenvalue of graph convolution filter and the degree distribution of the graph. Again, we provide a lower bound of Rademacher complexity for GCNs to show optimality of our derived upper bounds. Taking two commonly used examples as representatives, we discuss the implications of our results in designing graph convolution filters an graph distribution.
△ Less
Submitted 19 February, 2021;
originally announced February 2021.
-
DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement
Authors:
Yanxin Hu,
Yun Liu,
Shubo Lv,
Mengtao Xing,
Shimin Zhang,
Yihui Fu,
Jian Wu,
Bihong Zhang,
Lei Xie
Abstract:
Speech enhancement has benefited from the success of deep learning in terms of intelligibility and perceptual quality. Conventional time-frequency (TF) domain methods focus on predicting TF-masks or speech spectrum, via a naive convolution neural network (CNN) or recurrent neural network (RNN). Some recent studies use complex-valued spectrogram as a training target but train in a real-valued netwo…
▽ More
Speech enhancement has benefited from the success of deep learning in terms of intelligibility and perceptual quality. Conventional time-frequency (TF) domain methods focus on predicting TF-masks or speech spectrum, via a naive convolution neural network (CNN) or recurrent neural network (RNN). Some recent studies use complex-valued spectrogram as a training target but train in a real-valued network, predicting the magnitude and phase component or real and imaginary part, respectively. Particularly, convolution recurrent network (CRN) integrates a convolutional encoder-decoder (CED) structure and long short-term memory (LSTM), which has been proven to be helpful for complex targets. In order to train the complex target more effectively, in this paper, we design a new network structure simulating the complex-valued operation, called Deep Complex Convolution Recurrent Network (DCCRN), where both CNN and RNN structures can handle complex-valued operation. The proposed DCCRN models are very competitive over other previous networks, either on objective or subjective metric. With only 3.7M parameters, our DCCRN models submitted to the Interspeech 2020 Deep Noise Suppression (DNS) challenge ranked first for the real-time-track and second for the non-real-time track in terms of Mean Opinion Score (MOS).
△ Less
Submitted 22 September, 2020; v1 submitted 1 August, 2020;
originally announced August 2020.
-
Establishing Secrecy Region for Directional Modulation Scheme with Random Frequency Diverse Array
Authors:
Shengping Lv,
Jinsong Hu,
Youjia Chen,
Zhimeng Xu,
Zhizhang,
Chen
Abstract:
Random frequency diverse array (RFDA) based directional modulation (DM) was proposed as a promising technology in secure communications to achieve a precise transmission of confidential messages, and artificial noise (AN) was considered as an important helper in RFDA-DM. Compared with previous works that only focus on the spot of the desired receiver, in this work, we investigate a secrecy region…
▽ More
Random frequency diverse array (RFDA) based directional modulation (DM) was proposed as a promising technology in secure communications to achieve a precise transmission of confidential messages, and artificial noise (AN) was considered as an important helper in RFDA-DM. Compared with previous works that only focus on the spot of the desired receiver, in this work, we investigate a secrecy region around the desired receiver, that is, a specific range and angle resolution around the desired receiver. Firstly, the minimum number of antennas and the bandwidth needed to achieve a secrecy region are derived. Moreover, based on the lower bound of the secrecy capacity in RFDA-DM-AN scheme, we investigate the performance impact of AN on the secrecy capacity. From this work, we conclude that: 1) AN is not always beneficial to the secure transmission. Specifically, when the number of antennas is sufficiently large and the transmit power is smaller than a specified value, AN will reduce secrecy capacity due to the consumption of limited transmit power. 2) Increasing bandwidth will enlarge the set for randomly allocating frequencies and thus lead to a higher secrecy capacity. 3) The minimum number of antennas increases as the predefined secrecy transmission rate increases.
△ Less
Submitted 9 July, 2020;
originally announced July 2020.