-
Neural-g: A Deep Learning Framework for Mixing Density Estimation
Authors:
Shijie Wang,
Saptarshi Chakraborty,
Qian Qin,
Ray Bai
Abstract:
Mixing (or prior) density estimation is an important problem in machine learning and statistics, especially in empirical Bayes $g$-modeling where accurately estimating the prior is necessary for making good posterior inferences. In this paper, we propose neural-$g$, a new neural network-based estimator for $g$-modeling. Neural-$g$ uses a softmax output layer to ensure that the estimated prior is a…
▽ More
Mixing (or prior) density estimation is an important problem in machine learning and statistics, especially in empirical Bayes $g$-modeling where accurately estimating the prior is necessary for making good posterior inferences. In this paper, we propose neural-$g$, a new neural network-based estimator for $g$-modeling. Neural-$g$ uses a softmax output layer to ensure that the estimated prior is a valid probability density. Under default hyperparameters, we show that neural-$g$ is very flexible and capable of capturing many unknown densities, including those with flat regions, heavy tails, and/or discontinuities. In contrast, existing methods struggle to capture all of these prior shapes. We provide justification for neural-$g$ by establishing a new universal approximation theorem regarding the capability of neural networks to learn arbitrary probability mass functions. To accelerate convergence of our numerical implementation, we utilize a weighted average gradient descent approach to update the network parameters. Finally, we extend neural-$g$ to multivariate prior density estimation. We illustrate the efficacy of our approach through simulations and analyses of real datasets. A software package to implement neural-$g$ is publicly available at https://github.com/shijiew97/neuralG.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
FaceCom: Towards High-fidelity 3D Facial Shape Completion via Optimization and Inpainting Guidance
Authors:
Yinglong Li,
Hongyu Wu,
Xiaogang Wang,
Qingzhao Qin,
Yijiao Zhao,
Yong wang,
Aimin Hao
Abstract:
We propose FaceCom, a method for 3D facial shape completion, which delivers high-fidelity results for incomplete facial inputs of arbitrary forms. Unlike end-to-end shape completion methods based on point clouds or voxels, our approach relies on a mesh-based generative network that is easy to optimize, enabling it to handle shape completion for irregular facial scans. We first train a shape genera…
▽ More
We propose FaceCom, a method for 3D facial shape completion, which delivers high-fidelity results for incomplete facial inputs of arbitrary forms. Unlike end-to-end shape completion methods based on point clouds or voxels, our approach relies on a mesh-based generative network that is easy to optimize, enabling it to handle shape completion for irregular facial scans. We first train a shape generator on a mixed 3D facial dataset containing 2405 identities. Based on the incomplete facial input, we fit complete faces using an optimization approach under image inpainting guidance. The completion results are refined through a post-processing step. FaceCom demonstrates the ability to effectively and naturally complete facial scan data with varying missing regions and degrees of missing areas. Our method can be used in medical prosthetic fabrication and the registration of deficient scanning data. Our experimental results demonstrate that FaceCom achieves exceptional performance in fitting and shape completion tasks. The code is available at https://github.com/dragonylee/FaceCom.git.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Wrangling Data Issues to be Wrangled: Literature Review, Taxonomy, and Industry Case Study
Authors:
Qiaolin Qin,
Heng Li,
Ettore Merlo
Abstract:
Data quality is vital for user experience in products reliant on data. As solutions for data quality problems, researchers have developed various taxonomies for different types of issues. However, although some of the existing taxonomies are near-comprehensive, the over-complexity has limited their actionability in data issue solution development. Hence, recent researchers issued new sets of data…
▽ More
Data quality is vital for user experience in products reliant on data. As solutions for data quality problems, researchers have developed various taxonomies for different types of issues. However, although some of the existing taxonomies are near-comprehensive, the over-complexity has limited their actionability in data issue solution development. Hence, recent researchers issued new sets of data issue categories that are more concise for better usability. Although more concise, modern data issue labeling's over-catering to the solution systems may sometimes cause the taxonomy to be not mutually exclusive. Consequently, different categories sometimes overlap in determining the issue types, or the same categories share different definitions across research. This hinders solution development and confounds issue detection. Therefore, based on observations from a literature review and feedback from our industry partner, we propose a comprehensive taxonomy of data quality issues from two distinct dimensions: the attribute dimension represents the intrinsic characteristics and the outcome dimension that indicates the manifestation of the issues. With the categories redefined, we labeled the reported data issues in our industry partner's data warehouse. The labeled issues provide us with a general idea of the distributions of each type of problem and which types of issues require the most effort and care to deal with. Our work aims to address a widely generalizable taxonomy rule in modern data quality issue engineering and helps practitioners and researchers understand their data issues and estimate the efforts required for issue fixing.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language Models
Authors:
Liwen Zhang,
Weige Cai,
Zhaowei Liu,
Zhi Yang,
Wei Dai,
Yujie Liao,
Qianru Qin,
Yifei Li,
Xingyu Liu,
Zhiqiang Liu,
Zhoufan Zhu,
Anbo Wu,
Xin Guo,
Yun Chen
Abstract:
Large language models (LLMs) have demonstrated exceptional performance in various natural language processing tasks, yet their efficacy in more challenging and domain-specific tasks remains largely unexplored. This paper presents FinEval, a benchmark specifically designed for the financial domain knowledge in the LLMs. FinEval is a collection of high-quality multiple-choice questions covering Fina…
▽ More
Large language models (LLMs) have demonstrated exceptional performance in various natural language processing tasks, yet their efficacy in more challenging and domain-specific tasks remains largely unexplored. This paper presents FinEval, a benchmark specifically designed for the financial domain knowledge in the LLMs. FinEval is a collection of high-quality multiple-choice questions covering Finance, Economy, Accounting, and Certificate. It includes 4,661 questions spanning 34 different academic subjects. To ensure a comprehensive model performance evaluation, FinEval employs a range of prompt types, including zero-shot and few-shot prompts, as well as answer-only and chain-of-thought prompts. Evaluating state-of-the-art Chinese and English LLMs on FinEval, the results show that only GPT-4 achieved an accuracy close to 70% in different prompt settings, indicating significant growth potential for LLMs in the financial domain knowledge. Our work offers a more comprehensive financial knowledge evaluation benchmark, utilizing data of mock exams and covering a wide range of evaluated LLMs.
△ Less
Submitted 19 August, 2023;
originally announced August 2023.
-
Unstoppable Attack: Label-Only Model Inversion via Conditional Diffusion Model
Authors:
Rongke Liu,
Dong Wang,
Yizhi Ren,
Zhen Wang,
Kaitian Guo,
Qianqian Qin,
Xiaolei Liu
Abstract:
Model inversion attacks (MIAs) aim to recover private data from inaccessible training sets of deep learning models, posing a privacy threat. MIAs primarily focus on the white-box scenario where attackers have full access to the model's structure and parameters. However, practical applications are usually in black-box scenarios or label-only scenarios, i.e., the attackers can only obtain the output…
▽ More
Model inversion attacks (MIAs) aim to recover private data from inaccessible training sets of deep learning models, posing a privacy threat. MIAs primarily focus on the white-box scenario where attackers have full access to the model's structure and parameters. However, practical applications are usually in black-box scenarios or label-only scenarios, i.e., the attackers can only obtain the output confidence vectors or labels by accessing the model. Therefore, the attack models in existing MIAs are difficult to effectively train with the knowledge of the target model, resulting in sub-optimal attacks. To the best of our knowledge, we pioneer the research of a powerful and practical attack model in the label-only scenario.
In this paper, we develop a novel MIA method, leveraging a conditional diffusion model (CDM) to recover representative samples under the target label from the training set. Two techniques are introduced: selecting an auxiliary dataset relevant to the target model task and using predicted labels as conditions to guide training CDM; and inputting target label, pre-defined guidance strength, and random noise into the trained attack model to generate and correct multiple results for final selection. This method is evaluated using Learned Perceptual Image Patch Similarity as a new metric and as a judgment basis for deciding the values of hyper-parameters. Experimental results show that this method can generate similar and accurate samples to the target label, outperforming generators of previous approaches.
△ Less
Submitted 6 March, 2024; v1 submitted 17 July, 2023;
originally announced July 2023.
-
Securing Semantic Communications with Physical-layer Semantic Encryption and Obfuscation
Authors:
Qi Qin,
Yankai Rong,
Guoshun Nan,
Shaokang Wu,
Xuefei Zhang,
Qimei Cui,
Xiaofeng Tao
Abstract:
Deep learning based semantic communication(DLSC) systems have shown great potential of making wireless networks significantly more efficient by only transmitting the semantics of the data. However, the open nature of wireless channel and fragileness of neural models cause DLSC systems extremely vulnerable to various attacks. Traditional wireless physical layer key (PLK), which relies on reciprocal…
▽ More
Deep learning based semantic communication(DLSC) systems have shown great potential of making wireless networks significantly more efficient by only transmitting the semantics of the data. However, the open nature of wireless channel and fragileness of neural models cause DLSC systems extremely vulnerable to various attacks. Traditional wireless physical layer key (PLK), which relies on reciprocal channel and randomness characteristics between two legitimate users, holds the promise of securing DLSC. The main challenge lies in generating secret keys in the static environment with ultra-low/zero rate. Different from prior efforts that use relays or reconfigurable intelligent surfaces (RIS) to manipulate wireless channels, this paper proposes a novel physical layer semantic encryption scheme by exploring the randomness of bilingual evaluation understudy (BLEU) scores in the field of machine translation, and additionally presents a novel semantic obfuscation mechanism to provide further physical layer protections. Specifically, 1) we calculate the BLEU scores and corresponding weights of the DLSC system. Then, we generate semantic keys (SKey) by feeding the weighted sum of the scores into a hash function. 2) Equipped with the SKey, our proposed subcarrier obfuscation is able to further secure semantic communications with a dynamic dummy data insertion mechanism. Experiments show the effectiveness of our method, especially in the static wireless environment.
△ Less
Submitted 20 April, 2023;
originally announced April 2023.
-
Super-resolution Reconstruction of Single Image for Latent features
Authors:
Xin Wang,
Jing-Ke Yan,
Jing-Ye Cai,
Jian-Hua Deng,
Qin Qin,
Yao Cheng
Abstract:
Single-image super-resolution (SISR) typically focuses on restoring various degraded low-resolution (LR) images to a single high-resolution (HR) image. However, during SISR tasks, it is often challenging for models to simultaneously maintain high quality and rapid sampling while preserving diversity in details and texture features. This challenge can lead to issues such as model collapse, lack of…
▽ More
Single-image super-resolution (SISR) typically focuses on restoring various degraded low-resolution (LR) images to a single high-resolution (HR) image. However, during SISR tasks, it is often challenging for models to simultaneously maintain high quality and rapid sampling while preserving diversity in details and texture features. This challenge can lead to issues such as model collapse, lack of rich details and texture features in the reconstructed HR images, and excessive time consumption for model sampling. To address these problems, this paper proposes a Latent Feature-oriented Diffusion Probability Model (LDDPM). First, we designed a conditional encoder capable of effectively encoding LR images, reducing the solution space for model image reconstruction and thereby improving the quality of the reconstructed images. We then employed a normalized flow and multimodal adversarial training, learning from complex multimodal distributions, to model the denoising distribution. Doing so boosts the generative modeling capabilities within a minimal number of sampling steps. Experimental comparisons of our proposed model with existing SISR methods on mainstream datasets demonstrate that our model reconstructs more realistic HR images and achieves better performance on multiple evaluation metrics, providing a fresh perspective for tackling SISR tasks.
△ Less
Submitted 9 November, 2023; v1 submitted 16 November, 2022;
originally announced November 2022.
-
A Weakly Supervised Learning Framework for Salient Object Detection via Hybrid Labels
Authors:
Runmin Cong,
Qi Qin,
Chen Zhang,
Qiuping Jiang,
Shiqi Wang,
Yao Zhao,
Sam Kwong
Abstract:
Fully-supervised salient object detection (SOD) methods have made great progress, but such methods often rely on a large number of pixel-level annotations, which are time-consuming and labour-intensive. In this paper, we focus on a new weakly-supervised SOD task under hybrid labels, where the supervision labels include a large number of coarse labels generated by the traditional unsupervised metho…
▽ More
Fully-supervised salient object detection (SOD) methods have made great progress, but such methods often rely on a large number of pixel-level annotations, which are time-consuming and labour-intensive. In this paper, we focus on a new weakly-supervised SOD task under hybrid labels, where the supervision labels include a large number of coarse labels generated by the traditional unsupervised method and a small number of real labels. To address the issues of label noise and quantity imbalance in this task, we design a new pipeline framework with three sophisticated training strategies. In terms of model framework, we decouple the task into label refinement sub-task and salient object detection sub-task, which cooperate with each other and train alternately. Specifically, the R-Net is designed as a two-stream encoder-decoder model equipped with Blender with Guidance and Aggregation Mechanisms (BGA), aiming to rectify the coarse labels for more reliable pseudo-labels, while the S-Net is a replaceable SOD network supervised by the pseudo labels generated by the current R-Net. Note that, we only need to use the trained S-Net for testing. Moreover, in order to guarantee the effectiveness and efficiency of network training, we design three training strategies, including alternate iteration mechanism, group-wise incremental mechanism, and credibility verification mechanism. Experiments on five SOD benchmarks show that our method achieves competitive performance against weakly-supervised/unsupervised methods both qualitatively and quantitatively.
△ Less
Submitted 7 September, 2022;
originally announced September 2022.
-
Domain Shift-oriented Machine Anomalous Sound Detection Model Based on Self-Supervised Learning
Authors:
Jing-ke Yan,
Xin Wang,
Qin Wang,
Qin Qin,
Huang-he Li,
Peng-fei Ye,
Yue-ping He,
Jing Zeng
Abstract:
Thanks to the development of deep learning, research on machine anomalous sound detection based on self-supervised learning has made remarkable achievements. However, there are differences in the acoustic characteristics of the test set and the training set under different operating conditions of the same machine (domain shifts). It is challenging for the existing detection methods to learn the do…
▽ More
Thanks to the development of deep learning, research on machine anomalous sound detection based on self-supervised learning has made remarkable achievements. However, there are differences in the acoustic characteristics of the test set and the training set under different operating conditions of the same machine (domain shifts). It is challenging for the existing detection methods to learn the domain shifts features stably with low computation overhead. To address these problems, we propose a domain shift-oriented machine anomalous sound detection model based on self-supervised learning (TranSelf-DyGCN) in this paper. Firstly, we design a time-frequency domain feature modeling network to capture global and local spatial and time-domain features, thus improving the stability of machine anomalous sound detection stability under domain shifts. Then, we adopt a Dynamic Graph Convolutional Network (DyGCN) to model the inter-dependence relationship between domain shifts features, enabling the model to perceive domain shifts features efficiently. Finally, we use a Domain Adaptive Network (DAN) to compensate for the performance decrease caused by domain shifts, making the model adapt to anomalous sound better in the self-supervised environment. The performance of the suggested model is validated on DCASE 2020 task 2 and DCASE 2022 task 2.
△ Less
Submitted 7 September, 2022; v1 submitted 31 August, 2022;
originally announced August 2022.
-
Spectral Telescope: Convergence Rate Bounds for Random-Scan Gibbs Samplers Based on a Hierarchical Structure
Authors:
Qian Qin,
Guanyang Wang
Abstract:
Random-scan Gibbs samplers possess a natural hierarchical structure. The structure connects Gibbs samplers targeting higher dimensional distributions to those targeting lower dimensional ones. This leads to a quasi-telescoping property of their spectral gaps. Based on this property, we derive three new bounds on the spectral gaps and convergence rates of Gibbs samplers on general domains. The thre…
▽ More
Random-scan Gibbs samplers possess a natural hierarchical structure. The structure connects Gibbs samplers targeting higher dimensional distributions to those targeting lower dimensional ones. This leads to a quasi-telescoping property of their spectral gaps. Based on this property, we derive three new bounds on the spectral gaps and convergence rates of Gibbs samplers on general domains. The three bounds relate a chain's spectral gap to, respectively, the correlation structure of the target distribution, a class of random walk chains, and a collection of influence matrices. Notably, one of our results generalizes the technique of spectral independence, which has received considerable attention for its success on finite domains, to general state spaces. We illustrate our methods through a sampler targeting the uniform distribution on a corner of an $n$-cube.
△ Less
Submitted 13 October, 2022; v1 submitted 24 August, 2022;
originally announced August 2022.
-
Reconfigurable MIMO towards Electro-magnetic Information Theory: Capacity Maximization Pattern Design
Authors:
Haonan Wang,
Ang Li,
Ya-feng Liu,
Qibo Qin,
Lingyang Song,
Yonghui Li
Abstract:
In this paper, we focus on the pattern reconfigurable multiple-input multiple-output (PR-MIMO), a technique that has the potential to bridge the gap between electro-magnetics and communications towards the emerging Electro-magnetic Information Theory (EIT). Specifically, we focus on the pattern design problem aimed at maximizing the channel capacity for reconfigurable MIMO communication systems, w…
▽ More
In this paper, we focus on the pattern reconfigurable multiple-input multiple-output (PR-MIMO), a technique that has the potential to bridge the gap between electro-magnetics and communications towards the emerging Electro-magnetic Information Theory (EIT). Specifically, we focus on the pattern design problem aimed at maximizing the channel capacity for reconfigurable MIMO communication systems, where we firstly introduce the matrix representation of PR-MIMO and further formulate a pattern design problem. We decompose the pattern design into two steps, i.e., the correlation modification process to optimize the correlation structure of the channel, followed by the power allocation process to improve the channel quality based on the optimized channel structure. For the correlation modification process, we propose a sequential optimization framework with eigenvalue decomposition to obtain near-optimal solutions. For the power allocation process, we provide a closed-form power allocation scheme to redistribute the transmission power among the modified subchannels. Numerical results show that the proposed pattern design scheme offers significant improvements over legacy MIMO systems, which motivates the application of PR-MIMO in wireless communication systems.
△ Less
Submitted 11 March, 2022;
originally announced March 2022.
-
Preventing Timing Side-Channels via Security-Aware Just-In-Time Compilation
Authors:
Qi Qin,
JulianAndres JiYang,
Fu Song,
Taolue Chen,
Xinyu Xing
Abstract:
Recent work has shown that Just-In-Time (JIT) compilation can introduce timing side-channels to constant-time programs, which would otherwise be a principled and effective means to counter timing attacks. In this paper, we propose a novel approach to eliminate JIT-induced leaks from these programs. Specifically, we present an operational semantics and a formal definition of constant-time programs…
▽ More
Recent work has shown that Just-In-Time (JIT) compilation can introduce timing side-channels to constant-time programs, which would otherwise be a principled and effective means to counter timing attacks. In this paper, we propose a novel approach to eliminate JIT-induced leaks from these programs. Specifically, we present an operational semantics and a formal definition of constant-time programs under JIT compilation, laying the foundation for reasoning about programs with JIT compilation. We then propose to eliminate JIT-induced leaks via a fine-grained JIT compilation for which we provide an automated approach to generate policies and a novel type system to show its soundness. We develop a tool DeJITLeak for Java based on our approach and implement the fine-grained JIT compilation in HotSpot. Experimental results show that DeJITLeak can effectively and efficiently eliminate JIT-induced leaks on three datasets used in side-channel detection
△ Less
Submitted 26 February, 2022;
originally announced February 2022.
-
Achievable Rate Maximization Pattern Design for Reconfigurable MIMO Antenna Array
Authors:
Haonan Wang,
Ang Li,
Ya-Feng Liu,
Qibo Qin,
Lingyang Song,
Yonghui Li
Abstract:
Reconfigurable multiple-input multiple-output can provide performance gains over traditional MIMO by reshaping the channels, i.e., introducing more channel realizations. In this paper, we focus on the achievable rate maximization pattern design for reconfigurable MIMO systems. Firstly, we introduce the matrix representation of pattern reconfigurable MIMO (PR-MIMO), based on which a pattern design…
▽ More
Reconfigurable multiple-input multiple-output can provide performance gains over traditional MIMO by reshaping the channels, i.e., introducing more channel realizations. In this paper, we focus on the achievable rate maximization pattern design for reconfigurable MIMO systems. Firstly, we introduce the matrix representation of pattern reconfigurable MIMO (PR-MIMO), based on which a pattern design problem is formulated. To further reveal the effect of the radiation pattern on the wireless channel, we consider pattern design for both the single-pattern case where the optimized radiation pattern is the same for all the antenna elements, and the multi-pattern case where different antenna elements can adopt different radiation patterns. For the single-pattern case, we show that the pattern design is equivalent to a redistribution of gains among all scattering paths, and an eigenvalue optimization based solution is obtained. For the multi-pattern case, we propose a sequential optimization framework with manifold optimization and eigenvalue decomposition to obtain near-optimal solutions. Numerical results validate the superiority of PR-MIMO systems over traditional MIMO in terms of achievable rate, and also show the effectiveness of the proposed solutions.
△ Less
Submitted 13 February, 2023; v1 submitted 11 February, 2022;
originally announced February 2022.
-
Superpixel-Based Building Damage Detection from Post-earthquake Very High Resolution Imagery Using Deep Neural Networks
Authors:
Jun Wang,
Zhoujing Li,
Yixuan Qiao,
Qiming Qin,
Peng Gao,
Guotong Xie
Abstract:
Building damage detection after natural disasters like earthquakes is crucial for initiating effective emergency response actions. Remotely sensed very high spatial resolution (VHR) imagery can provide vital information due to their ability to map the affected buildings with high geometric precision. Many approaches have been developed to detect damaged buildings due to earthquakes. However, littl…
▽ More
Building damage detection after natural disasters like earthquakes is crucial for initiating effective emergency response actions. Remotely sensed very high spatial resolution (VHR) imagery can provide vital information due to their ability to map the affected buildings with high geometric precision. Many approaches have been developed to detect damaged buildings due to earthquakes. However, little attention has been paid to exploiting rich features represented in VHR images using Deep Neural Networks (DNN). This paper presents a novel superpixel based approach combining DNN and a modified segmentation method, to detect damaged buildings from VHR imagery. Firstly, a modified Fast Scanning and Adaptive Merging method is extended to create initial over-segmentation. Secondly, the segments are merged based on the Region Adjacent Graph (RAG), considered an improved semantic similarity criterion composed of Local Binary Patterns (LBP) texture, spectral, and shape features. Thirdly, a pre-trained DNN using Stacked Denoising Auto-Encoders called SDAE-DNN is presented, to exploit the rich semantic features for building damage detection. Deep-layer feature abstraction of SDAE-DNN could boost detection accuracy through learning more intrinsic and discriminative features, which outperformed other methods using state-of-the-art alternative classifiers. We demonstrate the feasibility and effectiveness of our method using a subset of WorldView-2 imagery, in the complex urban areas of Bhaktapur, Nepal, which was affected by the Nepal Earthquake of April 25, 2015.
△ Less
Submitted 30 September, 2022; v1 submitted 9 December, 2021;
originally announced December 2021.
-
Loop closure detection using local 3D deep descriptors
Authors:
Youjie Zhou,
Yiming Wang,
Fabio Poiesi,
Qi Qin,
Yi Wan
Abstract:
We present a simple yet effective method to address loop closure detection in simultaneous localisation and mapping using local 3D deep descriptors (L3Ds). L3Ds are emerging compact representations of patches extracted from point clouds that are learnt from data using a deep learning algorithm. We propose a novel overlap measure for loop detection by computing the metric error between points that…
▽ More
We present a simple yet effective method to address loop closure detection in simultaneous localisation and mapping using local 3D deep descriptors (L3Ds). L3Ds are emerging compact representations of patches extracted from point clouds that are learnt from data using a deep learning algorithm. We propose a novel overlap measure for loop detection by computing the metric error between points that correspond to mutually-nearest-neighbour descriptors after registering the loop candidate point cloud by its estimated relative pose. This novel approach enables us to accurately detect loops and estimate six degrees-of-freedom poses in the case of small overlaps. We compare our L3D-based loop closure approach with recent approaches on LiDAR data and achieve state-of-the-art loop closure detection accuracy. Additionally, we embed our loop closure approach in RESLAM, a recent edge-based SLAM system, and perform the evaluation on real-world RGBD-TUM and synthetic ICL datasets. Our approach enables RESLAM to achieve a better localisation accuracy compared to its original loop closure strategy. Our project page is available at github.com/yiming107/l3d_loop_closure.
△ Less
Submitted 27 February, 2022; v1 submitted 31 October, 2021;
originally announced November 2021.
-
Text Classification with Novelty Detection
Authors:
Qi Qin,
Wenpeng Hu,
Bing Liu
Abstract:
This paper studies the problem of detecting novel or unexpected instances in text classification. In traditional text classification, the classes appeared in testing must have been seen in training. However, in many applications, this is not the case because in testing, we may see unexpected instances that are not from any of the training classes. In this paper, we propose a significantly more eff…
▽ More
This paper studies the problem of detecting novel or unexpected instances in text classification. In traditional text classification, the classes appeared in testing must have been seen in training. However, in many applications, this is not the case because in testing, we may see unexpected instances that are not from any of the training classes. In this paper, we propose a significantly more effective approach that converts the original problem to a pair-wise matching problem and then outputs how probable two instances belong to the same class. Under this approach, we present two models. The more effective model uses two embedding matrices of a pair of instances as two channels of a CNN. The output probabilities from such pairs are used to judge whether a test instance is from a seen class or is novel/unexpected. Experimental results show that the proposed method substantially outperforms the state-of-the-art baselines.
△ Less
Submitted 23 September, 2020;
originally announced September 2020.
-
Learning the Optimal Synchronization Rates in Distributed SDN Control Architectures
Authors:
Konstantinos Poularakis,
Qiaofeng Qin,
Liang Ma,
Sastry Kompella,
Kin K. Leung,
Leandros Tassiulas
Abstract:
Since the early development of Software-Defined Network (SDN) technology, researchers have been concerned with the idea of physical distribution of the control plane to address scalability and reliability challenges of centralized designs. However, having multiple controllers managing the network while maintaining a "logically-centralized" network view brings additional challenges. One such challe…
▽ More
Since the early development of Software-Defined Network (SDN) technology, researchers have been concerned with the idea of physical distribution of the control plane to address scalability and reliability challenges of centralized designs. However, having multiple controllers managing the network while maintaining a "logically-centralized" network view brings additional challenges. One such challenge is how to coordinate the management decisions made by the controllers which is usually achieved by disseminating synchronization messages in a peer-to-peer manner. While there exist many architectures and protocols to ensure synchronized network views and drive coordination among controllers, there is no systematic methodology for deciding the optimal frequency (or rate) of message dissemination. In this paper, we fill this gap by introducing the SDN synchronization problem: how often to synchronize the network views for each controller pair. We consider two different objectives; first, the maximization of the number of controller pairs that are synchronized, and second, the maximization of the performance of applications of interest which may be affected by the synchronization rate. Using techniques from knapsack optimization and learning theory, we derive algorithms with provable performance guarantees for each objective. Evaluation results demonstrate significant benefits over baseline schemes that synchronize all controller pairs at equal rate.
△ Less
Submitted 25 January, 2019;
originally announced January 2019.
-
To Compress, or Not to Compress: Characterizing Deep Learning Model Compression for Embedded Inference
Authors:
Qing Qin,
Jie Ren,
Jialong Yu,
Ling Gao,
Hai Wang,
Jie Zheng,
Yansong Feng,
Jianbin Fang,
Zheng Wang
Abstract:
The recent advances in deep neural networks (DNNs) make them attractive for embedded systems. However, it can take a long time for DNNs to make an inference on resource-constrained computing devices. Model compression techniques can address the computation issue of deep inference on embedded devices. This technique is highly attractive, as it does not rely on specialized hardware, or computation-o…
▽ More
The recent advances in deep neural networks (DNNs) make them attractive for embedded systems. However, it can take a long time for DNNs to make an inference on resource-constrained computing devices. Model compression techniques can address the computation issue of deep inference on embedded devices. This technique is highly attractive, as it does not rely on specialized hardware, or computation-offloading that is often infeasible due to privacy concerns or high latency. However, it remains unclear how model compression techniques perform across a wide range of DNNs. To design efficient embedded deep learning solutions, we need to understand their behaviors. This work develops a quantitative approach to characterize model compression techniques on a representative embedded deep learning architecture, the NVIDIA Jetson Tx2. We perform extensive experiments by considering 11 influential neural network architectures from the image classification and the natural language processing domains. We experimentally show that how two mainstream compression techniques, data quantization and pruning, perform on these network architectures and the implications of compression techniques to the model storage size, inference time, energy consumption and performance metrics. We demonstrate that there are opportunities to achieve fast deep inference on embedded systems, but one must carefully choose the compression settings. Our results provide insights on when and how to apply model compression techniques and guidelines for designing efficient embedded deep learning systems.
△ Less
Submitted 21 October, 2018;
originally announced October 2018.
-
On the Iteration Complexity Analysis of Stochastic Primal-Dual Hybrid Gradient Approach with High Probability
Authors:
Linbo Qiao,
Tianyi Lin,
Qi Qin,
Xicheng Lu
Abstract:
In this paper, we propose a stochastic Primal-Dual Hybrid Gradient (PDHG) approach for solving a wide spectrum of regularized stochastic minimization problems, where the regularization term is composite with a linear function. It has been recognized that solving this kind of problem is challenging since the closed-form solution of the proximal mapping associated with the regularization term is not…
▽ More
In this paper, we propose a stochastic Primal-Dual Hybrid Gradient (PDHG) approach for solving a wide spectrum of regularized stochastic minimization problems, where the regularization term is composite with a linear function. It has been recognized that solving this kind of problem is challenging since the closed-form solution of the proximal mapping associated with the regularization term is not available due to the imposed linear composition, and the per-iteration cost of computing the full gradient of the expected objective function is extremely high when the number of input data samples is considerably large.
Our new approach overcomes these issues by exploring the special structure of the regularization term and sampling a few data points at each iteration. Rather than analyzing the convergence in expectation, we provide the detailed iteration complexity analysis for the cases of both uniformly and non-uniformly averaged iterates with high probability. This strongly supports the good practical performance of the proposed approach. Numerical experiments demonstrate that the efficiency of stochastic PDHG, which outperforms other competing algorithms, as expected by the high-probability convergence analysis.
△ Less
Submitted 1 February, 2018; v1 submitted 21 January, 2018;
originally announced January 2018.
-
Leveraging the Flow of Collective Attention for Computational Communication Research
Authors:
Cheng-Jun Wang,
Zhi-Cong Chen,
Qiang Qin,
Naipeng Chao
Abstract:
Human attention becomes an increasingly important resource for our understanding or collective human behaviors in the age of information explosion. To better understand the flow of collective attention, we construct the attention flow network using anonymous smartphone data of 100,000 users in a major city of China. In the constructed network, nodes are websites visited by users, and links denote…
▽ More
Human attention becomes an increasingly important resource for our understanding or collective human behaviors in the age of information explosion. To better understand the flow of collective attention, we construct the attention flow network using anonymous smartphone data of 100,000 users in a major city of China. In the constructed network, nodes are websites visited by users, and links denote the switch of users between two websites. We quantify the flow of collective attention by computing the flow network statistics, such as flow impact, flow dissipation, and flow distance. The findings reveal a strong concentration and fragmentation of collective attention for smartphone users, while the duplication of attention cross websites proves to be unfounded in mobile using. We further confirmed the law of dissipation and the allowmetric scaling of flow impact. Surprisingly, there is a centralized flow structure, suggesting that the website with large traffic can easily control the circulated collective attention. Additionally, we find that flow network analysis can effectively explain the page views and sale volume of products. Finally, we discuss the benefits and limitations of using the flow network analysis for computational communication research.
△ Less
Submitted 21 October, 2017;
originally announced October 2017.
-
Bringing SDN to the Mobile Edge
Authors:
Konstantinos Poularakis,
Qiaofeng Qin,
Erich Nahum,
Miguel Rio,
Leandros Tassiulas
Abstract:
Nowadays, Software Defined Network (SDN) architectures and applications are revolutionizing the way wired networks are built and operate. However, little is known about the potential of this disruptive technology in wireless mobile networks. In fact, SDN is based on a centralized network control principle, while existing mobile network protocols give emphasis on the distribution of network resourc…
▽ More
Nowadays, Software Defined Network (SDN) architectures and applications are revolutionizing the way wired networks are built and operate. However, little is known about the potential of this disruptive technology in wireless mobile networks. In fact, SDN is based on a centralized network control principle, while existing mobile network protocols give emphasis on the distribution of network resources and their management. Therefore, it is challenging to apply SDN ideas in the context of mobile networks. In this paper, we propose methods to overcome these challenges and make SDN more suitable for the mobile environment. Our main idea is to combine centralized SDN and distributed control in a hybrid design that takes the best of the two paradigms; (i) global network view and control programmability of SDN and (ii) robustness of distributed protocols. We discuss the pros and cons of each method and highlight them in an SDN prototype implementation built using off-the-shelf mobile devices.
△ Less
Submitted 19 June, 2017;
originally announced June 2017.
-
Distributed Compressive Sensing Based Doubly Selective Channel Estimation for Large-Scale MIMO Systems
Authors:
Bo Gong,
Qibo Qin,
Xiang Ren,
Lin Gui,
Hanwen Luo,
Wen Chen
Abstract:
Doubly selective (DS) channel estimation in largescale multiple-input multiple-output (MIMO) systems is a challenging problem due to the requirement of unaffordable pilot overheads and prohibitive complexity. In this paper, we propose a novel distributed compressive sensing (DCS) based channel estimation scheme to solve this problem. In the scheme, we introduce the basis expansion model (BEM) to r…
▽ More
Doubly selective (DS) channel estimation in largescale multiple-input multiple-output (MIMO) systems is a challenging problem due to the requirement of unaffordable pilot overheads and prohibitive complexity. In this paper, we propose a novel distributed compressive sensing (DCS) based channel estimation scheme to solve this problem. In the scheme, we introduce the basis expansion model (BEM) to reduce the required channel coefficients and pilot overheads. And due to the common sparsity of all the transmit-receive antenna pairs in delay domain, we estimate the BEM coefficients by considering the DCS framework, which has a simple linear structure with low complexity. Further more, a linear smoothing method is proposed to improve the estimation accuracy. Finally, we conduct various simulations to verify the validity of the proposed scheme and demonstrate the performance gains of the proposed scheme compared with conventional schemes.
△ Less
Submitted 9 November, 2015;
originally announced November 2015.
-
Post-buckling Solutions of Hyper-elastic Beam by Canonical Dual Finite Element Method
Authors:
Kun Cai,
David Y. Gao,
Qing H. Qin
Abstract:
Post buckling problem of a large deformed beam is analyzed using canonical dual finite element method (CD-FEM). The feature of this method is to choose correctly the canonical dual stress so that the original non-convex potential energy functional is reformulated in a mixed complementary energy form with both displacement and stress fields, and a pure complementary energy is explicitly formulated…
▽ More
Post buckling problem of a large deformed beam is analyzed using canonical dual finite element method (CD-FEM). The feature of this method is to choose correctly the canonical dual stress so that the original non-convex potential energy functional is reformulated in a mixed complementary energy form with both displacement and stress fields, and a pure complementary energy is explicitly formulated in finite dimensional space. Based on the canonical duality theory and the associated triality theorem, a primal-dual algorithm is proposed, which can be used to find all possible solutions of this nonconvex post-buckling problem. Numerical results show that the global maximum of the pure-complementary energy leads to a stable buckled configuration of the beam. While the local extrema of the pure-complementary energy present unstable deformation states, especially. We discovered that the unstable buckled state is very sensitive to the number of total elements and the external loads. Theoretical results are verified through numerical examples and some interesting phenomena in post-bifurcation of this large deformed beam are observed.
△ Less
Submitted 17 February, 2013;
originally announced February 2013.