subscribe to arXiv mailings

A Two-stage Evolutionary Framework For Multi-objective Optimization

Authors: Peng Chen, Jing Liang, Kangjia Qiao, Ponnuthurai Nagaratnam Suganthan, Xuanxuan Ban

Abstract: In the field of evolutionary multi-objective optimization, the approximation of the Pareto front (PF) is achieved by utilizing a collection of representative candidate solutions that exhibit desirable convergence and diversity. Although several multi-objective evolutionary algorithms (MOEAs) have been designed, they still have difficulties in keeping balance between convergence and diversity of po… ▽ More In the field of evolutionary multi-objective optimization, the approximation of the Pareto front (PF) is achieved by utilizing a collection of representative candidate solutions that exhibit desirable convergence and diversity. Although several multi-objective evolutionary algorithms (MOEAs) have been designed, they still have difficulties in keeping balance between convergence and diversity of population. To better solve multi-objective optimization problems (MOPs), this paper proposes a Two-stage Evolutionary Framework For Multi-objective Optimization (TEMOF). Literally, algorithms are divided into two stages to enhance the search capability of the population. During the initial half of evolutions, parental selection is exclusively conducted from the primary population. Additionally, we not only perform environmental selection on the current population, but we also establish an external archive to store individuals situated on the first PF. Subsequently, in the second stage, parents are randomly chosen either from the population or the archive. In the experiments, one classic MOEA and two state-of-the-art MOEAs are integrated into the framework to form three new algorithms. The experimental results demonstrate the superior and robust performance of the proposed framework across a wide range of MOPs. Besides, the winner among three new algorithms is compared with several existing MOEAs and shows better results. Meanwhile, we conclude the reasons that why the two-stage framework is effect for the existing benchmark functions. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: Accepted by the CEC conference of WCCI2024

arXiv:2405.05745 [pdf, other]

Efficient Pretraining Model based on Multi-Scale Local Visual Field Feature Reconstruction for PCB CT Image Element Segmentation

Authors: Chen Chen, Kai Qiao, Jie Yang, Jian Chen, Bin Yan

Abstract: Element segmentation is a key step in nondestructive testing of Printed Circuit Boards (PCB) based on Computed Tomography (CT) technology. In recent years, the rapid development of self-supervised pretraining technology can obtain general image features without labeled samples, and then use a small amount of labeled samples to solve downstream tasks, which has a good potential in PCB element segme… ▽ More Element segmentation is a key step in nondestructive testing of Printed Circuit Boards (PCB) based on Computed Tomography (CT) technology. In recent years, the rapid development of self-supervised pretraining technology can obtain general image features without labeled samples, and then use a small amount of labeled samples to solve downstream tasks, which has a good potential in PCB element segmentation. At present, Masked Image Modeling (MIM) pretraining model has been initially applied in PCB CT image element segmentation. However, due to the small and regular size of PCB elements such as vias, wires, and pads, the global visual field has redundancy for a single element reconstruction, which may damage the performance of the model. Based on this issue, we propose an efficient pretraining model based on multi-scale local visual field feature reconstruction for PCB CT image element segmentation (EMLR-seg). In this model, the teacher-guided MIM pretraining model is introduced into PCB CT image element segmentation for the first time, and a multi-scale local visual field extraction (MVE) module is proposed to reduce redundancy by focusing on local visual fields. At the same time, a simple 4-Transformer-blocks decoder is used. Experiments show that EMLR-seg can achieve 88.6% mIoU on the PCB CT image dataset we proposed, which exceeds 1.2% by the baseline model, and the training time is reduced by 29.6 hours, a reduction of 17.4% under the same experimental condition, which reflects the advantage of EMLR-seg in terms of performance and efficiency. △ Less

Submitted 9 May, 2024; originally announced May 2024.

arXiv:2307.01968 [pdf, other]

Muti-scale Graph Neural Network with Signed-attention for Social Bot Detection: A Frequency Perspective

Authors: Shuhao Shi, Kai Qiao, Zhengyan Wang, Jie Yang, Baojie Song, Jian Chen, Bin Yan

Abstract: The presence of a large number of bots on social media has adverse effects. The graph neural network (GNN) can effectively leverage the social relationships between users and achieve excellent results in detecting bots. Recently, more and more GNN-based methods have been proposed for bot detection. However, the existing GNN-based bot detection methods only focus on low-frequency information and se… ▽ More The presence of a large number of bots on social media has adverse effects. The graph neural network (GNN) can effectively leverage the social relationships between users and achieve excellent results in detecting bots. Recently, more and more GNN-based methods have been proposed for bot detection. However, the existing GNN-based bot detection methods only focus on low-frequency information and seldom consider high-frequency information, which limits the representation ability of the model. To address this issue, this paper proposes a Multi-scale with Signed-attention Graph Filter for social bot detection called MSGS. MSGS could effectively utilize both high and low-frequency information in the social graph. Specifically, MSGS utilizes a multi-scale structure to produce representation vectors at different scales. These representations are then combined using a signed-attention mechanism. Finally, multi-scale representations via MLP after polymerization to produce the final result. We analyze the frequency response and demonstrate that MSGS is a more flexible and expressive adaptive graph filter. MSGS can effectively utilize high-frequency information to alleviate the over-smoothing problem of deep GNNs. Experimental results on real-world datasets demonstrate that our method achieves better performance compared with several state-of-the-art social bot detection methods. △ Less

Submitted 4 July, 2023; originally announced July 2023.

Comments: 13 pages, 10 figures

arXiv:2306.13349 [pdf, other]

Multi-objective optimization based network control principles for identifying personalized drug targets with cancer

Authors: Jing Liang, Zhuo Hu, Zong-Wei Li, Kang-Jia Qiao, Wei-Feng Guo

Abstract: It is a big challenge to develop efficient models for identifying personalized drug targets (PDTs) from high-dimensional personalized genomic profile of individual patients. Recent structural network control principles have introduced a new approach to discover PDTs by selecting an optimal set of driver genes in personalized gene interaction network (PGIN). However, most of current methods only fo… ▽ More It is a big challenge to develop efficient models for identifying personalized drug targets (PDTs) from high-dimensional personalized genomic profile of individual patients. Recent structural network control principles have introduced a new approach to discover PDTs by selecting an optimal set of driver genes in personalized gene interaction network (PGIN). However, most of current methods only focus on controlling the system through a minimum driver-node set and ignore the existence of multiple candidate driver-node sets for therapeutic drug target identification in PGIN. Therefore, this paper proposed multi-objective optimization-based structural network control principles (MONCP) by considering minimum driver nodes and maximum prior-known drug-target information. To solve MONCP, a discrete multi-objective optimization problem is formulated with many constrained variables, and a novel evolutionary optimization model called LSCV-MCEA was developed by adapting a multi-tasking framework and a rankings-based fitness function method. With genomics data of patients with breast or lung cancer from The Cancer Genome Atlas database, the effectiveness of LSCV-MCEA was validated. The experimental results indicated that compared with other advanced methods, LSCV-MCEA can more effectively identify PDTs with the highest Area Under the Curve score for predicting clinically annotated combinatorial drugs. Meanwhile, LSCV-MCEA can more effectively solve MONCP than other evolutionary optimization methods in terms of algorithm convergence and diversity. Particularly, LSCV-MCEA can efficiently detect disease signals for individual patients with BRCA cancer. The study results show that multi-objective optimization can solve structural network control principles effectively and offer a new perspective for understanding tumor heterogeneity in cancer precision medicine. △ Less

Submitted 23 June, 2023; originally announced June 2023.

Comments: 15 pages, 8 figures; This work has been submitted to IEEE Transactions on Evolutionary Computation

MSC Class: None ACM Class: H.4.2; D.2.7

arXiv:2304.08239 [pdf, other]

RF-GNN: Random Forest Boosted Graph Neural Network for Social Bot Detection

Authors: Shuhao Shi, Kai Qiao, Jie Yang, Baojie Song, Jian Chen, Bin Yan

Abstract: The presence of a large number of bots on social media leads to adverse effects. Although Random forest algorithm is widely used in bot detection and can significantly enhance the performance of weak classifiers, it cannot utilize the interaction between accounts. This paper proposes a Random Forest boosted Graph Neural Network for social bot detection, called RF-GNN, which employs graph neural ne… ▽ More The presence of a large number of bots on social media leads to adverse effects. Although Random forest algorithm is widely used in bot detection and can significantly enhance the performance of weak classifiers, it cannot utilize the interaction between accounts. This paper proposes a Random Forest boosted Graph Neural Network for social bot detection, called RF-GNN, which employs graph neural networks (GNNs) as the base classifiers to construct a random forest, effectively combining the advantages of ensemble learning and GNNs to improve the accuracy and robustness of the model. Specifically, different subgraphs are constructed as different training sets through node sampling, feature selection, and edge dropout. Then, GNN base classifiers are trained using various subgraphs, and the remaining features are used for training Fully Connected Netural Network (FCN). The outputs of GNN and FCN are aligned in each branch. Finally, the outputs of all branches are aggregated to produce the final result. Moreover, RF-GNN is compatible with various widely-used GNNs for node classification. Extensive experimental results demonstrate that the proposed method obtains better performance than other state-of-the-art methods. △ Less

Submitted 13 April, 2023; originally announced April 2023.

Comments: 24 pages, 8 figures

arXiv:2302.06900 [pdf, other]

Over-Sampling Strategy in Feature Space for Graphs based Class-imbalanced Bot Detection

Authors: Shuhao Shi, Kai Qiao, Jie Yang, Baojie Song, Jian Chen, Bin Yan

Abstract: The presence of a large number of bots in Online Social Networks (OSN) leads to undesirable social effects. Graph neural networks (GNNs) are effective in detecting bots as they utilize user interactions. However, class-imbalanced issues can affect bot detection performance. To address this, we propose an over-sampling strategy for GNNs (OS-GNN) that generates samples for the minority class without… ▽ More The presence of a large number of bots in Online Social Networks (OSN) leads to undesirable social effects. Graph neural networks (GNNs) are effective in detecting bots as they utilize user interactions. However, class-imbalanced issues can affect bot detection performance. To address this, we propose an over-sampling strategy for GNNs (OS-GNN) that generates samples for the minority class without edge synthesis. First, node features are mapped to a feature space through neighborhood aggregation. Then, we generate samples for the minority class in the feature space. Finally, the augmented features are used to train the classifiers. This framework is general and can be easily extended into different GNN architectures. The proposed framework is evaluated using three real-world bot detection benchmark datasets, and it consistently exhibits superiority over the baselines. △ Less

Submitted 10 September, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

Comments: 5 pages, 4 figures

arXiv:2301.01123 [pdf, other]

MGTAB: A Multi-Relational Graph-Based Twitter Account Detection Benchmark

Authors: Shuhao Shi, Kai Qiao, Jian Chen, Shuai Yang, Jie Yang, Baojie Song, Linyuan Wang, Bin Yan

Abstract: The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benc… ▽ More The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB. △ Less

Submitted 13 March, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

Comments: 14 pages, 7 figures

arXiv:2205.03753 [pdf, other]

doi 10.1007/s10489-023-05110-5

Select and Calibrate the Low-confidence: Dual-Channel Consistency based Graph Convolutional Networks

Authors: Shuhao Shi, Jian Chen, Kai Qiao, Shuai Yang, Linyuan Wang, Bin Yan

Abstract: The Graph Convolutional Networks (GCNs) have achieved excellent results in node classification tasks, but the model's performance at low label rates is still unsatisfactory. Previous studies in Semi-Supervised Learning (SSL) for graph have focused on using network predictions to generate soft pseudo-labels or instructing message propagation, which inevitably contains the incorrect prediction due t… ▽ More The Graph Convolutional Networks (GCNs) have achieved excellent results in node classification tasks, but the model's performance at low label rates is still unsatisfactory. Previous studies in Semi-Supervised Learning (SSL) for graph have focused on using network predictions to generate soft pseudo-labels or instructing message propagation, which inevitably contains the incorrect prediction due to the over-confident in the predictions. Our proposed Dual-Channel Consistency based Graph Convolutional Networks (DCC-GCN) uses dual-channel to extract embeddings from node features and topological structures, and then achieves reliable low-confidence and high-confidence samples selection based on dual-channel consistency. We further confirmed that the low-confidence samples obtained based on dual-channel consistency were low in accuracy, constraining the model's performance. Unlike previous studies ignoring low-confidence samples, we calibrate the feature embeddings of the low-confidence samples by using the neighborhood's high-confidence samples. Our experiments have shown that the DCC-GCN can more accurately distinguish between low-confidence and high-confidence samples, and can also significantly improve the accuracy of low-confidence samples. We conducted extensive experiments on the benchmark datasets and demonstrated that DCC-GCN is significantly better than state-of-the-art baselines at different label rates. △ Less

Submitted 7 May, 2022; originally announced May 2022.

Comments: 25 pages, 7 figures. Submitted to neucom

Journal ref: Applied Intelligence 2023

arXiv:2109.14159 [pdf, other]

doi 10.1007/s11063-022-11064-5

Adaptive Multi-layer Contrastive Graph Neural Networks

Authors: Shuhao Shi, Pengfei Xie, Xu Luo, Kai Qiao, Linyuan Wang, Jian Chen, Bin Yan

Abstract: We present Adaptive Multi-layer Contrastive Graph Neural Networks (AMC-GNN), a self-supervised learning framework for Graph Neural Network, which learns feature representations of sample data without data labels. AMC-GNN generates two graph views by data augmentation and compares different layers' output embeddings of Graph Neural Network encoders to obtain feature representations, which could be… ▽ More We present Adaptive Multi-layer Contrastive Graph Neural Networks (AMC-GNN), a self-supervised learning framework for Graph Neural Network, which learns feature representations of sample data without data labels. AMC-GNN generates two graph views by data augmentation and compares different layers' output embeddings of Graph Neural Network encoders to obtain feature representations, which could be used for downstream tasks. AMC-GNN could learn the importance weights of embeddings in different layers adaptively through the attention mechanism, and an auxiliary encoder is introduced to train graph contrastive encoders better. The accuracy is improved by maximizing the representation's consistency of positive pairs in the early layers and the final embedding space. Our experiments show that the results can be consistently improved by using the AMC-GNN framework, across four established graph benchmarks: Cora, Citeseer, Pubmed, DBLP citation network datasets, as well as four newly proposed datasets: Co-author-CS, Co-author-Physics, Amazon-Computers, Amazon-Photo. △ Less

Submitted 28 September, 2021; originally announced September 2021.

Comments: 16 pages,7 figures

Journal ref: Neural Processing Letters 55 (2021): 4757 - 4776

arXiv:2106.13984 [pdf, other]

ShapeEditer: a StyleGAN Encoder for Face Swapping

Authors: Shuai Yang, Kai Qiao

Abstract: In this paper, we propose a novel encoder, called ShapeEditor, for high-resolution, realistic and high-fidelity face exchange. First of all, in order to ensure sufficient clarity and authenticity, our key idea is to use an advanced pretrained high-quality random face image generator, i.e. StyleGAN, as backbone. Secondly, we design ShapeEditor, a two-step encoder, to make the swapped face integrate… ▽ More In this paper, we propose a novel encoder, called ShapeEditor, for high-resolution, realistic and high-fidelity face exchange. First of all, in order to ensure sufficient clarity and authenticity, our key idea is to use an advanced pretrained high-quality random face image generator, i.e. StyleGAN, as backbone. Secondly, we design ShapeEditor, a two-step encoder, to make the swapped face integrate the identity and attribute of the input faces. In the first step, we extract the identity vector of the source image and the attribute vector of the target image respectively; in the second step, we map the concatenation of identity vector and attribute vector into the $\mathcal{W+}$ potential space. In addition, for learning to map into the latent space of StyleGAN, we propose a set of self-supervised loss functions with which the training data do not need to be labeled manually. Extensive experiments on the test dataset show that the results of our method not only have a great advantage in clarity and authenticity than other state-of-the-art methods, but also reflect the sufficient integration of identity and attribute. △ Less

Submitted 26 June, 2021; originally announced June 2021.

Comments: 13 pages, 3 figures

arXiv:2106.01617 [pdf, other]

Improving the Transferability of Adversarial Examples with New Iteration Framework and Input Dropout

Authors: Pengfei Xie, Linyuan Wang, Ruoxi Qin, Kai Qiao, Shuhao Shi, Guoen Hu, Bin Yan

Abstract: Deep neural networks(DNNs) is vulnerable to be attacked by adversarial examples. Black-box attack is the most threatening attack. At present, black-box attack methods mainly adopt gradient-based iterative attack methods, which usually limit the relationship between the iteration step size, the number of iterations, and the maximum perturbation. In this paper, we propose a new gradient iteration fr… ▽ More Deep neural networks(DNNs) is vulnerable to be attacked by adversarial examples. Black-box attack is the most threatening attack. At present, black-box attack methods mainly adopt gradient-based iterative attack methods, which usually limit the relationship between the iteration step size, the number of iterations, and the maximum perturbation. In this paper, we propose a new gradient iteration framework, which redefines the relationship between the above three. Under this framework, we easily improve the attack success rate of DI-TI-MIM. In addition, we propose a gradient iterative attack method based on input dropout, which can be well combined with our framework. We further propose a multi dropout rate version of this method. Experimental results show that our best method can achieve attack success rate of 96.2\% for defense model on average, which is higher than the state-of-the-art gradient-based attacks. △ Less

Submitted 22 June, 2021; v1 submitted 3 June, 2021; originally announced June 2021.

arXiv:2105.11625 [pdf, other]

doi 10.3389/fnbot.2021.775688

Boosting-GNN: Boosting Algorithm for Graph Networks on Imbalanced Node Classification

Authors: S. Shi, Kai Qiao, Shuai Yang, L. Wang, J. Chen, Bin Yan

Abstract: The Graph Neural Network (GNN) has been widely used for graph data representation. However, the existing researches only consider the ideal balanced dataset, and the imbalanced dataset is rarely considered. Traditional methods such as resampling, reweighting, and synthetic samples that deal with imbalanced datasets are no longer applicable in GNN. This paper proposes an ensemble model called Boost… ▽ More The Graph Neural Network (GNN) has been widely used for graph data representation. However, the existing researches only consider the ideal balanced dataset, and the imbalanced dataset is rarely considered. Traditional methods such as resampling, reweighting, and synthetic samples that deal with imbalanced datasets are no longer applicable in GNN. This paper proposes an ensemble model called Boosting-GNN, which uses GNNs as the base classifiers during boosting. In Boosting-GNN, higher weights are set for the training samples that are not correctly classified by the previous classifier, thus achieving higher classification accuracy and better reliability. Besides, transfer learning is used to reduce computational cost and increase fitting ability. Experimental results indicate that the proposed Boosting-GNN model achieves better performance than GCN, GraphSAGE, GAT, SGC, N-GCN, and most advanced reweighting and resampling methods on synthetic imbalanced datasets, with an average performance improvement of 4.5% △ Less

Submitted 7 May, 2022; v1 submitted 24 May, 2021; originally announced May 2021.

Comments: 19 pages, 6 figures published on Frontiers in Neurorobotics

Journal ref: Frontiers in Neurorobotics, 15 (2021)

arXiv:2010.11535 [pdf]

Defense-guided Transferable Adversarial Attacks

Authors: Zifei Zhang, Kai Qiao, Jian Chen, Ningning Liang

Abstract: Though deep neural networks perform challenging tasks excellently, they are susceptible to adversarial examples, which mislead classifiers by applying human-imperceptible perturbations on clean inputs. Under the query-free black-box scenario, adversarial examples are hard to transfer to unknown models, and several methods have been proposed with the low transferability. To settle such issue, we de… ▽ More Though deep neural networks perform challenging tasks excellently, they are susceptible to adversarial examples, which mislead classifiers by applying human-imperceptible perturbations on clean inputs. Under the query-free black-box scenario, adversarial examples are hard to transfer to unknown models, and several methods have been proposed with the low transferability. To settle such issue, we design a max-min framework inspired by input transformations, which are benificial to both the adversarial attack and defense. Explicitly, we decrease loss values with inputs' affline transformations as a defense in the minimum procedure, and then increase loss values with the momentum iterative algorithm as an attack in the maximum procedure. To further promote transferability, we determine transformed values with the max-min theory. Extensive experiments on Imagenet demonstrate that our defense-guided transferable attacks achieve impressive increase on transferability. Experimentally, we show that our ASR of adversarial attack reaches to 58.38% on average, which outperforms the state-of-the-art method by 12.1% on the normally trained models and by 11.13% on the adversarially trained models. Additionally, we provide elucidative insights on the improvement of transferability, and our method is expected to be a benchmark for assessing the robustness of deep models. △ Less

Submitted 3 November, 2020; v1 submitted 22 October, 2020; originally announced October 2020.

arXiv:2003.11797 [pdf]

Neural encoding and interpretation for high-level visual cortices based on fMRI using image caption features

Authors: Kai Qiao, Chi Zhang, Jian Chen, Linyuan Wang, Li Tong, Bin Yan

Abstract: On basis of functional magnetic resonance imaging (fMRI), researchers are devoted to designing visual encoding models to predict the neuron activity of human in response to presented image stimuli and analyze inner mechanism of human visual cortices. Deep network structure composed of hierarchical processing layers forms deep network models by learning features of data on specific task through big… ▽ More On basis of functional magnetic resonance imaging (fMRI), researchers are devoted to designing visual encoding models to predict the neuron activity of human in response to presented image stimuli and analyze inner mechanism of human visual cortices. Deep network structure composed of hierarchical processing layers forms deep network models by learning features of data on specific task through big dataset. Deep network models have powerful and hierarchical representation of data, and have brought about breakthroughs for visual encoding, while revealing hierarchical structural similarity with the manner of information processing in human visual cortices. However, previous studies almost used image features of those deep network models pre-trained on classification task to construct visual encoding models. Except for deep network structure, the task or corresponding big dataset is also important for deep network models, but neglected by previous studies. Because image classification is a relatively fundamental task, it is difficult to guide deep network models to master high-level semantic representations of data, which causes into that encoding performance for high-level visual cortices is limited. In this study, we introduced one higher-level vision task: image caption (IC) task and proposed the visual encoding model based on IC features (ICFVEM) to encode voxels of high-level visual cortices. Experiment demonstrated that ICFVEM obtained better encoding performance than previous deep network models pre-trained on classification task. In addition, the interpretation of voxels was realized to explore the detailed characteristics of voxels based on the visualization of semantic words, and comparative analysis implied that high-level visual cortices behaved the correlative representation of image content. △ Less

Submitted 26 March, 2020; originally announced March 2020.

arXiv:2003.06105 [pdf]

BigGAN-based Bayesian reconstruction of natural images from human brain activity

Authors: Kai Qiao, Jian Chen, Linyuan Wang, Chi Zhang, Li Tong, Bin Yan

Abstract: In the visual decoding domain, visually reconstructing presented images given the corresponding human brain activity monitored by functional magnetic resonance imaging (fMRI) is difficult, especially when reconstructing viewed natural images. Visual reconstruction is a conditional image generation on fMRI data and thus generative adversarial network (GAN) for natural image generation is recently i… ▽ More In the visual decoding domain, visually reconstructing presented images given the corresponding human brain activity monitored by functional magnetic resonance imaging (fMRI) is difficult, especially when reconstructing viewed natural images. Visual reconstruction is a conditional image generation on fMRI data and thus generative adversarial network (GAN) for natural image generation is recently introduced for this task. Although GAN-based methods have greatly improved, the fidelity and naturalness of reconstruction are still unsatisfactory due to the small number of fMRI data samples and the instability of GAN training. In this study, we proposed a new GAN-based Bayesian visual reconstruction method (GAN-BVRM) that includes a classifier to decode categories from fMRI data, a pre-trained conditional generator to generate natural images of specified categories, and a set of encoding models and evaluator to evaluate generated images. GAN-BVRM employs the pre-trained generator of the prevailing BigGAN to generate masses of natural images, and selects the images that best matches with the corresponding brain activity through the encoding models as the reconstruction of the image stimuli. In this process, the semantic and detailed contents of reconstruction are controlled by decoded categories and encoding models, respectively. GAN-BVRM used the Bayesian manner to avoid contradiction between naturalness and fidelity from current GAN-based methods and thus can improve the advantages of GAN. Experimental results revealed that GAN-BVRM improves the fidelity and naturalness, that is, the reconstruction is natural and similar to the presented image stimuli. △ Less

Submitted 13 March, 2020; originally announced March 2020.

Comments: brain decoding; visual reconstruction; generative adversarial network (GAN); Bayesian framework. under review in Neuroscience of Elsevier

arXiv:2002.00179 [pdf]

AdvJND: Generating Adversarial Examples with Just Noticeable Difference

Authors: Zifei Zhang, Kai Qiao, Lingyun Jiang, Linyuan Wang, Bin Yan

Abstract: Compared with traditional machine learning models, deep neural networks perform better, especially in image classification tasks. However, they are vulnerable to adversarial examples. Adding small perturbations on examples causes a good-performance model to misclassify the crafted examples, without category differences in the human eyes, and fools deep models successfully. There are two requiremen… ▽ More Compared with traditional machine learning models, deep neural networks perform better, especially in image classification tasks. However, they are vulnerable to adversarial examples. Adding small perturbations on examples causes a good-performance model to misclassify the crafted examples, without category differences in the human eyes, and fools deep models successfully. There are two requirements for generating adversarial examples: the attack success rate and image fidelity metrics. Generally, perturbations are increased to ensure the adversarial examples' high attack success rate; however, the adversarial examples obtained have poor concealment. To alleviate the tradeoff between the attack success rate and image fidelity, we propose a method named AdvJND, adding visual model coefficients, just noticeable difference coefficients, in the constraint of a distortion function when generating adversarial examples. In fact, the visual subjective feeling of the human eyes is added as a priori information, which decides the distribution of perturbations, to improve the image quality of adversarial examples. We tested our method on the FashionMNIST, CIFAR10, and MiniImageNet datasets. Adversarial examples generated by our AdvJND algorithm yield gradient distributions that are similar to those of the original inputs. Hence, the crafted noise can be hidden in the original inputs, thus improving the attack concealment significantly. △ Less

Submitted 23 June, 2020; v1 submitted 1 February, 2020; originally announced February 2020.

arXiv:1909.07558 [pdf]

HAD-GAN: A Human-perception Auxiliary Defense GAN to Defend Adversarial Examples

Authors: Wanting Yu, Hongyi Yu, Lingyun Jiang, Mengli Zhang, Kai Qiao

Abstract: Adversarial examples reveal the vulnerability and unexplained nature of neural networks. Studying the defense of adversarial examples is of considerable practical importance. Most adversarial examples that misclassify networks are often undetectable by humans. In this paper, we propose a defense model to train the classifier into a human-perception classification model with shape preference. The p… ▽ More Adversarial examples reveal the vulnerability and unexplained nature of neural networks. Studying the defense of adversarial examples is of considerable practical importance. Most adversarial examples that misclassify networks are often undetectable by humans. In this paper, we propose a defense model to train the classifier into a human-perception classification model with shape preference. The proposed model comprising a texture transfer network (TTN) and an auxiliary defense generative adversarial networks (GAN) is called Human-perception Auxiliary Defense GAN (HAD-GAN). The TTN is used to extend the texture samples of a clean image and helps classifiers focus on its shape. GAN is utilized to form a training framework for the model and generate the necessary images. A series of experiments conducted on MNIST, Fashion-MNIST and CIFAR10 show that the proposed model outperforms the state-of-the-art defense methods for network robustness. The model also demonstrates a significant improvement on defense capability of adversarial examples. △ Less

Submitted 22 December, 2021; v1 submitted 16 September, 2019; originally announced September 2019.

arXiv:1907.11885 [pdf]

Effective and efficient ROI-wise visual encoding using an end-to-end CNN regression model and selective optimization

Authors: Kai Qiao, Chi Zhang, Jian Chen, Linyuan Wang, Li Tong, Bin Yan

Abstract: Recently, visual encoding based on functional magnetic resonance imaging (fMRI) have realized many achievements with the rapid development of deep network computation. Visual encoding model is aimed at predicting brain activity in response to presented image stimuli. Currently, visual encoding is accomplished mainly by firstly extracting image features through convolutional neural network (CNN) mo… ▽ More Recently, visual encoding based on functional magnetic resonance imaging (fMRI) have realized many achievements with the rapid development of deep network computation. Visual encoding model is aimed at predicting brain activity in response to presented image stimuli. Currently, visual encoding is accomplished mainly by firstly extracting image features through convolutional neural network (CNN) model pre-trained on computer vision task, and secondly training a linear regression model to map specific layer of CNN features to each voxel, namely voxel-wise encoding. However, the two-step manner model, essentially, is hard to determine which kind of well features are well linearly matched for beforehand unknown fMRI data with little understanding of human visual representation. Analogizing computer vision mostly related human vision, we proposed the end-to-end convolution regression model (ETECRM) in the region of interest (ROI)-wise manner to accomplish effective and efficient visual encoding. The end-to-end manner was introduced to make the model automatically learn better matching features to improve encoding performance. The ROI-wise manner was used to improve the encoding efficiency for many voxels. In addition, we designed the selective optimization including self-adapting weight learning and weighted correlation loss, noise regularization to avoid interfering of ineffective voxels in ROI-wise encoding. Experiment demonstrated that the proposed model obtained better predicting accuracy than the two-step manner of encoding models. Comparative analysis implied that end-to-end manner and large volume of fMRI data may drive the future development of visual encoding. △ Less

Submitted 27 July, 2019; originally announced July 2019.

Comments: under review in Computational Intelligence and Neuroscience

arXiv:1904.06026 [pdf]

Cycle-Consistent Adversarial GAN: the integration of adversarial attack and defense

Authors: Lingyun Jiang, Kai Qiao, Ruoxi Qin, Linyuan Wang, Jian Chen, Haibing Bu, Bin Yan

Abstract: In image classification of deep learning, adversarial examples where inputs intended to add small magnitude perturbations may mislead deep neural networks (DNNs) to incorrect results, which means DNNs are vulnerable to them. Different attack and defense strategies have been proposed to better research the mechanism of deep learning. However, those research in these networks are only for one aspect… ▽ More In image classification of deep learning, adversarial examples where inputs intended to add small magnitude perturbations may mislead deep neural networks (DNNs) to incorrect results, which means DNNs are vulnerable to them. Different attack and defense strategies have been proposed to better research the mechanism of deep learning. However, those research in these networks are only for one aspect, either an attack or a defense, not considering that attacks and defenses should be interdependent and mutually reinforcing, just like the relationship between spears and shields. In this paper, we propose Cycle-Consistent Adversarial GAN (CycleAdvGAN) to generate adversarial examples, which can learn and approximate the distribution of original instances and adversarial examples. For CycleAdvGAN, once the Generator and are trained, can generate adversarial perturbations efficiently for any instance, so as to make DNNs predict wrong, and recovery adversarial examples to clean instances, so as to make DNNs predict correct. We apply CycleAdvGAN under semi-white box and black-box settings on two public datasets MNIST and CIFAR10. Using the extensive experiments, we show that our method has achieved the state-of-the-art adversarial attack method and also efficiently improve the defense ability, which make the integration of adversarial attack and defense come true. In additional, it has improved attack effect only trained on the adversarial dataset generated by any kind of adversarial attack. △ Less

Submitted 12 April, 2019; originally announced April 2019.

Comments: 13 pages,7 tables, 1 figure

arXiv:1902.08793 [pdf]

A visual encoding model based on deep neural networks and transfer learning

Authors: Chi Zhang, Kai Qiao, Linyuan Wang, Li Tong, Guoen Hu, Ruyuan Zhang, Bin Yan

Abstract: Background: Building visual encoding models to accurately predict visual responses is a central challenge for current vision-based brain-machine interface techniques. To achieve high prediction accuracy on neural signals, visual encoding models should include precise visual features and appropriate prediction algorithms. Most existing visual encoding models employ hand-craft visual features (e.g.,… ▽ More Background: Building visual encoding models to accurately predict visual responses is a central challenge for current vision-based brain-machine interface techniques. To achieve high prediction accuracy on neural signals, visual encoding models should include precise visual features and appropriate prediction algorithms. Most existing visual encoding models employ hand-craft visual features (e.g., Gabor wavelets or semantic labels) or data-driven features (e.g., features extracted from deep neural networks (DNN)). They also assume a linear mapping between feature representation to brain activity. However, it remains unknown whether such linear mapping is sufficient for maximizing prediction accuracy. New Method: We construct a new visual encoding framework to predict cortical responses in a benchmark functional magnetic resonance imaging (fMRI) dataset. In this framework, we employ the transfer learning technique to incorporate a pre-trained DNN (i.e., AlexNet) and train a nonlinear mapping from visual features to brain activity. This nonlinear mapping replaces the conventional linear mapping and is supposed to improve prediction accuracy on brain activity. Results: The proposed framework can significantly predict responses of over 20% voxels in early visual areas (i.e., V1-lateral occipital region, LO) and achieve unprecedented prediction accuracy. Comparison with Existing Methods: Comparing to two conventional visual encoding models, we find that the proposed encoding model shows consistent higher prediction accuracy in all early visual areas, especially in relatively anterior visual areas (i.e., V4 and LO). Conclusions: Our work proposes a new framework to utilize pre-trained visual features and train non-linear mappings from visual features to brain activity. △ Less

Submitted 23 February, 2019; originally announced February 2019.

arXiv:1811.02172 [pdf, other]

Neural Phrase-to-Phrase Machine Translation

Authors: Jiangtao Feng, Lingpeng Kong, Po-Sen Huang, Chong Wang, Da Huang, Jiayuan Mao, Kan Qiao, Dengyong Zhou

Abstract: In this paper, we propose Neural Phrase-to-Phrase Machine Translation (NP$^2$MT). Our model uses a phrase attention mechanism to discover relevant input (source) segments that are used by a decoder to generate output (target) phrases. We also design an efficient dynamic programming algorithm to decode segments that allows the model to be trained faster than the existing neural phrase-based machine… ▽ More In this paper, we propose Neural Phrase-to-Phrase Machine Translation (NP$^2$MT). Our model uses a phrase attention mechanism to discover relevant input (source) segments that are used by a decoder to generate output (target) phrases. We also design an efficient dynamic programming algorithm to decode segments that allows the model to be trained faster than the existing neural phrase-based machine translation method by Huang et al. (2018). Furthermore, our method can naturally integrate with external phrase dictionaries during decoding. Empirical experiments show that our method achieves comparable performance with the state-of-the art methods on benchmark datasets. However, when the training and testing data are from different distributions or domains, our method performs better. △ Less

Submitted 6 November, 2018; originally announced November 2018.

arXiv:1801.05151 [pdf]

Constraint-free Natural Image Reconstruction from fMRI Signals Based on Convolutional Neural Network

Authors: Chi Zhang, Kai Qiao, Linyuan Wang, Li Tong, Ying Zeng, Bin Yan

Abstract: In recent years, research on decoding brain activity based on functional magnetic resonance imaging (fMRI) has made remarkable achievements. However, constraint-free natural image reconstruction from brain activity is still a challenge. The existing methods simplified the problem by using semantic prior information or just reconstructing simple images such as letters and digitals. Without semantic… ▽ More In recent years, research on decoding brain activity based on functional magnetic resonance imaging (fMRI) has made remarkable achievements. However, constraint-free natural image reconstruction from brain activity is still a challenge. The existing methods simplified the problem by using semantic prior information or just reconstructing simple images such as letters and digitals. Without semantic prior information, we present a novel method to reconstruct nature images from fMRI signals of human visual cortex based on the computation model of convolutional neural network (CNN). Firstly, we extracted the units output of viewed natural images in each layer of a pre-trained CNN as CNN features. Secondly, we transformed image reconstruction from fMRI signals into the problem of CNN feature visualizations by training a sparse linear regression to map from the fMRI patterns to CNN features. By iteratively optimization to find the matched image, whose CNN unit features become most similar to those predicted from the brain activity, we finally achieved the promising results for the challenging constraint-free natural image reconstruction. As there was no use of semantic prior information of the stimuli when training decoding model, any category of images (not constraint by the training set) could be reconstructed theoretically. We found that the reconstructed images resembled the natural stimuli, especially in position and shape. The experimental results suggest that hierarchical visual features can effectively express the visual perception process of human brain. △ Less

Submitted 16 January, 2018; originally announced January 2018.

arXiv:1801.00602 [pdf]

Accurate reconstruction of image stimuli from human fMRI based on the decoding model with capsule network architecture

Authors: Kai Qiao, Chi Zhang, Linyuan Wang, Bin Yan, Jian Chen, Lei Zeng, Li Tong

Abstract: In neuroscience, all kinds of computation models were designed to answer the open question of how sensory stimuli are encoded by neurons and conversely, how sensory stimuli can be decoded from neuronal activities. Especially, functional Magnetic Resonance Imaging (fMRI) studies have made many great achievements with the rapid development of the deep network computation. However, comparing with the… ▽ More In neuroscience, all kinds of computation models were designed to answer the open question of how sensory stimuli are encoded by neurons and conversely, how sensory stimuli can be decoded from neuronal activities. Especially, functional Magnetic Resonance Imaging (fMRI) studies have made many great achievements with the rapid development of the deep network computation. However, comparing with the goal of decoding orientation, position and object category from activities in visual cortex, accurate reconstruction of image stimuli from human fMRI is a still challenging work. In this paper, the capsule network (CapsNet) architecture based visual reconstruction (CNAVR) method is developed to reconstruct image stimuli. The capsule means containing a group of neurons to perform the better organization of feature structure and representation, inspired by the structure of cortical mini column including several hundred neurons in primates. The high-level capsule features in the CapsNet includes diverse features of image stimuli such as semantic class, orientation, location and so on. We used these features to bridge between human fMRI and image stimuli. We firstly employed the CapsNet to train the nonlinear mapping from image stimuli to high-level capsule features, and from high-level capsule features to image stimuli again in an end-to-end manner. After estimating the serviceability of each voxel by encoding performance to accomplish the selecting of voxels, we secondly trained the nonlinear mapping from dimension-decreasing fMRI data to high-level capsule features. Finally, we can predict the high-level capsule features with fMRI data, and reconstruct image stimuli with the CapsNet. We evaluated the proposed CNAVR method on the dataset of handwritten digital images, and exceeded about 10% than the accuracy of all existing state-of-the-art methods on the structural similarity index (SSIM). △ Less

Submitted 2 January, 2018; originally announced January 2018.

arXiv:1607.08707 [pdf]

Image Prediction for Limited-angle Tomography via Deep Learning with Convolutional Neural Network

Authors: Hanming Zhang, Liang Li, Kai Qiao, Linyuan Wang, Bin Yan, Lei Li, Guoen Hu

Abstract: Limited angle problem is a challenging issue in x-ray computed tomography (CT) field. Iterative reconstruction methods that utilize the additional prior can suppress artifacts and improve image quality, but unfortunately require increased computation time. An interesting way is to restrain the artifacts in the images reconstructed from the practical filtered back projection (FBP) method. Frikel an… ▽ More Limited angle problem is a challenging issue in x-ray computed tomography (CT) field. Iterative reconstruction methods that utilize the additional prior can suppress artifacts and improve image quality, but unfortunately require increased computation time. An interesting way is to restrain the artifacts in the images reconstructed from the practical filtered back projection (FBP) method. Frikel and Quinto have proved that the streak artifacts in FBP results could be characterized. It indicates that the artifacts created by FBP method have specific and similar characteristics in a stationary limited-angle scanning configuration. Based on this understanding, this work aims at developing a method to extract and suppress specific artifacts of FBP reconstructions for limited-angle tomography. A data-driven learning-based method is proposed based on a deep convolutional neural network. An end-to-end mapping between the FBP and artifact-free images is learned and the implicit features involving artifacts will be extracted and suppressed via nonlinear mapping. The qualitative and quantitative evaluations of experimental results indicate that the proposed method show a stable and prospective performance on artifacts reduction and detail recovery for limited angle tomography. The presented strategy provides a simple and efficient approach for improving image quality of the reconstruction results from limited projection data. △ Less

Submitted 29 July, 2016; originally announced July 2016.

Showing 1–24 of 24 results for author: Qiao, K