subscribe to arXiv mailings

Noisy Node Classification by Bi-level Optimization based Multi-teacher Distillation

Authors: Yujing Liu, Zongqian Wu, Zhengyu Lu, Ci Nie, Guoqiu Wen, Ping Hu, Xiaofeng Zhu

Abstract: Previous graph neural networks (GNNs) usually assume that the graph data is with clean labels for representation learning, but it is not true in real applications. In this paper, we propose a new multi-teacher distillation method based on bi-level optimization (namely BO-NNC), to conduct noisy node classification on the graph data. Specifically, we first employ multiple self-supervised learning me… ▽ More Previous graph neural networks (GNNs) usually assume that the graph data is with clean labels for representation learning, but it is not true in real applications. In this paper, we propose a new multi-teacher distillation method based on bi-level optimization (namely BO-NNC), to conduct noisy node classification on the graph data. Specifically, we first employ multiple self-supervised learning methods to train diverse teacher models, and then aggregate their predictions through a teacher weight matrix. Furthermore, we design a new bi-level optimization strategy to dynamically adjust the teacher weight matrix based on the training progress of the student model. Finally, we design a label improvement module to improve the label quality. Extensive experimental results on real datasets show that our method achieves the best results compared to state-of-the-art methods. △ Less

Submitted 8 May, 2024; v1 submitted 27 April, 2024; originally announced April 2024.

arXiv:2404.11354 [pdf, other]

Distributed Fractional Bayesian Learning for Adaptive Optimization

Authors: Yaqun Yang, Jinlong Lei, Guanghui Wen, Yiguang Hong

Abstract: This paper considers a distributed adaptive optimization problem, where all agents only have access to their local cost functions with a common unknown parameter, whereas they mean to collaboratively estimate the true parameter and find the optimal solution over a connected network. A general mathematical framework for such a problem has not been studied yet. We aim to provide valuable insights fo… ▽ More This paper considers a distributed adaptive optimization problem, where all agents only have access to their local cost functions with a common unknown parameter, whereas they mean to collaboratively estimate the true parameter and find the optimal solution over a connected network. A general mathematical framework for such a problem has not been studied yet. We aim to provide valuable insights for addressing parameter uncertainty in distributed optimization problems and simultaneously find the optimal solution. Thus, we propose a novel Prediction while Optimization scheme, which utilizes distributed fractional Bayesian learning through weighted averaging on the log-beliefs to update the beliefs of unknown parameters, and distributed gradient descent for renewing the estimation of the optimal solution. Then under suitable assumptions, we prove that all agents' beliefs and decision variables converge almost surely to the true parameter and the optimal solution under the true parameter, respectively. We further establish a sublinear convergence rate for the belief sequence. Finally, numerical experiments are implemented to corroborate the theoretical analysis. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 16 pages, 6 figures

arXiv:2401.00283 [pdf, other]

Near-Space Communications: the Last Piece of 6G Space-Air-Ground-Sea Integrated Network Puzzle

Authors: Hongshan Liu, Tong Qin, Zhen Gao, Tianqi Mao, Keke Ying, Ziwei Wan, Li Qiao, Rui Na, Zhongxiang Li, Chun Hu, Yikun Mei, Tuan Li, Guanghui Wen, Lei Chen, Zhonghuai Wu, Ruiqi Liu, Gaojie Chen, Shuo Wang, Dezhi Zheng

Abstract: This article presents a comprehensive study on the emerging near-space communications (NS-COM) within the context of space-air-ground-sea integrated network (SAGSIN). Specifically, we firstly explore the recent technical developments of NS-COM, followed by the discussions about motivations behind integrating NS-COM into SAGSIN. To further demonstrate the necessity of NS-COM, a comparative analysis… ▽ More This article presents a comprehensive study on the emerging near-space communications (NS-COM) within the context of space-air-ground-sea integrated network (SAGSIN). Specifically, we firstly explore the recent technical developments of NS-COM, followed by the discussions about motivations behind integrating NS-COM into SAGSIN. To further demonstrate the necessity of NS-COM, a comparative analysis between the NS-COM network and other counterparts in SAGSIN is conducted, covering aspects of deployment, coverage, channel characteristics and unique problems of NS-COM network. Afterwards, the technical aspects of NS-COM, including channel modeling, random access, channel estimation, array-based beam management and joint network optimization, are examined in detail. Furthermore, we explore the potential applications of NS-COM, such as structural expansion in SAGSIN communication, civil aviation communication, remote and urgent communication, weather monitoring and carbon neutrality. Finally, some promising research avenues are identified, including stratospheric satellite (StratoSat) -to-ground direct links for mobile terminals, reconfigurable multiple-input multiple-output (MIMO) and holographic MIMO, federated learning in NS-COM networks, maritime communication, electromagnetic spectrum sensing and adversarial game, integrated sensing and communications, StratoSat-based radar detection and imaging, NS-COM assisted enhanced global navigation system, NS-COM assisted intelligent unmanned system and free space optical (FSO) communication. Overall, this paper highlights that the NS-COM plays an indispensable role in the SAGSIN puzzle, providing substantial performance and coverage enhancement to the traditional SAGSIN architecture. △ Less

Submitted 4 March, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

Comments: 28 pages, 8 figures, 2 tables

arXiv:2312.10920 [pdf, other]

Domain adaption and physical constrains transfer learning for shale gas production

Authors: Zhaozhong Yang, Liangjie Gou, Chao Min, Duo Yi, Xiaogang Li, Guoquan Wen

Abstract: Effective prediction of shale gas production is crucial for strategic reservoir development. However, in new shale gas blocks, two main challenges are encountered: (1) the occurrence of negative transfer due to insufficient data, and (2) the limited interpretability of deep learning (DL) models. To tackle these problems, we propose a novel transfer learning methodology that utilizes domain adaptat… ▽ More Effective prediction of shale gas production is crucial for strategic reservoir development. However, in new shale gas blocks, two main challenges are encountered: (1) the occurrence of negative transfer due to insufficient data, and (2) the limited interpretability of deep learning (DL) models. To tackle these problems, we propose a novel transfer learning methodology that utilizes domain adaptation and physical constraints. This methodology effectively employs historical data from the source domain to reduce negative transfer from the data distribution perspective, while also using physical constraints to build a robust and reliable prediction model that integrates various types of data. The methodology starts by dividing the production data from the source domain into multiple subdomains, thereby enhancing data diversity. It then uses Maximum Mean Discrepancy (MMD) and global average distance measures to decide on the feasibility of transfer. Through domain adaptation, we integrate all transferable knowledge, resulting in a more comprehensive target model. Lastly, by incorporating drilling, completion, and geological data as physical constraints, we develop a hybrid model. This model, a combination of a multi-layer perceptron (MLP) and a Transformer (Transformer-MLP), is designed to maximize interpretability. Experimental validation in China's southwestern region confirms the method's effectiveness. △ Less

Submitted 17 December, 2023; originally announced December 2023.

arXiv:2312.06255 [pdf, ps, other]

Ensemble Interpretation: A Unified Method for Interpretable Machine Learning

Authors: Chao Min, Guoyong Liao, Guoquan Wen, Yingjun Li, Xing Guo

Abstract: To address the issues of stability and fidelity in interpretable learning, a novel interpretable methodology, ensemble interpretation, is presented in this paper which integrates multi-perspective explanation of various interpretation methods. On one hand, we define a unified paradigm to describe the common mechanism of different interpretation methods, and then integrate the multiple interpretati… ▽ More To address the issues of stability and fidelity in interpretable learning, a novel interpretable methodology, ensemble interpretation, is presented in this paper which integrates multi-perspective explanation of various interpretation methods. On one hand, we define a unified paradigm to describe the common mechanism of different interpretation methods, and then integrate the multiple interpretation results to achieve more stable explanation. On the other hand, a supervised evaluation method based on prior knowledge is proposed to evaluate the explaining performance of an interpretation method. The experiment results show that the ensemble interpretation is more stable and more consistent with human experience and cognition. As an application, we use the ensemble interpretation for feature selection, and then the generalization performance of the corresponding learning model is significantly improved. △ Less

Submitted 11 December, 2023; originally announced December 2023.

arXiv:2310.00033 [pdf]

OriWheelBot: An origami-wheeled robot

Authors: Jie Liu, Zufeng Pang, Zhiyong Li, Guilin Wen, Zhoucheng Su, Junfeng He, Kaiyue Liu, Dezheng Jiang, Zenan Li, Shouyan Chen, Yang Tian, Yi Min Xie, Zhenpei Wang, Zhuangjian Liu

Abstract: Origami-inspired robots with multiple advantages, such as being lightweight, requiring less assembly, and exhibiting exceptional deformability, have received substantial and sustained attention. However, the existing origami-inspired robots are usually of limited functionalities and developing feature-rich robots is very challenging. Here, we report an origami-wheeled robot (OriWheelBot) with vari… ▽ More Origami-inspired robots with multiple advantages, such as being lightweight, requiring less assembly, and exhibiting exceptional deformability, have received substantial and sustained attention. However, the existing origami-inspired robots are usually of limited functionalities and developing feature-rich robots is very challenging. Here, we report an origami-wheeled robot (OriWheelBot) with variable width and outstanding sand walking versatility. The OriWheelBot's ability to adjust wheel width over obstacles is achieved by origami wheels made of Miura origami. An improved version, called iOriWheelBot, is also developed to automatically judge the width of the obstacles. Three actions, namely direct pass, variable width pass, and direct return, will be carried out depending on the width of the channel between the obstacles. We have identified two motion mechanisms, i.e., sand-digging and sand-pushing, with the latter being more conducive to walking on the sand. We have systematically examined numerous sand walking characteristics, including carrying loads, climbing a slope, walking on a slope, and navigating sand pits, small rocks, and sand traps. The OriWheelBot can change its width by 40%, has a loading-carrying ratio of 66.7% on flat sand and can climb a 17-degree sand incline. The OriWheelBot can be useful for planetary subsurface exploration and disaster area rescue. △ Less

Submitted 29 September, 2023; originally announced October 2023.

Comments: 23 papes, 7 figures

arXiv:2307.02126 [pdf, other]

Robust Graph Structure Learning with the Alignment of Features and Adjacency Matrix

Authors: Shaogao Lv, Gang Wen, Shiyu Liu, Linsen Wei, Ming Li

Abstract: To improve the robustness of graph neural networks (GNN), graph structure learning (GSL) has attracted great interest due to the pervasiveness of noise in graph data. Many approaches have been proposed for GSL to jointly learn a clean graph structure and corresponding representations. To extend the previous work, this paper proposes a novel regularized GSL approach, particularly with an alignment… ▽ More To improve the robustness of graph neural networks (GNN), graph structure learning (GSL) has attracted great interest due to the pervasiveness of noise in graph data. Many approaches have been proposed for GSL to jointly learn a clean graph structure and corresponding representations. To extend the previous work, this paper proposes a novel regularized GSL approach, particularly with an alignment of feature information and graph information, which is motivated mainly by our derived lower bound of node-level Rademacher complexity for GNNs. Additionally, our proposed approach incorporates sparse dimensional reduction to leverage low-dimensional node features that are relevant to the graph structure. To evaluate the effectiveness of our approach, we conduct experiments on real-world graphs. The results demonstrate that our proposed GSL method outperforms several competitive baselines, especially in scenarios where the graph structures are heavily affected by noise. Overall, our research highlights the importance of integrating feature and graph information alignment in GSL, as inspired by our derived theoretical result, and showcases the superiority of our approach in handling noisy graph structures through comprehensive experiments on real-world datasets. △ Less

Submitted 5 July, 2023; originally announced July 2023.

arXiv:2306.09648 [pdf, other]

Learning CO$_2$ plume migration in faulted reservoirs with Graph Neural Networks

Authors: Xin Ju, François P. Hamon, Gege Wen, Rayan Kanfar, Mauricio Araya-Polo, Hamdi A. Tchelepi

Abstract: Deep-learning-based surrogate models provide an efficient complement to numerical simulations for subsurface flow problems such as CO$_2$ geological storage. Accurately capturing the impact of faults on CO$_2$ plume migration remains a challenge for many existing deep learning surrogate models based on Convolutional Neural Networks (CNNs) or Neural Operators. We address this challenge with a graph… ▽ More Deep-learning-based surrogate models provide an efficient complement to numerical simulations for subsurface flow problems such as CO$_2$ geological storage. Accurately capturing the impact of faults on CO$_2$ plume migration remains a challenge for many existing deep learning surrogate models based on Convolutional Neural Networks (CNNs) or Neural Operators. We address this challenge with a graph-based neural model leveraging recent developments in the field of Graph Neural Networks (GNNs). Our model combines graph-based convolution Long-Short-Term-Memory (GConvLSTM) with a one-step GNN model, MeshGraphNet (MGN), to operate on complex unstructured meshes and limit temporal error accumulation. We demonstrate that our approach can accurately predict the temporal evolution of gas saturation and pore pressure in a synthetic reservoir with impermeable faults. Our results exhibit a better accuracy and a reduced temporal error accumulation compared to the standard MGN model. We also show the excellent generalizability of our algorithm to mesh configurations, boundary conditions, and heterogeneous permeability fields not included in the training set. This work highlights the potential of GNN-based methods to accurately and rapidly model subsurface flow with complex faults and fractures. △ Less

Submitted 16 June, 2023; originally announced June 2023.

arXiv:2304.10643 [pdf, other]

Activity Classification Using Unsupervised Domain Transfer from Body Worn Sensors

Authors: Chaitra Hedge, Gezheng Wen, Layne C. Price

Abstract: Activity classification has become a vital feature of wearable health tracking devices. As innovation in this field grows, wearable devices worn on different parts of the body are emerging. To perform activity classification on a new body location, labeled data corresponding to the new locations are generally required, but this is expensive to acquire. In this work, we present an innovative method… ▽ More Activity classification has become a vital feature of wearable health tracking devices. As innovation in this field grows, wearable devices worn on different parts of the body are emerging. To perform activity classification on a new body location, labeled data corresponding to the new locations are generally required, but this is expensive to acquire. In this work, we present an innovative method to leverage an existing activity classifier, trained on Inertial Measurement Unit (IMU) data from a reference body location (the source domain), in order to perform activity classification on a new body location (the target domain) in an unsupervised way, i.e. without the need for classification labels at the new location. Specifically, given an IMU embedding model trained to perform activity classification at the source domain, we train an embedding model to perform activity classification at the target domain by replicating the embeddings at the source domain. This is achieved using simultaneous IMU measurements at the source and target domains. The replicated embeddings at the target domain are used by a classification model that has previously been trained on the source domain to perform activity classification at the target domain. We have evaluated the proposed methods on three activity classification datasets PAMAP2, MHealth, and Opportunity, yielding high F1 scores of 67.19%, 70.40% and 68.34%, respectively when the source domain is the wrist and the target domain is the torso. △ Less

Submitted 20 April, 2023; originally announced April 2023.

arXiv:2304.09352 [pdf, other]

Optimizing Carbon Storage Operations for Long-Term Safety

Authors: Yizheng Wang, Markus Zechner, Gege Wen, Anthony Louis Corso, John Michael Mern, Mykel J. Kochenderfer, Jef Karel Caers

Abstract: To combat global warming and mitigate the risks associated with climate change, carbon capture and storage (CCS) has emerged as a crucial technology. However, safely sequestering CO2 in geological formations for long-term storage presents several challenges. In this study, we address these issues by modeling the decision-making process for carbon storage operations as a partially observable Markov… ▽ More To combat global warming and mitigate the risks associated with climate change, carbon capture and storage (CCS) has emerged as a crucial technology. However, safely sequestering CO2 in geological formations for long-term storage presents several challenges. In this study, we address these issues by modeling the decision-making process for carbon storage operations as a partially observable Markov decision process (POMDP). We solve the POMDP using belief state planning to optimize injector and monitoring well locations, with the goal of maximizing stored CO2 while maintaining safety. Empirical results in simulation demonstrate that our approach is effective in ensuring safe long-term carbon storage operations. We showcase the flexibility of our approach by introducing three different monitoring strategies and examining their impact on decision quality. Additionally, we introduce a neural network surrogate model for the POMDP decision-making process to handle the complex dynamics of the multi-phase flow. We also investigate the effects of different fidelity levels of the surrogate model on decision qualities. △ Less

Submitted 18 April, 2023; originally announced April 2023.

arXiv:2212.13631

Proceedings of AAAI 2022 Fall Symposium: The Role of AI in Responding to Climate Challenges

Authors: Feras A. Batarseh, Priya L. Donti, Ján Drgoňa, Kristen Fletcher, Pierre-Adrien Hanania, Melissa Hatton, Srinivasan Keshav, Bran Knowles, Raphaela Kotsch, Sean McGinnis, Peetak Mitra, Alex Philp, Jim Spohrer, Frank Stein, Meghna Tare, Svitlana Volkov, Gege Wen

Abstract: Climate change is one of the most pressing challenges of our time, requiring rapid action across society. As artificial intelligence tools (AI) are rapidly deployed, it is therefore crucial to understand how they will impact climate action. On the one hand, AI can support applications in climate change mitigation (reducing or preventing greenhouse gas emissions), adaptation (preparing for the effe… ▽ More Climate change is one of the most pressing challenges of our time, requiring rapid action across society. As artificial intelligence tools (AI) are rapidly deployed, it is therefore crucial to understand how they will impact climate action. On the one hand, AI can support applications in climate change mitigation (reducing or preventing greenhouse gas emissions), adaptation (preparing for the effects of a changing climate), and climate science. These applications have implications in areas ranging as widely as energy, agriculture, and finance. At the same time, AI is used in many ways that hinder climate action (e.g., by accelerating the use of greenhouse gas-emitting fossil fuels). In addition, AI technologies have a carbon and energy footprint themselves. This symposium brought together participants from across academia, industry, government, and civil society to explore these intersections of AI with climate change, as well as how each of these sectors can contribute to solutions. △ Less

Submitted 29 January, 2023; v1 submitted 27 December, 2022; originally announced December 2022.

arXiv:2212.10718 [pdf]

Interpretability and causal discovery of the machine learning models to predict the production of CBM wells after hydraulic fracturing

Authors: Chao Min, Guoquan Wen, Liangjie Gou, Xiaogang Li, Zhaozhong Yang

Abstract: Machine learning approaches are widely studied in the production prediction of CBM wells after hydraulic fracturing, but merely used in practice due to the low generalization ability and the lack of interpretability. A novel methodology is proposed in this article to discover the latent causality from observed data, which is aimed at finding an indirect way to interpret the machine learning result… ▽ More Machine learning approaches are widely studied in the production prediction of CBM wells after hydraulic fracturing, but merely used in practice due to the low generalization ability and the lack of interpretability. A novel methodology is proposed in this article to discover the latent causality from observed data, which is aimed at finding an indirect way to interpret the machine learning results. Based on the theory of causal discovery, a causal graph is derived with explicit input, output, treatment and confounding variables. Then, SHAP is employed to analyze the influence of the factors on the production capability, which indirectly interprets the machine learning models. The proposed method can capture the underlying nonlinear relationship between the factors and the output, which remedies the limitation of the traditional machine learning routines based on the correlation analysis of factors. The experiment on the data of CBM shows that the detected relationship between the production and the geological/engineering factors by the presented method, is coincident with the actual physical mechanism. Meanwhile, compared with traditional methods, the interpretable machine learning models have better performance in forecasting production capability, averaging 20% improvement in accuracy. △ Less

Submitted 20 December, 2022; originally announced December 2022.

arXiv:2211.11424 [pdf, other]

Modeling Hierarchical Structural Distance for Unsupervised Domain Adaptation

Authors: Yingxue Xu, Guihua Wen, Yang Hu, Pei Yang

Abstract: Unsupervised domain adaptation (UDA) aims to estimate a transferable model for unlabeled target domains by exploiting labeled source data. Optimal Transport (OT) based methods have recently been proven to be a promising solution for UDA with a solid theoretical foundation and competitive performance. However, most of these methods solely focus on domain-level OT alignment by leveraging the geometr… ▽ More Unsupervised domain adaptation (UDA) aims to estimate a transferable model for unlabeled target domains by exploiting labeled source data. Optimal Transport (OT) based methods have recently been proven to be a promising solution for UDA with a solid theoretical foundation and competitive performance. However, most of these methods solely focus on domain-level OT alignment by leveraging the geometry of domains for domain-invariant features based on the global embeddings of images. However, global representations of images may destroy image structure, leading to the loss of local details that offer category-discriminative information. This study proposes an end-to-end Deep Hierarchical Optimal Transport method (DeepHOT), which aims to learn both domain-invariant and category-discriminative representations by mining hierarchical structural relations among domains. The main idea is to incorporate a domain-level OT and image-level OT into a unified OT framework, hierarchical optimal transport, to model the underlying geometry in both domain space and image space. In DeepHOT framework, an image-level OT serves as the ground distance metric for the domain-level OT, leading to the hierarchical structural distance. Compared with the ground distance of the conventional domain-level OT, the image-level OT captures structural associations among local regions of images that are beneficial to classification. In this way, DeepHOT, a unified OT framework, not only aligns domains by domain-level OT, but also enhances the discriminative power through image-level OT. Moreover, to overcome the limitation of high computational complexity, we propose a robust and efficient implementation of DeepHOT by approximating origin OT with sliced Wasserstein distance in image-level OT and accomplishing the mini-batch unbalanced domain-level OT. △ Less

Submitted 19 April, 2024; v1 submitted 21 November, 2022; originally announced November 2022.

Comments: accepted by TCVST, code: https://github.com/Innse/DeepHOT

arXiv:2210.17051 [pdf, other]

doi 10.1039/D2EE04204E

Real-time high-resolution CO$_2$ geological storage prediction using nested Fourier neural operators

Authors: Gege Wen, Zongyi Li, Qirui Long, Kamyar Azizzadenesheli, Anima Anandkumar, Sally M. Benson

Abstract: Carbon capture and storage (CCS) plays an essential role in global decarbonization. Scaling up CCS deployment requires accurate and high-resolution modeling of the storage reservoir pressure buildup and the gaseous plume migration. However, such modeling is very challenging at scale due to the high computational costs of existing numerical methods. This challenge leads to significant uncertainties… ▽ More Carbon capture and storage (CCS) plays an essential role in global decarbonization. Scaling up CCS deployment requires accurate and high-resolution modeling of the storage reservoir pressure buildup and the gaseous plume migration. However, such modeling is very challenging at scale due to the high computational costs of existing numerical methods. This challenge leads to significant uncertainties in evaluating storage opportunities, which can delay the pace of large-scale CCS deployment. We introduce Nested Fourier Neural Operator (FNO), a machine-learning framework for high-resolution dynamic 3D CO2 storage modeling at a basin scale. Nested FNO produces forecasts at different refinement levels using a hierarchy of FNOs and speeds up flow prediction nearly 700,000 times compared to existing methods. By learning the solution operator for the family of governing partial differential equations, Nested FNO creates a general-purpose numerical simulator alternative for CO2 storage with diverse reservoir conditions, geological heterogeneity, and injection schemes. Our framework enables unprecedented real-time modeling and probabilistic simulations that can support the scale-up of global CCS deployment. △ Less

Submitted 1 June, 2023; v1 submitted 31 October, 2022; originally announced October 2022.

Journal ref: Energy & Environmental Science, 16(4), 1732-1741 (2023)

arXiv:2208.14447 [pdf, ps, other]

A further exploration of deep Multi-Agent Reinforcement Learning with Hybrid Action Space

Authors: Hongzhi Hua, Guixuan Wen, Kaigui Wu

Abstract: The research of extending deep reinforcement learning (drl) to multi-agent field has solved many complicated problems and made great achievements. However, almost all these studies only focus on discrete or continuous action space and there are few works having ever used multi-agent deep reinforcement learning to real-world environment problems which mostly have a hybrid action space. Therefore, i… ▽ More The research of extending deep reinforcement learning (drl) to multi-agent field has solved many complicated problems and made great achievements. However, almost all these studies only focus on discrete or continuous action space and there are few works having ever used multi-agent deep reinforcement learning to real-world environment problems which mostly have a hybrid action space. Therefore, in this paper, we propose two algorithms: deep multi-agent hybrid soft actor-critic (MAHSAC) and multi-agent hybrid deep deterministic policy gradients (MAHDDPG) to fill this gap. This two algorithms follow the centralized training and decentralized execution (CTDE) paradigm and could handle hybrid action space problems. Our experiences are running on multi-agent particle environment which is an easy multi-agent particle world, along with some basic simulated physics. The experimental results show that these algorithms have good performances. △ Less

Submitted 30 August, 2022; originally announced August 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2206.05108

arXiv:2206.05108 [pdf, ps, other]

Deep Multi-Agent Reinforcement Learning with Hybrid Action Spaces based on Maximum Entropy

Authors: Hongzhi Hua, Kaigui Wu, Guixuan Wen

Abstract: Multi-agent deep reinforcement learning has been applied to address a variety of complex problems with either discrete or continuous action spaces and achieved great success. However, most real-world environments cannot be described by only discrete action spaces or only continuous action spaces. And there are few works having ever utilized deep reinforcement learning (drl) to multi-agent problems… ▽ More Multi-agent deep reinforcement learning has been applied to address a variety of complex problems with either discrete or continuous action spaces and achieved great success. However, most real-world environments cannot be described by only discrete action spaces or only continuous action spaces. And there are few works having ever utilized deep reinforcement learning (drl) to multi-agent problems with hybrid action spaces. Therefore, we propose a novel algorithm: Deep Multi-Agent Hybrid Soft Actor-Critic (MAHSAC) to fill this gap. This algorithm follows the centralized training but decentralized execution (CTDE) paradigm, and extend the Soft Actor-Critic algorithm (SAC) to handle hybrid action space problems in Multi-Agent environments based on maximum entropy. Our experiences are running on an easy multi-agent particle world with a continuous observation and discrete action space, along with some basic simulated physics. The experimental results show that MAHSAC has good performance in training speed, stability, and anti-interference ability. At the same time, it outperforms existing independent deep hybrid learning method in cooperative scenarios and competitive scenarios. △ Less

Submitted 10 June, 2022; originally announced June 2022.

arXiv:2205.10235 [pdf, other]

An Efficient Methodology to Identify Missing Tags in Large-Scale RFID Systems

Authors: Chu Chu, Rui Xu, Gang Li, Zhenbing Li, Guangjun Wen

Abstract: Radio frequency identification (RFID) has been widely has broad applications. One such application is to use RFID to track inventory in warehouses and retail stores. In this application, timely identifying the missing items is an ongoing engineering problem. A feasible solution to this problem is to map each tag to a time slot and verify the presence of a tag by comparing the status of the predict… ▽ More Radio frequency identification (RFID) has been widely has broad applications. One such application is to use RFID to track inventory in warehouses and retail stores. In this application, timely identifying the missing items is an ongoing engineering problem. A feasible solution to this problem is to map each tag to a time slot and verify the presence of a tag by comparing the status of the predicted time slot and the actual time slot. However, existing works are time inefficient because they only verify tags one by one in singleton slots but ignore the collision slots mapped by multiple tags. To accelerate the identification process, we use bit tracking to verify tags in collision slots and design two protocols accordingly. We first propose the Sequential String based Missing Tag Identification (SSMTI) protocol, which converts all time slots to collision slots and enables tags in each slot to reply to a designed string simultaneously. By using bit tracking to decode the combined string, the reader can verify multiple tags together. To improve the performance of SSMTI when most tags are missing, we further propose the Interactive String based Missing Tag Identification (ISMTI) protocol. ISMTI improves the strategies of designing strings for each collided tag so that the reader can verify more tags using shorter strings than SSMTI.Besides, ISMTI can dynamically adjust the verification mechanism according to the proportion of missing tags to maintain time efficiency. We also provide theoretical analysis for proposed protocols to minimize execution time and evaluate their performance through extensive simulations. Compared with state-of-the-art solutions, the proposed SSMTI and ISMTI can reduce the time cost by as much as 39.74% and 68.87%. △ Less

Submitted 28 April, 2022; originally announced May 2022.

arXiv:2204.00306 [pdf, other]

Building Decision Forest via Deep Reinforcement Learning

Authors: Guixuan Wen, Kaigui Wu

Abstract: Ensemble learning methods whose base classifier is a decision tree usually belong to the bagging or boosting. However, no previous work has ever built the ensemble classifier by maximizing long-term returns to the best of our knowledge. This paper proposes a decision forest building method called MA-H-SAC-DF for binary classification via deep reinforcement learning. First, the building process is… ▽ More Ensemble learning methods whose base classifier is a decision tree usually belong to the bagging or boosting. However, no previous work has ever built the ensemble classifier by maximizing long-term returns to the best of our knowledge. This paper proposes a decision forest building method called MA-H-SAC-DF for binary classification via deep reinforcement learning. First, the building process is modeled as a decentralized partial observable Markov decision process, and a set of cooperative agents jointly constructs all base classifiers. Second, the global state and local observations are defined based on informations of the parent node and the current location. Last, the state-of-the-art deep reinforcement method Hybrid SAC is extended to a multi-agent system under the CTDE architecture to find an optimal decision forest building policy. The experiments indicate that MA-H-SAC-DF has the same performance as random forest, Adaboost, and GBDT on balanced datasets and outperforms them on imbalanced datasets. △ Less

Submitted 1 April, 2022; originally announced April 2022.

arXiv:2201.06778 [pdf, other]

Data-Driven Deep Learning Based Hybrid Beamforming for Aerial Massive MIMO-OFDM Systems with Implicit CSI

Authors: Zhen Gao, Minghui Wu, Chun Hu, Feifei Gao, Guanghui Wen, Dezhi Zheng, Jun Zhang

Abstract: In an aerial hybrid massive multiple-input multiple-output (MIMO) and orthogonal frequency division multiplexing (OFDM) system, how to design a spectral-efficient broadband multi-user hybrid beamforming with a limited pilot and feedback overhead is challenging. To this end, by modeling the key transmission modules as an end-to-end (E2E) neural network, this paper proposes a data-driven deep learni… ▽ More In an aerial hybrid massive multiple-input multiple-output (MIMO) and orthogonal frequency division multiplexing (OFDM) system, how to design a spectral-efficient broadband multi-user hybrid beamforming with a limited pilot and feedback overhead is challenging. To this end, by modeling the key transmission modules as an end-to-end (E2E) neural network, this paper proposes a data-driven deep learning (DL)-based unified hybrid beamforming framework for both the time division duplex (TDD) and frequency division duplex (FDD) systems with implicit channel state information (CSI). For TDD systems, the proposed DL-based approach jointly models the uplink pilot combining and downlink hybrid beamforming modules as an E2E neural network. While for FDD systems, we jointly model the downlink pilot transmission, uplink CSI feedback, and downlink hybrid beamforming modules as an E2E neural network. Different from conventional approaches separately processing different modules, the proposed solution simultaneously optimizes all modules with the sum rate as the optimization object. Therefore, by perceiving the inherent property of air-to-ground massive MIMO-OFDM channel samples, the DL-based E2E neural network can establish the mapping function from the channel to the beamformer, so that the explicit channel reconstruction can be avoided with reduced pilot and feedback overhead. Besides, practical low-resolution phase shifters (PSs) introduce the quantization constraint, leading to the intractable gradient backpropagation when training the neural network. To mitigate the performance loss caused by the phase quantization error, we adopt the transfer learning strategy to further fine-tune the E2E neural network based on a pre-trained network that assumes the ideal infinite-resolution PSs. Numerical results show that our DL-based schemes have considerable advantages over state-of-the-art schemes. △ Less

Submitted 9 September, 2022; v1 submitted 18 January, 2022; originally announced January 2022.

Comments: Accepted by IEEE Journal on Selected Areas in Communications

arXiv:2112.13508 [pdf]

doi 10.1007/s10586-024-04293-x

Duck swarm algorithm: theory, numerical optimization, and applications

Authors: Mengjian Zhang, Guihua Wen

Abstract: A swarm intelligence-based optimization algorithm, named Duck Swarm Algorithm (DSA), is proposed in this study, which is inspired by the searching for food sources and foraging behaviors of the duck swarm. Two rules are modeled from the finding food and foraging of the duck, which corresponds to the exploration and exploitation phases of the proposed DSA, respectively. The performance of the DSA i… ▽ More A swarm intelligence-based optimization algorithm, named Duck Swarm Algorithm (DSA), is proposed in this study, which is inspired by the searching for food sources and foraging behaviors of the duck swarm. Two rules are modeled from the finding food and foraging of the duck, which corresponds to the exploration and exploitation phases of the proposed DSA, respectively. The performance of the DSA is verified by using multiple CEC benchmark functions, where its statistical (best, mean, standard deviation, and average running-time) results are compared with seven well-known algorithms like Particle swarm optimization (PSO), Firefly algorithm (FA), Chicken swarm optimization (CSO), Grey wolf optimizer (GWO), Sine cosine algorithm (SCA), and Marine-predators algorithm (MPA), and Archimedes optimization algorithm (AOA). Moreover, the Wilcoxon rank-sum test, Friedman test, and convergence curves of the comparison results are utilized to prove the superiority of the DSA against other algorithms. The results demonstrate that DSA is a high-performance optimization method in terms of convergence speed and exploration-exploitation balance for solving the numerical optimization problems. Also, DSA is applied for the optimal design of six engineering constrained optimization problems and the node optimization deployment task of the Wireless Sensor Network (WSN). Overall, the comparison results revealed that the DSA is a promising and very competitive algorithm for solving different optimization problems. △ Less

Submitted 1 June, 2024; v1 submitted 26 December, 2021; originally announced December 2021.

Journal ref: Cluster Computing, 2024

arXiv:2109.03697 [pdf, other]

U-FNO -- An enhanced Fourier neural operator-based deep-learning model for multiphase flow

Authors: Gege Wen, Zongyi Li, Kamyar Azizzadenesheli, Anima Anandkumar, Sally M. Benson

Abstract: Numerical simulation of multiphase flow in porous media is essential for many geoscience applications. Machine learning models trained with numerical simulation data can provide a faster alternative to traditional simulators. Here we present U-FNO, a novel neural network architecture for solving multiphase flow problems with superior accuracy, speed, and data efficiency. U-FNO is designed based on… ▽ More Numerical simulation of multiphase flow in porous media is essential for many geoscience applications. Machine learning models trained with numerical simulation data can provide a faster alternative to traditional simulators. Here we present U-FNO, a novel neural network architecture for solving multiphase flow problems with superior accuracy, speed, and data efficiency. U-FNO is designed based on the newly proposed Fourier neural operator (FNO), which has shown excellent performance in single-phase flows. We extend the FNO-based architecture to a highly complex CO2-water multiphase problem with wide ranges of permeability and porosity heterogeneity, anisotropy, reservoir conditions, injection configurations, flow rates, and multiphase flow properties. The U-FNO architecture is more accurate in gas saturation and pressure buildup predictions than the original FNO and a state-of-the-art convolutional neural network (CNN) benchmark. Meanwhile, it has superior data utilization efficiency, requiring only a third of the training data to achieve the equivalent accuracy as CNN. U-FNO provides superior performance in highly heterogeneous geological formations and critically important applications such as gas saturation and pressure buildup "fronts" determination. The trained model can serve as a general-purpose alternative to routine numerical simulations of 2D-radial CO2 injection problems with significant speed-ups than traditional simulators. △ Less

Submitted 4 May, 2022; v1 submitted 3 September, 2021; originally announced September 2021.

arXiv:2106.06410 [pdf, other]

What Can Knowledge Bring to Machine Learning? -- A Survey of Low-shot Learning for Structured Data

Authors: Yang Hu, Adriane Chapman, Guihua Wen, Dame Wendy Hall

Abstract: Supervised machine learning has several drawbacks that make it difficult to use in many situations. Drawbacks include: heavy reliance on massive training data, limited generalizability and poor expressiveness of high-level semantics. Low-shot Learning attempts to address these drawbacks. Low-shot learning allows the model to obtain good predictive power with very little or no training data, where… ▽ More Supervised machine learning has several drawbacks that make it difficult to use in many situations. Drawbacks include: heavy reliance on massive training data, limited generalizability and poor expressiveness of high-level semantics. Low-shot Learning attempts to address these drawbacks. Low-shot learning allows the model to obtain good predictive power with very little or no training data, where structured knowledge plays a key role as a high-level semantic representation of human. This article will review the fundamental factors of low-shot learning technologies, with a focus on the operation of structured knowledge under different low-shot conditions. We also introduce other techniques relevant to low-shot learning. Finally, we point out the limitations of low-shot learning, the prospects and gaps of industrial applications, and future research directions. △ Less

Submitted 11 June, 2021; originally announced June 2021.

Comments: 41 pages, 280 references

arXiv:2104.01795 [pdf, other]

doi 10.1016/j.advwatres.2021.104009

CCSNet: a deep learning modeling suite for CO$_2$ storage

Authors: Gege Wen, Catherine Hay, Sally M. Benson

Abstract: Numerical simulation is an essential tool for many applications involving subsurface flow and transport, yet often suffers from computational challenges due to the multi-physics nature, highly non-linear governing equations, inherent parameter uncertainties, and the need for high spatial resolutions to capture multi-scale heterogeneity. We developed CCSNet, a general-purpose deep-learning modeling… ▽ More Numerical simulation is an essential tool for many applications involving subsurface flow and transport, yet often suffers from computational challenges due to the multi-physics nature, highly non-linear governing equations, inherent parameter uncertainties, and the need for high spatial resolutions to capture multi-scale heterogeneity. We developed CCSNet, a general-purpose deep-learning modeling suite that can act as an alternative to conventional numerical simulators for carbon capture and storage (CCS) problems where CO$_2$ is injected into saline aquifers in 2d-radial systems. CCSNet consists of a sequence of deep learning models producing all the outputs that a numerical simulator typically provides, including saturation distributions, pressure buildup, dry-out, fluid densities, mass balance, solubility trapping, and sweep efficiency. The results are 10$^3$ to 10$^4$ times faster than conventional numerical simulators. As an application of CCSNet illustrating the value of its high computational efficiency, we developed rigorous estimation techniques for the sweep efficiency and solubility trapping. △ Less

Submitted 5 April, 2021; originally announced April 2021.

arXiv:2009.01490 [pdf, ps, other]

Fixed-Time Cooperative Tracking Control for Double-Integrator Multi-Agent Systems: A Time-Based Generator Approach

Authors: Qiang Chen, Yu Zhao, Guanghui Wen, Guoqing Shi, Xinghuo Yu

Abstract: In this paper, both the fixed-time distributed consensus tracking and the fixed-time distributed average tracking problems for double-integrator-type multi-agent systems with bounded input disturbances are studied, respectively. Firstly, a new practical robust fixed-time sliding mode control method based on the time-based generator is proposed. Secondly, a fixed-time distributed consensus tracking… ▽ More In this paper, both the fixed-time distributed consensus tracking and the fixed-time distributed average tracking problems for double-integrator-type multi-agent systems with bounded input disturbances are studied, respectively. Firstly, a new practical robust fixed-time sliding mode control method based on the time-based generator is proposed. Secondly, a fixed-time distributed consensus tracking observer for double-integrator-type multi-agent systems is designed to estimate the state disagreements between the leader and the followers under undirected and directed communication, respectively. Thirdly, a fixed-time distributed average tracking observer for double-integrator-type multi-agent systems is designed to measure the average value of reference signals under undirected communication. Note that both the observers for the distributed consensus tracking and the distributed average tracking are devised based on time-based generators and can be extended to that of high-order multi-agent systems trivially. Furthermore, by combing the fixed-time sliding mode control with the fixed-time observers, the fixed-time controllers are designed to solve the distributed consensus tracking and the distributed average tracking problems. Finally, a few numerical simulations are shown to verify the results. △ Less

Submitted 3 September, 2020; originally announced September 2020.

Comments: 11 pages, 10 figures

arXiv:2007.00339 [pdf, other]

Multi-Task Variational Information Bottleneck

Authors: Weizhu Qian, Bowei Chen, Yichao Zhang, Guanghui Wen, Franck Gechter

Abstract: Multi-task learning (MTL) is an important subject in machine learning and artificial intelligence. Its applications to computer vision, signal processing, and speech recognition are ubiquitous. Although this subject has attracted considerable attention recently, the performance and robustness of the existing models to different tasks have not been well balanced. This article proposes an MTL model… ▽ More Multi-task learning (MTL) is an important subject in machine learning and artificial intelligence. Its applications to computer vision, signal processing, and speech recognition are ubiquitous. Although this subject has attracted considerable attention recently, the performance and robustness of the existing models to different tasks have not been well balanced. This article proposes an MTL model based on the architecture of the variational information bottleneck (VIB), which can provide a more effective latent representation of the input features for the downstream tasks. Extensive observations on three public data sets under adversarial attacks show that the proposed model is competitive to the state-of-the-art algorithms concerning the prediction accuracy. Experimental results suggest that combining the VIB and the task-dependent uncertainties is a very effective way to abstract valid information from the input features for accomplishing multiple tasks. △ Less

Submitted 1 March, 2021; v1 submitted 1 July, 2020; originally announced July 2020.

Comments: 10 pages

arXiv:2006.04648 [pdf, other]

doi 10.1109/TMM.2021.3082292

Graph-based Visual-Semantic Entanglement Network for Zero-shot Image Recognition

Authors: Yang Hu, Guihua Wen, Adriane Chapman, Pei Yang, Mingnan Luo, Yingxue Xu, Dan Dai, Wendy Hall

Abstract: Zero-shot learning uses semantic attributes to connect the search space of unseen objects. In recent years, although the deep convolutional network brings powerful visual modeling capabilities to the ZSL task, its visual features have severe pattern inertia and lack of representation of semantic relationships, which leads to severe bias and ambiguity. In response to this, we propose the Graph-base… ▽ More Zero-shot learning uses semantic attributes to connect the search space of unseen objects. In recent years, although the deep convolutional network brings powerful visual modeling capabilities to the ZSL task, its visual features have severe pattern inertia and lack of representation of semantic relationships, which leads to severe bias and ambiguity. In response to this, we propose the Graph-based Visual-Semantic Entanglement Network to conduct graph modeling of visual features, which is mapped to semantic attributes by using a knowledge graph, it contains several novel designs: 1. it establishes a multi-path entangled network with the convolutional neural network (CNN) and the graph convolutional network (GCN), which input the visual features from CNN to GCN to model the implicit semantic relations, then GCN feedback the graph modeled information to CNN features; 2. it uses attribute word vectors as the target for the graph semantic modeling of GCN, which forms a self-consistent regression for graph modeling and supervise GCN to learn more personalized attribute relations; 3. it fuses and supplements the hierarchical visual-semantic features refined by graph modeling into visual embedding. Our method outperforms state-of-the-art approaches on multiple representative ZSL datasets: AwA2, CUB, and SUN by promoting the semantic linkage modelling of visual features. △ Less

Submitted 11 June, 2021; v1 submitted 8 June, 2020; originally announced June 2020.

Comments: 15 pages, 11 figures, on IEEE Transactions on Multimedia

Journal ref: [J]. IEEE Transactions on Multimedia, 2021

arXiv:2005.06423 [pdf, other]

Multiple Attentional Pyramid Networks for Chinese Herbal Recognition

Authors: Yingxue Xu, Guihua Wen, Yang Hu, Mingnan Luo, Dan Dai, Yishan Zhuang, Wendy Hall

Abstract: Chinese herbs play a critical role in Traditional Chinese Medicine. Due to different recognition granularity, they can be recognized accurately only by professionals with much experience. It is expected that they can be recognized automatically using new techniques like machine learning. However, there is no Chinese herbal image dataset available. Simultaneously, there is no machine learning metho… ▽ More Chinese herbs play a critical role in Traditional Chinese Medicine. Due to different recognition granularity, they can be recognized accurately only by professionals with much experience. It is expected that they can be recognized automatically using new techniques like machine learning. However, there is no Chinese herbal image dataset available. Simultaneously, there is no machine learning method which can deal with Chinese herbal image recognition well. Therefore, this paper begins with building a new standard Chinese-Herbs dataset. Subsequently, a new Attentional Pyramid Networks (APN) for Chinese herbal recognition is proposed, where both novel competitive attention and spatial collaborative attention are proposed and then applied. APN can adaptively model Chinese herbal images with different feature scales. Finally, a new framework for Chinese herbal recognition is proposed as a new application of APN. Experiments are conducted on our constructed dataset and validate the effectiveness of our methods. △ Less

Submitted 13 May, 2020; originally announced May 2020.

Comments: 14 pages, 8 figures

arXiv:1910.09657 [pdf, other]

doi 10.1016/j.ijggc.2020.103223

Multiphase flow prediction with deep neural networks

Authors: Gege Wen, Meng Tang, Sally M. Benson

Abstract: This paper proposes a deep neural network approach for predicting multiphase flow in heterogeneous domains with high computational efficiency. The deep neural network model is able to handle permeability heterogeneity in high dimensional systems, and can learn the interplay of viscous, gravity, and capillary forces from small data sets. Using the example of carbon dioxide (CO2) storage, we demonst… ▽ More This paper proposes a deep neural network approach for predicting multiphase flow in heterogeneous domains with high computational efficiency. The deep neural network model is able to handle permeability heterogeneity in high dimensional systems, and can learn the interplay of viscous, gravity, and capillary forces from small data sets. Using the example of carbon dioxide (CO2) storage, we demonstrate that the model can generate highly accurate predictions of a CO2 saturation distribution given a permeability field, injection duration, injection rate, and injection location. The trained neural network model has an excellent ability to interpolate and to a limited extent, the ability to extrapolate beyond the training data ranges. To improve the prediction accuracy when the neural network model needs to extrapolate, we propose a transfer learning (fine-tuning) procedure that can quickly teach the neural network model new information without going through massive data collection and retraining. Based on this trained neural network model, a web-based tool is provided that allows users to perform CO2-water multiphase flow calculations online. With the tools provided in this paper, the deep neural network approach can provide a computationally efficient substitute for repetitive forward multiphase flow simulations, which can be adopted to the context of history matching and uncertainty quantification. △ Less

Submitted 21 October, 2019; originally announced October 2019.

arXiv:1904.12639 [pdf, other]

doi 10.1109/TCYB.2020.3034605

Inner-Imaging Networks: Put Lenses into Convolutional Structure

Authors: Yang Hu, Guihua Wen, Mingnan Luo, Dan Dai, Wenming Cao, Zhiwen Yu, Wendy Hall

Abstract: Despite the tremendous success in computer vision, deep convolutional networks suffer from serious computation costs and redundancies. Although previous works address this issue by enhancing diversities of filters, they have not considered the complementarity and the completeness of the internal structure of the convolutional network. To deal with these problems, a novel Inner-Imaging architecture… ▽ More Despite the tremendous success in computer vision, deep convolutional networks suffer from serious computation costs and redundancies. Although previous works address this issue by enhancing diversities of filters, they have not considered the complementarity and the completeness of the internal structure of the convolutional network. To deal with these problems, a novel Inner-Imaging architecture is proposed in this paper, which allows relationships between channels to meet the above requirement. Specifically, we organize the channel signal points in groups using convolutional kernels to model both the intra-group and inter-group relationships simultaneously. The convolutional filter is a powerful tool for modeling spatial relations and organizing grouped signals, so the proposed methods map the channel signals onto a pseudo-image, like putting a lens into convolution internal structure. Consequently, not only the diversity of channels is increased, but also the complementarity and completeness can be explicitly enhanced. The proposed architecture is lightweight and easy to be implemented. It provides an efficient self-organization strategy for convolutional networks so as to improve their efficiency and performance. Extensive experiments are conducted on multiple benchmark image recognition data sets including CIFAR, SVHN and ImageNet. Experimental results verify the effectiveness of the Inner-Imaging mechanism with the most popular convolutional networks as the backbones. △ Less

Submitted 27 August, 2021; v1 submitted 22 April, 2019; originally announced April 2019.

Comments: 14 pages, 10 figures, formal edition on IEEE Transactions on Cybernetics, 2021

arXiv:1904.09853 [pdf, other]

Stochastic Region Pooling: Make Attention More Expressive

Authors: Mingnan Luo, Guihua Wen, Yang Hu, Dan Dai, Yingxue Xu

Abstract: Global Average Pooling (GAP) is used by default on the channel-wise attention mechanism to extract channel descriptors. However, the simple global aggregation method of GAP is easy to make the channel descriptors have homogeneity, which weakens the detail distinction between feature maps, thus affecting the performance of the attention mechanism. In this work, we propose a novel method for channel… ▽ More Global Average Pooling (GAP) is used by default on the channel-wise attention mechanism to extract channel descriptors. However, the simple global aggregation method of GAP is easy to make the channel descriptors have homogeneity, which weakens the detail distinction between feature maps, thus affecting the performance of the attention mechanism. In this work, we propose a novel method for channel-wise attention network, called Stochastic Region Pooling (SRP), which makes the channel descriptors more representative and diversity by encouraging the feature map to have more or wider important feature responses. Also, SRP is the general method for the attention mechanisms without any additional parameters or computation. It can be widely applied to attention networks without modifying the network structure. Experimental results on image recognition datasets including CIAFR-10/100, ImageNet and three Fine-grained datasets (CUB-200-2011, Stanford Cars and Stanford Dogs) show that SRP brings the significant improvements of the performance over efficient CNNs and achieves the state-of-the-art results. △ Less

Submitted 22 April, 2019; originally announced April 2019.

arXiv:1812.09648 [pdf, other]

Chinese Herbal Recognition based on Competitive Attentional Fusion of Multi-hierarchies Pyramid Features

Authors: Yingxue Xu, Guihua Wen, Yang Hu, Mingnan Luo, Dan Dai, Yishan Zhuang

Abstract: Convolution neural netwotks (CNNs) are successfully applied in image recognition task. In this study, we explore the approach of automatic herbal recognition with CNNs and build the standard Chinese herbs datasets firstly. According to the characteristics of herbal images, we proposed the competitive attentional fusion pyramid networks to model the features of herbal image, which mdoels the relati… ▽ More Convolution neural netwotks (CNNs) are successfully applied in image recognition task. In this study, we explore the approach of automatic herbal recognition with CNNs and build the standard Chinese herbs datasets firstly. According to the characteristics of herbal images, we proposed the competitive attentional fusion pyramid networks to model the features of herbal image, which mdoels the relationship of feature maps from different levels, and re-weights multi-level channels with channel-wise attention mechanism. In this way, we can dynamically adjust the weight of feature maps from various layers, according to the visual characteristics of each herbal image. Moreover, we also introduce the spatial attention to recalibrate the misaligned features caused by sampling in features amalgamation. Extensive experiments are conducted on our proposed datasets and validate the superior performance of our proposed models. The Chinese herbs datasets will be released upon acceptance to facilitate the research of Chinese herbal recognition. △ Less

Submitted 22 December, 2018; originally announced December 2018.

Comments: New Datasets for Chinese Herbs Recognition

Journal ref: [J]. Pattern Recognition, 2021, 110: 107558

arXiv:1812.06847 [pdf, other]

Convolutional herbal prescription building method from multi-scale facial features

Authors: Huiqiang Liao, Guihua Wen, Yang Hu, Changjun Wang

Abstract: In Traditional Chinese Medicine (TCM), facial features are important basis for diagnosis and treatment. A doctor of TCM can prescribe according to a patient's physical indicators such as face, tongue, voice, symptoms, pulse. Previous works analyze and generate prescription according to symptoms. However, research work to mine the association between facial features and prescriptions has not been f… ▽ More In Traditional Chinese Medicine (TCM), facial features are important basis for diagnosis and treatment. A doctor of TCM can prescribe according to a patient's physical indicators such as face, tongue, voice, symptoms, pulse. Previous works analyze and generate prescription according to symptoms. However, research work to mine the association between facial features and prescriptions has not been found for the time being. In this work, we try to use deep learning methods to mine the relationship between the patient's face and herbal prescriptions (TCM prescriptions), and propose to construct convolutional neural networks that generate TCM prescriptions according to the patient's face image. It is a novel and challenging job. In order to mine features from different granularities of faces, we design a multi-scale convolutional neural network based on three-grained face, which mines the patient's face information from the organs, local regions, and the entire face. Our experiments show that convolutional neural networks can learn relevant information from face to prescribe, and the multi-scale convolutional neural networks based on three-grained face perform better. △ Less

Submitted 17 December, 2018; originally announced December 2018.

arXiv:1807.08920 [pdf, other]

Competitive Inner-Imaging Squeeze and Excitation for Residual Network

Authors: Yang Hu, Guihua Wen, Mingnan Luo, Dan Dai, Jiajiong Ma, Zhiwen Yu

Abstract: Residual networks, which use a residual unit to supplement the identity mappings, enable very deep convolutional architecture to operate well, however, the residual architecture has been proved to be diverse and redundant, which may leads to low-efficient modeling. In this work, we propose a competitive squeeze-excitation (SE) mechanism for the residual network. Re-scaling the value for each chann… ▽ More Residual networks, which use a residual unit to supplement the identity mappings, enable very deep convolutional architecture to operate well, however, the residual architecture has been proved to be diverse and redundant, which may leads to low-efficient modeling. In this work, we propose a competitive squeeze-excitation (SE) mechanism for the residual network. Re-scaling the value for each channel in this structure will be determined by the residual and identity mappings jointly, and this design enables us to expand the meaning of channel relationship modeling in residual blocks. Modeling of the competition between residual and identity mappings cause the identity flow to control the complement of the residual feature maps for itself. Furthermore, we design a novel inner-imaging competitive SE block to shrink the consumption and re-image the global features of intermediate network structure, by using the inner-imaging mechanism, we can model the channel-wise relations with convolution in spatial. We carry out experiments on the CIFAR, SVHN, and ImageNet datasets, and the proposed method can challenge state-of-the-art results. △ Less

Submitted 22 December, 2018; v1 submitted 24 July, 2018; originally announced July 2018.

Comments: Code is available at https://github.com/scut-aitcm/Competitive-Inner-Imaging-SENet

arXiv:1806.06916 [pdf]

On sound insulation of pyramidal lattice sandwich structure

Authors: Jie Liu, Tingting Chen, Guilin Wen, Qixiang Qing, Ramin Sedaghati, Yi Min Xie

Abstract: Pyramidal lattice sandwich structure (PLSS) exhibits high stiffness and strength-to-weight ratio which can be effectively utilized for designing light-weight load bearing structures for ranging from ground to aerospace vehicles. While these structures provide superior strength to weigh ratio, their sound insulation capacity has not been well understood. The aim of this study is to develop numerica… ▽ More Pyramidal lattice sandwich structure (PLSS) exhibits high stiffness and strength-to-weight ratio which can be effectively utilized for designing light-weight load bearing structures for ranging from ground to aerospace vehicles. While these structures provide superior strength to weigh ratio, their sound insulation capacity has not been well understood. The aim of this study is to develop numerical and experimental methods to fundamentally investigate the sound insulation property of the pyramidal lattice sandwich structure with solid trusses (PLSSST). A finite element model has been developed to predict the sound transmission loss (STL) of PLSSST and simulation results have been compared with those obtained experimentally. Parametric studies is then performed using the validated finite element model to investigate the effect of different parameters in pyramidal lattice sandwich structure with hollow trusses (PLSSHT), revealing that the pitching angle, the uniform thickness and the length of the hollow truss and the lattice constant have considerable effects on the sound transmission loss. Finally a design optimization strategy has been formulated to optimize PLSSHT in order to maximize STL while meeting mechanical property requirements. It has been shown that STL of the optimal PLSSHT can be increased by almost 10% at the low-frequency band. The work reported here provides useful information for the noise reduction design of periodic lattice structures. △ Less

Submitted 5 June, 2018; originally announced June 2018.

arXiv:1805.11223 [pdf]

Video Anomaly Detection and Localization via Gaussian Mixture Fully Convolutional Variational Autoencoder

Authors: Yaxiang Fan, Gongjian Wen, Deren Li, Shaohua Qiu, Martin D. Levine

Abstract: We present a novel end-to-end partially supervised deep learning approach for video anomaly detection and localization using only normal samples. The insight that motivates this study is that the normal samples can be associated with at least one Gaussian component of a Gaussian Mixture Model (GMM), while anomalies either do not belong to any Gaussian component. The method is based on Gaussian Mix… ▽ More We present a novel end-to-end partially supervised deep learning approach for video anomaly detection and localization using only normal samples. The insight that motivates this study is that the normal samples can be associated with at least one Gaussian component of a Gaussian Mixture Model (GMM), while anomalies either do not belong to any Gaussian component. The method is based on Gaussian Mixture Variational Autoencoder, which can learn feature representations of the normal samples as a Gaussian Mixture Model trained using deep learning. A Fully Convolutional Network (FCN) that does not contain a fully-connected layer is employed for the encoder-decoder structure to preserve relative spatial coordinates between the input image and the output feature map. Based on the joint probabilities of each of the Gaussian mixture components, we introduce a sample energy based method to score the anomaly of image test patches. A two-stream network framework is employed to combine the appearance and motion anomalies, using RGB frames for the former and dynamic flow images, for the latter. We test our approach on two popular benchmarks (UCSD Dataset and Avenue Dataset). The experimental results verify the superiority of our method compared to the state of the arts. △ Less

Submitted 28 May, 2018; originally announced May 2018.

arXiv:1803.00219 [pdf]

Tongue image constitution recognition based on Complexity Perception method

Authors: Jiajiong Ma, Guihua Wen, Yang Hu, Tianyuan Chang, Haibin Zeng, Lijun Jiang, Jianzeng Qin

Abstract: Background and Object: In China, body constitution is highly related to physiological and pathological functions of human body and determines the tendency of the disease, which is of great importance for treatment in clinical medicine. Tongue diagnosis, as a key part of Traditional Chinese Medicine inspection, is an important way to recognize the type of constitution.In order to deploy tongue imag… ▽ More Background and Object: In China, body constitution is highly related to physiological and pathological functions of human body and determines the tendency of the disease, which is of great importance for treatment in clinical medicine. Tongue diagnosis, as a key part of Traditional Chinese Medicine inspection, is an important way to recognize the type of constitution.In order to deploy tongue image constitution recognition system on non-invasive mobile device to achieve fast, efficient and accurate constitution recognition, an efficient method is required to deal with the challenge of this kind of complex environment. Methods: In this work, we perform the tongue area detection, tongue area calibration and constitution classification using methods which are based on deep convolutional neural network. Subject to the variation of inconstant environmental condition, the distribution of the picture is uneven, which has a bad effect on classification performance. To solve this problem, we propose a method based on the complexity of individual instances to divide dataset into two subsets and classify them separately, which is capable of improving classification accuracy. To evaluate the performance of our proposed method, we conduct experiments on three sizes of tongue datasets, in which deep convolutional neural network method and traditional digital image analysis method are respectively applied to extract features for tongue images. The proposed method is combined with the base classifier Softmax, SVM, and DecisionTree respectively. Results: As the experiments results shown, our proposed method improves the classification accuracy by 1.135% on average and achieves 59.99% constitution classification accuracy. Conclusions: Experimental results on three datasets show that our proposed method can effectively improve the classification accuracy of tongue constitution recognition. △ Less

Submitted 1 March, 2018; originally announced March 2018.

arXiv:1803.00185 [pdf]

Facial Expression Recognition Based on Complexity Perception Classification Algorithm

Authors: Tianyuan Chang, Guihua Wen, Yang Hu, JiaJiong Ma

Abstract: Facial expression recognition (FER) has always been a challenging issue in computer vision. The different expressions of emotion and uncontrolled environmental factors lead to inconsistencies in the complexity of FER and variability of between expression categories, which is often overlooked in most facial expression recognition systems. In order to solve this problem effectively, we presented a s… ▽ More Facial expression recognition (FER) has always been a challenging issue in computer vision. The different expressions of emotion and uncontrolled environmental factors lead to inconsistencies in the complexity of FER and variability of between expression categories, which is often overlooked in most facial expression recognition systems. In order to solve this problem effectively, we presented a simple and efficient CNN model to extract facial features, and proposed a complexity perception classification (CPC) algorithm for FER. The CPC algorithm divided the dataset into an easy classification sample subspace and a complex classification sample subspace by evaluating the complexity of facial features that are suitable for classification. The experimental results of our proposed algorithm on Fer2013 and CK-plus datasets demonstrated the algorithm's effectiveness and superiority over other state-of-the-art approaches. △ Less

Submitted 28 February, 2018; originally announced March 2018.

arXiv:1802.02203 [pdf]

doi 10.1109/TCYB.2019.2909925

Automatic construction of Chinese herbal prescription from tongue image via CNNs and auxiliary latent therapy topics

Authors: Yang Hu, Guihua Wen, Huiqiang Liao, Changjun Wang, Dan Dai, Zhiwen Yu

Abstract: The tongue image provides important physical information of humans. It is of great importance for diagnoses and treatments in clinical medicine. Herbal prescriptions are simple, noninvasive and have low side effects. Thus, they are widely applied in China. Studies on the automatic construction technology of herbal prescriptions based on tongue images have great significance for deep learning to ex… ▽ More The tongue image provides important physical information of humans. It is of great importance for diagnoses and treatments in clinical medicine. Herbal prescriptions are simple, noninvasive and have low side effects. Thus, they are widely applied in China. Studies on the automatic construction technology of herbal prescriptions based on tongue images have great significance for deep learning to explore the relevance of tongue images for herbal prescriptions, it can be applied to healthcare services in mobile medical systems. In order to adapt to the tongue image in a variety of photographic environments and construct herbal prescriptions, a neural network framework for prescription construction is designed. It includes single/double convolution channels and fully connected layers. Furthermore, it proposes the auxiliary therapy topic loss mechanism to model the therapy of Chinese doctors and alleviate the interference of sparse output labels on the diversity of results. The experiment use the real world tongue images and the corresponding prescriptions and the results can generate prescriptions that are close to the real samples, which verifies the feasibility of the proposed method for the automatic construction of herbal prescriptions from tongue images. Also, it provides a reference for automatic herbal prescription construction from more physical information. △ Less

Submitted 6 May, 2019; v1 submitted 23 January, 2018; originally announced February 2018.

Comments: 17 pages, 10 figures

Journal ref: IEEE Transactions on Cybernetics ( Volume: 51, Issue: 2, Feb. 2021)

arXiv:1704.02090 [pdf, other]

Conceptualization Topic Modeling

Authors: Yi-Kun Tang, Xian-Ling Mao, Heyan Huang, Guihua Wen

Abstract: Recently, topic modeling has been widely used to discover the abstract topics in text corpora. Most of the existing topic models are based on the assumption of three-layer hierarchical Bayesian structure, i.e. each document is modeled as a probability distribution over topics, and each topic is a probability distribution over words. However, the assumption is not optimal. Intuitively, it's more re… ▽ More Recently, topic modeling has been widely used to discover the abstract topics in text corpora. Most of the existing topic models are based on the assumption of three-layer hierarchical Bayesian structure, i.e. each document is modeled as a probability distribution over topics, and each topic is a probability distribution over words. However, the assumption is not optimal. Intuitively, it's more reasonable to assume that each topic is a probability distribution over concepts, and then each concept is a probability distribution over words, i.e. adding a latent concept layer between topic layer and word layer in traditional three-layer assumption. In this paper, we verify the proposed assumption by incorporating the new assumption in two representative topic models, and obtain two novel topic models. Extensive experiments were conducted among the proposed models and corresponding baselines, and the results show that the proposed models significantly outperform the baselines in terms of case study and perplexity, which means the new assumption is more reasonable than traditional one. △ Less

Submitted 7 April, 2017; originally announced April 2017.

Comments: 7 pages

arXiv:1704.02088 [pdf, ps, other]

Supervised Deep Hashing for Hierarchical Labeled Data

Authors: Dan Wang, Heyan Huang, Chi Lu, Bo-Si Feng, Liqiang Nie, Guihua Wen, Xian-Ling Mao

Abstract: Recently, hashing methods have been widely used in large-scale image retrieval. However, most existing hashing methods did not consider the hierarchical relation of labels, which means that they ignored the rich information stored in the hierarchy. Moreover, most of previous works treat each bit in a hash code equally, which does not meet the scenario of hierarchical labeled data. In this paper, w… ▽ More Recently, hashing methods have been widely used in large-scale image retrieval. However, most existing hashing methods did not consider the hierarchical relation of labels, which means that they ignored the rich information stored in the hierarchy. Moreover, most of previous works treat each bit in a hash code equally, which does not meet the scenario of hierarchical labeled data. In this paper, we propose a novel deep hashing method, called supervised hierarchical deep hashing (SHDH), to perform hash code learning for hierarchical labeled data. Specifically, we define a novel similarity formula for hierarchical labeled data by weighting each layer, and design a deep convolutional neural network to obtain a hash code for each data point. Extensive experiments on several real-world public datasets show that the proposed method outperforms the state-of-the-art baselines in the image retrieval task. △ Less

Submitted 12 September, 2017; v1 submitted 7 April, 2017; originally announced April 2017.

Comments: 9 pages

Showing 1–40 of 40 results for author: Wen, G