-
Equilibrium Strategies of Carbon Emission Reduction in Agricultural Product Supply Chain under Carbon Sink Trading
Authors:
Tingting Meng,
Yukun Cheng,
Xujin Pu,
Rui Li
Abstract:
As global climate change and environmental issues escalate, carbon reduction has emerged as a paramount global concern. Agriculture accounts for approximately 30% of global greenhouse gas emissions, making carbon reduction in this sector crucial for attaining global emission targets. Carbon sink trading serves as a supplementary mechanism to achieve carbon peaking and neutrality, helping to lower…
▽ More
As global climate change and environmental issues escalate, carbon reduction has emerged as a paramount global concern. Agriculture accounts for approximately 30% of global greenhouse gas emissions, making carbon reduction in this sector crucial for attaining global emission targets. Carbon sink trading serves as a supplementary mechanism to achieve carbon peaking and neutrality, helping to lower the rate ofcarbon emissions. However, practical projects and research in the field of carbon sink trading are not enough currently. This work aims to thoroughly explore the cooperative models between farmers and retailers within the context of agricultural carbon sink trading, as well as the optimal decisions on the efforts to reduce carbon emission for both parties under different cooperative models. To this end, we delve into three distinct cooperative frameworks: the decentralized, the Stackelberg, and the centralized models, each accompanied by a corresponding differentialgame model. The Hamilton-Jacobi-Bellman equation is utilized to investigate the equilibrium strategies of each participant under these three cooperative models, respectively. Furthermore, we conducte numerical simulations to analyze the carbon emission reduction efforts of farmers and retailers, the carbon emission reduction level of the agricultural supply chain, and the overall profits of the supply chain. We also compare scenarios with and without carbon sink trading to provide a comprehensive assessment. The numerical results indicate that the centralized modelexcels in all aspects, followed by the Stackelberg model, with the decentralized model showing the weakest performance. Additionally, carbon sink trading can significantly increase the profits of the participants under each cooperative model.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Efficient Long-distance Latent Relation-aware Graph Neural Network for Multi-modal Emotion Recognition in Conversations
Authors:
Yuntao Shou,
Wei Ai,
Jiayi Du,
Tao Meng,
Haiyan Liu
Abstract:
The task of multi-modal emotion recognition in conversation (MERC) aims to analyze the genuine emotional state of each utterance based on the multi-modal information in the conversation, which is crucial for conversation understanding. Existing methods focus on using graph neural networks (GNN) to model conversational relationships and capture contextual latent semantic relationships. However, due…
▽ More
The task of multi-modal emotion recognition in conversation (MERC) aims to analyze the genuine emotional state of each utterance based on the multi-modal information in the conversation, which is crucial for conversation understanding. Existing methods focus on using graph neural networks (GNN) to model conversational relationships and capture contextual latent semantic relationships. However, due to the complexity of GNN, existing methods cannot efficiently capture the potential dependencies between long-distance utterances, which limits the performance of MERC. In this paper, we propose an Efficient Long-distance Latent Relation-aware Graph Neural Network (ELR-GNN) for multi-modal emotion recognition in conversations. Specifically, we first use pre-extracted text, video and audio features as input to Bi-LSTM to capture contextual semantic information and obtain low-level utterance features. Then, we use low-level utterance features to construct a conversational emotion interaction graph. To efficiently capture the potential dependencies between long-distance utterances, we use the dilated generalized forward push algorithm to precompute the emotional propagation between global utterances and design an emotional relation-aware operator to capture the potential semantic associations between different utterances. Furthermore, we combine early fusion and adaptive late fusion mechanisms to fuse latent dependency information between speaker relationship information and context. Finally, we obtain high-level discourse features and feed them into MLP for emotion prediction. Extensive experimental results show that ELR-GNN achieves state-of-the-art performance on the benchmark datasets IEMOCAP and MELD, with running times reduced by 52\% and 35\%, respectively.
△ Less
Submitted 27 June, 2024;
originally announced July 2024.
-
Fed-Credit: Robust Federated Learning with Credibility Management
Authors:
Jiayan Chen,
Zhirong Qian,
Tianhui Meng,
Xitong Gao,
Tian Wang,
Weijia Jia
Abstract:
Aiming at privacy preservation, Federated Learning (FL) is an emerging machine learning approach enabling model training on decentralized devices or data sources. The learning mechanism of FL relies on aggregating parameter updates from individual clients. However, this process may pose a potential security risk due to the presence of malicious devices. Existing solutions are either costly due to…
▽ More
Aiming at privacy preservation, Federated Learning (FL) is an emerging machine learning approach enabling model training on decentralized devices or data sources. The learning mechanism of FL relies on aggregating parameter updates from individual clients. However, this process may pose a potential security risk due to the presence of malicious devices. Existing solutions are either costly due to the use of compute-intensive technology, or restrictive for reasons of strong assumptions such as the prior knowledge of the number of attackers and how they attack. Few methods consider both privacy constraints and uncertain attack scenarios. In this paper, we propose a robust FL approach based on the credibility management scheme, called Fed-Credit. Unlike previous studies, our approach does not require prior knowledge of the nodes and the data distribution. It maintains and employs a credibility set, which weighs the historical clients' contributions based on the similarity between the local models and global model, to adjust the global model update. The subtlety of Fed-Credit is that the time decay and attitudinal value factor are incorporated into the dynamic adjustment of the reputation weights and it boasts a computational complexity of O(n) (n is the number of the clients). We conducted extensive experiments on the MNIST and CIFAR-10 datasets under 5 types of attacks. The results exhibit superior accuracy and resilience against adversarial attacks, all while maintaining comparatively low computational complexity. Among these, on the Non-IID CIFAR-10 dataset, our algorithm exhibited performance enhancements of 19.5% and 14.5%, respectively, in comparison to the state-of-the-art algorithm when dealing with two types of data poisoning attacks.
△ Less
Submitted 19 May, 2024;
originally announced May 2024.
-
Revisiting Multimodal Emotion Recognition in Conversation from the Perspective of Graph Spectrum
Authors:
Tao Meng,
Fuchen Zhang,
Yuntao Shou,
Wei Ai,
Nan Yin,
Keqin Li
Abstract:
Efficiently capturing consistent and complementary semantic features in a multimodal conversation context is crucial for Multimodal Emotion Recognition in Conversation (MERC). Existing methods mainly use graph structures to model dialogue context semantic dependencies and employ Graph Neural Networks (GNN) to capture multimodal semantic features for emotion recognition. However, these methods are…
▽ More
Efficiently capturing consistent and complementary semantic features in a multimodal conversation context is crucial for Multimodal Emotion Recognition in Conversation (MERC). Existing methods mainly use graph structures to model dialogue context semantic dependencies and employ Graph Neural Networks (GNN) to capture multimodal semantic features for emotion recognition. However, these methods are limited by some inherent characteristics of GNN, such as over-smoothing and low-pass filtering, resulting in the inability to learn long-distance consistency information and complementary information efficiently. Since consistency and complementarity information correspond to low-frequency and high-frequency information, respectively, this paper revisits the problem of multimodal emotion recognition in conversation from the perspective of the graph spectrum. Specifically, we propose a Graph-Spectrum-based Multimodal Consistency and Complementary collaborative learning framework GS-MCC. First, GS-MCC uses a sliding window to construct a multimodal interaction graph to model conversational relationships and uses efficient Fourier graph operators to extract long-distance high-frequency and low-frequency information, respectively. Then, GS-MCC uses contrastive learning to construct self-supervised signals that reflect complementarity and consistent semantic collaboration with high and low-frequency signals, thereby improving the ability of high and low-frequency information to reflect real emotions. Finally, GS-MCC inputs the collaborative high and low-frequency information into the MLP network and softmax function for emotion prediction. Extensive experiments have proven the superiority of the GS-MCC architecture proposed in this paper on two benchmark data sets.
△ Less
Submitted 2 May, 2024; v1 submitted 27 April, 2024;
originally announced April 2024.
-
Revisiting Multi-modal Emotion Learning with Broad State Space Models and Probability-guidance Fusion
Authors:
Yuntao Shou,
Tao Meng,
Fuchen Zhang,
Nan Yin,
Keqin Li
Abstract:
Multi-modal Emotion Recognition in Conversation (MERC) has received considerable attention in various fields, e.g., human-computer interaction and recommendation systems. Most existing works perform feature disentanglement and fusion to extract emotional contextual information from multi-modal features and emotion classification. After revisiting the characteristic of MERC, we argue that long-rang…
▽ More
Multi-modal Emotion Recognition in Conversation (MERC) has received considerable attention in various fields, e.g., human-computer interaction and recommendation systems. Most existing works perform feature disentanglement and fusion to extract emotional contextual information from multi-modal features and emotion classification. After revisiting the characteristic of MERC, we argue that long-range contextual semantic information should be extracted in the feature disentanglement stage and the inter-modal semantic information consistency should be maximized in the feature fusion stage. Inspired by recent State Space Models (SSMs), Mamba can efficiently model long-distance dependencies. Therefore, in this work, we fully consider the above insights to further improve the performance of MERC. Specifically, on the one hand, in the feature disentanglement stage, we propose a Broad Mamba, which does not rely on a self-attention mechanism for sequence modeling, but uses state space models to compress emotional representation, and utilizes broad learning systems to explore the potential data distribution in broad space. Different from previous SSMs, we design a bidirectional SSM convolution to extract global context information. On the other hand, we design a multi-modal fusion strategy based on probability guidance to maximize the consistency of information between modalities. Experimental results show that the proposed method can overcome the computational and memory limitations of Transformer when modeling long-distance contexts, and has great potential to become a next-generation general architecture in MERC.
△ Less
Submitted 2 May, 2024; v1 submitted 27 April, 2024;
originally announced April 2024.
-
Leveraging viscous Hamilton-Jacobi PDEs for uncertainty quantification in scientific machine learning
Authors:
Zongren Zou,
Tingwei Meng,
Paula Chen,
Jérôme Darbon,
George Em Karniadakis
Abstract:
Uncertainty quantification (UQ) in scientific machine learning (SciML) combines the powerful predictive power of SciML with methods for quantifying the reliability of the learned models. However, two major challenges remain: limited interpretability and expensive training procedures. We provide a new interpretation for UQ problems by establishing a new theoretical connection between some Bayesian…
▽ More
Uncertainty quantification (UQ) in scientific machine learning (SciML) combines the powerful predictive power of SciML with methods for quantifying the reliability of the learned models. However, two major challenges remain: limited interpretability and expensive training procedures. We provide a new interpretation for UQ problems by establishing a new theoretical connection between some Bayesian inference problems arising in SciML and viscous Hamilton-Jacobi partial differential equations (HJ PDEs). Namely, we show that the posterior mean and covariance can be recovered from the spatial gradient and Hessian of the solution to a viscous HJ PDE. As a first exploration of this connection, we specialize to Bayesian inference problems with linear models, Gaussian likelihoods, and Gaussian priors. In this case, the associated viscous HJ PDEs can be solved using Riccati ODEs, and we develop a new Riccati-based methodology that provides computational advantages when continuously updating the model predictions. Specifically, our Riccati-based approach can efficiently add or remove data points to the training set invariant to the order of the data and continuously tune hyperparameters. Moreover, neither update requires retraining on or access to previously incorporated data. We provide several examples from SciML involving noisy data and \textit{epistemic uncertainty} to illustrate the potential advantages of our approach. In particular, this approach's amenability to data streaming applications demonstrates its potential for real-time inferences, which, in turn, allows for applications in which the predicted uncertainty is used to dynamically alter the learning process.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Monotonic Paraphrasing Improves Generalization of Language Model Prompting
Authors:
Qin Liu,
Fei Wang,
Nan Xu,
Tianyi Yan,
Tao Meng,
Muhao Chen
Abstract:
Performance of large language models (LLMs) may vary with different prompts or instructions of even the same task. One commonly recognized factor for this phenomenon is the model's familiarity with the given prompt or instruction, which is typically estimated by its perplexity. However, finding the prompt with the lowest perplexity is challenging, given the enormous space of possible prompting phr…
▽ More
Performance of large language models (LLMs) may vary with different prompts or instructions of even the same task. One commonly recognized factor for this phenomenon is the model's familiarity with the given prompt or instruction, which is typically estimated by its perplexity. However, finding the prompt with the lowest perplexity is challenging, given the enormous space of possible prompting phrases. In this paper, we propose monotonic paraphrasing (MonoPara), an end-to-end decoding strategy that paraphrases given prompts or instructions into their lower perplexity counterparts based on an ensemble of a paraphrase LM for prompt (or instruction) rewriting, and a target LM (i.e. the prompt or instruction executor) that constrains the generation for lower perplexity. The ensemble decoding process can efficiently paraphrase the original prompt without altering its semantic meaning, while monotonically decreasing the perplexity of each generation as calculated by the target LM. We explore in detail both greedy and search-based decoding as two alternative decoding schemes of MonoPara. Notably, MonoPara does not require any training and can monotonically lower the perplexity of the paraphrased prompt or instruction, leading to improved performance of zero-shot LM prompting as evaluated on a wide selection of tasks. In addition, MonoPara is also shown to effectively improve LMs' generalization on perturbed and unseen task instructions.
△ Less
Submitted 18 April, 2024; v1 submitted 24 March, 2024;
originally announced March 2024.
-
Few-Shot Image Classification and Segmentation as Visual Question Answering Using Vision-Language Models
Authors:
Tian Meng,
Yang Tao,
Ruilin Lyu,
Wuliang Yin
Abstract:
The task of few-shot image classification and segmentation (FS-CS) involves classifying and segmenting target objects in a query image, given only a few examples of the target classes. We introduce the Vision-Instructed Segmentation and Evaluation (VISE) method that transforms the FS-CS problem into the Visual Question Answering (VQA) problem, utilising Vision-Language Models (VLMs), and addresses…
▽ More
The task of few-shot image classification and segmentation (FS-CS) involves classifying and segmenting target objects in a query image, given only a few examples of the target classes. We introduce the Vision-Instructed Segmentation and Evaluation (VISE) method that transforms the FS-CS problem into the Visual Question Answering (VQA) problem, utilising Vision-Language Models (VLMs), and addresses it in a training-free manner. By enabling a VLM to interact with off-the-shelf vision models as tools, the proposed method is capable of classifying and segmenting target objects using only image-level labels. Specifically, chain-of-thought prompting and in-context learning guide the VLM to answer multiple-choice questions like a human; vision models such as YOLO and Segment Anything Model (SAM) assist the VLM in completing the task. The modular framework of the proposed method makes it easily extendable. Our approach achieves state-of-the-art performance on the Pascal-5i and COCO-20i datasets.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
A Primal-dual hybrid gradient method for solving optimal control problems and the corresponding Hamilton-Jacobi PDEs
Authors:
Tingwei Meng,
Siting Liu,
Wuchen Li,
Stanley Osher
Abstract:
Optimal control problems are crucial in various domains, including path planning, robotics, and humanoid control, demonstrating their broad applicability. The connection between optimal control and Hamilton-Jacobi (HJ) partial differential equations (PDEs) underscores the need for solving HJ PDEs to address these control problems effectively. While numerous numerical methods exist for tackling HJ…
▽ More
Optimal control problems are crucial in various domains, including path planning, robotics, and humanoid control, demonstrating their broad applicability. The connection between optimal control and Hamilton-Jacobi (HJ) partial differential equations (PDEs) underscores the need for solving HJ PDEs to address these control problems effectively. While numerous numerical methods exist for tackling HJ PDEs across different dimensions, this paper introduces an innovative optimization-based approach that reformulates optimal control problems and HJ PDEs into a saddle point problem using a Lagrange multiplier. Our method, based on the preconditioned primal-dual hybrid gradient (PDHG) method, offers a solution to HJ PDEs with first-order accuracy and numerical unconditional stability, enabling larger time steps and avoiding the limitations of explicit time discretization methods. Our approach has ability to handle a wide variety of Hamiltonian functions, including those that are non-smooth and dependent on time and space, through a simplified saddle point formulation that facilitates easy and parallelizable updates. Furthermore, our framework extends to viscous HJ PDEs and stochastic optimal control problems, showcasing its versatility. Through a series of numerical examples, we demonstrate the method's effectiveness in managing diverse Hamiltonians and achieving efficient parallel computation, highlighting its potential for wide-ranging applications in optimal control and beyond.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Giant quantum oscillations in thermal transport in low-density metals via electron absorption of phonons
Authors:
B. Bermond,
R. Wawrzynczak,
S. Zherlitsyn,
T. Kotte,
T. Helm,
D. Gorbunov,
G. D. Gu,
Q. Li,
F. Janasz,
T. Meng,
F. Menges,
C. Felser,
J. Wosnitza,
Adolfo G. Grushin,
David Carpentier,
J. Gooth,
S. Galeski
Abstract:
Oscillations of conductance observed in strong magnetic fields are a striking manifestation of the quantum dynamics of charge carriers in solids. The large charge carrier density in typical metals sets the scale of oscillations in both electrical and thermal conductivity, which characterize the Fermi surface. In semimetals, thermal transport at low-charge carrier density is expected to be phonon d…
▽ More
Oscillations of conductance observed in strong magnetic fields are a striking manifestation of the quantum dynamics of charge carriers in solids. The large charge carrier density in typical metals sets the scale of oscillations in both electrical and thermal conductivity, which characterize the Fermi surface. In semimetals, thermal transport at low-charge carrier density is expected to be phonon dominated, yet several experiments observe giant quantum oscillations in thermal transport. This raises the question of whether there is an overarching mechanism leading to sizable oscillations that survives in phonon-dominated semimetals. In this work, we show that such a mechanism exists. It relies on the peculiar phase-space allowed for phonon scattering by electrons when only a few Landau levels are filled. Our measurements on the Dirac semimetal ZrTe5 support this counter-intuitive mechanism through observation of pronounced thermal quantum oscillations, since they occur in similar magnitude and phase in directions parallel and transverse to the magnetic field. Our phase-space argument applies to all low-density semimetals, topological or not, including graphene and bismuth. Our work illustrates that phonon absorption can be leveraged to reveal degrees of freedom through their imprint on longitudinal thermal transport.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Fast Butterfly-Core Community Search For Large Labeled Graphs
Authors:
JiaYi Du,
Yinghao Wu,
Wei Ai,
Tao Meng,
CanHao Xie,
KeQin Li
Abstract:
Community Search (CS) aims to identify densely interconnected subgraphs corresponding to query vertices within a graph. However, existing heterogeneous graph-based community search methods need help identifying cross-group communities and suffer from efficiency issues, making them unsuitable for large graphs. This paper presents a fast community search model based on the Butterfly-Core Community (…
▽ More
Community Search (CS) aims to identify densely interconnected subgraphs corresponding to query vertices within a graph. However, existing heterogeneous graph-based community search methods need help identifying cross-group communities and suffer from efficiency issues, making them unsuitable for large graphs. This paper presents a fast community search model based on the Butterfly-Core Community (BCC) structure for heterogeneous graphs. The Random Walk with Restart (RWR) algorithm and butterfly degree comprehensively evaluate the importance of vertices within communities, allowing leader vertices to be rapidly updated to maintain cross-group cohesion. Moreover, we devised a more efficient method for updating vertex distances, which minimizes vertex visits and enhances operational efficiency. Extensive experiments on several real-world temporal graphs demonstrate the effectiveness and efficiency of this solution.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
An Effective Index for Truss-based Community Search on Large Directed Graphs
Authors:
Wei Ai,
CanHao Xie,
Tao Meng,
Yinghao Wu,
KeQin Li
Abstract:
Community search is a derivative of community detection that enables online and personalized discovery of communities and has found extensive applications in massive real-world networks. Recently, there needs to be more focus on the community search issue within directed graphs, even though substantial research has been carried out on undirected graphs. The recently proposed D-truss model has achi…
▽ More
Community search is a derivative of community detection that enables online and personalized discovery of communities and has found extensive applications in massive real-world networks. Recently, there needs to be more focus on the community search issue within directed graphs, even though substantial research has been carried out on undirected graphs. The recently proposed D-truss model has achieved good results in the quality of retrieved communities. However, existing D-truss-based work cannot perform efficient community searches on large graphs because it consumes too many computing resources to retrieve the maximal D-truss. To overcome this issue, we introduce an innovative merge relation known as D-truss-connected to capture the inherent density and cohesiveness of edges within D-truss. This relation allows us to partition all the edges in the original graph into a series of D-truss-connected classes. Then, we construct a concise and compact index, ConDTruss, based on D-truss-connected. Using ConDTruss, the efficiency of maximum D-truss retrieval will be greatly improved, making it a theoretically optimal approach. Experimental evaluations conducted on large directed graph certificate the effectiveness of our proposed method.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
Deciphering Interphase Instability of Lithium Metal Batteries with Localized High-Concentration Electrolytes at Elevated Temperatures
Authors:
Tao Meng,
Shanshan Yang,
Yitong Peng,
Xiwei Lan,
Pingan Li,
Kangjia Hu,
Xianluo Hu
Abstract:
Lithium metal batteries (LMBs), when coupled with a localized high-concentration electrolyte and a high-voltage nickel-rich cathode, offer a solution to the increasing demand for high energy density and long cycle life. However, the aggressive electrode chemistry poses safety risks to LMBs at higher temperatures and cutoff voltages. Here, we decipher the interphase instability in LHCE-based LMBs w…
▽ More
Lithium metal batteries (LMBs), when coupled with a localized high-concentration electrolyte and a high-voltage nickel-rich cathode, offer a solution to the increasing demand for high energy density and long cycle life. However, the aggressive electrode chemistry poses safety risks to LMBs at higher temperatures and cutoff voltages. Here, we decipher the interphase instability in LHCE-based LMBs with a Ni0.8Co0.1Mn0.1O2 cathode at elevated temperatures. Our findings reveal that the generation of fluorine radicals in the electrolyte induces the solvent decomposition and consequent chain reactions, thereby reconstructing the cathode electrolyte interphase (CEI) and degrading battery cyclability. As further evidenced, introducing an acid scavenger of dimethoxydimethylsilane (DODSi) significantly boosts CEI stability with suppressed microcracking. A Ni0.8Co0.1Mn0.1O2||Li cell with this DODSi-functionalized LHCE achieves an unprecedented capacity retention of 93.0% after 100 cycles at 80 °C. This research provides insights into electrolyte engineering for practical LMBs with high safety under extreme temperatures.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
A Two-Stage Multimodal Emotion Recognition Model Based on Graph Contrastive Learning
Authors:
Wei Ai,
FuChen Zhang,
Tao Meng,
YunTao Shou,
HongEn Shao,
Keqin Li
Abstract:
In terms of human-computer interaction, it is becoming more and more important to correctly understand the user's emotional state in a conversation, so the task of multimodal emotion recognition (MER) started to receive more attention. However, existing emotion classification methods usually perform classification only once. Sentences are likely to be misclassified in a single round of classificat…
▽ More
In terms of human-computer interaction, it is becoming more and more important to correctly understand the user's emotional state in a conversation, so the task of multimodal emotion recognition (MER) started to receive more attention. However, existing emotion classification methods usually perform classification only once. Sentences are likely to be misclassified in a single round of classification. Previous work usually ignores the similarities and differences between different morphological features in the fusion process. To address the above issues, we propose a two-stage emotion recognition model based on graph contrastive learning (TS-GCL). First, we encode the original dataset with different preprocessing modalities. Second, a graph contrastive learning (GCL) strategy is introduced for these three modal data with other structures to learn similarities and differences within and between modalities. Finally, we use MLP twice to achieve the final emotion classification. This staged classification method can help the model to better focus on different levels of emotional information, thereby improving the performance of the model. Extensive experiments show that TS-GCL has superior performance on IEMOCAP and MELD datasets compared with previous methods.
△ Less
Submitted 2 January, 2024;
originally announced January 2024.
-
Adversarial Representation with Intra-Modal and Inter-Modal Graph Contrastive Learning for Multimodal Emotion Recognition
Authors:
Yuntao Shou,
Tao Meng,
Wei Ai,
Keqin Li
Abstract:
With the release of increasing open-source emotion recognition datasets on social media platforms and the rapid development of computing resources, multimodal emotion recognition tasks (MER) have begun to receive widespread research attention. The MER task extracts and fuses complementary semantic information from different modalities, which can classify the speaker's emotions. However, the existi…
▽ More
With the release of increasing open-source emotion recognition datasets on social media platforms and the rapid development of computing resources, multimodal emotion recognition tasks (MER) have begun to receive widespread research attention. The MER task extracts and fuses complementary semantic information from different modalities, which can classify the speaker's emotions. However, the existing feature fusion methods have usually mapped the features of different modalities into the same feature space for information fusion, which can not eliminate the heterogeneity between different modalities. Therefore, it is challenging to make the subsequent emotion class boundary learning. To tackle the above problems, we have proposed a novel Adversarial Representation with Intra-Modal and Inter-Modal Graph Contrastive for Multimodal Emotion Recognition (AR-IIGCN) method. Firstly, we input video, audio, and text features into a multi-layer perceptron (MLP) to map them into separate feature spaces. Secondly, we build a generator and a discriminator for the three modal features through adversarial representation, which can achieve information interaction between modalities and eliminate heterogeneity among modalities. Thirdly, we introduce contrastive graph representation learning to capture intra-modal and inter-modal complementary semantic information and learn intra-class and inter-class boundary information of emotion categories. Specifically, we construct a graph structure for three modal features and perform contrastive representation learning on nodes with different emotions in the same modality and the same emotion in different modalities, which can improve the feature representation ability of nodes. Extensive experimental works show that the ARL-IIGCN method can significantly improve emotion recognition accuracy on IEMOCAP and MELD datasets.
△ Less
Submitted 27 December, 2023;
originally announced December 2023.
-
DER-GCN: Dialogue and Event Relation-Aware Graph Convolutional Neural Network for Multimodal Dialogue Emotion Recognition
Authors:
Wei Ai,
Yuntao Shou,
Tao Meng,
Keqin Li
Abstract:
With the continuous development of deep learning (DL), the task of multimodal dialogue emotion recognition (MDER) has recently received extensive research attention, which is also an essential branch of DL. The MDER aims to identify the emotional information contained in different modalities, e.g., text, video, and audio, in different dialogue scenes. However, existing research has focused on mode…
▽ More
With the continuous development of deep learning (DL), the task of multimodal dialogue emotion recognition (MDER) has recently received extensive research attention, which is also an essential branch of DL. The MDER aims to identify the emotional information contained in different modalities, e.g., text, video, and audio, in different dialogue scenes. However, existing research has focused on modeling contextual semantic information and dialogue relations between speakers while ignoring the impact of event relations on emotion. To tackle the above issues, we propose a novel Dialogue and Event Relation-Aware Graph Convolutional Neural Network for Multimodal Emotion Recognition (DER-GCN) method. It models dialogue relations between speakers and captures latent event relations information. Specifically, we construct a weighted multi-relationship graph to simultaneously capture the dependencies between speakers and event relations in a dialogue. Moreover, we also introduce a Self-Supervised Masked Graph Autoencoder (SMGAE) to improve the fusion representation ability of features and structures. Next, we design a new Multiple Information Transformer (MIT) to capture the correlation between different relations, which can provide a better fuse of the multivariate information between relations. Finally, we propose a loss optimization strategy based on contrastive learning to enhance the representation learning ability of minority class features. We conduct extensive experiments on the IEMOCAP and MELD benchmark datasets, which verify the effectiveness of the DER-GCN model. The results demonstrate that our model significantly improves both the average accuracy and the f1 value of emotion recognition.
△ Less
Submitted 16 December, 2023;
originally announced December 2023.
-
Deep Imbalanced Learning for Multimodal Emotion Recognition in Conversations
Authors:
Tao Meng,
Yuntao Shou,
Wei Ai,
Nan Yin,
Keqin Li
Abstract:
The main task of Multimodal Emotion Recognition in Conversations (MERC) is to identify the emotions in modalities, e.g., text, audio, image and video, which is a significant development direction for realizing machine intelligence. However, many data in MERC naturally exhibit an imbalanced distribution of emotion categories, and researchers ignore the negative impact of imbalanced data on emotion…
▽ More
The main task of Multimodal Emotion Recognition in Conversations (MERC) is to identify the emotions in modalities, e.g., text, audio, image and video, which is a significant development direction for realizing machine intelligence. However, many data in MERC naturally exhibit an imbalanced distribution of emotion categories, and researchers ignore the negative impact of imbalanced data on emotion recognition. To tackle this problem, we systematically analyze it from three aspects: data augmentation, loss sensitivity, and sampling strategy, and propose the Class Boundary Enhanced Representation Learning (CBERL) model. Concretely, we first design a multimodal generative adversarial network to address the imbalanced distribution of {emotion} categories in raw data. Secondly, a deep joint variational autoencoder is proposed to fuse complementary semantic information across modalities and obtain discriminative feature representations. Finally, we implement a multi-task graph neural network with mask reconstruction and classification optimization to solve the problem of overfitting and underfitting in class boundary learning, and achieve cross-modal emotion recognition. We have conducted extensive experiments on the IEMOCAP and MELD benchmark datasets, and the results show that CBERL has achieved a certain performance improvement in the effectiveness of emotion recognition. Especially on the minority class fear and disgust emotion labels, our model improves the accuracy and F1 value by 10% to 20%.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
A Comprehensive Survey on Multi-modal Conversational Emotion Recognition with Deep Learning
Authors:
Yuntao Shou,
Tao Meng,
Wei Ai,
Nan Yin,
Keqin Li
Abstract:
Multi-modal conversation emotion recognition (MCER) aims to recognize and track the speaker's emotional state using text, speech, and visual information in the conversation scene. Analyzing and studying MCER issues is significant to affective computing, intelligent recommendations, and human-computer interaction fields. Unlike the traditional single-utterance multi-modal emotion recognition or sin…
▽ More
Multi-modal conversation emotion recognition (MCER) aims to recognize and track the speaker's emotional state using text, speech, and visual information in the conversation scene. Analyzing and studying MCER issues is significant to affective computing, intelligent recommendations, and human-computer interaction fields. Unlike the traditional single-utterance multi-modal emotion recognition or single-modal conversation emotion recognition, MCER is a more challenging problem that needs to deal with more complex emotional interaction relationships. The critical issue is learning consistency and complementary semantics for multi-modal feature fusion based on emotional interaction relationships. To solve this problem, people have conducted extensive research on MCER based on deep learning technology, but there is still a lack of systematic review of the modeling methods. Therefore, a timely and comprehensive overview of MCER's recent advances in deep learning is of great significance to academia and industry. In this survey, we provide a comprehensive overview of MCER modeling methods and roughly divide MCER methods into four categories, i.e., context-free modeling, sequential context modeling, speaker-differentiated modeling, and speaker-relationship modeling. In addition, we further discuss MCER's publicly available popular datasets, multi-modal feature extraction methods, application areas, existing challenges, and future development directions. We hope that our review can help MCER researchers understand the current research status in emotion recognition, provide some inspiration, and develop more efficient models.
△ Less
Submitted 9 December, 2023;
originally announced December 2023.
-
Graph Information Bottleneck for Remote Sensing Segmentation
Authors:
Yuntao Shou,
Wei Ai,
Tao Meng
Abstract:
Remote sensing segmentation has a wide range of applications in environmental protection, and urban change detection, etc. Despite the success of deep learning-based remote sensing segmentation methods (e.g., CNN and Transformer), they are not flexible enough to model irregular objects. In addition, existing graph contrastive learning methods usually adopt the way of maximizing mutual information…
▽ More
Remote sensing segmentation has a wide range of applications in environmental protection, and urban change detection, etc. Despite the success of deep learning-based remote sensing segmentation methods (e.g., CNN and Transformer), they are not flexible enough to model irregular objects. In addition, existing graph contrastive learning methods usually adopt the way of maximizing mutual information to keep the node representations consistent between different graph views, which may cause the model to learn task-independent redundant information. To tackle the above problems, this paper treats images as graph structures and introduces a simple contrastive vision GNN (SC-ViG) architecture for remote sensing segmentation. Specifically, we construct a node-masked and edge-masked graph view to obtain an optimal graph structure representation, which can adaptively learn whether to mask nodes and edges. Furthermore, this paper innovatively introduces information bottleneck theory into graph contrastive learning to maximize task-related information while minimizing task-independent redundant information. Finally, we replace the convolutional module in UNet with the SC-ViG module to complete the segmentation and classification tasks of remote sensing images. Extensive experiments on publicly available real datasets demonstrate that our method outperforms state-of-the-art remote sensing image segmentation methods.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
MedDM:LLM-executable clinical guidance tree for clinical decision-making
Authors:
Binbin Li,
Tianxin Meng,
Xiaoming Shi,
Jie Zhai,
Tong Ruan
Abstract:
It is becoming increasingly emphasis on the importance of LLM participating in clinical diagnosis decision-making. However, the low specialization refers to that current medical LLMs can not provide specific medical advice, which are more like a medical Q\&A. And there is no suitable clinical guidance tree data set that can be used directly with LLM. To address this issue, we first propose LLM-exe…
▽ More
It is becoming increasingly emphasis on the importance of LLM participating in clinical diagnosis decision-making. However, the low specialization refers to that current medical LLMs can not provide specific medical advice, which are more like a medical Q\&A. And there is no suitable clinical guidance tree data set that can be used directly with LLM. To address this issue, we first propose LLM-executavle clinical guidance tree(CGT), which can be directly used by large language models, and construct medical diagnostic decision-making dataset (MedDM), from flowcharts in clinical practice guidelines. We propose an approach to screen flowcharts from medical literature, followed by their identification and conversion into standardized diagnostic decision trees. Constructed a knowledge base with 1202 decision trees, which came from 5000 medical literature and covered 12 hospital departments, including internal medicine, surgery, psychiatry, and over 500 diseases.Moreover, we propose a method for reasoning on LLM-executable CGT and a Patient-LLM multi-turn dialogue framework.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
CILF-CIAE: CLIP-driven Image-Language Fusion for Correcting Inverse Age Estimation
Authors:
Yuntao Shou,
Wei Ai,
Tao Meng,
Keqin Li
Abstract:
The age estimation task aims to predict the age of an individual by analyzing facial features in an image. The development of age estimation can improve the efficiency and accuracy of various applications (e.g., age verification and secure access control, etc.). In recent years, contrastive language-image pre-training (CLIP) has been widely used in various multimodal tasks and has made some progre…
▽ More
The age estimation task aims to predict the age of an individual by analyzing facial features in an image. The development of age estimation can improve the efficiency and accuracy of various applications (e.g., age verification and secure access control, etc.). In recent years, contrastive language-image pre-training (CLIP) has been widely used in various multimodal tasks and has made some progress in the field of age estimation. However, existing CLIP-based age estimation methods require high memory usage (quadratic complexity) when globally modeling images, and lack an error feedback mechanism to prompt the model about the quality of age prediction results. To tackle the above issues, we propose a novel CLIP-driven Image-Language Fusion for Correcting Inverse Age Estimation (CILF-CIAE). Specifically, we first introduce the CLIP model to extract image features and text semantic information respectively, and map them into a highly semantically aligned high-dimensional feature space. Next, we designed a new Transformer architecture (i.e., FourierFormer) to achieve channel evolution and spatial interaction of images, and to fuse image and text semantic information. Compared with the quadratic complexity of the attention mechanism, the proposed Fourierformer is of linear log complexity. To further narrow the semantic gap between image and text features, we utilize an efficient contrastive multimodal learning module that supervises the multimodal fusion process of FourierFormer through contrastive loss for image-text matching, thereby improving the interaction effect between different modalities. Finally, we introduce reversible age estimation, which uses end-to-end error feedback to reduce the error rate of age predictions. Through extensive experiments on multiple data sets, CILF-CIAE has achieved better age prediction results.
△ Less
Submitted 1 July, 2024; v1 submitted 4 December, 2023;
originally announced December 2023.
-
Few-Shot Classification & Segmentation Using Large Language Models Agent
Authors:
Tian Meng,
Yang Tao,
Wuliang Yin
Abstract:
The task of few-shot image classification and segmentation (FS-CS) requires the classification and segmentation of target objects in a query image, given only a few examples of the target classes. We introduce a method that utilises large language models (LLM) as an agent to address the FS-CS problem in a training-free manner. By making the LLM the task planner and off-the-shelf vision models the…
▽ More
The task of few-shot image classification and segmentation (FS-CS) requires the classification and segmentation of target objects in a query image, given only a few examples of the target classes. We introduce a method that utilises large language models (LLM) as an agent to address the FS-CS problem in a training-free manner. By making the LLM the task planner and off-the-shelf vision models the tools, the proposed method is capable of classifying and segmenting target objects using only image-level labels. Specifically, chain-of-thought prompting and in-context learning guide the LLM to observe support images like human; vision models such as Segment Anything Model (SAM) and GPT-4Vision assist LLM understand spatial and semantic information at the same time. Ultimately, the LLM uses its summarizing and reasoning capabilities to classify and segment the query image. The proposed method's modular framework makes it easily extendable. Our approach achieves state-of-the-art performance on the Pascal-5i dataset.
△ Less
Submitted 18 November, 2023;
originally announced November 2023.
-
Leveraging Hamilton-Jacobi PDEs with time-dependent Hamiltonians for continual scientific machine learning
Authors:
Paula Chen,
Tingwei Meng,
Zongren Zou,
Jérôme Darbon,
George Em Karniadakis
Abstract:
We address two major challenges in scientific machine learning (SciML): interpretability and computational efficiency. We increase the interpretability of certain learning processes by establishing a new theoretical connection between optimization problems arising from SciML and a generalized Hopf formula, which represents the viscosity solution to a Hamilton-Jacobi partial differential equation (…
▽ More
We address two major challenges in scientific machine learning (SciML): interpretability and computational efficiency. We increase the interpretability of certain learning processes by establishing a new theoretical connection between optimization problems arising from SciML and a generalized Hopf formula, which represents the viscosity solution to a Hamilton-Jacobi partial differential equation (HJ PDE) with time-dependent Hamiltonian. Namely, we show that when we solve certain regularized learning problems with integral-type losses, we actually solve an optimal control problem and its associated HJ PDE with time-dependent Hamiltonian. This connection allows us to reinterpret incremental updates to learned models as the evolution of an associated HJ PDE and optimal control problem in time, where all of the previous information is intrinsically encoded in the solution to the HJ PDE. As a result, existing HJ PDE solvers and optimal control algorithms can be reused to design new efficient training approaches for SciML that naturally coincide with the continual learning framework, while avoiding catastrophic forgetting. As a first exploration of this connection, we consider the special case of linear regression and leverage our connection to develop a new Riccati-based methodology for solving these learning problems that is amenable to continual learning applications. We also provide some corresponding numerical examples that demonstrate the potential computational and memory advantages our Riccati-based approach can provide.
△ Less
Submitted 6 May, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Primal-dual hybrid gradient algorithms for computing time-implicit Hamilton-Jacobi equations
Authors:
Tingwei Meng,
Wenbo Hao,
Siting Liu,
Stanley J. Osher,
Wuchen Li
Abstract:
Hamilton-Jacobi (HJ) partial differential equations (PDEs) have diverse applications spanning physics, optimal control, game theory, and imaging sciences. This research introduces a first-order optimization-based technique for HJ PDEs, which formulates the time-implicit update of HJ PDEs as saddle point problems. We remark that the saddle point formulation for HJ equations is aligned with the prim…
▽ More
Hamilton-Jacobi (HJ) partial differential equations (PDEs) have diverse applications spanning physics, optimal control, game theory, and imaging sciences. This research introduces a first-order optimization-based technique for HJ PDEs, which formulates the time-implicit update of HJ PDEs as saddle point problems. We remark that the saddle point formulation for HJ equations is aligned with the primal-dual formulation of optimal transport and potential mean-field games (MFGs). This connection enables us to extend MFG techniques and design numerical schemes for solving HJ PDEs. We employ the primal-dual hybrid gradient (PDHG) method to solve the saddle point problems, benefiting from the simple structures that enable fast computations in updates. Remarkably, the method caters to a broader range of Hamiltonians, encompassing non-smooth and spatiotemporally dependent cases. The approach's effectiveness is verified through various numerical examples in both one-dimensional and two-dimensional examples, such as quadratic and $L^1$ Hamiltonians with spatial and time dependence.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Nonlinear response functions and disorder: the case of photogalvanic effect
Authors:
Konstantinos Ladovrechis,
Tobias Meng
Abstract:
We investigate the impact of impurity scattering on a generalized version of the circular photogalvanic effect (CPGE) in Weyl semimetals where the frequency detuning between the two orthogonally polarized beams is non-zero. Considering a minimal model with two Weyl nodes at different energies, we employ the self-consistent Born approximation (SCBA) to unravel the dependence of the associated compl…
▽ More
We investigate the impact of impurity scattering on a generalized version of the circular photogalvanic effect (CPGE) in Weyl semimetals where the frequency detuning between the two orthogonally polarized beams is non-zero. Considering a minimal model with two Weyl nodes at different energies, we employ the self-consistent Born approximation (SCBA) to unravel the dependence of the associated complex correlation function on the strength of intra- and internode scattering, frequency detuning and energy difference between the two Weyl nodes. In the case of intranode scattering only, the optical response acquires Drude-like features, which we elucidate further by introducing an effective scattering strength. The Drude-like theory can even describe the response in presence of strong internode scattering if the latter has a fixed proportionality factor to the intranode scattering. By properly adjusting the frequency detuning, we also find the imaginary part of the response function to be reminiscent of a "quantized CPGE-like" form, although the real part of the response function is in general finite, and the total optical response oscillates with time due to the finite frequency offset. We finally conclude with an outlook on possible experimental consequences.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Anomalous Shubnikov-de Haas effect and observation of the Bloch-Grüneisen temperature in the Dirac semimetal ZrTe5
Authors:
S. Galeski,
K. Araki,
O. K. Forslund,
R. Wawrzynczak,
H. F. Legg,
P. K. Sivakumar,
U. Miniotaite,
F. Elson,
M. Månsson,
C. Witteveen,
F. O. von Rohr,
A. Q. R. Baron,
D. Ishikawa,
Q. Li,
G. Gu,
L. X. Zhao,
W. L. Zhu,
G. F. Chen,
Y. Wang,
S. S. P. Parkin,
D. Gorbunov,
S. Zherlitsyn,
B. Vlaar,
D. H. Nguyen,
S. Paschen
, et al. (7 additional authors not shown)
Abstract:
Appearance of quantum oscillations (QO) in both thermodynamic and transport properties of metals at low temperatures is the most striking experimental consequence of the existence of a Fermi surface (FS). The frequency of these oscillations and the temperature dependence of their amplitude provides essential information about the FS topology and fermionic quasiparticle properties. Here, we report…
▽ More
Appearance of quantum oscillations (QO) in both thermodynamic and transport properties of metals at low temperatures is the most striking experimental consequence of the existence of a Fermi surface (FS). The frequency of these oscillations and the temperature dependence of their amplitude provides essential information about the FS topology and fermionic quasiparticle properties. Here, we report the observation of an anomalous suppression of the QO amplitude seen in resistivity (Shubnikov de-Haas effect) at sub-kelvin temperatures in ZrTe5 samples with a single small FS sheet comprising less than 5% of the first Brillouin zone. By comparing these results with measurements of the magneto-acoustic QO and the recovery of the usual Lifshitz-Kosevich behavior of the Shubnikov de-Haas (SdH) effect in ZrTe$_5$ samples with a multi-sheet FS, we show that the suppression of the SdH effect originates from a decoupling of the electron liquid from the lattice. On crossing the so-called Bloch-Grüneisen temperature, T$_BG$, electron-phonon scattering becomes strongly suppressed and in the absence of Umklapp scattering the electronic liquid regains Galilean invariance. In addition, we show, using a combination of zero-field electrical conductivity and ultrasonic-absorption measurements, that entering this regime leads to an abrupt increase of electronic viscosity.
△ Less
Submitted 31 January, 2024; v1 submitted 19 September, 2023;
originally announced September 2023.
-
Localization-Guided Track: A Deep Association Multi-Object Tracking Framework Based on Localization Confidence of Detections
Authors:
Ting Meng,
Chunyun Fu,
Mingguang Huang,
Xiyang Wang,
Jiawei He,
Tao Huang,
Wankai Shi
Abstract:
In currently available literature, no tracking-by-detection (TBD) paradigm-based tracking method has considered the localization confidence of detection boxes. In most TBD-based methods, it is considered that objects of low detection confidence are highly occluded and thus it is a normal practice to directly disregard such objects or to reduce their priority in matching. In addition, appearance si…
▽ More
In currently available literature, no tracking-by-detection (TBD) paradigm-based tracking method has considered the localization confidence of detection boxes. In most TBD-based methods, it is considered that objects of low detection confidence are highly occluded and thus it is a normal practice to directly disregard such objects or to reduce their priority in matching. In addition, appearance similarity is not a factor to consider for matching these objects. However, in terms of the detection confidence fusing classification and localization, objects of low detection confidence may have inaccurate localization but clear appearance; similarly, objects of high detection confidence may have inaccurate localization or unclear appearance; yet these objects are not further classified. In view of these issues, we propose Localization-Guided Track (LG-Track). Firstly, localization confidence is applied in MOT for the first time, with appearance clarity and localization accuracy of detection boxes taken into account, and an effective deep association mechanism is designed; secondly, based on the classification confidence and localization confidence, a more appropriate cost matrix can be selected and used; finally, extensive experiments have been conducted on MOT17 and MOT20 datasets. The results show that our proposed method outperforms the compared state-of-art tracking methods. For the benefit of the community, our code has been made publicly at https://github.com/mengting2023/LG-Track.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Safety Guaranteed Control for Spacecraft Inspection Mission
Authors:
Kun Wang,
Tao Meng,
Jiakun Lei,
Weijia Wang
Abstract:
This paper investigates the safety guaranteed problem in spacecraft inspection missions, considering multiple position obstacles and logical attitude forbidden zones. In order to address this issue, we propose a control strategy based on control barrier functions, summarized as "safety check on kinematics" and "velocity tracking on dynamics" approach. The proposed approach employs control barrier…
▽ More
This paper investigates the safety guaranteed problem in spacecraft inspection missions, considering multiple position obstacles and logical attitude forbidden zones. In order to address this issue, we propose a control strategy based on control barrier functions, summarized as "safety check on kinematics" and "velocity tracking on dynamics" approach. The proposed approach employs control barrier functions to describe the obstacles and to generate safe velocities via the solution of a quadratic programming problem. Subsequently, we design a proportional-like controller based on the generated velocity, which, despite its simplicity, can ensure safety even in the presence of velocity tracking errors. The stability and safety of the system are rigorously analyzed in this paper. Furthermore, to account for model uncertainties and external disturbances, we incorporate an immersion and invariance-based disturbance observer in our design. Finally, numerical simulations are performed to demonstrate the effectiveness of the proposed control strategy.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Adaptive Reduced-Attitude Control for Spacecraft Boresight Alignment with Safety Constraints and Accuracy Requirements
Authors:
Jiakun Lei,
Tao Meng,
Kun Wang,
Weijia Wang,
Shujian Sun,
Lei Wang
Abstract:
This paper investigates the boresight alignment control problem under safety constraints and performance requirements, involving pointing-forbidden constraint, attitude angular velocity limitation, and pointing accuracy requirement. Meanwhile, the parameter uncertainty issue is taken into account simultaneously. To address this problem, we propose a modified composite framework integrating the Art…
▽ More
This paper investigates the boresight alignment control problem under safety constraints and performance requirements, involving pointing-forbidden constraint, attitude angular velocity limitation, and pointing accuracy requirement. Meanwhile, the parameter uncertainty issue is taken into account simultaneously. To address this problem, we propose a modified composite framework integrating the Artificial Potential Field (APF) methodology and the Prescribed Performance Control (PPC) scheme. The APF scheme ensures safety, while the PPC scheme is employed to realize an accuracy-guaranteed control. A Switched Prescribed Performance Function (SPPF) is proposed to facilitate the integration, which monitors various constraints and further establishes compatibility between safety and performance concerns by leveraging a special PPC freezing mechanism. To further address the parameter uncertainty, we introduce the Immersion-and-Invariance (I\&I) adaptive control technique to derive an adaptive APF-PPC composite controller, further guaranteeing the closed-loop system's asymptotic convergence. Finally, numerical simulations are carried out to validate the effectiveness of the proposed scheme.
△ Less
Submitted 18 September, 2023; v1 submitted 31 May, 2023;
originally announced May 2023.
-
Composite Triggered Intermittent Control for Constrained Spacecraft Attitude Tracking
Authors:
Jiakun Lei,
Tao Meng,
Kun Wang,
Weijia Wang,
Shujian Sun
Abstract:
This paper focuses on the spacecraft attitude control problem with intermittent actuator activation, taking into account the attitude rotation rate limitation and input saturation issue simultaneously. To address this problem, we first propose a composite event-trigger mechanism, which composed of two state-dependent trigger that governing the activation and deactivation of actuators. Subsequently…
▽ More
This paper focuses on the spacecraft attitude control problem with intermittent actuator activation, taking into account the attitude rotation rate limitation and input saturation issue simultaneously. To address this problem, we first propose a composite event-trigger mechanism, which composed of two state-dependent trigger that governing the activation and deactivation of actuators. Subsequently, by introducing the cascaded decomposition of Backstepping control philosophy, the designed trigger mechanism is then applied to the decomposed dynamical subsystem, providing a layered intermittent stabilization strategy.
Further, the basic intermittent attitude controller is extended to a "constrained version" by introducing a strictly bounded virtual control law and an input saturation compensation auxiliary system.
By analyzing the local boundedness of the system on each inter-event time interval, a uniformly, strictly decreasing upper boundary of the lumped system is further characterized, thereby completing the proof of the system's uniformly ultimately boundedness (UUB). Finally, numerical simulation results are illustrated to demonstrate the effectiveness of the proposed scheme.
△ Less
Submitted 31 May, 2023;
originally announced May 2023.
-
Adaptive Compatible Performance Control for Spacecraft Attitude Control under Motion Constraints with Guaranteed Accuracy
Authors:
Jiakun Lei,
Tao Meng,
Yang Zhu,
Kun Wang,
Weijia Wang
Abstract:
This paper focuses on the problem of spacecraft attitude control in the presence of time-varying parameter uncertainties and multiple constraints, accounting for angular velocity limitation, performance requirements, and input saturation. To tackle this problem, we propose a modified framework called Compatible Performance Control (CPC), which integrates the Prescribed Performance Control (PPC) sc…
▽ More
This paper focuses on the problem of spacecraft attitude control in the presence of time-varying parameter uncertainties and multiple constraints, accounting for angular velocity limitation, performance requirements, and input saturation. To tackle this problem, we propose a modified framework called Compatible Performance Control (CPC), which integrates the Prescribed Performance Control (PPC) scheme with a contradiction detection and alleviation strategy. Firstly, by introducing the Zeroing Barrier Function (ZBF) concept, we propose a detection strategy to yield judgment on the compatibility between the angular velocity constraint and the performance envelope constraint. Subsequently, we propose a projection operator-governed dynamical system with a varying upper bound to generate an appropriate bounded performance envelope-modification signal if a contradiction exists, thereby alleviating the contradiction and promoting compatibility within the system. Next, a dynamical filter technique is introduced to construct a bounded reference velocity signal to address the angular velocity limitation. Furthermore, we employ a time-varying gain technique to address the challenge posed by time-varying parameter uncertainties, further developing an adaptive strategy that exhibits robustness on disturbance rejection. By utilizing the proposed CPC scheme and time-varying gain adaptive strategy, we construct an adaptive CPC controller, which guarantees the ultimate boundedness of the system, and all constraints are satisfied simultaneously during the whole control process. Finally, numerical simulation results are presented to show the effectiveness of the proposed framework.
△ Less
Submitted 31 May, 2023;
originally announced May 2023.
-
Flock: Accurate network fault localization at scale
Authors:
Vipul Harsh,
Tong Meng,
Kapil Agrawal,
P. Brighten Godfrey
Abstract:
Inferring the root cause of failures among thousands of components in a data center network is challenging, especially for "gray" failures that are not reported directly by switches. Faults can be localized through end-to-end measurements, but past localization schemes are either too slow for large-scale networks or sacrifice accuracy. We describe Flock, a network fault localization algorithm and…
▽ More
Inferring the root cause of failures among thousands of components in a data center network is challenging, especially for "gray" failures that are not reported directly by switches. Faults can be localized through end-to-end measurements, but past localization schemes are either too slow for large-scale networks or sacrifice accuracy. We describe Flock, a network fault localization algorithm and system that achieves both high accuracy and speed at datacenter scale. Flock uses a probabilistic graphical model (PGM) to achieve high accuracy, coupled with new techniques to dramatically accelerate inference in discrete-valued Bayesian PGMs. Large-scale simulations and experiments in a hardware testbed show Flock speeds up inference by >10000x compared to past PGM methods, and improves accuracy over the best previous datacenter fault localization approaches, reducing inference error by 1.19-11x on the same input telemetry, and by 1.2-55x after incorporating passive telemetry. We also prove Flock's inference is optimal in restricted settings
△ Less
Submitted 5 May, 2023;
originally announced May 2023.
-
Edge-selective extremal damping from topological heritage of dissipative Chern insulators
Authors:
Suraj S. Hegde,
Toni Ehmcke,
Tobias Meng
Abstract:
One of the most important practical hallmarks of topological matter is the presence of topologically protected, exponentially localised edge states at interfaces of regions characterised by unequal topological invariants. Here, we show that even when driven far from their equilibrium ground state, Chern insulators can inherit topological edge features from their parent Hamiltonian. In particular,…
▽ More
One of the most important practical hallmarks of topological matter is the presence of topologically protected, exponentially localised edge states at interfaces of regions characterised by unequal topological invariants. Here, we show that even when driven far from their equilibrium ground state, Chern insulators can inherit topological edge features from their parent Hamiltonian. In particular, we show that the asymptotic long-time approach of the non-equilibrium steady state, governed by a Lindblad Master equation, can exhibit edge-selective extremal damping. This phenomenon derives from edge states of non-Hermitian extensions of the parent Chern insulator Hamiltonian. The combination of (non-Hermitian) topology and dissipation hence allows to design topologically robust, spatially localised damping patterns.
△ Less
Submitted 29 December, 2023; v1 submitted 18 April, 2023;
originally announced April 2023.
-
You Only Need Two Detectors to Achieve Multi-Modal 3D Multi-Object Tracking
Authors:
Xiyang Wang,
Chunyun Fu,
Jiawei He,
Mingguang Huang,
Ting Meng,
Siyu Zhang,
Hangning Zhou,
Ziyao Xu,
Chi Zhang
Abstract:
In the classical tracking-by-detection (TBD) paradigm, detection and tracking are separately and sequentially conducted, and data association must be properly performed to achieve satisfactory tracking performance. In this paper, a new end-to-end multi-object tracking framework is proposed, which integrates object detection and multi-object tracking into a single model. The proposed tracking frame…
▽ More
In the classical tracking-by-detection (TBD) paradigm, detection and tracking are separately and sequentially conducted, and data association must be properly performed to achieve satisfactory tracking performance. In this paper, a new end-to-end multi-object tracking framework is proposed, which integrates object detection and multi-object tracking into a single model. The proposed tracking framework eliminates the complex data association process in the classical TBD paradigm, and requires no additional training. Secondly, the regression confidence of historical trajectories is investigated, and the possible states of a trajectory (weak object or strong object) in the current frame are predicted. Then, a confidence fusion module is designed to guide non-maximum suppression for trajectories and detections to achieve ordered and robust tracking. Thirdly, by integrating historical trajectory features, the regression performance of the detector is enhanced, which better reflects the occlusion and disappearance patterns of objects in real world. Lastly, extensive experiments are conducted on the commonly used KITTI and Waymo datasets. The results show that the proposed framework can achieve robust tracking by using only a 2D detector and a 3D detector, and it is proven more accurate than many of the state-of-the-art TBD-based multi-modal tracking methods. The source codes of the proposed method are available at https://github.com/wangxiyang2022/YONTD-MOT.
△ Less
Submitted 22 March, 2024; v1 submitted 17 April, 2023;
originally announced April 2023.
-
In-Context Operator Learning with Data Prompts for Differential Equation Problems
Authors:
Liu Yang,
Siting Liu,
Tingwei Meng,
Stanley J. Osher
Abstract:
This paper introduces a new neural-network-based approach, namely In-Context Operator Networks (ICON), to simultaneously learn operators from the prompted data and apply it to new questions during the inference stage, without any weight update. Existing methods are limited to using a neural network to approximate a specific equation solution or a specific operator, requiring retraining when switch…
▽ More
This paper introduces a new neural-network-based approach, namely In-Context Operator Networks (ICON), to simultaneously learn operators from the prompted data and apply it to new questions during the inference stage, without any weight update. Existing methods are limited to using a neural network to approximate a specific equation solution or a specific operator, requiring retraining when switching to a new problem with different equations. By training a single neural network as an operator learner, we can not only get rid of retraining (even fine-tuning) the neural network for new problems, but also leverage the commonalities shared across operators so that only a few demos in the prompt are needed when learning a new operator. Our numerical results show the neural network's capability as a few-shot operator learner for a diversified type of differential equation problems, including forward and inverse problems of ordinary differential equations (ODEs), partial differential equations (PDEs), and mean-field control (MFC) problems, and also show that it can generalize its learning capability to operators beyond the training distribution.
△ Less
Submitted 19 September, 2023; v1 submitted 17 April, 2023;
originally announced April 2023.
-
Leveraging Multi-time Hamilton-Jacobi PDEs for Certain Scientific Machine Learning Problems
Authors:
Paula Chen,
Tingwei Meng,
Zongren Zou,
Jérôme Darbon,
George Em Karniadakis
Abstract:
Hamilton-Jacobi partial differential equations (HJ PDEs) have deep connections with a wide range of fields, including optimal control, differential games, and imaging sciences. By considering the time variable to be a higher dimensional quantity, HJ PDEs can be extended to the multi-time case. In this paper, we establish a novel theoretical connection between specific optimization problems arising…
▽ More
Hamilton-Jacobi partial differential equations (HJ PDEs) have deep connections with a wide range of fields, including optimal control, differential games, and imaging sciences. By considering the time variable to be a higher dimensional quantity, HJ PDEs can be extended to the multi-time case. In this paper, we establish a novel theoretical connection between specific optimization problems arising in machine learning and the multi-time Hopf formula, which corresponds to a representation of the solution to certain multi-time HJ PDEs. Through this connection, we increase the interpretability of the training process of certain machine learning applications by showing that when we solve these learning problems, we also solve a multi-time HJ PDE and, by extension, its corresponding optimal control problem. As a first exploration of this connection, we develop the relation between the regularized linear regression problem and the Linear Quadratic Regulator (LQR). We then leverage our theoretical connection to adapt standard LQR solvers (namely, those based on the Riccati ordinary differential equations) to design new training approaches for machine learning. Finally, we provide some numerical examples that demonstrate the versatility and possible computational advantages of our Riccati-based approach in the context of continual learning, post-training calibration, transfer learning, and sparse dynamics identification.
△ Less
Submitted 8 December, 2023; v1 submitted 22 March, 2023;
originally announced March 2023.
-
Quantum-Hall physics and three dimensions
Authors:
Johannes Gooth,
Stanislaw Galeski,
Tobias Meng
Abstract:
The discovery of the quantum Hall effect (QHE) in 1980 marked a turning point in condensed matter physics: given appropriate experimental conditions, the Hall conductivity σ_xy of a two-dimensional (2D) electron system is exactly quantized. But what happens to the QHE in three dimensions (3D)? Experiments over the past 40 years showed that some of the remarkable physics of the QHE, in particular p…
▽ More
The discovery of the quantum Hall effect (QHE) in 1980 marked a turning point in condensed matter physics: given appropriate experimental conditions, the Hall conductivity σ_xy of a two-dimensional (2D) electron system is exactly quantized. But what happens to the QHE in three dimensions (3D)? Experiments over the past 40 years showed that some of the remarkable physics of the QHE, in particular plateau-like Hall conductivities σ_xy accompanied by minima in the longitudinal resistivity \r{ho}_xx, can also be found in 3D materials. However, since typically \r{ho}_xx remains finite and a quantitative relation between σ_xy and the conductance quantum e^2/h could not be established, the role of quantum Hall physics in 3D remains unsettled. Following a recent series of exciting experiments, the QHE in 3D has now returned to the centre stage. Here, we summarize the leap in understanding of 3D matter in magnetic fields emerging from these experiments.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
Event-Triggered Intermittent Prescribed Performance Control for Spacecraft Attitude Reorientation
Authors:
Jiakun Lei,
Tao Meng,
Kun Wang,
Weijia Wang,
Zhonghe Jin
Abstract:
This paper focuses on the issue of how to realize spacecraft attitude control with guaranteed performance while conspicuously reducing the actuator acting frequency simultaneously. The prescribed performance control (PPC) scheme is often employed for the control with guaranteed performance. However, conventional PPC controllers are designed from the perspective of continuous system, which contradi…
▽ More
This paper focuses on the issue of how to realize spacecraft attitude control with guaranteed performance while conspicuously reducing the actuator acting frequency simultaneously. The prescribed performance control (PPC) scheme is often employed for the control with guaranteed performance. However, conventional PPC controllers are designed from the perspective of continuous system, which contradicts the "discrete" control logic in actual spacecraft control system, and such a problem limited the application value of PPC scheme in actual applications. In order to significantly lower the actuator acting frequency while still maintaining the desired performance, a composite event-trigger mechanism is proposed for this issue, turning off the actuator and eliminating unnecessary control output under appropriate conditions. Further, the proposed composite event-trigger mechanism is combined with a prescribed performance control scheme, constructing a complete controller structure for solving the presented issue. Meanwhile, a special performance function is designed to provide mild transition process. Based on the proposed scheme, a specific backstepping controller is further developed as a validation. Finally, numerical simulation results are presented to validate the effectiveness of the proposed scheme.
△ Less
Submitted 10 November, 2022;
originally announced November 2022.
-
Black hole mirages: electron lensing and Berry curvature effects in inhomogeneously tilted Weyl semimetals
Authors:
Andreas Haller,
Suraj Hegde,
Chen Xu,
Christophe De Beule,
Thomas L. Schmidt,
Tobias Meng
Abstract:
We study electronic transport in Weyl semimetals with spatially varying nodal tilt profiles. We find that the flow of electrons can be guided precisely by judiciously chosen tilt profiles. In a broad regime of parameters, we show that electron flow is described well by semiclassical equations of motion similar to the ones governing gravitational attraction. This analogy provides a physically trans…
▽ More
We study electronic transport in Weyl semimetals with spatially varying nodal tilt profiles. We find that the flow of electrons can be guided precisely by judiciously chosen tilt profiles. In a broad regime of parameters, we show that electron flow is described well by semiclassical equations of motion similar to the ones governing gravitational attraction. This analogy provides a physically transparent tool for designing tiltronic devices like electronic lenses. The analogy to gravity circumvents the notoriously difficult full-fledged description of inhomogeneous solids. A comparison to microscopic lattice simulations shows that it is only valid for trajectories sufficiently far from analogue black holes. We finally comment on the Berry curvature-driven transverse motion and relate the latter to spin precession physics.
△ Less
Submitted 20 February, 2023; v1 submitted 28 October, 2022;
originally announced October 2022.
-
Spacecraft Attitude Pointing Control under Pointing Forbidden Constraints with Guaranteed Accuracy
Authors:
Jiakun Lei,
Tao Meng,
Weijia Wang,
Shujian Sun,
Heng Li,
Zhonghe Jin
Abstract:
This paper focuses on the attitude pointing control problem under pointing-forbidden constraints and performance constraints. The spacecraft is expected to align its sensor's boresight to a desired direction, while the terminal control accuracy and the attitude adjustment rapidity should also be guaranteed simultaneously. To resolve this problem, a switching controller structure is proposed in thi…
▽ More
This paper focuses on the attitude pointing control problem under pointing-forbidden constraints and performance constraints. The spacecraft is expected to align its sensor's boresight to a desired direction, while the terminal control accuracy and the attitude adjustment rapidity should also be guaranteed simultaneously. To resolve this problem, a switching controller structure is proposed in this paper based on the reduced-attitude representation, fusing the artificial potential field (APF) methodology and the Prescribed Performance Control (PPC) scheme together. Firstly, a novel artificial potential field is presented, and a particular function is designed for the mollification of the switching process, aiming at providing a smooth transition for the system status. Subsequently, we propose a special performance function, which can freeze the PPC part when necessary. In this way, the intrinsic contradictory between the fast attitude maneuver and forbidden direction avoidance is tackled
Further, an asynchronous switching strategy is designed, guarantees the system's stability. Based on these proposed issues, a switching backstepping controller is developed, and a tracking differentiator(TD) is employed to generate a smooth approximation of differential signals. Numerical simulation results are illustrated to show the effectiveness of the proposed scheme.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Robust Control For Spacecraft Attitude Tracking Under Multiple Physical Limitations with Guaranteed Performance
Authors:
Jiakun Lei,
Tao Meng,
Weijia Wang,
Chengjin Yin,
Zhonghe Jin
Abstract:
This paper considers the prescribed performance control (PPC) of spacecraft attitude tracking under multiple physical constraints, focusing on the robust issues. A novel Barrier Lyapunov function is proposed to realize the guaranteed-performance control under angular velocity constraint without singularity. Additionally, an adaptive strategy for the performance function is presented to soften the…
▽ More
This paper considers the prescribed performance control (PPC) of spacecraft attitude tracking under multiple physical constraints, focusing on the robust issues. A novel Barrier Lyapunov function is proposed to realize the guaranteed-performance control under angular velocity constraint without singularity. Additionally, an adaptive strategy for the performance function is presented to soften the constraint and quickly re-stabilize the system after severe disturbances occur, providing strong robustness. Further, an auxiliary system is designed to handle the input saturation issue, incorporating the actuator limitation into the system. Based on the proposed structure, a backstepping controller is developed accordingly using a double-layer PPC framework. Numerical simulation results are presented to validate the proposed controller framework's efficiency and robustness.8 pa
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Gross-Neveu-Heisenberg criticality from $2+\boldsymbolε$ expansion
Authors:
Konstantinos Ladovrechis,
Shouryya Ray,
Tobias Meng,
Lukas Janssen
Abstract:
The Gross-Neveu-Heisenberg universality class describes a continuous quantum phase transition between a Dirac semimetal and an antiferromagnetic insulator. Such quantum critical points have originally been discussed in the context of Hubbard models on $π$-flux and honeycomb lattices, but more recently also in Bernal-stacked bilayer models, of potential relevance for bilayer graphene. Here, we demo…
▽ More
The Gross-Neveu-Heisenberg universality class describes a continuous quantum phase transition between a Dirac semimetal and an antiferromagnetic insulator. Such quantum critical points have originally been discussed in the context of Hubbard models on $π$-flux and honeycomb lattices, but more recently also in Bernal-stacked bilayer models, of potential relevance for bilayer graphene. Here, we demonstrate how the critical behavior of this fermionic universality class can be computed within an $ε$ expansion around the lower critical space-time dimension of two. This approach is complementary to the previously studied expansion around the upper critical dimension of four. The crucial technical novelty near the lower critical dimension is the presence of different four-fermion interaction channels at the critical point, which we take into account in a Fierz-complete way. By interpolating between the lower and upper critical dimensions, we obtain improved estimates for the critical exponents in 2+1 space-time dimensions. For the situation relevant to single-layer graphene, we find an unusually small leading-correction-to-scaling exponent, arising from the competition between different interaction channels. This suggests that corrections to scaling may need to be taken into account when comparing analytical estimates with numerical data from finite-size extrapolations.
△ Less
Submitted 3 February, 2023; v1 submitted 6 September, 2022;
originally announced September 2022.
-
Engineering a pure Dirac regime in ZrTe$_5$
Authors:
Jorge I. Facio,
Elisabetta Nocerino,
Ion Cosma Fulga,
Rafal Wawrzynczak,
Joanna Brown,
Genda Gu,
Qiang Li,
Martin Mansson,
Yasmine Sassa,
Oleh Ivashko,
Martin v. Zimmermann,
Felix Mende,
Johannes Gooth,
Stanislaw Galeski,
Jeroen van den Brink,
Tobias Meng
Abstract:
Real-world topological semimetals typically exhibit Dirac and Weyl nodes that coexist with trivial Fermi pockets. This tends to mask the physics of the relativistic quasiparticles. Using the example of ZrTe5, we show that strain provides a powerful tool for in-situ tuning of the band structure such that all trivial pockets are pushed far away from the Fermi energy, but only for a certain range of…
▽ More
Real-world topological semimetals typically exhibit Dirac and Weyl nodes that coexist with trivial Fermi pockets. This tends to mask the physics of the relativistic quasiparticles. Using the example of ZrTe5, we show that strain provides a powerful tool for in-situ tuning of the band structure such that all trivial pockets are pushed far away from the Fermi energy, but only for a certain range of Van der Waals gaps. Our results naturally reconcile contradicting reports on the presence or absence of additional pockets in ZrTe$_5$, and provide a clear map of where to find a pure three-dimensional Dirac semimetallic phase in the structural parameter space of the material.
△ Less
Submitted 5 December, 2022; v1 submitted 28 June, 2022;
originally announced June 2022.
-
Singularity-Avoidance Prescribed Performance Attitude Tracking of Spacecraft
Authors:
Jiakun Lei,
Tao Meng,
Weijia Wang,
Heng Li,
Zhonghe Jin
Abstract:
The attitude tracking problem with preassigned performance requirements has earned tremendous interest in recent years, and the Prescribed Performance Control (PPC) scheme is often adopted to tackle this problem. Nevertheless, traditional PPC schemes have inherent problems, which the solution still lacks, such as the singularity problem when the state constraint is violated and the potential over-…
▽ More
The attitude tracking problem with preassigned performance requirements has earned tremendous interest in recent years, and the Prescribed Performance Control (PPC) scheme is often adopted to tackle this problem. Nevertheless, traditional PPC schemes have inherent problems, which the solution still lacks, such as the singularity problem when the state constraint is violated and the potential over-control problem when the state trajectory approaches the constraint boundary.
This paper proposes a Singularity-Avoidance Prescribed Performance Control scheme (SAPPC) to deal with these problems. A novel shear mapping-based error transformation is proposed to provide a globally non-singular error transformation procedure, while a time-varying constraint boundary is employed to exert appropriate constraint strength at different control stages, alleviating the potential instability caused by the over-control problem. Besides, a novel piece-wise reference performance function (RPF) is constructed to provide a relevant reference trajectory for the state responding signals, allowing precise control of the system's responding behavior. Based on the proposed SAPPC scheme, a backstepping controller is developed, with the predefined-time stability technique and the dynamic surface control technique employed to enhance the controller's robustness and performance. Finally, theoretical analysis and numerical simulation results are presented to validate the proposed control scheme's effectiveness and robustness.
△ Less
Submitted 25 June, 2022;
originally announced June 2022.
-
Controllable Text Generation with Neurally-Decomposed Oracle
Authors:
Tao Meng,
Sidi Lu,
Nanyun Peng,
Kai-Wei Chang
Abstract:
We propose a general and efficient framework to control auto-regressive generation models with NeurAlly-Decomposed Oracle (NADO). Given a pre-trained base language model and a sequence-level boolean oracle function, we propose to decompose the oracle function into token-level guidance to steer the base model in text generation. Specifically, the token-level guidance is approximated by a neural mod…
▽ More
We propose a general and efficient framework to control auto-regressive generation models with NeurAlly-Decomposed Oracle (NADO). Given a pre-trained base language model and a sequence-level boolean oracle function, we propose to decompose the oracle function into token-level guidance to steer the base model in text generation. Specifically, the token-level guidance is approximated by a neural model trained with examples sampled from the base model, demanding no additional auxiliary labeled data. Based on posterior regularization, we present the closed-form optimal solution to incorporate the token-level guidance into the base model for controllable generation. We further provide a theoretical analysis of how the approximation quality of NADO affects the controllable generation results. Experiments conducted on two applications: (1) text generation with lexical constraints and (2) machine translation with formality control demonstrate that our framework efficiently guides the base model towards the given oracle while maintaining high generation quality.
△ Less
Submitted 20 October, 2022; v1 submitted 27 May, 2022;
originally announced May 2022.
-
On the Paradox of Learning to Reason from Data
Authors:
Honghua Zhang,
Liunian Harold Li,
Tao Meng,
Kai-Wei Chang,
Guy Van den Broeck
Abstract:
Logical reasoning is needed in a wide range of NLP tasks. Can a BERT model be trained end-to-end to solve logical reasoning problems presented in natural language? We attempt to answer this question in a confined problem space where there exists a set of parameters that perfectly simulates logical reasoning. We make observations that seem to contradict each other: BERT attains near-perfect accurac…
▽ More
Logical reasoning is needed in a wide range of NLP tasks. Can a BERT model be trained end-to-end to solve logical reasoning problems presented in natural language? We attempt to answer this question in a confined problem space where there exists a set of parameters that perfectly simulates logical reasoning. We make observations that seem to contradict each other: BERT attains near-perfect accuracy on in-distribution test examples while failing to generalize to other data distributions over the exact same problem space. Our study provides an explanation for this paradox: instead of learning to emulate the correct reasoning function, BERT has in fact learned statistical features that inherently exist in logical reasoning problems. We also show that it is infeasible to jointly remove statistical features from data, illustrating the difficulty of learning to reason in general. Our result naturally extends to other neural models and unveils the fundamental difference between learning to reason and learning to achieve high performance on NLP benchmarks using statistical features.
△ Less
Submitted 24 May, 2022; v1 submitted 23 May, 2022;
originally announced May 2022.
-
Signatures of a magnetic-field-induced Lifshitz transition in the ultra-quantum limit of the topological semimetal ZrTe$_5$
Authors:
S. Galeski,
H. F. Legg,
R. Wawrzyńczak,
T. Förster,
S. Zherlitsyn,
D. Gorbunov,
P. M. Lozano,
Q. Li,
G. D. Gu,
C. Felser,
J. Wosnitza,
T. Meng,
J Gooth
Abstract:
The quantum limit (QL) of an electron liquid, realised at strong magnetic fields, has long been proposed to host a wealth of strongly correlated states of matter. Electronic states in the QL are, for example, quasi-one dimensional (1D), which implies perfectly nested Fermi surfaces prone to instabilities. Whereas the QL typically requires unreachably strong magnetic fields, the topological semimet…
▽ More
The quantum limit (QL) of an electron liquid, realised at strong magnetic fields, has long been proposed to host a wealth of strongly correlated states of matter. Electronic states in the QL are, for example, quasi-one dimensional (1D), which implies perfectly nested Fermi surfaces prone to instabilities. Whereas the QL typically requires unreachably strong magnetic fields, the topological semimetal ZrTe$_5$ has been shown to reach the QL at fields of only a few Tesla. Here, we characterize the QL of ZrTe$_5$ at fields up to 64 T by a combination of electrical-transport and ultrasound measurements. We find that the Zeeman effect in ZrTe$_5$ enables an efficient tuning of the 1D Landau band structure with magnetic field. This results in a Lifshitz transition to a 1D Weyl regime in which perfect charge neutrality can be achieved. Since no instability-driven phase transitions destabilise the 1D electron liquid for the investigated field strengths and temperatures, our analysis establishes ZrTe$_5$ as a thoroughly understood platform for potentially inducing more exotic interaction-driven phases at lower temperatures.
△ Less
Submitted 25 April, 2022;
originally announced April 2022.
-
Revisiting Multi-Scale Feature Fusion for Semantic Segmentation
Authors:
Tianjian Meng,
Golnaz Ghiasi,
Reza Mahjourian,
Quoc V. Le,
Mingxing Tan
Abstract:
It is commonly believed that high internal resolution combined with expensive operations (e.g. atrous convolutions) are necessary for accurate semantic segmentation, resulting in slow speed and large memory usage. In this paper, we question this belief and demonstrate that neither high internal resolution nor atrous convolutions are necessary. Our intuition is that although segmentation is a dense…
▽ More
It is commonly believed that high internal resolution combined with expensive operations (e.g. atrous convolutions) are necessary for accurate semantic segmentation, resulting in slow speed and large memory usage. In this paper, we question this belief and demonstrate that neither high internal resolution nor atrous convolutions are necessary. Our intuition is that although segmentation is a dense per-pixel prediction task, the semantics of each pixel often depend on both nearby neighbors and far-away context; therefore, a more powerful multi-scale feature fusion network plays a critical role. Following this intuition, we revisit the conventional multi-scale feature space (typically capped at P5) and extend it to a much richer space, up to P9, where the smallest features are only 1/512 of the input size and thus have very large receptive fields. To process such a rich feature space, we leverage the recent BiFPN to fuse the multi-scale features. Based on these insights, we develop a simplified segmentation model, named ESeg, which has neither high internal resolution nor expensive atrous convolutions. Perhaps surprisingly, our simple method can achieve better accuracy with faster speed than prior art across multiple datasets. In real-time settings, ESeg-Lite-S achieves 76.0% mIoU on CityScapes [12] at 189 FPS, outperforming FasterSeg [9] (73.1% mIoU at 170 FPS). Our ESeg-Lite-L runs at 79 FPS and achieves 80.1% mIoU, largely closing the gap between real-time and high-performance segmentation models.
△ Less
Submitted 14 June, 2022; v1 submitted 23 March, 2022;
originally announced March 2022.
-
DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection
Authors:
Yingwei Li,
Adams Wei Yu,
Tianjian Meng,
Ben Caine,
Jiquan Ngiam,
Daiyi Peng,
Junyang Shen,
Bo Wu,
Yifeng Lu,
Denny Zhou,
Quoc V. Le,
Alan Yuille,
Mingxing Tan
Abstract:
Lidars and cameras are critical sensors that provide complementary information for 3D detection in autonomous driving. While prevalent multi-modal methods simply decorate raw lidar point clouds with camera features and feed them directly to existing 3D detection models, our study shows that fusing camera features with deep lidar features instead of raw points, can lead to better performance. Howev…
▽ More
Lidars and cameras are critical sensors that provide complementary information for 3D detection in autonomous driving. While prevalent multi-modal methods simply decorate raw lidar point clouds with camera features and feed them directly to existing 3D detection models, our study shows that fusing camera features with deep lidar features instead of raw points, can lead to better performance. However, as those features are often augmented and aggregated, a key challenge in fusion is how to effectively align the transformed features from two modalities. In this paper, we propose two novel techniques: InverseAug that inverses geometric-related augmentations, e.g., rotation, to enable accurate geometric alignment between lidar points and image pixels, and LearnableAlign that leverages cross-attention to dynamically capture the correlations between image and lidar features during fusion. Based on InverseAug and LearnableAlign, we develop a family of generic multi-modal 3D detection models named DeepFusion, which is more accurate than previous methods. For example, DeepFusion improves PointPillars, CenterPoint, and 3D-MAN baselines on Pedestrian detection for 6.7, 8.9, and 6.2 LEVEL_2 APH, respectively. Notably, our models achieve state-of-the-art performance on Waymo Open Dataset, and show strong model robustness against input corruptions and out-of-distribution data. Code will be publicly available at https://github.com/tensorflow/lingvo/tree/master/lingvo/.
△ Less
Submitted 15 March, 2022;
originally announced March 2022.
-
SympOCnet: Solving optimal control problems with applications to high-dimensional multi-agent path planning problems
Authors:
Tingwei Meng,
Zhen Zhang,
Jérôme Darbon,
George Em Karniadakis
Abstract:
Solving high-dimensional optimal control problems in real-time is an important but challenging problem, with applications to multi-agent path planning problems, which have drawn increased attention given the growing popularity of drones in recent years. In this paper, we propose a novel neural network method called SympOCnet that applies the Symplectic network to solve high-dimensional optimal con…
▽ More
Solving high-dimensional optimal control problems in real-time is an important but challenging problem, with applications to multi-agent path planning problems, which have drawn increased attention given the growing popularity of drones in recent years. In this paper, we propose a novel neural network method called SympOCnet that applies the Symplectic network to solve high-dimensional optimal control problems with state constraints. We present several numerical results on path planning problems in two-dimensional and three-dimensional spaces. Specifically, we demonstrate that our SympOCnet can solve a problem with more than 500 dimensions in 1.5 hours on a single GPU, which shows the effectiveness and efficiency of SympOCnet. The proposed method is scalable and has the potential to solve truly high-dimensional path planning problems in real-time.
△ Less
Submitted 14 January, 2022;
originally announced January 2022.