-
Software-Hardware Codesign for Efficient In-Memory Regular Pattern Matching
Authors:
Lingkun Kong,
Qixuan Yu,
Agnishom Chattopadhyay,
Alexis Le Glaunec,
Yi Huang,
Konstantinos Mamouras,
Kaiyuan Yang
Abstract:
Regular pattern matching is used in numerous application domains, including text processing, bioinformatics, and network security. Patterns are typically expressed with an extended syntax of regular expressions that include the computationally challenging construct of bounded iteration or counting, which describes the repetition of a pattern a fixed number of times. We develop a design for a speci…
▽ More
Regular pattern matching is used in numerous application domains, including text processing, bioinformatics, and network security. Patterns are typically expressed with an extended syntax of regular expressions that include the computationally challenging construct of bounded iteration or counting, which describes the repetition of a pattern a fixed number of times. We develop a design for a specialized in-memory hardware architecture for NFA execution that integrates counter and bit vector elements. The design is inspired by the theoretical model of nondeterministic counter automata (NCA). A key feature of our approach is that we statically analyze regular expressions to determine bounds on the amount of memory needed for the occurrences of counting. The results of this analysis are used by a regex-to-hardware compiler in order to make an appropriate selection of counter or bit vector elements. We evaluate the performance of our hardware implementation on a simulator based on circuit parameters collected by SPICE simulation using a TSMC 28nm process. We find the usage of counter and bit vector quickly outperforms unfolding solutions by orders of magnitude with small counting quantifiers. Experiments concerning realistic workloads show up to 76% energy reduction and 58% area reduction in comparison to traditional in-memory NFA processors.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
On Detecting Nearby Nano-Hertz Gravitational Wave Sources via Pulsar Timing Arrays
Authors:
Xiao Guo,
Youjun Lu,
Qingjuan Yu
Abstract:
Massive binary black holes (MBBHs) in nearby galactic centers, if any, may be nano-Hertz gravitational wave (GW) sources for pulsar timing arrays (PTAs) to detect. Normally the objective GWs for PTA experiments are approximated as plane waves because its sources are presumably located faraway. For nearby GW sources, however, this approximation may be inaccurate due to the curved GW wave front and…
▽ More
Massive binary black holes (MBBHs) in nearby galactic centers, if any, may be nano-Hertz gravitational wave (GW) sources for pulsar timing arrays (PTAs) to detect. Normally the objective GWs for PTA experiments are approximated as plane waves because its sources are presumably located faraway. For nearby GW sources, however, this approximation may be inaccurate due to the curved GW wave front and the GW strength changes along the paths of PTA pulsar pulses. In this paper, we analyze the near-field effect in the PTA detection of nearby sources and find it is important if the source distance is less than a few tens Mpc, and ignoring this effect may lead to a significant signal-to-noise underestimation especially when the source distance is comparable to the pulsar distances. As examples, we assume a nano-Hertz MBBH source located at either the Galactic Center (GC) or the Large Magellanic Cloud (LMC) according to the observational constraints/hints on the MBBH parameter space, and estimate its detectability by current/future PTAs. We find that the GC MBBH may be detectable by the Square Kilometer Array (SKA) PTA. It is challenging for detecting the LMC MBBH; however, if a number ($N\gtrsim10$) of stable millisecond pulsars can be found in the LMC center, the MBBH may be detectable via a PTA formed by these pulsars. We further illustrate the near-field effects on the PTA detection of an isotropic GW background contributed mainly by nearby GW sources, and the resulting angular correlation is similar to the Hellings-Downs curve.
△ Less
Submitted 13 October, 2022; v1 submitted 12 September, 2022;
originally announced September 2022.
-
CNSNet: A Cleanness-Navigated-Shadow Network for Shadow Removal
Authors:
Qianhao Yu,
Naishan Zheng,
Jie Huang,
Feng Zhao
Abstract:
The key to shadow removal is recovering the contents of the shadow regions with the guidance of the non-shadow regions. Due to the inadequate long-range modeling, the CNN-based approaches cannot thoroughly investigate the information from the non-shadow regions. To solve this problem, we propose a novel cleanness-navigated-shadow network (CNSNet), with a shadow-oriented adaptive normalization (SOA…
▽ More
The key to shadow removal is recovering the contents of the shadow regions with the guidance of the non-shadow regions. Due to the inadequate long-range modeling, the CNN-based approaches cannot thoroughly investigate the information from the non-shadow regions. To solve this problem, we propose a novel cleanness-navigated-shadow network (CNSNet), with a shadow-oriented adaptive normalization (SOAN) module and a shadow-aware aggregation with transformer (SAAT) module based on the shadow mask. Under the guidance of the shadow mask, the SOAN module formulates the statistics from the non-shadow region and adaptively applies them to the shadow region for region-wise restoration. The SAAT module utilizes the shadow mask to precisely guide the restoration of each shadowed pixel by considering the highly relevant pixels from the shadow-free regions for global pixel-wise restoration. Extensive experiments on three benchmark datasets (ISTD, ISTD+, and SRD) show that our method achieves superior de-shadowing performance.
△ Less
Submitted 5 September, 2022;
originally announced September 2022.
-
Coarse Retinal Lesion Annotations Refinement via Prototypical Learning
Authors:
Qinji Yu,
Kang Dang,
Ziyu Zhou,
Yongwei Chen,
Xiaowei Ding
Abstract:
Deep-learning-based approaches for retinal lesion segmentation often require an abundant amount of precise pixel-wise annotated data. However, coarse annotations such as circles or ellipses for outlining the lesion area can be six times more efficient than pixel-level annotation. Therefore, this paper proposes an annotation refinement network to convert a coarse annotation into a pixel-level segme…
▽ More
Deep-learning-based approaches for retinal lesion segmentation often require an abundant amount of precise pixel-wise annotated data. However, coarse annotations such as circles or ellipses for outlining the lesion area can be six times more efficient than pixel-level annotation. Therefore, this paper proposes an annotation refinement network to convert a coarse annotation into a pixel-level segmentation mask. Our main novelty is the application of the prototype learning paradigm to enhance the generalization ability across different datasets or types of lesions. We also introduce a prototype weighing module to handle challenging cases where the lesion is overly small. The proposed method was trained on the publicly available IDRiD dataset and then generalized to the public DDR and our real-world private datasets. Experiments show that our approach substantially improved the initial coarse mask and outperformed the non-prototypical baseline by a large margin. Moreover, we demonstrate the usefulness of the prototype weighing module in both cross-dataset and cross-class settings.
△ Less
Submitted 30 August, 2022;
originally announced August 2022.
-
Affine Deligne--Lusztig varieties with finite Coxeter parts
Authors:
Xuhua He,
Sian Nie,
Qingchao Yu
Abstract:
In this paper, we study affine Deligne--Lusztig varieties $X_w(b)$ when the finite part of the element $w$ in the Iwahori--Weyl group is a partial $σ$-Coxeter element. We show that such $w$ is a cordial element and $X_w(b) \neq \emptyset$ if and only if $b$ satisfies a certain Hodge--Newton indecomposability condition. The main result of this paper is that for such $w$ and $b$, $X_w(b)$ has a simp…
▽ More
In this paper, we study affine Deligne--Lusztig varieties $X_w(b)$ when the finite part of the element $w$ in the Iwahori--Weyl group is a partial $σ$-Coxeter element. We show that such $w$ is a cordial element and $X_w(b) \neq \emptyset$ if and only if $b$ satisfies a certain Hodge--Newton indecomposability condition. The main result of this paper is that for such $w$ and $b$, $X_w(b)$ has a simple geometric structure: the $σ$-centralizer of $b$ acts transitively on the set of irreducible components of $X_w(b)$; and each irreducible component is an iterated fibration over a classical Deligne--Lusztig variety of Coxeter type, and the iterated fibers are either $\mathbb A^1$ or $\mathbb G_m$.
△ Less
Submitted 30 August, 2022;
originally announced August 2022.
-
Prompt Tuning with Soft Context Sharing for Vision-Language Models
Authors:
Kun Ding,
Ying Wang,
Pengzhang Liu,
Qiang Yu,
Haojian Zhang,
Shiming Xiang,
Chunhong Pan
Abstract:
Vision-language models have recently shown great potential on many tasks in computer vision. Meanwhile, prior work demonstrates prompt tuning designed for vision-language models could acquire superior performance on few-shot image recognition compared to linear probe, a strong baseline. In practice, many few-shot tasks are inherently correlated, particularly within specialized domains. However, su…
▽ More
Vision-language models have recently shown great potential on many tasks in computer vision. Meanwhile, prior work demonstrates prompt tuning designed for vision-language models could acquire superior performance on few-shot image recognition compared to linear probe, a strong baseline. In practice, many few-shot tasks are inherently correlated, particularly within specialized domains. However, such information is overlooked previously. Inspired by the fact that modeling task relationship by multi-task learning can usually boost performance, we propose a novel method SoftCPT (Soft Context Sharing for Prompt Tuning) to tune pre-trained vision-language models on multiple target few-shot tasks jointly. Specifically, we design a task-shared meta network to generate prompt context for each task using task name together with a learnable task context as input. The parameters of this meta network as well as the task context are tuned on the joint training set of all tasks. As such, the prompt context of all tasks will be shared in a soft manner. Extensive experiments across four multi-task few-shot datasets covering 44 tasks and 1593 categories demonstrate that SoftCPT significantly outperforms single-task prompt tuning methods, highlighting the effectiveness of multi-task learning for vision-language prompt tuning. Code is available at https://github.com/kding1225/softcpt.
△ Less
Submitted 31 March, 2024; v1 submitted 29 August, 2022;
originally announced August 2022.
-
Towards Open Set Video Anomaly Detection
Authors:
Yuansheng Zhu,
Wentao Bao,
Qi Yu
Abstract:
Open Set Video Anomaly Detection (OpenVAD) aims to identify abnormal events from video data where both known anomalies and novel ones exist in testing. Unsupervised models learned solely from normal videos are applicable to any testing anomalies but suffer from a high false positive rate. In contrast, weakly supervised methods are effective in detecting known anomalies but could fail in an open wo…
▽ More
Open Set Video Anomaly Detection (OpenVAD) aims to identify abnormal events from video data where both known anomalies and novel ones exist in testing. Unsupervised models learned solely from normal videos are applicable to any testing anomalies but suffer from a high false positive rate. In contrast, weakly supervised methods are effective in detecting known anomalies but could fail in an open world. We develop a novel weakly supervised method for the OpenVAD problem by integrating evidential deep learning (EDL) and normalizing flows (NFs) into a multiple instance learning (MIL) framework. Specifically, we propose to use graph neural networks and triplet loss to learn discriminative features for training the EDL classifier, where the EDL is capable of identifying the unknown anomalies by quantifying the uncertainty. Moreover, we develop an uncertainty-aware selection strategy to obtain clean anomaly instances and a NFs module to generate the pseudo anomalies. Our method is superior to existing approaches by inheriting the advantages of both the unsupervised NFs and the weakly-supervised MIL framework. Experimental results on multiple real-world video datasets show the effectiveness of our method.
△ Less
Submitted 23 August, 2022;
originally announced August 2022.
-
Scaling Up Dynamic Graph Representation Learning via Spiking Neural Networks
Authors:
Jintang Li,
Zhouxin Yu,
Zulun Zhu,
Liang Chen,
Qi Yu,
Zibin Zheng,
Sheng Tian,
Ruofan Wu,
Changhua Meng
Abstract:
Recent years have seen a surge in research on dynamic graph representation learning, which aims to model temporal graphs that are dynamic and evolving constantly over time. However, current work typically models graph dynamics with recurrent neural networks (RNNs), making them suffer seriously from computation and memory overheads on large temporal graphs. So far, scalability of dynamic graph repr…
▽ More
Recent years have seen a surge in research on dynamic graph representation learning, which aims to model temporal graphs that are dynamic and evolving constantly over time. However, current work typically models graph dynamics with recurrent neural networks (RNNs), making them suffer seriously from computation and memory overheads on large temporal graphs. So far, scalability of dynamic graph representation learning on large temporal graphs remains one of the major challenges. In this paper, we present a scalable framework, namely SpikeNet, to efficiently capture the temporal and structural patterns of temporal graphs. We explore a new direction in that we can capture the evolving dynamics of temporal graphs with spiking neural networks (SNNs) instead of RNNs. As a low-power alternative to RNNs, SNNs explicitly model graph dynamics as spike trains of neuron populations and enable spike-based propagation in an efficient way. Experiments on three large real-world temporal graph datasets demonstrate that SpikeNet outperforms strong baselines on the temporal node classification task with lower computational costs. Particularly, SpikeNet generalizes to a large temporal graph (2.7M nodes and 13.9M edges) with significantly fewer parameters and computation overheads.Our code is publicly available at \url{https://github.com/EdisonLeeeee/SpikeNet}.
△ Less
Submitted 18 May, 2023; v1 submitted 15 August, 2022;
originally announced August 2022.
-
UPST-NeRF: Universal Photorealistic Style Transfer of Neural Radiance Fields for 3D Scene
Authors:
Yaosen Chen,
Qi Yuan,
Zhiqiang Li,
Yuegen Liu,
Wei Wang,
Chaoping Xie,
Xuming Wen,
Qien Yu
Abstract:
3D scenes photorealistic stylization aims to generate photorealistic images from arbitrary novel views according to a given style image while ensuring consistency when rendering from different viewpoints. Some existing stylization methods with neural radiance fields can effectively predict stylized scenes by combining the features of the style image with multi-view images to train 3D scenes. Howev…
▽ More
3D scenes photorealistic stylization aims to generate photorealistic images from arbitrary novel views according to a given style image while ensuring consistency when rendering from different viewpoints. Some existing stylization methods with neural radiance fields can effectively predict stylized scenes by combining the features of the style image with multi-view images to train 3D scenes. However, these methods generate novel view images that contain objectionable artifacts. Besides, they cannot achieve universal photorealistic stylization for a 3D scene. Therefore, a styling image must retrain a 3D scene representation network based on a neural radiation field. We propose a novel 3D scene photorealistic style transfer framework to address these issues. It can realize photorealistic 3D scene style transfer with a 2D style image. We first pre-trained a 2D photorealistic style transfer network, which can meet the photorealistic style transfer between any given content image and style image. Then, we use voxel features to optimize a 3D scene and get the geometric representation of the scene. Finally, we jointly optimize a hyper network to realize the scene photorealistic style transfer of arbitrary style images. In the transfer stage, we use a pre-trained 2D photorealistic network to constrain the photorealistic style of different views and different style images in the 3D scene. The experimental results show that our method not only realizes the 3D photorealistic style transfer of arbitrary style images but also outperforms the existing methods in terms of visual quality and consistency. Project page:https://semchan.github.io/UPST_NeRF.
△ Less
Submitted 21 August, 2022; v1 submitted 15 August, 2022;
originally announced August 2022.
-
SketchSampler: Sketch-based 3D Reconstruction via View-dependent Depth Sampling
Authors:
Chenjian Gao,
Qian Yu,
Lu Sheng,
Yi-Zhe Song,
Dong Xu
Abstract:
Reconstructing a 3D shape based on a single sketch image is challenging due to the large domain gap between a sparse, irregular sketch and a regular, dense 3D shape. Existing works try to employ the global feature extracted from sketch to directly predict the 3D coordinates, but they usually suffer from losing fine details that are not faithful to the input sketch. Through analyzing the 3D-to-2D p…
▽ More
Reconstructing a 3D shape based on a single sketch image is challenging due to the large domain gap between a sparse, irregular sketch and a regular, dense 3D shape. Existing works try to employ the global feature extracted from sketch to directly predict the 3D coordinates, but they usually suffer from losing fine details that are not faithful to the input sketch. Through analyzing the 3D-to-2D projection process, we notice that the density map that characterizes the distribution of 2D point clouds (i.e., the probability of points projected at each location of the projection plane) can be used as a proxy to facilitate the reconstruction process. To this end, we first translate a sketch via an image translation network to a more informative 2D representation that can be used to generate a density map. Next, a 3D point cloud is reconstructed via a two-stage probabilistic sampling process: first recovering the 2D points (i.e., the x and y coordinates) by sampling the density map; and then predicting the depth (i.e., the z coordinate) by sampling the depth values at the ray determined by each 2D point. Extensive experiments are conducted, and both quantitative and qualitative results show that our proposed approach significantly outperforms other baseline methods.
△ Less
Submitted 25 December, 2022; v1 submitted 14 August, 2022;
originally announced August 2022.
-
Uncertainty Quantification for Traffic Forecasting: A Unified Approach
Authors:
Weizhu Qian,
Dalin Zhang,
Yan Zhao,
Kai Zheng,
James J. Q. Yu
Abstract:
Uncertainty is an essential consideration for time series forecasting tasks. In this work, we specifically focus on quantifying the uncertainty of traffic forecasting. To achieve this, we develop Deep Spatio-Temporal Uncertainty Quantification (DeepSTUQ), which can estimate both aleatoric and epistemic uncertainty. We first leverage a spatio-temporal model to model the complex spatio-temporal corr…
▽ More
Uncertainty is an essential consideration for time series forecasting tasks. In this work, we specifically focus on quantifying the uncertainty of traffic forecasting. To achieve this, we develop Deep Spatio-Temporal Uncertainty Quantification (DeepSTUQ), which can estimate both aleatoric and epistemic uncertainty. We first leverage a spatio-temporal model to model the complex spatio-temporal correlations of traffic data. Subsequently, two independent sub-neural networks maximizing the heterogeneous log-likelihood are developed to estimate aleatoric uncertainty. For estimating epistemic uncertainty, we combine the merits of variational inference and deep ensembling by integrating the Monte Carlo dropout and the Adaptive Weight Averaging re-training methods, respectively. Finally, we propose a post-processing calibration approach based on Temperature Scaling, which improves the model's generalization ability to estimate uncertainty. Extensive experiments are conducted on four public datasets, and the empirical results suggest that the proposed method outperforms state-of-the-art methods in terms of both point prediction and uncertainty quantification.
△ Less
Submitted 11 August, 2022;
originally announced August 2022.
-
Searching for photon-ALPs mixing effects in AGN gamma-ray energy spectra
Authors:
Qixin Yu,
Dieter Horns
Abstract:
High energy gamma-rays propagating in external magnetic fields may convert into axion-like particles (ALPs). In this case, the observed gamma-ray spectra are modified by the resulting energy-dependent conversion probability. In this study, we use the energy spectra of 20 extra-galactic gamma-ray sources recorded during 10 years of \textit{Fermi}-LAT observations. We define a test statistics based…
▽ More
High energy gamma-rays propagating in external magnetic fields may convert into axion-like particles (ALPs). In this case, the observed gamma-ray spectra are modified by the resulting energy-dependent conversion probability. In this study, we use the energy spectra of 20 extra-galactic gamma-ray sources recorded during 10 years of \textit{Fermi}-LAT observations. We define a test statistics based upon the likelihood ratio to test the hypothesis for a spectral model without vs. a model with photon-ALPs coupling. The conversion probability is calculated for fixed values of the mass and two-photon coupling of the pseudo-scalar particle while the external magnetic field is characterized by the additional free parameters length scale $s$ and average field strength $B$. As a consistency check and in order to extend the analysis to include very high energy gamma-ray data, another test statistics is defined with the $χ^2$ method. We find for 18 of the 20 sources a favorable fit, particularly for Markarian~421 and NGC~1275 a significant improvement, with the hypothesis of photon-ALPs coupling in likelihood analysis. The test statistics of the sources are combined and the significance has been estimated $5.3~σ$ (test statistics summed in local maxima of all sources) and $6.0~σ$ (global maxima). The significance is estimated from dedicated simulations under the null hypotheses. The locally best-fitting values of $B$ and $s$ fall into the range that is expected for large scale magnetic fields present in relevant astrophysical environments.
△ Less
Submitted 15 March, 2023; v1 submitted 29 July, 2022;
originally announced August 2022.
-
Adversarial Contrastive Learning via Asymmetric InfoNCE
Authors:
Qiying Yu,
Jieming Lou,
Xianyuan Zhan,
Qizhang Li,
Wangmeng Zuo,
Yang Liu,
Jingjing Liu
Abstract:
Contrastive learning (CL) has recently been applied to adversarial learning tasks. Such practice considers adversarial samples as additional positive views of an instance, and by maximizing their agreements with each other, yields better adversarial robustness. However, this mechanism can be potentially flawed, since adversarial perturbations may cause instance-level identity confusion, which can…
▽ More
Contrastive learning (CL) has recently been applied to adversarial learning tasks. Such practice considers adversarial samples as additional positive views of an instance, and by maximizing their agreements with each other, yields better adversarial robustness. However, this mechanism can be potentially flawed, since adversarial perturbations may cause instance-level identity confusion, which can impede CL performance by pulling together different instances with separate identities. To address this issue, we propose to treat adversarial samples unequally when contrasted, with an asymmetric InfoNCE objective ($A-InfoNCE$) that allows discriminating considerations of adversarial samples. Specifically, adversaries are viewed as inferior positives that induce weaker learning signals, or as hard negatives exhibiting higher contrast to other negative samples. In the asymmetric fashion, the adverse impacts of conflicting objectives between CL and adversarial learning can be effectively mitigated. Experiments show that our approach consistently outperforms existing Adversarial CL methods across different finetuning schemes without additional computational cost. The proposed A-InfoNCE is also a generic form that can be readily extended to other CL methods. Code is available at https://github.com/yqy2001/A-InfoNCE.
△ Less
Submitted 18 July, 2022;
originally announced July 2022.
-
A Simple Test-Time Method for Out-of-Distribution Detection
Authors:
Ke Fan,
Yikai Wang,
Qian Yu,
Da Li,
Yanwei Fu
Abstract:
Neural networks are known to produce over-confident predictions on input images, even when these images are out-of-distribution (OOD) samples. This limits the applications of neural network models in real-world scenarios, where OOD samples exist. Many existing approaches identify the OOD instances via exploiting various cues, such as finding irregular patterns in the feature space, logits space, g…
▽ More
Neural networks are known to produce over-confident predictions on input images, even when these images are out-of-distribution (OOD) samples. This limits the applications of neural network models in real-world scenarios, where OOD samples exist. Many existing approaches identify the OOD instances via exploiting various cues, such as finding irregular patterns in the feature space, logits space, gradient space or the raw space of images. In contrast, this paper proposes a simple Test-time Linear Training (ETLT) method for OOD detection. Empirically, we find that the probabilities of input images being out-of-distribution are surprisingly linearly correlated to the features extracted by neural networks. To be specific, many state-of-the-art OOD algorithms, although designed to measure reliability in different ways, actually lead to OOD scores mostly linearly related to their image features. Thus, by simply learning a linear regression model trained from the paired image features and inferred OOD scores at test-time, we can make a more precise OOD prediction for the test instances. We further propose an online variant of the proposed method, which achieves promising performance and is more practical in real-world applications. Remarkably, we improve FPR95 from $51.37\%$ to $12.30\%$ on CIFAR-10 datasets with maximum softmax probability as the base OOD detector. Extensive experiments on several benchmark datasets show the efficacy of ETLT for OOD detection task.
△ Less
Submitted 17 July, 2022;
originally announced July 2022.
-
Observation of magnetism induced topological edge state in antiferromagnetic topological insulator MnBi4Te7
Authors:
HaoKe Xu,
Mingqiang Gu,
Fucong Fei,
YiSheng Gu,
Dang Liu,
QiaoYan Yu,
ShaSha Xue,
XuHui Ning,
Bo Chen,
Hangkai Xie,
Zhen Zhu,
Dandan Guan,
Shiyong Wang,
Yaoyi Li,
Canhua Liu,
Qihang Liu,
Fengqi Song,
Hao Zheng,
Jinfeng Jia
Abstract:
Breaking time reversal symmetry in a topological insulator may lead to quantum anomalous Hall effect and axion insulator phase. MnBi4Te7 is a recently discovered antiferromagnetic topological insulator with TN ~12.5 K, which is constituted of alternatively stacked magnetic layer (MnBi2Te4) and non-magnetic layer (Bi2Te3). By means of scanning tunneling spectroscopy, we clearly observe the electron…
▽ More
Breaking time reversal symmetry in a topological insulator may lead to quantum anomalous Hall effect and axion insulator phase. MnBi4Te7 is a recently discovered antiferromagnetic topological insulator with TN ~12.5 K, which is constituted of alternatively stacked magnetic layer (MnBi2Te4) and non-magnetic layer (Bi2Te3). By means of scanning tunneling spectroscopy, we clearly observe the electronic state present at a step edge of a magnetic MnBi2Te4 layer but absent at non-magnetic Bi2Te3 layers at 4.5 K. Furthermore, we find that as the temperature rises above TN, the edge state vanishes, while the point defect induced state persists upon temperature increasing. These results confirm the observation of magnetism induced edge states. Our analysis based on an axion insulator theory reveals that the nontrivial topological nature of the observed edge state.
△ Less
Submitted 16 July, 2022;
originally announced July 2022.
-
kMaX-DeepLab: k-means Mask Transformer
Authors:
Qihang Yu,
Huiyu Wang,
Siyuan Qiao,
Maxwell Collins,
Yukun Zhu,
Hartwig Adam,
Alan Yuille,
Liang-Chieh Chen
Abstract:
The rise of transformers in vision tasks not only advances network backbone designs, but also starts a brand-new page to achieve end-to-end image recognition (e.g., object detection and panoptic segmentation). Originated from Natural Language Processing (NLP), transformer architectures, consisting of self-attention and cross-attention, effectively learn long-range interactions between elements in…
▽ More
The rise of transformers in vision tasks not only advances network backbone designs, but also starts a brand-new page to achieve end-to-end image recognition (e.g., object detection and panoptic segmentation). Originated from Natural Language Processing (NLP), transformer architectures, consisting of self-attention and cross-attention, effectively learn long-range interactions between elements in a sequence. However, we observe that most existing transformer-based vision models simply borrow the idea from NLP, neglecting the crucial difference between languages and images, particularly the extremely large sequence length of spatially flattened pixel features. This subsequently impedes the learning in cross-attention between pixel features and object queries. In this paper, we rethink the relationship between pixels and object queries and propose to reformulate the cross-attention learning as a clustering process. Inspired by the traditional k-means clustering algorithm, we develop a k-means Mask Xformer (kMaX-DeepLab) for segmentation tasks, which not only improves the state-of-the-art, but also enjoys a simple and elegant design. As a result, our kMaX-DeepLab achieves a new state-of-the-art performance on COCO val set with 58.0% PQ, Cityscapes val set with 68.4% PQ, 44.0% AP, and 83.5% mIoU, and ADE20K val set with 50.9% PQ and 55.2% mIoU without test-time augmentation or external dataset. We hope our work can shed some light on designing transformers tailored for vision tasks. TensorFlow code and models are available at https://github.com/google-research/deeplab2 A PyTorch re-implementation is also available at https://github.com/bytedance/kmax-deeplab
△ Less
Submitted 10 July, 2023; v1 submitted 8 July, 2022;
originally announced July 2022.
-
Reheating constraints on modified single-field Natural Inflation models
Authors:
Hua Zhou,
Qing Yu,
Yu Pan,
Ruiyu Zhou,
Wei Cheng
Abstract:
In this paper, we discuss three modified single-field natural inflation models in detail, including Special generalized Natural Inflation model(SNI), Extended Natural Inflation model(ENI) and Natural Inflation inspired model(NII). We derive the analytical expression of the tensor-to-scalar ratio $r$ and the spectral index $n_s$ for those models. Then the reheating temperature $T_{re}$ and reheatin…
▽ More
In this paper, we discuss three modified single-field natural inflation models in detail, including Special generalized Natural Inflation model(SNI), Extended Natural Inflation model(ENI) and Natural Inflation inspired model(NII). We derive the analytical expression of the tensor-to-scalar ratio $r$ and the spectral index $n_s$ for those models. Then the reheating temperature $T_{re}$ and reheating duration $N_{re}$ are analytically derived. Moreover, considering the CMB constraints, the feasible space of the SNI model in $(n_s, r)$ plane is almost covered by that of the NII, which means the NII is more general than the SNI. In addition, there is no overlapping space between the ENI and the other two models in $(n_s, r)$ plane, which indicates that the ENI and the other two models exclude each other, and more accurate experiments can verify them. Furthermore, the reheating brings tighter constraints to the inflation models, but they still work for a different reheating universe. Considering the constraints of $n_s$, $r$, $N_k$ and choosing $T_{re}$ near the electroweak energy scale, one can find that the decay constants of the three models have no overlapping area and the effective equations of state $ω_{re}$ should be within $\frac{1}{4}\lesssim ω_{re} \lesssim \frac{4}{5}$ for the three models.
△ Less
Submitted 20 June, 2022;
originally announced June 2022.
-
CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation
Authors:
Qihang Yu,
Huiyu Wang,
Dahun Kim,
Siyuan Qiao,
Maxwell Collins,
Yukun Zhu,
Hartwig Adam,
Alan Yuille,
Liang-Chieh Chen
Abstract:
We propose Clustering Mask Transformer (CMT-DeepLab), a transformer-based framework for panoptic segmentation designed around clustering. It rethinks the existing transformer architectures used in segmentation and detection; CMT-DeepLab considers the object queries as cluster centers, which fill the role of grouping the pixels when applied to segmentation. The clustering is computed with an altern…
▽ More
We propose Clustering Mask Transformer (CMT-DeepLab), a transformer-based framework for panoptic segmentation designed around clustering. It rethinks the existing transformer architectures used in segmentation and detection; CMT-DeepLab considers the object queries as cluster centers, which fill the role of grouping the pixels when applied to segmentation. The clustering is computed with an alternating procedure, by first assigning pixels to the clusters by their feature affinity, and then updating the cluster centers and pixel features. Together, these operations comprise the Clustering Mask Transformer (CMT) layer, which produces cross-attention that is denser and more consistent with the final segmentation task. CMT-DeepLab improves the performance over prior art significantly by 4.4% PQ, achieving a new state-of-the-art of 55.7% PQ on the COCO test-dev set.
△ Less
Submitted 17 June, 2022;
originally announced June 2022.
-
On the HI Content of MaNGA Major Merger Pairs
Authors:
Qingzheng Yu,
Taotao Fang,
Shuai Feng,
Bo Zhang,
C. Kevin Xu,
Yunting Wang,
Lei Hao
Abstract:
The role of HI content in galaxy interactions is still under debate. To study the HI content of galaxy pairs at different merging stages, we compile a sample of 66 major-merger galaxy pairs and 433 control galaxies from the SDSS-IV MaNGA IFU survey. In this study, we adopt kinematic asymmetry as a new effective indicator to describe the merging stage of galaxy pairs. With archival data from the HI…
▽ More
The role of HI content in galaxy interactions is still under debate. To study the HI content of galaxy pairs at different merging stages, we compile a sample of 66 major-merger galaxy pairs and 433 control galaxies from the SDSS-IV MaNGA IFU survey. In this study, we adopt kinematic asymmetry as a new effective indicator to describe the merging stage of galaxy pairs. With archival data from the HI-MaNGA survey and new observations from the Five-hundred-meter Aperture Spherical Radio Telescope (FAST), we investigate the differences in HI gas fraction ($f_{\text{HI}}$), star formation rate (SFR), and HI star formation efficiency ($\rm SFE_{\text{HI}}$) between the pair and control samples. Our results suggest that the HI gas fraction of major-merger pairs on average is marginally decreased by $\sim 15\%$ relative to isolated galaxies, implying mild HI depletion during galaxy interactions. Compared to isolated galaxies, pre-passage paired galaxies have similar $f_{\text{HI}}$, SFR and $\rm SFE_{\text{HI}}$, while pairs during pericentric passage have weakly decreased $f_{\text{HI}}$ ($-0.10\pm0.05$ dex), significantly enhanced SFR ($0.42\pm0.11$ dex) and $\rm SFE_{\text{HI}}$ ($0.48\pm0.12$ dex). When approaching the apocenter, paired galaxies show marginally decreased $f_{\text{HI}}$ ($-0.05\pm0.04$ dex), comparable SFR ($0.04\pm0.06$ dex) and $\rm SFE_{\text{HI}}$ ($0.08\pm0.08$ dex). We propose the marginally detected HI depletion may originate from the gas consumption in fuelling the enhanced $\rm H_2$ reservoir of galaxy pairs. In addition, new FAST observations also reveal an HI absorber ($N_{\text{HI}}\sim 4.7 \times 10^{21} \text{ cm}^{-2}$), which may suggest gas infalling and the triggering of AGN activity.
△ Less
Submitted 13 June, 2022;
originally announced June 2022.
-
Balancing Bias and Variance for Active Weakly Supervised Learning
Authors:
Hitesh Sapkota,
Qi Yu
Abstract:
As a widely used weakly supervised learning scheme, modern multiple instance learning (MIL) models achieve competitive performance at the bag level. However, instance-level prediction, which is essential for many important applications, remains largely unsatisfactory. We propose to conduct novel active deep multiple instance learning that samples a small subset of informative instances for annotat…
▽ More
As a widely used weakly supervised learning scheme, modern multiple instance learning (MIL) models achieve competitive performance at the bag level. However, instance-level prediction, which is essential for many important applications, remains largely unsatisfactory. We propose to conduct novel active deep multiple instance learning that samples a small subset of informative instances for annotation, aiming to significantly boost the instance-level prediction. A variance regularized loss function is designed to properly balance the bias and variance of instance-level predictions, aiming to effectively accommodate the highly imbalanced instance distribution in MIL and other fundamental challenges. Instead of directly minimizing the variance regularized loss that is non-convex, we optimize a distributionally robust bag level likelihood as its convex surrogate. The robust bag likelihood provides a good approximation of the variance based MIL loss with a strong theoretical guarantee. It also automatically balances bias and variance, making it effective to identify the potentially positive instances to support active sampling. The robust bag likelihood can be naturally integrated with a deep architecture to support deep model training using mini-batches of positive-negative bag pairs. Finally, a novel P-F sampling function is developed that combines a probability vector and predicted instance scores, obtained by optimizing the robust bag likelihood. By leveraging the key MIL assumption, the sampling function can explore the most challenging bags and effectively detect their positive instances for annotation, which significantly improves the instance-level prediction. Experiments conducted over multiple real-world datasets clearly demonstrate the state-of-the-art instance-level prediction achieved by the proposed model.
△ Less
Submitted 12 June, 2022;
originally announced June 2022.
-
A $Δ$-Machine Learning Approach for Force Fields, Illustrated by a CCSD(T) 4-body Correction to the MB-pol Water Potential
Authors:
Chen Qu,
Qi Yu,
Riccardo Conte,
Paul L. Houston,
Apurba Nandi,
Joel M. Bowman
Abstract:
$Δ$-Machine Learning ($Δ…
▽ More
$Δ$-Machine Learning ($Δ$-ML) has been shown to effectively and efficiently bring a low-level ML potential energy surface to CCSD(T) quality. Here we propose extending this approach to general force fields, which implicitly or explicitly contain many-body effects. After describing this general approach, we illustrate it for the MB-pol water potential which contains CCSD(T) 2-body and 3-body interactions but relies on the TTM4-F 4-body and higher body interactions. The 4-body MB-pol (TTM4-F) interaction fails at a very short range and for the water hexamer errors up to 0.84 kcal/mol are seen for some isomers, owing mainly to 4-body errors. We apply $Δ$-ML for the 4-body interaction, using a recent dataset of CCSD(T) 4-body energies that we used to develop a new water potential, q-AQUA. This 4-body correction is shown to improve the accuracy of the MB-pol potential for the relative energies of 8 isomers of the water hexamer as well as the harmonic frequencies. The new potential is robust in the very short range and so should be reliable for simulations at high pressure and/or high temperature.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
New determination of $|V_{\rm cb}|$ using the three-loop QCD corrections for the $B\to D^{\ast}$ semi-leptonic decays
Authors:
Hua Zhou,
Qing Yu,
Xu-Chang Zheng,
Hai-Bing Fu,
Xing-Gang Wu
Abstract:
We present a new determination of the Cabibbo-Kobayashi-Maskawa matrix element $|V_{\rm cb}|$ by using the three-loop perturbative QCD corrections for the $B\to D^{\ast}$ semi-leptonic decay. The decay width of $B\to D^{\ast}$ semi-leptonic decay can be factorized as perturbatively calculable short-distance part and the non-perturbative but universal long-distance part. We adopt the principle of m…
▽ More
We present a new determination of the Cabibbo-Kobayashi-Maskawa matrix element $|V_{\rm cb}|$ by using the three-loop perturbative QCD corrections for the $B\to D^{\ast}$ semi-leptonic decay. The decay width of $B\to D^{\ast}$ semi-leptonic decay can be factorized as perturbatively calculable short-distance part and the non-perturbative but universal long-distance part. We adopt the principle of maximum conformality (PMC) single-scale setting approach to deal with the perturbative series so as to achieve a precise fixed-order prediction for the short-distance parameter $η_{A}$. By applying the PMC, an overall effective $α_s$ value is achieved by recursively using the renormalization group equation, which inversely results in a precise scale-invariant pQCD series. Such scale-invariant series also provides a reliable basis for predicting the contributions from uncalculated perturbative terms. We then obtain $η_{A}=0.9225^{+0.0117}_{-0.0168}$, where the error is the squared average of those from $Δα_{s}(M_Z)=\pm0.0010$ and the uncertainties caused by the uncalculated higher-order perturbative terms. By using the data of $B\to D^{\ast}\ell\barν_{\ell}$, we finally obtain $|V_{\rm cb}|_{\rm PMC} =(40.60^{+0.53}_{-0.57})\times10^{-3}$, which is consistent with the PDG value within errors.
△ Less
Submitted 6 January, 2023; v1 submitted 6 June, 2022;
originally announced June 2022.
-
Quantum calculations on a new CCSD(T) machine-learned PES reveal the leaky nature of gas-phase $trans$ and $gauche$ ethanol conformers
Authors:
Apurba Nandi,
Riccardo Conte,
Chen Qu,
Paul L. Houston,
Qi Yu,
Joel M. Bowman
Abstract:
Ethanol is a molecule of fundamental interest in combustion, astrochemistry, and condensed phase as a solvent. It is characterized by two methyl rotors and $trans$ ($anti$) and $gauche$ conformers, which are known to be very close in energy. Here we show that based on rigorous quantum calculations of the vibrational zero-point state, using a new ab initio potential energy surface (PES), the ground…
▽ More
Ethanol is a molecule of fundamental interest in combustion, astrochemistry, and condensed phase as a solvent. It is characterized by two methyl rotors and $trans$ ($anti$) and $gauche$ conformers, which are known to be very close in energy. Here we show that based on rigorous quantum calculations of the vibrational zero-point state, using a new ab initio potential energy surface (PES), the ground state resembles the $trans$ conformer but substantial delocalization to the $gauche$ conformer is present. This explains experimental issues about the identification and isolation of the two conformers. This "leak" effect is partially quenched when deuterating the OH group, which further demonstrates the need for a quantum mechanical approach. Diffusion Monte Carlo (DMC) and full-dimensional semiclassical dynamics calculations are employed. The new PES is obtained by means of a $Δ$-Machine learning approach starting from a pre-existing low level (LL) density functional theory (DFT) surface. This surface is brought to the CCSD(T) level of theory using a relatively small number of $ab$ $initio$ CCSD(T) energies. Agreement between the corrected PES and direct $ab$ $initio$ results for standard fidelity tests is excellent. One- and two-dimensional discrete variable representation calculations focusing on the $trans$-$gauche$ torsional motion are also reported, in reasonable agreement with the experiment.
△ Less
Submitted 8 June, 2022; v1 submitted 5 June, 2022;
originally announced June 2022.
-
TubeFormer-DeepLab: Video Mask Transformer
Authors:
Dahun Kim,
Jun Xie,
Huiyu Wang,
Siyuan Qiao,
Qihang Yu,
Hong-Seok Kim,
Hartwig Adam,
In So Kweon,
Liang-Chieh Chen
Abstract:
We present TubeFormer-DeepLab, the first attempt to tackle multiple core video segmentation tasks in a unified manner. Different video segmentation tasks (e.g., video semantic/instance/panoptic segmentation) are usually considered as distinct problems. State-of-the-art models adopted in the separate communities have diverged, and radically different approaches dominate in each task. By contrast, w…
▽ More
We present TubeFormer-DeepLab, the first attempt to tackle multiple core video segmentation tasks in a unified manner. Different video segmentation tasks (e.g., video semantic/instance/panoptic segmentation) are usually considered as distinct problems. State-of-the-art models adopted in the separate communities have diverged, and radically different approaches dominate in each task. By contrast, we make a crucial observation that video segmentation tasks could be generally formulated as the problem of assigning different predicted labels to video tubes (where a tube is obtained by linking segmentation masks along the time axis) and the labels may encode different values depending on the target task. The observation motivates us to develop TubeFormer-DeepLab, a simple and effective video mask transformer model that is widely applicable to multiple video segmentation tasks. TubeFormer-DeepLab directly predicts video tubes with task-specific labels (either pure semantic categories, or both semantic categories and instance identities), which not only significantly simplifies video segmentation models, but also advances state-of-the-art results on multiple video segmentation benchmarks
△ Less
Submitted 5 March, 2023; v1 submitted 30 May, 2022;
originally announced May 2022.
-
Improvement of all-optical Compton $γ$-rays source by reshaping colliding pulse
Authors:
Q. Yu,
Y. Zhang,
Q. Kong,
S. Kawata
Abstract:
All-optical Compton scattering is a remarkable method of generating high-quality $γ$ radiation source. It is easier achieved in experiment by employing a pulse based on laser wakefield accelerator. The driving laser is backward reflected when wakefield acceleration stage is over and thus it naturally collides with energetic electrons. To increase reflected pulse intensity, parabolic focusing plasm…
▽ More
All-optical Compton scattering is a remarkable method of generating high-quality $γ$ radiation source. It is easier achieved in experiment by employing a pulse based on laser wakefield accelerator. The driving laser is backward reflected when wakefield acceleration stage is over and thus it naturally collides with energetic electrons. To increase reflected pulse intensity, parabolic focusing plasma mirror instead of flat reflecting target is usually adopted. But concave focusing mirror also deteriorates the emitted photon beam monochromaticity and collimation. We propose using stepped focusing plasma mirror to reflect the driving pulse to conquer these issues. The longitudinal length of reflected pulse by stepped target is larger and intensity is relatively small. It leads emitted photon beam to have better monochromaticity and collimation except for having larger emitted energy and higher laser utilization efficiency. We affirm the robustness of stepped focusing mirror reflecting regime through various kinds of numerical simulations.
△ Less
Submitted 27 May, 2022;
originally announced May 2022.
-
The energy cost for flocking of active spins: the cusped dissipation maximum at the flocking transition
Authors:
Qiwei Yu,
Yuhai Tu
Abstract:
We study the energy cost of flocking in the active Ising model (AIM) and show that besides the energy cost for self-propelled motion, an additional energy dissipation is required to power the alignment of spins. We find that this additional alignment dissipation reaches its maximum at the flocking transition point in the form of a cusp with a discontinuous first derivative with respect to the cont…
▽ More
We study the energy cost of flocking in the active Ising model (AIM) and show that besides the energy cost for self-propelled motion, an additional energy dissipation is required to power the alignment of spins. We find that this additional alignment dissipation reaches its maximum at the flocking transition point in the form of a cusp with a discontinuous first derivative with respect to the control parameter. To understand this singular behavior, we analytically solve the two- and three-site AIM models and obtain the exact dependence of the alignment dissipation on the flocking order parameter and control parameter, which explains the cusped dissipation maximum at the flocking transition. Our results reveal a trade-off between the energy cost of the system and its performance measured by the flocking speed and sensitivity to external perturbations. This tradeoff relationship provides a new perspective for understanding the dynamics of natural flocks and designing optimal artificial flocking systems.
△ Less
Submitted 23 December, 2022; v1 submitted 26 May, 2022;
originally announced May 2022.
-
The MD17 Datasets from the Perspective of Datasets for Gas-Phase "Small" Molecule Potentials
Authors:
Joel M. Bowman,
Chen Qu Riccardo Conte,
Apurba Nandi,
Paul L. Houston,
Qi Yu
Abstract:
There has been great progress in developing methods for machine-learned potential energy surfaces. There have also been important assessments of these methods by comparing so-called learning curves on datasets of electronic energies and forces, notably the MD17 database. The dataset for each molecule in this database generally consists of tens of thousands of energies and forces obtained from DFT…
▽ More
There has been great progress in developing methods for machine-learned potential energy surfaces. There have also been important assessments of these methods by comparing so-called learning curves on datasets of electronic energies and forces, notably the MD17 database. The dataset for each molecule in this database generally consists of tens of thousands of energies and forces obtained from DFT direct dynamics at 500 K. We contrast the datasets from this database for three "small" molecules, ethanol, malonaldehyde, and glycine, with datasets we have generated with specific targets for the PESs in mind: a rigorous calculation of the zero-point energy and wavefunction, the tunneling splitting in malonaldehyde and in the case of glycine a description of all eight low-lying conformers. We found that the MD17 datasets are too limited for these targets. We also examine recent datasets for several PESs that describe small-molecule but complex chemical reactions. Finally, we introduce a new database, "QM-22", which contains datasets of molecules ranging from 4 to 15 atoms that extend to high energies and a large span of configurations.
△ Less
Submitted 23 May, 2022;
originally announced May 2022.
-
6G Network AI Architecture for Everyone-Centric Customized Services
Authors:
Yang Yang,
Mulei Ma,
Hequan Wu,
Quan Yu,
Ping Zhang,
Xiaohu You,
Jianjun Wu,
Chenghui Peng,
Tak-Shing Peter Yum,
Sherman Shen,
Hamid Aghvami,
Geoffrey Y Li,
Jiangzhou Wang,
Guangyi Liu,
Peng Gao,
Xiongyan Tang,
Chang Cao,
John Thompson,
Kat-Kit Wong,
Shanzhi Chen,
Merouane Debbah,
Schahram Dustdar,
Frank Eliassen,
Tao Chen,
Xiangyang Duan
, et al. (29 additional authors not shown)
Abstract:
Mobile communication standards were developed for enhancing transmission and network performance by using more radio resources and improving spectrum and energy efficiency. How to effectively address diverse user requirements and guarantee everyone's Quality of Experience (QoE) remains an open problem. The Sixth Generation (6G) mobile systems will solve this problem by utilizing heterogenous netwo…
▽ More
Mobile communication standards were developed for enhancing transmission and network performance by using more radio resources and improving spectrum and energy efficiency. How to effectively address diverse user requirements and guarantee everyone's Quality of Experience (QoE) remains an open problem. The Sixth Generation (6G) mobile systems will solve this problem by utilizing heterogenous network resources and pervasive intelligence to support everyone-centric customized services anywhere and anytime. In this article, we first coin the concept of Service Requirement Zone (SRZ) on the user side to characterize and visualize the integrated service requirements and preferences of specific tasks of individual users. On the system side, we further introduce the concept of User Satisfaction Ratio (USR) to evaluate the system's overall service ability of satisfying a variety of tasks with different SRZs. Then, we propose a network Artificial Intelligence (AI) architecture with integrated network resources and pervasive AI capabilities for supporting customized services with guaranteed QoEs. Finally, extensive simulations show that the proposed network AI architecture can consistently offer a higher USR performance than the cloud AI and edge AI architectures with respect to different task scheduling algorithms, random service requirements, and dynamic network conditions.
△ Less
Submitted 6 December, 2023; v1 submitted 19 May, 2022;
originally announced May 2022.
-
Tight focusing proton beam with radius in nanometer scale generation based on channeled solid target
Authors:
Q. Yu
Abstract:
An efficient scheme of generating ultra-tightly focused proton bunch with radius in nanometer scale is proposed. A needlelike proton filament of transverse size in nanometer scale with the density of and charge quantity is obtained based on multi-dimension Particle-in-Cell (PIC) simulations. The regime is achieved via laser irradiating on a solid target with pre-channeled density profile. The theo…
▽ More
An efficient scheme of generating ultra-tightly focused proton bunch with radius in nanometer scale is proposed. A needlelike proton filament of transverse size in nanometer scale with the density of and charge quantity is obtained based on multi-dimension Particle-in-Cell (PIC) simulations. The regime is achieved via laser irradiating on a solid target with pre-channeled density profile. The theoretical analysis mentions that the transverse electric field dramatically transits from a defocusing dipole to double dipoles structure with the change of the initial target density distribution from uniform to pre-channeled. The inner dipole of the electric field tightly focuses the proton beam into the order of magnitude of nanometer. 3D simulations verify the scheme in the realistic condition. Various pre-channeled density profiles including linear, parabolic and arbitrary steeped prove to work well for the regime, which declares the robustness and the performability of the scheme in experiment.
△ Less
Submitted 16 May, 2022;
originally announced May 2022.
-
Light-shift-free and dead-zone-free atomic orientation based scalar magnetometry using a single amplitude-modulated beam
Authors:
Qianqian Yu,
Siqi Liu,
Chunqi Yuan,
Dong Sheng
Abstract:
Detection dead zones and heading errors induced by light shifts are two important problems in optically pumped scalar magnetometry. We introduce an atomic orientation based single-beam magnetometry scheme to simultaneously solve these problems, using a polarization-reversing and path-bending Herriott cavity. Here, a reflection mirror is inserted into the cavity to bend the optical paths in the mid…
▽ More
Detection dead zones and heading errors induced by light shifts are two important problems in optically pumped scalar magnetometry. We introduce an atomic orientation based single-beam magnetometry scheme to simultaneously solve these problems, using a polarization-reversing and path-bending Herriott cavity. Here, a reflection mirror is inserted into the cavity to bend the optical paths in the middle, and divide them into two separated orthogonal regions to avoid the detection dead zone. Moreover, half-wave plates are added in the center of each optical region, so that the light polarization is flipped each time it passes the wave plates and the light shift effects are spatially averaged out. This operation is demonstrated to eliminate the unnoticed heading errors induced by ac light shifts. The methods developed in this paper are robust to use, and easy to be applied in other atomic devices.
△ Less
Submitted 17 May, 2022;
originally announced May 2022.
-
Efficient Recovery of Low Rank Tensor via Triple Nonconvex Nonsmooth Rank Minimization
Authors:
Quan Yu
Abstract:
A tensor nuclear norm (TNN) based method for solving the tensor recovery problem was recently proposed, and it has achieved state-of-the-art performance. However, it may fail to produce a highly accurate solution since it tends to treats each frontal slice and each rank component of each frontal slice equally. In order to get a recovery with high accuracy, we propose a general and flexible rank re…
▽ More
A tensor nuclear norm (TNN) based method for solving the tensor recovery problem was recently proposed, and it has achieved state-of-the-art performance. However, it may fail to produce a highly accurate solution since it tends to treats each frontal slice and each rank component of each frontal slice equally. In order to get a recovery with high accuracy, we propose a general and flexible rank relaxation function named double weighted nonconvex nonsmooth rank (DWNNR) relaxation function for efficiently solving the third order tensor recovery problem. The DWNNR relaxation function can be derived from the triple nonconvex nonsmooth rank (TNNR) relaxation function by setting the weight vector to be the hypergradient value of some concave function, thereby adaptively selecting the weight vector. To accelerate the proposed model, we develop the general inertial smoothing proximal gradient method. Furthermore, we prove that any limit point of the generated subsequence is a critical point. Combining the Kurdyka-Lojasiewicz (KL) property with some milder assumptions, we further give its global convergence guarantee. Experimental results on a practical tensor completion problem with both synthetic and real data, the results of which demonstrate the efficiency and superior performance of the proposed algorithm.
△ Less
Submitted 11 May, 2022;
originally announced May 2022.
-
Thermodynamics and quark condensates of three-flavor QCD at low temperature
Authors:
Jens O. Andersen,
Qing Yu,
Hua Zhou
Abstract:
We use three-flavor chiral perturbation theory ($χ$PT) to calculate the pressure, light and $s$-quark condensates of QCD in the confined phase at finite temperature to ${\cal O}(p^6)$ in the low-energy expansion. We also include electromagnetic effects to order $e^2$, where the electromagnetic coupling $e$ counts as order $p$. Our results for the pressure and the condensates suggest that $χ$PT con…
▽ More
We use three-flavor chiral perturbation theory ($χ$PT) to calculate the pressure, light and $s$-quark condensates of QCD in the confined phase at finite temperature to ${\cal O}(p^6)$ in the low-energy expansion. We also include electromagnetic effects to order $e^2$, where the electromagnetic coupling $e$ counts as order $p$. Our results for the pressure and the condensates suggest that $χ$PT converges very well for temperatures up to approximately 150 MeV. We combine $χ$PT and the Hadron Resonance Gas (HRG) model by adding heavier baryons and mesons. Our results are compared with lattice simulations an d the agreement is very good for temperatures below {170} MeV, in contrast to the results from $χ$PT which agree with the lattice only up to $T\approx120$ MeV. Our value for the chiral crossover temperature is 160.1 MeV, which compares favorably to the lattice result of $157.3$ MeV.
△ Less
Submitted 7 December, 2022; v1 submitted 6 May, 2022;
originally announced May 2022.
-
Spiking Graph Convolutional Networks
Authors:
Zulun Zhu,
Jiaying Peng,
Jintang Li,
Liang Chen,
Qi Yu,
Siqiang Luo
Abstract:
Graph Convolutional Networks (GCNs) achieve an impressive performance due to the remarkable representation ability in learning the graph information. However, GCNs, when implemented on a deep network, require expensive computation power, making them difficult to be deployed on battery-powered devices. In contrast, Spiking Neural Networks (SNNs), which perform a bio-fidelity inference process, offe…
▽ More
Graph Convolutional Networks (GCNs) achieve an impressive performance due to the remarkable representation ability in learning the graph information. However, GCNs, when implemented on a deep network, require expensive computation power, making them difficult to be deployed on battery-powered devices. In contrast, Spiking Neural Networks (SNNs), which perform a bio-fidelity inference process, offer an energy-efficient neural architecture. In this work, we propose SpikingGCN, an end-to-end framework that aims to integrate the embedding of GCNs with the biofidelity characteristics of SNNs. The original graph data are encoded into spike trains based on the incorporation of graph convolution. We further model biological information processing by utilizing a fully connected layer combined with neuron nodes. In a wide range of scenarios (e.g. citation networks, image graph classification, and recommender systems), our experimental results show that the proposed method could gain competitive performance against state-of-the-art approaches. Furthermore, we show that SpikingGCN on a neuromorphic chip can bring a clear advantage of energy efficiency into graph data analysis, which demonstrates its great potential to construct environment-friendly machine learning models.
△ Less
Submitted 2 August, 2022; v1 submitted 5 May, 2022;
originally announced May 2022.
-
Gating-adapted Wavelet Multiresolution Analysis for Exposure Sequence Modeling in CTR prediction
Authors:
Xiaoxiao Xu,
Zhiwei Fang,
Qian Yu,
Ruoran Huang,
\\Chaosheng Fan,
Yong Li,
Yang He,
Changping Peng,
Zhangang Lin,
Jingping Shao
Abstract:
The exposure sequence is being actively studied for user interest modeling in Click-Through Rate (CTR) prediction. However, the existing methods for exposure sequence modeling bring extensive computational burden and neglect noise problems, resulting in an excessively latency and the limited performance in online recommenders. In this paper, we propose to address the high latency and noise problem…
▽ More
The exposure sequence is being actively studied for user interest modeling in Click-Through Rate (CTR) prediction. However, the existing methods for exposure sequence modeling bring extensive computational burden and neglect noise problems, resulting in an excessively latency and the limited performance in online recommenders. In this paper, we propose to address the high latency and noise problems via Gating-adapted wavelet multiresolution analysis (Gama), which can effectively denoise the extremely long exposure sequence and adaptively capture the implied multi-dimension user interest with linear computational complexity. This is the first attempt to integrate non-parametric multiresolution analysis technique into deep neural networks to model user exposure sequence. Extensive experiments on large scale benchmark dataset and real production dataset confirm the effectiveness of Gama for exposure sequence modeling, especially in cold-start scenarios. Benefited from its low latency and high effecitveness, Gama has been deployed in our real large-scale industrial recommender, successfully serving over hundreds of millions users.
△ Less
Submitted 29 April, 2022;
originally announced April 2022.
-
All-optical Compton photon source generation via modulating pulse profile by convex focusing plasma mirror
Authors:
Q. Yu
Abstract:
A relatively efficient scheme to generate energetic $γ$-ray is all-optical Compton scattering based on wakefield acceleration and plasma mirror where planar or concave solid target is usually employed to reflect driving pulse. For flat plasma mirror case, the colliding pulse radius is far larger than the radius of accelerated electron beam, which leads to smaller laser utilization efficiency. The…
▽ More
A relatively efficient scheme to generate energetic $γ$-ray is all-optical Compton scattering based on wakefield acceleration and plasma mirror where planar or concave solid target is usually employed to reflect driving pulse. For flat plasma mirror case, the colliding pulse radius is far larger than the radius of accelerated electron beam, which leads to smaller laser utilization efficiency. The concave plasma mirror can efficiently focus reflected laser to match electron beam radius. But the produced photon beam collimation is worse due to intensive transverse field and larger transverse wave vector of colliding laser. Here, we propose a new mechanism to achieve Compton scattering via employing convex plasma mirror. The convex plasma mirror improves reflected laser longitudinal length in addition to focusing effect, which ensures longer colliding time and larger laser utilization efficiency. Importantly, it leads transverse wave vector of colliding laser to be smaller, which guarantees better photon bunch collimation.
△ Less
Submitted 17 April, 2022;
originally announced April 2022.
-
A Survey of Video-based Action Quality Assessment
Authors:
Shunli Wang,
Dingkang Yang,
Peng Zhai,
Qing Yu,
Tao Suo,
Zhan Sun,
Ka Li,
Lihua Zhang
Abstract:
Human action recognition and analysis have great demand and important application significance in video surveillance, video retrieval, and human-computer interaction. The task of human action quality evaluation requires the intelligent system to automatically and objectively evaluate the action completed by the human. The action quality assessment model can reduce the human and material resources…
▽ More
Human action recognition and analysis have great demand and important application significance in video surveillance, video retrieval, and human-computer interaction. The task of human action quality evaluation requires the intelligent system to automatically and objectively evaluate the action completed by the human. The action quality assessment model can reduce the human and material resources spent in action evaluation and reduce subjectivity. In this paper, we provide a comprehensive survey of existing papers on video-based action quality assessment. Different from human action recognition, the application scenario of action quality assessment is relatively narrow. Most of the existing work focuses on sports and medical care. We first introduce the definition and challenges of human action quality assessment. Then we present the existing datasets and evaluation metrics. In addition, we summarized the methods of sports and medical care according to the model categories and publishing institutions according to the characteristics of the two fields. At the end, combined with recent work, the promising development direction in action quality assessment is discussed.
△ Less
Submitted 20 April, 2022;
originally announced April 2022.
-
Multidimensional quantum calculation of the infrared spectra under polaritonic vibrational strong and ultrastrong coupling
Authors:
Qi Yu
Abstract:
Recent experiments and theory demonstrate that the the ground state properties and chemical reactivity of molecules can be modified inside an optical cavity. The vibrational strong or ultrastrong coupling results in the formation of vibrational polaritons which are usually observed through infrared spectra (IR). Here, we provide a theoretical framework to conduct multidimensional quantum simulatio…
▽ More
Recent experiments and theory demonstrate that the the ground state properties and chemical reactivity of molecules can be modified inside an optical cavity. The vibrational strong or ultrastrong coupling results in the formation of vibrational polaritons which are usually observed through infrared spectra (IR). Here, we provide a theoretical framework to conduct multidimensional quantum simulations of the infrared spectra when the molecule is interacting with cavity modes. Taking single water molecule as an example, combing with accurate potential energy and dipole moment surfaces, our implemented cavity vibrational self-consistent field/virtual state configuration interaction (cVSCF/VCI) is shown to be able to provide quantitative predictions of the IR spectra when the molecule is inside or outside the cavity. The spectral signatures of resonance splittings and blue/red shift of certain bands are found to be highly related with the frequency and polarization direction of the cavity modes. Further analyses of the simulated spectra shows that polaritonic strong vibrational coupling greatly induce the coupling between molecule's vibrational modes, indicating the intramolecular vibrational energy transfer may be significantly accelerated by the cavity.
△ Less
Submitted 11 April, 2022;
originally announced April 2022.
-
q-AQUA: a many-body CCSD(T) water potential, including 4-body interactions, demonstrates the quantum nature of water from clusters to the liquid phase
Authors:
Qi Yu,
Chen Qu,
Paul L. Houston,
Riccardo Conte,
Apurba Nandi,
Joel M. Bowman
Abstract:
Many model potential energy surfaces (PESs) have been reported for water; however, none are strictly from "first principles". Here we report such a potential, based on a many-body representation at the CCSD(T) level of theory up to the ultimate 4-body interaction. The new PES is benchmarked for the isomers of the water hexamer for dissociation energies, harmonic frequencies, and unrestricted diffu…
▽ More
Many model potential energy surfaces (PESs) have been reported for water; however, none are strictly from "first principles". Here we report such a potential, based on a many-body representation at the CCSD(T) level of theory up to the ultimate 4-body interaction. The new PES is benchmarked for the isomers of the water hexamer for dissociation energies, harmonic frequencies, and unrestricted diffusion Monte Carlo (DMC) calculations of the zero-point energies of the Prism, Book, and Cage isomers. Dissociation energies of several isomers of the 20-mer agree well with recent benchmark energies. Exploratory DMC calculations on this cluster verify the robustness of the new PES for quantum simulations. The accuracy and speed of the new PES are demonstrated for standard condensed phase properties, i.e., the radial distribution function and the self-diffusion constant. Quantum effects are shown to be substantial for these observables and also needed to bring theory into an excellent agreement with experiment.
△ Less
Submitted 4 April, 2022;
originally announced April 2022.
-
Exceptional fracture toughness of CrCoNi-based medium- and high-entropy alloys close to liquid helium temperatures
Authors:
Dong Liu,
Qin Yu,
Saurabh Kabra,
Ming Jiang,
Paul Forna-Kreutzer,
Ruopeng Zhang,
Madelyn Payne,
Flynn Walsh,
Bernd Gludovatz,
Mark Asta,
Andrew M. Minor,
Easo P. George,
Robert O. Ritchie
Abstract:
Medium- and high-entropy alloys based on the CrCoNi-system have been shown to display outstanding strength, tensile ductility and fracture toughness (damage-tolerance properties), especially at cryogenic temperatures. Here we examine the JIc and (back-calculated) KJIc fracture toughness values of the face-centered cubic, equiatomic CrCoNi and CrMnFeCoNi alloys at 20 K. At flow stress values of ~1.…
▽ More
Medium- and high-entropy alloys based on the CrCoNi-system have been shown to display outstanding strength, tensile ductility and fracture toughness (damage-tolerance properties), especially at cryogenic temperatures. Here we examine the JIc and (back-calculated) KJIc fracture toughness values of the face-centered cubic, equiatomic CrCoNi and CrMnFeCoNi alloys at 20 K. At flow stress values of ~1.5 GPa, crack-initiation KJIc toughnesses were found to be exceptionally high, respectively 235 and 415 MPa(square-root)m for CrMnFeCoNi and CrCoNi, with the latter displaying a crack-growth toughness Kss exceeding 540 MPa(square-root)m after 2.25 mm of stable cracking, which to our knowledge is the highest such value ever reported. Characterization of the crack-tip regions in CrCoNi by scanning electron and transmission electron microscopy reveal deformation structures at 20 K that are quite distinct from those at higher temperatures and involve heterogeneous nucleation, but restricted growth, of stacking faults and fine nano-twins, together with transformation to the hexagonal closed-packed phase. The coherent interfaces of these features can promote both the arrest and transmission of dislocations to generate respectively strength and ductility which strongly contributes to sustained strain hardening. Indeed, we believe that these nominally single-phase, concentrated solid-solution alloys develop their fracture resistance through a progressive synergy of deformation mechanisms, including dislocation glide, stacking-fault formation, nano-twinning and eventually in situ phase transformation, all of which serve to extend continuous strain hardening which simultaneously elevates strength and ductility (by delaying plastic instability), leading to truly exceptional resistance to fracture.
△ Less
Submitted 4 April, 2022;
originally announced April 2022.
-
A Dynamic Meta-Learning Model for Time-Sensitive Cold-Start Recommendations
Authors:
Krishna Prasad Neupane,
Ervine Zheng,
Yu Kong,
Qi Yu
Abstract:
We present a novel dynamic recommendation model that focuses on users who have interactions in the past but turn relatively inactive recently. Making effective recommendations to these time-sensitive cold-start users is critical to maintain the user base of a recommender system. Due to the sparse recent interactions, it is challenging to capture these users' current preferences precisely. Solely r…
▽ More
We present a novel dynamic recommendation model that focuses on users who have interactions in the past but turn relatively inactive recently. Making effective recommendations to these time-sensitive cold-start users is critical to maintain the user base of a recommender system. Due to the sparse recent interactions, it is challenging to capture these users' current preferences precisely. Solely relying on their historical interactions may also lead to outdated recommendations misaligned with their recent interests. The proposed model leverages historical and current user-item interactions and dynamically factorizes a user's (latent) preference into time-specific and time-evolving representations that jointly affect user behaviors. These latent factors further interact with an optimized item embedding to achieve accurate and timely recommendations. Experiments over real-world data help demonstrate the effectiveness of the proposed time-sensitive cold-start recommendation model.
△ Less
Submitted 2 April, 2022;
originally announced April 2022.
-
Interpretable RNA Foundation Model from Unannotated Data for Highly Accurate RNA Structure and Function Predictions
Authors:
Jiayang Chen,
Zhihang Hu,
Siqi Sun,
Qingxiong Tan,
Yixuan Wang,
Qinze Yu,
Licheng Zong,
Liang Hong,
Jin Xiao,
Tao Shen,
Irwin King,
Yu Li
Abstract:
Non-coding RNA structure and function are essential to understanding various biological processes, such as cell signaling, gene expression, and post-transcriptional regulations. These are all among the core problems in the RNA field. With the rapid growth of sequencing technology, we have accumulated a massive amount of unannotated RNA sequences. On the other hand, expensive experimental observato…
▽ More
Non-coding RNA structure and function are essential to understanding various biological processes, such as cell signaling, gene expression, and post-transcriptional regulations. These are all among the core problems in the RNA field. With the rapid growth of sequencing technology, we have accumulated a massive amount of unannotated RNA sequences. On the other hand, expensive experimental observatory results in only limited numbers of annotated data and 3D structures. Hence, it is still challenging to design computational methods for predicting their structures and functions. The lack of annotated data and systematic study causes inferior performance. To resolve the issue, we propose a novel RNA foundation model (RNA-FM) to take advantage of all the 23 million non-coding RNA sequences through self-supervised learning. Within this approach, we discover that the pre-trained RNA-FM could infer sequential and evolutionary information of non-coding RNAs without using any labels. Furthermore, we demonstrate RNA-FM's effectiveness by applying it to the downstream secondary/3D structure prediction, SARS-CoV-2 genome structure and evolution prediction, protein-RNA binding preference modeling, and gene expression regulation modeling. The comprehensive experiments show that the proposed method improves the RNA structural and functional modelling results significantly and consistently. Despite only being trained with unlabelled data, RNA-FM can serve as the foundational model for the field.
△ Less
Submitted 7 August, 2022; v1 submitted 1 April, 2022;
originally announced April 2022.
-
BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers
Authors:
Zhiqi Li,
Wenhai Wang,
Hongyang Li,
Enze Xie,
Chonghao Sima,
Tong Lu,
Qiao Yu,
Jifeng Dai
Abstract:
3D visual perception tasks, including 3D detection and map segmentation based on multi-camera images, are essential for autonomous driving systems. In this work, we present a new framework termed BEVFormer, which learns unified BEV representations with spatiotemporal transformers to support multiple autonomous driving perception tasks. In a nutshell, BEVFormer exploits both spatial and temporal in…
▽ More
3D visual perception tasks, including 3D detection and map segmentation based on multi-camera images, are essential for autonomous driving systems. In this work, we present a new framework termed BEVFormer, which learns unified BEV representations with spatiotemporal transformers to support multiple autonomous driving perception tasks. In a nutshell, BEVFormer exploits both spatial and temporal information by interacting with spatial and temporal space through predefined grid-shaped BEV queries. To aggregate spatial information, we design spatial cross-attention that each BEV query extracts the spatial features from the regions of interest across camera views. For temporal information, we propose temporal self-attention to recurrently fuse the history BEV information. Our approach achieves the new state-of-the-art 56.9\% in terms of NDS metric on the nuScenes \texttt{test} set, which is 9.0 points higher than previous best arts and on par with the performance of LiDAR-based baselines. We further show that BEVFormer remarkably improves the accuracy of velocity estimation and recall of objects under low visibility conditions. The code is available at \url{https://github.com/zhiqi-li/BEVFormer}.
△ Less
Submitted 13 July, 2022; v1 submitted 31 March, 2022;
originally announced March 2022.
-
Variational corner transfer matrix renormalization group method for classical statistical models
Authors:
X. F. Liu,
Y. F. Fu,
W. Q. Yu,
J. F. Yu,
Z. Y. Xie
Abstract:
In the context of tensor network states, we for the first time reformulate the corner transfer matrix renormalization group (CTMRG) method into a variational bilevel optimization algorithm. The solution of the optimization problem corresponds to the fixed-point environment pursued in the conventional CTMRG method, from which the partition function of a classical statistical model, represented by a…
▽ More
In the context of tensor network states, we for the first time reformulate the corner transfer matrix renormalization group (CTMRG) method into a variational bilevel optimization algorithm. The solution of the optimization problem corresponds to the fixed-point environment pursued in the conventional CTMRG method, from which the partition function of a classical statistical model, represented by an infinite tensor network, can be efficiently evaluated. The validity of this variational idea is demonstrated by the high-precision calculation of the residual entropy of the dimer model, and is further verified by investigating several typical phase transitions in classical spin models, where the obtained critical points and critical exponents all agree with the best known results in literature. Its extension to three-dimensional tensor networks or quantum lattice models is straightforward, as also discussed briefly.
△ Less
Submitted 31 March, 2022;
originally announced March 2022.
-
One Dimensional Wormhole Corrosion in Metals
Authors:
Yang Yang,
Weiyue Zhou,
Sheng Yin,
Sarah Y. Wang,
Qin Yu,
Matthew J. Olszta,
Ya-Qian Zhang,
Steven E. Zeltmann,
Mingda Li,
Miaomiao Jin,
Daniel K. Schreiber,
Jim Ciston,
M. C. Scott,
John R. Scully,
Robert O. Ritchie,
Mark Asta,
Ju Li,
Michael P. Short,
Andrew M. Minor
Abstract:
Corrosion is a ubiquitous failure mode of materials in extreme environments. The more localized it is, the more difficult it is to detect and more deleterious its effects. Often, the progression of localized corrosion is accompanied by the evolution of porosity in materials, creating internal void-structures that facilitate the ingress of the external environment into the interior of the material,…
▽ More
Corrosion is a ubiquitous failure mode of materials in extreme environments. The more localized it is, the more difficult it is to detect and more deleterious its effects. Often, the progression of localized corrosion is accompanied by the evolution of porosity in materials, creating internal void-structures that facilitate the ingress of the external environment into the interior of the material, further accelerating the internal corrosion. Previously, the dominant morphology of such void-structures has been reported to be either three-dimensional (3D) or two-dimensional (2D). Here, we report a more localized form of corrosion, which we call 1D wormhole corrosion. Using electron tomography, we show multiple examples of this 1D and percolating morphology that manifests a significantly high aspect ratio differentiable from 2D and 3D corrosion. To understand the origin of this mechanism in a Ni-Cr alloy corroded by molten salt, we combined energy-filtered four-dimensional scanning transmission electron microscopy (EF-4D-STEM) and ab initio density functional theory (DFT) calculations to develop a vacancy mapping method with nanometer-resolution, identifying a remarkably high vacancy concentration in the diffusion-induced grain boundary migration (DIGM) zone, up to 100 times the equilibrium value at the melting point. These vacancy supersaturation regions act as the precursors of wormholes, and lead to the asymmetrical growth of voids along GBs. We show that similar 1D penetrating corrosion morphologies could also occur in other materials or corrosion conditions, implying the broad impact of this extremely localized corrosion mechanism. Deciphering the origins of 1D corrosion is an important step towards designing structural materials with enhanced corrosion resistance, and also offers new pathways to create ordered-porous materials for functional applications.
△ Less
Submitted 30 March, 2022;
originally announced March 2022.
-
Bayesian Nonparametric Submodular Video Partition for Robust Anomaly Detection
Authors:
Hitesh Sapkota,
Qi Yu
Abstract:
Multiple-instance learning (MIL) provides an effective way to tackle the video anomaly detection problem by modeling it as a weakly supervised problem as the labels are usually only available at the video level while missing for frames due to expensive labeling cost. We propose to conduct novel Bayesian non-parametric submodular video partition (BN-SVP) to significantly improve MIL model training…
▽ More
Multiple-instance learning (MIL) provides an effective way to tackle the video anomaly detection problem by modeling it as a weakly supervised problem as the labels are usually only available at the video level while missing for frames due to expensive labeling cost. We propose to conduct novel Bayesian non-parametric submodular video partition (BN-SVP) to significantly improve MIL model training that can offer a highly reliable solution for robust anomaly detection in practical settings that include outlier segments or multiple types of abnormal events. BN-SVP essentially performs dynamic non-parametric hierarchical clustering with an enhanced self-transition that groups segments in a video into temporally consistent and semantically coherent hidden states that can be naturally interpreted as scenes. Each segment is assumed to be generated through a non-parametric mixture process that allows variations of segments within the same scenes to accommodate the dynamic and noisy nature of many real-world surveillance videos. The scene and mixture component assignment of BN-SVP also induces a pairwise similarity among segments, resulting in non-parametric construction of a submodular set function. Integrating this function with an MIL loss effectively exposes the model to a diverse set of potentially positive instances to improve its training. A greedy algorithm is developed to optimize the submodular function and support efficient model training. Our theoretical analysis ensures a strong performance guarantee of the proposed algorithm. The effectiveness of the proposed approach is demonstrated over multiple real-world anomaly video datasets with robust detection performance.
△ Less
Submitted 24 March, 2022;
originally announced March 2022.
-
Multidimensional Belief Quantification for Label-Efficient Meta-Learning
Authors:
Deep Pandey,
Qi Yu
Abstract:
Optimization-based meta-learning offers a promising direction for few-shot learning that is essential for many real-world computer vision applications. However, learning from few samples introduces uncertainty, and quantifying model confidence for few-shot predictions is essential for many critical domains. Furthermore, few-shot tasks used in meta training are usually sampled randomly from a task…
▽ More
Optimization-based meta-learning offers a promising direction for few-shot learning that is essential for many real-world computer vision applications. However, learning from few samples introduces uncertainty, and quantifying model confidence for few-shot predictions is essential for many critical domains. Furthermore, few-shot tasks used in meta training are usually sampled randomly from a task distribution for an iterative model update, leading to high labeling costs and computational overhead in meta-training. We propose a novel uncertainty-aware task selection model for label efficient meta-learning. The proposed model formulates a multidimensional belief measure, which can quantify the known uncertainty and lower bound the unknown uncertainty of any given task. Our theoretical result establishes an important relationship between the conflicting belief and the incorrect belief. The theoretical result allows us to estimate the total uncertainty of a task, which provides a principled criterion for task selection. A novel multi-query task formulation is further developed to improve both the computational and labeling efficiency of meta-learning. Experiments conducted over multiple real-world few-shot image classification tasks demonstrate the effectiveness of the proposed model.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
IAM: A Comprehensive and Large-Scale Dataset for Integrated Argument Mining Tasks
Authors:
Liying Cheng,
Lidong Bing,
Ruidan He,
Qian Yu,
Yan Zhang,
Luo Si
Abstract:
Traditionally, a debate usually requires a manual preparation process, including reading plenty of articles, selecting the claims, identifying the stances of the claims, seeking the evidence for the claims, etc. As the AI debate attracts more attention these years, it is worth exploring the methods to automate the tedious process involved in the debating system. In this work, we introduce a compre…
▽ More
Traditionally, a debate usually requires a manual preparation process, including reading plenty of articles, selecting the claims, identifying the stances of the claims, seeking the evidence for the claims, etc. As the AI debate attracts more attention these years, it is worth exploring the methods to automate the tedious process involved in the debating system. In this work, we introduce a comprehensive and large dataset named IAM, which can be applied to a series of argument mining tasks, including claim extraction, stance classification, evidence extraction, etc. Our dataset is collected from over 1k articles related to 123 topics. Near 70k sentences in the dataset are fully annotated based on their argument properties (e.g., claims, stances, evidence, etc.). We further propose two new integrated argument mining tasks associated with the debate preparation process: (1) claim extraction with stance classification (CESC) and (2) claim-evidence pair extraction (CEPE). We adopt a pipeline approach and an end-to-end method for each integrated task separately. Promising experimental results are reported to show the values and challenges of our proposed tasks, and motivate future research on argument mining.
△ Less
Submitted 16 July, 2022; v1 submitted 23 March, 2022;
originally announced March 2022.
-
OpenTAL: Towards Open Set Temporal Action Localization
Authors:
Wentao Bao,
Qi Yu,
Yu Kong
Abstract:
Temporal Action Localization (TAL) has experienced remarkable success under the supervised learning paradigm. However, existing TAL methods are rooted in the closed set assumption, which cannot handle the inevitable unknown actions in open-world scenarios. In this paper, we, for the first time, step toward the Open Set TAL (OSTAL) problem and propose a general framework OpenTAL based on Evidential…
▽ More
Temporal Action Localization (TAL) has experienced remarkable success under the supervised learning paradigm. However, existing TAL methods are rooted in the closed set assumption, which cannot handle the inevitable unknown actions in open-world scenarios. In this paper, we, for the first time, step toward the Open Set TAL (OSTAL) problem and propose a general framework OpenTAL based on Evidential Deep Learning (EDL). Specifically, the OpenTAL consists of uncertainty-aware action classification, actionness prediction, and temporal location regression. With the proposed importance-balanced EDL method, classification uncertainty is learned by collecting categorical evidence majorly from important samples. To distinguish the unknown actions from background video frames, the actionness is learned by the positive-unlabeled learning. The classification uncertainty is further calibrated by leveraging the guidance from the temporal localization quality. The OpenTAL is general to enable existing TAL models for open set scenarios, and experimental results on THUMOS14 and ActivityNet1.3 benchmarks show the effectiveness of our method. The code and pre-trained models are released at https://www.rit.edu/actionlab/opental.
△ Less
Submitted 9 March, 2022;
originally announced March 2022.
-
A Ta-TaS2 monolithic catalyst with robust and metallic interface for superior hydrogen evolution
Authors:
Qiangmin Yu,
Zhiyuan Zhang,
Siyao Qiu,
Yuting Luo,
Zhibo Liu,
Fengning Yang,
Heming Liu,
Shiyu Ge,
Xiaolong Zou,
Baofu Ding,
Wencai Ren,
Hui-Ming Cheng,
Chenghua Sun,
Bilu Liu
Abstract:
The use of highly active and robust catalysts is crucial for producing green hydrogen by water electrolysis as we strive to achieve global carbon neutrality. Noble metals like platinum are currently used in industry for the hydrogen evolution reaction (HER), but suffer from scarcity, high price and unsatisfied performance and stability at large current density, restricting their large scale implem…
▽ More
The use of highly active and robust catalysts is crucial for producing green hydrogen by water electrolysis as we strive to achieve global carbon neutrality. Noble metals like platinum are currently used in industry for the hydrogen evolution reaction (HER), but suffer from scarcity, high price and unsatisfied performance and stability at large current density, restricting their large scale implementations. Here we report the synthesis of a new type of monolithic catalyst (MC) consisting of a metal disulfide (e.g., TaS2) catalyst vertically bonded to a conductive substrate of the same metal by strong covalent bonds. These features give the MC a mechanically robust and electrically near zero resistance interface, leading to an outstanding HER performance including rapid charge transfer and excellent durability, together with a low overpotential of 398 mV to achieve a current density of 2,000 mA cm-2 as required by industry. The Ta TaS2 MC has a negligible performance decay after 200 h operation at large current densities. In light of its unique interface and the various choice of metal elements giving the same structure, such monolithic materials may have broad uses besides catalysis.
△ Less
Submitted 15 February, 2022;
originally announced February 2022.
-
GAMMA Challenge:Glaucoma grAding from Multi-Modality imAges
Authors:
Junde Wu,
Huihui Fang,
Fei Li,
Huazhu Fu,
Fengbin Lin,
Jiongcheng Li,
Lexing Huang,
Qinji Yu,
Sifan Song,
Xinxing Xu,
Yanyu Xu,
Wensai Wang,
Lingxiao Wang,
Shuai Lu,
Huiqi Li,
Shihua Huang,
Zhichao Lu,
Chubin Ou,
Xifei Wei,
Bingyuan Liu,
Riadh Kobbi,
Xiaoying Tang,
Li Lin,
Qiang Zhou,
Qiang Hu
, et al. (4 additional authors not shown)
Abstract:
Color fundus photography and Optical Coherence Tomography (OCT) are the two most cost-effective tools for glaucoma screening. Both two modalities of images have prominent biomarkers to indicate glaucoma suspected. Clinically, it is often recommended to take both of the screenings for a more accurate and reliable diagnosis. However, although numerous algorithms are proposed based on fundus images o…
▽ More
Color fundus photography and Optical Coherence Tomography (OCT) are the two most cost-effective tools for glaucoma screening. Both two modalities of images have prominent biomarkers to indicate glaucoma suspected. Clinically, it is often recommended to take both of the screenings for a more accurate and reliable diagnosis. However, although numerous algorithms are proposed based on fundus images or OCT volumes in computer-aided diagnosis, there are still few methods leveraging both of the modalities for the glaucoma assessment. Inspired by the success of Retinal Fundus Glaucoma Challenge (REFUGE) we held previously, we set up the Glaucoma grAding from Multi-Modality imAges (GAMMA) Challenge to encourage the development of fundus \& OCT-based glaucoma grading. The primary task of the challenge is to grade glaucoma from both the 2D fundus images and 3D OCT scanning volumes. As part of GAMMA, we have publicly released a glaucoma annotated dataset with both 2D fundus color photography and 3D OCT volumes, which is the first multi-modality dataset for glaucoma grading. In addition, an evaluation framework is also established to evaluate the performance of the submitted methods. During the challenge, 1272 results were submitted, and finally, top-10 teams were selected to the final stage. We analysis their results and summarize their methods in the paper. Since all these teams submitted their source code in the challenge, a detailed ablation study is also conducted to verify the effectiveness of the particular modules proposed. We find many of the proposed techniques are practical for the clinical diagnosis of glaucoma. As the first in-depth study of fundus \& OCT multi-modality glaucoma grading, we believe the GAMMA Challenge will be an essential starting point for future research.
△ Less
Submitted 26 December, 2022; v1 submitted 14 February, 2022;
originally announced February 2022.