-
First Measurement of Solar $^8$B Neutrino Flux through Coherent Elastic Neutrino-Nucleus Scattering in PandaX-4T
Authors:
PandaX Collaboration,
Zihao Bo,
Wei Chen,
Xun Chen,
Yunhua Chen,
Zhaokan Cheng,
Xiangyi Cui,
Yingjie Fan,
Deqing Fang,
Zhixing Gao,
Lisheng Geng,
Karl Giboni,
Xunan Guo,
Xuyuan Guo,
Zichao Guo,
Chencheng Han,
Ke Han,
Changda He,
Jinrong He,
Di Huang,
Houqi Huang,
Junting Huang,
Ruquan Hou,
Yu Hou,
Xiangdong Ji
, et al. (77 additional authors not shown)
Abstract:
The PandaX-4T liquid xenon detector at the China Jinping Underground Laboratory is used to measure the solar $^8$B neutrino flux by detecting neutrinos through coherent scattering with xenon nuclei. Data samples requiring the coincidence of scintillation and ionization signals (paired), as well as unpaired ionization-only signals (US2), are selected with energy threshold of approximately 1.1 keV (…
▽ More
The PandaX-4T liquid xenon detector at the China Jinping Underground Laboratory is used to measure the solar $^8$B neutrino flux by detecting neutrinos through coherent scattering with xenon nuclei. Data samples requiring the coincidence of scintillation and ionization signals (paired), as well as unpaired ionization-only signals (US2), are selected with energy threshold of approximately 1.1 keV (0.33 keV) nuclear recoil energy. Combining the commissioning run and the first science run of PandaX-4T, a total exposure of 1.25 and 1.04 tonne$\cdot$year are collected for the paired and US2, respectively. After unblinding, 3 and 332 events are observed with an expectation of 2.8$\pm$0.5 and 251$\pm$32 background events, for the paired and US2 data, respectively. A combined analysis yields a best-fit $^8$B neutrino signal of 3.5 (75) events from the paired (US2) data sample, with $\sim$37\% uncertainty, and the background-only hypothesis is disfavored at 2.64$σ$ significance. This gives a solar $^8$B neutrino flux of ($8.4\pm3.1$)$\times$10$^6$ cm$^{-2}$s$^{-1}$, consistent with the standard solar model prediction. This is the first indication of solar $^8$B neutrino ``fog'' in a dark matter direct detection experiment.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Observation of exceptional line semimetal in three-dimensional non-Hermitian phononic crystals
Authors:
Yejian Hu,
Jien Wu,
Peidong Ye,
Weiyin Deng,
Jiuyang Lu,
Xueqin Huang,
Ziyu Wang,
Manzhu Ke,
Zhengyou Liu
Abstract:
Non-Hermitian topological phases, which exhibit unique features such as skin effect and exceptional points originated from nontrivial band topologies in complex plane, have attracted enormous attention in condensed-matter physics and metamaterials. Here we report the realization of an exceptional line semimetal in a three-dimensional non-Hermitian phononic crystal. A pair of exceptional rings with…
▽ More
Non-Hermitian topological phases, which exhibit unique features such as skin effect and exceptional points originated from nontrivial band topologies in complex plane, have attracted enormous attention in condensed-matter physics and metamaterials. Here we report the realization of an exceptional line semimetal in a three-dimensional non-Hermitian phononic crystal. A pair of exceptional rings with opposite topologies are connected by the drumhead bulk states in the first Brillouin zone. The exceptional rings not only possess wave-function topology and thus result in the drumhead surface states, but also host spectral topology and thereby give rise to the hybrid-order geometry-dependent skin effect in three dimensions. Our experimental results evidence the complete non-Hermitian bulk-boundary correspondence of the three-dimensional exceptional line semimetal, and may pave the way for designing non-Hermitian acoustic devices.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Start from Zero: Triple Set Prediction for Automatic Knowledge Graph Completion
Authors:
Wen Zhang,
Yajing Xu,
Peng Ye,
Zhiwei Huang,
Zezhong Xu,
Jiaoyan Chen,
Jeff Z. Pan,
Huajun Chen
Abstract:
Knowledge graph (KG) completion aims to find out missing triples in a KG. Some tasks, such as link prediction and instance completion, have been proposed for KG completion. They are triple-level tasks with some elements in a missing triple given to predict the missing element of the triple. However, knowing some elements of the missing triple in advance is not always a realistic setting. In this p…
▽ More
Knowledge graph (KG) completion aims to find out missing triples in a KG. Some tasks, such as link prediction and instance completion, have been proposed for KG completion. They are triple-level tasks with some elements in a missing triple given to predict the missing element of the triple. However, knowing some elements of the missing triple in advance is not always a realistic setting. In this paper, we propose a novel graph-level automatic KG completion task called Triple Set Prediction (TSP) which assumes none of the elements in the missing triples is given. TSP is to predict a set of missing triples given a set of known triples. To properly and accurately evaluate this new task, we propose 4 evaluation metrics including 3 classification metrics and 1 ranking metric, considering both the partial-open-world and the closed-world assumptions. Furthermore, to tackle the huge candidate triples for prediction, we propose a novel and efficient subgraph-based method GPHT that can predict the triple set fast. To fairly compare the TSP results, we also propose two types of methods RuleTensor-TSP and KGE-TSP applying the existing rule- and embedding-based methods for TSP as baselines. During experiments, we evaluate the proposed methods on two datasets extracted from Wikidata following the relation-similarity partial-open-world assumption proposed by us, and also create a complete family data set to evaluate TSP results following the closed-world assumption. Results prove that the methods can successfully generate a set of missing triples and achieve reasonable scores on the new task, and GPHT performs better than the baselines with significantly shorter prediction time. The datasets and code for experiments are available at https://github.com/zjukg/GPHT-for-TSP.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
BEACON: Benchmark for Comprehensive RNA Tasks and Language Models
Authors:
Yuchen Ren,
Zhiyuan Chen,
Lifeng Qiao,
Hongtai Jing,
Yuchen Cai,
Sheng Xu,
Peng Ye,
Xinzhu Ma,
Siqi Sun,
Hongliang Yan,
Dong Yuan,
Wanli Ouyang,
Xihui Liu
Abstract:
RNA plays a pivotal role in translating genetic instructions into functional outcomes, underscoring its importance in biological processes and disease mechanisms. Despite the emergence of numerous deep learning approaches for RNA, particularly universal RNA language models, there remains a significant lack of standardized benchmarks to assess the effectiveness of these methods. In this study, we i…
▽ More
RNA plays a pivotal role in translating genetic instructions into functional outcomes, underscoring its importance in biological processes and disease mechanisms. Despite the emergence of numerous deep learning approaches for RNA, particularly universal RNA language models, there remains a significant lack of standardized benchmarks to assess the effectiveness of these methods. In this study, we introduce the first comprehensive RNA benchmark BEACON (\textbf{BE}nchm\textbf{A}rk for \textbf{CO}mprehensive R\textbf{N}A Task and Language Models). First, BEACON comprises 13 distinct tasks derived from extensive previous work covering structural analysis, functional studies, and engineering applications, enabling a comprehensive assessment of the performance of methods on various RNA understanding tasks. Second, we examine a range of models, including traditional approaches like CNNs, as well as advanced RNA foundation models based on language models, offering valuable insights into the task-specific performances of these models. Third, we investigate the vital RNA language model components from the tokenizer and positional encoding aspects. Notably, our findings emphasize the superiority of single nucleotide tokenization and the effectiveness of Attention with Linear Biases (ALiBi) over traditional positional encoding methods. Based on these insights, a simple yet strong baseline called BEACON-B is proposed, which can achieve outstanding performance with limited data and computational resources. The datasets and source code of our benchmark are available at https://github.com/terry-r123/RNABenchmark.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
DualMamba: A Lightweight Spectral-Spatial Mamba-Convolution Network for Hyperspectral Image Classification
Authors:
Jiamu Sheng,
Jingyi Zhou,
Jiong Wang,
Peng Ye,
Jiayuan Fan
Abstract:
The effectiveness and efficiency of modeling complex spectral-spatial relations are both crucial for Hyperspectral image (HSI) classification. Most existing methods based on CNNs and transformers still suffer from heavy computational burdens and have room for improvement in capturing the global-local spectral-spatial feature representation. To this end, we propose a novel lightweight parallel desi…
▽ More
The effectiveness and efficiency of modeling complex spectral-spatial relations are both crucial for Hyperspectral image (HSI) classification. Most existing methods based on CNNs and transformers still suffer from heavy computational burdens and have room for improvement in capturing the global-local spectral-spatial feature representation. To this end, we propose a novel lightweight parallel design called lightweight dual-stream Mamba-convolution network (DualMamba) for HSI classification. Specifically, a parallel lightweight Mamba and CNN block are first developed to extract global and local spectral-spatial features. First, the cross-attention spectral-spatial Mamba module is proposed to leverage the global modeling of Mamba at linear complexity. Within this module, dynamic positional embedding is designed to enhance the spatial location information of visual sequences. The lightweight spectral/spatial Mamba blocks comprise an efficient scanning strategy and a lightweight Mamba design to efficiently extract global spectral-spatial features. And the cross-attention spectral-spatial fusion is designed to learn cross-correlation and fuse spectral-spatial features. Second, the lightweight spectral-spatial residual convolution module is proposed with lightweight spectral and spatial branches to extract local spectral-spatial features through residual learning. Finally, the adaptive global-local fusion is proposed to dynamically combine global Mamba features and local convolution features for a global-local spectral-spatial representation. Compared with state-of-the-art HSI classification methods, experimental results demonstrate that DualMamba achieves significant classification accuracy on three public HSI datasets and a superior reduction in model parameters and floating point operations (FLOPs).
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision
Authors:
Minglei Li,
Peng Ye,
Yongqi Huang,
Lin Zhang,
Tao Chen,
Tong He,
Jiayuan Fan,
Wanli Ouyang
Abstract:
Parameter-efficient fine-tuning (PEFT) has become increasingly important as foundation models continue to grow in both popularity and size. Adapter has been particularly well-received due to their potential for parameter reduction and adaptability across diverse tasks. However, striking a balance between high efficiency and robust generalization across tasks remains a challenge for adapter-based m…
▽ More
Parameter-efficient fine-tuning (PEFT) has become increasingly important as foundation models continue to grow in both popularity and size. Adapter has been particularly well-received due to their potential for parameter reduction and adaptability across diverse tasks. However, striking a balance between high efficiency and robust generalization across tasks remains a challenge for adapter-based methods. We analyze existing methods and find that: 1) parameter sharing is the key to reducing redundancy; 2) more tunable parameters, dynamic allocation, and block-specific design are keys to improving performance. Unfortunately, no previous work considers all these factors. Inspired by this insight, we introduce a novel framework named Adapter-X. First, a Sharing Mixture of Adapters (SMoA) module is proposed to fulfill token-level dynamic allocation, increased tunable parameters, and inter-block sharing at the same time. Second, some block-specific designs like Prompt Generator (PG) are introduced to further enhance the ability of adaptation. Extensive experiments across 2D image and 3D point cloud modalities demonstrate that Adapter-X represents a significant milestone as it is the first to outperform full fine-tuning in both 2D image and 3D point cloud modalities with significantly fewer parameters, i.e., only 0.20% and 1.88% of original trainable parameters for 2D and 3D classification tasks. Our code will be publicly available.
△ Less
Submitted 5 June, 2024; v1 submitted 5 June, 2024;
originally announced June 2024.
-
Toeplitz non-liquids and Toeplitz braiding
Authors:
Boxi Li,
Yao Zhou,
Peng Ye
Abstract:
We study a class of $3$D non-liquid states called ``Toeplitz non-liquids''. These states consist of a stack of $2$D twisted $\mathbb{Z}_N$ topologically ordered layers along the $z$-direction; nearby layers are coupled while keeping translational symmetry along $z$. The effective field theory is described by infinite Chern-Simons (iCS) theory, with a coefficient matrix called ``$K$-matrix'' that i…
▽ More
We study a class of $3$D non-liquid states called ``Toeplitz non-liquids''. These states consist of a stack of $2$D twisted $\mathbb{Z}_N$ topologically ordered layers along the $z$-direction; nearby layers are coupled while keeping translational symmetry along $z$. The effective field theory is described by infinite Chern-Simons (iCS) theory, with a coefficient matrix called ``$K$-matrix'' that is of block-tridiagonal Toeplitz matrix-type. With open boundary conditions (OBC) along the $z$-direction, certain $K$-matrices exhibit an exotic phenomenon called ``Toeplitz braiding'', where the mutual braiding statistical phase between two anyons at opposite boundaries oscillates and remains non-zero in the thermodynamic limit. As a necessary condition, this requires boundary zero modes in the $K$-matrix spectrum under OBC. A key example is the $K$-matrix resembling the Hamiltonian of the $1$D Su-Schrieffer-Heeger insulator. Since the gauge invariance of Chern-Simons theory guarantees integer quantized entries for $K$-matrices, no usual global symmetries are needed to protect these zero modes or Toeplitz braiding. In order to obtain the general theory, we categorize $K$-matrices that support Toeplitz braiding into three types and analyze the conditions for each. We further numerically study the analytical results for all types of $K$-matrices. For comparison, a trivial case is numerically shown, where the mutual statistical phase angle decays exponentially to zero in the thermodynamic limit.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
FNP: Fourier Neural Processes for Arbitrary-Resolution Data Assimilation
Authors:
Kun Chen,
Tao Chen,
Peng Ye,
Hao Chen,
Kang Chen,
Tao Han,
Wanli Ouyang,
Lei Bai
Abstract:
Data assimilation is a vital component in modern global medium-range weather forecasting systems to obtain the best estimation of the atmospheric state by combining the short-term forecast and observations. Recently, AI-based data assimilation approaches have attracted increasing attention for their significant advantages over traditional techniques in terms of computational consumption. However,…
▽ More
Data assimilation is a vital component in modern global medium-range weather forecasting systems to obtain the best estimation of the atmospheric state by combining the short-term forecast and observations. Recently, AI-based data assimilation approaches have attracted increasing attention for their significant advantages over traditional techniques in terms of computational consumption. However, existing AI-based data assimilation methods can only handle observations with a specific resolution, lacking the compatibility and generalization ability to assimilate observations with other resolutions. Considering that complex real-world observations often have different resolutions, we propose the \textit{\textbf{Fourier Neural Processes}} (FNP) for \textit{arbitrary-resolution data assimilation} in this paper. Leveraging the efficiency of the designed modules and flexible structure of neural processes, FNP achieves state-of-the-art results in assimilating observations with varying resolutions, and also exhibits increasing advantages over the counterparts as the resolution and the amount of observations increase. Moreover, our FNP trained on a fixed resolution can directly handle the assimilation of observations with out-of-distribution resolutions and the observational information reconstruction task without additional fine-tuning, demonstrating its excellent generalization ability across data resolutions as well as across tasks.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
$Δ$-DiT: A Training-Free Acceleration Method Tailored for Diffusion Transformers
Authors:
Pengtao Chen,
Mingzhu Shen,
Peng Ye,
Jianjian Cao,
Chongjun Tu,
Christos-Savvas Bouganis,
Yiren Zhao,
Tao Chen
Abstract:
Diffusion models are widely recognized for generating high-quality and diverse images, but their poor real-time performance has led to numerous acceleration works, primarily focusing on UNet-based structures. With the more successful results achieved by diffusion transformers (DiT), there is still a lack of exploration regarding the impact of DiT structure on generation, as well as the absence of…
▽ More
Diffusion models are widely recognized for generating high-quality and diverse images, but their poor real-time performance has led to numerous acceleration works, primarily focusing on UNet-based structures. With the more successful results achieved by diffusion transformers (DiT), there is still a lack of exploration regarding the impact of DiT structure on generation, as well as the absence of an acceleration framework tailored to the DiT architecture. To tackle these challenges, we conduct an investigation into the correlation between DiT blocks and image generation. Our findings reveal that the front blocks of DiT are associated with the outline of the generated images, while the rear blocks are linked to the details. Based on this insight, we propose an overall training-free inference acceleration framework $Δ$-DiT: using a designed cache mechanism to accelerate the rear DiT blocks in the early sampling stages and the front DiT blocks in the later stages. Specifically, a DiT-specific cache mechanism called $Δ$-Cache is proposed, which considers the inputs of the previous sampling image and reduces the bias in the inference. Extensive experiments on PIXART-$α$ and DiT-XL demonstrate that the $Δ$-DiT can achieve a $1.6\times$ speedup on the 20-step generation and even improves performance in most cases. In the scenario of 4-step consistent model generation and the more challenging $1.12\times$ acceleration, our method significantly outperforms existing methods. Our code will be publicly available.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Diagrammatic Representations of Topological Orders with Loop- and Membrane-like Excitations
Authors:
Yizhou Huang,
Zhi-Feng Zhang,
Peng Ye
Abstract:
In spacetime dimensions of 4D (i.e., 3+1D) and higher, topological orders exhibit spatially extended excitations like loops and membranes, which support diverse topological data characterizing braiding, fusion, and shrinking processes, despite the absence of anyons. Our understanding of these topological data remains less mature compared to 3D, where anyons have been extensively studied and can be…
▽ More
In spacetime dimensions of 4D (i.e., 3+1D) and higher, topological orders exhibit spatially extended excitations like loops and membranes, which support diverse topological data characterizing braiding, fusion, and shrinking processes, despite the absence of anyons. Our understanding of these topological data remains less mature compared to 3D, where anyons have been extensively studied and can be fully described through diagrammatic representations. Inspired by recent advancements in field theory descriptions of higher-dimensional topological orders, this paper systematically constructs diagrammatic representations for 4D and 5D topological orders, generalizable to higher dimensions. We introduce elementary diagrams for fusion and shrinking processes, treating them as vectors in fusion and shrinking spaces, respectively, and build complex diagrams by combining these elementary diagrams. Within these vector spaces, we design unitary operations represented by \(F\)-, \(Δ\)-, and \(Δ^2\)-symbols to transform between different bases. We uncover \textit{pentagon equations} and \textit{(hierarchical) shrinking-fusion hexagon equations} that impose constraints on the legitimate forms of these unitary operations. We conjecture that all anomaly-free higher-dimensional topological orders must satisfy these conditions, with any violations indicating a quantum anomaly. This work opens promising avenues for future research, including the exploration of diagrammatic representations involving braiding and implications for noninvertible symmetries and Symmetry Topological Field Theory.
△ Less
Submitted 23 June, 2024; v1 submitted 29 May, 2024;
originally announced May 2024.
-
EMR-Merging: Tuning-Free High-Performance Model Merging
Authors:
Chenyu Huang,
Peng Ye,
Tao Chen,
Tong He,
Xiangyu Yue,
Wanli Ouyang
Abstract:
The success of pretrain-finetune paradigm brings about the release of numerous model weights. In this case, merging models finetuned on different tasks to enable a single model with multi-task capabilities is gaining increasing attention for its practicability. Existing model merging methods usually suffer from (1) significant performance degradation or (2) requiring tuning by additional data or t…
▽ More
The success of pretrain-finetune paradigm brings about the release of numerous model weights. In this case, merging models finetuned on different tasks to enable a single model with multi-task capabilities is gaining increasing attention for its practicability. Existing model merging methods usually suffer from (1) significant performance degradation or (2) requiring tuning by additional data or training. In this paper, we rethink and analyze the existing model merging paradigm. We discover that using a single model's weights can hardly simulate all the models' performance. To tackle this issue, we propose Elect, Mask & Rescale-Merging (EMR-Merging). We first (a) elect a unified model from all the model weights and then (b) generate extremely lightweight task-specific modulators, including masks and rescalers, to align the direction and magnitude between the unified model and each specific model, respectively. EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance. We find that EMR-Merging shows outstanding performance compared to existing merging methods under different classical and newly-established settings, including merging different numbers of vision models (up to 30), NLP models, PEFT models, and multi-modal models.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Modeling and simulation of a mechanism for suppressing the flipping problem of a jumping robot
Authors:
Qi Li,
Liang Peng,
Zhiyuan Wu,
Pengda Ye,
Weitao Zhang,
Yi Xu,
Qing Shi
Abstract:
In order to solve the problem of stable jumping of micro robot, we design a special mechanism: elastic passive joint (EPJ). EPJ can assist in achieving smooth jumping through the opening-closing process when the robot jumps. First, we introduce the composition and operation principle of EPJ, and perform a dynamic modeling of the robot's jumping process. Then, in order to verify the effectiveness o…
▽ More
In order to solve the problem of stable jumping of micro robot, we design a special mechanism: elastic passive joint (EPJ). EPJ can assist in achieving smooth jumping through the opening-closing process when the robot jumps. First, we introduce the composition and operation principle of EPJ, and perform a dynamic modeling of the robot's jumping process. Then, in order to verify the effectiveness of EPJ in controlling the robot's smooth jump, we design a simulation experiment based on MATLAB. Through comparative experiments, it was proved that EPJ can greatly adjust the angular velocity of the robot and increase the jump distance of the robot. Finally, we analyze each parameter in EPJ and performs parameter optimization. After optimization, EPJ achieves a completely flip-free jump of the robot, laying an important foundation for improving the mobility of micro-robot.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression
Authors:
Hancheng Ye,
Chong Yu,
Peng Ye,
Renqiu Xia,
Yansong Tang,
Jiwen Lu,
Tao Chen,
Bo Zhang
Abstract:
Recent Vision Transformer Compression (VTC) works mainly follow a two-stage scheme, where the importance score of each model unit is first evaluated or preset in each submodule, followed by the sparsity score evaluation according to the target sparsity constraint. Such a separate evaluation process induces the gap between importance and sparsity score distributions, thus causing high search costs…
▽ More
Recent Vision Transformer Compression (VTC) works mainly follow a two-stage scheme, where the importance score of each model unit is first evaluated or preset in each submodule, followed by the sparsity score evaluation according to the target sparsity constraint. Such a separate evaluation process induces the gap between importance and sparsity score distributions, thus causing high search costs for VTC. In this work, for the first time, we investigate how to integrate the evaluations of importance and sparsity scores into a single stage, searching the optimal subnets in an efficient manner. Specifically, we present OFB, a cost-efficient approach that simultaneously evaluates both importance and sparsity scores, termed Once for Both (OFB), for VTC. First, a bi-mask scheme is developed by entangling the importance score and the differentiable sparsity score to jointly determine the pruning potential (prunability) of each unit. Such a bi-mask search strategy is further used together with a proposed adaptive one-hot loss to realize the progressive-and-efficient search for the most important subnet. Finally, Progressive Masked Image Modeling (PMIM) is proposed to regularize the feature space to be more representative during the search process, which may be degraded by the dimension reduction. Extensive experiments demonstrate that OFB can achieve superior compression performance over state-of-the-art searching-based and pruning-based methods under various Vision Transformer architectures, meanwhile promoting search efficiency significantly, e.g., costing one GPU search day for the compression of DeiT-S on ImageNet-1K.
△ Less
Submitted 23 March, 2024;
originally announced March 2024.
-
Solvent-Free Silsesquioxane Self-Welding for 3D Printing Multi-Refractive Index Glass Objects
Authors:
Piaoran Ye,
Zhihan Hong,
Douglas A. Loy,
Rongguang Liang
Abstract:
The growing interest in 3D printing of silica glass has spurred substantial research efforts. Our prior work utilizing a liquid silica resin (LSR) demonstrated high printing accuracy and resolution. However, the resin's sensitivity to moisture posed limitations, restricting the printing environment. On the other hand, polyhedral oligomeric silsesquioxane (POSS)-based materials offer excellent wate…
▽ More
The growing interest in 3D printing of silica glass has spurred substantial research efforts. Our prior work utilizing a liquid silica resin (LSR) demonstrated high printing accuracy and resolution. However, the resin's sensitivity to moisture posed limitations, restricting the printing environment. On the other hand, polyhedral oligomeric silsesquioxane (POSS)-based materials offer excellent water stability and sinterless features. Yet, they suffer from relatively high shrinkage due to the presence of additional organic monomers. In this study, we present a polymeric silsesquioxane (PSQ) resin with reduced shrinkage, enhanced moisture stability, and the retention of sinterless features, providing a promising solution for achieving high-resolution 3D printing of glass objects. Leveraging the two-photon polymerization (2PP) method, we realized nanostructures with feature sizes below 80 nm. Moreover, we demonstrate the tunability of the refractive index by incorporating zirconium moieties into the resin, facilitating the fabrication of glass micro-optics with varying refractive indices. Importantly, the self-welding capability observed between two individual components provides a flexible approach for producing micro-optics with multiple components, each possessing distinct refractive indices. This research represents a significant advancement in the field of advanced glass manufacturing, paving the way for future applications in micro- and nano-scale glass objects.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
Prompt-fused framework for Inductive Logical Query Answering
Authors:
Zezhong Xu,
Peng Ye,
Lei Liang,
Huajun Chen,
Wen Zhang
Abstract:
Answering logical queries on knowledge graphs (KG) poses a significant challenge for machine reasoning. The primary obstacle in this task stems from the inherent incompleteness of KGs. Existing research has predominantly focused on addressing the issue of missing edges in KGs, thereby neglecting another aspect of incompleteness: the emergence of new entities. Furthermore, most of the existing meth…
▽ More
Answering logical queries on knowledge graphs (KG) poses a significant challenge for machine reasoning. The primary obstacle in this task stems from the inherent incompleteness of KGs. Existing research has predominantly focused on addressing the issue of missing edges in KGs, thereby neglecting another aspect of incompleteness: the emergence of new entities. Furthermore, most of the existing methods tend to reason over each logical operator separately, rather than comprehensively analyzing the query as a whole during the reasoning process. In this paper, we propose a query-aware prompt-fused framework named Pro-QE, which could incorporate existing query embedding methods and address the embedding of emerging entities through contextual information aggregation. Additionally, a query prompt, which is generated by encoding the symbolic query, is introduced to gather information relevant to the query from a holistic perspective. To evaluate the efficacy of our model in the inductive setting, we introduce two new challenging benchmarks. Experimental results demonstrate that our model successfully handles the issue of unseen entities in logical queries. Furthermore, the ablation study confirms the efficacy of the aggregator and prompt components.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
New constraints on Triton's atmosphere from the 6 October 2022 stellar occultation
Authors:
Ye Yuan,
Chen Zhang,
Fan Li,
Jian Chen,
Yanning Fu,
Chunhai Bai,
Xing Gao,
Yong Wang,
Tuhong Zhong,
Yixing Gao,
Liang Wang,
Donghua Chen,
Yixing Zhang,
Yang Zhang,
Wenpeng Xie,
Shupi Zhang,
Ding Liu,
Jun Cao,
Xiangdong Yin,
Xiaojun Mo,
Jing Liu,
Xinru Han,
Tong Liu,
Yuqiang Chen,
Zhendong Gao
, et al. (25 additional authors not shown)
Abstract:
The atmosphere of Triton was probed directly by observing a ground-based stellar occultation on 6 October 2022. This rare event yielded 23 positive light curves collected from 13 separate observation stations contributing to our campaign. The significance of this event lies in its potential to directly validate the modest pressure fluctuation on Triton, a phenomenon not definitively verified by pr…
▽ More
The atmosphere of Triton was probed directly by observing a ground-based stellar occultation on 6 October 2022. This rare event yielded 23 positive light curves collected from 13 separate observation stations contributing to our campaign. The significance of this event lies in its potential to directly validate the modest pressure fluctuation on Triton, a phenomenon not definitively verified by previous observations, including only five stellar occultations, and the Voyager 2 radio occultation in 1989. Using an approach consistent with a comparable study, we precisely determined a surface pressure of $14.07_{-0.13}^{+0.21}~\mathrm{μbar}$ in 2022. This new pressure rules out any significant monotonic variation in pressure between 2017 and 2022 through direct observations, as it is in alignment with the 2017 value. Additionally, both the pressures in 2017 and 2022 align with the 1989 value. This provides further support for the conclusion drawn from the previous volatile transport model simulation, which is consistent with the observed alignment between the pressures in 1989 and 2017; that is to say, the pressure fluctuation is modest. Moreover, this conclusion suggests the existence of a northern polar cap extended down to at least $45^\circ$N$-60^\circ$N and the presence of nitrogen between $30^\circ$S and $0^\circ$.
△ Less
Submitted 24 March, 2024; v1 submitted 14 March, 2024;
originally announced March 2024.
-
Enhanced Sparsification via Stimulative Training
Authors:
Shengji Tang,
Weihao Lin,
Hancheng Ye,
Peng Ye,
Chong Yu,
Baopu Li,
Tao Chen
Abstract:
Sparsification-based pruning has been an important category in model compression. Existing methods commonly set sparsity-inducing penalty terms to suppress the importance of dropped weights, which is regarded as the suppressed sparsification paradigm. However, this paradigm inactivates the dropped parts of networks causing capacity damage before pruning, thereby leading to performance degradation.…
▽ More
Sparsification-based pruning has been an important category in model compression. Existing methods commonly set sparsity-inducing penalty terms to suppress the importance of dropped weights, which is regarded as the suppressed sparsification paradigm. However, this paradigm inactivates the dropped parts of networks causing capacity damage before pruning, thereby leading to performance degradation. To alleviate this issue, we first study and reveal the relative sparsity effect in emerging stimulative training and then propose a structured pruning framework, named STP, based on an enhanced sparsification paradigm which maintains the magnitude of dropped weights and enhances the expressivity of kept weights by self-distillation. Besides, to find an optimal architecture for the pruned network, we propose a multi-dimension architecture space and a knowledge distillation-guided exploration strategy. To reduce the huge capacity gap of distillation, we propose a subnet mutating expansion technique. Extensive experiments on various benchmarks indicate the effectiveness of STP. Specifically, without fine-tuning, our method consistently achieves superior performance at different budgets, especially under extremely aggressive pruning scenarios, e.g., remaining 95.11% Top-1 accuracy (72.43% in 76.15%) while reducing 85% FLOPs for ResNet-50 on ImageNet. Codes will be released soon.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer
Authors:
Jianjian Cao,
Peng Ye,
Shengze Li,
Chong Yu,
Yansong Tang,
Jiwen Lu,
Tao Chen
Abstract:
Vision-Language Transformers (VLTs) have shown great success recently, but are meanwhile accompanied by heavy computation costs, where a major reason can be attributed to the large number of visual and language tokens. Existing token pruning research for compressing VLTs mainly follows a single-modality-based scheme yet ignores the critical role of aligning different modalities for guiding the tok…
▽ More
Vision-Language Transformers (VLTs) have shown great success recently, but are meanwhile accompanied by heavy computation costs, where a major reason can be attributed to the large number of visual and language tokens. Existing token pruning research for compressing VLTs mainly follows a single-modality-based scheme yet ignores the critical role of aligning different modalities for guiding the token pruning process, causing the important tokens for one modality to be falsely pruned in another modality branch. Meanwhile, existing VLT pruning works also lack the flexibility to dynamically compress each layer based on different input samples. To this end, we propose a novel framework named Multimodal Alignment-Guided Dynamic Token Pruning (MADTP) for accelerating various VLTs. Specifically, we first introduce a well-designed Multi-modality Alignment Guidance (MAG) module that can align features of the same semantic concept from different modalities, to ensure the pruned tokens are less important for all modalities. We further design a novel Dynamic Token Pruning (DTP) module, which can adaptively adjust the token compression ratio in each layer based on different input instances. Extensive experiments on various benchmarks demonstrate that MADTP significantly reduces the computational complexity of kinds of multimodal models while preserving competitive performance. Notably, when applied to the BLIP model in the NLVR2 dataset, MADTP can reduce the GFLOPs by 80% with less than 4% performance degradation.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Codebook-enabled Generative End-to-end Semantic Communication Powered by Transformer
Authors:
Peigen Ye,
Yaping Sun,
Shumin Yao,
Hao Chen,
Xiaodong Xu,
Shuguang Cui
Abstract:
Codebook-based generative semantic communication attracts increasing attention, since only indices are required to be transmitted when the codebook is shared between transmitter and receiver. However, due to the fact that the semantic relations among code vectors are not necessarily related to the distance of the corresponding code indices, the performance of the codebook-enabled semantic communic…
▽ More
Codebook-based generative semantic communication attracts increasing attention, since only indices are required to be transmitted when the codebook is shared between transmitter and receiver. However, due to the fact that the semantic relations among code vectors are not necessarily related to the distance of the corresponding code indices, the performance of the codebook-enabled semantic communication system is susceptible to the channel noise. Thus, how to improve the system robustness against the noise requires careful design. This paper proposes a robust codebook-assisted image semantic communication system, where semantic codec and codebook are first jointly constructed, and then vector-to-index transformer is designed guided by the codebook to eliminate the effects of channel noise, and achieve image generation. Thanks to the assistance of the high-quality codebook to the Transformer, the generated images at the receiver outperform those of the compared methods in terms of visual perception. In the end, numerical results and generated images demonstrate the advantages of the generative semantic communication method over JPEG+LDPC and traditional joint source channel coding (JSCC) methods.
△ Less
Submitted 5 March, 2024; v1 submitted 22 January, 2024;
originally announced February 2024.
-
CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling
Authors:
Junchao Gong,
Lei Bai,
Peng Ye,
Wanghan Xu,
Na Liu,
Jianhua Dai,
Xiaokang Yang,
Wanli Ouyang
Abstract:
Precipitation nowcasting based on radar data plays a crucial role in extreme weather prediction and has broad implications for disaster management. Despite progresses have been made based on deep learning, two key challenges of precipitation nowcasting are not well-solved: (i) the modeling of complex precipitation system evolutions with different scales, and (ii) accurate forecasts for extreme pre…
▽ More
Precipitation nowcasting based on radar data plays a crucial role in extreme weather prediction and has broad implications for disaster management. Despite progresses have been made based on deep learning, two key challenges of precipitation nowcasting are not well-solved: (i) the modeling of complex precipitation system evolutions with different scales, and (ii) accurate forecasts for extreme precipitation. In this work, we propose CasCast, a cascaded framework composed of a deterministic and a probabilistic part to decouple the predictions for mesoscale precipitation distributions and small-scale patterns. Then, we explore training the cascaded framework at the high resolution and conducting the probabilistic modeling in a low dimensional latent space with a frame-wise-guided diffusion transformer for enhancing the optimization of extreme events while reducing computational costs. Extensive experiments on three benchmark radar precipitation datasets show that CasCast achieves competitive performance. Especially, CasCast significantly surpasses the baseline (up to +91.8%) for regional extreme-precipitation nowcasting.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
ClipSAM: CLIP and SAM Collaboration for Zero-Shot Anomaly Segmentation
Authors:
Shengze Li,
Jianjian Cao,
Peng Ye,
Yuhan Ding,
Chongjun Tu,
Tao Chen
Abstract:
Recently, foundational models such as CLIP and SAM have shown promising performance for the task of Zero-Shot Anomaly Segmentation (ZSAS). However, either CLIP-based or SAM-based ZSAS methods still suffer from non-negligible key drawbacks: 1) CLIP primarily focuses on global feature alignment across different inputs, leading to imprecise segmentation of local anomalous parts; 2) SAM tends to gener…
▽ More
Recently, foundational models such as CLIP and SAM have shown promising performance for the task of Zero-Shot Anomaly Segmentation (ZSAS). However, either CLIP-based or SAM-based ZSAS methods still suffer from non-negligible key drawbacks: 1) CLIP primarily focuses on global feature alignment across different inputs, leading to imprecise segmentation of local anomalous parts; 2) SAM tends to generate numerous redundant masks without proper prompt constraints, resulting in complex post-processing requirements. In this work, we innovatively propose a CLIP and SAM collaboration framework called ClipSAM for ZSAS. The insight behind ClipSAM is to employ CLIP's semantic understanding capability for anomaly localization and rough segmentation, which is further used as the prompt constraints for SAM to refine the anomaly segmentation results. In details, we introduce a crucial Unified Multi-scale Cross-modal Interaction (UMCI) module for interacting language with visual features at multiple scales of CLIP to reason anomaly positions. Then, we design a novel Multi-level Mask Refinement (MMR) module, which utilizes the positional information as multi-level prompts for SAM to acquire hierarchical levels of masks and merges them. Extensive experiments validate the effectiveness of our approach, achieving the optimal segmentation performance on the MVTec-AD and VisA datasets.
△ Less
Submitted 29 January, 2024; v1 submitted 23 January, 2024;
originally announced January 2024.
-
Lotto: Secure Participant Selection against Adversarial Servers in Federated Learning
Authors:
Zhifeng Jiang,
Peng Ye,
Shiqi He,
Wei Wang,
Ruichuan Chen,
Bo Li
Abstract:
In Federated Learning (FL), common privacy-enhancing techniques, such as secure aggregation and distributed differential privacy, rely on the critical assumption of an honest majority among participants to withstand various attacks. In practice, however, servers are not always trusted, and an adversarial server can strategically select compromised clients to create a dishonest majority, thereby un…
▽ More
In Federated Learning (FL), common privacy-enhancing techniques, such as secure aggregation and distributed differential privacy, rely on the critical assumption of an honest majority among participants to withstand various attacks. In practice, however, servers are not always trusted, and an adversarial server can strategically select compromised clients to create a dishonest majority, thereby undermining the system's security guarantees. In this paper, we present Lotto, an FL system that addresses this fundamental, yet underexplored issue by providing secure participant selection against an adversarial server. Lotto supports two selection algorithms: random and informed. To ensure random selection without a trusted server, Lotto enables each client to autonomously determine their participation using verifiable randomness. For informed selection, which is more vulnerable to manipulation, Lotto approximates the algorithm by employing random selection within a refined client pool. Our theoretical analysis shows that Lotto effectively aligns the proportion of server-selected compromised participants with the base rate of dishonest clients in the population. Large-scale experiments further reveal that Lotto achieves time-to-accuracy performance comparable to that of insecure selection methods, indicating a low computational overhead for secure selection.
△ Less
Submitted 6 March, 2024; v1 submitted 5 January, 2024;
originally announced January 2024.
-
Higher-Order Cellular Automata Generated Symmetry-Protected Topological Phases and Detection Through Multi-Point Strange Correlators
Authors:
Jie-Yu Zhang,
Meng-Yuan Li,
Peng Ye
Abstract:
In computer and system sciences, higher-order cellular automata (HOCA) are a type of cellular automata that evolve over multiple time steps and generate complex patterns, which have various applications such as secret sharing schemes, data compression, and image encryption. In this paper, we introduce HOCA to quantum many-body physics and construct a series of symmetry-protected topological (SPT)…
▽ More
In computer and system sciences, higher-order cellular automata (HOCA) are a type of cellular automata that evolve over multiple time steps and generate complex patterns, which have various applications such as secret sharing schemes, data compression, and image encryption. In this paper, we introduce HOCA to quantum many-body physics and construct a series of symmetry-protected topological (SPT) phases of matter, in which symmetries are supported on a great variety of subsystems embbeded in the SPT bulk. We call these phases HOCA-generated SPT (HGSPT) phases. Specifically, we show that HOCA can generate not only well-understood SPTs with symmetries supported on either regular (e.g., line-like subsystems in the 2D cluster model) or fractal subsystems, but also a large class of unexplored SPTs with symmetries supported on more choices of subsystems. One example is mixed-subsystem SPT that has either fractal and line-like subsystem symmetries simultaneously or two distinct types of fractal symmetries simultaneously. Another example is chaotic SPT in which chaotic-looking symmetries are significantly different from and thus cannot reduce to fractal or regular subsystem symmetries. We also introduce a new notation system to characterize HGSPTs. As the usual two-point strange correlators are trivial in most HGSPTs, we find that the nontrivial SPT orders can be detected by what we call multi-point strange correlators. We propose a universal procedure to design the spatial configuration of the multi-point strange correlators for a given HGSPT phase. Our HOCA programs and multi-point strange correlators pave the way for a unified paradigm to design, classify, and detect phases of matter with symmetries supported on a great variety of subsystems, and also provide potential useful perspective in surpassing the computational irreducibility of HOCA in a quantum mechanical way.
△ Less
Submitted 28 January, 2024; v1 submitted 31 December, 2023;
originally announced January 2024.
-
Merging Vision Transformers from Different Tasks and Domains
Authors:
Peng Ye,
Chenyu Huang,
Mingzhu Shen,
Tao Chen,
Yongqi Huang,
Yuning Zhang,
Wanli Ouyang
Abstract:
This work targets to merge various Vision Transformers (ViTs) trained on different tasks (i.e., datasets with different object categories) or domains (i.e., datasets with the same categories but different environments) into one unified model, yielding still good performance on each task or domain. Previous model merging works focus on either CNNs or NLP models, leaving the ViTs merging research un…
▽ More
This work targets to merge various Vision Transformers (ViTs) trained on different tasks (i.e., datasets with different object categories) or domains (i.e., datasets with the same categories but different environments) into one unified model, yielding still good performance on each task or domain. Previous model merging works focus on either CNNs or NLP models, leaving the ViTs merging research untouched. To fill this gap, we first explore and find that existing model merging methods cannot well handle the merging of the whole ViT models and still have improvement space. To enable the merging of the whole ViT, we propose a simple-but-effective gating network that can both merge all kinds of layers (e.g., Embedding, Norm, Attention, and MLP) and select the suitable classifier. Specifically, the gating network is trained by unlabeled datasets from all the tasks (domains), and predicts the probability of which task (domain) the input belongs to for merging the models during inference. To further boost the performance of the merged model, especially when the difficulty of merging tasks increases, we design a novel metric of model weight similarity, and utilize it to realize controllable and combined weight merging. Comprehensive experiments on kinds of newly established benchmarks, validate the superiority of the proposed ViT merging framework for different tasks and domains. Our method can even merge beyond 10 ViT models from different vision tasks with a negligible effect on the performance of each task.
△ Less
Submitted 25 December, 2023;
originally announced December 2023.
-
Partial Fine-Tuning: A Successor to Full Fine-Tuning for Vision Transformers
Authors:
Peng Ye,
Yongqi Huang,
Chongjun Tu,
Minglei Li,
Tao Chen,
Tong He,
Wanli Ouyang
Abstract:
Fine-tuning pre-trained foundation models has gained significant popularity in various research fields. Existing methods for fine-tuning can be roughly divided into two categories, namely Parameter-Efficient Fine-Tuning and High-Performance Fine-Tuning. The former aims at improving efficiency, while the latter focuses on enhancing performance. Beyond these methods, we demonstrate that Partial Fine…
▽ More
Fine-tuning pre-trained foundation models has gained significant popularity in various research fields. Existing methods for fine-tuning can be roughly divided into two categories, namely Parameter-Efficient Fine-Tuning and High-Performance Fine-Tuning. The former aims at improving efficiency, while the latter focuses on enhancing performance. Beyond these methods, we demonstrate that Partial Fine-Tuning can be an innovative and promising direction capable of concurrently enhancing both efficiency and accuracy. We first validate eight manually-defined partial fine-tuning strategies across kinds of datasets and vision transformer architectures, and find that some partial fine-tuning strategies (e.g., ffn only or attention only) can achieve better performance with fewer tuned parameters than full fine-tuning, and selecting appropriate layers is critical to partial fine-tuning. Thus, we propose a novel fine-tuned angle metric to guide the selection of appropriate layers for partial fine-tuning, making it flexible to be adapted to various scenarios for more practicable partial fine-tuning. Additionally, we show that partial fine-tuning can serve as a new dimension for Model Soups, improving both the model performance and generalization with fewer tuned parameters. Comprehensive experiments on a wide range of datasets and models validate the great potential of partial fine-tuning.
△ Less
Submitted 25 December, 2023;
originally announced December 2023.
-
GanFinger: GAN-Based Fingerprint Generation for Deep Neural Network Ownership Verification
Authors:
Huali Ren,
Anli Yan,
Xiaojun Ren,
Pei-Gen Ye,
Chong-zhi Gao,
Zhili Zhou,
Jin Li
Abstract:
Deep neural networks (DNNs) are extensively employed in a wide range of application scenarios. Generally, training a commercially viable neural network requires significant amounts of data and computing resources, and it is easy for unauthorized users to use the networks illegally. Therefore, network ownership verification has become one of the most crucial steps in safeguarding digital assets. To…
▽ More
Deep neural networks (DNNs) are extensively employed in a wide range of application scenarios. Generally, training a commercially viable neural network requires significant amounts of data and computing resources, and it is easy for unauthorized users to use the networks illegally. Therefore, network ownership verification has become one of the most crucial steps in safeguarding digital assets. To verify the ownership of networks, the existing network fingerprinting approaches perform poorly in the aspects of efficiency, stealthiness, and discriminability. To address these issues, we propose a network fingerprinting approach, named as GanFinger, to construct the network fingerprints based on the network behavior, which is characterized by network outputs of pairs of original examples and conferrable adversarial examples. Specifically, GanFinger leverages Generative Adversarial Networks (GANs) to effectively generate conferrable adversarial examples with imperceptible perturbations. These examples can exhibit identical outputs on copyrighted and pirated networks while producing different results on irrelevant networks. Moreover, to enhance the accuracy of fingerprint ownership verification, the network similarity is computed based on the accuracy-robustness distance of fingerprint examples'outputs. To evaluate the performance of GanFinger, we construct a comprehensive benchmark consisting of 186 networks with five network structures and four popular network post-processing techniques. The benchmark experiments demonstrate that GanFinger significantly outperforms the state-of-the-arts in efficiency, stealthiness, and discriminability. It achieves a remarkable 6.57 times faster in fingerprint generation and boosts the ARUC value by 0.175, resulting in a relative improvement of about 26%.
△ Less
Submitted 25 December, 2023;
originally announced December 2023.
-
Wideband Sample Rate Converter Using Cascaded Parallel-serial Structure for Synthetic Instrumentation
Authors:
Ruiyuan Ming,
Peng Ye,
Kuojun Yang,
Zhixiang Pan,
Li chen,
Xuetao Liu
Abstract:
A sample rate converter(SRC) is designed to adjust the sampling rate of digital signals flexibly for different application requirements in the broadband signal processing system. In this paper, a novel parallel-serial structure is proposed to improve the bandwidth and flexibility of SRC. The core of this structure is a parallel decimation filter followed by a serial counterpart, the parallel part…
▽ More
A sample rate converter(SRC) is designed to adjust the sampling rate of digital signals flexibly for different application requirements in the broadband signal processing system. In this paper, a novel parallel-serial structure is proposed to improve the bandwidth and flexibility of SRC. The core of this structure is a parallel decimation filter followed by a serial counterpart, the parallel part is designed to process high sampling rate data streams, and the serial part provides high flexibility in decimation factor configuration. A typical combination of cascaded integral comb filter(CIC) and halfband filter is utilized in this structure, the serial recursive loop which limits the processing ability of the CIC filter is transformed into a parallel-pipeline recursive structure. In addition, the symmetry property and zero coefficient of the halfband filter are exploited with the polyphase filter structure to reduce resource utilization and design complexity. In the meantime, the decimation factor of the CIC filter can be adjusted flexibly in a wide range, which is used to improve the system configuration flexibility. This parallel-serial SRC structure was implemented on Xilinx KU115 series field programmable gate array(FPGA), and then applied in a synthetic instrument system. The experiment results demonstrate that the proposed scheme significantly improves the performance of SRC in bandwidth and flexibility.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Efficient Architecture Search via Bi-level Data Pruning
Authors:
Chongjun Tu,
Peng Ye,
Weihao Lin,
Hancheng Ye,
Chong Yu,
Tao Chen,
Baopu Li,
Wanli Ouyang
Abstract:
Improving the efficiency of Neural Architecture Search (NAS) is a challenging but significant task that has received much attention. Previous works mainly adopted the Differentiable Architecture Search (DARTS) and improved its search strategies or modules to enhance search efficiency. Recently, some methods have started considering data reduction for speedup, but they are not tightly coupled with…
▽ More
Improving the efficiency of Neural Architecture Search (NAS) is a challenging but significant task that has received much attention. Previous works mainly adopted the Differentiable Architecture Search (DARTS) and improved its search strategies or modules to enhance search efficiency. Recently, some methods have started considering data reduction for speedup, but they are not tightly coupled with the architecture search process, resulting in sub-optimal performance. To this end, this work pioneers an exploration into the critical role of dataset characteristics for DARTS bi-level optimization, and then proposes a novel Bi-level Data Pruning (BDP) paradigm that targets the weights and architecture levels of DARTS to enhance efficiency from a data perspective. Specifically, we introduce a new progressive data pruning strategy that utilizes supernet prediction dynamics as the metric, to gradually prune unsuitable samples for DARTS during the search. An effective automatic class balance constraint is also integrated into BDP, to suppress potential class imbalances resulting from data-efficient algorithms. Comprehensive evaluations on the NAS-Bench-201 search space, DARTS search space, and MobileNet-like search space validate that BDP reduces search costs by over 50% while achieving superior performance when applied to baseline DARTS. Besides, we demonstrate that BDP can harmoniously integrate with advanced DARTS variants, like PC-DARTS and \b{eta}-DARTS, offering an approximately 2 times speedup with minimal performance compromises.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Rethinking of Feature Interaction for Multi-task Learning on Dense Prediction
Authors:
Jingdong Zhang,
Jiayuan Fan,
Peng Ye,
Bo Zhang,
Hancheng Ye,
Baopu Li,
Yancheng Cai,
Tao Chen
Abstract:
Existing works generally adopt the encoder-decoder structure for Multi-task Dense Prediction, where the encoder extracts the task-generic features, and multiple decoders generate task-specific features for predictions. We observe that low-level representations with rich details and high-level representations with abundant task information are not both involved in the multi-task interaction process…
▽ More
Existing works generally adopt the encoder-decoder structure for Multi-task Dense Prediction, where the encoder extracts the task-generic features, and multiple decoders generate task-specific features for predictions. We observe that low-level representations with rich details and high-level representations with abundant task information are not both involved in the multi-task interaction process. Additionally, low-quality and low-efficiency issues also exist in current multi-task learning architectures. In this work, we propose to learn a comprehensive intermediate feature globally from both task-generic and task-specific features, we reveal an important fact that this intermediate feature, namely the bridge feature, is a good solution to the above issues. Based on this, we propose a novel Bridge-Feature-Centirc Interaction (BRFI) method. A Bridge Feature Extractor (BFE) is designed for the generation of strong bridge features and Task Pattern Propagation (TPP) is applied to ensure high-quality task interaction participants. Then a Task-Feature Refiner (TFR) is developed to refine final task predictions with the well-learned knowledge from the bridge features. Extensive experiments are conducted on NYUD-v2 and PASCAL Context benchmarks, and the superior performance shows the proposed architecture is effective and powerful in promoting different dense prediction tasks simultaneously.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Towards an end-to-end artificial intelligence driven global weather forecasting system
Authors:
Kun Chen,
Lei Bai,
Fenghua Ling,
Peng Ye,
Tao Chen,
Jing-Jia Luo,
Hao Chen,
Yi Xiao,
Kang Chen,
Tao Han,
Wanli Ouyang
Abstract:
The weather forecasting system is important for science and society, and significant achievements have been made in applying artificial intelligence (AI) to medium-range weather forecasting. However, existing AI-based weather forecasting models rely on analysis or reanalysis products from traditional numerical weather prediction (NWP) systems as initial conditions for making predictions. Initial s…
▽ More
The weather forecasting system is important for science and society, and significant achievements have been made in applying artificial intelligence (AI) to medium-range weather forecasting. However, existing AI-based weather forecasting models rely on analysis or reanalysis products from traditional numerical weather prediction (NWP) systems as initial conditions for making predictions. Initial states are typically generated by traditional data assimilation components, which are computational expensive and time-consuming. Here we present an AI-based data assimilation model, i.e., Adas, for global weather variables. By introducing the confidence matrix, Adas employs gated convolution to handle sparse observations and gated cross-attention for capturing the interactions between the background and observations. Further, we combine Adas with the advanced AI-based forecasting model (i.e., FengWu) to construct the first end-to-end AI-based global weather forecasting system: FengWu-Adas. We demonstrate that Adas can assimilate global observations to produce high-quality analysis, enabling the system operate stably for long term. Moreover, we are the first to apply the methods to real-world scenarios, which is more challenging and has considerable practical application potential. We have also achieved the forecasts based on the analyses generated by AI with a skillful forecast lead time exceeding that of the IFS for the first time.
△ Less
Submitted 8 April, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Measuring entanglement entropy and its topological signature for phononic systems
Authors:
Zhi-Kang Lin,
Yao Zhou,
Bin Jiang,
Bing-Quan Wu,
Li-Mei Chen,
Xiao-Yu Liu,
Li-Wei Wang,
Peng Ye,
Jian-Hua Jiang
Abstract:
Entanglement entropy is a fundamental concept with rising importance in different fields ranging from quantum information science, black holes to materials science. In complex materials and systems, entanglement entropy provides insight into the collective degrees of freedom that underlie the systems' complex behaviours. As well-known predictions, the entanglement entropy exhibits area laws for sy…
▽ More
Entanglement entropy is a fundamental concept with rising importance in different fields ranging from quantum information science, black holes to materials science. In complex materials and systems, entanglement entropy provides insight into the collective degrees of freedom that underlie the systems' complex behaviours. As well-known predictions, the entanglement entropy exhibits area laws for systems with gapped excitations, whereas it follows the Gioev-Klich-Widom scaling law in gapless fermion systems. Furthermore, the entanglement spectrum provides salient characterizations of topological phases and phase transitions beyond the conventional paradigms. However, many of these fundamental predictions have not yet been confirmed in experiments due to the difficulties in measuring entanglement entropy in physical systems. Here, we report the experimental verification of the above predictions by probing the nonlocal correlations in phononic systems. From the pump-probe responses in phononic crystals, we obtain the entanglement entropy and entanglement spectrum for phononic systems with the fermion filling analog. With these measurements, we verify the Gioev-Klich-Widom scaling law of entanglement entropy for various quasiparticle dispersions in one- and two-dimensions. Moreover, we observe the salient signatures of topological phases in the entanglement spectrum and entanglement entropy which unveil an unprecedented probe of topological phases without relying on the bulk-boundary correspondence. The progress here opens a frontier where entanglement entropy serves as an important experimental tool in the study of emergent phases and phase transitions which can be generalized to non-Hermitian and other unconventional regimes.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Hydrogen-induced tunable remanent polarization in a perovskite nickelate
Authors:
Yifan Yuan,
Michele Kotiuga,
Tae Joon Park,
Yuanyuan Ni,
Arnob Saha,
Hua Zhou,
Jerzy T. Sadowski,
Abdullah Al-Mahboob,
Haoming Yu,
Kai Du,
Minning Zhu,
Sunbin Deng,
Ravindra S. Bisht,
Xiao Lyu,
Chung-Tse Michael Wu,
Peide D. Ye,
Abhronil Sengupta,
Sang-Wook Cheong,
Xiaoshan Xu,
Karin M. Rabe,
Shriram Ramanathan
Abstract:
Materials with field-tunable polarization are of broad interest to condensed matter sciences and solid-state device technologies. Here, using hydrogen (H) donor doping, we modify the room temperature metallic phase of a perovskite nickelate NdNiO3 into an insulating phase with both metastable dipolar polarization and space-charge polarization. We then demonstrate transient negative differential ca…
▽ More
Materials with field-tunable polarization are of broad interest to condensed matter sciences and solid-state device technologies. Here, using hydrogen (H) donor doping, we modify the room temperature metallic phase of a perovskite nickelate NdNiO3 into an insulating phase with both metastable dipolar polarization and space-charge polarization. We then demonstrate transient negative differential capacitance in thin film capacitors. The space-charge polarization caused by long-range movement and trapping of protons dominates when the electric field exceeds the threshold value. First-principles calculations suggest the polarization originates from the polar structure created by H doping. We find that polarization decays within ~1 second which is an interesting temporal regime for neuromorphic computing hardware design, and we implement the transient characteristics in a neural network to demonstrate unsupervised learning. These discoveries open new avenues for designing novel ferroelectric materials and electrets using light-ion doping.
△ Less
Submitted 20 November, 2023;
originally announced November 2023.
-
Hyperfine Structure of Quantum Entanglement
Authors:
Liang-Hong Mo,
Yao Zhou,
Jia-Rui Sun,
Peng Ye
Abstract:
Quantum entanglement, crucial for understanding quantum many-body systems and quantum gravity, is commonly assessed through various measures such as von Neumann entropy, mutual information, and entanglement contour, each with its inherent limitations. In this work, we introduce the \textit{hyperfine structure of entanglement}, which dissects entanglement contours known as the fine structure into p…
▽ More
Quantum entanglement, crucial for understanding quantum many-body systems and quantum gravity, is commonly assessed through various measures such as von Neumann entropy, mutual information, and entanglement contour, each with its inherent limitations. In this work, we introduce the \textit{hyperfine structure of entanglement}, which dissects entanglement contours known as the fine structure into particle-number cumulants. This measure exhibits a set of universal properties with its significance in quantum information science. We apply it across diverse contexts: in Fermi gases, establishing connections to mutual information and interacting conformal field theory; in AdS$_3$/CFT$_2$ holographic duality, unveiling finer subregion-subregion duality and extending bulk reconstruction; and in Chern insulators, distinguishing between different quantum phases. Our findings suggest experimental accessibility, offering fresh insights into quantum entanglement across physical systems.
△ Less
Submitted 11 June, 2024; v1 submitted 3 November, 2023;
originally announced November 2023.
-
Quantum Entanglement on Fractal Landscapes
Authors:
Yao Zhou,
Peng Ye
Abstract:
We explore the interplay of fractal geometry and quantum entanglement by analyzing the von Neumann entropy (known as entanglement entropy) and the entanglement contour in the scaling limit. Focusing on free-fermion quantum models known for their simplicity and effectiveness in studying highly entangled quantum systems, we uncover intriguing findings. For gapless ground states exhibiting a finite d…
▽ More
We explore the interplay of fractal geometry and quantum entanglement by analyzing the von Neumann entropy (known as entanglement entropy) and the entanglement contour in the scaling limit. Focusing on free-fermion quantum models known for their simplicity and effectiveness in studying highly entangled quantum systems, we uncover intriguing findings. For gapless ground states exhibiting a finite density of states at the chemical potential, we reveal a super-area law characterized by the presence of a logarithmic divergence in the entanglement entropy. This extends the well-established super-area law observed on translationally invariant Euclidean lattices where the Gioev-Klich-Widom conjecture regarding the asymptotic behavior of Toeplitz matrices holds significant influence. Furthermore, we observe the emergence of a self-similar and universal pattern termed an ``entanglement fractal'' in the entanglement contour data as we approach the scaling limit. Remarkably, this pattern bears resemblance to intricate Chinese paper-cutting designs. We provide general rules to artificially generate this fractal, offering insights into the universal scaling of entanglement entropy. Building upon the insights gained from the entanglement fractal, we explicitly elucidate the origin of the logarithmic divergence on fractals where translation symmetry is broken and the Widom conjecture is inapplicable. For gapped ground states, we observe that the entanglement entropy adheres to a generalized area law, with its dependence on the Hausdorff dimension of the boundary between complementary subsystems.
△ Less
Submitted 26 March, 2024; v1 submitted 2 November, 2023;
originally announced November 2023.
-
Social Contract AI: Aligning AI Assistants with Implicit Group Norms
Authors:
Jan-Philipp Fränken,
Sam Kwok,
Peixuan Ye,
Kanishk Gandhi,
Dilip Arumugam,
Jared Moore,
Alex Tamkin,
Tobias Gerstenberg,
Noah D. Goodman
Abstract:
We explore the idea of aligning an AI assistant by inverting a model of users' (unknown) preferences from observed interactions. To validate our proposal, we run proof-of-concept simulations in the economic ultimatum game, formalizing user preferences as policies that guide the actions of simulated players. We find that the AI assistant accurately aligns its behavior to match standard policies fro…
▽ More
We explore the idea of aligning an AI assistant by inverting a model of users' (unknown) preferences from observed interactions. To validate our proposal, we run proof-of-concept simulations in the economic ultimatum game, formalizing user preferences as policies that guide the actions of simulated players. We find that the AI assistant accurately aligns its behavior to match standard policies from the economic literature (e.g., selfish, altruistic). However, the assistant's learned policies lack robustness and exhibit limited generalization in an out-of-distribution setting when confronted with a currency (e.g., grams of medicine) that was not included in the assistant's training distribution. Additionally, we find that when there is inconsistency in the relationship between language use and an unknown policy (e.g., an altruistic policy combined with rude language), the assistant's learning of the policy is slowed. Overall, our preliminary results suggest that developing simulation frameworks in which AI assistants need to infer preferences from diverse users can provide a valuable approach for studying practical alignment questions.
△ Less
Submitted 3 December, 2023; v1 submitted 26 October, 2023;
originally announced October 2023.
-
Rethinking the BERT-like Pretraining for DNA Sequences
Authors:
Chaoqi Liang,
Weiqiang Bai,
Lifeng Qiao,
Yuchen Ren,
Jianle Sun,
Peng Ye,
Hongliang Yan,
Xinzhu Ma,
Wangmeng Zuo,
Wanli Ouyang
Abstract:
With the success of large-scale pretraining in NLP, there is an increasing trend of applying it to the domain of life sciences. In particular, pretraining methods based on DNA sequences have garnered growing attention due to their potential to capture generic information about genes. However, existing pretraining methods for DNA sequences largely rely on direct adoptions of BERT pretraining from N…
▽ More
With the success of large-scale pretraining in NLP, there is an increasing trend of applying it to the domain of life sciences. In particular, pretraining methods based on DNA sequences have garnered growing attention due to their potential to capture generic information about genes. However, existing pretraining methods for DNA sequences largely rely on direct adoptions of BERT pretraining from NLP, lacking a comprehensive understanding and a specifically tailored approach. To address this research gap, we first conducted a series of exploratory experiments and gained several insightful observations: 1) In the fine-tuning phase of downstream tasks, when using K-mer overlapping tokenization instead of K-mer non-overlapping tokenization, both overlapping and non-overlapping pretraining weights show consistent performance improvement.2) During the pre-training process, using K-mer overlapping tokenization quickly produces clear K-mer embeddings and reduces the loss to a very low level, while using K-mer non-overlapping tokenization results in less distinct embeddings and continuously decreases the loss. 3) Using overlapping tokenization causes the self-attention in the intermediate layers of pre-trained models to tend to overly focus on certain tokens, reflecting that these layers are not adequately optimized. In summary, overlapping tokenization can benefit the fine-tuning of downstream tasks but leads to inadequate pretraining with fast convergence. To unleash the pretraining potential, we introduce a novel approach called RandomMask, which gradually increases the task difficulty of BERT-like pretraining by continuously expanding its mask boundary, forcing the model to learn more knowledge. RandomMask is simple but effective, achieving top-tier performance across 26 datasets of 28 datasets spanning 7 downstream tasks.
△ Less
Submitted 11 October, 2023; v1 submitted 11 October, 2023;
originally announced October 2023.
-
StructChart: Perception, Structuring, Reasoning for Visual Chart Understanding
Authors:
Renqiu Xia,
Bo Zhang,
Haoyang Peng,
Hancheng Ye,
Xiangchao Yan,
Peng Ye,
Botian Shi,
Yu Qiao,
Junchi Yan
Abstract:
Charts are common in literature across different scientific fields, conveying rich information easily accessible to readers. Current chart-related tasks focus on either chart perception which refers to extracting information from the visual charts, or performing reasoning given the extracted data, e.g. in a tabular form. In this paper, we aim to establish a unified and label-efficient learning par…
▽ More
Charts are common in literature across different scientific fields, conveying rich information easily accessible to readers. Current chart-related tasks focus on either chart perception which refers to extracting information from the visual charts, or performing reasoning given the extracted data, e.g. in a tabular form. In this paper, we aim to establish a unified and label-efficient learning paradigm for joint perception and reasoning tasks, which can be generally applicable to different downstream tasks, beyond the question-answering task as specifically studied in peer works. Specifically, StructChart first reformulates the chart information from the popular tubular form (specifically linearized CSV) to the proposed Structured Triplet Representations (STR), which is more friendly for reducing the task gap between chart perception and reasoning due to the employed structured information extraction for charts. We then propose a Structuring Chart-oriented Representation Metric (SCRM) to quantitatively evaluate the performance for the chart perception task. To enrich the dataset for training, we further explore the possibility of leveraging the Large Language Model (LLM), enhancing the chart diversity in terms of both chart visual style and its statistical information. Extensive experiments are conducted on various chart-related tasks, demonstrating the effectiveness and promising potential for a unified chart perception-reasoning paradigm to push the frontier of chart understanding.
△ Less
Submitted 18 February, 2024; v1 submitted 20 September, 2023;
originally announced September 2023.
-
Tunable Circular Photogalvanic and Photovoltaic Effect in 2D Tellurium with Different Chirality
Authors:
Chang Niu,
Shouyuan Huang,
Neil Ghosh,
Pukun Tan,
Mingyi Wang,
Wenzhuo Wu,
Xianfan Xu,
Peide D. Ye
Abstract:
Chirality arises from the asymmetry of matters, where two counterparts are the mirror image of each other. The interaction between circular-polarization light and quantum materials is enhanced in chiral space groups due to the structural chirality. Tellurium (Te) possesses the simplest chiral crystal structure, with Te atoms covalently bonded into a spiral atomic chain (left- or right-handed) with…
▽ More
Chirality arises from the asymmetry of matters, where two counterparts are the mirror image of each other. The interaction between circular-polarization light and quantum materials is enhanced in chiral space groups due to the structural chirality. Tellurium (Te) possesses the simplest chiral crystal structure, with Te atoms covalently bonded into a spiral atomic chain (left- or right-handed) with a periodicity of three. Here, we investigate the tunable circular photo-electric responses in 2D Te field-effect transistor with different chirality, including the longitudinal circular photogalvanic effect induced by the radial spin texture (electron-spin polarization parallel to the electron momentum direction) and the circular photovoltaic induced by the chiral crystal structure (helical Te atomic chains). Our work demonstrates the controllable manipulation of the chirality degree of freedom in materials.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
On the proportion of irreducible polynomials in unicritically generated semigroups
Authors:
Wade Hindes,
Reiyah Jacobs,
Benjamin Keller,
Albert Kim,
Peter Ye,
Aaron Zhou
Abstract:
Let $p$ be a prime number and let $S=\{x^p+c_1,\dots,x^p+c_r\}$ be a finite set of unicritical polynomials for some $c_1,\dots,c_r\in\mathbb{Z}$. Moreover, assume that $S$ contains at least one irreducible polynomial over $\mathbb{Q}$. Then we construct a large, explicit subset of irreducible polynomials within the semigroup generated by $S$ under composition; in fact, we show that this subset has…
▽ More
Let $p$ be a prime number and let $S=\{x^p+c_1,\dots,x^p+c_r\}$ be a finite set of unicritical polynomials for some $c_1,\dots,c_r\in\mathbb{Z}$. Moreover, assume that $S$ contains at least one irreducible polynomial over $\mathbb{Q}$. Then we construct a large, explicit subset of irreducible polynomials within the semigroup generated by $S$ under composition; in fact, we show that this subset has positive asymptotic density within the full semigroup when we count polynomials by degree. In addition, when $p=2$ or $3$ we construct an infinite family of semigroups that break the local-global principle for irreducibility. To do this, we use a mix of algebraic and arithmetic techniques and results, including Runge's method, the elliptic curve Chabauty method, and Fermat's Last Theorem.
△ Less
Submitted 27 August, 2023;
originally announced August 2023.
-
Boosting Residual Networks with Group Knowledge
Authors:
Shengji Tang,
Peng Ye,
Baopu Li,
Weihao Lin,
Tao Chen,
Tong He,
Chong Yu,
Wanli Ouyang
Abstract:
Recent research understands the residual networks from a new perspective of the implicit ensemble model. From this view, previous methods such as stochastic depth and stimulative training have further improved the performance of the residual network by sampling and training of its subnets. However, they both use the same supervision for all subnets of different capacities and neglect the valuable…
▽ More
Recent research understands the residual networks from a new perspective of the implicit ensemble model. From this view, previous methods such as stochastic depth and stimulative training have further improved the performance of the residual network by sampling and training of its subnets. However, they both use the same supervision for all subnets of different capacities and neglect the valuable knowledge generated by subnets during training. In this manuscript, we mitigate the significant knowledge distillation gap caused by using the same kind of supervision and advocate leveraging the subnets to provide diverse knowledge. Based on this motivation, we propose a group knowledge based training framework for boosting the performance of residual networks. Specifically, we implicitly divide all subnets into hierarchical groups by subnet-in-subnet sampling, aggregate the knowledge of different subnets in each group during training, and exploit upper-level group knowledge to supervise lower-level subnet groups. Meanwhile, We also develop a subnet sampling strategy that naturally samples larger subnets, which are found to be more helpful than smaller subnets in boosting performance for hierarchical groups. Compared with typical subnet training and other methods, our method achieves the best efficiency and performance trade-offs on multiple datasets and network structures. The code is at https://github.com/tsj-001/AAAI24-GKT.
△ Less
Submitted 14 December, 2023; v1 submitted 26 August, 2023;
originally announced August 2023.
-
Real-time frequency measurement based on parallel pipeline FFT for time-stretched acquisition system
Authors:
Ruiyuan Ming,
Peng Ye,
Kuojun Yang,
Zhixiang Pan,
ChenYang Li,
Chuang Huang
Abstract:
Real-time frequency measurement for non-repetitive and statistically rare signals are challenging problems in the electronic measurement area, which places high demands on the bandwidth, sampling rate, data processing and transmission capabilities of the measurement system. The time-stretching sampling system overcomes the bandwidth limitation and sampling rate limitation of electronic digitizers,…
▽ More
Real-time frequency measurement for non-repetitive and statistically rare signals are challenging problems in the electronic measurement area, which places high demands on the bandwidth, sampling rate, data processing and transmission capabilities of the measurement system. The time-stretching sampling system overcomes the bandwidth limitation and sampling rate limitation of electronic digitizers, allowing continuous ultra-high-speed acquisition at refresh rates of billions of frames per second. However, processing the high sampling rate signals of hundreds of GHz is an extremely challenging task, which becomes the bottleneck of the real-time analysis for non-stationary signals. In this work, a real-time frequency measurement system is designed based on a parallel pipelined FFT structure. Tens of FFT channels are pipelined to process the incoming high sampling rate signals in sequence, and a simplified parabola fitting algorithm is implemented in the FFT channel to improve the frequency precision. The frequency results of these FFT channels are reorganized and finally uploaded to an industrial personal computer for visualization and offline data mining. A real-time transmission datapath is designed to provide a high throughput rate transmission, ensuring the frequency results are uploaded without interruption. Several experiments are performed to evaluate the designed real-time frequency measurement system, the input signal has a bandwidth of 4 GHz, and the repetition rate of frames is 22 MHz. Experimental results show that the frequency of the signal can be measured at a high sampling rate of 20 GSPS, and the frequency precision is better than 1 MHz.
△ Less
Submitted 18 August, 2023;
originally announced August 2023.
-
RFD-ECNet: Extreme Underwater Image Compression with Reference to Feature Dictionar
Authors:
Mengyao Li,
Liquan Shen,
Peng Ye,
Guorui Feng,
Zheyin Wang
Abstract:
Thriving underwater applications demand efficient extreme compression technology to realize the transmission of underwater images (UWIs) in very narrow underwater bandwidth. However, existing image compression methods achieve inferior performance on UWIs because they do not consider the characteristics of UWIs: (1) Multifarious underwater styles of color shift and distance-dependent clarity, cause…
▽ More
Thriving underwater applications demand efficient extreme compression technology to realize the transmission of underwater images (UWIs) in very narrow underwater bandwidth. However, existing image compression methods achieve inferior performance on UWIs because they do not consider the characteristics of UWIs: (1) Multifarious underwater styles of color shift and distance-dependent clarity, caused by the unique underwater physical imaging; (2) Massive redundancy between different UWIs, caused by the fact that different UWIs contain several common ocean objects, which have plenty of similarities in structures and semantics. To remove redundancy among UWIs, we first construct an exhaustive underwater multi-scale feature dictionary to provide coarse-to-fine reference features for UWI compression. Subsequently, an extreme UWI compression network with reference to the feature dictionary (RFD-ECNet) is creatively proposed, which utilizes feature match and reference feature variant to significantly remove redundancy among UWIs. To align the multifarious underwater styles and improve the accuracy of feature match, an underwater style normalized block (USNB) is proposed, which utilizes underwater physical priors extracted from the underwater physical imaging model to normalize the underwater styles of dictionary features toward the input. Moreover, a reference feature variant module (RFVM) is designed to adaptively morph the reference features, improving the similarity between the reference and input features. Experimental results on four UWI datasets show that our RFD-ECNet is the first work that achieves a significant BD-rate saving of 31% over the most advanced VVC.
△ Less
Submitted 16 August, 2023;
originally announced August 2023.
-
Experts Weights Averaging: A New General Training Scheme for Vision Transformers
Authors:
Yongqi Huang,
Peng Ye,
Xiaoshui Huang,
Sheng Li,
Tao Chen,
Tong He,
Wanli Ouyang
Abstract:
Structural re-parameterization is a general training scheme for Convolutional Neural Networks (CNNs), which achieves performance improvement without increasing inference cost. As Vision Transformers (ViTs) are gradually surpassing CNNs in various visual tasks, one may question: if a training scheme specifically for ViTs exists that can also achieve performance improvement without increasing infere…
▽ More
Structural re-parameterization is a general training scheme for Convolutional Neural Networks (CNNs), which achieves performance improvement without increasing inference cost. As Vision Transformers (ViTs) are gradually surpassing CNNs in various visual tasks, one may question: if a training scheme specifically for ViTs exists that can also achieve performance improvement without increasing inference cost? Recently, Mixture-of-Experts (MoE) has attracted increasing attention, as it can efficiently scale up the capacity of Transformers at a fixed cost through sparsely activated experts. Considering that MoE can also be viewed as a multi-branch structure, can we utilize MoE to implement a ViT training scheme similar to structural re-parameterization? In this paper, we affirmatively answer these questions, with a new general training strategy for ViTs. Specifically, we decouple the training and inference phases of ViTs. During training, we replace some Feed-Forward Networks (FFNs) of the ViT with specially designed, more efficient MoEs that assign tokens to experts by random uniform partition, and perform Experts Weights Averaging (EWA) on these MoEs at the end of each iteration. After training, we convert each MoE into an FFN by averaging the experts, transforming the model back into original ViT for inference. We further provide a theoretical analysis to show why and how it works. Comprehensive experiments across various 2D and 3D visual tasks, ViT architectures, and datasets validate the effectiveness and generalizability of the proposed training scheme. Besides, our training scheme can also be applied to improve performance when fine-tuning ViTs. Lastly, but equally important, the proposed EWA technique can significantly improve the effectiveness of naive MoE in various 2D visual small datasets and 3D visual tasks.
△ Less
Submitted 25 August, 2023; v1 submitted 11 August, 2023;
originally announced August 2023.
-
Continuum field theory of 3D topological orders with emergent fermions and braiding statistics
Authors:
Zhi-Feng Zhang,
Qing-Rui Wang,
Peng Ye
Abstract:
Universal topological data of topologically ordered phases can be captured by topological quantum field theory in continuous space time by taking the limit of low energies and long wavelengths. While previous continuum field-theoretical studies of topological orders in $3$D real space focus on either self-statistics, braiding statistics, shrinking rules, fusion rules or quantum dimensions, it is y…
▽ More
Universal topological data of topologically ordered phases can be captured by topological quantum field theory in continuous space time by taking the limit of low energies and long wavelengths. While previous continuum field-theoretical studies of topological orders in $3$D real space focus on either self-statistics, braiding statistics, shrinking rules, fusion rules or quantum dimensions, it is yet to systematically put all topological data together in a unified continuum field-theoretical framework. Here, we construct the topological $BF$ field theory with twisted terms (e.g., $AAdA$ and $AAB$) as well as a $K$-matrix $BB$ term, in order to simultaneously explore all such topological data and reach anomaly-free topological orders. Following the spirit of the famous $K$-matrix Chern-Simons theory of $2$D topological orders, we present general formulas and systematically show how the $K$-matrix $BB$ term confines topological excitations, and how self-statistics of particles is transmuted between bosonic one and fermionic one. In order to reach anomaly-free topological orders, we explore, within the present continuum field-theoretical framework, how the principle of gauge invariance fundamentally influences possible realizations of topological data. More concretely, we present the topological actions of (i) particle-loop braidings with emergent fermions, (ii) multiloop braidings with emergent fermions, and (iii) Borromean-Rings braidings with emergent fermions, and calculate their universal topological data. Together with the previous efforts, our work paves the way toward a more systematic and complete continuum field-theoretical analysis of exotic topological properties of $3$D topological orders. Several interesting future directions are also discussed.
△ Less
Submitted 8 August, 2023; v1 submitted 19 July, 2023;
originally announced July 2023.
-
Fusion rules and shrinking rules of topological orders in five dimensions
Authors:
Yizhou Huang,
Zhi-Feng Zhang,
Peng Ye
Abstract:
As a series of work about 5D (spacetime) topological orders, here we employ the path-integral formalism of 5D topological quantum field theory (TQFT) established in Zhang and Ye, JHEP 04 (2022) 138 to explore non-Abelian fusion rules, hierarchical shrinking rules and quantum dimensions of particle-like, loop-like and membrane-like topological excitations in 5D topological orders. To illustrate, we…
▽ More
As a series of work about 5D (spacetime) topological orders, here we employ the path-integral formalism of 5D topological quantum field theory (TQFT) established in Zhang and Ye, JHEP 04 (2022) 138 to explore non-Abelian fusion rules, hierarchical shrinking rules and quantum dimensions of particle-like, loop-like and membrane-like topological excitations in 5D topological orders. To illustrate, we focus on a prototypical example of twisted $BF$ theories that comprise the twisted topological terms of the $BBA$ type. First, we classify topological excitations by establishing equivalence classes among all gauge-invariant Wilson operators. Then, we compute fusion rules from the path-integral and find that fusion rules may be non-Abelian; that is, the fusion outcome can be a direct sum of distinct excitations. We further compute shrinking rules. Especially, we discover exotic hierarchical structures hidden in shrinking processes of 5D or higher: a membrane is shrunk into particles and loops, and the loops are subsequently shrunk into a direct sum of particles. We obtain the algebraic structure of shrinking coefficients and fusion coefficients. We compute the quantum dimensions of all excitations and find that sphere-like membranes and torus-like membranes differ not only by their shapes but also by their quantum dimensions. We further study the algebraic structure that determines anomaly-free conditions on fusion coefficients and shrinking coefficients. Besides $BBA$, we explore general properties of all twisted terms in $5$D. Together with braiding statistics reported before, the theoretical progress here paves the way toward characterizing and classifying topological orders in higher dimensions where topological excitations consist of both particles and spatially extended objects.
△ Less
Submitted 21 November, 2023; v1 submitted 26 June, 2023;
originally announced June 2023.
-
Exploring Multi-Timestep Multi-Stage Diffusion Features for Hyperspectral Image Classification
Authors:
Jingyi Zhou,
Jiamu Sheng,
Jiayuan Fan,
Peng Ye,
Tong He,
Bin Wang,
Tao Chen
Abstract:
The effectiveness of spectral-spatial feature learning is crucial for the hyperspectral image (HSI) classification task. Diffusion models, as a new class of groundbreaking generative models, have the ability to learn both contextual semantics and textual details from the distinct timestep dimension, enabling the modeling of complex spectral-spatial relations in HSIs. However, existing diffusion-ba…
▽ More
The effectiveness of spectral-spatial feature learning is crucial for the hyperspectral image (HSI) classification task. Diffusion models, as a new class of groundbreaking generative models, have the ability to learn both contextual semantics and textual details from the distinct timestep dimension, enabling the modeling of complex spectral-spatial relations in HSIs. However, existing diffusion-based HSI classification methods only utilize manually selected single-timestep single-stage features, limiting the full exploration and exploitation of rich contextual semantics and textual information hidden in the diffusion model. To address this issue, we propose a novel diffusion-based feature learning framework that explores Multi-Timestep Multi-Stage Diffusion features for HSI classification for the first time, called MTMSD. Specifically, the diffusion model is first pretrained with unlabeled HSI patches to mine the connotation of unlabeled data, and then is used to extract the multi-timestep multi-stage diffusion features. To effectively and efficiently leverage multi-timestep multi-stage features,two strategies are further developed. One strategy is class & timestep-oriented multi-stage feature purification module with the inter-class and inter-timestep prior for reducing the redundancy of multi-stage features and alleviating memory constraints. The other one is selective timestep feature fusion module with the guidance of global features to adaptively select different timestep features for integrating texture and semantics. Both strategies facilitate the generality and adaptability of the MTMSD framework for diverse patterns of different HSI data. Extensive experiments are conducted on four public HSI datasets, and the results demonstrate that our method outperforms state-of-the-art methods for HSI classification, especially on the challenging Houston 2018 dataset.
△ Less
Submitted 3 June, 2024; v1 submitted 15 June, 2023;
originally announced June 2023.
-
Stimulative Training++: Go Beyond The Performance Limits of Residual Networks
Authors:
Peng Ye,
Tong He,
Shengji Tang,
Baopu Li,
Tao Chen,
Lei Bai,
Wanli Ouyang
Abstract:
Residual networks have shown great success and become indispensable in recent deep neural network models. In this work, we aim to re-investigate the training process of residual networks from a novel social psychology perspective of loafing, and further propose a new training scheme as well as three improved strategies for boosting residual networks beyond their performance limits. Previous resear…
▽ More
Residual networks have shown great success and become indispensable in recent deep neural network models. In this work, we aim to re-investigate the training process of residual networks from a novel social psychology perspective of loafing, and further propose a new training scheme as well as three improved strategies for boosting residual networks beyond their performance limits. Previous research has suggested that residual networks can be considered as ensembles of shallow networks, which implies that the final performance of a residual network is influenced by a group of subnetworks. We identify a previously overlooked problem that is analogous to social loafing, where subnetworks within a residual network are prone to exert less effort when working as part of a group compared to working alone. We define this problem as \textit{network loafing}. Similar to the decreased individual productivity and overall performance as demonstrated in society, network loafing inevitably causes sub-par performance. Inspired by solutions from social psychology, we first propose a novel training scheme called stimulative training, which randomly samples a residual subnetwork and calculates the KL divergence loss between the sampled subnetwork and the given residual network for extra supervision. In order to unleash the potential of stimulative training, we further propose three simple-yet-effective strategies, including a novel KL- loss that only aligns the network logits direction, random smaller inputs for subnetworks, and inter-stage sampling rules. Comprehensive experiments and analysis verify the effectiveness of stimulative training as well as its three improved strategies.
△ Less
Submitted 3 May, 2023;
originally announced May 2023.
-
Many-body physics of spontaneously broken higher-rank symmetry: from fractonic superfluids to dipolar Hubbard model
Authors:
Shuai A. Chen,
Peng Ye
Abstract:
Fractonic superfluids are exotic phases of matter in which bosons are subject to mobility constraints, resulting in features beyond those of conventional superfluids. These exotic phases arise from the spontaneous breaking of higher-rank symmetry (HRS) in many-body systems with higher-moment conservation, such as dipoles, quadrupoles, and angular moments. The aim of this paper is to introduce exci…
▽ More
Fractonic superfluids are exotic phases of matter in which bosons are subject to mobility constraints, resulting in features beyond those of conventional superfluids. These exotic phases arise from the spontaneous breaking of higher-rank symmetry (HRS) in many-body systems with higher-moment conservation, such as dipoles, quadrupoles, and angular moments. The aim of this paper is to introduce exciting developments on the theory of spontaneous symmetry breaking in such systems, which we refer to as ``many-fracton systems''. More specifically, we introduce exciting progress on general aspects of HRS, minimal model construction, realization of symmetry-breaking ground states, order parameter, off-diagonal long-range order (ODLRO), Noether currents with continuity equations, Gross-Pitaevskii equations, quantum fluctuations, Goldstone modes, specific heat, generalized Mermin-Wagner theorem, critical current, Landau criterion, symmetry defects, and Kosterlitz-Thouless (KT)-like physics, hydrodynamics, and dipolar Hubbard model realization. This paper is concluded with several future directions.
△ Less
Submitted 12 August, 2023; v1 submitted 1 May, 2023;
originally announced May 2023.
-
Proceedings to the 25th International Workshop "What Comes Beyond the Standard Models", July 4 -- July 10, 2022, Bled, Slovenia
Authors:
R. Bernabei,
P. Belli,
A. Bussolotti,
V. Caracciolo,
R. Cerulli,
N. Ferrari,
A. Leoncini,
V. Merlo,
F. Montecchia,
F. Cappella,
A. dAngelo,
A. Incicchitti,
A. Mattei,
C. J. Dai,
X. H. Ma,
X. D. Sheng,
Z. P. Ye,
V. Beylin,
L. Bonora,
S. J. Brodsky,
Paul H. Frampton,
A. Ghoshal,
G. Lambiase,
S. Pal,
A. Paul
, et al. (29 additional authors not shown)
Abstract:
Proceedings for our meeting ``What comes beyond the Standard Models'', which covered a broad series of subjects.
Proceedings for our meeting ``What comes beyond the Standard Models'', which covered a broad series of subjects.
△ Less
Submitted 29 March, 2023;
originally announced March 2023.
-
Phase-field Simulations of Polarization Variations in Polycrystalline Hf0.5Zr0.5O2 based MFIM: Voltage-Dependence and Dynamics
Authors:
Revanth Koduru,
Imtiaz Ahmed,
Atanu K Saha,
Xiao Lyu,
Peide Ye,
Sumeet K. Gupta
Abstract:
In this work, we investigate the device-to-device variations in remanent polarization of Hafnium-Zirconium-Oxide based Metal-Ferroelectric-Insulator-Metal (MFIM) stacks. We consider the effects of polycrystallinity in conjunction with multi-domain effects in HZO to understand the dependencies of variations on static and dynamic voltage stimuli using our 3D dynamic multi-grain phase-field simulatio…
▽ More
In this work, we investigate the device-to-device variations in remanent polarization of Hafnium-Zirconium-Oxide based Metal-Ferroelectric-Insulator-Metal (MFIM) stacks. We consider the effects of polycrystallinity in conjunction with multi-domain effects in HZO to understand the dependencies of variations on static and dynamic voltage stimuli using our 3D dynamic multi-grain phase-field simulation framework. We examine the trends in variations due to various design factors - set voltage, pulse amplitude and pulse width and correlate them to the dynamics of polarization switching and the underlying mechanisms. According to our analysis, variations exhibit a non-monotonic dependence on set voltage due to the interplay between voltage-dependent switching mechanisms and the polycrystalline structure. We further report that towards the higher end of the set voltages, collapsing of oppositely polarized domains can lead to increase in variations. We also show that ferroelectric thickness scaling lowers the device-to-device variations. In addition, considering the dynamics of polarization switching, we signify the key role of voltage and temporal dependence of domain nucleation in dictating the trends in variations. Finally, we show that to reach a target mean polarization, using a pulse with lower amplitude for longer duration results in lower variations compared to higher amplitude pulse for a shorter duration.
△ Less
Submitted 27 March, 2023;
originally announced March 2023.