-
Supernova Pointing Capabilities of DUNE
Authors:
DUNE Collaboration,
A. Abed Abud,
B. Abi,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
B. Aimard,
F. Akbar,
K. Allison,
S. Alonso Monsalve,
M. Alrashed,
A. Alton,
R. Alvarez,
T. Alves,
H. Amar,
P. Amedo,
J. Anderson,
D. A. Andrade
, et al. (1340 additional authors not shown)
Abstract:
The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electr…
▽ More
The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electron-neutrino charged-current absorption on $^{40}$Ar and elastic scattering of neutrinos on electrons. Procedures to reconstruct individual interactions, including a newly developed technique called ``brems flipping'', as well as the burst direction from an ensemble of interactions are described. Performance of the burst direction reconstruction is evaluated for supernovae happening at a distance of 10 kpc for a specific supernova burst flux model. The pointing resolution is found to be 3.4 degrees at 68% coverage for a perfect interaction-channel classification and a fiducial mass of 40 kton, and 6.6 degrees for a 10 kton fiducial mass respectively. Assuming a 4% rate of charged-current interactions being misidentified as elastic scattering, DUNE's burst pointing resolution is found to be 4.3 degrees (8.7 degrees) at 68% coverage.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
Inoculating solid-state homogeneous precipitation by impurity atoms through a spinodal decomposition like pathway
Authors:
Shiwei Pan,
Chunan Li,
Hanne-Sofie Søreide,
Dongdong Zhao,
Constantinos Hatzoglou,
Feng Qian,
Long-Qing Chen,
Yanjun Li
Abstract:
Solid-state homogeneous precipitation of nano-sized precipitates is one of the most effective processes to strengthen metal alloys, where the final density and size distribution of precipitates are largely controlled by the precipitation kinetics. Here, we report a strategy to inoculate the homogeneous precipitation of coherent precipitates to enhance the precipitation strengthening. Using the tec…
▽ More
Solid-state homogeneous precipitation of nano-sized precipitates is one of the most effective processes to strengthen metal alloys, where the final density and size distribution of precipitates are largely controlled by the precipitation kinetics. Here, we report a strategy to inoculate the homogeneous precipitation of coherent precipitates to enhance the precipitation strengthening. Using the technologically important dilute Al-Zr alloys as an example, we demonstrate that an addition of a trace level of economical and readily available, non-L1$_{2}$ phase forming impurity atoms, X (X= Sn, Sb, Bi or Cd) and Si, can significantly enhance the diffusivity of Zr atoms and overturn the precipitation of L1$_{2}$-structured Al$_{3}$Zr nanoparticles from the classical homogeneous nucleation and growth pathway into a nonclassical nucleation pathway: Al$_{3}$Zr forms through the spontaneous formation of nano-scale local concentration fluctuations of Zr atoms on Zr-X(-Si)-vacancy clusters followed by a continuous increase of the concentration and chemical short-range ordering (CSRO). Such an impurity atoms induced heterogeneous nucleation based on a "spinodal decomposition like" mechanism dramatically accelerates the precipitation kinetics, leading to an order of magnitude higher number density of precipitates and a record high hardening efficiency of solute Zr atoms. By formulating the generalized selection principles for inoculating impurity elements, this inoculation strategy should be extendable to a broader range of materials to further explore the precipitation strengthening potentials.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-guided Self-masking
Authors:
Yuheng Li,
Tianyu Luan,
Yizhou Wu,
Shaoyan Pan,
Yenho Chen,
Xiaofeng Yang
Abstract:
Due to the scarcity of labeled data, self-supervised learning (SSL) has gained much attention in 3D medical image segmentation, by extracting semantic representations from unlabeled data. Among SSL strategies, Masked image modeling (MIM) has shown effectiveness by reconstructing randomly masked images to learn detailed representations. However, conventional MIM methods require extensive training d…
▽ More
Due to the scarcity of labeled data, self-supervised learning (SSL) has gained much attention in 3D medical image segmentation, by extracting semantic representations from unlabeled data. Among SSL strategies, Masked image modeling (MIM) has shown effectiveness by reconstructing randomly masked images to learn detailed representations. However, conventional MIM methods require extensive training data to achieve good performance, which still poses a challenge for medical imaging. Since random masking uniformly samples all regions within medical images, it may overlook crucial anatomical regions and thus degrade the pretraining efficiency. We propose AnatoMask, a novel MIM method that leverages reconstruction loss to dynamically identify and mask out anatomically significant regions to improve pretraining efficacy. AnatoMask takes a self-distillation approach, where the model learns both how to find more significant regions to mask and how to reconstruct these masked regions. To avoid suboptimal learning, Anatomask adjusts the pretraining difficulty progressively using a masking dynamics function. We have evaluated our method on 4 public datasets with multiple imaging modalities (CT, MRI, and PET). AnatoMask demonstrates superior performance and scalability compared to existing SSL methods. The code is available at https://github.com/ricklisz/AnatoMask.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Cubic interactions for massless and partially massless spin-1 and spin-2 fields
Authors:
Nicolas Boulanger,
Sebastian Garcia-Saenz,
Songsong Pan,
Lucas Traina
Abstract:
We perform a complete classification of the consistent two-derivative cubic couplings for a system containing an arbitrary number of massless spin-1, massless spin-2, and partially massless (PM) spin-2 fields in $D$-dimensional (anti-)de Sitter space. In addition to previously known results, we find a unique candidate mixing between spin-1 and PM spin-2 fields. We derive all the quadratic constrai…
▽ More
We perform a complete classification of the consistent two-derivative cubic couplings for a system containing an arbitrary number of massless spin-1, massless spin-2, and partially massless (PM) spin-2 fields in $D$-dimensional (anti-)de Sitter space. In addition to previously known results, we find a unique candidate mixing between spin-1 and PM spin-2 fields. We derive all the quadratic constraints on the structure constants of the theory, allowing for relative ``wrong-sign'' kinetic terms for any of the fields. In the particular case when the kinetic terms in each sector have no relative signs, we find that the unique consistent non-trivial theory is given by multiple independent copies of conformal gravity coupled to a Yang-Mills sector in $D=4$. Our results strengthen the well-known no-go theorems on the absence of mutual interactions for massless and PM spin-2 fields.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Image-Conditional Diffusion Transformer for Underwater Image Enhancement
Authors:
Xingyang Nie,
Su Pan,
Xiaoyu Zhai,
Shifei Tao,
Fengzhong Qu,
Biao Wang,
Huilin Ge,
Guojie Xiao
Abstract:
Underwater image enhancement (UIE) has attracted much attention owing to its importance for underwater operation and marine engineering. Motivated by the recent advance in generative models, we propose a novel UIE method based on image-conditional diffusion transformer (ICDT). Our method takes the degraded underwater image as the conditional input and converts it into latent space where ICDT is ap…
▽ More
Underwater image enhancement (UIE) has attracted much attention owing to its importance for underwater operation and marine engineering. Motivated by the recent advance in generative models, we propose a novel UIE method based on image-conditional diffusion transformer (ICDT). Our method takes the degraded underwater image as the conditional input and converts it into latent space where ICDT is applied. ICDT replaces the conventional U-Net backbone in a denoising diffusion probabilistic model (DDPM) with a transformer, and thus inherits favorable properties such as scalability from transformers. Furthermore, we train ICDT with a hybrid loss function involving variances to achieve better log-likelihoods, which meanwhile significantly accelerates the sampling process. We experimentally assess the scalability of ICDTs and compare with prior works in UIE on the Underwater ImageNet dataset. Besides good scaling properties, our largest model, ICDT-XL/2, outperforms all comparison methods, achieving state-of-the-art (SOTA) quality of image enhancement.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Automatically Analyzing Performance Issues in Android Apps: How Far Are We?
Authors:
Dianshu Liao,
Shidong Pan,
Siyuan Yang,
Yitong Wang,
Yanjie Zhao,
Zhenchang Xing,
Xiaoyu Sun
Abstract:
Performance plays a critical role in ensuring the smooth operation of any mobile application, directly influencing user engagement and retention. Android applications are no exception. However, unlike functionality issues, performance issues are more challenging to discover as their root causes are sophisticated and typically emerge under specific payloads. To tackle this problem, researchers have…
▽ More
Performance plays a critical role in ensuring the smooth operation of any mobile application, directly influencing user engagement and retention. Android applications are no exception. However, unlike functionality issues, performance issues are more challenging to discover as their root causes are sophisticated and typically emerge under specific payloads. To tackle this problem, researchers have dedicated substantial efforts to proposing automatic approaches for understanding, detecting, and resolving performance issues. Despite these endeavors, it still remains unknown what the status quo of Android performance analysis is, and whether existing approaches can indeed accurately reflect real performance issues. To fill this research gap, we conducted a systematic literature review followed by an explanatory study to explore relevant studies and real-world challenges. Our findings reveal that current tools have limited capabilities, covering only 17.50% of the performance issues. Additionally, existing datasets encompass only 27.50% of the issues and are very limited in size. We also show real-world issue patterns, underscoring the huge gap between the identified techniques and practical concerns. Furthermore, possible solutions are provided to guide future research towards achieving effective performance issue detection and resolution.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
The Solution for the sequential task continual learning track of the 2nd Greater Bay Area International Algorithm Competition
Authors:
Sishun Pan,
Xixian Wu,
Tingmin Li,
Longfei Huang,
Mingxu Feng,
Zhonghua Wan,
Yang Yang
Abstract:
This paper presents a data-free, parameter-isolation-based continual learning algorithm we developed for the sequential task continual learning track of the 2nd Greater Bay Area International Algorithm Competition. The method learns an independent parameter subspace for each task within the network's convolutional and linear layers and freezes the batch normalization layers after the first task. S…
▽ More
This paper presents a data-free, parameter-isolation-based continual learning algorithm we developed for the sequential task continual learning track of the 2nd Greater Bay Area International Algorithm Competition. The method learns an independent parameter subspace for each task within the network's convolutional and linear layers and freezes the batch normalization layers after the first task. Specifically, for domain incremental setting where all domains share a classification head, we freeze the shared classification head after first task is completed, effectively solving the issue of catastrophic forgetting. Additionally, facing the challenge of domain incremental settings without providing a task identity, we designed an inference task identity strategy, selecting an appropriate mask matrix for each sample. Furthermore, we introduced a gradient supplementation strategy to enhance the importance of unselected parameters for the current task, facilitating learning for new tasks. We also implemented an adaptive importance scoring strategy that dynamically adjusts the amount of parameters to optimize single-task performance while reducing parameter usage. Moreover, considering the limitations of storage space and inference time, we designed a mask matrix compression strategy to save storage space and improve the speed of encryption and decryption of the mask matrix. Our approach does not require expanding the core network or using external auxiliary networks or data, and performs well under both task incremental and domain incremental settings. This solution ultimately won a second-place prize in the competition.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
The Solution for the AIGC Inference Performance Optimization Competition
Authors:
Sishun Pan,
Haonan Xu,
Zhonghua Wan,
Yang Yang
Abstract:
In recent years, the rapid advancement of large-scale pre-trained language models based on transformer architectures has revolutionized natural language processing tasks. Among these, ChatGPT has gained widespread popularity, demonstrating human-level conversational abilities and attracting over 100 million monthly users by late 2022. Concurrently, Baidu's commercial deployment of the Ernie Wenxin…
▽ More
In recent years, the rapid advancement of large-scale pre-trained language models based on transformer architectures has revolutionized natural language processing tasks. Among these, ChatGPT has gained widespread popularity, demonstrating human-level conversational abilities and attracting over 100 million monthly users by late 2022. Concurrently, Baidu's commercial deployment of the Ernie Wenxin model has significantly enhanced marketing effectiveness through AI-driven technologies. This paper focuses on optimizing high-performance inference for Ernie models, emphasizing GPU acceleration and leveraging the Paddle inference framework. We employ techniques such as Faster Transformer for efficient model processing, embedding layer pruning to reduce computational overhead, and FP16 half-precision inference for enhanced computational efficiency. Additionally, our approach integrates efficient data handling strategies using multi-process parallel processing to minimize latency. Experimental results demonstrate that our optimized solution achieves up to an 8.96x improvement in inference speed compared to standard methods, while maintaining competitive performance.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Classification of Power Quality Disturbances Using Resnet with Channel Attention Mechanism
Authors:
Su Pan,
Xingyang Nie,
Xiaoyu Zhai,
Biao Wang,
Huilin Ge,
Cheng He,
Zhenping Ding
Abstract:
The detection and classification of power quality disturbances (PQDs) carries significant importance for power systems. In response to this imperative, numerous intelligent diagnostic methods have been developed. However, existing identification methods usually concentrate on single-type signals or on complex signals with two types, rendering them susceptible to noisy labels and environmental effe…
▽ More
The detection and classification of power quality disturbances (PQDs) carries significant importance for power systems. In response to this imperative, numerous intelligent diagnostic methods have been developed. However, existing identification methods usually concentrate on single-type signals or on complex signals with two types, rendering them susceptible to noisy labels and environmental effects. This study proposes a novel method for the classification of PQDs, termed ST-GSResNet, which utilizes the S-Transform and an improved residual neural network (ResNet) with a channel attention mechanism. The ST-GSResNet approach initially uses the S-Transform to transform a time-series signal into a 2D time-frequency image for feature enhancement. Then, an improved ResNet model is introduced, which employs grouped convolution instead of the traditional convolution operation. This improvement aims to facilitate learning with a block-diagonal structured sparsity on the channel dimension, the highly-correlated filters are learned in a more structured way in the networks with filter groups. By reducing the number of parameters in the network in this significant manner, the model becomes less prone to overfitting. Furthermore, the SE module concentrates on primary components, which enhances the model's robustness in recognition and immunity to noise. Experimental results demonstrate that, compared to existing deep learning models, our approach has advantages in computational efficiency and classification accuracy.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
A Large-scale Investigation of Semantically Incompatible APIs behind Compatibility Issues in Android Apps
Authors:
Shidong Pan,
Tianchen Guo,
Lihong Zhang,
Pei Liu,
Zhenchang Xing,
Xiaoyu Sun
Abstract:
Application Programming Interface (API) incompatibility is a long-standing issue in Android application development. The rapid evolution of Android APIs results in a significant number of API additions, removals, and changes between adjacent versions. Unfortunately, this high frequency of alterations may lead to compatibility issues, often without adequate notification to developers regarding thes…
▽ More
Application Programming Interface (API) incompatibility is a long-standing issue in Android application development. The rapid evolution of Android APIs results in a significant number of API additions, removals, and changes between adjacent versions. Unfortunately, this high frequency of alterations may lead to compatibility issues, often without adequate notification to developers regarding these changes. Although researchers have proposed some work on detecting compatibility issues caused by changes in API signatures, they often overlook compatibility issues stemming from sophisticated semantic changes. In response to this challenge, we conducted a large-scale discovery of incompatible APIs in the Android Open Source Project (AOSP) by leveraging static analysis and pre-trained Large Language Models (LLMs) across adjacent versions. We systematically formulate the problem and propose a unified framework to detect incompatible APIs, especially for semantic changes. It's worth highlighting that our approach achieves a 0.83 F1-score in identifying semantically incompatible APIs in the Android framework. Ultimately, our approach detects 5,481 incompatible APIs spanning from version 4 to version 33. We further demonstrate its effectiveness in supplementing the state-of-the-art methods in detecting a broader spectrum of compatibility issues (+92.3%) that have been previously overlooked.
△ Less
Submitted 26 June, 2024; v1 submitted 25 June, 2024;
originally announced June 2024.
-
MindSpore Quantum: A User-Friendly, High-Performance, and AI-Compatible Quantum Computing Framework
Authors:
Xusheng Xu,
Jiangyu Cui,
Zidong Cui,
Runhong He,
Qingyu Li,
Xiaowei Li,
Yanling Lin,
Jiale Liu,
Wuxin Liu,
Jiale Lu,
Maolin Luo,
Chufan Lyu,
Shijie Pan,
Mosharev Pavel,
Runqiu Shu,
Jialiang Tang,
Ruoqian Xu,
Shu Xu,
Kang Yang,
Fan Yu,
Qingguo Zeng,
Haiying Zhao,
Qiang Zheng,
Junyuan Zhou,
Xu Zhou
, et al. (14 additional authors not shown)
Abstract:
We introduce MindSpore Quantum, a pioneering hybrid quantum-classical framework with a primary focus on the design and implementation of noisy intermediate-scale quantum (NISQ) algorithms. Leveraging the robust support of MindSpore, an advanced open-source deep learning training/inference framework, MindSpore Quantum exhibits exceptional efficiency in the design and training of variational quantum…
▽ More
We introduce MindSpore Quantum, a pioneering hybrid quantum-classical framework with a primary focus on the design and implementation of noisy intermediate-scale quantum (NISQ) algorithms. Leveraging the robust support of MindSpore, an advanced open-source deep learning training/inference framework, MindSpore Quantum exhibits exceptional efficiency in the design and training of variational quantum algorithms on both CPU and GPU platforms, delivering remarkable performance. Furthermore, this framework places a strong emphasis on enhancing the operational efficiency of quantum algorithms when executed on real quantum hardware. This encompasses the development of algorithms for quantum circuit compilation and qubit mapping, crucial components for achieving optimal performance on quantum processors. In addition to the core framework, we introduce QuPack, a meticulously crafted quantum computing acceleration engine. QuPack significantly accelerates the simulation speed of MindSpore Quantum, particularly in variational quantum eigensolver (VQE), quantum approximate optimization algorithm (QAOA), and tensor network simulations, providing astonishing speed. This combination of cutting-edge technologies empowers researchers and practitioners to explore the frontiers of quantum computing with unprecedented efficiency and performance.
△ Less
Submitted 10 July, 2024; v1 submitted 24 June, 2024;
originally announced June 2024.
-
The Championship-Winning Solution for the 5th CLVISION Challenge 2024
Authors:
Sishun Pan,
Tingmin Li,
Yang Yang
Abstract:
In this paper, we introduce our approach to the 5th CLVision Challenge, which presents distinctive challenges beyond traditional class incremental learning. Unlike standard settings, this competition features the recurrence of previously encountered classes and includes unlabeled data that may contain Out-of-Distribution (OOD) categories. Our approach is based on Winning Subnetworks to allocate in…
▽ More
In this paper, we introduce our approach to the 5th CLVision Challenge, which presents distinctive challenges beyond traditional class incremental learning. Unlike standard settings, this competition features the recurrence of previously encountered classes and includes unlabeled data that may contain Out-of-Distribution (OOD) categories. Our approach is based on Winning Subnetworks to allocate independent parameter spaces for each task addressing the catastrophic forgetting problem in class incremental learning and employ three training strategies: supervised classification learning, unsupervised contrastive learning, and pseudo-label classification learning to fully utilize the information in both labeled and unlabeled data, enhancing the classification performance of each subnetwork. Furthermore, during the inference stage, we have devised an interaction strategy between subnetworks, where the prediction for a specific class of a particular sample is the average logits across different subnetworks corresponding to that class, leveraging the knowledge learned from different subnetworks on recurring classes to improve classification accuracy. These strategies can be simultaneously applied to the three scenarios of the competition, effectively solving the difficulties in the competition scenarios. Experimentally, our method ranks first in both the pre-selection and final evaluation stages, with an average accuracy of 0.4535 during the preselection stage and an average accuracy of 0.4805 during the final evaluation stage.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
LLM-Powered Explanations: Unraveling Recommendations Through Subgraph Reasoning
Authors:
Guangsi Shi,
Xiaofeng Deng,
Linhao Luo,
Lijuan Xia,
Lei Bao,
Bei Ye,
Fei Du,
Shirui Pan,
Yuxiao Li
Abstract:
Recommender systems are pivotal in enhancing user experiences across various web applications by analyzing the complicated relationships between users and items. Knowledge graphs(KGs) have been widely used to enhance the performance of recommender systems. However, KGs are known to be noisy and incomplete, which are hard to provide reliable explanations for recommendation results. An explainable r…
▽ More
Recommender systems are pivotal in enhancing user experiences across various web applications by analyzing the complicated relationships between users and items. Knowledge graphs(KGs) have been widely used to enhance the performance of recommender systems. However, KGs are known to be noisy and incomplete, which are hard to provide reliable explanations for recommendation results. An explainable recommender system is crucial for the product development and subsequent decision-making. To address these challenges, we introduce a novel recommender that synergies Large Language Models (LLMs) and KGs to enhance the recommendation and provide interpretable results. Specifically, we first harness the power of LLMs to augment KG reconstruction. LLMs comprehend and decompose user reviews into new triples that are added into KG. In this way, we can enrich KGs with explainable paths that express user preferences. To enhance the recommendation on augmented KGs, we introduce a novel subgraph reasoning module that effectively measures the importance of nodes and discovers reasoning for recommendation. Finally, these reasoning paths are fed into the LLMs to generate interpretable explanations of the recommendation results. Our approach significantly enhances both the effectiveness and interpretability of recommender systems, especially in cross-selling scenarios where traditional methods falter. The effectiveness of our approach has been rigorously tested on four open real-world datasets, with our methods demonstrating a superior performance over contemporary state-of-the-art techniques by an average improvement of 12%. The application of our model in a multinational engineering and technology company cross-selling recommendation system further underscores its practical utility and potential to redefine recommendation practices through improved accuracy and user trust.
△ Less
Submitted 29 June, 2024; v1 submitted 22 June, 2024;
originally announced June 2024.
-
Adaptive Self-Supervised Consistency-Guided Diffusion Model for Accelerated MRI Reconstruction
Authors:
Mojtaba Safari,
Zach Eidex,
Shaoyan Pan,
Richard L. J. Qiu,
Xiaofeng Yang
Abstract:
Purpose: To propose a self-supervised deep learning-based compressed sensing MRI (DL-based CS-MRI) method named "Adaptive Self-Supervised Consistency Guided Diffusion Model (ASSCGD)" to accelerate data acquisition without requiring fully sampled datasets. Materials and Methods: We used the fastMRI multi-coil brain axial T2-weighted (T2-w) dataset from 1,376 cases and single-coil brain quantitative…
▽ More
Purpose: To propose a self-supervised deep learning-based compressed sensing MRI (DL-based CS-MRI) method named "Adaptive Self-Supervised Consistency Guided Diffusion Model (ASSCGD)" to accelerate data acquisition without requiring fully sampled datasets. Materials and Methods: We used the fastMRI multi-coil brain axial T2-weighted (T2-w) dataset from 1,376 cases and single-coil brain quantitative magnetization prepared 2 rapid acquisition gradient echoes (MP2RAGE) T1 maps from 318 cases to train and test our model. Robustness against domain shift was evaluated using two out-of-distribution (OOD) datasets: multi-coil brain axial postcontrast T1 -weighted (T1c) dataset from 50 cases and axial T1-weighted (T1-w) dataset from 50 patients. Data were retrospectively subsampled at acceleration rates R in {2x, 4x, 8x}. ASSCGD partitions a random sampling pattern into two disjoint sets, ensuring data consistency during training. We compared our method with ReconFormer Transformer and SS-MRI, assessing performance using normalized mean squared error (NMSE), peak signal-to-noise ratio (PSNR), and structural similarity index (SSIM). Statistical tests included one-way analysis of variance (ANOVA) and multi-comparison Tukey's Honesty Significant Difference (HSD) tests. Results: ASSCGD preserved fine structures and brain abnormalities visually better than comparative methods at R = 8x for both multi-coil and single-coil datasets. It achieved the lowest NMSE at R in {4x, 8x}, and the highest PSNR and SSIM values at all acceleration rates for the multi-coil dataset. Similar trends were observed for the single-coil dataset, though SSIM values were comparable to ReconFormer at R in {2x, 8x}. These results were further confirmed by the voxel-wise correlation scatter plots. OOD results showed significant (p << 10^-5 ) improvements in undersampled image quality after reconstruction.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark
Authors:
Yili Wang,
Yixin Liu,
Xu Shen,
Chenyu Li,
Kaize Ding,
Rui Miao,
Ying Wang,
Shirui Pan,
Xin Wang
Abstract:
To build safe and reliable graph machine learning systems, unsupervised graph-level anomaly detection (GLAD) and unsupervised graph-level out-of-distribution (OOD) detection (GLOD) have received significant attention in recent years. Though those two lines of research indeed share the same objective, they have been studied independently in the community due to distinct evaluation setups, creating…
▽ More
To build safe and reliable graph machine learning systems, unsupervised graph-level anomaly detection (GLAD) and unsupervised graph-level out-of-distribution (OOD) detection (GLOD) have received significant attention in recent years. Though those two lines of research indeed share the same objective, they have been studied independently in the community due to distinct evaluation setups, creating a gap that hinders the application and evaluation of methods from one to the other. To bridge the gap, in this work, we present a Unified Benchmark for unsupervised Graph-level OOD and anomaly Detection (our method), a comprehensive evaluation framework that unifies GLAD and GLOD under the concept of generalized graph-level OOD detection. Our benchmark encompasses 35 datasets spanning four practical anomaly and OOD detection scenarios, facilitating the comparison of 16 representative GLAD/GLOD methods. We conduct multi-dimensional analyses to explore the effectiveness, generalizability, robustness, and efficiency of existing methods, shedding light on their strengths and limitations. Furthermore, we provide an open-source codebase (https://github.com/UB-GOLD/UB-GOLD) of our method to foster reproducible research and outline potential directions for future investigations based on our insights.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
A Recursive Relation for Bipartition Numbers
Authors:
Yen-Chi Roger Lin,
Shu-Yen Pan
Abstract:
We establish a recursive relation for the bipartition number $p_2(n)$ which might be regarded as an analogue of Euler's recursive relation for the partition number $p(n)$. Two proofs of the main result are proved in this article. The first one is using the generating function, and the second one is using combinatoric objects (called ``symbols'') created by Lusztig for studying representation theor…
▽ More
We establish a recursive relation for the bipartition number $p_2(n)$ which might be regarded as an analogue of Euler's recursive relation for the partition number $p(n)$. Two proofs of the main result are proved in this article. The first one is using the generating function, and the second one is using combinatoric objects (called ``symbols'') created by Lusztig for studying representation theory of finite classical groups.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
GenderAlign: An Alignment Dataset for Mitigating Gender Bias in Large Language Models
Authors:
Tao Zhang,
Ziqian Zeng,
Yuxiang Xiao,
Huiping Zhuang,
Cen Chen,
James Foulds,
Shimei Pan
Abstract:
Large Language Models (LLMs) are prone to generating content that exhibits gender biases, raising significant ethical concerns. Alignment, the process of fine-tuning LLMs to better align with desired behaviors, is recognized as an effective approach to mitigate gender biases. Although proprietary LLMs have made significant strides in mitigating gender bias, their alignment datasets are not publicl…
▽ More
Large Language Models (LLMs) are prone to generating content that exhibits gender biases, raising significant ethical concerns. Alignment, the process of fine-tuning LLMs to better align with desired behaviors, is recognized as an effective approach to mitigate gender biases. Although proprietary LLMs have made significant strides in mitigating gender bias, their alignment datasets are not publicly available. The commonly used and publicly available alignment dataset, HH-RLHF, still exhibits gender bias to some extent. There is a lack of publicly available alignment datasets specifically designed to address gender bias. Hence, we developed a new dataset named GenderAlign, aiming at mitigating a comprehensive set of gender biases in LLMs. This dataset comprises 8k single-turn dialogues, each paired with a "chosen" and a "rejected" response. Compared to the "rejected" responses, the "chosen" responses demonstrate lower levels of gender bias and higher quality. Furthermore, we categorized the gender biases in the "rejected" responses of GenderAlign into 4 principal categories. The experimental results show the effectiveness of GenderAlign in reducing gender bias in LLMs.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
The Heterophilic Snowflake Hypothesis: Training and Empowering GNNs for Heterophilic Graphs
Authors:
Kun Wang,
Guibin Zhang,
Xinnan Zhang,
Junfeng Fang,
Xun Wu,
Guohao Li,
Shirui Pan,
Wei Huang,
Yuxuan Liang
Abstract:
Graph Neural Networks (GNNs) have become pivotal tools for a range of graph-based learning tasks. Notably, most current GNN architectures operate under the assumption of homophily, whether explicitly or implicitly. While this underlying assumption is frequently adopted, it is not universally applicable, which can result in potential shortcomings in learning effectiveness. In this paper, \textbf{fo…
▽ More
Graph Neural Networks (GNNs) have become pivotal tools for a range of graph-based learning tasks. Notably, most current GNN architectures operate under the assumption of homophily, whether explicitly or implicitly. While this underlying assumption is frequently adopted, it is not universally applicable, which can result in potential shortcomings in learning effectiveness. In this paper, \textbf{for the first time}, we transfer the prevailing concept of ``one node one receptive field" to the heterophilic graph. By constructing a proxy label predictor, we enable each node to possess a latent prediction distribution, which assists connected nodes in determining whether they should aggregate their associated neighbors. Ultimately, every node can have its own unique aggregation hop and pattern, much like each snowflake is unique and possesses its own characteristics. Based on observations, we innovatively introduce the Heterophily Snowflake Hypothesis and provide an effective solution to guide and facilitate research on heterophilic graphs and beyond. We conduct comprehensive experiments including (1) main results on 10 graphs with varying heterophily ratios across 10 backbones; (2) scalability on various deep GNN backbones (SGC, JKNet, etc.) across various large number of layers (2,4,6,8,16,32 layers); (3) comparison with conventional snowflake hypothesis; (4) efficiency comparison with existing graph pruning algorithms. Our observations show that our framework acts as a versatile operator for diverse tasks. It can be integrated into various GNN frameworks, boosting performance in-depth and offering an explainable approach to choosing the optimal network depth. The source code is available at \url{https://github.com/bingreeky/HeteroSnoH}.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Tilt stability of Ky-Fan $κ$-norm composite optimization
Authors:
Yulan Liu,
Shaohua Pan,
Wen Song
Abstract:
This paper concerns the tilt stability for the minimization of the sum of a twice continuously differentiable matrix-valued function and the Ky-Fan $κ$-norm. By using the expression of second subderivative of the Ky-Fan $κ$-norm, we derive a verifiable criterion to identify the tilt stability of a local minimum for this class of nonconvex and nonsmooth problems. As a byproduct, a practical criteri…
▽ More
This paper concerns the tilt stability for the minimization of the sum of a twice continuously differentiable matrix-valued function and the Ky-Fan $κ$-norm. By using the expression of second subderivative of the Ky-Fan $κ$-norm, we derive a verifiable criterion to identify the tilt stability of a local minimum for this class of nonconvex and nonsmooth problems. As a byproduct, a practical criterion is achieved for the tilt stable solution of the nuclear-norm regularized minimization.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Convergence of ZH-type nonmonotone descent method for Kurdyka-Łojasiewicz optimization problems
Authors:
Yitian Qian,
Ting Tao,
Shaohua Pan,
Houduo Qi
Abstract:
This note concerns a class of nonmonotone descent methods for minimizing a proper lower semicontinuous Kurdyka-Ł$\ddot{o}$jasiewicz (KL) function $Φ$, whose iterate sequence obeys the ZH-type nonmonotone decrease condition and a relative error condition. We prove that the iterate sequence converges to a critical point of $Φ$, and if $Φ$ has the KL property of exponent $θ\in(0,1)$ at this critical…
▽ More
This note concerns a class of nonmonotone descent methods for minimizing a proper lower semicontinuous Kurdyka-Ł$\ddot{o}$jasiewicz (KL) function $Φ$, whose iterate sequence obeys the ZH-type nonmonotone decrease condition and a relative error condition. We prove that the iterate sequence converges to a critical point of $Φ$, and if $Φ$ has the KL property of exponent $θ\in(0,1)$ at this critical point, the convergence has a linear rate for $θ\in(0,1/2]$ and a sublinear rate of exponent $\frac{1-θ}{1-2θ}$ for $θ\in(1/2,1)$. Our results first resolve the full convergence of the iterate sequence generated by the ZH-type nonmonotone descent method for nonconvex and nonsmooth optimization problems, and extend the full convergence of monotone descent methods for KL optimization problems to the ZH-type nonmonotone descent method.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Split-and-Fit: Learning B-Reps via Structure-Aware Voronoi Partitioning
Authors:
Yilin Liu,
Jiale Chen,
Shanshan Pan,
Daniel Cohen-Or,
Hao Zhang,
Hui Huang
Abstract:
We introduce a novel method for acquiring boundary representations (B-Reps) of 3D CAD models which involves a two-step process: it first applies a spatial partitioning, referred to as the ``split``, followed by a ``fit`` operation to derive a single primitive within each partition. Specifically, our partitioning aims to produce the classical Voronoi diagram of the set of ground-truth (GT) B-Rep pr…
▽ More
We introduce a novel method for acquiring boundary representations (B-Reps) of 3D CAD models which involves a two-step process: it first applies a spatial partitioning, referred to as the ``split``, followed by a ``fit`` operation to derive a single primitive within each partition. Specifically, our partitioning aims to produce the classical Voronoi diagram of the set of ground-truth (GT) B-Rep primitives. In contrast to prior B-Rep constructions which were bottom-up, either via direct primitive fitting or point clustering, our Split-and-Fit approach is top-down and structure-aware, since a Voronoi partition explicitly reveals both the number of and the connections between the primitives. We design a neural network to predict the Voronoi diagram from an input point cloud or distance field via a binary classification. We show that our network, coined NVD-Net for neural Voronoi diagrams, can effectively learn Voronoi partitions for CAD models from training data and exhibits superior generalization capabilities. Extensive experiments and evaluation demonstrate that the resulting B-Reps, consisting of parametric surfaces, curves, and vertices, are more plausible than those obtained by existing alternatives, with significant improvements in reconstruction quality. Code will be released on https://github.com/yilinliu77/NVDNet.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Bayesian Inference for Spatial-temporal Non-Gaussian Data Using Predictive Stacking
Authors:
Soumyakanti Pan,
Lu Zhang,
Jonathan R. Bradley,
Sudipto Banerjee
Abstract:
Analysing non-Gaussian spatial-temporal data typically requires introducing spatial dependence in generalised linear models through the link function of an exponential family distribution. However, unlike in Gaussian likelihoods, inference is considerably encumbered by the inability to analytically integrate out the random effects and reduce the dimension of the parameter space. Iterative estimati…
▽ More
Analysing non-Gaussian spatial-temporal data typically requires introducing spatial dependence in generalised linear models through the link function of an exponential family distribution. However, unlike in Gaussian likelihoods, inference is considerably encumbered by the inability to analytically integrate out the random effects and reduce the dimension of the parameter space. Iterative estimation algorithms struggle to converge due to the presence of weakly identified parameters. We devise an approach that obviates these issues by exploiting generalised conjugate multivariate distribution theory for exponential families, which enables exact sampling from analytically available posterior distributions conditional upon some fixed process parameters. More specifically, we expand upon the Diaconis-Ylvisaker family of conjugate priors to achieve analytically tractable posterior inference for spatially-temporally varying regression models conditional on some kernel parameters. Subsequently, we assimilate inference from these individual posterior distributions over a range of values of these parameters using Bayesian predictive stacking. We evaluate inferential performance on simulated data, compare with fully Bayesian inference using Markov chain Monte Carlo and apply our proposed method to analyse spatially-temporally referenced avian count data from the North American Breeding Bird Survey database.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
A majorized PAM method with subspace correction for low-rank composite factorization model
Authors:
Ting Tao,
Yitian Qian,
Shaohua Pan
Abstract:
This paper concerns a class of low-rank composite factorization models arising from matrix completion. For this nonconvex and nonsmooth optimization problem, we propose a proximal alternating minimization algorithm (PAMA) with subspace correction, in which a subspace correction step is imposed on every proximal subproblem so as to guarantee that the corrected proximal subproblem has a closed-form…
▽ More
This paper concerns a class of low-rank composite factorization models arising from matrix completion. For this nonconvex and nonsmooth optimization problem, we propose a proximal alternating minimization algorithm (PAMA) with subspace correction, in which a subspace correction step is imposed on every proximal subproblem so as to guarantee that the corrected proximal subproblem has a closed-form solution. For this subspace correction PAMA, we prove the subsequence convergence of the iterate sequence, and establish the convergence of the whole iterate sequence and the column subspace sequences of factor pairs under the KL property of objective function and a restrictive condition that holds automatically for the column $\ell_{2,0}$-norm function. Numerical comparison with the proximal alternating linearized minimization method on one-bit matrix completion problems indicates that PAMA has an advantage in seeking lower relative error within less time.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Coherent control of a triangular exchange-only spin qubit
Authors:
Edwin Acuna,
Joseph D. Broz,
Kaushal Shyamsundar,
Antonio B. Mei,
Colin P. Feeney,
Valerie Smetanka,
Tiffany Davis,
Kangmu Lee,
Maxwell D. Choi,
Brydon Boyd,
June Suh,
Wonill D. Ha,
Cameron Jennings,
Andrew S. Pan,
Daniel S. Sanchez,
Matthew D. Reed,
Jason R. Petta
Abstract:
We demonstrate coherent control of a three-electron exchange-only spin qubit with the quantum dots arranged in a close-packed triangular geometry. The device is tuned to confine one electron in each quantum dot, as evidenced by pairwise charge stability diagrams. Time-domain control of the exchange coupling is demonstrated and qubit performance is characterized using blind randomized benchmarking,…
▽ More
We demonstrate coherent control of a three-electron exchange-only spin qubit with the quantum dots arranged in a close-packed triangular geometry. The device is tuned to confine one electron in each quantum dot, as evidenced by pairwise charge stability diagrams. Time-domain control of the exchange coupling is demonstrated and qubit performance is characterized using blind randomized benchmarking, with an average single-qubit gate fidelity F = 99.84%. The compact triangular device geometry can be readily scaled to larger two-dimensional quantum dot arrays with high connectivity.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Decision-focused Graph Neural Networks for Combinatorial Optimization
Authors:
Yang Liu,
Chuan Zhou,
Peng Zhang,
Shirui Pan,
Zhao Li,
Hongyang Chen
Abstract:
In recent years, there has been notable interest in investigating combinatorial optimization (CO) problems by neural-based framework. An emerging strategy to tackle these challenging problems involves the adoption of graph neural networks (GNNs) as an alternative to traditional algorithms, a subject that has attracted considerable attention. Despite the growing popularity of GNNs and traditional a…
▽ More
In recent years, there has been notable interest in investigating combinatorial optimization (CO) problems by neural-based framework. An emerging strategy to tackle these challenging problems involves the adoption of graph neural networks (GNNs) as an alternative to traditional algorithms, a subject that has attracted considerable attention. Despite the growing popularity of GNNs and traditional algorithm solvers in the realm of CO, there is limited research on their integrated use and the correlation between them within an end-to-end framework. The primary focus of our work is to formulate a more efficient and precise framework for CO by employing decision-focused learning on graphs. Additionally, we introduce a decision-focused framework that utilizes GNNs to address CO problems with auxiliary support. To realize an end-to-end approach, we have designed two cascaded modules: (a) an unsupervised trained graph predictive model, and (b) a solver for quadratic binary unconstrained optimization. Empirical evaluations are conducted on various classical tasks, including maximum cut, maximum independent set, and minimum vertex cover. The experimental results on classical CO problems (i.e. MaxCut, MIS, and MVC) demonstrate the superiority of our method over both the standalone GNN approach and classical methods.
△ Less
Submitted 9 June, 2024; v1 submitted 5 June, 2024;
originally announced June 2024.
-
MagiNet: Mask-Aware Graph Imputation Network for Incomplete Traffic Data
Authors:
Jianping Zhou,
Bin Lu,
Zhanyu Liu,
Siyu Pan,
Xuejun Feng,
Hua Wei,
Guanjie Zheng,
Xinbing Wang,
Chenghu Zhou
Abstract:
Due to detector malfunctions and communication failures, missing data is ubiquitous during the collection of traffic data. Therefore, it is of vital importance to impute the missing values to facilitate data analysis and decision-making for Intelligent Transportation System (ITS). However, existing imputation methods generally perform zero pre-filling techniques to initialize missing values, intro…
▽ More
Due to detector malfunctions and communication failures, missing data is ubiquitous during the collection of traffic data. Therefore, it is of vital importance to impute the missing values to facilitate data analysis and decision-making for Intelligent Transportation System (ITS). However, existing imputation methods generally perform zero pre-filling techniques to initialize missing values, introducing inevitable noises. Moreover, we observe prevalent over-smoothing interpolations, falling short in revealing the intrinsic spatio-temporal correlations of incomplete traffic data. To this end, we propose Mask-Aware Graph imputation Network: MagiNet. Our method designs an adaptive mask spatio-temporal encoder to learn the latent representations of incomplete data, eliminating the reliance on pre-filling missing values. Furthermore, we devise a spatio-temporal decoder that stacks multiple blocks to capture the inherent spatial and temporal dependencies within incomplete traffic data, alleviating over-smoothing imputation. Extensive experiments demonstrate that our method outperforms state-of-the-art imputation methods on five real-world traffic datasets, yielding an average improvement of 4.31% in RMSE and 3.72% in MAPE.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Augmented Commonsense Knowledge for Remote Object Grounding
Authors:
Bahram Mohammadi,
Yicong Hong,
Yuankai Qi,
Qi Wu,
Shirui Pan,
Javen Qinfeng Shi
Abstract:
The vision-and-language navigation (VLN) task necessitates an agent to perceive the surroundings, follow natural language instructions, and act in photo-realistic unseen environments. Most of the existing methods employ the entire image or object features to represent navigable viewpoints. However, these representations are insufficient for proper action prediction, especially for the REVERIE task…
▽ More
The vision-and-language navigation (VLN) task necessitates an agent to perceive the surroundings, follow natural language instructions, and act in photo-realistic unseen environments. Most of the existing methods employ the entire image or object features to represent navigable viewpoints. However, these representations are insufficient for proper action prediction, especially for the REVERIE task, which uses concise high-level instructions, such as ''Bring me the blue cushion in the master bedroom''. To address enhancing representation, we propose an augmented commonsense knowledge model (ACK) to leverage commonsense information as a spatio-temporal knowledge graph for improving agent navigation. Specifically, the proposed approach involves constructing a knowledge base by retrieving commonsense information from ConceptNet, followed by a refinement module to remove noisy and irrelevant knowledge. We further present ACK which consists of knowledge graph-aware cross-modal and concept aggregation modules to enhance visual representation and visual-textual data alignment by integrating visible objects, commonsense knowledge, and concept history, which includes object and knowledge temporal information. Moreover, we add a new pipeline for the commonsense-based decision-making process which leads to more accurate local action prediction. Experimental results demonstrate our proposed model noticeably outperforms the baseline and archives the state-of-the-art on the REVERIE benchmark.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Ta2Pd3Te5 topological thermometer
Authors:
Yupeng Li,
Anqi Wang,
Senyang Pan,
Dayu Yan,
Guang Yang,
Xingchen Guo,
Yu Hong,
Guangtong Liu,
Fanming Qu,
Zhijun Wang,
Tian Qian,
Jinglei Zhang,
Youguo Shi,
Li Lu,
Jie Shen
Abstract:
In recent decades, there has been a persistent pursuit of applications for surface/edge states in topological systems, driven by their dissipationless transport effects. However, there have been limited tangible breakthroughs in this field. This work demonstrates the remarkable properties of the topological insulator Ta2Pd3Te5, as a thermometer. This material exhibits a power-law correlation in te…
▽ More
In recent decades, there has been a persistent pursuit of applications for surface/edge states in topological systems, driven by their dissipationless transport effects. However, there have been limited tangible breakthroughs in this field. This work demonstrates the remarkable properties of the topological insulator Ta2Pd3Te5, as a thermometer. This material exhibits a power-law correlation in temperature-dependent resistance at low temperatures, stemming from its Luttinger liquid behavior of edge states, while exhibiting semiconductor behavior at high temperatures. The power-law behavior effectively addresses the issue of infinite resistance in semiconductor thermometers at ultra-low temperatures, thereby playing a crucial role in enabling efficient thermometry in refrigerators supporting millikelvin temperatures or below. By employing chemical doping, adjusting thickness, and controlling gate voltage, its power-law behavior and semiconductor behavior can be effectively modulated. This enables efficient thermometry spanning from millikelvin temperatures to room temperature, and allows for precise local temperature measurement. Furthermore, this thermometer exhibits excellent temperature sensitivity and resolution, and can be fine-tuned to show small magnetoresistance. In summary, the Ta2Pd3Te5 thermometer, also referred to as a topological thermometer, exhibits outstanding performance and significant potential for measuring a wider range of temperatures compared to conventional low-temperature thermometers.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
A-SDM: Accelerating Stable Diffusion through Model Assembly and Feature Inheritance Strategies
Authors:
Jinchao Zhu,
Yuxuan Wang,
Siyuan Pan,
Pengfei Wan,
Di Zhang,
Gao Huang
Abstract:
The Stable Diffusion Model (SDM) is a prevalent and effective model for text-to-image (T2I) and image-to-image (I2I) generation. Despite various attempts at sampler optimization, model distillation, and network quantification, these approaches typically maintain the original network architecture. The extensive parameter scale and substantial computational demands have limited research into adjusti…
▽ More
The Stable Diffusion Model (SDM) is a prevalent and effective model for text-to-image (T2I) and image-to-image (I2I) generation. Despite various attempts at sampler optimization, model distillation, and network quantification, these approaches typically maintain the original network architecture. The extensive parameter scale and substantial computational demands have limited research into adjusting the model architecture. This study focuses on reducing redundant computation in SDM and optimizes the model through both tuning and tuning-free methods. 1) For the tuning method, we design a model assembly strategy to reconstruct a lightweight model while preserving performance through distillation. Second, to mitigate performance loss due to pruning, we incorporate multi-expert conditional convolution (ME-CondConv) into compressed UNets to enhance network performance by increasing capacity without sacrificing speed. Third, we validate the effectiveness of the multi-UNet switching method for improving network speed. 2) For the tuning-free method, we propose a feature inheritance strategy to accelerate inference by skipping local computations at the block, layer, or unit level within the network structure. We also examine multiple sampling modes for feature inheritance at the time-step level. Experiments demonstrate that both the proposed tuning and the tuning-free methods can improve the speed and performance of the SDM. The lightweight model reconstructed by the model assembly strategy increases generation speed by $22.4%$, while the feature inheritance strategy enhances the SDM generation speed by $40.0%$.
△ Less
Submitted 17 June, 2024; v1 submitted 31 May, 2024;
originally announced June 2024.
-
Sign is Not a Remedy: Multiset-to-Multiset Message Passing for Learning on Heterophilic Graphs
Authors:
Langzhang Liang,
Sunwoo Kim,
Kijung Shin,
Zenglin Xu,
Shirui Pan,
Yuan Qi
Abstract:
Graph Neural Networks (GNNs) have gained significant attention as a powerful modeling and inference method, especially for homophilic graph-structured data. To empower GNNs in heterophilic graphs, where adjacent nodes exhibit dissimilar labels or features, Signed Message Passing (SMP) has been widely adopted. However, there is a lack of theoretical and empirical analysis regarding the limitations…
▽ More
Graph Neural Networks (GNNs) have gained significant attention as a powerful modeling and inference method, especially for homophilic graph-structured data. To empower GNNs in heterophilic graphs, where adjacent nodes exhibit dissimilar labels or features, Signed Message Passing (SMP) has been widely adopted. However, there is a lack of theoretical and empirical analysis regarding the limitations of SMP. In this work, we unveil some potential pitfalls of SMP and their remedies. We first identify two limitations of SMP: undesirable representation update for multi-hop neighbors and vulnerability against oversmoothing issues. To overcome these challenges, we propose a novel message passing function called Multiset to Multiset GNN(M2M-GNN). Our theoretical analyses and extensive experiments demonstrate that M2M-GNN effectively alleviates the aforementioned limitations of SMP, yielding superior performance in comparison
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Large Language Model Watermark Stealing With Mixed Integer Programming
Authors:
Zhaoxi Zhang,
Xiaomei Zhang,
Yanjun Zhang,
Leo Yu Zhang,
Chao Chen,
Shengshan Hu,
Asif Gill,
Shirui Pan
Abstract:
The Large Language Model (LLM) watermark is a newly emerging technique that shows promise in addressing concerns surrounding LLM copyright, monitoring AI-generated text, and preventing its misuse. The LLM watermark scheme commonly includes generating secret keys to partition the vocabulary into green and red lists, applying a perturbation to the logits of tokens in the green list to increase their…
▽ More
The Large Language Model (LLM) watermark is a newly emerging technique that shows promise in addressing concerns surrounding LLM copyright, monitoring AI-generated text, and preventing its misuse. The LLM watermark scheme commonly includes generating secret keys to partition the vocabulary into green and red lists, applying a perturbation to the logits of tokens in the green list to increase their sampling likelihood, thus facilitating watermark detection to identify AI-generated text if the proportion of green tokens exceeds a threshold. However, recent research indicates that watermarking methods using numerous keys are susceptible to removal attacks, such as token editing, synonym substitution, and paraphrasing, with robustness declining as the number of keys increases. Therefore, the state-of-the-art watermark schemes that employ fewer or single keys have been demonstrated to be more robust against text editing and paraphrasing. In this paper, we propose a novel green list stealing attack against the state-of-the-art LLM watermark scheme and systematically examine its vulnerability to this attack. We formalize the attack as a mixed integer programming problem with constraints. We evaluate our attack under a comprehensive threat model, including an extreme scenario where the attacker has no prior knowledge, lacks access to the watermark detector API, and possesses no information about the LLM's parameter settings or watermark injection/detection scheme. Extensive experiments on LLMs, such as OPT and LLaMA, demonstrate that our attack can successfully steal the green list and remove the watermark across all settings.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Overcoming Negative Transfer by Online Selection: Distant Domain Adaptation for Fault Diagnosis
Authors:
Ziyan Wang,
Mohamed Ragab,
Wenmian Yang,
Min Wu,
Sinno Jialin Pan,
Jie Zhang,
Zhenghua Chen
Abstract:
Unsupervised domain adaptation (UDA) has achieved remarkable success in fault diagnosis, bringing significant benefits to diverse industrial applications. While most UDA methods focus on cross-working condition scenarios where the source and target domains are notably similar, real-world applications often grapple with severe domain shifts. We coin the term `distant domain adaptation problem' to d…
▽ More
Unsupervised domain adaptation (UDA) has achieved remarkable success in fault diagnosis, bringing significant benefits to diverse industrial applications. While most UDA methods focus on cross-working condition scenarios where the source and target domains are notably similar, real-world applications often grapple with severe domain shifts. We coin the term `distant domain adaptation problem' to describe the challenge of adapting from a labeled source domain to a significantly disparate unlabeled target domain. This problem exhibits the risk of negative transfer, where extraneous knowledge from the source domain adversely affects the target domain performance. Unfortunately, conventional UDA methods often falter in mitigating this negative transfer, leading to suboptimal performance. In response to this challenge, we propose a novel Online Selective Adversarial Alignment (OSAA) approach. Central to OSAA is its ability to dynamically identify and exclude distant source samples via an online gradient masking approach, focusing primarily on source samples that closely resemble the target samples. Furthermore, recognizing the inherent complexities in bridging the source and target domains, we construct an intermediate domain to act as a transitional domain and ease the adaptation process. Lastly, we develop a class-conditional adversarial adaptation to address the label distribution disparities while learning domain invariant representation to account for potential label distribution disparities between the domains. Through detailed experiments and ablation studies on two real-world datasets, we validate the superior performance of the OSAA method over state-of-the-art methods, underscoring its significant utility in practical scenarios with severe domain shifts.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
Balancing User Preferences by Social Networks: A Condition-Guided Social Recommendation Model for Mitigating Popularity Bias
Authors:
Xin He,
Wenqi Fan,
Ruobing Wang,
Yili Wang,
Ying Wang,
Shirui Pan,
Xin Wang
Abstract:
Social recommendation models weave social interactions into their design to provide uniquely personalized recommendation results for users. However, social networks not only amplify the popularity bias in recommendation models, resulting in more frequent recommendation of hot items and fewer long-tail items, but also include a substantial amount of redundant information that is essentially meaning…
▽ More
Social recommendation models weave social interactions into their design to provide uniquely personalized recommendation results for users. However, social networks not only amplify the popularity bias in recommendation models, resulting in more frequent recommendation of hot items and fewer long-tail items, but also include a substantial amount of redundant information that is essentially meaningless for the model's performance. Existing social recommendation models fail to address the issues of popularity bias and the redundancy of social information, as they directly characterize social influence across the entire social network without making targeted adjustments. In this paper, we propose a Condition-Guided Social Recommendation Model (named CGSoRec) to mitigate the model's popularity bias by denoising the social network and adjusting the weights of user's social preferences. More specifically, CGSoRec first includes a Condition-Guided Social Denoising Model (CSD) to remove redundant social relations in the social network for capturing users' social preferences with items more precisely. Then, CGSoRec calculates users' social preferences based on denoised social network and adjusts the weights in users' social preferences to make them can counteract the popularity bias present in the recommendation model. At last, CGSoRec includes a Condition-Guided Diffusion Recommendation Model (CGD) to introduce the adjusted social preferences as conditions to control the recommendation results for a debiased direction. Comprehensive experiments on three real-world datasets demonstrate the effectiveness of our proposed method. The code is in: https://github.com/hexin5515/CGSoRec.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
ARC: A Generalist Graph Anomaly Detector with In-Context Learning
Authors:
Yixin Liu,
Shiyuan Li,
Yu Zheng,
Qingfeng Chen,
Chengqi Zhang,
Shirui Pan
Abstract:
Graph anomaly detection (GAD), which aims to identify abnormal nodes that differ from the majority within a graph, has garnered significant attention. However, current GAD methods necessitate training specific to each dataset, resulting in high training costs, substantial data requirements, and limited generalizability when being applied to new datasets and domains. To address these limitations, t…
▽ More
Graph anomaly detection (GAD), which aims to identify abnormal nodes that differ from the majority within a graph, has garnered significant attention. However, current GAD methods necessitate training specific to each dataset, resulting in high training costs, substantial data requirements, and limited generalizability when being applied to new datasets and domains. To address these limitations, this paper proposes ARC, a generalist GAD approach that enables a ``one-for-all'' GAD model to detect anomalies across various graph datasets on-the-fly. Equipped with in-context learning, ARC can directly extract dataset-specific patterns from the target dataset using few-shot normal samples at the inference stage, without the need for retraining or fine-tuning on the target dataset. ARC comprises three components that are well-crafted for capturing universal graph anomaly patterns: 1) smoothness-based feature Alignment module that unifies the features of different datasets into a common and anomaly-sensitive space; 2) ego-neighbor Residual graph encoder that learns abnormality-related node embeddings; and 3) cross-attentive in-Context anomaly scoring module that predicts node abnormality by leveraging few-shot normal samples. Extensive experiments on multiple benchmark datasets from various domains demonstrate the superior anomaly detection performance, efficiency, and generalizability of ARC.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
A Cross-Field Fusion Strategy for Drug-Target Interaction Prediction
Authors:
Hongzhi Zhang,
Xiuwen Gong,
Shirui Pan,
Jia Wu,
Bo Du,
Wenbin Hu
Abstract:
Drug-target interaction (DTI) prediction is a critical component of the drug discovery process. In the drug development engineering field, predicting novel drug-target interactions is extremely crucial.However, although existing methods have achieved high accuracy levels in predicting known drugs and drug targets, they fail to utilize global protein information during DTI prediction. This leads to…
▽ More
Drug-target interaction (DTI) prediction is a critical component of the drug discovery process. In the drug development engineering field, predicting novel drug-target interactions is extremely crucial.However, although existing methods have achieved high accuracy levels in predicting known drugs and drug targets, they fail to utilize global protein information during DTI prediction. This leads to an inability to effectively predict interaction the interactions between novel drugs and their targets. As a result, the cross-field information fusion strategy is employed to acquire local and global protein information. Thus, we propose the siamese drug-target interaction SiamDTI prediction method, which utilizes a double channel network structure for cross-field supervised learning.Experimental results on three benchmark datasets demonstrate that SiamDTI achieves higher accuracy levels than other state-of-the-art (SOTA) methods on novel drugs and targets.Additionally, SiamDTI's performance with known drugs and targets is comparable to that of SOTA approachs. The code is available at https://anonymous.4open.science/r/DDDTI-434D.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Regressor-free Molecule Generation to Support Drug Response Prediction
Authors:
Kun Li,
Xiuwen Gong,
Shirui Pan,
Jia Wu,
Bo Du,
Wenbin Hu
Abstract:
Drug response prediction (DRP) is a crucial phase in drug discovery, and the most important metric for its evaluation is the IC50 score. DRP results are heavily dependent on the quality of the generated molecules. Existing molecule generation methods typically employ classifier-based guidance, enabling sampling within the IC50 classification range. However, these methods fail to ensure the samplin…
▽ More
Drug response prediction (DRP) is a crucial phase in drug discovery, and the most important metric for its evaluation is the IC50 score. DRP results are heavily dependent on the quality of the generated molecules. Existing molecule generation methods typically employ classifier-based guidance, enabling sampling within the IC50 classification range. However, these methods fail to ensure the sampling space range's effectiveness, generating numerous ineffective molecules. Through experimental and theoretical study, we hypothesize that conditional generation based on the target IC50 score can obtain a more effective sampling space. As a result, we introduce regressor-free guidance molecule generation to ensure sampling within a more effective space and support DRP. Regressor-free guidance combines a diffusion model's score estimation with a regression controller model's gradient based on number labels. To effectively map regression labels between drugs and cell lines, we design a common-sense numerical knowledge graph that constrains the order of text representations. Experimental results on the real-world dataset for the DRP task demonstrate our method's effectiveness in drug discovery. The code is available at:https://anonymous.4open.science/r/RMCD-DBD1.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Gradient Transformation: Towards Efficient and Model-Agnostic Unlearning for Dynamic Graph Neural Networks
Authors:
He Zhang,
Bang Wu,
Xiangwen Yang,
Xingliang Yuan,
Chengqi Zhang,
Shirui Pan
Abstract:
Graph unlearning has emerged as an essential tool for safeguarding user privacy and mitigating the negative impacts of undesirable data. Meanwhile, the advent of dynamic graph neural networks (DGNNs) marks a significant advancement due to their superior capability in learning from dynamic graphs, which encapsulate spatial-temporal variations in diverse real-world applications (e.g., traffic foreca…
▽ More
Graph unlearning has emerged as an essential tool for safeguarding user privacy and mitigating the negative impacts of undesirable data. Meanwhile, the advent of dynamic graph neural networks (DGNNs) marks a significant advancement due to their superior capability in learning from dynamic graphs, which encapsulate spatial-temporal variations in diverse real-world applications (e.g., traffic forecasting). With the increasing prevalence of DGNNs, it becomes imperative to investigate the implementation of dynamic graph unlearning. However, current graph unlearning methodologies are designed for GNNs operating on static graphs and exhibit limitations including their serving in a pre-processing manner and impractical resource demands. Furthermore, the adaptation of these methods to DGNNs presents non-trivial challenges, owing to the distinctive nature of dynamic graphs. To this end, we propose an effective, efficient, model-agnostic, and post-processing method to implement DGNN unlearning. Specifically, we first define the unlearning requests and formulate dynamic graph unlearning in the context of continuous-time dynamic graphs. After conducting a role analysis on the unlearning data, the remaining data, and the target DGNN model, we propose a method called Gradient Transformation and a loss function to map the unlearning request to the desired parameter update. Evaluations on six real-world datasets and state-of-the-art DGNN backbones demonstrate its effectiveness (e.g., limited performance drop even obvious improvement) and efficiency (e.g., at most 7.23$\times$ speed-up) outperformance, and potential advantages in handling future unlearning requests (e.g., at most 32.59$\times$ speed-up).
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Graph Sparsification via Mixture of Graphs
Authors:
Guibin Zhang,
Xiangguo Sun,
Yanwei Yue,
Kun Wang,
Tianlong Chen,
Shirui Pan
Abstract:
Graph Neural Networks (GNNs) have demonstrated superior performance across various graph learning tasks but face significant computational challenges when applied to large-scale graphs. One effective approach to mitigate these challenges is graph sparsification, which involves removing non-essential edges to reduce computational overhead. However, previous graph sparsification methods often rely o…
▽ More
Graph Neural Networks (GNNs) have demonstrated superior performance across various graph learning tasks but face significant computational challenges when applied to large-scale graphs. One effective approach to mitigate these challenges is graph sparsification, which involves removing non-essential edges to reduce computational overhead. However, previous graph sparsification methods often rely on a single global sparsity setting and uniform pruning criteria, failing to provide customized sparsification schemes for each node's complex local context. In this paper, we introduce Mixture-of-Graphs (MoG), leveraging the concept of Mixture-of-Experts (MoE), to dynamically select tailored pruning solutions for each node. Specifically, MoG incorporates multiple sparsifier experts, each characterized by unique sparsity levels and pruning criteria, and selects the appropriate experts for each node. Subsequently, MoG performs a mixture of the sparse graphs produced by different experts on the Grassmann manifold to derive an optimal sparse graph. One notable property of MoG is its entirely local nature, as it depends on the specific circumstances of each individual node. Extensive experiments on four large-scale OGB datasets and two superpixel datasets, equipped with five GNN backbones, demonstrate that MoG (I) identifies subgraphs at higher sparsity levels ($8.67\%\sim 50.85\%$), with performance equal to or better than the dense graph, (II) achieves $1.47-2.62\times$ speedup in GNN inference with negligible performance drop, and (III) boosts ``top-student'' GNN performance ($1.02\%\uparrow$ on RevGNN+\textsc{ogbn-proteins} and $1.74\%\uparrow$ on DeeperGCN+\textsc{ogbg-ppa}).
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Large Language Models-guided Dynamic Adaptation for Temporal Knowledge Graph Reasoning
Authors:
Jiapu Wang,
Kai Sun,
Linhao Luo,
Wei Wei,
Yongli Hu,
Alan Wee-Chung Liew,
Shirui Pan,
Baocai Yin
Abstract:
Temporal Knowledge Graph Reasoning (TKGR) is the process of utilizing temporal information to capture complex relations within a Temporal Knowledge Graph (TKG) to infer new knowledge. Conventional methods in TKGR typically depend on deep learning algorithms or temporal logical rules. However, deep learning-based TKGRs often lack interpretability, whereas rule-based TKGRs struggle to effectively le…
▽ More
Temporal Knowledge Graph Reasoning (TKGR) is the process of utilizing temporal information to capture complex relations within a Temporal Knowledge Graph (TKG) to infer new knowledge. Conventional methods in TKGR typically depend on deep learning algorithms or temporal logical rules. However, deep learning-based TKGRs often lack interpretability, whereas rule-based TKGRs struggle to effectively learn temporal rules that capture temporal patterns. Recently, Large Language Models (LLMs) have demonstrated extensive knowledge and remarkable proficiency in temporal reasoning. Consequently, the employment of LLMs for Temporal Knowledge Graph Reasoning (TKGR) has sparked increasing interest among researchers. Nonetheless, LLMs are known to function as black boxes, making it challenging to comprehend their reasoning process. Additionally, due to the resource-intensive nature of fine-tuning, promptly updating LLMs to integrate evolving knowledge within TKGs for reasoning is impractical. To address these challenges, in this paper, we propose a Large Language Models-guided Dynamic Adaptation (LLM-DA) method for reasoning on TKGs. Specifically, LLM-DA harnesses the capabilities of LLMs to analyze historical data and extract temporal logical rules. These rules unveil temporal patterns and facilitate interpretable reasoning. To account for the evolving nature of TKGs, a dynamic adaptation strategy is proposed to update the LLM-generated rules with the latest events. This ensures that the extracted rules always incorporate the most recent knowledge and better generalize to the predictions on future events. Experimental results show that without the need of fine-tuning, LLM-DA significantly improves the accuracy of reasoning over several common datasets, providing a robust framework for TKGR tasks.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Resilience Analysis of Multi-modal Logistics Service Network Through Robust Optimization with Budget-of-Uncertainty
Authors:
Yaxin Pang,
Shenle Pan,
Eric Ballot
Abstract:
Supply chain resilience analysis aims to identify the critical elements in the supply chain, measure its reliability, and analyze solutions for improving vulnerabilities. While extensive methods like stochastic approaches have been dominant, robust optimization-widely applied in robust planning under uncertainties without specific probability distributions-remains relatively underexplored for this…
▽ More
Supply chain resilience analysis aims to identify the critical elements in the supply chain, measure its reliability, and analyze solutions for improving vulnerabilities. While extensive methods like stochastic approaches have been dominant, robust optimization-widely applied in robust planning under uncertainties without specific probability distributions-remains relatively underexplored for this research problem. This paper employs robust optimization with budget-of-uncertainty as a tool to analyze the resilience of multi-modal logistics service networks under time uncertainty. We examine the interactive effects of three critical factors: network size, disruption scale, disruption degree. The computational experiments offer valuable managerial insights for practitioners and researchers.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Hi-GMAE: Hierarchical Graph Masked Autoencoders
Authors:
Chuang Liu,
Zelin Yao,
Yibing Zhan,
Xueqi Ma,
Dapeng Tao,
Jia Wu,
Wenbin Hu,
Shirui Pan,
Bo Du
Abstract:
Graph Masked Autoencoders (GMAEs) have emerged as a notable self-supervised learning approach for graph-structured data. Existing GMAE models primarily focus on reconstructing node-level information, categorizing them as single-scale GMAEs. This methodology, while effective in certain contexts, tends to overlook the complex hierarchical structures inherent in many real-world graphs. For instance,…
▽ More
Graph Masked Autoencoders (GMAEs) have emerged as a notable self-supervised learning approach for graph-structured data. Existing GMAE models primarily focus on reconstructing node-level information, categorizing them as single-scale GMAEs. This methodology, while effective in certain contexts, tends to overlook the complex hierarchical structures inherent in many real-world graphs. For instance, molecular graphs exhibit a clear hierarchical organization in the form of the atoms-functional groups-molecules structure. Hence, the inability of single-scale GMAE models to incorporate these hierarchical relationships often leads to their inadequate capture of crucial high-level graph information, resulting in a noticeable decline in performance. To address this limitation, we propose Hierarchical Graph Masked AutoEncoders (Hi-GMAE), a novel multi-scale GMAE framework designed to handle the hierarchical structures within graphs. First, Hi-GMAE constructs a multi-scale graph hierarchy through graph pooling, enabling the exploration of graph structures across different granularity levels. To ensure masking uniformity of subgraphs across these scales, we propose a novel coarse-to-fine strategy that initiates masking at the coarsest scale and progressively back-projects the mask to the finer scales. Furthermore, we integrate a gradual recovery strategy with the masking process to mitigate the learning challenges posed by completely masked subgraphs. Diverging from the standard graph neural network (GNN) used in GMAE models, Hi-GMAE modifies its encoder and decoder into hierarchical structures. This entails using GNN at the finer scales for detailed local graph analysis and employing a graph transformer at coarser scales to capture global information. Our experiments on 15 graph datasets consistently demonstrate that Hi-GMAE outperforms 17 state-of-the-art self-supervised competitors.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
All Nodes are created Not Equal: Node-Specific Layer Aggregation and Filtration for GNN
Authors:
Shilong Wang,
Hao Wu,
Yifan Duan,
Guibin Zhang,
Guohao Li,
Yuxuan Liang,
Shirui Pan,
Kun Wang,
Yang Wang
Abstract:
The ever-designed Graph Neural Networks, though opening a promising path for the modeling of the graph-structure data, unfortunately introduce two daunting obstacles to their deployment on devices. (I) Most of existing GNNs are shallow, due mostly to the over-smoothing and gradient-vanish problem as they go deeper as convolutional architectures. (II) The vast majority of GNNs adhere to the homophi…
▽ More
The ever-designed Graph Neural Networks, though opening a promising path for the modeling of the graph-structure data, unfortunately introduce two daunting obstacles to their deployment on devices. (I) Most of existing GNNs are shallow, due mostly to the over-smoothing and gradient-vanish problem as they go deeper as convolutional architectures. (II) The vast majority of GNNs adhere to the homophily assumption, where the central node and its adjacent nodes share the same label. This assumption often poses challenges for many GNNs working with heterophilic graphs. Addressing the aforementioned issue has become a looming challenge in enhancing the robustness and scalability of GNN applications. In this paper, we take a comprehensive and systematic approach to overcoming the two aforementioned challenges for the first time. We propose a Node-Specific Layer Aggregation and Filtration architecture, termed NoSAF, a framework capable of filtering and processing information from each individual nodes. NoSAF introduces the concept of "All Nodes are Created Not Equal" into every layer of deep networks, aiming to provide a reliable information filter for each layer's nodes to sieve out information beneficial for the subsequent layer. By incorporating a dynamically updated codebank, NoSAF dynamically optimizes the optimal information outputted downwards at each layer. This effectively overcomes heterophilic issues and aids in deepening the network. To compensate for the information loss caused by the continuous filtering in NoSAF, we also propose NoSAF-D (Deep), which incorporates a compensation mechanism that replenishes information in every layer of the model, allowing NoSAF to perform meaningful computations even in very deep layers.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Don't Chase Your Tail! Missing Key Aspects Augmentation in Textual Vulnerability Descriptions of Long-tail Software through Feature Inference
Authors:
Linyi Han,
Shidong Pan,
Zhenchang Xing,
Jiamou Sun,
Sofonias Yitagesu,
Xiaowang Zhang,
Zhiyong Feng
Abstract:
Augmenting missing key aspects in Textual Vulnerability Descriptions (TVDs) for software with a large user base (referred to as non-long-tail software) has greatly advanced vulnerability analysis and software security research. However, these methods often overlook software instances that have a limited user base (referred to as long-tail software) due to limited TVDs, variations in software featu…
▽ More
Augmenting missing key aspects in Textual Vulnerability Descriptions (TVDs) for software with a large user base (referred to as non-long-tail software) has greatly advanced vulnerability analysis and software security research. However, these methods often overlook software instances that have a limited user base (referred to as long-tail software) due to limited TVDs, variations in software features, and domain-specific jargon, which hinders vulnerability analysis and software repairs. In this paper, we introduce a novel software feature inference framework designed to augment the missing key aspects of TVDs for long-tail software. Firstly, we tackle the issue of non-standard software names found in community-maintained vulnerability databases by cross-referencing government databases with Common Vulnerabilities and Exposures (CVEs). Next, we employ Large Language Models (LLMs) to generate the missing key aspects. However, the limited availability of historical TVDs restricts the variety of examples. To overcome this limitation, we utilize the Common Weakness Enumeration (CWE) to classify all TVDs and select cluster centers as representative examples. To ensure accuracy, we present Natural Language Inference (NLI) models specifically designed for long-tail software. These models identify and eliminate incorrect responses. Additionally, we use a wiki repository to provide explanations for proprietary terms. Our evaluations demonstrate that our approach significantly improves the accuracy of augmenting missing key aspects of TVDs for log-tail software from 0.27 to 0.56 (+107%). Interestingly, the accuracy of non-long-tail software also increases from 64% to 71%. As a result, our approach can be useful in various downstream tasks that require complete TVD information.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
Implication of the velocity dispersion scalings on high-mass star formation in molecular clouds
Authors:
An-Xu Luo,
Hong-Li Liu,
Sheng-Li Qin,
Dong-Ting Yang,
Sirong Pan
Abstract:
This paper is aimed at exploring implications of velocity dispersion scalings on high-mass star formation in molecular clouds, including the scalings of Larson's linewidth--size ($σ$--$R$) and ratio--mass surface density ($\cal{L}$--$Σ$; here $\cal{L}$$=σ/R^{0.5}$). We have systematically analyzed the $σ$ parameter of well-selected 221 massive clumps, complemented with published samples of other h…
▽ More
This paper is aimed at exploring implications of velocity dispersion scalings on high-mass star formation in molecular clouds, including the scalings of Larson's linewidth--size ($σ$--$R$) and ratio--mass surface density ($\cal{L}$--$Σ$; here $\cal{L}$$=σ/R^{0.5}$). We have systematically analyzed the $σ$ parameter of well-selected 221 massive clumps, complemented with published samples of other hierarchical density structures of molecular clouds over spatial scales of 0.01--10 pc. Those massive clumps are classified into four phases: quiescent, protostellar, HII region, and PDR clumps in an evolutionary sequence. The velocity dispersion of clumps increases overall with the evolutionary sequence, reflecting enhanced stellar feedback in more evolved phases. The relations of $σ$--$R$ and $\cal{L}$--$Σ$ are weak with the clump sample alone, but become evident when combined with others spanning a much wider spatial scales. For $σ$--$R$, its tight relation indicates a kinematic connection between hierarchical density structures, supporting theoretical models of multiscale high-mass star formation. From the $\cal{L}$--$Σ$ relation, cloud structures can be found to transition from over-virial state ($α_\mathrm{vir} > 2$) to sub-virial state ($α_\mathrm{vir} < 2$) as they become smaller and denser, indicating a possible shift in the governing force from turbulence to gravity. This implies that the multiscale physical process of high-mass star formation hinges on the self-gravity of sub-virial molecular clouds. However, the influence of turbulence may not be dismissed until large-scale clouds attain a sub-virial state. This is pending confirmation from future multiscale kinematic observations of molecular clouds with uniform observing settings.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
What role of gravity, turbulence and magnetic fields in high-mass star formation clouds?
Authors:
An-Xu Luo,
Hong-Li Liu,
Guang-Xing Li,
Sirong Pan,
Dong-Ting Yang
Abstract:
To explore the potential role of gravity, turbulence and magnetic fields in high-mass star formation in molecular clouds, this study revisits the velocity dispersion--size ($σ$--$L$) and density--size ($ρ$--$L$) scalings and the associated turbulent energy spectrum using an extensive data sample. The sample includes various hierarchical density structures in high-mass star formation clouds, across…
▽ More
To explore the potential role of gravity, turbulence and magnetic fields in high-mass star formation in molecular clouds, this study revisits the velocity dispersion--size ($σ$--$L$) and density--size ($ρ$--$L$) scalings and the associated turbulent energy spectrum using an extensive data sample. The sample includes various hierarchical density structures in high-mass star formation clouds, across scales of 0.01 to 100 pc. We observe $σ\propto L^{0.26}$ and $ρ\propto L^{-1.54}$ scalings, converging toward a virial equilibrium state. A nearly flat virial parameter--mass ($α_{\rm vir}-M$) distribution is seen across all density scales, with $α_{\rm vir}$ values centered around unity, suggesting a global equilibrium maintained by the interplay between gravity and turbulence across multiple scales. Our turbulent energy spectrum ($E(k)$) analysis, based on the $σ$--$L$ and $ρ$--$L$ scalings, yields a characteristic $E(k) \propto k^{-1.52}$. These findings indicate the potential significance of gravity, turbulence, and possibly magnetic fields all in regulating dynamics of molecular clouds and high-mass star formation therein.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Constraining the mass-spectra in the presence of a light sterile neutrino from absolute mass-related observables
Authors:
Srubabati Goswami,
Debashis Pachhar,
Supriya Pan
Abstract:
The framework of three-flavor neutrino oscillation is a well-established phenomenon, but results from the short-baseline experiments, such as the Liquid Scintillator Neutrino Detector (LSND) and MiniBooster Neutrino Experiment (MiniBooNE), hint at the potential existence of an additional light neutrino state characterized by a mass-squared difference of approximately $1\,\rm eV^2$. The new neutrin…
▽ More
The framework of three-flavor neutrino oscillation is a well-established phenomenon, but results from the short-baseline experiments, such as the Liquid Scintillator Neutrino Detector (LSND) and MiniBooster Neutrino Experiment (MiniBooNE), hint at the potential existence of an additional light neutrino state characterized by a mass-squared difference of approximately $1\,\rm eV^2$. The new neutrino state is devoid of all Standard Model (SM) interactions, commonly referred to as a 'sterile' state. In addition, a sterile neutrino with a mass-squared difference of $10^{-2}$ $\rm eV^2$ has been proposed to improve the tension between the results obtained from the Tokai to Kamioka (T2K) and the NuMI Off-axis $ν_e$ Appearance (NO$ν$A) experiments. Further, the non-observation of the predicted upturn in the solar neutrino spectra below 8 MeV can be explained by postulating an extra light sterile neutrino state with a mass-squared difference around $10^{-5} \rm eV^2$. The hypothesis of an additional light sterile neutrino state introduces four distinct mass spectra depending on the sign of the mass-squared difference. In this paper, we discuss the implications of the above scenarios on the observables that depend on the absolute mass of the neutrinos, namely the sum of the light neutrino masses $(Σ)$ from cosmology, the effective mass of the electron neutrino from beta decay $(m_β)$, and the effective Majorana mass $( m_{ββ})$ from neutrinoless double beta decay. We show that some scenarios can be disfavored by the current constraints of the above variables. The implications for projected sensitivity of Karlsruhe Tritium Neutrino Experiment (KATRIN) and future experiments like Project-8, next Enriched Xenon Observatory (nEXO) have been discussed.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Nonnegative Matrix Factorization in Dimensionality Reduction: A Survey
Authors:
Farid Saberi-Movahed,
Kamal Berahman,
Razieh Sheikhpour,
Yuefeng Li,
Shirui Pan
Abstract:
Dimensionality Reduction plays a pivotal role in improving feature learning accuracy and reducing training time by eliminating redundant features, noise, and irrelevant data. Nonnegative Matrix Factorization (NMF) has emerged as a popular and powerful method for dimensionality reduction. Despite its extensive use, there remains a need for a comprehensive analysis of NMF in the context of dimension…
▽ More
Dimensionality Reduction plays a pivotal role in improving feature learning accuracy and reducing training time by eliminating redundant features, noise, and irrelevant data. Nonnegative Matrix Factorization (NMF) has emerged as a popular and powerful method for dimensionality reduction. Despite its extensive use, there remains a need for a comprehensive analysis of NMF in the context of dimensionality reduction. To address this gap, this paper presents a comprehensive survey of NMF, focusing on its applications in both feature extraction and feature selection. We introduce a classification of dimensionality reduction, enhancing understanding of the underlying concepts. Subsequently, we delve into a thorough summary of diverse NMF approaches used for feature extraction and selection. Furthermore, we discuss the latest research trends and potential future directions of NMF in dimensionality reduction, aiming to highlight areas that need further exploration and development.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
A family of air-stable chalcogenide solid electrolytes in Li$_2$BMQ$_4$ (B = Ca, Sr and Ba; M = Si, Ge and Sn; Q = O, S and Se) systems
Authors:
Huican Mao,
Xiang Zhu,
Guangmao Li,
Jie Pang,
Junfeng Hao,
Liqi Wang,
Hailong Yu,
Youguo Shi,
Fan Wu,
Shilie Pan,
Ruijuan Xiao,
Hong Li,
Liquan Chen
Abstract:
Combining high-throughput first-principles calculations and experimental measurements, we have identified a novel family of fast lithium-ion chalcogenide conductors in Li$_2$BMQ$_4$ (2114, B = Ca, Sr and Ba; M = Si, Ge and Sn; Q = O, S and Se) systems. Our calculations demonstrate that most of the thermodynamically and kinetically stable sulfides and selenides in this new system exhibit ultralow L…
▽ More
Combining high-throughput first-principles calculations and experimental measurements, we have identified a novel family of fast lithium-ion chalcogenide conductors in Li$_2$BMQ$_4$ (2114, B = Ca, Sr and Ba; M = Si, Ge and Sn; Q = O, S and Se) systems. Our calculations demonstrate that most of the thermodynamically and kinetically stable sulfides and selenides in this new system exhibit ultralow Li$^+$ ion migration activation energy (0.16 eV ~ 0.56 eV) and considerable bandgaps varying between ~ 2 eV and 3.5 eV. We have successfully synthesized Li$_2$BaSnS$_4$ and Li$_2$SrSiS$_4$, and they exhibit excellent moisture stability through H$_2$S gas measurements. Electrochemical impedance measurements indicate 2114 systems show the typical features of solid ionic conductors, with a room-temperature Li$^+$ conductivity close to 5$\times$10$^{-4}$ mS/cm aligning with our molecular dynamics simulations. Furthermore, we have theoretically investigated the substitution of Cl$^-$ at S$^{2-}$ site. The doped compounds display significantly higher conductivity, with an increase of about three orders of magnitude (up to a maximum of 0.72 mS/cm) compared to the undoped compounds. These findings offer valuable insights for the further exploration of potential chalcogenide solid electrolyte materials with robust air stability and enhanced ionic conductivity for practical applications in lithium-ion batteries.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
A Survey on Diffusion Models for Time Series and Spatio-Temporal Data
Authors:
Yiyuan Yang,
Ming Jin,
Haomin Wen,
Chaoli Zhang,
Yuxuan Liang,
Lintao Ma,
Yi Wang,
Chenghao Liu,
Bin Yang,
Zenglin Xu,
Jiang Bian,
Shirui Pan,
Qingsong Wen
Abstract:
The study of time series is crucial for understanding trends and anomalies over time, enabling predictive insights across various sectors. Spatio-temporal data, on the other hand, is vital for analyzing phenomena in both space and time, providing a dynamic perspective on complex system interactions. Recently, diffusion models have seen widespread application in time series and spatio-temporal data…
▽ More
The study of time series is crucial for understanding trends and anomalies over time, enabling predictive insights across various sectors. Spatio-temporal data, on the other hand, is vital for analyzing phenomena in both space and time, providing a dynamic perspective on complex system interactions. Recently, diffusion models have seen widespread application in time series and spatio-temporal data mining. Not only do they enhance the generative and inferential capabilities for sequential and temporal data, but they also extend to other downstream tasks. In this survey, we comprehensively and thoroughly review the use of diffusion models in time series and spatio-temporal data, categorizing them by model category, task type, data modality, and practical application domain. In detail, we categorize diffusion models into unconditioned and conditioned types and discuss time series and spatio-temporal data separately. Unconditioned models, which operate unsupervised, are subdivided into probability-based and score-based models, serving predictive and generative tasks such as forecasting, anomaly detection, classification, and imputation. Conditioned models, on the other hand, utilize extra information to enhance performance and are similarly divided for both predictive and generative tasks. Our survey extensively covers their application in various fields, including healthcare, recommendation, climate, energy, audio, and transportation, providing a foundational understanding of how these models analyze and generate data. Through this structured overview, we aim to provide researchers and practitioners with a comprehensive understanding of diffusion models for time series and spatio-temporal data analysis, aiming to direct future innovations and applications by addressing traditional challenges and exploring innovative solutions within the diffusion model framework.
△ Less
Submitted 11 June, 2024; v1 submitted 29 April, 2024;
originally announced April 2024.
-
Automating Zero-Shot Patch Porting for Hard Forks
Authors:
Shengyi Pan,
You Wang,
Zhongxin Liu,
Xing Hu,
Xin Xia,
Shanping Li
Abstract:
Forking is a typical way of code reuse, which provides a simple way for developers to create a variant software (denoted as hard fork) by copying and modifying an existing codebase. Despite of the benefits, forking also leads to duplicate efforts in software maintenance. Developers need to port patches across the hard forks to address similar bugs or implement similar features. Due to the divergen…
▽ More
Forking is a typical way of code reuse, which provides a simple way for developers to create a variant software (denoted as hard fork) by copying and modifying an existing codebase. Despite of the benefits, forking also leads to duplicate efforts in software maintenance. Developers need to port patches across the hard forks to address similar bugs or implement similar features. Due to the divergence between the source project and the hard fork, patch porting is complicated, which requires an adaption regarding different implementations of the same functionality. In this work, we take the first step to automate patch porting for hard forks under a zero-shot setting. We first conduct an empirical study of the patches ported from Vim to Neovim over the last ten years to investigate the necessities of patch porting and the potential flaws in the current practice. We then propose a large language model (LLM) based approach (namely PPatHF) to automatically port patches for hard forks on a function-wise basis. Specifically, PPatHF is composed of a reduction module and a porting module. Given the pre- and post-patch versions of a function from the reference project and the corresponding function from the target project, the reduction module first slims the input functions by removing code snippets less relevant to the patch. Then, the porting module leverages a LLM to apply the patch to the function from the target project. We evaluate PPatHF on 310 Neovim patches ported from Vim. The experimental results show that PPatHF outperforms the baselines significantly. Specifically, PPatHF can correctly port 131 (42.3%) patches and automate 57% of the manual edits required for the developer to port the patch.
△ Less
Submitted 27 April, 2024;
originally announced April 2024.