subscribe to arXiv mailings

Revisiting the Formulation of Charged Defect in Solids

Authors: Hanzhi Shang, Zeyu Jiang, Yiyang Sun, Damien West, Shengbai Zhang

Abstract: Defect physics is at the heart of microelectronics. By keeping track of the reference energy in total energy calculations, we explicitly show that the "potential alignment" correction vanishes, and the classic Markov-Payne correction yields accurate results. From linear response theory, we further formulate an accurate expression for the quadrupole correction. Application to numerous defects inclu… ▽ More Defect physics is at the heart of microelectronics. By keeping track of the reference energy in total energy calculations, we explicitly show that the "potential alignment" correction vanishes, and the classic Markov-Payne correction yields accurate results. From linear response theory, we further formulate an accurate expression for the quadrupole correction. Application to numerous defects including anisotropic material yields accurate formation energies in small supercells and the historically slow convergence of the 2+ diamond vacancy is shown to be a result of slow varying gap levels of the defect leading to a size dependent dielectric constant. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.02005 [pdf, other]

An End-to-End Speech Summarization Using Large Language Model

Authors: Hengchao Shang, Zongyao Li, Jiaxin Guo, Shaojun Li, Zhiqiang Rao, Yuanchang Luo, Daimeng Wei, Hao Yang

Abstract: Abstractive Speech Summarization (SSum) aims to generate human-like text summaries from spoken content. It encounters difficulties in handling long speech input and capturing the intricate cross-modal mapping between long speech inputs and short text summaries. Research on large language models (LLMs) and multimodal information fusion has provided new insights for addressing these challenges. In t… ▽ More Abstractive Speech Summarization (SSum) aims to generate human-like text summaries from spoken content. It encounters difficulties in handling long speech input and capturing the intricate cross-modal mapping between long speech inputs and short text summaries. Research on large language models (LLMs) and multimodal information fusion has provided new insights for addressing these challenges. In this paper, we propose an end-to-end SSum model that utilizes Q-Former as a connector for the audio-text modality and employs LLMs to generate text summaries directly from speech features. We adopt a multi-stage training approach that includes LLM based ASR and Text Summarization (TSum) tasks as auxiliary tasks. ASR tasks are used to align feature spaces and enhance the LLM's ability to handle longer speech. Then, we utilize a curriculum learning strategy to facilitate the model's transition from TSum to SSum. Finally, our model achieves competitive performance on the How-2 dataset. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: InterSpeech 2024

arXiv:2406.09180 [pdf, other]

Detection-Rate-Emphasized Multi-objective Evolutionary Feature Selection for Network Intrusion Detection

Authors: Zi-Hang Cheng, Haopu Shang, Chao Qian

Abstract: Network intrusion detection is one of the most important issues in the field of cyber security, and various machine learning techniques have been applied to build intrusion detection systems. However, since the number of features to describe the network connections is often large, where some features are redundant or noisy, feature selection is necessary in such scenarios, which can both improve t… ▽ More Network intrusion detection is one of the most important issues in the field of cyber security, and various machine learning techniques have been applied to build intrusion detection systems. However, since the number of features to describe the network connections is often large, where some features are redundant or noisy, feature selection is necessary in such scenarios, which can both improve the efficiency and accuracy. Recently, some researchers focus on using multi-objective evolutionary algorithms (MOEAs) to select features. But usually, they only consider the number of features and classification accuracy as the objectives, resulting in unsatisfactory performance on a critical metric, detection rate. This will lead to the missing of many real attacks and bring huge losses to the network system. In this paper, we propose DR-MOFS to model the feature selection problem in network intrusion detection as a three-objective optimization problem, where the number of features, accuracy and detection rate are optimized simultaneously, and use MOEAs to solve it. Experiments on two popular network intrusion detection datasets NSL-KDD and UNSW-NB15 show that in most cases the proposed method can outperform previous methods, i.e., lead to fewer features, higher accuracy and detection rate. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.08801 [pdf, other]

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

Authors: Mingwang Xu, Hui Li, Qingkun Su, Hanlin Shang, Liwei Zhang, Ce Liu, Jingdong Wang, Yao Yao, Siyu Zhu

Abstract: The field of portrait image animation, driven by speech audio input, has experienced significant advancements in the generation of realistic and dynamic portraits. This research delves into the complexities of synchronizing facial movements and creating visually appealing, temporally consistent animations within the framework of diffusion-based methodologies. Moving away from traditional paradigms… ▽ More The field of portrait image animation, driven by speech audio input, has experienced significant advancements in the generation of realistic and dynamic portraits. This research delves into the complexities of synchronizing facial movements and creating visually appealing, temporally consistent animations within the framework of diffusion-based methodologies. Moving away from traditional paradigms that rely on parametric models for intermediate facial representations, our innovative approach embraces the end-to-end diffusion paradigm and introduces a hierarchical audio-driven visual synthesis module to enhance the precision of alignment between audio inputs and visual outputs, encompassing lip, expression, and pose motion. Our proposed network architecture seamlessly integrates diffusion-based generative models, a UNet-based denoiser, temporal alignment techniques, and a reference network. The proposed hierarchical audio-driven visual synthesis offers adaptive control over expression and pose diversity, enabling more effective personalization tailored to different identities. Through a comprehensive evaluation that incorporates both qualitative and quantitative analyses, our approach demonstrates obvious enhancements in image and video quality, lip synchronization precision, and motion diversity. Further visualization and access to the source code can be found at: https://fudan-generative-vision.github.io/hallo. △ Less

Submitted 16 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

Comments: 20 pages

arXiv:2406.04791 [pdf, other]

Speaker-Smoothed kNN Speaker Adaptation for End-to-End ASR

Authors: Shaojun Li, Daimeng Wei, Hengchao Shang, Jiaxin Guo, ZongYao Li, Zhanglin Wu, Zhiqiang Rao, Yuanchang Luo, Xianghui He, Hao Yang

Abstract: Despite recent improvements in End-to-End Automatic Speech Recognition (E2E ASR) systems, the performance can degrade due to vocal characteristic mismatches between training and testing data, particularly with limited target speaker adaptation data. We propose a novel speaker adaptation approach Speaker-Smoothed kNN that leverages k-Nearest Neighbors (kNN) retrieval techniques to improve model out… ▽ More Despite recent improvements in End-to-End Automatic Speech Recognition (E2E ASR) systems, the performance can degrade due to vocal characteristic mismatches between training and testing data, particularly with limited target speaker adaptation data. We propose a novel speaker adaptation approach Speaker-Smoothed kNN that leverages k-Nearest Neighbors (kNN) retrieval techniques to improve model output by finding correctly pronounced tokens from its pre-built datastore during the decoding phase. Moreover, we utilize x-vector to dynamically adjust kNN interpolation parameters for data sparsity issue. This approach was validated using KeSpeech and MagicData corpora under in-domain and all-domain settings. Our method consistently performs comparably to fine-tuning without the associated performance degradation during speaker changes. Furthermore, in the all-domain setting, our method achieves state-of-the-art results, reducing the CER in both single speaker and multi-speaker test scenarios. △ Less

Submitted 1 July, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

Comments: Accepted to Interspeech 2024

arXiv:2406.04754 [pdf, ps, other]

Global well-posedness and large time behavior for the Oldroyd-B model

Authors: Haifeng Shang

Abstract: This paper studies the global well-posedness and optimal decay estimates to the Oldroyd-B model in $\mathbb R^d$ ($d\geq2$). By utilizing the special structure of this system, we give a simplified proof to the global existence of solutions for the case of initial data small in critical Besov spaces and non-small coupling parameters. Moreover, the optimal decay rates of the solutions under minimal… ▽ More This paper studies the global well-posedness and optimal decay estimates to the Oldroyd-B model in $\mathbb R^d$ ($d\geq2$). By utilizing the special structure of this system, we give a simplified proof to the global existence of solutions for the case of initial data small in critical Besov spaces and non-small coupling parameters. Moreover, the optimal decay rates of the solutions under minimal small assumption on the initial data are established by fully making use of the effect of velocity dissipation and damping mechanism. △ Less

Submitted 7 June, 2024; originally announced June 2024.

arXiv:2406.04745 [pdf, other]

Confidence-aware Contrastive Learning for Selective Classification

Authors: Yu-Chang Wu, Shen-Huan Lyu, Haopu Shang, Xiangyu Wang, Chao Qian

Abstract: Selective classification enables models to make predictions only when they are sufficiently confident, aiming to enhance safety and reliability, which is important in high-stakes scenarios. Previous methods mainly use deep neural networks and focus on modifying the architecture of classification layers to enable the model to estimate the confidence of its prediction. This work provides a generaliz… ▽ More Selective classification enables models to make predictions only when they are sufficiently confident, aiming to enhance safety and reliability, which is important in high-stakes scenarios. Previous methods mainly use deep neural networks and focus on modifying the architecture of classification layers to enable the model to estimate the confidence of its prediction. This work provides a generalization bound for selective classification, disclosing that optimizing feature layers helps improve the performance of selective classification. Inspired by this theory, we propose to explicitly improve the selective classification model at the feature level for the first time, leading to a novel Confidence-aware Contrastive Learning method for Selective Classification, CCL-SC, which similarizes the features of homogeneous instances and differentiates the features of heterogeneous instances, with the strength controlled by the model's confidence. The experimental results on typical datasets, i.e., CIFAR-10, CIFAR-100, CelebA, and ImageNet, show that CCL-SC achieves significantly lower selective risk than state-of-the-art methods, across almost all coverage degrees. Moreover, it can be combined with existing methods to bring further improvement. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: Accepted by ICML 2024

arXiv:2405.16989 [pdf, other]

Uncertainty Learning for High-dimensional Mean-variance Portfolio

Authors: Han Lin Shang, Ruike Wu, Yanrong Yang

Abstract: Accounting for uncertainty in Data quality is important for accurate statistical inference. We aim to an optimal conservative allocation for a large universe of assets in mean-variance portfolio (MVP), which is the worst choice within uncertainty in data distribution. Unlike the low dimensional MVP studied in Blanchet et al. (2022, Management Science), the large number of assets raises a challengi… ▽ More Accounting for uncertainty in Data quality is important for accurate statistical inference. We aim to an optimal conservative allocation for a large universe of assets in mean-variance portfolio (MVP), which is the worst choice within uncertainty in data distribution. Unlike the low dimensional MVP studied in Blanchet et al. (2022, Management Science), the large number of assets raises a challenging problem in quantifying the uncertainty, due to the big deviation of the sample covariance matrix from the population version. To overcome this difficulty, we propose a data-adaptive method to quantify the uncertainty with the help of a factor structure. Monte-Carlo Simulation is conducted to show the superiority of our method in high-dimensional cases, that, avoiding the over-conservative results in Blanchet et al. (2022), our allocation is closer to the oracle version in terms of risk minimization and expected portfolio return controlling. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 28 pages, 2 figures, 4 tables

MSC Class: 91G10; 62P05

arXiv:2405.14744 [pdf, other]

Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View

Authors: Xuan Liu, Jie Zhang, Song Guo, Haoyang Shang, Chengxu Yang, Quanyan Zhu

Abstract: Large language models (LLMs) have been shown to face hallucination issues due to the data they trained on often containing human bias; whether this is reflected in the decision-making process of LLM agents remains under-explored. As LLM Agents are increasingly employed in intricate social environments, a pressing and natural question emerges: Can LLM Agents leverage hallucinations to mirror human… ▽ More Large language models (LLMs) have been shown to face hallucination issues due to the data they trained on often containing human bias; whether this is reflected in the decision-making process of LLM agents remains under-explored. As LLM Agents are increasingly employed in intricate social environments, a pressing and natural question emerges: Can LLM Agents leverage hallucinations to mirror human cognitive biases, thus exhibiting irrational social intelligence? In this paper, we probe the irrational behavior among contemporary LLM agents by melding practical social science experiments with theoretical insights. Specifically, We propose CogMir, an open-ended Multi-LLM Agents framework that utilizes hallucination properties to assess and enhance LLM Agents' social intelligence through cognitive biases. Experimental results on CogMir subsets show that LLM Agents and humans exhibit high consistency in irrational and prosocial decision-making under uncertain conditions, underscoring the prosociality of LLM Agents as social entities, and highlighting the significance of hallucination properties. Additionally, CogMir framework demonstrates its potential as a valuable platform for encouraging more research into the social intelligence of LLM Agents. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.09164 [pdf]

Rapidly Achieving Chemical Accuracy with Quantum Computing Enforced Language Model

Authors: Honghui Shang, Xiongzhi Zeng, Ming Gong, Yangju Wu, Shaojun Guo, Haoran Qian, Chen Zha, Zhijie Fan, Kai Yan, Xiaobo Zhu, Zhenyu Li, Yi Luo, Jian-Wei Pan, Jinlong Yang

Abstract: Finding accurate ground state energy of a many-body system has been a major challenge in quantum chemistry. The integration of classic and quantum computers has shed new light on resolving this outstanding problem. Here we propose QiankunNet-VQE, a transformer based language models enforced with quantum computing to learn and generate quantum states. It has been implemented using up to 12 qubits a… ▽ More Finding accurate ground state energy of a many-body system has been a major challenge in quantum chemistry. The integration of classic and quantum computers has shed new light on resolving this outstanding problem. Here we propose QiankunNet-VQE, a transformer based language models enforced with quantum computing to learn and generate quantum states. It has been implemented using up to 12 qubits and attaining an accuracy level competitive with state-of-the-art classical methods. By leveraging both quantum and classical resources, this scheme overcomes the limitations of variational quantum eigensolver(VQE) without the need for cumbersome error mitigation. Moreover, QiankunNet-VQE provides a different route to achieve a practical quantum advantage for solving many-electron Schrödinger equation without requiring extremely precise preparation and measurement of the ground-state wavefunction on quantum computer. △ Less

Submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.04904 [pdf, other]

Dependence-based fuzzy clustering of functional time series

Authors: Angel Lopez-Oriona, Ying Sun, Han Lin Shang

Abstract: Time series clustering is an important data mining task with a wide variety of applications. While most methods focus on time series taking values on the real line, very few works consider functional time series. However, functional objects frequently arise in many fields, such as actuarial science, demography or finance. Functional time series are indexed collections of infinite-dimensional curve… ▽ More Time series clustering is an important data mining task with a wide variety of applications. While most methods focus on time series taking values on the real line, very few works consider functional time series. However, functional objects frequently arise in many fields, such as actuarial science, demography or finance. Functional time series are indexed collections of infinite-dimensional curves viewed as random elements taking values in a Hilbert space. In this paper, the problem of clustering functional time series is addressed. To this aim, a distance between functional time series is introduced and used to construct a clustering procedure. The metric relies on a measure of serial dependence which can be seen as a natural extension of the classical quantile autocorrelation function to the functional setting. Since the dynamics of the series may vary over time, we adopt a fuzzy approach, which enables the procedure to locate each series into several clusters with different membership degrees. The resulting algorithm can group series generated from similar stochastic processes, reaching accurate results with series coming from a broad variety of functional models and requiring minimum hyperparameter tuning. Several simulation experiments show that the method exhibits a high clustering accuracy besides being computationally efficient. Two interesting applications involving high-frequency financial time series and age-specific mortality improvement rates illustrate the potential of the proposed approach. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: 43 pages, 5 figures, 10 tables. arXiv admin note: substantial text overlap with arXiv:2402.08687

MSC Class: 62R10

arXiv:2404.10542 [pdf, other]

Statistical analysis of pulsar flux density distribution

Authors: H. W. Xu, R. S. Zhao, Erbil Gugercinoglu, H. Liu, D. Li, P. Wang, C. H. Niu, C. Miao, X. Zhu, R. W. Tian, W. L. Li, S. D. Wang, Z. F. Tu, Q. J. Zhi, S. J. Dang, L. H. Shang, S. Xiao

Abstract: This study presents a comprehensive analysis of the spectral properties of 886 pulsars across a wide frequency range from 20MHz to 343.5GHz, including a total of 86 millisecond pulsars. The majority of the pulsars exhibit power-law behavior in their spectra, although some exceptions are observed. Five different spectral models, namely simple power-law, broken power-law, low-frequency turn-over, hi… ▽ More This study presents a comprehensive analysis of the spectral properties of 886 pulsars across a wide frequency range from 20MHz to 343.5GHz, including a total of 86 millisecond pulsars. The majority of the pulsars exhibit power-law behavior in their spectra, although some exceptions are observed. Five different spectral models, namely simple power-law, broken power-law, low-frequency turn-over, high-frequency cut-off, and double turn-over, were employed to explore the spectral behaviors. The average spectral index for pulsars modeled with a simple power-law is found to be -1.64 +/-0.80, consistent with previous studies. Additionally, significant correlations between the spectral index and characteristic parameters are observed particularly in millisecond pulsars, while no strong correlation is observed in normal pulsars. Different models show variations in the most influential characteristic parameters associated with the spectral index, indicating diverse dominant radiation mechanisms in millisecond pulsars.Finally, this study identifies 22 pulsars of the Gigahertz-peaked Spectra (GPS) type for the first time based on the Akaike information criterion. △ Less

Submitted 16 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

Comments: 39 papers,17figures

arXiv:2404.10492 [pdf, other]

Efficient structural relaxation based on the random phase approximation: Applications to the water clusters

Authors: Muhammad N. Tahir, Honghui Shang, Jia Li, Xinguo Ren

Abstract: We report an improved implementation for evaluating the analytical gradients of the random phase approximation (RPA) electron-correlation energy based on atomic orbitals and the localized resolution of identity scheme. The more efficient RPA force calculations allow us to relax structures of medium-size water clusters. Particular attention is paid to the structures and energy orderings of the low-… ▽ More We report an improved implementation for evaluating the analytical gradients of the random phase approximation (RPA) electron-correlation energy based on atomic orbitals and the localized resolution of identity scheme. The more efficient RPA force calculations allow us to relax structures of medium-size water clusters. Particular attention is paid to the structures and energy orderings of the low-energy isomers of (H$_2$O)$_n$ clusters with $n=21$, 22, and 25. It is found that the energy ordering of the low-energy isomers of these water clusters are rather sensitive to how their structures are determined. For the five low-energy isomers of (H$_2$O)$_{25}$, the RPA energy ordering based on the RPA geometries is quite different from that based on the geometries relaxed by lower-level theories, in contrast with the situation of small water clusters like the water hexamer. The standard RPA underbinds the water clusters, and this underbinding behavior gets more pronounced as the complete basis set (CBS) limit is approached. The renormalized single excitation (rSE) correction remedies this underbinding, giving rise to a noticeable overbinding behavior at finite basis sets. However, as the CBS limit is approached, RPA+rSE yields an accuracy for the binding energies that is comparable to the best available double hybrid functionals, as demonstrated for the WATER27 testset. △ Less

Submitted 16 April, 2024; originally announced April 2024.

arXiv:2403.13340 [pdf, other]

Forecasting density-valued functional panel data

Authors: Cristian F. Jiménez-Varón, Ying Sun, Han Lin Shang

Abstract: We introduce a statistical method for modeling and forecasting functional panel data, where each element is a density. Density functions are nonnegative and have a constrained integral and thus do not constitute a linear vector space. We implement a center log-ratio transformation to transform densities into unconstrained functions. These functions exhibit cross-sectionally correlation and tempora… ▽ More We introduce a statistical method for modeling and forecasting functional panel data, where each element is a density. Density functions are nonnegative and have a constrained integral and thus do not constitute a linear vector space. We implement a center log-ratio transformation to transform densities into unconstrained functions. These functions exhibit cross-sectionally correlation and temporal dependence. Via a functional analysis of variance decomposition, we decompose the unconstrained functional panel data into a deterministic trend component and a time-varying residual component. To produce forecasts for the time-varying component, a functional time series forecasting method, based on the estimation of the long-range covariance, is implemented. By combining the forecasts of the time-varying residual component with the deterministic trend component, we obtain h-step-ahead forecast curves for multiple populations. Illustrated by age- and sex-specific life-table death counts in the United States, we apply our proposed method to generate forecasts of the life-table death counts for 51 states. △ Less

Submitted 20 March, 2024; originally announced March 2024.

arXiv:2403.11430 [pdf, other]

A Novel Paradigm Boosting Translation Capabilities of Large Language Models

Authors: Jiaxin Guo, Hao Yang, Zongyao Li, Daimeng Wei, Hengchao Shang, Xiaoyu Chen

Abstract: This paper presents a study on strategies to enhance the translation capabilities of large language models (LLMs) in the context of machine translation (MT) tasks. The paper proposes a novel paradigm consisting of three stages: Secondary Pre-training using Extensive Monolingual Data, Continual Pre-training with Interlinear Text Format Documents, and Leveraging Source-Language Consistent Instructio… ▽ More This paper presents a study on strategies to enhance the translation capabilities of large language models (LLMs) in the context of machine translation (MT) tasks. The paper proposes a novel paradigm consisting of three stages: Secondary Pre-training using Extensive Monolingual Data, Continual Pre-training with Interlinear Text Format Documents, and Leveraging Source-Language Consistent Instruction for Supervised Fine-Tuning. Previous research on LLMs focused on various strategies for supervised fine-tuning (SFT), but their effectiveness has been limited. While traditional machine translation approaches rely on vast amounts of parallel bilingual data, our paradigm highlights the importance of using smaller sets of high-quality bilingual data. We argue that the focus should be on augmenting LLMs' cross-lingual alignment abilities during pre-training rather than solely relying on extensive bilingual data during SFT. Experimental results conducted using the Llama2 model, particularly on Chinese-Llama2 after monolingual augmentation, demonstrate the improved translation capabilities of LLMs. A significant contribution of our approach lies in Stage2: Continual Pre-training with Interlinear Text Format Documents, which requires less than 1B training data, making our method highly efficient. Additionally, in Stage3, we observed that setting instructions consistent with the source language benefits the supervised fine-tuning process. Experimental results demonstrate that our approach surpasses previous work and achieves superior performance compared to models such as NLLB-54B and GPT3.5-text-davinci-003, despite having a significantly smaller parameter count of only 7B or 13B. This achievement establishes our method as a pioneering strategy in the field of machine translation. △ Less

Submitted 15 April, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

Comments: Accepted in NAACL 2024

arXiv:2403.03574 [pdf, other]

Formation of limb-brightened radio jets by angle-dependent energy extraction from rapidly rotating black holes

Authors: Kouichi Hirotani, Hsien Shang, Ruben Krasnopolsky, Kenichi Nishikawa

Abstract: By general relativistic magnetohydrodynamic simulations, it is suggested that the rotational energy of a rapidly rotating black hole (BH) is preferentially extracted along the magnetic field lines threading the event horizon in the middle and lower latitudes. Applying this angle-dependent Poynting flux to the jet downstream, we demonstrate that the jets exhibit limb-brightened structures at variou… ▽ More By general relativistic magnetohydrodynamic simulations, it is suggested that the rotational energy of a rapidly rotating black hole (BH) is preferentially extracted along the magnetic field lines threading the event horizon in the middle and lower latitudes. Applying this angle-dependent Poynting flux to the jet downstream, we demonstrate that the jets exhibit limb-brightened structures at various viewing angles, as observed from Mrk 501, M87, and Cyg A between 5 and 75 degrees, and that the limb-brightening is enhanced when the jet is collimated strongly. It is also found that the jet width perpendicular to the propagation direction shrinks at the projected distance of the altitude where the jet collimates from a conical shape (near the BH) to a parabolic one (in the jet). Comparing with the VLBI observations, we show this collimation takes place within the de-projected altitude of 100 Schwarzschild radii from the BH in the case of the M87 jet. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: 19 pages, 8 figures. The Astrophysical Journal in press

arXiv:2403.02118 [pdf, other]

Position: Towards Implicit Prompt For Text-To-Image Models

Authors: Yue Yang, Yuqi Lin, Hong Liu, Wenqi Shao, Runjian Chen, Hailong Shang, Yu Wang, Yu Qiao, Kaipeng Zhang, Ping Luo

Abstract: Recent text-to-image (T2I) models have had great success, and many benchmarks have been proposed to evaluate their performance and safety. However, they only consider explicit prompts while neglecting implicit prompts (hint at a target without explicitly mentioning it). These prompts may get rid of safety constraints and pose potential threats to the applications of these models. This position pap… ▽ More Recent text-to-image (T2I) models have had great success, and many benchmarks have been proposed to evaluate their performance and safety. However, they only consider explicit prompts while neglecting implicit prompts (hint at a target without explicitly mentioning it). These prompts may get rid of safety constraints and pose potential threats to the applications of these models. This position paper highlights the current state of T2I models toward implicit prompts. We present a benchmark named ImplicitBench and conduct an investigation on the performance and impacts of implicit prompts with popular T2I models. Specifically, we design and collect more than 2,000 implicit prompts of three aspects: General Symbols, Celebrity Privacy, and Not-Safe-For-Work (NSFW) Issues, and evaluate six well-known T2I models' capabilities under these implicit prompts. Experiment results show that (1) T2I models are able to accurately create various target symbols indicated by implicit prompts; (2) Implicit prompts bring potential risks of privacy leakage for T2I models. (3) Constraints of NSFW in most of the evaluated T2I models can be bypassed with implicit prompts. We call for increased attention to the potential and risks of implicit prompts in the T2I community and further investigation into the capabilities and impacts of implicit prompts, advocating for a balanced approach that harnesses their benefits while mitigating their risks. △ Less

Submitted 28 May, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

arXiv:2402.02529 [pdf, other]

A Unified Model for Bipolar Outflows from Young Stars: Kinematic and Mixing Structures in HH 30

Authors: Tsung-Han Ai, Chun-Fan Liu, Hsien Shang, Doug Johnstone, Ruben Krasnopolsky

Abstract: The young stellar source HH 30 is a textbook example of an ionic optical jet originating from a disk in an edge-on system shown by the HST. It has a remnant envelope in $^{12}$CO observed by ALMA. The optical jet is characterized by its narrow appearance, large line width at the base, and high temperature inferred from line diagnostics. Three featured structures can be identified, most evident in… ▽ More The young stellar source HH 30 is a textbook example of an ionic optical jet originating from a disk in an edge-on system shown by the HST. It has a remnant envelope in $^{12}$CO observed by ALMA. The optical jet is characterized by its narrow appearance, large line width at the base, and high temperature inferred from line diagnostics. Three featured structures can be identified, most evident in the transverse position--velocity diagrams: an extremely--high-velocity (EHV) wide-angle wind component with large spectral widths in the optical, a very--low-velocity (VLV) ambient surrounding medium seen in $^{12}$CO, and a low-velocity (LV) region traced by $^{12}$CO nested both in velocity and location between the primary wind and ambient environment. A layered cavity with multiple shells forms nested morphological and kinematic structures around the optical jet. The atomic gas originating from the innermost region of the disk attains a sufficient temperature and ionization to emit brightly in forbidden lines as an optical jet. The wide-angle portion expands, forming a low-density cavity. The filamentary $^{12}$CO encompassing the wind cavity is mixed and advected inward through the action of the magnetic interplay of the wide-angle wind with the molecular ambient medium. The magnetic interplay results in the layered shells penetrating deeply into the vast cavity of tenuous atomic wind material. The HH 30 system is an ideal manifestation of the unified wind model of \citet{Shang_2020,Shang_2023}, with clearly distinguishable atomic and molecular species mixed through the atomic lightly ionized magnetized wind and the surrounding cold molecular ambient material. △ Less

Submitted 4 February, 2024; originally announced February 2024.

Comments: 19 pages, 9 figures, ApJ in press

arXiv:2401.16696 [pdf, ps, other]

Properties of chiral nucleon-nucleon interaction at N$^3$LO with high cutoffs studied by local projection

Authors: Haoyu Shang, Rongzhe Hu, Junchen Pei, Furong Xu

Abstract: The chiral nucleon-nucleon ($NN$) interaction at high cutoffs has been plagued by the presence of spurious bound states. In this work, the chiral $NN$ interaction at N$^3$LO is studied by the local projection method as the cutoff increases. The evolution of short-range behaviors of pion-exchange interactions and contact interactions is intuitively demonstrated. The $P$-channel potentials toward hi… ▽ More The chiral nucleon-nucleon ($NN$) interaction at high cutoffs has been plagued by the presence of spurious bound states. In this work, the chiral $NN$ interaction at N$^3$LO is studied by the local projection method as the cutoff increases. The evolution of short-range behaviors of pion-exchange interactions and contact interactions is intuitively demonstrated. The $P$-channel potentials toward high cutoffs appear to be erratic at short ranges to compromise with phase shifts, while such erratic behaviors can be avoided in $S$ and $D$ channels. Furthermore, a chiral $NN$ interaction at N$^3$LO is studied at a cutoff of 700 MeV. The properties of deuteron and triton are testified with this interaction. Such a hard interaction is expected to provide an alternative choice for studies of short-range correlations and high density nuclear matter. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: 16 pages, 12 figures

arXiv:2401.13943 [pdf, other]

Is the age pension in Australia sustainable and fair? Evidence from forecasting the old-age dependency ratio using the Hamilton-Perry model

Authors: Sizhe Chen, Han Lin Shang, Yang Yang

Abstract: The age pension aims to assist eligible elderly Australians meet specific age and residency criteria in maintaining basic living standards. In designing efficient pension systems, government policymakers seek to satisfy the expectations of the overall aging population in Australia. However, the population's unique demographic characteristics at the state and territory level are often overlooked du… ▽ More The age pension aims to assist eligible elderly Australians meet specific age and residency criteria in maintaining basic living standards. In designing efficient pension systems, government policymakers seek to satisfy the expectations of the overall aging population in Australia. However, the population's unique demographic characteristics at the state and territory level are often overlooked due to the lack of available data. We use the Hamilton-Perry model, which requires minimum input, to model and forecast the evolution of age-specific populations at the state level. We also integrate the obtained sub-national demographic information to determine sustainable pension ages up to 2051. We also investigate pension welfare distribution in all states and territories to identify disadvantaged residents under the current pension system. Using the sub-national mortality data for Australia from 1971 to 2021 obtained from AHMD (2023), we implement the Hamilton-Perry model with the help of functional time series forecasting techniques. With forecasts of age-specific population sizes for each state and territory, we compute the old age dependency ratio to determine the nationwide sustainable pension age. △ Less

Submitted 24 January, 2024; originally announced January 2024.

Comments: 31 pages, 14 figures, 1 table

MSC Class: 62R10

arXiv:2401.05784 [pdf, other]

Covariance Function Estimation for High-Dimensional Functional Time Series with Dual Factor Structures

Authors: Chenlei Leng, Degui Li, Hanlin Shang, Yingcun Xia

Abstract: We propose a flexible dual functional factor model for modelling high-dimensional functional time series. In this model, a high-dimensional fully functional factor parametrisation is imposed on the observed functional processes, whereas a low-dimensional version (via series approximation) is assumed for the latent functional factors. We extend the classic principal component analysis technique for… ▽ More We propose a flexible dual functional factor model for modelling high-dimensional functional time series. In this model, a high-dimensional fully functional factor parametrisation is imposed on the observed functional processes, whereas a low-dimensional version (via series approximation) is assumed for the latent functional factors. We extend the classic principal component analysis technique for the estimation of a low-rank structure to the estimation of a large covariance matrix of random functions that satisfies a notion of (approximate) functional "low-rank plus sparse" structure; and generalise the matrix shrinkage method to functional shrinkage in order to estimate the sparse structure of functional idiosyncratic components. Under appropriate regularity conditions, we derive the large sample theory of the developed estimators, including the consistency of the estimated factors and functional factor loadings and the convergence rates of the estimated matrices of covariance functions measured by various (functional) matrix norms. Consistent selection of the number of factors and a data-driven rule to choose the shrinkage parameter are discussed. Simulation and empirical studies are provided to demonstrate the finite-sample performance of the developed model and estimation methodology. △ Less

Submitted 12 January, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

arXiv:2401.05700 [pdf, other]

R-BI: Regularized Batched Inputs enhance Incremental Decoding Framework for Low-Latency Simultaneous Speech Translation

Authors: Jiaxin Guo, Zhanglin Wu, Zongyao Li, Hengchao Shang, Daimeng Wei, Xiaoyu Chen, Zhiqiang Rao, Shaojun Li, Hao Yang

Abstract: Incremental Decoding is an effective framework that enables the use of an offline model in a simultaneous setting without modifying the original model, making it suitable for Low-Latency Simultaneous Speech Translation. However, this framework may introduce errors when the system outputs from incomplete input. To reduce these output errors, several strategies such as Hold-$n$, LA-$n$, and SP-$n$ c… ▽ More Incremental Decoding is an effective framework that enables the use of an offline model in a simultaneous setting without modifying the original model, making it suitable for Low-Latency Simultaneous Speech Translation. However, this framework may introduce errors when the system outputs from incomplete input. To reduce these output errors, several strategies such as Hold-$n$, LA-$n$, and SP-$n$ can be employed, but the hyper-parameter $n$ needs to be carefully selected for optimal performance. Moreover, these strategies are more suitable for end-to-end systems than cascade systems. In our paper, we propose a new adaptable and efficient policy named "Regularized Batched Inputs". Our method stands out by enhancing input diversity to mitigate output errors. We suggest particular regularization techniques for both end-to-end and cascade systems. We conducted experiments on IWSLT Simultaneous Speech Translation (SimulST) tasks, which demonstrate that our approach achieves low latency while maintaining no more than 2 BLEU points loss compared to offline systems. Furthermore, our SimulST systems attained several new state-of-the-art results in various language directions. △ Less

Submitted 11 January, 2024; originally announced January 2024.

Comments: Preprint

arXiv:2401.05689 [pdf, other]

doi 10.1109/ICASSP49357.2023.10096194

UCorrect: An Unsupervised Framework for Automatic Speech Recognition Error Correction

Authors: Jiaxin Guo, Minghan Wang, Xiaosong Qiao, Daimeng Wei, Hengchao Shang, Zongyao Li, Zhengzhe Yu, Yinglu Li, Chang Su, Min Zhang, Shimin Tao, Hao Yang

Abstract: Error correction techniques have been used to refine the output sentences from automatic speech recognition (ASR) models and achieve a lower word error rate (WER). Previous works usually adopt end-to-end models and has strong dependency on Pseudo Paired Data and Original Paired Data. But when only pre-training on Pseudo Paired Data, previous models have negative effect on correction. While fine-tu… ▽ More Error correction techniques have been used to refine the output sentences from automatic speech recognition (ASR) models and achieve a lower word error rate (WER). Previous works usually adopt end-to-end models and has strong dependency on Pseudo Paired Data and Original Paired Data. But when only pre-training on Pseudo Paired Data, previous models have negative effect on correction. While fine-tuning on Original Paired Data, the source side data must be transcribed by a well-trained ASR model, which takes a lot of time and not universal. In this paper, we propose UCorrect, an unsupervised Detector-Generator-Selector framework for ASR Error Correction. UCorrect has no dependency on the training data mentioned before. The whole procedure is first to detect whether the character is erroneous, then to generate some candidate characters and finally to select the most confident one to replace the error character. Experiments on the public AISHELL-1 dataset and WenetSpeech dataset show the effectiveness of UCorrect for ASR error correction: 1) it achieves significant WER reduction, achieves 6.83\% even without fine-tuning and 14.29\% after fine-tuning; 2) it outperforms the popular NAR correction models by a large margin with a competitive low latency; and 3) it is an universal method, as it reduces all WERs of the ASR model with different decoding strategies and reduces all WERs of ASR models trained on different scale datasets. △ Less

Submitted 11 January, 2024; originally announced January 2024.

Comments: Accepted in ICASSP 2023

arXiv:2401.02882 [pdf, other]

SpatialVisVR: An Immersive, Multiplexed Medical Image Viewer With Contextual Similar-Patient Search

Authors: Jai Prakash Veerla, Partha Sai Guttikonda, Amir Hajighasemi, Jillur Rahman Saurav, Aarti Darji, Cody T. Reynolds, Mohamed Mohamed, Mohammad S. Nasr, Helen H. Shang, Jacob M. Luber

Abstract: In contemporary pathology, multiplexed immunofluorescence (mIF) and multiplex immunohistochemistry (mIHC) present both significant opportunities and challenges. These methodologies shed light on intricate tumor microenvironment interactions, emphasizing the need for intuitive visualization tools to analyze vast biological datasets effectively. As electronic health records (EHR) proliferate and phy… ▽ More In contemporary pathology, multiplexed immunofluorescence (mIF) and multiplex immunohistochemistry (mIHC) present both significant opportunities and challenges. These methodologies shed light on intricate tumor microenvironment interactions, emphasizing the need for intuitive visualization tools to analyze vast biological datasets effectively. As electronic health records (EHR) proliferate and physicians face increasing information overload, the integration of advanced technologies becomes imperative. SpatialVisVR emerges as a versatile VR platform tailored for comparing medical images, with adaptability for data privacy on embedded hardware. Clinicians can capture pathology slides in real-time via mobile devices, leveraging SpatialVisVR's deep learning algorithm to match and display similar mIF images. This interface supports the manipulation of up to 100 multiplexed protein channels, thereby assisting in immuno-oncology decision-making. Ultimately, SpatialVisVR aims to streamline diagnostic processes, advocating for a comprehensive and efficient approach to immuno-oncology research and treatment. △ Less

Submitted 11 May, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

arXiv:2312.14574 [pdf, other]

MMGPL: Multimodal Medical Data Analysis with Graph Prompt Learning

Authors: Liang Peng, Songyue Cai, Zongqian Wu, Huifang Shang, Xiaofeng Zhu, Xiaoxiao Li

Abstract: Prompt learning has demonstrated impressive efficacy in the fine-tuning of multimodal large models to a wide range of downstream tasks. Nonetheless, applying existing prompt learning methods for the diagnosis of neurological disorder still suffers from two issues: (i) existing methods typically treat all patches equally, despite the fact that only a small number of patches in neuroimaging are rele… ▽ More Prompt learning has demonstrated impressive efficacy in the fine-tuning of multimodal large models to a wide range of downstream tasks. Nonetheless, applying existing prompt learning methods for the diagnosis of neurological disorder still suffers from two issues: (i) existing methods typically treat all patches equally, despite the fact that only a small number of patches in neuroimaging are relevant to the disease, and (ii) they ignore the structural information inherent in the brain connection network which is crucial for understanding and diagnosing neurological disorders. To tackle these issues, we introduce a novel prompt learning model by learning graph prompts during the fine-tuning process of multimodal large models for diagnosing neurological disorders. Specifically, we first leverage GPT-4 to obtain relevant disease concepts and compute semantic similarity between these concepts and all patches. Secondly, we reduce the weight of irrelevant patches according to the semantic similarity between each patch and disease-related concepts. Moreover, we construct a graph among tokens based on these concepts and employ a graph convolutional network layer to extract the structural information of the graph, which is used to prompt the pre-trained multimodal large models for diagnosing neurological disorders. Extensive experiments demonstrate that our method achieves superior performance for neurological disorder diagnosis compared with state-of-the-art methods and validated by clinicians. △ Less

Submitted 27 June, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

arXiv:2312.12587 [pdf, other]

Real-Time Diagnostic Integrity Meets Efficiency: A Novel Platform-Agnostic Architecture for Physiological Signal Compression

Authors: Neel R Vora, Amir Hajighasemi, Cody T. Reynolds, Amirmohammad Radmehr, Mohamed Mohamed, Jillur Rahman Saurav, Abdul Aziz, Jai Prakash Veerla, Mohammad S Nasr, Hayden Lotspeich, Partha Sai Guttikonda, Thuong Pham, Aarti Darji, Parisa Boodaghi Malidarreh, Helen H Shang, Jay Harvey, Kan Ding, Phuc Nguyen, Jacob M Luber

Abstract: Head-based signals such as EEG, EMG, EOG, and ECG collected by wearable systems will play a pivotal role in clinical diagnosis, monitoring, and treatment of important brain disorder diseases. However, the real-time transmission of the significant corpus physiological signals over extended periods consumes substantial power and time, limiting the viability of battery-dependent physiological monit… ▽ More Head-based signals such as EEG, EMG, EOG, and ECG collected by wearable systems will play a pivotal role in clinical diagnosis, monitoring, and treatment of important brain disorder diseases. However, the real-time transmission of the significant corpus physiological signals over extended periods consumes substantial power and time, limiting the viability of battery-dependent physiological monitoring wearables. This paper presents a novel deep-learning framework employing a variational autoencoder (VAE) for physiological signal compression to reduce wearables' computational complexity and energy consumption. Our approach achieves an impressive compression ratio of 1:293 specifically for spectrogram data, surpassing state-of-the-art compression techniques such as JPEG2000, H.264, Direct Cosine Transform (DCT), and Huffman Encoding, which do not excel in handling physiological signals. We validate the efficacy of the compressed algorithms using collected physiological signals from real patients in the Hospital and deploy the solution on commonly used embedded AI chips (i.e., ARM Cortex V8 and Jetson Nano). The proposed framework achieves a 91% seizure detection accuracy using XGBoost, confirming the approach's reliability, practicality, and scalability. △ Less

Submitted 4 January, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

arXiv:2312.04998 [pdf, other]

An Efficient Algorithm for Astrochemical Systems Using Stoichiometry Matrices

Authors: Kazutaka Motoyama, Ruben Krasnopolsky, Hsien Shang, Kento Aida, Eisaku Sakane

Abstract: Astrochemical simulations are a powerful tool for revealing chemical evolution in the interstellar medium. Astrochemical calculations require efficient processing of large matrices for the chemical networks. The large chemical reaction networks often present bottlenecks for computation because of time derivatives of chemical abundances. We propose an efficient algorithm using a stoichiometry matri… ▽ More Astrochemical simulations are a powerful tool for revealing chemical evolution in the interstellar medium. Astrochemical calculations require efficient processing of large matrices for the chemical networks. The large chemical reaction networks often present bottlenecks for computation because of time derivatives of chemical abundances. We propose an efficient algorithm using a stoichiometry matrix approach in which this time-consuming part is expressed as a loop, unlike the algorithm used in previous studies. Since stoichiometry matrices are sparse in general, the performances of simulations with our algorithm depend on which sparse-matrix storage format is used. We conducted a performance comparison experiment using the common storage formats, including the coordinate (COO) format, the compressed column storage (CCS) format, the compressed row storage (CRS) format, and the Sliced ELLPACK (SELL) format. Experimental results showed that the simulations with the CRS format are the most suitable for astrochemical simulations and about three times faster than those with the algorithm used in previous studies. In addition, our algorithm significantly reduces not only the computation time but also the compilation time. We also explore the beneficial effects of parallelization and sparse-matrix reordering in these algorithms. △ Less

Submitted 8 December, 2023; originally announced December 2023.

Comments: 10 pages, 7 figures, accepted for publication in ApJS

arXiv:2311.18477 [pdf, other]

Intraday foreign exchange rate volatility forecasting: univariate and multilevel functional GARCH models

Authors: Fearghal Kearney, Han Lin Shang, Yuqian Zhao

Abstract: This paper seeks to predict conditional intraday volatility in foreign exchange (FX) markets using functional Generalized AutoRegressive Conditional Heteroscedasticity (GARCH) models. We contribute to the existing functional GARCH-type models by accounting for the stylised features of long-range and cross-dependence through estimating the models with long-range dependent and multi-level functional… ▽ More This paper seeks to predict conditional intraday volatility in foreign exchange (FX) markets using functional Generalized AutoRegressive Conditional Heteroscedasticity (GARCH) models. We contribute to the existing functional GARCH-type models by accounting for the stylised features of long-range and cross-dependence through estimating the models with long-range dependent and multi-level functional principal component basis functions. Remarkably, we find that taking account of cross-dependency dynamics between the major currencies significantly improves intraday conditional volatility forecasting. Additionally, incorporating intraday bid-ask spread using a functional GARCH-X model adds explainability of long-range dependence and further enhances predictability. Intraday risk management applications are presented to highlight the practical economic benefits of our proposed approaches. △ Less

Submitted 30 November, 2023; originally announced November 2023.

Comments: 43 pages, 5 figures, 8 tables

MSC Class: 62R10

arXiv:2311.18200 [pdf, other]

INarIG: Iterative Non-autoregressive Instruct Generation Model For Word-Level Auto Completion

Authors: Hengchao Shang, Zongyao Li, Daimeng Wei, Jiaxin Guo, Minghan Wang, Xiaoyu Chen, Lizhi Lei, Hao Yang

Abstract: Computer-aided translation (CAT) aims to enhance human translation efficiency and is still important in scenarios where machine translation cannot meet quality requirements. One fundamental task within this field is Word-Level Auto Completion (WLAC). WLAC predicts a target word given a source sentence, translation context, and a human typed character sequence. Previous works either employ word cla… ▽ More Computer-aided translation (CAT) aims to enhance human translation efficiency and is still important in scenarios where machine translation cannot meet quality requirements. One fundamental task within this field is Word-Level Auto Completion (WLAC). WLAC predicts a target word given a source sentence, translation context, and a human typed character sequence. Previous works either employ word classification models to exploit contextual information from both sides of the target word or directly disregarded the dependencies from the right-side context. Furthermore, the key information, i.e. human typed sequences, is only used as prefix constraints in the decoding module. In this paper, we propose the INarIG (Iterative Non-autoregressive Instruct Generation) model, which constructs the human typed sequence into Instruction Unit and employs iterative decoding with subwords to fully utilize input information given in the task. Our model is more competent in dealing with low-frequency words (core scenario of this task), and achieves state-of-the-art results on the WMT22 and benchmark datasets, with a maximum increase of over 10% prediction accuracy. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: EMNLP2023

arXiv:2311.00401 [pdf, other]

A Spatial-Temporal Transformer based Framework For Human Pose Assessment And Correction in Education Scenarios

Authors: Wenyang Hu, Kai Liu, Libin Liu, Huiliang Shang

Abstract: Human pose assessment and correction play a crucial role in applications across various fields, including computer vision, robotics, sports analysis, healthcare, and entertainment. In this paper, we propose a Spatial-Temporal Transformer based Framework (STTF) for human pose assessment and correction in education scenarios such as physical exercises and science experiment. The framework comprising… ▽ More Human pose assessment and correction play a crucial role in applications across various fields, including computer vision, robotics, sports analysis, healthcare, and entertainment. In this paper, we propose a Spatial-Temporal Transformer based Framework (STTF) for human pose assessment and correction in education scenarios such as physical exercises and science experiment. The framework comprising skeletal tracking, pose estimation, posture assessment, and posture correction modules to educate students with professional, quick-to-fix feedback. We also create a pose correction method to provide corrective feedback in the form of visual aids. We test the framework with our own dataset. It comprises (a) new recordings of five exercises, (b) existing recordings found on the internet of the same exercises, and (c) corrective feedback on the recordings by professional athletes and teachers. Results show that our model can effectively measure and comment on the quality of students' actions. The STTF leverages the power of transformer models to capture spatial and temporal dependencies in human poses, enabling accurate assessment and effective correction of students' movements. △ Less

Submitted 1 November, 2023; originally announced November 2023.

arXiv:2311.00370 [pdf]

Discovery of four pulsars in a pilot survey at intermediate Galactic latitudes with FAST

Authors: Q. J. Zhi, J. T. Bai, S. Dai, X. Xu, S. J. Dang, L. H. Shang, R. S. Zhao, D. Li, W. W. Zhu, N. Wang, J. P. Yuan, P. Wang, L. Zhang, Y. Feng, J. B. Wang, S. Q. Wang, Q. D. Wu, A. J. Dong, H. Yang, J. Tian, W. Q. Zhong, X. H. Luo, Miroslav D. Filipovi, G. J. Qiao

Abstract: We present the discovery and timing results of four pulsars discovered in a pilot survey at intermediate Galactic latitudes with the Five-hundred Aperture Spherical Telescope (FAST). Among these pulsars, two belong to the category of millisecond pulsars (MSPs) with spin periods of less than 20 ms. The other two fall under the classification of "mildly recycled" pulsars, with massive white dwarfs a… ▽ More We present the discovery and timing results of four pulsars discovered in a pilot survey at intermediate Galactic latitudes with the Five-hundred Aperture Spherical Telescope (FAST). Among these pulsars, two belong to the category of millisecond pulsars (MSPs) with spin periods of less than 20 ms. The other two fall under the classification of "mildly recycled" pulsars, with massive white dwarfs as companions. Remarkably, this small survey, covering an area of 4.7 $deg^2$ , led to the discovery of four recycled pulsars. Such success underscores the immense potential of future surveys at intermediate Galactic latitudes. In order to assess the potential yield of MSPs, we conducted population simulations and found that both FAST and Parkes new phased array feed surveys, focusing on intermediate Galactic latitudes, have the capacity to uncover several hundred new MSPs. △ Less

Submitted 28 December, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

Comments: 7 pages, 4 figures, 2 tables, accepted to ApJ

arXiv:2310.16480 [pdf, other]

Exploring the Formation of Resistive Pseudodisks with the GPU Code Astaroth

Authors: Miikka S. Väisälä, Hsien Shang, Daniele Galli, Susana Lizano, Ruben Krasnopolsky

Abstract: Pseudodisks are dense structures formed perpendicular to the direction of the magnetic field during the gravitational collapse of a molecular cloud core. Numerical simulations of the formation of pseudodisks are usually computationally expensive with conventional CPU codes. To demonstrate the proof-of-concept of a fast computing method for this numerically costly problem, we explore the GPU-powere… ▽ More Pseudodisks are dense structures formed perpendicular to the direction of the magnetic field during the gravitational collapse of a molecular cloud core. Numerical simulations of the formation of pseudodisks are usually computationally expensive with conventional CPU codes. To demonstrate the proof-of-concept of a fast computing method for this numerically costly problem, we explore the GPU-powered MHD code Astaroth, a 6th-order finite difference method with low adjustable finite resistivity implemented with sink particles. The formation of pseudodisks is physically and numerically robust and can be achieved with a simple and clean setup for this newly adopted numerical approach for science verification. The method's potential is illustrated by evidencing the dependence on the initial magnetic field strength of specific physical features accompanying the formation of pseudodisks, e.g. the occurrence of infall shocks and the variable behavior of the mass and magnetic flux accreted on the central object. As a performance test, we measure both weak and strong scaling of our implementation to find most efficient way to use the code on a multi-GPU system. Once suitable physics and problem-specific implementations are realized, the GPU-accelerated code is an efficient option for 3-D magnetized collapse problems. △ Less

Submitted 25 October, 2023; originally announced October 2023.

Comments: 29 pages, 1 table, 15 figures, Accepted for publication in the Astrophysical Journal

arXiv:2310.09568 [pdf, other]

Wafer-scale Computing: Advancements, Challenges, and Future Perspectives

Authors: Yang Hu, Xinhan Lin, Huizheng Wang, Zhen He, Xingmao Yu, Jiahao Zhang, Qize Yang, Zheng Xu, Sihan Guan, Jiahao Fang, Haoran Shang, Xinru Tang, Xu Dai, Shaojun Wei, Shouyi Yin

Abstract: Nowadays, artificial intelligence (AI) technology with large models plays an increasingly important role in both academia and industry. It also brings a rapidly increasing demand for the computing power of the hardware. As the computing demand for AI continues to grow, the growth of hardware computing power has failed to keep up. This has become a significant factor restricting the development of… ▽ More Nowadays, artificial intelligence (AI) technology with large models plays an increasingly important role in both academia and industry. It also brings a rapidly increasing demand for the computing power of the hardware. As the computing demand for AI continues to grow, the growth of hardware computing power has failed to keep up. This has become a significant factor restricting the development of AI. The augmentation of hardware computing power is mainly propelled by the escalation of transistor density and chip area. However, the former is impeded by the termination of the Moore's Law and Dennard scaling, and the latter is significantly restricted by the challenge of disrupting the legacy fabrication equipment and process. In recent years, advanced packaging technologies that have gradually matured are increasingly used to implement bigger chips that integrate multiple chiplets, while still providing interconnections with chip-level density and bandwidth. Compared to conventional high-performance computing paradigms such as multi-accelerator and datacenter-scale computing, Wafer-scale Computing shows remarkable advantages in communication bandwidth, integration density, and programmability potential. Not surprisingly, disruptive Wafer-scale Computing also brings unprecedented design challenges for hardware architecture, design-system-technology co-optimization, power and cooling systems, and compiler tool chain. At present, there are no comprehensive surveys summarizing the current state and design insights of Wafer-scale Computing. This paper aims to take the first step to help academia and industry review existing wafer-scale chips and essential technologies in a one-stop manner. So that people can conveniently grasp the basic knowledge and key points, understand the achievements and shortcomings of existing research, and contribute to this promising research direction. △ Less

Submitted 14 October, 2023; originally announced October 2023.

ACM Class: B.7.0; C.1

arXiv:2310.08439 [pdf, other]

TensorMD: Scalable Tensor-Diagram based Machine Learning Interatomic Potential on Heterogeneous Many-Core Processors

Authors: Xin Chen, Yucheng Ouyang, Xin Chen, Zhenchuan Chen, Rongfen Lin, Xingyu Gao, Lifang Wang, Fang Li, Yin Liu, Honghui Shang, Haifeng Song

Abstract: Molecular dynamics simulations have emerged as a potent tool for investigating the physical properties and kinetic behaviors of materials at the atomic scale, particularly in extreme conditions. Ab initio accuracy is now achievable with machine learning based interatomic potentials. With recent advancements in high-performance computing, highly accurate and large-scale simulations become feasible.… ▽ More Molecular dynamics simulations have emerged as a potent tool for investigating the physical properties and kinetic behaviors of materials at the atomic scale, particularly in extreme conditions. Ab initio accuracy is now achievable with machine learning based interatomic potentials. With recent advancements in high-performance computing, highly accurate and large-scale simulations become feasible. This study introduces TensorMD, a new machine learning interatomic potential (MLIP) model that integrates physical principles and tensor diagrams. The tensor formalism provides a more efficient computation and greater flexibility for use with other scientific codes. Additionally, we proposed several portable optimization strategies and developed a highly optimized version for the new Sunway supercomputer. Our optimized TensorMD can achieve unprecedented performance on the new Sunway, enabling simulations of up to 52 billion atoms with a time-to-solution of 31 ps/step/atom, setting new records for HPC + AI + MD. △ Less

Submitted 12 October, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

arXiv:2308.14104 [pdf, other]

Towards Generalizable Neural Solvers for Vehicle Routing Problems via Ensemble with Transferrable Local Policy

Authors: Chengrui Gao, Haopu Shang, Ke Xue, Dong Li, Chao Qian

Abstract: Machine learning has been adapted to help solve NP-hard combinatorial optimization problems. One prevalent way is learning to construct solutions by deep neural networks, which has been receiving more and more attention due to the high efficiency and less requirement for expert knowledge. However, many neural construction methods for Vehicle Routing Problems~(VRPs) focus on synthetic problem insta… ▽ More Machine learning has been adapted to help solve NP-hard combinatorial optimization problems. One prevalent way is learning to construct solutions by deep neural networks, which has been receiving more and more attention due to the high efficiency and less requirement for expert knowledge. However, many neural construction methods for Vehicle Routing Problems~(VRPs) focus on synthetic problem instances with specified node distributions and limited scales, leading to poor performance on real-world problems which usually involve complex and unknown node distributions together with large scales. To make neural VRP solvers more practical, we design an auxiliary policy that learns from the local transferable topological features, named local policy, and integrate it with a typical construction policy (which learns from the global information of VRP instances) to form an ensemble policy. With joint training, the aggregated policies perform cooperatively and complementarily to boost generalization. The experimental results on two well-known benchmarks, TSPLIB and CVRPLIB, of travelling salesman problem and capacitated VRP show that the ensemble policy significantly improves both cross-distribution and cross-scale generalization performance, and even performs well on real-world problems with several thousand nodes. △ Less

Submitted 5 May, 2024; v1 submitted 27 August, 2023; originally announced August 2023.

Comments: Accepted by IJCAI 2024

arXiv:2308.05494 [pdf, other]

ALMA Survey of Orion Planck Galactic Cold Clumps (ALMASOP): The Warm-Envelope Origin of Hot Corinos

Authors: Shih-Ying Hsu, Sheng-Yuan Liu, Doug Johnstone, Tie Liu, Leonardo Bronfman, Huei-Ru Vivien Chen, Somnath Dutta, David J. Eden, Neal J. Evans II, Naomi Hirano, Mika Juvela, Yi-Jehng Kuan, Woojin Kwon, Chin-Fei Lee, Chang Won Lee, Jeong-Eun Lee, Shanghuo Li, Chun-Fan Liu, Xunchuan Liu, Qiuyi Luo, Sheng-Li Qin, Mark G. Rawlings, Dipen Sahu, Patricio Sanhueza, Hsien Shang , et al. (2 additional authors not shown)

Abstract: Hot corinos are of great interest due to their richness in interstellar complex organic molecules (COMs) and the consequent potential prebiotic connection to solar-like planetary systems. Recent surveys have reported an increasing number of hot corino detections in Class 0/I protostars; however, the relationships between their physical properties and the hot-corino signatures remain elusive. In th… ▽ More Hot corinos are of great interest due to their richness in interstellar complex organic molecules (COMs) and the consequent potential prebiotic connection to solar-like planetary systems. Recent surveys have reported an increasing number of hot corino detections in Class 0/I protostars; however, the relationships between their physical properties and the hot-corino signatures remain elusive. In this study, our objective is to establish a general picture of the detectability of the hot corinos by identifying the origin of the hot-corino signatures in the sample of young stellar objects (YSOs) obtained from the Atacama Large Millimeter/submillimeter Array Survey of Orion Planck Galactic Cold Clumps (ALMASOP) project. We apply spectral energy distribution (SED) modeling to our sample and identify the physical parameters of the modeled YSOs directly, linking the detection of hot-corino signatures to the envelope properties of the YSOs. Imaging simulations of the methanol emission further support this scenario. We, therefore, posit that the observed COM emission originates from the warm inner envelopes of the sample YSOs, based on both the warm region size and the envelope density profile. The former is governed by the source luminosity and is additionally affected by the disk and cavity properties, while the latter is related to the evolutionary stages. This scenario provides a framework for detecting hot-corino signatures toward luminous Class 0 YSOs, with fewer detections observed toward similarly luminous Class I sources. △ Less

Submitted 11 August, 2023; v1 submitted 10 August, 2023; originally announced August 2023.

Comments: 28 pages, 11 figures

arXiv:2308.01454 [pdf, other]

TOI-4860 b, a short-period giant planet transiting an M3.5 dwarf

Authors: J. M. Almenara, X. Bonfils, E. M. Bryant, A. Jordán, G. Hébrard, E. Martioli, A. C. M. Correia, N. Astudillo-Defru, C. Cadieux, L. Arnold, É. Artigau, G. Á. Bakos, S. C. C. Barros, D. Bayliss, F. Bouchy, G. Boué, R. Brahm, A. Carmona, D. Charbonneau, D. R. Ciardi, R. Cloutier, M. Cointepas, N. J. Cook, N. B. Cowan, X. Delfosse , et al. (25 additional authors not shown)

Abstract: We report the discovery and characterisation of a giant transiting planet orbiting a nearby M3.5V dwarf (d = 80.4 pc, $G$ = 15.1 mag, $K$=11.2 mag, R$_\star$ = 0.358 $\pm$ 0.015 R$_\odot$, M$_\star$ = 0.340 $\pm$ 0.009 M$_\odot$). Using the photometric time series from TESS sectors 10, 36, 46, and 63 and near-infrared spectrophotometry from ExTrA, we measured a planetary radius of 0.77 $\pm$ 0.03… ▽ More We report the discovery and characterisation of a giant transiting planet orbiting a nearby M3.5V dwarf (d = 80.4 pc, $G$ = 15.1 mag, $K$=11.2 mag, R$_\star$ = 0.358 $\pm$ 0.015 R$_\odot$, M$_\star$ = 0.340 $\pm$ 0.009 M$_\odot$). Using the photometric time series from TESS sectors 10, 36, 46, and 63 and near-infrared spectrophotometry from ExTrA, we measured a planetary radius of 0.77 $\pm$ 0.03 R$_J$ and an orbital period of 1.52 days. With high-resolution spectroscopy taken by the CFHT/SPIRou and ESO/ESPRESSO spectrographs, we refined the host star parameters ([Fe/H] = 0.27 $\pm$ 0.12) and measured the mass of the planet (0.273 $\pm$ 0.006 M$_J$). Based on these measurements, TOI-4860 b joins the small set of massive planets ($>$80 M$_E$) found around mid to late M dwarfs ($<$0.4 R$_\odot$), providing both an interesting challenge to planet formation theory and a favourable target for further atmospheric studies with transmission spectroscopy. We identified an additional signal in the radial velocity data that we attribute to an eccentric planet candidate ($e=0.66\pm0.09$) with an orbital period of $427\pm7$~days and a minimum mass of $1.66\pm 0.26$ M$_J$, but additional data would be needed to confirm this. △ Less

Submitted 12 January, 2024; v1 submitted 2 August, 2023; originally announced August 2023.

Comments: 16 pages, 14 figures, accepted for publication in A&A

arXiv:2307.12746 [pdf, other]

doi 10.1051/0004-6361/202346737

A high-resolution radio study of the L1551 IRS 5 and L1551 NE jets

Authors: A. Feeney-Johansson, S. J. D. Purser, T. P. Ray, C. Carrasco-González, A. Rodríguez-Kamenetzky, J. Eislöffel, J. Lim, R. Galván-Madrid, S. Lizano, L. F. Rodríguez, H. Shang, P. Ho, M. Hoare

Abstract: Using observations with e-MERLIN and the VLA, together with archival data from ALMA, we obtain high-resolution radio images of two binary YSOs: L1551 IRS 5 and L1551 NE, covering a wide range of frequencies from 5 - 336 GHz, and resolving emission from the radio jet on scales of only ~15 au. By comparing these observations to those from a previous epoch, it is shown that there is a high degree of… ▽ More Using observations with e-MERLIN and the VLA, together with archival data from ALMA, we obtain high-resolution radio images of two binary YSOs: L1551 IRS 5 and L1551 NE, covering a wide range of frequencies from 5 - 336 GHz, and resolving emission from the radio jet on scales of only ~15 au. By comparing these observations to those from a previous epoch, it is shown that there is a high degree of variability in the free-free emission from the jets of these sources. In particular, the northern component of L1551 IRS 5 shows a remarkable decline in flux density of a factor of ~5, suggesting that the free-free emission of this source has almost disappeared. By fitting the spectra of the sources, the ionised mass-loss rates of the jets are derived and it is shown that there is significant variability of up to a factor of ~6 on timescales of ~20 years. Using radiative transfer modelling, we also obtained a model image for the jet of the southern component of L1551 IRS 5 to help study the inner region of the ionised high-density jet. The findings favour the X-wind model launched from a very small innermost region. △ Less

Submitted 24 July, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

Comments: 13 pages, 7 figures, accepted for publication in A&A

Journal ref: A&A 677, A97 (2023)

arXiv:2307.09343 [pdf, other]

Solving Schrödinger Equation with a Language Model

Authors: Honghui Shang, Chu Guo, Yangjun Wu, Zhenyu Li, Jinlong Yang

Abstract: Accurately solving the Schrödinger equation for intricate systems remains a prominent challenge in physical sciences. A paradigm-shifting approach to address this challenge involves the application of artificial intelligence techniques. In this study, we introduce a machine-learning model named QiankunNet, based on the transformer architecture employed in language models. By incorporating the atte… ▽ More Accurately solving the Schrödinger equation for intricate systems remains a prominent challenge in physical sciences. A paradigm-shifting approach to address this challenge involves the application of artificial intelligence techniques. In this study, we introduce a machine-learning model named QiankunNet, based on the transformer architecture employed in language models. By incorporating the attention mechanism, QiankunNet adeptly captures intricate quantum correlations, which enhances its expressive power. The autoregressive attribute of QiankunNet allows for the adoption of an exceedingly efficient sampling technique to estimate the total energy, facilitating the model training process. Additionally, performance of QiankunNet can be further improved via a pre-training process. This work not only demonstrates the power of artificial intelligence in quantum mechanics but also signifies a pivotal advancement in extending the boundary of systems which can be studied with a full-configuration-interaction accuracy. △ Less

Submitted 4 April, 2024; v1 submitted 18 July, 2023; originally announced July 2023.

arXiv:2307.01047 [pdf, other]

Cross-modal Place Recognition in Image Databases using Event-based Sensors

Authors: Xiang Ji, Jiaxin Wei, Yifu Wang, Huiliang Shang, Laurent Kneip

Abstract: Visual place recognition is an important problem towards global localization in many robotics tasks. One of the biggest challenges is that it may suffer from illumination or appearance changes in surrounding environments. Event cameras are interesting alternatives to frame-based sensors as their high dynamic range enables robust perception in difficult illumination conditions. However, current eve… ▽ More Visual place recognition is an important problem towards global localization in many robotics tasks. One of the biggest challenges is that it may suffer from illumination or appearance changes in surrounding environments. Event cameras are interesting alternatives to frame-based sensors as their high dynamic range enables robust perception in difficult illumination conditions. However, current event-based place recognition methods only rely on event information, which restricts downstream applications of VPR. In this paper, we present the first cross-modal visual place recognition framework that is capable of retrieving regular images from a database given an event query. Our method demonstrates promising results with respect to the state-of-the-art frame-based and event-based methods on the Brisbane-Event-VPR dataset under different scenarios. We also verify the effectiveness of the combination of retrieval and classification, which can boost performance by a large margin. △ Less

Submitted 3 July, 2023; originally announced July 2023.

arXiv:2306.17019 [pdf, other]

Histopathology Slide Indexing and Search: Are We There Yet?

Authors: Helen H. Shang, Mohammad Sadegh Nasr, Jai Prakash Veerla, Parisa Boodaghi Malidarreh, MD Jillur Rahman Saurav, Amir Hajighasemi, Manfred Huber, Chace Moleta, Jitin Makker, Jacob M. Luber

Abstract: The search and retrieval of digital histopathology slides is an important task that has yet to be solved. In this case study, we investigate the clinical readiness of three state-of-the-art histopathology slide search engines, Yottixel, SISH, and RetCCL, on three patients with solid tumors. We provide a qualitative assessment of each model's performance in providing retrieval results that are reli… ▽ More The search and retrieval of digital histopathology slides is an important task that has yet to be solved. In this case study, we investigate the clinical readiness of three state-of-the-art histopathology slide search engines, Yottixel, SISH, and RetCCL, on three patients with solid tumors. We provide a qualitative assessment of each model's performance in providing retrieval results that are reliable and useful to pathologists. We found that all three image search engines fail to produce consistently reliable results and have difficulties in capturing granular and subtle features of malignancy, limiting their diagnostic accuracy. Based on our findings, we also propose a minimal set of requirements to further advance the development of accurate and reliable histopathology image search engines for successful clinical adoption. △ Less

Submitted 4 January, 2024; v1 submitted 29 June, 2023; originally announced June 2023.

arXiv:2306.16989 [pdf]

doi 10.12688/f1000research.139210.1

The State of Applying Artificial Intelligence to Tissue Imaging for Cancer Research and Early Detection

Authors: Michael Robben, Amir Hajighasemi, Mohammad Sadegh Nasr, Jai Prakesh Veerla, Anne M. Alsup, Biraaj Rout, Helen H. Shang, Kelli Fowlds, Parisa Boodaghi Malidarreh, Paul Koomey, MD Jillur Rahman Saurav, Jacob M. Luber

Abstract: Artificial intelligence represents a new frontier in human medicine that could save more lives and reduce the costs, thereby increasing accessibility. As a consequence, the rate of advancement of AI in cancer medical imaging and more particularly tissue pathology has exploded, opening it to ethical and technical questions that could impede its adoption into existing systems. In order to chart the… ▽ More Artificial intelligence represents a new frontier in human medicine that could save more lives and reduce the costs, thereby increasing accessibility. As a consequence, the rate of advancement of AI in cancer medical imaging and more particularly tissue pathology has exploded, opening it to ethical and technical questions that could impede its adoption into existing systems. In order to chart the path of AI in its application to cancer tissue imaging, we review current work and identify how it can improve cancer pathology diagnostics and research. In this review, we identify 5 core tasks that models are developed for, including regression, classification, segmentation, generation, and compression tasks. We address the benefits and challenges that such methods face, and how they can be adapted for use in cancer prevention and treatment. The studies looked at in this paper represent the beginning of this field and future experiments will build on the foundations that we highlight. △ Less

Submitted 29 June, 2023; originally announced June 2023.

Journal ref: F1000Research 2023, 12:1436

arXiv:2306.16705 [pdf, other]

NNQS-Transformer: an Efficient and Scalable Neural Network Quantum States Approach for Ab initio Quantum Chemistry

Authors: Yangjun Wu, Chu Guo, Yi Fan, Pengyu Zhou, Honghui Shang

Abstract: Neural network quantum state (NNQS) has emerged as a promising candidate for quantum many-body problems, but its practical applications are often hindered by the high cost of sampling and local energy calculation. We develop a high-performance NNQS method for \textit{ab initio} electronic structure calculations. The major innovations include: (1) A transformer based architecture as the quantum wav… ▽ More Neural network quantum state (NNQS) has emerged as a promising candidate for quantum many-body problems, but its practical applications are often hindered by the high cost of sampling and local energy calculation. We develop a high-performance NNQS method for \textit{ab initio} electronic structure calculations. The major innovations include: (1) A transformer based architecture as the quantum wave function ansatz; (2) A data-centric parallelization scheme for the variational Monte Carlo (VMC) algorithm which preserves data locality and well adapts for different computing architectures; (3) A parallel batch sampling strategy which reduces the sampling cost and achieves good load balance; (4) A parallel local energy evaluation scheme which is both memory and computationally efficient; (5) Study of real chemical systems demonstrates both the superior accuracy of our method compared to state-of-the-art and the strong and weak scalability for large molecular systems with up to $120$ spin orbitals. △ Less

Submitted 1 November, 2023; v1 submitted 29 June, 2023; originally announced June 2023.

Comments: Accepted by SC'23, fix Table1 CCSD references

arXiv:2306.07839 [pdf, other]

doi 10.3847/2041-8213/acdddf

ALMA Survey of Orion Planck Galactic Cold Clumps (ALMASOP): A forming quadruple system with continuum `ribbons' and intricate outflows

Authors: Qiu-yi Luo, Tie Liu, Aaron T. Lee, Stella S. R. Offner, James di Francesco, Doug Johnstone, Mika Juvela, Paul F. Goldsmith, Sheng-Li Qin, Xiaofeng Mai, Xun-chuan Liu, Patricio Sanhueza, Feng-Wei Xu, Ken'ichi Tatematsu, Somnath Dutta, Huei-Ru Vivien Chen, Shanghuo Li, Aiyuan Yang, Sheng-Yuan Liu, Chin-Fei Lee, Naomi Hirano, Chang Won Lee, Dipen Sahu, Hsien Shang, Shih-Ying Hsu , et al. (9 additional authors not shown)

Abstract: One of the most poorly understood aspects of low-mass star formation is how multiple-star systems are formed. Here we present the results of Atacama Large Millimeter/submillimeter Array (ALMA) Band-6 observations towards a forming quadruple protostellar system, G206.93-16.61E2, in the Orion B molecular cloud. ALMA 1.3 mm continuum emission reveals four compact objects, of which two are Class I you… ▽ More One of the most poorly understood aspects of low-mass star formation is how multiple-star systems are formed. Here we present the results of Atacama Large Millimeter/submillimeter Array (ALMA) Band-6 observations towards a forming quadruple protostellar system, G206.93-16.61E2, in the Orion B molecular cloud. ALMA 1.3 mm continuum emission reveals four compact objects, of which two are Class I young stellar objects (YSOs), and the other two are likely in prestellar phase. The 1.3 mm continuum emission also shows three asymmetric ribbon-like structures that are connected to the four objects, with lengths ranging from $\sim$500 au to $\sim$2200 au. By comparing our data with magneto-hydrodynamic (MHD) simulations, we suggest that these ribbons trace accretion flows and also function as gas bridges connecting the member protostars. Additionally, ALMA CO J=2-1 line emission reveals a complicated molecular outflow associated with G206.93-16.61E2 with arc-like structures suggestive of an outflow cavity viewed pole-on. △ Less

Submitted 13 June, 2023; originally announced June 2023.

Comments: The paper was accepted by APJL

arXiv:2306.06780 [pdf, other]

Multimodal Pathology Image Search Between H&E Slides and Multiplexed Immunofluorescent Images

Authors: Amir Hajighasemi, MD Jillur Rahman Saurav, Mohammad S Nasr, Jai Prakash Veerla, Aarti Darji, Parisa Boodaghi Malidarreh, Michael Robben, Helen H Shang, Jacob M Luber

Abstract: We present an approach for multimodal pathology image search, using dynamic time warping (DTW) on Variational Autoencoder (VAE) latent space that is fed into a ranked choice voting scheme to retrieve multiplexed immunofluorescent imaging (mIF) that is most similar to a query H&E slide. Through training the VAE and applying DTW, we align and compare mIF and H&E slides. Our method improves different… ▽ More We present an approach for multimodal pathology image search, using dynamic time warping (DTW) on Variational Autoencoder (VAE) latent space that is fed into a ranked choice voting scheme to retrieve multiplexed immunofluorescent imaging (mIF) that is most similar to a query H&E slide. Through training the VAE and applying DTW, we align and compare mIF and H&E slides. Our method improves differential diagnosis and therapeutic decisions by integrating morphological H&E data with immunophenotyping from mIF, providing clinicians a rich perspective of disease states. This facilitates an understanding of the spatial relationships in tissue samples and could revolutionize the diagnostic process, enhancing precision and enabling personalized therapy selection. Our technique demonstrates feasibility using colorectal cancer and healthy tonsil samples. An exhaustive ablation study was conducted on a search engine designed to explore the correlation between multiplexed Immunofluorescence (mIF) and Hematoxylin and Eosin (H&E) staining, in order to validate its ability to map these distinct modalities into a unified vector space. Despite extreme class imbalance, the system demonstrated robustness and utility by returning similar results across various data features, which suggests potential for future use in multimodal histopathology data analysis. △ Less

Submitted 11 June, 2023; originally announced June 2023.

arXiv:2306.01318 [pdf, other]

Text Style Transfer Back-Translation

Authors: Daimeng Wei, Zhanglin Wu, Hengchao Shang, Zongyao Li, Minghan Wang, Jiaxin Guo, Xiaoyu Chen, Zhengzhe Yu, Hao Yang

Abstract: Back Translation (BT) is widely used in the field of machine translation, as it has been proved effective for enhancing translation quality. However, BT mainly improves the translation of inputs that share a similar style (to be more specific, translation-like inputs), since the source side of BT data is machine-translated. For natural inputs, BT brings only slight improvements and sometimes even… ▽ More Back Translation (BT) is widely used in the field of machine translation, as it has been proved effective for enhancing translation quality. However, BT mainly improves the translation of inputs that share a similar style (to be more specific, translation-like inputs), since the source side of BT data is machine-translated. For natural inputs, BT brings only slight improvements and sometimes even adverse effects. To address this issue, we propose Text Style Transfer Back Translation (TST BT), which uses a style transfer model to modify the source side of BT data. By making the style of source-side text more natural, we aim to improve the translation of natural inputs. Our experiments on various language pairs, including both high-resource and low-resource ones, demonstrate that TST BT significantly improves translation performance against popular BT benchmarks. In addition, TST BT is proved to be effective in domain adaptation so this strategy can be regarded as a general data augmentation method. Our training code and text style transfer model are open-sourced. △ Less

Submitted 2 June, 2023; originally announced June 2023.

Comments: acl2023, 14 pages, 4 figures, 19 tables

arXiv:2305.19749 [pdf, other]

Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality

Authors: Cristian F. Jiménez-Varón, Ying Sun, Han Lin Shang

Abstract: We study the modeling and forecasting of high-dimensional functional time series (HDFTS), which can be cross-sectionally correlated and temporally dependent. We introduce a decomposition of the HDFTS into two distinct components: a deterministic component and a residual component that varies over time. The decomposition is derived through the estimation of two-way functional analysis of variance.… ▽ More We study the modeling and forecasting of high-dimensional functional time series (HDFTS), which can be cross-sectionally correlated and temporally dependent. We introduce a decomposition of the HDFTS into two distinct components: a deterministic component and a residual component that varies over time. The decomposition is derived through the estimation of two-way functional analysis of variance. A functional time series forecasting method, based on functional principal component analysis, is implemented to produce forecasts for the residual component. By combining the forecasts of the residual component with the deterministic component, we obtain forecast curves for multiple populations. We apply the model to age- and sex-specific mortality rates in the United States, France, and Japan, in which there are 51 states, 95 departments, and 47 prefectures, respectively. The proposed method is capable of delivering more accurate point and interval forecasts in forecasting multi-population mortality than several benchmark methods considered. △ Less

Submitted 13 February, 2024; v1 submitted 31 May, 2023; originally announced May 2023.

Comments: 31 pages, 6 figures

MSC Class: 62R10; 91D20

arXiv:2305.16531 [pdf, other]

Forecasting intraday financial time series with sieve bootstrapping and dynamic updating

Authors: Han Lin Shang, Kaiying Ji

Abstract: Intraday financial data often take the form of a collection of curves that can be observed sequentially over time, such as intraday stock price curves. These curves can be viewed as a time series of functions observed on equally spaced and dense grids. Due to the curse of dimensionality, high-dimensional data poses challenges from a statistical aspect; however, it also provides opportunities to an… ▽ More Intraday financial data often take the form of a collection of curves that can be observed sequentially over time, such as intraday stock price curves. These curves can be viewed as a time series of functions observed on equally spaced and dense grids. Due to the curse of dimensionality, high-dimensional data poses challenges from a statistical aspect; however, it also provides opportunities to analyze a rich source of information so that the dynamic changes within short-time intervals can be better understood. We consider a sieve bootstrap method of Paparoditis and Shang (2022) to construct one-day-ahead point and interval forecasts in a model-free way. As we sequentially observe new data, we also implement two dynamic updating methods to update point and interval forecasts for achieving improved accuracy. The forecasting methods are validated through an empirical study of 5-minute cumulative intraday returns of the S&P/ASX All Ordinaries Index. △ Less

Submitted 25 May, 2023; originally announced May 2023.

Comments: 25 pages, 10 figures, 2 tables

MSC Class: 62M10; 62M20

arXiv:2305.03893 [pdf]

Generalizability of PRS313 for breast cancer risk amongst non-Europeans in a Los Angeles biobank

Authors: Helen Shang, Yi Ding, Vidhya Venkateswaran, Kristin Boulier, Nikhita Kathuria-Prakash, Parisa Boodaghi Malidarreh, Jacob M. Luber, Bogdan Pasaniuc

Abstract: Polygenic risk scores (PRS) summarize the combined effect of common risk variants and are associated with breast cancer risk in patients without identifiable monogenic risk factors. One of the most well-validated PRSs in breast cancer to date is PRS313, which was developed from a Northern European biobank but has shown attenuated performance in non-European ancestries. We further investigate the g… ▽ More Polygenic risk scores (PRS) summarize the combined effect of common risk variants and are associated with breast cancer risk in patients without identifiable monogenic risk factors. One of the most well-validated PRSs in breast cancer to date is PRS313, which was developed from a Northern European biobank but has shown attenuated performance in non-European ancestries. We further investigate the generalizability of the PRS313 for American women of European (EA), African (AFR), Asian (EAA), and Latinx (HL) ancestry within one institution with a singular EHR system, genotyping platform, and quality control process. We found that the PRS313 achieved overlapping Areas under the ROC Curve (AUCs) in females of Lantix (AUC, 0.68; 95 CI, 0.65-0.71) and European ancestry (AUC, 0.70; 95 CI, 0.69-0.71) but lower AUCs for the AFR and EAA populations (AFR: AUC, 0.61; 95 CI, 0.56-0.65; EAA: AUC, 0.64; 95 CI, 0.60-0.680). While PRS313 is associated with Hormone Positive (HR+) disease in European Americans (OR, 1.42; 95 CI, 1.16-1.64), for Latinx females, it may be instead associated with Human Epidermal Growth Factor Receptor 2 (HER2+) disease (OR, 2.52; 95 CI, 1.35-4.70) although due to small numbers, additional studies are needed. In summary, we found that PRS313 was significantly associated with breast cancer but with attenuated accuracy in women of African and Asian descent within a singular health system in Los Angeles. Our work further highlights the need for additional validation in diverse cohorts prior to clinical implementation of polygenic risk scores. △ Less

Submitted 5 May, 2023; originally announced May 2023.

Comments: 27 pages, 2 figures

arXiv:2304.09423 [pdf, other]

ASM: Adaptive Skinning Model for High-Quality 3D Face Modeling

Authors: Kai Yang, Hong Shang, Tianyang Shi, Xinghan Chen, Jingkai Zhou, Zhongqian Sun, Wei Yang

Abstract: The research fields of parametric face model and 3D face reconstruction have been extensively studied. However, a critical question remains unanswered: how to tailor the face model for specific reconstruction settings. We argue that reconstruction with multi-view uncalibrated images demands a new model with stronger capacity. Our study shifts attention from data-dependent 3D Morphable Models (3DMM… ▽ More The research fields of parametric face model and 3D face reconstruction have been extensively studied. However, a critical question remains unanswered: how to tailor the face model for specific reconstruction settings. We argue that reconstruction with multi-view uncalibrated images demands a new model with stronger capacity. Our study shifts attention from data-dependent 3D Morphable Models (3DMM) to an understudied human-designed skinning model. We propose Adaptive Skinning Model (ASM), which redefines the skinning model with more compact and fully tunable parameters. With extensive experiments, we demonstrate that ASM achieves significantly improved capacity than 3DMM, with the additional advantage of model size and easy implementation for new topology. We achieve state-of-the-art performance with ASM for multi-view reconstruction on the Florence MICC Coop benchmark. Our quantitative analysis demonstrates the importance of a high-capacity model for fully exploiting abundant information from multi-view input in reconstruction. Furthermore, our model with physical-semantic parameters can be directly utilized for real-world applications, such as in-game avatar creation. As a result, our work opens up new research direction for parametric face model and facilitates future research on multi-view reconstruction. △ Less

Submitted 8 October, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

Comments: 18 pages

Showing 1–50 of 240 results for author: Shang, H