subscribe to arXiv mailings

Associated Mersenne graphs

Abstract: In this paper, a new sub-family of Hypercubes called the \textit{associated Mersenne graphs} $\mathcal{M}_{n}$ are introduced. The definition of associated Mersenne graphs is motivated from the Fibonacci-run graphs ({Ö}. Eǧecioǧlu, V. Iršič, 2021) by extending run-constrained strings to circularly-run-constrained strings. The name of this new family of graphs is identified with the interesting fac… ▽ More In this paper, a new sub-family of Hypercubes called the \textit{associated Mersenne graphs} $\mathcal{M}_{n}$ are introduced. The definition of associated Mersenne graphs is motivated from the Fibonacci-run graphs ({Ö}. Eǧecioǧlu, V. Iršič, 2021) by extending run-constrained strings to circularly-run-constrained strings. The name of this new family of graphs is identified with the interesting fact that $|V(\mathcal{M}_{n})|$ is equal to the $n$-th associated Mersenne number. Various interesting structural and enumerative properties of associated Mersenne graphs are investigated, including the analogue of the fundamental recursion, number of vertices and edges, radius, diameter, center, periphery and medianicity. Some future research directions and open problems concerning associated Mersenne graphs are also proposed. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08183 [pdf, other]

The white-light superflares from cool stars in GWAC triggers

Authors: Guang-Wei Li, Liang Wang, Hai-Long Yuan, Li-Ping Xin, Jing Wang, Chao Wu, Hua-Li Li, Hasitieer Haerken, Wei-Hua Wang, Hong-Bo Cai, Xu-Hui Han, Yang Xu, Lei Huang, Xiao-Meng Lu, Jian-Ying Bai, Xiang-Yu Wang, Zi-Gao Dai, En-Wei Liang, Jian-Yan Wei

Abstract: M-type stars are the ones that flare most frequently, but how big their maximum flare energy can reach is still unknown. We present 163 flares from 162 individual M2 through L1-type stars that triggered the GWAC, with flare energies ranging from $10^{32.2}$ to $10^{36.4}$ erg . The flare amplitudes range from $\triangle G = 0.84$ to $\sim 10$ mag. Flare energy increases with stellar surface temper… ▽ More M-type stars are the ones that flare most frequently, but how big their maximum flare energy can reach is still unknown. We present 163 flares from 162 individual M2 through L1-type stars that triggered the GWAC, with flare energies ranging from $10^{32.2}$ to $10^{36.4}$ erg . The flare amplitudes range from $\triangle G = 0.84$ to $\sim 10$ mag. Flare energy increases with stellar surface temperature ($T_{\rm eff}$) but both $\triangle G$ and equivalent duration $\log_{10}(ED)$ seem to be independent of $T_{\rm eff}$. Combining periods detected from light curves of TESS and K2, spectra from LAMOST, SDSS and the 2.16 m Telescope, and the Gaia DR3 data, we found that these GWAC flare stars are young. For the stars that have spectra, we found that these stars are in or very near to the saturation region, and $\log_{10}(L_{\rm Hα}/L_{\rm bol})$ is lower for M7-L1 stars than for M2-M6 stars. We also studied the relation between GWAC flare bolometric energy $E_{\rm bol}$ and stellar hemispherical area $S$, and found that $\log_{10}E_{\rm bol}$ (in erg) increases with increasing $S$ (in cm$^2$), and the maximum flare energy $\log_{10}E_{\rm bol, max} \geqslant \log_{10}S + 14.25$. For M7-L1 stars, there seem to be other factors limiting their maximum flare energies in addition to stellar hemispherical area. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 18 pages, 11 figures, 4 tables

arXiv:2407.08133 [pdf, other]

Nonverbal Interaction Detection

Authors: Jianan Wei, Tianfei Zhou, Yi Yang, Wenguan Wang

Abstract: This work addresses a new challenge of understanding human nonverbal interaction in social contexts. Nonverbal signals pervade virtually every communicative act. Our gestures, facial expressions, postures, gaze, even physical appearance all convey messages, without anything being said. Despite their critical role in social life, nonverbal signals receive very limited attention as compared to the l… ▽ More This work addresses a new challenge of understanding human nonverbal interaction in social contexts. Nonverbal signals pervade virtually every communicative act. Our gestures, facial expressions, postures, gaze, even physical appearance all convey messages, without anything being said. Despite their critical role in social life, nonverbal signals receive very limited attention as compared to the linguistic counterparts, and existing solutions typically examine nonverbal cues in isolation. Our study marks the first systematic effort to enhance the interpretation of multifaceted nonverbal signals. First, we contribute a novel large-scale dataset, called NVI, which is meticulously annotated to include bounding boxes for humans and corresponding social groups, along with 22 atomic-level nonverbal behaviors under five broad interaction types. Second, we establish a new task NVI-DET for nonverbal interaction detection, which is formalized as identifying triplets in the form <individual, group, interaction> from images. Third, we propose a nonverbal interaction detection hypergraph (NVI-DEHR), a new approach that explicitly models high-order nonverbal interactions using hypergraphs. Central to the model is a dual multi-scale hypergraph that adeptly addresses individual-to-individual and group-to-group correlations across varying scales, facilitating interactional feature learning and eventually improving interaction prediction. Extensive experiments on NVI show that NVI-DEHR improves various baselines significantly in NVI-DET. It also exhibits leading performance on HOI-DET, confirming its versatility in supporting related tasks and strong generalization ability. We hope that our study will offer the community new avenues to explore nonverbal signals in more depth. △ Less

Submitted 14 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

Comments: ECCV 2024; Project page: https://github.com/weijianan1/NVI

arXiv:2407.06934 [pdf, ps, other]

Quantitative stability of the total $Q$-curvature near minimizing metrics

Authors: João Henrique Andrade, Tobias König, Jesse Ratzkin, Juncheng Wei

Abstract: Under appropriate positivity hypotheses, we prove quantitative estimates for the total $k$-th order $Q$-curvature functional near minimizing metrics on any smooth, closed $n$-dimensional Riemannian manifold for every integer $1 \leq k < \frac{n}{2}$. More precisely, we show that on a generic closed Riemannian manifold the distance to the minimizing set of metrics is controlled quadratically by the… ▽ More Under appropriate positivity hypotheses, we prove quantitative estimates for the total $k$-th order $Q$-curvature functional near minimizing metrics on any smooth, closed $n$-dimensional Riemannian manifold for every integer $1 \leq k < \frac{n}{2}$. More precisely, we show that on a generic closed Riemannian manifold the distance to the minimizing set of metrics is controlled quadratically by the $Q$-curvature energy deficit, extending recent work by Engelstein, Neumayer and Spolaor in the case $k=1$. Next we prove, for any integer $1 \leq k< \frac{n}{2}$, the existence of an $n$-dimensional Riemannian manifold such that the $k$-th order $Q$-curvature deficit controls a higher power of the distance to the minimizing set. We believe that these degenerate examples are of independent interest and can be used for further development in the field. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: 43 pages. Comments welcome!

arXiv:2407.06591 [pdf, other]

Rate-Loss Regions for Polynomial Regression with Side Information

Authors: Jiahui Wei, Philippe Mary, Elsa Dupraz

Abstract: In the context of goal-oriented communications, this paper addresses the achievable rate versus generalization error region of a learning task applied on compressed data. The study focuses on the distributed setup where a source is compressed and transmitted through a noiseless channel to a receiver performing polynomial regression, aided by side information available at the decoder. The paper pro… ▽ More In the context of goal-oriented communications, this paper addresses the achievable rate versus generalization error region of a learning task applied on compressed data. The study focuses on the distributed setup where a source is compressed and transmitted through a noiseless channel to a receiver performing polynomial regression, aided by side information available at the decoder. The paper provides the asymptotic rate generalization error region, and extends the analysis to the non-asymptotic regime.Additionally, it investigates the asymptotic trade-off between polynomial regression and data reconstruction under communication constraints. The proposed achievable scheme is shown to achieve the minimum generalization error as well as the optimal rate-distortion region. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Journal ref: International Zurich Seminar on Information and Communication (IZS), Mar 2024, Zurich, Switzerland

arXiv:2407.05034 [pdf]

GCON: Differentially Private Graph Convolutional Network via Objective Perturbation

Authors: Jianxin Wei, Yizheng Zhu, Xiaokui Xiao, Ergute Bao, Yin Yang, Kuntai Cai, Beng Chin Ooi

Abstract: Graph Convolutional Networks (GCNs) are a popular machine learning model with a wide range of applications in graph analytics, including healthcare, transportation, and finance. Similar to other neural networks, a GCN may memorize parts of the training data through its model weights. Thus, when the underlying graph data contains sensitive information such as interpersonal relationships, a GCN trai… ▽ More Graph Convolutional Networks (GCNs) are a popular machine learning model with a wide range of applications in graph analytics, including healthcare, transportation, and finance. Similar to other neural networks, a GCN may memorize parts of the training data through its model weights. Thus, when the underlying graph data contains sensitive information such as interpersonal relationships, a GCN trained without privacy-protection measures could be exploited to extract private data, leading to potential violations of privacy regulations such as GDPR. To defend against such attacks, a promising approach is to train the GCN with differential privacy (DP), which is a rigorous framework that provides strong privacy protection by injecting random noise into the trained model weights. However, training a large graph neural network under DP is a highly challenging task. Existing solutions either introduce random perturbations in the graph topology, which leads to severe distortions of the network's message passing, or inject randomness into each neighborhood aggregation operation, which leads to a high noise scale when the GCN performs multiple levels of aggregations. Motivated by this, we propose GCON, a novel and effective solution for training GCNs with edge differential privacy. The main idea is to (i) convert the GCN training process into a convex optimization problem, and then (ii) apply the classic idea of perturbing the objective function to satisfy DP. Extensive experiments using multiple benchmark datasets demonstrate GCON's consistent and superior performance over existing solutions in a wide variety of settings. △ Less

Submitted 6 July, 2024; originally announced July 2024.

arXiv:2407.04297 [pdf, other]

HuntFUZZ: Enhancing Error Handling Testing through Clustering Based Fuzzing

Authors: Jin Wei, Ping Chen, Jun Dai, Xiaoyan Sun, Zhihao Zhang, Chang Xu, Yi Wanga

Abstract: Testing a program's capability to effectively handling errors is a significant challenge, given that program errors are relatively uncommon. To solve this, Software Fault Injection (SFI)-based fuzzing integrates SFI and traditional fuzzing, injecting and triggering errors for testing (error handling) code. However, we observe that current SFI-based fuzzing approaches have overlooked the correlatio… ▽ More Testing a program's capability to effectively handling errors is a significant challenge, given that program errors are relatively uncommon. To solve this, Software Fault Injection (SFI)-based fuzzing integrates SFI and traditional fuzzing, injecting and triggering errors for testing (error handling) code. However, we observe that current SFI-based fuzzing approaches have overlooked the correlation between paths housing error points. In fact, the execution paths of error points often share common paths. Nonetheless, Fuzzers usually generate test cases repeatedly to test error points on commonly traversed paths. This practice can compromise the efficiency of the fuzzer(s). Thus, this paper introduces HuntFUZZ, a novel SFI-based fuzzing framework that addresses the issue of redundant testing of error points with correlated paths. Specifically, HuntFUZZ clusters these correlated error points and utilizes concolic execution to compute constraints only for common paths within each cluster. By doing so, we provide the fuzzer with efficient test cases to explore related error points with minimal redundancy. We evaluate HuntFUZZ on a diverse set of 42 applications, and HuntFUZZ successfully reveals 162 known bugs, with 62 of them being related to error handling. Additionally, due to its efficient error point detection method, HuntFUZZ discovers 7 unique zero-day bugs, which are all missed by existing fuzzers. Furthermore, we compare HuntFUZZ with 4 existing fuzzing approaches, including AFL, AFL++, AFLGo, and EH-FUZZ. Our evaluation confirms that HuntFUZZ can cover a broader range of error points, and it exhibits better performance in terms of bug finding speed. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2407.04294 [pdf, other]

SQLaser: Detecting DBMS Logic Bugs with Clause-Guided Fuzzing

Authors: Jin Wei, Ping Chen, Kangjie Lu, Jun Dai, Xiaoyan Sun

Abstract: Database Management Systems (DBMSs) are vital components in modern data-driven systems. Their complexity often leads to logic bugs, which are implementation errors within the DBMSs that can lead to incorrect query results, data exposure, unauthorized access, etc., without necessarily causing visible system failures. Existing detection employs two strategies: rule-based bug detection and coverage-g… ▽ More Database Management Systems (DBMSs) are vital components in modern data-driven systems. Their complexity often leads to logic bugs, which are implementation errors within the DBMSs that can lead to incorrect query results, data exposure, unauthorized access, etc., without necessarily causing visible system failures. Existing detection employs two strategies: rule-based bug detection and coverage-guided fuzzing. In general, rule specification itself is challenging; as a result, rule-based detection is limited to specific and simple rules. Coverage-guided fuzzing blindly explores code paths or blocks, many of which are unlikely to contain logic bugs; therefore, this strategy is cost-ineffective. In this paper, we design SQLaser, a SQL-clause-guided fuzzer for detecting logic bugs in DBMSs. Through a comprehensive examination of most existing logic bugs across four distinct DBMSs, excluding those causing system crashes, we have identified 35 logic bug patterns. These patterns manifest as certain SQL clause combinations that commonly result in logic bugs, and behind these clause combinations are a sequence of functions. We therefore model logic bug patterns as error-prone function chains (ie, sequences of functions). We further develop a directed fuzzer with a new path-to-path distance-calculation mechanism for effectively testing these chains and discovering additional logic bugs. This mechanism enables SQLaser to swiftly navigate to target sites and uncover potential bugs emerging from these paths. Our evaluation, conducted on SQLite, MySQL, PostgreSQL, and TiDB, demonstrates that SQLaser significantly accelerates bug discovery compared to other fuzzing approaches, reducing detection time by approximately 60%. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2407.02547 [pdf, other]

Domain Generalizable Knowledge Tracing via Concept Aggregation and Relation-Based Attention

Authors: Yuquan Xie, Wanqi Yang, Jinyu Wei, Ming Yang, Yang Gao

Abstract: Knowledge Tracing (KT) is a critical task in online education systems, aiming to monitor students' knowledge states throughout a learning period. Common KT approaches involve predicting the probability of a student correctly answering the next question based on their exercise history. However, these methods often suffer from performance degradation when faced with the scarcity of student interacti… ▽ More Knowledge Tracing (KT) is a critical task in online education systems, aiming to monitor students' knowledge states throughout a learning period. Common KT approaches involve predicting the probability of a student correctly answering the next question based on their exercise history. However, these methods often suffer from performance degradation when faced with the scarcity of student interactions in new education systems. To address this, we leverage student interactions from existing education systems to mitigate performance degradation caused by limited training data. Nevertheless, these interactions exhibit significant differences since they are derived from different education systems. To address this issue, we propose a domain generalization approach for knowledge tracing, where existing education systems are considered source domains, and new education systems with limited data are considered target domains. Additionally, we design a domain-generalizable knowledge tracing framework (DGKT) that can be applied to any KT model. Specifically, we present a concept aggregation approach designed to reduce conceptual disparities within sequences of student interactions from diverse domains. To further mitigate domain discrepancies, we introduce a novel normalization module called Sequence Instance Normalization (SeqIN). Moreover, to fully leverage exercise information, we propose a new knowledge tracing model tailored for the domain generalization KT task, named Domain-Generalizable Relation-based Knowledge Tracing (DGRKT). Extensive experiments across five benchmark datasets demonstrate that the proposed method performs well despite limited training data. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.01937 [pdf, other]

Efficient-Empathy: Towards Efficient and Effective Selection of Empathy Data

Authors: Linzhuang Sun, Hao Liang, Jingxuan Wei, Linkun Sun, Bihui Yu, Bin Cui, Wentao Zhang

Abstract: In recent years, with the rapid advancements in large language models (LLMs), achieving excellent empathetic response capability has become a crucial prerequisite. Consequently, managing and understanding large-scale video datasets has gained increasing importance. However, empathetic data are typically trained without any quality selection, leading to inefficient data usage and wasted computation… ▽ More In recent years, with the rapid advancements in large language models (LLMs), achieving excellent empathetic response capability has become a crucial prerequisite. Consequently, managing and understanding large-scale video datasets has gained increasing importance. However, empathetic data are typically trained without any quality selection, leading to inefficient data usage and wasted computational resources. Additionally, using raw data can result in low performance in empathetic dialogues. In this work, we present Efficient-Empathy, a sensibility and rationality score-based data selection algorithm that automatically selects sensibility and rationality data while discarding low-quality data. With only the sensibility data (59% of the full dataset), our trained sensibility model efficiently achieves state-of-the-art (SoTA) performance. Furthermore, with multiple data selection hyperparameters, the sensibility model demonstrates SoTA performance, showcasing the robustness of our method. By integrating sensibility and rationality data with a MoE structure, we achieve even higher performance, demonstrating the effectiveness of our Efficient-Empathy algorithm. △ Less

Submitted 9 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.00639 [pdf, other]

GRB 221009A/SN 2022xiw: A Supernova Obscured by a Gamma-Ray Burst Afterglow?

Authors: De-Feng Kong, Xiang-Gao Wang, WeiKang Zheng, Hou-Jun Lü, L. P. Xin, Da-Bin Lin, Jia-Xin Cao, Ming-Xuan Lu, B. Ren, Edgar P. Vidal, J. Y. Wei, En-Wei Liang, Alexei V. Filippenko

Abstract: We present optical photometry for the afterglow of GRB 221009A, in some respects the most extraordinary gamma-ray burst (GRB) ever observed. Good quality in the R-band light curve is obtained, covering 0.32-19.57 days since the Fermi-GBM trigger. We find that a weak bump emerges fromthe declining afterglow at $t \approx 11$ days; a supernova (SN) may be responsible. We use a smooth broken power-la… ▽ More We present optical photometry for the afterglow of GRB 221009A, in some respects the most extraordinary gamma-ray burst (GRB) ever observed. Good quality in the R-band light curve is obtained, covering 0.32-19.57 days since the Fermi-GBM trigger. We find that a weak bump emerges fromthe declining afterglow at $t \approx 11$ days; a supernova (SN) may be responsible. We use a smooth broken power-law and $^{56}\mathrm{Ni}$ model to fit the light curve. The best-fitting results reveal that the SN ejected a total mass of $M_\mathrm{ej} = 3.70 M_\odot$, a $^{56}\mathrm{Ni}$ mass of $M_\mathrm{Ni} = 0.23 M_\odot$, and a kinetic energy of $E_\mathrm{SN,K} = 2.35 \times 10^{52} \mathrm{erg}$. We also compare GRB 221009A with other GRB-SN events based on a GRB-associated SN sample, and find that only SN 2003lw and SN 2011kl can be obviously revealed in the afterglow of GRB 221009A by setting these objects at its distance. This suggests that a supernova (SN 2022xiw) is possibly obscured by the brighter afterglow emission from GRB 221009A. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2407.00088 [pdf, other]

T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

Authors: Jianyu Wei, Shijie Cao, Ting Cao, Lingxiao Ma, Lei Wang, Yanyong Zhang, Mao Yang

Abstract: The deployment of Large Language Models (LLMs) on edge devices is increasingly important to enhance on-device intelligence. Weight quantization is crucial for reducing the memory footprint of LLMs on devices. However, low-bit LLMs necessitate mixed precision matrix multiplication (mpGEMM) of low precision weights and high precision activations during inference. Existing systems, lacking native sup… ▽ More The deployment of Large Language Models (LLMs) on edge devices is increasingly important to enhance on-device intelligence. Weight quantization is crucial for reducing the memory footprint of LLMs on devices. However, low-bit LLMs necessitate mixed precision matrix multiplication (mpGEMM) of low precision weights and high precision activations during inference. Existing systems, lacking native support for mpGEMM, resort to dequantize weights for high precision computation. Such an indirect way can lead to a significant inference overhead. In this paper, we introduce T-MAC, an innovative lookup table(LUT)-based method designed for efficient low-bit LLM (i.e., weight-quantized LLM) inference on CPUs. T-MAC directly supports mpGEMM without dequantization, while simultaneously eliminating multiplications and reducing additions required. Specifically, T-MAC transforms the traditional data-type-centric multiplication to bit-wise table lookup, and enables a unified and scalable mpGEMM solution. Our LUT-based kernels scale linearly to the weight bit-width. Evaluated on low-bit Llama and BitNet models, T-MAC demonstrates up to 4x increase in throughput and 70% reduction in energy consumption compared to llama.cpp. For BitNet-b1.58-3B, T-MAC delivers a token generation throughput of 30 tokens/s with a single core and 71 tokens/s with eight cores on M2-Ultra, and 11 tokens/s on lower-end devices like Raspberry Pi 5, which significantly exceeds the adult average reading speed. T-MAC with LUT-based computing paradigm, paves the way for the practical deployment of low-bit LLMs on resource-constrained edge devices without compromising computational efficiency. The system is open-sourced at https://github.com/microsoft/T-MAC. △ Less

Submitted 25 June, 2024; originally announced July 2024.

arXiv:2407.00005 [pdf, other]

Dual-pronged deep learning preprocessing on heterogeneous platforms with CPU, GPU and CSD

Authors: Jia Wei, Xingjun Zhang, Witold Pedrycz, Longxiang Wang, Jie Zhao

Abstract: Most existing data preprocessing is done at the CPU. Although some studies use techniques such as multi-processing and double buffering to accelerate CPU preprocessing, CPU computational speed and storage bandwidth still limit the processing speed. Other studies try to use intelligent data storage devices, such as computational storage devices, to complete data preprocessing instead of CPUs. The c… ▽ More Most existing data preprocessing is done at the CPU. Although some studies use techniques such as multi-processing and double buffering to accelerate CPU preprocessing, CPU computational speed and storage bandwidth still limit the processing speed. Other studies try to use intelligent data storage devices, such as computational storage devices, to complete data preprocessing instead of CPUs. The current studies use only one device to complete data preprocessing operations, which cannot fully overlap data preprocessing and accelerator computation time. To fully exploit the independence and high bandwidth of the novel CSD, this paper proposes an advanced, highly parallel dual-pronged data preprocessing algorithm (DDLP) that significantly improves the execution efficiency and computational overlap between heterogeneous devices. DDLP enables the CPU and CSD to start data preprocessing operations from both ends of the dataset separately. Meanwhile, we propose two adaptive dynamic selection strategies to make DDLP control the GPU to automatically read data from different sources. We achieve sufficient computational overlap between CSD data preprocessing and CPU preprocessing, GPU computation, and GPU data reading. In addition, DDLP leverages GPU Direct Storage technology to enable efficient SSD-to-GPU data transfer. DDLP reduces the usage of expensive CPU and DRAM resources, reduces the number of SSD-to-GPU data transfers, and improves the energy efficiency of preprocessing while reducing the overall preprocessing and training time. Extensive experimental results show that DDLP can improve learning speed by up to 23.5% on ImageNet Dataset while reducing energy consumption by 19.7% and CPU and DRAM usage by 37.6%. DDLP also improve learning speed by up to 27.6% on Cifar-10 Dataset. △ Less

Submitted 17 April, 2024; originally announced July 2024.

arXiv:2406.19966 [pdf, other]

Simulating Financial Market via Large Language Model based Agents

Authors: Shen Gao, Yuntao Wen, Minghang Zhu, Jianing Wei, Yuhan Cheng, Qunzi Zhang, Shuo Shang

Abstract: Most economic theories typically assume that financial market participants are fully rational individuals and use mathematical models to simulate human behavior in financial markets. However, human behavior is often not entirely rational and is challenging to predict accurately with mathematical models. In this paper, we propose \textbf{A}gent-based \textbf{S}imulated \textbf{F}inancial \textbf{M}… ▽ More Most economic theories typically assume that financial market participants are fully rational individuals and use mathematical models to simulate human behavior in financial markets. However, human behavior is often not entirely rational and is challenging to predict accurately with mathematical models. In this paper, we propose \textbf{A}gent-based \textbf{S}imulated \textbf{F}inancial \textbf{M}arket (ASFM), which first constructs a simulated stock market with a real order matching system. Then, we propose a large language model based agent as the stock trader, which contains the profile, observation, and tool-learning based action module. The trading agent can comprehensively understand current market dynamics and financial policy information, and make decisions that align with their trading strategy. In the experiments, we first verify that the reactions of our ASFM are consistent with the real stock market in two controllable scenarios. In addition, we also conduct experiments in two popular economics research directions, and we find that conclusions drawn in our \model align with the preliminary findings in economics research. Based on these observations, we believe our proposed ASFM provides a new paradigm for economic research. △ Less

Submitted 28 June, 2024; originally announced June 2024.

arXiv:2406.19654 [pdf, other]

A lensed FRB candidate in the first CHIME/FRB Catalogue and its potential implications

Authors: Chenming Chang, Songbo Zhang, Di Xiao, Zhenfan Tang, Ye Li, Junjie Wei, Xuefeng Wu

Abstract: Fast radio bursts (FRBs) are immensely energetic radio pulses with durations of milliseconds. Given their high all-sky rate, the probability of an FRB being lensed by an intervening massive object is non-negligible. In this study, we search for possible lensing candidates within the first Canadian Hydrogen Intensity Mapping Experiment FRB catalogue using an autocorrelation algorithm and verificati… ▽ More Fast radio bursts (FRBs) are immensely energetic radio pulses with durations of milliseconds. Given their high all-sky rate, the probability of an FRB being lensed by an intervening massive object is non-negligible. In this study, we search for possible lensing candidates within the first Canadian Hydrogen Intensity Mapping Experiment FRB catalogue using an autocorrelation algorithm and verification through signal simulations. We identify FRB 20190308C as a lensed candidate with a significance of 3.4 sigma. Furthermore, we constrain the mass of the lensing object using the Chang-Refsdal lens model, based on the flux ratio and time delay between the substructures of FRB 20190308C. Future long-term and high-precision observations are expected to reveal more lensed FRBs. △ Less

Submitted 28 June, 2024; originally announced June 2024.

Comments: 5 pages, 4 figures, submitted

arXiv:2406.18883 [pdf, other]

Galaxies Ages with Redshift z=2 to 4: Stellar Population Synthesis for Candidates in FourStar Galaxy Evolution Survey

Authors: Chong-yu Gao, Martin Lopez-Corredoira, Jun-jie Wei

Abstract: Observations of large amount of massive galaxies with relatively old populations found at high redshifts are challenging galaxy formation scenarios within standard cosmology. Precise determinations of the average age of these galaxies would be useful for the discussion of this problem. Here we carry out a better constraint of the age of 200 V-shaped SED non-AGN galaxies at redshifts $2<z<4$ of the… ▽ More Observations of large amount of massive galaxies with relatively old populations found at high redshifts are challenging galaxy formation scenarios within standard cosmology. Precise determinations of the average age of these galaxies would be useful for the discussion of this problem. Here we carry out a better constraint of the age of 200 V-shaped SED non-AGN galaxies at redshifts $2<z<4$ of the catalog of FourStar Galaxy Evolution Survey, identified by V-shape in their spectral energy distribution (SED) with a Lyman and a Balmer break. SED fitting include a main stellar population in addition to a residual younger population and extinction. The galaxies are younger at higher redshift on average. However, for the galaxies with $z>2.5$, we do not see a significant evolution of their average age, with all average ages of galaxies mostly remaining between 1 and 2 Gyr. Our research find that most massive galaxies ($\sim 10^{10} M_\odot$ ) are older (typically $>\sim 1$ Gyr old) and formed earlier than less massive galaxies in our sample. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 8 pages, 8 figures, 1 table, accepted for publication in ApJ

arXiv:2406.13631 [pdf, other]

On AI-Inspired UI-Design

Authors: Jialiang Wei, Anne-Lise Courbis, Thomas Lambolais, Gérard Dray, Walid Maalej

Abstract: Graphical User Interface (or simply UI) is a primary mean of interaction between users and their device. In this paper, we discuss three major complementary approaches on how to use Artificial Intelligence (AI) to support app designers create better, more diverse, and creative UI of mobile apps. First, designers can prompt a Large Language Model (LLM) like GPT to directly generate and adjust one o… ▽ More Graphical User Interface (or simply UI) is a primary mean of interaction between users and their device. In this paper, we discuss three major complementary approaches on how to use Artificial Intelligence (AI) to support app designers create better, more diverse, and creative UI of mobile apps. First, designers can prompt a Large Language Model (LLM) like GPT to directly generate and adjust one or multiple UIs. Second, a Vision-Language Model (VLM) enables designers to effectively search a large screenshot dataset, e.g. from apps published in app stores. The third approach is to train a Diffusion Model (DM) specifically designed to generate app UIs as inspirational images. We discuss how AI should be used, in general, to inspire and assist creative app design rather than automating it. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.12787 [pdf, other]

Generating Educational Materials with Different Levels of Readability using LLMs

Authors: Chieh-Yang Huang, Jing Wei, Ting-Hao 'Kenneth' Huang

Abstract: This study introduces the leveled-text generation task, aiming to rewrite educational materials to specific readability levels while preserving meaning. We assess the capability of GPT-3.5, LLaMA-2 70B, and Mixtral 8x7B, to generate content at various readability levels through zero-shot and few-shot prompting. Evaluating 100 processed educational materials reveals that few-shot prompting signific… ▽ More This study introduces the leveled-text generation task, aiming to rewrite educational materials to specific readability levels while preserving meaning. We assess the capability of GPT-3.5, LLaMA-2 70B, and Mixtral 8x7B, to generate content at various readability levels through zero-shot and few-shot prompting. Evaluating 100 processed educational materials reveals that few-shot prompting significantly improves performance in readability manipulation and information preservation. LLaMA-2 70B performs better in achieving the desired difficulty range, while GPT-3.5 maintains original meaning. However, manual inspection highlights concerns such as misinformation introduction and inconsistent edit distribution. These findings emphasize the need for further research to ensure the quality of generated educational content. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: In2Writing 2024

arXiv:2406.11693 [pdf, ps, other]

Wake dynamics of wind turbines in unsteady streamwise flow conditions

Authors: Nathaniel J. Wei, Adnan El Makdah, JiaCheng Hu, Frieder Kaiser, David E. Rival, John O. Dabiri

Abstract: The unsteady flow physics of wind-turbine wakes under dynamic forcing conditions are critical to the modeling and control of wind farms for optimal power density. Unsteady forcing in the streamwise direction may be generated by unsteady inflow conditions in the atmospheric boundary layer, dynamic induction control of the turbine, or streamwise surge motions of a floating offshore wind turbine due… ▽ More The unsteady flow physics of wind-turbine wakes under dynamic forcing conditions are critical to the modeling and control of wind farms for optimal power density. Unsteady forcing in the streamwise direction may be generated by unsteady inflow conditions in the atmospheric boundary layer, dynamic induction control of the turbine, or streamwise surge motions of a floating offshore wind turbine due to floating-platform oscillations. This study seeks to identify the dominant flow mechanisms in unsteady wakes forced by a periodic upstream inflow condition. A theoretical framework for the problem is derived, which describes traveling-wave undulations in the wake radius and streamwise velocity. These dynamics encourage the aggregation of tip vortices into large structures that are advected along in the wake. Flow measurements in the wake of a periodically surging turbine were obtained in an optically accessible towing-tank facility, with an average diameter-based Reynolds number of 300,000 and with surge-velocity amplitudes of up to 40% of the mean inflow velocity. Qualitative agreement between trends in the measurements and model predictions is observed, supporting the validity of the theoretical analyses. The experiments also demonstrate large enhancements in the recovery of the wake relative to the steady-flow case, with wake-length reductions of up to 46.5% and improvements in the available power at 10 diameters downstream of up to 15.7%. These results provide fundamental insights into the dynamics of unsteady wakes and serve as additional evidence that unsteady fluid mechanics can be leveraged to increase the power density of wind farms. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.10956 [pdf, other]

Robust Channel Learning for Large-Scale Radio Speaker Verification

Authors: Wenhao Yang, Jianguo Wei, Wenhuan Lu, Lei Li, Xugang Lu

Abstract: Recent research in speaker verification has increasingly focused on achieving robust and reliable recognition under challenging channel conditions and noisy environments. Identifying speakers in radio communications is particularly difficult due to inherent limitations such as constrained bandwidth and pervasive noise interference. To address this issue, we present a Channel Robust Speaker Learnin… ▽ More Recent research in speaker verification has increasingly focused on achieving robust and reliable recognition under challenging channel conditions and noisy environments. Identifying speakers in radio communications is particularly difficult due to inherent limitations such as constrained bandwidth and pervasive noise interference. To address this issue, we present a Channel Robust Speaker Learning (CRSL) framework that enhances the robustness of the current speaker verification pipeline, considering data source, data augmentation, and the efficiency of model transfer processes. Our framework introduces an augmentation module that mitigates bandwidth variations in radio speech datasets by manipulating the bandwidth of training inputs. It also addresses unknown noise by introducing noise within the manifold space. Additionally, we propose an efficient fine-tuning method that reduces the need for extensive additional training time and large amounts of data. Moreover, we develop a toolkit for assembling a large-scale radio speech corpus and establish a benchmark specifically tailored for radio scenario speaker verification studies. Experimental results demonstrate that our proposed methodology effectively enhances performance and mitigates degradation caused by radio transmission in speaker verification tasks. The code will be available on Github. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: 12 pages, 11 figures

arXiv:2406.10938 [pdf, other]

doi 10.14778/3665844.3665854

DET-LSH: A Locality-Sensitive Hashing Scheme with Dynamic Encoding Tree for Approximate Nearest Neighbor Search

Authors: Jiuqi Wei, Botao Peng, Xiaodong Lee, Themis Palpanas

Abstract: Locality-sensitive hashing (LSH) is a well-known solution for approximate nearest neighbor (ANN) search in high-dimensional spaces due to its robust theoretical guarantee on query accuracy. Traditional LSH-based methods mainly focus on improving the efficiency and accuracy of the query phase by designing different query strategies, but pay little attention to improving the efficiency of the indexi… ▽ More Locality-sensitive hashing (LSH) is a well-known solution for approximate nearest neighbor (ANN) search in high-dimensional spaces due to its robust theoretical guarantee on query accuracy. Traditional LSH-based methods mainly focus on improving the efficiency and accuracy of the query phase by designing different query strategies, but pay little attention to improving the efficiency of the indexing phase. They typically fine-tune existing data-oriented partitioning trees to index data points and support their query strategies. However, their strategy to directly partition the multi-dimensional space is time-consuming, and performance degrades as the space dimensionality increases. In this paper, we design an encoding-based tree called Dynamic Encoding Tree (DE-Tree) to improve the indexing efficiency and support efficient range queries based on Euclidean distance. Based on DE-Tree, we propose a novel LSH scheme called DET-LSH. DET-LSH adopts a novel query strategy, which performs range queries in multiple independent index DE-Trees to reduce the probability of missing exact NN points, thereby improving the query accuracy. Our theoretical studies show that DET-LSH enjoys probabilistic guarantees on query accuracy. Extensive experiments on real-world datasets demonstrate the superiority of DET-LSH over the state-of-the-art LSH-based methods on both efficiency and accuracy. While achieving better query accuracy than competitors, DET-LSH achieves up to 6x speedup in indexing time and 2x speedup in query time over the state-of-the-art LSH-based methods. This paper was published in PVLDB 2024. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Journal ref: PVLDB, 17(9): 2241 - 2254, 2024

arXiv:2406.10857 [pdf, other]

An LLM-enhanced Multi-objective Evolutionary Search for Autonomous Driving Test Scenario Generation

Authors: Haoxiang Tian, Xingshuo Han, Guoquan Wu, Yuan Zhou, Shuo Li, Jun Wei, Dan Ye, Wei Wang, Tianwei Zhang

Abstract: The safety of Autonomous Driving Systems (ADSs) is significantly important for the implementation of autonomous vehicles (AVs). Therefore, ADSs must be evaluated thoroughly before their release and deployment to the public. How to generate diverse safety-critical test scenarios is a key task for ADS testing. This paper proposes LEADE, an LLM-enhanced scenario generation approach for ADS testing, w… ▽ More The safety of Autonomous Driving Systems (ADSs) is significantly important for the implementation of autonomous vehicles (AVs). Therefore, ADSs must be evaluated thoroughly before their release and deployment to the public. How to generate diverse safety-critical test scenarios is a key task for ADS testing. This paper proposes LEADE, an LLM-enhanced scenario generation approach for ADS testing, which adopts the LLM-enhanced adaptive evolutionary search to generate safety-critical and diverse test scenarios. LEADE leverages LLM's ability in program understanding to better comprehend the scenario generation task, which generates high-quality scenarios of the first generation. LEADE adopts an adaptive multi-objective genetic algorithm to search for diverse safety-critical scenarios. To guide the search away from the local optima, LEADE formulates the evolutionary search into a QA task, which leverages LLM's ability in quantitative reasoning to generate differential seed scenarios to break out of the local optimal solutions. We implement and evaluate LEADE on industrial-grade full-stack ADS platform, Baidu Apollo. Experimental results show that LEADE can effectively and efficiently generate safety-critical scenarios and expose 10 diverse safety violations of Apollo. It outperforms two state-of-the-art search-based ADS testing techniques by identifying 4 new types of safety-critical scenarios on the same roads. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: 12 pages

arXiv:2406.09772 [pdf, other]

Accelerated Over-Relaxation Heavy-Ball Methods with Provable Acceleration and Global Convergence

Authors: Jingrong Wei, Long Chen

Abstract: The heavy-ball momentum method has gained widespread popularity for accelerating gradient descent by incorporating a momentum term. Recent studies have conclusively shown that the heavy-ball method cannot achieve an accelerated convergence rate for general smooth strongly convex optimization problems. This work introduces the Accelerated Over-Relaxation Heavy-Ball (AOR-HB) method, a novel approach… ▽ More The heavy-ball momentum method has gained widespread popularity for accelerating gradient descent by incorporating a momentum term. Recent studies have conclusively shown that the heavy-ball method cannot achieve an accelerated convergence rate for general smooth strongly convex optimization problems. This work introduces the Accelerated Over-Relaxation Heavy-Ball (AOR-HB) method, a novel approach that represents the first heavy-ball method to demonstrate provable global and accelerated convergence for smooth strongly convex optimization. The key innovation of the AOR-HB method lies in the application of an over-relaxation technique to the gradient term. This novel approach enables the method to be applied to min-max problems and meet optimal lower complexity bounds. This breakthrough addresses a long-standing theoretical gap in heavy-ball momentum methods and paves the way for developing accelerated methods that transcend the boundaries of convex optimization to non-convex optimization. Numerical experiments validate the effectiveness of the proposed algorithms, with their performance matching that of other leading first-order optimization methods. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.08698 [pdf, other]

Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes of astrophysical $γ$-ray background while large amount of dark matter. By analyzing more than 700 days observational data at LHAASO, no significant dark matter signal from 1 TeV to 1 EeV is detected. Accordingly we derive the most stringent constraints on the ultra-heavy dark matter annihilation cross-section up to EeV. The constraints on the lifetime of dark matter in decay mode are also derived. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 17 pages, 12 figures, accepted by PRL

arXiv:2406.08168 [pdf, ps, other]

Global Tests for Smoothed Functions in Mean Field Variational Additive Models

Authors: Mark J. Meyer, Junyi Wei

Abstract: Variational regression methods are an increasingly popular tool for their efficient estimation of complex. Given the mixed model representation of penalized effects, additive regression models with smoothed effects and scalar-on-function regression models can be fit relatively efficiently in a variational framework. However, inferential procedures for smoothed and functional effects in such a cont… ▽ More Variational regression methods are an increasingly popular tool for their efficient estimation of complex. Given the mixed model representation of penalized effects, additive regression models with smoothed effects and scalar-on-function regression models can be fit relatively efficiently in a variational framework. However, inferential procedures for smoothed and functional effects in such a context is limited. We demonstrate that by using the Mean Field Variational Bayesian (MFVB) approximation to the additive model and the subsequent Coordinate Ascent Variational Inference (CAVI) algorithm, we can obtain a form of the estimated effects required of a Frequentist test for semiparametric curves. We establish MFVB approximations and CAVI algorithms for both Gaussian and binary additive models with an arbitrary number of smoothed and functional effects. We then derive a global testing framework for smoothed and functional effects. Our empirical study demonstrates that the test maintains good Frequentist properties in the variational framework and can be used to directly test results from a converged, MFVB approximation and CAVI algorithm. We illustrate the applicability of this approach in a wide range of data illustrations. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.06559 [pdf, other]

Harnessing Business and Media Insights with Large Language Models

Authors: Yujia Bao, Ankit Parag Shah, Neeru Narang, Jonathan Rivers, Rajeev Maksey, Lan Guan, Louise N. Barrere, Shelley Evenson, Rahul Basole, Connie Miao, Ankit Mehta, Fabien Boulay, Su Min Park, Natalie E. Pearson, Eldhose Joy, Tiger He, Sumiran Thakur, Koustav Ghosal, Josh On, Phoebe Morrison, Tim Major, Eva Siqi Wang, Gina Escobar, Jiaheng Wei, Tharindu Cyril Weerasooriya , et al. (8 additional authors not shown)

Abstract: This paper introduces Fortune Analytics Language Model (FALM). FALM empowers users with direct access to comprehensive business analysis, including market trends, company performance metrics, and expert insights. Unlike generic LLMs, FALM leverages a curated knowledge base built from professional journalism, enabling it to deliver precise and in-depth answers to intricate business questions. Users… ▽ More This paper introduces Fortune Analytics Language Model (FALM). FALM empowers users with direct access to comprehensive business analysis, including market trends, company performance metrics, and expert insights. Unlike generic LLMs, FALM leverages a curated knowledge base built from professional journalism, enabling it to deliver precise and in-depth answers to intricate business questions. Users can further leverage natural language queries to directly visualize financial data, generating insightful charts and graphs to understand trends across diverse business sectors clearly. FALM fosters user trust and ensures output accuracy through three novel methods: 1) Time-aware reasoning guarantees accurate event registration and prioritizes recent updates. 2) Thematic trend analysis explicitly examines topic evolution over time, providing insights into emerging business landscapes. 3) Content referencing and task decomposition enhance answer fidelity and data visualization accuracy. We conduct both automated and human evaluations, demonstrating FALM's significant performance improvements over baseline methods while prioritizing responsible AI practices. These benchmarks establish FALM as a cutting-edge LLM in the business and media domains, with exceptional accuracy and trustworthiness. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.05688 [pdf, other]

Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions

Authors: Cheng Tan, Dongxin Lyu, Siyuan Li, Zhangyang Gao, Jingxuan Wei, Siqi Ma, Zicheng Liu, Stan Z. Li

Abstract: Large Language Models (LLMs) have demonstrated wide-ranging applications across various fields and have shown significant potential in the academic peer-review process. However, existing applications are primarily limited to static review generation based on submitted papers, which fail to capture the dynamic and iterative nature of real-world peer reviews. In this paper, we reformulate the peer-r… ▽ More Large Language Models (LLMs) have demonstrated wide-ranging applications across various fields and have shown significant potential in the academic peer-review process. However, existing applications are primarily limited to static review generation based on submitted papers, which fail to capture the dynamic and iterative nature of real-world peer reviews. In this paper, we reformulate the peer-review process as a multi-turn, long-context dialogue, incorporating distinct roles for authors, reviewers, and decision makers. We construct a comprehensive dataset containing over 26,841 papers with 92,017 reviews collected from multiple sources, including the top-tier conference and prestigious journal. This dataset is meticulously designed to facilitate the applications of LLMs for multi-turn dialogues, effectively simulating the complete peer-review process. Furthermore, we propose a series of metrics to evaluate the performance of LLMs for each role under this reformulated peer-review setting, ensuring fair and comprehensive evaluations. We believe this work provides a promising perspective on enhancing the LLM-driven peer-review process by incorporating dynamic, role-based interactions. It aligns closely with the iterative and interactive nature of real-world academic peer review, offering a robust foundation for future research and development in this area. We open-source the dataset at https://github.com/chengtan9907/ReviewMT. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Comments: Under review

arXiv:2406.03880 [pdf, other]

Memorization in deep learning: A survey

Authors: Jiaheng Wei, Yanjun Zhang, Leo Yu Zhang, Ming Ding, Chao Chen, Kok-Leong Ong, Jun Zhang, Yang Xiang

Abstract: Deep Learning (DL) powered by Deep Neural Networks (DNNs) has revolutionized various domains, yet understanding the intricacies of DNN decision-making and learning processes remains a significant challenge. Recent investigations have uncovered an interesting memorization phenomenon in which DNNs tend to memorize specific details from examples rather than learning general patterns, affecting model… ▽ More Deep Learning (DL) powered by Deep Neural Networks (DNNs) has revolutionized various domains, yet understanding the intricacies of DNN decision-making and learning processes remains a significant challenge. Recent investigations have uncovered an interesting memorization phenomenon in which DNNs tend to memorize specific details from examples rather than learning general patterns, affecting model generalization, security, and privacy. This raises critical questions about the nature of generalization in DNNs and their susceptibility to security breaches. In this survey, we present a systematic framework to organize memorization definitions based on the generalization and security/privacy domains and summarize memorization evaluation methods at both the example and model levels. Through a comprehensive literature review, we explore DNN memorization behaviors and their impacts on security and privacy. We also introduce privacy vulnerabilities caused by memorization and the phenomenon of forgetting and explore its connection with memorization. Furthermore, we spotlight various applications leveraging memorization and forgetting mechanisms, including noisy label learning, privacy preservation, and model enhancement. This survey offers the first-in-kind understanding of memorization in DNNs, providing insights into its challenges and opportunities for enhancing AI development while addressing critical ethical concerns. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2406.03799 [pdf]

Enhanced Semantic Segmentation Pipeline for WeatherProof Dataset Challenge

Authors: Nan Zhang, Xidan Zhang, Jianing Wei, Fangjun Wang, Zhiming Tan

Abstract: This report describes the winning solution to the WeatherProof Dataset Challenge (CVPR 2024 UG2+ Track 3). Details regarding the challenge are available at https://cvpr2024ug2challenge.github.io/track3.html. We propose an enhanced semantic segmentation pipeline for this challenge. Firstly, we improve semantic segmentation models, using backbone pretrained with Depth Anything to improve UperNet mod… ▽ More This report describes the winning solution to the WeatherProof Dataset Challenge (CVPR 2024 UG2+ Track 3). Details regarding the challenge are available at https://cvpr2024ug2challenge.github.io/track3.html. We propose an enhanced semantic segmentation pipeline for this challenge. Firstly, we improve semantic segmentation models, using backbone pretrained with Depth Anything to improve UperNet model and SETRMLA model, and adding language guidance based on both weather and category information to InternImage model. Secondly, we introduce a new dataset WeatherProofExtra with wider viewing angle and employ data augmentation methods, including adverse weather and super-resolution. Finally, effective training strategies and ensemble method are applied to improve final performance further. Our solution is ranked 1st on the final leaderboard. Code will be available at https://github.com/KaneiGi/WeatherProofChallenge. △ Less

Submitted 6 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

arXiv:2406.01839 [pdf, other]

doi 10.1016/j.nima.2023.168685

Simulation of DAMPE silicon microstrip detectors in the $\rm Allpix^{2}$ framework

Authors: Yu-Xin Cui, Xiang Li, Shen Wang, Chuan Yue, Qiang Wan, Shi-Jun Lei, Guan-Wen Yuan, Yi-Ming Hu, Jia-Ju Wei, Jian-Hua Guo

Abstract: Silicon strip detectors have been widely utilized in space experiments for gamma-ray and cosmic-ray detections thanks to their high spatial resolution and stable performance. For a silicon micro-strip detector, the Monte Carlo simulation is recognized as a practical and cost-effective approach to verify the detector performance. In this study, a technique for the simulation of the silicon micro-st… ▽ More Silicon strip detectors have been widely utilized in space experiments for gamma-ray and cosmic-ray detections thanks to their high spatial resolution and stable performance. For a silicon micro-strip detector, the Monte Carlo simulation is recognized as a practical and cost-effective approach to verify the detector performance. In this study, a technique for the simulation of the silicon micro-strip detector with the $\rm Allpix^{2}$ framework is developed. By incorporating the electric field into the particle transport simulation based on Geant4, this framework could precisely emulate the carrier drift in the silicon micro-strip detector. The simulation results are validated using the beam test data as well as the flight data of the DAMPE experiment, which suggests that the $\rm Allpix^{2}$ framework is a powerful tool to obtain the performance of the silicon micro-strip detector. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Journal ref: Nuclear Instruments and Methods in Physics Research A 1057 (2023) 168685

arXiv:2406.00993 [pdf]

Detection of Acetone as a Gas Biomarker for Diabetes Based on Gas Sensor Technology

Authors: Jiaming Wei, Tong Liu, Jipeng Huang, Xiaowei Li, Yurui Qi, Gangyin Luo

Abstract: With the continuous development and improvement of medical services, there is a growing demand for improving diabetes diagnosis. Exhaled breath analysis, characterized by its speed, convenience, and non-invasive nature, is leading the trend in diagnostic development. Studies have shown that the acetone levels in the breath of diabetes patients are higher than normal, making acetone a basis for dia… ▽ More With the continuous development and improvement of medical services, there is a growing demand for improving diabetes diagnosis. Exhaled breath analysis, characterized by its speed, convenience, and non-invasive nature, is leading the trend in diagnostic development. Studies have shown that the acetone levels in the breath of diabetes patients are higher than normal, making acetone a basis for diabetes breath analysis. This provides a more readily accepted method for early diabetes prevention and monitoring. Addressing issues such as the invasive nature, disease transmission risks, and complexity of diabetes testing, this study aims to design a diabetes gas biomarker acetone detection system centered around a sensor array using gas sensors and pattern recognition algorithms. The research covers sensor selection, sensor preparation, circuit design, data acquisition and processing, and detection model establishment to accurately identify acetone. Titanium dioxide was chosen as the nano gas-sensitive material to prepare the acetone gas sensor, with data collection conducted using STM32. Filtering was applied to process the raw sensor data, followed by feature extraction using principal component analysis. A recognition model based on support vector machine algorithm was used for qualitative identification of gas samples, while a recognition model based on backpropagation neural network was employed for quantitative detection of gas sample concentrations. Experimental results demonstrated recognition accuracies of 96% and 97.5% for acetone-ethanol and acetone-methanol mixed gases, and 90% for ternary acetone, ethanol, and methanol mixed gases. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: 9 pages, 14 figures

arXiv:2406.00948 [pdf]

Real-space tilting method for atomic resolution STEM imaging of nanocrystalline materials

Authors: Jiake Wei, Zhangze Xu, Wenjie Shen, Bin Feng, Ryo Ishikawa, Naoya Shibata, Yuichi Ikuhara, Xuedong Bai

Abstract: Atomic-resolution scanning transmission electron microscopy (STEM) characterization requires precise tilting of the specimen to high symmetric zone axis, which is usually processed in reciprocal space by following the diffraction patterns. However, for small-sized nanocrystalline materials, their diffraction patterns are too faint to guide the tilting process. Here, a simple and effective tilting… ▽ More Atomic-resolution scanning transmission electron microscopy (STEM) characterization requires precise tilting of the specimen to high symmetric zone axis, which is usually processed in reciprocal space by following the diffraction patterns. However, for small-sized nanocrystalline materials, their diffraction patterns are too faint to guide the tilting process. Here, a simple and effective tilting method is developed based on the diffraction contrast change of the shadow image in the Ronchigram. We can calculate the misorientation angle of the specimen and tilt it to the zone axis based on the position of the shadow image with lowest intensity. This method requires no prior knowledge of the sample and the maximum misorientation angle we can correct is greater than +-6.9 degree with sub-mrad accuracy. It is processed in real space, without recording the diffraction patterns of the specimens, which can effectively apply to nanocrystalline materials. Combined with the scripting to control the microscope, we can automatically tilt the sample to the zone axis under low dose condition (<0.17 e-/A2/s), which could facilitate the imaging of beam sensitive materials such as zeolites or metal organic frameworks. This automated tilting method could contribute to the atomic-scale characterization of the nanocrystalline materials by STEM imaging. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2405.20834 [pdf, other]

Retrieval Meets Reasoning: Even High-school Textbook Knowledge Benefits Multimodal Reasoning

Authors: Cheng Tan, Jingxuan Wei, Linzhuang Sun, Zhangyang Gao, Siyuan Li, Bihui Yu, Ruifeng Guo, Stan Z. Li

Abstract: Large language models equipped with retrieval-augmented generation (RAG) represent a burgeoning field aimed at enhancing answering capabilities by leveraging external knowledge bases. Although the application of RAG with language-only models has been extensively explored, its adaptation into multimodal vision-language models remains nascent. Going beyond mere answer generation, the primary goal of… ▽ More Large language models equipped with retrieval-augmented generation (RAG) represent a burgeoning field aimed at enhancing answering capabilities by leveraging external knowledge bases. Although the application of RAG with language-only models has been extensively explored, its adaptation into multimodal vision-language models remains nascent. Going beyond mere answer generation, the primary goal of multimodal RAG is to cultivate the models' ability to reason in response to relevant queries. To this end, we introduce a novel multimodal RAG framework named RMR (Retrieval Meets Reasoning). The RMR framework employs a bi-modal retrieval module to identify the most relevant question-answer pairs, which then serve as scaffolds for the multimodal reasoning process. This training-free approach not only encourages the model to engage deeply with the reasoning processes inherent in the retrieved content but also facilitates the generation of answers that are precise and richly interpretable. Surprisingly, utilizing solely the ScienceQA dataset, collected from elementary and high school science curricula, RMR significantly boosts the performance of various vision-language models across a spectrum of benchmark datasets, including A-OKVQA, MMBench, and SEED. These outcomes highlight the substantial potential of our multimodal retrieval and reasoning mechanism to improve the reasoning capabilities of vision-language models. △ Less

Submitted 31 May, 2024; originally announced May 2024.

Comments: Under review

arXiv:2405.19592 [pdf, other]

Why Larger Language Models Do In-context Learning Differently?

Authors: Zhenmei Shi, Junyi Wei, Zhuoyan Xu, Yingyu Liang

Abstract: Large language models (LLM) have emerged as a powerful tool for AI, with the key ability of in-context learning (ICL), where they can perform well on unseen tasks based on a brief series of task examples without necessitating any adjustments to the model parameters. One recent interesting mysterious observation is that models of different scales may have different ICL behaviors: larger models tend… ▽ More Large language models (LLM) have emerged as a powerful tool for AI, with the key ability of in-context learning (ICL), where they can perform well on unseen tasks based on a brief series of task examples without necessitating any adjustments to the model parameters. One recent interesting mysterious observation is that models of different scales may have different ICL behaviors: larger models tend to be more sensitive to noise in the test context. This work studies this observation theoretically aiming to improve the understanding of LLM and ICL. We analyze two stylized settings: (1) linear regression with one-layer single-head linear transformers and (2) parity classification with two-layer multiple attention heads transformers (non-linear data and non-linear model). In both settings, we give closed-form optimal solutions and find that smaller models emphasize important hidden features while larger ones cover more hidden features; thus, smaller models are more robust to noise while larger ones are more easily distracted, leading to different ICL behaviors. This sheds light on where transformers pay attention to and how that affects ICL. Preliminary experimental results on large base and chat models provide positive support for our analysis. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.19266 [pdf, other]

PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric Applications

Authors: Dingkang Yang, Jinjie Wei, Dongling Xiao, Shunli Wang, Tong Wu, Gang Li, Mingcheng Li, Shuaibing Wang, Jiawei Chen, Yue Jiang, Qingyao Xu, Ke Li, Peng Zhai, Lihua Zhang

Abstract: Developing intelligent pediatric consultation systems offers promising prospects for improving diagnostic efficiency, especially in China, where healthcare resources are scarce. Despite recent advances in Large Language Models (LLMs) for Chinese medicine, their performance is sub-optimal in pediatric applications due to inadequate instruction data and vulnerable training procedures. To address the… ▽ More Developing intelligent pediatric consultation systems offers promising prospects for improving diagnostic efficiency, especially in China, where healthcare resources are scarce. Despite recent advances in Large Language Models (LLMs) for Chinese medicine, their performance is sub-optimal in pediatric applications due to inadequate instruction data and vulnerable training procedures. To address the above issues, this paper builds PedCorpus, a high-quality dataset of over 300,000 multi-task instructions from pediatric textbooks, guidelines, and knowledge graph resources to fulfil diverse diagnostic demands. Upon well-designed PedCorpus, we propose PediatricsGPT, the first Chinese pediatric LLM assistant built on a systematic and robust training pipeline. In the continuous pre-training phase, we introduce a hybrid instruction pre-training mechanism to mitigate the internal-injected knowledge inconsistency of LLMs for medical domain adaptation. Immediately, the full-parameter Supervised Fine-Tuning (SFT) is utilized to incorporate the general medical knowledge schema into the models. After that, we devise a direct following preference optimization to enhance the generation of pediatrician-like humanistic responses. In the parameter-efficient secondary SFT phase, a mixture of universal-specific experts strategy is presented to resolve the competency conflict between medical generalist and pediatric expertise mastery. Extensive results based on the metrics, GPT-4, and doctor evaluations on distinct doctor downstream tasks show that PediatricsGPT consistently outperforms previous Chinese medical LLMs. Our model and dataset will be open-source for community development. △ Less

Submitted 3 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

Comments: A Technical Report on a Chinese Medical Large Language Model

arXiv:2405.16849 [pdf, other]

Sync4D: Video Guided Controllable Dynamics for Physics-Based 4D Generation

Authors: Zhoujie Fu, Jiacheng Wei, Wenhao Shen, Chaoyue Song, Xiaofeng Yang, Fayao Liu, Xulei Yang, Guosheng Lin

Abstract: In this work, we introduce a novel approach for creating controllable dynamics in 3D-generated Gaussians using casually captured reference videos. Our method transfers the motion of objects from reference videos to a variety of generated 3D Gaussians across different categories, ensuring precise and customizable motion transfer. We achieve this by employing blend skinning-based non-parametric shap… ▽ More In this work, we introduce a novel approach for creating controllable dynamics in 3D-generated Gaussians using casually captured reference videos. Our method transfers the motion of objects from reference videos to a variety of generated 3D Gaussians across different categories, ensuring precise and customizable motion transfer. We achieve this by employing blend skinning-based non-parametric shape reconstruction to extract the shape and motion of reference objects. This process involves segmenting the reference objects into motion-related parts based on skinning weights and establishing shape correspondences with generated target shapes. To address shape and temporal inconsistencies prevalent in existing methods, we integrate physical simulation, driving the target shapes with matched motion. This integration is optimized through a displacement loss to ensure reliable and genuine dynamics. Our approach supports diverse reference inputs, including humans, quadrupeds, and articulated objects, and can generate dynamics of arbitrary length, providing enhanced fidelity and applicability. Unlike methods heavily reliant on diffusion video generation models, our technique offers specific and high-quality motion transfer, maintaining both shape integrity and temporal consistency. △ Less

Submitted 7 July, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

Comments: Our project page: https://sync4dphys.github.io/

arXiv:2405.16209 [pdf]

Analytical photoresponses of Schottky contact MoS2 phototransistors

Authors: Jianyong Wei, Yumeng Liu, Yizhuo Wang, Kai Li, Zhentao Lian, Maosong Xie, Xinhan Yang, Seyed Saleh Mousavi Khaleghi, Fuxing Dai, Weida Hu, Xuejiao Gao, Rui Yang, Yaping Dan

Abstract: High-gain photodetectors based on two-dimensional (2D) semiconductors, in particular those in photoconductive mode, have been extensively investigated in the past decade. However, the classical photoconductive theory was derived on two misplaced assumptions. In this work, we established an explicit analytical device model for Schottky contact MoS2 phototransistors that fits well with experimental… ▽ More High-gain photodetectors based on two-dimensional (2D) semiconductors, in particular those in photoconductive mode, have been extensively investigated in the past decade. However, the classical photoconductive theory was derived on two misplaced assumptions. In this work, we established an explicit analytical device model for Schottky contact MoS2 phototransistors that fits well with experimental data. From the fitting results, we found that the Richardson constant of the MoS2 Schottky contact is temperature dependent, indicating that the Schottky contacts for the 2D material is best described by the mixed thermionic emission and diffusion model. Based on this device model, we further established an analytical photoresponse for the few-layer MoS2 phototransistors, from which we found the voltage distribution on the two Schottky contacts and the channel, and extracted the minority carrier recombination lifetimes. The lifetimes are comparable with the values found from transient photoluminescence measurements, which therefore validates our analytical photoresponses for Schottky contact 2D semiconducting phototransistors. △ Less

Submitted 25 May, 2024; originally announced May 2024.

Comments: 15 pages, 6 figures

arXiv:2405.16191 [pdf, other]

Rocket Landing Control with Grid Fins and Path-following using MPC

Authors: Junhao Yu, Jiarun Wei

Abstract: In this project, we attempt to optimize a landing trajectory of a rocket. The goal is to minimize the total fuel consumption during the landing process using different techniques. Once the optimal and feasible trajectory is generated using batch approach, we attempt to follow the path using a Model Predictive Control (MPC) based algorithm, called Trajectory Optimizing Path following Estimation fro… ▽ More In this project, we attempt to optimize a landing trajectory of a rocket. The goal is to minimize the total fuel consumption during the landing process using different techniques. Once the optimal and feasible trajectory is generated using batch approach, we attempt to follow the path using a Model Predictive Control (MPC) based algorithm, called Trajectory Optimizing Path following Estimation from Demonstration (TOPED), in order to generalize to similar initial states and models, where we introduce a novel cost function for the MPC to solve. We further show that TOPED can follow a demonstration trajectory well in practice under model mismatch and different initial states. △ Less

Submitted 25 May, 2024; originally announced May 2024.

arXiv:2405.15995 [pdf, other]

Efficient Temporal Action Segmentation via Boundary-aware Query Voting

Authors: Peiyao Wang, Yuewei Lin, Erik Blasch, Jie Wei, Haibin Ling

Abstract: Although the performance of Temporal Action Segmentation (TAS) has improved in recent years, achieving promising results often comes with a high computational cost due to dense inputs, complex model structures, and resource-intensive post-processing requirements. To improve the efficiency while keeping the performance, we present a novel perspective centered on per-segment classification. By harne… ▽ More Although the performance of Temporal Action Segmentation (TAS) has improved in recent years, achieving promising results often comes with a high computational cost due to dense inputs, complex model structures, and resource-intensive post-processing requirements. To improve the efficiency while keeping the performance, we present a novel perspective centered on per-segment classification. By harnessing the capabilities of Transformers, we tokenize each video segment as an instance token, endowed with intrinsic instance segmentation. To realize efficient action segmentation, we introduce BaFormer, a boundary-aware Transformer network. It employs instance queries for instance segmentation and a global query for class-agnostic boundary prediction, yielding continuous segment proposals. During inference, BaFormer employs a simple yet effective voting strategy to classify boundary-wise segments based on instance segmentation. Remarkably, as a single-stage approach, BaFormer significantly reduces the computational costs, utilizing only 6% of the running time compared to state-of-the-art method DiffAct, while producing better or comparable accuracy over several popular benchmarks. The code for this project is publicly available at https://github.com/peiyao-w/BaFormer. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 17 pages, 8 figures, 11 tables

arXiv:2405.12851 [pdf]

Ultrafast Broadband Strong-Field Tunnelling in Asymmetric Nanogaps for Time-Resolved Nanoscopy

Authors: Haoqing Ning, Marios Maimaris, Jiewen Wei, Emilie Gérouville, Evangelos Moutoulas, Zhu Meng, Clement Ferchaud, Dmitry Maslennikov, Navendu Mondal, Tong Wang, Colin Chow, Aleksandar P. Ivanov, Joshua B. Edel, Saif A. Haque, Misha Ivanov, Jon P. Marangos, Dimitra G. Georgiadou, Artem A. Bakulin

Abstract: Femtosecond-fast and nanometre-size pulses of electrons are emerging as unique probes for ultrafast dynamics at the nanoscale. Presently, such pulses are achievable only in highly sophisticated ultrafast electron microscopes or equally complex setups involving few-cycle-pulsed lasers with stable carrier-envelope phase (CEP) and nanotip probes. Here, we show that the generation of femtosecond pulse… ▽ More Femtosecond-fast and nanometre-size pulses of electrons are emerging as unique probes for ultrafast dynamics at the nanoscale. Presently, such pulses are achievable only in highly sophisticated ultrafast electron microscopes or equally complex setups involving few-cycle-pulsed lasers with stable carrier-envelope phase (CEP) and nanotip probes. Here, we show that the generation of femtosecond pulses of nanoscale tunnelling electrons can be achieved in any ultrafast optical laboratory, using any (deep-UV to mid-IR) femtosecond laser in combination with photosensitive asymmetric nanogap (PAN) diodes fabricated via easy-to-scale adhesion lithography. The dominant mechanism producing tunnelling electrons in PANs is strong-field emission, which is easily achievable without CEP locking or external bias voltage. We employ PANs to demonstrate ultrafast nanoscopy of metal-halide perovskite quantum dots immobilised inside a 10-nm Al/Au nanogap and to characterise laser pulses across the entire optical region (266-6700 nm). Short electron pulses in PANs open the way towards scalable on-chip femtosecond electron measurements and novel design approaches for integrated ultrafast sensing nanodevices. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.12665 [pdf, other]

Age of massive galaxies at redshift 8

Authors: M. Lopez-Corredoira, F. Melia, J. -J. Wei, C. -Y. Gao

Abstract: Recent James Webb Space Telescope (JWST) data analyses have shown that massive red galaxies existed at redshifts $z>6$, a discovery that is difficult to understand in the context of standard cosmology ($Λ$CDM). Here we analyze these observations more deeply by fitting a stellar population model to the optical and near-infrared photometric data. These fits include a main stellar population in addit… ▽ More Recent James Webb Space Telescope (JWST) data analyses have shown that massive red galaxies existed at redshifts $z>6$, a discovery that is difficult to understand in the context of standard cosmology ($Λ$CDM). Here we analyze these observations more deeply by fitting a stellar population model to the optical and near-infrared photometric data. These fits include a main stellar population in addition to a residual younger population and with the same extinction for both (a lower extinction for the younger population is unphysical). Extra stellar populations or the inclusion of an AGN component do not significantly improve the fits. These galaxies are being viewed at very high redshifts, with an average $\langle z\rangle \approx 8.2$, when the $Λ$CDM Universe was only $\approx 600$ Myr old. This result conflicts with the inferred ages of these galaxies, however, which were on average between 0.9 and 2.4 Gyr old within 95% CL. Given the sequence of star formation and galaxy assembly in the standard model, these galaxies should instead be even younger than 290 Myr on average, for which our analysis assigns a probability of only $<3\times 10^{-4}$ ($\gtrsim 3.6σ$ tension). This outcome may indicate the need to consider non-standard cosmologies. Nevertheless, our conclusions result from several approximations in stellar astrophysics and extinction, so they should be taken with a grain of salt. Further research is necessary to corroborate the possible existence of galaxies older than the $Λ$CDM universe at their observed redshifts. △ Less

Submitted 22 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

Comments: 16 pages, 8 figures, accepted to be published in ApJ

arXiv:2405.12571 [pdf, other]

iHERO: Interactive Human-oriented Exploration and Supervision Under Scarce Communication

Authors: Zhuoli Tian, Yuyang Zhang, Jinsheng Wei, Meng Guo

Abstract: Exploration of unknown scenes before human entry is essential for safety and efficiency in numerous scenarios, e.g., subterranean exploration, reconnaissance, search and rescue missions. Fleets of autonomous robots are particularly suitable for this task, via concurrent exploration, multi-sensory perception and autonomous navigation. Communication however among the robots can be severely restricte… ▽ More Exploration of unknown scenes before human entry is essential for safety and efficiency in numerous scenarios, e.g., subterranean exploration, reconnaissance, search and rescue missions. Fleets of autonomous robots are particularly suitable for this task, via concurrent exploration, multi-sensory perception and autonomous navigation. Communication however among the robots can be severely restricted to only close-range exchange via ad-hoc networks. Although some recent works have addressed the problem of collaborative exploration under restricted communication, the crucial role of the human operator has been mostly neglected. Indeed, the operator may: (i) require timely update regarding the exploration progress and fleet status; (ii) prioritize certain regions; and (iii) dynamically move within the explored area; To facilitate these requests, this work proposes an interactive human-oriented online coordination framework for collaborative exploration and supervision under scarce communication (iHERO). The robots switch smoothly and optimally among fast exploration, intermittent exchange of map and sensory data, and return to the operator for status update. It is ensured that these requests are fulfilled online interactively with a pre-specified latency. Extensive large-scale human-in-the-loop simulations and hardware experiments are performed over numerous challenging scenes, which signify its performance such as explored area and efficiency, and validate its potential applicability to real-world scenarios. The videos are available on https://zl-tian.github.io/iHERO/. △ Less

Submitted 7 June, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

Comments: Accepted at RSS 2024

arXiv:2405.12504 [pdf, other]

SN 2019tua : A Type IIb Supernova with Multiple Bumps in the Light Curves

Authors: Xin-Bo Huang, Xiang-Gao Wang, Long Li, Li-Ping Xin, Jing Wang, Tian-Ci Zheng, Qi Wang, Hui-Ya Liu, Zi-Min Zhou, Xiao-meng Lu, jian-yan Wei, En-Wei Liang

Abstract: We present photometric and spectroscopic observations and analysis of the type IIb supernova (SN) SN 2019tua, which exhibits multiple bumps in its declining light curves between 40 and 65 days after discovery. SN 2019tua shows a time to peak of about 25 days similar to other type IIb SNe. Our observations indicate a decrease in its brightness of about 1 magnitude in the 60 days after the peak. At… ▽ More We present photometric and spectroscopic observations and analysis of the type IIb supernova (SN) SN 2019tua, which exhibits multiple bumps in its declining light curves between 40 and 65 days after discovery. SN 2019tua shows a time to peak of about 25 days similar to other type IIb SNe. Our observations indicate a decrease in its brightness of about 1 magnitude in the 60 days after the peak. At about days 50, and 60, its multiband light curves exhibit bumpy behavior. The complex luminosity evolution of SN 2019tua could not be well modeled with a single currently popular energy source model, e.g., radioactive decay of $^{56}$Ni, magnetar, interaction between the ejecta and a circumstellar shell. Even though the magnetar model has a smaller $χ^2 / \text{dof}$ value, the complex changes in SN 2019tua's brightness suggest that more than one physical process might be involved. We propose a hybrid CSM interaction plus $^{56}$Ni model to explain the bolometric light curve (LC) of SN 2019tua. The fitting results show that the ejecta mass $M_{\rm ej} \approx 2.4~M_\odot$, the total CSM mass $M_{\rm CSM} \approx 1.0~M_\odot$, and the $^{56}$Ni mass $M_{\rm Ni} \approx 0.4~M_\odot$. The total kinetic energy of the ejecta is $E_k\approx 0.5 \times 10^{51}\rm~erg$. Pre-existing multiple shells suggest that the progenitor of SN 2019tua experienced mass ejections within approximately $\sim6 - 44$ years prior to the explosion. △ Less

Submitted 23 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

Comments: Accepted for publication in ApJ. 24 pages, 17 figures, 6 tables

arXiv:2405.12031 [pdf, other]

Neighborhood Attention Transformer with Progressive Channel Fusion for Speaker Verification

Authors: Nian Li, Jianguo Wei

Abstract: Transformer-based architectures for speaker verification typically require more training data than ECAPA-TDNN. Therefore, recent work has generally been trained on VoxCeleb1&2. We propose a backbone network based on self-attention, which can achieve competitive results when trained on VoxCeleb2 alone. The network alternates between neighborhood attention and global attention to capture local and g… ▽ More Transformer-based architectures for speaker verification typically require more training data than ECAPA-TDNN. Therefore, recent work has generally been trained on VoxCeleb1&2. We propose a backbone network based on self-attention, which can achieve competitive results when trained on VoxCeleb2 alone. The network alternates between neighborhood attention and global attention to capture local and global features, then aggregates features of different hierarchical levels, and finally performs attentive statistics pooling. Additionally, we employ a progressive channel fusion strategy to expand the receptive field in the channel dimension as the network deepens. We trained the proposed PCF-NAT model on VoxCeleb2 and evaluated it on VoxCeleb1 and the validation sets of VoxSRC. The EER and minDCF of the shallow PCF-NAT are on average more than 20% lower than those of similarly sized ECAPA-TDNN. Deep PCF-NAT achieves an EER lower than 0.5% on VoxCeleb1-O. The code and models are publicly available at https://github.com/ChenNan1996/PCF-NAT. △ Less

Submitted 29 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: 8 pages, 2 figures, 3 tables; added github link

arXiv:2405.11826 [pdf, other]

Data quality control system and long-term performance monitor of the LHAASO-KM2A

Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (263 additional authors not shown)

Abstract: The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To… ▽ More The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To ensure the reliability of the LHAASO-KM2A data, a three-level quality control system has been established. It is used to monitor the status of detector units, stability of reconstructed parameters and the performance of the array based on observations of the Crab Nebula and Moon shadow. This paper will introduce the control system and its application on the LHAASO-KM2A data collected from August 2021 to July 2023. During this period, the pointing and angular resolution of the array were stable. From the observations of the Moon shadow and Crab Nebula, the results achieved using the two methods are consistent with each other. According to the observation of the Crab Nebula at energies from 25 TeV to 100 TeV, the time averaged pointing errors are estimated to be $-0.003^{\circ} \pm 0.005^{\circ}$ and $0.001^{\circ} \pm 0.006^{\circ}$ in the R.A. and Dec directions, respectively. △ Less

Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: 15 pages, 9 figures

arXiv:2405.10663 [pdf, ps, other]

Instability of Circumnuclear Gas Supply as An Origin of "Changing-look" Phenomenon of Supermassive Blackholes

Authors: J. Wang, D. W. Xu, Xinwu Cao, C. Gao, C. H. Xie, J. Y. Wei

Abstract: The origin of the "Changing-look" (CL) phenomenon in supermassive black holes (SMBHs) remains an open issue. This study aims to shed light on this phenomenon by focusing on a sample that encompasses all known repeating CL active galactic nuclei (AGNs). Through the identification of a characteristic time scale for the CL phenomenon, it was observed that larger SMBHs possess shorter characteristic t… ▽ More The origin of the "Changing-look" (CL) phenomenon in supermassive black holes (SMBHs) remains an open issue. This study aims to shed light on this phenomenon by focusing on a sample that encompasses all known repeating CL active galactic nuclei (AGNs). Through the identification of a characteristic time scale for the CL phenomenon, it was observed that larger SMBHs possess shorter characteristic timescales, while smaller SMBHs exhibit longer timescales. These findings reveal a significant contrast to the traditional AGN variability that has been adequately explained by the AGN's disk instability model. This stark discrepancy highlights a distinct origin of the CL phenomenon, distinguishing it from traditional AGN variability. By properly predicting the characteristic time scale and its dependence on SMBH mass, we propose that the CL phenomenon is likely a result of a variation in accretion rate caused by a sudden change in the supply of circumnuclear gas during the transition between active and passive SMBH fueling stages. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 14 pages, 4 figures and 2 tables, accepted by ApJ

arXiv:2405.08582 [pdf, other]

doi 10.1145/3626772.3657736

Treatment Effect Estimation for User Interest Exploration on Recommender Systems

Authors: Jiaju Chen, Wenjie Wang, Chongming Gao, Peng Wu, Jianxiong Wei, Qingsong Hua

Abstract: Recommender systems learn personalized user preferences from user feedback like clicks. However, user feedback is usually biased towards partially observed interests, leaving many users' hidden interests unexplored. Existing approaches typically mitigate the bias, increase recommendation diversity, or use bandit algorithms to balance exploration-exploitation trade-offs. Nevertheless, they fail to… ▽ More Recommender systems learn personalized user preferences from user feedback like clicks. However, user feedback is usually biased towards partially observed interests, leaving many users' hidden interests unexplored. Existing approaches typically mitigate the bias, increase recommendation diversity, or use bandit algorithms to balance exploration-exploitation trade-offs. Nevertheless, they fail to consider the potential rewards of recommending different categories of items and lack the global scheduling of allocating top-N recommendations to categories, leading to suboptimal exploration. In this work, we propose an Uplift model-based Recommender (UpliftRec) framework, which regards top-N recommendation as a treatment optimization problem. UpliftRec estimates the treatment effects, i.e., the click-through rate (CTR) under different category exposure ratios, by using observational user feedback. UpliftRec calculates group-level treatment effects to discover users' hidden interests with high CTR rewards and leverages inverse propensity weighting to alleviate confounder bias. Thereafter, UpliftRec adopts a dynamic programming method to calculate the optimal treatment for overall CTR maximization. We implement UpliftRec on different backend models and conduct extensive experiments on three datasets. The empirical results validate the effectiveness of UpliftRec in discovering users' hidden interests while achieving superior recommendation accuracy. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: Accepted to SIGIR 2024

arXiv:2405.07691 [pdf, other]

Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) i… ▽ More The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) is compatible with NGC 4278 within $\sim0.03$ degree. Variation analysis shows an indication of the variability at a few months level in the TeV band, which is consistent with low frequency observations. Based on these observations, we report the detection of TeV $γ$-ray emissions from this low-luminosity AGN NGC 4278. The observations by LHAASO-WCDA during active period has a significance level of 8.8\,$σ$ with best-fit photon spectral index $\varGamma=2.56\pm0.14$ and a flux $f_{1-10\,\rm{TeV}}=(7.0\pm1.1_{\rm{sta}}\pm0.35_{\rm{syst}})\times10^{-13}\,\rm{photons\,cm^{-2}\,s^{-1}}$, or approximately $5\%$ of the Crab Nebula. The discovery of VHE from NGC 4278 indicates that the compact, weak radio jet can efficiently accelerate particles and emit TeV photons. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures

arXiv:2405.07690 [pdf, ps, other]

Convergence analysis of three semi-discrete numerical schemes for nonlocal geometric flows including perimeter terms

Authors: Jiang Wei, Su Chunmei, Zhang Ganghui

Abstract: We present and analyze three distinct semi-discrete schemes for solving nonlocal geometric flows incorporating perimeter terms. These schemes are based on the finite difference method, the finite element method, and the finite element method with a specific tangential motion. We offer rigorous proofs of quadratic convergence under $H^1$-norm for the first scheme and linear convergence under $H^1$-… ▽ More We present and analyze three distinct semi-discrete schemes for solving nonlocal geometric flows incorporating perimeter terms. These schemes are based on the finite difference method, the finite element method, and the finite element method with a specific tangential motion. We offer rigorous proofs of quadratic convergence under $H^1$-norm for the first scheme and linear convergence under $H^1$-norm for the latter two schemes. All error estimates rely on the observation that the error of the nonlocal term can be controlled by the error of the local term. Furthermore, we explore the relationship between the convergence under $L^\infty$-norm and manifold distance. Extensive numerical experiments are conducted to verify the convergence analysis, and demonstrate the accuracy of our schemes under various norms for different types of nonlocal flows. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 34 pages, 9 figures

MSC Class: 65M60; 65M12; 35K55

arXiv:2405.06841 [pdf, other]

Bridging the Gap: Protocol Towards Fair and Consistent Affect Analysis

Authors: Guanyu Hu, Eleni Papadopoulou, Dimitrios Kollias, Paraskevi Tzouveli, Jie Wei, Xinyu Yang

Abstract: The increasing integration of machine learning algorithms in daily life underscores the critical need for fairness and equity in their deployment. As these technologies play a pivotal role in decision-making, addressing biases across diverse subpopulation groups, including age, gender, and race, becomes paramount. Automatic affect analysis, at the intersection of physiology, psychology, and machin… ▽ More The increasing integration of machine learning algorithms in daily life underscores the critical need for fairness and equity in their deployment. As these technologies play a pivotal role in decision-making, addressing biases across diverse subpopulation groups, including age, gender, and race, becomes paramount. Automatic affect analysis, at the intersection of physiology, psychology, and machine learning, has seen significant development. However, existing databases and methodologies lack uniformity, leading to biased evaluations. This work addresses these issues by analyzing six affective databases, annotating demographic attributes, and proposing a common protocol for database partitioning. Emphasis is placed on fairness in evaluations. Extensive experiments with baseline and state-of-the-art methods demonstrate the impact of these changes, revealing the inadequacy of prior assessments. The findings underscore the importance of considering demographic attributes in affect analysis research and provide a foundation for more equitable methodologies. Our annotations, code and pre-trained models are available at: https://github.com/dkollias/Fair-Consistent-Affect-Analysis △ Less

Submitted 16 May, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

Comments: accepted at IEEE FG 2024

Showing 1–50 of 1,292 results for author: Wei, J