subscribe to arXiv mailings

Intelligent Cross-Organizational Process Mining: A Survey and New Perspectives

Authors: Yiyuan Yang, Zheshun Wu, Yong Chu, Zhenghua Chen, Zenglin Xu, Qingsong Wen

Abstract: Process mining, as a high-level field in data mining, plays a crucial role in enhancing operational efficiency and decision-making across organizations. In this survey paper, we delve into the growing significance and ongoing trends in the field of process mining, advocating a specific viewpoint on its contents, application, and development in modern businesses and process management, particularly… ▽ More Process mining, as a high-level field in data mining, plays a crucial role in enhancing operational efficiency and decision-making across organizations. In this survey paper, we delve into the growing significance and ongoing trends in the field of process mining, advocating a specific viewpoint on its contents, application, and development in modern businesses and process management, particularly in cross-organizational settings. We first summarize the framework of process mining, common industrial applications, and the latest advances combined with artificial intelligence, such as workflow optimization, compliance checking, and performance analysis. Then, we propose a holistic framework for intelligent process analysis and outline initial methodologies in cross-organizational settings, highlighting both challenges and opportunities. This particular perspective aims to revolutionize process mining by leveraging artificial intelligence to offer sophisticated solutions for complex, multi-organizational data analysis. By integrating advanced machine learning techniques, we can enhance predictive capabilities, streamline processes, and facilitate real-time decision-making. Furthermore, we pinpoint avenues for future investigations within the research community, encouraging the exploration of innovative algorithms, data integration strategies, and privacy-preserving methods to fully harness the potential of process mining in diverse, interconnected business environments. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: Under review; 13 pages, 7 figures, 2 tables

arXiv:2406.17551 [pdf, other]

Sharing tripartite nonlocality sequentially using only projective measurements

Authors: Yiyang Xu, Hao Sun, Fenzhuo Guo, Haifeng Dong, Qiaoyan Wen

Abstract: Bell nonlocality is a valuable resource in quantum information processing tasks. Scientists are interested in whether a single entangled state can generate a long sequence of nonlocal correlations. Previous work has accomplished sequential tripartite nonlocality sharing through unsharp measurements. In this paper, we investigate the sharing of tripartite nonlocality using only projective measureme… ▽ More Bell nonlocality is a valuable resource in quantum information processing tasks. Scientists are interested in whether a single entangled state can generate a long sequence of nonlocal correlations. Previous work has accomplished sequential tripartite nonlocality sharing through unsharp measurements. In this paper, we investigate the sharing of tripartite nonlocality using only projective measurements and sharing classical randomness. For the generalized GHZ state, we have demonstrated that using unbiased measurement choices, two Charlies can share the standard tripartite nonlocality with a single Alice and a single Bob, while at most one Charlie can share the genuine tripartite nonlocality with a single Alice and a single Bob. However, with biased measurement choices, the number of Charlies sharing the genuine tripartite nonlocality can be increased to two. Nonetheless, we find that using biased measurements does not increase the number of sequential observers sharing the standard tripartite nonlocality. Moreover, we provide the feasible range of double violation for the parameters of the measurement combination probability with respect to the state. △ Less

Submitted 26 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

Comments: 11 pages, 7 figures

arXiv:2406.13885 [pdf, other]

Knowledge Tagging System on Math Questions via LLMs with Flexible Demonstration Retriever

Authors: Hang Li, Tianlong Xu, Jiliang Tang, Qingsong Wen

Abstract: Knowledge tagging for questions plays a crucial role in contemporary intelligent educational applications, including learning progress diagnosis, practice question recommendations, and course content organization. Traditionally, these annotations are always conducted by pedagogical experts, as the task requires not only a strong semantic understanding of both question stems and knowledge definitio… ▽ More Knowledge tagging for questions plays a crucial role in contemporary intelligent educational applications, including learning progress diagnosis, practice question recommendations, and course content organization. Traditionally, these annotations are always conducted by pedagogical experts, as the task requires not only a strong semantic understanding of both question stems and knowledge definitions but also deep insights into connecting question-solving logic with corresponding knowledge concepts. With the recent emergence of advanced text encoding algorithms, such as pre-trained language models, many researchers have developed automatic knowledge tagging systems based on calculating the semantic similarity between the knowledge and question embeddings. In this paper, we explore automating the task using Large Language Models (LLMs), in response to the inability of prior encoding-based methods to deal with the hard cases which involve strong domain knowledge and complicated concept definitions. By showing the strong performance of zero- and few-shot results over math questions knowledge tagging tasks, we demonstrate LLMs' great potential in conquering the challenges faced by prior methods. Furthermore, by proposing a reinforcement learning-based demonstration retriever, we successfully exploit the great potential of different-sized LLMs in achieving better performance results while keeping the in-context demonstration usage efficiency high. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 13 pages, 6 figures

arXiv:2406.13434 [pdf, other]

Tactile Aware Dynamic Obstacle Avoidance in Crowded Environment with Deep Reinforcement Learning

Authors: Yung Chuen Ng, Qi Wen, Lim, Chun Ye Tan, Zhen Hao Gan, Meng Yee, Chuah

Abstract: Mobile robots operating in crowded environments require the ability to navigate among humans and surrounding obstacles efficiently while adhering to safety standards and socially compliant mannerisms. This scale of the robot navigation problem may be classified as both a local path planning and trajectory optimization problem. This work presents an array of force sensors that act as a tactile laye… ▽ More Mobile robots operating in crowded environments require the ability to navigate among humans and surrounding obstacles efficiently while adhering to safety standards and socially compliant mannerisms. This scale of the robot navigation problem may be classified as both a local path planning and trajectory optimization problem. This work presents an array of force sensors that act as a tactile layer to complement the use of a LiDAR for the purpose of inducing awareness of contact with any surrounding objects within immediate vicinity of a mobile robot undetected by LiDARs. By incorporating the tactile layer, the robot can take more risks in its movements and possibly go right up to an obstacle or wall, and gently squeeze past it. In addition, we built up a simulation platform via Pybullet which integrates Robot Operating System (ROS) and reinforcement learning (RL) together. A touch-aware neural network model was trained on it to create an RL-based local path planner for dynamic obstacle avoidance. Our proposed method was demonstrated successfully on an omni-directional mobile robot who was able to navigate in a crowded environment with high agility and versatility in movement, while not being overly sensitive to nearby obstacles-not-in-contact. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.12747 [pdf, other]

TSI-Bench: Benchmarking Time Series Imputation

Authors: Wenjie Du, Jun Wang, Linglong Qian, Yiyuan Yang, Fanxing Liu, Zepu Wang, Zina Ibrahim, Haoxin Liu, Zhiyuan Zhao, Yingjie Zhou, Wenjia Wang, Kaize Ding, Yuxuan Liang, B. Aditya Prakash, Qingsong Wen

Abstract: Effective imputation is a crucial preprocessing step for time series analysis. Despite the development of numerous deep learning algorithms for time series imputation, the community lacks standardized and comprehensive benchmark platforms to effectively evaluate imputation performance across different settings. Moreover, although many deep learning forecasting algorithms have demonstrated excellen… ▽ More Effective imputation is a crucial preprocessing step for time series analysis. Despite the development of numerous deep learning algorithms for time series imputation, the community lacks standardized and comprehensive benchmark platforms to effectively evaluate imputation performance across different settings. Moreover, although many deep learning forecasting algorithms have demonstrated excellent performance, whether their modeling achievements can be transferred to time series imputation tasks remains unexplored. To bridge these gaps, we develop TSI-Bench, the first (to our knowledge) comprehensive benchmark suite for time series imputation utilizing deep learning techniques. The TSI-Bench pipeline standardizes experimental settings to enable fair evaluation of imputation algorithms and identification of meaningful insights into the influence of domain-appropriate missingness ratios and patterns on model performance. Furthermore, TSI-Bench innovatively provides a systematic paradigm to tailor time series forecasting algorithms for imputation purposes. Our extensive study across 34,804 experiments, 28 algorithms, and 8 datasets with diverse missingness scenarios demonstrates TSI-Bench's effectiveness in diverse downstream tasks and potential to unlock future directions in time series imputation research and analysis. The source code and experiment logs are available at https://github.com/WenjieDu/AwesomeImputation. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.11903 [pdf, other]

A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges

Authors: Yuqi Nie, Yaxuan Kong, Xiaowen Dong, John M. Mulvey, H. Vincent Poor, Qingsong Wen, Stefan Zohren

Abstract: Recent advances in large language models (LLMs) have unlocked novel opportunities for machine learning applications in the financial domain. These models have demonstrated remarkable capabilities in understanding context, processing vast amounts of data, and generating human-preferred contents. In this survey, we explore the application of LLMs on various financial tasks, focusing on their potenti… ▽ More Recent advances in large language models (LLMs) have unlocked novel opportunities for machine learning applications in the financial domain. These models have demonstrated remarkable capabilities in understanding context, processing vast amounts of data, and generating human-preferred contents. In this survey, we explore the application of LLMs on various financial tasks, focusing on their potential to transform traditional practices and drive innovation. We provide a discussion of the progress and advantages of LLMs in financial contexts, analyzing their advanced technologies as well as prospective capabilities in contextual understanding, transfer learning flexibility, complex emotion detection, etc. We then highlight this survey for categorizing the existing literature into key application areas, including linguistic tasks, sentiment analysis, financial time series, financial reasoning, agent-based modeling, and other applications. For each application area, we delve into specific methodologies, such as textual analysis, knowledge-based analysis, forecasting, data augmentation, planning, decision support, and simulations. Furthermore, a comprehensive collection of datasets, model assets, and useful codes associated with mainstream applications are presented as resources for the researchers and practitioners. Finally, we outline the challenges and opportunities for future research, particularly emphasizing a number of distinctive aspects in this field. We hope our work can help facilitate the adoption and further development of LLMs in the financial sector. △ Less

Submitted 15 June, 2024; originally announced June 2024.

arXiv:2406.10252 [pdf, other]

AutoSurvey: Large Language Models Can Automatically Write Surveys

Authors: Yidong Wang, Qi Guo, Wenjin Yao, Hongbo Zhang, Xin Zhang, Zhen Wu, Meishan Zhang, Xinyu Dai, Min Zhang, Qingsong Wen, Wei Ye, Shikun Zhang, Yue Zhang

Abstract: This paper introduces AutoSurvey, a speedy and well-organized methodology for automating the creation of comprehensive literature surveys in rapidly evolving fields like artificial intelligence. Traditional survey paper creation faces challenges due to the vast volume and complexity of information, prompting the need for efficient survey methods. While large language models (LLMs) offer promise in… ▽ More This paper introduces AutoSurvey, a speedy and well-organized methodology for automating the creation of comprehensive literature surveys in rapidly evolving fields like artificial intelligence. Traditional survey paper creation faces challenges due to the vast volume and complexity of information, prompting the need for efficient survey methods. While large language models (LLMs) offer promise in automating this process, challenges such as context window limitations, parametric knowledge constraints, and the lack of evaluation benchmarks remain. AutoSurvey addresses these challenges through a systematic approach that involves initial retrieval and outline generation, subsection drafting by specialized LLMs, integration and refinement, and rigorous evaluation and iteration. Our contributions include a comprehensive solution to the survey problem, a reliable evaluation method, and experimental validation demonstrating AutoSurvey's effectiveness.We open our resources at \url{https://github.com/AutoSurveys/AutoSurvey}. △ Less

Submitted 17 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

arXiv:2406.08627 [pdf, other]

Time-MMD: A New Multi-Domain Multimodal Dataset for Time Series Analysis

Authors: Haoxin Liu, Shangqing Xu, Zhiyuan Zhao, Lingkai Kong, Harshavardhan Kamarthi, Aditya B. Sasanur, Megha Sharma, Jiaming Cui, Qingsong Wen, Chao Zhang, B. Aditya Prakash

Abstract: Time series data are ubiquitous across a wide range of real-world domains. While real-world time series analysis (TSA) requires human experts to integrate numerical series data with multimodal domain-specific knowledge, most existing TSA models rely solely on numerical data, overlooking the significance of information beyond numerical series. This oversight is due to the untapped potential of text… ▽ More Time series data are ubiquitous across a wide range of real-world domains. While real-world time series analysis (TSA) requires human experts to integrate numerical series data with multimodal domain-specific knowledge, most existing TSA models rely solely on numerical data, overlooking the significance of information beyond numerical series. This oversight is due to the untapped potential of textual series data and the absence of a comprehensive, high-quality multimodal dataset. To overcome this obstacle, we introduce Time-MMD, the first multi-domain, multimodal time series dataset covering 9 primary data domains. Time-MMD ensures fine-grained modality alignment, eliminates data contamination, and provides high usability. Additionally, we develop MM-TSFlib, the first multimodal time-series forecasting (TSF) library, seamlessly pipelining multimodal TSF evaluations based on Time-MMD for in-depth analyses. Extensive experiments conducted on Time-MMD through MM-TSFlib demonstrate significant performance enhancements by extending unimodal TSF to multimodality, evidenced by over 15% mean squared error reduction in general, and up to 40% in domains with rich textual data. More importantly, our datasets and library revolutionize broader applications, impacts, research topics to advance TSA. The dataset and library are available at https://github.com/AdityaLab/Time-MMD and https://github.com/AdityaLab/MM-TSFlib. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.08487 [pdf, other]

Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models

Authors: Yi-Fan Zhang, Qingsong Wen, Chaoyou Fu, Xue Wang, Zhang Zhang, Liang Wang, Rong Jin

Abstract: Seeing clearly with high resolution is a foundation of Large Multimodal Models (LMMs), which has been proven to be vital for visual perception and reasoning. Existing works usually employ a straightforward resolution upscaling method, where the image consists of global and local branches, with the latter being the sliced image patches but resized to the same resolution as the former. This means th… ▽ More Seeing clearly with high resolution is a foundation of Large Multimodal Models (LMMs), which has been proven to be vital for visual perception and reasoning. Existing works usually employ a straightforward resolution upscaling method, where the image consists of global and local branches, with the latter being the sliced image patches but resized to the same resolution as the former. This means that higher resolution requires more local patches, resulting in exorbitant computational expenses, and meanwhile, the dominance of local image tokens may diminish the global context. In this paper, we dive into the problems and propose a new framework as well as an elaborate optimization strategy. Specifically, we extract contextual information from the global view using a mixture of adapters, based on the observation that different adapters excel at different tasks. With regard to local patches, learnable query embeddings are introduced to reduce image tokens, the most important tokens accounting for the user question will be further selected by a similarity-based selector. Our empirical results demonstrate a `less is more' pattern, where \textit{utilizing fewer but more informative local image tokens leads to improved performance}. Besides, a significant challenge lies in the training strategy, as simultaneous end-to-end training of the global mining block and local compression block does not yield optimal results. We thus advocate for an alternating training way, ensuring balanced learning between global and local aspects. Finally, we also introduce a challenging dataset with high requirements for image detail, enhancing the training of the local compression layer. The proposed method, termed LMM with Sophisticated Tasks, Local image compression, and Mixture of global Experts (SliME), achieves leading performance across various benchmarks with only 2 million training data. △ Less

Submitted 13 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

Comments: Project page: https://github.com/yfzhang114/SliME

arXiv:2406.03710 [pdf, other]

TwinS: Revisiting Non-Stationarity in Multivariate Time Series Forecasting

Authors: Jiaxi Hu, Qingsong Wen, Sijie Ruan, Li Liu, Yuxuan Liang

Abstract: Recently, multivariate time series forecasting tasks have garnered increasing attention due to their significant practical applications, leading to the emergence of various deep forecasting models. However, real-world time series exhibit pronounced non-stationary distribution characteristics. These characteristics are not solely limited to time-varying statistical properties highlighted by non-sta… ▽ More Recently, multivariate time series forecasting tasks have garnered increasing attention due to their significant practical applications, leading to the emergence of various deep forecasting models. However, real-world time series exhibit pronounced non-stationary distribution characteristics. These characteristics are not solely limited to time-varying statistical properties highlighted by non-stationary Transformer but also encompass three key aspects: nested periodicity, absence of periodic distributions, and hysteresis among time variables. In this paper, we begin by validating this theory through wavelet analysis and propose the Transformer-based TwinS model, which consists of three modules to address the non-stationary periodic distributions: Wavelet Convolution, Period-Aware Attention, and Channel-Temporal Mixed MLP. Specifically, The Wavelet Convolution models nested periods by scaling the convolution kernel size like wavelet transform. The Period-Aware Attention guides attention computation by generating period relevance scores through a convolutional sub-network. The Channel-Temporal Mixed MLP captures the overall relationships between time series through channel-time mixing learning. TwinS achieves SOTA performance compared to mainstream TS models, with a maximum improvement in MSE of 25.8\% over PatchTST. △ Less

Submitted 14 July, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

arXiv:2406.00317 [pdf, other]

Combining Experimental and Historical Data for Policy Evaluation

Authors: Ting Li, Chengchun Shi, Qianglin Wen, Yang Sui, Yongli Qin, Chunbo Lai, Hongtu Zhu

Abstract: This paper studies policy evaluation with multiple data sources, especially in scenarios that involve one experimental dataset with two arms, complemented by a historical dataset generated under a single control arm. We propose novel data integration methods that linearly integrate base policy value estimators constructed based on the experimental and historical data, with weights optimized to min… ▽ More This paper studies policy evaluation with multiple data sources, especially in scenarios that involve one experimental dataset with two arms, complemented by a historical dataset generated under a single control arm. We propose novel data integration methods that linearly integrate base policy value estimators constructed based on the experimental and historical data, with weights optimized to minimize the mean square error (MSE) of the resulting combined estimator. We further apply the pessimistic principle to obtain more robust estimators, and extend these developments to sequential decision making. Theoretically, we establish non-asymptotic error bounds for the MSEs of our proposed estimators, and derive their oracle, efficiency and robustness properties across a broad spectrum of reward shift scenarios. Numerical experiments and real-data-based analyses from a ridesharing company demonstrate the superior performance of the proposed estimators. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2405.18910 [pdf, other]

Predicting Parking Availability in Singapore with Cross-Domain Data: A New Dataset and A Data-Driven Approach

Authors: Huaiwu Zhang, Yutong Xia, Siru Zhong, Kun Wang, Zekun Tong, Qingsong Wen, Roger Zimmermann, Yuxuan Liang

Abstract: The increasing number of vehicles highlights the need for efficient parking space management. Predicting real-time Parking Availability (PA) can help mitigate traffic congestion and the corresponding social problems, which is a pressing issue in densely populated cities like Singapore. In this study, we aim to collectively predict future PA across Singapore with complex factors from various domain… ▽ More The increasing number of vehicles highlights the need for efficient parking space management. Predicting real-time Parking Availability (PA) can help mitigate traffic congestion and the corresponding social problems, which is a pressing issue in densely populated cities like Singapore. In this study, we aim to collectively predict future PA across Singapore with complex factors from various domains. The contributions in this paper are listed as follows: (1) A New Dataset: We introduce the \texttt{SINPA} dataset, containing a year's worth of PA data from 1,687 parking lots in Singapore, enriched with various spatial and temporal factors. (2) A Data-Driven Approach: We present DeepPA, a novel deep-learning framework, to collectively and efficiently predict future PA across thousands of parking lots. (3) Extensive Experiments and Deployment: DeepPA demonstrates a 9.2% reduction in prediction error for up to 3-hour forecasts compared to existing advanced models. Furthermore, we implement DeepPA in a practical web-based platform to provide real-time PA predictions to aid drivers and inform urban planning for the governors in Singapore. We release the dataset and source code at https://github.com/yoshall/SINPA. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: Accepted by IJCAI 2024 (Multi-Year Track On AI And Social Good with ~20% acceptance rate)

arXiv:2405.16312 [pdf, other]

Time-SSM: Simplifying and Unifying State Space Models for Time Series Forecasting

Authors: Jiaxi Hu, Disen Lan, Ziyu Zhou, Qingsong Wen, Yuxuan Liang

Abstract: State Space Models (SSMs) have emerged as a potent tool in sequence modeling tasks in recent years. These models approximate continuous systems using a set of basis functions and discretize them to handle input data, making them well-suited for modeling time series data collected at specific frequencies from continuous systems. Despite its potential, the application of SSMs in time series forecast… ▽ More State Space Models (SSMs) have emerged as a potent tool in sequence modeling tasks in recent years. These models approximate continuous systems using a set of basis functions and discretize them to handle input data, making them well-suited for modeling time series data collected at specific frequencies from continuous systems. Despite its potential, the application of SSMs in time series forecasting remains underexplored, with most existing models treating SSMs as a black box for capturing temporal or channel dependencies. To address this gap, this paper proposes a novel theoretical framework termed Dynamic Spectral Operator, offering more intuitive and general guidance on applying SSMs to time series data. Building upon our theory, we introduce Time-SSM, a novel SSM-based foundation model with only one-seventh of the parameters compared to Mamba. Various experiments validate both our theoretical framework and the superior performance of Time-SSM. △ Less

Submitted 14 July, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

Comments: arXiv admin note: text overlap with arXiv:2402.11463

arXiv:2405.15145 [pdf, other]

CulturePark: Boosting Cross-cultural Understanding in Large Language Models

Authors: Cheng Li, Damien Teney, Linyi Yang, Qingsong Wen, Xing Xie, Jindong Wang

Abstract: Cultural bias is pervasive in many large language models (LLMs), largely due to the deficiency of data representative of different cultures. Typically, cultural datasets and benchmarks are constructed either by extracting subsets of existing datasets or by aggregating from platforms such as Wikipedia and social media. However, these approaches are highly dependent on real-world data and human anno… ▽ More Cultural bias is pervasive in many large language models (LLMs), largely due to the deficiency of data representative of different cultures. Typically, cultural datasets and benchmarks are constructed either by extracting subsets of existing datasets or by aggregating from platforms such as Wikipedia and social media. However, these approaches are highly dependent on real-world data and human annotations, making them costly and difficult to scale. Inspired by cognitive theories on social communication, this paper introduces CulturePark, an LLM-powered multi-agent communication framework for cultural data collection. CulturePark simulates cross-cultural human communication with LLM-based agents playing roles in different cultures. It generates high-quality cross-cultural dialogues encapsulating human beliefs, norms, and customs. Using CulturePark, we generated 41,000 cultural samples to fine-tune eight culture-specific LLMs. We evaluated these models across three downstream tasks: content moderation, cultural alignment, and cultural education. Results show that for content moderation, our GPT-3.5-based models either match or outperform GPT-4 on datasets. Regarding cultural alignment, our models surpass GPT-4 on Hofstede's VSM 13 framework. Furthermore, for cultural education of human participants, our models demonstrate superior outcomes in both learning efficacy and user experience compared to GPT-4. CulturePark proves an important step in addressing cultural bias and advancing the democratization of AI, highlighting the critical role of culturally inclusive data in model training. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: Technical report; 28 pages

arXiv:2405.14252 [pdf, other]

Time-FFM: Towards LM-Empowered Federated Foundation Model for Time Series Forecasting

Authors: Qingxiang Liu, Xu Liu, Chenghao Liu, Qingsong Wen, Yuxuan Liang

Abstract: Unlike natural language processing and computer vision, the development of Foundation Models (FMs) for time series forecasting is blocked due to data scarcity. While recent efforts are focused on building such FMs by unlocking the potential of language models (LMs) for time series analysis, dedicated parameters for various downstream forecasting tasks need training, which hinders the common knowle… ▽ More Unlike natural language processing and computer vision, the development of Foundation Models (FMs) for time series forecasting is blocked due to data scarcity. While recent efforts are focused on building such FMs by unlocking the potential of language models (LMs) for time series analysis, dedicated parameters for various downstream forecasting tasks need training, which hinders the common knowledge sharing across domains. Moreover, data owners may hesitate to share the access to local data due to privacy concerns and copyright protection, which makes it impossible to simply construct a FM on cross-domain training instances. To address these issues, we propose Time-FFM, a Federated Foundation Model for Time series forecasting by leveraging pretrained LMs. Specifically, we begin by transforming time series into the modality of text tokens. To bootstrap LMs for time series reasoning, we propose a prompt adaption module to determine domain-customized prompts dynamically instead of artificially. Given the data heterogeneity across domains, we design a personalized federated training strategy by learning global encoders and local prediction heads. Our comprehensive experiments indicate that Time-FFM outperforms state-of-the-arts and promises effective few-shot and zero-shot forecaster. △ Less

Submitted 25 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.10959 [pdf, other]

Foundation Models for Education: Promises and Prospects

Authors: Tianlong Xu, Richard Tong, Jing Liang, Xing Fan, Haoyang Li, Qingsong Wen

Abstract: With the advent of foundation models like ChatGPT, educators are excited about the transformative role that AI might play in propelling the next education revolution. The developing speed and the profound impact of foundation models in various industries force us to think deeply about the changes they will make to education, a domain that is critically important for the future of humans. In this p… ▽ More With the advent of foundation models like ChatGPT, educators are excited about the transformative role that AI might play in propelling the next education revolution. The developing speed and the profound impact of foundation models in various industries force us to think deeply about the changes they will make to education, a domain that is critically important for the future of humans. In this paper, we discuss the strengths of foundation models, such as personalized learning, education inequality, and reasoning capabilities, as well as the development of agent architecture tailored for education, which integrates AI agents with pedagogical frameworks to create adaptive learning environments. Furthermore, we highlight the risks and opportunities of AI overreliance and creativity. Lastly, we envision a future where foundation models in education harmonize human and AI capabilities, fostering a dynamic, inclusive, and adaptive educational ecosystem. △ Less

Submitted 8 April, 2024; originally announced May 2024.

Comments: Accepted by IEEE Intelligent Systems

arXiv:2405.10800 [pdf, other]

Heterogeneity-Informed Meta-Parameter Learning for Spatiotemporal Time Series Forecasting

Authors: Zheng Dong, Renhe Jiang, Haotian Gao, Hangchen Liu, Jinliang Deng, Qingsong Wen, Xuan Song

Abstract: Spatiotemporal time series forecasting plays a key role in a wide range of real-world applications. While significant progress has been made in this area, fully capturing and leveraging spatiotemporal heterogeneity remains a fundamental challenge. Therefore, we propose a novel Heterogeneity-Informed Meta-Parameter Learning scheme. Specifically, our approach implicitly captures spatiotemporal heter… ▽ More Spatiotemporal time series forecasting plays a key role in a wide range of real-world applications. While significant progress has been made in this area, fully capturing and leveraging spatiotemporal heterogeneity remains a fundamental challenge. Therefore, we propose a novel Heterogeneity-Informed Meta-Parameter Learning scheme. Specifically, our approach implicitly captures spatiotemporal heterogeneity through learning spatial and temporal embeddings, which can be viewed as a clustering process. Then, a novel spatiotemporal meta-parameter learning paradigm is proposed to learn spatiotemporal-specific parameters from meta-parameter pools, which is informed by the captured heterogeneity. Based on these ideas, we develop a Heterogeneity-Informed Spatiotemporal Meta-Network (HimNet) for spatiotemporal time series forecasting. Extensive experiments on five widely-used benchmarks demonstrate our method achieves state-of-the-art performance while exhibiting superior interpretability. Our code is available at https://github.com/XDZhelheim/HimNet. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: Accepted by KDD'24 Research Track

arXiv:2405.05614 [pdf, other]

doi 10.1016/j.imavis.2024.104924

Depth Awakens: A Depth-perceptual Attention Fusion Network for RGB-D Camouflaged Object Detection

Authors: Xinran Liua, Lin Qia, Yuxuan Songa, Qi Wen

Abstract: Camouflaged object detection (COD) presents a persistent challenge in accurately identifying objects that seamlessly blend into their surroundings. However, most existing COD models overlook the fact that visual systems operate within a genuine 3D environment. The scene depth inherent in a single 2D image provides rich spatial clues that can assist in the detection of camouflaged objects. Therefor… ▽ More Camouflaged object detection (COD) presents a persistent challenge in accurately identifying objects that seamlessly blend into their surroundings. However, most existing COD models overlook the fact that visual systems operate within a genuine 3D environment. The scene depth inherent in a single 2D image provides rich spatial clues that can assist in the detection of camouflaged objects. Therefore, we propose a novel depth-perception attention fusion network that leverages the depth map as an auxiliary input to enhance the network's ability to perceive 3D information, which is typically challenging for the human eye to discern from 2D images. The network uses a trident-branch encoder to extract chromatic and depth information and their communications. Recognizing that certain regions of a depth map may not effectively highlight the camouflaged object, we introduce a depth-weighted cross-attention fusion module to dynamically adjust the fusion weights on depth and RGB feature maps. To keep the model simple without compromising effectiveness, we design a straightforward feature aggregation decoder that adaptively fuses the enhanced aggregated features. Experiments demonstrate the significant superiority of our proposed method over other states of the arts, which further validates the contribution of depth information in camouflaged object detection. The code will be available at https://github.com/xinran-liu00/DAF-Net. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Journal ref: Image and Vision Computing, 143:104924, 2024

arXiv:2405.04303 [pdf, other]

Progressive Quantum Algorithm for Quantum Alternating Operator Ansatz

Authors: Xiao-Hui Ni, Yan-Qi Song, Ling-Xiao Li, Su-Juan Qin, Fei Gao, Qiao-Yan Wen

Abstract: Recently, Hadfield has proposed a novel Quantum Alternating Operator Ansatz (QAOA+) to tackle Constrained Combinatorial Optimization Problems (CCOPs), and it has wide applications. However, the large requirement of multi-qubit controlled gates in QAOA+ limits its applications in solving larger-scale CCOPs. To mitigate the resources overhead of QAOA+, we introduce an approach termed Progressive Qua… ▽ More Recently, Hadfield has proposed a novel Quantum Alternating Operator Ansatz (QAOA+) to tackle Constrained Combinatorial Optimization Problems (CCOPs), and it has wide applications. However, the large requirement of multi-qubit controlled gates in QAOA+ limits its applications in solving larger-scale CCOPs. To mitigate the resources overhead of QAOA+, we introduce an approach termed Progressive Quantum Algorithm (PQA). In this paper, the concept and performance of PQA are introduced focusing on the Maximal Independent Set (MIS) problem. PQA aims to yield the solution of the target graph $G$ with fewer resources by solving the MIS problem on a desired derived subgraph that has the same MIS solution as $G$ but has a much smaller graph size. To construct such a desired subgraph, PQA gradually and regularly expands the graph size starting from a well-designed initial subgraph. After each expansion, PQA solves the MIS problem on the current subgraph using QAOA+ and estimates whether the current graph has the same MIS solution as the target graph. PQA repeats the graph expansion and solving process until reaching the stop condition. In our simulations, the performance of PQA is benchmarked on Erdős-Rényi (ER) and regular graphs. The simulation results suggest that PQA showcases higher average approximation ratio (AAR) and significant quantum resource savings compared with directly solves the original problem using QAOA+ (DS-QAOA+) at the same level depth $p$. Remarkably, the AAR obtained by PQA is $12.9305\%$ ($4.8645\%$) higher than DS-QAOA+ on ER (regular) graphs, and the average number of multi-qubit gates (qubits) consumed by PQA is 1/3 (1/2) of that of DS-QAOA+. The remarkable efficiency of PQA makes it possible to solve larger-scale CCOPs on the current quantum devices. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.01510 [pdf, other]

Reverse Influential Community Search Over Social Networks (Technical Report)

Authors: Qi Wen, Nan Zhang, Yutong Ye, Xiang Lian, Mingsong Chen

Abstract: As an important fundamental task of numerous real-world applications such as social network analysis and online advertising/marketing, several prior works studied influential community search, which retrieves a community with high structural cohesiveness and maximum influences on other users in social networks. However, previous works usually considered the influences of the community on arbitrary… ▽ More As an important fundamental task of numerous real-world applications such as social network analysis and online advertising/marketing, several prior works studied influential community search, which retrieves a community with high structural cohesiveness and maximum influences on other users in social networks. However, previous works usually considered the influences of the community on arbitrary users in social networks, rather than specific groups (e.g., customer groups, or senior communities). Inspired by this, we propose a novel Reverse Influential Community Search (RICS) problem, which obtains a seed community with the maximum influence on a user-specified target community, satisfying both structural and keyword constraints. To efficiently tackle the RICS problem, we design effective pruning strategies to filter out false alarms of candidate seed communities, and propose an effective index mechanism to facilitate the community retrieval. We also formulate and tackle an RICS variant, named Relaxed Reverse Influential Community Search (R2ICS), which returns a subgraph with the relaxed structural constraints and having the maximum influence on a user-specified target community. Comprehensive experiments have been conducted to verify the efficiency and effectiveness of our RICS and R2ICS approaches on both real-world and synthetic social networks under various parameter settings. △ Less

Submitted 7 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

arXiv:2404.18886 [pdf, other]

A Survey on Diffusion Models for Time Series and Spatio-Temporal Data

Authors: Yiyuan Yang, Ming Jin, Haomin Wen, Chaoli Zhang, Yuxuan Liang, Lintao Ma, Yi Wang, Chenghao Liu, Bin Yang, Zenglin Xu, Jiang Bian, Shirui Pan, Qingsong Wen

Abstract: The study of time series is crucial for understanding trends and anomalies over time, enabling predictive insights across various sectors. Spatio-temporal data, on the other hand, is vital for analyzing phenomena in both space and time, providing a dynamic perspective on complex system interactions. Recently, diffusion models have seen widespread application in time series and spatio-temporal data… ▽ More The study of time series is crucial for understanding trends and anomalies over time, enabling predictive insights across various sectors. Spatio-temporal data, on the other hand, is vital for analyzing phenomena in both space and time, providing a dynamic perspective on complex system interactions. Recently, diffusion models have seen widespread application in time series and spatio-temporal data mining. Not only do they enhance the generative and inferential capabilities for sequential and temporal data, but they also extend to other downstream tasks. In this survey, we comprehensively and thoroughly review the use of diffusion models in time series and spatio-temporal data, categorizing them by model category, task type, data modality, and practical application domain. In detail, we categorize diffusion models into unconditioned and conditioned types and discuss time series and spatio-temporal data separately. Unconditioned models, which operate unsupervised, are subdivided into probability-based and score-based models, serving predictive and generative tasks such as forecasting, anomaly detection, classification, and imputation. Conditioned models, on the other hand, utilize extra information to enhance performance and are similarly divided for both predictive and generative tasks. Our survey extensively covers their application in various fields, including healthcare, recommendation, climate, energy, audio, and transportation, providing a foundational understanding of how these models analyze and generate data. Through this structured overview, we aim to provide researchers and practitioners with a comprehensive understanding of diffusion models for time series and spatio-temporal data analysis, aiming to direct future innovations and applications by addressing traditional challenges and exploring innovative solutions within the diffusion model framework. △ Less

Submitted 11 June, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

Comments: Ongoing work & Under review; 27 pages, 8 figures, 2 tables; Github Repo: https://github.com/yyysjz1997/Awesome-TimeSeries-SpatioTemporal-Diffusion-Model

arXiv:2404.11269 [pdf, other]

DACAD: Domain Adaptation Contrastive Learning for Anomaly Detection in Multivariate Time Series

Authors: Zahra Zamanzadeh Darban, Yiyuan Yang, Geoffrey I. Webb, Charu C. Aggarwal, Qingsong Wen, Mahsa Salehi

Abstract: In time series anomaly detection (TSAD), the scarcity of labeled data poses a challenge to the development of accurate models. Unsupervised domain adaptation (UDA) offers a solution by leveraging labeled data from a related domain to detect anomalies in an unlabeled target domain. However, existing UDA methods assume consistent anomalous classes across domains. To address this limitation, we propo… ▽ More In time series anomaly detection (TSAD), the scarcity of labeled data poses a challenge to the development of accurate models. Unsupervised domain adaptation (UDA) offers a solution by leveraging labeled data from a related domain to detect anomalies in an unlabeled target domain. However, existing UDA methods assume consistent anomalous classes across domains. To address this limitation, we propose a novel Domain Adaptation Contrastive learning model for Anomaly Detection in multivariate time series (DACAD), combining UDA with contrastive learning. DACAD utilizes an anomaly injection mechanism that enhances generalization across unseen anomalous classes, improving adaptability and robustness. Additionally, our model employs supervised contrastive loss for the source domain and self-supervised contrastive triplet loss for the target domain, ensuring comprehensive feature representation learning and domain-invariant feature extraction. Finally, an effective Centre-based Entropy Classifier (CEC) accurately learns normal boundaries in the source domain. Extensive evaluations on multiple real-world datasets and a synthetic dataset highlight DACAD's superior performance in transferring knowledge across domains and mitigating the challenge of limited labeled data in TSAD. △ Less

Submitted 11 July, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

Comments: 11 pages, 3 figures, 6 tables

arXiv:2404.06397 [pdf, other]

New Contributions to $b \to s γ$ in Minimal G2HDM

Authors: Che Hao Liu, Van Que Tran, Qiaoyi Wen, Fanrong Xu, Tzu-Chiang Yuan

Abstract: We study the flavor-changing bottom quark radiative decay $b \to s γ$ induced at one-loop level within the minimal gauged two-Higgs-doublet model (G2HDM). Among the three new contributions to this rare process in G2HDM, we find that only the charged Higgs $\mathcal{H^\pm}$ contribution can be constrained by the current global fit data in $B$-physics. Other two contributions from the complex vector… ▽ More We study the flavor-changing bottom quark radiative decay $b \to s γ$ induced at one-loop level within the minimal gauged two-Higgs-doublet model (G2HDM). Among the three new contributions to this rare process in G2HDM, we find that only the charged Higgs $\mathcal{H^\pm}$ contribution can be constrained by the current global fit data in $B$-physics. Other two contributions from the complex vectorial dark matter $\mathcal{W}$ and dark Higgs $\mathcal{D}$ are not sensitive to the current data. Combining with theoretical constraints imposed on the scalar potential and electroweak precision data for the oblique parameters, we exclude mass regions $m_{\mathcal{H}^\pm} \lesssim 250$ GeV and $m_{\mathcal{D}} \lesssim 100$ GeV at the 95\% confidence level. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 35 pages, 8 figures

arXiv:2404.01606 [pdf]

Haina Storage: A Decentralized Secure Storage Framework Based on Improved Blockchain Structure

Authors: Zijian Zhou, Caimei Wang, Xiaoheng Deng, Jianhao Lu, Qilue Wen, Chen Zhang, Hong Li

Abstract: Although the decentralized storage technology based on the blockchain can effectively realize secure data storage on cloud services. However, there are still some problems in the existing schemes, such as low storage capacity and low efficiency. To address related issues, we propose a novel decentralized storage framework, which mainly includes four aspects: (1) we proposed a Bi-direction Circular… ▽ More Although the decentralized storage technology based on the blockchain can effectively realize secure data storage on cloud services. However, there are still some problems in the existing schemes, such as low storage capacity and low efficiency. To address related issues, we propose a novel decentralized storage framework, which mainly includes four aspects: (1) we proposed a Bi-direction Circular Linked Chain Structure (BCLCS), which improves data's storage capacity and applicability in decentralized storage. (2) A Proof of Resources (PoR) decision model is proposed. By introducing the network environment as an essential evaluation parameter of storage right decision, the energy and time consumption of decision-making are reduced, and the fairness of decision-making is improved. (3) A chain structure dynamic locking mechanism (CSDLM) is designed to realize anti-traverse and access control. (4) A Bi-directional data Access Mechanism (BDAM) is proposed, which improves the efficiency of data access and acquisition in decentralized storage mode. The experimental results show that the framework has significantly improved the shortcomings of the current decentralized storage. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: 24 pages, 21 figures

arXiv:2403.18105 [pdf, other]

Large Language Models for Education: A Survey and Outlook

Authors: Shen Wang, Tianlong Xu, Hang Li, Chaoli Zhang, Joleen Liang, Jiliang Tang, Philip S. Yu, Qingsong Wen

Abstract: The advent of Large Language Models (LLMs) has brought in a new era of possibilities in the realm of education. This survey paper summarizes the various technologies of LLMs in educational settings from multifaceted perspectives, encompassing student and teacher assistance, adaptive learning, and commercial tools. We systematically review the technological advancements in each perspective, organiz… ▽ More The advent of Large Language Models (LLMs) has brought in a new era of possibilities in the realm of education. This survey paper summarizes the various technologies of LLMs in educational settings from multifaceted perspectives, encompassing student and teacher assistance, adaptive learning, and commercial tools. We systematically review the technological advancements in each perspective, organize related datasets and benchmarks, and identify the risks and challenges associated with deploying LLMs in education. Furthermore, we outline future research opportunities, highlighting the potential promising directions. Our survey aims to provide a comprehensive technological picture for educators, researchers, and policymakers to harness the power of LLMs to revolutionize educational practices and foster a more effective personalized learning environment. △ Less

Submitted 1 April, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

arXiv:2403.17285 [pdf, other]

An Analysis of Switchback Designs in Reinforcement Learning

Authors: Qianglin Wen, Chengchun Shi, Ying Yang, Niansheng Tang, Hongtu Zhu

Abstract: This paper offers a detailed investigation of switchback designs in A/B testing, which alternate between baseline and new policies over time. Our aim is to thoroughly evaluate the effects of these designs on the accuracy of their resulting average treatment effect (ATE) estimators. We propose a novel "weak signal analysis" framework, which substantially simplifies the calculations of the mean squa… ▽ More This paper offers a detailed investigation of switchback designs in A/B testing, which alternate between baseline and new policies over time. Our aim is to thoroughly evaluate the effects of these designs on the accuracy of their resulting average treatment effect (ATE) estimators. We propose a novel "weak signal analysis" framework, which substantially simplifies the calculations of the mean squared errors (MSEs) of these ATEs in Markov decision process environments. Our findings suggest that (i) when the majority of reward errors are positively correlated, the switchback design is more efficient than the alternating-day design which switches policies in a daily basis. Additionally, increasing the frequency of policy switches tends to reduce the MSE of the ATE estimator. (ii) When the errors are uncorrelated, however, all these designs become asymptotically equivalent. (iii) In cases where the majority of errors are negative correlated, the alternating-day design becomes the optimal choice. These insights are crucial, offering guidelines for practitioners on designing experiments in A/B testing. Our analysis accommodates a variety of policy value estimators, including model-based estimators, least squares temporal difference learning estimators, and double reinforcement learning estimators, thereby offering a comprehensive understanding of optimal design strategies for policy evaluation in reinforcement learning. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.17281 [pdf, other]

Automate Knowledge Concept Tagging on Math Questions with LLMs

Authors: Hang Li, Tianlong Xu, Jiliang Tang, Qingsong Wen

Abstract: Knowledge concept tagging for questions plays a crucial role in contemporary intelligent educational applications, including learning progress diagnosis, practice question recommendations, and course content organization. Traditionally, these annotations have been conducted manually with help from pedagogical experts, as the task requires not only a strong semantic understanding of both question s… ▽ More Knowledge concept tagging for questions plays a crucial role in contemporary intelligent educational applications, including learning progress diagnosis, practice question recommendations, and course content organization. Traditionally, these annotations have been conducted manually with help from pedagogical experts, as the task requires not only a strong semantic understanding of both question stems and knowledge definitions but also deep insights into connecting question-solving logic with corresponding knowledge concepts. In this paper, we explore automating the tagging task using Large Language Models (LLMs), in response to the inability of prior manual methods to meet the rapidly growing demand for concept tagging in questions posed by advanced educational applications. Moreover, the zero/few-shot learning capability of LLMs makes them well-suited for application in educational scenarios, which often face challenges in collecting large-scale, expertise-annotated datasets. By conducting extensive experiments with a variety of representative LLMs, we demonstrate that LLMs are a promising tool for concept tagging in math questions. Furthermore, through case studies examining the results from different LLMs, we draw some empirical conclusions about the key factors for success in applying LLMs to the automatic concept tagging task. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 7 pages, 2 figures

arXiv:2403.16831 [pdf, other]

UrbanVLP: Multi-Granularity Vision-Language Pretraining for Urban Region Profiling

Authors: Xixuan Hao, Wei Chen, Yibo Yan, Siru Zhong, Kun Wang, Qingsong Wen, Yuxuan Liang

Abstract: Urban region profiling aims to learn a low-dimensional representation of a given urban area while preserving its characteristics, such as demographics, infrastructure, and economic activities, for urban planning and development. However, prevalent pretrained models, particularly those reliant on satellite imagery, face dual challenges. Firstly, concentrating solely on macro-level patterns from sat… ▽ More Urban region profiling aims to learn a low-dimensional representation of a given urban area while preserving its characteristics, such as demographics, infrastructure, and economic activities, for urban planning and development. However, prevalent pretrained models, particularly those reliant on satellite imagery, face dual challenges. Firstly, concentrating solely on macro-level patterns from satellite data may introduce bias, lacking nuanced details at micro levels, such as architectural details at a place.Secondly, the lack of interpretability in pretrained models limits their utility in providing transparent evidence for urban planning. In response to these issues, we devise a novel framework entitled UrbanVLP based on Vision-Language Pretraining. Our UrbanVLP seamlessly integrates multi-granularity information from both macro (satellite) and micro (street-view) levels, overcoming the limitations of prior pretrained models. Moreover, it introduces automatic text generation and calibration, elevating interpretability in downstream applications by producing high-quality text descriptions of urban imagery. Rigorous experiments conducted across six urban indicator prediction tasks underscore its superior performance. △ Less

Submitted 29 May, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: Preprint

arXiv:2403.14949 [pdf, other]

Addressing Concept Shift in Online Time Series Forecasting: Detect-then-Adapt

Authors: YiFan Zhang, Weiqi Chen, Zhaoyang Zhu, Dalin Qin, Liang Sun, Xue Wang, Qingsong Wen, Zhang Zhang, Liang Wang, Rong Jin

Abstract: Online updating of time series forecasting models aims to tackle the challenge of concept drifting by adjusting forecasting models based on streaming data. While numerous algorithms have been developed, most of them focus on model design and updating. In practice, many of these methods struggle with continuous performance regression in the face of accumulated concept drifts over time. To address t… ▽ More Online updating of time series forecasting models aims to tackle the challenge of concept drifting by adjusting forecasting models based on streaming data. While numerous algorithms have been developed, most of them focus on model design and updating. In practice, many of these methods struggle with continuous performance regression in the face of accumulated concept drifts over time. To address this limitation, we present a novel approach, Concept \textbf{D}rift \textbf{D}etection an\textbf{D} \textbf{A}daptation (D3A), that first detects drifting conception and then aggressively adapts the current model to the drifted concepts after the detection for rapid adaption. To best harness the utility of historical data for model adaptation, we propose a data augmentation strategy introducing Gaussian noise into existing training instances. It helps mitigate the data distribution gap, a critical factor contributing to train-test performance inconsistency. The significance of our data augmentation process is verified by our theoretical analysis. Our empirical studies across six datasets demonstrate the effectiveness of D3A in improving model adaptation capability. Notably, compared to a simple Temporal Convolutional Network (TCN) baseline, D3A reduces the average Mean Squared Error (MSE) by $43.9\%$. For the state-of-the-art (SOTA) model, the MSE is reduced by $33.3\%$. △ Less

Submitted 22 March, 2024; originally announced March 2024.

Comments: 7 figures, 14 pages. arXiv admin note: text overlap with arXiv:2309.12659

arXiv:2403.14735 [pdf, other]

doi 10.1145/3637528.3671451

Foundation Models for Time Series Analysis: A Tutorial and Survey

Authors: Yuxuan Liang, Haomin Wen, Yuqi Nie, Yushan Jiang, Ming Jin, Dongjin Song, Shirui Pan, Qingsong Wen

Abstract: Time series analysis stands as a focal point within the data mining community, serving as a cornerstone for extracting valuable insights crucial to a myriad of real-world applications. Recent advances in Foundation Models (FMs) have fundamentally reshaped the paradigm of model design for time series analysis, boosting various downstream tasks in practice. These innovative approaches often leverage… ▽ More Time series analysis stands as a focal point within the data mining community, serving as a cornerstone for extracting valuable insights crucial to a myriad of real-world applications. Recent advances in Foundation Models (FMs) have fundamentally reshaped the paradigm of model design for time series analysis, boosting various downstream tasks in practice. These innovative approaches often leverage pre-trained or fine-tuned FMs to harness generalized knowledge tailored for time series analysis. This survey aims to furnish a comprehensive and up-to-date overview of FMs for time series analysis. While prior surveys have predominantly focused on either application or pipeline aspects of FMs in time series analysis, they have often lacked an in-depth understanding of the underlying mechanisms that elucidate why and how FMs benefit time series analysis. To address this gap, our survey adopts a methodology-centric classification, delineating various pivotal elements of time-series FMs, including model architectures, pre-training techniques, adaptation methods, and data modalities. Overall, this survey serves to consolidate the latest advancements in FMs pertinent to time series analysis, accentuating their theoretical underpinnings, recent strides in development, and avenues for future exploration. △ Less

Submitted 18 June, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

Comments: In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'24)

arXiv:2403.14689 [pdf, other]

Developing and Deploying Industry Standards for Artificial Intelligence in Education (AIED): Challenges, Strategies, and Future Directions

Authors: Richard Tong, Haoyang Li, Joleen Liang, Qingsong Wen

Abstract: The adoption of Artificial Intelligence in Education (AIED) holds the promise of revolutionizing educational practices by offering personalized learning experiences, automating administrative and pedagogical tasks, and reducing the cost of content creation. However, the lack of standardized practices in the development and deployment of AIED solutions has led to fragmented ecosystems, which presen… ▽ More The adoption of Artificial Intelligence in Education (AIED) holds the promise of revolutionizing educational practices by offering personalized learning experiences, automating administrative and pedagogical tasks, and reducing the cost of content creation. However, the lack of standardized practices in the development and deployment of AIED solutions has led to fragmented ecosystems, which presents challenges in interoperability, scalability, and ethical governance. This article aims to address the critical need to develop and implement industry standards in AIED, offering a comprehensive analysis of the current landscape, challenges, and strategic approaches to overcome these obstacles. We begin by examining the various applications of AIED in various educational settings and identify key areas lacking in standardization, including system interoperability, ontology mapping, data integration, evaluation, and ethical governance. Then, we propose a multi-tiered framework for establishing robust industry standards for AIED. In addition, we discuss methodologies for the iterative development and deployment of standards, incorporating feedback loops from real-world applications to refine and adapt standards over time. The paper also highlights the role of emerging technologies and pedagogical theories in shaping future standards for AIED. Finally, we outline a strategic roadmap for stakeholders to implement these standards, fostering a cohesive and ethical AIED ecosystem. By establishing comprehensive industry standards, such as those by IEEE Artificial Intelligence Standards Committee (AISC) and International Organization for Standardization (ISO), we can accelerate and scale AIED solutions to improve educational outcomes, ensuring that technological advances align with the principles of inclusivity, fairness, and educational excellence. △ Less

Submitted 25 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

Comments: 12 pages

arXiv:2403.14151 [pdf, other]

Deep Learning for Trajectory Data Management and Mining: A Survey and Beyond

Authors: Wei Chen, Yuxuan Liang, Yuanshao Zhu, Yanchuan Chang, Kang Luo, Haomin Wen, Lei Li, Yanwei Yu, Qingsong Wen, Chao Chen, Kai Zheng, Yunjun Gao, Xiaofang Zhou, Yu Zheng

Abstract: Trajectory computing is a pivotal domain encompassing trajectory data management and mining, garnering widespread attention due to its crucial role in various practical applications such as location services, urban traffic, and public safety. Traditional methods, focusing on simplistic spatio-temporal features, face challenges of complex calculations, limited scalability, and inadequate adaptabili… ▽ More Trajectory computing is a pivotal domain encompassing trajectory data management and mining, garnering widespread attention due to its crucial role in various practical applications such as location services, urban traffic, and public safety. Traditional methods, focusing on simplistic spatio-temporal features, face challenges of complex calculations, limited scalability, and inadequate adaptability to real-world complexities. In this paper, we present a comprehensive review of the development and recent advances in deep learning for trajectory computing (DL4Traj). We first define trajectory data and provide a brief overview of widely-used deep learning models. Systematically, we explore deep learning applications in trajectory management (pre-processing, storage, analysis, and visualization) and mining (trajectory-related forecasting, trajectory-related recommendation, trajectory classification, travel time estimation, anomaly detection, and mobility generation). Notably, we encapsulate recent advancements in Large Language Models (LLMs) that hold the potential to augment trajectory computing. Additionally, we summarize application scenarios, public datasets, and toolkits. Finally, we outline current challenges in DL4Traj research and propose future directions. Relevant papers and open-source resources have been collated and are continuously updated at: \href{https://github.com/yoshall/Awesome-Trajectory-Computing}{DL4Traj Repo}. △ Less

Submitted 21 March, 2024; originally announced March 2024.

Comments: 25 pages, 12 figures, 5 tables

arXiv:2403.09953 [pdf, other]

Online GNN Evaluation Under Test-time Graph Distribution Shifts

Authors: Xin Zheng, Dongjin Song, Qingsong Wen, Bo Du, Shirui Pan

Abstract: Evaluating the performance of a well-trained GNN model on real-world graphs is a pivotal step for reliable GNN online deployment and serving. Due to a lack of test node labels and unknown potential training-test graph data distribution shifts, conventional model evaluation encounters limitations in calculating performance metrics (e.g., test error) and measuring graph data-level discrepancies, par… ▽ More Evaluating the performance of a well-trained GNN model on real-world graphs is a pivotal step for reliable GNN online deployment and serving. Due to a lack of test node labels and unknown potential training-test graph data distribution shifts, conventional model evaluation encounters limitations in calculating performance metrics (e.g., test error) and measuring graph data-level discrepancies, particularly when the training graph used for developing GNNs remains unobserved during test time. In this paper, we study a new research problem, online GNN evaluation, which aims to provide valuable insights into the well-trained GNNs's ability to effectively generalize to real-world unlabeled graphs under the test-time graph distribution shifts. Concretely, we develop an effective learning behavior discrepancy score, dubbed LeBeD, to estimate the test-time generalization errors of well-trained GNN models. Through a novel GNN re-training strategy with a parameter-free optimality criterion, the proposed LeBeD comprehensively integrates learning behavior discrepancies from both node prediction and structure reconstruction perspectives. This enables the effective evaluation of the well-trained GNNs' ability to capture test node semantics and structural representations, making it an expressive metric for estimating the generalization error in online GNN evaluation. Extensive experiments on real-world test graphs under diverse graph distribution shifts could verify the effectiveness of the proposed method, revealing its strong correlation with ground-truth test errors on various well-trained GNN models. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: Accepted by ICLR-2024

arXiv:2403.09318 [pdf, other]

A Hierarchical Fused Quantum Fuzzy Neural Network for Image Classification

Authors: Sheng-Yao Wu, Run-Ze Li, Yan-Qi Song, Su-Juan Qin, Qiao-Yan Wen, Fei Gao

Abstract: Neural network is a powerful learning paradigm for data feature learning in the era of big data. However, most neural network models are deterministic models that ignore the uncertainty of data. Fuzzy neural networks are proposed to address this problem. FDNN is a hierarchical deep neural network that derives information from both fuzzy and neural representations, the representations are then fuse… ▽ More Neural network is a powerful learning paradigm for data feature learning in the era of big data. However, most neural network models are deterministic models that ignore the uncertainty of data. Fuzzy neural networks are proposed to address this problem. FDNN is a hierarchical deep neural network that derives information from both fuzzy and neural representations, the representations are then fused to form representation to be classified. FDNN perform well on uncertain data classification tasks. In this paper, we proposed a novel hierarchical fused quantum fuzzy neural network (HQFNN). Different from classical FDNN, HQFNN uses quantum neural networks to learn fuzzy membership functions in fuzzy neural network. We conducted simulated experiment on two types of datasets (Dirty-MNIST and 15-Scene), the results show that the proposed model can outperform several existing methods. In addition, we demonstrate the robustness of the proposed quantum circuit. △ Less

Submitted 14 March, 2024; originally announced March 2024.

arXiv:2403.05262 [pdf, other]

Debiasing Multimodal Large Language Models

Authors: Yi-Fan Zhang, Weichen Yu, Qingsong Wen, Xue Wang, Zhang Zhang, Liang Wang, Rong Jin, Tieniu Tan

Abstract: In the realms of computer vision and natural language processing, Large Vision-Language Models (LVLMs) have become indispensable tools, proficient in generating textual descriptions based on visual inputs. Despite their advancements, our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior ra… ▽ More In the realms of computer vision and natural language processing, Large Vision-Language Models (LVLMs) have become indispensable tools, proficient in generating textual descriptions based on visual inputs. Despite their advancements, our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior rather than the input image. Our empirical experiments underscore the persistence of this bias, as LVLMs often provide confident answers even in the absence of relevant images or given incongruent visual input. To rectify these biases and redirect the model's focus toward vision information, we introduce two simple, training-free strategies. Firstly, for tasks such as classification or multi-choice question-answering (QA), we propose a ``calibration'' step through affine transformation to adjust the output distribution. This ``Post-Hoc debias'' approach ensures uniform scores for each answer when the image is absent, serving as an effective regularization technique to alleviate the influence of LLM priors. For more intricate open-ended generation tasks, we extend this method to ``Debias sampling'', drawing inspirations from contrastive decoding methods. Furthermore, our investigation sheds light on the instability of LVLMs across various decoding configurations. Through systematic exploration of different settings, we significantly enhance performance, surpassing reported results and raising concerns about the fairness of existing evaluations. Comprehensive experiments substantiate the effectiveness of our proposed strategies in mitigating biases. These strategies not only prove beneficial in minimizing hallucinations but also contribute to the generation of more helpful and precise illustrations. △ Less

Submitted 27 March, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

Comments: 38 pages, 17 figures

arXiv:2402.16913 [pdf, other]

PDETime: Rethinking Long-Term Multivariate Time Series Forecasting from the perspective of partial differential equations

Authors: Shiyi Qi, Zenglin Xu, Yiduo Li, Liangjian Wen, Qingsong Wen, Qifan Wang, Yuan Qi

Abstract: Recent advancements in deep learning have led to the development of various models for long-term multivariate time-series forecasting (LMTF), many of which have shown promising results. Generally, the focus has been on historical-value-based models, which rely on past observations to predict future series. Notably, a new trend has emerged with time-index-based models, offering a more nuanced under… ▽ More Recent advancements in deep learning have led to the development of various models for long-term multivariate time-series forecasting (LMTF), many of which have shown promising results. Generally, the focus has been on historical-value-based models, which rely on past observations to predict future series. Notably, a new trend has emerged with time-index-based models, offering a more nuanced understanding of the continuous dynamics underlying time series. Unlike these two types of models that aggregate the information of spatial domains or temporal domains, in this paper, we consider multivariate time series as spatiotemporal data regularly sampled from a continuous dynamical system, which can be represented by partial differential equations (PDEs), with the spatial domain being fixed. Building on this perspective, we present PDETime, a novel LMTF model inspired by the principles of Neural PDE solvers, following the encoding-integration-decoding operations. Our extensive experimentation across seven diverse real-world LMTF datasets reveals that PDETime not only adapts effectively to the intrinsic spatiotemporal nature of the data but also sets new benchmarks, achieving state-of-the-art results △ Less

Submitted 25 February, 2024; originally announced February 2024.

arXiv:2402.15836 [pdf, other]

Entanglement islands and cutoff branes from path-integral optimization

Authors: Ashish Chandra, Zhengjiang Li, Qiang Wen

Abstract: Recently it was proposed that, the AdS/BCFT correspondence can be simulated by a holographic Weyl transformed CFT$_2$, where the cut-off brane plays the role of the Karch-Randall (KR) brane \cite{Basu:2022crn}. In this paper, we focus on the Weyl transformation that optimizes the path integral computation of the reduced density matrix for a single interval in a holographic CFT$_2$. When we take th… ▽ More Recently it was proposed that, the AdS/BCFT correspondence can be simulated by a holographic Weyl transformed CFT$_2$, where the cut-off brane plays the role of the Karch-Randall (KR) brane \cite{Basu:2022crn}. In this paper, we focus on the Weyl transformation that optimizes the path integral computation of the reduced density matrix for a single interval in a holographic CFT$_2$. When we take the limit that one of the endpoint of the interval goes to infinity (a half line), such a holographic Weyl transformed CFT$_2$ matches the AdS/BCFT configuration for a BCFT with one boundary. Without taking the limit, the induced cutoff brane becomes a circle passing through the two endpoints of the interval. We assume that the cutoff brane also plays the same role as the KR brane in AdS/BCFT, hence the path-integral-optimized purification for the interval is in the island phase. This explains the appearance of negative mutual information observed in \cite{Camargo:2022mme}. We check that, the entanglement entropy and the balanced partial entanglement entropy (BPE) calculated via the island formulas, exactly match with the RT formula and the entanglement wedge cross-section (EWCS), which are allowed to anchor on the cutoff brane. △ Less

Submitted 24 June, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

Comments: 43pages,comments welcome;V2 major revision; minor revision,JHEP version

arXiv:2402.14601 [pdf, other]

Bringing Generative AI to Adaptive Learning in Education

Authors: Hang Li, Tianlong Xu, Chaoli Zhang, Eason Chen, Jing Liang, Xing Fan, Haoyang Li, Jiliang Tang, Qingsong Wen

Abstract: The recent surge in generative AI technologies, such as large language models and diffusion models, has boosted the development of AI applications in various domains, including science, finance, and education. Concurrently, adaptive learning, a concept that has gained substantial interest in the educational sphere, has proven its efficacy in enhancing students' learning efficiency. In this positio… ▽ More The recent surge in generative AI technologies, such as large language models and diffusion models, has boosted the development of AI applications in various domains, including science, finance, and education. Concurrently, adaptive learning, a concept that has gained substantial interest in the educational sphere, has proven its efficacy in enhancing students' learning efficiency. In this position paper, we aim to shed light on the intersectional studies of these two methods, which combine generative AI with adaptive learning concepts. By presenting discussions about the benefits, challenges, and potentials in this field, we argue that this union will contribute significantly to the development of the next-stage learning format in education. △ Less

Submitted 28 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: 14 pages, 7 figures

arXiv:2402.11887 [pdf, other]

Generative Semi-supervised Graph Anomaly Detection

Authors: Hezhe Qiao, Qingsong Wen, Xiaoli Li, Ee-Peng Lim, Guansong Pang

Abstract: This work considers a practical semi-supervised graph anomaly detection (GAD) scenario, where part of the nodes in a graph are known to be normal, contrasting to the extensively explored unsupervised setting with a fully unlabeled graph. We reveal that having access to the normal nodes, even just a small percentage of normal nodes, helps enhance the detection performance of existing unsupervised G… ▽ More This work considers a practical semi-supervised graph anomaly detection (GAD) scenario, where part of the nodes in a graph are known to be normal, contrasting to the extensively explored unsupervised setting with a fully unlabeled graph. We reveal that having access to the normal nodes, even just a small percentage of normal nodes, helps enhance the detection performance of existing unsupervised GAD methods when they are adapted to the semi-supervised setting. However, their utilization of these normal nodes is limited. In this paper, we propose a novel Generative GAD approach (namely GGAD) for the semi-supervised scenario to better exploit the normal nodes. The key idea is to generate pseudo anomaly nodes, referred to as 'outlier nodes', for providing effective negative node samples in training a discriminative one-class classifier. The main challenge here lies in the lack of ground truth information about real anomaly nodes. To address this challenge, GGAD is designed to leverage two important priors about the anomaly nodes -- asymmetric local affinity and egocentric closeness -- to generate reliable outlier nodes that assimilate anomaly nodes in both graph structure and feature representations. Comprehensive experiments on six real-world GAD datasets are performed to establish a benchmark for semi-supervised GAD and show that GGAD substantially outperforms state-of-the-art unsupervised and semi-supervised GAD methods with varying numbers of training normal nodes. Code will be made available at https://github.com/mala-lab/GGAD. △ Less

Submitted 28 May, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

Comments: 20 pages, 11 figures

arXiv:2402.11463 [pdf, other]

Attractor Memory for Long-Term Time Series Forecasting: A Chaos Perspective

Authors: Jiaxi Hu, Yuehong Hu, Wei Chen, Ming Jin, Shirui Pan, Qingsong Wen, Yuxuan Liang

Abstract: In long-term time series forecasting (LTSF) tasks, an increasing number of models have acknowledged that discrete time series originate from continuous dynamic systems and have attempted to model their dynamical structures. Recognizing the chaotic nature of real-world data, our model, \textbf{\textit{Attraos}}, incorporates chaos theory into LTSF, perceiving real-world time series as observations… ▽ More In long-term time series forecasting (LTSF) tasks, an increasing number of models have acknowledged that discrete time series originate from continuous dynamic systems and have attempted to model their dynamical structures. Recognizing the chaotic nature of real-world data, our model, \textbf{\textit{Attraos}}, incorporates chaos theory into LTSF, perceiving real-world time series as observations from unknown high-dimensional chaotic dynamic systems. Under the concept of attractor invariance, Attraos utilizes non-parametric Phase Space Reconstruction embedding and the proposed multi-scale dynamic memory unit to memorize historical dynamics structure and predicts by a frequency-enhanced local evolution strategy. Detailed theoretical analysis and abundant empirical evidence consistently show that Attraos outperforms various LTSF methods on mainstream LTSF datasets and chaotic datasets with only one-twelfth of the parameters compared to PatchTST. △ Less

Submitted 14 July, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

arXiv:2402.05956 [pdf, other]

Pathformer: Multi-scale Transformers with Adaptive Pathways for Time Series Forecasting

Authors: Peng Chen, Yingying Zhang, Yunyao Cheng, Yang Shu, Yihang Wang, Qingsong Wen, Bin Yang, Chenjuan Guo

Abstract: Transformers for time series forecasting mainly model time series from limited or fixed scales, making it challenging to capture different characteristics spanning various scales. We propose Pathformer, a multi-scale Transformer with adaptive pathways. It integrates both temporal resolution and temporal distance for multi-scale modeling. Multi-scale division divides the time series into different… ▽ More Transformers for time series forecasting mainly model time series from limited or fixed scales, making it challenging to capture different characteristics spanning various scales. We propose Pathformer, a multi-scale Transformer with adaptive pathways. It integrates both temporal resolution and temporal distance for multi-scale modeling. Multi-scale division divides the time series into different temporal resolutions using patches of various sizes. Based on the division of each scale, dual attention is performed over these patches to capture global correlations and local details as temporal dependencies. We further enrich the multi-scale Transformer with adaptive pathways, which adaptively adjust the multi-scale modeling process based on the varying temporal dynamics of the input, improving the accuracy and generalization of Pathformer. Extensive experiments on eleven real-world datasets demonstrate that Pathformer not only achieves state-of-the-art performance by surpassing all current models but also exhibits stronger generalization abilities under various transfer scenarios. The code is made available at https://github.com/decisionintelligence/pathformer. △ Less

Submitted 6 March, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

Comments: Accepted by the 12th International Conference on Learning Representations (ICLR 2024)

arXiv:2402.04059 [pdf, other]

Deep Learning for Multivariate Time Series Imputation: A Survey

Authors: Jun Wang, Wenjie Du, Wei Cao, Keli Zhang, Wenjia Wang, Yuxuan Liang, Qingsong Wen

Abstract: The ubiquitous missing values cause the multivariate time series data to be partially observed, destroying the integrity of time series and hindering the effective time series data analysis. Recently deep learning imputation methods have demonstrated remarkable success in elevating the quality of corrupted time series data, subsequently enhancing performance in downstream tasks. In this paper, we… ▽ More The ubiquitous missing values cause the multivariate time series data to be partially observed, destroying the integrity of time series and hindering the effective time series data analysis. Recently deep learning imputation methods have demonstrated remarkable success in elevating the quality of corrupted time series data, subsequently enhancing performance in downstream tasks. In this paper, we conduct a comprehensive survey on the recently proposed deep learning imputation methods. First, we propose a taxonomy for the reviewed methods, and then provide a structured review of these methods by highlighting their strengths and limitations. We also conduct empirical experiments to study different methods and compare their enhancement for downstream tasks. Finally, the open issues for future research on multivariate time series imputation are pointed out. All code and configurations of this work, including a regularly maintained multivariate time series imputation paper list, can be found in the GitHub repository~\url{https://github.com/WenjieDu/Awesome\_Imputation}. △ Less

Submitted 6 February, 2024; originally announced February 2024.

Comments: 9 pages, 1 figure, 5 tables, 58 referred papers

arXiv:2402.02713 [pdf, other]

Position: What Can Large Language Models Tell Us about Time Series Analysis

Authors: Ming Jin, Yifan Zhang, Wei Chen, Kexin Zhang, Yuxuan Liang, Bin Yang, Jindong Wang, Shirui Pan, Qingsong Wen

Abstract: Time series analysis is essential for comprehending the complexities inherent in various realworld systems and applications. Although large language models (LLMs) have recently made significant strides, the development of artificial general intelligence (AGI) equipped with time series analysis capabilities remains in its nascent phase. Most existing time series models heavily rely on domain knowle… ▽ More Time series analysis is essential for comprehending the complexities inherent in various realworld systems and applications. Although large language models (LLMs) have recently made significant strides, the development of artificial general intelligence (AGI) equipped with time series analysis capabilities remains in its nascent phase. Most existing time series models heavily rely on domain knowledge and extensive model tuning, predominantly focusing on prediction tasks. In this paper, we argue that current LLMs have the potential to revolutionize time series analysis, thereby promoting efficient decision-making and advancing towards a more universal form of time series analytical intelligence. Such advancement could unlock a wide range of possibilities, including time series modality switching and question answering. We encourage researchers and practitioners to recognize the potential of LLMs in advancing time series analysis and emphasize the need for trust in these related efforts. Furthermore, we detail the seamless integration of time series analysis with existing LLM technologies and outline promising avenues for future research. △ Less

Submitted 1 June, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

Comments: Accepted by the 41st International Conference on Machine Learning (ICML 2024)

arXiv:2402.02032 [pdf, other]

RobustTSF: Towards Theory and Design of Robust Time Series Forecasting with Anomalies

Authors: Hao Cheng, Qingsong Wen, Yang Liu, Liang Sun

Abstract: Time series forecasting is an important and forefront task in many real-world applications. However, most of time series forecasting techniques assume that the training data is clean without anomalies. This assumption is unrealistic since the collected time series data can be contaminated in practice. The forecasting model will be inferior if it is directly trained by time series with anomalies. T… ▽ More Time series forecasting is an important and forefront task in many real-world applications. However, most of time series forecasting techniques assume that the training data is clean without anomalies. This assumption is unrealistic since the collected time series data can be contaminated in practice. The forecasting model will be inferior if it is directly trained by time series with anomalies. Thus it is essential to develop methods to automatically learn a robust forecasting model from the contaminated data. In this paper, we first statistically define three types of anomalies, then theoretically and experimentally analyze the loss robustness and sample robustness when these anomalies exist. Based on our analyses, we propose a simple and efficient algorithm to learn a robust forecasting model. Extensive experiments show that our method is highly robust and outperforms all existing approaches. The code is available at https://github.com/haochenglouis/RobustTSF. △ Less

Submitted 3 February, 2024; originally announced February 2024.

Comments: Accepted by the 12th International Conference on Learning Representations (ICLR 2024)

arXiv:2401.11552 [pdf, other]

Correlating $B\to K^{(\ast)} ν\barν$ and flavor anomalies in SMEFT

Authors: Feng-Zhi Chen, Qiaoyi Wen, Fanrong Xu

Abstract: The recent measurement of $\mathcal{B}(B^+\to K^+ν\barν)$ by Belle-II reveals a $2.8~σ$ deviation from the Standard Model (SM) prediction. Combining this with a prior Belle measurement of $\mathcal{B}(B^{0}\to K^{\ast0}ν\barν)$, the upper bound of the ratio $\mathcal{B}(B^{0}\to K^{\ast0}ν\barν)/\mathcal{B}(B^+\to K^+ν\barν)$ is notably smaller than the SM prediction. In this work, tensions are so… ▽ More The recent measurement of $\mathcal{B}(B^+\to K^+ν\barν)$ by Belle-II reveals a $2.8~σ$ deviation from the Standard Model (SM) prediction. Combining this with a prior Belle measurement of $\mathcal{B}(B^{0}\to K^{\ast0}ν\barν)$, the upper bound of the ratio $\mathcal{B}(B^{0}\to K^{\ast0}ν\barν)/\mathcal{B}(B^+\to K^+ν\barν)$ is notably smaller than the SM prediction. In this work, tensions are solved within the framework of Standard Model Effective Field Theory (SMEFT). Flavor observables, described by Low-Energy Effective Field Theory (LEFT) operators, are interconnected by SMEFT at the electroweak scale. Utilizing a set of only four SMEFT operators, the FCNC process $b\to sν\barν$ is correlated with $b\to s\ell^+\ell^-$, $b\to u_i\ell\barν$, $u_j\to s\ell\barν$, $u_j\to u_iν\barν$, and $u_j\to u_i\ell^+\ell^-$. Subsequently, we obtain the latest ranges of Wilson coefficients for these four operators through a global fit that accommodates flavor anomalies such as $R_{K^{(\ast)}}$, $R_{D^{(\ast)}}$, and $\mathcal{B}(B\to K^{(\ast)}ν\barν)$. Our findings reveal that predictions for $\mathcal{B}(B^+\to τ^+ν_τ)$ and $\mathcal{B}(D_s^+\to τ^+ν_τ)$ align well with measured values from Belle and BESIII, based on the fitted coefficients. The predicted branching fraction for $B^0\to K^{\ast0}ν\barν$ is $(1.42\pm 0.74)\times 10^{-5}$, closely approaching the current experimental upper limit. Anticipation surrounds the rare decay $B_s\to τ^+ τ^-$, expected in the near future with a branching fraction on the order of $10^{-4}$. △ Less

Submitted 21 January, 2024; originally announced January 2024.

Comments: 24 pages, 2 figures, 3 tables

arXiv:2401.08552 [pdf, other]

Explaining Time Series via Contrastive and Locally Sparse Perturbations

Authors: Zichuan Liu, Yingying Zhang, Tianchun Wang, Zefan Wang, Dongsheng Luo, Mengnan Du, Min Wu, Yi Wang, Chunlin Chen, Lunting Fan, Qingsong Wen

Abstract: Explaining multivariate time series is a compound challenge, as it requires identifying important locations in the time series and matching complex temporal patterns. Although previous saliency-based methods addressed the challenges, their perturbation may not alleviate the distribution shift issue, which is inevitable especially in heterogeneous samples. We present ContraLSP, a locally sparse mod… ▽ More Explaining multivariate time series is a compound challenge, as it requires identifying important locations in the time series and matching complex temporal patterns. Although previous saliency-based methods addressed the challenges, their perturbation may not alleviate the distribution shift issue, which is inevitable especially in heterogeneous samples. We present ContraLSP, a locally sparse model that introduces counterfactual samples to build uninformative perturbations but keeps distribution using contrastive learning. Furthermore, we incorporate sample-specific sparse gates to generate more binary-skewed and smooth masks, which easily integrate temporal trends and select the salient features parsimoniously. Empirical studies on both synthetic and real-world datasets show that ContraLSP outperforms state-of-the-art models, demonstrating a substantial improvement in explanation quality for time series data. The source code is available at \url{https://github.com/zichuan-liu/ContraLSP}. △ Less

Submitted 28 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

Comments: Accepted by International Conference on Learning Representations (ICLR 2024)

arXiv:2401.07471 [pdf, other]

Partial entanglement network and bulk geometry reconstruction in AdS/CFT

Authors: Jiong Lin, Yizhou Lu, Qiang Wen

Abstract: In the context of Anti-de Sitter / Conformal Field Theory (AdS/CFT) correspondence, we present a general scheme to reconstruct bulk geometric quantities in terms of a specific measure of the entanglement structure on the boundary CFT, the partial entanglement entropy (PEE). The PEE between any two points $\mathcal{I}(\vec x, \vec y)$ is the fundamental building block of the PEE structure. It can b… ▽ More In the context of Anti-de Sitter / Conformal Field Theory (AdS/CFT) correspondence, we present a general scheme to reconstruct bulk geometric quantities in terms of a specific measure of the entanglement structure on the boundary CFT, the partial entanglement entropy (PEE). The PEE between any two points $\mathcal{I}(\vec x, \vec y)$ is the fundamental building block of the PEE structure. It can be geometrized into a bulk geodesic connecting the two boundary points $\vec x$ and $\vec y$, which we refer to as the PEE thread. Thus, we ave a network of the PEE threads in the bulk with a density of the threads determined by the boundary PEE structure \cite{Lin:2023rbd}.We demonstrate that, for any static boundary region $A$, the homologous surface $Σ_{A}$ that has the minimal flux of the PEE threads passing through it is exactly the Ryu-Takayanagi (RT) surface of $A$, and the minimal flux coincides with the holographic entanglement entropy of $A$.Furthermore, we show that the strength of the PEE flux at any bulk point along any direction is $1/4G$. Based on this observation, we prove that any area element in the bulk can be reconstructed by the PEE threads passing through it, which corresponds to a set of two-point PEEs on the CFT. △ Less

Submitted 23 January, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

Comments: Comments welcome!

arXiv:2401.05012 [pdf, other]

HiMTM: Hierarchical Multi-Scale Masked Time Series Modeling for Long-Term Forecasting

Authors: Shubao Zhao, Ming Jin, Zhaoxiang Hou, Chengyi Yang, Zengxiang Li, Qingsong Wen, Yi Wang

Abstract: Time series forecasting is crucial and challenging in the real world. The recent surge in interest regarding time series foundation models, which cater to a diverse array of downstream tasks, is noteworthy. However, existing methods often overlook the multi-scale nature of time series, an aspect crucial for precise forecasting. To bridge this gap, we propose HiMTM, a hierarchical multi-scale maske… ▽ More Time series forecasting is crucial and challenging in the real world. The recent surge in interest regarding time series foundation models, which cater to a diverse array of downstream tasks, is noteworthy. However, existing methods often overlook the multi-scale nature of time series, an aspect crucial for precise forecasting. To bridge this gap, we propose HiMTM, a hierarchical multi-scale masked time series modeling method designed for long-term forecasting. Specifically, it comprises four integral components: (1) hierarchical multi-scale transformer (HMT) to capture temporal information at different scales; (2) decoupled encoder-decoder (DED) forces the encoder to focus on feature extraction, while the decoder to focus on pretext tasks; (3) multi-scale masked reconstruction (MMR) provides multi-stage supervision signals for pre-training; (4) cross-scale attention fine-tuning (CSA-FT) to capture dependencies between different scales for forecasting. Collectively, these components enhance multi-scale feature extraction capabilities in masked time series modeling and contribute to improved prediction accuracy. We conduct extensive experiments on 7 mainstream datasets to prove that HiMTM has obvious advantages over contemporary self-supervised and end-to-end learning methods. The effectiveness of HiMTM is further showcased by its application in the industry of natural gas demand forecasting. △ Less

Submitted 10 January, 2024; originally announced January 2024.

arXiv:2312.11672 [pdf, other]

A Quantum Federated Learning Framework for Classical Clients

Authors: Yanqi Song, Yusen Wu, Shengyao Wu, Dandan Li, Qiaoyan Wen, Sujuan Qin, Fei Gao

Abstract: Quantum Federated Learning (QFL) enables collaborative training of a Quantum Machine Learning (QML) model among multiple clients possessing quantum computing capabilities, without the need to share their respective local data. However, the limited availability of quantum computing resources poses a challenge for each client to acquire quantum computing capabilities. This raises a natural question:… ▽ More Quantum Federated Learning (QFL) enables collaborative training of a Quantum Machine Learning (QML) model among multiple clients possessing quantum computing capabilities, without the need to share their respective local data. However, the limited availability of quantum computing resources poses a challenge for each client to acquire quantum computing capabilities. This raises a natural question: Can quantum computing capabilities be deployed on the server instead? In this paper, we propose a QFL framework specifically designed for classical clients, referred to as CC-QFL, in response to this question. In each iteration, the collaborative training of the QML model is assisted by the shadow tomography technique, eliminating the need for quantum computing capabilities of clients. Specifically, the server constructs a classical representation of the QML model and transmits it to the clients. The clients encode their local data onto observables and use this classical representation to calculate local gradients. These local gradients are then utilized to update the parameters of the QML model. We evaluate the effectiveness of our framework through extensive numerical simulations using handwritten digit images from the MNIST dataset. Our framework provides valuable insights into QFL, particularly in scenarios where quantum computing resources are scarce. △ Less

Submitted 18 December, 2023; originally announced December 2023.

arXiv:2312.10488 [pdf, ps, other]

Non-Markovian Dynamics of Time-Fractional Open Quantum Systems

Authors: Dongmei Wei, Hailing Liu, Yongmei Li, Sujuan Qin, Qiaoyan Wen, Fei Gao

Abstract: Applications of Time-Fractional Schrodinger Equations (TFSEs) to quantum processes are instructive for understanding and describing the time behavior of real physical systems. By applying three popular TFSEs, namely Naber's TFSE I, Naber's TFSE II, and XGF's TFSE, to a basic open system model of a two-level system (qubit) coupled resonantly to a dissipative environment, we solve exactly for Time-F… ▽ More Applications of Time-Fractional Schrodinger Equations (TFSEs) to quantum processes are instructive for understanding and describing the time behavior of real physical systems. By applying three popular TFSEs, namely Naber's TFSE I, Naber's TFSE II, and XGF's TFSE, to a basic open system model of a two-level system (qubit) coupled resonantly to a dissipative environment, we solve exactly for Time-Fractional Single Qubit Open Systems (TFSQOSs). However, the three TFSEs perform badly for the following reasons. On the other hand, in the respective frameworks of the three TFSEs, the total probability for obtaining the system in a single-qubit state is not equal to one with time at fractional order, implying that time-fractional quantum mechanics violates quantum mechanical probability conservation. On the other hand, the latter two TFSEs are not capable of describing the non-Markovian dynamics of the system at all fractional order, only at some fractional order. To address this, we introduce a well-performed TFSE by constructing a new analytic continuation of time combined with the conformable fractional derivative, in which for all fractional order, not only does the total probability for the system equal one at all times but also the non-Markovian features can be observed throughout the time evolution of the system. Furthermore, we study the performances of the four TFSEs applying to an open system model of two isolated qubits each locally interacting with its dissipative environment. By deriving the exact solutions for time-fractional two qubits open systems, we show that our TFSE still possesses the above two advantages compared with the other three TFSEs. △ Less

Submitted 16 December, 2023; originally announced December 2023.

Showing 1–50 of 286 results for author: Wen, Q