Skip to main content

Showing 1–50 of 230 results for author: Tao, X

  1. arXiv:2407.09068  [pdf, other

    cs.RO

    Fast and Accurate Multi-Agent Trajectory Prediction For Crowded Unknown Scenes

    Authors: Xiuye Tao, Huiping Li, Bin Liang, Yang Shi, Demin Xu

    Abstract: This paper studies the problem of multi-agent trajectory prediction in crowded unknown environments. A novel energy function optimization-based framework is proposed to generate prediction trajectories. Firstly, a new energy function is designed for easier optimization. Secondly, an online optimization pipeline for calculating parameters and agents' velocities is developed. In this pipeline, we fi… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2406.11181  [pdf, other

    physics.optics physics.ao-ph

    General Scintillation for Gaussian Beam Propagating through Oceanic Turbulence and UWOC System Performance Evaluation

    Authors: Yuxuan Li, Xiang Yi, Xinyue Tao, Ata Yalçın, Mingjian Cheng, Lu Zhang

    Abstract: In this paper, we derive a general and exact closed-form expression of scintillation index (SI) for a Gaussian beam propagating through weak oceanic turbulence, based on the general oceanic turbulence optical power spectrum (OTOPS) and the Rytov theory. Our universal expression not only includes existing Rytov variances but also accounts for actual cases where the Kolmogorov microscale is non-zero… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  3. arXiv:2406.10469  [pdf, other

    eess.IV cs.CV cs.MM

    Object-Attribute-Relation Representation based Video Semantic Communication

    Authors: Qiyuan Du, Yiping Duan, Qianqian Yang, Xiaoming Tao, Mérouane Debbah

    Abstract: With the rapid growth of multimedia data volume, there is an increasing need for efficient video transmission in applications such as virtual reality and future video streaming services. Semantic communication is emerging as a vital technique for ensuring efficient and reliable transmission in low-bandwidth, high-noise settings. However, most current approaches focus on joint source-channel coding… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  4. arXiv:2406.04277  [pdf, other

    cs.CV

    VideoTetris: Towards Compositional Text-to-Video Generation

    Authors: Ye Tian, Ling Yang, Haotian Yang, Yuan Gao, Yufan Deng, Jingmin Chen, Xintao Wang, Zhaochen Yu, Xin Tao, Pengfei Wan, Di Zhang, Bin Cui

    Abstract: Diffusion models have demonstrated great success in text-to-video (T2V) generation. However, existing methods may face challenges when handling complex (long) video generation scenarios that involve multiple objects or dynamic changes in object numbers. To address these limitations, we propose VideoTetris, a novel framework that enables compositional T2V generation. Specifically, we propose spatio… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Code: https://github.com/YangLing0818/VideoTetris

  5. arXiv:2405.19226  [pdf, other

    cs.CV cs.MM

    ContextBLIP: Doubly Contextual Alignment for Contrastive Image Retrieval from Linguistically Complex Descriptions

    Authors: Honglin Lin, Siyu Li, Guoshun Nan, Chaoyue Tang, Xueting Wang, Jingxin Xu, Rong Yankai, Zhili Zhou, Yutong Gao, Qimei Cui, Xiaofeng Tao

    Abstract: Image retrieval from contextual descriptions (IRCD) aims to identify an image within a set of minimally contrastive candidates based on linguistically complex text. Despite the success of VLMs, they still significantly lag behind human performance in IRCD. The main challenges lie in aligning key contextual cues in two modalities, where these subtle cues are concealed in tiny areas of multiple cont… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted in ACL 2024 Findings

  6. arXiv:2405.15403  [pdf, other

    cs.LG stat.ML

    Fine-Grained Dynamic Framework for Bias-Variance Joint Optimization on Data Missing Not at Random

    Authors: Mingming Ha, Xuewen Tao, Wenfang Lin, Qionxu Ma, Wujiang Xu, Linxun Chen

    Abstract: In most practical applications such as recommendation systems, display advertising, and so forth, the collected data often contains missing values and those missing values are generally missing-not-at-random, which deteriorates the prediction performance of models. Some existing estimators and regularizers attempt to achieve unbiased estimation to improve the predictive performance. However, varia… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  7. arXiv:2405.15321  [pdf, other

    cs.CV

    SG-Adapter: Enhancing Text-to-Image Generation with Scene Graph Guidance

    Authors: Guibao Shen, Luozhou Wang, Jiantao Lin, Wenhang Ge, Chaozhe Zhang, Xin Tao, Yuan Zhang, Pengfei Wan, Zhongyuan Wang, Guangyong Chen, Yijun Li, Ying-Cong Chen

    Abstract: Recent advancements in text-to-image generation have been propelled by the development of diffusion models and multi-modality learning. However, since text is typically represented sequentially in these models, it often falls short in providing accurate contextualization and structural control. So the generated images do not consistently align with human expectations, especially in complex scenari… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  8. arXiv:2405.10514  [pdf, other

    cs.IT eess.SP

    Secrecy Performance Analysis of Multi-Functional RIS-Assisted NOMA Networks

    Authors: Yingjie Pei, Wanli Ni, Jin Xu, Xinwei Yue, Xiaofeng Tao, Dusit Niyato

    Abstract: Although reconfigurable intelligent surface (RIS) can improve the secrecy communication performance of wireless users, it still faces challenges such as limited coverage and double-fading effect. To address these issues, in this paper, we utilize a novel multi-functional RIS (MF-RIS) to enhance the secrecy performance of wireless users, and investigate the physical layer secrecy problem in non-ort… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: 14 pages, 9 figures, submitted to IEEE transactions on wireless communication

  9. arXiv:2405.08096  [pdf, other

    eess.AS cs.SD

    Semantic MIMO Systems for Speech-to-Text Transmission

    Authors: Zhenzi Weng, Zhijin Qin, Huiqiang Xie, Xiaoming Tao, Khaled B. Letaief

    Abstract: Semantic communications have been utilized to execute numerous intelligent tasks by transmitting task-related semantic information instead of bits. In this article, we propose a semantic-aware speech-to-text transmission system for the single-user multiple-input multiple-output (MIMO) and multi-user MIMO communication scenarios, named SAC-ST. Particularly, a semantic communication system to serve… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  10. arXiv:2405.05795  [pdf, other

    cs.LG

    Enhancing Suicide Risk Detection on Social Media through Semi-Supervised Deep Label Smoothing

    Authors: Matthew Squires, Xiaohui Tao, Soman Elangovan, U Rajendra Acharya, Raj Gururajan, Haoran Xie, Xujuan Zhou

    Abstract: Suicide is a prominent issue in society. Unfortunately, many people at risk for suicide do not receive the support required. Barriers to people receiving support include social stigma and lack of access to mental health care. With the popularity of social media, people have turned to online forums, such as Reddit to express their feelings and seek support. This provides the opportunity to support… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  11. arXiv:2405.00181  [pdf, other

    cs.CV cs.AI

    Uncovering What, Why and How: A Comprehensive Benchmark for Causation Understanding of Video Anomaly

    Authors: Hang Du, Sicheng Zhang, Binzhu Xie, Guoshun Nan, Jiayang Zhang, Junrui Xu, Hangyu Liu, Sicong Leng, Jiangming Liu, Hehe Fan, Dajiu Huang, Jing Feng, Linli Chen, Can Zhang, Xuhuan Li, Hao Zhang, Jianhang Chen, Qimei Cui, Xiaofeng Tao

    Abstract: Video anomaly understanding (VAU) aims to automatically comprehend unusual occurrences in videos, thereby enabling various applications such as traffic surveillance and industrial manufacturing. While existing VAU benchmarks primarily concentrate on anomaly detection and localization, our focus is on more practicality, prompting us to raise the following crucial questions: "what anomaly occurred?"… ▽ More

    Submitted 6 May, 2024; v1 submitted 30 April, 2024; originally announced May 2024.

    Comments: Accepted in CVPR2024, Codebase: https://github.com/fesvhtr/CUVA

  12. arXiv:2404.16913  [pdf, other

    cs.LG cs.AI eess.IV

    DE-CGAN: Boosting rTMS Treatment Prediction with Diversity Enhancing Conditional Generative Adversarial Networks

    Authors: Matthew Squires, Xiaohui Tao, Soman Elangovan, Raj Gururajan, Haoran Xie, Xujuan Zhou, Yuefeng Li, U Rajendra Acharya

    Abstract: Repetitive Transcranial Magnetic Stimulation (rTMS) is a well-supported, evidence-based treatment for depression. However, patterns of response to this treatment are inconsistent. Emerging evidence suggests that artificial intelligence can predict rTMS treatment outcomes for most patients using fMRI connectivity features. While these models can reliably predict treatment outcomes for many patients… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  13. arXiv:2404.16687  [pdf, other

    cs.CV

    NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, Haoning Wu, Yixuan Gao, Yuqin Cao, Zicheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng , et al. (89 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Conte… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  14. arXiv:2404.16076  [pdf, other

    cs.SI cs.AI cs.CL cs.LG

    Semantic Evolvement Enhanced Graph Autoencoder for Rumor Detection

    Authors: Xiang Tao, Liang Wang, Qiang Liu, Shu Wu, Liang Wang

    Abstract: Due to the rapid spread of rumors on social media, rumor detection has become an extremely important challenge. Recently, numerous rumor detection models which utilize textual information and the propagation structure of events have been proposed. However, these methods overlook the importance of semantic evolvement information of event in propagation process, which is often challenging to be trul… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  15. arXiv:2404.11001  [pdf

    cond-mat.supr-con

    Modulation of the Octahedral Structure and Potential Superconductivity of La$_3$Ni$_2$O$_7$ through Strain Engineering

    Authors: Zihao Huo, Zhihui Luo, Peng Zhang, Aiqin Yang, Zhengtao Liu, Xiangru Tao, Zihan Zhang, Shumin Guo, Qiwen Jiang, Wenxuan Chen, Dao-Xin Yao, Defang Duan, Tian Cui

    Abstract: The recent transport measurement of La$_3$Ni$_2$O$_7$ uncover a "right-triangle" shape of the superconducting dome in the pressure-temperature (P-T) phase diagram. Motivated by this, we perform theoretical first-principles studies of La$_3$Ni$_2$O$_7$ with the pressure ranging from 0 to 100 GPa. Notably, we reveal a pressure dependence of the Ni-$d_{z^2}$ electron density at the Fermi energy (… ▽ More

    Submitted 8 July, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  16. arXiv:2404.09619  [pdf, other

    cs.CV cs.AI

    UNIAA: A Unified Multi-modal Image Aesthetic Assessment Baseline and Benchmark

    Authors: Zhaokun Zhou, Qiulin Wang, Bin Lin, Yiwei Su, Rui Chen, Xin Tao, Amin Zheng, Li Yuan, Pengfei Wan, Di Zhang

    Abstract: As an alternative to expensive expert evaluation, Image Aesthetic Assessment (IAA) stands out as a crucial task in computer vision. However, traditional IAA methods are typically constrained to a single data source or task, restricting the universality and broader application. In this work, to better align with human aesthetics, we propose a Unified Multi-modal Image Aesthetic Assessment (UNIAA) f… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  17. arXiv:2404.07484  [pdf

    cs.MM cs.AI

    Multimodal Emotion Recognition by Fusing Video Semantic in MOOC Learning Scenarios

    Authors: Yuan Zhang, Xiaomei Tao, Hanxu Ai, Tao Chen, Yanling Gan

    Abstract: In the Massive Open Online Courses (MOOC) learning scenario, the semantic information of instructional videos has a crucial impact on learners' emotional state. Learners mainly acquire knowledge by watching instructional videos, and the semantic information in the videos directly affects learners' emotional states. However, few studies have paid attention to the potential influence of the semantic… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  18. arXiv:2404.06756  [pdf, other

    cs.LG cs.AI

    CrimeAlarm: Towards Intensive Intent Dynamics in Fine-grained Crime Prediction

    Authors: Kaixi Hu, Lin Li, Qing Xie, Xiaohui Tao, Guandong Xu

    Abstract: Granularity and accuracy are two crucial factors for crime event prediction. Within fine-grained event classification, multiple criminal intents may alternately exhibit in preceding sequential events, and progress differently in next. Such intensive intent dynamics makes training models hard to capture unobserved intents, and thus leads to sub-optimal generalization performance, especially in the… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Accepted by DASFAA 2024

  19. arXiv:2404.06692  [pdf, other

    cs.CV

    Perception-Oriented Video Frame Interpolation via Asymmetric Blending

    Authors: Guangyang Wu, Xin Tao, Changlin Li, Wenyi Wang, Xiaohong Liu, Qingqing Zheng

    Abstract: Previous methods for Video Frame Interpolation (VFI) have encountered challenges, notably the manifestation of blur and ghosting effects. These issues can be traced back to two pivotal factors: unavoidable motion errors and misalignment in supervision. In practice, motion estimates often prove to be error-prone, resulting in misaligned features. Furthermore, the reconstruction loss tends to bring… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024

  20. arXiv:2404.05386  [pdf, other

    cs.IR

    MealRec$^+$: A Meal Recommendation Dataset with Meal-Course Affiliation for Personalization and Healthiness

    Authors: Ming Li, Lin Li, Xiaohui Tao, Jimmy Xiangji Huang

    Abstract: Meal recommendation, as a typical health-related recommendation task, contains complex relationships between users, courses, and meals. Among them, meal-course affiliation associates user-meal and user-course interactions. However, an extensive literature review demonstrates that there is a lack of publicly available meal recommendation datasets including meal-course affiliation. Meal recommendati… ▽ More

    Submitted 27 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted by SIGIR 2024

  21. arXiv:2403.20193  [pdf, other

    cs.CV

    Motion Inversion for Video Customization

    Authors: Luozhou Wang, Guibao Shen, Yixun Liang, Xin Tao, Pengfei Wan, Di Zhang, Yijun Li, Yingcong Chen

    Abstract: In this research, we present a novel approach to motion customization in video generation, addressing the widespread gap in the thorough exploration of motion representation within video generative models. Recognizing the unique challenges posed by video's spatiotemporal nature, our method introduces Motion Embeddings, a set of explicit, temporally coherent one-dimensional embeddings derived from… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: Project Page: https://wileewang.github.io/MotionInversion/

  22. arXiv:2403.17735  [pdf, other

    cs.AI

    Out-of-distribution Rumor Detection via Test-Time Adaptation

    Authors: Xiang Tao, Mingqing Zhang, Qiang Liu, Shu Wu, Liang Wang

    Abstract: Due to the rapid spread of rumors on social media, rumor detection has become an extremely important challenge. Existing methods for rumor detection have achieved good performance, as they have collected enough corpus from the same data distribution for model training. However, significant distribution shifts between the training data and real-world test data occur due to differences in news topic… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  23. arXiv:2403.15758  [pdf, ps, other

    math.CA

    An endpoint estimate for the maximal Calderón commutator with rough kernel

    Authors: Guoen Hu, Xudong Lai, Xiangxing Tao, Qingying Xue

    Abstract: In this paper, the authors consider the endpoint estimates for the maximal Calderón commutator defined by $$T_{Ω,\,a}^*f(x)=\sup_{ε>0}\Big|\int_{|x-y|>ε}\frac{Ω(x-y)}{|x-y|^{d+1}} \big(a(x)-a(y)\big)f(y)dy\Big|,$$ where $Ω$ is homogeneous of degree zero, integrable on $S^{d-1}$ and has vanishing moment of order one, $a$ be a function on $\mathbb{R}^d$ such that… ▽ More

    Submitted 14 April, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

    Comments: 25 pages

    MSC Class: 42B20

  24. arXiv:2403.15283  [pdf, other

    cond-mat.supr-con cond-mat.mtrl-sci

    Discovery of superconductivity in technetium-borides at moderate pressures

    Authors: Xiangru Tao, Aiqin Yang, Yundi Quan, Biao Wan, Shuxiang Yang, Peng Zhang

    Abstract: Advances in theoretical calculations boosted the searches for high temperature superconductors, such as sulfur hydrides and rare-earth polyhydrides. However, the required extremely high pressures for stabilizing these superconductors handicapped further implementations. Based upon thorough structural searches, we identified series of unprecedented superconducting technetium-borides at moderate pre… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  25. arXiv:2403.15234  [pdf, other

    cs.CV

    Shadow Generation for Composite Image Using Diffusion model

    Authors: Qingyang Liu, Junqi You, Jianting Wang, Xinhao Tao, Bo Zhang, Li Niu

    Abstract: In the realm of image composition, generating realistic shadow for the inserted foreground remains a formidable challenge. Previous works have developed image-to-image translation models which are trained on paired training data. However, they are struggling to generate shadows with accurate shapes and intensities, hindered by data scarcity and inherent task complexity. In this paper, we resort to… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: accepted by CVPR2024

  26. arXiv:2403.12372  [pdf, other

    cs.LG

    Learning Transferable Time Series Classifier with Cross-Domain Pre-training from Language Model

    Authors: Mingyue Cheng, Xiaoyu Tao, Qi Liu, Hao Zhang, Yiheng Chen, Chenyi Lei

    Abstract: Advancements in self-supervised pre-training (SSL) have significantly advanced the field of learning transferable time series representations, which can be very useful in enhancing the downstream task. Despite being effective, most existing works struggle to achieve cross-domain SSL pre-training, missing valuable opportunities to integrate patterns and features from different domains. The main cha… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  27. arXiv:2403.09222  [pdf, other

    eess.SP

    A Robust Semantic Communication System for Image

    Authors: Xiang Peng, Zhijin Qin, Xiaoming Tao, Jianhua Lu, Khaled B. Letaief

    Abstract: Semantic communications have gained significant attention as a promising approach to address the transmission bottleneck, especially with the continuous development of 6G techniques. Distinct from the well investigated physical channel impairments, this paper focuses on semantic impairments in image, particularly those arising from adversarial perturbations. Specifically, we propose a novel metric… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 6 pages

  28. arXiv:2403.09157  [pdf, ps, other

    eess.IV cs.CV

    VM-UNET-V2 Rethinking Vision Mamba UNet for Medical Image Segmentation

    Authors: Mingya Zhang, Yue Yu, Limei Gu, Tingsheng Lin, Xianping Tao

    Abstract: In the field of medical image segmentation, models based on both CNN and Transformer have been thoroughly investigated. However, CNNs have limited modeling capabilities for long-range dependencies, making it challenging to exploit the semantic information within images fully. On the other hand, the quadratic computational complexity poses a challenge for Transformers. Recently, State Space Models… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 12 pages, 4 figures

  29. arXiv:2403.02910  [pdf, other

    cs.CV cs.AI

    ImgTrojan: Jailbreaking Vision-Language Models with ONE Image

    Authors: Xijia Tao, Shuai Zhong, Lei Li, Qi Liu, Lingpeng Kong

    Abstract: There has been an increasing interest in the alignment of large language models (LLMs) with human values. However, the safety issues of their integration with a vision module, or vision language models (VLMs), remain relatively underexplored. In this paper, we propose a novel jailbreaking attack against VLMs, aiming to bypass their safety barrier when a user inputs harmful instructions. A scenario… ▽ More

    Submitted 5 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  30. arXiv:2402.17417  [pdf, other

    cs.CV

    CARZero: Cross-Attention Alignment for Radiology Zero-Shot Classification

    Authors: Haoran Lai, Qingsong Yao, Zihang Jiang, Rongsheng Wang, Zhiyang He, Xiaodong Tao, S. Kevin Zhou

    Abstract: The advancement of Zero-Shot Learning in the medical domain has been driven forward by using pre-trained models on large-scale image-text pairs, focusing on image-text alignment. However, existing methods primarily rely on cosine similarity for alignment, which may not fully capture the complex relationship between medical images and reports. To address this gap, we introduce a novel approach call… ▽ More

    Submitted 24 March, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  31. arXiv:2402.14718  [pdf, other

    quant-ph hep-ex

    Quantum Annealing Inspired Algorithms for Track Reconstruction at High Energy Colliders

    Authors: Hideki Okawa, Qing-Guo Zeng, Xian-Zhe Tao, Man-Hong Yung

    Abstract: Charged particle reconstruction or track reconstruction is one of the most crucial components of pattern recognition in high energy collider physics. It is known for enormous consumption of the computing resources, especially when the particle multiplicity is high. This would indeed be the conditions at future colliders such as the High Luminosity Large Hadron Collider and Super Proton Proton Coll… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: 10 pages, 4 figures

  32. arXiv:2402.13471  [pdf

    cond-mat.mtrl-sci physics.app-ph

    Thermal transport in a 2D amorphous material

    Authors: Yuxi Wang, Xingxing Zhang, Wujuan Yan, Nianjie Liang, Haiyu He, Xinwei Tao, Ang Li, Fuwei Yang, Buxuan Li, Te-Huan Liu, Jia Zhu, Wu Zhou, Wei Wang, Lin Zhou, Bai Song

    Abstract: Two-dimensional (2D) crystals proved revolutionary soon after graphene was discovered in 2004. However, 2D amorphous materials only became accessible in 2020 and remain largely unexplored. In particular, the thermophysical properties of amorphous materials are of great interest upon transition from 3D to 2D. Here, we probe thermal transport in 2D amorphous carbon. A cross-plane thermal conductivit… ▽ More

    Submitted 22 March, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

  33. arXiv:2402.13073  [pdf, other

    eess.SP

    Towards Intelligent Communications: Large Model Empowered Semantic Communications

    Authors: Huiqiang Xie, Zhijin Qin, Xiaoming Tao, Zhu Han

    Abstract: Deep learning enabled semantic communications have shown great potential to significantly improve transmission efficiency and alleviate spectrum scarcity, by effectively exchanging the semantics behind the data. Recently, the emergence of large models, boasting billions of parameters, has unveiled remarkable human-like intelligence, offering a promising avenue for advancing semantic communication… ▽ More

    Submitted 19 March, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: 7 pages, 6 figures

  34. arXiv:2402.12398  [pdf, other

    cs.LG

    Primary and Secondary Factor Consistency as Domain Knowledge to Guide Happiness Computing in Online Assessment

    Authors: Xiaohua Wu, Lin Li, Xiaohui Tao, Frank Xing, Jingling Yuan

    Abstract: Happiness computing based on large-scale online web data and machine learning methods is an emerging research topic that underpins a range of issues, from personal growth to social stability. Many advanced Machine Learning (ML) models with explanations are used to compute the happiness online assessment while maintaining high accuracy of results. However, domain knowledge constraints, such as the… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

    Comments: 12 pages

  35. arXiv:2402.10097  [pdf, other

    cs.LG cs.NI

    Adaptive Federated Learning in Heterogeneous Wireless Networks with Independent Sampling

    Authors: Jiaxiang Geng, Yanzhao Hou, Xiaofeng Tao, Juncheng Wang, Bing Luo

    Abstract: Federated Learning (FL) algorithms commonly sample a random subset of clients to address the straggler issue and improve communication efficiency. While recent works have proposed various client sampling methods, they have limitations in joint system and data heterogeneity design, which may not align with practical heterogeneous wireless networks. In this work, we advocate a new independent client… ▽ More

    Submitted 13 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: 6 pages, 5 figures, accepted for publication in IEEE International Conference on Communications (ICC)

  36. arXiv:2402.07225  [pdf, other

    cs.LG

    Rethinking Graph Masked Autoencoders through Alignment and Uniformity

    Authors: Liang Wang, Xiang Tao, Qiang Liu, Shu Wu, Liang Wang

    Abstract: Self-supervised learning on graphs can be bifurcated into contrastive and generative methods. Contrastive methods, also known as graph contrastive learning (GCL), have dominated graph self-supervised learning in the past few years, but the recent advent of graph masked autoencoder (GraphMAE) rekindles the momentum behind generative methods. Despite the empirical success of GraphMAE, there is still… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

    Comments: Accepted by AAAI 2024

  37. arXiv:2402.03916  [pdf, other

    cs.IR cs.CL

    Can Large Language Models Detect Rumors on Social Media?

    Authors: Qiang Liu, Xiang Tao, Junfei Wu, Shu Wu, Liang Wang

    Abstract: In this work, we investigate to use Large Language Models (LLMs) for rumor detection on social media. However, it is challenging for LLMs to reason over the entire propagation information on social media, which contains news contents and numerous comments, due to LLMs may not concentrate on key clues in the complex propagation information, and have trouble in reasoning when facing massive and redu… ▽ More

    Submitted 8 February, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  38. arXiv:2402.02950  [pdf, other

    cs.CR eess.SP

    Semantic Entropy Can Simultaneously Benefit Transmission Efficiency and Channel Security of Wireless Semantic Communications

    Authors: Yankai Rong, Guoshun Nan, Minwei Zhang, Sihan Chen, Songtao Wang, Xuefei Zhang, Nan Ma, Shixun Gong, Zhaohui Yang, Qimei Cui, Xiaofeng Tao, Tony Q. S. Quek

    Abstract: Recently proliferated deep learning-based semantic communications (DLSC) focus on how transmitted symbols efficiently convey a desired meaning to the destination. However, the sensitivity of neural models and the openness of wireless channels cause the DLSC system to be extremely fragile to various malicious attacks. This inspires us to ask a question: "Can we further exploit the advantages of tra… ▽ More

    Submitted 6 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: 13 pages, 12 figures

  39. arXiv:2401.17575  [pdf, other

    eess.SP

    Can We Improve Channel Reciprocity via Loop-back Compensation for RIS-assisted Physical Layer Key Generation

    Authors: Ningya Xu, Guoshun Nan, Xiaofeng Tao, Na Li, Pengxuan Mao, Tianyuan Yang

    Abstract: Reconfigurable intelligent surface (RIS) facilitates the extraction of unpredictable channel features for physical layer key generation (PKG), securing communications among legitimate users with symmetric keys. Previous works have demonstrated that channel reciprocity plays a crucial role in generating symmetric keys in PKG systems, whereas, in reality, reciprocity is greatly affected by hardware… ▽ More

    Submitted 30 April, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: Accepted by ICC 2024

  40. arXiv:2401.15444  [pdf, other

    cs.LG

    Towards Causal Classification: A Comprehensive Study on Graph Neural Networks

    Authors: Simi Job, Xiaohui Tao, Taotao Cai, Lin Li, Haoran Xie, Jianming Yong

    Abstract: The exploration of Graph Neural Networks (GNNs) for processing graph-structured data has expanded, particularly their potential for causal analysis due to their universal approximation capabilities. Anticipated to significantly enhance common graph-based tasks such as classification and prediction, the development of a causally enhanced GNN framework is yet to be thoroughly investigated. Addressin… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

  41. arXiv:2401.13425  [pdf, ps, other

    cond-mat.mtrl-sci

    Two-dimensional ferromagnetic semiconductor Cr2XP: First-principles calculations and Monte Carlo simulations

    Authors: Xiao-Ping Wei, Lan-Lan Du, Jiang-Liu Meng, Xiaoma Tao

    Abstract: According to the Mermin Wagner theorem, two-dimensional material is difficult to have the Curie temperature above room temperature. By using the method of band engineering, we design a promising two-dimensional ferromagnetic semiconductor Cr2XP (X=P, As, Sb) with large magnetization, high Curie temperature and sizable band gap. The formation of gap is discussed in terms of the hybridizations, occu… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  42. arXiv:2401.12483  [pdf, other

    cs.IR

    Persona-centric Metamorphic Relation guided Robustness Evaluation for Multi-turn Dialogue Modelling

    Authors: Yanbing Chen, Lin Li, Xiaohui Tao, Dong Zhou

    Abstract: Recently there has been significant progress in the field of dialogue system thanks to the introduction of training paradigms such as fine-tune and prompt learning. Persona can function as the prior knowledge for maintaining the personality consistency of dialogue systems, which makes it perform well on accuracy. Nonetheless, the conventional reference-based evaluation method falls short in captur… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  43. arXiv:2401.00859  [pdf, ps, other

    eess.IV cs.CV cs.LG

    Federated Multi-View Synthesizing for Metaverse

    Authors: Yiyu Guo, Zhijin Qin, Xiaoming Tao, Geoffrey Ye Li

    Abstract: The metaverse is expected to provide immersive entertainment, education, and business applications. However, virtual reality (VR) transmission over wireless networks is data- and computation-intensive, making it critical to introduce novel solutions that meet stringent quality-of-service requirements. With recent advances in edge intelligence and deep learning, we have developed a novel multi-view… ▽ More

    Submitted 18 December, 2023; originally announced January 2024.

  44. arXiv:2312.16418  [pdf, other

    cs.LG cs.AI cs.SI

    Refining Latent Homophilic Structures over Heterophilic Graphs for Robust Graph Convolution Networks

    Authors: Chenyang Qiu, Guoshun Nan, Tianyu Xiong, Wendi Deng, Di Wang, Zhiyang Teng, Lijuan Sun, Qimei Cui, Xiaofeng Tao

    Abstract: Graph convolution networks (GCNs) are extensively utilized in various graph tasks to mine knowledge from spatial data. Our study marks the pioneering attempt to quantitatively investigate the GCN robustness over omnipresent heterophilic graphs for node classification. We uncover that the predominant vulnerability is caused by the structural out-of-distribution (OOD) issue. This finding motivates u… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

    Comments: To be appeared in the proceedings of AAAI-2024

  45. arXiv:2312.16023  [pdf, other

    cs.CL cs.MM

    DocMSU: A Comprehensive Benchmark for Document-level Multimodal Sarcasm Understanding

    Authors: Hang Du, Guoshun Nan, Sicheng Zhang, Binzhu Xie, Junrui Xu, Hehe Fan, Qimei Cui, Xiaofeng Tao, Xudong Jiang

    Abstract: Multimodal Sarcasm Understanding (MSU) has a wide range of applications in the news field such as public opinion analysis and forgery detection. However, existing MSU benchmarks and approaches usually focus on sentence-level MSU. In document-level news, sarcasm clues are sparse or small and are often concealed in long text. Moreover, compared to sentence-level comments like tweets, which mainly fo… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

  46. arXiv:2312.13316  [pdf, other

    cs.CV

    ECAMP: Entity-centered Context-aware Medical Vision Language Pre-training

    Authors: Rongsheng Wang, Qingsong Yao, Haoran Lai, Zhiyang He, Xiaodong Tao, Zihang Jiang, S. Kevin Zhou

    Abstract: Despite significant advancements in medical vision-language pre-training, existing methods have largely overlooked the inherent entity-specific context within radiology reports and the complex cross-modality contextual relationships between text and images. To close this gap, we propose a novel Entity-centered Context-aware Medical Vision-language Pre-training (ECAMP) framework, which is designed… ▽ More

    Submitted 19 March, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

  47. arXiv:2312.13305  [pdf, other

    cs.CV

    DVIS++: Improved Decoupled Framework for Universal Video Segmentation

    Authors: Tao Zhang, Xingye Tian, Yikang Zhou, Shunping Ji, Xuebo Wang, Xin Tao, Yuan Zhang, Pengfei Wan, Zhongyuan Wang, Yu Wu

    Abstract: We present the \textbf{D}ecoupled \textbf{VI}deo \textbf{S}egmentation (DVIS) framework, a novel approach for the challenging task of universal video segmentation, including video instance segmentation (VIS), video semantic segmentation (VSS), and video panoptic segmentation (VPS). Unlike previous methods that model video segmentation in an end-to-end manner, our approach decouples video segmentat… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  48. arXiv:2312.12338  [pdf, other

    cs.CY

    Smart Connected Farms and Networked Farmers to Tackle Climate Challenges Impacting Agricultural Production

    Authors: Behzad J. Balabaygloo, Barituka Bekee, Samuel W. Blair, Suzanne Fey, Fateme Fotouhi, Ashish Gupta, Kevin Menke, Anusha Vangala, Jorge C. M. Palomares, Aaron Prestholt, Vishesh K. Tanwar, Xu Tao, Matthew E. Carroll, Sajal Das, Gil Depaula, Peter Kyveryga, Soumik Sarkar, Michelle Segovia, Simone Sylvestri, Corinne Valdivia, Asheesh K. Singh

    Abstract: To meet the grand challenges of agricultural production including climate change impacts on crop production, a tight integration of social science, technology and agriculture experts including farmers are needed. There are rapid advances in information and communication technology, precision agriculture and data analytics, which are creating a fertile field for the creation of smart connected farm… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  49. arXiv:2312.12148  [pdf, other

    cs.CL

    Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment

    Authors: Lingling Xu, Haoran Xie, Si-Zhao Joe Qin, Xiaohui Tao, Fu Lee Wang

    Abstract: With the continuous growth in the number of parameters of transformer-based pretrained language models (PLMs), particularly the emergence of large language models (LLMs) with billions of parameters, many natural language processing (NLP) tasks have demonstrated remarkable success. However, the enormous size and computational demands of these models pose significant challenges for adapting them to… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: 20 pages, 4 figures

  50. arXiv:2312.11391  [pdf, other

    cs.AI cs.GT cs.LG

    FedCompetitors: Harmonious Collaboration in Federated Learning with Competing Participants

    Authors: Shanli Tan, Hao Cheng, Xiaohu Wu, Han Yu, Tiantian He, Yew-Soon Ong, Chongjun Wang, Xiaofeng Tao

    Abstract: Federated learning (FL) provides a privacy-preserving approach for collaborative training of machine learning models. Given the potential data heterogeneity, it is crucial to select appropriate collaborators for each FL participant (FL-PT) based on data complementarity. Recent studies have addressed this challenge. Similarly, it is imperative to consider the inter-individual relationships among FL… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: Accepted to AAAI-2024