Skip to main content

Showing 1–50 of 147 results for author: Cui, X

  1. arXiv:2407.08906  [pdf, other

    cs.CV cs.AI cs.GR

    AirSketch: Generative Motion to Sketch

    Authors: Hui Xian Grace Lim, Xuanming Cui, Yogesh S Rawat, Ser-Nam Lim

    Abstract: Illustration is a fundamental mode of human expression and communication. Certain types of motion that accompany speech can provide this illustrative mode of communication. While Augmented and Virtual Reality technologies (AR/VR) have introduced tools for producing drawings with hand motions (air drawing), they typically require costly hardware and additional digital markers, thereby limiting thei… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  2. arXiv:2406.08772  [pdf, other

    cs.CV cs.CL

    MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs

    Authors: Xuannan Liu, Zekun Li, Peipei Li, Shuhan Xia, Xing Cui, Linzhi Huang, Huaibo Huang, Weihong Deng, Zhaofeng He

    Abstract: Current multimodal misinformation detection (MMD) methods often assume a single source and type of forgery for each sample, which is insufficient for real-world scenarios where multiple forgery sources coexist. The lack of a benchmark for mixed-source misinformation has hindered progress in this field. To address this, we introduce MMFakeBench, the first comprehensive benchmark for mixed-source MM… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  3. arXiv:2406.04834  [pdf, other

    cs.CL

    Annotating FrameNet via Structure-Conditioned Language Generation

    Authors: Xinyue Cui, Swabha Swayamdipta

    Abstract: Despite the remarkable generative capabilities of language models in producing naturalistic language, their effectiveness on explicit manipulation and generation of linguistic structures remain understudied. In this paper, we investigate the task of generating new sentences preserving a given semantic structure, following the FrameNet formalism. We propose a framework to produce novel frame-semant… ▽ More

    Submitted 24 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted to ACL 2024

  4. arXiv:2406.00456  [pdf, other

    cs.LG cs.AI cs.CL

    Mix-of-Granularity: Optimize the Chunking Granularity for Retrieval-Augmented Generation

    Authors: Zijie Zhong, Hanwen Liu, Xiaoya Cui, Xiaofan Zhang, Zengchang Qin

    Abstract: Integrating information from different reference data sources is a major challenge for Retrieval-Augmented Generation (RAG) systems because each knowledge source adopts a unique data structure and follows different conventions. Retrieving from multiple knowledge sources with one fixed strategy usually leads to under-exploitation of information. To mitigate this drawback, inspired by Mix-of-Expert,… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: 17 pages, 6 figures and 8 tables

  5. arXiv:2406.00432  [pdf, other

    cs.CV

    Localize, Understand, Collaborate: Semantic-Aware Dragging via Intention Reasoner

    Authors: Xing Cui, Peipei Li, Zekun Li, Xuannan Liu, Yueying Zou, Zhaofeng He

    Abstract: Flexible and accurate drag-based editing is a challenging task that has recently garnered significant attention. Current methods typically model this problem as automatically learning ``how to drag'' through point dragging and often produce one deterministic estimation, which presents two key limitations: 1) Overlooking the inherently ill-posed nature of drag-based editing, where multiple results… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  6. Node Injection Attack Based on Label Propagation Against Graph Neural Network

    Authors: Peican Zhu, Zechen Pan, Keke Tang, Xiaodong Cui, Jinhuan Wang, Qi Xuan

    Abstract: Graph Neural Network (GNN) has achieved remarkable success in various graph learning tasks, such as node classification, link prediction and graph classification. The key to the success of GNN lies in its effective structure information representation through neighboring aggregation. However, the attacker can easily perturb the aggregation process through injecting fake nodes, which reveals that G… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted by TCSS;DOI:10.1109/TCSS.2024.3395794

  7. arXiv:2405.13381  [pdf

    cs.LG

    Optimizing Search Advertising Strategies: Integrating Reinforcement Learning with Generalized Second-Price Auctions for Enhanced Ad Ranking and Bidding

    Authors: Chang Zhou, Yang Zhao, Jin Cao, Yi Shen, Xiaoling Cui, Chiyu Cheng

    Abstract: This paper explores the integration of strategic optimization methods in search advertising, focusing on ad ranking and bidding mechanisms within E-commerce platforms. By employing a combination of reinforcement learning and evolutionary strategies, we propose a dynamic model that adjusts to varying user interactions and optimizes the balance between advertiser cost, user relevance, and platform r… ▽ More

    Submitted 29 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: Accepted by 2024 5th International Conference on Electronic communication and Artificial Intelligence (ICECAI 2024)

  8. arXiv:2405.00556  [pdf, other

    cs.LG

    Swarm Learning: A Survey of Concepts, Applications, and Trends

    Authors: Elham Shammar, Xiaohui Cui, Mohammed A. A. Al-qaness

    Abstract: Deep learning models have raised privacy and security concerns due to their reliance on large datasets on central servers. As the number of Internet of Things (IoT) devices increases, artificial intelligence (AI) will be crucial for resource management, data processing, and knowledge acquisition. To address those issues, federated learning (FL) has introduced a novel approach to building a versati… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 31 pages

    ACM Class: C.2.4, I.2.11

  9. arXiv:2403.06529  [pdf, other

    cs.CV

    Confidence-Aware RGB-D Face Recognition via Virtual Depth Synthesis

    Authors: Zijian Chen, Mei Wang, Weihong Deng, Hongzhi Shi, Dongchao Wen, Yingjie Zhang, Xingchen Cui, Jian Zhao

    Abstract: 2D face recognition encounters challenges in unconstrained environments due to varying illumination, occlusion, and pose. Recent studies focus on RGB-D face recognition to improve robustness by incorporating depth information. However, collecting sufficient paired RGB-D training data is expensive and time-consuming, hindering wide deployment. In this work, we first construct a diverse depth datase… ▽ More

    Submitted 16 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: 9 pages, 5 figures

  10. arXiv:2403.06452  [pdf, other

    cs.CV

    Text2QR: Harmonizing Aesthetic Customization and Scanning Robustness for Text-Guided QR Code Generation

    Authors: Guangyang Wu, Xiaohong Liu, Jun Jia, Xuehao Cui, Guangtao Zhai

    Abstract: In the digital era, QR codes serve as a linchpin connecting virtual and physical realms. Their pervasive integration across various applications highlights the demand for aesthetically pleasing codes without compromised scannability. However, prevailing methods grapple with the intrinsic challenge of balancing customization and scannability. Notably, stable-diffusion models have ushered in an epoc… ▽ More

    Submitted 12 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  11. arXiv:2403.01988  [pdf, other

    cs.CL

    FakeNewsGPT4: Advancing Multimodal Fake News Detection through Knowledge-Augmented LVLMs

    Authors: Xuannan Liu, Peipei Li, Huaibo Huang, Zekun Li, Xing Cui, Jiahao Liang, Lixiong Qin, Weihong Deng, Zhaofeng He

    Abstract: The massive generation of multimodal fake news exhibits substantial distribution discrepancies, prompting the need for generalized detectors. However, the insulated nature of training within specific domains restricts the capability of classical detectors to obtain open-world facts. In this paper, we propose FakeNewsGPT4, a novel framework that augments Large Vision-Language Models (LVLMs) with fo… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  12. Taking Second-life Batteries from Exhausted to Empowered using Experiments, Data Analysis, and Health Estimation

    Authors: Xiaofan Cui, Muhammad Aadil Khan, Gabriele Pozzato, Surinder Singh, Ratnesh Sharma, Simona Onori

    Abstract: The reuse of retired electric vehicle batteries in grid energy storage offers environmental and economic benefits. This study concentrates on health monitoring algorithms for retired batteries deployed in grid storage. Over 15 months of testing, we collect, analyze, and publicize a dataset of second-life batteries, implementing a cycling protocol simulating grid energy storage load profiles within… ▽ More

    Submitted 8 June, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: 16 pages, 8 figures

  13. arXiv:2402.17110  [pdf, other

    cs.LG cs.CL

    Sinkhorn Distance Minimization for Knowledge Distillation

    Authors: Xiao Cui, Yulei Qin, Yuting Gao, Enwei Zhang, Zihan Xu, Tong Wu, Ke Li, Xing Sun, Wengang Zhou, Houqiang Li

    Abstract: Knowledge distillation (KD) has been widely adopted to compress large language models (LLMs). Existing KD methods investigate various divergence measures including the Kullback-Leibler (KL), reverse Kullback-Leibler (RKL), and Jensen-Shannon (JS) divergences. However, due to limitations inherent in their assumptions and definitions, these measures fail to deliver effective supervision when few dis… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: Accepted by COLING 2024

  14. arXiv:2402.15607  [pdf, other

    cs.LG

    How Do Nonlinear Transformers Learn and Generalize in In-Context Learning?

    Authors: Hongkang Li, Meng Wang, Songtao Lu, Xiaodong Cui, Pin-Yu Chen

    Abstract: Transformer-based large language models have displayed impressive in-context learning capabilities, where a pre-trained model can handle new tasks without fine-tuning by simply augmenting the query with some input-output examples from that task. Despite the empirical success, the mechanics of how to train a Transformer to achieve ICL and the corresponding ICL capacity is mostly elusive due to the… ▽ More

    Submitted 16 June, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: ICML 2024

  15. arXiv:2402.11302  [pdf, other

    cs.IR

    Knowledge Graph-based Session Recommendation with Adaptive Propagation

    Authors: Yu Wang, Amin Javari, Janani Balaji, Walid Shalaby, Tyler Derr, Xiquan Cui

    Abstract: Session-based recommender systems (SBRSs) predict users' next interacted items based on their historical activities. While most SBRSs capture purchasing intentions locally within each session, capturing items' global information across different sessions is crucial in characterizing their general properties. Previous works capture this cross-session information by constructing graphs and incorpora… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

  16. arXiv:2401.12999  [pdf, other

    physics.chem-ph cs.AI cs.LG

    Quantum-Inspired Machine Learning for Molecular Docking

    Authors: Runqiu Shu, Bowen Liu, Zhaoping Xiong, Xiaopeng Cui, Yunting Li, Wei Cui, Man-Hong Yung, Nan Qiao

    Abstract: Molecular docking is an important tool for structure-based drug design, accelerating the efficiency of drug development. Complex and dynamic binding processes between proteins and small molecules require searching and sampling over a wide spatial range. Traditional docking by searching for possible binding sites and conformations is computationally complex and results poorly under blind docking. Q… ▽ More

    Submitted 21 February, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

  17. arXiv:2401.07572  [pdf, other

    cs.CV cs.CL

    Exploiting GPT-4 Vision for Zero-shot Point Cloud Understanding

    Authors: Qi Sun, Xiao Cui, Wengang Zhou, Houqiang Li

    Abstract: In this study, we tackle the challenge of classifying the object category in point clouds, which previous works like PointCLIP struggle to address due to the inherent limitations of the CLIP architecture. Our approach leverages GPT-4 Vision (GPT-4V) to overcome these challenges by employing its advanced generative abilities, enabling a more adaptive and robust classification process. We adapt the… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  18. arXiv:2401.06980  [pdf, other

    cs.CL cs.LG stat.ML

    Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization

    Authors: A F M Saif, Xiaodong Cui, Han Shen, Songtao Lu, Brian Kingsbury, Tianyi Chen

    Abstract: In this paper, we present a novel bilevel optimization-based training approach to training acoustic models for automatic speech recognition (ASR) tasks that we term {bi-level joint unsupervised and supervised training (BL-JUST)}. {BL-JUST employs a lower and upper level optimization with an unsupervised loss and a supervised loss respectively, leveraging recent advances in penalty-based bilevel op… ▽ More

    Submitted 13 January, 2024; originally announced January 2024.

    Comments: This paper has been accepted in ICASSP-2024 conference

  19. arXiv:2401.02651  [pdf, other

    cs.CV

    Benchmarking PathCLIP for Pathology Image Analysis

    Authors: Sunyi Zheng, Xiaonan Cui, Yuxuan Sun, Jingxiong Li, Honglin Li, Yunlong Zhang, Pingyi Chen, Xueping Jing, Zhaoxiang Ye, Lin Yang

    Abstract: Accurate image classification and retrieval are of importance for clinical diagnosis and treatment decision-making. The recent contrastive language-image pretraining (CLIP) model has shown remarkable proficiency in understanding natural images. Drawing inspiration from CLIP, PathCLIP is specifically designed for pathology image analysis, utilizing over 200,000 image and text pairs in training. Whi… ▽ More

    Submitted 12 June, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

  20. arXiv:2401.01575  [pdf, other

    cs.CV

    Enhancing Generalization of Invisible Facial Privacy Cloak via Gradient Accumulation

    Authors: Xuannan Liu, Yaoyao Zhong, Weihong Deng, Hongzhi Shi, Xingchen Cui, Yunfeng Yin, Dongchao Wen

    Abstract: The blooming of social media and face recognition (FR) systems has increased people's concern about privacy and security. A new type of adversarial privacy cloak (class-universal) can be applied to all the images of regular users, to prevent malicious FR systems from acquiring their identity information. In this work, we discover the optimization dilemma in the existing methods -- the local optima… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  21. arXiv:2312.16881  [pdf, other

    cs.CV

    Exploring 3D-aware Lifespan Face Aging via Disentangled Shape-Texture Representations

    Authors: Qianrui Teng, Rui Wang, Xing Cui, Peipei Li, Zhaofeng He

    Abstract: Existing face aging methods often focus on modeling either texture aging or using an entangled shape-texture representation to achieve face aging. However, shape and texture are two distinct factors that mutually affect the human face aging process. In this paper, we propose 3D-STD, a novel 3D-aware Shape-Texture Disentangled face aging network that explicitly disentangles the facial image into sh… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  22. arXiv:2312.15385  [pdf, other

    q-fin.MF cs.LG q-fin.PM

    Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning

    Authors: Xiangyu Cui, Xun Li, Yun Shi, Si Zhao

    Abstract: This paper studies a discrete-time mean-variance model based on reinforcement learning. Compared with its continuous-time counterpart in \cite{zhou2020mv}, the discrete-time model makes more general assumptions about the asset's return distribution. Using entropy to measure the cost of exploration, we derive the optimal investment strategy, whose density function is also Gaussian type. Additionall… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: arXiv admin note: text overlap with arXiv:1904.11392 by other authors

  23. arXiv:2312.14407  [pdf, other

    cs.CV

    AdvCloak: Customized Adversarial Cloak for Privacy Protection

    Authors: Xuannan Liu, Yaoyao Zhong, Xing Cui, Yuhang Zhang, Peipei Li, Weihong Deng

    Abstract: With extensive face images being shared on social media, there has been a notable escalation in privacy concerns. In this paper, we propose AdvCloak, an innovative framework for privacy protection using generative models. AdvCloak is designed to automatically customize class-wise adversarial masks that can maintain superior image-level naturalness while providing enhanced feature-level generalizat… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

  24. arXiv:2312.03777  [pdf, other

    cs.CV

    On the Robustness of Large Multimodal Models Against Image Adversarial Attacks

    Authors: Xuanming Cui, Alejandro Aparcedo, Young Kyun Jang, Ser-Nam Lim

    Abstract: Recent advances in instruction tuning have led to the development of State-of-the-Art Large Multimodal Models (LMMs). Given the novelty of these models, the impact of visual adversarial attacks on LMMs has not been thoroughly examined. We conduct a comprehensive study of the robustness of various LMMs against different adversarial attacks, evaluated across tasks including image classification, ima… ▽ More

    Submitted 8 December, 2023; v1 submitted 5 December, 2023; originally announced December 2023.

  25. arXiv:2311.15040  [pdf, other

    cs.CV

    InstaStyle: Inversion Noise of a Stylized Image is Secretly a Style Adviser

    Authors: Xing Cui, Zekun Li, Pei Pei Li, Huaibo Huang, Xuannan Liu, Zhaofeng He

    Abstract: Stylized text-to-image generation focuses on creating images from textual descriptions while adhering to a style specified by a few reference images. However, subtle style variations within different reference images can hinder the model from accurately learning the target style. In this paper, we propose InstaStyle, a novel approach that excels in generating high-fidelity stylized images with onl… ▽ More

    Submitted 12 July, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

    Comments: Accepted by European Conference on Computer Vision (ECCV 2024). Project page: https://cuixing100876.github.io/instastyle.github.io/

  26. arXiv:2311.12727  [pdf, other

    cs.LG cs.CL

    Soft Random Sampling: A Theoretical and Empirical Analysis

    Authors: Xiaodong Cui, Ashish Mittal, Songtao Lu, Wei Zhang, George Saon, Brian Kingsbury

    Abstract: Soft random sampling (SRS) is a simple yet effective approach for efficient training of large-scale deep neural networks when dealing with massive data. SRS selects a subset uniformly at random with replacement from the full data set in each epoch. In this paper, we conduct a theoretical and empirical analysis of SRS. First, we analyze its sampling dynamics including data coverage and occupancy. N… ▽ More

    Submitted 23 November, 2023; v1 submitted 21 November, 2023; originally announced November 2023.

  27. arXiv:2311.07871  [pdf, other

    cs.CV

    Dual-channel Prototype Network for few-shot Classification of Pathological Images

    Authors: Hao Quan, Xinjia Li, Dayu Hu, Tianhang Nan, Xiaoyu Cui

    Abstract: In pathology, the rarity of certain diseases and the complexity in annotating pathological images significantly hinder the creation of extensive, high-quality datasets. This limitation impedes the progress of deep learning-assisted diagnostic systems in pathology. Consequently, it becomes imperative to devise a technology that can discern new disease categories from a minimal number of annotated e… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  28. arXiv:2310.18127  [pdf, other

    cs.LG cs.AI cs.CL

    Ask more, know better: Reinforce-Learned Prompt Questions for Decision Making with Large Language Models

    Authors: Xue Yan, Yan Song, Xinyu Cui, Filippos Christianos, Haifeng Zhang, David Henry Mguni, Jun Wang

    Abstract: Large language models (LLMs) demonstrate their promise in tackling complicated practical challenges by combining action-based policies with chain of thought (CoT) reasoning. Having high-quality prompts on hand, however, is vital to the framework's effectiveness. Currently, these prompts are handcrafted utilising extensive human labor, resulting in CoT policies that frequently fail to generalise. H… ▽ More

    Submitted 28 February, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

  29. Bidirectional Knowledge Reconfiguration for Lightweight Point Cloud Analysis

    Authors: Peipei Li, Xing Cui, Yibo Hu, Man Zhang, Ting Yao, Tao Mei

    Abstract: Point cloud analysis faces computational system overhead, limiting its application on mobile or edge devices. Directly employing small models may result in a significant drop in performance since it is difficult for a small model to adequately capture local structure and global shape information simultaneously, which are essential clues for point cloud analysis. This paper explores feature distill… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

    Comments: Accepted by IEEE Transactions on Multimedia (TMM)

    Journal ref: IEEE Transactions on Multimedia ( Early Access ), 02 October 2023

  30. arXiv:2309.06533  [pdf, other

    cs.IR cs.AI cs.LG

    Hierarchical Multi-Task Learning Framework for Session-based Recommendations

    Authors: Sejoon Oh, Walid Shalaby, Amir Afsharinejad, Xiquan Cui

    Abstract: While session-based recommender systems (SBRSs) have shown superior recommendation performance, multi-task learning (MTL) has been adopted by SBRSs to enhance their prediction accuracy and generalizability further. Hierarchical MTL (H-MTL) sets a hierarchical structure between prediction tasks and feeds outputs from auxiliary tasks to main tasks. This hierarchy leads to richer input features for m… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: Accepted at the 6th Workshop on Online Recommender Systems and User Modeling @ ACM RecSys 2023

  31. arXiv:2309.03548  [pdf, other

    cs.CV

    Trash to Treasure: Low-Light Object Detection via Decomposition-and-Aggregation

    Authors: Xiaohan Cui, Long Ma, Tengyu Ma, Jinyuan Liu, Xin Fan, Risheng Liu

    Abstract: Object detection in low-light scenarios has attracted much attention in the past few years. A mainstream and representative scheme introduces enhancers as the pre-processing for regular detectors. However, because of the disparity in task objectives between the enhancer and detector, this paradigm cannot shine at its best ability. In this work, we try to arouse the potential of enhancer + detector… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  32. arXiv:2308.14533  [pdf, other

    cs.CL

    A Multi-Task Semantic Decomposition Framework with Task-specific Pre-training for Few-Shot NER

    Authors: Guanting Dong, Zechen Wang, Jinxu Zhao, Gang Zhao, Daichi Guo, Dayuan Fu, Tingfeng Hui, Chen Zeng, Keqing He, Xuefeng Li, Liwen Wang, Xinyue Cui, Weiran Xu

    Abstract: The objective of few-shot named entity recognition is to identify named entities with limited labeled instances. Previous works have primarily focused on optimizing the traditional token-wise classification framework, while neglecting the exploration of information based on NER data characteristics. To address this issue, we propose a Multi-Task Semantic Decomposition Framework via Joint Task-spec… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: Accepted by CIKM 2023 (Oral Presentation)

  33. arXiv:2308.14064  [pdf

    cs.CV

    Multi-model fusion for Aerial Vision and Dialog Navigation based on human attention aids

    Authors: Xinyi Wang, Xuan Cui, Danxu Li, Fang Liu, Licheng Jiao

    Abstract: Drones have been widely used in many areas of our daily lives. It relieves people of the burden of holding a controller all the time and makes drone control easier to use for people with disabilities or occupied hands. However, the control of aerial robots is more complicated compared to normal robots due to factors such as uncontrollable height. Therefore, it is crucial to develop an intelligent… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

    Comments: 4 pages, 1 figures

  34. arXiv:2308.13760  [pdf, other

    cs.AI cs.CL cs.IR

    How Can Context Help? Exploring Joint Retrieval of Passage and Personalized Context

    Authors: Hui Wan, Hongkang Li, Songtao Lu, Xiaodong Cui, Marina Danilevsky

    Abstract: The integration of external personalized context information into document-grounded conversational systems has significant potential business value, but has not been well-studied. Motivated by the concept of personalized context-aware document-grounded conversational systems, we introduce the task of context-aware passage retrieval. We also construct a dataset specifically curated for this purpose… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

  35. arXiv:2307.00769  [pdf, other

    cs.CL

    CollabKG: A Learnable Human-Machine-Cooperative Information Extraction Toolkit for (Event) Knowledge Graph Construction

    Authors: Xiang Wei, Yufeng Chen, Ning Cheng, Xingyu Cui, Jinan Xu, Wenjuan Han

    Abstract: In order to construct or extend entity-centric and event-centric knowledge graphs (KG and EKG), the information extraction (IE) annotation toolkit is essential. However, existing IE toolkits have several non-trivial problems, such as not supporting multi-tasks, not supporting automatic updates. In this work, we present CollabKG, a learnable human-machine-cooperative IE toolkit for KG and EKG const… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  36. arXiv:2306.16834  [pdf, ps, other

    astro-ph.IM cs.AI

    Intelligence of Astronomical Optical Telescope: Present Status and Future Perspectives

    Authors: Kang Huang, Tianzhu Hu, Jingyi Cai, Xiushan Pang, Yonghui Hou, Yong Zhang, Huaiqing Wang, Xiangqun Cui

    Abstract: Artificial intelligence technology has been widely used in astronomy, and new artificial intelligence technologies and application scenarios are constantly emerging. There have been a large number of papers reviewing the application of artificial intelligence technology in astronomy. However, relevant articles seldom mention telescope intelligence separately, and it is difficult to understand the… ▽ More

    Submitted 16 January, 2024; v1 submitted 29 June, 2023; originally announced June 2023.

    Comments: 41 pages, 10 figure, for questions or comments, please email tzhu@niaot.ac.cn

    ACM Class: J.7

  37. Local Boosting for Weakly-Supervised Learning

    Authors: Rongzhi Zhang, Yue Yu, Jiaming Shen, Xiquan Cui, Chao Zhang

    Abstract: Boosting is a commonly used technique to enhance the performance of a set of base models by combining them into a strong ensemble model. Though widely adopted, boosting is typically used in supervised learning where the data is labeled accurately. However, in weakly supervised learning, where most of the data is labeled through weak and noisy sources, it remains nontrivial to design effective boos… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Accepted by KDD 2023 Research Track

  38. arXiv:2305.17600  [pdf, other

    cs.LG cs.CV cs.GT cs.RO math.OC

    NashFormer: Leveraging Local Nash Equilibria for Semantically Diverse Trajectory Prediction

    Authors: Justin Lidard, Oswin So, Yanxia Zhang, Jonathan DeCastro, Xiongyi Cui, Xin Huang, Yen-Ling Kuo, John Leonard, Avinash Balachandran, Naomi Leonard, Guy Rosman

    Abstract: Interactions between road agents present a significant challenge in trajectory prediction, especially in cases involving multiple agents. Because existing diversity-aware predictors do not account for the interactive nature of multi-agent predictions, they may miss these important interaction outcomes. In this paper, we propose NashFormer, a framework for trajectory prediction that leverages game-… ▽ More

    Submitted 11 November, 2023; v1 submitted 27 May, 2023; originally announced May 2023.

    Comments: 8 pages, 6 figures

  39. arXiv:2305.16967  [pdf, other

    cs.CL

    Evaluating Open-Domain Dialogues in Latent Space with Next Sentence Prediction and Mutual Information

    Authors: Kun Zhao, Bohao Yang, Chenghua Lin, Wenge Rong, Aline Villavicencio, Xiaohui Cui

    Abstract: The long-standing one-to-many issue of the open-domain dialogues poses significant challenges for automatic evaluation methods, i.e., there may be multiple suitable responses which differ in semantics for a given conversational context. To tackle this challenge, we propose a novel learning-based automatic evaluation metric (CMN), which can robustly evaluate open-domain dialogues by augmenting Cond… ▽ More

    Submitted 10 June, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL2023

  40. arXiv:2303.11108  [pdf, other

    cs.CV

    CHATEDIT: Towards Multi-turn Interactive Facial Image Editing via Dialogue

    Authors: Xing Cui, Zekun Li, Peipei Li, Yibo Hu, Hailin Shi, Zhaofeng He

    Abstract: This paper explores interactive facial image editing via dialogue and introduces the ChatEdit benchmark dataset for evaluating image editing and conversation abilities in this context. ChatEdit is constructed from the CelebA-HQ dataset, incorporating annotated multi-turn dialogues corresponding to user edit requests on the images. The dataset is challenging, as it requires the system to dynamicall… ▽ More

    Submitted 16 October, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: Accepted to EMNLP 2023 (Main Conference)

  41. arXiv:2302.14120  [pdf, other

    eess.AS cs.SD

    Diagonal State Space Augmented Transformers for Speech Recognition

    Authors: George Saon, Ankit Gupta, Xiaodong Cui

    Abstract: We improve on the popular conformer architecture by replacing the depthwise temporal convolutions with diagonal state space (DSS) models. DSS is a recently introduced variant of linear RNNs obtained by discretizing a linear dynamical system with a diagonal state transition matrix. DSS layers project the input sequence onto a space of orthogonal polynomials where the choice of basis functions, metr… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: to be presented at ICASSP 2023

  42. arXiv:2302.13610   

    cs.CL

    A Prototypical Semantic Decoupling Method via Joint Contrastive Learning for Few-Shot Name Entity Recognition

    Authors: Guanting Dong, Zechen Wang, Liwen Wang, Daichi Guo, Dayuan Fu, Yuxiang Wu, Chen Zeng, Xuefeng Li, Tingfeng Hui, Keqing He, Xinyue Cui, Qixiang Gao, Weiran Xu

    Abstract: Few-shot named entity recognition (NER) aims at identifying named entities based on only few labeled instances. Most existing prototype-based sequence labeling models tend to memorize entity mentions which would be easily confused by close prototypes. In this paper, we proposed a Prototypical Semantic Decoupling method via joint Contrastive learning (PSDC) for few-shot NER. Specifically, we decoup… ▽ More

    Submitted 12 April, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: we want to revise our paper and upload this article in few days

  43. arXiv:2302.13584  [pdf, other

    cs.CL

    Revisit Out-Of-Vocabulary Problem for Slot Filling: A Unified Contrastive Frameword with Multi-level Data Augmentations

    Authors: Daichi Guo, Guanting Dong, Dayuan Fu, Yuxiang Wu, Chen Zeng, Tingfeng Hui, Liwen Wang, Xuefeng Li, Zechen Wang, Keqing He, Xinyue Cui, Weiran Xu

    Abstract: In real dialogue scenarios, the existing slot filling model, which tends to memorize entity patterns, has a significantly reduced generalization facing Out-of-Vocabulary (OOV) problems. To address this issue, we propose an OOV robust slot filling model based on multi-level data augmentations to solve the OOV problem from both word and slot perspectives. We present a unified contrastive learning fr… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: 5 pages, 3 figures, published to ICASSP 2023

  44. arXiv:2302.10205  [pdf, other

    cs.CL

    ChatIE: Zero-Shot Information Extraction via Chatting with ChatGPT

    Authors: Xiang Wei, Xingyu Cui, Ning Cheng, Xiaobin Wang, Xin Zhang, Shen Huang, Pengjun Xie, Jinan Xu, Yufeng Chen, Meishan Zhang, Yong Jiang, Wenjuan Han

    Abstract: Zero-shot information extraction (IE) aims to build IE systems from the unannotated text. It is challenging due to involving little human intervention. Challenging but worthwhile, zero-shot IE reduces the time and effort that data labeling takes. Recent efforts on large language models (LLMs, e.g., GPT-3, ChatGPT) show promising performance on zero-shot settings, thus inspiring us to explore promp… ▽ More

    Submitted 27 May, 2024; v1 submitted 20 February, 2023; originally announced February 2023.

  45. arXiv:2302.05621  [pdf, other

    cs.CV

    Dive into the Resolution Augmentations and Metrics in Low Resolution Face Recognition: A Plain yet Effective New Baseline

    Authors: Xu Ling, Yichen Lu, Wenqi Xu, Weihong Deng, Yingjie Zhang, Xingchen Cui, Hongzhi Shi, Dongchao Wen

    Abstract: Although deep learning has significantly improved Face Recognition (FR), dramatic performance deterioration may occur when processing Low Resolution (LR) faces. To alleviate this, approaches based on unified feature space are proposed with the sacrifice under High Resolution (HR) circumstances. To deal with the huge domain gap between HR and LR domains and achieve the best on both domains, we firs… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

    Comments: AAAI 2023 R2HCAI Workshop

  46. arXiv:2302.01642  [pdf, other

    cs.CV cs.AI

    Cluster-CAM: Cluster-Weighted Visual Interpretation of CNNs' Decision in Image Classification

    Authors: Zhenpeng Feng, Hongbing Ji, Milos Dakovic, Xiyang Cui, Mingzhe Zhu, Ljubisa Stankovic

    Abstract: Despite the tremendous success of convolutional neural networks (CNNs) in computer vision, the mechanism of CNNs still lacks clear interpretation. Currently, class activation mapping (CAM), a famous visualization technique to interpret CNN's decision, has drawn increasing attention. Gradient-based CAMs are efficient while the performance is heavily affected by gradient vanishing and exploding. In… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

    Comments: 10 pages

  47. arXiv:2301.13455  [pdf, other

    cs.CL

    ZhichunRoad at Amazon KDD Cup 2022: MultiTask Pre-Training for E-Commerce Product Search

    Authors: Xuange Cui, Wei Xiong, Songlin Wang

    Abstract: In this paper, we propose a robust multilingual model to improve the quality of search results. Our model not only leverage the processed class-balanced dataset, but also benefit from multitask pre-training that leads to more general representations. In pre-training stage, we adopt mlm task, classification task and contrastive learning task to achieve considerably performance. In fine-tuning stage… ▽ More

    Submitted 31 January, 2023; originally announced January 2023.

    Comments: KDD Cup Workshop @ KDD 2022

  48. arXiv:2212.14672  [pdf

    cs.CY cs.CL cs.SI

    Twitter's Agenda-Setting Role: A Study of Twitter Strategy for Political Diversion

    Authors: Yuyang Chen, Xiaoyu Cui, Yunjie Song, Manli Wu

    Abstract: This study verified the effectiveness of Donald Trump's Twitter campaign in guiding agen-da-setting and deflecting political risk and examined Trump's Twitter communication strategy and explores the communication effects of his tweet content during Covid-19 pandemic. We collected all tweets posted by Trump on the Twitter platform from January 1, 2020 to December 31, 2020.We used Ordinary Least Squ… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

    Comments: 14 pages, 6 tables

  49. arXiv:2212.01054  [pdf, other

    cs.CV cs.AI

    Model and Data Agreement for Learning with Noisy Labels

    Authors: Yuhang Zhang, Weihong Deng, Xingchen Cui, Yunfeng Yin, Hongzhi Shi, Dongchao Wen

    Abstract: Learning with noisy labels is a vital topic for practical deep learning as models should be robust to noisy open-world datasets in the wild. The state-of-the-art noisy label learning approach JoCoR fails when faced with a large ratio of noisy labels. Moreover, selecting small-loss samples can also cause error accumulation as once the noisy samples are mistakenly selected as small-loss samples, the… ▽ More

    Submitted 24 December, 2022; v1 submitted 2 December, 2022; originally announced December 2022.

    Comments: Accepted by AAAI2023 Workshop

  50. Hybrid MBlur: A Systematic Approach to Augment Rasterization with Ray Tracing for Rendering Motion Blur in Games

    Authors: Yu Wei Tan, Xiaohan Cui, Anand Bhojan

    Abstract: Motion blur is commonly used in game cinematics to achieve photorealism by modelling the behaviour of the camera shutter and simulating its effect associated with the relative motion of scene objects. A common real-time post-process approach is spatial sampling, where the directional blur of a moving object is rendered by integrating its colour based on velocity information within a single frame.… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    ACM Class: I.3