Skip to main content

Showing 1–50 of 295 results for author: Wei, L

  1. arXiv:2407.10474  [pdf, other

    cs.MM

    Multi-source Knowledge Enhanced Graph Attention Networks for Multimodal Fact Verification

    Authors: Han Cao, Lingwei Wei, Wei Zhou, Songlin Hu

    Abstract: Multimodal fact verification is an under-explored and emerging field that has gained increasing attention in recent years. The goal is to assess the veracity of claims that involve multiple modalities by analyzing the retrieved evidence. The main challenge in this area is to effectively fuse features from different modalities to learn meaningful multimodal representations. To this end, we propose… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted by ICME 2024

  2. arXiv:2407.09894  [pdf, other

    cs.SI cs.AI cs.CL

    Transferring Structure Knowledge: A New Task to Fake news Detection Towards Cold-Start Propagation

    Authors: Lingwei Wei, Dou Hu, Wei Zhou, Songlin Hu

    Abstract: Many fake news detection studies have achieved promising performance by extracting effective semantic and structure features from both content and propagation trees. However, it is challenging to apply them to practical situations, especially when using the trained propagation-based models to detect news with no propagation data. Towards this scenario, we study a new task named cold-start fake new… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: ICASSP 2024

  3. arXiv:2407.08961  [pdf

    eess.IV cs.CV

    Tissue-Contrastive Semi-Masked Autoencoders for Segmentation Pretraining on Chest CT

    Authors: Jie Zheng, Ru Wen, Haiqin Hu, Lina Wei, Kui Su, Wei Chen, Chen Liu, Jun Wang

    Abstract: Existing Masked Image Modeling (MIM) depends on a spatial patch-based masking-reconstruction strategy to perceive objects'features from unlabeled images, which may face two limitations when applied to chest CT: 1) inefficient feature learning due to complex anatomical details presented in CT images, and 2) suboptimal knowledge transfer owing to input disparity between upstream and downstream model… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  4. arXiv:2407.07589  [pdf

    cs.RO

    MSC-LIO: An MSCKF-Based LiDAR-Inertial Odometry with Same-Plane-Point Tracking

    Authors: Tisheng Zhang, Man Yuan, Linfu Wei, Hailiang Tang, Xiaoji Niu

    Abstract: The multi-state constraint Kalman filter (MSCKF) has been proven to be more efficient than graph optimization for visual-based odometry while with similar accuracy. However, it has not yet been properly considered and studied for LiDAR-based odometry. In this paper, we propose a novel tightly coupled LiDAR-inertial odometry based on the MSCKF framework, named MSC-LIO. An efficient LiDAR same-plane… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 9 pages

  5. arXiv:2407.06494  [pdf, other

    cs.LG cs.AI

    A Generative Approach to Control Complex Physical Systems

    Authors: Long Wei, Peiyan Hu, Ruiqi Feng, Haodong Feng, Yixuan Du, Tao Zhang, Rui Wang, Yue Wang, Zhi-Ming Ma, Tailin Wu

    Abstract: Controlling the evolution of complex physical systems is a fundamental task across science and engineering. Classical techniques suffer from limited applicability or huge computational costs. On the other hand, recent deep learning and reinforcement learning-based approaches often struggle to optimize long-term control sequences under the constraints of system dynamics. In this work, we introduce… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  6. arXiv:2407.05610  [pdf, other

    cs.CV

    Described Spatial-Temporal Video Detection

    Authors: Wei Ji, Xiangyan Liu, Yingfei Sun, Jiajun Deng, You Qin, Ammar Nuwanna, Mengyao Qiu, Lina Wei, Roger Zimmermann

    Abstract: Detecting visual content on language expression has become an emerging topic in the community. However, in the video domain, the existing setting, i.e., spatial-temporal video grounding (STVG), is formulated to only detect one pre-existing object in each frame, ignoring the fact that language descriptions can involve none or multiple entities within a video. In this work, we advance the STVG to a… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  7. arXiv:2407.00431  [pdf, other

    cs.CV

    Location embedding based pairwise distance learning for fine-grained diagnosis of urinary stones

    Authors: Qiangguo Jin, Jiapeng Huang, Changming Sun, Hui Cui, Ping Xuan, Ran Su, Leyi Wei, Yu-Jie Wu, Chia-An Wu, Henry B. L. Duh, Yueh-Hsun Lu

    Abstract: The precise diagnosis of urinary stones is crucial for devising effective treatment strategies. The diagnostic process, however, is often complicated by the low contrast between stones and surrounding tissues, as well as the variability in stone locations across different patients. To address this issue, we propose a novel location embedding based pairwise distance learning network (LEPD-Net) that… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Journal ref: MICCAI 2024

  8. arXiv:2406.07979  [pdf, other

    cs.LG cs.AI cs.IR

    Heuristic Learning with Graph Neural Networks: A Unified Framework for Link Prediction

    Authors: Juzheng Zhang, Lanning Wei, Zhen Xu, Quanming Yao

    Abstract: Link prediction is a fundamental task in graph learning, inherently shaped by the topology of the graph. While traditional heuristics are grounded in graph topology, they encounter challenges in generalizing across diverse graphs. Recent research efforts have aimed to leverage the potential of heuristics, yet a unified formulation accommodating both local and global heuristics remains undiscovered… ▽ More

    Submitted 14 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD 2024

  9. arXiv:2406.06140  [pdf, other

    cs.CL cs.LG

    Can I understand what I create? Self-Knowledge Evaluation of Large Language Models

    Authors: Zhiquan Tan, Lai Wei, Jindong Wang, Xing Xie, Weiran Huang

    Abstract: Large language models (LLMs) have achieved remarkable progress in linguistic tasks, necessitating robust evaluation frameworks to understand their capabilities and limitations. Inspired by Feynman's principle of understanding through creation, we introduce a self-knowledge evaluation framework that is easy to implement, evaluating models on their ability to comprehend and respond to self-generated… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  10. arXiv:2406.05510  [pdf, other

    cs.LG cs.CL

    Representation Learning with Conditional Information Flow Maximization

    Authors: Dou Hu, Lingwei Wei, Wei Zhou, Songlin Hu

    Abstract: This paper proposes an information-theoretic representation learning framework, named conditional information flow maximization, to extract noise-invariant sufficient representations for the input data and target task. It promotes the learned representations have good feature uniformity and sufficient predictive ability, which can enhance the generalization of pre-trained language models (PLMs) fo… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 16 pages, accepted to ACL 2024 (main conference)

  11. arXiv:2406.04675  [pdf, other

    cs.CV

    OVMR: Open-Vocabulary Recognition with Multi-Modal References

    Authors: Zehong Ma, Shiliang Zhang, Longhui Wei, Qi Tian

    Abstract: The challenge of open-vocabulary recognition lies in the model has no clue of new categories it is applied to. Existing works have proposed different methods to embed category cues into the model, \eg, through few-shot fine-tuning, providing category names or textual descriptions to Vision-Language Models. Fine-tuning is time-consuming and degrades the generalization capability. Textual descriptio… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: CVPR2024

  12. arXiv:2405.11548  [pdf, other

    cs.LG stat.AP

    Adaptive Online Experimental Design for Causal Discovery

    Authors: Muhammad Qasim Elahi, Lai Wei, Murat Kocaoglu, Mahsa Ghasemi

    Abstract: Causal discovery aims to uncover cause-and-effect relationships encoded in causal graphs by leveraging observational, interventional data, or their combination. The majority of existing causal discovery methods are developed assuming infinite interventional data. We focus on data interventional efficiency and formalize causal discovery from the perspective of online learning, inspired by pure expl… ▽ More

    Submitted 22 June, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

    Comments: To appear in Proceedings of ICML 24

  13. arXiv:2405.10496  [pdf, other

    cs.IT eess.SP

    Electromagnetic Information Theory for Holographic MIMO Communications

    Authors: Li Wei, Tierui Gong, Chongwen Huang, Zhaoyang Zhang, Wei E. I. Sha, Zhi Ning Chen, Linglong Dai, Merouane Debbah, Chau Yuen

    Abstract: Holographic multiple-input multiple-output (HMIMO) utilizes a compact antenna array to form a nearly continuous aperture, thereby enhancing higher capacity and more flexible configurations compared with conventional MIMO systems, making it attractive in current scientific research. Key questions naturally arise regarding the potential of HMIMO to surpass Shannon's theoretical limits and how far it… ▽ More

    Submitted 25 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

  14. arXiv:2405.08322  [pdf, other

    cs.CV

    StraightPCF: Straight Point Cloud Filtering

    Authors: Dasith de Silva Edirimuni, Xuequan Lu, Gang Li, Lei Wei, Antonio Robles-Kelly, Hongdong Li

    Abstract: Point cloud filtering is a fundamental 3D vision task, which aims to remove noise while recovering the underlying clean surfaces. State-of-the-art methods remove noise by moving noisy points along stochastic trajectories to the clean surfaces. These methods often require regularization within the training objective and/or during post-processing, to ensure fidelity. In this paper, we introduce Stra… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: This paper has been accepted to the IEEE/CVF CVPR Conference, 2024

  15. arXiv:2405.07065  [pdf, other

    cs.HC

    LogoMotion: Visually Grounded Code Generation for Content-Aware Animation

    Authors: Vivian Liu, Rubaiat Habib Kazi, Li-Yi Wei, Matthew Fisher, Timothy Langlois, Seth Walker, Lydia Chilton

    Abstract: Animated logos are a compelling and ubiquitous way individuals and brands represent themselves online. Manually authoring these logos can require significant artistic skill and effort. To help novice designers animate logos, design tools currently offer templates and animation presets. However, these solutions can be limited in their expressive range. Large language models have the potential to he… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  16. arXiv:2404.19750  [pdf, other

    cs.IT eess.SP

    A Joint Communication and Computation Design for Distributed RISs Assisted Probabilistic Semantic Communication in IIoT

    Authors: Zhouxiang Zhao, Zhaohui Yang, Chongwen Huang, Li Wei, Qianqian Yang, Caijun Zhong, Wei Xu, Zhaoyang Zhang

    Abstract: In this paper, the problem of spectral-efficient communication and computation resource allocation for distributed reconfigurable intelligent surfaces (RISs) assisted probabilistic semantic communication (PSC) in industrial Internet-of-Things (IIoT) is investigated. In the considered model, multiple RISs are deployed to serve multiple users, while PSC adopts compute-then-transmit protocol to reduc… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  17. arXiv:2404.12509  [pdf, other

    cs.GR cs.AI cs.CV cs.LG

    Compositional Neural Textures

    Authors: Peihan Tu, Li-Yi Wei, Matthias Zwicker

    Abstract: Texture plays a vital role in enhancing visual richness in both real photographs and computer-generated imagery. However, the process of editing textures often involves laborious and repetitive manual adjustments of textons, which are the small, recurring local patterns that define textures. In this work, we introduce a fully unsupervised approach for representing textures using a compositional ne… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  18. arXiv:2404.06563  [pdf, other

    cs.DB cs.LG cs.MM

    Demonstration of MaskSearch: Efficiently Querying Image Masks for Machine Learning Workflows

    Authors: Lindsey Linxi Wei, Chung Yik Edward Yeung, Hongjian Yu, Jingchuan Zhou, Dong He, Magdalena Balazinska

    Abstract: We demonstrate MaskSearch, a system designed to accelerate queries over databases of image masks generated by machine learning models. MaskSearch formalizes and accelerates a new category of queries for retrieving images and their corresponding masks based on mask properties, which support various applications, from identifying spurious correlations learned by models to exploring discrepancies bet… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  19. arXiv:2404.04810  [pdf, other

    cond-mat.mtrl-sci cs.LG

    AlphaCrystal-II: Distance matrix based crystal structure prediction using deep learning

    Authors: Yuqi Song, Rongzhi Dong, Lai Wei, Qin Li, Jianjun Hu

    Abstract: Computational prediction of stable crystal structures has a profound impact on the large-scale discovery of novel functional materials. However, predicting the crystal structure solely from a material's composition or formula is a promising yet challenging task, as traditional ab initio crystal structure prediction (CSP) methods rely on time-consuming global searches and first-principles free ener… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: 16 pages

  20. arXiv:2403.19600  [pdf, other

    cs.CV

    Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model

    Authors: Zhicai Wang, Longhui Wei, Tan Wang, Heyu Chen, Yanbin Hao, Xiang Wang, Xiangnan He, Qi Tian

    Abstract: Text-to-image (T2I) generative models have recently emerged as a powerful tool, enabling the creation of photo-realistic images and giving rise to a multitude of applications. However, the effective integration of T2I models into fundamental image classification tasks remains an open question. A prevalent strategy to bolster image classification performance is through augmenting the training set w… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  21. arXiv:2403.13244  [pdf

    cs.CL cs.AI

    Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model

    Authors: Peng Zhou, Jianmin Wang, Chunyan Li, Zixu Wang, Yiping Liu, Siqi Sun, Jianxin Lin, Leyi Wei, Xibao Cai, Houtim Lai, Wei Liu, Longyue Wang, Xiangxiang Zeng

    Abstract: While various models and computational tools have been proposed for structure and property analysis of molecules, generating molecules that conform to all desired structures and properties remains a challenge. Here, we introduce a multi-constraint molecular generation large language model, TSMMG, which, akin to a student, incorporates knowledge from various small models and tools, namely, the 'tea… ▽ More

    Submitted 10 July, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: 37 pages, 10 figures

  22. Inter- and intra-uncertainty based feature aggregation model for semi-supervised histopathology image segmentation

    Authors: Qiangguo Jin, Hui Cui, Changming Sun, Yang Song, Jiangbin Zheng, Leilei Cao, Leyi Wei, Ran Su

    Abstract: Acquiring pixel-level annotations is often limited in applications such as histology studies that require domain expertise. Various semi-supervised learning approaches have been developed to work with limited ground truth annotations, such as the popular teacher-student models. However, hierarchical prediction uncertainty within the student model (intra-uncertainty) and image prediction uncertaint… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Journal ref: Expert Systems with Applications, 2024, 238: 122093

  23. Near-Field Channel Modeling for Holographic MIMO Communications

    Authors: Tierui Gong, Li Wei, Chongwen Huang, George C. Alexandropoulos, Mérouane Debbah, Chau Yuen

    Abstract: Empowered by the latest progress on innovative metamaterials/metasurfaces and advanced antenna technologies, holographic multiple-input multiple-output (H-MIMO) emerges as a promising technology to fulfill the extreme goals of the sixth-generation (6G) wireless networks. The antenna arrays utilized in H-MIMO comprise massive (possibly to extreme extent) numbers of antenna elements, densely spaced… ▽ More

    Submitted 16 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: double column, 9 pages, 3 figures, 2 tables, accepted by IEEE Wireless Communications Magazine

  24. arXiv:2403.09167  [pdf, other

    cs.CL

    Dial-insight: Fine-tuning Large Language Models with High-Quality Domain-Specific Data Preventing Capability Collapse

    Authors: Jianwei Sun, Chaoyang Mei, Linlin Wei, Kaiyu Zheng, Na Liu, Ming Cui, Tianyi Li

    Abstract: The efficacy of large language models (LLMs) is heavily dependent on the quality of the underlying data, particularly within specialized domains. A common challenge when fine-tuning LLMs for domain-specific applications is the potential degradation of the model's generalization capabilities. To address these issues, we propose a two-stage approach for the construction of production prompts designe… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  25. arXiv:2403.09101  [pdf, other

    cs.LG cs.CR cs.CV

    Soften to Defend: Towards Adversarial Robustness via Self-Guided Label Refinement

    Authors: Daiwei Yu, Zhuorong Li, Lina Wei, Canghong Jin, Yun Zhang, Sixian Chan

    Abstract: Adversarial training (AT) is currently one of the most effective ways to obtain the robustness of deep neural networks against adversarial attacks. However, most AT methods suffer from robust overfitting, i.e., a significant generalization gap in adversarial robustness between the training and testing curves. In this paper, we first identify a connection between robust overfitting and the excessiv… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  26. arXiv:2403.08787  [pdf

    cs.CV cs.LG

    Multi-view Subspace Clustering via An Adaptive Consensus Graph Filter

    Authors: Lai Wei, Shanshan Song

    Abstract: Multiview subspace clustering (MVSC) has attracted an increasing amount of attention in recent years. Most existing MVSC methods first collect complementary information from different views and consequently derive a consensus reconstruction coefficient matrix to indicate the subspace structure of a multi-view data set. In this paper, we initially assume the existence of a consensus reconstruction… ▽ More

    Submitted 29 January, 2024; originally announced March 2024.

  27. arXiv:2403.08630  [pdf, other

    stat.ME cs.LG

    Leveraging Non-Decimated Wavelet Packet Features and Transformer Models for Time Series Forecasting

    Authors: Guy P Nason, James L. Wei

    Abstract: This article combines wavelet analysis techniques with machine learning methods for univariate time series forecasting, focusing on three main contributions. Firstly, we consider the use of Daubechies wavelets with different numbers of vanishing moments as input features to both non-temporal and temporal forecasting methods, by selecting these numbers during the cross-validation phase. Secondly, w… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    MSC Class: 62M10; 62M45

  28. arXiv:2403.08339  [pdf, other

    cs.IT eess.SP

    Low-Complexity Beam Training for Multi-RIS-Assisted Multi-User Communications

    Authors: Yuan Xu, Chongwen Huang, Li Wei, Zhaohui Yang, Xiaoming Chen, Zhaoyang Zhang, Chau Yuen, Mérouane Debbah

    Abstract: In this paper, we investigate the beam training problem in the multi-user millimeter wave (mmWave) communication system, where multiple reconfigurable intelligent surfaces (RISs) are deployed to improve the coverage and the achievable rate. However, existing beam training techniques in mmWave systems suffer from the high complexity (i.e., exponential order) and low identification accuracy. To addr… ▽ More

    Submitted 9 April, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  29. arXiv:2403.07764  [pdf, other

    cs.CV

    Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model

    Authors: Yuxuan Zhang, Lifu Wei, Qing Zhang, Yiren Song, Jiaming Liu, Huaxia Li, Xu Tang, Yao Hu, Haibo Zhao

    Abstract: Current makeup transfer methods are limited to simple makeup styles, making them difficult to apply in real-world scenarios. In this paper, we introduce Stable-Makeup, a novel diffusion-based makeup transfer method capable of robustly transferring a wide range of real-world makeup, onto user-provided faces. Stable-Makeup is based on a pre-trained diffusion model and utilizes a Detail-Preserving (D… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  30. arXiv:2403.06074  [pdf, other

    cs.IT eess.SP

    Hashing Beam Training for Near-Field Communications

    Authors: Yuan Xu, Li Wei, Chongwen Huang, Chen Zhu, Zhaohui Yang, Jun Yang, Jiguang He, Zhaoyang Zhang, Mérouane Debbah

    Abstract: In this paper, we investigate the millimeter-wave (mmWave) near-field beam training problem to find the correct beam direction. In order to address the high complexity and low identification accuracy of existing beam training techniques, we propose an efficient hashing multi-arm beam (HMB) training scheme for the near-field scenario. Specifically, we first design a set of sparse bases based on the… ▽ More

    Submitted 9 April, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.04913

  31. arXiv:2403.06073  [pdf, other

    cs.IT eess.SP

    Stochastic Geometry Analysis for Distributed RISs-Assisted mmWave Communications

    Authors: Yuan Xu, Li Wei, Chongwen Huang, Yongxu Zhu, Zhaohui Yang, Jun Yang, Jiguang He, Zhaoyang Zhang, Mérouane Debbah

    Abstract: Millimeter wave (mmWave) has attracted considerable attention due to its wide bandwidth and high frequency. However, it is highly susceptible to blockages, resulting in significant degradation of the coverage and the sum rate. A promising approach is deploying distributed reconfigurable intelligent surfaces (RISs), which can establish extra communication links. In this paper, we investigate the im… ▽ More

    Submitted 9 April, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2402.06154

  32. arXiv:2403.06066  [pdf

    eess.IV cs.CV cs.LG

    CausalCellSegmenter: Causal Inference inspired Diversified Aggregation Convolution for Pathology Image Segmentation

    Authors: Dawei Fan, Yifan Gao, Jiaming Yu, Yanping Chen, Wencheng Li, Chuancong Lin, Kaibin Li, Changcai Yang, Riqing Chen, Lifang Wei

    Abstract: Deep learning models have shown promising performance for cell nucleus segmentation in the field of pathology image analysis. However, training a robust model from multiple domains remains a great challenge for cell nucleus segmentation. Additionally, the shortcomings of background noise, highly overlapping between cell nucleus, and blurred edges often lead to poor performance. To address these ch… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: 10 pages, 5 figures, 2 tables, MICCAI

  33. arXiv:2403.02074  [pdf, other

    cs.CV cs.AI

    Modality-Aware and Shift Mixer for Multi-modal Brain Tumor Segmentation

    Authors: Zhongzhen Huang, Linda Wei, Shaoting Zhang, Xiaofan Zhang

    Abstract: Combining images from multi-modalities is beneficial to explore various information in computer vision, especially in the medical domain. As an essential part of clinical diagnosis, multi-modal brain tumor segmentation aims to delineate the malignant entity involving multiple modalities. Although existing methods have shown remarkable performance in the task, the information exchange for cross-sca… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  34. arXiv:2402.13589  [pdf, other

    cs.HC

    Affective Computing for Healthcare: Recent Trends, Applications, Challenges, and Beyond

    Authors: Yuanyuan Liu, Ke Wang, Lin Wei, Jingying Chen, Yibing Zhan, Dapeng Tao, Zhe Chen

    Abstract: Affective computing, which aims to recognize, interpret, and understand human emotions, provides benefits in healthcare, such as improving patient care and enhancing doctor-patient communication. However, there is a noticeable absence of a comprehensive summary of recent advancements in affective computing for healthcare, which could pose difficulties for researchers entering this field. To addres… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  35. arXiv:2402.11641  [pdf, other

    cs.LG

    Towards Versatile Graph Learning Approach: from the Perspective of Large Language Models

    Authors: Lanning Wei, Jun Gao, Huan Zhao, Quanming Yao

    Abstract: Graph-structured data are the commonly used and have wide application scenarios in the real world. For these diverse applications, the vast variety of learning tasks, graph domains, and complex graph learning procedures present challenges for human experts when designing versatile graph learning approaches. Facing these challenges, large language models (LLMs) offer a potential solution due to the… ▽ More

    Submitted 23 February, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  36. arXiv:2401.17139  [pdf, other

    cs.LG cs.AI cs.CL cs.IT

    Large Language Model Evaluation via Matrix Entropy

    Authors: Lai Wei, Zhiquan Tan, Chenghai Li, Jindong Wang, Weiran Huang

    Abstract: Large language models (LLMs) have revolutionized the field of natural language processing, extending their strong capabilities into multi-modal domains. Thus, it is vital to define proper and diversified metrics for the evaluation of LLMs. In this paper, we introduce matrix entropy, a novel metric rooted in information theory and geometry principles to quantify the data compression proficiency i… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  37. arXiv:2401.14446  [pdf, other

    cs.CY cs.AI cs.CR

    Black-Box Access is Insufficient for Rigorous AI Audits

    Authors: Stephen Casper, Carson Ezell, Charlotte Siegmann, Noam Kolt, Taylor Lynn Curtis, Benjamin Bucknall, Andreas Haupt, Kevin Wei, Jérémy Scheurer, Marius Hobbhahn, Lee Sharkey, Satyapriya Krishna, Marvin Von Hagen, Silas Alberti, Alan Chan, Qinyi Sun, Michael Gerovitch, David Bau, Max Tegmark, David Krueger, Dylan Hadfield-Menell

    Abstract: External audits of AI systems are increasingly recognized as a key mechanism for AI governance. The effectiveness of an audit, however, depends on the degree of access granted to auditors. Recent audits of state-of-the-art AI systems have primarily relied on black-box access, in which auditors can only query the system and observe its outputs. However, white-box access to the system's inner workin… ▽ More

    Submitted 29 May, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: FAccT 2024

    Journal ref: The 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT '24), June 3-6, 2024, Rio de Janeiro, Brazil

  38. arXiv:2401.13171  [pdf, other

    cs.LG cs.AI cs.CE

    Compositional Generative Inverse Design

    Authors: Tailin Wu, Takashi Maruyama, Long Wei, Tao Zhang, Yilun Du, Gianluca Iaccarino, Jure Leskovec

    Abstract: Inverse design, where we seek to design input variables in order to optimize an underlying objective function, is an important problem that arises across fields such as mechanical engineering to aerospace engineering. Inverse design is typically formulated as an optimization problem, with recent works leveraging optimization across learned dynamics models. However, as models are optimized they ten… ▽ More

    Submitted 11 March, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: ICLR 2024 spotlight. 30 pages, 17 figures

  39. arXiv:2401.13138  [pdf, other

    cs.CY cs.AI

    Visibility into AI Agents

    Authors: Alan Chan, Carson Ezell, Max Kaufmann, Kevin Wei, Lewis Hammond, Herbie Bradley, Emma Bluemke, Nitarshan Rajkumar, David Krueger, Noam Kolt, Lennart Heim, Markus Anderljung

    Abstract: Increased delegation of commercial, scientific, governmental, and personal activities to AI agents -- systems capable of pursuing complex goals with limited supervision -- may exacerbate existing societal risks and introduce new risks. Understanding and mitigating these risks involves critically evaluating existing governance structures, revising and adapting these structures where needed, and ens… ▽ More

    Submitted 17 May, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: Accepted to ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT 2024)

  40. arXiv:2401.09895  [pdf

    cs.CV

    Skeleton-Guided Instance Separation for Fine-Grained Segmentation in Microscopy

    Authors: Jun Wang, Chengfeng Zhou, Zhaoyan Ming, Lina Wei, Xudong Jiang, Dahong Qian

    Abstract: One of the fundamental challenges in microscopy (MS) image analysis is instance segmentation (IS), particularly when segmenting cluster regions where multiple objects of varying sizes and shapes may be connected or even overlapped in arbitrary orientations. Existing IS methods usually fail in handling such scenarios, as they rely on coarse instance representations such as keypoints and horizontal… ▽ More

    Submitted 19 January, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

  41. arXiv:2401.05631  [pdf, other

    cs.HC cs.AI cs.CL cs.GR

    DrawTalking: Building Interactive Worlds by Sketching and Speaking

    Authors: Karl Toby Rosenberg, Rubaiat Habib Kazi, Li-Yi Wei, Haijun Xia, Ken Perlin

    Abstract: We introduce DrawTalking, an approach to building and controlling interactive worlds by sketching and speaking. It emphasizes user control and flexibility, and gives programming-like capability without requiring code. We built a prototype to demonstrate it. An early open-ended study shows the mechanics resonate and are applicable to many creative-exploratory use cases, with the potential to inspir… ▽ More

    Submitted 22 April, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    ACM Class: H.5.2; D.2.2; I.2.7; D.1.7; H.5.1

  42. arXiv:2401.03968  [pdf, other

    q-bio.QM cs.LG q-bio.GN

    scDiffusion: conditional generation of high-quality single-cell data using diffusion model

    Authors: Erpai Luo, Minsheng Hao, Lei Wei, Xuegong Zhang

    Abstract: Single-cell RNA sequencing (scRNA-seq) data are important for studying the laws of life at single-cell level. However, it is still challenging to obtain enough high-quality scRNA-seq data. To mitigate the limited availability of data, generative models have been proposed to computationally generate synthetic scRNA-seq data. Nevertheless, the data generated with current models are not very realisti… ▽ More

    Submitted 4 March, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

  43. arXiv:2401.03105  [pdf, other

    cs.CV cs.MM

    Incorporating Visual Experts to Resolve the Information Loss in Multimodal Large Language Models

    Authors: Xin He, Longhui Wei, Lingxi Xie, Qi Tian

    Abstract: Multimodal Large Language Models (MLLMs) are experiencing rapid growth, yielding a plethora of noteworthy contributions in recent months. The prevailing trend involves adopting data-driven methodologies, wherein diverse instruction-following datasets are collected. However, a prevailing challenge persists in these approaches, specifically in relation to the limited visual perception ability, as CL… ▽ More

    Submitted 13 January, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

  44. arXiv:2312.13933  [pdf, other

    cs.CL cs.LG

    Structured Probabilistic Coding

    Authors: Dou Hu, Lingwei Wei, Yaxin Liu, Wei Zhou, Songlin Hu

    Abstract: This paper presents a new supervised representation learning framework, namely structured probabilistic coding (SPC), to learn compact and informative representations from input related to the target task. SPC is an encoder-only probabilistic coding technology with a structured regularization from the target space. It can enhance the generalization ability of pre-trained language models for better… ▽ More

    Submitted 2 May, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: 11 pages, accepted by AAAI 2024 (Oral)

  45. arXiv:2312.03628  [pdf, other

    cs.CV

    Boosting Segment Anything Model Towards Open-Vocabulary Learning

    Authors: Xumeng Han, Longhui Wei, Xuehui Yu, Zhiyang Dou, Xin He, Kuiran Wang, Zhenjun Han, Qi Tian

    Abstract: The recent Segment Anything Model (SAM) has emerged as a new paradigmatic vision foundation model, showcasing potent zero-shot generalization and flexible prompting. Despite SAM finding applications and adaptations in various domains, its primary limitation lies in the inability to grasp object semantics. In this paper, we present Sambor to seamlessly integrate SAM with the open-vocabulary object… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  46. arXiv:2312.01809  [pdf

    cs.RO

    SE-LIO: Semantics-enhanced Solid-State-LiDAR-Inertial Odometry for Tree-rich Environments

    Authors: Tisheng Zhang, Linfu Wei, Hailiang Tang, Liqiang Wang, Man Yuan, Xiaoji Niu

    Abstract: In this letter, we propose a semantics-enhanced solid-state-LiDAR-inertial odometry (SE-LIO) in tree-rich environments. Multiple LiDAR frames are first merged and compensated with the inertial navigation system (INS) to increase the point-cloud coverage, thus improving the accuracy of semantic segmentation. The unstructured point clouds, such as tree leaves and dynamic objects, are then removed wi… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  47. arXiv:2311.17355  [pdf, other

    cs.CL

    Are Large Language Models Good Fact Checkers: A Preliminary Study

    Authors: Han Cao, Lingwei Wei, Mengyang Chen, Wei Zhou, Songlin Hu

    Abstract: Recently, Large Language Models (LLMs) have drawn significant attention due to their outstanding reasoning capabilities and extensive knowledge repository, positioning them as superior in handling various natural language processing tasks compared to other language models. In this paper, we present a preliminary investigation into the potential of LLMs in fact-checking. This study aims to comprehe… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  48. arXiv:2311.14981  [pdf, other

    cs.CV

    Multi-task Planar Reconstruction with Feature Warping Guidance

    Authors: Luan Wei, Anna Hilsmann, Peter Eisert

    Abstract: Piece-wise planar 3D reconstruction simultaneously segments plane instances and recovers their 3D plane parameters from an image, which is particularly useful for indoor or man-made environments. Efficient reconstruction of 3D planes coupled with semantic predictions offers advantages for a wide range of applications requiring scene understanding and concurrent spatial mapping. However, most exist… ▽ More

    Submitted 21 December, 2023; v1 submitted 25 November, 2023; originally announced November 2023.

    Comments: For code, see https://github.com/fraunhoferhhi/SOLOPlanes

    Journal ref: VISAPP 2024

  49. arXiv:2311.13614  [pdf, other

    cs.CV cs.AI

    HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data

    Authors: Qifan Yu, Juncheng Li, Longhui Wei, Liang Pang, Wentao Ye, Bosheng Qin, Siliang Tang, Qi Tian, Yueting Zhuang

    Abstract: Multi-modal Large Language Models (MLLMs) tuned on machine-generated instruction-following data have demonstrated remarkable performance in various multi-modal understanding and generation tasks. However, the hallucinations inherent in machine-generated data, which could lead to hallucinatory outputs in MLLMs, remain under-explored. This work aims to investigate various hallucinations (i.e., objec… ▽ More

    Submitted 24 March, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: Accepted by CVPR 2024

  50. arXiv:2311.12699  [pdf, other

    cs.CL cs.AI cs.CY

    Can Large Language Models Understand Content and Propagation for Misinformation Detection: An Empirical Study

    Authors: Mengyang Chen, Lingwei Wei, Han Cao, Wei Zhou, Songlin Hu

    Abstract: Large Language Models (LLMs) have garnered significant attention for their powerful ability in natural language understanding and reasoning. In this paper, we present a comprehensive empirical study to explore the performance of LLMs on misinformation detection tasks. This study stands as the pioneering investigation into the understanding capabilities of multiple LLMs regarding both content and p… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.