Skip to main content

Showing 1–50 of 128 results for author: Meng, C

  1. arXiv:2407.02398  [pdf, other

    cs.CV

    Consistency Flow Matching: Defining Straight Flows with Velocity Consistency

    Authors: Ling Yang, Zixiang Zhang, Zhilong Zhang, Xingchao Liu, Minkai Xu, Wentao Zhang, Chenlin Meng, Stefano Ermon, Bin Cui

    Abstract: Flow matching (FM) is a general framework for defining probability paths via Ordinary Differential Equations (ODEs) to transform between noise and data samples. Recent approaches attempt to straighten these flow trajectories to generate high-quality samples with fewer function evaluations, typically through iterative rectification methods or optimal transport solutions. In this paper, we introduce… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Code: https://github.com/YangLing0818/consistency_flow_matching

  2. arXiv:2407.01521  [pdf, other

    cs.LG cs.AI cs.CV

    Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing

    Authors: Bingliang Zhang, Wenda Chu, Julius Berner, Chenlin Meng, Anima Anandkumar, Yang Song

    Abstract: Diffusion models have recently achieved success in solving Bayesian inverse problems with learned data priors. Current methods build on top of the diffusion sampling process, where each denoising step makes small modifications to samples from the previous step. However, this process struggles to correct errors from earlier sampling steps, leading to worse performance in complicated nonlinear inver… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  3. arXiv:2407.00029  [pdf, other

    cs.DC

    Distributed Inference Performance Optimization for LLMs on CPUs

    Authors: Pujiang He, Shan Zhou, Changqing Li, Wenhuan Huang, Weifei Yu, Duyi Wang, Chen Meng, Sheng Gui

    Abstract: Large language models (LLMs) hold tremendous potential for addressing numerous real-world challenges, yet they typically demand significant computational resources and memory. Deploying LLMs onto a resource-limited hardware device with restricted memory capacity presents considerable challenges. Distributed computing emerges as a prevalent strategy to mitigate single-node memory constraints and ex… ▽ More

    Submitted 16 May, 2024; originally announced July 2024.

    Comments: 4 pages, 3 figures, Practical ML for Low Resource Settings Workshop @ ICLR 2024

  4. arXiv:2406.14288  [pdf, other

    cs.LG cs.AI

    Revisiting Modularity Maximization for Graph Clustering: A Contrastive Learning Perspective

    Authors: Yunfei Liu, Jintang Li, Yuehe Chen, Ruofan Wu, Ericbk Wang, Jing Zhou, Sheng Tian, Shuheng Shen, Xing Fu, Changhua Meng, Weiqiang Wang, Liang Chen

    Abstract: Graph clustering, a fundamental and challenging task in graph mining, aims to classify nodes in a graph into several disjoint clusters. In recent years, graph contrastive learning (GCL) has emerged as a dominant line of research in graph clustering and advances the new state-of-the-art. However, GCL-based methods heavily rely on graph augmentations and contrastive schemes, which may potentially in… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: KDD 2024 research track. Code available at https://github.com/EdisonLeeeee/MAGI

  5. arXiv:2406.14250  [pdf, other

    cs.CV cs.HC

    E-ANT: A Large-Scale Dataset for Efficient Automatic GUI NavigaTion

    Authors: Ke Wang, Tianyu Xia, Zhangxuan Gu, Yi Zhao, Shuheng Shen, Changhua Meng, Weiqiang Wang, Ke Xu

    Abstract: Online GUI navigation on mobile devices has driven a lot of attention recent years since it contributes to many real-world applications. With the rapid development of large language models (LLM), multimodal large language models (MLLM) have tremendous potential on this task. However, existing MLLMs need high quality data to improve its abilities of making the correct navigation decisions according… ▽ More

    Submitted 1 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: 9 pages, 5 figures, Under review

  6. arXiv:2405.05600  [pdf, other

    cs.IR cs.CL

    Can We Use Large Language Models to Fill Relevance Judgment Holes?

    Authors: Zahra Abbasiantaeb, Chuan Meng, Leif Azzopardi, Mohammad Aliannejadi

    Abstract: Incomplete relevance judgments limit the re-usability of test collections. When new systems are compared against previous systems used to build the pool of judged documents, they often do so at a disadvantage due to the ``holes'' in test collection (i.e., pockets of un-assessed documents returned by the new system). In this paper, we take initial steps towards extending existing test collections b… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  7. arXiv:2404.18185  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    Ranked List Truncation for Large Language Model-based Re-Ranking

    Authors: Chuan Meng, Negar Arabzadeh, Arian Askari, Mohammad Aliannejadi, Maarten de Rijke

    Abstract: We study ranked list truncation (RLT) from a novel "retrieve-then-re-rank" perspective, where we optimize re-ranking by truncating the retrieved list (i.e., trim re-ranking candidates). RLT is crucial for re-ranking as it can improve re-ranking efficiency by sending variable-length candidate lists to a re-ranker on a per-query basis. It also has the potential to improve re-ranking effectiveness. D… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: Accepted for publication as a long paper at SIGIR 2024

    ACM Class: H.3.3

  8. arXiv:2404.01012  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    Query Performance Prediction using Relevance Judgments Generated by Large Language Models

    Authors: Chuan Meng, Negar Arabzadeh, Arian Askari, Mohammad Aliannejadi, Maarten de Rijke

    Abstract: Query performance prediction (QPP) aims to estimate the retrieval quality of a search system for a query without human relevance judgments. Previous QPP methods typically return a single scalar value and do not require the predicted values to approximate a specific information retrieval (IR) evaluation measure, leading to certain drawbacks: (i) a single scalar is insufficient to accurately represe… ▽ More

    Submitted 17 June, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    ACM Class: H.3.3

  9. arXiv:2404.00732  [pdf, other

    cs.GT cs.CY

    An Abundance of Katherines: The Game Theory of Baby Naming

    Authors: Katy Blumer, Kate Donahue, Katie Fritz, Kate Ivanovich, Katherine Lee, Katie Luo, Cathy Meng, Katie Van Koevering

    Abstract: In this paper, we study the highly competitive arena of baby naming. Through making several Extremely Reasonable Assumptions (namely, that parents are myopic, perfectly knowledgeable agents who pick a name based solely on its uniquness), we create a model which is not only tractable and clean, but also perfectly captures the real world. We then extend our investigation with numerical experiments,… ▽ More

    Submitted 1 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

    Comments: Accepted at SIGBOVIK 2024

  10. arXiv:2403.15183  [pdf, other

    cs.RO

    CRPlace: Camera-Radar Fusion with BEV Representation for Place Recognition

    Authors: Shaowei Fu, Yifan Duan, Yao Li, Chengzhen Meng, Yingjie Wang, Jianmin Ji, Yanyong Zhang

    Abstract: The integration of complementary characteristics from camera and radar data has emerged as an effective approach in 3D object detection. However, such fusion-based methods remain unexplored for place recognition, an equally important task for autonomous systems. Given that place recognition relies on the similarity between a query scene and the corresponding candidate scene, the stationary backgro… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  11. arXiv:2403.14221  [pdf, other

    cs.CL

    Improving the Robustness of Large Language Models via Consistency Alignment

    Authors: Yukun Zhao, Lingyong Yan, Weiwei Sun, Guoliang Xing, Shuaiqiang Wang, Chong Meng, Zhicong Cheng, Zhaochun Ren, Dawei Yin

    Abstract: Large language models (LLMs) have shown tremendous success in following user instructions and generating helpful responses. Nevertheless, their robustness is still far from optimal, as they may generate significantly inconsistent responses due to minor changes in the verbalized instructions. Recent literature has explored this inconsistency issue, highlighting the importance of continued improveme… ▽ More

    Submitted 22 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: Accepted by LREC-COLING 2024

  12. arXiv:2403.04703  [pdf, other

    cs.RO

    mmPlace: Robust Place Recognition with Intermediate Frequency Signal of Low-cost Single-chip Millimeter Wave Radar

    Authors: Chengzhen Meng, Yifan Duan, Chenming He, Dequan Wang, Xiaoran Fan, Yanyong Zhang

    Abstract: Place recognition is crucial for tasks like loop-closure detection and re-localization. Single-chip millimeter wave radar (single-chip radar in short) emerges as a low-cost sensor option for place recognition, with the advantage of insensitivity to degraded visual environments. However, it encounters two challenges. Firstly, sparse point cloud from single-chip radar leads to poor performance when… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: 8 pages, 8 figures

  13. arXiv:2403.00829  [pdf, other

    cs.AI cs.CL

    TroubleLLM: Align to Red Team Expert

    Authors: Zhuoer Xu, Jianping Zhang, Shiwen Cui, Changhua Meng, Weiqiang Wang

    Abstract: Large Language Models (LLMs) become the start-of-the-art solutions for a variety of natural language tasks and are integrated into real-world applications. However, LLMs can be potentially harmful in manifesting undesirable safety issues like social biases and toxic content. It is imperative to assess its safety issues before deployment. However, the quality and diversity of test prompts generated… ▽ More

    Submitted 27 February, 2024; originally announced March 2024.

  14. arXiv:2402.12962  [pdf, other

    cs.SE

    Multi-Level ML Based Burst-Aware Autoscaling for SLO Assurance and Cost Efficiency

    Authors: Chunyang Meng, Haogang Tong, Tianyang Wu, Maolin Pan, Yang Yu

    Abstract: Autoscaling is a technology to automatically scale the resources provided to their applications without human intervention to guarantee runtime Quality of Service (QoS) while saving costs. However, user-facing cloud applications serve dynamic workloads that often exhibit variable and contain bursts, posing challenges to autoscaling for maintaining QoS within Service-Level Objectives (SLOs). Conser… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  15. arXiv:2402.11633  [pdf, other

    cs.CL

    Self-seeding and Multi-intent Self-instructing LLMs for Generating Intent-aware Information-Seeking dialogs

    Authors: Arian Askari, Roxana Petcu, Chuan Meng, Mohammad Aliannejadi, Amin Abolghasemi, Evangelos Kanoulas, Suzan Verberne

    Abstract: Identifying user intents in information-seeking dialogs is crucial for a system to meet user's information needs. Intent prediction (IP) is challenging and demands sufficient dialogs with human-labeled intents for training. However, manually annotating intents is resource-intensive. While large language models (LLMs) have been shown to be effective in generating synthetic data, there is no study o… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

  16. arXiv:2401.13934  [pdf, other

    cs.CV

    MambaMorph: a Mamba-based Framework for Medical MR-CT Deformable Registration

    Authors: Tao Guo, Yinuo Wang, Shihao Shu, Diansheng Chen, Zhouping Tang, Cai Meng, Xiangzhi Bai

    Abstract: Capturing voxel-wise spatial correspondence across distinct modalities is crucial for medical image analysis. However, current registration approaches are not practical enough in terms of registration accuracy and clinical applicability. In this paper, we introduce MambaMorph, a novel multi-modality deformable registration framework. Specifically, MambaMorph utilizes a Mamba-based registration mod… ▽ More

    Submitted 12 March, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

  17. arXiv:2401.11708  [pdf, other

    cs.CV cs.AI cs.LG

    Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs

    Authors: Ling Yang, Zhaochen Yu, Chenlin Meng, Minkai Xu, Stefano Ermon, Bin Cui

    Abstract: Diffusion models have exhibit exceptional performance in text-to-image generation and editing. However, existing methods often face challenges when handling complex text prompts that involve multiple objects with multiple attributes and relationships. In this paper, we propose a brand new training-free text-to-image generation/editing framework, namely Recaption, Plan and Generate (RPG), harnessin… ▽ More

    Submitted 5 May, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

    Comments: ICML 2024. Project: https://github.com/YangLing0818/RPG-DiffusionMaster

  18. arXiv:2312.03606  [pdf, other

    cs.CV cs.AI cs.LG

    DiffusionSat: A Generative Foundation Model for Satellite Imagery

    Authors: Samar Khanna, Patrick Liu, Linqi Zhou, Chenlin Meng, Robin Rombach, Marshall Burke, David Lobell, Stefano Ermon

    Abstract: Diffusion models have achieved state-of-the-art results on many modalities including images, speech, and video. However, existing models are not tailored to support remote sensing data, which is widely used in important applications including environmental monitoring and crop-yield prediction. Satellite images are significantly different from natural images -- they can be multi-spectral, irregular… ▽ More

    Submitted 25 May, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: Published at ICLR 2024

  19. arXiv:2311.17082  [pdf, other

    cs.CV stat.ML

    DreamPropeller: Supercharge Text-to-3D Generation with Parallel Sampling

    Authors: Linqi Zhou, Andy Shih, Chenlin Meng, Stefano Ermon

    Abstract: Recent methods such as Score Distillation Sampling (SDS) and Variational Score Distillation (VSD) using 2D diffusion models for text-to-3D generation have demonstrated impressive generation quality. However, the long generation time of such algorithms significantly degrades the user experience. To tackle this problem, we propose DreamPropeller, a drop-in acceleration algorithm that can be wrapped… ▽ More

    Submitted 20 May, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: Github repo: https://github.com/alexzhou907/DreamPropeller; Project page: https://alexzhou907.github.io/dreampropeller_page/

  20. arXiv:2311.16605  [pdf, other

    cs.LG cs.AI

    LasTGL: An Industrial Framework for Large-Scale Temporal Graph Learning

    Authors: Jintang Li, Jiawang Dan, Ruofan Wu, Jing Zhou, Sheng Tian, Yunfei Liu, Baokun Wang, Changhua Meng, Weiqiang Wang, Yuchang Zhu, Liang Chen, Zibin Zheng

    Abstract: Over the past few years, graph neural networks (GNNs) have become powerful and practical tools for learning on (static) graph-structure data. However, many real-world applications, such as social networks and e-commerce, involve temporal graphs where nodes and edges are dynamically evolving. Temporal graph neural networks (TGNNs) have progressively emerged as an extension of GNNs to address time-e… ▽ More

    Submitted 30 November, 2023; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: Preprint; Work in progress

  21. arXiv:2311.15241  [pdf, other

    cs.CV cs.RO

    CalibFormer: A Transformer-based Automatic LiDAR-Camera Calibration Network

    Authors: Yuxuan Xiao, Yao Li, Chengzhen Meng, Xingchen Li, Jianmin Ji, Yanyong Zhang

    Abstract: The fusion of LiDARs and cameras has been increasingly adopted in autonomous driving for perception tasks. The performance of such fusion-based algorithms largely depends on the accuracy of sensor calibration, which is challenging due to the difficulty of identifying common features across different data modalities. Previously, many calibration methods involved specific targets and/or manual inter… ▽ More

    Submitted 17 March, 2024; v1 submitted 26 November, 2023; originally announced November 2023.

  22. arXiv:2311.08190  [pdf, other

    eess.IV cs.CV cs.LG

    SAMIHS: Adaptation of Segment Anything Model for Intracranial Hemorrhage Segmentation

    Authors: Yinuo Wang, Kai Chen, Weimin Yuan, Cai Meng, XiangZhi Bai

    Abstract: Segment Anything Model (SAM), a vision foundation model trained on large-scale annotations, has recently continued raising awareness within medical image segmentation. Despite the impressive capabilities of SAM on natural scenes, it struggles with performance decline when confronted with medical images, especially those involving blurry boundaries and highly irregular regions of low contrast. In t… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: 5 pages, 3 figures, 2 tables

  23. arXiv:2311.04287  [pdf, other

    cs.CV cs.LG

    Holistic Evaluation of Text-To-Image Models

    Authors: Tony Lee, Michihiro Yasunaga, Chenlin Meng, Yifan Mai, Joon Sung Park, Agrim Gupta, Yunzhi Zhang, Deepak Narayanan, Hannah Benita Teufel, Marco Bellagente, Minguk Kang, Taesung Park, Jure Leskovec, Jun-Yan Zhu, Li Fei-Fei, Jiajun Wu, Stefano Ermon, Percy Liang

    Abstract: The stunning qualitative improvement of recent text-to-image models has led to their widespread attention and adoption. However, we lack a comprehensive quantitative understanding of their capabilities and risks. To fill this gap, we introduce a new benchmark, Holistic Evaluation of Text-to-Image Models (HEIM). Whereas previous evaluations focus mostly on text-image alignment and image quality, we… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: NeurIPS 2023. First three authors contributed equally

  24. arXiv:2311.00886  [pdf, other

    cs.LG

    COSTAR: Improved Temporal Counterfactual Estimation with Self-Supervised Learning

    Authors: Chuizheng Meng, Yihe Dong, Sercan Ö. Arık, Yan Liu, Tomas Pfister

    Abstract: Estimation of temporal counterfactual outcomes from observed history is crucial for decision-making in many domains such as healthcare and e-commerce, particularly when randomized controlled trials (RCTs) suffer from high cost or impracticality. For real-world datasets, modeling time-dependent confounders is challenging due to complex dynamics, long-range dependencies and both past treatments and… ▽ More

    Submitted 12 February, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

  25. arXiv:2310.17918  [pdf, other

    cs.CL cs.AI

    Knowing What LLMs DO NOT Know: A Simple Yet Effective Self-Detection Method

    Authors: Yukun Zhao, Lingyong Yan, Weiwei Sun, Guoliang Xing, Chong Meng, Shuaiqiang Wang, Zhicong Cheng, Zhaochun Ren, Dawei Yin

    Abstract: Large Language Models (LLMs) have shown great potential in Natural Language Processing (NLP) tasks. However, recent literature reveals that LLMs generate nonfactual responses intermittently, which impedes the LLMs' reliability for further utilization. In this paper, we propose a novel self-detection method to detect which questions that a LLM does not know that are prone to generate nonfactual res… ▽ More

    Submitted 21 March, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

    Comments: Accepted by NAACL 2024

  26. arXiv:2310.16834  [pdf, other

    stat.ML cs.CL cs.LG

    Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution

    Authors: Aaron Lou, Chenlin Meng, Stefano Ermon

    Abstract: Despite their groundbreaking performance for many generative modeling tasks, diffusion models have fallen short on discrete data domains such as natural language. Crucially, standard diffusion models rely on the well-established theory of score matching, but efforts to generalize this to discrete structures have not yielded the same empirical gains. In this work, we bridge this gap by proposing sc… ▽ More

    Submitted 6 June, 2024; v1 submitted 25 October, 2023; originally announced October 2023.

    Comments: ICML 2024 Oral. Code at https://github.com/louaaron/Score-Entropy-Discrete-Diffusion

  27. arXiv:2310.16319  [pdf, other

    cs.CL

    DiQAD: A Benchmark Dataset for End-to-End Open-domain Dialogue Assessment

    Authors: Yukun Zhao, Lingyong Yan, Weiwei Sun, Chong Meng, Shuaiqiang Wang, Zhicong Cheng, Zhaochun Ren, Dawei Yin

    Abstract: Dialogue assessment plays a critical role in the development of open-domain dialogue systems. Existing work are uncapable of providing an end-to-end and human-epistemic assessment dataset, while they only provide sub-metrics like coherence or the dialogues are conversed between annotators far from real user settings. In this paper, we release a large-scale dialogue quality assessment dataset (DiQA… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: Accepted to Findings of EMNLP 2023

  28. arXiv:2310.11664  [pdf, other

    cs.LG cs.AI

    Hetero$^2$Net: Heterophily-aware Representation Learning on Heterogenerous Graphs

    Authors: Jintang Li, Zheng Wei, Jiawang Dan, Jing Zhou, Yuchang Zhu, Ruofan Wu, Baokun Wang, Zhang Zhen, Changhua Meng, Hong Jin, Zibin Zheng, Liang Chen

    Abstract: Real-world graphs are typically complex, exhibiting heterogeneity in the global structure, as well as strong heterophily within local neighborhoods. While a growing body of literature has revealed the limitations of common graph neural networks (GNNs) in handling homogeneous graphs with heterophily, little work has been conducted on investigating the heterophily properties in the context of hetero… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: Preprint

  29. arXiv:2310.11281  [pdf, other

    cs.LG

    Self-supervision meets kernel graph neural models: From architecture to augmentations

    Authors: Jiawang Dan, Ruofan Wu, Yunpeng Liu, Baokun Wang, Changhua Meng, Tengfei Liu, Tianyi Zhang, Ningtao Wang, Xing Fu, Qi Li, Weiqiang Wang

    Abstract: Graph representation learning has now become the de facto standard when handling graph-structured data, with the framework of message-passing graph neural networks (MPNN) being the most prevailing algorithmic tool. Despite its popularity, the family of MPNNs suffers from several drawbacks such as transparency and expressivity. Recently, the idea of designing neural models on graphs using the theor… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  30. arXiv:2310.00413  [pdf, other

    cs.CV cs.LG eess.IV

    SSIF: Learning Continuous Image Representation for Spatial-Spectral Super-Resolution

    Authors: Gengchen Mai, Ni Lao, Weiwei Sun, Yuchi Ma, Jiaming Song, Chenlin Meng, Hongxu Ma, Jinmeng Rao, Ziyuan Li, Stefano Ermon

    Abstract: Existing digital sensors capture images at fixed spatial and spectral resolutions (e.g., RGB, multispectral, and hyperspectral images), and each combination requires bespoke machine learning models. Neural Implicit Functions partially overcome the spatial resolution challenge by representing an image in a resolution-independent way. However, they still operate at fixed, pre-defined spectral resolu… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

    MSC Class: 68T07; 68T45 ACM Class: I.4.10; I.2.10; I.4.6

  31. arXiv:2309.00859  [pdf, other

    cs.SE

    DeepScaler: Holistic Autoscaling for Microservices Based on Spatiotemporal GNN with Adaptive Graph Learning

    Authors: Chunyang Meng, Shijie Song, Haogang Tong, Maolin Pan, Yang Yu

    Abstract: Autoscaling functions provide the foundation for achieving elasticity in the modern cloud computing paradigm. It enables dynamic provisioning or de-provisioning resources for cloud software services and applications without human intervention to adapt to workload fluctuations. However, autoscaling microservice is challenging due to various factors. In particular, complex, time-varying service depe… ▽ More

    Submitted 2 September, 2023; originally announced September 2023.

    Comments: To be published in the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE 2023)

  32. arXiv:2309.00169  [pdf, other

    eess.AS cs.LG cs.SD

    RepCodec: A Speech Representation Codec for Speech Tokenization

    Authors: Zhichao Huang, Chutong Meng, Tom Ko

    Abstract: With recent rapid growth of large language models (LLMs), discrete speech tokenization has played an important role for injecting speech into LLMs. However, this discretization gives rise to a loss of information, consequently impairing overall performance. To improve the performance of these discrete speech tokens, we present RepCodec, a novel speech representation codec for semantic speech token… ▽ More

    Submitted 6 June, 2024; v1 submitted 31 August, 2023; originally announced September 2023.

  33. arXiv:2308.12061  [pdf, other

    cs.CV cs.LG

    HarvestNet: A Dataset for Detecting Smallholder Farming Activity Using Harvest Piles and Remote Sensing

    Authors: Jonathan Xu, Amna Elmustafa, Liya Weldegebriel, Emnet Negash, Richard Lee, Chenlin Meng, Stefano Ermon, David Lobell

    Abstract: Small farms contribute to a large share of the productive land in developing countries. In regions such as sub-Saharan Africa, where 80\% of farms are small (under 2 ha in size), the task of mapping smallholder cropland is an important part of tracking sustainability measures such as crop productivity. However, the visually diverse and nuanced appearance of small farms has limited the effectivenes… ▽ More

    Submitted 5 March, 2024; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: submitted to AAAI24

  34. arXiv:2308.11198  [pdf, other

    cs.CV

    Novel-view Synthesis and Pose Estimation for Hand-Object Interaction from Sparse Views

    Authors: Wentian Qu, Zhaopeng Cui, Yinda Zhang, Chenyu Meng, Cuixia Ma, Xiaoming Deng, Hongan Wang

    Abstract: Hand-object interaction understanding and the barely addressed novel view synthesis are highly desired in the immersive communication, whereas it is challenging due to the high deformation of hand and heavy occlusions between hand and object. In this paper, we propose a neural rendering and pose estimation system for hand-object interaction from sparse views, which can also enable 3D hand-object i… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

  35. arXiv:2308.09966  [pdf, other

    cs.IR

    Time-aligned Exposure-enhanced Model for Click-Through Rate Prediction

    Authors: Hengyu Zhang, Chang Meng, Wei Guo, Huifeng Guo, Jieming Zhu, Guangpeng Zhao, Ruiming Tang, Xiu Li

    Abstract: Click-Through Rate (CTR) prediction, crucial in applications like recommender systems and online advertising, involves ranking items based on the likelihood of user clicks. User behavior sequence modeling has marked progress in CTR prediction, which extracts users' latent interests from their historical behavior sequences to facilitate accurate CTR prediction. Recent research explores using implic… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

    Comments: 11 pages, 5 figures

  36. arXiv:2308.07625  [pdf, other

    cs.CV cs.CR cs.LG

    Backpropagation Path Search On Adversarial Transferability

    Authors: Zhuoer Xu, Zhangxuan Gu, Jianping Zhang, Shiwen Cui, Changhua Meng, Weiqiang Wang

    Abstract: Deep neural networks are vulnerable to adversarial examples, dictating the imperativeness to test the model's robustness before deployment. Transfer-based attackers craft adversarial examples against surrogate models and transfer them to victim models deployed in the black-box situation. To enhance the adversarial transferability, structure-based attackers adjust the backpropagation path to avoid… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV2023

  37. Parallel Knowledge Enhancement based Framework for Multi-behavior Recommendation

    Authors: Chang Meng, Chenhao Zhai, Yu Yang, Hengyu Zhang, Xiu Li

    Abstract: Multi-behavior recommendation algorithms aim to leverage the multiplex interactions between users and items to learn users' latent preferences. Recent multi-behavior recommendation frameworks contain two steps: fusion and prediction. In the fusion step, advanced neural networks are used to model the hierarchical correlations between user behaviors. In the prediction step, multiple signals are util… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: Accepted by CIKM 2023

  38. arXiv:2307.10875  [pdf, other

    cs.CV cs.CR cs.LG

    Risk-optimized Outlier Removal for Robust 3D Point Cloud Classification

    Authors: Xinke Li, Junchi Lu, Henghui Ding, Changsheng Sun, Joey Tianyi Zhou, Chee Yeow Meng

    Abstract: With the growth of 3D sensing technology, deep learning system for 3D point clouds has become increasingly important, especially in applications like autonomous vehicles where safety is a primary concern. However, there are also growing concerns about the reliability of these systems when they encounter noisy point clouds, whether occurring naturally or introduced with malicious intent. This paper… ▽ More

    Submitted 1 January, 2024; v1 submitted 20 July, 2023; originally announced July 2023.

  39. arXiv:2307.04817  [pdf

    physics.ao-ph cs.LG

    A physics-constrained machine learning method for mapping gapless land surface temperature

    Authors: Jun Ma, Huanfeng Shen, Menghui Jiang, Liupeng Lin, Chunlei Meng, Chao Zeng, Huifang Li, Penghai Wu

    Abstract: More accurate, spatio-temporally, and physically consistent LST estimation has been a main interest in Earth system research. Developing physics-driven mechanism models and data-driven machine learning (ML) models are two major paradigms for gapless LST estimation, which have their respective advantages and disadvantages. In this paper, a physics-constrained ML model, which combines the strengths… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

  40. arXiv:2306.08257  [pdf, other

    cs.CV cs.CR

    On the Robustness of Latent Diffusion Models

    Authors: Jianping Zhang, Zhuoer Xu, Shiwen Cui, Changhua Meng, Weibin Wu, Michael R. Lyu

    Abstract: Latent diffusion models achieve state-of-the-art performance on a variety of generative tasks, such as image synthesis and image editing. However, the robustness of latent diffusion models is not well studied. Previous works only focus on the adversarial attacks against the encoder or the output image under white-box settings, regardless of the denoising process. Therefore, in this paper, we aim t… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

  41. arXiv:2306.06581  [pdf, other

    stat.ML cs.DS cs.LG math.OC

    Importance Sparsification for Sinkhorn Algorithm

    Authors: Mengyu Li, Jun Yu, Tao Li, Cheng Meng

    Abstract: Sinkhorn algorithm has been used pervasively to approximate the solution to optimal transport (OT) and unbalanced optimal transport (UOT) problems. However, its practical application is limited due to the high computational complexity. To alleviate the computational burden, we propose a novel importance sparsification method, called Spar-Sink, to efficiently approximate entropy-regularized OT and… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    Comments: Accepted by Journal of Machine Learning Research

  42. arXiv:2305.19306  [pdf, other

    cs.NE cs.AI cs.LG

    A Graph is Worth 1-bit Spikes: When Graph Contrastive Learning Meets Spiking Neural Networks

    Authors: Jintang Li, Huizhe Zhang, Ruofan Wu, Zulun Zhu, Baokun Wang, Changhua Meng, Zibin Zheng, Liang Chen

    Abstract: While contrastive self-supervised learning has become the de-facto learning paradigm for graph neural networks, the pursuit of higher task accuracy requires a larger hidden dimensionality to learn informative and discriminative full-precision representations, raising concerns about computation, memory footprint, and energy consumption burden (largely overlooked) for real-world applications. This w… ▽ More

    Submitted 19 February, 2024; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted to ICLR 2024; Code is available at https://github.com/EdisonLeeeee/SpikeGCL

  43. arXiv:2305.13573  [pdf, other

    cs.LG cs.SI

    SAD: Semi-Supervised Anomaly Detection on Dynamic Graphs

    Authors: Sheng Tian, Jihai Dong, Jintang Li, Wenlong Zhao, Xiaolong Xu, Baokun wang, Bowen Song, Changhua Meng, Tianyi Zhang, Liang Chen

    Abstract: Anomaly detection aims to distinguish abnormal instances that deviate significantly from the majority of benign ones. As instances that appear in the real world are naturally connected and can be represented with graphs, graph neural networks become increasingly popular in tackling the anomaly detection problem. Despite the promising results, research on anomaly detection has almost exclusively fo… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted to IJCAI'23. Code will be available at https://github.com/D10Andy/SAD

  44. arXiv:2305.10923  [pdf, other

    cs.IR cs.CL cs.LG

    Query Performance Prediction: From Ad-hoc to Conversational Search

    Authors: Chuan Meng, Negar Arabzadeh, Mohammad Aliannejadi, Maarten de Rijke

    Abstract: Query performance prediction (QPP) is a core task in information retrieval. The QPP task is to predict the retrieval quality of a search system for a query without relevance judgments. Research has shown the effectiveness and usefulness of QPP for ad-hoc search. Recent years have witnessed considerable progress in conversational search (CS). Effective QPP could help a CS system to decide an approp… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: Accepted for publication at SIGIR 2023

    ACM Class: H.3.3

  45. arXiv:2305.10825  [pdf, other

    cs.CV

    DiffUTE: Universal Text Editing Diffusion Model

    Authors: Haoxing Chen, Zhuoer Xu, Zhangxuan Gu, Jun Lan, Xing Zheng, Yaohui Li, Changhua Meng, Huijia Zhu, Weiqiang Wang

    Abstract: Diffusion model based language-guided image editing has achieved great success recently. However, existing state-of-the-art diffusion models struggle with rendering correct text and text style during generation. To tackle this problem, we propose a universal self-supervised text editing diffusion model (DiffUTE), which aims to replace or modify words in the source image with another one while main… ▽ More

    Submitted 18 October, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: Accepted by NeurIPS'2023

  46. arXiv:2305.10673  [pdf, other

    cs.LG cs.AI

    Less Can Be More: Unsupervised Graph Pruning for Large-scale Dynamic Graphs

    Authors: Jintang Li, Sheng Tian, Ruofan Wu, Liang Zhu, Welong Zhao, Changhua Meng, Liang Chen, Zibin Zheng, Hongzhi Yin

    Abstract: The prevalence of large-scale graphs poses great challenges in time and storage for training and deploying graph neural networks (GNNs). Several recent works have explored solutions for pruning the large original graph into a small and highly-informative one, such that training and inference on the pruned and large graphs have comparable performance. Although empirically effective, current researc… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

    Comments: Preprint

  47. arXiv:2305.09699  [pdf, other

    cs.CV

    Mobile User Interface Element Detection Via Adaptively Prompt Tuning

    Authors: Zhangxuan Gu, Zhuoer Xu, Haoxing Chen, Jun Lan, Changhua Meng, Weiqiang Wang

    Abstract: Recent object detection approaches rely on pretrained vision-language models for image-text alignment. However, they fail to detect the Mobile User Interface (MUI) element since it contains additional OCR information, which describes its content and function but is often ignored. In this paper, we develop a new MUI element detection dataset named MUI-zh and propose an Adaptively Prompt Tuning (APT… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: Accepted by CVPR23

  48. arXiv:2303.17395  [pdf, other

    eess.AS cs.CL cs.MM cs.SD

    WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research

    Authors: Xinhao Mei, Chutong Meng, Haohe Liu, Qiuqiang Kong, Tom Ko, Chengqi Zhao, Mark D. Plumbley, Yuexian Zou, Wenwu Wang

    Abstract: The advancement of audio-language (AL) multimodal learning tasks has been significant in recent years. However, researchers face challenges due to the costly and time-consuming collection process of existing audio-language datasets, which are limited in size. To address this data scarcity issue, we introduce WavCaps, the first large-scale weakly-labelled audio captioning dataset, comprising approx… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

    Comments: 12 pages

  49. arXiv:2303.03933  [pdf, other

    cs.LG cs.AI cs.DC

    DEDGAT: Dual Embedding of Directed Graph Attention Networks for Detecting Financial Risk

    Authors: Jiafu Wu, Mufeng Yao, Dong Wu, Mingmin Chi, Baokun Wang, Ruofan Wu, Xin Fu, Changhua Meng, Weiqiang Wang

    Abstract: Graph representation plays an important role in the field of financial risk control, where the relationship among users can be constructed in a graph manner. In practical scenarios, the relationships between nodes in risk control tasks are bidirectional, e.g., merchants having both revenue and expense behaviors. Graph neural networks designed for undirected graphs usually aggregate discriminative… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

  50. arXiv:2303.02604  [pdf

    cs.RO

    Two-Stage Grasping: A New Bin Picking Framework for Small Objects

    Authors: Hanwen Cao, Jianshu Zhou, Junda Huang, Yichuan Li, Ng Cheng Meng, Rui Cao, Qi Dou, Yunhui Liu

    Abstract: This paper proposes a novel bin picking framework, two-stage grasping, aiming at precise grasping of cluttered small objects. Object density estimation and rough grasping are conducted in the first stage. Fine segmentation, detection, grasping, and pushing are performed in the second stage. A small object bin picking system has been realized to exhibit the concept of two-stage grasping. Experiment… ▽ More

    Submitted 6 May, 2023; v1 submitted 5 March, 2023; originally announced March 2023.

    Comments: ICRA 2023