Skip to main content

Showing 1–50 of 149 results for author: Dong, B

  1. arXiv:2407.09521  [pdf, other

    cs.CV cs.NE

    Apprenticeship-Inspired Elegance: Synergistic Knowledge Distillation Empowers Spiking Neural Networks for Efficient Single-Eye Emotion Recognition

    Authors: Yang Wang, Haiyang Mei, Qirui Bao, Ziqi Wei, Mike Zheng Shou, Haizhou Li, Bo Dong, Xin Yang

    Abstract: We introduce a novel multimodality synergistic knowledge distillation scheme tailored for efficient single-eye motion recognition tasks. This method allows a lightweight, unimodal student spiking neural network (SNN) to extract rich knowledge from an event-frame multimodal teacher network. The core strength of this approach is its ability to utilize the ample, coarser temporal cues found in conven… ▽ More

    Submitted 20 June, 2024; originally announced July 2024.

    Comments: Accepted by IJCAI 2024

  2. arXiv:2406.10878  [pdf, other

    cs.AI cs.CL

    Demonstration Notebook: Finding the Most Suited In-Context Learning Example from Interactions

    Authors: Yiming Tang, Bin Dong

    Abstract: Large language models (LLMs) benefit greatly from prompt engineering, with in-context learning standing as a pivital technique. While former approaches have provided various ways to construct the demonstrations used for in-context learning, they often ignore the inherent heterogeneity within datasets, applying the same demonstrations to all reasoning questions. We observed that the effectiveness o… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  3. arXiv:2405.05714  [pdf, other

    cs.CV cs.LG

    Estimating Noisy Class Posterior with Part-level Labels for Noisy Label Learning

    Authors: Rui Zhao, Bin Shi, Jianfei Ruan, Tianze Pan, Bo Dong

    Abstract: In noisy label learning, estimating noisy class posteriors plays a fundamental role for developing consistent classifiers, as it forms the basis for estimating clean class posteriors and the transition matrix. Existing methods typically learn noisy class posteriors by training a classification model with noisy labels. However, when labels are incorrect, these models may be misled to overemphasize… ▽ More

    Submitted 2 July, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: CVPR 2024

  4. arXiv:2405.03136  [pdf, other

    cs.CR

    FOBNN: Fast Oblivious Binarized Neural Network Inference

    Authors: Xin Chen, Zhili Chen, Benchang Dong, Shiwen Wei, Lin Chen, Daojing He

    Abstract: The superior performance of deep learning has propelled the rise of Deep Learning as a Service, enabling users to transmit their private data to service providers for model execution and inference retrieval. Nevertheless, the primary concern remains safeguarding the confidentiality of sensitive user data while optimizing the efficiency of secure protocols. To address this, we develop a fast oblivi… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  5. arXiv:2404.16331  [pdf, other

    cs.CV cs.AI

    IMWA: Iterative Model Weight Averaging Benefits Class-Imbalanced Learning Tasks

    Authors: Zitong Huang, Ze Chen, Bowen Dong, Chaoqi Liang, Erjin Zhou, Wangmeng Zuo

    Abstract: Model Weight Averaging (MWA) is a technique that seeks to enhance model's performance by averaging the weights of multiple trained models. This paper first empirically finds that 1) the vanilla MWA can benefit the class-imbalanced learning, and 2) performing model averaging in the early epochs of training yields a greater performance improvement than doing that in later epochs. Inspired by these t… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  6. arXiv:2404.02823  [pdf, other

    cs.CL cs.AI cs.LG

    Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models

    Authors: Haoran Sun, Lixin Liu, Junjie Li, Fengyu Wang, Baohua Dong, Ran Lin, Ruohui Huang

    Abstract: The ability of large language models (LLMs) to follow instructions is crucial to real-world applications. Despite recent advances, several studies have highlighted that LLMs struggle when faced with challenging instructions, especially those that include complex constraints, hindering their effectiveness in various tasks. To address this challenge, we introduce Conifer, a novel instruction tuning… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  7. arXiv:2403.16463  [pdf, other

    cs.CL

    Few-shot Named Entity Recognition via Superposition Concept Discrimination

    Authors: Jiawei Chen, Hongyu Lin, Xianpei Han, Yaojie Lu, Shanshan Jiang, Bin Dong, Le Sun

    Abstract: Few-shot NER aims to identify entities of target types with only limited number of illustrative instances. Unfortunately, few-shot NER is severely challenged by the intrinsic precise generalization problem, i.e., it is hard to accurately determine the desired target type due to the ambiguity stemming from information deficiency. In this paper, we propose Superposition Concept Discriminator (SuperC… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Accepted to LREC-COLING 2024

  8. arXiv:2403.13310  [pdf, other

    cs.IR cs.LG cs.LO

    A Semantic Search Engine for Mathlib4

    Authors: Guoxiong Gao, Haocheng Ju, Jiedong Jiang, Zihan Qin, Bin Dong

    Abstract: The interactive theorem prover, Lean, enables the verification of formal mathematical proofs and is backed by an expanding community. Central to this ecosystem is its mathematical library, mathlib4, which lays the groundwork for the formalization of an expanding range of mathematical theories. However, searching for theorems in mathlib4 can be challenging. To successfully search in mathlib4, users… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  9. arXiv:2403.00030  [pdf, other

    cs.SI cs.AI cs.CR cs.LG

    GraphPub: Generation of Differential Privacy Graph with High Availability

    Authors: Wanghan Xu, Bin Shi, Ao Liu, Jiqiang Zhang, Bo Dong

    Abstract: In recent years, with the rapid development of graph neural networks (GNN), more and more graph datasets have been published for GNN tasks. However, when an upstream data owner publishes graph data, there are often many privacy concerns, because many real-world graph data contain sensitive information like person's friend list. Differential privacy (DP) is a common method to protect privacy, but d… ▽ More

    Submitted 5 March, 2024; v1 submitted 28 February, 2024; originally announced March 2024.

  10. arXiv:2402.16674  [pdf, other

    cs.CV

    ConSept: Continual Semantic Segmentation via Adapter-based Vision Transformer

    Authors: Bowen Dong, Guanglei Yang, Wangmeng Zuo, Lei Zhang

    Abstract: In this paper, we delve into the realm of vision transformers for continual semantic segmentation, a problem that has not been sufficiently explored in previous literature. Empirical investigations on the adaptation of existing frameworks to vanilla ViT reveal that incorporating visual adapters into ViTs or fine-tuning ViTs with distillation terms is advantageous for enhancing the segmentation cap… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  11. arXiv:2402.05044  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models

    Authors: Lijun Li, Bowen Dong, Ruohui Wang, Xuhao Hu, Wangmeng Zuo, Dahua Lin, Yu Qiao, Jing Shao

    Abstract: In the rapidly evolving landscape of Large Language Models (LLMs), ensuring robust safety measures is paramount. To meet this crucial need, we propose \emph{SALAD-Bench}, a safety benchmark specifically designed for evaluating LLMs, attack, and defense methods. Distinguished by its breadth, SALAD-Bench transcends conventional benchmarks through its large scale, rich diversity, intricate taxonomy s… ▽ More

    Submitted 7 June, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: Accepted at ACL 2024 Findings

  12. arXiv:2402.01441  [pdf, ps, other

    q-fin.TR cs.LG

    Learning the Market: Sentiment-Based Ensemble Trading Agents

    Authors: Andrew Ye, James Xu, Yi Wang, Yifan Yu, Daniel Yan, Ryan Chen, Bosheng Dong, Vipin Chaudhary, Shuai Xu

    Abstract: We propose the integration of sentiment analysis and deep-reinforcement learning ensemble algorithms for stock trading, and design a strategy capable of dynamically altering its employed agent given concurrent market sentiment. In particular, we create a simple-yet-effective method for extracting news sentiment and combine this with general improvements upon existing works, resulting in automated… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  13. EchoWrist: Continuous Hand Pose Tracking and Hand-Object Interaction Recognition Using Low-Power Active Acoustic Sensing On a Wristband

    Authors: Chi-Jung Lee, Ruidong Zhang, Devansh Agarwal, Tianhong Catherine Yu, Vipin Gunda, Oliver Lopez, James Kim, Sicheng Yin, Boao Dong, Ke Li, Mose Sakashita, Francois Guimbretiere, Cheng Zhang

    Abstract: Our hands serve as a fundamental means of interaction with the world around us. Therefore, understanding hand poses and interaction context is critical for human-computer interaction. We present EchoWrist, a low-power wristband that continuously estimates 3D hand pose and recognizes hand-object interactions using active acoustic sensing. EchoWrist is equipped with two speakers emitting inaudible s… ▽ More

    Submitted 29 March, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

  14. arXiv:2311.09622  [pdf

    cs.RO

    Homography Initialization and Dynamic Weighting Algorithm Based on a Downward-Looking Camera and IMU

    Authors: Bo Dong, Yongkang Tao, Deng Peng, Zhigang Fu

    Abstract: In recent years, the technology in visual-inertial odometry (VIO) has matured considerably and has been widely used in many applications. However, we still encounter challenges when applying VIO to a micro air vehicle (MAV) equipped with a downward-looking camera. Specifically, VIO cannot compute the correct initialization results during take-off and the cumulative drift is large when the MAV is f… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  15. arXiv:2311.01918  [pdf, other

    cs.CL cs.AI cs.LG

    Large Language Models Illuminate a Progressive Pathway to Artificial Healthcare Assistant: A Review

    Authors: Mingze Yuan, Peng Bao, Jiajia Yuan, Yunhao Shen, Zifan Chen, Yi Xie, Jie Zhao, Yang Chen, Li Zhang, Lin Shen, Bin Dong

    Abstract: With the rapid development of artificial intelligence, large language models (LLMs) have shown promising capabilities in mimicking human-level language comprehension and reasoning. This has sparked significant interest in applying LLMs to enhance various aspects of healthcare, ranging from medical education to clinical decision support. However, medicine involves multifaceted data modalities and n… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: 24 pages, 1 figure, 3 tables

  16. arXiv:2311.00502  [pdf, other

    cs.LG cs.AI cs.CL

    Efficient LLM Inference on CPUs

    Authors: Haihao Shen, Hanwen Chang, Bo Dong, Yu Luo, Hengyu Meng

    Abstract: Large language models (LLMs) have demonstrated remarkable performance and tremendous potential across a wide range of tasks. However, deploying these models has been challenging due to the astronomical amount of model parameters, which requires a demand for large memory capacity and high memory bandwidth. In this paper, we propose an effective approach that can make the deployment of LLMs more eff… ▽ More

    Submitted 7 December, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: NeurIPS'2023 on Efficient Natural Language and Speech Processing

  17. arXiv:2310.14201  [pdf, other

    cs.LG math.OC

    Prompt Engineering Through the Lens of Optimal Control

    Authors: Yifan Luo, Yiming Tang, Chengfeng Shen, Zhennan Zhou, Bin Dong

    Abstract: Prompt Engineering (PE) has emerged as a critical technique for guiding Large Language Models (LLMs) in solving intricate tasks. Its importance is highlighted by its potential to significantly enhance the efficiency and effectiveness of human-machine interaction. As tasks grow increasingly complex, recent advanced PE methods have extended beyond the limitations of single-round interactions to embr… ▽ More

    Submitted 3 November, 2023; v1 submitted 22 October, 2023; originally announced October 2023.

  18. A Geometrical Approach to Evaluate the Adversarial Robustness of Deep Neural Networks

    Authors: Yang Wang, Bo Dong, Ke Xu, Haiyin Piao, Yufei Ding, Baocai Yin, Xin Yang

    Abstract: Deep Neural Networks (DNNs) are widely used for computer vision tasks. However, it has been shown that deep models are vulnerable to adversarial attacks, i.e., their performances drop when imperceptible perturbations are made to the original inputs, which may further degrade the following visual tasks or introduce new problems such as data and privacy security. Hence, metrics for evaluating the ro… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: ACM Transactions on Multimedia Computing, Communications, and Applications (ACM TOMM)

  19. arXiv:2310.06201  [pdf, other

    cs.CL

    Compressing Context to Enhance Inference Efficiency of Large Language Models

    Authors: Yucheng Li, Bo Dong, Chenghua Lin, Frank Guerin

    Abstract: Large language models (LLMs) achieved remarkable performance across various tasks. However, they face challenges in managing long documents and extended conversations, due to significantly increased computational requirements, both in memory and inference time, and potential context truncation when the input exceeds the LLM's fixed context length. This paper proposes a method called Selective Cont… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023. arXiv admin note: substantial text overlap with arXiv:2304.12102; text overlap with arXiv:2303.11076 by other authors

  20. In the Blink of an Eye: Event-based Emotion Recognition

    Authors: Haiwei Zhang, Jiqing Zhang, Bo Dong, Pieter Peers, Wenwei Wu, Xiaopeng Wei, Felix Heide, Xin Yang

    Abstract: We introduce a wearable single-eye emotion recognition device and a real-time approach to recognizing emotions from partial observations of an emotion that is robust to changes in lighting conditions. At the heart of our method is a bio-inspired event-based camera setup and a newly designed lightweight Spiking Eye Emotion Network (SEEN). Compared to conventional cameras, event-based cameras offer… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Journal ref: Special Interest Group for Computer GRAPHICS,2023

  21. Event-Enhanced Multi-Modal Spiking Neural Network for Dynamic Obstacle Avoidance

    Authors: Yang Wang, Bo Dong, Yuji Zhang, Yunduo Zhou, Haiyang Mei, Ziqi Wei, Xin Yang

    Abstract: Autonomous obstacle avoidance is of vital importance for an intelligent agent such as a mobile robot to navigate in its environment. Existing state-of-the-art methods train a spiking neural network (SNN) with deep reinforcement learning (DRL) to achieve energy-efficient and fast inference speed in complex/unknown scenes. These methods typically assume that the environment is static while the obsta… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

    Comments: In Proceedings of the 31st ACM International Conference on Multimedia (ACM MM 2023)

  22. arXiv:2309.17175  [pdf, other

    cs.CV

    TextField3D: Towards Enhancing Open-Vocabulary 3D Generation with Noisy Text Fields

    Authors: Tianyu Huang, Yihan Zeng, Bowen Dong, Hang Xu, Songcen Xu, Rynson W. H. Lau, Wangmeng Zuo

    Abstract: Recent works learn 3D representation explicitly under text-3D guidance. However, limited text-3D data restricts the vocabulary scale and text control of generations. Generators may easily fall into a stereotype concept for certain text prompts, thus losing open-vocabulary generation ability. To tackle this issue, we introduce a conditional 3D generative model, namely TextField3D. Specifically, rat… ▽ More

    Submitted 14 March, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: Accepted by ICLR 2024

  23. arXiv:2309.09574  [pdf, other

    cs.LG math-ph math.OC physics.ao-ph

    Latent assimilation with implicit neural representations for unknown dynamics

    Authors: Zhuoyuan Li, Bin Dong, Pingwen Zhang

    Abstract: Data assimilation is crucial in a wide range of applications, but it often faces challenges such as high computational costs due to data dimensionality and incomplete understanding of underlying mechanisms. To address these challenges, this study presents a novel assimilation framework, termed Latent Assimilation with Implicit Neural Representations (LAINR). By introducing Spherical Implicit Neura… ▽ More

    Submitted 22 March, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: 40 pages

    MSC Class: 68T07; 49N45; 33C55

  24. arXiv:2308.12060  [pdf, other

    cs.CL cs.AI

    FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering

    Authors: Zhenyu Li, Sunqi Fan, Yu Gu, Xiuxing Li, Zhichao Duan, Bowen Dong, Ning Liu, Jianyong Wang

    Abstract: Knowledge base question answering (KBQA) is a critical yet challenging task due to the vast number of entities within knowledge bases and the diversity of natural language questions posed by users. Unfortunately, the performance of most KBQA models tends to decline significantly in real-world scenarios where high-quality annotated data is insufficient. To mitigate the burden associated with manual… ▽ More

    Submitted 26 January, 2024; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: Accepted as AAAI-24 Oral paper; Knowledge Base Question Answering; Large Language Model; Data Generation; Few-Shot & Zero-Shot

  25. arXiv:2308.11355  [pdf, ps, other

    math.AG cs.LG math.RT

    Machine learning assisted exploration for affine Deligne-Lusztig varieties

    Authors: Bin Dong, Xuhua He, Pengfei Jin, Felix Schremmer, Qingchao Yu

    Abstract: This paper presents a novel, interdisciplinary study that leverages a Machine Learning (ML) assisted framework to explore the geometry of affine Deligne-Lusztig varieties (ADLV). The primary objective is to investigate the nonemptiness pattern, dimension and enumeration of irreducible components of ADLV. Our proposed framework demonstrates a recursive pipeline of data generation, model training, p… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Comments: 36 pages

    MSC Class: 22E35; 22E67

  26. arXiv:2308.07392  [pdf, other

    cs.CV

    A Unified Query-based Paradigm for Camouflaged Instance Segmentation

    Authors: Bo Dong, Jialun Pei, Rongrong Gao, Tian-Zhu Xiang, Shuo Wang, Huan Xiong

    Abstract: Due to the high similarity between camouflaged instances and the background, the recently proposed camouflaged instance segmentation (CIS) faces challenges in accurate localization and instance segmentation. To this end, inspired by query-based transformers, we propose a unified query-based multi-task learning framework for camouflaged instance segmentation, termed UQFormer, which builds a set of… ▽ More

    Submitted 29 August, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: This paper has been accepted by ACM MM2023

  27. arXiv:2308.00891  [pdf, other

    cs.DC

    PROV-IO+: A Cross-Platform Provenance Framework for Scientific Data on HPC Systems

    Authors: Runzhou Han, Mai Zheng, Suren Byna, Houjun Tang, Bin Dong, Dong Dai, Yong Chen, Dongkyun Kim, Joseph Hassoun, David Thorsley, Matthew Wolf

    Abstract: Data provenance, or data lineage, describes the life cycle of data. In scientific workflows on HPC systems, scientists often seek diverse provenance (e.g., origins of data products, usage patterns of datasets). Unfortunately, existing provenance solutions cannot address the challenges due to their incompatible provenance models and/or system implementations. In this paper, we analyze four represen… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

  28. arXiv:2308.00507  [pdf, other

    eess.IV cs.CV cs.LG

    Improved Prognostic Prediction of Pancreatic Cancer Using Multi-Phase CT by Integrating Neural Distance and Texture-Aware Transformer

    Authors: Hexin Dong, Jiawen Yao, Yuxing Tang, Mingze Yuan, Yingda Xia, Jian Zhou, Hong Lu, Jingren Zhou, Bin Dong, Le Lu, Li Zhang, Zaiyi Liu, Yu Shi, Ling Zhang

    Abstract: Pancreatic ductal adenocarcinoma (PDAC) is a highly lethal cancer in which the tumor-vascular involvement greatly affects the resectability and, thus, overall survival of patients. However, current prognostic prediction methods fail to explicitly and accurately investigate relationships between the tumor and nearby important vessels. This paper proposes a novel learnable neural distance that descr… ▽ More

    Submitted 13 September, 2023; v1 submitted 1 August, 2023; originally announced August 2023.

    Comments: MICCAI 2023

  29. arXiv:2307.14395  [pdf, other

    cs.LG cs.AI

    Learning to simulate partially known spatio-temporal dynamics with trainable difference operators

    Authors: Xiang Huang, Zhuoyuan Li, Hongsheng Liu, Zidong Wang, Hongye Zhou, Bin Dong, Bei Hua

    Abstract: Recently, using neural networks to simulate spatio-temporal dynamics has received a lot of attention. However, most existing methods adopt pure data-driven black-box models, which have limited accuracy and interpretability. By combining trainable difference operators with black-box models, we propose a new hybrid architecture explicitly embedded with partial prior knowledge of the underlying PDEs… ▽ More

    Submitted 26 July, 2023; originally announced July 2023.

  30. arXiv:2307.04525  [pdf, other

    eess.IV cs.CV cs.LG

    Cluster-Induced Mask Transformers for Effective Opportunistic Gastric Cancer Screening on Non-contrast CT Scans

    Authors: Mingze Yuan, Yingda Xia, Xin Chen, Jiawen Yao, Junli Wang, Mingyan Qiu, Hexin Dong, Jingren Zhou, Bin Dong, Le Lu, Li Zhang, Zaiyi Liu, Ling Zhang

    Abstract: Gastric cancer is the third leading cause of cancer-related mortality worldwide, but no guideline-recommended screening test exists. Existing methods can be invasive, expensive, and lack sensitivity to identify early-stage gastric cancer. In this study, we explore the feasibility of using a deep learning approach on non-contrast CT scans for gastric cancer detection. We propose a novel cluster-ind… ▽ More

    Submitted 15 July, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

    Comments: MICCAI 2023

  31. arXiv:2306.17799  [pdf, other

    cs.CV cs.SD eess.AS

    A Low-rank Matching Attention based Cross-modal Feature Fusion Method for Conversational Emotion Recognition

    Authors: Yuntao Shou, Xiangyong Cao, Deyu Meng, Bo Dong, Qinghua Zheng

    Abstract: Conversational emotion recognition (CER) is an important research topic in human-computer interactions. Although deep learning (DL) based CER approaches have achieved excellent performance, existing cross-modal feature fusion methods used in these DL-based approaches either ignore the intra-modal and inter-modal emotional interaction or have high computational complexity. To address these issues,… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: 10 pages, 4 figures

  32. arXiv:2306.16601  [pdf, other

    cs.LG cs.AI cs.CL

    An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs

    Authors: Haihao Shen, Hengyu Meng, Bo Dong, Zhe Wang, Ofir Zafrir, Yi Ding, Yu Luo, Hanwen Chang, Qun Gao, Ziheng Wang, Guy Boudoukh, Moshe Wasserblat

    Abstract: In recent years, Transformer-based language models have become the standard approach for natural language processing tasks. However, stringent throughput and latency requirements in industrial applications are limiting their adoption. To mitigate the gap, model compression techniques such as structured pruning are being used to improve inference efficiency. However, most existing neural network in… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

  33. arXiv:2306.01499  [pdf, other

    cs.CL cs.LG

    Can LLMs like GPT-4 outperform traditional AI tools in dementia diagnosis? Maybe, but not today

    Authors: Zhuo Wang, Rongzhen Li, Bowen Dong, Jie Wang, Xiuxing Li, Ning Liu, Chenhui Mao, Wei Zhang, Liling Dong, Jing Gao, Jianyong Wang

    Abstract: Recent investigations show that large language models (LLMs), specifically GPT-4, not only have remarkable capabilities in common Natural Language Processing (NLP) tasks but also exhibit human-level performance on various professional and academic benchmarks. However, whether GPT-4 can be directly used in practical applications and replace traditional artificial intelligence (AI) tools in speciali… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: 16 pages, 6 figures

  34. arXiv:2305.17871  [pdf, other

    eess.IV cs.CV cs.LG

    propnet: Propagating 2D Annotation to 3D Segmentation for Gastric Tumors on CT Scans

    Authors: Zifan Chen, Jiazheng Li, Jie Zhao, Yiting Liu, Hongfeng Li, Bin Dong, Lei Tang, Li Zhang

    Abstract: **Background:** Accurate 3D CT scan segmentation of gastric tumors is pivotal for diagnosis and treatment. The challenges lie in the irregular shapes, blurred boundaries of tumors, and the inefficiency of existing methods. **Purpose:** We conducted a study to introduce a model, utilizing human-guided knowledge and unique modules, to address the challenges of 3D tumor segmentation. **Methods:**… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

  35. arXiv:2305.15907  [pdf, other

    cs.LG

    Double Descent of Discrepancy: A Task-, Data-, and Model-Agnostic Phenomenon

    Authors: Yifan Luo, Bin Dong

    Abstract: In this paper, we studied two identically-trained neural networks (i.e. networks with the same architecture, trained on the same dataset using the same algorithm, but with different initialization) and found that their outputs discrepancy on the training dataset exhibits a "double descent" phenomenon. We demonstrated through extensive experiments across various tasks, datasets, and network archite… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

  36. arXiv:2305.09144  [pdf, other

    cs.CL cs.AI

    Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism of Language Models

    Authors: Boxi Cao, Qiaoyu Tang, Hongyu Lin, Shanshan Jiang, Bin Dong, Xianpei Han, Jiawei Chen, Tianshu Wang, Le Sun

    Abstract: Memory is one of the most essential cognitive functions serving as a repository of world knowledge and episodes of activities. In recent years, large-scale pre-trained language models have shown remarkable memorizing ability. On the contrary, vanilla neural networks without pre-training have been long observed suffering from the catastrophic forgetting problem. To investigate such a retentive-forg… ▽ More

    Submitted 13 March, 2024; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: Accepted by LREC-COLING 2024

  37. arXiv:2305.07965  [pdf, other

    cs.HC

    Using a virtual reality interview simulator to explore factors influencing people's behavior

    Authors: Xinyi Luo, Yuyang Wang, Lik-Hang Lee, Zihan Xing, Shan Jin, Boya Dong, Yuanyi Hu, Zeming Chen, Jing Yan, Pan Hui

    Abstract: Virtual reality interview simulator (VRIS) provides an effective and manageable approach for candidates prone to being very nervous during interviews, yet, the major anxiety-inducing elements remain unknown. During an interview, the anxiety levels, overall experience, and performance of interviewees might be affected by various circumstances. By analyzing electrodermal activity and questionnaire,… ▽ More

    Submitted 16 May, 2023; v1 submitted 13 May, 2023; originally announced May 2023.

    Comments: 12 pages, 4 pictures, 9 tables

  38. arXiv:2305.05991  [pdf, other

    cs.CV eess.IV

    DMNR: Unsupervised De-noising of Point Clouds Corrupted by Airborne Particles

    Authors: Chu Chen, Yanqi Ma, Bingcheng Dong, Junjie Cao

    Abstract: LiDAR sensors are critical for autonomous driving and robotics applications due to their ability to provide accurate range measurements and their robustness to lighting conditions. However, airborne particles, such as fog, rain, snow, and dust, will degrade its performance and it is inevitable to encounter these inclement environmental conditions outdoors. It would be a straightforward approach to… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: 8 pages, 6 figures, 15 references, submitted paper

  39. U-NEED: A Fine-grained Dataset for User Needs-Centric E-commerce Conversational Recommendation

    Authors: Yuanxing Liu, Weinan Zhang, Baohua Dong, Yan Fan, Hang Wang, Fan Feng, Yifan Chen, Ziyu Zhuang, Hengbin Cui, Yongbin Li, Wanxiang Che

    Abstract: Conversational recommender systems (CRSs) aim to understand the information needs and preferences expressed in a dialogue to recommend suitable items to the user. Most of the existing conversational recommendation datasets are synthesized or simulated with crowdsourcing, which has a large gap with real-world scenarios. To bridge the gap, previous work contributes a dataset E-ConvRec, based on pre-… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: SIGIR23 Resource Track

  40. arXiv:2304.08384  [pdf, other

    cs.CV eess.IV

    Unsupervised Image Denoising with Score Function

    Authors: Yutong Xie, Mingze Yuan, Bin Dong, Quanzheng Li

    Abstract: Though achieving excellent performance in some cases, current unsupervised learning methods for single image denoising usually have constraints in applications. In this paper, we propose a new approach which is more general and applicable to complicated noise models. Utilizing the property of score function, the gradient of logarithmic probability, we define a solving system for denoising. Once th… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

  41. arXiv:2304.00212  [pdf, other

    cs.CV cs.LG

    Devil is in the Queries: Advancing Mask Transformers for Real-world Medical Image Segmentation and Out-of-Distribution Localization

    Authors: Mingze Yuan, Yingda Xia, Hexin Dong, Zifan Chen, Jiawen Yao, Mingyan Qiu, Ke Yan, Xiaoli Yin, Yu Shi, Xin Chen, Zaiyi Liu, Bin Dong, Jingren Zhou, Le Lu, Ling Zhang, Li Zhang

    Abstract: Real-world medical image segmentation has tremendous long-tailed complexity of objects, among which tail conditions correlate with relatively rare diseases and are clinically significant. A trustworthy medical AI algorithm should demonstrate its effectiveness on tail conditions to avoid clinically dangerous damage in these out-of-distribution (OOD) cases. In this paper, we adopt the concept of obj… ▽ More

    Submitted 31 March, 2023; originally announced April 2023.

    Comments: CVPR 2023 Highlight

  42. arXiv:2303.16421  [pdf, other

    cs.CL

    ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models

    Authors: Ning Bian, Xianpei Han, Le Sun, Hongyu Lin, Yaojie Lu, Ben He, Shanshan Jiang, Bin Dong

    Abstract: Large language models (LLMs) have made significant progress in NLP. However, their ability to memorize, represent, and leverage commonsense knowledge has been a well-known pain point. In this paper, we specifically focus on ChatGPT, a widely used and easily accessible LLM, and ask the following questions: (1) Can ChatGPT effectively answer commonsense questions? (2) Is ChatGPT aware of the underly… ▽ More

    Submitted 19 April, 2024; v1 submitted 28 March, 2023; originally announced March 2023.

    Comments: Accepted by LREC-COLING 2024

  43. arXiv:2303.06547  [pdf, other

    cs.CV

    Towards Universal Vision-language Omni-supervised Segmentation

    Authors: Bowen Dong, Jiaxi Gu, Jianhua Han, Hang Xu, Wangmeng Zuo

    Abstract: Existing open-world universal segmentation approaches usually leverage CLIP and pre-computed proposal masks to treat open-world segmentation tasks as proposal classification. However, 1) these works cannot handle universal segmentation in an end-to-end manner, and 2) the limited scale of panoptic datasets restricts the open-world segmentation ability on things classes. In this paper, we present Vi… ▽ More

    Submitted 11 March, 2023; originally announced March 2023.

  44. A Comparative Study of Deep Learning and Iterative Algorithms for Joint Channel Estimation and Signal Detection in OFDM Systems

    Authors: Haocheng Ju, Haimiao Zhang, Lin Li, Xiao Li, Bin Dong

    Abstract: Joint channel estimation and signal detection (JCESD) is crucial in orthogonal frequency division multiplexing (OFDM) systems, but traditional algorithms perform poorly in low signal-to-noise ratio (SNR) scenarios. Deep learning (DL) methods have been investigated, but concerns regarding computational expense and lack of validation in low-SNR settings remain. Hence, the development of a robust and… ▽ More

    Submitted 20 June, 2024; v1 submitted 7 March, 2023; originally announced March 2023.

    Comments: Code is available at https://github.com/j991222/MIMO_JCESD

    Journal ref: Signal Processing 223 (2024), 109554

  45. EDMAE: An Efficient Decoupled Masked Autoencoder for Standard View Identification in Pediatric Echocardiography

    Authors: Yiman Liu, Xiaoxiang Han, Tongtong Liang, Bin Dong, Jiajun Yuan, Menghan Hu, Qiaohong Liu, Jiangang Chen, Qingli Li, Yuqi Zhang

    Abstract: This paper introduces the Efficient Decoupled Masked Autoencoder (EDMAE), a novel self-supervised method for recognizing standard views in pediatric echocardiography. EDMAE introduces a new proxy task based on the encoder-decoder structure. The EDMAE encoder is composed of a teacher and a student encoder. The teacher encoder extracts the potential representation of the masked image blocks, while t… ▽ More

    Submitted 3 August, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: 15 pages, 5 figures, 8 tables, Published in Biomedical Signal Processing and Control

    Journal ref: Biomedical Signal Processing and Control 86 (2023) 105280

  46. arXiv:2302.03488  [pdf, other

    cs.CL cs.AI

    APAM: Adaptive Pre-training and Adaptive Meta Learning in Language Model for Noisy Labels and Long-tailed Learning

    Authors: Sunyi Chi, Bo Dong, Yiming Xu, Zhenyu Shi, Zheng Du

    Abstract: Practical natural language processing (NLP) tasks are commonly long-tailed with noisy labels. Those problems challenge the generalization and robustness of complex models such as Deep Neural Networks (DNNs). Some commonly used resampling techniques, such as oversampling or undersampling, could easily lead to overfitting. It is growing popular to learn the data weights leveraging a small amount of… ▽ More

    Submitted 2 May, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

  47. arXiv:2302.02398  [pdf, other

    cs.CV

    Diffusion Model for Generative Image Denoising

    Authors: Yutong Xie, Minne Yuan, Bin Dong, Quanzheng Li

    Abstract: In supervised learning for image denoising, usually the paired clean images and noisy images are collected or synthesised to train a denoising model. L2 norm loss or other distance functions are used as the objective function for training. It often leads to an over-smooth result with less image details. In this paper, we regard the denoising task as a problem of estimating the posterior distributi… ▽ More

    Submitted 5 February, 2023; originally announced February 2023.

  48. arXiv:2302.00918  [pdf, other

    cs.CV

    Visual Realism Assessment for Face-swap Videos

    Authors: Xianyun Sun, Beibei Dong, Caiyong Wang, Bo Peng, Jing Dong

    Abstract: Deep-learning based face-swap videos, also known as deep fakes, are becoming more and more realistic and deceiving. The malicious usage of these face-swap videos has caused wide concerns. The research community has been focusing on the automatic detection of these fake videos, but the assessment of their visual realism, as perceived by human eyes, is still an unexplored dimension. Visual realism a… ▽ More

    Submitted 30 July, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

    Comments: Accepted by ICIG 2023

  49. arXiv:2301.04648  [pdf, other

    cs.CV

    Head-Free Lightweight Semantic Segmentation with Linear Transformer

    Authors: Bo Dong, Pichao Wang, Fan Wang

    Abstract: Existing semantic segmentation works have been mainly focused on designing effective decoders; however, the computational load introduced by the overall structure has long been ignored, which hinders their applications on resource-constrained hardwares. In this paper, we propose a head-free lightweight architecture specifically for semantic segmentation, named Adaptive Frequency Transformer. It ad… ▽ More

    Submitted 11 January, 2023; originally announced January 2023.

    Comments: Accepted by AAAI2023; codes and models are available at https://github.com/dongbo811/AFFormer

  50. arXiv:2212.02190  [pdf, other

    cs.CV

    L2SR: Learning to Sample and Reconstruct for Accelerated MRI via Reinforcement Learning

    Authors: Pu Yang, Bin Dong

    Abstract: Magnetic Resonance Imaging (MRI) is a widely used medical imaging technique, but its long acquisition time can be a limiting factor in clinical settings. To address this issue, researchers have been exploring ways to reduce the acquisition time while maintaining the reconstruction quality. Previous works have focused on finding either sparse samplers with a fixed reconstructor or finding reconstru… ▽ More

    Submitted 6 April, 2024; v1 submitted 5 December, 2022; originally announced December 2022.