Skip to main content

Showing 101–150 of 338 results for author: Cai, D

  1. arXiv:2210.09946  [pdf, other

    cs.MM cs.AI cs.LG

    MMGA: Multimodal Learning with Graph Alignment

    Authors: Xuan Yang, Quanjin Tao, Xiao Feng, Donghong Cai, Xiang Ren, Yang Yang

    Abstract: Multimodal pre-training breaks down the modality barriers and allows the individual modalities to be mutually augmented with information, resulting in significant advances in representation learning. However, graph modality, as a very general and important form of data, cannot be easily interacted with other modalities because of its non-regular nature. In this paper, we propose MMGA (Multimodal l… ▽ More

    Submitted 31 October, 2022; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: Please contact xuany@zju.edu.cn for the dataset

  2. arXiv:2210.09773  [pdf, other

    cs.CL cs.AI

    Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation

    Authors: Deng Cai, Xin Li, Jackie Chun-Sing Ho, Lidong Bing, Wai Lam

    Abstract: We introduce a new method to improve existing multilingual sentence embeddings with Abstract Meaning Representation (AMR). Compared with the original textual input, AMR is a structured semantic representation that presents the core concepts and relations in a sentence explicitly and unambiguously. It also helps reduce surface variations across different expressions and languages. Unlike most prior… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

    Comments: EMNLP2022

  3. arXiv:2210.02719  [pdf, other

    cs.LG

    Continuous Diagnosis and Prognosis by Controlling the Update Process of Deep Neural Networks

    Authors: Chenxi Sun, Hongyan Li, Moxian Song, Derun Cai, Baofeng Zhang, Shenda Hong

    Abstract: Continuous diagnosis and prognosis are essential for intensive care patients. It can provide more opportunities for timely treatment and rational resource allocation, especially for sepsis, a main cause of death in ICU, and COVID-19, a new worldwide epidemic. Although deep learning methods have shown their great superiority in many medical tasks, they tend to catastrophically forget, over fit, and… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

    Comments: 41 pages, 15 figures

  4. arXiv:2210.01534  [pdf, other

    stat.ML cs.LG stat.CO

    Multi-fidelity Monte Carlo: a pseudo-marginal approach

    Authors: Diana Cai, Ryan P. Adams

    Abstract: Markov chain Monte Carlo (MCMC) is an established approach for uncertainty quantification and propagation in scientific applications. A key challenge in applying MCMC to scientific domains is computation: the target density of interest is often a function of expensive computations, such as a high-fidelity physical simulation, an intractable integral, or a slowly-converging iterative algorithm. Thu… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

    Comments: 22 pages, 7 figures

  5. arXiv:2209.12028  [pdf, other

    cs.CV

    Towards Explainable 3D Grounded Visual Question Answering: A New Benchmark and Strong Baseline

    Authors: Lichen Zhao, Daigang Cai, Jing Zhang, Lu Sheng, Dong Xu, Rui Zheng, Yinjie Zhao, Lipeng Wang, Xibo Fan

    Abstract: Recently, 3D vision-and-language tasks have attracted increasing research interest. Compared to other vision-and-language tasks, the 3D visual question answering (VQA) task is less exploited and is more susceptible to language priors and co-reference ambiguity. Meanwhile, a couple of recently proposed 3D VQA datasets do not well support 3D VQA task due to their limited scale and annotation methods… ▽ More

    Submitted 24 September, 2022; originally announced September 2022.

    Comments: 13 pages, 10 figures

  6. arXiv:2209.11348  [pdf, other

    quant-ph

    A Depth-Progressive Initialization Strategy for Quantum Approximate Optimization Algorithm

    Authors: Xinwei Lee, Ningyi Xie, Yoshiyuki Saito, Dongsheng Cai, Nobuyoshi Asai

    Abstract: The quantum approximate optimization algorithm (QAOA) is known for its capability and universality in solving combinatorial optimization problems on near-term quantum devices. The results yielded by QAOA depend strongly on its initial variational parameters. Hence, parameters selection for QAOA becomes an active area of research as bad initialization might deteriorate the quality of the results, e… ▽ More

    Submitted 27 September, 2022; v1 submitted 22 September, 2022; originally announced September 2022.

    Comments: 10 pages, 4 figures

  7. arXiv:2208.13433  [pdf, other

    cs.CV cs.LG

    Towards In-distribution Compatibility in Out-of-distribution Detection

    Authors: Boxi Wu, Jie Jiang, Haidong Ren, Zifan Du, Wenxiao Wang, Zhifeng Li, Deng Cai, Xiaofei He, Binbin Lin, Wei Liu

    Abstract: Deep neural network, despite its remarkable capability of discriminating targeted in-distribution samples, shows poor performance on detecting anomalous out-of-distribution data. To address this defect, state-of-the-art solutions choose to train deep networks on an auxiliary dataset of outliers. Various training criteria for these auxiliary outliers are proposed based on heuristic intuitions. Howe… ▽ More

    Submitted 29 August, 2022; originally announced August 2022.

  8. arXiv:2208.03624  [pdf, other

    cs.CV

    Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph

    Authors: Honghui Yang, Zili Liu, Xiaopei Wu, Wenxiao Wang, Wei Qian, Xiaofei He, Deng Cai

    Abstract: Two-stage detectors have gained much popularity in 3D object detection. Most two-stage 3D detectors utilize grid points, voxel grids, or sampled keypoints for RoI feature extraction in the second stage. Such methods, however, are inefficient in handling unevenly distributed and sparse outdoor points. This paper solves this problem in three aspects. 1) Dynamic Point Aggregation. We propose the patc… ▽ More

    Submitted 6 August, 2022; originally announced August 2022.

    Comments: ECCV 2022, Oral

  9. arXiv:2208.02129  [pdf, other

    cs.CV

    SC6D: Symmetry-agnostic and Correspondence-free 6D Object Pose Estimation

    Authors: Dingding Cai, Janne Heikkilä, Esa Rahtu

    Abstract: This paper presents an efficient symmetry-agnostic and correspondence-free framework, referred to as SC6D, for 6D object pose estimation from a single monocular RGB image. SC6D requires neither the 3D CAD model of the object nor any prior knowledge of the symmetries. The pose estimation is decomposed into three sub-tasks: a) object 3D rotation representation learning and matching; b) estimation of… ▽ More

    Submitted 18 September, 2022; v1 submitted 3 August, 2022; originally announced August 2022.

    Comments: 3DV 2022

  10. Accelerating Vertical Federated Learning

    Authors: Dongqi Cai, Tao Fan, Yan Kang, Lixin Fan, Mengwei Xu, Shangguang Wang, Qiang Yang

    Abstract: Privacy, security and data governance constraints rule out a brute force process in the integration of cross-silo data, which inherits the development of the Internet of Things. Federated learning is proposed to ensure that all parties can collaboratively complete the training task while the data is not out of the local. Vertical federated learning is a specialization of federated learning for dis… ▽ More

    Submitted 21 January, 2024; v1 submitted 23 July, 2022; originally announced July 2022.

  11. arXiv:2207.10498  [pdf, other

    cs.CV

    Towards Efficient Adversarial Training on Vision Transformers

    Authors: Boxi Wu, Jindong Gu, Zhifeng Li, Deng Cai, Xiaofei He, Wei Liu

    Abstract: Vision Transformer (ViT), as a powerful alternative to Convolutional Neural Network (CNN), has received much attention. Recent work showed that ViTs are also vulnerable to adversarial examples like CNNs. To build robust ViTs, an intuitive way is to apply adversarial training since it has been shown as one of the most effective ways to accomplish robust CNNs. However, one major limitation of advers… ▽ More

    Submitted 21 July, 2022; originally announced July 2022.

  12. arXiv:2207.08531  [pdf, other

    cs.CV

    DID-M3D: Decoupling Instance Depth for Monocular 3D Object Detection

    Authors: Liang Peng, Xiaopei Wu, Zheng Yang, Haifeng Liu, Deng Cai

    Abstract: Monocular 3D detection has drawn much attention from the community due to its low cost and setup simplicity. It takes an RGB image as input and predicts 3D boxes in the 3D space. The most challenging sub-task lies in the instance depth estimation. Previous works usually use a direct estimation method. However, in this paper we point out that the instance depth on the RGB image is non-intuitive. It… ▽ More

    Submitted 22 July, 2022; v1 submitted 18 July, 2022; originally announced July 2022.

    Comments: ECCV 2022

  13. arXiv:2207.08265   

    eess.IV cs.CV

    MLP-GAN for Brain Vessel Image Segmentation

    Authors: Bin Xie, Hao Tang, Bin Duan, Dawen Cai, Yan Yan

    Abstract: Brain vessel image segmentation can be used as a promising biomarker for better prevention and treatment of different diseases. One successful approach is to consider the segmentation as an image-to-image translation task and perform a conditional Generative Adversarial Network (cGAN) to learn a transformation between two distributions. In this paper, we present a novel multi-view approach, MLP-GA… ▽ More

    Submitted 26 October, 2022; v1 submitted 17 July, 2022; originally announced July 2022.

    Comments: Resubmit a conference

  14. arXiv:2206.09103  [pdf, other

    eess.AS cs.CR

    Identifying Source Speakers for Voice Conversion based Spoofing Attacks on Speaker Verification Systems

    Authors: Danwei Cai, Zexin Cai, Ming Li

    Abstract: An automatic speaker verification system aims to verify the speaker identity of a speech signal. However, a voice conversion system could manipulate a person's speech signal to make it sound like another speaker's voice and deceive the speaker verification system. Most countermeasures for voice conversion-based spoofing attacks are designed to discriminate bona fide speech from spoofed speech for… ▽ More

    Submitted 31 October, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

  15. arXiv:2206.07956  [pdf, other

    cs.SD cs.CL eess.AS

    Automatic Prosody Annotation with Pre-Trained Text-Speech Model

    Authors: Ziqian Dai, Jianwei Yu, Yan Wang, Nuo Chen, Yanyao Bian, Guangzhi Li, Deng Cai, Dong Yu

    Abstract: Prosodic boundary plays an important role in text-to-speech synthesis (TTS) in terms of naturalness and readability. However, the acquisition of prosodic boundary labels relies on manual annotation, which is costly and time-consuming. In this paper, we propose to automatically extract prosodic boundary labels from text-audio data via a neural text-speech model with pre-trained audio encoders. This… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: accepted by INTERSPEECH2022

  16. arXiv:2206.02369  [pdf, other

    cs.CL

    Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation

    Authors: Jin Xu, Xiaojiang Liu, Jianhao Yan, Deng Cai, Huayang Li, Jian Li

    Abstract: While large-scale neural language models, such as GPT2 and BART, have achieved impressive results on various text generation tasks, they tend to get stuck in undesirable sentence-level loops with maximization-based decoding algorithms (\textit{e.g.}, greedy search). This phenomenon is counter-intuitive since there are few consecutive sentence-level repetitions in human corpora (e.g., 0.02\% in Wik… ▽ More

    Submitted 9 October, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: Accepted by NeurIPS 2022. Code is released at https://github.com/Jxu-Thu/DITTO

  17. arXiv:2206.02102  [pdf, other

    cs.LG cs.AI cs.CV math.NA

    AUTM Flow: Atomic Unrestricted Time Machine for Monotonic Normalizing Flows

    Authors: Difeng Cai, Yuliang Ji, Huan He, Qiang Ye, Yuanzhe Xi

    Abstract: Nonlinear monotone transformations are used extensively in normalizing flows to construct invertible triangular mappings from simple distributions to complex ones. In existing literature, monotonicity is usually enforced by restricting function classes or model parameters and the inverse transformation is often approximated by root-finding algorithms as a closed-form inverse is unavailable. In thi… ▽ More

    Submitted 5 June, 2022; originally announced June 2022.

    Comments: 20 pages, 3 figures

    MSC Class: 68T07 ACM Class: I.5.1; I.2.6

  18. arXiv:2206.01885  [pdf, other

    math.NA

    Data-driven Construction of Hierarchical Matrices with Nested Bases

    Authors: Difeng Cai, Hua Huang, Edmond Chow, Yuanzhe Xi

    Abstract: Hierarchical matrices provide a powerful representation for significantly reducing the computational complexity associated with dense kernel matrices. For general kernel functions, interpolation-based methods are widely used for the efficient construction of hierarchical matrices. In this paper, we present a fast hierarchical data reduction (HiDR) procedure with $O(n)$ complexity for the memory-ef… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

    Comments: 26 pages, 20 figures

    MSC Class: 15A23 (Primary); 68W25; 65D40 (Secondary)

  19. arXiv:2205.10162  [pdf, other

    cs.LG

    FedAdapter: Efficient Federated Learning for Modern NLP

    Authors: Dongqi Cai, Yaozong Wu, Shangguang Wang, Felix Xiaozhu Lin, Mengwei Xu

    Abstract: Transformer-based pre-trained models have revolutionized NLP for superior performance and generality. Fine-tuning pre-trained models for downstream tasks often requires private data, for which federated learning is the de-facto approach (i.e., FedNLP). However, our measurements show that FedNLP is prohibitively slow due to the large model sizes and the resultant high network/computation cost. Towa… ▽ More

    Submitted 8 May, 2023; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: Accepted by MobiCom 2023

  20. Neural Collapse Inspired Attraction-Repulsion-Balanced Loss for Imbalanced Learning

    Authors: Liang Xie, Yibo Yang, Deng Cai, Xiaofei He

    Abstract: Class imbalance distribution widely exists in real-world engineering. However, the mainstream optimization algorithms that seek to minimize error will trap the deep learning model in sub-optimums when facing extreme class imbalance. It seriously harms the classification precision, especially on the minor classes. The essential reason is that the gradients of the classifier weights are imbalanced a… ▽ More

    Submitted 21 February, 2023; v1 submitted 19 April, 2022; originally announced April 2022.

    Comments: 25 pages, 5 figures, accepted by Neurocomputing

  21. arXiv:2203.14957  [pdf, other

    cs.CV

    Frame-wise Action Representations for Long Videos via Sequence Contrastive Learning

    Authors: Minghao Chen, Fangyun Wei, Chong Li, Deng Cai

    Abstract: Prior works on action representation learning mainly focus on designing various architectures to extract the global representations for short video clips. In contrast, many practical applications such as video alignment have strong demand for learning dense representations for long videos. In this paper, we introduce a novel contrastive action representation learning (CARL) framework to learn fram… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: Accepted by CVPR 2022

  22. arXiv:2203.12644  [pdf, other

    cs.CL cs.LG

    Linearizing Transformer with Key-Value Memory

    Authors: Yizhe Zhang, Deng Cai

    Abstract: Efficient transformer variants with linear time complexity have been developed to mitigate the quadratic computational overhead of the vanilla transformer. Among them are low-rank projection methods such as Linformer and kernel-based Transformers. Despite their unique merits, they usually suffer from a performance drop comparing with the vanilla transformer on many sequence generation tasks, and o… ▽ More

    Submitted 12 October, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

    Comments: EMNLP2022. The two authors contributed equally

  23. arXiv:2203.10350  [pdf, other

    cs.CV

    CLRNet: Cross Layer Refinement Network for Lane Detection

    Authors: Tu Zheng, Yifei Huang, Yang Liu, Wenjian Tang, Zheng Yang, Deng Cai, Xiaofei He

    Abstract: Lane is critical in the vision navigation system of the intelligent vehicle. Naturally, lane is a traffic sign with high-level semantics, whereas it owns the specific local pattern which needs detailed low-level features to localize accurately. Using different feature levels is of great importance for accurate lane detection, but it is still under-explored. In this work, we present Cross Layer Ref… ▽ More

    Submitted 19 March, 2022; originally announced March 2022.

    Comments: CVPR2022 Acceptance

  24. arXiv:2203.09780  [pdf, other

    cs.CV

    Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion

    Authors: Xiaopei Wu, Liang Peng, Honghui Yang, Liang Xie, Chenxi Huang, Chengqi Deng, Haifeng Liu, Deng Cai

    Abstract: Current LiDAR-only 3D detection methods inevitably suffer from the sparsity of point clouds. Many multi-modal methods are proposed to alleviate this issue, while different representations of images and point clouds make it difficult to fuse them, resulting in suboptimal performance. In this paper, we present a novel multi-modal framework SFD (Sparse Fuse Dense), which utilizes pseudo point clouds… ▽ More

    Submitted 4 July, 2022; v1 submitted 18 March, 2022; originally announced March 2022.

    Comments: Accepted by CVPR 2022 (Oral)

  25. arXiv:2203.08332  [pdf, other

    cs.CV

    WeakM3D: Towards Weakly Supervised Monocular 3D Object Detection

    Authors: Liang Peng, Senbo Yan, Boxi Wu, Zheng Yang, Xiaofei He, Deng Cai

    Abstract: Monocular 3D object detection is one of the most challenging tasks in 3D scene understanding. Due to the ill-posed nature of monocular imagery, existing monocular 3D detection methods highly rely on training with the manually annotated 3D box labels on the LiDAR point clouds. This annotation process is very laborious and expensive. To dispense with the reliance on 3D box labels, in this paper we e… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: Accepted by ICLR 2022

  26. arXiv:2203.02309  [pdf, other

    physics.ins-det astro-ph.CO hep-ex nucl-ex

    A Next-Generation Liquid Xenon Observatory for Dark Matter and Neutrino Physics

    Authors: J. Aalbers, K. Abe, V. Aerne, F. Agostini, S. Ahmed Maouloud, D. S. Akerib, D. Yu. Akimov, J. Akshat, A. K. Al Musalhi, F. Alder, S. K. Alsum, L. Althueser, C. S. Amarasinghe, F. D. Amaro, A. Ames, T. J. Anderson, B. Andrieu, N. Angelides, E. Angelino, J. Angevaare, V. C. Antochi, D. Antón Martin, B. Antunovic, E. Aprile, H. M. Araújo , et al. (572 additional authors not shown)

    Abstract: The nature of dark matter and properties of neutrinos are among the most pressing issues in contemporary particle physics. The dual-phase xenon time-projection chamber is the leading technology to cover the available parameter space for Weakly Interacting Massive Particles (WIMPs), while featuring extensive sensitivity to many alternative dark matter candidates. These detectors can also study neut… ▽ More

    Submitted 4 March, 2022; originally announced March 2022.

    Comments: 77 pages, 40 figures, 1262 references

    Report number: INT-PUB-22-003

    Journal ref: J. Phys. G: Nucl. Part. Phys. 50 (2023) 013001

  27. arXiv:2203.01072  [pdf, other

    cs.CV

    OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose Estimation

    Authors: Dingding Cai, Janne Heikkilä, Esa Rahtu

    Abstract: This paper proposes a universal framework, called OVE6D, for model-based 6D object pose estimation from a single depth image and a target object mask. Our model is trained using purely synthetic data rendered from ShapeNet, and, unlike most of the existing methods, it generalizes well on new real-world objects without any fine-tuning. We achieve this by decomposing the 6D pose into viewpoint, in-p… ▽ More

    Submitted 7 April, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

    Comments: CVPR 2022

  28. arXiv:2203.00825  [pdf, other

    cs.NI eess.SY

    Towards Effective Resource Procurement in MEC: a Resource Re-selling Framework

    Authors: Marie Siew, Shikhar Sharma, Kun Guo, Desmond Cai, Wanli Wen, Carlee Joe-Wong, Tony Q. S. Quek

    Abstract: On-demand and resource reservation pricing models have been widely used in cloud computing, catering to different user requirements. Nevertheless, in Multi-Access Edge Computing (MEC), as the edge has limited resources compared to the cloud, on-demand users may not get their jobs served on time, or at all, if too many resources were reserved by reservation plan users. Concurrently, reservation pla… ▽ More

    Submitted 8 November, 2023; v1 submitted 1 March, 2022; originally announced March 2022.

    Comments: Accepted at IEEE Transactions on Services Computing

  29. arXiv:2202.02976  [pdf, other

    cs.CL cs.AI cs.LG

    Measuring and Reducing Model Update Regression in Structured Prediction for NLP

    Authors: Deng Cai, Elman Mansimov, Yi-An Lai, Yixuan Su, Lei Shu, Yi Zhang

    Abstract: Recent advance in deep learning has led to the rapid adoption of machine learning-based NLP models in a wide range of applications. Despite the continuous gain in accuracy, backward compatibility is also an important aspect for industrial applications, yet it received little research attention. Backward compatibility requires that the new model does not regress on cases that were correctly handled… ▽ More

    Submitted 8 October, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

    Comments: NeurIPS2022

  30. arXiv:2202.01110  [pdf, other

    cs.CL

    A Survey on Retrieval-Augmented Text Generation

    Authors: Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu

    Abstract: Recently, retrieval-augmented text generation attracted increasing attention of the computational linguistics community. Compared with conventional generation models, retrieval-augmented text generation has remarkable advantages and particularly has achieved state-of-the-art performance in many NLP tasks. This paper aims to conduct a survey about retrieval-augmented text generation. It firstly hig… ▽ More

    Submitted 13 February, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

    Comments: all authors contributed equally

  31. Synthetic topology and Floquet dynamic quantum phase transition in a periodically driven Raman lattice

    Authors: De-Huan Cai, Wei Yi

    Abstract: Stimulated by the recent progress in engineering topological band structures in cold atomic gases, we study the dynamic topological phenomena for atoms loaded in a periodically driven optical lattice. When the frequency of the periodic modulation is low, the time-dependent Hamiltonian can be mapped to a two-dimensional topological insulator, with the discretized frequency components playing the ro… ▽ More

    Submitted 28 April, 2022; v1 submitted 30 December, 2021; originally announced December 2021.

    Journal ref: Phys. Rev. A 105, 042812(2022)

  32. TopNet: Learning from Neural Topic Model to Generate Long Stories

    Authors: Yazheng Yang, Boyuan Pan, Deng Cai, Huan Sun

    Abstract: Long story generation (LSG) is one of the coveted goals in natural language processing. Different from most text generation tasks, LSG requires to output a long story of rich content based on a much shorter text input, and often suffers from information sparsity. In this paper, we propose \emph{TopNet} to alleviate this problem, by leveraging the recent advances in neural topic modeling to obtain… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

    Comments: KDD2021, 9 pages

    Journal ref: Yang, Yazheng, Boyuan Pan, Deng Cai, and Huan Sun. "TopNet: Learning from Neural Topic Model to Generate Long Stories." In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 1997-2005. 2021

  33. arXiv:2112.02353  [pdf, other

    cs.CV cs.LG

    Label Hierarchy Transition: Delving into Class Hierarchies to Enhance Deep Classifiers

    Authors: Renzhen Wang, De cai, Kaiwen Xiao, Xixi Jia, Xiao Han, Deyu Meng

    Abstract: Hierarchical classification aims to sort the object into a hierarchical structure of categories. For example, a bird can be categorized according to a three-level hierarchy of order, family, and species. Existing methods commonly address hierarchical classification by decoupling it into a series of multi-class classification tasks. However, such a multi-task learning strategy fails to fully exploi… ▽ More

    Submitted 31 October, 2023; v1 submitted 4 December, 2021; originally announced December 2021.

  34. arXiv:2111.15464  [pdf, other

    cs.IT cs.LG eess.SP

    Energy-Efficient Design for a NOMA assisted STAR-RIS Network with Deep Reinforcement Learning

    Authors: Yi Guo, Fang Fang, Donghong Cai, Zhiguo Ding

    Abstract: Simultaneous transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs) has been considered as a promising auxiliary device to enhance the performance of the wireless network, where users located at the different sides of the surfaces can be simultaneously served by the transmitting and reflecting signals. In this paper, the energy efficiency (EE) maximization problem for a non-or… ▽ More

    Submitted 30 November, 2021; originally announced November 2021.

  35. arXiv:2111.10342  [pdf, other

    cs.IR cs.LG

    GRecX: An Efficient and Unified Benchmark for GNN-based Recommendation

    Authors: Desheng Cai, Jun Hu, Quan Zhao, Shengsheng Qian, Quan Fang, Changsheng Xu

    Abstract: In this paper, we present GRecX, an open-source TensorFlow framework for benchmarking GNN-based recommendation models in an efficient and unified way. GRecX consists of core libraries for building GNN-based recommendation benchmarks, as well as the implementations of popular GNN-based recommendation models. The core libraries provide essential components for building efficient and unified benchmar… ▽ More

    Submitted 22 February, 2022; v1 submitted 19 November, 2021; originally announced November 2021.

  36. arXiv:2110.06612  [pdf, other

    cs.CL cs.AI

    Exploring Dense Retrieval for Dialogue Response Selection

    Authors: Tian Lan, Deng Cai, Yan Wang, Yixuan Su, Heyan Huang, Xian-Ling Mao

    Abstract: Recent progress in deep learning has continuously improved the accuracy of dialogue response selection. In particular, sophisticated neural network architectures are leveraged to capture the rich interactions between dialogue context and response candidates. While remarkably effective, these models also bring in a steep increase in computational cost. Consequently, such models can only be used as… ▽ More

    Submitted 25 April, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

    Comments: 11 pages, 4 figures, 6 tables

  37. arXiv:2109.15196  [pdf, other

    cs.CL cs.AI

    Multilingual AMR Parsing with Noisy Knowledge Distillation

    Authors: Deng Cai, Xin Li, Jackie Chun-Sing Ho, Lidong Bing, Wai Lam

    Abstract: We study multilingual AMR parsing from the perspective of knowledge distillation, where the aim is to learn and improve a multilingual AMR parser by using an existing English parser as its teacher. We constrain our exploration in a strict multilingual setting: there is but one model to parse all different languages including English. We identify that noisy input and precise output are the key to s… ▽ More

    Submitted 13 October, 2021; v1 submitted 30 September, 2021; originally announced September 2021.

    Comments: EMNLP21 (findings)

  38. arXiv:2109.14739  [pdf, other

    cs.CL

    Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

    Authors: Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai, Yi Zhang

    Abstract: Pre-trained language models have been recently shown to benefit task-oriented dialogue (TOD) systems. Despite their success, existing methods often formulate this task as a cascaded generation problem which can lead to error accumulation across different sub-tasks and greater data annotation overhead. In this study, we present PPTOD, a unified plug-and-play model for task-oriented dialogue. In add… ▽ More

    Submitted 1 March, 2022; v1 submitted 29 September, 2021; originally announced September 2021.

    Comments: Camera-ready for ACL2022 main conference

  39. Scaling properties of scale-free networks in degree-thresholding renormalization flows

    Authors: Dan Chen, Defu Cai, Housheng Su

    Abstract: We study the statistical properties of observables of scale-free networks in the degree-thresholding renormalization (DTR) flows. For BA scale-free networks with different sizes, we find that their structural and dynamical observables have similar scaling behavior in the DTR flow. The finite-size scaling analysis confirms this view and reveals a scaling function with a single scaling exponent that… ▽ More

    Submitted 28 November, 2022; v1 submitted 25 September, 2021; originally announced September 2021.

    Journal ref: 2023, IEEE Transactions on Network Science and Engineering

  40. arXiv:2109.09059  [pdf

    physics.app-ph physics.class-ph

    A simple transcendental travelling wave solution and stability study for the thermophoretic motion with variable heat transmission factors on substrate-supported grapheme sheet

    Authors: Yue Chan, Daoju Cai, Kaisheng Cai, Shern-Long Lee, Rumiao Lin, Yong Ren

    Abstract: Manually tailored wrinkled graphene sheets hold great promise in fabricating smart solid-state devices. In this paper, we employ an energy method to transform the original third-order partial differential equation (pde), i.e. Eq. (1) into the first-order pde, i.e. Eq. (8) for the thermophoretic motion of substrate-supported graphene sheets, which can be solved in terms of semi-group and transcende… ▽ More

    Submitted 19 September, 2021; originally announced September 2021.

    Comments: 8 pages, 5 figures, conference

  41. arXiv:2109.02905  [pdf, other

    cs.CL

    Exploiting Reasoning Chains for Multi-hop Science Question Answering

    Authors: Weiwen Xu, Yang Deng, Huihui Zhang, Deng Cai, Wai Lam

    Abstract: We propose a novel Chain Guided Retriever-reader ({\tt CGR}) framework to model the reasoning chain for multi-hop Science Question Answering. Our framework is capable of performing explainable reasoning without the need of any corpus-specific annotations, such as the ground-truth reasoning chain, or human-annotated entity mentions. Specifically, we first generate reasoning chains from a semantic g… ▽ More

    Submitted 7 September, 2021; originally announced September 2021.

    Comments: 14 pages, Findings of EMNLP 2021

  42. arXiv:2109.02853  [pdf, other

    eess.AS cs.SD

    The DKU-DukeECE System for the Self-Supervision Speaker Verification Task of the 2021 VoxCeleb Speaker Recognition Challenge

    Authors: Danwei Cai, Ming Li

    Abstract: This report describes the submission of the DKU-DukeECE team to the self-supervision speaker verification task of the 2021 VoxCeleb Speaker Recognition Challenge (VoxSRC). Our method employs an iterative labeling framework to learn self-supervised speaker representation based on a deep neural network (DNN). The framework starts with training a self-supervision speaker embedding network by maximizi… ▽ More

    Submitted 7 September, 2021; originally announced September 2021.

    Comments: arXiv admin note: text overlap with arXiv:2010.14751

  43. arXiv:2109.02002  [pdf, other

    eess.AS cs.SD

    The DKU-DukeECE-Lenovo System for the Diarization Task of the 2021 VoxCeleb Speaker Recognition Challenge

    Authors: Weiqing Wang, Danwei Cai, Qingjian Lin, Lin Yang, Junjie Wang, Jin Wang, Ming Li

    Abstract: This report describes the submission of the DKU-DukeECE-Lenovo team to the VoxCeleb Speaker Recognition Challenge (VoxSRC) 2021 track 4. Our system including a voice activity detection (VAD) model, a speaker embedding model, two clustering-based speaker diarization systems with different similarity measurements, two different overlapped speech detection (OSD) models, and a target-speaker voice act… ▽ More

    Submitted 6 September, 2021; v1 submitted 5 September, 2021; originally announced September 2021.

  44. arXiv:2108.13858  [pdf, other

    cs.LG cs.AI

    GRP-FED: Addressing Client Imbalance in Federated Learning via Global-Regularized Personalization

    Authors: Yen-Hsiu Chou, Shenda Hong, Chenxi Sun, Derun Cai, Moxian Song, Hongyan Li

    Abstract: Since data is presented long-tailed in reality, it is challenging for Federated Learning (FL) to train across decentralized clients as practical applications. We present Global-Regularized Personalization (GRP-FED) to tackle the data imbalanced issue by considering a single global model and multiple local models for each client. With adaptive aggregation, the global model treats multiple clients f… ▽ More

    Submitted 31 August, 2021; originally announced August 2021.

    Comments: (FL-ICML'21) International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with ICML 2021

  45. arXiv:2108.13048  [pdf, other

    cs.CL cs.SD eess.AS

    ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language Understanding

    Authors: Lingyun Feng, Jianwei Yu, Deng Cai, Songxiang Liu, Haitao Zheng, Yan Wang

    Abstract: Language understanding in speech-based systems have attracted much attention in recent years with the growing demand for voice interface applications. However, the robustness of natural language understanding (NLU) systems to errors introduced by automatic speech recognition (ASR) is under-examined. %To facilitate the research on ASR-robust general language understanding, In this paper, we propose… ▽ More

    Submitted 16 March, 2022; v1 submitted 30 August, 2021; originally announced August 2021.

  46. arXiv:2108.07744  [pdf, other

    quant-ph

    An Iterative Improvement Method for HHL algorithm for Solving Linear System of Equations

    Authors: Yoshiyuki Saito, Xinwei Lee, Dongsheng Cai, Nobuyoshi Asai

    Abstract: We propose an iterative improvement method for the Harrow-Hassidim-Lloyd (HHL) algorithm to solve a linear system of equations. This is a quantum-classical hybrid algorithm. The accuracy is essential to solve the linear system of equations. However, the accuracy of the HHL algorithm is limited by the number of quantum bits used to express the eigenvalues of the matrix. Our iterative method improve… ▽ More

    Submitted 17 August, 2021; originally announced August 2021.

    Comments: 7 pages, 7 figures

  47. Parameters Fixing Strategy for Quantum Approximate Optimization Algorithm

    Authors: Xinwei Lee, Yoshiyuki Saito, Dongsheng Cai, Nobuyoshi Asai

    Abstract: The quantum approximate optimization algorithm (QAOA) has numerous promising applications in solving the combinatorial optimization problems on near-term Noisy Intermediate Scalable Quantum (NISQ) devices. QAOA has a quantum-classical hybrid structure. Its quantum part consists of a parameterized alternating operator ansatz, and its classical part comprises an optimization algorithm, which optimiz… ▽ More

    Submitted 11 August, 2021; originally announced August 2021.

    Comments: 7 pages, 5 figures, accepted in the IEEE International Conference on Quantum Computing and Engineering

  48. arXiv:2108.00154  [pdf, other

    cs.CV cs.LG

    CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale Attention

    Authors: Wenxiao Wang, Lu Yao, Long Chen, Binbin Lin, Deng Cai, Xiaofei He, Wei Liu

    Abstract: Transformers have made great progress in dealing with computer vision tasks. However, existing vision transformers do not yet possess the ability of building the interactions among features of different scales, which is perceptually important to visual inputs. The reasons are two-fold: (1) Input embeddings of each layer are equal-scale, so no cross-scale feature can be extracted; (2) to lower the… ▽ More

    Submitted 8 October, 2021; v1 submitted 31 July, 2021; originally announced August 2021.

    Comments: 15 pages, 4 figures, and 9 tables

  49. arXiv:2107.06341  [pdf, ps, other

    math.NA

    Hybrid A Posteriori Error Estimators for Conforming Finite Element Approximations to Stationary Convection-Diffusion-Reaction equations

    Authors: Difeng Cai, Zhiqiang Cai

    Abstract: We consider the a posteriori error estimation for convection-diffusion-reaction equations in both diffusion-dominated and convection/reaction-dominated regimes. We present an explicit hybrid estimator, which, in each regime, is proved to be reliable and efficient with constants independent of the parameters in the underlying problem. For convection-dominated problems, the norm introduced by Verf{ü… ▽ More

    Submitted 15 July, 2021; v1 submitted 13 July, 2021; originally announced July 2021.

  50. arXiv:2106.05517  [pdf, other

    cs.CV

    Learning to Affiliate: Mutual Centralized Learning for Few-shot Classification

    Authors: Yang Liu, Weifeng Zhang, Chao Xiang, Tu Zheng, Deng Cai, Xiaofei He

    Abstract: Few-shot learning (FSL) aims to learn a classifier that can be easily adapted to accommodate new tasks not seen during training, given only a few examples. To handle the limited-data problem in few-shot regimes, recent methods tend to collectively use a set of local features to densely represent an image instead of using a mixed global feature. They generally explore a unidirectional query-to-supp… ▽ More

    Submitted 18 March, 2022; v1 submitted 10 June, 2021; originally announced June 2021.

    Comments: CVPR 2022