subscribe to arXiv mailings

Imaging Interiors: An Implicit Solution to Electromagnetic Inverse Scattering Problems

Authors: Ziyuan Luo, Boxin Shi, Haoliang Li, Renjie Wan

Abstract: Electromagnetic Inverse Scattering Problems (EISP) have gained wide applications in computational imaging. By solving EISP, the internal relative permittivity of the scatterer can be non-invasively determined based on the scattered electromagnetic fields. Despite previous efforts to address EISP, achieving better solutions to this problem has remained elusive, due to the challenges posed by invers… ▽ More Electromagnetic Inverse Scattering Problems (EISP) have gained wide applications in computational imaging. By solving EISP, the internal relative permittivity of the scatterer can be non-invasively determined based on the scattered electromagnetic fields. Despite previous efforts to address EISP, achieving better solutions to this problem has remained elusive, due to the challenges posed by inversion and discretization. This paper tackles those challenges in EISP via an implicit approach. By representing the scatterer's relative permittivity as a continuous implicit representation, our method is able to address the low-resolution problems arising from discretization. Further, optimizing this implicit representation within a forward framework allows us to conveniently circumvent the challenges posed by inverse estimation. Our approach outperforms existing methods on standard benchmark datasets. Project page: https://luo-ziyuan.github.io/Imaging-Interiors △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 33 pages, accepted by ECCV 2024 non-camera-ready version

arXiv:2407.07735 [pdf, other]

Protecting NeRFs' Copyright via Plug-And-Play Watermarking Base Model

Authors: Qi Song, Ziyuan Luo, Ka Chun Cheung, Simon See, Renjie Wan

Abstract: Neural Radiance Fields (NeRFs) have become a key method for 3D scene representation. With the rising prominence and influence of NeRF, safeguarding its intellectual property has become increasingly important. In this paper, we propose \textbf{NeRFProtector}, which adopts a plug-and-play strategy to protect NeRF's copyright during its creation. NeRFProtector utilizes a pre-trained watermarking base… ▽ More Neural Radiance Fields (NeRFs) have become a key method for 3D scene representation. With the rising prominence and influence of NeRF, safeguarding its intellectual property has become increasingly important. In this paper, we propose \textbf{NeRFProtector}, which adopts a plug-and-play strategy to protect NeRF's copyright during its creation. NeRFProtector utilizes a pre-trained watermarking base model, enabling NeRF creators to embed binary messages directly while creating their NeRF. Our plug-and-play property ensures NeRF creators can flexibly choose NeRF variants without excessive modifications. Leveraging our newly designed progressive distillation, we demonstrate performance on par with several leading-edge neural rendering methods. Our project is available at: \url{https://qsong2001.github.io/NeRFProtector}. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: Accepted by ECCV2024

arXiv:2407.06838 [pdf, other]

Event Trojan: Asynchronous Event-based Backdoor Attacks

Authors: Ruofei Wang, Qing Guo, Haoliang Li, Renjie Wan

Abstract: As asynchronous event data is more frequently engaged in various vision tasks, the risk of backdoor attacks becomes more evident. However, research into the potential risk associated with backdoor attacks in asynchronous event data has been scarce, leaving related tasks vulnerable to potential threats. This paper has uncovered the possibility of directly poisoning event data streams by proposing E… ▽ More As asynchronous event data is more frequently engaged in various vision tasks, the risk of backdoor attacks becomes more evident. However, research into the potential risk associated with backdoor attacks in asynchronous event data has been scarce, leaving related tasks vulnerable to potential threats. This paper has uncovered the possibility of directly poisoning event data streams by proposing Event Trojan framework, including two kinds of triggers, i.e., immutable and mutable triggers. Specifically, our two types of event triggers are based on a sequence of simulated event spikes, which can be easily incorporated into any event stream to initiate backdoor attacks. Additionally, for the mutable trigger, we design an adaptive learning mechanism to maximize its aggressiveness. To improve the stealthiness, we introduce a novel loss function that constrains the generated contents of mutable triggers, minimizing the difference between triggers and original events while maintaining effectiveness. Extensive experiments on public event datasets show the effectiveness of the proposed backdoor triggers. We hope that this paper can draw greater attention to the potential threats posed by backdoor attacks on event-based tasks. Our code is available at https://github.com/rfww/EventTrojan. △ Less

Submitted 14 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

Comments: Accepted by ECCV2024

arXiv:2406.02540 [pdf, other]

ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation

Authors: Tianchen Zhao, Tongcheng Fang, Enshu Liu, Rui Wan, Widyadewi Soedarmadji, Shiyao Li, Zinan Lin, Guohao Dai, Shengen Yan, Huazhong Yang, Xuefei Ning, Yu Wang

Abstract: Diffusion transformers (DiTs) have exhibited remarkable performance in visual generation tasks, such as generating realistic images or videos based on textual instructions. However, larger model sizes and multi-frame processing for video generation lead to increased computational and memory costs, posing challenges for practical deployment on edge devices. Post-Training Quantization (PTQ) is an ef… ▽ More Diffusion transformers (DiTs) have exhibited remarkable performance in visual generation tasks, such as generating realistic images or videos based on textual instructions. However, larger model sizes and multi-frame processing for video generation lead to increased computational and memory costs, posing challenges for practical deployment on edge devices. Post-Training Quantization (PTQ) is an effective method for reducing memory costs and computational complexity. When quantizing diffusion transformers, we find that applying existing diffusion quantization methods designed for U-Net faces challenges in preserving quality. After analyzing the major challenges for quantizing diffusion transformers, we design an improved quantization scheme: "ViDiT-Q": Video and Image Diffusion Transformer Quantization) to address these issues. Furthermore, we identify highly sensitive layers and timesteps hinder quantization for lower bit-widths. To tackle this, we improve ViDiT-Q with a novel metric-decoupled mixed-precision quantization method (ViDiT-Q-MP). We validate the effectiveness of ViDiT-Q across a variety of text-to-image and video models. While baseline quantization methods fail at W8A8 and produce unreadable content at W4A8, ViDiT-Q achieves lossless W8A8 quantization. ViDiTQ-MP achieves W4A8 with negligible visual quality degradation, resulting in a 2.5x memory optimization and a 1.5x latency speedup. △ Less

Submitted 30 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

Comments: Project Page: https://a-suozhang.xyz/viditq.github.io/

arXiv:2405.12438 [pdf, other]

doi 10.1145/3635636.3664260

CoCo Matrix: Taxonomy of Cognitive Contributions in Co-writing with Intelligent Agents

Authors: Ruyuan Wan, Simret Gebreegziabhe, Toby Jia-Jun Li, Karla Badillo-Urquiola

Abstract: In recent years, there has been a growing interest in employing intelligent agents in writing. Previous work emphasizes the evaluation of the quality of end product-whether it was coherent and polished, overlooking the journey that led to the product, which is an invaluable dimension of the creative process. To understand how to recognize human efforts in co-writing with intelligent writing system… ▽ More In recent years, there has been a growing interest in employing intelligent agents in writing. Previous work emphasizes the evaluation of the quality of end product-whether it was coherent and polished, overlooking the journey that led to the product, which is an invaluable dimension of the creative process. To understand how to recognize human efforts in co-writing with intelligent writing systems, we adapt Flower and Hayes' cognitive process theory of writing and propose CoCo Matrix, a two-dimensional taxonomy of entropy and information gain, to depict the new human-agent co-writing model. We define four quadrants and situate thirty-four published systems within the taxonomy. Our research found that low entropy and high information gain systems are under-explored, yet offer promising future directions in writing tasks that benefit from the agent's divergent planning and the human's focused translation. CoCo Matrix, not only categorizes different writing systems but also deepens our understanding of the cognitive processes in human-agent co-writing. By analyzing minimal changes in the writing process, CoCo Matrix serves as a proxy for the writer's mental model, allowing writers to reflect on their contributions. This reflection is facilitated through the measured metrics of information gain and entropy, which provide insights irrespective of the writing system used. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2404.19247 [pdf, ps, other]

Improved AutoEncoder with LSTM module and KL divergence

Authors: Wei Huang, Bingyang Zhang, Kaituo Zhang, Hua Gao, Rongchun Wan

Abstract: The task of anomaly detection is to separate anomalous data from normal data in the dataset. Models such as deep convolutional autoencoder (CAE) network and deep supporting vector data description (SVDD) model have been universally employed and have demonstrated significant success in detecting anomalies. However, the over-reconstruction ability of CAE network for anomalous data can easily lead to… ▽ More The task of anomaly detection is to separate anomalous data from normal data in the dataset. Models such as deep convolutional autoencoder (CAE) network and deep supporting vector data description (SVDD) model have been universally employed and have demonstrated significant success in detecting anomalies. However, the over-reconstruction ability of CAE network for anomalous data can easily lead to high false negative rate in detecting anomalous data. On the other hand, the deep SVDD model has the drawback of feature collapse, which leads to a decrease of detection accuracy for anomalies. To address these problems, we propose the Improved AutoEncoder with LSTM module and Kullback-Leibler divergence (IAE-LSTM-KL) model in this paper. An LSTM network is added after the encoder to memorize feature representations of normal data. In the meanwhile, the phenomenon of feature collapse can also be mitigated by penalizing the featured input to SVDD module via KL divergence. The efficacy of the IAE-LSTM-KL model is validated through experiments on both synthetic and real-world datasets. Experimental results show that IAE-LSTM-KL model yields higher detection accuracy for anomalies. In addition, it is also found that the IAE-LSTM-KL model demonstrates enhanced robustness to contaminated outliers in the dataset. △ Less

Submitted 30 April, 2024; originally announced April 2024.

arXiv:2402.12184 [pdf, other]

Colorizing Monochromatic Radiance Fields

Authors: Yean Cheng, Renjie Wan, Shuchen Weng, Chengxuan Zhu, Yakun Chang, Boxin Shi

Abstract: Though Neural Radiance Fields (NeRF) can produce colorful 3D representations of the world by using a set of 2D images, such ability becomes non-existent when only monochromatic images are provided. Since color is necessary in representing the world, reproducing color from monochromatic radiance fields becomes crucial. To achieve this goal, instead of manipulating the monochromatic radiance fields… ▽ More Though Neural Radiance Fields (NeRF) can produce colorful 3D representations of the world by using a set of 2D images, such ability becomes non-existent when only monochromatic images are provided. Since color is necessary in representing the world, reproducing color from monochromatic radiance fields becomes crucial. To achieve this goal, instead of manipulating the monochromatic radiance fields directly, we consider it as a representation-prediction task in the Lab color space. By first constructing the luminance and density representation using monochromatic images, our prediction stage can recreate color representation on the basis of an image colorization module. We then reproduce a colorful implicit model through the representation of luminance, density, and color. Extensive experiments have been conducted to validate the effectiveness of our approaches. Our project page: https://liquidammonia.github.io/color-nerf. △ Less

Submitted 19 February, 2024; originally announced February 2024.

arXiv:2401.08195 [pdf, ps, other]

Three classes of propagation rules for GRS and EGRS codes and their applications to EAQECCs

Authors: Ruhao Wan, Shixin Zhu

Abstract: In this paper, we study the Hermitian hulls of (extended) generalized Reed-Solomon (GRS and EGRS) codes over finite fields. For a given class of (extended) GRS codes, by increasing the length, increasing the dimensions and increasing both the length and the dimensions, we obtain three new classes of (extended) GRS codes with Hermitian hulls of arbitrary dimensions. Furthermore, we obtain several n… ▽ More In this paper, we study the Hermitian hulls of (extended) generalized Reed-Solomon (GRS and EGRS) codes over finite fields. For a given class of (extended) GRS codes, by increasing the length, increasing the dimensions and increasing both the length and the dimensions, we obtain three new classes of (extended) GRS codes with Hermitian hulls of arbitrary dimensions. Furthermore, we obtain several new classes of $q^2$-ary maximum distance separable (MDS) codes with Hermitian hulls of arbitrary dimensions. And the dimension of these MDS codes can be taken from $1$ to $\frac{n}{2}$. By propagation rules, the parameters of the obtained code can be more flexible. As an application, a lot of new (MDS) entanglement-assisted quantum error correction codes (EAQECCs) can be constructed from previous known (extended) GRS codes. We derive three new propagation rules on (MDS) EAQECCs constructed from (extended) GRS codes. Finally, we present several new classes of (MDS) EAQECCs with flexible parameters. Notably, the distance parameters of our codes can range from $2$ to $\frac{n+2}{2}$. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: 23 pages, 5 tables

ACM Class: E.4

arXiv:2401.02031 [pdf, other]

Spy-Watermark: Robust Invisible Watermarking for Backdoor Attack

Authors: Ruofei Wang, Renjie Wan, Zongyu Guo, Qing Guo, Rui Huang

Abstract: Backdoor attack aims to deceive a victim model when facing backdoor instances while maintaining its performance on benign data. Current methods use manual patterns or special perturbations as triggers, while they often overlook the robustness against data corruption, making backdoor attacks easy to defend in practice. To address this issue, we propose a novel backdoor attack method named Spy-Water… ▽ More Backdoor attack aims to deceive a victim model when facing backdoor instances while maintaining its performance on benign data. Current methods use manual patterns or special perturbations as triggers, while they often overlook the robustness against data corruption, making backdoor attacks easy to defend in practice. To address this issue, we propose a novel backdoor attack method named Spy-Watermark, which remains effective when facing data collapse and backdoor defense. Therein, we introduce a learnable watermark embedded in the latent domain of images, serving as the trigger. Then, we search for a watermark that can withstand collapse during image decoding, cooperating with several anti-collapse operations to further enhance the resilience of our trigger against data corruption. Extensive experiments are conducted on CIFAR10, GTSRB, and ImageNet datasets, demonstrating that Spy-Watermark overtakes ten state-of-the-art methods in terms of robustness and stealthiness. △ Less

Submitted 3 January, 2024; originally announced January 2024.

Comments: Accepted by ICASSP2024

arXiv:2312.15595 [pdf, other]

Zero-Inflated Bandits

Authors: Haoyu Wei, Runzhe Wan, Lei Shi, Rui Song

Abstract: Many real applications of bandits have sparse non-zero rewards, leading to slow learning rates. A careful distribution modeling that utilizes problem-specific structures is known as critical to estimation efficiency in the statistics literature, yet is under-explored in bandits. To fill the gap, we initiate the study of zero-inflated bandits, where the reward is modeled as a classic semi-parametri… ▽ More Many real applications of bandits have sparse non-zero rewards, leading to slow learning rates. A careful distribution modeling that utilizes problem-specific structures is known as critical to estimation efficiency in the statistics literature, yet is under-explored in bandits. To fill the gap, we initiate the study of zero-inflated bandits, where the reward is modeled as a classic semi-parametric distribution called zero-inflated distribution. We carefully design Upper Confidence Bound (UCB) and Thompson Sampling (TS) algorithms for this specific structure. Our algorithms are suitable for a very general class of reward distributions, operating under tail assumptions that are considerably less stringent than the typical sub-Gaussian requirements. Theoretically, we derive the regret bounds for both the UCB and TS algorithms for multi-armed bandit, showing that they can achieve rate-optimal regret when the reward distribution is sub-Gaussian. The superior empirical performance of the proposed methods is shown via extensive numerical studies. △ Less

Submitted 24 December, 2023; originally announced December 2023.

arXiv:2312.12871 [pdf, other]

Effect Size Estimation for Duration Recommendation in Online Experiments: Leveraging Hierarchical Models and Objective Utility Approaches

Authors: Yu Liu, Runzhe Wan, James McQueen, Doug Hains, Jinxiang Gu, Rui Song

Abstract: The selection of the assumed effect size (AES) critically determines the duration of an experiment, and hence its accuracy and efficiency. Traditionally, experimenters determine AES based on domain knowledge. However, this method becomes impractical for online experimentation services managing numerous experiments, and a more automated approach is hence of great demand. We initiate the study of da… ▽ More The selection of the assumed effect size (AES) critically determines the duration of an experiment, and hence its accuracy and efficiency. Traditionally, experimenters determine AES based on domain knowledge. However, this method becomes impractical for online experimentation services managing numerous experiments, and a more automated approach is hence of great demand. We initiate the study of data-driven AES selection in for online experimentation services by introducing two solutions. The first employs a three-layer Gaussian Mixture Model considering the heteroskedasticity across experiments, and it seeks to estimate the true expected effect size among positive experiments. The second method, grounded in utility theory, aims to determine the optimal effect size by striking a balance between the experiment's cost and the precision of decision-making. Through comparisons with baseline methods using both simulated and real data, we showcase the superior performance of the proposed approaches. △ Less

Submitted 17 April, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

arXiv:2310.18715 [pdf, other]

Robust Offline Reinforcement learning with Heavy-Tailed Rewards

Authors: Jin Zhu, Runzhe Wan, Zhengling Qi, Shikai Luo, Chengchun Shi

Abstract: This paper endeavors to augment the robustness of offline reinforcement learning (RL) in scenarios laden with heavy-tailed rewards, a prevalent circumstance in real-world applications. We propose two algorithmic frameworks, ROAM and ROOM, for robust off-policy evaluation and offline policy optimization (OPO), respectively. Central to our frameworks is the strategic incorporation of the median-of-m… ▽ More This paper endeavors to augment the robustness of offline reinforcement learning (RL) in scenarios laden with heavy-tailed rewards, a prevalent circumstance in real-world applications. We propose two algorithmic frameworks, ROAM and ROOM, for robust off-policy evaluation and offline policy optimization (OPO), respectively. Central to our frameworks is the strategic incorporation of the median-of-means method with offline RL, enabling straightforward uncertainty estimation for the value function estimator. This not only adheres to the principle of pessimism in OPO but also adeptly manages heavy-tailed rewards. Theoretical results and extensive experiments demonstrate that our two frameworks outperform existing methods on the logged dataset exhibits heavy-tailed reward distributions. The implementation of the proposal is available at https://github.com/Mamba413/ROOM. △ Less

Submitted 30 March, 2024; v1 submitted 28 October, 2023; originally announced October 2023.

Comments: 23 pages, 6 figures. Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS) 2024

arXiv:2310.00214 [pdf, ps, other]

Quantum MDS Codes with length $n\equiv 0,1($mod$\,\frac{q\pm1}{2})$

Authors: Ruhao Wan

Abstract: An important family of quantum codes is the quantum maximum-distance-separable (MDS) codes. In this paper, we construct some new classes of quantum MDS codes by generalized Reed-Solomon (GRS) codes and Hermitian construction. In addition, the length $n$ of most of the quantum MDS codes we constructed satisfies $n\equiv 0,1($mod$\,\frac{q\pm1}{2})$, which is different from previously known code len… ▽ More An important family of quantum codes is the quantum maximum-distance-separable (MDS) codes. In this paper, we construct some new classes of quantum MDS codes by generalized Reed-Solomon (GRS) codes and Hermitian construction. In addition, the length $n$ of most of the quantum MDS codes we constructed satisfies $n\equiv 0,1($mod$\,\frac{q\pm1}{2})$, which is different from previously known code lengths. At the same time, the quantum MDS codes we construct have large minimum distances that are greater than $q/2+1$. △ Less

Submitted 29 September, 2023; originally announced October 2023.

Comments: 21 pages, 2 tables

MSC Class: 81p70

arXiv:2309.12708 [pdf, other]

PointSSC: A Cooperative Vehicle-Infrastructure Point Cloud Benchmark for Semantic Scene Completion

Authors: Yuxiang Yan, Boda Liu, Jianfei Ai, Qinbu Li, Ru Wan, Jian Pu

Abstract: Semantic Scene Completion (SSC) aims to jointly generate space occupancies and semantic labels for complex 3D scenes. Most existing SSC models focus on volumetric representations, which are memory-inefficient for large outdoor spaces. Point clouds provide a lightweight alternative but existing benchmarks lack outdoor point cloud scenes with semantic labels. To address this, we introduce PointSSC,… ▽ More Semantic Scene Completion (SSC) aims to jointly generate space occupancies and semantic labels for complex 3D scenes. Most existing SSC models focus on volumetric representations, which are memory-inefficient for large outdoor spaces. Point clouds provide a lightweight alternative but existing benchmarks lack outdoor point cloud scenes with semantic labels. To address this, we introduce PointSSC, the first cooperative vehicle-infrastructure point cloud benchmark for semantic scene completion. These scenes exhibit long-range perception and minimal occlusion. We develop an automated annotation pipeline leveraging Semantic Segment Anything to efficiently assign semantics. To benchmark progress, we propose a LiDAR-based model with a Spatial-Aware Transformer for global and local feature extraction and a Completion and Segmentation Cooperative Module for joint completion and segmentation. PointSSC provides a challenging testbed to drive advances in semantic point cloud completion for real-world navigation. The code and datasets are available at https://github.com/yyxssm/PointSSC. △ Less

Submitted 6 March, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

Comments: ICRA2024, oral & poster

arXiv:2309.02702 [pdf, other]

Gene-induced Multimodal Pre-training for Image-omic Classification

Authors: Ting Jin, Xingran Xie, Renjie Wan, Qingli Li, Yan Wang

Abstract: Histology analysis of the tumor micro-environment integrated with genomic assays is the gold standard for most cancers in modern medicine. This paper proposes a Gene-induced Multimodal Pre-training (GiMP) framework, which jointly incorporates genomics and Whole Slide Images (WSIs) for classification tasks. Our work aims at dealing with the main challenges of multi-modality image-omic classificatio… ▽ More Histology analysis of the tumor micro-environment integrated with genomic assays is the gold standard for most cancers in modern medicine. This paper proposes a Gene-induced Multimodal Pre-training (GiMP) framework, which jointly incorporates genomics and Whole Slide Images (WSIs) for classification tasks. Our work aims at dealing with the main challenges of multi-modality image-omic classification w.r.t. (1) the patient-level feature extraction difficulties from gigapixel WSIs and tens of thousands of genes, and (2) effective fusion considering high-order relevance modeling. Concretely, we first propose a group multi-head self-attention gene encoder to capture global structured features in gene expression cohorts. We design a masked patch modeling paradigm (MPM) to capture the latent pathological characteristics of different tissues. The mask strategy is randomly masking a fixed-length contiguous subsequence of patch embeddings of a WSI. Finally, we combine the classification tokens of paired modalities and propose a triplet learning module to learn high-order relevance and discriminative patient-level information.After pre-training, a simple fine-tuning can be adopted to obtain the classification results. Experimental results on the TCGA dataset show the superiority of our network architectures and our pre-training framework, achieving 99.47% in accuracy for image-omic classification. The code is publicly available at https://github.com/huangwudiduan/GIMP. △ Less

Submitted 6 September, 2023; originally announced September 2023.

arXiv:2308.03990 [pdf, ps, other]

NEOLAF, an LLM-powered neural-symbolic cognitive architecture

Authors: Richard Jiarui Tong, Cassie Chen Cao, Timothy Xueqian Lee, Guodong Zhao, Ray Wan, Feiyue Wang, Xiangen Hu, Robin Schmucker, Jinsheng Pan, Julian Quevedo, Yu Lu

Abstract: This paper presents the Never Ending Open Learning Adaptive Framework (NEOLAF), an integrated neural-symbolic cognitive architecture that models and constructs intelligent agents. The NEOLAF framework is a superior approach to constructing intelligent agents than both the pure connectionist and pure symbolic approaches due to its explainability, incremental learning, efficiency, collaborative and… ▽ More This paper presents the Never Ending Open Learning Adaptive Framework (NEOLAF), an integrated neural-symbolic cognitive architecture that models and constructs intelligent agents. The NEOLAF framework is a superior approach to constructing intelligent agents than both the pure connectionist and pure symbolic approaches due to its explainability, incremental learning, efficiency, collaborative and distributed learning, human-in-the-loop enablement, and self-improvement. The paper further presents a compelling experiment where a NEOLAF agent, built as a problem-solving agent, is fed with complex math problems from the open-source MATH dataset. The results demonstrate NEOLAF's superior learning capability and its potential to revolutionize the field of cognitive architectures and self-improving adaptive instructional systems. △ Less

Submitted 7 August, 2023; originally announced August 2023.

arXiv:2307.14489 [pdf, other]

SuperInpaint: Learning Detail-Enhanced Attentional Implicit Representation for Super-resolutional Image Inpainting

Authors: Canyu Zhang, Qing Guo, Xiaoguang Li, Renjie Wan, Hongkai Yu, Ivor Tsang, Song Wang

Abstract: In this work, we introduce a challenging image restoration task, referred to as SuperInpaint, which aims to reconstruct missing regions in low-resolution images and generate completed images with arbitrarily higher resolutions. We have found that this task cannot be effectively addressed by stacking state-of-the-art super-resolution and image inpainting methods as they amplify each other's flaws,… ▽ More In this work, we introduce a challenging image restoration task, referred to as SuperInpaint, which aims to reconstruct missing regions in low-resolution images and generate completed images with arbitrarily higher resolutions. We have found that this task cannot be effectively addressed by stacking state-of-the-art super-resolution and image inpainting methods as they amplify each other's flaws, leading to noticeable artifacts. To overcome these limitations, we propose the detail-enhanced attentional implicit representation (DEAR) that can achieve SuperInpaint with a single model, resulting in high-quality completed images with arbitrary resolutions. Specifically, we use a deep convolutional network to extract the latent embedding of an input image and then enhance the high-frequency components of the latent embedding via an adaptive high-pass filter. This leads to detail-enhanced semantic embedding. We further feed the semantic embedding into an unmask-attentional module that suppresses embeddings from ineffective masked pixels. Additionally, we extract a pixel-wise importance map that indicates which pixels should be used for image reconstruction. Given the coordinates of a pixel we want to reconstruct, we first collect its neighboring pixels in the input image and extract their detail-enhanced semantic embeddings, unmask-attentional semantic embeddings, importance values, and spatial distances to the desired pixel. Then, we feed all the above terms into an implicit representation and generate the color of the specified pixel. To evaluate our method, we extend three existing datasets for this new task and build 18 meaningful baselines using SOTA inpainting and super-resolution methods. Extensive experimental results demonstrate that our method outperforms all existing methods by a significant margin on four widely used metrics. △ Less

Submitted 26 July, 2023; originally announced July 2023.

arXiv:2307.11526 [pdf, other]

CopyRNeRF: Protecting the CopyRight of Neural Radiance Fields

Authors: Ziyuan Luo, Qing Guo, Ka Chun Cheung, Simon See, Renjie Wan

Abstract: Neural Radiance Fields (NeRF) have the potential to be a major representation of media. Since training a NeRF has never been an easy task, the protection of its model copyright should be a priority. In this paper, by analyzing the pros and cons of possible copyright protection solutions, we propose to protect the copyright of NeRF models by replacing the original color representation in NeRF with… ▽ More Neural Radiance Fields (NeRF) have the potential to be a major representation of media. Since training a NeRF has never been an easy task, the protection of its model copyright should be a priority. In this paper, by analyzing the pros and cons of possible copyright protection solutions, we propose to protect the copyright of NeRF models by replacing the original color representation in NeRF with a watermarked color representation. Then, a distortion-resistant rendering scheme is designed to guarantee robust message extraction in 2D renderings of NeRF. Our proposed method can directly protect the copyright of NeRF models while maintaining high rendering quality and bit accuracy when compared among optional solutions. △ Less

Submitted 29 July, 2023; v1 submitted 21 July, 2023; originally announced July 2023.

Comments: 11 pages, 6 figures, accepted by ICCV 2023 non-camera-ready version

arXiv:2307.04122 [pdf, other]

Enhancing Low-Light Images Using Infrared-Encoded Images

Authors: Shulin Tian, Yufei Wang, Renjie Wan, Wenhan Yang, Alex C. Kot, Bihan Wen

Abstract: Low-light image enhancement task is essential yet challenging as it is ill-posed intrinsically. Previous arts mainly focus on the low-light images captured in the visible spectrum using pixel-wise loss, which limits the capacity of recovering the brightness, contrast, and texture details due to the small number of income photons. In this work, we propose a novel approach to increase the visibility… ▽ More Low-light image enhancement task is essential yet challenging as it is ill-posed intrinsically. Previous arts mainly focus on the low-light images captured in the visible spectrum using pixel-wise loss, which limits the capacity of recovering the brightness, contrast, and texture details due to the small number of income photons. In this work, we propose a novel approach to increase the visibility of images captured under low-light environments by removing the in-camera infrared (IR) cut-off filter, which allows for the capture of more photons and results in improved signal-to-noise ratio due to the inclusion of information from the IR spectrum. To verify the proposed strategy, we collect a paired dataset of low-light images captured without the IR cut-off filter, with corresponding long-exposure reference images with an external filter. The experimental results on the proposed dataset demonstrate the effectiveness of the proposed method, showing better performance quantitatively and qualitatively. The dataset and code are publicly available at https://wyf0912.github.io/ELIEI/ △ Less

Submitted 9 July, 2023; originally announced July 2023.

Comments: The first two authors contribute equally. The work is accepted by ICIP 2023

arXiv:2306.11503 [pdf, other]

The Age of Synthetic Realities: Challenges and Opportunities

Authors: João Phillipe Cardenuto, Jing Yang, Rafael Padilha, Renjie Wan, Daniel Moreira, Haoliang Li, Shiqi Wang, Fernanda Andaló, Sébastien Marcel, Anderson Rocha

Abstract: Synthetic realities are digital creations or augmentations that are contextually generated through the use of Artificial Intelligence (AI) methods, leveraging extensive amounts of data to construct new narratives or realities, regardless of the intent to deceive. In this paper, we delve into the concept of synthetic realities and their implications for Digital Forensics and society at large within… ▽ More Synthetic realities are digital creations or augmentations that are contextually generated through the use of Artificial Intelligence (AI) methods, leveraging extensive amounts of data to construct new narratives or realities, regardless of the intent to deceive. In this paper, we delve into the concept of synthetic realities and their implications for Digital Forensics and society at large within the rapidly advancing field of AI. We highlight the crucial need for the development of forensic techniques capable of identifying harmful synthetic creations and distinguishing them from reality. This is especially important in scenarios involving the creation and dissemination of fake news, disinformation, and misinformation. Our focus extends to various forms of media, such as images, videos, audio, and text, as we examine how synthetic realities are crafted and explore approaches to detecting these malicious creations. Additionally, we shed light on the key research challenges that lie ahead in this area. This study is of paramount importance due to the rapid progress of AI generative techniques and their impact on the fundamental principles of Forensic Science. △ Less

Submitted 9 June, 2023; originally announced June 2023.

arXiv:2305.15070 [pdf, other]

Annotation Imputation to Individualize Predictions: Initial Studies on Distribution Dynamics and Model Predictions

Authors: London Lowmanstone, Ruyuan Wan, Risako Owan, Jaehyung Kim, Dongyeop Kang

Abstract: Annotating data via crowdsourcing is time-consuming and expensive. Due to these costs, dataset creators often have each annotator label only a small subset of the data. This leads to sparse datasets with examples that are marked by few annotators. The downside of this process is that if an annotator doesn't get to label a particular example, their perspective on it is missed. This is especially co… ▽ More Annotating data via crowdsourcing is time-consuming and expensive. Due to these costs, dataset creators often have each annotator label only a small subset of the data. This leads to sparse datasets with examples that are marked by few annotators. The downside of this process is that if an annotator doesn't get to label a particular example, their perspective on it is missed. This is especially concerning for subjective NLP datasets where there is no single correct label: people may have different valid opinions. Thus, we propose using imputation methods to generate the opinions of all annotators for all examples, creating a dataset that does not leave out any annotator's view. We then train and prompt models, using data from the imputed dataset, to make predictions about the distribution of responses and individual annotations. In our analysis of the results, we found that the choice of imputation method significantly impacts soft label changes and distribution. While the imputation introduces noise in the prediction of the original dataset, it has shown potential in enhancing shots for prompts, particularly for low-response-rate annotators. We have made all of our code and data publicly available. △ Less

Submitted 5 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: NLPerspectives - 2nd Workshop on Perspectivist Approaches to NLP, 39 pages, 13 figures, 13 tables

Journal ref: 2nd Workshop on Perspectivist Approaches to NLP 2023

arXiv:2304.11393 [pdf, other]

Knowledge Distillation from 3D to Bird's-Eye-View for LiDAR Semantic Segmentation

Authors: Feng Jiang, Heng Gao, Shoumeng Qiu, Haiqiang Zhang, Ru Wan, Jian Pu

Abstract: LiDAR point cloud segmentation is one of the most fundamental tasks for autonomous driving scene understanding. However, it is difficult for existing models to achieve both high inference speed and accuracy simultaneously. For example, voxel-based methods perform well in accuracy, while Bird's-Eye-View (BEV)-based methods can achieve real-time inference. To overcome this issue, we develop an effec… ▽ More LiDAR point cloud segmentation is one of the most fundamental tasks for autonomous driving scene understanding. However, it is difficult for existing models to achieve both high inference speed and accuracy simultaneously. For example, voxel-based methods perform well in accuracy, while Bird's-Eye-View (BEV)-based methods can achieve real-time inference. To overcome this issue, we develop an effective 3D-to-BEV knowledge distillation method that transfers rich knowledge from 3D voxel-based models to BEV-based models. Our framework mainly consists of two modules: the voxel-to-pillar distillation module and the label-weight distillation module. Voxel-to-pillar distillation distills sparse 3D features to BEV features for middle layers to make the BEV-based model aware of more structural and geometric information. Label-weight distillation helps the model pay more attention to regions with more height information. Finally, we conduct experiments on the SemanticKITTI dataset and Paris-Lille-3D. The results on SemanticKITTI show more than 5% improvement on the test set, especially for classes such as motorcycle and person, with more than 15% improvement. The code can be accessed at https://github.com/fengjiang5/Knowledge-Distillation-from-Cylinder3D-to-PolarNet. △ Less

Submitted 22 April, 2023; originally announced April 2023.

Comments: ICME 2023 Accepted

arXiv:2304.00420 [pdf, other]

Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-Making for Continuous Monitoring

Authors: Runzhe Wan, Yu Liu, James McQueen, Doug Hains, Rui Song

Abstract: With the growing needs of online A/B testing to support the innovation in industry, the opportunity cost of running an experiment becomes non-negligible. Therefore, there is an increasing demand for an efficient continuous monitoring service that allows early stopping when appropriate. Classic statistical methods focus on hypothesis testing and are mostly developed for traditional high-stake probl… ▽ More With the growing needs of online A/B testing to support the innovation in industry, the opportunity cost of running an experiment becomes non-negligible. Therefore, there is an increasing demand for an efficient continuous monitoring service that allows early stopping when appropriate. Classic statistical methods focus on hypothesis testing and are mostly developed for traditional high-stake problems such as clinical trials, while experiments at online service companies typically have very different features and focuses. Motivated by the real needs, in this paper, we introduce a novel framework that we developed in Amazon to maximize customer experience and control opportunity cost. We formulate the problem as a Bayesian optimal sequential decision making problem that has a unified utility function. We discuss extensively practical design choices and considerations. We further introduce how to solve the optimal decision rule via Reinforcement Learning and scale the solution. We show the effectiveness of this novel approach compared with existing methods via a large-scale meta-analysis on experiments in Amazon. △ Less

Submitted 1 April, 2023; originally announced April 2023.

arXiv:2302.13251 [pdf, other]

Unsupervised Domain Adaptation for Low-dose CT Reconstruction via Bayesian Uncertainty Alignment

Authors: Kecheng Chen, Jie Liu, Renjie Wan, Victor Ho-Fun Lee, Varut Vardhanabhuti, Hong Yan, Haoliang Li

Abstract: Low-dose computed tomography (LDCT) image reconstruction techniques can reduce patient radiation exposure while maintaining acceptable imaging quality. Deep learning is widely used in this problem, but the performance of testing data (a.k.a. target domain) is often degraded in clinical scenarios due to the variations that were not encountered in training data (a.k.a. source domain). Unsupervised d… ▽ More Low-dose computed tomography (LDCT) image reconstruction techniques can reduce patient radiation exposure while maintaining acceptable imaging quality. Deep learning is widely used in this problem, but the performance of testing data (a.k.a. target domain) is often degraded in clinical scenarios due to the variations that were not encountered in training data (a.k.a. source domain). Unsupervised domain adaptation (UDA) of LDCT reconstruction has been proposed to solve this problem through distribution alignment. However, existing UDA methods fail to explore the usage of uncertainty quantification, which is crucial for reliable intelligent medical systems in clinical scenarios with unexpected variations. Moreover, existing direct alignment for different patients would lead to content mismatch issues. To address these issues, we propose to leverage a probabilistic reconstruction framework to conduct a joint discrepancy minimization between source and target domains in both the latent and image spaces. In the latent space, we devise a Bayesian uncertainty alignment to reduce the epistemic gap between the two domains. This approach reduces the uncertainty level of target domain data, making it more likely to render well-reconstructed results on target domains. In the image space, we propose a sharpness-aware distribution alignment to achieve a match of second-order information, which can ensure that the reconstructed images from the target domain have similar sharpness to normal-dose CT images from the source domain. Experimental results on two simulated datasets and one clinical low-dose imaging dataset show that our proposed method outperforms other methods in quantitative and visualized performance. △ Less

Submitted 2 June, 2024; v1 submitted 26 February, 2023; originally announced February 2023.

Comments: Accepted by IEEE Transactions on Neural Networks and Learning Systems

arXiv:2302.06169 [pdf, ps, other]

New Quantum MDS codes from Hermitian self-orthogonal generalized Reed-Solomon codes

Authors: Ruhao Wan, Shixin Zhu

Abstract: Quantum maximum-distance-separable (MDS for short) codes are an important class of quantum codes. In this paper, by using Hermitian self-orthogonal generalized Reed-Solomon (GRS for short) codes, we construct five new classes of $q$-ary quantum MDS codes with minimum distance larger than $q/2+1$. Furthermore, the parameters of our quantum MDS code cannot be obtained from the previous constructions… ▽ More Quantum maximum-distance-separable (MDS for short) codes are an important class of quantum codes. In this paper, by using Hermitian self-orthogonal generalized Reed-Solomon (GRS for short) codes, we construct five new classes of $q$-ary quantum MDS codes with minimum distance larger than $q/2+1$. Furthermore, the parameters of our quantum MDS code cannot be obtained from the previous constructions. △ Less

Submitted 9 July, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

Comments: 19 pages, 3 tables

MSC Class: 94B05; 81P70

arXiv:2302.05746 [pdf, other]

Removing Image Artifacts From Scratched Lens Protectors

Authors: Yufei Wang, Renjie Wan, Wenhan Yang, Bihan Wen, Lap-Pui Chau, Alex C. Kot

Abstract: A protector is placed in front of the camera lens for mobile devices to avoid damage, while the protector itself can be easily scratched accidentally, especially for plastic ones. The artifacts appear in a wide variety of patterns, making it difficult to see through them clearly. Removing image artifacts from the scratched lens protector is inherently challenging due to the occasional flare artifa… ▽ More A protector is placed in front of the camera lens for mobile devices to avoid damage, while the protector itself can be easily scratched accidentally, especially for plastic ones. The artifacts appear in a wide variety of patterns, making it difficult to see through them clearly. Removing image artifacts from the scratched lens protector is inherently challenging due to the occasional flare artifacts and the co-occurring interference within mixed artifacts. Though different methods have been proposed for some specific distortions, they seldom consider such inherent challenges. In our work, we consider the inherent challenges in a unified framework with two cooperative modules, which facilitate the performance boost of each other. We also collect a new dataset from the real world to facilitate training and evaluation purposes. The experimental results demonstrate that our method outperforms the baselines qualitatively and quantitatively. The code and datasets will be released after acceptance. △ Less

Submitted 14 February, 2023; v1 submitted 11 February, 2023; originally announced February 2023.

Comments: Accepted by ISCAS 2023

arXiv:2302.01543 [pdf, other]

Multiplier Bootstrap-based Exploration

Authors: Runzhe Wan, Haoyu Wei, Branislav Kveton, Rui Song

Abstract: Despite the great interest in the bandit problem, designing efficient algorithms for complex models remains challenging, as there is typically no analytical way to quantify uncertainty. In this paper, we propose Multiplier Bootstrap-based Exploration (MBE), a novel exploration strategy that is applicable to any reward model amenable to weighted loss minimization. We prove both instance-dependent a… ▽ More Despite the great interest in the bandit problem, designing efficient algorithms for complex models remains challenging, as there is typically no analytical way to quantify uncertainty. In this paper, we propose Multiplier Bootstrap-based Exploration (MBE), a novel exploration strategy that is applicable to any reward model amenable to weighted loss minimization. We prove both instance-dependent and instance-independent rate-optimal regret bounds for MBE in sub-Gaussian multi-armed bandits. With extensive simulation and real data experiments, we show the generality and adaptivity of MBE. △ Less

Submitted 2 February, 2023; originally announced February 2023.

arXiv:2301.13152 [pdf, other]

STEEL: Singularity-aware Reinforcement Learning

Authors: Xiaohong Chen, Zhengling Qi, Runzhe Wan

Abstract: Batch reinforcement learning (RL) aims at leveraging pre-collected data to find an optimal policy that maximizes the expected total rewards in a dynamic environment. The existing methods require absolutely continuous assumption (e.g., there do not exist non-overlapping regions) on the distribution induced by target policies with respect to the data distribution over either the state or action or b… ▽ More Batch reinforcement learning (RL) aims at leveraging pre-collected data to find an optimal policy that maximizes the expected total rewards in a dynamic environment. The existing methods require absolutely continuous assumption (e.g., there do not exist non-overlapping regions) on the distribution induced by target policies with respect to the data distribution over either the state or action or both. We propose a new batch RL algorithm that allows for singularity for both state and action spaces (e.g., existence of non-overlapping regions between offline data distribution and the distribution induced by the target policies) in the setting of an infinite-horizon Markov decision process with continuous states and actions. We call our algorithm STEEL: SingulariTy-awarE rEinforcement Learning. Our algorithm is motivated by a new error analysis on off-policy evaluation, where we use maximum mean discrepancy, together with distributionally robust optimization, to characterize the error of off-policy evaluation caused by the possible singularity and to enable model extrapolation. By leveraging the idea of pessimism and under some technical conditions, we derive a first finite-sample regret guarantee for our proposed algorithm under singularity. Compared with existing algorithms,by requiring only minimal data-coverage assumption, STEEL improves the applicability and robustness of batch RL. In addition, a two-step adaptive STEEL, which is nearly tuning-free, is proposed. Extensive simulation studies and one (semi)-real experiment on personalized pricing demonstrate the superior performance of our methods in dealing with possible singularity in batch RL. △ Less

Submitted 25 June, 2024; v1 submitted 30 January, 2023; originally announced January 2023.

arXiv:2301.07301 [pdf, other]

PTA-Det: Point Transformer Associating Point cloud and Image for 3D Object Detection

Authors: Rui Wan, Tianyun Zhao, Wei Zhao

Abstract: In autonomous driving, 3D object detection based on multi-modal data has become an indispensable approach when facing complex environments around the vehicle. During multi-modal detection, LiDAR and camera are simultaneously applied for capturing and modeling. However, due to the intrinsic discrepancies between the LiDAR point and camera image, the fusion of the data for object detection encounter… ▽ More In autonomous driving, 3D object detection based on multi-modal data has become an indispensable approach when facing complex environments around the vehicle. During multi-modal detection, LiDAR and camera are simultaneously applied for capturing and modeling. However, due to the intrinsic discrepancies between the LiDAR point and camera image, the fusion of the data for object detection encounters a series of problems. Most multi-modal detection methods perform even worse than LiDAR-only methods. In this investigation, we propose a method named PTA-Det to improve the performance of multi-modal detection. Accompanied by PTA-Det, a Pseudo Point Cloud Generation Network is proposed, which can convert image information including texture and semantic features by pseudo points. Thereafter, through a transformer-based Point Fusion Transition (PFT) module, the features of LiDAR points and pseudo points from image can be deeply fused under a unified point-based representation. The combination of these modules can conquer the major obstacle in feature fusion across modalities and realizes a complementary and discriminative representation for proposal generation. Extensive experiments on the KITTI dataset show the PTA-Det achieves a competitive result and support its effectiveness. △ Less

Submitted 17 January, 2023; originally announced January 2023.

arXiv:2301.05036 [pdf, other]

Everyone's Voice Matters: Quantifying Annotation Disagreement Using Demographic Information

Authors: Ruyuan Wan, Jaehyung Kim, Dongyeop Kang

Abstract: In NLP annotation, it is common to have multiple annotators label the text and then obtain the ground truth labels based on the agreement of major annotators. However, annotators are individuals with different backgrounds, and minors' opinions should not be simply ignored. As annotation tasks become subjective and topics are controversial in modern NLP tasks, we need NLP systems that can represent… ▽ More In NLP annotation, it is common to have multiple annotators label the text and then obtain the ground truth labels based on the agreement of major annotators. However, annotators are individuals with different backgrounds, and minors' opinions should not be simply ignored. As annotation tasks become subjective and topics are controversial in modern NLP tasks, we need NLP systems that can represent people's diverse voices on subjective matters and predict the level of diversity. This paper examines whether the text of the task and annotators' demographic background information can be used to estimate the level of disagreement among annotators. Particularly, we extract disagreement labels from the annotators' voting histories in the five subjective datasets, and then fine-tune language models to predict annotators' disagreement. Our results show that knowing annotators' demographic information, like gender, ethnicity, and education level, helps predict disagreements. In order to distinguish the disagreement from the inherent controversy from text content and the disagreement in the annotators' different perspectives, we simulate everyone's voices with different combinations of annotators' artificial demographics and examine its variance of the finetuned disagreement predictor. Our paper aims to improve the annotation process for more efficient and inclusive NLP systems through a novel disagreement prediction mechanism. Our code and dataset are publicly available. △ Less

Submitted 12 January, 2023; originally announced January 2023.

arXiv:2212.14580 [pdf, ps, other]

Heterogeneous Synthetic Learner for Panel Data

Authors: Ye Shen, Runzhe Wan, Hengrui Cai, Rui Song

Abstract: In the new era of personalization, learning the heterogeneous treatment effect (HTE) becomes an inevitable trend with numerous applications. Yet, most existing HTE estimation methods focus on independently and identically distributed observations and cannot handle the non-stationarity and temporal dependency in the common panel data setting. The treatment evaluators developed for panel data, on th… ▽ More In the new era of personalization, learning the heterogeneous treatment effect (HTE) becomes an inevitable trend with numerous applications. Yet, most existing HTE estimation methods focus on independently and identically distributed observations and cannot handle the non-stationarity and temporal dependency in the common panel data setting. The treatment evaluators developed for panel data, on the other hand, typically ignore the individualized information. To fill the gap, in this paper, we initialize the study of HTE estimation in panel data. Under different assumptions for HTE identifiability, we propose the corresponding heterogeneous one-side and two-side synthetic learner, namely H1SL and H2SL, by leveraging the state-of-the-art HTE estimator for non-panel data and generalizing the synthetic control method that allows flexible data generating process. We establish the convergence rates of the proposed estimators. The superior performance of the proposed methods over existing ones is demonstrated by extensive numerical studies. △ Less

Submitted 29 January, 2023; v1 submitted 30 December, 2022; originally announced December 2022.

arXiv:2212.12845 [pdf, ps, other]

Mining the Factor Zoo: Estimation of Latent Factor Models with Sufficient Proxies

Authors: Runzhe Wan, Yingying Li, Wenbin Lu, Rui Song

Abstract: Latent factor model estimation typically relies on either using domain knowledge to manually pick several observed covariates as factor proxies, or purely conducting multivariate analysis such as principal component analysis. However, the former approach may suffer from the bias while the latter can not incorporate additional information. We propose to bridge these two approaches while allowing th… ▽ More Latent factor model estimation typically relies on either using domain knowledge to manually pick several observed covariates as factor proxies, or purely conducting multivariate analysis such as principal component analysis. However, the former approach may suffer from the bias while the latter can not incorporate additional information. We propose to bridge these two approaches while allowing the number of factor proxies to diverge, and hence make the latent factor model estimation robust, flexible, and statistically more accurate. As a bonus, the number of factors is also allowed to grow. At the heart of our method is a penalized reduced rank regression to combine information. To further deal with heavy-tailed data, a computationally attractive penalized robust reduced rank regression method is proposed. We establish faster rates of convergence compared with the benchmark. Extensive simulations and real examples are used to illustrate the advantages. △ Less

Submitted 2 January, 2023; v1 submitted 24 December, 2022; originally announced December 2022.

arXiv:2211.01553 [pdf, other]

User or Labor: An Interaction Framework for Human-Machine Relationships in NLP

Authors: Ruyuan Wan, Naome Etori, Karla Badillo-Urquiola, Dongyeop Kang

Abstract: The bridging research between Human-Computer Interaction and Natural Language Processing is developing quickly these years. However, there is still a lack of formative guidelines to understand the human-machine interaction in the NLP loop. When researchers crossing the two fields talk about humans, they may imply a user or labor. Regarding a human as a user, the human is in control, and the machin… ▽ More The bridging research between Human-Computer Interaction and Natural Language Processing is developing quickly these years. However, there is still a lack of formative guidelines to understand the human-machine interaction in the NLP loop. When researchers crossing the two fields talk about humans, they may imply a user or labor. Regarding a human as a user, the human is in control, and the machine is used as a tool to achieve the human's goals. Considering a human as a laborer, the machine is in control, and the human is used as a resource to achieve the machine's goals. Through a systematic literature review and thematic analysis, we present an interaction framework for understanding human-machine relationships in NLP. In the framework, we propose four types of human-machine interactions: Human-Teacher and Machine-Learner, Machine-Leading, Human-Leading, and Human-Machine Collaborators. Our analysis shows that the type of interaction is not fixed but can change across tasks as the relationship between the human and the machine develops. We also discuss the implications of this framework for the future of NLP and human-machine relationships. △ Less

Submitted 2 November, 2022; originally announced November 2022.

arXiv:2210.10562 [pdf, ps, other]

Research on Hermitian self-dual codes, GRS codes and EGRS codes

Authors: Ruhao Wan, Shixin Zhu

Abstract: MDS self-dual codes have nice algebraic structures, theoretical significance and practical implications. In this paper, we present three classes of $q^2$-ary Hermitian self-dual (extended) generalized Reed-Solomon codes with different code locators. Combining the results in Ball et al. (Designs, Codes and Cryptography, 89: 811-821, 2021), we show that if the code locators do not contain zero,… ▽ More MDS self-dual codes have nice algebraic structures, theoretical significance and practical implications. In this paper, we present three classes of $q^2$-ary Hermitian self-dual (extended) generalized Reed-Solomon codes with different code locators. Combining the results in Ball et al. (Designs, Codes and Cryptography, 89: 811-821, 2021), we show that if the code locators do not contain zero, $q^2$-ary Hermitian self-dual (extended) GRS codes of length $\geq 2q\ (q>2)$ does not exist. Under certain conditions, we prove Conjecture 3.7 and Conjecture 3.13 proposed by Guo and Li et al. (IEEE Communications Letters, 25(4): 1062-1065, 2021). △ Less

Submitted 14 December, 2022; v1 submitted 19 October, 2022; originally announced October 2022.

Comments: 18 pages

MSC Class: 94B05; 81p70

arXiv:2209.12254 [pdf, other]

From One to Many: Dynamic Cross Attention Networks for LiDAR and Camera Fusion

Authors: Rui Wan, Shuangjie Xu, Wei Wu, Xiaoyi Zou, Tongyi Cao

Abstract: LiDAR and cameras are two complementary sensors for 3D perception in autonomous driving. LiDAR point clouds have accurate spatial and geometry information, while RGB images provide textural and color data for context reasoning. To exploit LiDAR and cameras jointly, existing fusion methods tend to align each 3D point to only one projected image pixel based on calibration, namely one-to-one mapping.… ▽ More LiDAR and cameras are two complementary sensors for 3D perception in autonomous driving. LiDAR point clouds have accurate spatial and geometry information, while RGB images provide textural and color data for context reasoning. To exploit LiDAR and cameras jointly, existing fusion methods tend to align each 3D point to only one projected image pixel based on calibration, namely one-to-one mapping. However, the performance of these approaches highly relies on the calibration quality, which is sensitive to the temporal and spatial synchronization of sensors. Therefore, we propose a Dynamic Cross Attention (DCA) module with a novel one-to-many cross-modality mapping that learns multiple offsets from the initial projection towards the neighborhood and thus develops tolerance to calibration error. Moreover, a \textit{dynamic query enhancement} is proposed to perceive the model-independent calibration, which further strengthens DCA's tolerance to the initial misalignment. The whole fusion architecture named Dynamic Cross Attention Network (DCAN) exploits multi-level image features and adapts to multiple representations of point clouds, which allows DCA to serve as a plug-in fusion module. Extensive experiments on nuScenes and KITTI prove DCA's effectiveness. The proposed DCAN outperforms state-of-the-art methods on the nuScenes detection challenge. △ Less

Submitted 25 September, 2022; originally announced September 2022.

arXiv:2207.11744 [pdf, ps, other]

New MDS self-dual codes over finite fields $\F_{r^2}$

Authors: Ruhao Wan, Yang Li, Shixin Zhu

Abstract: MDS self-dual codes have nice algebraic structures and are uniquely determined by lengths. Recently, the construction of MDS self-dual codes of new lengths has become an important and hot issue in coding theory. In this paper, we develop the existing theory and construct six new classes of MDS self-dual codes. Together with our constructions, the proportion of all known MDS self-dual codes relativ… ▽ More MDS self-dual codes have nice algebraic structures and are uniquely determined by lengths. Recently, the construction of MDS self-dual codes of new lengths has become an important and hot issue in coding theory. In this paper, we develop the existing theory and construct six new classes of MDS self-dual codes. Together with our constructions, the proportion of all known MDS self-dual codes relative to possible MDS self-dual codes generally exceed 57\%. As far as we know, this is the largest known ratio. Moreover, some new families of MDS self-orthogonal codes and MDS almost self-dual codes are also constructed. △ Less

Submitted 3 October, 2022; v1 submitted 24 July, 2022; originally announced July 2022.

Comments: 16 pages, 3 table

MSC Class: 94B05; 81p70 ACM Class: E.4

arXiv:2207.04232 [pdf, ps, other]

Construction of MDS self-dual codes from generalized Reed-Solomon codes

Authors: Ruhao Wan, Shixin Zhu, Jin Li

Abstract: MDS codes and self-dual codes are important families of classical codes in coding theory. It is of interest to investigate MDS self-dual codes. The existence of MDS self-dual codes over finite field $F_q$ is completely solved for $q$ is even. In this paper, for finite field with odd characteristic, we construct some new classes of MDS self-dual codes by (extended) generalized Reed-Solomon codes. MDS codes and self-dual codes are important families of classical codes in coding theory. It is of interest to investigate MDS self-dual codes. The existence of MDS self-dual codes over finite field $F_q$ is completely solved for $q$ is even. In this paper, for finite field with odd characteristic, we construct some new classes of MDS self-dual codes by (extended) generalized Reed-Solomon codes. △ Less

Submitted 27 August, 2022; v1 submitted 9 July, 2022; originally announced July 2022.

Comments: 24 pages,2 table

MSC Class: 94B05; 81p70 ACM Class: E.4

arXiv:2206.06615 [pdf, ps, other]

MDS Codes with Euclidean and Hermitian Hulls of Flexible Dimensions and Their Applications to EAQECCs

Authors: Yang Li, Ruhao Wan, Shixin Zhu

Abstract: The hull of a linear code is the intersection of itself with its dual code with respect to certain inner product. Both Euclidean and Hermitian hulls are of theorical and practical significance. In this paper, we construct several new classes of MDS codes via (extended) generalized Reed-Solomon (GRS) codes and determine their Euclidean or Hermitian hulls. Specifically, four new classes of MDS codes… ▽ More The hull of a linear code is the intersection of itself with its dual code with respect to certain inner product. Both Euclidean and Hermitian hulls are of theorical and practical significance. In this paper, we construct several new classes of MDS codes via (extended) generalized Reed-Solomon (GRS) codes and determine their Euclidean or Hermitian hulls. Specifically, four new classes of MDS codes with Hermitian hulls of flexible dimensions and six new classes of MDS codes with Euclidean hulls of flexible dimensions are constructed. For the former, we further construct four new classes of entanglement-assisted quantum error-correcting codes (EAQECCs) and four new classes of MDS EAQECCs of length $n>q+1$. For the latter, we also give some examples on Euclidean self-orthogonal and one-dimensional Euclidean hull MDS codes. △ Less

Submitted 5 October, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

Comments: 25 pages, 5 tables

MSC Class: 94B05; 81p70

arXiv:2204.12148 [pdf, other]

Morest: Model-based RESTful API Testing with Execution Feedback

Authors: Yi Liu, Yuekang Li, Gelei Deng, Yang Liu, Ruiyuan Wan, Runchao Wu, Dandan Ji, Shiheng Xu, Minli Bao

Abstract: RESTful APIs are arguably the most popular endpoints for accessing Web services. Blackbox testing is one of the emerging techniques for ensuring the reliability of RESTful APIs. The major challenge in testing RESTful APIs is the need for correct sequences of API operation calls for in-depth testing. To build meaningful operation call sequences, researchers have proposed techniques to learn and uti… ▽ More RESTful APIs are arguably the most popular endpoints for accessing Web services. Blackbox testing is one of the emerging techniques for ensuring the reliability of RESTful APIs. The major challenge in testing RESTful APIs is the need for correct sequences of API operation calls for in-depth testing. To build meaningful operation call sequences, researchers have proposed techniques to learn and utilize the API dependencies based on OpenAPI specifications. However, these techniques either lack the overall awareness of how all the APIs are connected or the flexibility of adaptively fixing the learned knowledge. In this paper, we propose Morest, a model-based RESTful API testing technique that builds and maintains a dynamically updating RESTful-service Property Graph (RPG) to model the behaviors of RESTful-services and guide the call sequence generation. We empirically evaluated Morest and the results demonstrate that Morest can successfully request an average of 152.66%-232.45% more API operations, cover 26.16%-103.24% more lines of code, and detect 40.64%-215.94% more bugs than state-of-the-art techniques. In total, we applied Morest to 6 real-world projects and found 44 bugs (13 of them cannot be detected by existing approaches). Specifically, 2 of the confirmed bugs are from Bitbucket, a famous code management service with more than 6 million users. △ Less

Submitted 26 April, 2022; originally announced April 2022.

Journal ref: 44th International Conference on Software Engineering (ICSE 2022)

arXiv:2204.12088 [pdf, ps, other]

A physics-informed deep neural network for surrogate modeling in classical elasto-plasticity

Authors: Mahdad Eghbalian, Mehdi Pouragha, Richard Wan

Abstract: In this work, we present a deep neural network architecture that can efficiently approximate classical elasto-plastic constitutive relations. The network is enriched with crucial physics aspects of classical elasto-plasticity, including additive decomposition of strains into elastic and plastic parts, and nonlinear incremental elasticity. This leads to a Physics-Informed Neural Network (PINN) surr… ▽ More In this work, we present a deep neural network architecture that can efficiently approximate classical elasto-plastic constitutive relations. The network is enriched with crucial physics aspects of classical elasto-plasticity, including additive decomposition of strains into elastic and plastic parts, and nonlinear incremental elasticity. This leads to a Physics-Informed Neural Network (PINN) surrogate model named here as Elasto-Plastic Neural Network (EPNN). Detailed analyses show that embedding these physics into the architecture of the neural network facilitates a more efficient training of the network with less training data, while also enhancing the extrapolation capability for loading regimes outside the training data. The architecture of EPNN is model and material-independent, i.e. it can be adapted to a wide range of elasto-plastic material types, including geomaterials and metals; and experimental data can potentially be directly used in training the network. To demonstrate the robustness of the proposed architecture, we adapt its general framework to the elasto-plastic behavior of sands. We use synthetic data generated from material point simulations based on a relatively advanced dilatancy-based constitutive model for granular materials to train the neural network. The superiority of EPNN over regular neural network architectures is explored through predicting unseen strain-controlled loading paths for sands with different initial densities. △ Less

Submitted 26 April, 2022; originally announced April 2022.

Comments: 53 pages, 30 figures, preprint submitted to Elsevier

MSC Class: 74C05; 65N99 ACM Class: J.2; I.6.5

arXiv:2202.13234 [pdf, other]

Safe Exploration for Efficient Policy Evaluation and Comparison

Authors: Runzhe Wan, Branislav Kveton, Rui Song

Abstract: High-quality data plays a central role in ensuring the accuracy of policy evaluation. This paper initiates the study of efficient and safe data collection for bandit policy evaluation. We formulate the problem and investigate its several representative variants. For each variant, we analyze its statistical properties, derive the corresponding exploration policy, and design an efficient algorithm f… ▽ More High-quality data plays a central role in ensuring the accuracy of policy evaluation. This paper initiates the study of efficient and safe data collection for bandit policy evaluation. We formulate the problem and investigate its several representative variants. For each variant, we analyze its statistical properties, derive the corresponding exploration policy, and design an efficient algorithm for computing it. Both theoretical analysis and experiments support the usefulness of the proposed methods. △ Less

Submitted 18 June, 2022; v1 submitted 26 February, 2022; originally announced February 2022.

arXiv:2202.13227 [pdf, other]

Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework

Authors: Runzhe Wan, Lin Ge, Rui Song

Abstract: Online learning in large-scale structured bandits is known to be challenging due to the curse of dimensionality. In this paper, we propose a unified meta-learning framework for a general class of structured bandit problems where the parameter space can be factorized to item-level. The novel bandit algorithm is general to be applied to many popular problems,scalable to the huge parameter and action… ▽ More Online learning in large-scale structured bandits is known to be challenging due to the curse of dimensionality. In this paper, we propose a unified meta-learning framework for a general class of structured bandit problems where the parameter space can be factorized to item-level. The novel bandit algorithm is general to be applied to many popular problems,scalable to the huge parameter and action spaces, and robust to the specification of the generalization model. At the core of this framework is a Bayesian hierarchical model that allows information sharing among items via their features, upon which we design a meta Thompson sampling algorithm. Three representative examples are discussed thoroughly. Both theoretical analysis and numerical results support the usefulness of the proposed method. △ Less

Submitted 26 February, 2022; originally announced February 2022.

arXiv:2202.10574 [pdf, other]

A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets

Authors: Chengchun Shi, Runzhe Wan, Ge Song, Shikai Luo, Rui Song, Hongtu Zhu

Abstract: The two-sided markets such as ride-sharing companies often involve a group of subjects who are making sequential decisions across time and/or location. With the rapid development of smart phones and internet of things, they have substantially transformed the transportation landscape of human beings. In this paper we consider large-scale fleet management in ride-sharing companies that involve multi… ▽ More The two-sided markets such as ride-sharing companies often involve a group of subjects who are making sequential decisions across time and/or location. With the rapid development of smart phones and internet of things, they have substantially transformed the transportation landscape of human beings. In this paper we consider large-scale fleet management in ride-sharing companies that involve multiple units in different areas receiving sequences of products (or treatments) over time. Major technical challenges, such as policy evaluation, arise in those studies because (i) spatial and temporal proximities induce interference between locations and times; and (ii) the large number of locations results in the curse of dimensionality. To address both challenges simultaneously, we introduce a multi-agent reinforcement learning (MARL) framework for carrying policy evaluation in these studies. We propose novel estimators for mean outcomes under different products that are consistent despite the high-dimensionality of state-action space. The proposed estimator works favorably in simulation experiments. We further illustrate our method using a real dataset obtained from a two-sided marketplace company to evaluate the effects of applying different subsidizing policies. A Python implementation of our proposed method is available at https://github.com/RunzheStat/CausalMARL. △ Less

Submitted 26 March, 2023; v1 submitted 21 February, 2022; originally announced February 2022.

arXiv:2201.05972 [pdf, other]

Sparse Cross-scale Attention Network for Efficient LiDAR Panoptic Segmentation

Authors: Shuangjie Xu, Rui Wan, Maosheng Ye, Xiaoyi Zou, Tongyi Cao

Abstract: Two major challenges of 3D LiDAR Panoptic Segmentation (PS) are that point clouds of an object are surface-aggregated and thus hard to model the long-range dependency especially for large instances, and that objects are too close to separate each other. Recent literature addresses these problems by time-consuming grouping processes such as dual-clustering, mean-shift offsets, etc., or by bird-eye-… ▽ More Two major challenges of 3D LiDAR Panoptic Segmentation (PS) are that point clouds of an object are surface-aggregated and thus hard to model the long-range dependency especially for large instances, and that objects are too close to separate each other. Recent literature addresses these problems by time-consuming grouping processes such as dual-clustering, mean-shift offsets, etc., or by bird-eye-view (BEV) dense centroid representation that downplays geometry. However, the long-range geometry relationship has not been sufficiently modeled by local feature learning from the above methods. To this end, we present SCAN, a novel sparse cross-scale attention network to first align multi-scale sparse features with global voxel-encoded attention to capture the long-range relationship of instance context, which can boost the regression accuracy of the over-segmented large objects. For the surface-aggregated points, SCAN adopts a novel sparse class-agnostic representation of instance centroids, which can not only maintain the sparsity of aligned features to solve the under-segmentation on small objects, but also reduce the computation amount of the network through sparse convolution. Our method outperforms previous methods by a large margin in the SemanticKITTI dataset for the challenging 3D PS task, achieving 1st place with a real-time inference speed. △ Less

Submitted 16 January, 2022; originally announced January 2022.

Comments: Accepted by the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22)

arXiv:2201.03145 [pdf, other]

Enhancing Low-Light Images in Real World via Cross-Image Disentanglement

Authors: Lanqing Guo, Renjie Wan, Wenhan Yang, Alex Kot, Bihan Wen

Abstract: Images captured in the low-light condition suffer from low visibility and various imaging artifacts, e.g., real noise. Existing supervised enlightening algorithms require a large set of pixel-aligned training image pairs, which are hard to prepare in practice. Though weakly-supervised or unsupervised methods can alleviate such challenges without using paired training images, some real-world artifa… ▽ More Images captured in the low-light condition suffer from low visibility and various imaging artifacts, e.g., real noise. Existing supervised enlightening algorithms require a large set of pixel-aligned training image pairs, which are hard to prepare in practice. Though weakly-supervised or unsupervised methods can alleviate such challenges without using paired training images, some real-world artifacts inevitably get falsely amplified because of the lack of corresponded supervision. In this paper, instead of using perfectly aligned images for training, we creatively employ the misaligned real-world images as the guidance, which are considerably easier to collect. Specifically, we propose a Cross-Image Disentanglement Network (CIDN) to separately extract cross-image brightness and image-specific content features from low/normal-light images. Based on that, CIDN can simultaneously correct the brightness and suppress image artifacts in the feature domain, which largely increases the robustness to the pixel shifts. Furthermore, we collect a new low-light image enhancement dataset consisting of misaligned training images with real-world corruptions. Experimental results show that our model achieves state-of-the-art performances on both the newly proposed dataset and other popular low-light datasets. △ Less

Submitted 7 July, 2022; v1 submitted 9 January, 2022; originally announced January 2022.

ACM Class: I.4.3; I.4.4

arXiv:2111.08318 [pdf, other]

DRINet++: Efficient Voxel-as-point Point Cloud Segmentation

Authors: Maosheng Ye, Rui Wan, Shuangjie Xu, Tongyi Cao, Qifeng Chen

Abstract: Recently, many approaches have been proposed through single or multiple representations to improve the performance of point cloud semantic segmentation. However, these works do not maintain a good balance among performance, efficiency, and memory consumption. To address these issues, we propose DRINet++ that extends DRINet by enhancing the sparsity and geometric properties of a point cloud with a… ▽ More Recently, many approaches have been proposed through single or multiple representations to improve the performance of point cloud semantic segmentation. However, these works do not maintain a good balance among performance, efficiency, and memory consumption. To address these issues, we propose DRINet++ that extends DRINet by enhancing the sparsity and geometric properties of a point cloud with a voxel-as-point principle. To improve efficiency and performance, DRINet++ mainly consists of two modules: Sparse Feature Encoder and Sparse Geometry Feature Enhancement. The Sparse Feature Encoder extracts the local context information for each point, and the Sparse Geometry Feature Enhancement enhances the geometric properties of a sparse point cloud via multi-scale sparse projection and attentive multi-scale fusion. In addition, we propose deep sparse supervision in the training phase to help convergence and alleviate the memory consumption problem. Our DRINet++ achieves state-of-the-art outdoor point cloud segmentation on both SemanticKITTI and Nuscenes datasets while running significantly faster and consuming less memory. △ Less

Submitted 16 November, 2021; originally announced November 2021.

arXiv:2110.06753 [pdf, other]

doi 10.1109/TIFS.2022.3158551

Learning Meta Pattern for Face Anti-Spoofing

Authors: Rizhao Cai, Zhi Li, Renjie Wan, Haoliang Li, Yongjian Hu, Alex Chichung Kot

Abstract: Face Anti-Spoofing (FAS) is essential to secure face recognition systems and has been extensively studied in recent years. Although deep neural networks (DNNs) for the FAS task have achieved promising results in intra-dataset experiments with similar distributions of training and testing data, the DNNs' generalization ability is limited under the cross-domain scenarios with different distributions… ▽ More Face Anti-Spoofing (FAS) is essential to secure face recognition systems and has been extensively studied in recent years. Although deep neural networks (DNNs) for the FAS task have achieved promising results in intra-dataset experiments with similar distributions of training and testing data, the DNNs' generalization ability is limited under the cross-domain scenarios with different distributions of training and testing data. To improve the generalization ability, recent hybrid methods have been explored to extract task-aware handcrafted features (e.g., Local Binary Pattern) as discriminative information for the input of DNNs. However, the handcrafted feature extraction relies on experts' domain knowledge, and how to choose appropriate handcrafted features is underexplored. To this end, we propose a learnable network to extract Meta Pattern (MP) in our learning-to-learn framework. By replacing handcrafted features with the MP, the discriminative information from MP is capable of learning a more generalized model. Moreover, we devise a two-stream network to hierarchically fuse the input RGB image and the extracted MP by using our proposed Hierarchical Fusion Module (HFM). We conduct comprehensive experiments and show that our MP outperforms the compared handcrafted features. Also, our proposed method with HFM and the MP can achieve state-of-the-art performance on two different domain generalization evaluation benchmarks. △ Less

Submitted 17 May, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

Comments: Accepted by IEEE Transactions on Information Forensics and Security (https://ieeexplore.ieee.org.remotexs.ntu.edu.sg/document/9732458) Source code available in https://github.com/RizhaoCai/MetaPattern_FAS

Journal ref: IEEE Transactions on Information Forensics and Security, vol. 17, pp. 1201-1213, 2022

arXiv:2109.05923 [pdf, other]

Low-Light Image Enhancement with Normalizing Flow

Authors: Yufei Wang, Renjie Wan, Wenhan Yang, Haoliang Li, Lap-Pui Chau, Alex C. Kot

Abstract: To enhance low-light images to normally-exposed ones is highly ill-posed, namely that the mapping relationship between them is one-to-many. Previous works based on the pixel-wise reconstruction losses and deterministic processes fail to capture the complex conditional distribution of normally exposed images, which results in improper brightness, residual noise, and artifacts. In this paper, we inv… ▽ More To enhance low-light images to normally-exposed ones is highly ill-posed, namely that the mapping relationship between them is one-to-many. Previous works based on the pixel-wise reconstruction losses and deterministic processes fail to capture the complex conditional distribution of normally exposed images, which results in improper brightness, residual noise, and artifacts. In this paper, we investigate to model this one-to-many relationship via a proposed normalizing flow model. An invertible network that takes the low-light images/features as the condition and learns to map the distribution of normally exposed images into a Gaussian distribution. In this way, the conditional distribution of the normally exposed images can be well modeled, and the enhancement process, i.e., the other inference direction of the invertible network, is equivalent to being constrained by a loss function that better describes the manifold structure of natural images during the training. The experimental results on the existing benchmark datasets show our method achieves better quantitative and qualitative results, obtaining better-exposed illumination, less noise and artifact, and richer colors. △ Less

Submitted 13 September, 2021; originally announced September 2021.

arXiv:2108.06422 [pdf, other]

Metadata-based Multi-Task Bandits with Bayesian Hierarchical Models

Authors: Runzhe Wan, Lin Ge, Rui Song

Abstract: How to explore efficiently is a central problem in multi-armed bandits. In this paper, we introduce the metadata-based multi-task bandit problem, where the agent needs to solve a large number of related multi-armed bandit tasks and can leverage some task-specific features (i.e., metadata) to share knowledge across tasks. As a general framework, we propose to capture task relations through the lens… ▽ More How to explore efficiently is a central problem in multi-armed bandits. In this paper, we introduce the metadata-based multi-task bandit problem, where the agent needs to solve a large number of related multi-armed bandit tasks and can leverage some task-specific features (i.e., metadata) to share knowledge across tasks. As a general framework, we propose to capture task relations through the lens of Bayesian hierarchical models, upon which a Thompson sampling algorithm is designed to efficiently learn task relations, share information, and minimize the cumulative regrets. Two concrete examples for Gaussian bandits and Bernoulli bandits are carefully analyzed. The Bayes regret for Gaussian bandits clearly demonstrates the benefits of information sharing with our algorithm. The proposed method is further supported by extensive experiments. △ Less

Submitted 13 August, 2021; originally announced August 2021.

arXiv:2105.13218 [pdf, other]

Pattern Transfer Learning for Reinforcement Learning in Order Dispatching

Authors: Runzhe Wan, Sheng Zhang, Chengchun Shi, Shikai Luo, Rui Song

Abstract: Order dispatch is one of the central problems to ride-sharing platforms. Recently, value-based reinforcement learning algorithms have shown promising performance on this problem. However, in real-world applications, the non-stationarity of the demand-supply system poses challenges to re-utilizing data generated in different time periods to learn the value function. In this work, motivated by the f… ▽ More Order dispatch is one of the central problems to ride-sharing platforms. Recently, value-based reinforcement learning algorithms have shown promising performance on this problem. However, in real-world applications, the non-stationarity of the demand-supply system poses challenges to re-utilizing data generated in different time periods to learn the value function. In this work, motivated by the fact that the relative relationship between the values of some states is largely stable across various environments, we propose a pattern transfer learning framework for value-based reinforcement learning in the order dispatch problem. Our method efficiently captures the value patterns by incorporating a concordance penalty. The superior performance of the proposed method is supported by experiments. △ Less

Submitted 18 June, 2021; v1 submitted 27 May, 2021; originally announced May 2021.

Comments: Spotlight paper, RL4ITS, IJCAI-21

Journal ref: RL4ITS, IJCAI, 2021

Showing 1–50 of 68 results for author: Wan, R