Skip to main content

Showing 1–50 of 90 results for author: Guo, F

  1. arXiv:2406.17326  [pdf, other

    cs.AI

    The State-Action-Reward-State-Action Algorithm in Spatial Prisoner's Dilemma Game

    Authors: Lanyu Yang, Dongchun Jiang, Fuqiang Guo, Mingjian Fu

    Abstract: Cooperative behavior is prevalent in both human society and nature. Understanding the emergence and maintenance of cooperation among self-interested individuals remains a significant challenge in evolutionary biology and social sciences. Reinforcement learning (RL) provides a suitable framework for studying evolutionary game theory as it can adapt to environmental changes and maximize expected ben… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2406.14080  [pdf, other

    cs.CV cs.GR

    CMTNet: Convolutional Meets Transformer Network for Hyperspectral Images Classification

    Authors: Faxu Guo, Quan Feng, Sen Yang, Wanxia Yang

    Abstract: Hyperspectral remote sensing (HIS) enables the detailed capture of spectral information from the Earth's surface, facilitating precise classification and identification of surface crops due to its superior spectral diagnostic capabilities. However, current convolutional neural networks (CNNs) focus on local features in hyperspectral data, leading to suboptimal performance when classifying intricat… ▽ More

    Submitted 20 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: 15 pages, 11figures

    ACM Class: I.4.6

  3. arXiv:2406.13724  [pdf, other

    cs.AI

    Heterogeneous Graph Neural Networks with Post-hoc Explanations for Multi-modal and Explainable Land Use Inference

    Authors: Xuehao Zhai, Junqi Jiang, Adam Dejl, Antonio Rago, Fangce Guo, Francesca Toni, Aruna Sivakumar

    Abstract: Urban land use inference is a critically important task that aids in city planning and policy-making. Recently, the increased use of sensor and location technologies has facilitated the collection of multi-modal mobility data, offering valuable insights into daily activity patterns. Many studies have adopted advanced data-driven techniques to explore the potential of these multi-modal mobility dat… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  4. arXiv:2406.10673  [pdf, other

    cs.CV

    SemanticMIM: Marring Masked Image Modeling with Semantics Compression for General Visual Representation

    Authors: Yike Yuan, Huanzhang Dou, Fengjun Guo, Xi Li

    Abstract: This paper represents a neat yet effective framework, named SemanticMIM, to integrate the advantages of masked image modeling (MIM) and contrastive learning (CL) for general visual representation. We conduct a thorough comparative analysis between CL and MIM, revealing that their complementary advantages fundamentally stem from two distinct phases, i.e., compression and reconstruction. Specificall… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  5. arXiv:2406.05343  [pdf, other

    cs.AI cs.CL

    M3GIA: A Cognition Inspired Multilingual and Multimodal General Intelligence Ability Benchmark

    Authors: Wei Song, Yadong Li, Jianhua Xu, Guowei Wu, Lingfeng Ming, Kexin Yi, Weihua Luo, Houyi Li, Yi Du, Fangda Guo, Kaicheng Yu

    Abstract: As recent multi-modality large language models (MLLMs) have shown formidable proficiency on various complex tasks, there has been increasing attention on debating whether these models could eventually mirror human intelligence. However, existing benchmarks mainly focus on evaluating solely on task performance, such as the accuracy of identifying the attribute of an object. Combining well-developed… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

  6. arXiv:2405.05928  [pdf

    cs.HC

    Moderating Embodied Cyber Threats Using Generative AI

    Authors: Keyan Guo, Freeman Guo, Hongxin Hu

    Abstract: The advancement in computing and hardware, like spatial computing and VR headsets (e.g., Apple's Vision Pro) [1], has boosted the popularity of social VR platforms (VRChat, Rec Room, Meta HorizonWorlds) [2, 3, 4]. Unlike traditional digital interactions, social VR allows for more immersive experiences, with avatars that mimic users' real-time movements and enable physical-like interactions. Howeve… ▽ More

    Submitted 23 April, 2024; originally announced May 2024.

    Comments: This is an accepted position statement of CHI 2024 Workshop (Novel Approaches for Understanding and Mitigating Emerging New Harms in Immersive and Embodied Virtual Spaces: A Workshop at CHI 2024)

  7. arXiv:2404.11960  [pdf, other

    cs.IR cs.AI

    Generating Diverse Criteria On-the-Fly to Improve Point-wise LLM Rankers

    Authors: Fang Guo, Wenyu Li, Honglei Zhuang, Yun Luo, Yafu Li, Qi Zhu, Le Yan, Yue Zhang

    Abstract: The most recent pointwise Large Language Model (LLM) rankers have achieved remarkable ranking results. However, these rankers are hindered by two major drawbacks: (1) they fail to follow a standardized comparison guidance during the ranking process, and (2) they struggle with comprehensive considerations when dealing with complicated passages. To address these shortcomings, we propose to build a r… ▽ More

    Submitted 8 June, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  8. arXiv:2404.00885  [pdf, other

    cs.LG

    Modeling Output-Level Task Relatedness in Multi-Task Learning with Feedback Mechanism

    Authors: Xiangming Xi, Feng Gao, Jun Xu, Fangtai Guo, Tianlei Jin

    Abstract: Multi-task learning (MTL) is a paradigm that simultaneously learns multiple tasks by sharing information at different levels, enhancing the performance of each individual task. While previous research has primarily focused on feature-level or parameter-level task relatedness, and proposed various model architectures and learning algorithms to improve learning performance, we aim to explore output-… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: submitted to CDC2024

  9. arXiv:2402.07915  [pdf

    cs.HC cs.LG

    Research on Older Adults' Interaction with E-Health Interface Based on Explainable Artificial Intelligence

    Authors: Xueting Huang, Zhibo Zhang, Fusen Guo, Xianghao Wang, Kun Chi, Kexin Wu

    Abstract: This paper proposed a comprehensive mixed-methods framework with varied samples of older adults, including user experience, usability assessments, and in-depth interviews with the integration of Explainable Artificial Intelligence (XAI) methods. The experience of older adults' interaction with the Ehealth interface is collected through interviews and transformed into operatable databases whereas X… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  10. arXiv:2402.00904  [pdf, ps, other

    cs.LG cs.AI

    Graph Domain Adaptation: Challenges, Progress and Prospects

    Authors: Boshen Shi, Yongqing Wang, Fangda Guo, Bingbing Xu, Huawei Shen, Xueqi Cheng

    Abstract: As graph representation learning often suffers from label scarcity problems in real-world applications, researchers have proposed graph domain adaptation (GDA) as an effective knowledge-transfer paradigm across graphs. In particular, to enhance model performance on target graphs with specific tasks, GDA introduces a bunch of task-related graphs as source graphs and adapts the knowledge learnt from… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

  11. arXiv:2401.12895  [pdf, other

    cs.SI cs.GR

    ESC: Edge-attributed Skyline Community Search in Large-scale Bipartite Graphs

    Authors: Fangda Guo, Xuanpu Luo, Yanghao Liu, Guoxin Chen, Yongqing Wang, Huawei Shen, Xueqi Cheng

    Abstract: Due to the ability of modeling relationships between two different types of entities, bipartite graphs are naturally employed in many real-world applications. Community Search in bipartite graphs is a fundamental problem and has gained much attention. However, existing studies focus on measuring the structural cohesiveness between two sets of vertices, while either completely ignoring the edge att… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

  12. arXiv:2401.08345  [pdf, other

    cs.CV

    Multi-view Distillation based on Multi-modal Fusion for Few-shot Action Recognition(CLIP-$\mathrm{M^2}$DF)

    Authors: Fei Guo, YiKang Wang, Han Qi, WenPing Jin, Li Zhu

    Abstract: In recent years, few-shot action recognition has attracted increasing attention. It generally adopts the paradigm of meta-learning. In this field, overcoming the overlapping distribution of classes and outliers is still a challenging problem based on limited samples. We believe the combination of Multi-modal and Multi-view can improve this issue depending on information complementarity. Therefore,… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  13. arXiv:2401.01896  [pdf

    cs.CR cs.LG eess.SP

    Reputation-Based Federated Learning Defense to Mitigate Threats in EEG Signal Classification

    Authors: Zhibo Zhang, Pengfei Li, Ahmed Y. Al Hammadi, Fusen Guo, Ernesto Damiani, Chan Yeob Yeun

    Abstract: This paper presents a reputation-based threat mitigation framework that defends potential security threats in electroencephalogram (EEG) signal classification during model aggregation of Federated Learning. While EEG signal analysis has attracted attention because of the emergence of brain-computer interface (BCI) technology, it is difficult to create efficient learning models for EEG analysis bec… ▽ More

    Submitted 22 October, 2023; originally announced January 2024.

  14. arXiv:2312.07285  [pdf, other

    cs.LG stat.ML

    Forced Exploration in Bandit Problems

    Authors: Han Qi, Fei Guo, Li Zhu

    Abstract: The multi-armed bandit(MAB) is a classical sequential decision problem. Most work requires assumptions about the reward distribution (e.g., bounded), while practitioners may have difficulty obtaining information about these distributions to design models for their problems, especially in non-stationary MAB problems. This paper aims to design a multi-armed bandit algorithm that can be implemented w… ▽ More

    Submitted 12 December, 2023; v1 submitted 12 December, 2023; originally announced December 2023.

  15. arXiv:2312.02694  [pdf, other

    cs.CV

    UPOCR: Towards Unified Pixel-Level OCR Interface

    Authors: Dezhi Peng, Zhenhua Yang, Jiaxin Zhang, Chongyu Liu, Yongxin Shi, Kai Ding, Fengjun Guo, Lianwen Jin

    Abstract: In recent years, the optical character recognition (OCR) field has been proliferating with plentiful cutting-edge approaches for a wide spectrum of tasks. However, these approaches are task-specifically designed with divergent paradigms, architectures, and training strategies, which significantly increases the complexity of research and maintenance and hinders the fast deployment in applications.… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  16. arXiv:2312.01083  [pdf, other

    cs.CV

    Consistency Prototype Module and Motion Compensation for Few-Shot Action Recognition (CLIP-CP$\mathbf{M^2}$C)

    Authors: Fei Guo, Li Zhu, YiKang Wang, Han Qi

    Abstract: Recently, few-shot action recognition has significantly progressed by learning the feature discriminability and designing suitable comparison methods. Still, there are the following restrictions. (a) Previous works are mainly based on visual mono-modal. Although some multi-modal works use labels as supplementary to construct prototypes of support videos, they can not use this information for query… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  17. arXiv:2311.10453  [pdf, other

    cs.RO

    A Fingertip Sensor and Algorithms for Pre-touch Distance Ranging and Material Detection in Robotic Grasping

    Authors: Cheng Fang, Di Wang, Fengzhi Guo, Jun Zou, Dezhen Song

    Abstract: To enhance robotic grasping capabilities, we are developing new contactless fingertip sensors to measure distance in close proximity and simultaneously detect the type of material and the interior structure. These sensors are referred to as pre-touch dual-modal and dual-mechanism (PDM$^2$) sensors, and they operate using both pulse-echo ultrasound (US) and optoacoustic (OA) modalities. We present… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  18. arXiv:2311.08919  [pdf, other

    cs.SI

    FCS-HGNN: Flexible Multi-type Community Search in Heterogeneous Information Networks

    Authors: Guoxin Chen, Fangda Guo, Yongqing Wang, Yanghao Liu, Peiying Yu, Huawei Shen, Xueqi Cheng

    Abstract: Community search is a personalized community discovery problem designed to identify densely connected subgraphs containing the query node. Recently, community search in heterogeneous information networks (HINs) has received considerable attention. Existing methods typically focus on modeling relationships in HINs through predefined meta-paths or user-specified relational constraints. However, meta… ▽ More

    Submitted 2 March, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Ongoing Work

  19. arXiv:2310.14532  [pdf, other

    cs.CV

    Practical Deep Dispersed Watermarking with Synchronization and Fusion

    Authors: Hengchang Guo, Qilong Zhang, Junwei Luo, Feng Guo, Wenbin Zhang, Xiaodong Su, Minglei Li

    Abstract: Deep learning based blind watermarking works have gradually emerged and achieved impressive performance. However, previous deep watermarking studies mainly focus on fixed low-resolution images while paying less attention to arbitrary resolution images, especially widespread high-resolution images nowadays. Moreover, most works usually demonstrate robustness against typical non-geometric attacks (\… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: Accpeted by ACM MM 2023

  20. Causality and Independence Enhancement for Biased Node Classification

    Authors: Guoxin Chen, Yongqing Wang, Fangda Guo, Qinglang Guo, Jiangli Shao, Huawei Shen, Xueqi Cheng

    Abstract: Most existing methods that address out-of-distribution (OOD) generalization for node classification on graphs primarily focus on a specific type of data biases, such as label selection bias or structural bias. However, anticipating the type of bias in advance is extremely challenging, and designing models solely for one specific type may not necessarily improve overall generalization performance.… ▽ More

    Submitted 4 November, 2023; v1 submitted 14 October, 2023; originally announced October 2023.

    Comments: 10 pages, 5 figures, accepted by CIKM2023

  21. arXiv:2310.05502  [pdf, other

    cs.CL

    XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners

    Authors: Yun Luo, Zhen Yang, Fandong Meng, Yingjie Li, Fang Guo, Qinglin Qi, Jie Zhou, Yue Zhang

    Abstract: Active learning (AL), which aims to construct an effective training set by iteratively curating the most formative unlabeled data for annotation, has been widely used in low-resource tasks. Most active learning techniques in classification rely on the model's uncertainty or disagreement to choose unlabeled data, suffering from the problem of over-confidence in superficial patterns and a lack of ex… ▽ More

    Submitted 14 March, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Accepted by NAACL 2024

  22. arXiv:2310.03268  [pdf, other

    cs.IT eess.SY

    On the Distribution of SINR for Cell-Free Massive MIMO Systems

    Authors: Baolin Chong, Fengqian Guo, Hancheng Lu, Langtian Qin

    Abstract: Cell-free (CF) massive multiple-input multiple-output (mMIMO) has been considered as a potential technology for Beyond 5G communication systems. However, the performance of CF mMIMO systems has not been well studied. Most existing analytical work on CF mMIMO systems is based on the expected signal-to-interference-plus-noise ratio (SINR). The statistical characteristics of the SINR, which is critic… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  23. arXiv:2308.13244  [pdf, other

    cs.SI cs.DB

    Significant-attributed Community Search in Heterogeneous Information Networks

    Authors: Yanghao Liu, Fangda Guo, Bingbing Xu, Peng Bao, Huawei Shen, Xueqi Cheng

    Abstract: Community search is a personalized community discovery problem aimed at finding densely-connected subgraphs containing the query vertex. In particular, the search for communities with high-importance vertices has recently received a great deal of attention. However, existing works mainly focus on conventional homogeneous networks where vertices are of the same type, but are not applicable to heter… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

    Comments: 14 pages, 11figures

  24. arXiv:2308.00356  [pdf, other

    cs.CV

    Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation

    Authors: Li Niu, Linfeng Tan, Xinhao Tao, Junyan Cao, Fengjun Guo, Teng Long, Liqing Zhang

    Abstract: Given a composite image, image harmonization aims to adjust the foreground illumination to be consistent with background. Previous methods have explored transforming foreground features to achieve competitive performance. In this work, we show that using global information to guide foreground feature transformation could achieve significant improvement. Besides, we propose to transfer the foregrou… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV 2023

  25. arXiv:2307.11341  [pdf, other

    cs.AI cs.DL cs.SI

    OpenGDA: Graph Domain Adaptation Benchmark for Cross-network Learning

    Authors: Boshen Shi, Yongqing Wang, Fangda Guo, Jiangli Shao, Huawei Shen, Xueqi Cheng

    Abstract: Graph domain adaptation models are widely adopted in cross-network learning tasks, with the aim of transferring labeling or structural knowledge. Currently, there mainly exist two limitations in evaluating graph domain adaptation models. On one side, they are primarily tested for the specific cross-network node classification task, leaving tasks at edge-level and graph-level largely under-explored… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

    Comments: Under Review

  26. arXiv:2307.01985  [pdf, other

    cs.CV cs.DM

    Task-Specific Alignment and Multiple Level Transformer for Few-Shot Action Recognition

    Authors: Fei Guo, Li Zhu, YiWang Wang, Jing Sun

    Abstract: In the research field of few-shot learning, the main difference between image-based and video-based is the additional temporal dimension. In recent years, some works have used the Transformer to deal with frames, then get the attention feature and the enhanced prototype, and the results are competitive. However, some video frames may relate little to the action, and only using single frame-level o… ▽ More

    Submitted 30 November, 2023; v1 submitted 4 July, 2023; originally announced July 2023.

  27. arXiv:2307.00954  [pdf, other

    cs.CV eess.IV

    HODINet: High-Order Discrepant Interaction Network for RGB-D Salient Object Detection

    Authors: Kang Yi, Jing Xu, Xiao Jin, Fu Guo, Yan-Feng Wu

    Abstract: RGB-D salient object detection (SOD) aims to detect the prominent regions by jointly modeling RGB and depth information. Most RGB-D SOD methods apply the same type of backbones and fusion modules to identically learn the multimodality and multistage features. However, these features contribute differently to the final saliency results, which raises two issues: 1) how to model discrepant characteri… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  28. arXiv:2306.15656  [pdf, other

    cs.LG cs.AI cs.CC cs.CL cs.MS

    SparseOptimizer: Sparsify Language Models through Moreau-Yosida Regularization and Accelerate via Compiler Co-design

    Authors: Fu-Ming Guo

    Abstract: This paper introduces SparseOptimizer, a novel deep learning optimizer that exploits Moreau-Yosida regularization to naturally induce sparsity in large language models such as BERT, ALBERT and GPT. Key to the design of SparseOptimizer is an embedded shrinkage operator, which imparts sparsity directly within the optimization process. This operator, backed by a sound theoretical framework, includes… ▽ More

    Submitted 18 July, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

  29. arXiv:2306.11977  [pdf

    eess.IV cs.CV

    Encoding Enhanced Complex CNN for Accurate and Highly Accelerated MRI

    Authors: Zimeng Li, Sa Xiao, Cheng Wang, Haidong Li, Xiuchao Zhao, Caohui Duan, Qian Zhou, Qiuchen Rao, Yuan Fang, Junshuai Xie, Lei Shi, Fumin Guo, Chaohui Ye, Xin Zhou

    Abstract: Magnetic resonance imaging (MRI) using hyperpolarized noble gases provides a way to visualize the structure and function of human lung, but the long imaging time limits its broad research and clinical applications. Deep learning has demonstrated great potential for accelerating MRI by reconstructing images from undersampled data. However, most existing deep conventional neural networks (CNN) direc… ▽ More

    Submitted 13 November, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

  30. arXiv:2306.05749  [pdf, other

    cs.CV

    DocAligner: Annotating Real-world Photographic Document Images by Simply Taking Pictures

    Authors: Jiaxin Zhang, Bangdong Chen, Hiuyi Cheng, Fengjun Guo, Kai Ding, Lianwen Jin

    Abstract: Recently, there has been a growing interest in research concerning document image analysis and recognition in photographic scenarios. However, the lack of labeled datasets for this emerging challenge poses a significant obstacle, as manual annotation can be time-consuming and impractical. To tackle this issue, we present DocAligner, a novel method that streamlines the manual annotation process to… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

  31. arXiv:2306.02107  [pdf, other

    cs.IT eess.SY

    Achievable Sum Rate Optimization on NOMA-aided Cell-Free Massive MIMO with Finite Blocklength Coding

    Authors: Baolin Chong, Hancheng Lu, Yuang Chen, Langtian Qin, Fengqian Guo

    Abstract: Non-orthogonal multiple access (NOMA)-aided cell-free massive multiple-input multiple-output (CFmMIMO) has been considered as a promising technology to fulfill strict quality of service requirements for ultra-reliable low-latency communications (URLLC). However, finite blocklength coding (FBC) in URLLC makes it challenging to achieve the optimal performance in the NOMA-aided CFmMIMO system. In thi… ▽ More

    Submitted 25 March, 2024; v1 submitted 3 June, 2023; originally announced June 2023.

  32. arXiv:2304.10088  [pdf, other

    eess.AS cs.CR cs.SD

    Towards the Universal Defense for Query-Based Audio Adversarial Attacks

    Authors: Feng Guo, Zheng Sun, Yuxuan Chen, Lei Ju

    Abstract: Recently, studies show that deep learning-based automatic speech recognition (ASR) systems are vulnerable to adversarial examples (AEs), which add a small amount of noise to the original audio examples. These AE attacks pose new challenges to deep learning security and have raised significant concerns about deploying ASR systems and devices. The existing defense methods are either limited in appli… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: Submitted to Cybersecurity journal

  33. arXiv:2304.08811  [pdf, other

    cs.CR cs.LG cs.SD eess.AS

    Towards the Transferable Audio Adversarial Attack via Ensemble Methods

    Authors: Feng Guo, Zheng Sun, Yuxuan Chen, Lei Ju

    Abstract: In recent years, deep learning (DL) models have achieved significant progress in many domains, such as autonomous driving, facial recognition, and speech recognition. However, the vulnerability of deep learning models to adversarial attacks has raised serious concerns in the community because of their insufficient robustness and generalization. Also, transferable attacks have become a prominent me… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

    Comments: Submitted to Cybersecurity journal 2023

  34. arXiv:2303.17354  [pdf, other

    cs.CV cs.LG

    ISSTAD: Incremental Self-Supervised Learning Based on Transformer for Anomaly Detection and Localization

    Authors: Wenping Jin, Fei Guo, Li Zhu

    Abstract: In the realm of machine learning, the study of anomaly detection and localization within image data has gained substantial traction, particularly for practical applications such as industrial defect detection. While the majority of existing methods predominantly use Convolutional Neural Networks (CNN) as their primary network architecture, we introduce a novel approach based on the Transformer bac… ▽ More

    Submitted 28 April, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

  35. The Application of Driver Models in the Safety Assessment of Autonomous Vehicles: A Survey

    Authors: Cheng Wang, Fengwei Guo, Ruilin Yu, Luyao Wang, Yuxin Zhang

    Abstract: Driver models play a vital role in developing and verifying autonomous vehicles (AVs). Previously, they are mainly applied in traffic flow simulation to model driver behavior. With the development of AVs, driver models attract much attention again due to their potential contributions to AV safety assessment. The simulation-based testing method is an effective measure to accelerate AV testing due t… ▽ More

    Submitted 4 August, 2023; v1 submitted 26 March, 2023; originally announced March 2023.

  36. arXiv:2303.04940  [pdf, other

    cs.CV

    Non-aligned supervision for Real Image Dehazing

    Authors: Junkai Fan, Fei Guo, Jianjun Qian, Xiang Li, Jun Li, Jian Yang

    Abstract: Removing haze from real-world images is challenging due to unpredictable weather conditions, resulting in the misalignment of hazy and clear image pairs. In this paper, we propose an innovative dehazing framework that operates under non-aligned supervision. This framework is grounded in the atmospheric scattering model, and consists of three interconnected networks: dehazing, airlight, and transmi… ▽ More

    Submitted 5 January, 2024; v1 submitted 8 March, 2023; originally announced March 2023.

  37. arXiv:2209.15368  [pdf, other

    cs.CV

    Inharmonious Region Localization by Magnifying Domain Discrepancy

    Authors: Jing Liang, Li Niu, Penghao Wu, Fengjun Guo, Teng Long

    Abstract: Inharmonious region localization aims to localize the region in a synthetic image which is incompatible with surrounding background. The inharmony issue is mainly attributed to the color and illumination inconsistency produced by image editing techniques. In this work, we tend to transform the input image to another color space to magnify the domain discrepancy between inharmonious region and back… ▽ More

    Submitted 30 September, 2022; originally announced September 2022.

  38. arXiv:2209.08712  [pdf, ps, other

    cs.IT

    Systematic Constructions of Bent-Negabent Functions, 2-Rotation Symmetric Bent-Negabent Functions and Their Duals

    Authors: Fei Guo, Zilong Wang, Guang Gong

    Abstract: Bent-negabent functions have many important properties for their application in cryptography since they have the flat absolute spectrum under the both Walsh-Hadamard transform and nega-Hadamard transform. In this paper, we present four new systematic constructions of bent-negabent functions on $4k, 8k, 4k+2$ and $8k+2$ variables, respectively, by modifying the truth tables of two classes of quadra… ▽ More

    Submitted 18 September, 2022; originally announced September 2022.

  39. arXiv:2208.09197  [pdf, other

    cs.CV

    EAA-Net: Rethinking the Autoencoder Architecture with Intra-class Features for Medical Image Segmentation

    Authors: Shiqiang Ma, Xuejian Li, Jijun Tang, Fei Guo

    Abstract: Automatic image segmentation technology is critical to the visual analysis. The autoencoder architecture has satisfying performance in various image segmentation tasks. However, autoencoders based on convolutional neural networks (CNN) seem to encounter a bottleneck in improving the accuracy of semantic segmentation. Increasing the inter-class distance between foreground and background is an inher… ▽ More

    Submitted 19 August, 2022; originally announced August 2022.

  40. arXiv:2208.08678  [pdf, other

    cs.CL cs.AI

    Mere Contrastive Learning for Cross-Domain Sentiment Analysis

    Authors: Yun Luo, Fang Guo, Zihan Liu, Yue Zhang

    Abstract: Cross-domain sentiment analysis aims to predict the sentiment of texts in the target domain using the model trained on the source domain to cope with the scarcity of labeled data. Previous studies are mostly cross-entropy-based methods for the task, which suffer from instability and poor generalization. In this paper, we explore contrastive learning on the cross-domain sentiment analysis task. We… ▽ More

    Submitted 18 August, 2022; originally announced August 2022.

  41. arXiv:2207.11515  [pdf, other

    cs.CV

    Marior: Margin Removal and Iterative Content Rectification for Document Dewarping in the Wild

    Authors: Jiaxin Zhang, Canjie Luo, Lianwen Jin, Fengjun Guo, Kai Ding

    Abstract: Camera-captured document images usually suffer from perspective and geometric deformations. It is of great value to rectify them when considering poor visual aesthetics and the deteriorated performance of OCR systems. Recent learning-based methods intensively focus on the accurately cropped document image. However, this might not be sufficient for overcoming practical challenges, including documen… ▽ More

    Submitted 23 July, 2022; originally announced July 2022.

    Comments: This paper has been accepted by ACM Multimedia 2022

  42. arXiv:2207.10273  [pdf, other

    cs.CV

    Don't Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Context

    Authors: Chongyu Liu, Lianwen Jin, Yuliang Liu, Canjie Luo, Bangdong Chen, Fengjun Guo, Kai Ding

    Abstract: Text removal has attracted increasingly attention due to its various applications on privacy protection, document restoration, and text editing. It has shown significant progress with deep neural network. However, most of the existing methods often generate inconsistent results for complex background. To address this issue, we propose a Contextual-guided Text Removal Network, termed as CTRNet. CTR… ▽ More

    Submitted 20 July, 2022; originally announced July 2022.

  43. Unsupervised Key Event Detection from Massive Text Corpora

    Authors: Yunyi Zhang, Fang Guo, Jiaming Shen, Jiawei Han

    Abstract: Automated event detection from news corpora is a crucial task towards mining fast-evolving structured knowledge. As real-world events have different granularities, from the top-level themes to key events and then to event mentions corresponding to concrete actions, there are generally two lines of research: (1) theme detection identifies from a news corpus major themes (e.g., "2019 Hong Kong Prote… ▽ More

    Submitted 3 July, 2022; v1 submitted 8 June, 2022; originally announced June 2022.

    Comments: Accepted to KDD 2022 Research Track

  44. arXiv:2205.15290  [pdf, other

    cs.CV cs.AI cs.LG

    Zero-Shot and Few-Shot Learning for Lung Cancer Multi-Label Classification using Vision Transformer

    Authors: Fu-Ming Guo, Yingfang Fan

    Abstract: Lung cancer is the leading cause of cancer-related death worldwide. Lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) are the most common histologic subtypes of non-small-cell lung cancer (NSCLC). Histology is an essential tool for lung cancer diagnosis. Pathologists make classifications according to the dominant subtypes. Although morphology remains the standard for diagnosis, si… ▽ More

    Submitted 31 May, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

  45. arXiv:2204.12185  [pdf, other

    cs.CV

    TranSiam: Fusing Multimodal Visual Features Using Transformer for Medical Image Segmentation

    Authors: Xuejian Li, Shiqiang Ma, Jijun Tang, Fei Guo

    Abstract: Automatic segmentation of medical images based on multi-modality is an important topic for disease diagnosis. Although the convolutional neural network (CNN) has been proven to have excellent performance in image segmentation tasks, it is difficult to obtain global information. The lack of global information will seriously affect the accuracy of the segmentation results of the lesion area. In addi… ▽ More

    Submitted 26 April, 2022; originally announced April 2022.

  46. arXiv:2203.13420  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Automatic Song Translation for Tonal Languages

    Authors: Fenfei Guo, Chen Zhang, Zhirui Zhang, Qixin He, Kejun Zhang, Jun Xie, Jordan Boyd-Graber

    Abstract: This paper develops automatic song translation (AST) for tonal languages and addresses the unique challenge of aligning words' tones with melody of a song in addition to conveying the original meaning. We propose three criteria for effective AST -- preserving meaning, singability and intelligibility -- and design metrics for these criteria. We develop a new benchmark for English--Mandarin song tra… ▽ More

    Submitted 24 March, 2022; originally announced March 2022.

    Comments: Accepted at Findings of ACL 2022, 15 pages, 4 Tables and 10 Figures

  47. arXiv:2203.12835  [pdf, other

    cs.CV

    Industrial Style Transfer with Large-scale Geometric Warping and Content Preservation

    Authors: Jinchao Yang, Fei Guo, Shuo Chen, Jun Li, Jian Yang

    Abstract: We propose a novel style transfer method to quickly create a new visual product with a nice appearance for industrial designers' reference. Given a source product, a target product, and an art style image, our method produces a neural warping field that warps the source shape to imitate the geometric style of the target and a neural texture transformation network that transfers the artistic style… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: Accepted at CVPR 2022

  48. arXiv:2202.01934  [pdf, other

    cs.LG

    Smartphone-based Hard-braking Event Detection at Scale for Road Safety Services

    Authors: Luyang Liu, David Racz, Kara Vaillancourt, Julie Michelman, Matt Barnes, Stefan Mellem, Paul Eastham, Bradley Green, Charles Armstrong, Rishi Bal, Shawn O'Banion, Feng Guo

    Abstract: Road crashes are the sixth leading cause of lost disability-adjusted life-years (DALYs) worldwide. One major challenge in traffic safety research is the sparsity of crashes, which makes it difficult to achieve a fine-grain understanding of crash causations and predict future crash risk in a timely manner. Hard-braking events have been widely used as a safety surrogate due to their relatively high… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

  49. arXiv:2112.09220  [pdf, other

    cs.CV cs.AI cs.LG

    Sim2Real Docs: Domain Randomization for Documents in Natural Scenes using Ray-traced Rendering

    Authors: Nikhil Maddikunta, Huijun Zhao, Sumit Keswani, Alfy Samuel, Fu-Ming Guo, Nishan Srishankar, Vishwa Pardeshi, Austin Huang

    Abstract: In the past, computer vision systems for digitized documents could rely on systematically captured, high-quality scans. Today, transactions involving digital documents are more likely to start as mobile phone photo uploads taken by non-professionals. As such, computer vision for document automation must now account for documents captured in natural scene contexts. An additional challenge is that t… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

    Comments: Accepted to Neurips 2021 Data Centric AI (DCAI) Workshop

  50. Probabilistic Spatial Distribution Prior Based Attentional Keypoints Matching Network

    Authors: Xiaoming Zhao, Jingmeng Liu, Xingming Wu, Weihai Chen, Fanghong Guo, Zhengguo Li

    Abstract: Keypoints matching is a pivotal component for many image-relevant applications such as image stitching, visual simultaneous localization and mapping (SLAM), and so on. Both handcrafted-based and recently emerged deep learning-based keypoints matching methods merely rely on keypoints and local features, while losing sight of other available sensors such as inertial measurement unit (IMU) in the abo… ▽ More

    Submitted 23 November, 2021; v1 submitted 17 November, 2021; originally announced November 2021.