Skip to main content

Showing 1–28 of 28 results for author: Hua, T

  1. arXiv:2404.12444  [pdf, other

    cs.CL cs.AI

    mOthello: When Do Cross-Lingual Representation Alignment and Cross-Lingual Transfer Emerge in Multilingual Models?

    Authors: Tianze Hua, Tian Yun, Ellie Pavlick

    Abstract: Many pretrained multilingual models exhibit cross-lingual transfer ability, which is often attributed to a learned language-neutral representation during pretraining. However, it remains unclear what factors contribute to the learning of a language-neutral representation, and whether the learned language-neutral representation suffices to facilitate cross-lingual transfer. We propose a synthetic t… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: Accepted at Findings of NAACL 2024. Project Webpage: https://multilingual-othello.github.io/

  2. arXiv:2404.07972  [pdf, other

    cs.AI cs.CL

    OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

    Authors: Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio Savarese, Caiming Xiong, Victor Zhong, Tao Yu

    Abstract: Autonomous agents that accomplish complex computer tasks with minimal human interventions have the potential to transform human-computer interaction, significantly enhancing accessibility and productivity. However, existing benchmarks either lack an interactive environment or are limited to environments specific to certain applications or domains, failing to reflect the diverse and complex nature… ▽ More

    Submitted 30 May, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

    Comments: 51 pages, 21 figures

  3. arXiv:2403.19473  [pdf, other

    cs.CV

    Benchmarking Implicit Neural Representation and Geometric Rendering in Real-Time RGB-D SLAM

    Authors: Tongyan Hua, Lin Wang

    Abstract: Implicit neural representation (INR), in combination with geometric rendering, has recently been employed in real-time dense RGB-D SLAM. Despite active research endeavors being made, there lacks a unified protocol for fair evaluation, impeding the evolution of this area. In this work, we establish, to our knowledge, the first open-source benchmark framework to evaluate the performance of a wide sp… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  4. arXiv:2403.12504  [pdf, other

    cs.RO

    TON-VIO: Online Time Offset Modeling Networks for Robust Temporal Alignment in High Dynamic Motion VIO

    Authors: Chaoran Xiong, Guoqing Liu, Qi Wu, Songpengcheng Xia, Tong Hua, Kehui Ma, Zhen Sun, Yan Xiang, Ling Pei

    Abstract: Temporal misalignment (time offset) between sensors is common in low cost visual-inertial odometry (VIO) systems. Such temporal misalignment introduces inconsistent constraints for state estimation, leading to a significant positioning drift especially in high dynamic motion scenarios. In this article, we focus on online temporal calibration to reduce the positioning drift caused by the time offse… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  5. arXiv:2402.00646  [pdf, ps, other

    cs.IT eess.SP

    Cell-Free Massive MIMO SWIPT with Beyond Diagonal Reconfigurable Intelligent Surfaces

    Authors: Thien Duc Hua, Mohammadali Mohammadi, Hien Quoc Ngo, Michail Matthaiou

    Abstract: This paper investigates the integration of beyond-diagonal reconfigurable intelligent surfaces (BD-RISs) into cell-free massive multiple-input multiple-output (CF-mMIMO) systems, focusing on applications involving simultaneous wireless information and power transfer (SWIPT). The system supports concurrently two user groups: information users (IUs) and energy users (EUs). A BD-RIS is employed to en… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  6. arXiv:2401.03203  [pdf, other

    cs.CV

    Hi-Map: Hierarchical Factorized Radiance Field for High-Fidelity Monocular Dense Mapping

    Authors: Tongyan Hua, Haotian Bai, Zidong Cao, Ming Liu, Dacheng Tao, Lin Wang

    Abstract: In this paper, we introduce Hi-Map, a novel monocular dense mapping approach based on Neural Radiance Field (NeRF). Hi-Map is exceptional in its capacity to achieve efficient and high-fidelity mapping using only posed RGB inputs. Our method eliminates the need for external depth priors derived from e.g., a depth estimation model. Our key idea is to represent the scene as a hierarchical feature gri… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

  7. arXiv:2312.02191  [pdf, other

    cs.CV cs.AI

    Prompt Tuning for Zero-shot Compositional Learning

    Authors: Lingyu Zhang, Ting Hua, Yilin Shen, Hongxia Jin

    Abstract: Open World Compositional Zero-Shot Learning (OW-CZSL) is known to be an extremely challenging task, which aims to recognize unseen compositions formed from seen attributes and objects without any prior assumption of the output space. In order to achieve this goal, a model has to be "smart" and "knowledgeable". To be smart, a model should be good at reasoning the interactions between attributes and… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  8. arXiv:2311.04477  [pdf, other

    cs.RO

    PLV-IEKF: Consistent Visual-Inertial Odometry using Points, Lines, and Vanishing Points

    Authors: Tong Hua, Tao Li, Liang Pang, Guoqing Liu, Wencheng Xuanyuan, Chang Shu, Ling Pei

    Abstract: In this paper, we propose an Invariant Extended Kalman Filter (IEKF) based Visual-Inertial Odometry (VIO) using multiple features in man-made environments. Conventional EKF-based VIO usually suffers from system inconsistency and angular drift that naturally occurs in feature-based methods. However, in man-made environments, notable structural regularities, such as lines and vanishing points, offer… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: ROBIO 2023

  9. arXiv:2310.10634  [pdf, other

    cs.CL cs.AI

    OpenAgents: An Open Platform for Language Agents in the Wild

    Authors: Tianbao Xie, Fan Zhou, Zhoujun Cheng, Peng Shi, Luoxuan Weng, Yitao Liu, Toh Jing Hua, Junning Zhao, Qian Liu, Che Liu, Leo Z. Liu, Yiheng Xu, Hongjin Su, Dongchan Shin, Caiming Xiong, Tao Yu

    Abstract: Language agents show potential in being capable of utilizing natural language for varied and intricate tasks in diverse environments, particularly when built upon large language models (LLMs). Current language agent frameworks aim to facilitate the construction of proof-of-concept language agents while neglecting the non-expert user access to agents and paying little attention to application-level… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: 34 pages, 8 figures

  10. arXiv:2306.06815  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    TrojLLM: A Black-box Trojan Prompt Attack on Large Language Models

    Authors: Jiaqi Xue, Mengxin Zheng, Ting Hua, Yilin Shen, Yepeng Liu, Ladislau Boloni, Qian Lou

    Abstract: Large Language Models (LLMs) are progressively being utilized as machine learning services and interface tools for various applications. However, the security implications of LLMs, particularly in relation to adversarial and Trojan attacks, remain insufficiently examined. In this paper, we propose TrojLLM, an automatic and black-box framework to effectively generate universal and stealthy triggers… ▽ More

    Submitted 30 October, 2023; v1 submitted 11 June, 2023; originally announced June 2023.

    Comments: Accepted by NeurIPS'23

  11. arXiv:2306.00579  [pdf, other

    cs.CV

    FMapping: Factorized Efficient Neural Field Mapping for Real-Time Dense RGB SLAM

    Authors: Tongyan Hua, Haotian Bai, Zidong Cao, Lin Wang

    Abstract: In this paper, we introduce FMapping, an efficient neural field mapping framework that facilitates the continuous estimation of a colorized point cloud map in real-time dense RGB SLAM. To achieve this challenging goal without depth, a hurdle is how to improve efficiency and reduce the mapping uncertainty of the RGB SLAM system. To this end, we first build up a theoretical analysis by decomposing t… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

  12. arXiv:2304.06027  [pdf, other

    cs.CV cs.AI cs.LG

    Continual Diffusion: Continual Customization of Text-to-Image Diffusion with C-LoRA

    Authors: James Seale Smith, Yen-Chang Hsu, Lingyu Zhang, Ting Hua, Zsolt Kira, Yilin Shen, Hongxia Jin

    Abstract: Recent works demonstrate a remarkable ability to customize text-to-image diffusion models while only providing a few example images. What happens if you try to customize such models using multiple, fine-grained concepts in a sequential (i.e., continual) manner? In our work, we show that recent state-of-the-art customization of text-to-image models suffer from catastrophic forgetting when new conce… ▽ More

    Submitted 2 May, 2024; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: Transactions on Machine Learning Research (TMLR) 2024

  13. arXiv:2303.07668  [pdf, ps, other

    cs.RO

    PIEKF-VIWO: Visual-Inertial-Wheel Odometry using Partial Invariant Extended Kalman Filter

    Authors: Tong Hua, Tao Li, Ling Pei

    Abstract: Invariant Extended Kalman Filter (IEKF) has been successfully applied in Visual-inertial Odometry (VIO) as an advanced achievement of Kalman filter, showing great potential in sensor fusion. In this paper, we propose partial IEKF (PIEKF), which only incorporates rotation-velocity state into the Lie group structure and apply it for Visual-Inertial-Wheel Odometry (VIWO) to improve positioning accura… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

  14. arXiv:2302.08890  [pdf, other

    cs.CV

    Deep Learning for Event-based Vision: A Comprehensive Survey and Benchmarks

    Authors: Xu Zheng, Yexin Liu, Yunfan Lu, Tongyan Hua, Tianbo Pan, Weiming Zhang, Dacheng Tao, Lin Wang

    Abstract: Event cameras are bio-inspired sensors that capture the per-pixel intensity changes asynchronously and produce event streams encoding the time, pixel position, and polarity (sign) of the intensity changes. Event cameras possess a myriad of advantages over canonical frame-based cameras, such as high temporal resolution, high dynamic range, low latency, etc. Being capable of capturing information in… ▽ More

    Submitted 11 April, 2024; v1 submitted 17 February, 2023; originally announced February 2023.

  15. arXiv:2211.09718  [pdf, other

    cs.CL cs.LG

    Numerical Optimizations for Weighted Low-rank Estimation on Language Model

    Authors: Ting Hua, Yen-Chang Hsu, Felicity Wang, Qian Lou, Yilin Shen, Hongxia Jin

    Abstract: Singular value decomposition (SVD) is one of the most popular compression methods that approximate a target matrix with smaller matrices. However, standard SVD treats the parameters within the matrix with equal importance, which is a simple but unrealistic assumption. The parameters of a trained neural network model may affect task performance unevenly, which suggests non-equal importance among th… ▽ More

    Submitted 15 December, 2022; v1 submitted 1 November, 2022; originally announced November 2022.

    Comments: long paper EMNLP 2022

    Journal ref: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

  16. arXiv:2207.00112  [pdf, other

    cs.LG cs.AI cs.CL

    Language model compression with weighted low-rank factorization

    Authors: Yen-Chang Hsu, Ting Hua, Sungen Chang, Qian Lou, Yilin Shen, Hongxia Jin

    Abstract: Factorizing a large matrix into small matrices is a popular strategy for model compression. Singular value decomposition (SVD) plays a vital role in this compression strategy, approximating a learned matrix with fewer parameters. However, SVD minimizes the squared error toward reconstructing the original matrix without gauging the importance of the parameters, potentially giving a larger reconstru… ▽ More

    Submitted 30 June, 2022; originally announced July 2022.

    Comments: ICLR 2022

  17. arXiv:2203.12054  [pdf, other

    cs.CV cs.AI

    Self-supervision through Random Segments with Autoregressive Coding (RandSAC)

    Authors: Tianyu Hua, Yonglong Tian, Sucheng Ren, Michalis Raptis, Hang Zhao, Leonid Sigal

    Abstract: Inspired by the success of self-supervised autoregressive representation learning in natural language (GPT and its variants), and advances in recent visual architecture design with Vision Transformers (ViTs), in this paper, we explore the effect various design choices have on the success of applying such training strategies for visual feature learning. Specifically, we introduce a novel strategy t… ▽ More

    Submitted 25 October, 2022; v1 submitted 22 March, 2022; originally announced March 2022.

  18. Hyperparameter-free Continuous Learning for Domain Classification in Natural Language Understanding

    Authors: Ting Hua, Yilin Shen, Changsheng Zhao, Yen-Chang Hsu, Hongxia Jin

    Abstract: Domain classification is the fundamental task in natural language understanding (NLU), which often requires fast accommodation to new emerging domains. This constraint makes it impossible to retrain all previous domains, even if they are accessible to the new model. Most existing continual learning approaches suffer from low accuracy and performance fluctuation, especially when the distributions o… ▽ More

    Submitted 4 January, 2022; originally announced January 2022.

    Journal ref: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies,pages 2669--2678

  19. Automatic Mixed-Precision Quantization Search of BERT

    Authors: Changsheng Zhao, Ting Hua, Yilin Shen, Qian Lou, Hongxia Jin

    Abstract: Pre-trained language models such as BERT have shown remarkable effectiveness in various natural language processing tasks. However, these models usually contain millions of parameters, which prevents them from practical deployment on resource-constrained devices. Knowledge distillation, Weight pruning, and Quantization are known to be the main directions in model compression. However, compact mode… ▽ More

    Submitted 30 December, 2021; originally announced December 2021.

    Journal ref: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI, 2021

  20. arXiv:2106.12378  [pdf, other

    cs.CV cs.LG

    Co-advise: Cross Inductive Bias Distillation

    Authors: Sucheng Ren, Zhengqi Gao, Tianyu Hua, Zihui Xue, Yonglong Tian, Shengfeng He, Hang Zhao

    Abstract: Transformers recently are adapted from the community of natural language processing as a promising substitute of convolution-based neural networks for visual learning tasks. However, its supremacy degenerates given an insufficient amount of training data (e.g., ImageNet). To make it into practical utility, we propose a novel distillation-based method to train vision transformers. Unlike previous w… ▽ More

    Submitted 23 June, 2021; originally announced June 2021.

  21. arXiv:2106.11059  [pdf, other

    cs.LG

    Improving Multi-Modal Learning with Uni-Modal Teachers

    Authors: Chenzhuang Du, Tingle Li, Yichen Liu, Zixin Wen, Tianyu Hua, Yue Wang, Hang Zhao

    Abstract: Learning multi-modal representations is an essential step towards real-world robotic applications, and various multi-modal fusion models have been developed for this purpose. However, we observe that existing models, whose objectives are mostly based on joint training, often suffer from learning inferior representations of each modality. We name this problem Modality Failure, and hypothesize that… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

  22. arXiv:2105.00470  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    On Feature Decorrelation in Self-Supervised Learning

    Authors: Tianyu Hua, Wenxiao Wang, Zihui Xue, Sucheng Ren, Yue Wang, Hang Zhao

    Abstract: In self-supervised representation learning, a common idea behind most of the state-of-the-art approaches is to enforce the robustness of the representations to predefined augmentations. A potential issue of this idea is the existence of completely collapsed solutions (i.e., constant features), which are typically avoided implicitly by carefully chosen implementation details. In this work, we study… ▽ More

    Submitted 25 August, 2021; v1 submitted 2 May, 2021; originally announced May 2021.

    Comments: ICCV 2021 Oral. The first two authors contribute equally

  23. arXiv:2104.00356  [pdf, other

    cs.CV

    Exploiting Relationship for Complex-scene Image Generation

    Authors: Tianyu Hua, Hongdong Zheng, Yalong Bai, Wei Zhang, Xiao-Ping Zhang, Tao Mei

    Abstract: The significant progress on Generative Adversarial Networks (GANs) has facilitated realistic single-object image generation based on language input. However, complex-scene generation (with various interactions among multiple objects) still suffers from messy layouts and object distortions, due to diverse configurations in layouts and appearances. Prior methods are mostly object-driven and ignore t… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

  24. arXiv:2005.04073  [pdf, other

    cs.LG q-bio.GN stat.ML

    Multi-Instance Multi-Label Learning for Gene Mutation Prediction in Hepatocellular Carcinoma

    Authors: Kaixin Xu, Ziyuan Zhao, Jiapan Gu, Zeng Zeng, Chan Wan Ying, Lim Kheng Choon, Thng Choon Hua, Pierce KH Chow

    Abstract: Gene mutation prediction in hepatocellular carcinoma (HCC) is of great diagnostic and prognostic value for personalized treatments and precision medicine. In this paper, we tackle this problem with multi-instance multi-label learning to address the difficulties on label correlations, label representations, etc. Furthermore, an effective oversampling strategy is applied for data imbalance. Experime… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

    Comments: Accepted version to be published in the 42nd IEEE Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2020, Montreal, Canada

    Journal ref: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC)

  25. arXiv:2005.04069  [pdf, other

    q-bio.QM cs.CV eess.IV q-bio.GN

    Multi-Phase Cross-modal Learning for Noninvasive Gene Mutation Prediction in Hepatocellular Carcinoma

    Authors: Jiapan Gu, Ziyuan Zhao, Zeng Zeng, Yuzhe Wang, Zhengyiren Qiu, Bharadwaj Veeravalli, Brian Kim Poh Goh, Glenn Kunnath Bonney, Krishnakumar Madhavan, Chan Wan Ying, Lim Kheng Choon, Thng Choon Hua, Pierce KH Chow

    Abstract: Hepatocellular carcinoma (HCC) is the most common type of primary liver cancer and the fourth most common cause of cancer-related death worldwide. Understanding the underlying gene mutations in HCC provides great prognostic value for treatment planning and targeted therapy. Radiogenomics has revealed an association between non-invasive imaging features and molecular genomics. However, imaging feat… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

    Comments: Accepted version to be published in the 42nd IEEE Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2020, Montreal, Canada

    Journal ref: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC)

  26. arXiv:1911.07736  [pdf, other

    cs.CV cs.LG eess.IV

    Modeling Gestalt Visual Reasoning on the Raven's Progressive Matrices Intelligence Test Using Generative Image Inpainting Techniques

    Authors: Tianyu Hua, Maithilee Kunda

    Abstract: Psychologists recognize Raven's Progressive Matrices as a very effective test of general human intelligence. While many computational models have been developed by the AI community to investigate different forms of top-down, deliberative reasoning on the test, there has been less research on bottom-up perceptual processes, like Gestalt image completion, that are also critical in human test perform… ▽ More

    Submitted 26 November, 2019; v1 submitted 18 November, 2019; originally announced November 2019.

  27. arXiv:1708.07928  [pdf, ps, other

    math.CO cs.DM

    Mahonian STAT on rearrangement class of words

    Authors: Shishuo Fu, Ting Hua, Vincent Vajnovszki

    Abstract: In 2000, Babson and Steingrímsson generalized the notion of permutation patterns to the so-called vincular patterns, and they showed that many Mahonian statistics can be expressed as sums of vincular pattern occurrence statistics. STAT is one of such Mahonian statistics discoverd by them. In 2016, Kitaev and the third author introduced a words analogue of STAT and proved a joint equidistribution r… ▽ More

    Submitted 26 August, 2017; originally announced August 2017.

    Comments: 11 pages

    MSC Class: 05A05; 05A19

  28. arXiv:1402.7035  [pdf, ps, other

    cs.SI cs.CY physics.soc-ph

    'Beating the news' with EMBERS: Forecasting Civil Unrest using Open Source Indicators

    Authors: Naren Ramakrishnan, Patrick Butler, Sathappan Muthiah, Nathan Self, Rupinder Khandpur, Parang Saraf, Wei Wang, Jose Cadena, Anil Vullikanti, Gizem Korkmaz, Chris Kuhlman, Achla Marathe, Liang Zhao, Ting Hua, Feng Chen, Chang-Tien Lu, Bert Huang, Aravind Srinivasan, Khoa Trinh, Lise Getoor, Graham Katz, Andy Doyle, Chris Ackermann, Ilya Zavorin, Jim Ford , et al. (5 additional authors not shown)

    Abstract: We describe the design, implementation, and evaluation of EMBERS, an automated, 24x7 continuous system for forecasting civil unrest across 10 countries of Latin America using open source indicators such as tweets, news sources, blogs, economic indicators, and other data sources. Unlike retrospective studies, EMBERS has been making forecasts into the future since Nov 2012 which have been (and conti… ▽ More

    Submitted 27 February, 2014; v1 submitted 27 February, 2014; originally announced February 2014.

    ACM Class: K.4.1; J.4; I.2.7