Skip to main content

Showing 1–39 of 39 results for author: Zhuo, T

  1. arXiv:2407.08956  [pdf, other

    cs.CR cs.SE

    DeCE: Deceptive Cross-Entropy Loss Designed for Defending Backdoor Attacks

    Authors: Guang Yang, Yu Zhou, Xiang Chen, Xiangyu Zhang, Terry Yue Zhuo, David Lo, Taolue Chen

    Abstract: Code Language Models (CLMs), particularly those leveraging deep learning, have achieved significant success in code intelligence domain. However, the issue of security, particularly backdoor attacks, is often overlooked in this process. The previous research has focused on designing backdoor attacks for CLMs, but effective defenses have not been adequately addressed. In particular, existing defens… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Under Review; Waiting for updates

  2. arXiv:2406.15877  [pdf, other

    cs.SE cs.AI cs.CL

    BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

    Authors: Terry Yue Zhuo, Minh Chien Vu, Jenny Chim, Han Hu, Wenhao Yu, Ratnadira Widyasari, Imam Nur Bani Yusuf, Haolan Zhan, Junda He, Indraneil Paul, Simon Brunner, Chen Gong, Thong Hoang, Armel Randy Zebaze, Xiaoheng Hong, Wen-Ding Li, Jean Kaddour, Ming Xu, Zhihan Zhang, Prateek Yadav, Naman Jain, Alex Gu, Zhoujun Cheng, Jiawei Liu, Qian Liu , et al. (8 additional authors not shown)

    Abstract: Automated software engineering has been greatly empowered by the recent advances in Large Language Models (LLMs) for programming. While current benchmarks have shown that LLMs can perform various software engineering tasks like human developers, the majority of their evaluations are limited to short and self-contained algorithmic tasks. Solving challenging and practical programming tasks requires… ▽ More

    Submitted 26 June, 2024; v1 submitted 22 June, 2024; originally announced June 2024.

    Comments: 44 pages, 14 figures, 7 tables, built with love by the BigCode community :)

  3. arXiv:2405.18955  [pdf, other

    cs.CV

    RGB-T Object Detection via Group Shuffled Multi-receptive Attention and Multi-modal Supervision

    Authors: Jinzhong Wang, Xuetao Tian, Shun Dai, Tao Zhuo, Haorui Zeng, Hongjuan Liu, Jiaqi Liu, Xiuwei Zhang, Yanning Zhang

    Abstract: Multispectral object detection, utilizing both visible (RGB) and thermal infrared (T) modals, has garnered significant attention for its robust performance across diverse weather and lighting conditions. However, effectively exploiting the complementarity between RGB-T modals while maintaining efficiency remains a critical challenge. In this paper, a very simple Group Shuffled Multi-receptive Atte… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  4. arXiv:2404.15247  [pdf, other

    cs.CL cs.AI cs.LG cs.SE

    XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts

    Authors: Yifeng Ding, Jiawei Liu, Yuxiang Wei, Terry Yue Zhuo, Lingming Zhang

    Abstract: We introduce XFT, a simple yet powerful training scheme, by simply merging upcycled Mixture-of-Experts (MoE) to unleash the performance limit of instruction-tuned code Large Language Models (LLMs). While vanilla sparse upcycling fails to improve instruction tuning, XFT introduces a shared expert mechanism with a novel routing weight normalization strategy into sparse upcycling, which significantly… ▽ More

    Submitted 6 June, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  5. arXiv:2404.00399  [pdf, other

    cs.CL cs.AI cs.LG

    Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

    Authors: Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak , et al. (20 additional authors not shown)

    Abstract: Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility. Initiatives such as BLOOM and StarCoder aim to democratize access to pretrained models for collaborative community development. However, such existing models face challenges: limited multilingual capabilities, continual pretraining causing catastrophic forgetting, where… ▽ More

    Submitted 23 April, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: Preprint

  6. arXiv:2403.02784  [pdf, other

    cs.CV

    DDF: A Novel Dual-Domain Image Fusion Strategy for Remote Sensing Image Semantic Segmentation with Unsupervised Domain Adaptation

    Authors: Lingyan Ran, Lushuang Wang, Tao Zhuo, Yinghui Xing

    Abstract: Semantic segmentation of remote sensing images is a challenging and hot issue due to the large amount of unlabeled data. Unsupervised domain adaptation (UDA) has proven to be advantageous in incorporating unclassified information from the target domain. However, independently fine-tuning UDA models on the source and target domains has a limited effect on the outcome. This paper proposes a hybrid t… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  7. arXiv:2402.19173  [pdf, other

    cs.SE cs.AI

    StarCoder 2 and The Stack v2: The Next Generation

    Authors: Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo , et al. (41 additional authors not shown)

    Abstract: The BigCode project, an open-scientific collaboration focused on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder2. In partnership with Software Heritage (SWH), we build The Stack v2 on top of the digital commons of their source code archive. Alongside the SWH repositories spanning 619 programming languages, we carefully select other high-quality data… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  8. arXiv:2401.15987  [pdf, other

    cs.CV

    Hand-Centric Motion Refinement for 3D Hand-Object Interaction via Hierarchical Spatial-Temporal Modeling

    Authors: Yuze Hao, Jianrong Zhang, Tao Zhuo, Fuan Wen, Hehe Fan

    Abstract: Hands are the main medium when people interact with the world. Generating proper 3D motion for hand-object interaction is vital for applications such as virtual reality and robotics. Although grasp tracking or object manipulation synthesis can produce coarse hand motion, this kind of motion is inevitably noisy and full of jitter. To address this problem, we propose a data-driven method for coarse… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: Accepted to AAAI 2024

  9. arXiv:2401.00788  [pdf, other

    cs.CL cs.AI cs.SE

    Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models

    Authors: Terry Yue Zhuo, Armel Zebaze, Nitchakarn Suppattarachai, Leandro von Werra, Harm de Vries, Qian Liu, Niklas Muennighoff

    Abstract: The high cost of full-parameter fine-tuning (FFT) of Large Language Models (LLMs) has led to a series of parameter-efficient fine-tuning (PEFT) methods. However, it remains unclear which methods provide the best cost-performance trade-off at different model scales. We introduce Astraios, a suite of 28 instruction-tuned OctoCoder models using 7 tuning methods and 4 model sizes up to 16 billion para… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

    Comments: 25 pages (12 main), 19 figures, 8 tables

  10. arXiv:2312.05562  [pdf, other

    cs.SE

    Chain-of-Thought in Neural Code Generation: From and For Lightweight Language Models

    Authors: Guang Yang, Yu Zhou, Xiang Chen, Xiangyu Zhang, Terry Yue Zhuo, Taolue Chen

    Abstract: Large Language Models (LLMs) have demonstrated remarkable potential in code generation. The integration of Chain of Thought (CoT) reasoning can further boost their performance. However, current CoT methods often require manual writing or LLMs with over 100 billion parameters to generate, impeding their applicability in resource-constrained scenarios. In this study, we investigate lightweight Langu… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Comments: UNDER REVIEW

  11. Can ChatGPT Perform Reasoning Using the IRAC Method in Analyzing Legal Scenarios Like a Lawyer?

    Authors: Xiaoxi Kang, Lizhen Qu, Lay-Ki Soon, Adnan Trakic, Terry Yue Zhuo, Patrick Charles Emerton, Genevieve Grant

    Abstract: Large Language Models (LLMs), such as ChatGPT, have drawn a lot of attentions recently in the legal domain due to its emergent ability to tackle a variety of legal tasks. However, it is still unknown if LLMs are able to analyze a legal case and perform reasoning in the same manner as lawyers. Therefore, we constructed a novel corpus consisting of scenarios pertain to Contract Acts Malaysia and Aus… ▽ More

    Submitted 2 November, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Findings

    Report number: 2023.findings-emnlp.929

    Journal ref: 2023.findings-emnlp.929

  12. arXiv:2310.10417  [pdf, other

    cs.CV cs.LG

    Prior-Free Continual Learning with Unlabeled Data in the Wild

    Authors: Tao Zhuo, Zhiyong Cheng, Hehe Fan, Mohan Kankanhalli

    Abstract: Continual Learning (CL) aims to incrementally update a trained model on new tasks without forgetting the acquired knowledge of old ones. Existing CL methods usually reduce forgetting with task priors, \ie using task identity or a subset of previously seen samples for model training. However, these methods would be infeasible when such priors are unknown in real-world applications. To address this… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  13. arXiv:2309.08674  [pdf, other

    cs.CL cs.AI

    Fake News Detectors are Biased against Texts Generated by Large Language Models

    Authors: Jinyan Su, Terry Yue Zhuo, Jonibek Mansurov, Di Wang, Preslav Nakov

    Abstract: The spread of fake news has emerged as a critical challenge, undermining trust and posing threats to society. In the era of Large Language Models (LLMs), the capability to generate believable fake content has intensified these concerns. In this study, we present a novel paradigm to evaluate fake news detectors in scenarios involving both human-written and LLM-generated misinformation. Intriguingly… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: The first two authors contributed equally

  14. arXiv:2309.07804  [pdf, other

    cs.SE cs.CL

    Pop Quiz! Do Pre-trained Code Models Possess Knowledge of Correct API Names?

    Authors: Terry Yue Zhuo, Xiaoning Du, Zhenchang Xing, Jiamou Sun, Haowei Quan, Li Li, Liming Zhu

    Abstract: Recent breakthroughs in pre-trained code models, such as CodeBERT and Codex, have shown their superior performance in various downstream tasks. The correctness and unambiguity of API usage among these code models are crucial for achieving desirable program functionalities, requiring them to learn various API fully qualified names structurally and semantically. Recent studies reveal that even state… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  15. arXiv:2308.07124  [pdf, other

    cs.CL cs.AI

    OctoPack: Instruction Tuning Code Large Language Models

    Authors: Niklas Muennighoff, Qian Liu, Armel Zebaze, Qinkai Zheng, Binyuan Hui, Terry Yue Zhuo, Swayam Singh, Xiangru Tang, Leandro von Werra, Shayne Longpre

    Abstract: Finetuning large language models (LLMs) on instructions leads to vast performance improvements on natural language tasks. We apply instruction tuning using code, leveraging the natural structure of Git commits, which pair code changes with human instructions. We compile CommitPack: 4 terabytes of Git commits across 350 programming languages. We benchmark CommitPack against other natural and synthe… ▽ More

    Submitted 18 February, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: 60 pages (9 main), 40 figures, 19 tables

  16. arXiv:2307.12328  [pdf, other

    cs.SE

    A First Look at On-device Models in iOS Apps

    Authors: Han Hu, Yujin Huang, Qiuyuan Chen, Terry Yue Zhuo, Chunyang Chen

    Abstract: Powered by the rising popularity of deep learning techniques on smartphones, on-device deep learning models are being used in vital fields like finance, social media, and driving assistance. Because of the transparency of the Android platform and the on-device models inside, on-device models on Android smartphones have been proven to be extremely vulnerable. However, due to the challenge in ac… ▽ More

    Submitted 27 July, 2023; v1 submitted 23 July, 2023; originally announced July 2023.

    Comments: 30 pages, 7 pages, journal paper

  17. arXiv:2306.05540  [pdf, other

    cs.CL cs.AI

    DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text

    Authors: Jinyan Su, Terry Yue Zhuo, Di Wang, Preslav Nakov

    Abstract: With the rapid progress of large language models (LLMs) and the huge amount of text they generated, it becomes more and more impractical to manually distinguish whether a text is machine-generated. Given the growing use of LLMs in social media and education, it prompts us to develop methods to detect machine-generated text, preventing malicious usage such as plagiarism, misinformation, and propaga… ▽ More

    Submitted 23 May, 2023; originally announced June 2023.

    Comments: machine-generated text, large language models, LLMs, zero-shot

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

  18. arXiv:2305.19915  [pdf, other

    cs.CL cs.AI cs.SE

    Source Code Data Augmentation for Deep Learning: A Survey

    Authors: Terry Yue Zhuo, Zhou Yang, Zhensu Sun, Yufei Wang, Li Li, Xiaoning Du, Zhenchang Xing, David Lo

    Abstract: The increasingly popular adoption of deep learning models in many critical source code tasks motivates the development of data augmentation (DA) techniques to enhance training data and improve various capabilities (e.g., robustness and generalizability) of these models. Although a series of DA methods have been proposed and tailored for source code models, there lacks a comprehensive survey and ex… ▽ More

    Submitted 13 November, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: ongoing work; 89 publications

  19. arXiv:2305.17497  [pdf, other

    cs.CL

    FACTUAL: A Benchmark for Faithful and Consistent Textual Scene Graph Parsing

    Authors: Zhuang Li, Yuyang Chai, Terry Yue Zhuo, Lizhen Qu, Gholamreza Haffari, Fei Li, Donghong Ji, Quan Hung Tran

    Abstract: Textual scene graph parsing has become increasingly important in various vision-language applications, including image caption evaluation and image retrieval. However, existing scene graph parsers that convert image captions into scene graphs often suffer from two types of errors. First, the generated scene graphs fail to capture the true semantics of the captions or the corresponding images, resu… ▽ More

    Submitted 1 June, 2023; v1 submitted 27 May, 2023; originally announced May 2023.

    Comments: 9 pages, ACL 2023 (findings)

  20. arXiv:2305.13622  [pdf, other

    cs.CV

    Continual Learning with Strong Experience Replay

    Authors: Tao Zhuo, Zhiyong Cheng, Zan Gao, Hehe Fan, Mohan Kankanhalli

    Abstract: Continual Learning (CL) aims at incrementally learning new tasks without forgetting the knowledge acquired from old ones. Experience Replay (ER) is a simple and effective rehearsal-based strategy, which optimizes the model with current training data and a subset of old samples stored in a memory buffer. To further reduce forgetting, recent approaches extend ER with various techniques, such as mode… ▽ More

    Submitted 3 December, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

  21. arXiv:2305.06161  [pdf, other

    cs.CL cs.AI cs.PL cs.SE

    StarCoder: may the source be with you!

    Authors: Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu , et al. (42 additional authors not shown)

    Abstract: The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15.5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention. StarCoderBase is trained on 1 trillion tokens sourced from The Stack, a large colle… ▽ More

    Submitted 13 December, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

  22. arXiv:2304.14317  [pdf, other

    cs.AI cs.CL cs.SE

    ICE-Score: Instructing Large Language Models to Evaluate Code

    Authors: Terry Yue Zhuo

    Abstract: Recent advancements in the field of natural language generation have facilitated the use of large language models to assess the quality of generated text. Although these models have shown promising results in tasks such as machine translation and summarization, their applicability in code intelligence tasks remains limited without human involvement. The complexity of programming concepts required… ▽ More

    Submitted 22 January, 2024; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: Accepted to Findings of EACL 2024

  23. arXiv:2302.04116  [pdf, other

    cs.CR cs.AI cs.CL

    Training-free Lexical Backdoor Attacks on Language Models

    Authors: Yujin Huang, Terry Yue Zhuo, Qiongkai Xu, Han Hu, Xingliang Yuan, Chunyang Chen

    Abstract: Large-scale language models have achieved tremendous success across various natural language processing (NLP) applications. Nevertheless, language models are vulnerable to backdoor attacks, which inject stealthy triggers into models for steering them to undesirable behaviors. Most existing backdoor attacks, such as data poisoning, require further (re)training or fine-tuning language models to lear… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

    Comments: Accepted to International World Wide Web Conference 2023, Security, Privacy & Trust Track

  24. arXiv:2301.12868  [pdf, other

    cs.CL

    On Robustness of Prompt-based Semantic Parsing with Large Pre-trained Language Model: An Empirical Study on Codex

    Authors: Terry Yue Zhuo, Zhuang Li, Yujin Huang, Fatemeh Shiri, Weiqing Wang, Gholamreza Haffari, Yuan-Fang Li

    Abstract: Semantic parsing is a technique aimed at constructing a structured representation of the meaning of a natural-language question. Recent advancements in few-shot language models trained on code have demonstrated superior performance in generating these representations compared to traditional unimodal language models, which are trained on downstream tasks. Despite these advancements, existing fine-t… ▽ More

    Submitted 9 March, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

    Comments: Accepted at EACL2023 (main)

  25. arXiv:2301.12867  [pdf, other

    cs.CL cs.SE

    Red teaming ChatGPT via Jailbreaking: Bias, Robustness, Reliability and Toxicity

    Authors: Terry Yue Zhuo, Yujin Huang, Chunyang Chen, Zhenchang Xing

    Abstract: Recent breakthroughs in natural language processing (NLP) have permitted the synthesis and comprehension of coherent text in an open-ended way, therefore translating the theoretical algorithms into practical applications. The large language models (LLMs) have significantly impacted businesses such as report summarization software and copywriters. Observations indicate, however, that LLMs may exhib… ▽ More

    Submitted 29 May, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

    Comments: Technical Report

  26. arXiv:2301.03988  [pdf, other

    cs.SE cs.AI cs.LG

    SantaCoder: don't reach for the stars!

    Authors: Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff, Mayank Mishra, Alex Gu, Manan Dey, Logesh Kumar Umapathi, Carolyn Jane Anderson, Yangtian Zi, Joel Lamy Poirier, Hailey Schoelkopf, Sergey Troshin, Dmitry Abulkhanov, Manuel Romero, Michael Lappert, Francesco De Toni, Bernardo García del Río, Qian Liu, Shamik Bose, Urvashi Bhattacharyya, Terry Yue Zhuo , et al. (16 additional authors not shown)

    Abstract: The BigCode project is an open-scientific collaboration working on the responsible development of large language models for code. This tech report describes the progress of the collaboration until December 2022, outlining the current state of the Personally Identifiable Information (PII) redaction pipeline, the experiments conducted to de-risk the model architecture, and the experiments investigat… ▽ More

    Submitted 24 February, 2023; v1 submitted 9 January, 2023; originally announced January 2023.

  27. arXiv:2210.05556  [pdf, other

    cs.CV cs.CL

    ViLPAct: A Benchmark for Compositional Generalization on Multimodal Human Activities

    Authors: Terry Yue Zhuo, Yaqing Liao, Yuecheng Lei, Lizhen Qu, Gerard de Melo, Xiaojun Chang, Yazhou Ren, Zenglin Xu

    Abstract: We introduce ViLPAct, a novel vision-language benchmark for human activity planning. It is designed for a task where embodied AI agents can reason and forecast future actions of humans based on video clips about their initial activities and intents in text. The dataset consists of 2.9k videos from \charades extended with intents via crowdsourcing, a multi-choice question test set, and four strong… ▽ More

    Submitted 9 March, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: Accepted at EACL2023 (Findings)

  28. arXiv:2209.07351  [pdf, other

    cs.CL

    Rethinking Round-Trip Translation for Machine Translation Evaluation

    Authors: Terry Yue Zhuo, Qiongkai Xu, Xuanli He, Trevor Cohn

    Abstract: Automatic evaluation on low-resource language translation suffers from a deficiency of parallel corpora. Round-trip translation could be served as a clever and straightforward technique to alleviate the requirement of the parallel evaluation corpus. However, there was an observation of obscure correlations between the evaluation scores by forward and round-trip translations in the era of statistic… ▽ More

    Submitted 15 May, 2023; v1 submitted 15 September, 2022; originally announced September 2022.

    Comments: Accepted to Findings of ACL 2023

  29. arXiv:2208.07493  [pdf, other

    cs.CV

    Temporal Action Localization with Multi-temporal Scales

    Authors: Zan Gao, Xinglei Cui, Tao Zhuo, Zhiyong Cheng, An-An Liu, Meng Wang, Shenyong Chen

    Abstract: Temporal action localization plays an important role in video analysis, which aims to localize and classify actions in untrimmed videos. The previous methods often predict actions on a feature space of a single-temporal scale. However, the temporal features of a low-level scale lack enough semantics for action classification while a high-level scale cannot provide rich details of the action bounda… ▽ More

    Submitted 15 August, 2022; originally announced August 2022.

  30. arXiv:2205.13720  [pdf, other

    cs.CV cs.AI cs.LG

    Effective Abstract Reasoning with Dual-Contrast Network

    Authors: Tao Zhuo, Mohan Kankanhalli

    Abstract: As a step towards improving the abstract reasoning capability of machines, we aim to solve Raven's Progressive Matrices (RPM) with neural networks, since solving RPM puzzles is highly correlated with human intelligence. Unlike previous methods that use auxiliary annotations or assume hidden rules to produce appropriate feature representation, we only use the ground truth answer of each question fo… ▽ More

    Submitted 26 May, 2022; originally announced May 2022.

    Comments: Published on ICLR 2021

  31. arXiv:2203.10854  [pdf, other

    cs.CL

    Paraphrasing Techniques for Maritime QA system

    Authors: Fatemeh Shiri, Terry Yue Zhuo, Zhuang Li, Van Nguyen, Shirui Pan, Weiqing Wang, Reza Haffari, Yuan-Fang Li

    Abstract: There has been an increasing interest in incorporating Artificial Intelligence (AI) into Defence and military systems to complement and augment human intelligence and capabilities. However, much work still needs to be done toward achieving an effective human-machine partnership. This work is aimed at enhancing human-machine communications by developing a capability for automatically translating hu… ▽ More

    Submitted 9 March, 2023; v1 submitted 21 March, 2022; originally announced March 2022.

    Comments: 8 pages. The first three authors contribute equally

  32. arXiv:2109.10011  [pdf, other

    cs.CV cs.AI cs.LG

    Unsupervised Abstract Reasoning for Raven's Problem Matrices

    Authors: Tao Zhuo, Qiang Huang, Mohan Kankanhalli

    Abstract: Raven's Progressive Matrices (RPM) is highly correlated with human intelligence, and it has been widely used to measure the abstract reasoning ability of humans. In this paper, to study the abstract reasoning capability of deep neural networks, we propose the first unsupervised learning method for solving RPM problems. Since the ground truth labels are not allowed, we design a pseudo target based… ▽ More

    Submitted 21 September, 2021; originally announced September 2021.

    Comments: Accepted by TIP

  33. PyArmadillo: a streamlined linear algebra library for Python

    Authors: Jason Rumengan, Terry Yue Zhuo, Conrad Sanderson

    Abstract: PyArmadillo is a linear algebra library for the Python language, with the aim of closely mirroring the programming interface of the widely used Armadillo C++ library, which in turn is deliberately similar to Matlab. PyArmadillo hence facilitates algorithm prototyping with Matlab-like syntax directly in Python, and relatively straightforward conversion of PyArmadillo-based Python code into performa… ▽ More

    Submitted 20 October, 2021; v1 submitted 22 April, 2021; originally announced April 2021.

    MSC Class: 15-04; 62-04; 65-04; 68-04 ACM Class: G.4; D.3; D.2.3

    Journal ref: Journal of Open Source Software, Vol. 66, No. 6, 2021

  34. arXiv:2002.01646  [pdf, other

    cs.CV

    Solving Raven's Progressive Matrices with Neural Networks

    Authors: Tao Zhuo, Mohan Kankanhalli

    Abstract: Raven's Progressive Matrices (RPM) have been widely used for Intelligence Quotient (IQ) test of humans. In this paper, we aim to solve RPM with neural networks in both supervised and unsupervised manners. First, we investigate strategies to reduce over-fitting in supervised learning. We suggest the use of a neural network with deep layers and pre-training on large-scale datasets to improve model g… ▽ More

    Submitted 6 February, 2020; v1 submitted 5 February, 2020; originally announced February 2020.

  35. arXiv:1908.10717  [pdf, other

    cs.CV

    Fast Video Object Segmentation via Mask Transfer Network

    Authors: Tao Zhuo, Zhiyong Cheng, Mohan Kankanhalli

    Abstract: Accuracy and processing speed are two important factors that affect the use of video object segmentation (VOS) in real applications. With the advanced techniques of deep neural networks, the accuracy has been significantly improved, however, the speed is still far below the real-time needs because of the complicated network design, such as the requirement of the first frame fine-tuning step. To ov… ▽ More

    Submitted 28 August, 2019; originally announced August 2019.

  36. arXiv:1908.10700  [pdf, other

    cs.CV

    Explainable Video Action Reasoning via Prior Knowledge and State Transitions

    Authors: Tao Zhuo, Zhiyong Cheng, Peng Zhang, Yongkang Wong, Mohan Kankanhalli

    Abstract: Human action analysis and understanding in videos is an important and challenging task. Although substantial progress has been made in past years, the explainability of existing methods is still limited. In this work, we propose a novel action reasoning framework that uses prior knowledge to explain semantic-level observations of video state changes. Our method takes advantage of both classical re… ▽ More

    Submitted 28 August, 2019; originally announced August 2019.

  37. arXiv:1810.03783  [pdf, other

    cs.CV cs.LG

    Unsupervised Online Video Object Segmentation with Motion Property Understanding

    Authors: Tao Zhuo, Zhiyong Cheng, Peng Zhang, Yongkang Wong, Mohan Kankanhalli

    Abstract: Unsupervised video object segmentation aims to automatically segment moving objects over an unconstrained video without any user annotation. So far, only few unsupervised online methods have been reported in literature and their performance is still far from satisfactory, because the complementary information from future frames cannot be processed under online setting. To solve this challenging pr… ▽ More

    Submitted 5 August, 2019; v1 submitted 8 October, 2018; originally announced October 2018.

  38. arXiv:1712.09093  [pdf

    cs.CV

    Brain Tumor Segmentation Based on Refined Fully Convolutional Neural Networks with A Hierarchical Dice Loss

    Authors: Jiachi Zhang, Xiaolei Shen, Tianqi Zhuo, Hong Zhou

    Abstract: As a basic task in computer vision, semantic segmentation can provide fundamental information for object detection and instance segmentation to help the artificial intelligence better understand real world. Since the proposal of fully convolutional neural network (FCNN), it has been widely used in semantic segmentation because of its high accuracy of pixel-wise classification as well as high preci… ▽ More

    Submitted 12 February, 2018; v1 submitted 25 December, 2017; originally announced December 2017.

    Comments: 14 pages, 7 figures, 6 tables

  39. arXiv:1409.7307  [pdf, other

    cs.CV

    Image Classification with A Deep Network Model based on Compressive Sensing

    Authors: Yufei Gan, Tong Zhuo, Chu He

    Abstract: To simplify the parameter of the deep learning network, a cascaded compressive sensing model "CSNet" is implemented for image classification. Firstly, we use cascaded compressive sensing network to learn feature from the data. Secondly, CSNet generates the feature by binary hashing and block-wise histograms. Finally, a linear SVM classifier is used to classify these features. The experiments on th… ▽ More

    Submitted 25 September, 2014; originally announced September 2014.