Skip to main content

Showing 1–24 of 24 results for author: Fung, Y R

  1. arXiv:2406.14137  [pdf, other

    cs.CL

    MACAROON: Training Vision-Language Models To Be Your Engaged Partners

    Authors: Shujin Wu, Yi R. Fung, Sha Li, Yixin Wan, Kai-Wei Chang, Heng Ji

    Abstract: Large vision-language models (LVLMs), while proficient in following instructions and responding to diverse questions, invariably generate detailed responses even when questions are ambiguous or unanswerable, leading to hallucinations and bias issues. Thus, it is essential for LVLMs to proactively engage with humans to ask for clarifications or additional information for better responses. In this s… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: The code will be made public at https://github.com/ShujinWu-0814/MACAROON

  2. arXiv:2403.12027  [pdf, other

    cs.CL cs.AI cs.CV

    From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models

    Authors: Kung-Hsiang Huang, Hou Pong Chan, Yi R. Fung, Haoyi Qiu, Mingyang Zhou, Shafiq Joty, Shih-Fu Chang, Heng Ji

    Abstract: Data visualization in the form of charts plays a pivotal role in data analysis, offering critical insights and aiding in informed decision-making. Automatic chart understanding has witnessed significant advancements with the rise of large foundation models in recent years. Foundation models, such as large language models, have revolutionized various natural language processing tasks and are increa… ▽ More

    Submitted 25 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

  3. arXiv:2402.11943  [pdf, other

    cs.CL

    LEMMA: Towards LVLM-Enhanced Multimodal Misinformation Detection with External Knowledge Augmentation

    Authors: Keyang Xuan, Li Yi, Fan Yang, Ruochen Wu, Yi R. Fung, Heng Ji

    Abstract: The rise of multimodal misinformation on social platforms poses significant challenges for individuals and societies. Its increased credibility and broader impact compared to textual misinformation make detection complex, requiring robust reasoning across diverse media types and profound knowledge for accurate verification. The emergence of Large Vision Language Model (LVLM) offers a potential sol… ▽ More

    Submitted 20 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  4. arXiv:2402.11060  [pdf, other

    cs.CL cs.AI cs.IR

    Persona-DB: Efficient Large Language Model Personalization for Response Prediction with Collaborative Data Refinement

    Authors: Chenkai Sun, Ke Yang, Revanth Gangi Reddy, Yi R. Fung, Hou Pong Chan, ChengXiang Zhai, Heng Ji

    Abstract: The increasing demand for personalized interactions with large language models (LLMs) calls for the development of methodologies capable of accurately and efficiently identifying user opinions and preferences. Retrieval augmentation emerges as an effective strategy, as it can accommodate a vast number of users without the costs from fine-tuning. Existing research, however, has largely focused on e… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  5. arXiv:2401.00812  [pdf, other

    cs.CL

    If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents

    Authors: Ke Yang, Jiateng Liu, John Wu, Chaoqi Yang, Yi R. Fung, Sha Li, Zixuan Huang, Xu Cao, Xingyao Wang, Yiquan Wang, Heng Ji, Chengxiang Zhai

    Abstract: The prominent large language models (LLMs) of today differ from past language models not only in size, but also in the fact that they are trained on a combination of natural language and formal language (code). As a medium between humans and computers, code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity. In this survey… ▽ More

    Submitted 8 January, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

  6. arXiv:2312.10160  [pdf, other

    cs.CL

    Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning

    Authors: Kung-Hsiang Huang, Mingyang Zhou, Hou Pong Chan, Yi R. Fung, Zhenhailong Wang, Lingyu Zhang, Shih-Fu Chang, Heng Ji

    Abstract: Recent advancements in large vision-language models (LVLMs) have led to significant progress in generating natural language descriptions for visual content and thus enhancing various applications. One issue with these powerful models is that they sometimes produce texts that are factually inconsistent with the visual input. While there has been some effort to mitigate such inconsistencies in natur… ▽ More

    Submitted 30 May, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: ACL 2024 Findings

  7. arXiv:2311.09677  [pdf, other

    cs.CL

    R-Tuning: Instructing Large Language Models to Say `I Don't Know'

    Authors: Hanning Zhang, Shizhe Diao, Yong Lin, Yi R. Fung, Qing Lian, Xingyao Wang, Yangyi Chen, Heng Ji, Tong Zhang

    Abstract: Large language models (LLMs) have revolutionized numerous domains with their impressive performance but still face their challenges. A predominant issue is the propensity for these models to generate non-existent facts, a concern termed hallucination. Our research is motivated by the observation that previous instruction tuning methods force the model to complete a sentence no matter whether the m… ▽ More

    Submitted 6 June, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: NAACL 2024

  8. arXiv:2310.20633  [pdf, other

    cs.CL

    Defining a New NLP Playground

    Authors: Sha Li, Chi Han, Pengfei Yu, Carl Edwards, Manling Li, Xingyao Wang, Yi R. Fung, Charles Yu, Joel R. Tetreault, Eduard H. Hovy, Heng Ji

    Abstract: The recent explosion of performance of large language models (LLMs) has changed the field of Natural Language Processing (NLP) more abruptly and seismically than any other shift in the field's 80-year history. This has resulted in concerns that the field will become homogenized and resource-intensive. The new status quo has put many academic researchers, especially PhD students, at a disadvantage.… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: EMNLP Findings 2023 "Theme Track: Large Language Models and the Future of NLP"

  9. arXiv:2310.13297  [pdf, other

    cs.CL cs.AI cs.LG

    Decoding the Silent Majority: Inducing Belief Augmented Social Graph with Large Language Model for Response Forecasting

    Authors: Chenkai Sun, Jinning Li, Yi R. Fung, Hou Pong Chan, Tarek Abdelzaher, ChengXiang Zhai, Heng Ji

    Abstract: Automatic response forecasting for news media plays a crucial role in enabling content producers to efficiently predict the impact of news releases and prevent unexpected negative outcomes such as social conflict and moral injury. To effectively forecast responses, it is essential to develop measures that leverage the social dynamics and contextual information surrounding individuals, especially i… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: Accepted at EMNLP 2023 Main Conference

  10. arXiv:2309.17428  [pdf, other

    cs.CL cs.AI cs.LG

    CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets

    Authors: Lifan Yuan, Yangyi Chen, Xingyao Wang, Yi R. Fung, Hao Peng, Heng Ji

    Abstract: Large language models (LLMs) are often augmented with tools to solve complex tasks. By generating code snippets and executing them through task-specific Application Programming Interfaces (APIs), they can offload certain functions to dedicated external modules, such as image encoding and performing calculations. However, most existing approaches to augment LLMs with tools are constrained by genera… ▽ More

    Submitted 13 March, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: Accepted to ICLR 2024. Code is available at https://github.com/lifan-yuan/CRAFT

  11. arXiv:2305.18641  [pdf, other

    cs.CL cs.CV

    Enhanced Chart Understanding in Vision and Language Task via Cross-modal Pre-training on Plot Table Pairs

    Authors: Mingyang Zhou, Yi R. Fung, Long Chen, Christopher Thomas, Heng Ji, Shih-Fu Chang

    Abstract: Building cross-model intelligence that can understand charts and communicate the salient information hidden behind them is an appealing challenge in the vision and language(V+L) community. The capability to uncover the underlined table data of chart figures is a critical key to automatic chart understanding. We introduce ChartT5, a V+L model that learns how to interpret table information from char… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: Accepted by Findings of ACL 2023

  12. arXiv:2305.14318  [pdf, other

    cs.CL

    CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models

    Authors: Cheng Qian, Chi Han, Yi R. Fung, Yujia Qin, Zhiyuan Liu, Heng Ji

    Abstract: Large Language Models (LLMs) have made significant progress in utilizing tools, but their ability is limited by API availability and the instability of implicit reasoning, particularly when both planning and execution are involved. To overcome these limitations, we propose CREATOR, a novel framework that enables LLMs to create their own tools using documentation and code realization. CREATOR disen… ▽ More

    Submitted 21 June, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Findings of EMNLP 2023

  13. arXiv:2304.08354  [pdf, other

    cs.CL cs.AI cs.LG

    Tool Learning with Foundation Models

    Authors: Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Yufei Huang, Chaojun Xiao, Chi Han, Yi Ren Fung, Yusheng Su, Huadong Wang, Cheng Qian, Runchu Tian, Kunlun Zhu, Shihao Liang, Xingyu Shen, Bokai Xu, Zhen Zhang, Yining Ye, Bowen Li, Ziwei Tang, Jing Yi, Yuzhang Zhu , et al. (16 additional authors not shown)

    Abstract: Humans possess an extraordinary ability to create and utilize tools, allowing them to overcome physical limitations and explore new frontiers. With the advent of foundation models, AI systems have the potential to be equally adept in tool use as humans. This paradigm, i.e., tool learning with foundation models, combines the strengths of specialized tools and foundation models to achieve enhanced a… ▽ More

    Submitted 15 June, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

  14. arXiv:2303.14337  [pdf, other

    cs.CL

    SmartBook: AI-Assisted Situation Report Generation for Intelligence Analysts

    Authors: Revanth Gangi Reddy, Daniel Lee, Yi R. Fung, Khanh Duy Nguyen, Qi Zeng, Manling Li, Ziqi Wang, Clare Voss, Heng Ji

    Abstract: Timely and comprehensive understanding of emerging events is crucial for effective decision-making; automating situation report generation can significantly reduce the time, effort, and cost for intelligence analysts. In this work, we identify intelligence analysts' practices and preferences for AI assistance in situation report generation to guide the design strategies for an effective, trust-bui… ▽ More

    Submitted 27 May, 2024; v1 submitted 24 March, 2023; originally announced March 2023.

    Comments: Preprint

  15. arXiv:2303.13775  [pdf, other

    cs.DC cs.LG

    GSplit: Scaling Graph Neural Network Training on Large Graphs via Split-Parallelism

    Authors: Sandeep Polisetty, Juelin Liu, Kobi Falus, Yi Ren Fung, Seung-Hwan Lim, Hui Guan, Marco Serafini

    Abstract: Graph neural networks (GNNs), an emerging class of machine learning models for graphs, have gained popularity for their superior performance in various graph analytical tasks. Mini-batch training is commonly used to train GNNs on large graphs, and data parallelism is the standard approach to scale mini-batch training across multiple GPUs. One of the major performance costs in GNN training is the l… ▽ More

    Submitted 27 June, 2024; v1 submitted 23 March, 2023; originally announced March 2023.

  16. arXiv:2210.08604  [pdf, other

    cs.CL cs.AI

    NormSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations On-the-Fly

    Authors: Yi R. Fung, Tuhin Chakraborty, Hao Guo, Owen Rambow, Smaranda Muresan, Heng Ji

    Abstract: Norm discovery is important for understanding and reasoning about the acceptable behaviors and potential violations in human communication and interactions. We introduce NormSage, a framework for addressing the novel task of conversation-grounded multi-lingual, multi-cultural norm discovery, based on language model prompting and self-verification. NormSAGE leverages the expressiveness and implicit… ▽ More

    Submitted 13 January, 2024; v1 submitted 16 October, 2022; originally announced October 2022.

  17. arXiv:2203.05967  [pdf, other

    cs.SI cs.CL

    A Weibo Dataset for the 2022 Russo-Ukrainian Crisis

    Authors: Yi R. Fung, Heng Ji

    Abstract: Online social networks such as Twitter and Weibo play an important role in how people stay informed and exchange reactions. Each crisis encompasses a new opportunity to study the portability of models for various tasks (e.g., information extraction, complex event understanding, misinformation detection, etc.), due to differences in domain, entities, and event types. We present the Russia-Ukraine C… ▽ More

    Submitted 9 March, 2022; originally announced March 2022.

    Comments: Russia-Ukraine Crisis, Weibo Dataset

  18. arXiv:2112.08544  [pdf, other

    cs.CL cs.AI

    NewsClaims: A New Benchmark for Claim Detection from News with Attribute Knowledge

    Authors: Revanth Gangi Reddy, Sai Chetan, Zhenhailong Wang, Yi R. Fung, Kathryn Conger, Ahmed Elsayed, Martha Palmer, Preslav Nakov, Eduard Hovy, Kevin Small, Heng Ji

    Abstract: Claim detection and verification are crucial for news understanding and have emerged as promising technologies for mitigating misinformation and disinformation in the news. However, most existing work has focused on claim sentence analysis while overlooking additional crucial attributes (e.g., the claimer and the main object associated with the claim). In this work, we present NewsClaims, a new be… ▽ More

    Submitted 23 November, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: Accepted at EMNLP 2022

  19. arXiv:2011.13406  [pdf, other

    cs.CV

    Learning from Lexical Perturbations for Consistent Visual Question Answering

    Authors: Spencer Whitehead, Hui Wu, Yi Ren Fung, Heng Ji, Rogerio Feris, Kate Saenko

    Abstract: Existing Visual Question Answering (VQA) models are often fragile and sensitive to input variations. In this paper, we propose a novel approach to address this issue based on modular networks, which creates two questions related by linguistic perturbations and regularizes the visual reasoning process between them to be consistent during training. We show that our framework markedly improves consis… ▽ More

    Submitted 22 December, 2020; v1 submitted 26 November, 2020; originally announced November 2020.

    Comments: 14 pages, 8 figures

  20. arXiv:2007.00576  [pdf, other

    cs.CL cs.AI

    COVID-19 Literature Knowledge Graph Construction and Drug Repurposing Report Generation

    Authors: Qingyun Wang, Manling Li, Xuan Wang, Nikolaus Parulian, Guangxing Han, Jiawei Ma, Jingxuan Tu, Ying Lin, Haoran Zhang, Weili Liu, Aabhas Chauhan, Yingjun Guan, Bangzheng Li, Ruisong Li, Xiangchen Song, Yi R. Fung, Heng Ji, Jiawei Han, Shih-Fu Chang, James Pustejovsky, Jasmine Rah, David Liem, Ahmed Elsayed, Martha Palmer, Clare Voss , et al. (2 additional authors not shown)

    Abstract: To combat COVID-19, both clinicians and scientists need to digest vast amounts of relevant biomedical knowledge in scientific literature to understand the disease mechanism and related biological functions. We have developed a novel and comprehensive knowledge discovery framework, COVID-KG to extract fine-grained multimedia knowledge elements (entities and their visual chemical structures, relatio… ▽ More

    Submitted 11 May, 2021; v1 submitted 1 July, 2020; originally announced July 2020.

    Comments: 12 pages, Accepted by Proceedings of 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics System Demonstrations, for resources see http://blender.cs.illinois.edu/covid19/, for video see http://159.89.180.81/demo/covid/Covid-KG_DemoVideo.mp4, for slides see https://eaglew.github.io/files/Covid-KG_DemoVideo_with_ethics.pdf

  21. arXiv:1909.06427  [pdf, other

    cs.AI

    Responsive Planning and Recognition for Closed-Loop Interaction

    Authors: Richard G. Freedman, Yi Ren Fung, Roman Ganchin, Shlomo Zilberstein

    Abstract: Many intelligent systems currently interact with others using at least one of fixed communication inputs or preset responses, resulting in rigid interaction experiences and extensive efforts developing a variety of scenarios for the system. Fixed inputs limit the natural behavior of the user in order to effectively communicate, and preset responses prevent the system from adapting to the current s… ▽ More

    Submitted 13 September, 2019; originally announced September 2019.

    Comments: Accepted for presentation at the AAAI 2019 Fall Symposium Series, in the symposium for Artificial Intelligence and Human-Robot Interaction for Service Robots in Human Environments

    Report number: AI-HRI/2019/24

  22. arXiv:1906.04231  [pdf, other

    eess.IV cs.CV

    Alzheimer's Disease Brain MRI Classification: Challenges and Insights

    Authors: Yi Ren Fung, Ziqiang Guan, Ritesh Kumar, Joie Yeahuay Wu, Madalina Fiterau

    Abstract: In recent years, many papers have reported state-of-the-art performance on Alzheimer's Disease classification with MRI scans from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset using convolutional neural networks. However, we discover that when we split that data into training and testing sets at the subject level, we are not able to obtain similar performance, bringing the validit… ▽ More

    Submitted 10 June, 2019; originally announced June 2019.

    Comments: 5 pages, 2 figures, IJCAI ARIAL workshop paper

  23. arXiv:1904.08930  [pdf, other

    cs.LG stat.ML

    FLARe: Forecasting by Learning Anticipated Representations

    Authors: Surya Teja Devarakonda, Joie Yeahuay Wu, Yi Ren Fung, Madalina Fiterau

    Abstract: Computational models that forecast the progression of Alzheimer's disease at the patient level are extremely useful tools for identifying high risk cohorts for early intervention and treatment planning. The state-of-the-art work in this area proposes models that forecast by using latent representations extracted from the longitudinal data across multiple modalities, including volumetric informatio… ▽ More

    Submitted 26 December, 2019; v1 submitted 17 April, 2019; originally announced April 2019.

    Report number: PMLR 106:53-65

  24. arXiv:1904.07950  [pdf, other

    cs.CV

    A Comprehensive Study of Alzheimer's Disease Classification Using Convolutional Neural Networks

    Authors: Ziqiang Guan, Ritesh Kumar, Yi Ren Fung, Yeahuay Wu, Madalina Fiterau

    Abstract: A plethora of deep learning models have been developed for the task of Alzheimer's disease classification from brain MRI scans. Many of these models report high performance, achieving three-class classification accuracy of up to 95%. However, it is common for these studies to draw performance comparisons between models that are trained on different subsets of a dataset or use varying imaging prepr… ▽ More

    Submitted 16 April, 2019; originally announced April 2019.