Skip to main content

Showing 1–50 of 72 results for author: Ning, Y

  1. arXiv:2407.11034  [pdf

    cs.LG

    Bridging Data Gaps in Healthcare: A Scoping Review of Transfer Learning in Biomedical Data Analysis

    Authors: Siqi Li, Xin Li, Kunyu Yu, Di Miao, Mingcheng Zhu, Mengying Yan, Yuhe Ke, Danny D'Agostino, Yilin Ning, Qiming Wu, Ziwen Wang, Yuqing Shang, Molei Liu, Chuan Hong, Nan Liu

    Abstract: Clinical and biomedical research in low-resource settings often faces significant challenges due to the need for high-quality data with sufficient sample sizes to construct effective models. These constraints hinder robust model training and prompt researchers to seek methods for leveraging existing knowledge from related studies to support new research efforts. Transfer learning (TL), a machine l… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  2. arXiv:2406.12449  [pdf

    cs.AI

    Retrieval-Augmented Generation for Generative Artificial Intelligence in Medicine

    Authors: Rui Yang, Yilin Ning, Emilia Keppo, Mingxuan Liu, Chuan Hong, Danielle S Bitterman, Jasmine Chiat Ling Ong, Daniel Shu Wei Ting, Nan Liu

    Abstract: Generative artificial intelligence (AI) has brought revolutionary innovations in various fields, including medicine. However, it also exhibits limitations. In response, retrieval-augmented generation (RAG) provides a potential solution, enabling models to generate more accurate contents by leveraging the retrieval of external knowledge. With the rapid advancement of generative AI, RAG can pave the… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  3. arXiv:2406.10492  [pdf, other

    cs.CL cs.LG

    Large Language Models as Event Forecasters

    Authors: Libo Zhang, Yue Ning

    Abstract: Key elements of human events are extracted as quadruples that consist of subject, relation, object, and timestamp. This representation can be extended to a quintuple by adding a fifth element: a textual summary that briefly describes the event. These quadruples or quintuples, when organized within a specific domain, form a temporal knowledge graph (TKG). Current learning frameworks focus on a few… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 10 pages, 3 figures, 10 tables

  4. arXiv:2406.09455  [pdf, other

    cs.CV cs.AI cs.CL

    Pandora: Towards General World Model with Natural Language Actions and Video States

    Authors: Jiannan Xiang, Guangyi Liu, Yi Gu, Qiyue Gao, Yuting Ning, Yuheng Zha, Zeyu Feng, Tianhua Tao, Shibo Hao, Yemin Shi, Zhengzhong Liu, Eric P. Xing, Zhiting Hu

    Abstract: World models simulate future states of the world in response to different actions. They facilitate interactive content creation and provides a foundation for grounded, long-horizon reasoning. Current foundation models do not fully meet the capabilities of general world models: large language models (LLMs) are constrained by their reliance on language modality and their limited understanding of the… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Website: https://world-model.maitrix.org/

  5. arXiv:2406.01276  [pdf, other

    cs.CL

    EduNLP: Towards a Unified and Modularized Library for Educational Resources

    Authors: Zhenya Huang, Yuting Ning, Longhu Qin, Shiwei Tong, Shangzi Xue, Tong Xiao, Xin Lin, Jiayu Liu, Qi Liu, Enhong Chen, Shijing Wang

    Abstract: Educational resource understanding is vital to online learning platforms, which have demonstrated growing applications recently. However, researchers and developers always struggle with using existing general natural language toolkits or domain-specific models. The issue raises a need to develop an effective and easy-to-use one that benefits AI education-related research and applications. To bridg… ▽ More

    Submitted 4 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  6. arXiv:2405.17921  [pdf

    cs.AI cs.CY

    Towards Clinical AI Fairness: Filling Gaps in the Puzzle

    Authors: Mingxuan Liu, Yilin Ning, Salinelat Teixayavong, Xiaoxuan Liu, Mayli Mertens, Yuqing Shang, Xin Li, Di Miao, Jie Xu, Daniel Shu Wei Ting, Lionel Tim-Ee Cheng, Jasmine Chiat Ling Ong, Zhen Ling Teo, Ting Fang Tan, Narrendar RaviChandran, Fei Wang, Leo Anthony Celi, Marcus Eng Hock Ong, Nan Liu

    Abstract: The ethical integration of Artificial Intelligence (AI) in healthcare necessitates addressing fairness-a concept that is highly context-specific across medical fields. Extensive studies have been conducted to expand the technical components of AI fairness, while tremendous calls for AI fairness have been raised from healthcare. Despite this, a significant disconnect persists between technical adva… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  7. arXiv:2405.03299  [pdf, other

    cs.CR cs.DC

    DarkFed: A Data-Free Backdoor Attack in Federated Learning

    Authors: Minghui Li, Wei Wan, Yuxuan Ning, Shengshan Hu, Lulu Xue, Leo Yu Zhang, Yichen Wang

    Abstract: Federated learning (FL) has been demonstrated to be susceptible to backdoor attacks. However, existing academic studies on FL backdoor attacks rely on a high proportion of real clients with main task-related data, which is impractical. In the context of real-world industrial scenarios, even the simplest defense suffices to defend against the state-of-the-art attack, 3DFed. A practical FL backdoor… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: This paper has been accepted by IJCAI 2024

  8. arXiv:2404.16223  [pdf, other

    cs.CV eess.IV

    Deep RAW Image Super-Resolution. A NTIRE 2024 Challenge Survey

    Authors: Marcos V. Conde, Florin-Alexandru Vasluianu, Radu Timofte, Jianxing Zhang, Jia Li, Fan Wang, Xiaopeng Li, Zikun Liu, Hyunhee Park, Sejun Song, Changho Kim, Zhijuan Huang, Hongyuan Yu, Cheng Wan, Wending Xiang, Jiamin Lin, Hang Zhong, Qiaosong Zhang, Yue Sun, Xuanwu Yin, Kunlong Zuo, Senyan Xu, Siyuan Jiang, Zhijing Sun, Jiaying Zhu , et al. (10 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 RAW Image Super-Resolution Challenge, highlighting the proposed solutions and results. New methods for RAW Super-Resolution could be essential in modern Image Signal Processing (ISP) pipelines, however, this problem is not as explored as in the RGB domain. Th goal of this challenge is to upscale RAW Bayer images by 2x, considering unknown degradations such as nois… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 - NTIRE Workshop

  9. arXiv:2403.20085  [pdf, other

    cs.RO

    OmniNxt: A Fully Open-source and Compact Aerial Robot with Omnidirectional Visual Perception

    Authors: Peize Liu, Chen Feng, Yang Xu, Yan Ning, Hao Xu, Shaojie Shen

    Abstract: Adopting omnidirectional Field of View (FoV) cameras in aerial robots vastly improves perception ability, significantly advancing aerial robotics's capabilities in inspection, reconstruction, and rescue tasks. However, such sensors also elevate system complexity, e.g., hardware design, and corresponding algorithm, which limits researchers from utilizing aerial robots with omnidirectional FoV in th… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: Submitted to IROS2024. Open source: https://github.com/HKUST-Aerial-Robotics/OmniNxt. Project page: https://hkust-aerial-robotics.github.io/OmniNxt/

  10. arXiv:2403.17708  [pdf, other

    cs.CV cs.HC cs.MM

    Panonut360: A Head and Eye Tracking Dataset for Panoramic Video

    Authors: Yutong Xu, Junhao Du, Jiahe Wang, Yuwei Ning, Sihan Zhou Yang Cao

    Abstract: With the rapid development and widespread application of VR/AR technology, maximizing the quality of immersive panoramic video services that match users' personal preferences and habits has become a long-standing challenge. Understanding the saliency region where users focus, based on data collected with HMDs, can promote multimedia encoding, transmission, and quality assessment. At the same time,… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: 7 pages,ACM MMSys'24 accepted

  11. arXiv:2403.06999  [pdf

    cs.LG cs.AI cs.CY

    Survival modeling using deep learning, machine learning and statistical methods: A comparative analysis for predicting mortality after hospital admission

    Authors: Ziwen Wang, Jin Wee Lee, Tanujit Chakraborty, Yilin Ning, Mingxuan Liu, Feng Xie, Marcus Eng Hock Ong, Nan Liu

    Abstract: Survival analysis is essential for studying time-to-event outcomes and providing a dynamic understanding of the probability of an event occurring over time. Various survival analysis techniques, from traditional statistical models to state-of-the-art machine learning algorithms, support healthcare intervention and policy decisions. However, there remains ongoing discussion about their comparative… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  12. arXiv:2403.05235  [pdf

    cs.LG cs.AI cs.CY

    Fairness-Aware Interpretable Modeling (FAIM) for Trustworthy Machine Learning in Healthcare

    Authors: Mingxuan Liu, Yilin Ning, Yuhe Ke, Yuqing Shang, Bibhas Chakraborty, Marcus Eng Hock Ong, Roger Vaughan, Nan Liu

    Abstract: The escalating integration of machine learning in high-stakes fields such as healthcare raises substantial concerns about model fairness. We propose an interpretable framework - Fairness-Aware Interpretable Modeling (FAIM), to improve model fairness without compromising performance, featuring an interactive interface to identify a "fairer" model from a set of high-performing models and promoting t… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  13. arXiv:2403.05229  [pdf

    cs.AI

    Developing Federated Time-to-Event Scores Using Heterogeneous Real-World Survival Data

    Authors: Siqi Li, Yuqing Shang, Ziwen Wang, Qiming Wu, Chuan Hong, Yilin Ning, Di Miao, Marcus Eng Hock Ong, Bibhas Chakraborty, Nan Liu

    Abstract: Survival analysis serves as a fundamental component in numerous healthcare applications, where the determination of the time to specific events (such as the onset of a certain disease or death) for patients is crucial for clinical decision-making. Scoring systems are widely used for swift and efficient risk prediction. However, existing methods for constructing survival scores presume that data or… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  14. arXiv:2402.12852  [pdf, other

    cs.LG

    CCFC++: Enhancing Federated Clustering through Feature Decorrelation

    Authors: Jie Yan, Jing Liu, Yi-Zi Ning, Zhong-Yuan Zhang

    Abstract: In federated clustering, multiple data-holding clients collaboratively group data without exchanging raw data. This field has seen notable advancements through its marriage with contrastive learning, exemplified by Cluster-Contrastive Federated Clustering (CCFC). However, CCFC suffers from heterogeneous data across clients, leading to poor and unrobust performance. Our study conducts both empirica… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  15. arXiv:2402.06861  [pdf, other

    cs.AI

    UrbanKGent: A Unified Large Language Model Agent Framework for Urban Knowledge Graph Construction

    Authors: Yansong Ning, Hao Liu

    Abstract: Urban knowledge graph has recently worked as an emerging building block to distill critical knowledge from multi-sourced urban data for diverse urban application scenarios. Despite its promising benefits, urban knowledge graph construction (UrbanKGC) still heavily relies on manual effort, hindering its potential advancement. This paper presents UrbanKGent, a unified large language model agent fram… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

    Comments: Under review

  16. arXiv:2401.03077  [pdf, other

    cs.LG

    A Topology-aware Graph Coarsening Framework for Continual Graph Learning

    Authors: Xiaoxue Han, Zhuo Feng, Yue Ning

    Abstract: Continual learning on graphs tackles the problem of training a graph neural network (GNN) where graph data arrive in a streaming fashion and the model tends to forget knowledge from previous tasks when updating with new data. Traditional continual learning strategies such as Experience Replay can be adapted to streaming graphs, however, these methods often face challenges such as inefficiency in p… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  17. arXiv:2312.11026  [pdf, other

    cs.LG cs.CR cs.DC

    MISA: Unveiling the Vulnerabilities in Split Federated Learning

    Authors: Wei Wan, Yuxuan Ning, Shengshan Hu, Lulu Xue, Minghui Li, Leo Yu Zhang, Hai Jin

    Abstract: \textit{Federated learning} (FL) and \textit{split learning} (SL) are prevailing distributed paradigms in recent years. They both enable shared global model training while keeping data localized on users' devices. The former excels in parallel execution capabilities, while the latter enjoys low dependence on edge computing resources and strong privacy protection. \textit{Split federated learning}… ▽ More

    Submitted 19 December, 2023; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: This paper has been accepted by the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024)

  18. arXiv:2311.08747  [pdf, other

    cs.CV

    Improved Dense Nested Attention Network Based on Transformer for Infrared Small Target Detection

    Authors: Chun Bao, Jie Cao, Yaqian Ning, Tianhua Zhao, Zhijun Li, Zechen Wang, Li Zhang, Qun Hao

    Abstract: Infrared small target detection based on deep learning offers unique advantages in separating small targets from complex and dynamic backgrounds. However, the features of infrared small targets gradually weaken as the depth of convolutional neural network (CNN) increases. To address this issue, we propose a novel method for detecting infrared small targets called improved dense nested attention ne… ▽ More

    Submitted 17 January, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

  19. arXiv:2311.07237  [pdf, other

    cs.CL cs.AI

    In Search of the Long-Tail: Systematic Generation of Long-Tail Inferential Knowledge via Logical Rule Guided Search

    Authors: Huihan Li, Yuting Ning, Zeyi Liao, Siyuan Wang, Xiang Lorraine Li, Ximing Lu, Wenting Zhao, Faeze Brahman, Yejin Choi, Xiang Ren

    Abstract: State-of-the-art LLMs outperform humans on reasoning tasks such as Natural Language Inference. Recent works evaluating LLMs note a marked performance drop on input data from the low-probability distribution, i.e., the longtail. Therefore, we focus on systematically generating statements involving long-tail inferential knowledge for more effective evaluation of LLMs in the reasoning space. We first… ▽ More

    Submitted 27 February, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

  20. arXiv:2311.03417  [pdf

    cs.LG cs.AI

    Federated Learning for Clinical Structured Data: A Benchmark Comparison of Engineering and Statistical Approaches

    Authors: Siqi Li, Di Miao, Qiming Wu, Chuan Hong, Danny D'Agostino, Xin Li, Yilin Ning, Yuqing Shang, Huazhu Fu, Marcus Eng Hock Ong, Hamed Haddadi, Nan Liu

    Abstract: Federated learning (FL) has shown promising potential in safeguarding data privacy in healthcare collaborations. While the term "FL" was originally coined by the engineering community, the statistical field has also explored similar privacy-preserving algorithms. Statistical FL algorithms, however, remain considerably less recognized than their engineering counterparts. Our goal was to bridge the… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  21. arXiv:2311.02107  [pdf

    cs.LG cs.AI cs.CY

    Generative Artificial Intelligence in Healthcare: Ethical Considerations and Assessment Checklist

    Authors: Yilin Ning, Salinelat Teixayavong, Yuqing Shang, Julian Savulescu, Vaishaanth Nagaraj, Di Miao, Mayli Mertens, Daniel Shu Wei Ting, Jasmine Chiat Ling Ong, Mingxuan Liu, Jiuwen Cao, Michael Dunn, Roger Vaughan, Marcus Eng Hock Ong, Joseph Jao-Yiu Sung, Eric J Topol, Nan Liu

    Abstract: The widespread use of ChatGPT and other emerging technology powered by generative artificial intelligence (GenAI) has drawn much attention to potential ethical issues, especially in high-stakes applications such as healthcare, but ethical discussions are yet to translate into operationalisable solutions. Furthermore, ongoing ethical discussions often neglect other types of GenAI that have been use… ▽ More

    Submitted 23 February, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

  22. arXiv:2310.12350  [pdf, other

    cs.LG

    Equipping Federated Graph Neural Networks with Structure-aware Group Fairness

    Authors: Nan Cui, Xiuling Wang, Wendy Hui Wang, Violet Chen, Yue Ning

    Abstract: Graph Neural Networks (GNNs) have been widely used for various types of graph data processing and analytical tasks in different domains. Training GNNs over centralized graph data can be infeasible due to privacy concerns and regulatory restrictions. Thus, federated learning (FL) becomes a trending solution to address this challenge in a distributed learning paradigm. However, as GNNs may inherit h… ▽ More

    Submitted 13 May, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

  23. arXiv:2310.09672  [pdf, other

    cs.LG

    Towards Semi-Structured Automatic ICD Coding via Tree-based Contrastive Learning

    Authors: Chang Lu, Chandan K. Reddy, Ping Wang, Yue Ning

    Abstract: Automatic coding of International Classification of Diseases (ICD) is a multi-label text categorization task that involves extracting disease or procedure codes from clinical notes. Despite the application of state-of-the-art natural language processing (NLP) techniques, there are still challenges including limited availability of data due to privacy constraints and the high variability of clinica… ▽ More

    Submitted 14 October, 2023; originally announced October 2023.

    Comments: Accepted by NeurIPS 2023

  24. arXiv:2307.11758  [pdf, other

    cs.RO

    A Comprehensive Introduction of Visual-Inertial Navigation

    Authors: Yangyang Ning

    Abstract: In this article, a tutorial introduction to visual-inertial navigation(VIN) is presented. Visual and inertial perception are two complementary sensing modalities. Cameras and inertial measurement units (IMU) are the corresponding sensors for these two modalities. The low cost and light weight of camera-IMU sensor combinations make them ubiquitous in robotic navigation. Visual-inertial Navigation i… ▽ More

    Submitted 27 June, 2023; originally announced July 2023.

    Comments: 35 pages, 10 figures

  25. arXiv:2306.11443  [pdf, other

    cs.AI cs.LG

    UUKG: Unified Urban Knowledge Graph Dataset for Urban Spatiotemporal Prediction

    Authors: Yansong Ning, Hao Liu, Hao Wang, Zhenyu Zeng, Hui Xiong

    Abstract: Accurate Urban SpatioTemporal Prediction (USTP) is of great importance to the development and operation of the smart city. As an emerging building block, multi-sourced urban data are usually integrated as urban knowledge graphs (UrbanKGs) to provide critical knowledge for urban spatiotemporal prediction models. However, existing UrbanKGs are often tailored for specific downstream prediction tasks… ▽ More

    Submitted 22 October, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023 Track on Datasets and Benchmarks

  26. arXiv:2306.10512  [pdf, other

    cs.CL

    Efficiently Measuring the Cognitive Ability of LLMs: An Adaptive Testing Perspective

    Authors: Yan Zhuang, Qi Liu, Yuting Ning, Weizhe Huang, Rui Lv, Zhenya Huang, Guanhao Zhao, Zheng Zhang, Qingyang Mao, Shijin Wang, Enhong Chen

    Abstract: Large language models (LLMs), like ChatGPT, have shown some human-like cognitive abilities. For comparing these abilities of different models, several benchmarks (i.e. sets of standard test questions) from different fields (e.g., Literature, Biology and Psychology) are often adopted and the test results under traditional metrics such as accuracy, recall and F1, are reported. However, such way for… ▽ More

    Submitted 28 October, 2023; v1 submitted 18 June, 2023; originally announced June 2023.

  27. arXiv:2306.10351  [pdf, other

    cs.LG cs.AI cs.CR

    Bkd-FedGNN: A Benchmark for Classification Backdoor Attacks on Federated Graph Neural Network

    Authors: Fan Liu, Siqi Lai, Yansong Ning, Hao Liu

    Abstract: Federated Graph Neural Network (FedGNN) has recently emerged as a rapidly growing research topic, as it integrates the strengths of graph neural networks and federated learning to enable advanced machine learning applications without direct access to sensitive data. Despite its advantages, the distributed nature of FedGNN introduces additional vulnerabilities, particularly backdoor attacks stemmin… ▽ More

    Submitted 17 June, 2023; originally announced June 2023.

  28. arXiv:2304.13493  [pdf

    cs.CY cs.AI

    Towards clinical AI fairness: A translational perspective

    Authors: Mingxuan Liu, Yilin Ning, Salinelat Teixayavong, Mayli Mertens, Jie Xu, Daniel Shu Wei Ting, Lionel Tim-Ee Cheng, Jasmine Chiat Ling Ong, Zhen Ling Teo, Ting Fang Tan, Ravi Chandran Narrendar, Fei Wang, Leo Anthony Celi, Marcus Eng Hock Ong, Nan Liu

    Abstract: Artificial intelligence (AI) has demonstrated the ability to extract insights from data, but the issue of fairness remains a concern in high-stakes fields such as healthcare. Despite extensive discussion and efforts in algorithm development, AI fairness and clinical concerns have not been adequately addressed. In this paper, we discuss the misalignment between technical and clinical perspectives o… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

  29. Federated and distributed learning applications for electronic health records and structured medical data: A scoping review

    Authors: Siqi Li, Pinyan Liu, Gustavo G. Nascimento, Xinru Wang, Fabio Renato Manzolli Leite, Bibhas Chakraborty, Chuan Hong, Yilin Ning, Feng Xie, Zhen Ling Teo, Daniel Shu Wei Ting, Hamed Haddadi, Marcus Eng Hock Ong, Marco Aurélio Peres, Nan Liu

    Abstract: Federated learning (FL) has gained popularity in clinical research in recent years to facilitate privacy-preserving collaboration. Structured data, one of the most prevalent forms of clinical data, has experienced significant growth in volume concurrently, notably with the widespread adoption of electronic health records in clinical practice. This review examines FL applications on structured medi… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

  30. arXiv:2304.03779  [pdf

    cs.LG cs.AI cs.CY

    A roadmap to fair and trustworthy prediction model validation in healthcare

    Authors: Yilin Ning, Victor Volovici, Marcus Eng Hock Ong, Benjamin Alan Goldstein, Nan Liu

    Abstract: A prediction model is most useful if it generalizes beyond the development data with external validations, but to what extent should it generalize remains unclear. In practice, prediction models are externally validated using data from very different settings, including populations from other health systems or countries, with predictably poor results. This may not be a fair reflection of the perfo… ▽ More

    Submitted 7 April, 2023; originally announced April 2023.

    Comments: 12 pages, 2 figures

  31. arXiv:2303.07830  [pdf

    q-bio.NC cs.AI

    Emergent Bio-Functional Similarities in a Cortical-Spike-Train-Decoding Spiking Neural Network Facilitate Predictions of Neural Computation

    Authors: Tengjun Liu, Yansong Chua, Yiwei Zhang, Yuxiao Ning, Pengfu Liu, Guihua Wan, Zijun Wan, Shaomin Zhang, Weidong Chen

    Abstract: Despite its better bio-plausibility, goal-driven spiking neural network (SNN) has not achieved applicable performance for classifying biological spike trains, and showed little bio-functional similarities compared to traditional artificial neural networks. In this study, we proposed the motorSRNN, a recurrent SNN topologically inspired by the neural motor circuit of primates. By employing the moto… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

  32. arXiv:2303.00282  [pdf

    cs.LG cs.AI cs.CR

    FedScore: A privacy-preserving framework for federated scoring system development

    Authors: Siqi Li, Yilin Ning, Marcus Eng Hock Ong, Bibhas Chakraborty, Chuan Hong, Feng Xie, Han Yuan, Mingxuan Liu, Daniel M. Buckland, Yong Chen, Nan Liu

    Abstract: We propose FedScore, a privacy-preserving federated learning framework for scoring system generation across multiple sites to facilitate cross-institutional collaborations. The FedScore framework includes five modules: federated variable ranking, federated variable transformation, federated score derivation, federated model selection and federated model evaluation. To illustrate usage and assess F… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

  33. arXiv:2302.04643  [pdf, other

    cs.CL

    A Novel Approach for Auto-Formulation of Optimization Problems

    Authors: Yuting Ning, Jiayu Liu, Longhu Qin, Tong Xiao, Shangzi Xue, Zhenya Huang, Qi Liu, Enhong Chen, Jinze Wu

    Abstract: In the Natural Language for Optimization (NL4Opt) NeurIPS 2022 competition, competitors focus on improving the accessibility and usability of optimization solvers, with the aim of subtask 1: recognizing the semantic entities that correspond to the components of the optimization problem; subtask 2: generating formulations for the optimization problem. In this paper, we present the solution of our t… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

  34. arXiv:2301.07558  [pdf, other

    cs.CL

    Towards a Holistic Understanding of Mathematical Questions with Contrastive Pre-training

    Authors: Yuting Ning, Zhenya Huang, Xin Lin, Enhong Chen, Shiwei Tong, Zheng Gong, Shijin Wang

    Abstract: Understanding mathematical questions effectively is a crucial task, which can benefit many applications, such as difficulty estimation. Researchers have drawn much attention to designing pre-training models for question representations due to the scarcity of human annotations (e.g., labeling difficulty). However, unlike general free-format texts (e.g., user comments), mathematical questions are ge… ▽ More

    Submitted 18 January, 2023; originally announced January 2023.

    Comments: Accepted by AAAI 2023

  35. arXiv:2212.08370  [pdf

    cs.LG

    Shapley variable importance cloud for machine learning models

    Authors: Yilin Ning, Mingxuan Liu, Nan Liu

    Abstract: Current practice in interpretable machine learning often focuses on explaining the final model trained from data, e.g., by using the Shapley additive explanations (SHAP) method. The recently developed Shapley variable importance cloud (ShapleyVIC) extends the current practice to a group of "nearly optimal models" to provide comprehensive and robust variable importance assessments, with estimated u… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

  36. Rega-Net:Retina Gabor Attention for Deep Convolutional Neural Networks

    Authors: Chun Bao, Jie Cao, Yaqian Ning, Yang Cheng, Qun Hao

    Abstract: Extensive research works demonstrate that the attention mechanism in convolutional neural networks (CNNs) effectively improves accuracy. Nevertheless, few works design attention mechanisms using large receptive fields. In this work, we propose a novel attention method named Rega-net to increase CNN accuracy by enlarging the receptive field. Inspired by the mechanism of the human retina, we design… ▽ More

    Submitted 3 March, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

  37. Handling missing values in healthcare data: A systematic review of deep learning-based imputation techniques

    Authors: Mingxuan Liu, Siqi Li, Han Yuan, Marcus Eng Hock Ong, Yilin Ning, Feng Xie, Seyed Ehsan Saffari, Victor Volovici, Bibhas Chakraborty, Nan Liu

    Abstract: Objective: The proper handling of missing values is critical to delivering reliable estimates and decisions, especially in high-stakes fields such as clinical research. The increasing diversity and complexity of data have led many researchers to develop deep learning (DL)-based imputation techniques. We conducted a systematic review to evaluate the use of these techniques, with a particular focus… ▽ More

    Submitted 15 October, 2022; originally announced October 2022.

  38. arXiv:2206.04050  [pdf

    cs.LG cs.HC

    Balanced background and explanation data are needed in explaining deep learning models with SHAP: An empirical study on clinical decision making

    Authors: Mingxuan Liu, Yilin Ning, Han Yuan, Marcus Eng Hock Ong, Nan Liu

    Abstract: Objective: Shapley additive explanations (SHAP) is a popular post-hoc technique for explaining black box models. While the impact of data imbalance on predictive models has been extensively studied, it remains largely unknown with respect to SHAP-based model explanations. This study sought to investigate the effects of data imbalance on SHAP explanations for deep learning models, and to propose a… ▽ More

    Submitted 8 June, 2022; originally announced June 2022.

  39. arXiv:2206.02791  [pdf, other

    cs.LG

    Instance-Dependent Label-Noise Learning with Manifold-Regularized Transition Matrix Estimation

    Authors: De Cheng, Tongliang Liu, Yixiong Ning, Nannan Wang, Bo Han, Gang Niu, Xinbo Gao, Masashi Sugiyama

    Abstract: In label-noise learning, estimating the transition matrix has attracted more and more attention as the matrix plays an important role in building statistically consistent classifiers. However, it is very challenging to estimate the transition matrix T(x), where x denotes the instance, because it is unidentifiable under the instance-dependent noise(IDN). To address this problem, we have noticed tha… ▽ More

    Submitted 6 June, 2022; originally announced June 2022.

    Comments: accepted by CVPR2022

  40. arXiv:2204.09726  [pdf, other

    cs.RO

    Coverage Control for a Multi-robot Team with Heterogeneous Capabilities using Block Coordinate Descent (BCD) Method

    Authors: Yung Yu Andy Yiu, Ying Hing Yim, Yan Ning, Zikai Wang, Ling Shi

    Abstract: In this paper, we propose a coverage control system for a multi-robot team with heterogeneous capabilities to patrol or monitor a bounded environment. The capability could be defined as any criterion of robots like remaining power or mobile speed, depending on the purpose. The proposed control system aims to allocate different portions of the environment to the robots according to their capabiliti… ▽ More

    Submitted 20 April, 2022; originally announced April 2022.

    Comments: 7 pages, 5 figures, accepted by ICCA2022

  41. arXiv:2204.07010  [pdf, other

    cs.CL

    Anti-Asian Hate Speech Detection via Data Augmented Semantic Relation Inference

    Authors: Jiaxuan Li, Yue Ning

    Abstract: With the spreading of hate speech on social media in recent years, automatic detection of hate speech is becoming a crucial task and has attracted attention from various communities. This task aims to recognize online posts (e.g., tweets) that contain hateful information. The peculiarities of languages in social media, such as short and poorly written content, lead to the difficulty of learning se… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: To appear in Proceedings of the 16th International AAAI Conference on Web and Social Media (ICWSM)

  42. Multi-Label Clinical Time-Series Generation via Conditional GAN

    Authors: Chang Lu, Chandan K. Reddy, Ping Wang, Dong Nie, Yue Ning

    Abstract: In recent years, deep learning has been successfully adopted in a wide range of applications related to electronic health records (EHRs) such as representation learning and clinical event prediction. However, due to privacy constraints, limited access to EHR becomes a bottleneck for deep learning research. To mitigate these concerns, generative adversarial networks (GANs) have been successfully us… ▽ More

    Submitted 31 August, 2023; v1 submitted 10 April, 2022; originally announced April 2022.

    Comments: \c{opyright}2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  43. arXiv:2202.08407  [pdf

    cs.LG

    AutoScore-Ordinal: An interpretable machine learning framework for generating scoring models for ordinal outcomes

    Authors: Seyed Ehsan Saffari, Yilin Ning, Xie Feng, Bibhas Chakraborty, Victor Volovici, Roger Vaughan, Marcus Eng Hock Ong, Nan Liu

    Abstract: Background: Risk prediction models are useful tools in clinical decision-making which help with risk stratification and resource allocations and may lead to a better health care for patients. AutoScore is a machine learning-based automatic clinical score generator for binary outcomes. This study aims to expand the AutoScore framework to provide a tool for interpretable risk prediction for ordinal… ▽ More

    Submitted 16 February, 2022; originally announced February 2022.

  44. arXiv:2201.03291  [pdf

    cs.LG

    A novel interpretable machine learning system to generate clinical risk scores: An application for predicting early mortality or unplanned readmission in a retrospective cohort study

    Authors: Yilin Ning, Siqi Li, Marcus Eng Hock Ong, Feng Xie, Bibhas Chakraborty, Daniel Shu Wei Ting, Nan Liu

    Abstract: Risk scores are widely used for clinical decision making and commonly generated from logistic regression models. Machine-learning-based methods may work well for identifying important predictors, but such 'black box' variable selection limits interpretability, and variable importance evaluated from a single model can be biased. We propose a robust and interpretable variable selection approach usin… ▽ More

    Submitted 10 January, 2022; originally announced January 2022.

  45. arXiv:2112.12909  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Optimal Variable Clustering for High-Dimensional Matrix Valued Data

    Authors: Inbeom Lee, Siyi Deng, Yang Ning

    Abstract: Matrix valued data has become increasingly prevalent in many applications. Most of the existing clustering methods for this type of data are tailored to the mean model and do not account for the dependence structure of the features, which can be very informative, especially in high-dimensional settings or when mean information is not available. To extract the information from the dependence struct… ▽ More

    Submitted 6 December, 2023; v1 submitted 23 December, 2021; originally announced December 2021.

  46. arXiv:2112.06345  [pdf, other

    cs.LG cs.AI

    A Survey on Societal Event Forecasting with Deep Learning

    Authors: Songgaojun Deng, Yue Ning

    Abstract: Population-level societal events, such as civil unrest and crime, often have a significant impact on our daily life. Forecasting such events is of great importance for decision-making and resource allocation. Event prediction has traditionally been challenging due to the lack of knowledge regarding the true causes and underlying mechanisms of event occurrence. In recent years, research on event fo… ▽ More

    Submitted 12 December, 2021; originally announced December 2021.

    Comments: 31 pages, 12 figures, 4 tables

    MSC Class: 68T07

  47. arXiv:2112.05695  [pdf, other

    cs.LG cs.AI

    Causal Knowledge Guided Societal Event Forecasting

    Authors: Songgaojun Deng, Huzefa Rangwala, Yue Ning

    Abstract: Data-driven societal event forecasting methods exploit relevant historical information to predict future events. These methods rely on historical labeled data and cannot accurately predict events when data are limited or of poor quality. Studying causal effects between events goes beyond correlation analysis and can contribute to a more robust prediction of events. However, incorporating causality… ▽ More

    Submitted 10 December, 2021; originally announced December 2021.

    Comments: 17 pages, 19 figures

    MSC Class: 68T07

  48. arXiv:2112.05195  [pdf, other

    cs.LG

    Context-aware Health Event Prediction via Transition Functions on Dynamic Disease Graphs

    Authors: Chang Lu, Tian Han, Yue Ning

    Abstract: With the wide application of electronic health records (EHR) in healthcare facilities, health event prediction with deep learning has gained more and more attention. A common feature of EHR data used for deep-learning-based predictions is historical diagnoses. Existing work mainly regards a diagnosis as an independent disease and does not consider clinical relations among diseases in a visit. Many… ▽ More

    Submitted 15 December, 2021; v1 submitted 9 December, 2021; originally announced December 2021.

    Comments: This paper is accepted by AAAI 2022

  49. arXiv:2110.02484  [pdf

    cs.LG cs.HC

    Shapley variable importance clouds for interpretable machine learning

    Authors: Yilin Ning, Marcus Eng Hock Ong, Bibhas Chakraborty, Benjamin Alan Goldstein, Daniel Shu Wei Ting, Roger Vaughan, Nan Liu

    Abstract: Interpretable machine learning has been focusing on explaining final models that optimize performance. The current state-of-the-art is the Shapley additive explanations (SHAP) that locally explains variable impact on individual predictions, and it is recently extended for a global assessment across the dataset. Recently, Dong and Rudin proposed to extend the investigation to models from the same c… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

  50. Deep learning for temporal data representation in electronic health records: A systematic review of challenges and methodologies

    Authors: Feng Xie, Han Yuan, Yilin Ning, Marcus Eng Hock Ong, Mengling Feng, Wynne Hsu, Bibhas Chakraborty, Nan Liu

    Abstract: Objective: Temporal electronic health records (EHRs) can be a wealth of information for secondary uses, such as clinical events prediction or chronic disease management. However, challenges exist for temporal data representation. We therefore sought to identify these challenges and evaluate novel methodologies for addressing them through a systematic examination of deep learning solutions. Metho… ▽ More

    Submitted 21 July, 2021; originally announced July 2021.