Skip to main content

Showing 1–50 of 71 results for author: Ou, Y

  1. arXiv:2406.18532  [pdf, other

    cs.CL cs.AI cs.LG

    Symbolic Learning Enables Self-Evolving Agents

    Authors: Wangchunshu Zhou, Yixin Ou, Shengwei Ding, Long Li, Jialong Wu, Tiannan Wang, Jiamin Chen, Shuai Wang, Xiaohua Xu, Ningyu Zhang, Huajun Chen, Yuchen Eleanor Jiang

    Abstract: The AI community has been exploring a pathway to artificial general intelligence (AGI) by developing "language agents", which are complex large language models (LLMs) pipelines involving both prompting techniques and tool usage methods. While language agents have demonstrated impressive capabilities for many real-world tasks, a fundamental limitation of current language agents research is that the… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Code available at https://github.com/aiwaves-cn/agents

  2. arXiv:2406.15050  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Tri-VQA: Triangular Reasoning Medical Visual Question Answering for Multi-Attribute Analysis

    Authors: Lin Fan, Xun Gong, Cenyang Zheng, Yafei Ou

    Abstract: The intersection of medical Visual Question Answering (Med-VQA) is a challenging research topic with advantages including patient engagement and clinical expert involvement for second opinions. However, existing Med-VQA methods based on joint embedding fail to explain whether their provided results are based on correct reasoning or coincidental answers, which undermines the credibility of VQA answ… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    ACM Class: I.2.7; I.2.10; J.3

  3. arXiv:2406.10569  [pdf, other

    cs.LG cs.CV

    MDA: An Interpretable Multi-Modal Fusion with Missing Modalities and Intrinsic Noise

    Authors: Lin Fan, Yafei Ou, Cenyang Zheng, Pengyu Dai, Tamotsu Kamishima, Masayuki Ikebe, Kenji Suzuki, Xun Gong

    Abstract: Multi-modal fusion is crucial in medical data research, enabling a comprehensive understanding of diseases and improving diagnostic performance by combining diverse modalities. However, multi-modal fusion faces challenges, including capturing interactions between modalities, addressing missing modalities, handling erroneous modal information, and ensuring interpretability. Many existing researcher… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    ACM Class: I.5.2; I.2.7; I.2.10; J.3

  4. arXiv:2405.05502  [pdf, other

    cs.CV cs.CR cs.LG

    Towards Accurate and Robust Architectures via Neural Architecture Search

    Authors: Yuwei Ou, Yuqi Feng, Yanan Sun

    Abstract: To defend deep neural networks from adversarial attacks, adversarial training has been drawing increasing attention for its effectiveness. However, the accuracy and robustness resulting from the adversarial training are limited by the architecture, because adversarial training improves accuracy and robustness by adjusting the weight connection affiliated to the architecture. In this work, we propo… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted by CVPR2024. arXiv admin note: substantial text overlap with arXiv:2212.14049

  5. arXiv:2404.18550  [pdf, other

    cs.LG cs.HC

    IncidentResponseGPT: Generating Traffic Incident Response Plans with Generative Artificial Intelligence

    Authors: Artur Grigorev, Adriana-Simona Mihaita Khaled Saleh, Yuming Ou

    Abstract: Traffic congestion due to road incidents poses a significant challenge in urban environments, leading to increased pollution, economic losses, and traffic congestion. Efficiently managing these incidents is imperative for mitigating their adverse effects; however, the complexity of urban traffic systems and the variety of potential incidents represent a considerable obstacle. This paper introduces… ▽ More

    Submitted 29 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  6. arXiv:2404.10980  [pdf, other

    cs.CV cs.LG

    Hyper Evidential Deep Learning to Quantify Composite Classification Uncertainty

    Authors: Changbin Li, Kangshuo Li, Yuzhe Ou, Lance M. Kaplan, Audun Jøsang, Jin-Hee Cho, Dong Hyun Jeong, Feng Chen

    Abstract: Deep neural networks (DNNs) have been shown to perform well on exclusive, multi-class classification tasks. However, when different classes have similar visual features, it becomes challenging for human annotators to differentiate them. This scenario necessitates the use of composite class labels. In this paper, we propose a novel framework called Hyper-Evidential Neural Network (HENN) that explic… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: In Proceedings of The Twelfth International Conference on Learning Representations, ICLR 2024

  7. arXiv:2404.05888  [pdf, other

    cs.RO

    A Realistic Surgical Simulator for Non-Rigid and Contact-Rich Manipulation in Surgeries with the da Vinci Research Kit

    Authors: Yafei Ou, Sadra Zargarzadeh, Paniz Sedighi, Mahdi Tavakoli

    Abstract: Realistic real-time surgical simulators play an increasingly important role in surgical robotics research, such as surgical robot learning and automation, and surgical skills assessment. Although there are a number of existing surgical simulators for research, they generally lack the ability to simulate the diverse types of objects and contact-rich manipulation tasks typically present in surgeries… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: 7 pages, 21st International Conference on Ubiquitous Robots (UR 2024), accepted

  8. arXiv:2403.13547  [pdf, other

    cs.LG eess.SY

    Enhancing Traffic Incident Management with Large Language Models: A Hybrid Machine Learning Approach for Severity Classification

    Authors: Artur Grigorev, Khaled Saleh, Yuming Ou, Adriana-Simona Mihaita

    Abstract: This research showcases the innovative integration of Large Language Models into machine learning workflows for traffic incident management, focusing on the classification of incident severity using accident reports. By leveraging features generated by modern language models alongside conventional data extracted from incident reports, our research demonstrates improvements in the accuracy of sever… ▽ More

    Submitted 29 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  9. arXiv:2403.03101  [pdf, other

    cs.CL cs.AI cs.HC cs.LG cs.MA

    KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents

    Authors: Yuqi Zhu, Shuofei Qiao, Yixin Ou, Shumin Deng, Ningyu Zhang, Shiwei Lyu, Yue Shen, Lei Liang, Jinjie Gu, Huajun Chen

    Abstract: Large Language Models (LLMs) have demonstrated great potential in complex reasoning tasks, yet they fall short when tackling more sophisticated challenges, especially when interacting with environments through generating executable actions. This inadequacy primarily stems from the lack of built-in action knowledge in language agents, which fails to effectively guide the planning trajectories durin… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Work in progress. Project page: https://zjunlp.github.io/project/KnowAgent/ Code: https://github.com/zjunlp/KnowAgent

  10. Halo Reduction in Display Systems through Smoothed Local Histogram Equalization and Human Visual System Modeling

    Authors: Prasoon Ambalathankandy, Yafei Ou, Masayuki Ikebe

    Abstract: Halo artifacts significantly impact display quality. We propose a method to reduce halos in Local Histogram Equalization (LHE) algorithms by separately addressing dark and light variants. This approach results in visually natural images by exploring the relationship between lateral inhibition and halo artifacts in the human visual system.

    Submitted 9 February, 2024; originally announced February 2024.

    ACM Class: I.4.3

  11. arXiv:2402.04587  [pdf, other

    cs.CV

    Sparse Anatomical Prompt Semi-Supervised Learning with Masked Image Modeling for CBCT Tooth Segmentation

    Authors: Pengyu Dai, Yafei Ou, Yang Liu, Yue Zhao

    Abstract: Accurate tooth identification and segmentation in Cone Beam Computed Tomography (CBCT) dental images can significantly enhance the efficiency and precision of manual diagnoses performed by dentists. However, existing segmentation methods are mainly developed based on large data volumes training, on which their annotations are extremely time-consuming. Meanwhile, the teeth of each class in CBCT den… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    ACM Class: I.4.6

  12. A Psychological Study: Importance of Contrast and Luminance in Color to Grayscale Mapping

    Authors: Prasoon Ambalathankandy, Yafei Ou, Sae Kaneko, Masayuki Ikebe

    Abstract: Grayscale images are essential in image processing and computer vision tasks. They effectively emphasize luminance and contrast, highlighting important visual features, while also being easily compatible with other algorithms. Moreover, their simplified representation makes them efficient for storage and transmission purposes. While preserving contrast is important for maintaining visual quality,… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    ACM Class: I.4.3

  13. arXiv:2402.03049  [pdf, other

    cs.CL cs.AI cs.HC cs.IR cs.LG

    EasyInstruct: An Easy-to-use Instruction Processing Framework for Large Language Models

    Authors: Yixin Ou, Ningyu Zhang, Honghao Gui, Ziwen Xu, Shuofei Qiao, Yida Xue, Runnan Fang, Kangwei Liu, Lei Li, Zhen Bi, Guozhou Zheng, Huajun Chen

    Abstract: In recent years, instruction tuning has gained increasing attention and emerged as a crucial technique to enhance the capabilities of Large Language Models (LLMs). To construct high-quality instruction datasets, many instruction processing approaches have been proposed, aiming to achieve a delicate balance between data quantity and data quality. Nevertheless, due to inconsistencies that persist am… ▽ More

    Submitted 23 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: ACL 2024 System Demonstrations; Project website: https://zjunlp.github.io/project/EasyInstruct Code: https://github.com/zjunlp/EasyInstruct Video: https://youtu.be/rfQOWYfziFo Demo: https://huggingface.co/spaces/zjunlp/EasyInstruct

  14. arXiv:2402.02079  [pdf, other

    cs.IR cs.AI

    Prototypical Contrastive Learning through Alignment and Uniformity for Recommendation

    Authors: Yangxun Ou, Lei Chen, Fenglin Pan, Yupeng Wu

    Abstract: Graph Collaborative Filtering (GCF), one of the most widely adopted recommendation system methods, effectively captures intricate relationships between user and item interactions. Graph Contrastive Learning (GCL) based GCF has gained significant attention as it leverages self-supervised techniques to extract valuable signals from real-world scenarios. However, many methods usually learn the instan… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

  15. arXiv:2401.15496  [pdf, other

    cs.CL cs.AI cs.LG

    Baichuan2-Sum: Instruction Finetune Baichuan2-7B Model for Dialogue Summarization

    Authors: Jianfei Xiao, Yancan Chen, Yimin Ou, Hanyi Yu, Kai Shu, Yiyong Xiao

    Abstract: Large language models (LLMs) like Llama, Baichuan and Bloom models show remarkable ability with instruction fine-tuning in many natural language tasks. Nevertheless, for the dialogue summarization task, which aims to generate summaries for different roles in dialogue, most of the state-of-the-art methods conduct on small models (e.g Bart and Bert). Existing methods try to add task specified optimi… ▽ More

    Submitted 3 April, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

  16. arXiv:2311.05371  [pdf, other

    cs.CV cs.AI

    Training Robust Deep Physiological Measurement Models with Synthetic Video-based Data

    Authors: Yuxuan Ou, Yuzhe Zhang, Yuntang Wang, Shwetak Patel, Daniel McDuf, Yuzhe Yang, Xin Liu

    Abstract: Recent advances in supervised deep learning techniques have demonstrated the possibility to remotely measure human physiological vital signs (e.g., photoplethysmograph, heart rate) just from facial videos. However, the performance of these methods heavily relies on the availability and diversity of real labeled data. Yet, collecting large-scale real-world data with high-quality labels is typically… ▽ More

    Submitted 15 November, 2023; v1 submitted 9 November, 2023; originally announced November 2023.

  17. arXiv:2310.20155  [pdf

    physics.chem-ph cs.AI

    MLatom 3: Platform for machine learning-enhanced computational chemistry simulations and workflows

    Authors: Pavlo O. Dral, Fuchun Ge, Yi-Fan Hou, Peikun Zheng, Yuxinxin Chen, Mario Barbatti, Olexandr Isayev, Cheng Wang, Bao-Xin Xue, Max Pinheiro Jr, Yuming Su, Yiheng Dai, Yangtao Chen, Lina Zhang, Shuang Zhang, Arif Ullah, Quanhao Zhang, Yanchi Ou

    Abstract: Machine learning (ML) is increasingly becoming a common tool in computational chemistry. At the same time, the rapid development of ML methods requires a flexible software framework for designing custom workflows. MLatom 3 is a program package designed to leverage the power of ML to enhance typical computational chemistry simulations and to create complex workflows. This open-source package provid… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  18. arXiv:2310.13574  [pdf, other

    eess.IV cs.CV cs.LG

    Progressive Dual Priori Network for Generalized Breast Tumor Segmentation

    Authors: Li Wang, Lihui Wang, Zixiang Kuai, Lei Tang, Yingfeng Ou, Chen Ye, Yuemin Zhu

    Abstract: To promote the generalization ability of breast tumor segmentation models, as well as to improve the segmentation performance for breast tumors with smaller size, low-contrast and irregular shape, we propose a progressive dual priori network (PDPNet) to segment breast tumors from dynamic enhanced magnetic resonance images (DCE-MRI) acquired at different centers. The PDPNet first cropped tumor regi… ▽ More

    Submitted 16 June, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

    Comments: 14 pages, 12 figures

    Journal ref: IEEE Journal of Biomedical and Health Informatics, 2024

  19. arXiv:2310.09444  [pdf, other

    cs.CV

    Tackling Heterogeneity in Medical Federated learning via Vision Transformers

    Authors: Erfan Darzi, Yiqing Shen, Yangming Ou, Nanna M. Sijtsema, P. M. A van Ooijen

    Abstract: Optimization-based regularization methods have been effective in addressing the challenges posed by data heterogeneity in medical federated learning, particularly in improving the performance of underrepresented clients. However, these methods often lead to lower overall model accuracy and slower convergence rates. In this paper, we demonstrate that using Vision Transformers can substantially impr… ▽ More

    Submitted 15 November, 2023; v1 submitted 13 October, 2023; originally announced October 2023.

  20. arXiv:2310.02031  [pdf, other

    cs.CL cs.AI cs.CE cs.LG cs.RO

    OceanGPT: A Large Language Model for Ocean Science Tasks

    Authors: Zhen Bi, Ningyu Zhang, Yida Xue, Yixin Ou, Daxiong Ji, Guozhou Zheng, Huajun Chen

    Abstract: Ocean science, which delves into the oceans that are reservoirs of life and biodiversity, is of great significance given that oceans cover over 70% of our planet's surface. Recently, advances in Large Language Models (LLMs) have transformed the paradigm in science. Despite the success in other domains, current LLMs often fall short in catering to the needs of domain experts like oceanographers, an… ▽ More

    Submitted 23 May, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: ACL2024. Project Website: https://oceangpt.zjukg.cn/

  21. arXiv:2307.04356  [pdf, other

    cs.NE cs.CV

    InfLoR-SNN: Reducing Information Loss for Spiking Neural Networks

    Authors: Yufei Guo, Yuanpei Chen, Liwen Zhang, Xiaode Liu, Xinyi Tong, Yuanyuan Ou, Xuhui Huang, Zhe Ma

    Abstract: The Spiking Neural Network (SNN) has attracted more and more attention recently. It adopts binary spike signals to transmit information. Benefitting from the information passing paradigm of SNNs, the multiplications of activations and weights can be replaced by additions, which are more energy-efficient. However, its "Hard Reset" mechanism for the firing activity would ignore the difference among… ▽ More

    Submitted 17 August, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

    Comments: Accepted by ECCV2022

  22. arXiv:2307.00583  [pdf, other

    eess.IV cs.CV

    A region and category confidence-based multi-task network for carotid ultrasound image segmentation and classification

    Authors: Haitao Gan, Ran Zhou, Yanghan Ou, Furong Wang, Xinyao Cheng, Aaron Fenster

    Abstract: The segmentation and classification of carotid plaques in ultrasound images play important roles in the treatment of atherosclerosis and assessment for the risk of stroke. Although deep learning methods have been used for carotid plaque segmentation and classification, two-stage methods will increase the complexity of the overall analysis and the existing multi-task methods ignored the relationshi… ▽ More

    Submitted 18 November, 2023; v1 submitted 2 July, 2023; originally announced July 2023.

  23. Sim-to-Real Surgical Robot Learning and Autonomous Planning for Internal Tissue Points Manipulation using Reinforcement Learning

    Authors: Yafei Ou, Mahdi Tavakoli

    Abstract: Indirect simultaneous positioning (ISP), where internal tissue points are placed at desired locations indirectly through the manipulation of boundary points, is a type of subtask frequently performed in robotic surgeries. Although challenging due to complex tissue dynamics, automating the task can potentially reduce the workload of surgeons. This paper presents a sim-to-real framework for learning… ▽ More

    Submitted 24 June, 2023; originally announced June 2023.

    Comments: 8 pages, 8 figures

    Journal ref: IEEE Robotics and Automation Letters, vol. 8, no. 5, pp. 2502-2509, May 2023

  24. arXiv:2306.00526  [pdf, other

    cs.CL cs.AI cs.CV

    Layout and Task Aware Instruction Prompt for Zero-shot Document Image Question Answering

    Authors: Wenjin Wang, Yunhao Li, Yixin Ou, Yin Zhang

    Abstract: Layout-aware pre-trained models has achieved significant progress on document image question answering. They introduce extra learnable modules into existing language models to capture layout information within document images from text bounding box coordinates obtained by OCR tools. However, extra modules necessitate pre-training on extensive document images. This prevents these methods from direc… ▽ More

    Submitted 7 September, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: Add the LATIN-Tuning for Alapca. Code is available at https://github.com/WenjinW/LATIN-Prompt

  25. arXiv:2305.13168  [pdf, other

    cs.CL cs.AI cs.DB cs.IR cs.LG

    LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities

    Authors: Yuqi Zhu, Xiaohan Wang, Jing Chen, Shuofei Qiao, Yixin Ou, Yunzhi Yao, Shumin Deng, Huajun Chen, Ningyu Zhang

    Abstract: This paper presents an exhaustive quantitative and qualitative evaluation of Large Language Models (LLMs) for Knowledge Graph (KG) construction and reasoning. We engage in experiments across eight diverse datasets, focusing on four representative tasks encompassing entity and relation extraction, event extraction, link prediction, and question-answering, thereby thoroughly exploring LLMs' performa… ▽ More

    Submitted 22 February, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Work in progress

  26. A Deep Registration Method for Accurate Quantification of Joint Space Narrowing Progression in Rheumatoid Arthritis

    Authors: Haolin Wang, Yafei Ou, Wanxuan Fang, Prasoon Ambalathankandy, Naoto Goto, Gen Ota, Masayuki Ikebe, Tamotsu Kamishima

    Abstract: Rheumatoid arthritis (RA) is a chronic autoimmune inflammatory disease that results in progressive articular destruction and severe disability. Joint space narrowing (JSN) progression has been regarded as an important indicator for RA progression and has received sustained attention. In the diagnosis and monitoring of RA, radiology plays a crucial role to monitor joint space. A new framework for m… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

    Comments: 11 pages, 9 figures, 7 tables

    MSC Class: 68T45 ACM Class: I.4

  27. arXiv:2304.09324  [pdf, other

    eess.IV cs.CV

    Computer-Vision Benchmark Segment-Anything Model (SAM) in Medical Images: Accuracy in 12 Datasets

    Authors: Sheng He, Rina Bao, Jingpeng Li, Jeffrey Stout, Atle Bjornerud, P. Ellen Grant, Yangming Ou

    Abstract: Background: The segment-anything model (SAM), introduced in April 2023, shows promise as a benchmark model and a universal solution to segment various natural images. It comes without previously-required re-training or fine-tuning specific to each new dataset. Purpose: To test SAM's accuracy in various medical image segmentation tasks and investigate potential factors that may affect its accurac… ▽ More

    Submitted 5 May, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: Technical Report

  28. arXiv:2304.08915  [pdf, other

    cs.NE cs.LG

    Differentiable Genetic Programming for High-dimensional Symbolic Regression

    Authors: Peng Zeng, Xiaotian Song, Andrew Lensen, Yuwei Ou, Yanan Sun, Mengjie Zhang, Jiancheng Lv

    Abstract: Symbolic regression (SR) is the process of discovering hidden relationships from data with mathematical expressions, which is considered an effective way to reach interpretable machine learning (ML). Genetic programming (GP) has been the dominator in solving SR problems. However, as the scale of SR problems increases, GP often poorly demonstrates and cannot effectively address the real-world high-… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

  29. arXiv:2304.01401  [pdf, other

    eess.IV cs.CV

    U-Netmer: U-Net meets Transformer for medical image segmentation

    Authors: Sheng He, Rina Bao, P. Ellen Grant, Yangming Ou

    Abstract: The combination of the U-Net based deep learning models and Transformer is a new trend for medical image segmentation. U-Net can extract the detailed local semantic and texture information and Transformer can learn the long-rang dependencies among pixels in the input image. However, directly adapting the Transformer for segmentation has ``token-flatten" problem (flattens the local patches into 1D… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Comments: 10 pages, 5 figures, under review

  30. arXiv:2303.16434  [pdf, other

    cs.AI cs.CL

    TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs

    Authors: Yaobo Liang, Chenfei Wu, Ting Song, Wenshan Wu, Yan Xia, Yu Liu, Yang Ou, Shuai Lu, Lei Ji, Shaoguang Mao, Yun Wang, Linjun Shou, Ming Gong, Nan Duan

    Abstract: Artificial Intelligence (AI) has made incredible progress recently. On the one hand, advanced foundation models like ChatGPT can offer powerful conversation, in-context learning and code generation abilities on a broad range of open-domain tasks. They can also generate high-level solution outlines for domain-specific tasks based on the common sense knowledge they have acquired. However, they still… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

  31. arXiv:2301.00503  [pdf, other

    cs.CL cs.AI cs.DB cs.IR cs.LG

    A Concept Knowledge Graph for User Next Intent Prediction at Alipay

    Authors: Yacheng He, Qianghuai Jia, Lin Yuan, Ruopeng Li, Yixin Ou, Ningyu Zhang

    Abstract: This paper illustrates the technologies of user next intent prediction with a concept knowledge graph. The system has been deployed on the Web at Alipay, serving more than 100 million daily active users. To explicitly characterize user intent, we propose AlipayKG, which is an offline concept knowledge graph in the Life-Service domain modeling the historical behaviors of users, the rich content int… ▽ More

    Submitted 14 March, 2023; v1 submitted 1 January, 2023; originally announced January 2023.

    Comments: Accepted by WWW 2023 poster

  32. arXiv:2212.14049  [pdf, other

    cs.LG cs.AI cs.CR

    Differentiable Search of Accurate and Robust Architectures

    Authors: Yuwei Ou, Xiangning Xie, Shangce Gao, Yanan Sun, Kay Chen Tan, Jiancheng Lv

    Abstract: Deep neural networks (DNNs) are found to be vulnerable to adversarial attacks, and various methods have been proposed for the defense. Among these methods, adversarial training has been drawing increasing attention because of its simplicity and effectiveness. However, the performance of the adversarial training is greatly limited by the architectures of target DNNs, which often makes the resulting… ▽ More

    Submitted 2 January, 2023; v1 submitted 28 December, 2022; originally announced December 2022.

  33. arXiv:2212.09597  [pdf, other

    cs.CL cs.AI cs.CV cs.IR cs.LG

    Reasoning with Language Model Prompting: A Survey

    Authors: Shuofei Qiao, Yixin Ou, Ningyu Zhang, Xiang Chen, Yunzhi Yao, Shumin Deng, Chuanqi Tan, Fei Huang, Huajun Chen

    Abstract: Reasoning, as an essential ability for complex problem-solving, can provide back-end support for various real-world applications, such as medical diagnosis, negotiation, etc. This paper provides a comprehensive survey of cutting-edge research on reasoning with language model prompting. We introduce research works with comparisons and summaries and provide systematic resources to help beginners. We… ▽ More

    Submitted 18 September, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: ACL 2023, 24 pages, add references of theoretical analysis

  34. arXiv:2212.09206  [pdf, other

    eess.IV cs.CV

    Segmentation Ability Map: Interpret deep features for medical image segmentation

    Authors: Sheng He, Yanfang Feng, P. Ellen Grant, Yangming Ou

    Abstract: Deep convolutional neural networks (CNNs) have been widely used for medical image segmentation. In most studies, only the output layer is exploited to compute the final segmentation results and the hidden representations of the deep learned features have not been well understood. In this paper, we propose a prototype segmentation (ProtoSeg) method to compute a binary segmentation map based on deep… ▽ More

    Submitted 18 December, 2022; originally announced December 2022.

    Journal ref: Medical Image Analysis, 2023

  35. arXiv:2212.08883  [pdf, other

    cs.LG cs.CV

    Modeling Global Distribution for Federated Learning with Label Distribution Skew

    Authors: Tao Sheng, Chengchao Shen, Yuan Liu, Yeyu Ou, Zhe Qu, Jianxin Wang

    Abstract: Federated learning achieves joint training of deep models by connecting decentralized data sources, which can significantly mitigate the risk of privacy leakage. However, in a more general case, the distributions of labels among clients are different, called ``label distribution skew''. Directly applying conventional federated learning without consideration of label distribution skew issue signifi… ▽ More

    Submitted 17 December, 2022; originally announced December 2022.

  36. arXiv:2208.10496  [pdf, ps, other

    cs.LG cs.AI cs.NI

    Representation Learning of Knowledge Graph for Wireless Communication Networks

    Authors: Shiwen He, Yeyu Ou, Liangpeng Wang, Hang Zhan, Peng Ren, Yongming Huang

    Abstract: With the application of the fifth-generation wireless communication technologies, more smart terminals are being used and generating huge amounts of data, which has prompted extensive research on how to handle and utilize these wireless data. Researchers currently focus on the research on the upper-layer application data or studying the intelligent transmission methods concerning a specific proble… ▽ More

    Submitted 22 August, 2022; originally announced August 2022.

  37. arXiv:2206.01741  [pdf, other

    eess.IV cs.CV

    Patcher: Patch Transformers with Mixture of Experts for Precise Medical Image Segmentation

    Authors: Yanglan Ou, Ye Yuan, Xiaolei Huang, Stephen T. C. Wong, John Volpi, James Z. Wang, Kelvin Wong

    Abstract: We present a new encoder-decoder Vision Transformer architecture, Patcher, for medical image segmentation. Unlike standard Vision Transformers, it employs Patcher blocks that segment an image into large patches, each of which is further divided into small patches. Transformers are applied to the small patches within a large patch, which constrains the receptive field of each pixel. We intentionall… ▽ More

    Submitted 29 May, 2023; v1 submitted 3 June, 2022; originally announced June 2022.

    Comments: MICCAI 2022

  38. arXiv:2205.12646  [pdf, other

    cs.CV cs.AI

    UniInst: Unique Representation for End-to-End Instance Segmentation

    Authors: Yimin Ou, Rui Yang, Lufan Ma, Yong Liu, Jiangpeng Yan, Shang Xu, Chengjie Wang, Xiu Li

    Abstract: Existing instance segmentation methods have achieved impressive performance but still suffer from a common dilemma: redundant representations (e.g., multiple boxes, grids, and anchor points) are inferred for one instance, which leads to multiple duplicated predictions. Thus, mainstream methods usually rely on a hand-designed non-maximum suppression (NMS) post-processing step to select the optimal… ▽ More

    Submitted 17 September, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: This paper wil appear at Neurocomputing. Code: https://github.com/b03505036/UniInst

  39. A Sub-pixel Accurate Quantification of Joint Space Narrowing Progression in Rheumatoid Arthritis

    Authors: Yafei Ou, Prasoon Ambalathankandy, Ryunosuke Furuya, Seiya Kawada, Tianyu Zeng, Yujie An, Tamotsu Kamishima, Kenichi Tamura, Masayuki Ikebe

    Abstract: Rheumatoid arthritis (RA) is a chronic autoimmune disease that primarily affects peripheral synovial joints, like fingers, wrist and feet. Radiology plays a critical role in the diagnosis and monitoring of RA. Limited by the current spatial resolution of radiographic imaging, joint space narrowing (JSN) progression of RA with the same reason above can be less than one pixel per year with universal… ▽ More

    Submitted 1 November, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

  40. arXiv:2204.11621  [pdf, other

    cs.RO

    Semantic Geometric Fusion Multi-object Tracking and Lidar Odometry in Dynamic Environment

    Authors: Tingchen Ma, Yongsheng Ou

    Abstract: The SLAM system based on static scene assumption will introduce huge estimation errors when moving objects appear in the field of view. This paper proposes a novel multi-object dynamic lidar odometry (MLO) based on semantic object detection technology to solve this problem. The MLO system can provide reliable localization of robot and semantic objects and build long-term static maps in complex dyn… ▽ More

    Submitted 2 March, 2023; v1 submitted 25 April, 2022; originally announced April 2022.

  41. arXiv:2204.06598  [pdf, other

    cs.CV

    Deep Relation Learning for Regression and Its Application to Brain Age Estimation

    Authors: Sheng He, Yanfang Feng, P. Ellen Grant, Yangming Ou

    Abstract: Most deep learning models for temporal regression directly output the estimation based on single input images, ignoring the relationships between different images. In this paper, we propose deep relation learning for regression, aiming to learn different relations between a pair of input images. Four non-linear relations are considered: "cumulative relation", "relative relation", "maximal relation… ▽ More

    Submitted 13 April, 2022; originally announced April 2022.

    Journal ref: IEEE Transactions on Medical Imaging. 2022

  42. arXiv:2109.03220  [pdf, other

    cs.LG

    Revisiting Recursive Least Squares for Training Deep Neural Networks

    Authors: Chunyuan Zhang, Qi Song, Hui Zhou, Yigui Ou, Hongyao Deng, Laurence Tianruo Yang

    Abstract: Recursive least squares (RLS) algorithms were once widely used for training small-scale neural networks, due to their fast convergence. However, previous RLS algorithms are unsuitable for training deep neural networks (DNNs), since they have high computational complexity and too many preconditions. In this paper, to overcome these drawbacks, we propose three novel RLS optimization algorithms for t… ▽ More

    Submitted 7 September, 2021; originally announced September 2021.

    Comments: 12 pages,5 figures, IEEE Transactions on Neural Networks and Learning Systems under review

    MSC Class: 68T07 ACM Class: K.3.2

  43. arXiv:2109.01663  [pdf, other

    cs.CV eess.IV

    Global-Local Transformer for Brain Age Estimation

    Authors: Sheng He, P. Ellen Grant, Yangming Ou

    Abstract: Deep learning can provide rapid brain age estimation based on brain magnetic resonance imaging (MRI). However, most studies use one neural network to extract the global information from the whole input image, ignoring the local fine-grained details. In this paper, we propose a global-local transformer, which consists of a global-pathway to extract the global-context information from the whole inpu… ▽ More

    Submitted 2 September, 2021; originally announced September 2021.

    Comments: To appear: IEEE Transactions on Medical Imaging

  44. arXiv:2107.03029  [pdf, ps, other

    cs.IT

    An Overview on the Application of Graph Neural Networks in Wireless Networks

    Authors: S. He, S. Xiong, Y. Ou, J. Zhang, J. Wang, Y. Huang, Y. Zhang

    Abstract: In recent years, with the rapid enhancement of computing power, deep learning methods have been widely applied in wireless networks and achieved impressive performance. To effectively exploit the information of graph-structured data as well as contextual information, graph neural networks (GNNs) have been introduced to address a series of optimization problems of wireless networks. In this overvie… ▽ More

    Submitted 17 November, 2021; v1 submitted 7 July, 2021; originally announced July 2021.

    Comments: 17 pages, 14 figures, submitted to the IEEE Open Journal of the Communications Society. Manuscript received Jul. 29, 2021; revised Sept. 15, 2021; Oct. 25, 2021; accepted Nov. 12, 2021

  45. arXiv:2107.01181  [pdf, other

    cs.CV cs.AI

    Visual Relationship Forecasting in Videos

    Authors: Li Mi, Yangjun Ou, Zhenzhong Chen

    Abstract: Real-world scenarios often require the anticipation of object interactions in unknown future, which would assist the decision-making process of both humans and agents. To meet this challenge, we present a new task named Visual Relationship Forecasting (VRF) in videos to explore the prediction of visual relationships in a reasoning manner. Specifically, given a subject-object pair with H existing f… ▽ More

    Submitted 2 July, 2021; originally announced July 2021.

  46. arXiv:2104.13917  [pdf, other

    eess.IV cs.CV

    LambdaUNet: 2.5D Stroke Lesion Segmentation of Diffusion-weighted MR Images

    Authors: Yanglan Ou, Ye Yuan, Xiaolei Huang, Kelvin Wong, John Volpi, James Z. Wang, Stephen T. C. Wong

    Abstract: Diffusion-weighted (DW) magnetic resonance imaging is essential for the diagnosis and treatment of ischemic stroke. DW images (DWIs) are usually acquired in multi-slice settings where lesion areas in two consecutive 2D slices are highly discontinuous due to large slice thickness and sometimes even slice gaps. Therefore, although DWIs contain rich 3D information, they cannot be treated as regular 3… ▽ More

    Submitted 29 May, 2023; v1 submitted 28 April, 2021; originally announced April 2021.

  47. arXiv:2104.13763  [pdf, other

    cs.CV

    LGA-RCNN: Loss-Guided Attention for Object Detection

    Authors: Xin Yi, Jiahao Wu, Bo Ma, Yangtong Ou, Longyao Liu

    Abstract: Object detection is widely studied in computer vision filed. In recent years, certain representative deep learning based detection methods along with solid benchmarks are proposed, which boosts the development of related researchs. However, existing detection methods still suffer from undesirable performance under challenges such as camouflage, blur, inter-class similarity, intra-class variance an… ▽ More

    Submitted 12 May, 2021; v1 submitted 28 April, 2021; originally announced April 2021.

  48. arXiv:2104.09452  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    Epsilon Consistent Mixup: Structural Regularization with an Adaptive Consistency-Interpolation Tradeoff

    Authors: Vincent Pisztora, Yanglan Ou, Xiaolei Huang, Francesca Chiaromonte, Jia Li

    Abstract: In this paper we propose $ε$-Consistent Mixup ($ε$mu). $ε$mu is a data-based structural regularization technique that combines Mixup's linear interpolation with consistency regularization in the Mixup direction, by compelling a simple adaptive tradeoff between the two. This learnable combination of consistency and interpolation induces a more flexible structure on the evolution of the response acr… ▽ More

    Submitted 29 September, 2021; v1 submitted 19 April, 2021; originally announced April 2021.

  49. arXiv:2104.06472  [pdf, other

    cs.IT

    Mitigating Hand Blockage with Non-Directional Beamforming Codebooks

    Authors: Vasanthan Raghavan, Ricardo A. Motos, M. Ali Tassoudji, Yu-Chin Ou, Ozge H. Koymen, Junyi Li

    Abstract: Hand blockage leads to significant performance impairments at millimeter wave carrier frequencies. A number of prior works have characterized the loss in signal strength with the hand using studies with horn antennas and form-factor user equipments (UEs). However, the impact of the hand on the effective phase response seen by the antenna elements has not been studied so far. Towards this goal, we… ▽ More

    Submitted 13 April, 2021; originally announced April 2021.

    Comments: 28 pages, 8 figures

  50. arXiv:2103.15316  [pdf, other

    cs.CL cs.AI cs.LG

    Whitening Sentence Representations for Better Semantics and Faster Retrieval

    Authors: Jianlin Su, Jiarun Cao, Weijie Liu, Yangyiwen Ou

    Abstract: Pre-training models such as BERT have achieved great success in many natural language processing tasks. However, how to obtain better sentence representation through these pre-training models is still worthy to exploit. Previous work has shown that the anisotropy problem is an critical bottleneck for BERT-based sentence representation which hinders the model to fully utilize the underlying semanti… ▽ More

    Submitted 28 March, 2021; originally announced March 2021.

    Comments: The source code of this paper is available at https://github.com/bojone/BERT-whitening