Skip to main content

Showing 1–50 of 78 results for author: Zou, K

  1. arXiv:2406.18321  [pdf, other

    cs.CL cs.AI

    MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data

    Authors: Meng Fang, Xiangpeng Wan, Fei Lu, Fei Xing, Kai Zou

    Abstract: Large language models (LLMs) have significantly advanced natural language understanding and demonstrated strong problem-solving abilities. Despite these successes, most LLMs still struggle with solving mathematical problems due to the intricate reasoning required. This paper investigates the mathematical problem-solving capabilities of LLMs using the newly developed "MathOdyssey" dataset. The data… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2406.16942  [pdf, other

    eess.IV cs.AI cs.CV

    Enhancing Diagnostic Reliability of Foundation Model with Uncertainty Estimation in OCT Images

    Authors: Yuanyuan Peng, Aidi Lin, Meng Wang, Tian Lin, Ke Zou, Yinglin Cheng, Tingkun Shi, Xulong Liao, Lixia Feng, Zhen Liang, Xinjian Chen, Huazhu Fu, Haoyu Chen

    Abstract: Inability to express the confidence level and detect unseen classes has limited the clinical implementation of artificial intelligence in the real-world. We developed a foundation model with uncertainty estimation (FMUE) to detect 11 retinal conditions on optical coherence tomography (OCT). In the internal test set, FMUE achieved a higher F1 score of 96.76% than two state-of-the-art algorithms, RE… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: All codes are available at https://github.com/yuanyuanpeng0129/FMUE

  3. arXiv:2406.12479  [pdf, other

    cs.CV cs.AI

    RS-GPT4V: A Unified Multimodal Instruction-Following Dataset for Remote Sensing Image Understanding

    Authors: Linrui Xu, Ling Zhao, Wang Guo, Qiujun Li, Kewang Long, Kaiqi Zou, Yuhan Wang, Haifeng Li

    Abstract: The remote sensing image intelligence understanding model is undergoing a new profound paradigm shift which has been promoted by multi-modal large language model (MLLM), i.e. from the paradigm learning a domain model (LaDM) shifts to paradigm learning a pre-trained general foundation model followed by an adaptive domain model (LaGD). Under the new LaGD paradigm, the old datasets, which have led to… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 14 pages, 6 figures, 4 tables

  4. arXiv:2406.09317  [pdf, other

    eess.IV cs.CV

    Common and Rare Fundus Diseases Identification Using Vision-Language Foundation Model with Knowledge of Over 400 Diseases

    Authors: Meng Wang, Tian Lin, Aidi Lin, Kai Yu, Yuanyuan Peng, Lianyu Wang, Cheng Chen, Ke Zou, Huiyu Liang, Man Chen, Xue Yao, Meiqin Zhang, Binwei Huang, Chaoxin Zheng, Peixin Zhang, Wei Chen, Yilong Luo, Yifan Chen, Honghe Xia, Tingkun Shi, Qi Zhang, Jinming Guo, Xiaolin Chen, Jingcheng Wang, Yih Chung Tham , et al. (24 additional authors not shown)

    Abstract: Previous foundation models for retinal images were pre-trained with limited disease categories and knowledge base. Here we introduce RetiZero, a vision-language foundation model that leverages knowledge from over 400 fundus diseases. To RetiZero's pre-training, we compiled 341,896 fundus images paired with text descriptions, sourced from public datasets, ophthalmic literature, and online resources… ▽ More

    Submitted 30 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  5. arXiv:2406.08835  [pdf, other

    cs.SD eess.AS

    A Single-Step Non-Autoregressive Automatic Speech Recognition Architecture with High Accuracy and Inference Speed

    Authors: Ziyang Zhuang, Chenfeng Miao, Kun Zou, Shuai Gong, Ming Fang, Tao Wei, Zijian Li, Wei Hu, Shaojun Wang, Jing Xiao

    Abstract: Non-autoregressive (NAR) automatic speech recognition (ASR) models predict tokens independently and simultaneously, bringing high inference speed. However, there is still a gap in the accuracy of the NAR models compared to the autoregressive (AR) models. To further narrow the gap between the NAR and AR models, we propose a single-step NAR ASR architecture with high accuracy and inference speed, ca… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  6. arXiv:2405.18167  [pdf, other

    eess.IV cs.CV

    Confidence-aware multi-modality learning for eye disease screening

    Authors: Ke Zou, Tian Lin, Zongbo Han, Meng Wang, Xuedong Yuan, Haoyu Chen, Changqing Zhang, Xiaojing Shen, Huazhu Fu

    Abstract: Multi-modal ophthalmic image classification plays a key role in diagnosing eye diseases, as it integrates information from different sources to complement their respective performances. However, recent improvements have mainly focused on accuracy, often neglecting the importance of confidence and robustness in predictions for diverse modalities. In this study, we propose a novel multi-modality evi… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 27 pages, 7 figures, 9 tables

  7. arXiv:2405.16910  [pdf, other

    cond-mat.supr-con cond-mat.str-el

    Temperature evolution of the Fermi surface of the FeSe monolayer on STO

    Authors: Khalil Zakeri, Ryan Roemer, Ke Zou

    Abstract: The origin of superconductivity in FeSe monolayer on SrTiO$_3$ belongs to one of the unresolved mysteries in condensed-matter physics. Here by investigation of the temperature evolution of the dynamic charge response of FeSe/SrTiO$_3$ we demonstrate that the response of the monolayer itself is nearly temperature independent. This indicates a constant Fermi surface over a wide range of temperature,… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 7 Pages, 3 Figures

  8. arXiv:2405.16573  [pdf, other

    cs.CV

    FRCNet Frequency and Region Consistency for Semi-supervised Medical Image Segmentation

    Authors: Along He, Tao Li, Yanlin Wu, Ke Zou, Huazhu Fu

    Abstract: Limited labeled data hinder the application of deep learning in medical domain. In clinical practice, there are sufficient unlabeled data that are not effectively used, and semi-supervised learning (SSL) is a promising way for leveraging these unlabeled data. However, existing SSL methods ignore frequency domain and region-level information and it is important for lesion regions located at low fre… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: MICCAI 2024 Early Accept

  9. arXiv:2405.16102  [pdf, other

    eess.IV cs.CV

    Reliable Source Approximation: Source-Free Unsupervised Domain Adaptation for Vestibular Schwannoma MRI Segmentation

    Authors: Hongye Zeng, Ke Zou, Zhihao Chen, Rui Zheng, Huazhu Fu

    Abstract: Source-Free Unsupervised Domain Adaptation (SFUDA) has recently become a focus in the medical image domain adaptation, as it only utilizes the source model and does not require annotated target data. However, current SFUDA approaches cannot tackle the complex segmentation task across different MRI sequences, such as the vestibular schwannoma segmentation. To address this problem, we proposed Relia… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: Early accepted by MICCAI 2024

  10. arXiv:2405.04294  [pdf, other

    cs.AI

    Enhancing the Efficiency and Accuracy of Underlying Asset Reviews in Structured Finance: The Application of Multi-agent Framework

    Authors: Xiangpeng Wan, Haicheng Deng, Kai Zou, Shiqi Xu

    Abstract: Structured finance, which involves restructuring diverse assets into securities like MBS, ABS, and CDOs, enhances capital market efficiency but presents significant due diligence challenges. This study explores the integration of artificial intelligence (AI) with traditional asset review processes to improve efficiency and accuracy in structured finance. Using both open-sourced and close-sourced l… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  11. arXiv:2404.06798  [pdf, other

    cs.CV

    MedRG: Medical Report Grounding with Multi-modal Large Language Model

    Authors: Ke Zou, Yang Bai, Zhihao Chen, Yang Zhou, Yidi Chen, Kai Ren, Meng Wang, Xuedong Yuan, Xiaojing Shen, Huazhu Fu

    Abstract: Medical Report Grounding is pivotal in identifying the most relevant regions in medical images based on a given phrase query, a critical aspect in medical image analysis and radiological diagnosis. However, prevailing visual grounding approaches necessitate the manual extraction of key phrases from medical reports, imposing substantial burdens on both system efficiency and physicians. In this pape… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 12 pages, 4 figures

  12. arXiv:2403.18388  [pdf, other

    cs.AI cs.CV

    FTBC: Forward Temporal Bias Correction for Optimizing ANN-SNN Conversion

    Authors: Xiaofeng Wu, Velibor Bojkovic, Bin Gu, Kun Suo, Kai Zou

    Abstract: Spiking Neural Networks (SNNs) offer a promising avenue for energy-efficient computing compared with Artificial Neural Networks (ANNs), closely mirroring biological neural processes. However, this potential comes with inherent challenges in directly training SNNs through spatio-temporal backpropagation -- stemming from the temporal dynamics of spiking neurons and their discrete signal processing -… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  13. arXiv:2402.11211  [pdf, other

    eess.IV cs.CV

    Training-free image style alignment for self-adapting domain shift on handheld ultrasound devices

    Authors: Hongye Zeng, Ke Zou, Zhihao Chen, Yuchong Gao, Hongbo Chen, Haibin Zhang, Kang Zhou, Meng Wang, Rick Siow Mong Goh, Yong Liu, Chang Jiang, Rui Zheng, Huazhu Fu

    Abstract: Handheld ultrasound devices face usage limitations due to user inexperience and cannot benefit from supervised deep learning without extensive expert annotations. Moreover, the models trained on standard ultrasound device data are constrained by training data distribution and perform poorly when directly applied to handheld device data. In this study, we propose the Training-free Image Style Align… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

  14. arXiv:2401.07502  [pdf, other

    cs.CV

    Compositional Oil Spill Detection Based on Object Detector and Adapted Segment Anything Model from SAR Images

    Authors: Wenhui Wu, Man Sing Wong, Xinyu Yu, Guoqiang Shi, Coco Yin Tung Kwok, Kang Zou

    Abstract: Semantic segmentation-based methods have attracted extensive attention in oil spill detection from SAR images. However, the existing approaches require a large number of finely annotated segmentation samples in the training stage. To alleviate this issue, we propose a composite oil spill detection framework, SAM-OIL, comprising an object detector (e.g., YOLOv8), an adapted Segment Anything Model (… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

    Comments: 5 pages, 4 figures

  15. arXiv:2312.03042  [pdf, other

    cs.CL cs.AI

    Inherent limitations of LLMs regarding spatial information

    Authors: He Yan, Xinyao Hu, Xiangpeng Wan, Chengyu Huang, Kai Zou, Shiqi Xu

    Abstract: Despite the significant advancements in natural language processing capabilities demonstrated by large language models such as ChatGPT, their proficiency in comprehending and processing spatial information, especially within the domains of 2D and 3D route planning, remains notably underdeveloped. This paper investigates the inherent limitations of ChatGPT and similar models in spatial reasoning an… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  16. arXiv:2310.18827  [pdf, other

    cs.CL cs.AI

    All Things Considered: Detecting Partisan Events from News Media with Cross-Article Comparison

    Authors: Yujian Liu, Xinliang Frederick Zhang, Kaijian Zou, Ruihong Huang, Nick Beauchamp, Lu Wang

    Abstract: Public opinion is shaped by the information news media provide, and that information in turn may be shaped by the ideological preferences of media outlets. But while much attention has been devoted to media bias via overt ideological language or topic selection, a more unobtrusive way in which the media shape opinion is via the strategic inclusion or omission of partisan events that may support on… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: EMNLP'23 Main Conference

  17. arXiv:2310.18768  [pdf, other

    cs.CL

    Crossing the Aisle: Unveiling Partisan and Counter-Partisan Events in News Reporting

    Authors: Kaijian Zou, Xinliang Frederick Zhang, Winston Wu, Nick Beauchamp, Lu Wang

    Abstract: News media is expected to uphold unbiased reporting. Yet they may still affect public opinion by selectively including or omitting events that support or contradict their ideological positions. Prior work in NLP has only studied media bias via linguistic style and word usage. In this paper, we study to which degree media balances news reporting and affects consumers through event inclusion or omis… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: EMNLP'23 Findings

  18. arXiv:2310.13800  [pdf, other

    cs.CL

    Evaluation Metrics in the Era of GPT-4: Reliably Evaluating Large Language Models on Sequence to Sequence Tasks

    Authors: Andrea Sottana, Bin Liang, Kai Zou, Zheng Yuan

    Abstract: Large Language Models (LLMs) evaluation is a patchy and inconsistent landscape, and it is becoming clear that the quality of automatic evaluation metrics is not keeping up with the pace of development of generative models. We aim to improve the understanding of current models' performance by providing a preliminary and hybrid evaluation on a range of open and closed-source generative LLMs on three… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: Accepted at EMNLP 2023

  19. arXiv:2310.12111  [pdf, other

    eess.AS cs.AI

    DASA: Difficulty-Aware Semantic Augmentation for Speaker Verification

    Authors: Yuanyuan Wang, Yang Zhang, Zhiyong Wu, Zhihan Yang, Tao Wei, Kun Zou, Helen Meng

    Abstract: Data augmentation is vital to the generalization ability and robustness of deep neural networks (DNNs) models. Existing augmentation methods for speaker verification manipulate the raw signal, which are time-consuming and the augmented samples lack diversity. In this paper, we present a novel difficulty-aware semantic augmentation (DASA) approach for speaker verification, which can generate divers… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: Accepted by ICASSP 2023

  20. arXiv:2310.03170  [pdf

    cond-mat.supr-con

    Critical Role of Disorder for Superconductivity in the Series of Epitaxial Ti(O,N) Films

    Authors: Fengmiao Li, Oliver Dicks, Myung-Geun Han, Solveig Aamlid, Giorgio Levy, Ronny Sutarto, Chong Liu, Hsiang-Hsi Kung, Oleksandr Foyevstov, Simon Godin, Bruce A. Davidson, Andrea Damascelli, Yimei Zhu, Christoph Heil, Ilya Elfimov, George A. Sawatzky, Ke Zou

    Abstract: Experimental manipulation of superconductivity is of paramount importance, not only for practical applications but also for identifying the key factors involved in electron pairing. In this work, we have undertaken a meticulous study of the superconductivity in a series of titanium compounds with a rocksalt structure, synthesized as epitaxial films. We find that substituting nitrogen (N) for oxyge… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  21. arXiv:2308.11213  [pdf

    physics.optics

    Temporally and Longitudinally Tailored Dynamic Space-Time Wave Packets

    Authors: Xinzhou Su, Kaiheng Zou, Huibin Zhou, Hao Song, Yingning Wang, Ruoyu Zeng, Zile Jiang, Yuxiang Duan, Maxim Karpov, Tobias J. Kippenberg, Moshe Tur, Demetrios N. Christodoulides, Alan E. Willner

    Abstract: In general, space-time wave packets with correlations between transverse spatial fields and temporal frequency spectra can lead to unique spatiotemporal dynamics, thus enabling control of the instantaneous light properties. However, spatiotemporal dynamics generated in previous approaches manifest themselves at a given propagation distance yet not arbitrarily tailored longitudinally. Here, we prop… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

  22. arXiv:2307.04981  [pdf, other

    cs.CV

    A Multi-view Impartial Decision Network for Frontotemporal Dementia Diagnosis

    Authors: Guoyao Deng, Ke Zou, Meng Wang, Xuedong Yuan, Sancong Ying, Huazhu Fu

    Abstract: Frontotemporal Dementia (FTD) diagnosis has been successfully progress using deep learning techniques. However, current FTD identification methods suffer from two limitations. Firstly, they do not exploit the potential of multi-view functional magnetic resonance imaging (fMRI) for classifying FTD. Secondly, they do not consider the reliability of the multi-view FTD diagnosis. To address these limi… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

  23. arXiv:2307.04973  [pdf, other

    cs.CV

    SAM-U: Multi-box prompts triggered uncertainty estimation for reliable SAM in medical image

    Authors: Guoyao Deng, Ke Zou, Kai Ren, Meng Wang, Xuedong Yuan, Sancong Ying, Huazhu Fu

    Abstract: Recently, Segmenting Anything has taken an important step towards general artificial intelligence. At the same time, its reliability and fairness have also attracted great attention, especially in the field of health care. In this study, we propose multi-box prompts triggered uncertainty estimation for SAM cues to demonstrate the reliability of segmented lesions or tissues. We estimate the distrib… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

  24. arXiv:2304.03981  [pdf, other

    cs.LG cs.CV

    Uncertainty-inspired Open Set Learning for Retinal Anomaly Identification

    Authors: Meng Wang, Tian Lin, Lianyu Wang, Aidi Lin, Ke Zou, Xinxing Xu, Yi Zhou, Yuanyuan Peng, Qingquan Meng, Yiming Qian, Guoyao Deng, Zhiqun Wu, Junhong Chen, Jianhong Lin, Mingzhi Zhang, Weifang Zhu, Changqing Zhang, Daoqiang Zhang, Rick Siow Mong Goh, Yong Liu, Chi Pui Pang, Xinjian Chen, Haoyu Chen, Huazhu Fu

    Abstract: Failure to recognize samples from the classes unseen during training is a major limitation of artificial intelligence in the real-world implementation for recognition and classification of retinal anomalies. We established an uncertainty-inspired open-set (UIOS) model, which was trained with fundus images of 9 retinal conditions. Besides assessing the probability of each category, UIOS also calcul… ▽ More

    Submitted 29 August, 2023; v1 submitted 8 April, 2023; originally announced April 2023.

  25. 4D Facial Expression Diffusion Model

    Authors: Kaifeng Zou, Sylvain Faisan, Boyang Yu, Sébastien Valette, Hyewon Seo

    Abstract: Facial expression generation is one of the most challenging and long-sought aspects of character animation, with many interesting applications. The challenging task, traditionally having relied heavily on digital craftspersons, remains yet to be explored. In this paper, we introduce a generative framework for generating 3D facial expression sequences (i.e. 4D faces) that can be conditioned on diff… ▽ More

    Submitted 15 April, 2024; v1 submitted 29 March, 2023; originally announced March 2023.

  26. Federated Uncertainty-Aware Aggregation for Fundus Diabetic Retinopathy Staging

    Authors: Meng Wang, Lianyu Wang, Xinxing Xu, Ke Zou, Yiming Qian, Rick Siow Mong Goh, Yong Liu, Huazhu Fu

    Abstract: Deep learning models have shown promising performance in the field of diabetic retinopathy (DR) staging. However, collaboratively training a DR staging model across multiple institutions remains a challenge due to non-iid data, client reliability, and confidence evaluation of the prediction. To address these issues, we propose a novel federated uncertainty-aware aggregation paradigm (FedUAA), whic… ▽ More

    Submitted 22 July, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Report number: 978-3-031-43894-3

    Journal ref: Medical Image Computing and Computer Assisted Intervention(MICCAI 2023)

  27. arXiv:2303.10049  [pdf, other

    cs.CV

    Uncertainty-informed Mutual Learning for Joint Medical Image Classification and Segmentation

    Authors: Kai Ren, Ke Zou, Xianjie Liu, Yidi Chen, Xuedong Yuan, Xiaojing Shen, Meng Wang, Huazhu Fu

    Abstract: Classification and segmentation are crucial in medical image analysis as they enable accurate diagnosis and disease monitoring. However, current methods often prioritize the mutual learning features and shared model parameters, while neglecting the reliability of features and performances. In this paper, we propose a novel Uncertainty-informed Mutual Learning (UML) framework for reliable and inter… ▽ More

    Submitted 2 August, 2023; v1 submitted 17 March, 2023; originally announced March 2023.

    Comments: 13 pages

  28. arXiv:2303.09790  [pdf, other

    eess.IV cs.CV

    Reliable Multimodality Eye Disease Screening via Mixture of Student's t Distributions

    Authors: Ke Zou, Tian Lin, Xuedong Yuan, Haoyu Chen, Xiaojing Shen, Meng Wang, Huazhu Fu

    Abstract: Multimodality eye disease screening is crucial in ophthalmology as it integrates information from diverse sources to complement their respective performances. However, the existing methods are weak in assessing the reliability of each unimodality, and directly fusing an unreliable modality may cause screening errors. To address this issue, we introduce a novel multimodality evidential fusion pipel… ▽ More

    Submitted 29 August, 2023; v1 submitted 17 March, 2023; originally announced March 2023.

    Comments: MICCAI 2023 (Early accept):11 pages, 4 figures

  29. arXiv:2302.08119  [pdf, other

    eess.IV cs.CV

    A Review of Uncertainty Estimation and its Application in Medical Imaging

    Authors: Ke Zou, Zhihao Chen, Xuedong Yuan, Xiaojing Shen, Meng Wang, Huazhu Fu

    Abstract: The use of AI systems in healthcare for the early screening of diseases is of great clinical importance. Deep learning has shown great promise in medical imaging, but the reliability and trustworthiness of AI systems limit their deployment in real clinical scenes, where patient safety is at stake. Uncertainty estimation plays a pivotal role in producing a confidence evaluation along with the predi… ▽ More

    Submitted 15 May, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: 11 pages, 3 figures, 3 tables

  30. arXiv:2301.12798  [pdf, other

    cs.CV

    Reliable Federated Disentangling Network for Non-IID Domain Feature

    Authors: Meng Wang, Kai Yu, Chun-Mei Feng, Yiming Qian, Ke Zou, Lianyu Wang, Rick Siow Mong Goh, Yong Liu, Huazhu Fu

    Abstract: Federated learning (FL), as an effective decentralized distributed learning approach, enables multiple institutions to jointly train a model without sharing their local data. However, the domain feature shift caused by different acquisition devices/clients substantially degrades the performance of the FL model. Furthermore, most existing FL approaches aim to improve accuracy without considering re… ▽ More

    Submitted 19 September, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

  31. arXiv:2301.00349  [pdf, other

    eess.IV cs.CV

    Towards Reliable Medical Image Segmentation by utilizing Evidential Calibrated Uncertainty

    Authors: Ke Zou, Yidi Chen, Ling Huang, Xuedong Yuan, Xiaojing Shen, Meng Wang, Rick Siow Mong Goh, Yong Liu, Huazhu Fu

    Abstract: Medical image segmentation is critical for disease diagnosis and treatment assessment. However, concerns regarding the reliability of segmentation regions persist among clinicians, mainly attributed to the absence of confidence assessment, robustness, and calibration to accuracy. To address this, we introduce DEviS, an easily implementable foundational model that seamlessly integrates into various… ▽ More

    Submitted 13 April, 2024; v1 submitted 1 January, 2023; originally announced January 2023.

    Comments: 34 pages, 11 figures

  32. arXiv:2212.00330   

    eess.IV cs.CV

    Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images

    Authors: Meng Wang, Kai Yu, Chun-Mei Feng, Ke Zou, Yanyu Xu, Qingquan Meng, Rick Siow Mong Goh, Yong Liu, Huazhu Fu

    Abstract: Focusing on the complicated pathological features, such as blurred boundaries, severe scale differences between symptoms, background noise interference, etc., in the task of retinal edema lesions joint segmentation from OCT images and enabling the segmentation results more reliable. In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network, which can provide accur… ▽ More

    Submitted 1 January, 2024; v1 submitted 1 December, 2022; originally announced December 2022.

    Comments: Improving algorithm

  33. arXiv:2210.14793  [pdf, other

    cs.CV

    M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design

    Authors: Hanxue Liang, Zhiwen Fan, Rishov Sarkar, Ziyu Jiang, Tianlong Chen, Kai Zou, Yu Cheng, Cong Hao, Zhangyang Wang

    Abstract: Multi-task learning (MTL) encapsulates multiple learned tasks in a single model and often lets those tasks learn better jointly. However, when deploying MTL onto those real-world systems that are often resource-constrained or latency-sensitive, two prominent challenges arise: (i) during training, simultaneously optimizing all tasks is often difficult due to gradient conflicts across tasks; (ii) at… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

  34. Roadmap on spatiotemporal light fields

    Authors: Yijie Shen, Qiwen Zhan, Logan G. Wright, Demetrios N. Christodoulides, Frank W. Wise, Alan E. Willner, Zhe Zhao, Kai-heng Zou, Chen-Ting Liao, Carlos Hernández-García, Margaret Murnane, Miguel A. Porras, Andy Chong, Chenhao Wan, Konstantin Y. Bliokh, Murat Yessenov, Ayman F. Abouraddy, Liang Jie Wong, Michael Go, Suraj Kumar, Cheng Guo, Shanhui Fan, Nikitas Papasimakis, Nikolay I. Zheludev, Lu Chen , et al. (20 additional authors not shown)

    Abstract: Spatiotemporal sculpturing of light pulse with ultimately sophisticated structures represents the holy grail of the human everlasting pursue of ultrafast information transmission and processing as well as ultra-intense energy concentration and extraction. It also holds the key to unlock new extraordinary fundamental physical effects. Traditionally, spatiotemporal light pulses are always treated as… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: This is the version of the article before peer review or editing, as submitted by an author to Journal of Optics. IOP Publishing Ltd is not responsible for any errors or omissions in this version of the manuscript or any version derived from it

  35. arXiv:2207.06642  [pdf

    cond-mat.mtrl-sci physics.comp-ph

    Realistic simulation of reflection high-energy electron diffraction patterns for two-dimensional lattices using Ewald construction

    Authors: Chong Liu, Kai Chang, Ke Zou

    Abstract: Reflection high-energy electron diffraction (RHEED) is a powerful tool for characterizing crystal surface structures. However, the setup geometry leads to distorted and complicated patterns, which are not straightforward to link to the real-space structures. A program with a graphical user interface is provided here to simulate the RHEED patterns. Following the Ewald construction in the kinematic… ▽ More

    Submitted 24 August, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

    Comments: 15 pages, 5 figures. This article may be downloaded for personal use only. Any other use requires prior permission of the author and AIP Publishing. This article appeared in Journal of Vacuum Science & Technology B

    Journal ref: Journal of Vacuum Science & Technology B 40 (2022) 054002

  36. arXiv:2207.00592  [pdf, other

    cs.DC cs.NI

    Dissecting Service Mesh Overheads

    Authors: Xiangfeng Zhu, Guozhen She, Bowen Xue, Yu Zhang, Yongsu Zhang, Xuan Kelvin Zou, Xiongchun Duan, Peng He, Arvind Krishnamurthy, Matthew Lentz, Danyang Zhuo, Ratul Mahajan

    Abstract: Service meshes play a central role in the modern application ecosystem by providing an easy and flexible way to connect different services that form a distributed application. However, because of the way they interpose on application traffic, they can substantially increase application latency and resource consumption. We develop a decompositional approach and a tool, called MeshInsight, to system… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

  37. arXiv:2206.09309  [pdf, other

    eess.IV cs.CV

    TBraTS: Trusted Brain Tumor Segmentation

    Authors: Ke Zou, Xuedong Yuan, Xiaojing Shen, Meng Wang, Huazhu Fu

    Abstract: Despite recent improvements in the accuracy of brain tumor segmentation, the results still exhibit low levels of confidence and robustness. Uncertainty estimation is one effective way to change this situation, as it provides a measure of confidence in the segmentation results. In this paper, we propose a trusted brain tumor segmentation network which can generate robust segmentation results and re… ▽ More

    Submitted 28 July, 2022; v1 submitted 18 June, 2022; originally announced June 2022.

    Comments: 11 pages, 4 figures, Accepted by MICCAI 2022

  38. arXiv:2203.13005  [pdf, other

    cs.DC

    GX-Plug: a Middleware for Plugging Accelerators to Distributed Graph Processing

    Authors: Kai Zou, Xike Xie, Qi Li, Deyu Kong

    Abstract: Recently, research communities highlight the necessity of formulating a scalability continuum for large-scale graph processing, which gains the scale-out benefits from distributed graph systems, and the scale-up benefits from high-performance accelerators. To this end, we propose a middleware, called the GX-plug, for the ease of integrating the merits of both. As a middleware, the GX-plug is versa… ▽ More

    Submitted 31 March, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

    Comments: 13 pages

  39. arXiv:2203.03367  [pdf, other

    cs.IR cs.CL

    Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval

    Authors: Dingkun Long, Qiong Gao, Kuan Zou, Guangwei Xu, Pengjun Xie, Ruijie Guo, Jian Xu, Guanjun Jiang, Luxi Xing, Ping Yang

    Abstract: Passage retrieval is a fundamental task in information retrieval (IR) research, which has drawn much attention recently. In the English field, the availability of large-scale annotated dataset (e.g, MS MARCO) and the emergence of deep pre-trained language models (e.g, BERT) has resulted in a substantial improvement of existing passage retrieval systems. However, in the Chinese field, especially fo… ▽ More

    Submitted 24 April, 2022; v1 submitted 7 March, 2022; originally announced March 2022.

    Comments: SIGIR 2022 Resource Track

  40. arXiv:2202.13238  [pdf, other

    math.RT math.AG math.NT

    The categorical form of Fargues' conjecture for tori

    Authors: Konrad Zou

    Abstract: We prove the main conjecture of arXiv:2102.13459 for integral coefficients in the case of tori. Along the way we prove that the spectral action as constructed in that manuscript is compatible with the action of the excursion algebra and preserves the grading by $π_1(G)_Q$ on both sides. We additionally develop a (non-solidified) version of condensed group (co)homology and show that many constructi… ▽ More

    Submitted 26 February, 2022; originally announced February 2022.

    Report number: MPIM-Bonn-2022

  41. DSNet: Dynamic Skin Deformation Prediction by Recurrent Neural Network

    Authors: Hyewon Seo, Kaifeng Zou, Frederic Cordier

    Abstract: Skin dynamics contributes to the enriched realism of human body models in rendered scenes. Traditional methods rely on physics-based simulations to accurately reproduce the dynamic behavior of soft tissues. Due to the model complexity and thus the heavy computation, however, they do not directly offer practical solutions to domains where real-time performance is desirable. The quality shapes obtai… ▽ More

    Submitted 26 November, 2021; originally announced January 2022.

    Journal ref: Lecture Notes in Computer Science, Springer, 2021, Lecture Notes in Computer Science, 13002, pp.365-377

  42. arXiv:2201.05307  [pdf, other

    cs.CV cs.LG

    Unsupervised Temporal Video Grounding with Deep Semantic Clustering

    Authors: Daizong Liu, Xiaoye Qu, Yinzhen Wang, Xing Di, Kai Zou, Yu Cheng, Zichuan Xu, Pan Zhou

    Abstract: Temporal video grounding (TVG) aims to localize a target segment in a video according to a given sentence query. Though respectable works have made decent achievements in this task, they severely rely on abundant video-query paired data, which is expensive and time-consuming to collect in real-world scenarios. In this paper, we explore whether a video grounding model can be learned without any pai… ▽ More

    Submitted 14 January, 2022; originally announced January 2022.

    Comments: Accepted by AAAI2022

  43. Disentangled representations: towards interpretation of sex determination from hip bone

    Authors: Kaifeng Zou, Sylvain Faisan, Fabrice Heitz, Marie Epain, Pierre Croisille, Laurent Fanton, Sébastien Valette

    Abstract: By highlighting the regions of the input image that contribute the most to the decision, saliency maps have become a popular method to make neural networks interpretable. In medical imaging, they are particularly well-suited to explain neural networks in the context of abnormality localization. However, from our experiments, they are less suited to classification problems where the features that a… ▽ More

    Submitted 17 December, 2021; originally announced December 2021.

    Journal ref: The Visual Computer (2023)

  44. arXiv:2105.09596  [pdf, other

    cs.CV cs.AI

    AGSFCOS: Based on attention mechanism and Scale-Equalizing pyramid network of object detection

    Authors: Li Wang, Wei Xiang, Ruhui Xue, Kaida Zou, Laili Zhu

    Abstract: Recently, the anchor-free object detection model has shown great potential for accuracy and speed to exceed anchor-based object detection. Therefore, two issues are mainly studied in this article: (1) How to let the backbone network in the anchor-free object detection model learn feature extraction? (2) How to make better use of the feature pyramid network? In order to solve the above problems, Ex… ▽ More

    Submitted 20 May, 2021; originally announced May 2021.

    Comments: 9 pages,9 figures

  45. arXiv:2103.14139  [pdf, other

    physics.app-ph physics.optics

    Inverse-designed multi-dimensional silicon photonic transmitters

    Authors: Ki Youl Yang, Alexander D. White, Farshid Ashtiani, Chinmay Shirpurkar, Srinivas V. Pericherla, Lin Chang, Hao Song, Kaiheng Zou, Huibin Zhou, Kai Pang, Joshua Yang, Melissa A. Guidry, Daniil M. Lukin, Han Hao, Lawrence Trask, Geun Ho Ahn, Andy Netherton, Travis C. Briles, Jordan R. Stone, Lior Rechtman, Jeffery S. Stone, Kasper Van Gasse, Jinhie L. Skarda, Logan Su, Dries Vercruysse , et al. (11 additional authors not shown)

    Abstract: Modern microelectronic processors have migrated towards parallel computing architectures with many-core processors. However, such expansion comes with diminishing returns exacted by the high cost of data movement between individual processors. The use of optical interconnects has burgeoned as a promising technology that can address the limits of this data transfer. While recent pushes to enhance o… ▽ More

    Submitted 10 October, 2021; v1 submitted 25 March, 2021; originally announced March 2021.

    Comments: Fig.2-4 present new experimental results -- (i) demonstration of a broadband, low cross-talk multiplexer, (ii) a silicon photonic mode-division multiplexing with a chip-scale soliton microcomb source, and (iii) a chip-to-chip optical interconnect using a multimode-matched fibre and inverse-designed beam couplers

  46. arXiv:2101.09967  [pdf

    physics.optics eess.SP

    Turbulence-Resilient Coherent Free-Space Optical Communications using Automatic Power-Efficient Pilot-Assisted Optoelectronic Beam Mixing of Many Modes

    Authors: Runzhou Zhang, Nanzhe Hu, Huibin Zhou, Kaiheng Zou, Xinzhou Su, Yiyu Zhou, Haoqian Song, Kai Pang, Hao Song, Amir Minoofar, Zhe Zhao, Cong Liu, Karapet Manukyan, Ahmed Almaiman, Brittany Lynn, Robert W. Boyd, Moshe Tur, Alan E. Willner

    Abstract: Atmospheric turbulence generally limits free-space optical (FSO) communications, and this problem is severely exacerbated when implementing highly sensitive and spectrally efficient coherent detection. Specifically, turbulence induces power coupling from the transmitted Gaussian mode to higher-order Laguerre-Gaussian (LG) modes, resulting in a significant decrease of the power that mixes with a si… ▽ More

    Submitted 25 January, 2021; originally announced January 2021.

  47. arXiv:2012.06730  [pdf, other

    quant-ph physics.optics

    Fractal superconducting nanowires detect infrared single photons with 84% system detection efficiency, 1.02 polarization sensitivity, and 20.8 ps timing resolution

    Authors: Yun Meng, Kai Zou, Nan Hu, Liang Xu, Xiaojian Lan, Stephan Steinhauer, Samuel Gyger, Val Zwiller, Xiaolong Hu

    Abstract: The near-unity system detection efficiency (SDE) and excellent timing resolution of superconducting nanowire single-photon detectors (SNSPDs), combined with their other merits, have enabled many classical and quantum photonic applications. However, the prevalent design based on meandering nanowires makes SDE dependent on the polarization states of the incident photons; for unpolarized light, the m… ▽ More

    Submitted 31 March, 2022; v1 submitted 12 December, 2020; originally announced December 2020.

    Comments: 8 pages, 4 figures

  48. Controlling the electrical and magnetic ground states by doping in the complete phase diagram of titanate Eu1-xLaxTiO3 thin films

    Authors: Hyungki Shin, Chong Liu, Fengmiao Li, Ronny Sutarto, Bruce A. Davidson, Ke Zou

    Abstract: EuTiO3, a band insulator, and LaTiO3, a Mott insulator, are both antiferromagnetic with transition temperatures ~ 5.5 K and ~ 160 K, respectively. Here, we report the synthesis of Eu1-xLaxTiO3 thin films with x = 0 to 1 by oxide molecular beam epitaxy. The films in the full range have high crystalline quality and show no phase segregation, allowing us carry out transport measurements to study thei… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

    Journal ref: Physical Review B, 2020

  49. arXiv:2003.09916  [pdf, other

    physics.ins-det physics.app-ph physics.optics quant-ph

    A platform for high performance photon correlation measurements

    Authors: Iman Esmaeil Zadeh, Johannes W. N. Los, Ronan B. M. Gourgues, Jin Chang, Ali W. Elshaari, Julien Zichi, Yuri J. van Staaden, Jeroen Swens, Nima Kalhor, Antonio Guardiani, Yun Meng, Kai Zou, Sergiy Dobrovolskiy, Andreas W. Fognini, Dennis R. Schaart, Dan Dalacu, Philip J. Poole, Michael E. Reimer, Xiaolong Hu, Silvania F. Pereira, Val Zwiller, Sander N. Dorenbos

    Abstract: A broad range of scientific and industrial disciplines require precise optical measurements at very low light levels. Single-photon detectors combining high efficiency and high time resolution are pivotal in such experiments. By using relatively thick films of NbTiN (8-11\,nm) and improving the pattern fidelity of the nano-structure of the superconducting nanowire single-photon detectors (SNSPD),… ▽ More

    Submitted 22 March, 2020; originally announced March 2020.

  50. Tuning stoichiometry and its impact on superconductivity of monolayer and multilayer FeSe on SrTiO3

    Authors: Chong Liu, Ke Zou

    Abstract: Synthesis of monolayer FeSe on SrTiO3, with greatly enhanced superconductivity compared to bulk FeSe, remains difficult. Lengthy annealing within a certain temperature window is always required to achieve superconducting samples as reported by different groups around the world, but the mechanism of annealing in inducing superconductivity has not been elucidated. We grow FeSe films on SrTiO3 by mol… ▽ More

    Submitted 14 February, 2020; originally announced February 2020.

    Comments: 14 pages, 5 figures

    Journal ref: Phys. Rev. B 101, 140502 (2020)