Skip to main content

Showing 1–50 of 76 results for author: Adeli, E

  1. arXiv:2407.00316  [pdf, other

    cs.CV

    OccFusion: Rendering Occluded Humans with Generative Diffusion Priors

    Authors: Adam Sun, Tiange Xiang, Scott Delp, Li Fei-Fei, Ehsan Adeli

    Abstract: Most existing human rendering methods require every part of the human to be fully visible throughout the input video. However, this assumption does not hold in real-life settings where obstructions are common, resulting in only partial visibility of the human. Considering this, we present OccFusion, an approach that utilizes efficient 3D Gaussian splatting supervised by pretrained 2D diffusion mod… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  2. arXiv:2406.01662  [pdf, other

    cs.CV cs.AI

    Few-Shot Classification of Interactive Activities of Daily Living (InteractADL)

    Authors: Zane Durante, Robathan Harries, Edward Vendrow, Zelun Luo, Yuta Kyuragi, Kazuki Kozuka, Li Fei-Fei, Ehsan Adeli

    Abstract: Understanding Activities of Daily Living (ADLs) is a crucial step for different applications including assistive robots, smart homes, and healthcare. However, to date, few benchmarks and methods have focused on complex ADLs, especially those involving multi-person interactions in home environments. In this paper, we propose a new dataset and benchmark, InteractADL, for understanding complex ADLs t… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  3. arXiv:2404.13798  [pdf, other

    cs.CV

    Enforcing Conditional Independence for Fair Representation Learning and Causal Image Generation

    Authors: Jensen Hwa, Qingyu Zhao, Aditya Lahiri, Adnan Masood, Babak Salimi, Ehsan Adeli

    Abstract: Conditional independence (CI) constraints are critical for defining and evaluating fairness in machine learning, as well as for learning unconfounded or causal representations. Traditional methods for ensuring fairness either blindly learn invariant features with respect to a protected variable (e.g., race when classifying sex from face images) or enforce CI relative to the protected attribute onl… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: To appear at the 2024 IEEE CVPR Workshop on Fair, Data-Efficient, and Trusted Computer Vision

  4. arXiv:2404.02242  [pdf, other

    cs.CV

    Towards Robust 3D Pose Transfer with Adversarial Learning

    Authors: Haoyu Chen, Hao Tang, Ehsan Adeli, Guoying Zhao

    Abstract: 3D pose transfer that aims to transfer the desired pose to a target mesh is one of the most challenging 3D generation tasks. Previous attempts rely on well-defined parametric human models or skeletal joints as driving pose sources. However, to obtain those clean pose sources, cumbersome but necessary pre-processing pipelines are inevitable, hindering implementations of the real-time applications.… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  5. arXiv:2402.05929  [pdf, other

    cs.AI cs.LG cs.RO

    An Interactive Agent Foundation Model

    Authors: Zane Durante, Bidipta Sarkar, Ran Gong, Rohan Taori, Yusuke Noda, Paul Tang, Ehsan Adeli, Shrinidhi Kowshika Lakshmikanth, Kevin Schulman, Arnold Milstein, Demetri Terzopoulos, Ade Famoti, Noboru Kuno, Ashley Llorens, Hoi Vo, Katsu Ikeuchi, Li Fei-Fei, Jianfeng Gao, Naoki Wake, Qiuyuan Huang

    Abstract: The development of artificial intelligence systems is transitioning from creating static, task-specific models to dynamic, agent-based systems capable of performing well in a wide range of applications. We propose an Interactive Agent Foundation Model that uses a novel multi-task agent training paradigm for training AI agents across a wide range of domains, datasets, and tasks. Our training paradi… ▽ More

    Submitted 17 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  6. arXiv:2401.00431  [pdf, other

    cs.CV

    Wild2Avatar: Rendering Humans Behind Occlusions

    Authors: Tiange Xiang, Adam Sun, Scott Delp, Kazuki Kozuka, Li Fei-Fei, Ehsan Adeli

    Abstract: Rendering the visual appearance of moving humans from occluded monocular videos is a challenging task. Most existing research renders 3D humans under ideal conditions, requiring a clear and unobstructed scene. Those methods cannot be used to render humans in real-world scenes where obstacles may block the camera's view and lead to partial occlusions. In this work, we present Wild2Avatar, a neural… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

  7. arXiv:2312.13783  [pdf, other

    cs.CV cs.AI cs.LG

    Few Shot Part Segmentation Reveals Compositional Logic for Industrial Anomaly Detection

    Authors: Soopil Kim, Sion An, Philip Chikontwe, Myeongkyun Kang, Ehsan Adeli, Kilian M. Pohl, Sang Hyun Park

    Abstract: Logical anomalies (LA) refer to data violating underlying logical constraints e.g., the quantity, arrangement, or composition of components within an image. Detecting accurately such anomalies requires models to reason about various component types through segmentation. However, curation of pixel-level annotations for semantic segmentation is both time-consuming and expensive. Although there are s… ▽ More

    Submitted 15 April, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: Accepted in AAAI2024

  8. arXiv:2311.02247  [pdf, other

    cs.LG

    PRISM: Progressive Restoration for Scene Graph-based Image Manipulation

    Authors: Pavel Jahoda, Azade Farshad, Yousef Yeganeh, Ehsan Adeli, Nassir Navab

    Abstract: Scene graphs have emerged as accurate descriptive priors for image generation and manipulation tasks, however, their complexity and diversity of the shapes and relations of objects in data make it challenging to incorporate them into the models and generate high-quality results. To address these challenges, we propose PRISM, a novel progressive multi-head image manipulation approach to improve the… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  9. arXiv:2310.07781  [pdf, other

    cs.CV

    3D TransUNet: Advancing Medical Image Segmentation through Vision Transformers

    Authors: Jieneng Chen, Jieru Mei, Xianhang Li, Yongyi Lu, Qihang Yu, Qingyue Wei, Xiangde Luo, Yutong Xie, Ehsan Adeli, Yan Wang, Matthew Lungren, Lei Xing, Le Lu, Alan Yuille, Yuyin Zhou

    Abstract: Medical image segmentation plays a crucial role in advancing healthcare systems for disease diagnosis and treatment planning. The u-shaped architecture, popularly known as U-Net, has proven highly successful for various medical image segmentation tasks. However, U-Net's convolution-based operations inherently limit its ability to model long-range dependencies effectively. To address these limitati… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: Code and models are available at https://github.com/Beckschen/3D-TransUNet

  10. arXiv:2310.04630  [pdf, other

    eess.IV cs.CV

    Metadata-Conditioned Generative Models to Synthesize Anatomically-Plausible 3D Brain MRIs

    Authors: Wei Peng, Tomas Bosschieter, Jiahong Ouyang, Robert Paul, Ehsan Adeli, Qingyu Zhao, Kilian M. Pohl

    Abstract: Generative AI models hold great potential in creating synthetic brain MRIs that advance neuroimaging studies by, for example, enriching data diversity. However, the mainstay of AI research only focuses on optimizing the visual quality (such as signal-to-noise ratio) of the synthetic MRIs while lacking insights into their relevance to neuroscience. To gain these insights with respect to T1-weighted… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

  11. arXiv:2310.00213  [pdf, other

    cs.CV

    LSOR: Longitudinally-Consistent Self-Organized Representation Learning

    Authors: Jiahong Ouyang, Qingyu Zhao, Ehsan Adeli, Wei Peng, Greg Zaharchuk, Kilian M. Pohl

    Abstract: Interpretability is a key issue when applying deep learning models to longitudinal brain MRIs. One way to address this issue is by visualizing the high-dimensional latent spaces generated by deep learning via self-organizing maps (SOM). SOM separates the latent space into clusters and then maps the cluster centers to a discrete (typically 2D) grid preserving the high-dimensional relationship betwe… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

    Journal ref: International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) 2023

  12. arXiv:2308.04622  [pdf, other

    cs.CV

    Rendering Humans from Object-Occluded Monocular Videos

    Authors: Tiange Xiang, Adam Sun, Jiajun Wu, Ehsan Adeli, Li Fei-Fei

    Abstract: 3D understanding and rendering of moving humans from monocular videos is a challenging task. Despite recent progress, the task remains difficult in real-world scenarios, where obstacles may block the camera view and cause partial occlusions in the captured videos. Existing methods cannot handle such defects due to two reasons. First, the standard rendering strategy relies on point-point mapping, w… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: ICCV 2023, project page: https://cs.stanford.edu/~xtiange/projects/occnerf/

  13. arXiv:2307.13108  [pdf, other

    cs.LG cs.AI eess.IV q-bio.NC

    An Explainable Geometric-Weighted Graph Attention Network for Identifying Functional Networks Associated with Gait Impairment

    Authors: Favour Nerrise, Qingyu Zhao, Kathleen L. Poston, Kilian M. Pohl, Ehsan Adeli

    Abstract: One of the hallmark symptoms of Parkinson's Disease (PD) is the progressive loss of postural reflexes, which eventually leads to gait difficulties and balance problems. Identifying disruptions in brain function associated with gait impairment could be crucial in better understanding PD motor progression, thus advancing the development of more effective and personalized therapeutics. In this work,… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: Accepted by the 26th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2023). MICCAI Student-Author Registration (STAR) Award. 11 pages, 2 figures, 1 table, appendix. Source Code: https://github.com/favour-nerrise/xGW-GAT

  14. arXiv:2306.01623  [pdf, other

    cs.CV cs.AI cs.LG

    HomE: Homography-Equivariant Video Representation Learning

    Authors: Anirudh Sriram, Adrien Gaidon, Jiajun Wu, Juan Carlos Niebles, Li Fei-Fei, Ehsan Adeli

    Abstract: Recent advances in self-supervised representation learning have enabled more efficient and robust model performance without relying on extensive labeled data. However, most works are still focused on images, with few working on videos and even fewer on multi-view videos, where more powerful inductive biases can be leveraged for self-supervision. In this work, we propose a novel method for represen… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: 10 pages, 4 figures, 4 tables

  15. arXiv:2304.14572  [pdf, other

    cs.CV cs.AI

    SCOPE: Structural Continuity Preservation for Medical Image Segmentation

    Authors: Yousef Yeganeh, Azade Farshad, Goktug Guevercin, Amr Abu-zer, Rui Xiao, Yongjian Tang, Ehsan Adeli, Nassir Navab

    Abstract: Although the preservation of shape continuity and physiological anatomy is a natural assumption in the segmentation of medical images, it is often neglected by deep learning methods that mostly aim for the statistical modeling of input data as pixels rather than interconnected structures. In biological structures, however, organs are not separate entities; for example, in reality, a severed vessel… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

  16. arXiv:2304.14571  [pdf, other

    cs.CV cs.AI

    DIAMANT: Dual Image-Attention Map Encoders For Medical Image Segmentation

    Authors: Yousef Yeganeh, Azade Farshad, Peter Weinberger, Seyed-Ahmad Ahmadi, Ehsan Adeli, Nassir Navab

    Abstract: Although purely transformer-based architectures showed promising performance in many computer vision tasks, many hybrid models consisting of CNN and transformer blocks are introduced to fit more specialized tasks. Nevertheless, despite the performance gain of both pure and hybrid transformer-based architectures compared to CNNs in medical imaging segmentation, their high training cost and complexi… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

  17. arXiv:2304.12470  [pdf, other

    cs.CV

    Vision-based Estimation of Fatigue and Engagement in Cognitive Training Sessions

    Authors: Yanchen Wang, Adam Turnbull, Yunlong Xu, Kathi Heffner, Feng Vankee Lin, Ehsan Adeli

    Abstract: Computerized cognitive training (CCT) is a scalable, well-tolerated intervention that has promise for slowing cognitive decline. Outcomes from CCT are limited by a lack of effective engagement, which is decreased by factors such as mental fatigue, particularly in older adults at risk for dementia. There is a need for scalable, automated measures that can monitor mental fatigue during CCT. Here, we… ▽ More

    Submitted 15 November, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

    Comments: 23 pages, 6 figures

  18. arXiv:2211.14830  [pdf, other

    eess.IV cs.CV

    Medical Image Segmentation Review: The success of U-Net

    Authors: Reza Azad, Ehsan Khodapanah Aghdam, Amelie Rauland, Yiwei Jia, Atlas Haddadi Avval, Afshin Bozorgpour, Sanaz Karimijafarbigloo, Joseph Paul Cohen, Ehsan Adeli, Dorit Merhof

    Abstract: Automatic medical image segmentation is a crucial topic in the medical domain and successively a critical counterpart in the computer-aided diagnosis paradigm. U-Net is the most widespread image segmentation architecture due to its flexibility, optimized modular design, and success in all medical image modalities. Over the years, the U-Net model achieved tremendous attention from academic and indu… ▽ More

    Submitted 27 November, 2022; originally announced November 2022.

    Comments: Submitted to the IEEE Transactions on Pattern Analysis and Machine Intelligence Journal

  19. arXiv:2211.07363  [pdf, other

    q-bio.NC cs.AI cs.LG

    Joint Graph Convolution for Analyzing Brain Structural and Functional Connectome

    Authors: Yueting Li, Qingyue Wei, Ehsan Adeli, Kilian M. Pohl, Qingyu Zhao

    Abstract: The white-matter (micro-)structural architecture of the brain promotes synchrony among neuronal populations, giving rise to richly patterned functional connections. A fundamental problem for systems neuroscience is determining the best way to relate structural and functional networks quantified by diffusion tensor imaging and resting-state functional MRI. As one of the state-of-the-art approaches… ▽ More

    Submitted 27 October, 2022; originally announced November 2022.

  20. arXiv:2208.14023  [pdf, other

    cs.CV

    SoMoFormer: Multi-Person Pose Forecasting with Transformers

    Authors: Edward Vendrow, Satyajit Kumar, Ehsan Adeli, Hamid Rezatofighi

    Abstract: Human pose forecasting is a challenging problem involving complex human body motion and posture dynamics. In cases that there are multiple people in the environment, one's motion may also be influenced by the motion and dynamic movements of others. Although there are several previous works targeting the problem of multi-person dynamic pose forecasting, they often model the entire pose sequence as… ▽ More

    Submitted 30 August, 2022; originally announced August 2022.

    Comments: 10 pages, 6 figures. Submitted to WACV 2023. Our method was submitted to the SoMoF benchmark leaderboard dated March 2022. See https://somof.stanford.edu/result/217/

  21. arXiv:2208.10077  [pdf, other

    cs.CV cs.AI

    Identifying Auxiliary or Adversarial Tasks Using Necessary Condition Analysis for Adversarial Multi-task Video Understanding

    Authors: Stephen Su, Samuel Kwong, Qingyu Zhao, De-An Huang, Juan Carlos Niebles, Ehsan Adeli

    Abstract: There has been an increasing interest in multi-task learning for video understanding in recent years. In this work, we propose a generalized notion of multi-task learning by incorporating both auxiliary tasks that the model should perform well on and adversarial tasks that the model should not perform well on. We employ Necessary Condition Analysis (NCA) as a data-driven approach for deciding what… ▽ More

    Submitted 22 August, 2022; originally announced August 2022.

  22. Multiple Instance Neuroimage Transformer

    Authors: Ayush Singla, Qingyu Zhao, Daniel K. Do, Yuyin Zhou, Kilian M. Pohl, Ehsan Adeli

    Abstract: For the first time, we propose using a multiple instance learning based convolution-free transformer model, called Multiple Instance Neuroimage Transformer (MINiT), for the classification of T1weighted (T1w) MRIs. We first present several variants of transformer models adopted for neuroimages. These models extract non-overlapping 3D blocks from the input volume and perform multi-headed self-attent… ▽ More

    Submitted 19 August, 2022; originally announced August 2022.

  23. arXiv:2208.00713  [pdf, other

    eess.IV cs.CV cs.LG

    TransDeepLab: Convolution-Free Transformer-based DeepLab v3+ for Medical Image Segmentation

    Authors: Reza Azad, Moein Heidari, Moein Shariatnia, Ehsan Khodapanah Aghdam, Sanaz Karimijafarbigloo, Ehsan Adeli, Dorit Merhof

    Abstract: Convolutional neural networks (CNNs) have been the de facto standard in a diverse set of computer vision tasks for many years. Especially, deep neural networks based on seminal architectures such as U-shaped models with skip-connections or atrous convolution with pyramid pooling have been tailored to a wide range of medical image analysis tasks. The main advantage of such architectures is that the… ▽ More

    Submitted 1 August, 2022; originally announced August 2022.

  24. arXiv:2207.14349  [pdf, other

    cs.LG

    Bridging the Gap between Deep Learning and Hypothesis-Driven Analysis via Permutation Testing

    Authors: Magdalini Paschali, Qingyu Zhao, Ehsan Adeli, Kilian M. Pohl

    Abstract: A fundamental approach in neuroscience research is to test hypotheses based on neuropsychological and behavioral measures, i.e., whether certain factors (e.g., related to life events) are associated with an outcome (e.g., depression). In recent years, deep learning has become a potential alternative approach for conducting such analyses by predicting an outcome from a collection of factors and ide… ▽ More

    Submitted 28 July, 2022; originally announced July 2022.

    Comments: Accepted at the 5th workshop on PRedictive Intelligence in Medicine (PRIME 2022) - MICCAI 2022

  25. arXiv:2207.04607  [pdf, other

    cs.LG cs.CV

    A Penalty Approach for Normalizing Feature Distributions to Build Confounder-Free Models

    Authors: Anthony Vento, Qingyu Zhao, Robert Paul, Kilian M. Pohl, Ehsan Adeli

    Abstract: Translating machine learning algorithms into clinical applications requires addressing challenges related to interpretability, such as accounting for the effect of confounding variables (or metadata). Confounding variables affect the relationship between input training data and target outputs. When we train a model on such data, confounding variables will bias the distribution of the learned featu… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

  26. arXiv:2207.00106  [pdf, other

    cs.CV cs.LG eess.IV

    GaitForeMer: Self-Supervised Pre-Training of Transformers via Human Motion Forecasting for Few-Shot Gait Impairment Severity Estimation

    Authors: Mark Endo, Kathleen L. Poston, Edith V. Sullivan, Li Fei-Fei, Kilian M. Pohl, Ehsan Adeli

    Abstract: Parkinson's disease (PD) is a neurological disorder that has a variety of observable motor-related symptoms such as slow movement, tremor, muscular rigidity, and impaired posture. PD is typically diagnosed by evaluating the severity of motor impairments according to scoring systems such as the Movement Disorder Society Unified Parkinson's Disease Rating Scale (MDS-UPDRS). Automated severity predic… ▽ More

    Submitted 30 June, 2022; originally announced July 2022.

    Comments: Accepted as a conference paper at MICCAI (Medical Image Computing and Computer Assisted Intervention) 2022

  27. arXiv:2206.07087  [pdf, other

    cs.LG

    Combining Counterfactuals With Shapley Values To Explain Image Models

    Authors: Aditya Lahiri, Kamran Alipour, Ehsan Adeli, Babak Salimi

    Abstract: With the widespread use of sophisticated machine learning models in sensitive applications, understanding their decision-making has become an essential task. Models trained on tabular data have witnessed significant progress in explanations of their underlying decision making processes by virtue of having a small number of discrete features. However, applying these methods to high-dimensional inpu… ▽ More

    Submitted 14 June, 2022; originally announced June 2022.

    Journal ref: ICML 2022 Workshop on Responsible Decision Making in Dynamic Environments

  28. arXiv:2206.05257  [pdf, other

    cs.CV cs.AI

    Explaining Image Classifiers Using Contrastive Counterfactuals in Generative Latent Spaces

    Authors: Kamran Alipour, Aditya Lahiri, Ehsan Adeli, Babak Salimi, Michael Pazzani

    Abstract: Despite their high accuracies, modern complex image classifiers cannot be trusted for sensitive tasks due to their unknown decision-making process and potential biases. Counterfactual explanations are very effective in providing transparency for these black-box algorithms. Nevertheless, generating counterfactuals that can have a consistent impact on classifier outputs and yet expose interpretable… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

  29. arXiv:2206.03891  [pdf, other

    cs.CV cs.AI cs.CR cs.LG eess.IV

    PrivHAR: Recognizing Human Actions From Privacy-preserving Lens

    Authors: Carlos Hinojosa, Miguel Marquez, Henry Arguello, Ehsan Adeli, Li Fei-Fei, Juan Carlos Niebles

    Abstract: The accelerated use of digital cameras prompts an increasing concern about privacy and security, particularly in applications such as action recognition. In this paper, we propose an optimizing framework to provide robust visual privacy protection along the human action recognition pipeline. Our framework parameterizes the camera lens to successfully degrade the quality of the videos to inhibit pr… ▽ More

    Submitted 29 January, 2023; v1 submitted 8 June, 2022; originally announced June 2022.

    Comments: Oral paper presented at European Conference on Computer Vision (ECCV) 2022, in Tel Aviv, Israel

    Journal ref: Computer Vision--ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part IV

  30. arXiv:2205.04599  [pdf, other

    cs.LG cs.AI

    Affective Medical Estimation and Decision Making via Visualized Learning and Deep Learning

    Authors: Mohammad Eslami, Solale Tabarestani, Ehsan Adeli, Glyn Elwyn, Tobias Elze, Mengyu Wang, Nazlee Zebardast, Nassir Navab, Malek Adjouadi

    Abstract: With the advent of sophisticated machine learning (ML) techniques and the promising results they yield, especially in medical applications, where they have been investigated for different tasks to enhance the decision-making process. Since visualization is such an effective tool for human comprehension, memorization, and judgment, we have presented a first-of-its-kind estimation approach we refer… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

  31. arXiv:2204.09501  [pdf, other

    cs.LG cs.AI cs.CE

    An advanced spatio-temporal convolutional recurrent neural network for storm surge predictions

    Authors: Ehsan Adeli, Luning Sun, Jianxun Wang, Alexandros A. Taflanidis

    Abstract: In this research paper, we study the capability of artificial neural network models to emulate storm surge based on the storm track/size/intensity history, leveraging a database of synthetic storm simulations. Traditionally, Computational Fluid Dynamics solvers are employed to numerically solve the storm surge governing equations that are Partial Differential Equations and are generally very costl… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

  32. arXiv:2204.02943  [pdf, other

    cs.CV

    Intervertebral Disc Labeling With Learning Shape Information, A Look Once Approach

    Authors: Reza Azad, Moein Heidari, Julien Cohen-Adad, Ehsan Adeli, Dorit Merhof

    Abstract: Accurate and automatic segmentation of intervertebral discs from medical images is a critical task for the assessment of spine-related diseases such as osteoporosis, vertebral fractures, and intervertebral disc herniation. To date, various approaches have been developed in the literature which routinely relies on detecting the discs as the primary step. A disadvantage of many cohort studies is tha… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

  33. arXiv:2111.02823  [pdf, other

    cs.LG cs.AI cs.CE

    Convolutional generative adversarial imputation networks for spatio-temporal missing data in storm surge simulations

    Authors: Ehsan Adeli, Jize Zhang, Alexandros A. Taflanidis

    Abstract: Imputation of missing data is a task that plays a vital role in a number of engineering and science applications. Often such missing data arise in experimental observations from limitations of sensors or post-processing transformation errors. Other times they arise from numerical and algorithmic constraints in computer simulations. One such instance and the application emphasis of this paper are n… ▽ More

    Submitted 26 November, 2021; v1 submitted 2 November, 2021; originally announced November 2021.

  34. arXiv:2108.07258  [pdf, other

    cs.LG cs.AI cs.CY

    On the Opportunities and Risks of Foundation Models

    Authors: Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh , et al. (89 additional authors not shown)

    Abstract: AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their cap… ▽ More

    Submitted 12 July, 2022; v1 submitted 16 August, 2021; originally announced August 2021.

    Comments: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Report page with citation guidelines: https://crfm.stanford.edu/report.html

  35. arXiv:2107.04724  [pdf, other

    q-bio.NC cs.LG eess.IV

    Longitudinal Correlation Analysis for Decoding Multi-Modal Brain Development

    Authors: Qingyu Zhao, Ehsan Adeli, Kilian M. Pohl

    Abstract: Starting from childhood, the human brain restructures and rewires throughout life. Characterizing such complex brain development requires effective analysis of longitudinal and multi-modal neuroimaging data. Here, we propose such an analysis approach named Longitudinal Correlation Analysis (LCA). LCA couples the data of two modalities by first reducing the input from each modality to a latent repr… ▽ More

    Submitted 9 July, 2021; originally announced July 2021.

  36. arXiv:2106.06047  [pdf, other

    cs.LG cs.CV

    Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning

    Authors: Liangqiong Qu, Yuyin Zhou, Paul Pu Liang, Yingda Xia, Feifei Wang, Ehsan Adeli, Li Fei-Fei, Daniel Rubin

    Abstract: Federated learning is an emerging research paradigm enabling collaborative training of machine learning models among different organizations while keeping data private at each institution. Despite recent progress, there remain fundamental challenges such as the lack of convergence and the potential for catastrophic forgetting across real-world heterogeneous devices. In this paper, we demonstrate t… ▽ More

    Submitted 13 April, 2022; v1 submitted 10 June, 2021; originally announced June 2021.

    Comments: Published as a conference paper at CVPR 2022

  37. arXiv:2105.05226  [pdf, other

    cs.CV

    Home Action Genome: Cooperative Compositional Action Understanding

    Authors: Nishant Rai, Haofeng Chen, Jingwei Ji, Rishi Desai, Kazuki Kozuka, Shun Ishizaka, Ehsan Adeli, Juan Carlos Niebles

    Abstract: Existing research on action recognition treats activities as monolithic events occurring in videos. Recently, the benefits of formulating actions as a combination of atomic-actions have shown promise in improving action understanding with the emergence of datasets containing such annotations, allowing us to learn representations capturing this information. However, there remains a lack of studies… ▽ More

    Submitted 11 May, 2021; originally announced May 2021.

    Comments: CVPR '21

  38. arXiv:2104.14764  [pdf, other

    cs.CV

    CoCon: Cooperative-Contrastive Learning

    Authors: Nishant Rai, Ehsan Adeli, Kuan-Hui Lee, Adrien Gaidon, Juan Carlos Niebles

    Abstract: Labeling videos at scale is impractical. Consequently, self-supervised visual representation learning is key for efficient video analysis. Recent success in learning image representations suggests contrastive learning is a promising framework to tackle this challenge. However, when applied to real-world videos, contrastive learning may unknowingly lead to the separation of instances that contain s… ▽ More

    Submitted 30 April, 2021; originally announced April 2021.

  39. arXiv:2104.09052  [pdf, other

    cs.LG

    Metadata Normalization

    Authors: Mandy Lu, Qingyu Zhao, Jiequan Zhang, Kilian M. Pohl, Li Fei-Fei, Juan Carlos Niebles, Ehsan Adeli

    Abstract: Batch Normalization (BN) and its variants have delivered tremendous success in combating the covariate shift induced by the training step of deep learning methods. While these techniques normalize feature distributions by standardizing with batch statistics, they do not correct the influence on features from extraneous variables or multiple distributions. Such extra variables, referred to as metad… ▽ More

    Submitted 5 May, 2021; v1 submitted 19 April, 2021; originally announced April 2021.

    Comments: Accepted to CVPR 2021. Project page: https://mml.stanford.edu/MDN/

  40. TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild

    Authors: Vida Adeli, Mahsa Ehsanpour, Ian Reid, Juan Carlos Niebles, Silvio Savarese, Ehsan Adeli, Hamid Rezatofighi

    Abstract: Joint forecasting of human trajectory and pose dynamics is a fundamental building block of various applications ranging from robotics and autonomous driving to surveillance systems. Predicting body dynamics requires capturing subtle information embedded in the humans' interactions with each other and with the objects present in the scene. In this paper, we propose a novel TRajectory and POse Dynam… ▽ More

    Submitted 27 August, 2021; v1 submitted 8 April, 2021; originally announced April 2021.

    Journal ref: IEEE/CVF International Conference on Computer Vision, pp. 13390-13400. 2021

  41. arXiv:2103.03840  [pdf, other

    cs.CV

    Self-Supervised Longitudinal Neighbourhood Embedding

    Authors: Jiahong Ouyang, Qingyu Zhao, Ehsan Adeli, Edith V Sullivan, Adolf Pfefferbaum, Greg Zaharchuk, Kilian M Pohl

    Abstract: Longitudinal MRIs are often used to capture the gradual deterioration of brain structure and function caused by aging or neurological diseases. Analyzing this data via machine learning generally requires a large number of ground-truth labels, which are often missing or expensive to obtain. Reducing the need for labels, we propose a self-supervised strategy for representation learning named Longitu… ▽ More

    Submitted 17 June, 2021; v1 submitted 5 March, 2021; originally announced March 2021.

    Comments: Provisional Accepted by Medical Image Computing and Computer Assisted Intervention (MICCAI) 2021

  42. arXiv:2102.11456  [pdf, other

    cs.CV

    Representation Disentanglement for Multi-modal brain MR Analysis

    Authors: Jiahong Ouyang, Ehsan Adeli, Kilian M. Pohl, Qingyu Zhao, Greg Zaharchuk

    Abstract: Multi-modal MRIs are widely used in neuroimaging applications since different MR sequences provide complementary information about brain structures. Recent works have suggested that multi-modal deep learning analysis can benefit from explicitly disentangling anatomical (shape) and modality (appearance) information into separate image presentations. In this work, we challenge mainstream strategies… ▽ More

    Submitted 11 June, 2021; v1 submitted 22 February, 2021; originally announced February 2021.

    Comments: Accepted by Information Processing in Medical Imaging (IPMI) 2021

  43. arXiv:2102.08239  [pdf, other

    eess.IV cs.CV cs.LG

    Going Beyond Saliency Maps: Training Deep Models to Interpret Deep Models

    Authors: Zixuan Liu, Ehsan Adeli, Kilian M. Pohl, Qingyu Zhao

    Abstract: Interpretability is a critical factor in applying complex deep learning models to advance the understanding of brain disorders in neuroimaging studies. To interpret the decision process of a trained classifier, existing techniques typically rely on saliency maps to quantify the voxel-wise or feature-level importance for classification through partial derivatives. Despite providing some level of lo… ▽ More

    Submitted 25 June, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

  44. arXiv:2102.04306  [pdf, other

    cs.CV

    TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

    Authors: Jieneng Chen, Yongyi Lu, Qihang Yu, Xiangde Luo, Ehsan Adeli, Yan Wang, Le Lu, Alan L. Yuille, Yuyin Zhou

    Abstract: Medical image segmentation is an essential prerequisite for developing healthcare systems, especially for disease diagnosis and treatment planning. On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard and achieved tremendous success. However, due to the intrinsic locality of convolution operations, U-Net generally demonstrate… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

    Comments: 13 pages, 3 figures

  45. arXiv:2101.04793  [pdf, other

    eess.IV cs.CV

    Generative Adversarial U-Net for Domain-free Medical Image Augmentation

    Authors: Xiaocong Chen, Yun Li, Lina Yao, Ehsan Adeli, Yu Zhang

    Abstract: The shortage of annotated medical images is one of the biggest challenges in the field of medical image computing. Without a sufficient number of training samples, deep learning based models are very likely to suffer from over-fitting problem. The common solution is image manipulation such as image rotation, cropping, or resizing. Those methods can help relieve the over-fitting problem as more tra… ▽ More

    Submitted 12 January, 2021; originally announced January 2021.

  46. arXiv:2011.08652  [pdf, other

    cs.CV

    3D CNNs with Adaptive Temporal Feature Resolutions

    Authors: Mohsen Fayyaz, Emad Bahrami, Ali Diba, Mehdi Noroozi, Ehsan Adeli, Luc Van Gool, Juergen Gall

    Abstract: While state-of-the-art 3D Convolutional Neural Networks (CNN) achieve very good results on action recognition datasets, they are computationally very expensive and require many GFLOPs. While the GFLOPs of a 3D CNN can be decreased by reducing the temporal feature resolution within the network, there is no setting that is optimal for all input clips. In this work, we therefore introduce a different… ▽ More

    Submitted 11 August, 2021; v1 submitted 17 November, 2020; originally announced November 2020.

    Comments: CVPR 2021

  47. arXiv:2007.08920  [pdf, other

    cs.CV cs.LG eess.IV

    Vision-based Estimation of MDS-UPDRS Gait Scores for Assessing Parkinson's Disease Motor Severity

    Authors: Mandy Lu, Kathleen Poston, Adolf Pfefferbaum, Edith V. Sullivan, Li Fei-Fei, Kilian M. Pohl, Juan Carlos Niebles, Ehsan Adeli

    Abstract: Parkinson's disease (PD) is a progressive neurological disorder primarily affecting motor function resulting in tremor at rest, rigidity, bradykinesia, and postural instability. The physical severity of PD impairments can be quantified through the Movement Disorder Society Unified Parkinson's Disease Rating Scale (MDS-UPDRS), a widely used clinical rating scale. Accurate and quantitative assessmen… ▽ More

    Submitted 17 July, 2020; originally announced July 2020.

    Comments: Accepted as a conference paper at MICCAI (Medical Image Computing and Computer Assisted Intervention), Lima, Peru, October 2020. 11 pages, LaTeX

  48. arXiv:2007.06843  [pdf, other

    cs.CV

    Socially and Contextually Aware Human Motion and Pose Forecasting

    Authors: Vida Adeli, Ehsan Adeli, Ian Reid, Juan Carlos Niebles, Hamid Rezatofighi

    Abstract: Smooth and seamless robot navigation while interacting with humans depends on predicting human movements. Forecasting such human dynamics often involves modeling human trajectories (global motion) or detailed body joint movements (local motion). Prior work typically tackled local and global human movements separately. In this paper, we propose a novel framework to tackle both tasks of human motion… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

    Comments: Accepted in RA-L and IROS

  49. arXiv:2006.06930  [pdf, other

    cs.LG cs.CV stat.ML

    Longitudinal Self-Supervised Learning

    Authors: Qingyu Zhao, Zixuan Liu, Ehsan Adeli, Kilian M. Pohl

    Abstract: Machine learning analysis of longitudinal neuroimaging data is typically based on supervised learning, which requires a large number of ground-truth labels to be informative. As ground-truth labels are often missing or expensive to obtain in neuroscience, we avoid them in our analysis by combing factor disentanglement with self-supervised learning to identify changes and consistencies across the m… ▽ More

    Submitted 26 June, 2021; v1 submitted 11 June, 2020; originally announced June 2020.

  50. arXiv:2005.07462  [pdf, other

    eess.IV cs.CV cs.LG

    MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT Prostate Segmentation via Online Sampling

    Authors: Kelei He, Chunfeng Lian, Ehsan Adeli, Jing Huo, Yang Gao, Bing Zhang, Junfeng Zhang, Dinggang Shen

    Abstract: Fully convolutional networks (FCNs), including UNet and VNet, are widely-used network architectures for semantic segmentation in recent studies. However, conventional FCN is typically trained by the cross-entropy or Dice loss, which only calculates the error between predictions and ground-truth labels for pixels individually. This often results in non-smooth neighborhoods in the predicted segmenta… ▽ More

    Submitted 23 January, 2021; v1 submitted 15 May, 2020; originally announced May 2020.