subscribe to arXiv mailings

arXiv:2406.19225 [pdf, other]

ProtoGMM: Multi-prototype Gaussian-Mixture-based Domain Adaptation Model for Semantic Segmentation

Authors: Nazanin Moradinasab, Laura S. Shankman, Rebecca A. Deaton, Gary K. Owens, Donald E. Brown

Abstract: Domain adaptive semantic segmentation aims to generate accurate and dense predictions for an unlabeled target domain by leveraging a supervised model trained on a labeled source domain. The prevalent self-training approach involves retraining the dense discriminative classifier of $p(class|pixel feature)$ using the pseudo-labels from the target domain. While many methods focus on mitigating the is… ▽ More Domain adaptive semantic segmentation aims to generate accurate and dense predictions for an unlabeled target domain by leveraging a supervised model trained on a labeled source domain. The prevalent self-training approach involves retraining the dense discriminative classifier of $p(class|pixel feature)$ using the pseudo-labels from the target domain. While many methods focus on mitigating the issue of noisy pseudo-labels, they often overlook the underlying data distribution p(pixel feature|class) in both the source and target domains. To address this limitation, we propose the multi-prototype Gaussian-Mixture-based (ProtoGMM) model, which incorporates the GMM into contrastive losses to perform guided contrastive learning. Contrastive losses are commonly executed in the literature using memory banks, which can lead to class biases due to underrepresented classes. Furthermore, memory banks often have fixed capacities, potentially restricting the model's ability to capture diverse representations of the target/source domains. An alternative approach is to use global class prototypes (i.e. averaged features per category). However, the global prototypes are based on the unimodal distribution assumption per class, disregarding within-class variation. To address these challenges, we propose the ProtoGMM model. This novel approach involves estimating the underlying multi-prototype source distribution by utilizing the GMM on the feature space of the source samples. The components of the GMM model act as representative prototypes. To achieve increased intra-class semantic similarity, decreased inter-class similarity, and domain alignment between the source and target domains, we employ multi-prototype contrastive learning between source distribution and target samples. The experiments show the effectiveness of our method on UDA benchmarks. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2403.19831 [pdf, other]

TASR: A Novel Trust-Aware Stackelberg Routing Algorithm to Mitigate Traffic Congestion

Authors: Doris E. M. Brown, Venkata Sriram Siddhardh Nadendla, Sajal K. Das

Abstract: Stackelberg routing platforms (SRP) reduce congestion in one-shot traffic networks by proposing optimal route recommendations to selfish travelers. Traditionally, Stackelberg routing is cast as a partial control problem where a fraction of traveler flow complies with route recommendations, while the remaining respond as selfish travelers. In this paper, a novel Stackelberg routing framework is for… ▽ More Stackelberg routing platforms (SRP) reduce congestion in one-shot traffic networks by proposing optimal route recommendations to selfish travelers. Traditionally, Stackelberg routing is cast as a partial control problem where a fraction of traveler flow complies with route recommendations, while the remaining respond as selfish travelers. In this paper, a novel Stackelberg routing framework is formulated where the agents exhibit \emph{probabilistic compliance} by accepting SRP's route recommendations with a \emph{trust} probability. A greedy \emph{\textbf{T}rust-\textbf{A}ware \textbf{S}tackelberg \textbf{R}outing} algorithm (in short, TASR) is proposed for SRP to compute unique path recommendations to each traveler flow with a unique demand. Simulation experiments are designed with random travel demands with diverse trust values on real road networks such as Sioux Falls, Chicago Sketch, and Sydney networks for both single-commodity and multi-commodity flows. The performance of TASR is compared with state-of-the-art Stackelberg routing methods in terms of traffic congestion and trust dynamics over repeated interaction between the SRP and the travelers. Results show that TASR improves network congestion without causing a significant reduction in trust towards the SRP, when compared to most well-known Stackelberg routing strategies. △ Less

Submitted 28 March, 2024; originally announced March 2024.

arXiv:2312.01435 [pdf, other]

Automatic Report Generation for Histopathology images using pre-trained Vision Transformers and BERT

Authors: Saurav Sengupta, Donald E. Brown

Abstract: Deep learning for histopathology has been successfully used for disease classification, image segmentation and more. However, combining image and text modalities using current state-of-the-art (SOTA) methods has been a challenge due to the high resolution of histopathology images. Automatic report generation for histopathology images is one such challenge. In this work, we show that using an exist… ▽ More Deep learning for histopathology has been successfully used for disease classification, image segmentation and more. However, combining image and text modalities using current state-of-the-art (SOTA) methods has been a challenge due to the high resolution of histopathology images. Automatic report generation for histopathology images is one such challenge. In this work, we show that using an existing pre-trained Vision Transformer (ViT) to encode 4096x4096 sized patches of the Whole Slide Image (WSI) and a pre-trained Bidirectional Encoder Representations from Transformers (BERT) model for language modeling-based decoder for report generation, we can build a performant and portable report generation mechanism that takes into account the whole high resolution image. Our method allows us to not only generate and evaluate captions that describe the image, but also helps us classify the image into tissue types and the gender of the patient as well. Our best performing model achieves a 89.52% accuracy in Tissue Type classification with a BLEU-4 score of 0.12 in our caption generation task. △ Less

Submitted 15 March, 2024; v1 submitted 3 December, 2023; originally announced December 2023.

Comments: Accepted at IEEE ISBI 2024. arXiv admin note: substantial text overlap with arXiv:2311.06176

arXiv:2311.06176 [pdf, other]

Automatic Report Generation for Histopathology images using pre-trained Vision Transformers

Authors: Saurav Sengupta, Donald E. Brown

Abstract: Deep learning for histopathology has been successfully used for disease classification, image segmentation and more. However, combining image and text modalities using current state-of-the-art methods has been a challenge due to the high resolution of histopathology images. Automatic report generation for histopathology images is one such challenge. In this work, we show that using an existing pre… ▽ More Deep learning for histopathology has been successfully used for disease classification, image segmentation and more. However, combining image and text modalities using current state-of-the-art methods has been a challenge due to the high resolution of histopathology images. Automatic report generation for histopathology images is one such challenge. In this work, we show that using an existing pre-trained Vision Transformer in a two-step process of first using it to encode 4096x4096 sized patches of the Whole Slide Image (WSI) and then using it as the encoder and an LSTM decoder for report generation, we can build a fairly performant and portable report generation mechanism that takes into account the whole of the high resolution image, instead of just the patches. We are also able to use representations from an existing powerful pre-trained hierarchical vision transformer and show its usefulness in not just zero shot classification but also for report generation. △ Less

Submitted 13 November, 2023; v1 submitted 10 November, 2023; originally announced November 2023.

Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 09 pages

arXiv:2309.03744 [pdf, other]

Label-efficient Contrastive Learning-based model for nuclei detection and classification in 3D Cardiovascular Immunofluorescent Images

Authors: Nazanin Moradinasab, Rebecca A. Deaton, Laura S. Shankman, Gary K. Owens, Donald E. Brown

Abstract: Recently, deep learning-based methods achieved promising performance in nuclei detection and classification applications. However, training deep learning-based methods requires a large amount of pixel-wise annotated data, which is time-consuming and labor-intensive, especially in 3D images. An alternative approach is to adapt weak-annotation methods, such as labeling each nucleus with a point, but… ▽ More Recently, deep learning-based methods achieved promising performance in nuclei detection and classification applications. However, training deep learning-based methods requires a large amount of pixel-wise annotated data, which is time-consuming and labor-intensive, especially in 3D images. An alternative approach is to adapt weak-annotation methods, such as labeling each nucleus with a point, but this method does not extend from 2D histopathology images (for which it was originally developed) to 3D immunofluorescent images. The reason is that 3D images contain multiple channels (z-axis) for nuclei and different markers separately, which makes training using point annotations difficult. To address this challenge, we propose the Label-efficient Contrastive learning-based (LECL) model to detect and classify various types of nuclei in 3D immunofluorescent images. Previous methods use Maximum Intensity Projection (MIP) to convert immunofluorescent images with multiple slices to 2D images, which can cause signals from different z-stacks to falsely appear associated with each other. To overcome this, we devised an Extended Maximum Intensity Projection (EMIP) approach that addresses issues using MIP. Furthermore, we performed a Supervised Contrastive Learning (SCL) approach for weakly supervised settings. We conducted experiments on cardiovascular datasets and found that our proposed framework is effective and efficient in detecting and classifying various types of nuclei in 3D immunofluorescent images. △ Less

Submitted 14 January, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

Comments: 11 pages, 5 figures, MICCAI Workshop Conference 2023

arXiv:2210.02490 [pdf, other]

doi 10.1109/BIBM55620.2022.9994851

Analyzing historical diagnosis code data from NIH N3C and RECOVER Programs using deep learning to determine risk factors for Long Covid

Authors: Saurav Sengupta, Johanna Loomba, Suchetha Sharma, Donald E. Brown, Lorna Thorpe, Melissa A Haendel, Christopher G Chute, Stephanie Hong

Abstract: Post-acute sequelae of SARS-CoV-2 infection (PASC) or Long COVID is an emerging medical condition that has been observed in several patients with a positive diagnosis for COVID-19. Historical Electronic Health Records (EHR) like diagnosis codes, lab results and clinical notes have been analyzed using deep learning and have been used to predict future clinical events. In this paper, we propose an i… ▽ More Post-acute sequelae of SARS-CoV-2 infection (PASC) or Long COVID is an emerging medical condition that has been observed in several patients with a positive diagnosis for COVID-19. Historical Electronic Health Records (EHR) like diagnosis codes, lab results and clinical notes have been analyzed using deep learning and have been used to predict future clinical events. In this paper, we propose an interpretable deep learning approach to analyze historical diagnosis code data from the National COVID Cohort Collective (N3C) to find the risk factors contributing to developing Long COVID. Using our deep learning approach, we are able to predict if a patient is suffering from Long COVID from a temporally ordered list of diagnosis codes up to 45 days post the first COVID positive test or diagnosis for each patient, with an accuracy of 70.48\%. We are then able to examine the trained model using Gradient-weighted Class Activation Mapping (GradCAM) to give each input diagnoses a score. The highest scored diagnosis were deemed to be the most important for making the correct prediction for a patient. We also propose a way to summarize these top diagnoses for each patient in our cohort and look at their temporal trends to determine which codes contribute towards a positive Long COVID diagnosis. △ Less

Submitted 5 October, 2022; originally announced October 2022.

arXiv:2208.00098 [pdf, other]

Weakly Supervised Deep Instance Nuclei Detection using Points Annotation in 3D Cardiovascular Immunofluorescent Images

Authors: Nazanin Moradinasab, Yash Sharma, Laura S. Shankman, Gary K. Owens, Donald E. Brown

Abstract: Two major causes of death in the United States and worldwide are stroke and myocardial infarction. The underlying cause of both is thrombi released from ruptured or eroded unstable atherosclerotic plaques that occlude vessels in the heart (myocardial infarction) or the brain (stroke). Clinical studies show that plaque composition plays a more important role than lesion size in plaque rupture or er… ▽ More Two major causes of death in the United States and worldwide are stroke and myocardial infarction. The underlying cause of both is thrombi released from ruptured or eroded unstable atherosclerotic plaques that occlude vessels in the heart (myocardial infarction) or the brain (stroke). Clinical studies show that plaque composition plays a more important role than lesion size in plaque rupture or erosion events. To determine the plaque composition, various cell types in 3D cardiovascular immunofluorescent images of plaque lesions are counted. However, counting these cells manually is expensive, time-consuming, and prone to human error. These challenges of manual counting motivate the need for an automated approach to localize and count the cells in images. The purpose of this study is to develop an automatic approach to accurately detect and count cells in 3D immunofluorescent images with minimal annotation effort. In this study, we used a weakly supervised learning approach to train the HoVer-Net segmentation model using point annotations to detect nuclei in fluorescent images. The advantage of using point annotations is that they require less effort as opposed to pixel-wise annotation. To train the HoVer-Net model using point annotations, we adopted a popularly used cluster labeling approach to transform point annotations into accurate binary masks of cell nuclei. Traditionally, these approaches have generated binary masks from point annotations, leaving a region around the object unlabeled (which is typically ignored during model training). However, these areas may contain important information that helps determine the boundary between cells. Therefore, we used the entropy minimization loss function in these areas to encourage the model to output more confident predictions on the unlabeled areas. Our comparison studies indicate that the HoVer-Net model trained using our weakly ... △ Less

Submitted 29 July, 2022; originally announced August 2022.

arXiv:2206.14437 [pdf, other]

MaNi: Maximizing Mutual Information for Nuclei Cross-Domain Unsupervised Segmentation

Authors: Yash Sharma, Sana Syed, Donald E. Brown

Abstract: In this work, we propose a mutual information (MI) based unsupervised domain adaptation (UDA) method for the cross-domain nuclei segmentation. Nuclei vary substantially in structure and appearances across different cancer types, leading to a drop in performance of deep learning models when trained on one cancer type and tested on another. This domain shift becomes even more critical as accurate se… ▽ More In this work, we propose a mutual information (MI) based unsupervised domain adaptation (UDA) method for the cross-domain nuclei segmentation. Nuclei vary substantially in structure and appearances across different cancer types, leading to a drop in performance of deep learning models when trained on one cancer type and tested on another. This domain shift becomes even more critical as accurate segmentation and quantification of nuclei is an essential histopathology task for the diagnosis/ prognosis of patients and annotating nuclei at the pixel level for new cancer types demands extensive effort by medical experts. To address this problem, we maximize the MI between labeled source cancer type data and unlabeled target cancer type data for transferring nuclei segmentation knowledge across domains. We use the Jensen-Shanon divergence bound, requiring only one negative pair per positive pair for MI maximization. We evaluate our set-up for multiple modeling frameworks and on different datasets comprising of over 20 cancer-type domain shifts and demonstrate competitive performance. All the recently proposed approaches consist of multiple components for improving the domain adaptation, whereas our proposed module is light and can be easily incorporated into other methods (Implementation: https://github.com/YashSharma/MaNi ). △ Less

Submitted 29 June, 2022; originally announced June 2022.

Comments: Accepted at MICCAI 2022

arXiv:2204.12432 [pdf, other]

Encoding Cardiopulmonary Exercise Testing Time Series as Images for Classification using Convolutional Neural Network

Authors: Yash Sharma, Nick Coronato, Donald E. Brown

Abstract: Exercise testing has been available for more than a half-century and is a remarkably versatile tool for diagnostic and prognostic information of patients for a range of diseases, especially cardiovascular and pulmonary. With rapid advancements in technology, wearables, and learning algorithm in the last decade, its scope has evolved. Specifically, Cardiopulmonary exercise testing (CPX) is one of t… ▽ More Exercise testing has been available for more than a half-century and is a remarkably versatile tool for diagnostic and prognostic information of patients for a range of diseases, especially cardiovascular and pulmonary. With rapid advancements in technology, wearables, and learning algorithm in the last decade, its scope has evolved. Specifically, Cardiopulmonary exercise testing (CPX) is one of the most commonly used laboratory tests for objective evaluation of exercise capacity and performance levels in patients. CPX provides a non-invasive, integrative assessment of the pulmonary, cardiovascular, and skeletal muscle systems involving the measurement of gas exchanges. However, its assessment is challenging, requiring the individual to process multiple time series data points, leading to simplification to peak values and slopes. But this simplification can discard the valuable trend information present in these time series. In this work, we encode the time series as images using the Gramian Angular Field and Markov Transition Field and use it with a convolutional neural network and attention pooling approach for the classification of heart failure and metabolic syndrome patients. Using GradCAMs, we highlight the discriminative features identified by the model. △ Less

Submitted 26 April, 2022; originally announced April 2022.

Comments: Accepted in NeurIPS 2021 - MLPH Workshop; EMBC 2022. Code: https://github.com/YashSharma/MultivariateTimeSeries

arXiv:2112.01537 [pdf, other]

Improving mathematical questioning in teacher training

Authors: Debajyoti Datta, Maria Phillips, James P Bywater, Jennifer Chiu, Ginger S. Watson, Laura E. Barnes, Donald E Brown

Abstract: High-fidelity, AI-based simulated classroom systems enable teachers to rehearse effective teaching strategies. However, dialogue-oriented open-ended conversations such as teaching a student about scale factors can be difficult to model. This paper builds a text-based interactive conversational agent to help teachers practice mathematical questioning skills based on the well-known Instructional Qua… ▽ More High-fidelity, AI-based simulated classroom systems enable teachers to rehearse effective teaching strategies. However, dialogue-oriented open-ended conversations such as teaching a student about scale factors can be difficult to model. This paper builds a text-based interactive conversational agent to help teachers practice mathematical questioning skills based on the well-known Instructional Quality Assessment. We take a human-centered approach to designing our system, relying on advances in deep learning, uncertainty quantification, and natural language processing while acknowledging the limitations of conversational agents for specific pedagogical needs. Using experts' input directly during the simulation, we demonstrate how conversation success rate and high user satisfaction can be achieved. △ Less

Submitted 6 December, 2021; v1 submitted 2 December, 2021; originally announced December 2021.

Comments: Accepted to appear at the NeurIPS 2021 Human Centered AI Workshop (HCAI). Data collection process for this data is described here arXiv:2112.00985

arXiv:2112.00985 [pdf, other]

Evaluation of mathematical questioning strategies using data collected through weak supervision

Authors: Debajyoti Datta, Maria Phillips, James P Bywater, Jennifer Chiu, Ginger S. Watson, Laura E. Barnes, Donald E Brown

Abstract: A large body of research demonstrates how teachers' questioning strategies can improve student learning outcomes. However, developing new scenarios is challenging because of the lack of training data for a specific scenario and the costs associated with labeling. This paper presents a high-fidelity, AI-based classroom simulator to help teachers rehearse research-based mathematical questioning skil… ▽ More A large body of research demonstrates how teachers' questioning strategies can improve student learning outcomes. However, developing new scenarios is challenging because of the lack of training data for a specific scenario and the costs associated with labeling. This paper presents a high-fidelity, AI-based classroom simulator to help teachers rehearse research-based mathematical questioning skills. Using a human-in-the-loop approach, we collected a high-quality training dataset for a mathematical questioning scenario. Using recent advances in uncertainty quantification, we evaluated our conversational agent for usability and analyzed the practicality of incorporating a human-in-the-loop approach for data collection and system evaluation for a mathematical questioning scenario. △ Less

Submitted 2 December, 2021; originally announced December 2021.

Comments: Accepted to appear at the NeurIPS 2021 Workshop on Math AI for Education (MATHAI4ED)

arXiv:2106.11077 [pdf, other]

Toward a Knowledge Discovery Framework for Data Science Job Market in the United States

Authors: Mojtaba Heidarysafa, Kamran Kowsari, Masoud Bashiri, Donald E. Brown

Abstract: The growth of the data science field requires better tools to understand such a fast-paced growing domain. Moreover, individuals from different backgrounds became interested in following a career as data scientists. Therefore, providing a quantitative guide for individuals and organizations to understand the skills required in the job market would be crucial. This paper introduces a framework to a… ▽ More The growth of the data science field requires better tools to understand such a fast-paced growing domain. Moreover, individuals from different backgrounds became interested in following a career as data scientists. Therefore, providing a quantitative guide for individuals and organizations to understand the skills required in the job market would be crucial. This paper introduces a framework to analyze the job market for data science-related jobs within the US while providing an interface to access insights in this market. The proposed framework includes three sub-modules allowing continuous data collection, information extraction, and a web-based dashboard visualization to investigate the spatial and temporal distribution of data science-related jobs and skills. The result of this work shows important skills for the main branches of data science jobs and attempts to provide a skill-based definition of these data science branches. The current version of this application is deployed on the web and allows individuals and institutes to investigate skills required for data science positions through the industry lens. △ Less

Submitted 20 July, 2021; v1 submitted 14 June, 2021; originally announced June 2021.

arXiv:2106.07068 [pdf, other]

HistoTransfer: Understanding Transfer Learning for Histopathology

Authors: Yash Sharma, Lubaina Ehsan, Sana Syed, Donald E. Brown

Abstract: Advancement in digital pathology and artificial intelligence has enabled deep learning-based computer vision techniques for automated disease diagnosis and prognosis. However, WSIs present unique computational and algorithmic challenges. WSIs are gigapixel-sized, making them infeasible to be used directly for training deep neural networks. Hence, for modeling, a two-stage approach is adopted: Patc… ▽ More Advancement in digital pathology and artificial intelligence has enabled deep learning-based computer vision techniques for automated disease diagnosis and prognosis. However, WSIs present unique computational and algorithmic challenges. WSIs are gigapixel-sized, making them infeasible to be used directly for training deep neural networks. Hence, for modeling, a two-stage approach is adopted: Patch representations are extracted first, followed by the aggregation for WSI prediction. These approaches require detailed pixel-level annotations for training the patch encoder. However, obtaining these annotations is time-consuming and tedious for medical experts. Transfer learning is used to address this gap and deep learning architectures pre-trained on ImageNet are used for generating patch-level representation. Even though ImageNet differs significantly from histopathology data, pre-trained networks have been shown to perform impressively on histopathology data. Also, progress in self-supervised and multi-task learning coupled with the release of multiple histopathology data has led to the release of histopathology-specific networks. In this work, we compare the performance of features extracted from networks trained on ImageNet and histopathology data. We use an attention pooling network over these extracted features for slide-level aggregation. We investigate if features learned using more complex networks lead to gain in performance. We use a simple top-k sampling approach for fine-tuning framework and study the representation similarity between frozen and fine-tuned networks using Centered Kernel Alignment. Further, to examine if intermediate block representation is better suited for feature extraction and ImageNet architectures are unnecessarily large for histopathology, we truncate the blocks of ResNet18 and DenseNet121 and examine the performance. △ Less

Submitted 13 June, 2021; originally announced June 2021.

Comments: Accepted at IEEE International Conference on Biomedical and Health Informatics (BHI'21). arXiv admin note: text overlap with arXiv:2103.10626

arXiv:2103.10626 [pdf, other]

Cluster-to-Conquer: A Framework for End-to-End Multi-Instance Learning for Whole Slide Image Classification

Authors: Yash Sharma, Aman Shrivastava, Lubaina Ehsan, Christopher A. Moskaluk, Sana Syed, Donald E. Brown

Abstract: In recent years, the availability of digitized Whole Slide Images (WSIs) has enabled the use of deep learning-based computer vision techniques for automated disease diagnosis. However, WSIs present unique computational and algorithmic challenges. WSIs are gigapixel-sized ($\sim$100K pixels), making them infeasible to be used directly for training deep neural networks. Also, often only slide-level… ▽ More In recent years, the availability of digitized Whole Slide Images (WSIs) has enabled the use of deep learning-based computer vision techniques for automated disease diagnosis. However, WSIs present unique computational and algorithmic challenges. WSIs are gigapixel-sized ($\sim$100K pixels), making them infeasible to be used directly for training deep neural networks. Also, often only slide-level labels are available for training as detailed annotations are tedious and can be time-consuming for experts. Approaches using multiple-instance learning (MIL) frameworks have been shown to overcome these challenges. Current state-of-the-art approaches divide the learning framework into two decoupled parts: a convolutional neural network (CNN) for encoding the patches followed by an independent aggregation approach for slide-level prediction. In this approach, the aggregation step has no bearing on the representations learned by the CNN encoder. We have proposed an end-to-end framework that clusters the patches from a WSI into ${k}$-groups, samples ${k}'$ patches from each group for training, and uses an adaptive attention mechanism for slide level prediction; Cluster-to-Conquer (C2C). We have demonstrated that dividing a WSI into clusters can improve the model training by exposing it to diverse discriminative features extracted from the patches. We regularized the clustering mechanism by introducing a KL-divergence loss between the attention weights of patches in a cluster and the uniform distribution. The framework is optimized end-to-end on slide-level cross-entropy, patch-level cross-entropy, and KL-divergence loss (Implementation: https://github.com/YashSharma/C2C). △ Less

Submitted 13 June, 2021; v1 submitted 19 March, 2021; originally announced March 2021.

Comments: Accepted at MIDL, 2021 - https://openreview.net/forum?id=7i1-2oKIELU

arXiv:2101.05326 [pdf, other]

Advancing Eosinophilic Esophagitis Diagnosis and Phenotype Assessment with Deep Learning Computer Vision

Authors: William Adorno III, Alexis Catalano, Lubaina Ehsan, Hans Vitzhum von Eckstaedt, Barrett Barnes, Emily McGowan, Sana Syed, Donald E. Brown

Abstract: Eosinophilic Esophagitis (EoE) is an inflammatory esophageal disease which is increasing in prevalence. The diagnostic gold-standard involves manual review of a patient's biopsy tissue sample by a clinical pathologist for the presence of 15 or greater eosinophils within a single high-power field (400x magnification). Diagnosing EoE can be a cumbersome process with added difficulty for assessing th… ▽ More Eosinophilic Esophagitis (EoE) is an inflammatory esophageal disease which is increasing in prevalence. The diagnostic gold-standard involves manual review of a patient's biopsy tissue sample by a clinical pathologist for the presence of 15 or greater eosinophils within a single high-power field (400x magnification). Diagnosing EoE can be a cumbersome process with added difficulty for assessing the severity and progression of disease. We propose an automated approach for quantifying eosinophils using deep image segmentation. A U-Net model and post-processing system are applied to generate eosinophil-based statistics that can diagnose EoE as well as describe disease severity and progression. These statistics are captured in biopsies at the initial EoE diagnosis and are then compared with patient metadata: clinical and treatment phenotypes. The goal is to find linkages that could potentially guide treatment plans for new patients at their initial disease diagnosis. A deep image classification model is further applied to discover features other than eosinophils that can be used to diagnose EoE. This is the first study to utilize a deep learning computer vision approach for EoE diagnosis and to provide an automated process for tracking disease severity and progression. △ Less

Submitted 13 January, 2021; originally announced January 2021.

Comments: This paper contains 12 pages, 9 figures, and 7 tables

arXiv:2005.05516 [pdf, other]

Framing Effects on Strategic Information Design under Receiver Distrust and Unknown State

Authors: Doris E. M. Brown, Venkata Sriram Siddhardh Nadendla

Abstract: Strategic information design is a framework where a sender designs information strategically to steer its receiver's decision towards a desired choice. Traditionally, such frameworks have always assumed that the sender and the receiver comprehends the state of the choice environment, and that the receiver always trusts the sender's signal. This paper deviates from these assumptions and re-investig… ▽ More Strategic information design is a framework where a sender designs information strategically to steer its receiver's decision towards a desired choice. Traditionally, such frameworks have always assumed that the sender and the receiver comprehends the state of the choice environment, and that the receiver always trusts the sender's signal. This paper deviates from these assumptions and re-investigates strategic information design in the presence of distrustful receiver and when both sender and receiver cannot observe/comprehend the environment state space. Specifically, we assume that both sender and receiver has access to non-identical beliefs about choice rewards (with sender's belief being more accurate), but not the environment state that determines these rewards. Furthermore, given that the receiver does not trust the sender, we also assume that the receiver updates its prior in a non-Bayesian manner. We evaluate the Stackelberg equilibrium and investigate effects of information framing (i.e. send complete signal, or just expected value of the signal) on the equilibrium. Furthermore, we also investigate trust dynamics at the receiver, under the assumption that the receiver minimizes regret in hindsight. Simulation results are presented to illustrate signaling effects and trust dynamics in strategic information design. △ Less

Submitted 21 July, 2021; v1 submitted 11 May, 2020; originally announced May 2020.

Comments: 12 pages, 5 figures; This is a working draft, and can potentially have errors. Any feedback will be greatly appreciated, and will be acknowledged in the subsequent versions

arXiv:2005.03868 [pdf, other]

Hierarchical Deep Convolutional Neural Networks for Multi-category Diagnosis of Gastrointestinal Disorders on Histopathological Images

Authors: Rasoul Sali, Sodiq Adewole, Lubaina Ehsan, Lee A. Denson, Paul Kelly, Beatrice C. Amadi, Lori Holtz, Syed Asad Ali, Sean R. Moore, Sana Syed, Donald E. Brown

Abstract: Deep convolutional neural networks(CNNs) have been successful for a wide range of computer vision tasks, including image classification. A specific area of the application lies in digital pathology for pattern recognition in the tissue-based diagnosis of gastrointestinal(GI) diseases. This domain can utilize CNNs to translate histopathological images into precise diagnostics. This is challenging s… ▽ More Deep convolutional neural networks(CNNs) have been successful for a wide range of computer vision tasks, including image classification. A specific area of the application lies in digital pathology for pattern recognition in the tissue-based diagnosis of gastrointestinal(GI) diseases. This domain can utilize CNNs to translate histopathological images into precise diagnostics. This is challenging since these complex biopsies are heterogeneous and require multiple levels of assessment. This is mainly due to structural similarities in different parts of the GI tract and shared features among different gut diseases. Addressing this problem with a flat model that assumes all classes (parts of the gut and their diseases) are equally difficult to distinguish leads to an inadequate assessment of each class. Since the hierarchical model restricts classification error to each sub-class, it leads to a more informative model than a flat model. In this paper, we propose to apply the hierarchical classification of biopsy images from different parts of the GI tract and the receptive diseases within each. We embedded a class hierarchy into the plain VGGNet to take advantage of its layers' hierarchical structure. The proposed model was evaluated using an independent set of image patches from 373 whole slide images. The results indicate that the hierarchical model can achieve better results than the flat model for multi-category diagnosis of GI disorders using histopathological images. △ Less

Submitted 6 August, 2020; v1 submitted 8 May, 2020; originally announced May 2020.

Comments: accepted at IEEE International Conference on Healthcare Informatics (ICHI 2020)

arXiv:2004.06518 [pdf, other]

doi 10.1007/978-3-030-63128-4_26

Gender Detection on Social Networks using Ensemble Deep Learning

Authors: Kamran Kowsari, Mojtaba Heidarysafa, Tolu Odukoya, Philip Potter, Laura E. Barnes, Donald E. Brown

Abstract: Analyzing the ever-increasing volume of posts on social media sites such as Facebook and Twitter requires improved information processing methods for profiling authorship. Document classification is central to this task, but the performance of traditional supervised classifiers has degraded as the volume of social media has increased. This paper addresses this problem in the context of gender dete… ▽ More Analyzing the ever-increasing volume of posts on social media sites such as Facebook and Twitter requires improved information processing methods for profiling authorship. Document classification is central to this task, but the performance of traditional supervised classifiers has degraded as the volume of social media has increased. This paper addresses this problem in the context of gender detection through ensemble classification that employs multi-model deep learning architectures to generate specialized understanding from different feature spaces. △ Less

Submitted 9 September, 2020; v1 submitted 13 April, 2020; originally announced April 2020.

arXiv:2002.00223 [pdf, other]

Dialogue-Based Simulation For Cultural Awareness Training

Authors: Sodiq Adewole, Erfaneh Gharavi, Benjamin Shpringer, Martin Bolger, Vaibhav Sharma, Sung Ming Yang, Donald E. Brown

Abstract: Existing simulations designed for cultural and interpersonal skill training rely on pre-defined responses with a menu option selection interface. Using a multiple-choice interface and restricting trainees' responses may limit the trainees' ability to apply the lessons in real life situations. This systems also uses a simplistic evaluation model, where trainees' selected options are marked as eithe… ▽ More Existing simulations designed for cultural and interpersonal skill training rely on pre-defined responses with a menu option selection interface. Using a multiple-choice interface and restricting trainees' responses may limit the trainees' ability to apply the lessons in real life situations. This systems also uses a simplistic evaluation model, where trainees' selected options are marked as either correct or incorrect. This model may not capture sufficient information that could drive an adaptive feedback mechanism to improve trainees' cultural awareness. This paper describes the design of a dialogue-based simulation for cultural awareness training. The simulation, built around a disaster management scenario involving a joint coalition between the US and the Chinese armies. Trainees were able to engage in realistic dialogue with the Chinese agent. Their responses, at different points, get evaluated by different multi-label classification models. Based on training on our dataset, the models score the trainees' responses for cultural awareness in the Chinese culture. Trainees also get feedback that informs the cultural appropriateness of their responses. The result of this work showed the following; i) A feature-based evaluation model improves the design, modeling and computation of dialogue-based training simulation systems; ii) Output from current automatic speech recognition (ASR) systems gave comparable end results compared with the output from manual transcription; iii) A multi-label classification model trained as a cultural expert gave results which were comparable with scores assigned by human annotators. △ Less

Submitted 18 October, 2021; v1 submitted 1 February, 2020; originally announced February 2020.

arXiv:1912.03804 [pdf, other]

Women in ISIS Propaganda: A Natural Language Processing Analysis of Topics and Emotions in a Comparison with Mainstream Religious Group

Authors: Mojtaba Heidarysafa, Kamran Kowsari, Tolu Odukoya, Philip Potter, Laura E. Barnes, Donald E. Brown

Abstract: Online propaganda is central to the recruitment strategies of extremist groups and in recent years these efforts have increasingly extended to women. To investigate ISIS' approach to targeting women in their online propaganda and uncover implications for counterterrorism, we rely on text mining and natural language processing (NLP). Specifically, we extract articles published in Dabiq and Rumiyah… ▽ More Online propaganda is central to the recruitment strategies of extremist groups and in recent years these efforts have increasingly extended to women. To investigate ISIS' approach to targeting women in their online propaganda and uncover implications for counterterrorism, we rely on text mining and natural language processing (NLP). Specifically, we extract articles published in Dabiq and Rumiyah (ISIS's online English language publications) to identify prominent topics. To identify similarities or differences between these texts and those produced by non-violent religious groups, we extend the analysis to articles from a Catholic forum dedicated to women. We also perform an emotional analysis of both of these resources to better understand the emotional components of propaganda. We rely on Depechemood (a lexical-base emotion analysis method) to detect emotions most likely to be evoked in readers of these materials. The findings indicate that the emotional appeal of ISIS and Catholic materials are similar △ Less

Submitted 8 December, 2019; originally announced December 2019.

arXiv:1910.03084 [pdf, other]

CeliacNet: Celiac Disease Severity Diagnosis on Duodenal Histopathological Images Using Deep Residual Networks

Authors: Rasoul Sali, Lubaina Ehsan, Kamran Kowsari, Marium Khan, Christopher A. Moskaluk, Sana Syed, Donald E. Brown

Abstract: Celiac Disease (CD) is a chronic autoimmune disease that affects the small intestine in genetically predisposed children and adults. Gluten exposure triggers an inflammatory cascade which leads to compromised intestinal barrier function. If this enteropathy is unrecognized, this can lead to anemia, decreased bone density, and, in longstanding cases, intestinal cancer. The prevalence of the disorde… ▽ More Celiac Disease (CD) is a chronic autoimmune disease that affects the small intestine in genetically predisposed children and adults. Gluten exposure triggers an inflammatory cascade which leads to compromised intestinal barrier function. If this enteropathy is unrecognized, this can lead to anemia, decreased bone density, and, in longstanding cases, intestinal cancer. The prevalence of the disorder is 1% in the United States. An intestinal (duodenal) biopsy is considered the "gold standard" for diagnosis. The mild CD might go unnoticed due to non-specific clinical symptoms or mild histologic features. In our current work, we trained a model based on deep residual networks to diagnose CD severity using a histological scoring system called the modified Marsh score. The proposed model was evaluated using an independent set of 120 whole slide images from 15 CD patients and achieved an AUC greater than 0.96 in all classes. These results demonstrate the diagnostic power of the proposed model for CD severity classification using histological images. △ Less

Submitted 7 October, 2019; originally announced October 2019.

Comments: accepted at IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM 2019)

arXiv:1909.01963 [pdf, other]

Self-Attentive Adversarial Stain Normalization

Authors: Aman Shrivastava, Will Adorno, Yash Sharma, Lubaina Ehsan, S. Asad Ali, Sean R. Moore, Beatrice C. Amadi, Paul Kelly, Sana Syed, Donald E. Brown

Abstract: Hematoxylin and Eosin (H&E) stained Whole Slide Images (WSIs) are utilized for biopsy visualization-based diagnostic and prognostic assessment of diseases. Variation in the H&E staining process across different lab sites can lead to significant variations in biopsy image appearance. These variations introduce an undesirable bias when the slides are examined by pathologists or used for training dee… ▽ More Hematoxylin and Eosin (H&E) stained Whole Slide Images (WSIs) are utilized for biopsy visualization-based diagnostic and prognostic assessment of diseases. Variation in the H&E staining process across different lab sites can lead to significant variations in biopsy image appearance. These variations introduce an undesirable bias when the slides are examined by pathologists or used for training deep learning models. To reduce this bias, slides need to be translated to a common domain of stain appearance before analysis. We propose a Self-Attentive Adversarial Stain Normalization (SAASN) approach for the normalization of multiple stain appearances to a common domain. This unsupervised generative adversarial approach includes self-attention mechanism for synthesizing images with finer detail while preserving the structural consistency of the biopsy features during translation. SAASN demonstrates consistent and superior performance compared to other popular stain normalization techniques on H&E stained duodenal biopsy image data. △ Less

Submitted 22 November, 2020; v1 submitted 4 September, 2019; originally announced September 2019.

Comments: Accepted at AIDP (ICPR 2021)

arXiv:1908.03272 [pdf, other]

Deep Learning for Visual Recognition of Environmental Enteropathy and Celiac Disease

Authors: Aman Shrivastava, Karan Kant, Saurav Sengupta, Sung-Jun Kang, Marium Khan, Asad Ali, Sean R. Moore, Beatrice C. Amadi, Paul Kelly, Donald E. Brown, Sana Syed

Abstract: Physicians use biopsies to distinguish between different but histologically similar enteropathies. The range of syndromes and pathologies that could cause different gastrointestinal conditions makes this a difficult problem. Recently, deep learning has been used successfully in helping diagnose cancerous tissues in histopathological images. These successes motivated the research presented in this… ▽ More Physicians use biopsies to distinguish between different but histologically similar enteropathies. The range of syndromes and pathologies that could cause different gastrointestinal conditions makes this a difficult problem. Recently, deep learning has been used successfully in helping diagnose cancerous tissues in histopathological images. These successes motivated the research presented in this paper, which describes a deep learning approach that distinguishes between Celiac Disease (CD) and Environmental Enteropathy (EE) and normal tissue from digitized duodenal biopsies. Experimental results show accuracies of over 90% for this approach. We also look into interpreting the neural network model using Gradient-weighted Class Activation Mappings and filter activations on input images to understand the visual explanations for the decisions made by the model. △ Less

Submitted 8 August, 2019; originally announced August 2019.

arXiv:1904.08067 [pdf, other]

doi 10.3390/info10040150

Text Classification Algorithms: A Survey

Authors: Kamran Kowsari, Kiana Jafari Meimandi, Mojtaba Heidarysafa, Sanjana Mendu, Laura E. Barnes, Donald E. Brown

Abstract: In recent years, there has been an exponential growth in the number of complex documents and texts that require a deeper understanding of machine learning methods to be able to accurately classify texts in many applications. Many machine learning approaches have achieved surpassing results in natural language processing. The success of these learning algorithms relies on their capacity to understa… ▽ More In recent years, there has been an exponential growth in the number of complex documents and texts that require a deeper understanding of machine learning methods to be able to accurately classify texts in many applications. Many machine learning approaches have achieved surpassing results in natural language processing. The success of these learning algorithms relies on their capacity to understand complex models and non-linear relationships within data. However, finding suitable structures, architectures, and techniques for text classification is a challenge for researchers. In this paper, a brief overview of text classification algorithms is discussed. This overview covers different text feature extractions, dimensionality reduction methods, existing algorithms and techniques, and evaluations methods. Finally, the limitations of each technique and their application in the real-world problem are discussed. △ Less

Submitted 20 May, 2020; v1 submitted 16 April, 2019; originally announced April 2019.

arXiv:1904.05773 [pdf, other]

Diagnosis of Celiac Disease and Environmental Enteropathy on Biopsy Images Using Color Balancing on Convolutional Neural Networks

Authors: Kamran Kowsari, Rasoul Sali, Marium N. Khan, William Adorno, S. Asad Ali, Sean R. Moore, Beatrice C. Amadi, Paul Kelly, Sana Syed, Donald E. Brown

Abstract: Celiac Disease (CD) and Environmental Enteropathy (EE) are common causes of malnutrition and adversely impact normal childhood development. CD is an autoimmune disorder that is prevalent worldwide and is caused by an increased sensitivity to gluten. Gluten exposure destructs the small intestinal epithelial barrier, resulting in nutrient mal-absorption and childhood under-nutrition. EE also results… ▽ More Celiac Disease (CD) and Environmental Enteropathy (EE) are common causes of malnutrition and adversely impact normal childhood development. CD is an autoimmune disorder that is prevalent worldwide and is caused by an increased sensitivity to gluten. Gluten exposure destructs the small intestinal epithelial barrier, resulting in nutrient mal-absorption and childhood under-nutrition. EE also results in barrier dysfunction but is thought to be caused by an increased vulnerability to infections. EE has been implicated as the predominant cause of under-nutrition, oral vaccine failure, and impaired cognitive development in low-and-middle-income countries. Both conditions require a tissue biopsy for diagnosis, and a major challenge of interpreting clinical biopsy images to differentiate between these gastrointestinal diseases is striking histopathologic overlap between them. In the current study, we propose a convolutional neural network (CNN) to classify duodenal biopsy images from subjects with CD, EE, and healthy controls. We evaluated the performance of our proposed model using a large cohort containing 1000 biopsy images. Our evaluations show that the proposed model achieves an area under ROC of 0.99, 1.00, and 0.97 for CD, EE, and healthy controls, respectively. These results demonstrate the discriminative power of the proposed model in duodenal biopsies classification. △ Less

Submitted 9 October, 2019; v1 submitted 10 April, 2019; originally announced April 2019.

arXiv:1811.06193 [pdf, other]

From Videos to URLs: A Multi-Browser Guide To Extract User's Behavior with Optical Character Recognition

Authors: Mojtaba Heidarysafa, James Reed, Kamran Kowsari, April Celeste R. Leviton, Janet I. Warren, Donald E. Brown

Abstract: Tracking users' activities on the World Wide Web (WWW) allows researchers to analyze each user's internet behavior as time passes and for the amount of time spent on a particular domain. This analysis can be used in research design, as researchers may access to their participant's behaviors while browsing the web. Web search behavior has been a subject of interest because of its real-world applica… ▽ More Tracking users' activities on the World Wide Web (WWW) allows researchers to analyze each user's internet behavior as time passes and for the amount of time spent on a particular domain. This analysis can be used in research design, as researchers may access to their participant's behaviors while browsing the web. Web search behavior has been a subject of interest because of its real-world applications in marketing, digital advertisement, and identifying potential threats online. In this paper, we present an image-processing based method to extract domains which are visited by a participant over multiple browsers during a lab session. This method could provide another way to collect users' activities during an online session given that the session recorder collected the data. The method can also be used to collect the textual content of web-pages that an individual visits for later analysis △ Less

Submitted 19 May, 2020; v1 submitted 15 November, 2018; originally announced November 2018.

arXiv:1811.00981 [pdf, other]

Exploring Gaze Behavior to Assess Performance in Digital Game-Based Learning Systems

Authors: Brian An, Inki Kim, Erfan Pakdamanian, Donald E. Brown

Abstract: The recent growth of sophisticated digital gaming technologies has spawned an \$8.1B industry around using these games for pedagogical purposes. Though Digital Game-Based Learning Systems have been adopted by industries ranging from military to medical applications, these systems continue to rely on traditional measures of explicit interactions to gauge player performance which can be subject to g… ▽ More The recent growth of sophisticated digital gaming technologies has spawned an \$8.1B industry around using these games for pedagogical purposes. Though Digital Game-Based Learning Systems have been adopted by industries ranging from military to medical applications, these systems continue to rely on traditional measures of explicit interactions to gauge player performance which can be subject to guessing and other factors unrelated to actual performance. This study presents a novel implicit eye-tracking based metric for digital game-based learning environments. The proposed metric introduces a weighted eye-tracking measure of traditional in-game scoring to consider the mental schema of a player's decision making. In order to validate the efficacy of this metric, we conducted an experiment with 25 participants playing a game designed to evaluate Chinese cultural competency and communication. This experiment showed strong correlation between the novel eye-tracking performance metric and traditional measures of in-game performance. △ Less

Submitted 2 November, 2018; originally announced November 2018.

arXiv:1810.07382 [pdf, other]

doi 10.1109/ICMLA.2018.00235

Analysis of Railway Accidents' Narratives Using Deep Learning

Authors: Mojtaba Heidarysafa, Kamran Kowsari, Laura E. Barnes, Donald E. Brown

Abstract: Automatic understanding of domain specific texts in order to extract useful relationships for later use is a non-trivial task. One such relationship would be between railroad accidents' causes and their correspondent descriptions in reports. From 2001 to 2016 rail accidents in the U.S. cost more than $4.6B. Railroads involved in accidents are required to submit an accident report to the Federal Ra… ▽ More Automatic understanding of domain specific texts in order to extract useful relationships for later use is a non-trivial task. One such relationship would be between railroad accidents' causes and their correspondent descriptions in reports. From 2001 to 2016 rail accidents in the U.S. cost more than $4.6B. Railroads involved in accidents are required to submit an accident report to the Federal Railroad Administration (FRA). These reports contain a variety of fixed field entries including primary cause of the accidents (a coded variable with 389 values) as well as a narrative field which is a short text description of the accident. Although these narratives provide more information than a fixed field entry, the terminologies used in these reports are not easy to understand by a non-expert reader. Therefore, providing an assisting method to fill in the primary cause from such domain specific texts(narratives) would help to label the accidents with more accuracy. Another important question for transportation safety is whether the reported accident cause is consistent with narrative description. To address these questions, we applied deep learning methods together with powerful word embeddings such as Word2Vec and GloVe to classify accident cause values for the primary cause field using the text in the narratives. The results show that such approaches can both accurately classify accident causes based on report narratives and find important inconsistencies in accident reporting. △ Less

Submitted 20 May, 2020; v1 submitted 17 October, 2018; originally announced October 2018.

Comments: accepted in IEEE International Conference on Machine Learning and Applications (IEEE ICMLA)

arXiv:1808.08121 [pdf]

doi 10.18178/ijmlc.2018.8.4.703

An Improvement of Data Classification Using Random Multimodel Deep Learning (RMDL)

Authors: Mojtaba Heidarysafa, Kamran Kowsari, Donald E. Brown, Kiana Jafari Meimandi, Laura E. Barnes

Abstract: The exponential growth in the number of complex datasets every year requires more enhancement in machine learning methods to provide robust and accurate data classification. Lately, deep learning approaches have achieved surpassing results in comparison to previous machine learning algorithms. However, finding the suitable structure for these models has been a challenge for researchers. This paper… ▽ More The exponential growth in the number of complex datasets every year requires more enhancement in machine learning methods to provide robust and accurate data classification. Lately, deep learning approaches have achieved surpassing results in comparison to previous machine learning algorithms. However, finding the suitable structure for these models has been a challenge for researchers. This paper introduces Random Multimodel Deep Learning (RMDL): a new ensemble, deep learning approach for classification. RMDL solves the problem of finding the best deep learning structure and architecture while simultaneously improving robustness and accuracy through ensembles of deep learning architectures. In short, RMDL trains multiple randomly generated models of Deep Neural Network (DNN), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) in parallel and combines their results to produce better result of any of those models individually. In this paper, we describe RMDL model and compare the results for image and text classification as well as face recognition. We used MNIST and CIFAR-10 datasets as ground truth datasets for image classification and WOS, Reuters, IMDB, and 20newsgroup datasets for text classification. Lastly, we used ORL dataset to compare the model performance on face recognition task. △ Less

Submitted 22 August, 2018; originally announced August 2018.

Comments: published in International Journal of Machine Learning and Computing (IJMLC). arXiv admin note: substantial text overlap with arXiv:1805.01890

arXiv:1805.01890 [pdf, other]

doi 10.1145/3206098.3206111

RMDL: Random Multimodel Deep Learning for Classification

Authors: Kamran Kowsari, Mojtaba Heidarysafa, Donald E. Brown, Kiana Jafari Meimandi, Laura E. Barnes

Abstract: The continually increasing number of complex datasets each year necessitates ever improving machine learning methods for robust and accurate categorization of these data. This paper introduces Random Multimodel Deep Learning (RMDL): a new ensemble, deep learning approach for classification. Deep learning models have achieved state-of-the-art results across many domains. RMDL solves the problem of… ▽ More The continually increasing number of complex datasets each year necessitates ever improving machine learning methods for robust and accurate categorization of these data. This paper introduces Random Multimodel Deep Learning (RMDL): a new ensemble, deep learning approach for classification. Deep learning models have achieved state-of-the-art results across many domains. RMDL solves the problem of finding the best deep learning structure and architecture while simultaneously improving robustness and accuracy through ensembles of deep learning architectures. RDML can accept as input a variety data to include text, video, images, and symbolic. This paper describes RMDL and shows test results for image and text data including MNIST, CIFAR-10, WOS, Reuters, IMDB, and 20newsgroup. These test results show that RDML produces consistently better performance than standard methods over a broad range of data types and classification problems. △ Less

Submitted 31 May, 2018; v1 submitted 3 May, 2018; originally announced May 2018.

Comments: Best Paper award ACM ICISDM

arXiv:1711.06976 [pdf, other]

doi 10.1109/ACCESS.2019.2926040

MIT Advanced Vehicle Technology Study: Large-Scale Naturalistic Driving Study of Driver Behavior and Interaction with Automation

Authors: Lex Fridman, Daniel E. Brown, Michael Glazer, William Angell, Spencer Dodd, Benedikt Jenik, Jack Terwilliger, Aleksandr Patsekin, Julia Kindelsberger, Li Ding, Sean Seaman, Alea Mehler, Andrew Sipperley, Anthony Pettinato, Bobbie Seppelt, Linda Angell, Bruce Mehler, Bryan Reimer

Abstract: For the foreseeble future, human beings will likely remain an integral part of the driving task, monitoring the AI system as it performs anywhere from just over 0% to just under 100% of the driving. The governing objectives of the MIT Autonomous Vehicle Technology (MIT-AVT) study are to (1) undertake large-scale real-world driving data collection that includes high-definition video to fuel the dev… ▽ More For the foreseeble future, human beings will likely remain an integral part of the driving task, monitoring the AI system as it performs anywhere from just over 0% to just under 100% of the driving. The governing objectives of the MIT Autonomous Vehicle Technology (MIT-AVT) study are to (1) undertake large-scale real-world driving data collection that includes high-definition video to fuel the development of deep learning based internal and external perception systems, (2) gain a holistic understanding of how human beings interact with vehicle automation technology by integrating video data with vehicle state data, driver characteristics, mental models, and self-reported experiences with technology, and (3) identify how technology and other factors related to automation adoption and use can be improved in ways that save lives. In pursuing these objectives, we have instrumented 23 Tesla Model S and Model X vehicles, 2 Volvo S90 vehicles, 2 Range Rover Evoque, and 2 Cadillac CT6 vehicles for both long-term (over a year per driver) and medium term (one month per driver) naturalistic driving data collection. Furthermore, we are continually developing new methods for analysis of the massive-scale dataset collected from the instrumented vehicle fleet. The recorded data streams include IMU, GPS, CAN messages, and high-definition video streams of the driver face, the driver cabin, the forward roadway, and the instrument cluster (on select vehicles). The study is on-going and growing. To date, we have 122 participants, 15,610 days of participation, 511,638 miles, and 7.1 billion video frames. This paper presents the design of the study, the data collection hardware, the processing of the data, and the computer vision algorithms currently being used to extract actionable knowledge from the data. △ Less

Submitted 14 August, 2019; v1 submitted 19 November, 2017; originally announced November 2017.

Journal ref: IEEE Access, vol. 7, pp. 102021-102038, 2019

arXiv:1709.08267 [pdf, other]

doi 10.1109/ICMLA.2017.0-134

HDLTex: Hierarchical Deep Learning for Text Classification

Authors: Kamran Kowsari, Donald E. Brown, Mojtaba Heidarysafa, Kiana Jafari Meimandi, Matthew S. Gerber, Laura E. Barnes

Abstract: The continually increasing number of documents produced each year necessitates ever improving information processing methods for searching, retrieving, and organizing text. Central to these information processing methods is document classification, which has become an important application for supervised learning. Recently the performance of these traditional classifiers has degraded as the number… ▽ More The continually increasing number of documents produced each year necessitates ever improving information processing methods for searching, retrieving, and organizing text. Central to these information processing methods is document classification, which has become an important application for supervised learning. Recently the performance of these traditional classifiers has degraded as the number of documents has increased. This is because along with this growth in the number of documents has come an increase in the number of categories. This paper approaches this problem differently from current document classification methods that view the problem as multi-class classification. Instead we perform hierarchical classification using an approach we call Hierarchical Deep Learning for Text classification (HDLTex). HDLTex employs stacks of deep learning architectures to provide specialized understanding at each level of the document hierarchy. △ Less

Submitted 6 October, 2017; v1 submitted 24 September, 2017; originally announced September 2017.

Comments: ICMLA 2017

arXiv:1611.01170 [pdf, other]

PrivLogit: Efficient Privacy-preserving Logistic Regression by Tailoring Numerical Optimizers

Authors: Wei Xie, Yang Wang, Steven M. Boker, Donald E. Brown

Abstract: Safeguarding privacy in machine learning is highly desirable, especially in collaborative studies across many organizations. Privacy-preserving distributed machine learning (based on cryptography) is popular to solve the problem. However, existing cryptographic protocols still incur excess computational overhead. Here, we make a novel observation that this is partially due to naive adoption of mai… ▽ More Safeguarding privacy in machine learning is highly desirable, especially in collaborative studies across many organizations. Privacy-preserving distributed machine learning (based on cryptography) is popular to solve the problem. However, existing cryptographic protocols still incur excess computational overhead. Here, we make a novel observation that this is partially due to naive adoption of mainstream numerical optimization (e.g., Newton method) and failing to tailor for secure computing. This work presents a contrasting perspective: customizing numerical optimization specifically for secure settings. We propose a seemingly less-favorable optimization method that can in fact significantly accelerate privacy-preserving logistic regression. Leveraging this new method, we propose two new secure protocols for conducting logistic regression in a privacy-preserving and distributed manner. Extensive theoretical and empirical evaluations prove the competitive performance of our two secure proposals while without compromising accuracy or privacy: with speedup up to 2.3x and 8.1x, respectively, over state-of-the-art; and even faster as data scales up. Such drastic speedup is on top of and in addition to performance improvements from existing (and future) state-of-the-art cryptography. Our work provides a new way towards efficient and practical privacy-preserving logistic regression for large-scale studies which are common for modern science. △ Less

Submitted 3 November, 2016; originally announced November 2016.

Comments: 24 pages, 4 figures. Work done and circulated since 2015

arXiv:1511.07035 [pdf, other]

Detecting Road Surface Wetness from Audio: A Deep Learning Approach

Authors: Irman Abdić, Lex Fridman, Erik Marchi, Daniel E Brown, William Angell, Bryan Reimer, Björn Schuller

Abstract: We introduce a recurrent neural network architecture for automated road surface wetness detection from audio of tire-surface interaction. The robustness of our approach is evaluated on 785,826 bins of audio that span an extensive range of vehicle speeds, noises from the environment, road surface types, and pavement conditions including international roughness index (IRI) values from 25 in/mi to 14… ▽ More We introduce a recurrent neural network architecture for automated road surface wetness detection from audio of tire-surface interaction. The robustness of our approach is evaluated on 785,826 bins of audio that span an extensive range of vehicle speeds, noises from the environment, road surface types, and pavement conditions including international roughness index (IRI) values from 25 in/mi to 1400 in/mi. The training and evaluation of the model are performed on different roads to minimize the impact of environmental and other external factors on the accuracy of the classification. We achieve an unweighted average recall (UAR) of 93.2% across all vehicle speeds including 0 mph. The classifier still works at 0 mph because the discriminating signal is present in the sound of other vehicles driving by. △ Less

Submitted 4 December, 2015; v1 submitted 22 November, 2015; originally announced November 2015.

Comments: Under review in IEEE Signal Processing Letters

arXiv:1510.06113 [pdf, other]

doi 10.1016/j.patrec.2016.02.011

Automated Synchronization of Driving Data Using Vibration and Steering Events

Authors: Lex Fridman, Daniel E Brown, William Angell, Irman Abdić, Bryan Reimer, Hae Young Noh

Abstract: We propose a method for automated synchronization of vehicle sensors useful for the study of multi-modal driver behavior and for the design of advanced driver assistance systems. Multi-sensor decision fusion relies on synchronized data streams in (1) the offline supervised learning context and (2) the online prediction context. In practice, such data streams are often out of sync due to the absenc… ▽ More We propose a method for automated synchronization of vehicle sensors useful for the study of multi-modal driver behavior and for the design of advanced driver assistance systems. Multi-sensor decision fusion relies on synchronized data streams in (1) the offline supervised learning context and (2) the online prediction context. In practice, such data streams are often out of sync due to the absence of a real-time clock, use of multiple recording devices, or improper thread scheduling and data buffer management. Cross-correlation of accelerometer, telemetry, audio, and dense optical flow from three video sensors is used to achieve an average synchronization error of 13 milliseconds. The insight underlying the effectiveness of the proposed approach is that the described sensors capture overlapping aspects of vehicle vibrations and vehicle steering allowing the cross-correlation function to serve as a way to compute the delay shift in each sensor. Furthermore, we show the decrease in synchronization error as a function of the duration of the data stream. △ Less

Submitted 1 March, 2016; v1 submitted 20 October, 2015; originally announced October 2015.

Comments: Accepted for Publication in Elsevier Pattern Recognition Letters

Showing 1–35 of 35 results for author: Brown, D E