subscribe to arXiv mailings

From Protoscience to Epistemic Monoculture: How Benchmarking Set the Stage for the Deep Learning Revolution

Authors: Bernard J. Koch, David Peterson

Abstract: Over the past decade, AI research has focused heavily on building ever-larger deep learning models. This approach has simultaneously unlocked incredible achievements in science and technology, and hindered AI from overcoming long-standing limitations with respect to explainability, ethical harms, and environmental efficiency. Drawing on qualitative interviews and computational analyses, our three-… ▽ More Over the past decade, AI research has focused heavily on building ever-larger deep learning models. This approach has simultaneously unlocked incredible achievements in science and technology, and hindered AI from overcoming long-standing limitations with respect to explainability, ethical harms, and environmental efficiency. Drawing on qualitative interviews and computational analyses, our three-part history of AI research traces the creation of this "epistemic monoculture" back to a radical reconceptualization of scientific progress that began in the late 1980s. In the first era of AI research (1950s-late 1980s), researchers and patrons approached AI as a "basic" science that would advance through autonomous exploration and organic assessments of progress (e.g., peer-review, theoretical consensus). The failure of this approach led to a retrenchment of funding in the 1980s. Amid this "AI Winter," an intervention by the U.S. government reoriented the field towards measurable progress on tasks of military and commercial interest. A new evaluation system called "benchmarking" provided an objective way to quantify progress on tasks by focusing exclusively on increasing predictive accuracy on example datasets. Distilling science down to verifiable metrics clarified the roles of scientists, allowed the field to rapidly integrate talent, and provided clear signals of significance and progress. But history has also revealed a tradeoff to this streamlined approach to science: the consolidation around external interests and inherent conservatism of benchmarking has disincentivized exploration beyond scaling monoculture. In the discussion, we explain how AI's monoculture offers a compelling challenge to the belief that basic, exploration-driven research is needed for scientific progress. Implications for the spread of AI monoculture to other sciences in the era of generative AI are also discussed. △ Less

Submitted 10 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

arXiv:2402.03654 [pdf]

doi 10.5121/csit.2024.140208

Reviewing FID and SID Metrics on Generative Adversarial Networks

Authors: Ricardo de Deijn, Aishwarya Batra, Brandon Koch, Naseef Mansoor, Hema Makkena

Abstract: The growth of generative adversarial network (GAN) models has increased the ability of image processing and provides numerous industries with the technology to produce realistic image transformations. However, with the field being recently established there are new evaluation metrics that can further this research. Previous research has shown the Fréchet Inception Distance (FID) to be an effective… ▽ More The growth of generative adversarial network (GAN) models has increased the ability of image processing and provides numerous industries with the technology to produce realistic image transformations. However, with the field being recently established there are new evaluation metrics that can further this research. Previous research has shown the Fréchet Inception Distance (FID) to be an effective metric when testing these image-to-image GANs in real-world applications. Signed Inception Distance (SID), a founded metric in 2023, expands on FID by allowing unsigned distances. This paper uses public datasets that consist of façades, cityscapes, and maps within Pix2Pix and CycleGAN models. After training these models are evaluated on both inception distance metrics which measure the generating performance of the trained models. Our findings indicate that usage of the metric SID incorporates an efficient and effective metric to complement, or even exceed the ability shown using the FID for the image-to-image GANs △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: 14 pages 9 figures 1 table Included in IOTBS, NLTM, AIMLA, DBDM - 2024 Conference Proceedings Editor: David C. Wyld et al

Journal ref: CS & IT - CSCP (2024) 111-124

arXiv:2311.13028 [pdf, other]

DMLR: Data-centric Machine Learning Research -- Past, Present and Future

Authors: Luis Oala, Manil Maskey, Lilith Bat-Leah, Alicia Parrish, Nezihe Merve Gürel, Tzu-Sheng Kuo, Yang Liu, Rotem Dror, Danilo Brajovic, Xiaozhe Yao, Max Bartolo, William A Gaviria Rojas, Ryan Hileman, Rainier Aliment, Michael W. Mahoney, Meg Risdal, Matthew Lease, Wojciech Samek, Debojyoti Dutta, Curtis G Northcutt, Cody Coleman, Braden Hancock, Bernard Koch, Girmaw Abebe Tadesse, Bojan Karlaš , et al. (13 additional authors not shown)

Abstract: Drawing from discussions at the inaugural DMLR workshop at ICML 2023 and meetings prior, in this report we outline the relevance of community engagement and infrastructure development for the creation of next-generation public datasets that will advance machine learning science. We chart a path forward as a collective effort to sustain the creation and maintenance of these datasets and methods tow… ▽ More Drawing from discussions at the inaugural DMLR workshop at ICML 2023 and meetings prior, in this report we outline the relevance of community engagement and infrastructure development for the creation of next-generation public datasets that will advance machine learning science. We chart a path forward as a collective effort to sustain the creation and maintenance of these datasets and methods towards positive scientific, societal and business impact. △ Less

Submitted 1 June, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

Comments: Published in the Journal of Data-centric Machine Learning Research (DMLR) at https://data.mlr.press/assets/pdf/v01-5.pdf

arXiv:2308.08192 [pdf, other]

doi 10.1016/j.eswa.2023.120147

Automatic Vision-Based Parking Slot Detection and Occupancy Classification

Authors: Ratko Grbić, Brando Koch

Abstract: Parking guidance information (PGI) systems are used to provide information to drivers about the nearest parking lots and the number of vacant parking slots. Recently, vision-based solutions started to appear as a cost-effective alternative to standard PGI systems based on hardware sensors mounted on each parking slot. Vision-based systems provide information about parking occupancy based on images… ▽ More Parking guidance information (PGI) systems are used to provide information to drivers about the nearest parking lots and the number of vacant parking slots. Recently, vision-based solutions started to appear as a cost-effective alternative to standard PGI systems based on hardware sensors mounted on each parking slot. Vision-based systems provide information about parking occupancy based on images taken by a camera that is recording a parking lot. However, such systems are challenging to develop due to various possible viewpoints, weather conditions, and object occlusions. Most notably, they require manual labeling of parking slot locations in the input image which is sensitive to camera angle change, replacement, or maintenance. In this paper, the algorithm that performs Automatic Parking Slot Detection and Occupancy Classification (APSD-OC) solely on input images is proposed. Automatic parking slot detection is based on vehicle detections in a series of parking lot images upon which clustering is applied in bird's eye view to detect parking slots. Once the parking slots positions are determined in the input image, each detected parking slot is classified as occupied or vacant using a specifically trained ResNet34 deep classifier. The proposed approach is extensively evaluated on well-known publicly available datasets (PKLot and CNRPark+EXT), showing high efficiency in parking slot detection and robustness to the presence of illegal parking or passing vehicles. Trained classifier achieves high accuracy in parking slot occupancy classification. △ Less

Submitted 16 August, 2023; originally announced August 2023.

Comments: 39 pages, 8 figures, 9 tables

Journal ref: Expert Systems with Applications, Volume 225, 1 September 2023, 120147

arXiv:2308.06944 [pdf, other]

doi 10.1016/j.imavis.2024.104900

One-shot lip-based biometric authentication: extending behavioral features with authentication phrase information

Authors: Brando Koch, Ratko Grbić

Abstract: Lip-based biometric authentication (LBBA) is an authentication method based on a person's lip movements during speech in the form of video data captured by a camera sensor. LBBA can utilize both physical and behavioral characteristics of lip movements without requiring any additional sensory equipment apart from an RGB camera. State-of-the-art (SOTA) approaches use one-shot learning to train deep… ▽ More Lip-based biometric authentication (LBBA) is an authentication method based on a person's lip movements during speech in the form of video data captured by a camera sensor. LBBA can utilize both physical and behavioral characteristics of lip movements without requiring any additional sensory equipment apart from an RGB camera. State-of-the-art (SOTA) approaches use one-shot learning to train deep siamese neural networks which produce an embedding vector out of these features. Embeddings are further used to compute the similarity between an enrolled user and a user being authenticated. A flaw of these approaches is that they model behavioral features as style-of-speech without relation to what is being said. This makes the system vulnerable to video replay attacks of the client speaking any phrase. To solve this problem we propose a one-shot approach which models behavioral features to discriminate against what is being said in addition to style-of-speech. We achieve this by customizing the GRID dataset to obtain required triplets and training a siamese neural network based on 3D convolutions and recurrent neural network layers. A custom triplet loss for batch-wise hard-negative mining is proposed. Obtained results using an open-set protocol are 3.2% FAR and 3.8% FRR on the test set of the customized GRID dataset. Additional analysis of the results was done to quantify the influence and discriminatory power of behavioral and physical features for LBBA. △ Less

Submitted 14 August, 2023; originally announced August 2023.

Comments: 28 pages, 10 figures, 7 tables

Journal ref: Image and Vision Computing, 142, 2024

arXiv:2112.12294 [pdf, other]

Run-of-Mine Stockyard Recovery Scheduling and Optimisation for Multiple Reclaimers

Authors: Hirad Assimi, Ben Koch, Chris Garcia, Markus Wagner, Frank Neumann

Abstract: Stockpiles are essential in the mining value chain, assisting in maximising value and production. Quality control of taken minerals from the stockpiles is a major concern for stockpile managers where failure to meet some requirements can lead to losing money. This problem was recently investigated using a single reclaimer, and basic assumptions. This study extends the approach to consider multiple… ▽ More Stockpiles are essential in the mining value chain, assisting in maximising value and production. Quality control of taken minerals from the stockpiles is a major concern for stockpile managers where failure to meet some requirements can lead to losing money. This problem was recently investigated using a single reclaimer, and basic assumptions. This study extends the approach to consider multiple reclaimers in preparing for short and long-term deliveries. The engagement of multiple reclaimers complicates the problem in terms of their interaction in preparing a delivery simultaneously and safety distancing of reclaimers. We also consider more realistic settings, such as handling different minerals with different types of reclaimers. We propose methods that construct a solution step by step to meet precedence constraints for all reclaimers in the stockyard. We study various instances of the problem using greedy algorithms, Ant Colony Optimisation (ACO), and propose an integrated local search method determining an efficient schedule. We fine-tune and compare the algorithms and show that the ACO combined with local search can yield efficient solutions. △ Less

Submitted 22 December, 2021; originally announced December 2021.

arXiv:2112.01716 [pdf, other]

Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research

Authors: Bernard Koch, Emily Denton, Alex Hanna, Jacob G. Foster

Abstract: Benchmark datasets play a central role in the organization of machine learning research. They coordinate researchers around shared research problems and serve as a measure of progress towards shared goals. Despite the foundational role of benchmarking practices in this field, relatively little attention has been paid to the dynamics of benchmark dataset use and reuse, within or across machine lear… ▽ More Benchmark datasets play a central role in the organization of machine learning research. They coordinate researchers around shared research problems and serve as a measure of progress towards shared goals. Despite the foundational role of benchmarking practices in this field, relatively little attention has been paid to the dynamics of benchmark dataset use and reuse, within or across machine learning subcommunities. In this paper, we dig into these dynamics. We study how dataset usage patterns differ across machine learning subcommunities and across time from 2015-2020. We find increasing concentration on fewer and fewer datasets within task communities, significant adoption of datasets from other tasks, and concentration across the field on datasets that have been introduced by researchers situated within a small number of elite institutions. Our results have implications for scientific evaluation, AI ethics, and equity/access within the field. △ Less

Submitted 3 December, 2021; originally announced December 2021.

Comments: 35th Conference on Neural Information Processing Systems (NeurIPS 2021), Sydney, Australia

arXiv:2110.04442 [pdf, other]

A Primer on Deep Learning for Causal Inference

Authors: Bernard Koch, Tim Sainburg, Pablo Geraldo, Song Jiang, Yizhou Sun, Jacob Gates Foster

Abstract: This review systematizes the emerging literature for causal inference using deep neural networks under the potential outcomes framework. It provides an intuitive introduction on how deep learning can be used to estimate/predict heterogeneous treatment effects and extend causal inference to settings where confounding is non-linear, time varying, or encoded in text, networks, and images. To maximize… ▽ More This review systematizes the emerging literature for causal inference using deep neural networks under the potential outcomes framework. It provides an intuitive introduction on how deep learning can be used to estimate/predict heterogeneous treatment effects and extend causal inference to settings where confounding is non-linear, time varying, or encoded in text, networks, and images. To maximize accessibility, we also introduce prerequisite concepts from causal inference and deep learning. The survey differs from other treatments of deep learning and causal inference in its sharp focus on observational causal estimation, its extended exposition of key algorithms, and its detailed tutorials for implementing, training, and selecting among deep estimators in Tensorflow 2 available at github.com/kochbj/Deep-Learning-for-Causal-Inference. △ Less

Submitted 28 November, 2023; v1 submitted 8 October, 2021; originally announced October 2021.

Comments: Forthcoming in Sociological Methods and Research

arXiv:1909.09138 [pdf]

Uncovering Sociological Effect Heterogeneity using Machine Learning

Authors: Jennie E. Brand, Jiahui Xu, Bernard Koch, Pablo Geraldo

Abstract: Individuals do not respond uniformly to treatments, events, or interventions. Sociologists routinely partition samples into subgroups to explore how the effects of treatments vary by covariates like race, gender, and socioeconomic status. In so doing, analysts determine the key subpopulations based on theoretical priors. Data-driven discoveries are also routine, yet the analyses by which sociologi… ▽ More Individuals do not respond uniformly to treatments, events, or interventions. Sociologists routinely partition samples into subgroups to explore how the effects of treatments vary by covariates like race, gender, and socioeconomic status. In so doing, analysts determine the key subpopulations based on theoretical priors. Data-driven discoveries are also routine, yet the analyses by which sociologists typically go about them are problematic and seldom move us beyond our expectations, and biases, to explore new meaningful subgroups. Emerging machine learning methods allow researchers to explore sources of variation that they may not have previously considered, or envisaged. In this paper, we use causal trees to recursively partition the sample and uncover sources of treatment effect heterogeneity. We use honest estimation, splitting the sample into a training sample to grow the tree and an estimation sample to estimate leaf-specific effects. Assessing a central topic in the social inequality literature, college effects on wages, we compare what we learn from conventional approaches for exploring variation in effects to causal trees. Given our use of observational data, we use leaf-specific matching and sensitivity analyses to address confounding and offer interpretations of effects based on observed and unobserved heterogeneity. We encourage researchers to follow similar practices in their work on variation in sociological effects. △ Less

Submitted 18 September, 2019; originally announced September 2019.

Showing 1–9 of 9 results for author: Koch, B