subscribe to arXiv mailings

More Distinctively Black and Feminine Faces Lead to Increased Stereotyping in Vision-Language Models

Authors: Messi H. J. Lee, Jacob M. Montgomery, Calvin K. Lai

Abstract: Vision Language Models (VLMs), exemplified by GPT-4V, adeptly integrate text and vision modalities. This integration enhances Large Language Models' ability to mimic human perception, allowing them to process image inputs. Despite VLMs' advanced capabilities, however, there is a concern that VLMs inherit biases of both modalities in ways that make biases more pervasive and difficult to mitigate. O… ▽ More Vision Language Models (VLMs), exemplified by GPT-4V, adeptly integrate text and vision modalities. This integration enhances Large Language Models' ability to mimic human perception, allowing them to process image inputs. Despite VLMs' advanced capabilities, however, there is a concern that VLMs inherit biases of both modalities in ways that make biases more pervasive and difficult to mitigate. Our study explores how VLMs perpetuate homogeneity bias and trait associations with regards to race and gender. When prompted to write stories based on images of human faces, GPT-4V describes subordinate racial and gender groups with greater homogeneity than dominant groups and relies on distinct, yet generally positive, stereotypes. Importantly, VLM stereotyping is driven by visual cues rather than group membership alone such that faces that are rated as more prototypically Black and feminine are subject to greater stereotyping. These findings suggest that VLMs may associate subtle visual cues related to racial and gender groups with stereotypes in ways that could be challenging to mitigate. We explore the underlying reasons behind this behavior and discuss its implications and emphasize the importance of addressing these biases as VLMs come to mirror human perception. △ Less

Submitted 21 May, 2024; originally announced July 2024.

arXiv:2407.04970 [pdf, other]

Idiographic Personality Gaussian Process for Psychological Assessment

Authors: Yehu Chen, Muchen Xi, Jacob Montgomery, Joshua Jackson, Roman Garnett

Abstract: We develop a novel measurement framework based on a Gaussian process coregionalization model to address a long-lasting debate in psychometrics: whether psychological features like personality share a common structure across the population, vary uniquely for individuals, or some combination. We propose the idiographic personality Gaussian process (IPGP) framework, an intermediate model that accommo… ▽ More We develop a novel measurement framework based on a Gaussian process coregionalization model to address a long-lasting debate in psychometrics: whether psychological features like personality share a common structure across the population, vary uniquely for individuals, or some combination. We propose the idiographic personality Gaussian process (IPGP) framework, an intermediate model that accommodates both shared trait structure across a population and "idiographic" deviations for individuals. IPGP leverages the Gaussian process coregionalization model to handle the grouped nature of battery responses, but adjusted to non-Gaussian ordinal data. We further exploit stochastic variational inference for efficient latent factor estimation required for idiographic modeling at scale. Using synthetic and real data, we show that IPGP improves both prediction of actual responses and estimation of individualized factor structures relative to existing benchmarks. In a third study, we show that IPGP also identifies unique clusters of personality taxonomies in real-world data, displaying great potential in advancing individualized approaches to psychological diagnosis and treatment. △ Less

Submitted 6 July, 2024; originally announced July 2024.

Comments: 9 pages, 4 figures

arXiv:2405.15023 [pdf, other]

ROB 204: Introduction to Human-Robot Systems at the University of Michigan, Ann Arbor

Authors: Leia Stirling, Joseph Montgomery, Mark Draelos, Christoforos Mavrogiannis, Lionel P. Robert Jr., Odest Chadwicke Jenkins

Abstract: The University of Michigan Robotics program focuses on the study of embodied intelligence that must sense, reason, act, and work with people to improve quality of life and productivity equitably across society. ROB 204, part of the core curriculum towards the undergraduate degree in Robotics, introduces students to topics that enable conceptually designing a robotic system to address users' needs… ▽ More The University of Michigan Robotics program focuses on the study of embodied intelligence that must sense, reason, act, and work with people to improve quality of life and productivity equitably across society. ROB 204, part of the core curriculum towards the undergraduate degree in Robotics, introduces students to topics that enable conceptually designing a robotic system to address users' needs from a sociotechnical context. Students are introduced to human-robot interaction (HRI) concepts and the process for socially-engaged design with a Learn-Reinforce-Integrate approach. In this paper, we discuss the course topics and our teaching methodology, and provide recommendations for delivering this material. Overall, students leave the course with a new understanding and appreciation for how human capabilities can inform requirements for a robotics system, how humans can interact with a robot, and how to assess the usability of robotic systems. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: Presented at the Designing an Intro to HRI Course Workshop at HRI 2024 (arXiv:2403.05588)

Report number: HRI101/2024/2

arXiv:2401.08495 [pdf, other]

doi 10.1145/3630106.3658975

Large Language Models Portray Socially Subordinate Groups as More Homogeneous, Consistent with a Bias Observed in Humans

Authors: Messi H. J. Lee, Jacob M. Montgomery, Calvin K. Lai

Abstract: Large language models (LLMs) are becoming pervasive in everyday life, yet their propensity to reproduce biases inherited from training data remains a pressing concern. Prior investigations into bias in LLMs have focused on the association of social groups with stereotypical attributes. However, this is only one form of human bias such systems may reproduce. We investigate a new form of bias in LLM… ▽ More Large language models (LLMs) are becoming pervasive in everyday life, yet their propensity to reproduce biases inherited from training data remains a pressing concern. Prior investigations into bias in LLMs have focused on the association of social groups with stereotypical attributes. However, this is only one form of human bias such systems may reproduce. We investigate a new form of bias in LLMs that resembles a social psychological phenomenon where socially subordinate groups are perceived as more homogeneous than socially dominant groups. We had ChatGPT, a state-of-the-art LLM, generate texts about intersectional group identities and compared those texts on measures of homogeneity. We consistently found that ChatGPT portrayed African, Asian, and Hispanic Americans as more homogeneous than White Americans, indicating that the model described racial minority groups with a narrower range of human experience. ChatGPT also portrayed women as more homogeneous than men, but these differences were small. Finally, we found that the effect of gender differed across racial/ethnic groups such that the effect of gender was consistent within African and Hispanic Americans but not within Asian and White Americans. We argue that the tendency of LLMs to describe groups as less diverse risks perpetuating stereotypes and discriminatory behavior. △ Less

Submitted 25 April, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

Comments: Forthcoming at ACM Conference on Fairness, Accountability, and Transparency (FAccT) 2024

arXiv:2304.01915 [pdf, other]

Fog Device-as-a-Service (FDaaS): A Framework for Service Deployment in Public Fog Environments

Authors: Sudheer Kumar Battula, Saurabh Garg, James Montgomery, Ranesh Naha

Abstract: Meeting the requirements of future services with time sensitivity and handling sudden load spikes of the services in Fog computing environments are challenging tasks due to the lack of publicly available Fog nodes and their characteristics. Researchers have assumed that the traditional autoscaling techniques, with lightweight virtualisation technology (containers), can be used to provide autoscali… ▽ More Meeting the requirements of future services with time sensitivity and handling sudden load spikes of the services in Fog computing environments are challenging tasks due to the lack of publicly available Fog nodes and their characteristics. Researchers have assumed that the traditional autoscaling techniques, with lightweight virtualisation technology (containers), can be used to provide autoscaling features in Fog computing environments, few researchers have built the platform by exploiting the default autoscaling techniques of the containerisation orchestration tools or systems. However, the adoption of these techniques alone, in a publicly available Fog infrastructure, does not guarantee Quality of Service (QoS) due to the heterogeneity of Fog devices and their characteristics, such as frequent resource changes and high mobility. To tackle this challenge, in this work we developed a Fog as a Service (FaaS) framework that can create, configure and manage the containers which are running on the Fog devices to deploy services. This work presents the key techniques and algorithms which are responsible for handling sudden load spikes of the services to meet the QoS of the application. This work provides an evaluation by comparing it with existing techniques under real scenarios. The experiment results show that our proposed approach maximises the satisfied service requests by an average of 1.9 times in different scenarios. △ Less

Submitted 3 March, 2024; v1 submitted 1 March, 2023; originally announced April 2023.

Comments: 10 Pages, 13 Figures

arXiv:2303.04217 [pdf, other]

AI for Science: An Emerging Agenda

Authors: Philipp Berens, Kyle Cranmer, Neil D. Lawrence, Ulrike von Luxburg, Jessica Montgomery

Abstract: This report documents the programme and the outcomes of Dagstuhl Seminar 22382 "Machine Learning for Science: Bridging Data-Driven and Mechanistic Modelling". Today's scientific challenges are characterised by complexity. Interconnected natural, technological, and human systems are influenced by forces acting across time- and spatial-scales, resulting in complex interactions and emergent behaviour… ▽ More This report documents the programme and the outcomes of Dagstuhl Seminar 22382 "Machine Learning for Science: Bridging Data-Driven and Mechanistic Modelling". Today's scientific challenges are characterised by complexity. Interconnected natural, technological, and human systems are influenced by forces acting across time- and spatial-scales, resulting in complex interactions and emergent behaviours. Understanding these phenomena -- and leveraging scientific advances to deliver innovative solutions to improve society's health, wealth, and well-being -- requires new ways of analysing complex systems. The transformative potential of AI stems from its widespread applicability across disciplines, and will only be achieved through integration across research domains. AI for science is a rendezvous point. It brings together expertise from $\mathrm{AI}$ and application domains; combines modelling knowledge with engineering know-how; and relies on collaboration across disciplines and between humans and machines. Alongside technical advances, the next wave of progress in the field will come from building a community of machine learning researchers, domain experts, citizen scientists, and engineers working together to design and deploy effective AI tools. This report summarises the discussions from the seminar and provides a roadmap to suggest how different communities can collaborate to deliver a new wave of progress in AI and its application for scientific discovery. △ Less

Submitted 7 March, 2023; originally announced March 2023.

arXiv:2205.09488 [pdf]

PSI Draft Specification

Authors: Mark Reid, James Montgomery, Barry Drake, Avraham Ruderman

Abstract: This document presents the draft specification for delivering machine learning services over HTTP, developed as part of the Protocols and Structures for Inference project, which concluded in 2013. It presents the motivation for providing machine learning as a service, followed by a description of the essential and optional components of such a service. This document presents the draft specification for delivering machine learning services over HTTP, developed as part of the Protocols and Structures for Inference project, which concluded in 2013. It presents the motivation for providing machine learning as a service, followed by a description of the essential and optional components of such a service. △ Less

Submitted 1 May, 2022; originally announced May 2022.

Comments: Software specification for PSI machine learning web services. 42 pages, 2 figures

arXiv:2107.04140 [pdf, other]

First-Generation Inference Accelerator Deployment at Facebook

Authors: Michael Anderson, Benny Chen, Stephen Chen, Summer Deng, Jordan Fix, Michael Gschwind, Aravind Kalaiah, Changkyu Kim, Jaewon Lee, Jason Liang, Haixin Liu, Yinghai Lu, Jack Montgomery, Arun Moorthy, Satish Nadathur, Sam Naghshineh, Avinash Nayak, Jongsoo Park, Chris Petersen, Martin Schatz, Narayanan Sundaram, Bangsheng Tang, Peter Tang, Amy Yang, Jiecao Yu , et al. (90 additional authors not shown)

Abstract: In this paper, we provide a deep dive into the deployment of inference accelerators at Facebook. Many of our ML workloads have unique characteristics, such as sparse memory accesses, large model sizes, as well as high compute, memory and network bandwidth requirements. We co-designed a high-performance, energy-efficient inference accelerator platform based on these requirements. We describe the in… ▽ More In this paper, we provide a deep dive into the deployment of inference accelerators at Facebook. Many of our ML workloads have unique characteristics, such as sparse memory accesses, large model sizes, as well as high compute, memory and network bandwidth requirements. We co-designed a high-performance, energy-efficient inference accelerator platform based on these requirements. We describe the inference accelerator platform ecosystem we developed and deployed at Facebook: both hardware, through Open Compute Platform (OCP), and software framework and tooling, through Pytorch/Caffe2/Glow. A characteristic of this ecosystem from the start is its openness to enable a variety of AI accelerators from different vendors. This platform, with six low-power accelerator cards alongside a single-socket host CPU, allows us to serve models of high complexity that cannot be easily or efficiently run on CPUs. We describe various performance optimizations, at both platform and accelerator level, which enables this platform to serve production traffic at Facebook. We also share deployment challenges, lessons learned during performance optimization, as well as provide guidance for future inference hardware co-design. △ Less

Submitted 4 August, 2021; v1 submitted 8 July, 2021; originally announced July 2021.

arXiv:2106.11051 [pdf, other]

Towards Better Shale Gas Production Forecasting Using Transfer Learning

Authors: Omar S. Alolayan, Samuel J. Raymond, Justin B. Montgomery, John R. Williams

Abstract: Deep neural networks can generate more accurate shale gas production forecasts in counties with a limited number of sample wells by utilizing transfer learning. This paper provides a way of transferring the knowledge gained from other deep neural network models trained on adjacent counties into the county of interest. The paper uses data from more than 6000 shale gas wells across 17 counties from… ▽ More Deep neural networks can generate more accurate shale gas production forecasts in counties with a limited number of sample wells by utilizing transfer learning. This paper provides a way of transferring the knowledge gained from other deep neural network models trained on adjacent counties into the county of interest. The paper uses data from more than 6000 shale gas wells across 17 counties from Texas Barnett and Pennsylvania Marcellus shale formations to test the capabilities of transfer learning. The results reduce the forecasting error between 11% and 47% compared to the widely used Arps decline curve model. △ Less

Submitted 21 June, 2021; originally announced June 2021.

arXiv:2006.09900 [pdf, other]

GPIRT: A Gaussian Process Model for Item Response Theory

Authors: JBrandon Duck-Mayr, Roman Garnett, Jacob M. Montgomery

Abstract: The goal of item response theoretic (IRT) models is to provide estimates of latent traits from binary observed indicators and at the same time to learn the item response functions (IRFs) that map from latent trait to observed response. However, in many cases observed behavior can deviate significantly from the parametric assumptions of traditional IRT models. Nonparametric IRT models overcome thes… ▽ More The goal of item response theoretic (IRT) models is to provide estimates of latent traits from binary observed indicators and at the same time to learn the item response functions (IRFs) that map from latent trait to observed response. However, in many cases observed behavior can deviate significantly from the parametric assumptions of traditional IRT models. Nonparametric IRT models overcome these challenges by relaxing assumptions about the form of the IRFs, but standard tools are unable to simultaneously estimate flexible IRFs and recover ability estimates for respondents. We propose a Bayesian nonparametric model that solves this problem by placing Gaussian process priors on the latent functions defining the IRFs. This allows us to simultaneously relax assumptions about the shape of the IRFs while preserving the ability to estimate latent traits. This in turn allows us to easily extend the model to further tasks such as active learning. GPIRT therefore provides a simple and intuitive solution to several longstanding problems in the IRT literature. △ Less

Submitted 17 June, 2020; originally announced June 2020.

Journal ref: Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), PMLR 124 (2020) 520-529

arXiv:1806.07629 [pdf, other]

User's Privacy in Recommendation Systems Applying Online Social Network Data, A Survey and Taxonomy

Authors: Erfan Aghasian, Saurabh Garg, James Montgomery

Abstract: Recommender systems have become an integral part of many social networks and extract knowledge from a user's personal and sensitive data both explicitly, with the user's knowledge, and implicitly. This trend has created major privacy concerns as users are mostly unaware of what data and how much data is being used and how securely it is used. In this context, several works have been done to addres… ▽ More Recommender systems have become an integral part of many social networks and extract knowledge from a user's personal and sensitive data both explicitly, with the user's knowledge, and implicitly. This trend has created major privacy concerns as users are mostly unaware of what data and how much data is being used and how securely it is used. In this context, several works have been done to address privacy concerns for usage in online social network data and by recommender systems. This paper surveys the main privacy concerns, measurements and privacy-preserving techniques used in large-scale online social networks and recommender systems. It is based on historical works on security, privacy-preserving, statistical modeling, and datasets to provide an overview of the technical difficulties and problems associated with privacy preserving in online social networks. △ Less

Submitted 20 June, 2018; originally announced June 2018.

Comments: 26 pages, IET book chapter on big data recommender systems

arXiv:1805.00907 [pdf, other]

Glow: Graph Lowering Compiler Techniques for Neural Networks

Authors: Nadav Rotem, Jordan Fix, Saleem Abdulrasool, Garret Catron, Summer Deng, Roman Dzhabarov, Nick Gibson, James Hegeman, Meghan Lele, Roman Levenstein, Jack Montgomery, Bert Maher, Satish Nadathur, Jakob Olesen, Jongsoo Park, Artem Rakhov, Misha Smelyanskiy, Man Wang

Abstract: This paper presents the design of Glow, a machine learning compiler for heterogeneous hardware. It is a pragmatic approach to compilation that enables the generation of highly optimized code for multiple targets. Glow lowers the traditional neural network dataflow graph into a two-phase strongly-typed intermediate representation. The high-level intermediate representation allows the optimizer to p… ▽ More This paper presents the design of Glow, a machine learning compiler for heterogeneous hardware. It is a pragmatic approach to compilation that enables the generation of highly optimized code for multiple targets. Glow lowers the traditional neural network dataflow graph into a two-phase strongly-typed intermediate representation. The high-level intermediate representation allows the optimizer to perform domain-specific optimizations. The lower-level instruction-based address-only intermediate representation allows the compiler to perform memory-related optimizations, such as instruction scheduling, static memory allocation and copy elimination. At the lowest level, the optimizer performs machine-specific code generation to take advantage of specialized hardware features. Glow features a lowering phase which enables the compiler to support a high number of input operators as well as a large number of hardware targets by eliminating the need to implement all operators on all targets. The lowering phase is designed to reduce the input space and allow new hardware backends to focus on a small number of linear algebra primitives. △ Less

Submitted 3 April, 2019; v1 submitted 2 May, 2018; originally announced May 2018.

arXiv:1804.05502 [pdf, other]

Automatic Rain and Cicada Chorus Filtering of Bird Acoustic Data

Authors: Alexander Brown, Saurabh Garg, James Montgomery

Abstract: Recording and analysing environmental audio recordings has become a common approach for monitoring the environment. A current problem with performing analyses of environmental recordings is interference from noise that can mask sounds of interest. This makes detecting these sounds more difficult and can require additional resources. While some work has been done to remove stationary noise from env… ▽ More Recording and analysing environmental audio recordings has become a common approach for monitoring the environment. A current problem with performing analyses of environmental recordings is interference from noise that can mask sounds of interest. This makes detecting these sounds more difficult and can require additional resources. While some work has been done to remove stationary noise from environmental recordings, there has been little effort to remove noise from non-stationary sources, such as rain, wind, engines, and animal vocalisations that are not of interest. In this paper, we address the challenge of filtering noise from rain and cicada choruses from recordings containing bird sound. We improve upon previously established classification approaches using acoustic indices and Mel Frequency Cepstral Coefficients (MFCCs) as acoustic features to detect these noise sources, approaching the problem with the motivation of removing these sounds. We investigate the use of acoustic indices, and machine learning classifiers to find the most effective filters. The approach we use enables users to set thresholds to increase or decrease the sensitivity of classification, based on the prediction probability outputted by classifiers. We also propose a novel approach to remove cicada choruses using band-pass filters Our threshold-based approach (Random Forest with Acoustic Indices and Mel Frequency Cepstral Coefficients (MFCCs)) for rain detection achieves an AUC of 0.9881 and is more accurate than existing approaches when set to the same sensitivities. We also detect cicada choruses in our training set with 100% accuracy using 10-folds cross validation. Our cicada filtering approach greatly increased the median signal to noise ratios of affected recordings from 0.53 for unfiltered audio to 1.86 to audio filtered by both the cicada filter and a stationary noise filter. △ Less

Submitted 16 April, 2018; originally announced April 2018.

Comments: 18 pages, 10 figures

arXiv:1802.00535 [pdf, other]

Scalable Preprocessing of High Volume Bird Acoustic Data

Authors: Alexander Brown, Saurabh Garg, James Montgomery

Abstract: In this work, we examine the problem of efficiently preprocessing high volume bird acoustic data. We combine several existing preprocessing steps including noise reduction approaches into a single efficient pipeline by examining each process individually. We then utilise a distributed computing architecture to improve execution time. Using a master-slave model with data parallelisation, we develop… ▽ More In this work, we examine the problem of efficiently preprocessing high volume bird acoustic data. We combine several existing preprocessing steps including noise reduction approaches into a single efficient pipeline by examining each process individually. We then utilise a distributed computing architecture to improve execution time. Using a master-slave model with data parallelisation, we developed a near-linear automated scalable system, capable of preprocessing bird acoustic recordings 21.76 times faster with 32 cores over 8 virtual machines, compared to a serial process. This work contributes to the research area of bioacoustic analysis, which is currently very active because of its potential to monitor animals quickly at low cost. Overcoming noise interference is a significant challenge in many bioacoustic studies, and the volume of data in these studies is increasing. Our work makes large scale bird acoustic analyses more feasible by parallelising important bird acoustic processing tasks to significantly reduce execution times. △ Less

Submitted 1 February, 2018; originally announced February 2018.

Comments: 28 pages, 20 figures

Showing 1–14 of 14 results for author: Montgomery, J