subscribe to arXiv mailings

doi 10.1109/MSP.2021.3106615

Integrating Psychometrics and Computing Perspectives on Bias and Fairness in Affective Computing: A Case Study of Automated Video Interviews

Authors: Brandon M Booth, Louis Hickman, Shree Krishna Subburaj, Louis Tay, Sang Eun Woo, Sidney K. DMello

Abstract: We provide a psychometric-grounded exposition of bias and fairness as applied to a typical machine learning pipeline for affective computing. We expand on an interpersonal communication framework to elucidate how to identify sources of bias that may arise in the process of inferring human emotions and other psychological constructs from observed behavior. Various methods and metrics for measuring… ▽ More We provide a psychometric-grounded exposition of bias and fairness as applied to a typical machine learning pipeline for affective computing. We expand on an interpersonal communication framework to elucidate how to identify sources of bias that may arise in the process of inferring human emotions and other psychological constructs from observed behavior. Various methods and metrics for measuring fairness and bias are discussed along with pertinent implications within the United States legal context. We illustrate how to measure some types of bias and fairness in a case study involving automatic personality and hireability inference from multimodal data collected in video interviews for mock job applications. We encourage affective computing researchers and practitioners to encapsulate bias and fairness in their research processes and products and to consider their role, agency, and responsibility in promoting equitable and just systems. △ Less

Submitted 4 May, 2023; originally announced May 2023.

Comments: 21 pages, 4 figures

Journal ref: IEEE Signal Processing Magazine 38.6 (2021): 84-95

arXiv:2304.13933 [pdf]

doi 10.1111/peps.12593

Oversampling Higher-Performing Minorities During Machine Learning Model Training Reduces Adverse Impact Slightly but Also Reduces Model Accuracy

Authors: Louis Hickman, Jason Kuruzovich, Vincent Ng, Kofi Arhin, Danielle Wilson

Abstract: Organizations are increasingly adopting machine learning (ML) for personnel assessment. However, concerns exist about fairness in designing and implementing ML assessments. Supervised ML models are trained to model patterns in data, meaning ML models tend to yield predictions that reflect subgroup differences in applicant attributes in the training data, regardless of the underlying cause of subgr… ▽ More Organizations are increasingly adopting machine learning (ML) for personnel assessment. However, concerns exist about fairness in designing and implementing ML assessments. Supervised ML models are trained to model patterns in data, meaning ML models tend to yield predictions that reflect subgroup differences in applicant attributes in the training data, regardless of the underlying cause of subgroup differences. In this study, we systematically under- and oversampled minority (Black and Hispanic) applicants to manipulate adverse impact ratios in training data and investigated how training data adverse impact ratios affect ML model adverse impact and accuracy. We used self-reports and interview transcripts from job applicants (N = 2,501) to train 9,702 ML models to predict screening decisions. We found that training data adverse impact related linearly to ML model adverse impact. However, removing adverse impact from training data only slightly reduced ML model adverse impact and tended to negatively affect ML model accuracy. We observed consistent effects across self-reports and interview transcripts, whether oversampling real (i.e., bootstrapping) or synthetic observations. As our study relied on limited predictor sets from one organization, the observed effects on adverse impact may be attenuated among more accurate ML models. △ Less

Submitted 26 April, 2023; originally announced April 2023.

Comments: forthcoming in Personnel Psychology 2 figures

arXiv:2206.08287 [pdf]

Definition drives design: Disability models and mechanisms of bias in AI technologies

Authors: Denis Newman-Griffis, Jessica Sage Rauchberg, Rahaf Alharbi, Louise Hickman, Harry Hochheiser

Abstract: The increasing deployment of artificial intelligence (AI) tools to inform decision making across diverse areas including healthcare, employment, social benefits, and government policy, presents a serious risk for disabled people, who have been shown to face bias in AI implementations. While there has been significant work on analysing and mitigating algorithmic bias, the broader mechanisms of how… ▽ More The increasing deployment of artificial intelligence (AI) tools to inform decision making across diverse areas including healthcare, employment, social benefits, and government policy, presents a serious risk for disabled people, who have been shown to face bias in AI implementations. While there has been significant work on analysing and mitigating algorithmic bias, the broader mechanisms of how bias emerges in AI applications are not well understood, hampering efforts to address bias where it begins. In this article, we illustrate how bias in AI-assisted decision making can arise from a range of specific design decisions, each of which may seem self-contained and non-biasing when considered separately. These design decisions include basic problem formulation, the data chosen for analysis, the use the AI technology is put to, and operational design elements in addition to the core algorithmic design. We draw on three historical models of disability common to different decision-making settings to demonstrate how differences in the definition of disability can lead to highly distinct decisions on each of these aspects of design, leading in turn to AI technologies with a variety of biases and downstream effects. We further show that the potential harms arising from inappropriate definitions of disability in fundamental design stages are further amplified by a lack of transparency and disabled participation throughout the AI design process. Our analysis provides a framework for critically examining AI technologies in decision-making contexts and guiding the development of a design praxis for disability-related AI analytics. We put forth this article to provide key questions to facilitate disability-led design and participatory development to produce more fair and equitable AI technologies in disability-related contexts. △ Less

Submitted 23 November, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

Comments: 38 pages, 1 figure, 2 tables. Keywords: artificial intelligence; critical disability studies; information and communication technologies; data analytics; data science; fairness, accountability, transparency, and ethics

arXiv:2008.02449 [pdf, other]

Studying Politeness across Cultures Using English Twitter and Mandarin Weibo

Authors: Mingyang Li, Louis Hickman, Louis Tay, Lyle Ungar, Sharath Chandra Guntuku

Abstract: Modeling politeness across cultures helps to improve intercultural communication by uncovering what is considered appropriate and polite. We study the linguistic features associated with politeness across US English and Mandarin Chinese. First, we annotate 5,300 Twitter posts from the US and 5,300 Sina Weibo posts from China for politeness scores. Next, we develop an English and Chinese politeness… ▽ More Modeling politeness across cultures helps to improve intercultural communication by uncovering what is considered appropriate and polite. We study the linguistic features associated with politeness across US English and Mandarin Chinese. First, we annotate 5,300 Twitter posts from the US and 5,300 Sina Weibo posts from China for politeness scores. Next, we develop an English and Chinese politeness feature set, `PoliteLex'. Combining it with validated psycholinguistic dictionaries, we then study the correlations between linguistic features and perceived politeness across cultures. We find that on Mandarin Weibo, future-focusing conversations, identifying with a group affiliation, and gratitude are considered to be more polite than on English Twitter. Death-related taboo topics, lack of or poor choice of pronouns, and informal language are associated with higher impoliteness on Mandarin Weibo compared to English Twitter. Finally, we build language-based machine learning models to predict politeness with an F1 score of 0.886 on Mandarin Weibo and a 0.774 on English Twitter. △ Less

Submitted 24 August, 2020; v1 submitted 6 August, 2020; originally announced August 2020.

Comments: Accepted for CSCW 2020. To be published in PACM HCI

Showing 1–4 of 4 results for author: Hickman, L