-
Data Quality in Crowdsourcing and Spamming Behavior Detection
Authors:
Yang Ba,
Michelle V. Mancenido,
Erin K. Chiou,
Rong Pan
Abstract:
As crowdsourcing emerges as an efficient and cost-effective method for obtaining labels for machine learning datasets, it is important to assess the quality of crowd-provided data, so as to improve analysis performance and reduce biases in subsequent machine learning tasks. Given the lack of ground truth in most cases of crowdsourcing, we refer to data quality as annotators' consistency and credib…
▽ More
As crowdsourcing emerges as an efficient and cost-effective method for obtaining labels for machine learning datasets, it is important to assess the quality of crowd-provided data, so as to improve analysis performance and reduce biases in subsequent machine learning tasks. Given the lack of ground truth in most cases of crowdsourcing, we refer to data quality as annotators' consistency and credibility. Unlike the simple scenarios where Kappa coefficient and intraclass correlation coefficient usually can apply, online crowdsourcing requires dealing with more complex situations. We introduce a systematic method for evaluating data quality and detecting spamming threats via variance decomposition, and we classify spammers into three categories based on their different behavioral patterns. A spammer index is proposed to assess entire data consistency and two metrics are developed to measure crowd worker's credibility by utilizing the Markov chain and generalized random effects models. Furthermore, we showcase the practicality of our techniques and their advantages by applying them on a face verification task with both simulation and real-world data collected from two crowdsourcing platforms.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
PADTHAI-MM: A Principled Approach for Designing Trustable, Human-centered AI systems using the MAST Methodology
Authors:
Nayoung Kim,
Myke C. Cohen,
Yang Ba,
Anna Pan,
Shawaiz Bhatti,
Pouria Salehi,
James Sung,
Erik Blasch,
Michelle V. Mancenido,
Erin K. Chiou
Abstract:
Designing for AI trustworthiness is challenging, with a lack of practical guidance despite extensive literature on trust. The Multisource AI Scorecard Table (MAST), a checklist rating system, addresses this gap in designing and evaluating AI-enabled decision support systems. We propose the Principled Approach for Designing Trustable Human-centered AI systems using MAST Methodology (PADTHAI-MM), a…
▽ More
Designing for AI trustworthiness is challenging, with a lack of practical guidance despite extensive literature on trust. The Multisource AI Scorecard Table (MAST), a checklist rating system, addresses this gap in designing and evaluating AI-enabled decision support systems. We propose the Principled Approach for Designing Trustable Human-centered AI systems using MAST Methodology (PADTHAI-MM), a nine-step framework what we demonstrate through the iterative design of a text analysis platform called the REporting Assistant for Defense and Intelligence Tasks (READIT). We designed two versions of READIT, high-MAST including AI context and explanations, and low-MAST resembling a "black box" type system. Participant feedback and state-of-the-art AI knowledge was integrated in the design process, leading to a redesigned prototype tested by participants in an intelligence reporting task. Results show that MAST-guided design can improve trust perceptions, and that MAST criteria can be linked to performance, process, and purpose information, providing a practical and theory-informed basis for AI system design.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.
-
Evaluating Trustworthiness of AI-Enabled Decision Support Systems: Validation of the Multisource AI Scorecard Table (MAST)
Authors:
Pouria Salehi,
Yang Ba,
Nayoung Kim,
Ahmadreza Mosallanezhad,
Anna Pan,
Myke C. Cohen,
Yixuan Wang,
Jieqiong Zhao,
Shawaiz Bhatti,
James Sung,
Erik Blasch,
Michelle V. Mancenido,
Erin K. Chiou
Abstract:
The Multisource AI Scorecard Table (MAST) is a checklist tool based on analytic tradecraft standards to inform the design and evaluation of trustworthy AI systems. In this study, we evaluate whether MAST is associated with people's trust perceptions in AI-enabled decision support systems (AI-DSSs). Evaluating trust in AI-DSSs poses challenges to researchers and practitioners. These challenges incl…
▽ More
The Multisource AI Scorecard Table (MAST) is a checklist tool based on analytic tradecraft standards to inform the design and evaluation of trustworthy AI systems. In this study, we evaluate whether MAST is associated with people's trust perceptions in AI-enabled decision support systems (AI-DSSs). Evaluating trust in AI-DSSs poses challenges to researchers and practitioners. These challenges include identifying the components, capabilities, and potential of these systems, many of which are based on the complex deep learning algorithms that drive DSS performance and preclude complete manual inspection. We developed two interactive, AI-DSS test environments using the MAST criteria. One emulated an identity verification task in security screening, and another emulated a text summarization system to aid in an investigative reporting task. Each test environment had one version designed to match low-MAST ratings, and another designed to match high-MAST ratings, with the hypothesis that MAST ratings would be positively related to the trust ratings of these systems. A total of 177 subject matter experts were recruited to interact with and evaluate these systems. Results generally show higher MAST ratings for the high-MAST conditions compared to the low-MAST groups, and that measures of trust perception are highly correlated with the MAST ratings. We conclude that MAST can be a useful tool for designing and evaluating systems that will engender high trust perceptions, including AI-DSS that may be used to support visual screening and text summarization tasks. However, higher MAST ratings may not translate to higher joint performance.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
Robust Stance Detection: Understanding Public Perceptions in Social Media
Authors:
Nayoung Kim,
David Mosallanezhad,
Lu Cheng,
Michelle V. Mancenido,
Huan Liu
Abstract:
The abundance of social media data has presented opportunities for accurately determining public and group-specific stances around policy proposals or controversial topics. In contrast with sentiment analysis which focuses on identifying prevailing emotions, stance detection identifies precise positions (i.e., supportive, opposing, neutral) relative to a well-defined topic, such as perceptions tow…
▽ More
The abundance of social media data has presented opportunities for accurately determining public and group-specific stances around policy proposals or controversial topics. In contrast with sentiment analysis which focuses on identifying prevailing emotions, stance detection identifies precise positions (i.e., supportive, opposing, neutral) relative to a well-defined topic, such as perceptions toward specific global health interventions during the COVID-19 pandemic. Traditional stance detection models, while effective within their specific domain (e.g., attitudes towards masking protocols during COVID-19), often lag in performance when applied to new domains and topics due to changes in data distribution. This limitation is compounded by the scarcity of domain-specific, labeled datasets, which are expensive and labor-intensive to create. A solution we present in this paper combines counterfactual data augmentation with contrastive learning to enhance the robustness of stance detection across domains and topics of interest. We evaluate the performance of current state-of-the-art stance detection models, including a prompt-optimized large language model, relative to our proposed framework succinctly called STANCE-C3 (domain-adaptive Cross-target STANCE detection via Contrastive learning and Counterfactual generation). Empirical evaluations demonstrate STANCE-C3's consistent improvements over the baseline models with respect to accuracy across domains and varying focal topics. Despite the increasing prevalence of general-purpose models such as generative AI, specialized models such as STANCE-C3 provide utility in safety-critical domains wherein precision is highly valued, especially when a nuanced understanding of the concerns of different population segments could result in crafting more impactful public policies.
△ Less
Submitted 1 July, 2024; v1 submitted 26 September, 2023;
originally announced September 2023.
-
Domain Adaptive Fake News Detection via Reinforcement Learning
Authors:
Ahmadreza Mosallanezhad,
Mansooreh Karami,
Kai Shu,
Michelle V. Mancenido,
Huan Liu
Abstract:
With social media being a major force in information consumption, accelerated propagation of fake news has presented new challenges for platforms to distinguish between legitimate and fake news. Effective fake news detection is a non-trivial task due to the diverse nature of news domains and expensive annotation costs. In this work, we address the limitations of existing automated fake news detect…
▽ More
With social media being a major force in information consumption, accelerated propagation of fake news has presented new challenges for platforms to distinguish between legitimate and fake news. Effective fake news detection is a non-trivial task due to the diverse nature of news domains and expensive annotation costs. In this work, we address the limitations of existing automated fake news detection models by incorporating auxiliary information (e.g., user comments and user-news interactions) into a novel reinforcement learning-based model called \textbf{RE}inforced \textbf{A}daptive \textbf{L}earning \textbf{F}ake \textbf{N}ews \textbf{D}etection (REAL-FND). REAL-FND exploits cross-domain and within-domain knowledge that makes it robust in a target domain, despite being trained in a different source domain. Extensive experiments on real-world datasets illustrate the effectiveness of the proposed model, especially when limited labeled data is available in the target domain.
△ Less
Submitted 16 February, 2022;
originally announced February 2022.
-
"Let's Eat Grandma": Does Punctuation Matter in Sentence Representation?
Authors:
Mansooreh Karami,
Ahmadreza Mosallanezhad,
Michelle V Mancenido,
Huan Liu
Abstract:
Neural network-based embeddings have been the mainstream approach for creating a vector representation of the text to capture lexical and semantic similarities and dissimilarities. In general, existing encoding methods dismiss the punctuation as insignificant information; consequently, they are routinely treated as a predefined token/word or eliminated in the pre-processing phase. However, punctua…
▽ More
Neural network-based embeddings have been the mainstream approach for creating a vector representation of the text to capture lexical and semantic similarities and dissimilarities. In general, existing encoding methods dismiss the punctuation as insignificant information; consequently, they are routinely treated as a predefined token/word or eliminated in the pre-processing phase. However, punctuation could play a significant role in the semantics of the sentences, as in "Let's eat\hl{,} grandma" and "Let's eat grandma". We hypothesize that a punctuation-aware representation model would affect the performance of the downstream tasks. Thereby, we propose a model-agnostic method that incorporates both syntactic and contextual information to improve the performance of the sentiment classification task. We corroborate our findings by conducting experiments on publicly available datasets and provide case studies that our model generates representations with respect to the punctuation in the sentence.
△ Less
Submitted 7 August, 2022; v1 submitted 10 December, 2020;
originally announced January 2021.
-
Toward Privacy and Utility Preserving Image Representation
Authors:
Ahmadreza Mosallanezhad,
Yasin N. Silva,
Michelle V. Mancenido,
Huan Liu
Abstract:
Face images are rich data items that are useful and can easily be collected in many applications, such as in 1-to-1 face verification tasks in the domain of security and surveillance systems. Multiple methods have been proposed to protect an individual's privacy by perturbing the images to remove traces of identifiable information, such as gender or race. However, significantly less attention has…
▽ More
Face images are rich data items that are useful and can easily be collected in many applications, such as in 1-to-1 face verification tasks in the domain of security and surveillance systems. Multiple methods have been proposed to protect an individual's privacy by perturbing the images to remove traces of identifiable information, such as gender or race. However, significantly less attention has been given to the problem of protecting images while maintaining optimal task utility. In this paper, we study the novel problem of creating privacy-preserving image representations with respect to a given utility task by proposing a principled framework called the Adversarial Image Anonymizer (AIA). AIA first creates an image representation using a generative model, then enhances the learned image representations using adversarial learning to preserve privacy and utility for a given task. Experiments were conducted on a publicly available data set to demonstrate the effectiveness of AIA as a privacy-preserving mechanism for face images.
△ Less
Submitted 17 October, 2020; v1 submitted 29 September, 2020;
originally announced September 2020.