subscribe to arXiv mailings

The Future of Research on Social Technologies: CCC Workshop Visioning Report

Authors: Motahhare Eslami, Eric Gilbert, Sarita Schoenebeck, Eric P. S. Baumer, Eshwar Chandrasekharan, Michelle De Mooy, Karrie Karahalios, David Karger, Tressie McMillan Cottom, Andrés Monroy-Hernández, Loren Terveen, John Wihbey

Abstract: Social technologies are the systems, interfaces, features, infrastructures, and architectures that allow people to interact with each other online. These technologies dramatically shape the fabric of our everyday lives, from the information we consume to the people we interact with to the foundations of our culture and politics. While the benefits of social technologies are well documented, the ha… ▽ More Social technologies are the systems, interfaces, features, infrastructures, and architectures that allow people to interact with each other online. These technologies dramatically shape the fabric of our everyday lives, from the information we consume to the people we interact with to the foundations of our culture and politics. While the benefits of social technologies are well documented, the harms, too, have cast a long shadow. To address widespread problems like harassment, disinformation, information access, and mental health concerns, we need to rethink the foundations of how social technologies are designed, sustained, and governed. This report is based on discussions at the Computing Community Consortium Workshop, The Future of Research on Social Technologies, that was held November 2-3, 2023 in Washington, DC. The visioning workshop came together to focus on two questions. What should we know about social technologies, and what is needed to get there? The workshop brought together over 50 information and computer scientists, social scientists, communication and journalism scholars, and policy experts. We used a discussion format, with one day of guiding topics and a second day using an unconference model where participants created discussion topics. The interdisciplinary group of attendees discussed gaps in existing scholarship and the methods, resources, access, and collective effort needed to address those gaps. We also discussed approaches for translating scholarship for various audiences including citizens, funders, educators, industry professionals, and policymakers. This report presents a synthesis of major themes during our discussions. The themes presented are not a summary of what we know already, they are an exploration of what we do not know enough about, and what we should spend more effort and investment on in the coming years. △ Less

Submitted 16 April, 2024; originally announced April 2024.

arXiv:2403.07150 [pdf]

Breaking Political Filter Bubbles via Social Comparison

Authors: Nouran Soliman, Motahhare Eslami, Karrie Karahalios

Abstract: Online social platforms allow users to filter out content they do not like. According to selective exposure theory, people tend to view content they agree with more to get more self-assurance. This causes people to live in ideological filter bubbles. We report on a user study that encourages users to break the political filter bubble of their Twitter feed by reading more diverse viewpoints through… ▽ More Online social platforms allow users to filter out content they do not like. According to selective exposure theory, people tend to view content they agree with more to get more self-assurance. This causes people to live in ideological filter bubbles. We report on a user study that encourages users to break the political filter bubble of their Twitter feed by reading more diverse viewpoints through social comparison. The user study is conducted using political-bias analyzing and Twitter-mirroring tools to compare the political slant of what a user reads and what other Twitter users read about a topic, and in general. The results show that social comparison can have a great impact on users' reading behavior by motivating them to read viewpoints from the opposing political party. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: * Both of the first two authors contributed equally to this work

arXiv:2309.07287 [pdf, other]

Enhancing Child Vocalization Classification with Phonetically-Tuned Embeddings for Assisting Autism Diagnosis

Authors: Jialu Li, Mark Hasegawa-Johnson, Karrie Karahalios

Abstract: The assessment of children at risk of autism typically involves a clinician observing, taking notes, and rating children's behaviors. A machine learning model that can label adult and child audio may largely save labor in coding children's behaviors, helping clinicians capture critical events and better communicate with parents. In this study, we leverage Wav2Vec 2.0 (W2V2), pre-trained on 4300-ho… ▽ More The assessment of children at risk of autism typically involves a clinician observing, taking notes, and rating children's behaviors. A machine learning model that can label adult and child audio may largely save labor in coding children's behaviors, helping clinicians capture critical events and better communicate with parents. In this study, we leverage Wav2Vec 2.0 (W2V2), pre-trained on 4300-hour of home audio of children under 5 years old, to build a unified system for tasks of clinician-child speaker diarization and vocalization classification (VC). To enhance children's VC, we build a W2V2 phoneme recognition system for children under 4 years old, and we incorporate its phonetically-tuned embeddings as auxiliary features or recognize pseudo phonetic transcripts as an auxiliary task. We test our method on two corpora (Rapid-ABC and BabbleCor) and obtain consistent improvements. Additionally, we outperform the state-of-the-art performance on the reproducible subset of BabbleCor. Code available at https://huggingface.co/lijialudew △ Less

Submitted 5 June, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

Comments: Accepted to Interspeech 2024

arXiv:2302.00832 [pdf, other]

doi 10.1145/3544548.3581252

Inform the uninformed: Improving Online Informed Consent Reading with an AI-Powered Chatbot

Authors: Ziang Xiao, Tiffany Wenting Li, Karrie Karahalios, Hari Sundaram

Abstract: Informed consent is a core cornerstone of ethics in human subject research. Through the informed consent process, participants learn about the study procedure, benefits, risks, and more to make an informed decision. However, recent studies showed that current practices might lead to uninformed decisions and expose participants to unknown risks, especially in online studies. Without the researcher'… ▽ More Informed consent is a core cornerstone of ethics in human subject research. Through the informed consent process, participants learn about the study procedure, benefits, risks, and more to make an informed decision. However, recent studies showed that current practices might lead to uninformed decisions and expose participants to unknown risks, especially in online studies. Without the researcher's presence and guidance, online participants must read a lengthy form on their own with no answers to their questions. In this paper, we examined the role of an AI-powered chatbot in improving informed consent online. By comparing the chatbot with form-based interaction, we found the chatbot improved consent form reading, promoted participants' feelings of agency, and closed the power gap between the participant and the researcher. Our exploratory analysis further revealed the altered power dynamic might eventually benefit study response quality. We discussed design implications for creating AI-powered chatbots to offer effective informed consent in broader settings. △ Less

Submitted 1 February, 2023; originally announced February 2023.

Comments: Accepted by CHI 2023

arXiv:2210.08974 [pdf]

Coordinated Science Laboratory 70th Anniversary Symposium: The Future of Computing

Authors: Klara Nahrstedt, Naresh Shanbhag, Vikram Adve, Nancy Amato, Romit Roy Choudhury, Carl Gunter, Nam Sung Kim, Olgica Milenkovic, Sayan Mitra, Lav Varshney, Yurii Vlasov, Sarita Adve, Rashid Bashir, Andreas Cangellaris, James DiCarlo, Katie Driggs-Campbell, Nick Feamster, Mattia Gazzola, Karrie Karahalios, Sanmi Koyejo, Paul Kwiat, Bo Li, Negar Mehr, Ravish Mehra, Andrew Miller , et al. (3 additional authors not shown)

Abstract: In 2021, the Coordinated Science Laboratory CSL, an Interdisciplinary Research Unit at the University of Illinois Urbana-Champaign, hosted the Future of Computing Symposium to celebrate its 70th anniversary. CSL's research covers the full computing stack, computing's impact on society and the resulting need for social responsibility. In this white paper, we summarize the major technological points… ▽ More In 2021, the Coordinated Science Laboratory CSL, an Interdisciplinary Research Unit at the University of Illinois Urbana-Champaign, hosted the Future of Computing Symposium to celebrate its 70th anniversary. CSL's research covers the full computing stack, computing's impact on society and the resulting need for social responsibility. In this white paper, we summarize the major technological points, insights, and directions that speakers brought forward during the Future of Computing Symposium. Participants discussed topics related to new computing paradigms, technologies, algorithms, behaviors, and research challenges to be expected in the future. The symposium focused on new computing paradigms that are going beyond traditional computing and the research needed to support their realization. These needs included stressing security and privacy, the end to end human cyber physical systems and with them the analysis of the end to end artificial intelligence needs. Furthermore, advances that enable immersive environments for users, the boundaries between humans and machines will blur and become seamless. Particular integration challenges were made clear in the final discussion on the integration of autonomous driving, robo taxis, pedestrians, and future cities. Innovative approaches were outlined to motivate the next generation of researchers to work on these challenges. The discussion brought out the importance of considering not just individual research areas, but innovations at the intersections between computing research efforts and relevant application domains, such as health care, transportation, energy systems, and manufacturing. △ Less

Submitted 4 October, 2022; originally announced October 2022.

arXiv:2205.10977 [pdf, other]

What should I Ask: A Knowledge-driven Approach for Follow-up Questions Generation in Conversational Surveys

Authors: Yubin Ge, Ziang Xiao, Jana Diesner, Heng Ji, Karrie Karahalios, Hari Sundaram

Abstract: Generating follow-up questions on the fly could significantly improve conversational survey quality and user experiences by enabling a more dynamic and personalized survey structure. In this paper, we proposed a novel task for knowledge-driven follow-up question generation in conversational surveys. We constructed a new human-annotated dataset of human-written follow-up questions with dialogue his… ▽ More Generating follow-up questions on the fly could significantly improve conversational survey quality and user experiences by enabling a more dynamic and personalized survey structure. In this paper, we proposed a novel task for knowledge-driven follow-up question generation in conversational surveys. We constructed a new human-annotated dataset of human-written follow-up questions with dialogue history and labeled knowledge in the context of conversational surveys. Along with the dataset, we designed and validated a set of reference-free Gricean-inspired evaluation metrics to systematically evaluate the quality of generated follow-up questions. We then propose a two-staged knowledge-driven model for the task, which generates informative and coherent follow-up questions by using knowledge to steer the generation process. The experiments demonstrate that compared to GPT-based baseline models, our two-staged model generates more informative, coherent, and clear follow-up questions. △ Less

Submitted 13 October, 2023; v1 submitted 22 May, 2022; originally announced May 2022.

arXiv:2102.07070 [pdf, other]

Deconstructing Categorization in Visualization Recommendation: A Taxonomy and Comparative Study

Authors: Doris Jung-Lin Lee, Vidya Setlur, Melanie Tory, Karrie Karahalios, Aditya Parameswaran

Abstract: Visualization recommendation (VisRec) systems provide users with suggestions for potentially interesting and useful next steps during exploratory data analysis. These recommendations are typically organized into categories based on their analytical actions, i.e., operations employed to transition from the current exploration state to a recommended visualization. However, despite the emergence of a… ▽ More Visualization recommendation (VisRec) systems provide users with suggestions for potentially interesting and useful next steps during exploratory data analysis. These recommendations are typically organized into categories based on their analytical actions, i.e., operations employed to transition from the current exploration state to a recommended visualization. However, despite the emergence of a plethora of VisRec systems in recent work, the utility of the categories employed by these systems in analytical workflows has not been systematically investigated. Our paper explores the efficacy of recommendation categories by formalizing a taxonomy of common categories and developing a system, Frontier, that implements these categories. Using Frontier, we evaluate workflow strategies adopted by users and how categories influence those strategies. Participants found recommendations that add attributes to enhance the current visualization and recommendations that filter to sub-populations to be comparatively most useful during data exploration. Our findings pave the way for next-generation VisRec systems that are adaptive and personalized via carefully chosen, effective recommendation categories. △ Less

Submitted 14 February, 2021; originally announced February 2021.

Comments: 10 pages. This work has been submitted to IEEE TVCG. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:1910.00757 [pdf, other]

doi 10.1145/3359222

Quantifying Voter Biases in Online Platforms: An Instrumental Variable Approach

Authors: Himel Dev, Karrie Karahalios, Hari Sundaram

Abstract: In content-based online platforms, use of aggregate user feedback (say, the sum of votes) is commonplace as the "gold standard" for measuring content quality. Use of vote aggregates, however, is at odds with the existing empirical literature, which suggests that voters are susceptible to different biases -- reputation (e.g., of the poster), social influence (e.g., votes thus far), and position (e.… ▽ More In content-based online platforms, use of aggregate user feedback (say, the sum of votes) is commonplace as the "gold standard" for measuring content quality. Use of vote aggregates, however, is at odds with the existing empirical literature, which suggests that voters are susceptible to different biases -- reputation (e.g., of the poster), social influence (e.g., votes thus far), and position (e.g., answer position). Our goal is to quantify, in an observational setting, the degree of these biases in online platforms. Specifically, what are the causal effects of different impression signals -- such as the reputation of the contributing user, aggregate vote thus far, and position of content -- on a participant's vote on content? We adopt an instrumental variable (IV) framework to answer this question. We identify a set of candidate instruments, carefully analyze their validity, and then use the valid instruments to reveal the effects of the impression signals on votes. Our empirical study using log data from Stack Exchange websites shows that the bias estimates from our IV approach differ from the bias estimates from the ordinary least squares (OLS) method. In particular, OLS underestimates reputation bias (1.6--2.2x for gold badges) and position bias (up to 1.9x for the initial position) and overestimates social influence bias (1.8--2.3x for initial votes). The implications of our work include: redesigning user interface to avoid voter biases; making changes to platforms' policy to mitigate voter biases; detecting other forms of biases in online platforms. △ Less

Submitted 1 October, 2019; originally announced October 2019.

Comments: The 22nd ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW), 2019

Journal ref: Proceedings of the ACM on Human Computer Interaction, Vol. 3, No. CSCW, Article 120. Publication date: November 2019

arXiv:1811.07977 [pdf, other]

ShapeSearch: A Flexible and Efficient System for Shape-based Exploration of Trendlines

Authors: Tarique Siddiqui, Zesheng Wang, Paul Luh, Karrie Karahalios, Aditya Parameswaran

Abstract: Identifying trendline visualizations with desired patterns is a common and fundamental data exploration task. Existing visual analytics tools offer limited flexibility and expressiveness for such tasks, especially when the pattern of interest is under-specified and approximate, and do not scale well when the pattern searching needs are ad-hoc, as is often the case. We propose ShapeSearch, an effic… ▽ More Identifying trendline visualizations with desired patterns is a common and fundamental data exploration task. Existing visual analytics tools offer limited flexibility and expressiveness for such tasks, especially when the pattern of interest is under-specified and approximate, and do not scale well when the pattern searching needs are ad-hoc, as is often the case. We propose ShapeSearch, an efficient and flexible pattern-searching tool, that enables the search for desired patterns via multiple mechanisms: sketch, natural-language, and visual regular expressions. We develop a novel shape querying algebra, with a minimal set of primitives and operators that can express a large number of ShapeSearch queries, and design a natural-language and regex-based parser to automatically parse and translate user queries to the algebra representation. To execute these queries within interactive response times, ShapeSearch uses a fast shape algebra-based execution engine with query-aware optimizations, and perceptually-aware scoring methodologies. We present a thorough evaluation of the system, including a general-purpose user study, a case study involving genomic data analysis, as well as performance experiments, comparing against state-of-the-art time series shape matching approaches---that together demonstrate the usability and scalability of ShapeSearch. △ Less

Submitted 29 January, 2020; v1 submitted 19 November, 2018; originally announced November 2018.

arXiv:1801.03829 [pdf, other]

Characterizing Scalability Issues in Spreadsheet Software using Online Forums

Authors: Kelly Mack, John Lee, Kevin Chang, Karrie Karahalios, Aditya Parameswaran

Abstract: In traditional usability studies, researchers talk to users of tools to understand their needs and challenges. Insights gained via such interviews offer context, detail, and background. Due to costs in time and money, we are beginning to see a new form of tool interrogation that prioritizes scale, cost, and breadth by utilizing existing data from online forums. In this case study, we set out to ap… ▽ More In traditional usability studies, researchers talk to users of tools to understand their needs and challenges. Insights gained via such interviews offer context, detail, and background. Due to costs in time and money, we are beginning to see a new form of tool interrogation that prioritizes scale, cost, and breadth by utilizing existing data from online forums. In this case study, we set out to apply this method of using online forum data to a specific issue---challenges that users face with Excel spreadsheets. Spreadsheets are a versatile and powerful processing tool if used properly. However, with versatility and power come errors, from both users and the software, which make using spreadsheets less effective. By scraping posts from the website Reddit, we collected a dataset of questions and complaints about Excel. Specifically, we explored and characterized the issues users were facing with spreadsheet software in general, and in particular, as resulting from a large amount of data in their spreadsheets. We discuss the implications of our findings on the design of next-generation spreadsheet software. △ Less

Submitted 30 January, 2018; v1 submitted 11 January, 2018; originally announced January 2018.

arXiv:1710.00763 [pdf, other]

doi 10.1109/TVCG.2019.2934666

You can't always sketch what you want: Understanding Sensemaking in Visual Query Systems

Authors: Doris Jung-Lin Lee, John Lee, Tarique Siddiqui, Jaewoo Kim, Karrie Karahalios, Aditya Parameswaran

Abstract: Visual query systems (VQSs) empower users to interactively search for line charts with desired visual patterns, typically specified using intuitive sketch-based interfaces. Despite decades of past work on VQSs, these efforts have not translated to adoption in practice, possibly because VQSs are largely evaluated in unrealistic lab-based settings. To remedy this gap in adoption, we collaborated wit… ▽ More Visual query systems (VQSs) empower users to interactively search for line charts with desired visual patterns, typically specified using intuitive sketch-based interfaces. Despite decades of past work on VQSs, these efforts have not translated to adoption in practice, possibly because VQSs are largely evaluated in unrealistic lab-based settings. To remedy this gap in adoption, we collaborated with experts from three diverse domains---astronomy, genetics, and material science---via a year-long user-centered design process to develop a VQS that supports their workflow and analytical needs, and evaluate how VQSs can be used in practice. Our study results reveal that ad-hoc sketch-only querying is not as commonly used as prior work suggests, since analysts are often unable to precisely express their patterns of interest. In addition, we characterize three essential sensemaking processes supported by our enhanced VQS. We discover that participants employ all three processes, but in different proportions, depending on the analytical needs in each domain. Our findings suggest that all three sensemaking processes must be integrated in order to make future VQSs useful for a wide range of analytical inquiries. △ Less

Submitted 3 October, 2019; v1 submitted 2 October, 2017; originally announced October 2017.

Comments: Accepted for presentation at IEEE VAST 2019, to be held October 20-25 in Vancouver, Canada. Paper will also be published in a special issue of IEEE Transactions on Visualization and Computer Graphics (TVCG) IEEE VIS (InfoVis/VAST/SciVis) 2019 ACM 2012 CCS - Human-centered computing, Visualization, Visualization design and evaluation methods

arXiv:1704.01347 [pdf, ps, other]

doi 10.1145/2998181.2998321

Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media

Authors: Juhi Kulshrestha, Motahhare Eslami, Johnnatan Messias, Muhammad Bilal Zafar, Saptarshi Ghosh, Krishna P. Gummadi, Karrie Karahalios

Abstract: Search systems in online social media sites are frequently used to find information about ongoing events and people. For topics with multiple competing perspectives, such as political events or political candidates, bias in the top ranked results significantly shapes public opinion. However, bias does not emerge from an algorithm alone. It is important to distinguish between the bias that arises f… ▽ More Search systems in online social media sites are frequently used to find information about ongoing events and people. For topics with multiple competing perspectives, such as political events or political candidates, bias in the top ranked results significantly shapes public opinion. However, bias does not emerge from an algorithm alone. It is important to distinguish between the bias that arises from the data that serves as the input to the ranking system and the bias that arises from the ranking system itself. In this paper, we propose a framework to quantify these distinct biases and apply this framework to politics-related queries on Twitter. We found that both the input data and the ranking system contribute significantly to produce varying amounts of bias in the search results and in different ways. We discuss the consequences of these biases and possible mechanisms to signal this bias in social media search systems' interfaces. △ Less

Submitted 5 April, 2017; originally announced April 2017.

Comments: In Proceedings of ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW), Portland, USA, February 2017

arXiv:1604.03583 [pdf, other]

Effortless Data Exploration with zenvisage: An Expressive and Interactive Visual Analytics System

Authors: Tarique Siddiqui, Albert Kim, John Lee, Karrie Karahalios, Aditya Parameswaran

Abstract: Data visualization is by far the most commonly used mechanism to explore data, especially by novice data analysts and data scientists. And yet, current visual analytics tools are rather limited in their ability to guide data scientists to interesting or desired visualizations: the process of visual data exploration remains cumbersome and time-consuming. We propose zenvisage, a platform for effortl… ▽ More Data visualization is by far the most commonly used mechanism to explore data, especially by novice data analysts and data scientists. And yet, current visual analytics tools are rather limited in their ability to guide data scientists to interesting or desired visualizations: the process of visual data exploration remains cumbersome and time-consuming. We propose zenvisage, a platform for effortlessly visualizing interesting patterns, trends, or insights from large datasets. We describe zenvisage's general purpose visual query language, ZQL ("zee-quel") for specifying the desired visual trend, pattern, or insight - ZQL draws from use-cases in a variety of domains, including biology, mechanical engineering, climate science, and commerce. We formalize the expressiveness of ZQL via a visual exploration algebra, and demonstrate that ZQL is at least as expressive as that algebra. While analysts are free to use ZQL directly, we also expose ZQL via a visual specification interface that we describe in this paper. We then describe our architecture and optimizations, preliminary experiments in supporting and optimizing for ZQL queries in our initial zenvisage prototype, and a user study to evaluate whether data scientists are able to effectively use zenvisage for real applications. △ Less

Submitted 4 January, 2018; v1 submitted 12 April, 2016; originally announced April 2016.

Comments: Tech Report

Showing 1–13 of 13 results for author: Karahalios, K