subscribe to arXiv mailings

Trusting the Search: Unraveling Human Trust in Health Information from Google and ChatGPT

Authors: Xin Sun, Rongjun Ma, Xiaochang Zhao, Zhuying Li, Janne Lindqvist, Abdallah El Ali, Jos A. Bosch

Abstract: People increasingly rely on online sources for health information seeking due to their convenience and timeliness, traditionally using search engines like Google as the primary search agent. Recently, the emergence of generative Artificial Intelligence (AI) has made Large Language Model (LLM) powered conversational agents such as ChatGPT a viable alternative for health information search. However,… ▽ More People increasingly rely on online sources for health information seeking due to their convenience and timeliness, traditionally using search engines like Google as the primary search agent. Recently, the emergence of generative Artificial Intelligence (AI) has made Large Language Model (LLM) powered conversational agents such as ChatGPT a viable alternative for health information search. However, while trust is crucial for adopting the online health advice, the factors influencing people's trust judgments in health information provided by LLM-powered conversational agents remain unclear. To address this, we conducted a mixed-methods, within-subjects lab study (N=21) to explore how interactions with different agents (ChatGPT vs. Google) across three health search tasks influence participants' trust judgments of the search results as well as the search agents themselves. Our key findings showed that: (a) participants' trust levels in ChatGPT were significantly higher than Google in the context of health information seeking; (b) there is a significant correlation between trust in health-related information and trust in the search agent, however only for Google; (c) the type of search tasks did not affect participants' perceived trust; and (d) participants' prior knowledge, the style of information presentation, and the interactive manner of using search agents were key determinants of trust in the health-related information. Our study taps into differences in trust perceptions when using traditional search engines compared to LLM-powered conversational agents. We highlight the potential role LLMs play in health-related information-seeking contexts, where they excel as stepping stones for further search. We contribute key factors and considerations for ensuring effective and reliable personal health information seeking in the age of generative AI. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: 24 pages

ACM Class: F.2.2, I.2.7

arXiv:2402.16688 [pdf, ps, other]

On the connection between Noise-Contrastive Estimation and Contrastive Divergence

Authors: Amanda Olmin, Jakob Lindqvist, Lennart Svensson, Fredrik Lindsten

Abstract: Noise-contrastive estimation (NCE) is a popular method for estimating unnormalised probabilistic models, such as energy-based models, which are effective for modelling complex data distributions. Unlike classical maximum likelihood (ML) estimation that relies on importance sampling (resulting in ML-IS) or MCMC (resulting in contrastive divergence, CD), NCE uses a proxy criterion to avoid the need… ▽ More Noise-contrastive estimation (NCE) is a popular method for estimating unnormalised probabilistic models, such as energy-based models, which are effective for modelling complex data distributions. Unlike classical maximum likelihood (ML) estimation that relies on importance sampling (resulting in ML-IS) or MCMC (resulting in contrastive divergence, CD), NCE uses a proxy criterion to avoid the need for evaluating an often intractable normalisation constant. Despite apparent conceptual differences, we show that two NCE criteria, ranking NCE (RNCE) and conditional NCE (CNCE), can be viewed as ML estimation methods. Specifically, RNCE is equivalent to ML estimation combined with conditional importance sampling, and both RNCE and CNCE are special cases of CD. These findings bridge the gap between the two method classes and allow us to apply techniques from the ML-IS and CD literature to NCE, offering several advantageous extensions. △ Less

Submitted 26 February, 2024; originally announced February 2024.

Comments: Accepted to AISTATS 2024

arXiv:2307.14012 [pdf, other]

MCMC-Correction of Score-Based Diffusion Models for Model Composition

Authors: Anders Sjöberg, Jakob Lindqvist, Magnus Önnheim, Mats Jirstrand, Lennart Svensson

Abstract: Diffusion models can be parameterised in terms of either a score or an energy function. An energy parameterisation is appealing since it enables an extended sampling procedure with a Metropolis--Hastings (MH) correction step, based on the change in total energy in the proposed samples. Improved sampling is important for model compositions, where off-the-shelf models are combined with each other, i… ▽ More Diffusion models can be parameterised in terms of either a score or an energy function. An energy parameterisation is appealing since it enables an extended sampling procedure with a Metropolis--Hastings (MH) correction step, based on the change in total energy in the proposed samples. Improved sampling is important for model compositions, where off-the-shelf models are combined with each other, in order to sample from new distributions. For model composition, score-based diffusions have the advantages that they are popular and that many pre-trained models are readily available. However, this parameterisation does not, in general, define an energy, and the MH acceptance probability is therefore unavailable, and generally ill-defined. We propose keeping the score parameterisation and computing an acceptance probability inspired by energy-based models through line integration of the score function. This allows us to reuse existing diffusion models and still combine the reverse process with various Markov-Chain Monte Carlo (MCMC) methods. We evaluate our method using numerical experiments and find that score-parameterised versions of the MCMC samplers can achieve similar improvements to the corresponding energy parameterisation. △ Less

Submitted 10 July, 2024; v1 submitted 26 July, 2023; originally announced July 2023.

arXiv:2210.04569 [pdf, other]

Systematic Evaluation and User Study of Privacy of Default Apps in Apple's Mobile Ecosystem

Authors: Amel Bourdoucen, Janne Lindqvist

Abstract: Users need to configure default apps when they first start using their devices. The privacy configurations of the default apps do not always match what users think they have initially enabled. We first systematically evaluated the privacy configurations of default apps. We discovered serious issues with the documentation of the default apps. Based on these findings, we explored users' experiences… ▽ More Users need to configure default apps when they first start using their devices. The privacy configurations of the default apps do not always match what users think they have initially enabled. We first systematically evaluated the privacy configurations of default apps. We discovered serious issues with the documentation of the default apps. Based on these findings, we explored users' experiences with an interview study (N=15). Our findings from both studies show that: the instructions of setting privacy configurations of default apps are vague and lack required steps; users were unable to disable default apps from accessing their personal information; users assumed they were being tracked by some default apps; default apps may cause tensions in family relationships because of information sharing. Our results illuminate on the privacy and security implications of configuring the privacy of default apps and how users perceive and understand the mobile ecosystem. △ Less

Submitted 10 October, 2022; originally announced October 2022.

Comments: 23 pages, 1 Figure

arXiv:2204.08335 [pdf, other]

doi 10.1007/978-981-99-1642-9_17

Active Learning with Weak Supervision for Gaussian Processes

Authors: Amanda Olmin, Jakob Lindqvist, Lennart Svensson, Fredrik Lindsten

Abstract: Annotating data for supervised learning can be costly. When the annotation budget is limited, active learning can be used to select and annotate those observations that are likely to give the most gain in model performance. We propose an active learning algorithm that, in addition to selecting which observation to annotate, selects the precision of the annotation that is acquired. Assuming that an… ▽ More Annotating data for supervised learning can be costly. When the annotation budget is limited, active learning can be used to select and annotate those observations that are likely to give the most gain in model performance. We propose an active learning algorithm that, in addition to selecting which observation to annotate, selects the precision of the annotation that is acquired. Assuming that annotations with low precision are cheaper to obtain, this allows the model to explore a larger part of the input space, with the same annotation budget. We build our acquisition function on the previously proposed BALD objective for Gaussian Processes, and empirically demonstrate the gains of being able to adjust the annotation precision in the active learning loop. △ Less

Submitted 9 June, 2023; v1 submitted 18 April, 2022; originally announced April 2022.

Journal ref: In: ICONIP. Communications in Computer and Information Science, vol 1792. Springer, Singapore (2023)

arXiv:2002.11531 [pdf, ps, other]

doi 10.1109/MLSP49062.2020.9231703

A general framework for ensemble distribution distillation

Authors: Jakob Lindqvist, Amanda Olmin, Fredrik Lindsten, Lennart Svensson

Abstract: Ensembles of neural networks have been shown to give better performance than single networks, both in terms of predictions and uncertainty estimation. Additionally, ensembles allow the uncertainty to be decomposed into aleatoric (data) and epistemic (model) components, giving a more complete picture of the predictive uncertainty. Ensemble distillation is the process of compressing an ensemble into… ▽ More Ensembles of neural networks have been shown to give better performance than single networks, both in terms of predictions and uncertainty estimation. Additionally, ensembles allow the uncertainty to be decomposed into aleatoric (data) and epistemic (model) components, giving a more complete picture of the predictive uncertainty. Ensemble distillation is the process of compressing an ensemble into a single model, often resulting in a leaner model that still outperforms the individual ensemble members. Unfortunately, standard distillation erases the natural uncertainty decomposition of the ensemble. We present a general framework for distilling both regression and classification ensembles in a way that preserves the decomposition. We demonstrate the desired behaviour of our framework and show that its predictive performance is on par with standard distillation. △ Less

Submitted 8 January, 2021; v1 submitted 26 February, 2020; originally announced February 2020.

Journal ref: 2020 IEEE 30th International Workshop on Machine Learning for Signal Processing (MLSP), Espoo, Finland, 2020, pp. 1-6

arXiv:1812.09410 [pdf, other]

Quantifying the Security of Recognition Passwords: Gestures and Signatures

Authors: Can Liu, Shridatt Sugrim, Gradeigh D. Clark, Janne Lindqvist

Abstract: Gesture and signature passwords are two-dimensional figures created by drawing on the surface of a touchscreen with one or more fingers. Prior results about their security have used resilience to either shoulder surfing, a human observation attack, or dictionary attacks. These evaluations restrict generalizability since the results are: non-comparable to other password systems (e.g. PINs), harder… ▽ More Gesture and signature passwords are two-dimensional figures created by drawing on the surface of a touchscreen with one or more fingers. Prior results about their security have used resilience to either shoulder surfing, a human observation attack, or dictionary attacks. These evaluations restrict generalizability since the results are: non-comparable to other password systems (e.g. PINs), harder to reproduce, and attacker-dependent. Strong statements about the security of a password system use an analysis of the statistical distribution of the password space, which models a best-case attacker who guesses passwords in order of most likely to least likely. Estimating the distribution of recognition passwords is challenging because many different trials need to map to one password. In this paper, we solve this difficult problem by: (1) representing a recognition password of continuous data as a discrete alphabet set, and (2) estimating the password distribution through modeling the unseen passwords. We use Symbolic Aggregate approXimation (SAX) to represent time series data as symbols and develop Markov chains to model recognition passwords. We use a partial guessing metric, which demonstrates how many guesses an attacker needs to crack a percentage of the entire space, to compare the security of the distributions for gestures, signatures, and Android unlock patterns. We found the lower bounds of the partial guessing metric of gestures and signatures are much higher than the upper bound of the partial guessing metric of Android unlock patterns. △ Less

Submitted 21 December, 2018; originally announced December 2018.

arXiv:1710.06932 [pdf, other]

Transforming Speed Sequences into Road Rays on the Map with Elastic Pathing

Authors: Xianyi Gao, Bernhard Firner, Shridatt Sugrim, Victor Kaiser-Pendergrast, Yulong Yang, Janne Lindqvist

Abstract: Advances in technology have provided ways to monitor and measure driving behavior. Recently, this technology has been applied to usage-based automotive insurance policies that offer reduced insurance premiums to policy holders who opt-in to automotive monitoring. Several companies claim to measure only speed data, which they further claim preserves privacy. However, we have developed an algorithm… ▽ More Advances in technology have provided ways to monitor and measure driving behavior. Recently, this technology has been applied to usage-based automotive insurance policies that offer reduced insurance premiums to policy holders who opt-in to automotive monitoring. Several companies claim to measure only speed data, which they further claim preserves privacy. However, we have developed an algorithm - elastic pathing - that successfully tracks drivers' locations from speed data. The algorithm tracks drivers by assuming a start position, such as the driver's home address (which is typically known to insurance companies), and then estimates the possible routes by fitting the speed data to map data. To demonstrate the algorithm's real-world applicability, we evaluated its performance with driving datasets from central New Jersey and Seattle, Washington, representing suburban and urban areas. We are able to estimate destinations with error within 250 meters for 17% of the traces and within 500 meters for 24% of the traces in the New Jersey dataset, and with error within 250 and 500 meters for 15.5% and 27.5% of the traces, respectively, in the Seattle dataset. Our work shows that these insurance schemes enable a substantial breach of privacy. △ Less

Submitted 18 October, 2017; originally announced October 2017.

Comments: This article supersedes both arXiv:1401.0052 and the earlier work published in Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp'14). Please see http://elasticpathing.org for further information

arXiv:1503.02377 [pdf, other]

Of Two Minds, Multiple Addresses, and One History: Characterizing Opinions, Knowledge, and Perceptions of Bitcoin Across Groups

Authors: Xianyi Gao, Gradeigh D. Clark, Janne Lindqvist

Abstract: Digital currencies represent a new method for exchange and investment that differs strongly from any other fiat money seen throughout history. A digital currency makes it possible to perform all financial transactions without the intervention of a third party to act as an arbiter of verification; payments can be made between two people with degrees of anonymity, across continents, at any denominat… ▽ More Digital currencies represent a new method for exchange and investment that differs strongly from any other fiat money seen throughout history. A digital currency makes it possible to perform all financial transactions without the intervention of a third party to act as an arbiter of verification; payments can be made between two people with degrees of anonymity, across continents, at any denomination, and without any transaction fees going to a central authority. The most successful example of this is Bitcoin, introduced in 2008, which has experienced a recent boom of popularity, media attention, and investment. With this surge of attention, we became interested in finding out how people both inside and outside the Bitcoin community perceive Bitcoin -- what do they think of it, how do they feel, and how knowledgeable they are. Towards this end, we conducted the first interview study (N = 20) with participants to discuss Bitcoin and other related financial topics. Some of our major findings include: not understanding how Bitcoin works is not a barrier for entry, although non-user participants claim it would be for them and that user participants are in a state of cognitive dissonance concerning the role of governments in the system. Our findings, overall, contribute to knowledge concerning Bitcoin and attitudes towards digital currencies in general. △ Less

Submitted 9 March, 2015; originally announced March 2015.

arXiv:1408.6010 [pdf, other]

Engineering Gesture-Based Authentication Systems

Authors: Gradeigh D. Clark, Janne Lindqvist

Abstract: Gestures are a topic of increasing interest in authentication and successful implementation as a security layer requires reliable gesture recognition. So far much work focuses on new ways to recognize gestures, leaving discussion on the viability of recognition in an authentication scheme to the background. It is unclear how recognition should be deployed for practical and robust real-world auth… ▽ More Gestures are a topic of increasing interest in authentication and successful implementation as a security layer requires reliable gesture recognition. So far much work focuses on new ways to recognize gestures, leaving discussion on the viability of recognition in an authentication scheme to the background. It is unclear how recognition should be deployed for practical and robust real-world authentication. In this article, we analyze the effectiveness of different approaches to recognizing gestures and the potential for use in secure gesture-based authentication systems. △ Less

Submitted 26 August, 2014; originally announced August 2014.

arXiv:1403.1910 [pdf, other]

Text Entry Method Affects Password Security

Authors: Yulong Yang, Janne Lindqvist, Antti Oulasvirta

Abstract: Text-based passwords continue to be the prime form of authentication to computer systems. Today, they are increasingly created and used with mobile text entry methods, such as touchscreens and mobile keyboards, in addition to traditional physical keyboards. This raises a foundational question for usable security: whether text entry methods affect password generation and password security. This pap… ▽ More Text-based passwords continue to be the prime form of authentication to computer systems. Today, they are increasingly created and used with mobile text entry methods, such as touchscreens and mobile keyboards, in addition to traditional physical keyboards. This raises a foundational question for usable security: whether text entry methods affect password generation and password security. This paper presents results from a between-group study with 63 participants, in which each group generated passwords for multiple virtual accounts using a different text entry method. Participants were also asked to recall their passwords afterwards. We applied analysis of structures and probabilities, with standard and recent security metrics and also performed cracking attacks on the collected data. The results show a significant effect of text entry methods on passwords. In particular, one of the experimental groups created passwords with significantly more lowercase letters per password than the control group ($t(60) = 2.99, p = 0.004$). The choices for character types in each group were also significantly different ($p=0.048, FET$). Our cracking attacks consequently expose significantly different resistance across groups ($p=0.031, FET$) and text entry method vulnerabilities. Our findings contribute to the understanding of password security in the context of usable interfaces. △ Less

Submitted 7 March, 2014; originally announced March 2014.

arXiv:1401.0561 [pdf, other]

User-Generated Free-Form Gestures for Authentication: Security and Memorability

Authors: Michael Sherman, Gradeigh Clark, Yulong Yang, Shridatt Sugrim, Arttu Modig, Janne Lindqvist, Antti Oulasvirta, Teemu Roos

Abstract: This paper studies the security and memorability of free-form multitouch gestures for mobile authentication. Towards this end, we collected a dataset with a generate-test-retest paradigm where participants (N=63) generated free-form gestures, repeated them, and were later retested for memory. Half of the participants decided to generate one-finger gestures, and the other half generated multi-finge… ▽ More This paper studies the security and memorability of free-form multitouch gestures for mobile authentication. Towards this end, we collected a dataset with a generate-test-retest paradigm where participants (N=63) generated free-form gestures, repeated them, and were later retested for memory. Half of the participants decided to generate one-finger gestures, and the other half generated multi-finger gestures. Although there has been recent work on template-based gestures, there are yet no metrics to analyze security of either template or free-form gestures. For example, entropy-based metrics used for text-based passwords are not suitable for capturing the security and memorability of free-form gestures. Hence, we modify a recently proposed metric for analyzing information capacity of continuous full-body movements for this purpose. Our metric computed estimated mutual information in repeated sets of gestures. Surprisingly, one-finger gestures had higher average mutual information. Gestures with many hard angles and turns had the highest mutual information. The best-remembered gestures included signatures and simple angular shapes. We also implemented a multitouch recognizer to evaluate the practicality of free-form gestures in a real authentication system and how they perform against shoulder surfing attacks. We conclude the paper with strategies for generating secure and memorable free-form gestures, which present a robust method for mobile authentication. △ Less

Submitted 2 January, 2014; originally announced January 2014.

arXiv:1401.0052 [pdf, other]

Elastic Pathing: Your Speed is Enough to Track You

Authors: Bernhard Firner, Shridatt Sugrim, Yulong Yang, Janne Lindqvist

Abstract: Today people increasingly have the opportunity to opt-in to "usage-based" automotive insurance programs for reducing insurance premiums. In these programs, participants install devices in their vehicles that monitor their driving behavior, which raises some privacy concerns. Some devices collect fine-grained speed data to monitor driving habits. Companies that use these devices claim that their ap… ▽ More Today people increasingly have the opportunity to opt-in to "usage-based" automotive insurance programs for reducing insurance premiums. In these programs, participants install devices in their vehicles that monitor their driving behavior, which raises some privacy concerns. Some devices collect fine-grained speed data to monitor driving habits. Companies that use these devices claim that their approach is privacy-preserving because speedometer measurements do not have physical locations. However, we show that with knowledge of the user's home location, as the insurance companies have, speed data is sufficient to discover driving routes and destinations when trip data is collected over a period of weeks. To demonstrate the real-world applicability of our approach we applied our algorithm, elastic pathing, to data collected over hundreds of driving trips occurring over several months. With this data and our approach, we were able to predict trip destinations to within 250 meters of ground truth in 10% of the traces and within 500 meters in 20% of the traces. This result, combined with the amount of speed data that is being collected by insurance companies, constitutes a substantial breach of privacy because a person's regular driving pattern can be deduced with repeated examples of the same paths with just a few weeks of monitoring. △ Less

Submitted 30 December, 2013; originally announced January 2014.

Report number: Technical Report - Rutgers University - WINLAB - TR-429

Showing 1–13 of 13 results for author: Lindqvist, J