subscribe to arXiv mailings

Sexism Detection on a Data Diet

Authors: Rabiraj Bandyopadhyay, Dennis Assenmacher, Jose M. Alonso Moral, Claudia Wagner

Abstract: There is an increase in the proliferation of online hate commensurate with the rise in the usage of social media. In response, there is also a significant advancement in the creation of automated tools aimed at identifying harmful text content using approaches grounded in Natural Language Processing and Deep Learning. Although it is known that training Deep Learning models require a substantial am… ▽ More There is an increase in the proliferation of online hate commensurate with the rise in the usage of social media. In response, there is also a significant advancement in the creation of automated tools aimed at identifying harmful text content using approaches grounded in Natural Language Processing and Deep Learning. Although it is known that training Deep Learning models require a substantial amount of annotated data, recent line of work suggests that models trained on specific subsets of the data still retain performance comparable to the model that was trained on the full dataset. In this work, we show how we can leverage influence scores to estimate the importance of a data point while training a model and designing a pruning strategy applied to the case of sexism detection. We evaluate the model performance trained on data pruned with different pruning strategies on three out-of-domain datasets and find, that in accordance with other work a large fraction of instances can be removed without significant performance drop. However, we also discover that the strategies for pruning data, previously successful in Natural Language Inference tasks, do not readily apply to the detection of harmful content and instead amplify the already prevalent class imbalance even more, leading in the worst-case to a complete absence of the hateful class. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: Accepted at ACM WebSci 2024 Workshop in DHOW: Diffusion of Harmful Content on Online Web Workshop

arXiv:2405.10068 [pdf, other]

MrRegNet: Multi-resolution Mask Guided Convolutional Neural Network for Medical Image Registration with Large Deformations

Authors: Ruizhe Li, Grazziela Figueredo, Dorothee Auer, Christian Wagner, Xin Chen

Abstract: Deformable image registration (alignment) is highly sought after in numerous clinical applications, such as computer aided diagnosis and disease progression analysis. Deep Convolutional Neural Network (DCNN)-based image registration methods have demonstrated advantages in terms of registration accuracy and computational speed. However, while most methods excel at global alignment, they often perfo… ▽ More Deformable image registration (alignment) is highly sought after in numerous clinical applications, such as computer aided diagnosis and disease progression analysis. Deep Convolutional Neural Network (DCNN)-based image registration methods have demonstrated advantages in terms of registration accuracy and computational speed. However, while most methods excel at global alignment, they often perform worse in aligning local regions. To address this challenge, this paper proposes a mask-guided encoder-decoder DCNN-based image registration method, named as MrRegNet. This approach employs a multi-resolution encoder for feature extraction and subsequently estimates multi-resolution displacement fields in the decoder to handle the substantial deformation of images. Furthermore, segmentation masks are employed to direct the model's attention toward aligning local regions. The results show that the proposed method outperforms traditional methods like Demons and a well-known deep learning method, VoxelMorph, on a public 3D brain MRI dataset (OASIS) and a local 2D brain MRI dataset with large deformations. Importantly, the image alignment accuracies are significantly improved at local regions guided by segmentation masks. Github link:https://github.com/ruizhe-l/MrRegNet. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: Accepted for publication at IEEE International Symposium on Biomedical Imaging (ISBI) 2024

arXiv:2405.08562 [pdf, other]

doi 10.1177/08944393241258771

The Unseen Targets of Hate -- A Systematic Review of Hateful Communication Datasets

Authors: Zehui Yu, Indira Sen, Dennis Assenmacher, Mattia Samory, Leon Fröhling, Christina Dahn, Debora Nozza, Claudia Wagner

Abstract: Machine learning (ML)-based content moderation tools are essential to keep online spaces free from hateful communication. Yet, ML tools can only be as capable as the quality of the data they are trained on allows them. While there is increasing evidence that they underperform in detecting hateful communications directed towards specific identities and may discriminate against them, we know surpris… ▽ More Machine learning (ML)-based content moderation tools are essential to keep online spaces free from hateful communication. Yet, ML tools can only be as capable as the quality of the data they are trained on allows them. While there is increasing evidence that they underperform in detecting hateful communications directed towards specific identities and may discriminate against them, we know surprisingly little about the provenance of such bias. To fill this gap, we present a systematic review of the datasets for the automated detection of hateful communication introduced over the past decade, and unpack the quality of the datasets in terms of the identities that they embody: those of the targets of hateful communication that the data curators focused on, as well as those unintentionally included in the datasets. We find, overall, a skewed representation of selected target identities and mismatches between the targets that research conceptualizes and ultimately includes in datasets. Yet, by contextualizing these findings in the language and location of origin of the datasets, we highlight a positive trend towards the broadening and diversification of this research space. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 20 pages, 14 figures

arXiv:2403.12308 [pdf, other]

Gradient-based Fuzzy System Optimisation via Automatic Differentiation -- FuzzyR as a Use Case

Authors: Chao Chen, Christian Wagner, Jonathan M. Garibaldi

Abstract: Since their introduction, fuzzy sets and systems have become an important area of research known for its versatility in modelling, knowledge representation and reasoning, and increasingly its potential within the context explainable AI. While the applications of fuzzy systems are diverse, there has been comparatively little advancement in their design from a machine learning perspective. In other… ▽ More Since their introduction, fuzzy sets and systems have become an important area of research known for its versatility in modelling, knowledge representation and reasoning, and increasingly its potential within the context explainable AI. While the applications of fuzzy systems are diverse, there has been comparatively little advancement in their design from a machine learning perspective. In other words, while representations such as neural networks have benefited from a boom in learning capability driven by an increase in computational performance in combination with advances in their training mechanisms and available tool, in particular gradient descent, the impact on fuzzy system design has been limited. In this paper, we discuss gradient-descent-based optimisation of fuzzy systems, focussing in particular on automatic differentiation -- crucial to neural network learning -- with a view to free fuzzy system designers from intricate derivative computations, allowing for more focus on the functional and explainability aspects of their design. As a starting point, we present a use case in FuzzyR which demonstrates how current fuzzy inference system implementations can be adjusted to leverage powerful features of automatic differentiation tools sets, discussing its potential for the future of fuzzy system design. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2312.14979 [pdf, other]

Stacked tensorial neural networks for reduced-order modeling of a parametric partial differential equation

Authors: Caleb G. Wagner

Abstract: Tensorial neural networks (TNNs) combine the successes of multilinear algebra with those of deep learning to enable extremely efficient reduced-order models of high-dimensional problems. Here, I describe a deep neural network architecture that fuses multiple TNNs into a larger network, intended to solve a broader class of problems than a single TNN. I evaluate this architecture, referred to as a "… ▽ More Tensorial neural networks (TNNs) combine the successes of multilinear algebra with those of deep learning to enable extremely efficient reduced-order models of high-dimensional problems. Here, I describe a deep neural network architecture that fuses multiple TNNs into a larger network, intended to solve a broader class of problems than a single TNN. I evaluate this architecture, referred to as a "stacked tensorial neural network" (STNN), on a parametric PDE with three independent variables and three parameters. The three parameters correspond to one PDE coefficient and two quantities describing the domain geometry. The STNN provides an accurate reduced-order description of the solution manifold over a wide range of parameters. There is also evidence of meaningful generalization to parameter values outside its training data. Finally, while the STNN architecture is relatively simple and problem agnostic, it can be regularized to incorporate problem-specific features like symmetries and physical modeling assumptions. △ Less

Submitted 21 December, 2023; originally announced December 2023.

arXiv:2311.04559 [pdf, other]

Individual and gender inequality in computer science: A career study of cohorts from 1970 to 2000

Authors: Haiko Lietz, Mohsen Jadidi, Daniel Kostic, Milena Tsvetkova, Claudia Wagner

Abstract: Inequality prevails in science. Individual inequality means that most perish quickly and only a few are successful, while gender inequality implies that there are differences in achievements for women and men. Using large-scale bibliographic data and following a computational approach, we study the evolution of individual and gender inequality for cohorts from 1970 to 2000 in the whole field of co… ▽ More Inequality prevails in science. Individual inequality means that most perish quickly and only a few are successful, while gender inequality implies that there are differences in achievements for women and men. Using large-scale bibliographic data and following a computational approach, we study the evolution of individual and gender inequality for cohorts from 1970 to 2000 in the whole field of computer science as it grows and becomes a team-based science. We find that individual inequality in productivity (publications) increases over a scholar's career but is historically invariant, while individual inequality in impact (citations), albeit larger, is stable across cohorts and careers. Gender inequality prevails regarding productivity, but there is no evidence for differences in impact. The Matthew Effect is shown to accumulate advantages to early achievements and to become stronger over the decades, indicating the rise of a "publish or perish" imperative. Only some authors manage to reap the benefits that publishing in teams promises. The Matthew Effect then amplifies initial differences and propagates the gender gap. Women continue to fall behind because they continue to be at a higher risk of dropping out for reasons that have nothing to do with early-career achievements or social support. Our findings suggest that mentoring programs for women to improve their social-networking skills can help to reduce gender inequality. △ Less

Submitted 8 November, 2023; originally announced November 2023.

Comments: To be published in Quantitative Science Studies. 33 pages, 7 figures, 3 tables

ACM Class: K.4.3; K.7.0; J.4

arXiv:2311.04007 [pdf, other]

The Energy Prediction Smart-Meter Dataset: Analysis of Previous Competitions and Beyond

Authors: Direnc Pekaslan, Jose Maria Alonso-Moral, Kasun Bandara, Christoph Bergmeir, Juan Bernabe-Moreno, Robert Eigenmann, Nils Einecke, Selvi Ergen, Rakshitha Godahewa, Hansika Hewamalage, Jesus Lago, Steffen Limmer, Sven Rebhan, Boris Rabinovich, Dilini Rajapasksha, Heda Song, Christian Wagner, Wenlong Wu, Luis Magdalena, Isaac Triguero

Abstract: This paper presents the real-world smart-meter dataset and offers an analysis of solutions derived from the Energy Prediction Technical Challenges, focusing primarily on two key competitions: the IEEE Computational Intelligence Society (IEEE-CIS) Technical Challenge on Energy Prediction from Smart Meter data in 2020 (named EP) and its follow-up challenge at the IEEE International Conference on Fuz… ▽ More This paper presents the real-world smart-meter dataset and offers an analysis of solutions derived from the Energy Prediction Technical Challenges, focusing primarily on two key competitions: the IEEE Computational Intelligence Society (IEEE-CIS) Technical Challenge on Energy Prediction from Smart Meter data in 2020 (named EP) and its follow-up challenge at the IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) in 2021 (named as XEP). These competitions focus on accurate energy consumption forecasting and the importance of interpretability in understanding the underlying factors. The challenge aims to predict monthly and yearly estimated consumption for households, addressing the accurate billing problem with limited historical smart meter data. The dataset comprises 3,248 smart meters, with varying data availability ranging from a minimum of one month to a year. This paper delves into the challenges, solutions and analysing issues related to the provided real-world smart meter data, developing accurate predictions at the household level, and introducing evaluation criteria for assessing interpretability. Additionally, this paper discusses aspects beyond the competitions: opportunities for energy disaggregation and pattern detection applications at the household level, significance of communicating energy-driven factors for optimised billing, and emphasising the importance of responsible AI and data privacy considerations. These aspects provide insights into the broader implications and potential advancements in energy consumption prediction. Overall, these competitions provide a dataset for residential energy research and serve as a catalyst for exploring accurate forecasting, enhancing interpretability, and driving progress towards the discussion of various aspects such as energy disaggregation, demand response programs or behavioural interventions. △ Less

Submitted 7 November, 2023; originally announced November 2023.

arXiv:2311.01270 [pdf, other]

People Make Better Edits: Measuring the Efficacy of LLM-Generated Counterfactually Augmented Data for Harmful Language Detection

Authors: Indira Sen, Dennis Assenmacher, Mattia Samory, Isabelle Augenstein, Wil van der Aalst, Claudia Wagner

Abstract: NLP models are used in a variety of critical social computing tasks, such as detecting sexist, racist, or otherwise hateful content. Therefore, it is imperative that these models are robust to spurious features. Past work has attempted to tackle such spurious features using training data augmentation, including Counterfactually Augmented Data (CADs). CADs introduce minimal changes to existing trai… ▽ More NLP models are used in a variety of critical social computing tasks, such as detecting sexist, racist, or otherwise hateful content. Therefore, it is imperative that these models are robust to spurious features. Past work has attempted to tackle such spurious features using training data augmentation, including Counterfactually Augmented Data (CADs). CADs introduce minimal changes to existing training data points and flip their labels; training on them may reduce model dependency on spurious features. However, manually generating CADs can be time-consuming and expensive. Hence in this work, we assess if this task can be automated using generative NLP models. We automatically generate CADs using Polyjuice, ChatGPT, and Flan-T5, and evaluate their usefulness in improving model robustness compared to manually-generated CADs. By testing both model performance on multiple out-of-domain test sets and individual data point efficacy, our results show that while manual CADs are still the most effective, CADs generated by ChatGPT come a close second. One key reason for the lower performance of automated methods is that the changes they introduce are often insufficient to flip the original label. △ Less

Submitted 25 February, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

Comments: Preprint of EMNLP'23 paper

arXiv:2307.10198 [pdf]

Has China caught up to the US in AI research? An exploration of mimetic isomorphism as a model for late industrializers

Authors: Chao Min, Yi Zhao, Yi Bu, Ying Ding, Caroline S. Wagner

Abstract: Artificial Intelligence (AI), a cornerstone of 21st-century technology, has seen remarkable growth in China. In this paper, we examine China's AI development process, demonstrating that it is characterized by rapid learning and differentiation, surpassing the export-oriented growth propelled by Foreign Direct Investment seen in earlier Asian industrializers. Our data indicates that China current… ▽ More Artificial Intelligence (AI), a cornerstone of 21st-century technology, has seen remarkable growth in China. In this paper, we examine China's AI development process, demonstrating that it is characterized by rapid learning and differentiation, surpassing the export-oriented growth propelled by Foreign Direct Investment seen in earlier Asian industrializers. Our data indicates that China currently leads the USA in the volume of AI-related research papers. However, when we delve into the quality of these papers based on specific metrics, the USA retains a slight edge. Nevertheless, the pace and scale of China's AI development remain noteworthy. We attribute China's accelerated AI progress to several factors, including global trends favoring open access to algorithms and research papers, contributions from China's broad diaspora and returnees, and relatively lax data protection policies. In the vein of our research, we have developed a novel measure for gauging China's imitation of US research. Our analysis shows that by 2018, the time lag between China and the USA in addressing AI research topics had evaporated. This finding suggests that China has effectively bridged a significant knowledge gap and could potentially be setting out on an independent research trajectory. While this study compares China and the USA exclusively, it's important to note that research collaborations between these two nations have resulted in more highly cited work than those produced by either country independently. This underscores the power of international cooperation in driving scientific progress in AI. △ Less

Submitted 11 July, 2023; originally announced July 2023.

arXiv:2307.02863 [pdf]

ValiText -- a unified validation framework for computational text-based measures of social constructs

Authors: Lukas Birkenmaier, Claudia Wagner, Clemens Lechner

Abstract: Guidance on how to validate computational text-based measures of social constructs is fragmented. While researchers generally acknowledge the importance of validating text-based measures, they often lack a shared vocabulary and a unified framework to do so. This paper introduces ValiText, a new validation framework designed to assist scholars in validly measuring social constructs in textual data.… ▽ More Guidance on how to validate computational text-based measures of social constructs is fragmented. While researchers generally acknowledge the importance of validating text-based measures, they often lack a shared vocabulary and a unified framework to do so. This paper introduces ValiText, a new validation framework designed to assist scholars in validly measuring social constructs in textual data. The framework is built on a conceptual foundation of validity in the social sciences, strengthened by an empirical review of validation practices in the social sciences and consultations with experts. Ultimately, ValiText prescribes researchers to demonstrate three types of validation evidence: substantive evidence (outlining the theoretical underpinning of the measure), structural evidence (examining the properties of the text model and its output) and external evidence (testing for how the measure relates to independent information). The framework is further supplemented by a checklist of validation steps, offering practical guidance in the form of documentation sheets that guide researchers in the validation process. △ Less

Submitted 10 June, 2024; v1 submitted 6 July, 2023; originally announced July 2023.

arXiv:2307.01918 [pdf, other]

Computational Reproducibility in Computational Social Science

Authors: David Schoch, Chung-hong Chan, Claudia Wagner, Arnim Bleier

Abstract: Replication crises have shaken the scientific landscape during the last decade. As potential solutions, open science practices were heavily discussed and have been implemented with varying success in different disciplines. We argue that computational-x disciplines such as computational social science, are also susceptible for the symptoms of the crises, but in terms of reproducibility. We expand t… ▽ More Replication crises have shaken the scientific landscape during the last decade. As potential solutions, open science practices were heavily discussed and have been implemented with varying success in different disciplines. We argue that computational-x disciplines such as computational social science, are also susceptible for the symptoms of the crises, but in terms of reproducibility. We expand the binary definition of reproducibility into a tier system which allows increasing levels of reproducibility based on external verfiability to counteract the practice of open-washing. We provide solutions for barriers in Computational Social Science that hinder researchers from obtaining the highest level of reproducibility, including the use of alternate data sources and considering reproducibility proactively. △ Less

Submitted 4 October, 2023; v1 submitted 4 July, 2023; originally announced July 2023.

Comments: v1: Working Paper; v2: fixed missing citation in text; v3: fixed some minor errors and formatting; v4: shortened paper

arXiv:2302.11885 [pdf, other]

The Joint Weighted Average (JWA) Operator

Authors: Stephen B. Broomell, Christian Wagner

Abstract: Information aggregation is a vital tool for human and machine decision making in the presence of uncertainty. Traditionally, approaches to aggregation broadly diverge into two categories, those which attribute a worth or weight to information sources and those which attribute said worth to the evidence arising from said sources. The latter is pervasive in the physical sciences, underpinning linear… ▽ More Information aggregation is a vital tool for human and machine decision making in the presence of uncertainty. Traditionally, approaches to aggregation broadly diverge into two categories, those which attribute a worth or weight to information sources and those which attribute said worth to the evidence arising from said sources. The latter is pervasive in the physical sciences, underpinning linear order statistics and enabling non-linear aggregation. The former is popular in the social sciences, providing interpretable insight on the sources. While prior work has identified the need to apply both approaches simultaneously, it has yet to conceptually integrate both approaches and provide a semantic interpretation of the arising aggregation approach. Here, we conceptually integrate both approaches in a novel joint weighted averaging operator. We leverage compositional geometry to underpin this integration, showing how it provides a systematic basis for the combination of weighted aggregation operators--which has thus far not been considered in the literature. We proceed to show how the resulting operator systematically integrates a priori beliefs about the worth of both sources and evidence, reflecting the semantic integration of both weighting strategies. We conclude and highlight the potential of the operator across disciplines, from machine learning to psychology. △ Less

Submitted 2 May, 2024; v1 submitted 23 February, 2023; originally announced February 2023.

arXiv:2302.00546 [pdf, other]

You are a Bot! -- Studying the Development of Bot Accusations on Twitter

Authors: Dennis Assenmacher, Leon Fröhling, Claudia Wagner

Abstract: The characterization and detection of bots with their presumed ability to manipulate society on social media platforms have been subject to many research endeavors over the last decade. In the absence of ground truth data (i.e., accounts that are labeled as bots by experts or self-declare their automated nature), researchers interested in the characterization and detection of bots may want to tap… ▽ More The characterization and detection of bots with their presumed ability to manipulate society on social media platforms have been subject to many research endeavors over the last decade. In the absence of ground truth data (i.e., accounts that are labeled as bots by experts or self-declare their automated nature), researchers interested in the characterization and detection of bots may want to tap into the wisdom of the crowd. But how many people need to accuse another user as a bot before we can assume that the account is most likely automated? And more importantly, are bot accusations on social media at all a valid signal for the detection of bots? Our research presents the first large-scale study of bot accusations on Twitter and shows how the term bot became an instrument of dehumanization in social media conversations since it is predominantly used to deny the humanness of conversation partners. Consequently, bot accusations on social media should not be naively used as a signal to train or test bot detection models. △ Less

Submitted 31 March, 2024; v1 submitted 1 February, 2023; originally announced February 2023.

Comments: 11 pages, 7 figures

arXiv:2208.12400 [pdf, ps, other]

Synthesis of Distributed Agreement-Based Systems with Efficiently-Decidable Verification (Extended Version)

Authors: Nouraldin Jaber, Christopher Wagner, Swen Jacobs, Milind Kulkarni, Roopsha Samanta

Abstract: Distributed agreement-based (DAB) systems use common distributed agreement protocols such as leader election and consensus as building blocks for their target functionality. While automated verification for DAB systems is undecidable in general, recent work identifies a large class of DAB systems for which verification is efficiently-decidable. Unfortunately, the conditions characterizing such a c… ▽ More Distributed agreement-based (DAB) systems use common distributed agreement protocols such as leader election and consensus as building blocks for their target functionality. While automated verification for DAB systems is undecidable in general, recent work identifies a large class of DAB systems for which verification is efficiently-decidable. Unfortunately, the conditions characterizing such a class can be opaque and non-intuitive, and can pose a significant challenge to system designers trying to model their systems in this class. In this paper, we present a synthesis-driven tool, Cinnabar, to help system designers building DAB systems "fit" their intended designs into an efficiently-decidable class. In particular, starting from an initial sketch provided by the designer, Cinnabar generates sketch completions using a counterexample-guided procedure. The core technique relies on a compact encoding of a set of related counterexamples. We demonstrate Cinnabar's effectiveness by successfully and efficiently synthesizing completions for a variety of interesting DAB systems. △ Less

Submitted 14 January, 2023; v1 submitted 25 August, 2022; originally announced August 2022.

Comments: TACAS 2023

arXiv:2206.00268 [pdf, other]

The Hipster Paradox in Electronic Dance Music: How Musicians Trade Mainstream Success off against Alternative Status

Authors: Mohsen Jadidi, Haiko Lietz, Mattia Samory, Claudia Wagner

Abstract: The hipster paradox in Electronic Dance Music is the phenomenon that commercial success is collectively considered illegitimate while serious and aspiring professional musicians strive for it. We study this behavioral dilemma using digital traces of performing live and releasing music as they are stored in the \textit{Resident Advisor}, \textit{Juno Download}, and \textit{Discogs} databases from 2… ▽ More The hipster paradox in Electronic Dance Music is the phenomenon that commercial success is collectively considered illegitimate while serious and aspiring professional musicians strive for it. We study this behavioral dilemma using digital traces of performing live and releasing music as they are stored in the \textit{Resident Advisor}, \textit{Juno Download}, and \textit{Discogs} databases from 2001-2018. We construct network snapshots following a formal sociological approach based on bipartite networks, and we use network positions to explain success in regression models of artistic careers. We find evidence for a structural trade-off among success and autonomy. Musicians in EDM embed into exclusive performance-based communities for autonomy but, in earlier career stages, seek the mainstream for commercial success. Our approach highlights how Computational Social Science can benefit from a close connection of data analysis and theory. △ Less

Submitted 1 June, 2022; originally announced June 2022.

Comments: 16th International Conference on Web and Social Media

arXiv:2205.06322 [pdf, other]

Bounded Verification of Doubly-Unbounded Distributed Agreement-Based Systems

Authors: Christopher Wagner, Nouraldin Jaber, Roopsha Samanta

Abstract: The ubiquity of distributed agreement protocols, such as consensus, has galvanized interest in verification of such protocols as well as applications built on top of them. The complexity and unboundedness of such systems, however, makes their verification onerous in general, and, particularly prohibitive for full automation. An exciting, recent breakthrough reveals that, through careful modeling,… ▽ More The ubiquity of distributed agreement protocols, such as consensus, has galvanized interest in verification of such protocols as well as applications built on top of them. The complexity and unboundedness of such systems, however, makes their verification onerous in general, and, particularly prohibitive for full automation. An exciting, recent breakthrough reveals that, through careful modeling, it becomes possible for verification of interesting distributed agreement-based (DAB) systems, that are unbounded in the number of processes, to be reduced to model checking of small, finite-state systems. It is an open question if such reductions are also possible for DAB systems that are doubly-unbounded, in particular, DAB systems that additionally have unbounded data domains. We answer this question in the affirmative in this work for models of DAB systems, thereby broadening the class of DAB systems which can be automatically verified. We present a new symmetry-based reduction and develop a tool, Venus, that can efficiently verify sophisticated DAB system models. △ Less

Submitted 12 May, 2022; originally announced May 2022.

arXiv:2205.06048 [pdf, other]

doi 10.1145/3501247.3531583

Link recommendations: Their impact on network structure and minorities

Authors: Antonio Ferrara, Lisette Espín-Noboa, Fariba Karimi, Claudia Wagner

Abstract: Network-based people recommendation algorithms are widely employed on the Web to suggest new connections in social media or professional platforms. While such recommendations bring people together, the feedback loop between the algorithms and the changes in network structure may exacerbate social biases. These biases include rich-get-richer effects, filter bubbles, and polarization. However, socia… ▽ More Network-based people recommendation algorithms are widely employed on the Web to suggest new connections in social media or professional platforms. While such recommendations bring people together, the feedback loop between the algorithms and the changes in network structure may exacerbate social biases. These biases include rich-get-richer effects, filter bubbles, and polarization. However, social networks are diverse complex systems and recommendations may affect them differently, depending on their structural properties. In this work, we explore five people recommendation algorithms by systematically applying them over time to different synthetic networks. In particular, we measure to what extent these recommendations change the structure of bi-populated networks and show how these changes affect the minority group. Our systematic experimentation helps to better understand when link recommendation algorithms are beneficial or harmful to minority groups in social networks. In particular, our findings suggest that, while all algorithms tend to close triangles and increase cohesion, all algorithms except Node2Vec are prone to favor and suggest nodes with high in-degree. Furthermore, we found that, especially when both classes are heterophilic, recommendation algorithms can reduce the visibility of minorities. △ Less

Submitted 12 May, 2022; originally announced May 2022.

Comments: 11 pages, accepted at the WebSci'22 conference

arXiv:2205.04238 [pdf, other]

Counterfactually Augmented Data and Unintended Bias: The Case of Sexism and Hate Speech Detection

Authors: Indira Sen, Mattia Samory, Claudia Wagner, Isabelle Augenstein

Abstract: Counterfactually Augmented Data (CAD) aims to improve out-of-domain generalizability, an indicator of model robustness. The improvement is credited with promoting core features of the construct over spurious artifacts that happen to correlate with it. Yet, over-relying on core features may lead to unintended model bias. Especially, construct-driven CAD -- perturbations of core features -- may indu… ▽ More Counterfactually Augmented Data (CAD) aims to improve out-of-domain generalizability, an indicator of model robustness. The improvement is credited with promoting core features of the construct over spurious artifacts that happen to correlate with it. Yet, over-relying on core features may lead to unintended model bias. Especially, construct-driven CAD -- perturbations of core features -- may induce models to ignore the context in which core features are used. Here, we test models for sexism and hate speech detection on challenging data: non-hateful and non-sexist usage of identity and gendered terms. In these hard cases, models trained on CAD, especially construct-driven CAD, show higher false-positive rates than models trained on the original, unperturbed data. Using a diverse set of CAD -- construct-driven and construct-agnostic -- reduces such unintended bias. △ Less

Submitted 9 May, 2022; originally announced May 2022.

Comments: Accepted to NAACL'22 as a short paper

arXiv:2205.03028 [pdf, other]

Quantification of Robotic Surgeries with Vision-Based Deep Learning

Authors: Dani Kiyasseh, Runzhuo Ma, Taseen F. Haque, Jessica Nguyen, Christian Wagner, Animashree Anandkumar, Andrew J. Hung

Abstract: Surgery is a high-stakes domain where surgeons must navigate critical anatomical structures and actively avoid potential complications while achieving the main task at hand. Such surgical activity has been shown to affect long-term patient outcomes. To better understand this relationship, whose mechanics remain unknown for the majority of surgical procedures, we hypothesize that the core elements… ▽ More Surgery is a high-stakes domain where surgeons must navigate critical anatomical structures and actively avoid potential complications while achieving the main task at hand. Such surgical activity has been shown to affect long-term patient outcomes. To better understand this relationship, whose mechanics remain unknown for the majority of surgical procedures, we hypothesize that the core elements of surgery must first be quantified in a reliable, objective, and scalable manner. We believe this is a prerequisite for the provision of surgical feedback and modulation of surgeon performance in pursuit of improved patient outcomes. To holistically quantify surgeries, we propose a unified deep learning framework, entitled Roboformer, which operates exclusively on videos recorded during surgery to independently achieve multiple tasks: surgical phase recognition (the what of surgery), gesture classification and skills assessment (the how of surgery). We validated our framework on four video-based datasets of two commonly-encountered types of steps (dissection and suturing) within minimally-invasive robotic surgeries. We demonstrated that our framework can generalize well to unseen videos, surgeons, medical centres, and surgical procedures. We also found that our framework, which naturally lends itself to explainable findings, identified relevant information when achieving a particular task. These findings are likely to instill surgeons with more confidence in our framework's behaviour, increasing the likelihood of clinical adoption, and thus paving the way for more targeted surgical feedback. △ Less

Submitted 6 May, 2022; originally announced May 2022.

arXiv:2204.10836 [pdf, other]

doi 10.1038/s41467-022-33407-5

Federated Learning Enables Big Data for Rare Cancer Boundary Detection

Authors: Sarthak Pati, Ujjwal Baid, Brandon Edwards, Micah Sheller, Shih-Han Wang, G Anthony Reina, Patrick Foley, Alexey Gruzdev, Deepthi Karkada, Christos Davatzikos, Chiharu Sako, Satyam Ghodasara, Michel Bilello, Suyash Mohan, Philipp Vollmuth, Gianluca Brugnara, Chandrakanth J Preetha, Felix Sahm, Klaus Maier-Hein, Maximilian Zenk, Martin Bendszus, Wolfgang Wick, Evan Calabrese, Jeffrey Rudie, Javier Villanueva-Meyer , et al. (254 additional authors not shown)

Abstract: Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train acc… ▽ More Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train accurate and generalizable ML models, by only sharing numerical model updates. Here we present findings from the largest FL study to-date, involving data from 71 healthcare institutions across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, utilizing the largest dataset of such patients ever used in the literature (25,256 MRI scans from 6,314 patients). We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent. We anticipate our study to: 1) enable more studies in healthcare informed by large and diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further quantitative analyses for glioblastoma via performance optimization of our consensus model for eventual public release, and 3) demonstrate the effectiveness of FL at such scale and task complexity as a paradigm shift for multi-site collaborations, alleviating the need for data sharing. △ Less

Submitted 25 April, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

Comments: federated learning, deep learning, convolutional neural network, segmentation, brain tumor, glioma, glioblastoma, FeTS, BraTS

arXiv:2204.07728 [pdf, other]

doi 10.46298/lmcs-19(4:14)2023

FTMPST: Fault-Tolerant Multiparty Session Types

Authors: Kirstin Peters, Uwe Nestmann, Christoph Wagner

Abstract: Multiparty session types are designed to abstractly capture the structure of communication protocols and verify behavioural properties. One important such property is progress, i.e., the absence of deadlock. Distributed algorithms often resemble multiparty communication protocols. But proving their properties, in particular termination that is closely related to progress, can be elaborate. Since d… ▽ More Multiparty session types are designed to abstractly capture the structure of communication protocols and verify behavioural properties. One important such property is progress, i.e., the absence of deadlock. Distributed algorithms often resemble multiparty communication protocols. But proving their properties, in particular termination that is closely related to progress, can be elaborate. Since distributed algorithms are often designed to cope with faults, a first step towards using session types to verify distributed algorithms is to integrate fault-tolerance. We extend multiparty session types to cope with system failures such as unreliable communication and process crashes. Moreover, we augment the semantics of processes by failure patterns that can be used to represent system requirements (as, e.g., failure detectors). To illustrate our approach we analyse a variant of the well-known rotating coordinator algorithm by Chandra and Toueg. △ Less

Submitted 24 November, 2023; v1 submitted 16 April, 2022; originally announced April 2022.

Journal ref: Logical Methods in Computer Science, Volume 19, Issue 4 (November 27, 2023) lmcs:10424

arXiv:2202.03202 [pdf]

doi 10.1371/journal.pone.0261624

One-Year In: COVID-19 Research at the International Level in CORD-19 Data

Authors: Caroline S. Wagner, Xiaojing Cai, Yi Zhang, Caroline V. Fry

Abstract: The appearance of a novel coronavirus in late 2019 radically changed the community of researchers working on coronaviruses since the 2002 SARS epidemic. In 2020, coronavirus-related publications grew by 20 times over the previous two years, with 130,000 more researchers publishing on related topics. The United States, the United Kingdom and China led dozens of nations working on coronavirus prior… ▽ More The appearance of a novel coronavirus in late 2019 radically changed the community of researchers working on coronaviruses since the 2002 SARS epidemic. In 2020, coronavirus-related publications grew by 20 times over the previous two years, with 130,000 more researchers publishing on related topics. The United States, the United Kingdom and China led dozens of nations working on coronavirus prior to the pandemic, but leadership consolidated among these three nations in 2020, which collectively accounted for 50% of all papers, garnering well more than 60% of citations. China took an early lead on COVID-19 research, but dropped rapidly in production and international participation through the year. Europe showed an opposite pattern, beginning slowly in publications but growing in contributions during the year. The share of internationally collaborative publications dropped from pre-pandemic rates; single-authored publications grew. For all nations, including China, the number of publications about COVID track closely with the outbreak of COVID-19 cases. Lower-income nations participate very little in COVID-19 research in 2020. Topic maps of internationally collaborative work show the rise of patient care and public health clusters, two topics that were largely absent from coronavirus research in the two years prior to 2020. Findings are consistent with global science as a self-organizing system operating on a reputation-based dynamic. △ Less

Submitted 1 February, 2022; originally announced February 2022.

Comments: 39 pages, 8 figures, Appendix

arXiv:2202.00781 [pdf]

A discussion of measuring the top-1 percent most-highly cited publications: Quality and impact of Chinese papers

Authors: Caroline S. Wagner, Lin Zhang, Loet Leydesdorff

Abstract: The top 1 percent most highly cited articles are watched closely as the vanguards of the sciences. Using Web of Science data, one can find that China had overtaken the USA in the relative participation in the top 1 percent in 2019, after outcompeting the EU on this indicator in 2015. However, this finding contrasts with repeated reports of Western agencies that the quality of Chinese output in sci… ▽ More The top 1 percent most highly cited articles are watched closely as the vanguards of the sciences. Using Web of Science data, one can find that China had overtaken the USA in the relative participation in the top 1 percent in 2019, after outcompeting the EU on this indicator in 2015. However, this finding contrasts with repeated reports of Western agencies that the quality of Chinese output in science is lagging other advanced nations, even as it has caught up in numbers of articles. The difference between the results presented here and the previous results depends mainly upon field normalizations, which classify source journals by discipline. Average citation rates of these subsets are commonly used as a baseline so that one can compare among disciplines. However, the expected value of the top 1 percent of a sample of N papers is N 100, ceteris paribus. Using the average citation rates as expected values, errors are introduced by using the mean of highly skewed distributions and a specious precision in the delineations of the subsets. Classifications can be used for the decomposition, but not for the normalization. When the data is thus decomposed, the USA ranks ahead of China in biomedical fields such as virology. Although the number of papers is smaller, China outperforms the US in the field of Business and Finance in the Social Sciences Citation Index when p is less than .05. Using percentile ranks, subsets other than indexing based classifications can be tested for the statistical significance of differences among them. △ Less

Submitted 1 February, 2022; originally announced February 2022.

Comments: 26 pages, 8 figures accepted for publication in Scientometrics

arXiv:2202.00453 [pdf]

Changes in co-publication patterns among China, the European Union (28) and the United States of America, 2016-2021

Authors: Caroline S. Wagner, Xiaojing Cai

Abstract: The COVID-19 global pandemic starting in January 2020 disrupted international collaborations in scholarly exchange, reducing mobility and connections across the globe. An examination of Web of Science-indexed publications from China, the European Union-28 and the United States of America shows a drop in publications numbers coming from the EU-28 and the United States in 2021. Importantly, cooperat… ▽ More The COVID-19 global pandemic starting in January 2020 disrupted international collaborations in scholarly exchange, reducing mobility and connections across the globe. An examination of Web of Science-indexed publications from China, the European Union-28 and the United States of America shows a drop in publications numbers coming from the EU-28 and the United States in 2021. Importantly, cooperation between China and the United States drops without a corresponding drop between China and the EU-28. Moreover, the drop in China-USA cooperation can be seen beginning in 2019, before the pandemic, at a time when political tensions around science, technology, and innovation arose, with the United States claiming that China was violating intellectual property norms. The patterns suggest that political tensions, more than the pandemic, influenced the drop in China-USA cooperation. △ Less

Submitted 11 February, 2022; v1 submitted 1 February, 2022; originally announced February 2022.

Comments: 11 pages, 4 figures, 4 tables

arXiv:2110.00072 [pdf, other]

doi 10.1038/s41598-022-05434-1

Inequality and Inequity in Network-based Ranking and Recommendation Algorithms

Authors: Lisette Espín-Noboa, Claudia Wagner, Markus Strohmaier, Fariba Karimi

Abstract: Though algorithms promise many benefits including efficiency, objectivity and accuracy, they may also introduce or amplify biases. Here we study two well-known algorithms, namely PageRank and Who-to-Follow (WTF), and show to what extent their ranks produce inequality and inequity when applied to directed social networks. To this end, we propose a directed network model with preferential attachment… ▽ More Though algorithms promise many benefits including efficiency, objectivity and accuracy, they may also introduce or amplify biases. Here we study two well-known algorithms, namely PageRank and Who-to-Follow (WTF), and show to what extent their ranks produce inequality and inequity when applied to directed social networks. To this end, we propose a directed network model with preferential attachment and homophily (DPAH) and demonstrate the influence of network structure on the rank distributions of these algorithms. Our main findings suggest that (i) inequality is positively correlated with inequity, (ii) inequality is driven by the interplay between preferential attachment, homophily, node activity and edge density, and (iii) inequity is driven by the interplay between homophily and minority size. In particular, these two algorithms reduce, replicate and amplify the representation of minorities in top ranks when majorities are homophilic, neutral and heterophilic, respectively. Moreover, when this representation is reduced, minorities may improve their visibility in the rank by connecting strategically in the network. For instance, by increasing their out-degree or homophily when majorities are also homophilic. These findings shed light on the social and algorithmic mechanisms that hinder equality and equity in network-based ranking and recommendation algorithms. △ Less

Submitted 22 July, 2022; v1 submitted 30 September, 2021; originally announced October 2021.

Comments: 23 pages, 7 figures and 3 tables in main manuscript. Includes supplementary material

Journal ref: Sci Rep 12, 2012 (2022)

arXiv:2109.07022 [pdf, other]

How Does Counterfactually Augmented Data Impact Models for Social Computing Constructs?

Authors: Indira Sen, Mattia Samory, Fabian Floeck, Claudia Wagner, Isabelle Augenstein

Abstract: As NLP models are increasingly deployed in socially situated settings such as online abusive content detection, it is crucial to ensure that these models are robust. One way of improving model robustness is to generate counterfactually augmented data (CAD) for training models that can better learn to distinguish between core features and data artifacts. While models trained on this type of data ha… ▽ More As NLP models are increasingly deployed in socially situated settings such as online abusive content detection, it is crucial to ensure that these models are robust. One way of improving model robustness is to generate counterfactually augmented data (CAD) for training models that can better learn to distinguish between core features and data artifacts. While models trained on this type of data have shown promising out-of-domain generalizability, it is still unclear what the sources of such improvements are. We investigate the benefits of CAD for social NLP models by focusing on three social computing constructs -- sentiment, sexism, and hate speech. Assessing the performance of models trained with and without CAD across different types of datasets, we find that while models trained on CAD show lower in-domain performance, they generalize better out-of-domain. We unpack this apparent discrepancy using machine explanations and find that CAD reduces model reliance on spurious features. Leveraging a novel typology of CAD to analyze their relationship with model performance, we find that CAD which acts on the construct directly or a diverse set of CAD leads to higher performance. △ Less

Submitted 14 September, 2021; originally announced September 2021.

Comments: Preprint of a paper accepted to EMNLP 2021

arXiv:2108.01659 [pdf]

Image Augmentation Using a Task Guided Generative Adversarial Network for Age Estimation on Brain MRI

Authors: Ruizhe Li, Matteo Bastiani, Dorothee Auer, Christian Wagner, Xin Chen

Abstract: Brain age estimation based on magnetic resonance imaging (MRI) is an active research area in early diagnosis of some neurodegenerative diseases (e.g. Alzheimer, Parkinson, Huntington, etc.) for elderly people or brain underdevelopment for the young group. Deep learning methods have achieved the state-of-the-art performance in many medical image analysis tasks, including brain age estimation. Howev… ▽ More Brain age estimation based on magnetic resonance imaging (MRI) is an active research area in early diagnosis of some neurodegenerative diseases (e.g. Alzheimer, Parkinson, Huntington, etc.) for elderly people or brain underdevelopment for the young group. Deep learning methods have achieved the state-of-the-art performance in many medical image analysis tasks, including brain age estimation. However, the performance and generalisability of the deep learning model are highly dependent on the quantity and quality of the training data set. Both collecting and annotating brain MRI data are extremely time-consuming. In this paper, to overcome the data scarcity problem, we propose a generative adversarial network (GAN) based image synthesis method. Different from the existing GAN-based methods, we integrate a task-guided branch (a regression model for age estimation) to the end of the generator in GAN. By adding a task-guided loss to the conventional GAN loss, the learned low-dimensional latent space and the synthesised images are more task-specific. It helps to boost the performance of the down-stream task by combining the synthesised images and real images for model training. The proposed method was evaluated on a public brain MRI data set for age estimation. Our proposed method outperformed (statistically significant) a deep convolutional neural network based regression model and the GAN-based image synthesis method without the task-guided branch. More importantly, it enables the identification of age-related brain regions in the image space. The code is available on GitHub (https://github.com/ruizhe-l/tgb-gan). △ Less

Submitted 3 August, 2021; originally announced August 2021.

Comments: Accepted for publication at 25th Annual Conference on Medical Image Understanding and Analysis (MIUA 2021)

arXiv:2104.07245 [pdf, other]

doi 10.1109/TAI.2023.3234930

Towards Handling Uncertainty-at-Source in AI -- A Review and Next Steps for Interval Regression

Authors: Shaily Kabir, Christian Wagner, Zack Ellerby

Abstract: Most of statistics and AI draw insights through modelling discord or variance between sources of information (i.e., inter-source uncertainty). Increasingly, however, research is focusing upon uncertainty arising at the level of individual measurements (i.e., within- or intra-source), such as for a given sensor output or human response. Here, adopting intervals rather than numbers as the fundamenta… ▽ More Most of statistics and AI draw insights through modelling discord or variance between sources of information (i.e., inter-source uncertainty). Increasingly, however, research is focusing upon uncertainty arising at the level of individual measurements (i.e., within- or intra-source), such as for a given sensor output or human response. Here, adopting intervals rather than numbers as the fundamental data-type provides an efficient, powerful, yet challenging way forward -- offering systematic capture of uncertainty-at-source, increasing informational capacity, and ultimately potential for insight. Following recent progress in the capture of interval-valued data, including from human participants, conducting machine learning directly upon intervals is a crucial next step. This paper focuses on linear regression for interval-valued data as a recent growth area, providing an essential foundation for broader use of intervals in AI. We conduct an in-depth analysis of state-of-the-art methods, elucidating their behaviour, advantages, and pitfalls when applied to datasets with different properties. Specific emphasis is given to the challenge of preserving mathematical coherence -- i.e., ensuring that models maintain fundamental mathematical properties of intervals throughout -- and the paper puts forward extensions to an existing approach to guarantee this. Carefully designed experiments, using both synthetic and real-world data, are conducted -- with findings presented alongside novel visualizations for interval-valued regression outputs, designed to maximise model interpretability. Finally, the paper makes recommendations concerning method suitability for data sets with specific properties and highlights remaining challenges and important next steps for developing AI with the capacity to handle uncertainty-at-source. △ Less

Submitted 27 February, 2023; v1 submitted 15 April, 2021; originally announced April 2021.

arXiv:2102.04119 [pdf, other]

doi 10.1145/3450614.3463291

The FairCeptron: A Framework for Measuring Human Perceptions of Algorithmic Fairness

Authors: Georg Ahnert, Ivan Smirnov, Florian Lemmerich, Claudia Wagner, Markus Strohmaier

Abstract: Measures of algorithmic fairness often do not account for human perceptions of fairness that can substantially vary between different sociodemographics and stakeholders. The FairCeptron framework is an approach for studying perceptions of fairness in algorithmic decision making such as in ranking or classification. It supports (i) studying human perceptions of fairness and (ii) comparing these hum… ▽ More Measures of algorithmic fairness often do not account for human perceptions of fairness that can substantially vary between different sociodemographics and stakeholders. The FairCeptron framework is an approach for studying perceptions of fairness in algorithmic decision making such as in ranking or classification. It supports (i) studying human perceptions of fairness and (ii) comparing these human perceptions with measures of algorithmic fairness. The framework includes fairness scenario generation, fairness perception elicitation and fairness perception analysis. We demonstrate the FairCeptron framework by applying it to a hypothetical university admission context where we collect human perceptions of fairness in the presence of minorities. An implementation of the FairCeptron framework is openly available, and it can easily be adapted to study perceptions of algorithmic fairness in other application contexts. We hope our work paves the way towards elevating the role of studies of human fairness perceptions in the process of designing algorithmic decision making systems. △ Less

Submitted 8 February, 2021; originally announced February 2021.

Comments: For source code of the implementation, see https://github.com/cssh-rwth/fairceptron

arXiv:2012.15112 [pdf, other]

Web Routineness and Limits of Predictability: Investigating Demographic and Behavioral Differences Using Web Tracking Data

Authors: Juhi Kulshrestha, Marcos Oliveira, Orkut Karacalik, Denis Bonnay, Claudia Wagner

Abstract: Understanding human activities and movements on the Web is not only important for computational social scientists but can also offer valuable guidance for the design of online systems for recommendations, caching, advertising, and personalization. In this work, we demonstrate that people tend to follow routines on the Web, and these repetitive patterns of web visits increase their browsing behavio… ▽ More Understanding human activities and movements on the Web is not only important for computational social scientists but can also offer valuable guidance for the design of online systems for recommendations, caching, advertising, and personalization. In this work, we demonstrate that people tend to follow routines on the Web, and these repetitive patterns of web visits increase their browsing behavior's achievable predictability. We present an information-theoretic framework for measuring the uncertainty and theoretical limits of predictability of human mobility on the Web. We systematically assess the impact of different design decisions on the measurement. We apply the framework to a web tracking dataset of German internet users. Our empirical results highlight that individual's routines on the Web make their browsing behavior predictable to 85% on average, though the value varies across individuals. We observe that these differences in the users' predictabilities can be explained to some extent by their demographic and behavioral attributes. △ Less

Submitted 30 December, 2020; originally announced December 2020.

Comments: 12 pages, 8 figures. To be published in the proceedings of the International AAAI Conference on Web and Social Media (ICWSM) 2021

arXiv:2011.08591 [pdf]

Are University Rankings Statistically Significant? A Comparison among Chinese Universities and with the USA

Authors: Loet Leydesdorff, Caroline S. Wagner, Lin Zhang

Abstract: Purpose: We address the question of whether differences are statistically significant in the rankings of universities. We propose methods measuring the statistical significance among different universities and illustrate the results by empirical data. Design/methodology/approach: Based on z-testing and overlapping confidence intervals, and using data about 205 Chinese universities included in the… ▽ More Purpose: We address the question of whether differences are statistically significant in the rankings of universities. We propose methods measuring the statistical significance among different universities and illustrate the results by empirical data. Design/methodology/approach: Based on z-testing and overlapping confidence intervals, and using data about 205 Chinese universities included in the Leiden Rankings 2020, we argue that three main groups of Chinese research universities can be distinguished. Findings: When the sample of 205 Chinese universities is merged with the 197 US universities included in Leiden Rankings 2020, the results similarly indicate three main groups: high, middle, low. Using this data (Leiden Rankings and Web-of-Science), the z-scores of the Chinese universities are significantly below those of the US universities albeit with some overlap. Research limitations: We show empirically that differences in ranking may be due to changes in the data, the models, or the modeling effects on the data. The scientometric groupings are not always stable when we use different methods. R&D policy implications: Differences among universities can be tested for their statistical significance. The statistics relativize the values of decimals in the rankings. One can operate with a scheme of low/middle/high in policy debates and leave the more fine-grained rankings of individual universities to operational management and local settings. Originality/value: In the discussion about the rankings of universities, the question of whether differences are statistically significant, is, in our opinion, insufficiently addressed. △ Less

Submitted 17 November, 2020; originally announced November 2020.

arXiv:2011.07693 [pdf, other]

Measuring agreement on linguistic expressions in medical treatment scenarios

Authors: J Navrro, C Wagner, Uwe Aickelin, L Green, R Ashford

Abstract: Quality of life assessment represents a key process of deciding treatment success and viability. As such, patients' perceptions of their functional status and well-being are important inputs for impairment assessment. Given that patient completed questionnaires are often used to assess patient status and determine future treatment options, it is important to know the level of agreement of the word… ▽ More Quality of life assessment represents a key process of deciding treatment success and viability. As such, patients' perceptions of their functional status and well-being are important inputs for impairment assessment. Given that patient completed questionnaires are often used to assess patient status and determine future treatment options, it is important to know the level of agreement of the words used by patients and different groups of medical professionals. In this paper, we propose a measure called the Agreement Ratio which provides a ratio of overall agreement when modelling words through Fuzzy Sets (FSs). The measure has been specifically designed for assessing this agreement in fuzzy sets which are generated from data such as patient responses. The measure relies on using the Jaccard Similarity Measure for comparing the different levels of agreement in the FSs generated. △ Less

Submitted 15 November, 2020; originally announced November 2020.

Comments: IEEE Symposium on Computational Intelligence, 6-9 Dec 2016, Athens, Greece

arXiv:2010.15534 [pdf, other]

doi 10.1145/3328905.3332506

Poster: Benchmarking Financial Data Feed Systems

Authors: Manuel Coenen, Christoph Wagner, Alexander Echler, Sebastian Frischbier

Abstract: Data-driven solutions for the investment industry require event-based backend systems to process high-volume financial data feeds with low latency, high throughput, and guaranteed delivery modes. At vwd we process an average of 18 billion incoming event notifications from 500+ data sources for 30 million symbols per day and peak rates of 1+ million notifications per second using custom-built pla… ▽ More Data-driven solutions for the investment industry require event-based backend systems to process high-volume financial data feeds with low latency, high throughput, and guaranteed delivery modes. At vwd we process an average of 18 billion incoming event notifications from 500+ data sources for 30 million symbols per day and peak rates of 1+ million notifications per second using custom-built platforms that keep audit logs of every event. We currently assess modern open source event-processing platforms such as Kafka, NATS, Redis, Flink or Storm for the use in our ticker plant to reduce the maintenance effort for cross-cutting concerns and leverage hybrid deployment models. For comparability and repeatability we benchmark candidates with a standardized workload we derived from our real data feeds. We have enhanced an existing light-weight open source benchmarking tool in its processing, logging, and reporting capabilities to cope with our workloads. The resulting tool wrench can simulate workloads or replay snapshots in volume and dynamics like those we process in our ticker plant. We provide the tool as open source. As part of ongoing work we contribute details on (a) our workload and requirements for benchmarking candidate platforms for financial feed processing; (b) the current state of the tool wrench. △ Less

Submitted 29 October, 2020; originally announced October 2020.

Comments: Authors' version of the accepted submission; final version published by ACM as part of the proceedings of DEBS '19: The 13th ACM International Conference on Distributed and Event-based Systems (DEBS '19); 2 pages, 2 figures

arXiv:2010.08473 [pdf, other]

SMAC: Symbiotic Multi-Agent Construction

Authors: Caleb Wagner, Neel Dhanaraj, Trevor Rizzo, Josue Contreras, Hannan Liang, Gregory Lewin, Carlo Pinciroli

Abstract: We present a novel concept of a heterogeneous, distributed platform for autonomous 3D construction. The platform is composed of two types of robots acting in a coordinated and complementary fashion: (i) A collection of communicating smart construction blocks behaving as a form of growable smart matter, and capable of planning and monitoring their own state and the construction progress; and (ii) A… ▽ More We present a novel concept of a heterogeneous, distributed platform for autonomous 3D construction. The platform is composed of two types of robots acting in a coordinated and complementary fashion: (i) A collection of communicating smart construction blocks behaving as a form of growable smart matter, and capable of planning and monitoring their own state and the construction progress; and (ii) A team of inchworm-shaped builder robots designed to navigate and modify the 3D structure, following the guidance of the smart blocks. We describe the design of the hardware and introduce algorithms for navigation and construction that support a wide class of 3D structures. We demonstrate the capabilities of our concept and characterize its performance through simulations and real-robot experiments. △ Less

Submitted 16 October, 2020; originally announced October 2020.

Comments: 8 pages, submitted to RAL-IROS2021

arXiv:2009.08456 [pdf]

Capturing Richer Information -- On Establishing the Validity of an Interval-Valued Survey Response Mode

Authors: Zack Ellerby, Christian Wagner, Stephen Broomell

Abstract: Obtaining quantitative survey responses that are both accurate and informative is crucial to a wide range of fields. Traditional and ubiquitous response formats such as Likert and Visual Analogue Scales require condensation of responses into discrete point values - but sometimes a range of options may better represent the correct answer. In this paper, we propose an efficient interval-valued respo… ▽ More Obtaining quantitative survey responses that are both accurate and informative is crucial to a wide range of fields. Traditional and ubiquitous response formats such as Likert and Visual Analogue Scales require condensation of responses into discrete point values - but sometimes a range of options may better represent the correct answer. In this paper, we propose an efficient interval-valued response mode, whereby responses are made by marking an ellipse along a continuous scale. We discuss its potential to capture and quantify valuable information that would be lost using conventional approaches, while preserving a high degree of response-efficiency. The information captured by the response interval may represent a possible response range - i.e., a conjunctive set, such as the real numbers between three and six. Alternatively, it may reflect uncertainty in respect to a distinct response - i.e., a disjunctive set, such as a confidence interval. We then report a validation study, utilizing our recently introduced open-source software (DECSYS) to explore how interval-valued survey responses reflect experimental manipulations of several factors hypothesised to influence interval width, across multiple contexts. Results consistently indicate that respondents used interval widths effectively, and subjective participant feedback was also positive. We present this as initial empirical evidence for the efficacy and value of interval-valued response capture. Interestingly, our results also provide insight into respondents' reasoning about the different aforementioned types of intervals - we replicate a tendency towards overconfidence for those representing epistemic uncertainty (i.e., disjunctive sets), but find intervals representing inherent range (i.e., conjunctive sets) to be well-calibrated. △ Less

Submitted 10 March, 2021; v1 submitted 16 September, 2020; originally announced September 2020.

Comments: 59 pages, 12 figures, submitted to Behavior Research Methods

arXiv:2009.01489 [pdf, other]

HACCLE: Metaprogramming for Secure Multi-Party Computation -- Extended Version

Authors: Yuyan Bao, Kirshanthan Sundararajah, Raghav Malik, Qianchuan Ye, Christopher Wagner, Nouraldin Jaber, Fei Wang, Mohammad Hassan Ameri, Donghang Lu, Alexander Seto, Benjamin Delaware, Roopsha Samanta, Aniket Kate, Christina Garman, Jeremiah Blocki, Pierre-David Letourneau, Benoit Meister, Jonathan Springer, Tiark Rompf, Milind Kulkarni

Abstract: Cryptographic techniques have the potential to enable distrusting parties to collaborate in fundamentally new ways, but their practical implementation poses numerous challenges. An important class of such cryptographic techniques is known as Secure Multi-Party Computation (MPC). Developing Secure MPC applications in realistic scenarios requires extensive knowledge spanning multiple areas of crypto… ▽ More Cryptographic techniques have the potential to enable distrusting parties to collaborate in fundamentally new ways, but their practical implementation poses numerous challenges. An important class of such cryptographic techniques is known as Secure Multi-Party Computation (MPC). Developing Secure MPC applications in realistic scenarios requires extensive knowledge spanning multiple areas of cryptography and systems. And while the steps to arrive at a solution for a particular application are often straightforward, it remains difficult to make the implementation efficient, and tedious to apply those same steps to a slightly different application from scratch. Hence, it is an important problem to design platforms for implementing Secure MPC applications with minimum effort and using techniques accessible to non-experts in cryptography. In this paper, we present the HACCLE (High Assurance Compositional Cryptography: Languages and Environments) toolchain, specifically targeted to MPC applications. HACCLE contains an embedded domain-specific language Harpoon, for software developers without cryptographic expertise to write MPC-based programs, and uses Lightweight Modular Staging (LMS) for code generation. Harpoon programs are compiled into acyclic circuits represented in HACCLE's Intermediate Representation (HIR) that serves as an abstraction over different cryptographic protocols such as secret sharing, homomorphic encryption, or garbled circuits. Implementations of different cryptographic protocols serve as different backends of our toolchain. The extensible design of HIR allows cryptographic experts to plug in new primitives and protocols to realize computation. And the use of standard metaprogramming techniques lowers the development effort significantly. △ Less

Submitted 30 September, 2021; v1 submitted 3 September, 2020; originally announced September 2020.

arXiv:2004.12764 [pdf, other]

"Call me sexist, but...": Revisiting Sexism Detection Using Psychological Scales and Adversarial Samples

Authors: Mattia Samory, Indira Sen, Julian Kohne, Fabian Floeck, Claudia Wagner

Abstract: Research has focused on automated methods to effectively detect sexism online. Although overt sexism seems easy to spot, its subtle forms and manifold expressions are not. In this paper, we outline the different dimensions of sexism by grounding them in their implementation in psychological scales. From the scales, we derive a codebook for sexism in social media, which we use to annotate existing… ▽ More Research has focused on automated methods to effectively detect sexism online. Although overt sexism seems easy to spot, its subtle forms and manifold expressions are not. In this paper, we outline the different dimensions of sexism by grounding them in their implementation in psychological scales. From the scales, we derive a codebook for sexism in social media, which we use to annotate existing and novel datasets, surfacing their limitations in breadth and validity with respect to the construct of sexism. Next, we leverage the annotated datasets to generate adversarial examples, and test the reliability of sexism detection methods. Results indicate that current machine learning models pick up on a very narrow set of linguistic markers of sexism and do not generalize well to out-of-domain examples. Yet, including diverse data and adversarial examples at training time results in models that generalize better and that are more robust to artifacts of data collection. By providing a scale-based codebook and insights regarding the shortcomings of the state-of-the-art, we hope to contribute to the development of better and broader models for sexism detection, including reflections on theory-driven approaches to data collection. △ Less

Submitted 2 June, 2021; v1 submitted 27 April, 2020; originally announced April 2020.

Comments: Indira Sen and Julian Kohne contributed equally to this work

Journal ref: Proceedings of the 15th International AAAI Conference on Web and Social Media (ICWSM), 2021

arXiv:2004.07995 [pdf, other]

A generic ensemble based deep convolutional neural network for semi-supervised medical image segmentation

Authors: Ruizhe Li, Dorothee Auer, Christian Wagner, Xin Chen

Abstract: Deep learning based image segmentation has achieved the state-of-the-art performance in many medical applications such as lesion quantification, organ detection, etc. However, most of the methods rely on supervised learning, which require a large set of high-quality labeled data. Data annotation is generally an extremely time-consuming process. To address this problem, we propose a generic semi-su… ▽ More Deep learning based image segmentation has achieved the state-of-the-art performance in many medical applications such as lesion quantification, organ detection, etc. However, most of the methods rely on supervised learning, which require a large set of high-quality labeled data. Data annotation is generally an extremely time-consuming process. To address this problem, we propose a generic semi-supervised learning framework for image segmentation based on a deep convolutional neural network (DCNN). An encoder-decoder based DCNN is initially trained using a few annotated training samples. This initially trained model is then copied into sub-models and improved iteratively using random subsets of unlabeled data with pseudo labels generated from models trained in the previous iteration. The number of sub-models is gradually decreased to one in the final iteration. We evaluate the proposed method on a public grand-challenge dataset for skin lesion segmentation. Our method is able to significantly improve beyond fully supervised model learning by incorporating unlabeled data. △ Less

Submitted 16 April, 2020; originally announced April 2020.

Comments: Accepted for publication at IEEE International Symposium on Biomedical Imaging (ISBI) 2020

Journal ref: 2020 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2020)

arXiv:2004.04896 [pdf, other]

doi 10.1007/978-3-030-53288-8_15

Parameterized Verification of Systems with Global Synchronization and Guards

Authors: Nouraldin Jaber, Swen Jacobs, Christopher Wagner, Milind Kulkarni, Roopsha Samanta

Abstract: Inspired by distributed applications that use consensus or other agreement protocols for global coordination, we define a new computational model for parameterized systems that is based on a general global synchronization primitive and allows for global transition guards. Our model generalizes many existing models in the literature, including broadcast protocols and guarded protocols. We show that… ▽ More Inspired by distributed applications that use consensus or other agreement protocols for global coordination, we define a new computational model for parameterized systems that is based on a general global synchronization primitive and allows for global transition guards. Our model generalizes many existing models in the literature, including broadcast protocols and guarded protocols. We show that reachability properties are decidable for systems without guards, and give sufficient conditions under which they remain decidable in the presence of guards. Furthermore, we investigate cutoffs for reachability properties and provide sufficient conditions for small cutoffs in a number of cases that are inspired by our target applications. △ Less

Submitted 5 May, 2021; v1 submitted 9 April, 2020; originally announced April 2020.

Comments: Conference version published at CAV 2020; this version contains a correction of guard-compatibility conditions C2.1 and C2.2

Journal ref: Lecture Notes in Computer Science, vol 12224. Springer (2020)

arXiv:2004.04613 [pdf, ps, other]

QuickSilver: A Modeling and Parameterized Verification Framework for Systems with Distributed Agreement (Extended Version)

Authors: Nouraldin Jaber, Christopher Wagner, Swen Jacobs, Milind Kulkarni, Roopsha Samanta

Abstract: The last decade has sparked several valiant efforts in deductive verification of distributed agreement protocols such as consensus and leader election. Oddly, there have been far fewer verification efforts that go beyond the core protocols and target applications that are built on top of agreement protocols. This is unfortunate, as agreement-based distributed services such as data stores, locks, a… ▽ More The last decade has sparked several valiant efforts in deductive verification of distributed agreement protocols such as consensus and leader election. Oddly, there have been far fewer verification efforts that go beyond the core protocols and target applications that are built on top of agreement protocols. This is unfortunate, as agreement-based distributed services such as data stores, locks, and ledgers are ubiquitous and potentially permit modular, scalable verification approaches that mimic their modular design. We address this need for verification of distributed agreement-based systems through our novel modeling and verification framework, QuickSilver, that is not only modular, but also fully automated. The key enabling feature of QuickSilver is our encoding of abstractions of verified agreement protocols that facilitates modular, decidable, and scalable automated verification. We demonstrate the potential of QuickSilver by modeling and efficiently verifying a series of tricky case studies, adapted from real-world applications, such as a data store, a lock service, a surveillance system, a pathfinding algorithm for mobile robots, and more. △ Less

Submitted 13 September, 2021; v1 submitted 9 April, 2020; originally announced April 2020.

Comments: Accepted at OOPSLA 2021

arXiv:2002.11952 [pdf, other]

doi 10.1126/sciadv.abb6987

Autonomous robotic nanofabrication with reinforcement learning

Authors: Philipp Leinen, Malte Esders, Kristof T. Schütt, Christian Wagner, Klaus-Robert Müller, F. Stefan Tautz

Abstract: The ability to handle single molecules as effectively as macroscopic building-blocks would enable the construction of complex supramolecular structures inaccessible to self-assembly. The fundamental challenges obstructing this goal are the uncontrolled variability and poor observability of atomic-scale conformations. Here, we present a strategy to work around both obstacles, and demonstrate autono… ▽ More The ability to handle single molecules as effectively as macroscopic building-blocks would enable the construction of complex supramolecular structures inaccessible to self-assembly. The fundamental challenges obstructing this goal are the uncontrolled variability and poor observability of atomic-scale conformations. Here, we present a strategy to work around both obstacles, and demonstrate autonomous robotic nanofabrication by manipulating single molecules. Our approach employs reinforcement learning (RL), which finds solution strategies even in the face of large uncertainty and sparse feedback. We demonstrate the potential of our RL approach by removing molecules autonomously with a scanning probe microscope from a supramolecular structure -- an exemplary task of subtractive manufacturing at the nanoscale. Our RL agent reaches an excellent performance, enabling us to automate a task which previously had to be performed by a human. We anticipate that our work opens the way towards autonomous agents for the robotic construction of functional supramolecular structures with speed, precision and perseverance beyond our current capabilities. △ Less

Submitted 1 October, 2020; v1 submitted 27 February, 2020; originally announced February 2020.

Comments: 3 figures

Journal ref: Sci. Adv. 6, eabb6987 (2020)

arXiv:2001.09762 [pdf, other]

Bias in Data-driven AI Systems -- An Introductory Survey

Authors: Eirini Ntoutsi, Pavlos Fafalios, Ujwal Gadiraju, Vasileios Iosifidis, Wolfgang Nejdl, Maria-Esther Vidal, Salvatore Ruggieri, Franco Turini, Symeon Papadopoulos, Emmanouil Krasanakis, Ioannis Kompatsiaris, Katharina Kinder-Kurlanda, Claudia Wagner, Fariba Karimi, Miriam Fernandez, Harith Alani, Bettina Berendt, Tina Kruegel, Christian Heinze, Klaus Broelemann, Gjergji Kasneci, Thanassis Tiropanis, Steffen Staab

Abstract: AI-based systems are widely employed nowadays to make decisions that have far-reaching impacts on individuals and society. Their decisions might affect everyone, everywhere and anytime, entailing concerns about potential human rights issues. Therefore, it is necessary to move beyond traditional AI algorithms optimized for predictive performance and embed ethical and legal principles in their desig… ▽ More AI-based systems are widely employed nowadays to make decisions that have far-reaching impacts on individuals and society. Their decisions might affect everyone, everywhere and anytime, entailing concerns about potential human rights issues. Therefore, it is necessary to move beyond traditional AI algorithms optimized for predictive performance and embed ethical and legal principles in their design, training and deployment to ensure social good while still benefiting from the huge potential of the AI technology. The goal of this survey is to provide a broad multi-disciplinary overview of the area of bias in AI systems, focusing on technical challenges and solutions as well as to suggest new research directions towards approaches well-grounded in a legal frame. In this survey, we focus on data-driven AI, as a large part of AI is powered nowadays by (big) data and powerful Machine Learning (ML) algorithms. If otherwise not specified, we use the general term bias to describe problems related to the gathering or processing of data that might result in prejudiced decisions on the bases of demographic features like race, sex, etc. △ Less

Submitted 14 January, 2020; originally announced January 2020.

Comments: 19 pages, 1 figure

arXiv:1910.00703 [pdf, other]

Exploring how Component Factors and their Uncertainty Affect Judgements of Risk in Cyber-Security

Authors: Zack Ellerby, Josie McCulloch, Melanie Wilson, Christian Wagner

Abstract: Subjective judgements from experts provide essential information when assessing and modelling threats in respect to cyber-physical systems. For example, the vulnerability of individual system components can be described using multiple factors, such as complexity, technological maturity, and the availability of tools to aid an attack. Such information is useful for determining attack risk, but much… ▽ More Subjective judgements from experts provide essential information when assessing and modelling threats in respect to cyber-physical systems. For example, the vulnerability of individual system components can be described using multiple factors, such as complexity, technological maturity, and the availability of tools to aid an attack. Such information is useful for determining attack risk, but much of it is challenging to acquire automatically and instead must be collected through expert assessments. However, most experts inherently carry some degree of uncertainty in their assessments. For example, it is impossible to be certain precisely how many tools are available to aid an attack. Traditional methods of capturing subjective judgements through choices such as \emph{high}, \emph{medium} or \emph{low} do not enable experts to quantify their uncertainty. However, it is important to measure the range of uncertainty surrounding responses in order to appropriately inform system vulnerability analysis. We use a recently introduced interval-valued response-format to capture uncertainty in experts' judgements and employ inferential statistical approaches to analyse the data. We identify key attributes that contribute to hop vulnerability in cyber-systems and demonstrate the value of capturing the uncertainty around these attributes. We find that this uncertainty is not only predictive of uncertainty in the overall vulnerability of a given system component, but also significantly informs ratings of overall component vulnerability itself. We propose that these methods and associated insights can be employed in real world situations, including vulnerability assessments of cyber-physical systems, which are becoming increasingly complex and integrated into society, making them particularly susceptible to uncertainty in assessment. △ Less

Submitted 30 September, 2019; originally announced October 2019.

Comments: International Conference on Critical Information Infrastructures Security (CRITIS) 2019

arXiv:1909.04468 [pdf]

doi 10.1093/scipol/scab036

Democracy, Complexity, and Science: Exploring Structural Sources of National Scientific Performance

Authors: Travis A. Whetsell, Koen Jonkers, Ana-Maria Dimand, Jeroen Baas, Caroline S. Wagner

Abstract: Scholars have long hypothesized that democratic forms of government are more compatible with scientific advancement. However, empirical analysis testing the democracy-science compatibility hypothesis remains underdeveloped. This article explores the effect of democratic governance on scientific performance using panel data on 124 countries between 2007 and 2017. We find evidence supporting the dem… ▽ More Scholars have long hypothesized that democratic forms of government are more compatible with scientific advancement. However, empirical analysis testing the democracy-science compatibility hypothesis remains underdeveloped. This article explores the effect of democratic governance on scientific performance using panel data on 124 countries between 2007 and 2017. We find evidence supporting the democracy-science hypothesis. Further, using both internal and external measures of complexity, we estimate the effects of complexity as a moderating factor between the democracy-science connection. The results show differential main effects of economic complexity, globalization, and international collaboration on scientific performance, as well as significant interaction effects that moderate the effect of democracy on scientific performance. The findings show the significance of democratic governance and complex systems in national scientific performance. △ Less

Submitted 12 August, 2021; v1 submitted 10 September, 2019; originally announced September 2019.

Comments: Early work published in the Proceedings of the 17th International Conference of the International Society for Scientometrics and Informetrics (pp. 756-761). Science and Public Policy, 2021

arXiv:1908.06510 [pdf, ps, other]

Taming Concurrency for Verification Using Multiparty Session Types (Technical Report)

Authors: Kirstin Peters, Christoph Wagner, Uwe Nestmann

Abstract: The additional complexity caused by concurrently communicating processes in distributed systems render the verification of such systems into a very hard problem. Multiparty session types were developed to govern communication and concurrency in distributed systems. As such, they provide an efficient verification method w.r.t. properties about communication and concurrency, like communication safet… ▽ More The additional complexity caused by concurrently communicating processes in distributed systems render the verification of such systems into a very hard problem. Multiparty session types were developed to govern communication and concurrency in distributed systems. As such, they provide an efficient verification method w.r.t. properties about communication and concurrency, like communication safety or progress. However, they do not support the analysis of properties that require the consideration of concrete runs or concrete values of variables. We sequentialise well-typed systems of processes guided by the structure of their global type to obtain interaction-free abstractions thereof. Without interaction, concurrency in the system is reduced to sequential and completely independent parallel compositions. In such abstractions, the verification of properties such as e.g. data-based termination that are not covered by multiparty session types, but rely on concrete runs or values of variables, becomes significantly more efficient. △ Less

Submitted 18 August, 2019; originally announced August 2019.

Comments: This technical report provides proofs and additional materials for a paper (with the same title) at ICTAC'19

arXiv:1907.08228 [pdf, other]

TED-On: A Total Error Framework for Digital Traces of Human Behavior on Online Platforms

Authors: Indira Sen, Fabian Floeck, Katrin Weller, Bernd Weiss, Claudia Wagner

Abstract: Peoples' activities and opinions recorded as digital traces online, especially on social media and other web-based platforms, offer increasingly informative pictures of the public. They promise to allow inferences about populations beyond the users of the platforms on which the traces are recorded, representing real potential for the Social Sciences and a complement to survey-based research. But t… ▽ More Peoples' activities and opinions recorded as digital traces online, especially on social media and other web-based platforms, offer increasingly informative pictures of the public. They promise to allow inferences about populations beyond the users of the platforms on which the traces are recorded, representing real potential for the Social Sciences and a complement to survey-based research. But the use of digital traces brings its own complexities and new error sources to the research enterprise. Recently, researchers have begun to discuss the errors that can occur when digital traces are used to learn about humans and social phenomena. This article synthesizes this discussion and proposes a systematic way to categorize potential errors, inspired by the Total Survey Error (TSE) Framework developed for survey methodology. We introduce a conceptual framework to diagnose, understand, and document errors that may occur in studies based on such digital traces. While there are clear parallels to the well-known error sources in the TSE framework, the new "Total Error Framework for Digital Traces of Human Behavior on Online Platforms" (TED-On) identifies several types of error that are specific to the use of digital traces. By providing a standard vocabulary to describe these errors, the proposed framework is intended to advance communication and research concerning the use of digital traces in scientific social research. △ Less

Submitted 3 June, 2021; v1 submitted 18 July, 2019; originally announced July 2019.

Comments: 20 pages, 2 figures, Longer version of paper set to appear in Public Opinion Quarterly. Updating terminology

arXiv:1907.04679 [pdf, other]

Measuring Inter-group Agreement on zSlice Based General Type-2 Fuzzy Sets

Authors: Javier Navarro, Christian Wagner

Abstract: Recently, there has been much research into modelling of uncertainty in human perception through Fuzzy Sets (FSs). Most of this research has focused on allowing respondents to express their (intra) uncertainty using intervals. Here, depending on the technique used and types of uncertainties being modelled different types of FSs can be obtained (e.g., Type-1, Interval Type-2, General Type-2). Argua… ▽ More Recently, there has been much research into modelling of uncertainty in human perception through Fuzzy Sets (FSs). Most of this research has focused on allowing respondents to express their (intra) uncertainty using intervals. Here, depending on the technique used and types of uncertainties being modelled different types of FSs can be obtained (e.g., Type-1, Interval Type-2, General Type-2). Arguably, one of the most flexible techniques is the Interval Agreement Approach (IAA) as it allows to model the perception of all respondents without making assumptions such as outlier removal or predefined membership function types (e.g. Gaussian). A key aspect in the analysis of interval-valued data and indeed, IAA based agreement models of said data, is to determine the position and strengths of agreement across all the sources/participants. While previously, the Agreement Ratio was proposed to measure the strength of agreement in fuzzy set based models of interval data, said measure has only been applicable to type-1 fuzzy sets. In this paper, we extend the Agreement Ratio to capture the degree of inter-group agreement modelled by a General Type-2 Fuzzy Set when using the IAA. This measure relies on using a similarity measure to quantitatively express the relation between the different levels of agreement in a given FS. Synthetic examples are provided in order to demonstrate both behaviour and calculation of the measure. Finally, an application to real-world data is provided in order to show the potential of this measure to assess the divergence of opinions for ambiguous concepts when heterogeneous groups of participants are involved. △ Less

Submitted 9 July, 2019; originally announced July 2019.

arXiv:1903.07730 [pdf]

doi 10.1093/scipol/scz048

Between Promise and Performance: Science and Technology Policy Implementation through Network Governance

Authors: Travis A. Whetsell, Michael J. Leiblein, Caroline S. Wagner

Abstract: This research analyzes the effects of U.S. science and technology policy on the technological performance of organizations in a global strategic alliance network. During the mid-1980s the U.S. semiconductor industry appeared to be collapsing. Industry leaders and policymakers moved to support and protect U.S. firms by creating a program called Sematech. While many scholars regard Sematech as a suc… ▽ More This research analyzes the effects of U.S. science and technology policy on the technological performance of organizations in a global strategic alliance network. During the mid-1980s the U.S. semiconductor industry appeared to be collapsing. Industry leaders and policymakers moved to support and protect U.S. firms by creating a program called Sematech. While many scholars regard Sematech as a success, how the program succeeded remains unclear. This study re-contextualizes Sematech as a network administrative organization which lowered cooperation costs and enhanced resource combination for innovation at the cutting edge. This study combines network analysis and longitudinal regression techniques to test the effects of public policy on organizational network position and technological performance in an unbalanced panel of semiconductor firms between 1986 and 2001. This research suggests governments might achieve policy through inter-organizational innovations aimed at the development and administration of robust governance networks. △ Less

Submitted 13 August, 2019; v1 submitted 18 March, 2019; originally announced March 2019.

Comments: Science and Public Policy, forthcoming; 40 pages, 4 figures, 4 tables

arXiv:1902.01642 [pdf]

Agent-Based Simulation Modelling for Reflecting on Consequences of Digital Mental Health

Authors: Daniel Stroud, Christian Wagner, Peer-Olaf Siebers

Abstract: The premise of this working paper is based around agent-based simulation models and how to go about creating them from given incomplete information. Agent-based simulations are stochastic simulations that revolve around groups of agents that each have their own characteristics and can make decisions. Such simulations can be used to emulate real life situations and to create hypothetical situations… ▽ More The premise of this working paper is based around agent-based simulation models and how to go about creating them from given incomplete information. Agent-based simulations are stochastic simulations that revolve around groups of agents that each have their own characteristics and can make decisions. Such simulations can be used to emulate real life situations and to create hypothetical situations without the need for real-world testing prior. Here we describe the development of an agent-based simulation model for studying future digital mental health scenarios. An incomplete conceptual model has been used as the basis for this development. To define differences in responses to stimuli we employed fuzzy decision making logic. The model has been implemented but not been used for structured experimentation yet. This is planned as our next step. △ Less

Submitted 5 February, 2019; originally announced February 2019.

Comments: 16 pages, 18 figures, 3 tables, working paper

arXiv:1810.07812 [pdf]

doi 10.3389/frma.2018.00010

Openness and Impact of Leading Scientific Countries

Authors: Caroline S. Wagner, Travis Whetsell, Jeroen Baas, Koen Jonkers

Abstract: The rapid rise of international collaboration over the past three decades, demonstrated in coauthorship of scientific articles, raises the question of whether countries benefit from cooperative science and how this might be measured. We develop and compare measures to ask this question. For all source publications in 2013, we obtained from Elsevier national level full and fractional paper counts a… ▽ More The rapid rise of international collaboration over the past three decades, demonstrated in coauthorship of scientific articles, raises the question of whether countries benefit from cooperative science and how this might be measured. We develop and compare measures to ask this question. For all source publications in 2013, we obtained from Elsevier national level full and fractional paper counts as well as accompanying field-weighted citation counts. Then we collected information from Elsevier on the percent of all internationally coauthored papers for each country, as well as Organization for Economic Cooperation and Development measures of the international mobility of the scientific workforce in 2013, and conducted a principle component analysis that produced an openness index. We added data from the OECD on government budget allocation on research and development for 2011 to tie in the public spending that contributed to the 2013 output. We found that openness among advanced science systems is strongly correlated with impact: the more internationally engaged a nation is in terms of coauthorships and researcher mobility, the higher the impact of scientific work. The results have important implications for policy making around investment, as well as the flows of students, researchers, and technical workers. △ Less

Submitted 17 October, 2018; originally announced October 2018.

Journal ref: Wagner, C. S., Whetsell, T., Baas, J., & Jonkers, K. (2018). Openness and impact of leading scientific countries. Frontiers in Research Metrics and Analytics, 3, 10

Showing 1–50 of 109 results for author: Wagner, C