subscribe to arXiv mailings

arXiv:2406.13523 [pdf, other]

Measurement of the Crystallization and Phase Transition of Niobium Dioxide Thin-Films for Neuromorphic Computing Applications Using a Tube Furnace Optical Transmission System

Authors: Zachary R. Robinson, Karsten Beckmann, James Michels, Vincent Daviero, Elizabeth A. Street, Fiona Lorenzen, Matthew C. Sullivan, Nathaniel Cady, Alexander Kozen, Marc Currie

Abstract: Significant research has focused on low-power stochastic devices built from memristive materials. These devices foster neuromorphic approaches to computational efficiency enhancement in merged biomimetic and CMOS architectures due to their ability to phase transition from a dielectric to a metal at an increased temperature. Niobium dioxide has a volatile memristive phase change that occurs $\sim$8… ▽ More Significant research has focused on low-power stochastic devices built from memristive materials. These devices foster neuromorphic approaches to computational efficiency enhancement in merged biomimetic and CMOS architectures due to their ability to phase transition from a dielectric to a metal at an increased temperature. Niobium dioxide has a volatile memristive phase change that occurs $\sim$800$^\circ$C~that makes it an ideal candidate for future neuromorphic electronics. A straightforward optical system has been developed on a horizontal tube furnace for \emph{in situ} spectral measurements as an as-grown \NbtOf\ film is annealed and ultimately crystallizes as \NbOt. The system measures the changing spectral transmissivity of \NbtOf\ as it undergoes both reduction and crystallization processes. We were also able to measure the transition from metallic-to-non-metallic \NbOt\ during the cooldown phase, which is shown to occur about 100$^\circ$C~ lower on a sapphire substrate than fused silica. After annealing, the material properties of the \NbtOf\ and \NbOt\ were assessed via X-ray photoelectron spectroscopy, X-ray diffraction, and 4-point resistivity, confirming that we have made crystalline \NbOt. △ Less

Submitted 11 July, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

arXiv:2405.08526 [pdf, other]

Why Larp?! A Synthesis Paper on Live Action Roleplay in Relation to HCI Research and Practice

Authors: Karin Johansson, Raquel Breejon Robinson, Jon Back, Sarah Lynne Bowman, James Fey, Elena Márquez Segura, Annika Waern, Katherine Isbister

Abstract: Live action roleplay (larp) has a wide range of applications, and can be relevant in relation to HCI. While there has been research about larp in relation to topics such as embodied interaction, playfulness and futuring published in HCI venues since the early 2000s, there is not yet a compilation of this knowledge. In this paper, we synthesise knowledge about larp and larp-adjacent work within the… ▽ More Live action roleplay (larp) has a wide range of applications, and can be relevant in relation to HCI. While there has been research about larp in relation to topics such as embodied interaction, playfulness and futuring published in HCI venues since the early 2000s, there is not yet a compilation of this knowledge. In this paper, we synthesise knowledge about larp and larp-adjacent work within the domain of HCI. We present a practitioner overview from an expert group of larp researchers, the results of a literature review, and highlight particular larp research exemplars which all work together to showcase the diverse set of ways that larp can be utilised in relation to HCI topics and research. This paper identifies the need for further discussions toward establishing best practices for utilising larp in relation to HCI research, as well as advocating for increased engagement with larps outside academia. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.05376 [pdf, other]

Kreyòl-MT: Building MT for Latin American, Caribbean and Colonial African Creole Languages

Authors: Nathaniel R. Robinson, Raj Dabre, Ammon Shurtz, Rasul Dent, Onenamiyi Onesi, Claire Bizon Monroc, Loïc Grobol, Hasan Muhammad, Ashi Garg, Naome A. Etori, Vijay Murari Tiyyala, Olanrewaju Samuel, Matthew Dean Stutzman, Bismarck Bamfo Odoom, Sanjeev Khudanpur, Stephen D. Richardson, Kenton Murray

Abstract: A majority of language technologies are tailored for a small number of high-resource languages, while relatively many low-resource languages are neglected. One such group, Creole languages, have long been marginalized in academic study, though their speakers could benefit from machine translation (MT). These languages are predominantly used in much of Latin America, Africa and the Caribbean. We pr… ▽ More A majority of language technologies are tailored for a small number of high-resource languages, while relatively many low-resource languages are neglected. One such group, Creole languages, have long been marginalized in academic study, though their speakers could benefit from machine translation (MT). These languages are predominantly used in much of Latin America, Africa and the Caribbean. We present the largest cumulative dataset to date for Creole language MT, including 14.5M unique Creole sentences with parallel translations -- 11.6M of which we release publicly, and the largest bitexts gathered to date for 41 languages -- the first ever for 21. In addition, we provide MT models supporting all 41 Creole languages in 172 translation directions. Given our diverse dataset, we produce a model for Creole language MT exposed to more genre diversity than ever before, which outperforms a genre-specific Creole MT model on its own benchmark for 26 of 34 translation directions. △ Less

Submitted 13 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

Comments: NAACL 2024

arXiv:2404.01848 [pdf, other]

doi 10.1145/3613905.3644063

"That's Not Good Science!": An Argument for the Thoughtful Use of Formative Situations in Research through Design

Authors: Raquel B Robinson, Anya Osborne, Chen Ji, James Collin Fey, Ella Dagan, Katherine Isbister

Abstract: Most currently accepted approaches to evaluating Research through Design (RtD) presume that design prototypes are finalized and ready for robust testing in laboratory or in-the-wild settings. However, it is also valuable to assess designs at intermediate phases with mid-fidelity prototypes, not just to inform an ongoing design process, but also to glean knowledge of broader use to the research com… ▽ More Most currently accepted approaches to evaluating Research through Design (RtD) presume that design prototypes are finalized and ready for robust testing in laboratory or in-the-wild settings. However, it is also valuable to assess designs at intermediate phases with mid-fidelity prototypes, not just to inform an ongoing design process, but also to glean knowledge of broader use to the research community. We propose 'formative situations' as a frame for examining mid-fidelity prototypes-in-process in this way. We articulate a set of criteria to help the community better assess the rigor of formative situations, in the service of opening conversation about establishing formative situations as a valuable contribution type within the RtD community. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Comments: 8 pages, 1 figure

arXiv:2403.13169 [pdf, other]

Wav2Gloss: Generating Interlinear Glossed Text from Speech

Authors: Taiqi He, Kwanghee Choi, Lindia Tjuatja, Nathaniel R. Robinson, Jiatong Shi, Shinji Watanabe, Graham Neubig, David R. Mortensen, Lori Levin

Abstract: Thousands of the world's languages are in danger of extinction--a tremendous threat to cultural identities and human language diversity. Interlinear Glossed Text (IGT) is a form of linguistic annotation that can support documentation and resource creation for these languages' communities. IGT typically consists of (1) transcriptions, (2) morphological segmentation, (3) glosses, and (4) free transl… ▽ More Thousands of the world's languages are in danger of extinction--a tremendous threat to cultural identities and human language diversity. Interlinear Glossed Text (IGT) is a form of linguistic annotation that can support documentation and resource creation for these languages' communities. IGT typically consists of (1) transcriptions, (2) morphological segmentation, (3) glosses, and (4) free translations to a majority language. We propose Wav2Gloss: a task in which these four annotation components are extracted automatically from speech, and introduce the first dataset to this end, Fieldwork: a corpus of speech with all these annotations, derived from the work of field linguists, covering 37 languages, with standard formatting, and train/dev/test splits. We provide various baselines to lay the groundwork for future research on IGT generation from speech, such as end-to-end versus cascaded, monolingual versus multilingual, and single-task versus multi-task approaches. △ Less

Submitted 5 June, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

Comments: ACL 2024 camera ready version

arXiv:2402.01582 [pdf]

Automating Sound Change Prediction for Phylogenetic Inference: A Tukanoan Case Study

Authors: Kalvin Chang, Nathaniel R. Robinson, Anna Cai, Ting Chen, Annie Zhang, David R. Mortensen

Abstract: We describe a set of new methods to partially automate linguistic phylogenetic inference given (1) cognate sets with their respective protoforms and sound laws, (2) a mapping from phones to their articulatory features and (3) a typological database of sound changes. We train a neural network on these sound change data to weight articulatory distances between phones and predict intermediate sound c… ▽ More We describe a set of new methods to partially automate linguistic phylogenetic inference given (1) cognate sets with their respective protoforms and sound laws, (2) a mapping from phones to their articulatory features and (3) a typological database of sound changes. We train a neural network on these sound change data to weight articulatory distances between phones and predict intermediate sound change steps between historical protoforms and their modern descendants, replacing a linguistic expert in part of a parsimony-based phylogenetic inference algorithm. In our best experiments on Tukanoan languages, this method produces trees with a Generalized Quartet Distance of 0.12 from a tree that used expert annotations, a significant improvement over other semi-automated baselines. We discuss potential benefits and drawbacks to our neural approach and parsimony-based tree prediction. We also experiment with a minimal generalization learner for automatic sound law induction, finding it comparably effective to sound laws from expert annotation. Our code is publicly available at https://github.com/cmu-llab/aiscp. △ Less

Submitted 2 February, 2024; originally announced February 2024.

Comments: Accepted to LChange 2023

arXiv:2401.05818 [pdf, ps, other]

doi 10.1145/3613905.3644051

How to write a CHI paper (asking for a friend)

Authors: Raquel Robinson, Alberto Alvarez, Elisa Mekler

Abstract: Writing and genre conventions are extant to any scientific community, and CHI is no different. In this paper, we present the early phases of an AI tool we created called KITSUNE, which supports authors in placing their work into the format of a CHI paper, taking into account many conventions that are ever-present in CHI papers. We describe the development of the tool with the intent to promote dis… ▽ More Writing and genre conventions are extant to any scientific community, and CHI is no different. In this paper, we present the early phases of an AI tool we created called KITSUNE, which supports authors in placing their work into the format of a CHI paper, taking into account many conventions that are ever-present in CHI papers. We describe the development of the tool with the intent to promote discussion around how writing conventions are upheld and unquestioned by the CHI community, and how this translates to the work produced. In addition, we bring up questions surrounding how the introduction of LLMs into academic writing fundamentally change how conventions will be upheld now and in the future △ Less

Submitted 11 January, 2024; originally announced January 2024.

Comments: 8 pages

Journal ref: Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems

arXiv:2309.07423 [pdf, other]

ChatGPT MT: Competitive for High- (but not Low-) Resource Languages

Authors: Nathaniel R. Robinson, Perez Ogayo, David R. Mortensen, Graham Neubig

Abstract: Large language models (LLMs) implicitly learn to perform a range of language tasks, including machine translation (MT). Previous studies explore aspects of LLMs' MT capabilities. However, there exist a wide variety of languages for which recent LLM MT performance has never before been evaluated. Without published experimental evidence on the matter, it is difficult for speakers of the world's dive… ▽ More Large language models (LLMs) implicitly learn to perform a range of language tasks, including machine translation (MT). Previous studies explore aspects of LLMs' MT capabilities. However, there exist a wide variety of languages for which recent LLM MT performance has never before been evaluated. Without published experimental evidence on the matter, it is difficult for speakers of the world's diverse languages to know how and whether they can use LLMs for their languages. We present the first experimental evidence for an expansive set of 204 languages, along with MT cost analysis, using the FLORES-200 benchmark. Trends reveal that GPT models approach or exceed traditional MT model performance for some high-resource languages (HRLs) but consistently lag for low-resource languages (LRLs), under-performing traditional MT for 84.1% of languages we covered. Our analysis reveals that a language's resource level is the most important feature in determining ChatGPT's relative ability to translate it, and suggests that ChatGPT is especially disadvantaged for LRLs and African languages. △ Less

Submitted 14 September, 2023; originally announced September 2023.

Comments: 27 pages, 9 figures, 14 tables

arXiv:2209.06295 [pdf, other]

Data-adaptive Transfer Learning for Translation: A Case Study in Haitian and Jamaican

Authors: Nathaniel R. Robinson, Cameron J. Hogan, Nancy Fulda, David R. Mortensen

Abstract: Multilingual transfer techniques often improve low-resource machine translation (MT). Many of these techniques are applied without considering data characteristics. We show in the context of Haitian-to-English translation that transfer effectiveness is correlated with amount of training data and relationships between knowledge-sharing languages. Our experiments suggest that for some languages beyo… ▽ More Multilingual transfer techniques often improve low-resource machine translation (MT). Many of these techniques are applied without considering data characteristics. We show in the context of Haitian-to-English translation that transfer effectiveness is correlated with amount of training data and relationships between knowledge-sharing languages. Our experiments suggest that for some languages beyond a threshold of authentic data, back-translation augmentation methods are counterproductive, while cross-lingual transfer from a sufficiently related language is preferred. We complement this finding by contributing a rule-based French-Haitian orthographic and syntactic engine and a novel method for phonological embedding. When used with multilingual techniques, orthographic transformation makes statistically significant improvements over conventional methods. And in very low-resource Jamaican MT, code-switching with a transfer language for orthographic resemblance yields a 6.63 BLEU point advantage. △ Less

Submitted 13 September, 2022; originally announced September 2022.

arXiv:2111.08088 [pdf, other]

Assessing gender bias in medical and scientific masked language models with StereoSet

Authors: Robert Robinson

Abstract: NLP systems use language models such as Masked Language Models (MLMs) that are pre-trained on large quantities of text such as Wikipedia create representations of language. BERT is a powerful and flexible general-purpose MLM system developed using unlabeled text. Pre-training on large quantities of text also has the potential to transparently embed the cultural and social biases found in the sourc… ▽ More NLP systems use language models such as Masked Language Models (MLMs) that are pre-trained on large quantities of text such as Wikipedia create representations of language. BERT is a powerful and flexible general-purpose MLM system developed using unlabeled text. Pre-training on large quantities of text also has the potential to transparently embed the cultural and social biases found in the source text into the MLM system. This study aims to compare biases in general purpose and medical MLMs with the StereoSet bias assessment tool. The general purpose MLMs showed significant bias overall, with BERT scoring 57 and RoBERTa scoring 61. The category of gender bias is where the best performances were found, with 63 for BERT and 73 for RoBERTa. Performances for profession, race, and religion were similar to the overall bias scores for the general-purpose MLMs.Medical MLMs showed more bias in all categories than the general-purpose MLMs except for SciBERT, which showed a race bias score of 55, which was superior to the race bias score of 53 for BERT. More gender (Medical 54-58 vs. General 63-73) and religious (46-54 vs. 58) biases were found with medical MLMs. This evaluation of four medical MLMs for stereotyped assessments about race, gender, religion, and profession showed inferior performance to general-purpose MLMs. These medically focused MLMs differ considerably in training source data, which is likely the root cause of the differences in ratings for stereotyped biases from the StereoSet tool. △ Less

Submitted 15 November, 2021; originally announced November 2021.

Comments: 5 pages, 1 table

arXiv:2009.04110 [pdf, other]

doi 10.1371/journal.pone.0243243

Real-time Plant Health Assessment Via Implementing Cloud-based Scalable Transfer Learning On AWS DeepLens

Authors: Asim Khan, Umair Nawaz, Anwaar Ulhaq, Randall W. Robinson

Abstract: In the Agriculture sector, control of plant leaf diseases is crucial as it influences the quality and production of plant species with an impact on the economy of any country. Therefore, automated identification and classification of plant leaf disease at an early stage is essential to reduce economic loss and to conserve the specific species. Previously, to detect and classify plant leaf disease,… ▽ More In the Agriculture sector, control of plant leaf diseases is crucial as it influences the quality and production of plant species with an impact on the economy of any country. Therefore, automated identification and classification of plant leaf disease at an early stage is essential to reduce economic loss and to conserve the specific species. Previously, to detect and classify plant leaf disease, various Machine Learning models have been proposed; however, they lack usability due to hardware incompatibility, limited scalability and inefficiency in practical usage. Our proposed DeepLens Classification and Detection Model (DCDM) approach deal with such limitations by introducing automated detection and classification of the leaf diseases in fruits (apple, grapes, peach and strawberry) and vegetables (potato and tomato) via scalable transfer learning on AWS SageMaker and importing it on AWS DeepLens for real-time practical usability. Cloud integration provides scalability and ubiquitous access to our approach. Our experiments on extensive image data set of healthy and unhealthy leaves of fruits and vegetables showed an accuracy of 98.78% with a real-time diagnosis of plant leaves diseases. We used forty thousand images for the training of deep learning model and then evaluated it on ten thousand images. The process of testing an image for disease diagnosis and classification using AWS DeepLens on average took 0.349s, providing disease information to the user in less than a second. △ Less

Submitted 10 September, 2020; v1 submitted 9 September, 2020; originally announced September 2020.

Comments: 10 Pages, 12 Figures and 6 Tables

arXiv:2006.16741 [pdf, other]

doi 10.1007/978-3-030-59728-3_69

Image-level Harmonization of Multi-Site Data using Image-and-Spatial Transformer Networks

Authors: R. Robinson, Q. Dou, D. C. Castro, K. Kamnitsas, M. de Groot, R. M. Summers, D. Rueckert, B. Glocker

Abstract: We investigate the use of image-and-spatial transformer networks (ISTNs) to tackle domain shift in multi-site medical imaging data. Commonly, domain adaptation (DA) is performed with little regard for explainability of the inter-domain transformation and is often conducted at the feature-level in the latent space. We employ ISTNs for DA at the image-level which constrains transformations to explai… ▽ More We investigate the use of image-and-spatial transformer networks (ISTNs) to tackle domain shift in multi-site medical imaging data. Commonly, domain adaptation (DA) is performed with little regard for explainability of the inter-domain transformation and is often conducted at the feature-level in the latent space. We employ ISTNs for DA at the image-level which constrains transformations to explainable appearance and shape changes. As proof-of-concept we demonstrate that ISTNs can be trained adversarially on a classification problem with simulated 2D data. For real-data validation, we construct two 3D brain MRI datasets from the Cam-CAN and UK Biobank studies to investigate domain shift due to acquisition and population differences. We show that age regression and sex classification models trained on ISTN output improve generalization when training on data from one and testing on the other site. △ Less

Submitted 30 June, 2020; originally announced June 2020.

Comments: Accepted at MICCAI 2020

Journal ref: Medical Image Computing and Computer-Assisted Intervention (2020), pp. 710-719, LNCS 12267

arXiv:1910.04597 [pdf, other]

Machine Learning with Multi-Site Imaging Data: An Empirical Study on the Impact of Scanner Effects

Authors: Ben Glocker, Robert Robinson, Daniel C. Castro, Qi Dou, Ender Konukoglu

Abstract: This is an empirical study to investigate the impact of scanner effects when using machine learning on multi-site neuroimaging data. We utilize structural T1-weighted brain MRI obtained from two different studies, Cam-CAN and UK Biobank. For the purpose of our investigation, we construct a dataset consisting of brain scans from 592 age- and sex-matched individuals, 296 subjects from each original… ▽ More This is an empirical study to investigate the impact of scanner effects when using machine learning on multi-site neuroimaging data. We utilize structural T1-weighted brain MRI obtained from two different studies, Cam-CAN and UK Biobank. For the purpose of our investigation, we construct a dataset consisting of brain scans from 592 age- and sex-matched individuals, 296 subjects from each original study. Our results demonstrate that even after careful pre-processing with state-of-the-art neuroimaging pipelines a classifier can easily distinguish between the origin of the data with very high accuracy. Our analysis on the example application of sex classification suggests that current approaches to harmonize data are unable to remove scanner-specific bias leading to overly optimistic performance estimates and poor generalization. We conclude that multi-site data harmonization remains an open challenge and particular care needs to be taken when using such data with advanced machine learning methods for predictive modelling. △ Less

Submitted 10 October, 2019; originally announced October 2019.

Comments: Presented at the Medical Imaging meets NeurIPS Workshop 2019

arXiv:1901.09351 [pdf, other]

Automated Quality Control in Image Segmentation: Application to the UK Biobank Cardiac MR Imaging Study

Authors: Robert Robinson, Vanya V. Valindria, Wenjia Bai, Ozan Oktay, Bernhard Kainz, Hideaki Suzuki, Mihir M. Sanghvi, Nay Aung, Jos$é$ Miguel Paiva, Filip Zemrak, Kenneth Fung, Elena Lukaschuk, Aaron M. Lee, Valentina Carapella, Young Jin Kim, Stefan K. Piechnik, Stefan Neubauer, Steffen E. Petersen, Chris Page, Paul M. Matthews, Daniel Rueckert, Ben Glocker

Abstract: Background: The trend towards large-scale studies including population imaging poses new challenges in terms of quality control (QC). This is a particular issue when automatic processing tools, e.g. image segmentation methods, are employed to derive quantitative measures or biomarkers for later analyses. Manual inspection and visual QC of each segmentation isn't feasible at large scale. However, i… ▽ More Background: The trend towards large-scale studies including population imaging poses new challenges in terms of quality control (QC). This is a particular issue when automatic processing tools, e.g. image segmentation methods, are employed to derive quantitative measures or biomarkers for later analyses. Manual inspection and visual QC of each segmentation isn't feasible at large scale. However, it's important to be able to automatically detect when a segmentation method fails so as to avoid inclusion of wrong measurements into subsequent analyses which could lead to incorrect conclusions. Methods: To overcome this challenge, we explore an approach for predicting segmentation quality based on Reverse Classification Accuracy, which enables us to discriminate between successful and failed segmentations on a per-cases basis. We validate this approach on a new, large-scale manually-annotated set of 4,800 cardiac magnetic resonance scans. We then apply our method to a large cohort of 7,250 cardiac MRI on which we have performed manual QC. Results: We report results used for predicting segmentation quality metrics including Dice Similarity Coefficient (DSC) and surface-distance measures. As initial validation, we present data for 400 scans demonstrating 99% accuracy for classifying low and high quality segmentations using predicted DSC scores. As further validation we show high correlation between real and predicted scores and 95% classification accuracy on 4,800 scans for which manual segmentations were available. We mimic real-world application of the method on 7,250 cardiac MRI where we show good agreement between predicted quality metrics and manual visual QC scores. Conclusions: We show that RCA has the potential for accurate and fully automatic segmentation QC on a per-case basis in the context of large-scale population imaging as in the UK Biobank Imaging Study. △ Less

Submitted 27 January, 2019; originally announced January 2019.

Comments: 14 pages, 7 figures, Journal of Cardiovascular Magnetic Resonance

arXiv:1806.06244 [pdf, other]

Real-time Prediction of Segmentation Quality

Authors: Robert Robinson, Ozan Oktay, Wenjia Bai, Vanya Valindria, Mihir Sanghvi, Nay Aung, José Paiva, Filip Zemrak, Kenneth Fung, Elena Lukaschuk, Aaron Lee, Valentina Carapella, Young Jin Kim, Bernhard Kainz, Stefan Piechnik, Stefan Neubauer, Steffen Petersen, Chris Page, Daniel Rueckert, Ben Glocker

Abstract: Recent advances in deep learning based image segmentation methods have enabled real-time performance with human-level accuracy. However, occasionally even the best method fails due to low image quality, artifacts or unexpected behaviour of black box algorithms. Being able to predict segmentation quality in the absence of ground truth is of paramount importance in clinical practice, but also in lar… ▽ More Recent advances in deep learning based image segmentation methods have enabled real-time performance with human-level accuracy. However, occasionally even the best method fails due to low image quality, artifacts or unexpected behaviour of black box algorithms. Being able to predict segmentation quality in the absence of ground truth is of paramount importance in clinical practice, but also in large-scale studies to avoid the inclusion of invalid data in subsequent analysis. In this work, we propose two approaches of real-time automated quality control for cardiovascular MR segmentations using deep learning. First, we train a neural network on 12,880 samples to predict Dice Similarity Coefficients (DSC) on a per-case basis. We report a mean average error (MAE) of 0.03 on 1,610 test samples and 97% binary classification accuracy for separating low and high quality segmentations. Secondly, in the scenario where no manually annotated data is available, we train a network to predict DSC scores from estimated quality obtained via a reverse testing strategy. We report an MAE=0.14 and 91% binary classification accuracy for this case. Predictions are obtained in real-time which, when combined with real-time segmentation methods, enables instant feedback on whether an acquired scan is analysable while the patient is still in the scanner. This further enables new applications of optimising image acquisition towards best possible analysis results. △ Less

Submitted 16 June, 2018; originally announced June 2018.

Comments: Accepted at MICCAI 2018

arXiv:1805.01392 [pdf]

Prevalence of web trackers on hospital websites in Illinois

Authors: Robert Robinson

Abstract: Web tracking technologies are pervasive and operated by a few large technology companies. This technology, and the use of the collected data has been implicated in influencing elections, fake news, discrimination, and even health decisions. Little is known about how this technology is deployed on hospital or other health related websites. The websites of the 210 public hospitals in the state of Il… ▽ More Web tracking technologies are pervasive and operated by a few large technology companies. This technology, and the use of the collected data has been implicated in influencing elections, fake news, discrimination, and even health decisions. Little is known about how this technology is deployed on hospital or other health related websites. The websites of the 210 public hospitals in the state of Illinois, USA were evaluated with a web tracker identification tool. Web trackers were identified on 94% of hospital webs sites, with an average of 3.5 trackers on the websites of general hospitals. The websites of smaller critical access hospitals used an average of 2 web trackers. The most common web tracker identified was Google Analytics, found on 74% of Illinois hospital websites. Of the web trackers discovered, 88% were operated by Google and 26% by Facebook. In light of revelations about how web browsing profiles have been used and misused, search bubbles, and the potential for algorithmic discrimination hospital leadership and policy makers must carefully consider if it is appropriate to use third party tracking technology on hospital web sites. △ Less

Submitted 3 May, 2018; originally announced May 2018.

Comments: 7 pages, 1 table, 2 figures

arXiv:1802.04159 [pdf]

Urban vs. rural divide in HTTPS implementation for hospital websites in Illinois

Authors: Robert Robinson

Abstract: The Hypertext Transfer Protocol Secure (HTTPS) communications protocol is used to secure traffic between a web browser and server. This technology can significantly reduce the risk of interception and manipulation of web information for nefarious purposes such as identity theft. Deployment of HTTPS has reached about 50% of all webs sites. Little is known about HTTPS implantation for hospital websi… ▽ More The Hypertext Transfer Protocol Secure (HTTPS) communications protocol is used to secure traffic between a web browser and server. This technology can significantly reduce the risk of interception and manipulation of web information for nefarious purposes such as identity theft. Deployment of HTTPS has reached about 50% of all webs sites. Little is known about HTTPS implantation for hospital websites. To investigate the prevalence of HTTPS implementation, we analyzed the websites of the 210 public hospitals in the state of Illinois, USA. HTTPS was implemented to industry standards for 54% of all hospital websites in Illinois. Geographical analysis showed an urban vs. rural digital divide with 60% of urban hospitals and 40% of rural hospitals implementing HTTPS. △ Less

Submitted 9 February, 2018; originally announced February 2018.

Comments: 5 pages, 1 table. arXiv admin note: text overlap with arXiv:1712.05376

arXiv:1712.05376 [pdf]

Prevalence of DNSSEC for hospital websites in Illinois

Authors: Robert Robinson

Abstract: The domain name system translates human friendly web addresses to a computer readable internet protocol address. This basic infrastructure is insecure and can be manipulated. Deployment of technology to secure the DNS system has been slow, reaching about 20% of all web sites based in the USA. Little is known about the efforts hospitals and health systems make to secure the domain name system for t… ▽ More The domain name system translates human friendly web addresses to a computer readable internet protocol address. This basic infrastructure is insecure and can be manipulated. Deployment of technology to secure the DNS system has been slow, reaching about 20% of all web sites based in the USA. Little is known about the efforts hospitals and health systems make to secure the domain name system for their websites. To investigate the prevalence of implementing Domain Name System Security Extensions (DNSSEC), we analyzed the websites of the 210 public hospitals in the state of Illinois, USA. Only one Illinois hospital website was found to have implemented DNSSEC by December, 2017. △ Less

Submitted 14 December, 2017; originally announced December 2017.

Comments: 4 pages

arXiv:1705.01224 [pdf, ps, other]

Topological containment of the 5-clique minus an edge in 4-connected graphs

Authors: Rebecca Robinson, Graham Farr

Abstract: The topological containment problem is known to be polynomial-time solvable for any fixed pattern graph $H$, but good characterisations have been found for only a handful of non-trivial pattern graphs. The complete graph on five vertices, $K_5$, is one pattern graph for which a characterisation has not been found. The discovery of such a characterisation would be of particular interest, due to the… ▽ More The topological containment problem is known to be polynomial-time solvable for any fixed pattern graph $H$, but good characterisations have been found for only a handful of non-trivial pattern graphs. The complete graph on five vertices, $K_5$, is one pattern graph for which a characterisation has not been found. The discovery of such a characterisation would be of particular interest, due to the Hajós Conjecture. One step towards this may be to find a good characterisation of graphs that do not topologically contain the simpler pattern graph $K_5^-$, obtained by removing a single edge from $K_5$. This paper makes progress towards achieving this, by showing that every 4-connected graph must contain a $K_5^-$-subdivision. △ Less

Submitted 3 May, 2017; v1 submitted 2 May, 2017; originally announced May 2017.

Comments: 26 pages, 14 figures

ACM Class: G.2.2

arXiv:1604.03518 [pdf, other]

DTM: Deformable Template Matching

Authors: Hyungtae Lee, Heesung Kwon, Ryan M. Robinson, William D. Nothwang

Abstract: A novel template matching algorithm that can incorporate the concept of deformable parts, is presented in this paper. Unlike the deformable part model (DPM) employed in object recognition, the proposed template-matching approach called Deformable Template Matching (DTM) does not require a training step. Instead, deformation is achieved by a set of predefined basic rules (e.g. the left sub-patch ca… ▽ More A novel template matching algorithm that can incorporate the concept of deformable parts, is presented in this paper. Unlike the deformable part model (DPM) employed in object recognition, the proposed template-matching approach called Deformable Template Matching (DTM) does not require a training step. Instead, deformation is achieved by a set of predefined basic rules (e.g. the left sub-patch cannot pass across the right patch). Experimental evaluation of this new method using the PASCAL VOC 07 dataset demonstrated substantial performance improvement over conventional template matching algorithms. Additionally, to confirm the applicability of DTM, the concept is applied to the generation of a rotation-invariant SIFT descriptor. Experimental evaluation employing deformable matching of SIFT features shows an increased number of matching features compared to a conventional SIFT matching. △ Less

Submitted 12 April, 2016; originally announced April 2016.

arXiv:1511.03183 [pdf, other]

Dynamic Belief Fusion for Object Detection

Authors: Hyungtae Lee, Heesung Kwon, Ryan M. Robinson, William d. Nothwang, Amar M. Marathe

Abstract: A novel approach for the fusion of heterogeneous object detection methods is proposed. In order to effectively integrate the outputs of multiple detectors, the level of ambiguity in each individual detection score is estimated using the precision/recall relationship of the corresponding detector. The main contribution of the proposed work is a novel fusion method, called Dynamic Belief Fusion (DBF… ▽ More A novel approach for the fusion of heterogeneous object detection methods is proposed. In order to effectively integrate the outputs of multiple detectors, the level of ambiguity in each individual detection score is estimated using the precision/recall relationship of the corresponding detector. The main contribution of the proposed work is a novel fusion method, called Dynamic Belief Fusion (DBF), which dynamically assigns probabilities to hypotheses (target, non-target, intermediate state (target or non-target)) based on confidence levels in the detection results conditioned on the prior performance of individual detectors. In DBF, a joint basic probability assignment, optimally fusing information from all detectors, is determined by the Dempster's combination rule, and is easily reduced to a single fused detection score. Experiments on ARL and PASCAL VOC 07 datasets demonstrate that the detection accuracy of DBF is considerably greater than conventional fusion approaches as well as individual detectors used for the fusion. △ Less

Submitted 10 November, 2015; originally announced November 2015.

Comments: 8 pages, 6 figures, 28 references. arXiv admin note: text overlap with arXiv:1502.07643

arXiv:1502.07643

Dynamic Belief Fusion for Object Detection

Authors: Ryan Robinson

Abstract: A novel approach for the fusion of detection scores from disparate object detection methods is proposed. In order to effectively integrate the outputs of multiple detectors, the level of ambiguity in each individual detection score (called "uncertainty") is estimated using the precision/recall relationship of the corresponding detector. The proposed fusion method, called Dynamic Belief Fusion (DBF… ▽ More A novel approach for the fusion of detection scores from disparate object detection methods is proposed. In order to effectively integrate the outputs of multiple detectors, the level of ambiguity in each individual detection score (called "uncertainty") is estimated using the precision/recall relationship of the corresponding detector. The proposed fusion method, called Dynamic Belief Fusion (DBF), dynamically assigns basic probabilities to propositions (target, non-target, uncertain) based on confidence levels in the detection results of individual approaches. A joint basic probability assignment, containing information from all detectors, is determined using Dempster's combination rule, and is easily reduced to a single fused detection score. Experiments on ARL and PASCAL VOC 07 datasets demonstrate that the detection accuracy of DBF is considerably greater than conventional fusion approaches as well as state-of-the-art individual detectors. △ Less

Submitted 11 November, 2015; v1 submitted 26 February, 2015; originally announced February 2015.

Comments: The paper has been withdrawn and an updated paper has been uploaded by a co-author: http://arxiv.org/pdf/1511.03183.pdf

arXiv:1407.4095 [pdf, other]

doi 10.1007/s10107-015-0922-1

Positive semidefinite rank

Authors: Hamza Fawzi, João Gouveia, Pablo A. Parrilo, Richard Z. Robinson, Rekha R. Thomas

Abstract: Let M be a p-by-q matrix with nonnegative entries. The positive semidefinite rank (psd rank) of M is the smallest integer k for which there exist positive semidefinite matrices $A_i, B_j$ of size $k \times k$ such that $M_{ij} = \text{trace}(A_i B_j)$. The psd rank has many appealing geometric interpretations, including semidefinite representations of polyhedra and information-theoretic applicatio… ▽ More Let M be a p-by-q matrix with nonnegative entries. The positive semidefinite rank (psd rank) of M is the smallest integer k for which there exist positive semidefinite matrices $A_i, B_j$ of size $k \times k$ such that $M_{ij} = \text{trace}(A_i B_j)$. The psd rank has many appealing geometric interpretations, including semidefinite representations of polyhedra and information-theoretic applications. In this paper we develop and survey the main mathematical properties of psd rank, including its geometry, relationships with other rank notions, and computational and algorithmic aspects. △ Less

Submitted 15 July, 2014; originally announced July 2014.

Comments: 35 pages

Journal ref: Mathematical Programming 153(1) 133-177, 2015

arXiv:1312.1121 [pdf, other]

Interpreting random forest classification models using a feature contribution method

Authors: Anna Palczewska, Jan Palczewski, Richard Marchese Robinson, Daniel Neagu

Abstract: Model interpretation is one of the key aspects of the model evaluation process. The explanation of the relationship between model variables and outputs is relatively easy for statistical models, such as linear regressions, thanks to the availability of model parameters and their statistical significance. For "black box" models, such as random forest, this information is hidden inside the model str… ▽ More Model interpretation is one of the key aspects of the model evaluation process. The explanation of the relationship between model variables and outputs is relatively easy for statistical models, such as linear regressions, thanks to the availability of model parameters and their statistical significance. For "black box" models, such as random forest, this information is hidden inside the model structure. This work presents an approach for computing feature contributions for random forest classification models. It allows for the determination of the influence of each variable on the model prediction for an individual instance. By analysing feature contributions for a training dataset, the most significant variables can be determined and their typical contribution towards predictions made for individual classes, i.e., class-specific feature contribution "patterns", are discovered. These patterns represent a standard behaviour of the model and allow for an additional assessment of the model reliability for a new data. Interpretation of feature contributions for two UCI benchmark datasets shows the potential of the proposed methodology. The robustness of results is demonstrated through an extensive analysis of feature contributions calculated for a large number of generated random forest models. △ Less

Submitted 4 December, 2013; originally announced December 2013.

ACM Class: I.5.2; I.2.1; J.3

arXiv:1311.0574 [pdf, ps, other]

Search strategies for developing characterizations of graphs without small wheel subdivisions

Authors: Rebecca Robinson, Graham Farr

Abstract: Practical algorithms for solving the Subgraph Homeomorphism Problem are known for only a few small pattern graphs: among these are the wheel graphs with four, five, six, and seven spokes. The length and difficulty of the proofs leading to these algorithms increase greatly as the size of the pattern graph increases. Proving a result for the wheel with six spokes requires extensive case analysis on… ▽ More Practical algorithms for solving the Subgraph Homeomorphism Problem are known for only a few small pattern graphs: among these are the wheel graphs with four, five, six, and seven spokes. The length and difficulty of the proofs leading to these algorithms increase greatly as the size of the pattern graph increases. Proving a result for the wheel with six spokes requires extensive case analysis on many small graphs, and even more such analysis is needed for the wheel with seven spokes. This paper describes algorithms and programs used to automate the generation and testing of the graphs that arise as cases in these proofs. The main algorithm given may be useful in a more general context, for developing other characterizations of SHP-related properties. △ Less

Submitted 3 November, 2013; originally announced November 2013.

Comments: 22 pages, 4 figures

Report number: Technical report 2009/241, Clayton School of Information Technology, Monash University ACM Class: G.2.2

arXiv:1311.0573 [pdf, ps, other]

doi 10.1016/j.disc.2014.03.014

Graphs with no 7-wheel subdivision

Authors: Rebecca Robinson, Graham Farr

Abstract: The subgraph homeomorphism problem, SHP($H$), has been shown to be polynomial-time solvable for any fixed pattern graph $H$, but practical algorithms have been developed only for a few specific pattern graphs. Among these are the wheels with four, five, and six spokes. This paper examines the subgraph homeomorphism problem where the pattern graph is a wheel with seven spokes, and gives a result th… ▽ More The subgraph homeomorphism problem, SHP($H$), has been shown to be polynomial-time solvable for any fixed pattern graph $H$, but practical algorithms have been developed only for a few specific pattern graphs. Among these are the wheels with four, five, and six spokes. This paper examines the subgraph homeomorphism problem where the pattern graph is a wheel with seven spokes, and gives a result that describes graphs with no $W_{7}$-subdivision, showing how they can be built up, using certain operations, from smaller `pieces' that meet certain conditions. We also discuss algorithmic aspects of the problem. △ Less

Submitted 16 December, 2013; v1 submitted 3 November, 2013; originally announced November 2013.

Comments: 97 pages, 47 figures

Report number: Technical report number 2012/270, Clayton School of Information Technology, Monash University ACM Class: G.2.2

Showing 1–26 of 26 results for author: Robinson, R