subscribe to arXiv mailings

ResearchBot: Bridging the Gap between Academic Research and Practical Programming Communities

Authors: Sahar Farzanehpour, Swetha Rajeev, Huayu Liang, Ritvik Prabhu, Chris Brown

Abstract: Software developers commonly rely on platforms like Stack Overflow for problem-solving and learning. However, academic research is an untapped resource that could greatly benefit industry practitioners. The challenge lies in connecting the innovative insights from academia to real-world problems faced by developers. This project introduces ResearchBot, a tool designed to bridge this academia-indus… ▽ More Software developers commonly rely on platforms like Stack Overflow for problem-solving and learning. However, academic research is an untapped resource that could greatly benefit industry practitioners. The challenge lies in connecting the innovative insights from academia to real-world problems faced by developers. This project introduces ResearchBot, a tool designed to bridge this academia-industry gap. ResearchBot employs a modular approach, encompassing understanding questions, curating queries to obtain relevant papers in the CrossRef repository, summarizing paper content and finally answering user questions based on paper summaries. The core objective of ResearchBot is to democratize access to academic knowledge for industry professionals. By providing concise summaries of cutting-edge research directly in response to SE-related questions, ResearchBot facilitates the application of academic insights to practical contexts. Ultimately, it aims to bridge the gap between academia and industry, using research evidence to support learning and decision-making in software development. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2406.14724 [pdf, other]

An Exploratory Mixed-Methods Study on General Data Protection Regulation (GDPR) Compliance in Open-Source Software

Authors: Lucas Franke, Huayu Liang, Sahar Farzanehpour, Aaron Brantly, James C. Davis, Chris Brown

Abstract: Background: Governments worldwide are considering data privacy regulations. These laws, e.g. the European Union's General Data Protection Regulation (GDPR), require software developers to meet privacy-related requirements when interacting with users' data. Prior research describes the impact of such laws on software development, but only for commercial software. Open-source software is commonly in… ▽ More Background: Governments worldwide are considering data privacy regulations. These laws, e.g. the European Union's General Data Protection Regulation (GDPR), require software developers to meet privacy-related requirements when interacting with users' data. Prior research describes the impact of such laws on software development, but only for commercial software. Open-source software is commonly integrated into regulated software, and thus must be engineered or adapted for compliance. We do not know how such laws impact open-source software development. Aims: To understand how data privacy laws affect open-source software development. We studied the European Union's GDPR, the most prominent such law. We investigated how GDPR compliance activities influence OSS developer activity (RQ1), how OSS developers perceive fulfilling GDPR requirements (RQ2), the most challenging GDPR requirements to implement (RQ3), and how OSS developers assess GDPR compliance (RQ4). Method: We distributed an online survey to explore perceptions of GDPR implementations from open-source developers (N=56). We further conducted a repository mining study to analyze development metrics on pull requests (N=31462) submitted to open-source GitHub repositories. Results: GDPR policies complicate open-source development processes and introduce challenges for developers, primarily regarding the management of users' data, implementation costs and time, and assessments of compliance. Moreover, we observed negative perceptions of GDPR from open-source developers and significant increases in development activity, in particular metrics related to coding and reviewing activity, on GitHub pull requests related to GDPR compliance. Conclusions: Our findings motivate policy-related resources and automated tools to support data privacy regulation implementation and compliance efforts in open-source software. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: In the proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM'24)

arXiv:2405.05241 [pdf, other]

BenthicNet: A global compilation of seafloor images for deep learning applications

Authors: Scott C. Lowe, Benjamin Misiuk, Isaac Xu, Shakhboz Abdulazizov, Amit R. Baroi, Alex C. Bastos, Merlin Best, Vicki Ferrini, Ariell Friedman, Deborah Hart, Ove Hoegh-Guldberg, Daniel Ierodiaconou, Julia Mackin-McLaughlin, Kathryn Markey, Pedro S. Menandro, Jacquomo Monk, Shreya Nemani, John O'Brien, Elizabeth Oh, Luba Y. Reshitnyk, Katleen Robert, Chris M. Roelfsema, Jessica A. Sameoto, Alexandre C. G. Schimel, Jordan A. Thomson , et al. (4 additional authors not shown)

Abstract: Advances in underwater imaging enable the collection of extensive seafloor image datasets that are necessary for monitoring important benthic ecosystems. The ability to collect seafloor imagery has outpaced our capacity to analyze it, hindering expedient mobilization of this crucial environmental information. Recent machine learning approaches provide opportunities to increase the efficiency with… ▽ More Advances in underwater imaging enable the collection of extensive seafloor image datasets that are necessary for monitoring important benthic ecosystems. The ability to collect seafloor imagery has outpaced our capacity to analyze it, hindering expedient mobilization of this crucial environmental information. Recent machine learning approaches provide opportunities to increase the efficiency with which seafloor image datasets are analyzed, yet large and consistent datasets necessary to support development of such approaches are scarce. Here we present BenthicNet: a global compilation of seafloor imagery designed to support the training and evaluation of large-scale image recognition models. An initial set of over 11.4 million images was collected and curated to represent a diversity of seafloor environments using a representative subset of 1.3 million images. These are accompanied by 2.6 million annotations translated to the CATAMI scheme, which span 190,000 of the images. A large deep learning model was trained on this compilation and preliminary results suggest it has utility for automating large and small-scale image analysis tasks. The compilation and model are made openly available for use by the scientific community at https://doi.org/10.20383/103.0614. △ Less

Submitted 11 July, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

arXiv:2404.12048 [pdf, other]

Symbolic Computation for All the Fun

Authors: Chad E. Brown, Mikoláš Janota, Mirek Olšák

Abstract: Motivated by the recent 10 million dollar AIMO challenge, this paper targets the problem of finding all functions conforming to a given specification. This is a popular problem at mathematical competitions and it brings about a number of challenges, primarily, synthesizing the possible solutions and proving that no other solutions exist. Often, there are infinitely many solutions and then the set… ▽ More Motivated by the recent 10 million dollar AIMO challenge, this paper targets the problem of finding all functions conforming to a given specification. This is a popular problem at mathematical competitions and it brings about a number of challenges, primarily, synthesizing the possible solutions and proving that no other solutions exist. Often, there are infinitely many solutions and then the set of solutions has to be captured symbolically. We propose an approach to solving this problem and evaluate it on a set of problems that appeared in mathematical competitions and olympics. △ Less

Submitted 23 June, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.07142 [pdf, ps, other]

Bridging Gaps, Building Futures: Advancing Software Developer Diversity and Inclusion Through Future-Oriented Research

Authors: Sonja M. Hyrynsalmi, Sebastian Baltes, Chris Brown, Rafael Prikladnicki, Gema Rodriguez-Perez, Alexander Serebrenik, Jocelyn Simmonds, Bianca Trinkenreich, Yi Wang, Grischa Liebel

Abstract: Software systems are responsible for nearly all aspects of modern life and society. However, the demographics of software development teams that are tasked with designing and maintaining these software systems rarely match the demographics of users. As the landscape of software engineering (SE) evolves due to technological innovations, such as the rise of automated programming assistants powered b… ▽ More Software systems are responsible for nearly all aspects of modern life and society. However, the demographics of software development teams that are tasked with designing and maintaining these software systems rarely match the demographics of users. As the landscape of software engineering (SE) evolves due to technological innovations, such as the rise of automated programming assistants powered by artificial intelligence (AI) and machine learning, more effort is needed to promote software developer diversity and inclusion (SDDI) to ensure inclusive work environments for development teams and usable software for diverse populations. To this end, we present insights from SE researchers and practitioners on challenges and solutions regarding diversity and inclusion in SE. Based on these findings, we share potential utopian and dystopian visions of the future and provide future research directions and implications for academia and industry to promote SDDI in the age of AI-driven SE. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.01761 [pdf, other]

A Formal Proof of R(4,5)=25

Authors: Thibault Gauthier, Chad E. Brown

Abstract: In 1995, McKay and Radziszowski proved that the Ramsey number R(4,5) is equal to 25. Their proof relies on a combination of high-level arguments and computational steps. The authors have performed the computational parts of the proof with different implementations in order to reduce the possibility of an error in their programs. In this work, we prove this theorem in the interactive theorem prover… ▽ More In 1995, McKay and Radziszowski proved that the Ramsey number R(4,5) is equal to 25. Their proof relies on a combination of high-level arguments and computational steps. The authors have performed the computational parts of the proof with different implementations in order to reduce the possibility of an error in their programs. In this work, we prove this theorem in the interactive theorem prover HOL4 limiting the uncertainty to the small HOL4 kernel. Instead of verifying their algorithms directly, we rely on the HOL4 interface to MiniSat SAT to prove gluing lemmas. To reduce the number of such lemmas and thus make the computational part of the proof feasible, we implement a generalization algorithm. We verify that its output covers all the possible cases by implementing a custom SAT-solver extended with a graph isomorphism checker. △ Less

Submitted 12 June, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

arXiv:2401.14629 [pdf, ps, other]

A First Look at the General Data Protection Regulation (GDPR) in Open-Source Software

Authors: Lucas Franke, Huayu Liang, Aaron Brantly, James C Davis, Chris Brown

Abstract: This poster describes work on the General Data Protection Regulation (GDPR) in open-source software. Although open-source software is commonly integrated into regulated software, and thus must be engineered or adapted for compliance, we do not know how such laws impact open-source software development. We surveyed open-source developers (N=47) to understand their experiences and perceptions of G… ▽ More This poster describes work on the General Data Protection Regulation (GDPR) in open-source software. Although open-source software is commonly integrated into regulated software, and thus must be engineered or adapted for compliance, we do not know how such laws impact open-source software development. We surveyed open-source developers (N=47) to understand their experiences and perceptions of GDPR. We learned many engineering challenges, primarily regarding the management of users' data and assessments of compliance. We call for improved policy-related resources, especially tools to support data privacy regulation implementation and compliance in open-source software. △ Less

Submitted 25 January, 2024; originally announced January 2024.

Comments: 2 page extended abstract for ICSE-Poster 2024

arXiv:2306.16757 [pdf, other]

Exploiting Strict Constraints in the Cylindrical Algebraic Covering

Authors: Philipp Bär, Jasper Nalbach, Erika Ábrahám, Christopher W. Brown

Abstract: One of the few available complete methods for checking the satisfiability of sets of polynomial constraints over the reals is the cylindrical algebraic covering (CAlC) method. In this paper, we propose an extension for this method to exploit the strictness of input constraints for reducing the computational effort. We illustrate the concepts on a multidimensional example and provide experimental r… ▽ More One of the few available complete methods for checking the satisfiability of sets of polynomial constraints over the reals is the cylindrical algebraic covering (CAlC) method. In this paper, we propose an extension for this method to exploit the strictness of input constraints for reducing the computational effort. We illustrate the concepts on a multidimensional example and provide experimental results to evaluate the usefulness of our proposed extension. △ Less

Submitted 29 June, 2023; originally announced June 2023.

arXiv:2306.11722 [pdf, other]

doi 10.1145/3636515

Automated Grading and Feedback Tools for Programming Education: A Systematic Review

Authors: Marcus Messer, Neil C. C. Brown, Michael Kölling, Miaojing Shi

Abstract: We conducted a systematic literature review on automated grading and feedback tools for programming education. We analysed 121 research papers from 2017 to 2021 inclusive and categorised them based on skills assessed, approach, language paradigm, degree of automation and evaluation techniques. Most papers assess the correctness of assignments in object-oriented languages. Typically, these to… ▽ More We conducted a systematic literature review on automated grading and feedback tools for programming education. We analysed 121 research papers from 2017 to 2021 inclusive and categorised them based on skills assessed, approach, language paradigm, degree of automation and evaluation techniques. Most papers assess the correctness of assignments in object-oriented languages. Typically, these tools use a dynamic technique, primarily unit testing, to provide grades and feedback to the students or static analysis techniques to compare a submission with a reference solution or with a set of correct student submissions. However, these techniques' feedback is often limited to whether the unit tests have passed or failed, the expected and actual output, or how they differ from the reference solution. Furthermore, few tools assess the maintainability, readability or documentation of the source code, with most using static analysis techniques, such as code quality metrics, in conjunction with grading correctness. Additionally, we found that most tools offered fully automated assessment to allow for near-instantaneous feedback and multiple resubmissions, which can increase student satisfaction and provide them with more opportunities to succeed. In terms of techniques used to evaluate the tools' performance, most papers primarily use student surveys or compare the automatic assessment tools to grades or feedback provided by human graders. However, because the evaluation dataset is frequently unavailable, it is more difficult to reproduce results and compare tools to a collection of common assignments. △ Less

Submitted 5 December, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

Comments: Accepted version of the manuscript

arXiv:2306.09909 [pdf, other]

Neural Volumetric Reconstruction for Coherent Synthetic Aperture Sonar

Authors: Albert W. Reed, Juhyeon Kim, Thomas Blanford, Adithya Pediredla, Daniel C. Brown, Suren Jayasuriya

Abstract: Synthetic aperture sonar (SAS) measures a scene from multiple views in order to increase the resolution of reconstructed imagery. Image reconstruction methods for SAS coherently combine measurements to focus acoustic energy onto the scene. However, image formation is typically under-constrained due to a limited number of measurements and bandlimited hardware, which limits the capabilities of exist… ▽ More Synthetic aperture sonar (SAS) measures a scene from multiple views in order to increase the resolution of reconstructed imagery. Image reconstruction methods for SAS coherently combine measurements to focus acoustic energy onto the scene. However, image formation is typically under-constrained due to a limited number of measurements and bandlimited hardware, which limits the capabilities of existing reconstruction methods. To help meet these challenges, we design an analysis-by-synthesis optimization that leverages recent advances in neural rendering to perform coherent SAS imaging. Our optimization enables us to incorporate physics-based constraints and scene priors into the image formation process. We validate our method on simulation and experimental results captured in both air and water. We demonstrate both quantitatively and qualitatively that our method typically produces superior reconstructions than existing approaches. We share code and data for reproducibility. △ Less

Submitted 16 June, 2023; originally announced June 2023.

arXiv:2306.06115 [pdf, other]

The TeamPlay Project: Analysing and Optimising Time, Energy, and Security for Cyber-Physical Systems

Authors: Benjamin Rouxel, Christopher Brown, Emad Ebeid, Kerstin Eder, Heiko Falk, Clemens Grelck, Jesper Holst, Shashank Jadhav, Yoann Marquer, Marcos Martinez De Alejandro, Kris Nikov, Ali Sahafi, Ulrik Pagh Schultz Lundquist, Adam Seewald, Vangelis Vassalos, Simon Wegener, Olivier Zendra

Abstract: Non-functional properties, such as energy, time, and security (ETS) are becoming increasingly important in Cyber-Physical Systems (CPS) programming. This article describes TeamPlay, a research project funded under the EU Horizon 2020 programme between January 2018 and June 2021. TeamPlay aimed to provide the system designer with a toolchain for developing embedded applications where ETS properties… ▽ More Non-functional properties, such as energy, time, and security (ETS) are becoming increasingly important in Cyber-Physical Systems (CPS) programming. This article describes TeamPlay, a research project funded under the EU Horizon 2020 programme between January 2018 and June 2021. TeamPlay aimed to provide the system designer with a toolchain for developing embedded applications where ETS properties are first-class citizens, allowing the developer to reflect directly on energy, time and security properties at the source code level. In this paper we give an overview of the TeamPlay methodology, introduce the challenges and solutions of our approach and summarise the results achieved. Overall, applying our TeamPlay methodology led to an improvement of up to 18% performance and 52% energy usage over traditional approaches. △ Less

Submitted 1 June, 2023; originally announced June 2023.

Journal ref: Design, Automation and Test in Europe, Apr 2023, Antwerp, Belgium

arXiv:2306.03289 [pdf, other]

doi 10.1109/VL-HCC57772.2023.00018

Exploring the Role of AI Assistants in Computer Science Education: Methods, Implications, and Instructor Perspectives

Authors: Tianjia Wang, Daniel Vargas-Díaz, Chris Brown, Yan Chen

Abstract: The use of AI assistants, along with the challenges they present, has sparked significant debate within the community of computer science education. While these tools demonstrate the potential to support students' learning and instructors' teaching, they also raise concerns about enabling unethical uses by students. Previous research has suggested various strategies aimed at addressing these issue… ▽ More The use of AI assistants, along with the challenges they present, has sparked significant debate within the community of computer science education. While these tools demonstrate the potential to support students' learning and instructors' teaching, they also raise concerns about enabling unethical uses by students. Previous research has suggested various strategies aimed at addressing these issues. However, they concentrate on the introductory programming courses and focus on one specific type of problem. The present research evaluated the performance of ChatGPT, a state-of-the-art AI assistant, at solving 187 problems spanning three distinct types that were collected from six undergraduate computer science. The selected courses covered different topics and targeted different program levels. We then explored methods to modify these problems to adapt them to ChatGPT's capabilities to reduce potential misuse by students. Finally, we conducted semi-structured interviews with 11 computer science instructors. The aim was to gather their opinions on our problem modification methods, understand their perspectives on the impact of AI assistants on computer science education, and learn their strategies for adapting their courses to leverage these AI capabilities for educational improvement. The results revealed issues ranging from academic fairness to long-term impact on students' mental models. From our results, we derived design implications and recommended tools to help instructors design and create future course material that could more effectively adapt to AI assistants' capabilities. △ Less

Submitted 12 November, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

Journal ref: 2023 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), Washington, DC, USA, 2023, pp. 92-102

arXiv:2305.07903 [pdf, ps, other]

Translating SUMO-K to Higher-Order Set Theory

Authors: Chad Brown, Adam Pease, Josef Urban

Abstract: We describe a translation from a fragment of SUMO (SUMO-K) into higher-order set theory. The translation provides a formal semantics for portions of SUMO which are beyond first-order and which have previously only had an informal interpretation. It also for the first time embeds a large common-sense ontology into a very secure interactive theorem proving system. We further extend our previous work… ▽ More We describe a translation from a fragment of SUMO (SUMO-K) into higher-order set theory. The translation provides a formal semantics for portions of SUMO which are beyond first-order and which have previously only had an informal interpretation. It also for the first time embeds a large common-sense ontology into a very secure interactive theorem proving system. We further extend our previous work in finding contradictions in SUMO from first order constructs to include a portion of SUMO's higher order constructs. Finally, using the translation, we can create problems that can be proven using higher-order interactive and automated theorem provers. This is tested in several systems and can be used to form a corpus of higher-order common-sense reasoning problems. △ Less

Submitted 13 May, 2023; originally announced May 2023.

Comments: 17 pages including references

arXiv:2305.06328 [pdf, other]

Suggestion Bot: Analyzing the Impact of Automated Suggested Changes on Code Reviews

Authors: Nivishree Palvannan, Chris Brown

Abstract: Peer code reviews are crucial for maintaining the quality of the code in software repositories. Developers have introduced a number of software bots to help with the code review process. Despite the benefits of automating code review tasks, many developers face challenges interacting with these bots due to non-comprehensive feedback and disruptive notifications. In this paper, we analyze how incor… ▽ More Peer code reviews are crucial for maintaining the quality of the code in software repositories. Developers have introduced a number of software bots to help with the code review process. Despite the benefits of automating code review tasks, many developers face challenges interacting with these bots due to non-comprehensive feedback and disruptive notifications. In this paper, we analyze how incorporating a bot in software development cycle will decrease turnaround time of pull request. We created a bot called SUGGESTION BOT to automatically review the code base using GitHub's suggested changes functionality in order to solve this issue. A preliminary comparative empirical investigation between the utilization of this bot and manual review procedures was also conducted in this study. We evaluate SUGGESTION BOT concerning its impact on review time and also analyze whether the comments given by the bot are clear and useful for users. Our results provide implications for the design of future systems and improving human-bot interactions for code review. △ Less

Submitted 10 May, 2023; originally announced May 2023.

arXiv:2304.02986 [pdf, ps, other]

A Mathematical Benchmark for Inductive Theorem Provers

Authors: Thibault Gauthier, Chad E. Brown, Mikolas Janota, Josef Urban

Abstract: We present a benchmark of 29687 problems derived from the On-Line Encyclopedia of Integer Sequences (OEIS). Each problem expresses the equivalence of two syntactically different programs generating the same OEIS sequence. Such programs were conjectured by a learning-guided synthesis system using a language with looping operators. The operators implement recursion, and thus many of the proofs requi… ▽ More We present a benchmark of 29687 problems derived from the On-Line Encyclopedia of Integer Sequences (OEIS). Each problem expresses the equivalence of two syntactically different programs generating the same OEIS sequence. Such programs were conjectured by a learning-guided synthesis system using a language with looping operators. The operators implement recursion, and thus many of the proofs require induction on natural numbers. The benchmark contains problems of varying difficulty from a wide area of mathematical domains. We believe that these characteristics will make it an effective judge for the progress of inductive theorem provers in this domain for years to come. △ Less

Submitted 6 April, 2023; originally announced April 2023.

arXiv:2303.11936 [pdf, other]

Clustering US Counties to Find Patterns Related to the COVID-19 Pandemic

Authors: Cora Brown, Sarah Milstein, Tianyi Sun, Cooper Zhao

Abstract: When COVID-19 first started spreading and quarantine was implemented, the Society for Industrial and Applied Mathematics (SIAM) Student Chapter at the University of Minnesota-Twin Cities began a collaboration with Ecolab to use our skills as data scientists and mathematicians to extract useful insights from relevant data relating to the pandemic. This collaboration consisted of multiple groups wor… ▽ More When COVID-19 first started spreading and quarantine was implemented, the Society for Industrial and Applied Mathematics (SIAM) Student Chapter at the University of Minnesota-Twin Cities began a collaboration with Ecolab to use our skills as data scientists and mathematicians to extract useful insights from relevant data relating to the pandemic. This collaboration consisted of multiple groups working on different projects. In this write-up we focus on using clustering techniques to help us find groups of similar counties in the US and use that to help us understand the pandemic. Our team for this project consisted of University of Minnesota students Cora Brown, Sarah Milstein, Tianyi Sun, and Cooper Zhao, with help from Ecolab Data Scientist Jimmy Broomfield and University of Minnesota student Skye Ke. In the sections below we describe all of the work done for this project. In Section 2, we list the data we gathered, as well as the feature engineering we performed. In Section 3, we describe the metrics we used for evaluating our models. In Section 4, we explain the methods we used for interpreting the results of our various clustering approaches. In Section 5, we describe the different clustering methods we implemented. In Section 6, we present the results of our clustering techniques and provide relevant interpretation. Finally, in Section 7, we provide some concluding remarks comparing the different clustering methods. △ Less

Submitted 19 March, 2023; originally announced March 2023.

arXiv:2301.07799 [pdf, other]

doi 10.1016/j.neunet.2023.01.007

A Domain-Agnostic Approach for Characterization of Lifelong Learning Systems

Authors: Megan M. Baker, Alexander New, Mario Aguilar-Simon, Ziad Al-Halah, Sébastien M. R. Arnold, Ese Ben-Iwhiwhu, Andrew P. Brna, Ethan Brooks, Ryan C. Brown, Zachary Daniels, Anurag Daram, Fabien Delattre, Ryan Dellana, Eric Eaton, Haotian Fu, Kristen Grauman, Jesse Hostetler, Shariq Iqbal, Cassandra Kent, Nicholas Ketz, Soheil Kolouri, George Konidaris, Dhireesha Kudithipudi, Erik Learned-Miller, Seungwon Lee , et al. (22 additional authors not shown)

Abstract: Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed. This critical gap may be addressed through th… ▽ More Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed. This critical gap may be addressed through the development of "Lifelong Learning" systems that are capable of 1) Continuous Learning, 2) Transfer and Adaptation, and 3) Scalability. Unfortunately, efforts to improve these capabilities are typically treated as distinct areas of research that are assessed independently, without regard to the impact of each separate capability on other aspects of the system. We instead propose a holistic approach, using a suite of metrics and an evaluation framework to assess Lifelong Learning in a principled way that is agnostic to specific domains or system techniques. Through five case studies, we show that this suite of metrics can inform the development of varied and complex Lifelong Learning systems. We highlight how the proposed suite of metrics quantifies performance trade-offs present during Lifelong Learning system development - both the widely discussed Stability-Plasticity dilemma and the newly proposed relationship between Sample Efficient and Robust Learning. Further, we make recommendations for the formulation and use of metrics to guide the continuing development of Lifelong Learning systems and assess their progress in the future. △ Less

Submitted 18 January, 2023; originally announced January 2023.

Comments: To appear in Neural Networks

arXiv:2212.09309 [pdf, other]

doi 10.1016/j.jsc.2023.102288

Levelwise construction of a single cylindrical algebraic cell

Authors: Jasper Nalbach, Erika Ábrahám, Philippe Specht, Christopher W. Brown, James H. Davenport, Matthew England

Abstract: Satisfiability Modulo Theories (SMT) solvers check the satisfiability of quantifier-free first-order logic formulas. We consider the theory of non-linear real arithmetic where the formulae are logical combinations of polynomial constraints. Here a commonly used tool is the Cylindrical Algebraic Decomposition (CAD) to decompose real space into cells where the constraints are truth-invariant through… ▽ More Satisfiability Modulo Theories (SMT) solvers check the satisfiability of quantifier-free first-order logic formulas. We consider the theory of non-linear real arithmetic where the formulae are logical combinations of polynomial constraints. Here a commonly used tool is the Cylindrical Algebraic Decomposition (CAD) to decompose real space into cells where the constraints are truth-invariant through the use of projection polynomials. An improved approach is to repackage the CAD theory into a search-based algorithm: one that guesses sample points to satisfy the formula, and generalizes guesses that conflict constraints to cylindrical cells around samples which are avoided in the continuing search. Such an approach can lead to a satisfying assignment more quickly, or conclude unsatisfiability with fewer cells. A notable example of this approach is Jovanović and de Moura's NLSAT algorithm. Since these cells are produced locally to a sample we might need fewer projection polynomials than the traditional CAD projection. The original NLSAT algorithm reduced the set a little; while Brown's single cell construction reduced it much further still. However, the shape and size of the cell produced depends on the order in which the polynomials are considered. This paper proposes a method to construct such cells levelwise, i.e. built level-by-level according to a variable ordering. We still use a reduced number of projection polynomials, but can now consider a variety of different reductions and use heuristics to select the projection polynomials in order to optimise the shape of the cell under construction. We formulate all the necessary theory as a proof system: while not a common presentation for work in this field, it allows an elegant decoupling of heuristics from the algorithm and its proof of correctness. △ Less

Submitted 20 July, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

Journal ref: Journal of Symbolic Computation, Volume 123, Article Number 102288. Elsevier, 2024

arXiv:2211.13239 [pdf, other]

Relating Regularization and Generalization through the Intrinsic Dimension of Activations

Authors: Bradley C. A. Brown, Jordan Juravsky, Anthony L. Caterini, Gabriel Loaiza-Ganem

Abstract: Given a pair of models with similar training set performance, it is natural to assume that the model that possesses simpler internal representations would exhibit better generalization. In this work, we provide empirical evidence for this intuition through an analysis of the intrinsic dimension (ID) of model activations, which can be thought of as the minimal number of factors of variation in the… ▽ More Given a pair of models with similar training set performance, it is natural to assume that the model that possesses simpler internal representations would exhibit better generalization. In this work, we provide empirical evidence for this intuition through an analysis of the intrinsic dimension (ID) of model activations, which can be thought of as the minimal number of factors of variation in the model's representation of the data. First, we show that common regularization techniques uniformly decrease the last-layer ID (LLID) of validation set activations for image classification models and show how this strongly affects generalization performance. We also investigate how excessive regularization decreases a model's ability to extract features from data in earlier layers, leading to a negative effect on validation accuracy even while LLID continues to decrease and training accuracy remains near-perfect. Finally, we examine the LLID over the course of training of models that exhibit grokking. We observe that well after training accuracy saturates, when models ``grok'' and validation accuracy suddenly improves from random to perfect, there is a co-occurent sudden drop in LLID, thus providing more insight into the dynamics of sudden generalization. △ Less

Submitted 23 November, 2022; originally announced November 2022.

Comments: NeurIPS 2022 OPT and HITY workshops

arXiv:2207.02862 [pdf, other]

Verifying the Union of Manifolds Hypothesis for Image Data

Authors: Bradley C. A. Brown, Anthony L. Caterini, Brendan Leigh Ross, Jesse C. Cresswell, Gabriel Loaiza-Ganem

Abstract: Deep learning has had tremendous success at learning low-dimensional representations of high-dimensional data. This success would be impossible if there was no hidden low-dimensional structure in data of interest; this existence is posited by the manifold hypothesis, which states that the data lies on an unknown manifold of low intrinsic dimension. In this paper, we argue that this hypothesis does… ▽ More Deep learning has had tremendous success at learning low-dimensional representations of high-dimensional data. This success would be impossible if there was no hidden low-dimensional structure in data of interest; this existence is posited by the manifold hypothesis, which states that the data lies on an unknown manifold of low intrinsic dimension. In this paper, we argue that this hypothesis does not properly capture the low-dimensional structure typically present in image data. Assuming that data lies on a single manifold implies intrinsic dimension is identical across the entire data space, and does not allow for subregions of this space to have a different number of factors of variation. To address this deficiency, we consider the union of manifolds hypothesis, which states that data lies on a disjoint union of manifolds of varying intrinsic dimensions. We empirically verify this hypothesis on commonly-used image datasets, finding that indeed, observed data lies on a disconnected set and that intrinsic dimension is not constant. We also provide insights into the implications of the union of manifolds hypothesis in deep learning, both supervised and unsupervised, showing that designing models with an inductive bias for this structure improves performance across classification and generative modelling tasks. Our code is available at https://github.com/layer6ai-labs/UoMH. △ Less

Submitted 2 March, 2023; v1 submitted 6 July, 2022; originally announced July 2022.

Comments: ICLR 2023

arXiv:2206.07796 [pdf, other]

FixEval: Execution-based Evaluation of Program Fixes for Programming Problems

Authors: Md Mahim Anjum Haque, Wasi Uddin Ahmad, Ismini Lourentzou, Chris Brown

Abstract: The complexity of modern software has led to a drastic increase in the time and cost associated with detecting and rectifying software bugs. In response, researchers have explored various methods to automatically generate fixes for buggy code. However, due to the large combinatorial space of possible fixes for any given bug, few tools and datasets are available to evaluate model-generated fixes ef… ▽ More The complexity of modern software has led to a drastic increase in the time and cost associated with detecting and rectifying software bugs. In response, researchers have explored various methods to automatically generate fixes for buggy code. However, due to the large combinatorial space of possible fixes for any given bug, few tools and datasets are available to evaluate model-generated fixes effectively. To address this issue, we introduce FixEval, a benchmark comprising of buggy code submissions to competitive programming problems and their corresponding fixes. FixEval offers an extensive collection of unit tests to evaluate the correctness of model-generated program fixes and assess further information regarding time, memory constraints, and acceptance based on a verdict. We consider two Transformer language models pretrained on programming languages as our baseline and compare them using match-based and execution-based evaluation metrics. Our experiments show that match-based metrics do not reflect model-generated program fixes accurately. At the same time, execution-based methods evaluate programs through all cases and scenarios designed explicitly for that solution. Therefore, we believe FixEval provides a step towards real-world automatic bug fixing and model-generated code evaluation. The dataset and models are open-sourced at https://github.com/mahimanzum/FixEval. △ Less

Submitted 30 March, 2023; v1 submitted 15 June, 2022; originally announced June 2022.

arXiv:2205.06640 [pdf, ps, other]

Lash 1.0 (System Description)

Authors: Chad E. Brown, Cezary Kaliszyk

Abstract: Lash is a higher-order automated theorem prover created as a fork of the theorem prover Satallax. The basic underlying calculus of Satallax is a ground tableau calculus whose rules only use shallow information about the terms and formulas taking part in the rule. Lash uses new, efficient C representations of vital structures and operations. Most importantly, Lash uses a C representation of (normal… ▽ More Lash is a higher-order automated theorem prover created as a fork of the theorem prover Satallax. The basic underlying calculus of Satallax is a ground tableau calculus whose rules only use shallow information about the terms and formulas taking part in the rule. Lash uses new, efficient C representations of vital structures and operations. Most importantly, Lash uses a C representation of (normal) terms with perfect sharing along with a C implementation of normalizing substitutions. We describe the ways in which Lash differs from Satallax and the performance improvement of Lash over Satallax when used with analogous flag settings. With a 10s timeout Lash outperforms Satallax on a collection TH0 problems from the TPTP. We conclude with ideas for continuing the development of Lash. △ Less

Submitted 13 May, 2022; originally announced May 2022.

Journal ref: IJCAR 2022 Conference Submission

arXiv:2204.08027 [pdf, other]

Attention Mechanism based Cognition-level Scene Understanding

Authors: Xuejiao Tang, Tai Le Quy, Eirini Ntoutsi, Kea Turner, Vasile Palade, Israat Haque, Peng Xu, Chris Brown, Wenbin Zhang

Abstract: Given a question-image input, the Visual Commonsense Reasoning (VCR) model can predict an answer with the corresponding rationale, which requires inference ability from the real world. The VCR task, which calls for exploiting the multi-source information as well as learning different levels of understanding and extensive commonsense knowledge, is a cognition-level scene understanding task. The VCR… ▽ More Given a question-image input, the Visual Commonsense Reasoning (VCR) model can predict an answer with the corresponding rationale, which requires inference ability from the real world. The VCR task, which calls for exploiting the multi-source information as well as learning different levels of understanding and extensive commonsense knowledge, is a cognition-level scene understanding task. The VCR task has aroused researchers' interest due to its wide range of applications, including visual question answering, automated vehicle systems, and clinical decision support. Previous approaches to solving the VCR task generally rely on pre-training or exploiting memory with long dependency relationship encoded models. However, these approaches suffer from a lack of generalizability and losing information in long sequences. In this paper, we propose a parallel attention-based cognitive VCR network PAVCR, which fuses visual-textual information efficiently and encodes semantic information in parallel to enable the model to capture rich information for cognition-level inference. Extensive experiments show that the proposed model yields significant improvements over existing methods on the benchmark VCR dataset. Moreover, the proposed model provides intuitive interpretation into visual commonsense reasoning. △ Less

Submitted 18 April, 2022; v1 submitted 17 April, 2022; originally announced April 2022.

Comments: arXiv admin note: text overlap with arXiv:2108.02924, arXiv:2107.01671

arXiv:2204.03323 [pdf, other]

Multi-Sample $ζ$-mixup: Richer, More Realistic Synthetic Samples from a $p$-Series Interpolant

Authors: Kumar Abhishek, Colin J. Brown, Ghassan Hamarneh

Abstract: Modern deep learning training procedures rely on model regularization techniques such as data augmentation methods, which generate training samples that increase the diversity of data and richness of label information. A popular recent method, mixup, uses convex combinations of pairs of original samples to generate new samples. However, as we show in our experiments, mixup can produce undesirable… ▽ More Modern deep learning training procedures rely on model regularization techniques such as data augmentation methods, which generate training samples that increase the diversity of data and richness of label information. A popular recent method, mixup, uses convex combinations of pairs of original samples to generate new samples. However, as we show in our experiments, mixup can produce undesirable synthetic samples, where the data is sampled off the manifold and can contain incorrect labels. We propose $ζ$-mixup, a generalization of mixup with provably and demonstrably desirable properties that allows convex combinations of $N \geq 2$ samples, leading to more realistic and diverse outputs that incorporate information from $N$ original samples by using a $p$-series interpolant. We show that, compared to mixup, $ζ$-mixup better preserves the intrinsic dimensionality of the original datasets, which is a desirable property for training generalizable models. Furthermore, we show that our implementation of $ζ$-mixup is faster than mixup, and extensive evaluation on controlled synthetic and 24 real-world natural and medical image classification datasets shows that $ζ$-mixup outperforms mixup and traditional data augmentation techniques. △ Less

Submitted 7 April, 2022; originally announced April 2022.

Comments: 21 pages, 5 figures

arXiv:2202.08981 [pdf, other]

A Summary of the ComParE COVID-19 Challenges

Authors: Harry Coppock, Alican Akman, Christian Bergler, Maurice Gerczuk, Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Jing Han, Shahin Amiriparian, Alice Baird, Lukas Stappen, Sandra Ottl, Panagiotis Tzirakis, Anton Batliner, Cecilia Mascolo, Björn W. Schuller

Abstract: The COVID-19 pandemic has caused massive humanitarian and economic damage. Teams of scientists from a broad range of disciplines have searched for methods to help governments and communities combat the disease. One avenue from the machine learning field which has been explored is the prospect of a digital mass test which can detect COVID-19 from infected individuals' respiratory sounds. We present… ▽ More The COVID-19 pandemic has caused massive humanitarian and economic damage. Teams of scientists from a broad range of disciplines have searched for methods to help governments and communities combat the disease. One avenue from the machine learning field which has been explored is the prospect of a digital mass test which can detect COVID-19 from infected individuals' respiratory sounds. We present a summary of the results from the INTERSPEECH 2021 Computational Paralinguistics Challenges: COVID-19 Cough, (CCS) and COVID-19 Speech, (CSS). △ Less

Submitted 17 February, 2022; originally announced February 2022.

Comments: 18 pages, 13 figures

arXiv:2112.08539 [pdf, other]

Implicit Neural Representations for Deconvolving SAS Images

Authors: Albert Reed, Thomas Blanford, Daniel C. Brown, Suren Jayasuriya

Abstract: Synthetic aperture sonar (SAS) image resolution is constrained by waveform bandwidth and array geometry. Specifically, the waveform bandwidth determines a point spread function (PSF) that blurs the locations of point scatterers in the scene. In theory, deconvolving the reconstructed SAS image with the scene PSF restores the original distribution of scatterers and yields sharper reconstructions. Ho… ▽ More Synthetic aperture sonar (SAS) image resolution is constrained by waveform bandwidth and array geometry. Specifically, the waveform bandwidth determines a point spread function (PSF) that blurs the locations of point scatterers in the scene. In theory, deconvolving the reconstructed SAS image with the scene PSF restores the original distribution of scatterers and yields sharper reconstructions. However, deconvolution is an ill-posed operation that is highly sensitive to noise. In this work, we leverage implicit neural representations (INRs), shown to be strong priors for the natural image space, to deconvolve SAS images. Importantly, our method does not require training data, as we perform our deconvolution through an analysis-bysynthesis optimization in a self-supervised fashion. We validate our method on simulated SAS data created with a point scattering model and real data captured with an in-air circular SAS. This work is an important first step towards applying neural networks for SAS image deconvolution. △ Less

Submitted 15 December, 2021; originally announced December 2021.

arXiv:2106.15523 [pdf, other]

Sounds of COVID-19: exploring realistic performance of audio-based digital testing

Authors: Jing Han, Tong Xia, Dimitris Spathis, Erika Bondareva, Chloë Brown, Jagmohan Chauhan, Ting Dang, Andreas Grammenos, Apinan Hasthanasombat, Andres Floto, Pietro Cicuta, Cecilia Mascolo

Abstract: Researchers have been battling with the question of how we can identify Coronavirus disease (COVID-19) cases efficiently, affordably and at scale. Recent work has shown how audio based approaches, which collect respiratory audio data (cough, breathing and voice) can be used for testing, however there is a lack of exploration of how biases and methodological decisions impact these tools' performanc… ▽ More Researchers have been battling with the question of how we can identify Coronavirus disease (COVID-19) cases efficiently, affordably and at scale. Recent work has shown how audio based approaches, which collect respiratory audio data (cough, breathing and voice) can be used for testing, however there is a lack of exploration of how biases and methodological decisions impact these tools' performance in practice. In this paper, we explore the realistic performance of audio-based digital testing of COVID-19. To investigate this, we collected a large crowdsourced respiratory audio dataset through a mobile app, alongside recent COVID-19 test result and symptoms intended as a ground truth. Within the collected dataset, we selected 5,240 samples from 2,478 participants and split them into different participant-independent sets for model development and validation. Among these, we controlled for potential confounding factors (such as demographics and language). The unbiased model takes features extracted from breathing, coughs, and voice signals as predictors and yields an AUC-ROC of 0.71 (95\% CI: 0.65$-$0.77). We further explore different unbalanced distributions to show how biases and participant splits affect performance. Finally, we discuss how the realistic model presented could be integrated in clinical practice to realize continuous, ubiquitous, sustainable and affordable testing at population scale. △ Less

Submitted 29 June, 2021; originally announced June 2021.

arXiv:2105.03568 [pdf, other]

ChaRRNets: Channel Robust Representation Networks for RF Fingerprinting

Authors: Carter N. Brown, Enrico Mattei, Andrew Draganov

Abstract: We present complex-valued Convolutional Neural Networks (CNNs) for RF fingerprinting that go beyond translation invariance and appropriately account for the inductive bias with respect to multipath propagation channels, a phenomenon that is specific to the fields of wireless signal processing and communications. We focus on the problem of fingerprinting wireless IoT devices in-the-wild using Deep… ▽ More We present complex-valued Convolutional Neural Networks (CNNs) for RF fingerprinting that go beyond translation invariance and appropriately account for the inductive bias with respect to multipath propagation channels, a phenomenon that is specific to the fields of wireless signal processing and communications. We focus on the problem of fingerprinting wireless IoT devices in-the-wild using Deep Learning (DL) techniques. Under these real-world conditions, the multipath environments represented in the train and test sets will be different. These differences are due to the physics governing the propagation of wireless signals, as well as the limitations of practical data collection campaigns. Our approach follows a group-theoretic framework, leverages prior work on DL on manifold-valued data, and extends this prior work to the wireless signal processing domain. We introduce the Lie group of transformations that a signal experiences under the multipath propagation model and define operations that are equivariant and invariant to the frequency response of a Finite Impulse Response (FIR) filter to build a ChaRRNet. We present results using synthetic and real-world datasets, and we benchmark against a strong baseline model, that show the efficacy of our approach. Our results provide evidence of the benefits of incorporating appropriate wireless domain biases into DL models. We hope to spur new work in the area of robust RF machine learning, as the 5G revolution increases demand for enhanced security mechanisms. △ Less

Submitted 7 May, 2021; originally announced May 2021.

arXiv:2104.09693 [pdf, other]

Demystifying Regular Expression Bugs: A comprehensive study on regular expression bug causes, fixes, and testing

Authors: Peipei Wang, Chris Brown, Jamie A. Jennings, Kathryn T. Stolee

Abstract: Regular expressions cause string-related bugs and open security vulnerabilities for DOS attacks. However, beyond ReDoS (Regular expression Denial of Service), little is known about the extent to which regular expression issues affect software development and how these issues are addressed in practice. We conduct an empirical study of 356 merged regex-related pull request bugs from Apache, Mozilla,… ▽ More Regular expressions cause string-related bugs and open security vulnerabilities for DOS attacks. However, beyond ReDoS (Regular expression Denial of Service), little is known about the extent to which regular expression issues affect software development and how these issues are addressed in practice. We conduct an empirical study of 356 merged regex-related pull request bugs from Apache, Mozilla, Facebook, and Google GitHub repositories. We identify and classify the nature of the regular expression problems, the fixes, and the related changes in the test code. The most important findings in this paper are as follows: 1) incorrect regular expression behavior is the dominant root cause of regular expression bugs (165/356, 46.3%). The remaining root causes are incorrect API usage (9.3%) and other code issues that require regular expression changes in the fix (29.5%), 2) fixing regular expression bugs is nontrivial as it takes more time and more lines of code to fix them compared to the general pull requests, 3) most (51%) of the regex-related pull requests do not contain test code changes. Certain regex bug types (e.g., compile error, performance issues, regex representation) are less likely to include test code changes than others, and 4) the dominant type of test code changes in regex-related pull requests is test case addition (75%). The results of this study contribute to a broader understanding of the practical problems faced by developers when using, fixing, and testing regular expressions. △ Less

Submitted 19 April, 2021; originally announced April 2021.

arXiv:2104.00086 [pdf, other]

An Online Survey on the Perception of Mediated Social Touch Interaction and Device Design

Authors: Carine Rognon, Taylor Bunge, Meiyuzi Gao, Chip Connor, Benjamin Stephens-Fripp, Casey Brown, Ali Israr

Abstract: Social touch is essential for our social interactions, communication, and well-being. It has been shown to reduce anxiety and loneliness; and is a key channel to transmit emotions for which words are not sufficient, such as love, sympathy, reassurance, etc. However, direct physical contact is not always possible due to being remotely located, interacting in a virtual environment, or as a result of… ▽ More Social touch is essential for our social interactions, communication, and well-being. It has been shown to reduce anxiety and loneliness; and is a key channel to transmit emotions for which words are not sufficient, such as love, sympathy, reassurance, etc. However, direct physical contact is not always possible due to being remotely located, interacting in a virtual environment, or as a result of a health issue. Mediated social touch enables physical interactions, despite the distance, by transmitting the haptic cues that constitute social touch through devices. As this technology is fairly new, the users' needs and their expectations on a device design and its features are unclear, as well as who would use this technology, and in which conditions. To better understand these aspects of the mediated interaction, we conducted an online survey on 258 respondents located in the USA. Results give insights on the type of interactions and device features that the US population would like to use. △ Less

Submitted 31 March, 2021; originally announced April 2021.

arXiv:2103.09685 [pdf, other]

Nudging Students Toward Better Software Engineering Behaviors

Authors: Chris Brown, Chris Parnin

Abstract: Student experiences in large undergraduate Computer Science courses are increasingly impacted by automated systems. Bots, or agents of software automation, are useful for efficiently grading and generating feedback. Current efforts at automation in CS education focus on supporting instructional tasks, but do not address student struggles due to poor behaviors, such as procrastination. In this pape… ▽ More Student experiences in large undergraduate Computer Science courses are increasingly impacted by automated systems. Bots, or agents of software automation, are useful for efficiently grading and generating feedback. Current efforts at automation in CS education focus on supporting instructional tasks, but do not address student struggles due to poor behaviors, such as procrastination. In this paper, we explore using bots to improve the software engineering behaviors of students using developer recommendation choice architectures, a framework incorporating behavioral science concepts in recommendations to improve the actions of programmers. We implemented this framework in class-bot, a novel system designed to nudge students to make better choices while working on programming assignments. This work presents a preliminary evaluation integrating this tool in an introductory programming course. Our results show that class-bot is beneficial for improving student development behaviors increasing code quality and productivity. △ Less

Submitted 17 March, 2021; originally announced March 2021.

Comments: 5 pages, 1 figure, BotSE 2021

arXiv:2103.02465 [pdf]

Arthroscopic Multi-Spectral Scene Segmentation Using Deep Learning

Authors: Shahnewaz Ali, Dr. Yaqub Jonmohamadi, Yu Takeda, Jonathan Roberts, Ross Crawford, Cameron Brown, Dr. Ajay K. Pandey

Abstract: Knee arthroscopy is a minimally invasive surgical (MIS) procedure which is performed to treat knee-joint ailment. Lack of visual information of the surgical site obtained from miniaturized cameras make this surgical procedure more complex. Knee cavity is a very confined space; therefore, surgical scenes are captured at close proximity. Insignificant context of knee atlas often makes them unrecogni… ▽ More Knee arthroscopy is a minimally invasive surgical (MIS) procedure which is performed to treat knee-joint ailment. Lack of visual information of the surgical site obtained from miniaturized cameras make this surgical procedure more complex. Knee cavity is a very confined space; therefore, surgical scenes are captured at close proximity. Insignificant context of knee atlas often makes them unrecognizable as a consequence unintentional tissue damage often occurred and shows a long learning curve to train new surgeons. Automatic context awareness through labeling of the surgical site can be an alternative to mitigate these drawbacks. However, from the previous studies, it is confirmed that the surgical site exhibits several limitations, among others, lack of discriminative contextual information such as texture and features which drastically limits this vision task. Additionally, poor imaging conditions and lack of accurate ground-truth labels are also limiting the accuracy. To mitigate these limitations of knee arthroscopy, in this work we proposed a scene segmentation method that successfully segments multi structures. △ Less

Submitted 3 March, 2021; originally announced March 2021.

arXiv:2102.13468 [pdf, other]

The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates

Authors: Björn W. Schuller, Anton Batliner, Christian Bergler, Cecilia Mascolo, Jing Han, Iulia Lefter, Heysem Kaya, Shahin Amiriparian, Alice Baird, Lukas Stappen, Sandra Ottl, Maurice Gerczuk, Panagiotis Tzirakis, Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Leon J. M. Rothkrantz, Joeri Zwerts, Jelle Treep, Casper Kaandorp

Abstract: The INTERSPEECH 2021 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the COVID-19 Cough and COVID-19 Speech Sub-Challenges, a binary classification on COVID-19 infection has to be made based on coughing sounds and speech; in the Escalation SubChallenge, a three-way assessment of the level of es… ▽ More The INTERSPEECH 2021 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the COVID-19 Cough and COVID-19 Speech Sub-Challenges, a binary classification on COVID-19 infection has to be made based on coughing sounds and speech; in the Escalation SubChallenge, a three-way assessment of the level of escalation in a dialogue is featured; and in the Primates Sub-Challenge, four species vs background need to be classified. We describe the Sub-Challenges, baseline feature extraction, and classifiers based on the 'usual' COMPARE and BoAW features as well as deep unsupervised representation learning using the AuDeep toolkit, and deep feature extraction from pre-trained CNNs using the Deep Spectrum toolkit; in addition, we add deep end-to-end sequential modelling, and partially linguistic analysis. △ Less

Submitted 24 February, 2021; originally announced February 2021.

Comments: 5 pages

MSC Class: 68 ACM Class: I.2.7; I.5.0; J.3

arXiv:2102.05225 [pdf, other]

doi 10.1109/ICASSP39728.2021.9414576

Exploring Automatic COVID-19 Diagnosis via voice and symptoms from Crowdsourced Data

Authors: Jing Han, Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Cecilia Mascolo

Abstract: The development of fast and accurate screening tools, which could facilitate testing and prevent more costly clinical tests, is key to the current pandemic of COVID-19. In this context, some initial work shows promise in detecting diagnostic signals of COVID-19 from audio sounds. In this paper, we propose a voice-based framework to automatically detect individuals who have tested positive for COVI… ▽ More The development of fast and accurate screening tools, which could facilitate testing and prevent more costly clinical tests, is key to the current pandemic of COVID-19. In this context, some initial work shows promise in detecting diagnostic signals of COVID-19 from audio sounds. In this paper, we propose a voice-based framework to automatically detect individuals who have tested positive for COVID-19. We evaluate the performance of the proposed framework on a subset of data crowdsourced from our app, containing 828 samples from 343 participants. By combining voice signals and reported symptoms, an AUC of $0.79$ has been attained, with a sensitivity of $0.68$ and a specificity of $0.82$. We hope that this study opens the door to rapid, low-cost, and convenient pre-screening tools to automatically detect the disease. △ Less

Submitted 9 February, 2021; originally announced February 2021.

Comments: 5 pages, 3 figures, 2 tables, Accepted for publication at ICASSP 2021

arXiv:2101.12355 [pdf, other]

A Petri Dish for Histopathology Image Analysis

Authors: Jerry Wei, Arief Suriawinata, Bing Ren, Xiaoying Liu, Mikhail Lisovsky, Louis Vaickus, Charles Brown, Michael Baker, Naofumi Tomita, Lorenzo Torresani, Jason Wei, Saeed Hassanpour

Abstract: With the rise of deep learning, there has been increased interest in using neural networks for histopathology image analysis, a field that investigates the properties of biopsy or resected specimens traditionally manually examined under a microscope by pathologists. However, challenges such as limited data, costly annotation, and processing high-resolution and variable-size images make it difficul… ▽ More With the rise of deep learning, there has been increased interest in using neural networks for histopathology image analysis, a field that investigates the properties of biopsy or resected specimens traditionally manually examined under a microscope by pathologists. However, challenges such as limited data, costly annotation, and processing high-resolution and variable-size images make it difficult to quickly iterate over model designs. Throughout scientific history, many significant research directions have leveraged small-scale experimental setups as petri dishes to efficiently evaluate exploratory ideas. In this paper, we introduce a minimalist histopathology image analysis dataset (MHIST), an analogous petri dish for histopathology image analysis. MHIST is a binary classification dataset of 3,152 fixed-size images of colorectal polyps, each with a gold-standard label determined by the majority vote of seven board-certified gastrointestinal pathologists and annotator agreement level. MHIST occupies less than 400 MB of disk space, and a ResNet-18 baseline can be trained to convergence on MHIST in just 6 minutes using 3.5 GB of memory on a NVIDIA RTX 3090. As example use cases, we use MHIST to study natural questions such as how dataset size, network depth, transfer learning, and high-disagreement examples affect model performance. By introducing MHIST, we hope to not only help facilitate the work of current histopathology imaging researchers, but also make the field more-accessible to the general community. Our dataset is available at https://bmirds.github.io/MHIST. △ Less

Submitted 27 March, 2021; v1 submitted 28 January, 2021; originally announced January 2021.

Comments: In proceedings of Artificial Intelligence in Medicine (AIME) 2021

arXiv:2012.05983 [pdf, other]

Towards Neural Programming Interfaces

Authors: Zachary C. Brown, Nathaniel Robinson, David Wingate, Nancy Fulda

Abstract: It is notoriously difficult to control the behavior of artificial neural networks such as generative neural language models. We recast the problem of controlling natural language generation as that of learning to interface with a pretrained language model, just as Application Programming Interfaces (APIs) control the behavior of programs by altering hyperparameters. In this new paradigm, a special… ▽ More It is notoriously difficult to control the behavior of artificial neural networks such as generative neural language models. We recast the problem of controlling natural language generation as that of learning to interface with a pretrained language model, just as Application Programming Interfaces (APIs) control the behavior of programs by altering hyperparameters. In this new paradigm, a specialized neural network (called a Neural Programming Interface or NPI) learns to interface with a pretrained language model by manipulating the hidden activations of the pretrained model to produce desired outputs. Importantly, no permanent changes are made to the weights of the original model, allowing us to re-purpose pretrained models for new tasks without overwriting any aspect of the language model. We also contribute a new data set construction algorithm and GAN-inspired loss function that allows us to train NPI models to control outputs of autoregressive transformers. In experiments against other state-of-the-art approaches, we demonstrate the efficacy of our methods using OpenAI's GPT-2 model, successfully controlling noun selection, topic aversion, offensive speech filtering, and other aspects of language while largely maintaining the controlled model's fluency under deterministic settings. △ Less

Submitted 17 February, 2021; v1 submitted 10 December, 2020; originally announced December 2020.

Comments: 24 pages total (13 for main paper and references, 11 for Appendix 1), accepted for publication in Advances in Neural Information Processing Systems 33 (NeurIPS 2020)

Journal ref: Neural Information Processing Systems 33 (2020) 17416-17428

arXiv:2012.00878 [pdf]

doi 10.1021/acsnano.1c00079

Neural network-based on-chip spectroscopy using a scalable plasmonic encoder

Authors: Calvin Brown, Artem Goncharov, Zachary Ballard, Mason Fordham, Ashley Clemens, Yunzhe Qiu, Yair Rivenson, Aydogan Ozcan

Abstract: Conventional spectrometers are limited by trade-offs set by size, cost, signal-to-noise ratio (SNR), and spectral resolution. Here, we demonstrate a deep learning-based spectral reconstruction framework, using a compact and low-cost on-chip sensing scheme that is not constrained by the design trade-offs inherent to grating-based spectroscopy. The system employs a plasmonic spectral encoder chip co… ▽ More Conventional spectrometers are limited by trade-offs set by size, cost, signal-to-noise ratio (SNR), and spectral resolution. Here, we demonstrate a deep learning-based spectral reconstruction framework, using a compact and low-cost on-chip sensing scheme that is not constrained by the design trade-offs inherent to grating-based spectroscopy. The system employs a plasmonic spectral encoder chip containing 252 different tiles of nanohole arrays fabricated using a scalable and low-cost imprint lithography method, where each tile has a unique geometry and, thus, a unique optical transmission spectrum. The illumination spectrum of interest directly impinges upon the plasmonic encoder, and a CMOS image sensor captures the transmitted light, without any lenses, gratings, or other optical components in between, making the entire hardware highly compact, light-weight and field-portable. A trained neural network then reconstructs the unknown spectrum using the transmitted intensity information from the spectral encoder in a feed-forward and non-iterative manner. Benefiting from the parallelization of neural networks, the average inference time per spectrum is ~28 microseconds, which is orders of magnitude faster compared to other computational spectroscopy approaches. When blindly tested on unseen new spectra (N = 14,648) with varying complexity, our deep-learning based system identified 96.86% of the spectral peaks with an average peak localization error, bandwidth error, and height error of 0.19 nm, 0.18 nm, and 7.60%, respectively. This system is also highly tolerant to fabrication defects that may arise during the imprint lithography process, which further makes it ideal for applications that demand cost-effective, field-portable and sensitive high-resolution spectroscopy tools. △ Less

Submitted 1 December, 2020; originally announced December 2020.

Comments: 18 pages, 6 figures

Journal ref: ACS Nano (2021)

arXiv:2009.13698 [pdf, other]

Learn like a Pathologist: Curriculum Learning by Annotator Agreement for Histopathology Image Classification

Authors: Jerry Wei, Arief Suriawinata, Bing Ren, Xiaoying Liu, Mikhail Lisovsky, Louis Vaickus, Charles Brown, Michael Baker, Mustafa Nasir-Moin, Naofumi Tomita, Lorenzo Torresani, Jason Wei, Saeed Hassanpour

Abstract: Applying curriculum learning requires both a range of difficulty in data and a method for determining the difficulty of examples. In many tasks, however, satisfying these requirements can be a formidable challenge. In this paper, we contend that histopathology image classification is a compelling use case for curriculum learning. Based on the nature of histopathology images, a range of difficulty… ▽ More Applying curriculum learning requires both a range of difficulty in data and a method for determining the difficulty of examples. In many tasks, however, satisfying these requirements can be a formidable challenge. In this paper, we contend that histopathology image classification is a compelling use case for curriculum learning. Based on the nature of histopathology images, a range of difficulty inherently exists among examples, and, since medical datasets are often labeled by multiple annotators, annotator agreement can be used as a natural proxy for the difficulty of a given example. Hence, we propose a simple curriculum learning method that trains on progressively-harder images as determined by annotator agreement. We evaluate our hypothesis on the challenging and clinically-important task of colorectal polyp classification. Whereas vanilla training achieves an AUC of 83.7% for this task, a model trained with our proposed curriculum learning approach achieves an AUC of 88.2%, an improvement of 4.5%. Our work aims to inspire researchers to think more creatively and rigorously when choosing contexts for applying curriculum learning. △ Less

Submitted 28 September, 2020; originally announced September 2020.

arXiv:2006.05919 [pdf, other]

doi 10.1145/3394486.3412865

Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound Data

Authors: Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Jing Han, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Cecilia Mascolo

Abstract: Audio signals generated by the human body (e.g., sighs, breathing, heart, digestion, vibration sounds) have routinely been used by clinicians as indicators to diagnose disease or assess disease progression. Until recently, such signals were usually collected through manual auscultation at scheduled visits. Research has now started to use digital technology to gather bodily sounds (e.g., from digit… ▽ More Audio signals generated by the human body (e.g., sighs, breathing, heart, digestion, vibration sounds) have routinely been used by clinicians as indicators to diagnose disease or assess disease progression. Until recently, such signals were usually collected through manual auscultation at scheduled visits. Research has now started to use digital technology to gather bodily sounds (e.g., from digital stethoscopes) for cardiovascular or respiratory examination, which could then be used for automatic analysis. Some initial work shows promise in detecting diagnostic signals of COVID-19 from voice and coughs. In this paper we describe our data analysis over a large-scale crowdsourced dataset of respiratory sounds collected to aid diagnosis of COVID-19. We use coughs and breathing to understand how discernible COVID-19 sounds are from those in asthma or healthy controls. Our results show that even a simple binary machine learning classifier is able to classify correctly healthy and COVID-19 sounds. We also show how we distinguish a user who tested positive for COVID-19 and has a cough from a healthy user with a cough, and users who tested positive for COVID-19 and have a cough from users with asthma and a cough. Our models achieve an AUC of above 80% across all tasks. These results are preliminary and only scratch the surface of the potential of this type of data and audio-based machine learning. This work opens the door to further investigation of how automatically analysed respiratory patterns could be used as pre-screening signals to aid COVID-19 diagnosis. △ Less

Submitted 18 January, 2021; v1 submitted 10 June, 2020; originally announced June 2020.

Comments: 9 pages, 6 figures, 2 tables, Accepted for publication at KDD'20 (Health Day)

arXiv:2004.06997 [pdf, ps, other]

Prolog Technology Reinforcement Learning Prover

Authors: Zsolt Zombori, Josef Urban, Chad E. Brown

Abstract: We present a reinforcement learning toolkit for experiments with guiding automated theorem proving in the connection calculus. The core of the toolkit is a compact and easy to extend Prolog-based automated theorem prover called plCoP. plCoP builds on the leanCoP Prolog implementation and adds learning-guided Monte-Carlo Tree Search as done in the rlCoP system. Other components include a Python int… ▽ More We present a reinforcement learning toolkit for experiments with guiding automated theorem proving in the connection calculus. The core of the toolkit is a compact and easy to extend Prolog-based automated theorem prover called plCoP. plCoP builds on the leanCoP Prolog implementation and adds learning-guided Monte-Carlo Tree Search as done in the rlCoP system. Other components include a Python interface to plCoP and machine learners, and an external proof checker that verifies the validity of plCoP proofs. The toolkit is evaluated on two benchmarks and we demonstrate its extendability by two additions: (1) guidance is extended to reduction steps and (2) the standard leanCoP calculus is extended with rewrite steps and their learned guidance. We argue that the Prolog setting is suitable for combining statistical and symbolic learning methods. The complete toolkit is publicly released. △ Less

Submitted 15 April, 2020; originally announced April 2020.

arXiv:2004.06149 [pdf, other]

Local Model Feature Transformations

Authors: CScott Brown

Abstract: Local learning methods are a popular class of machine learning algorithms. The basic idea for the entire cadre is to choose some non-local model family, to train many of them on small sections of neighboring data, and then to `stitch' the resulting models together in some way. Due to the limits of constraining a training dataset to a small neighborhood, research on locally-learned models has large… ▽ More Local learning methods are a popular class of machine learning algorithms. The basic idea for the entire cadre is to choose some non-local model family, to train many of them on small sections of neighboring data, and then to `stitch' the resulting models together in some way. Due to the limits of constraining a training dataset to a small neighborhood, research on locally-learned models has largely been restricted to simple model families. Also, since simple model families have no complex structure by design, this has limited use of the individual local models to predictive tasks. We hypothesize that, using a sufficiently complex local model family, various properties of the individual local models, such as their learned parameters, can be used as features for further learning. This dissertation improves upon the current state of research and works toward establishing this hypothesis by investigating algorithms for localization of more complex model families and by studying their applications beyond predictions as a feature extraction mechanism. We summarize this generic technique of using local models as a feature extraction step with the term ``local model feature transformations.'' In this document, we extend the local modeling paradigm to Gaussian processes, orthogonal quadric models and word embedding models, and extend the existing theory for localized linear classifiers. We then demonstrate applications of local model feature transformations to epileptic event classification from EEG readings, activity monitoring via chest accelerometry, 3D surface reconstruction, 3D point cloud segmentation, handwritten digit classification and event detection from Twitter feeds. △ Less

Submitted 13 April, 2020; originally announced April 2020.

arXiv:1912.02636 [pdf, other]

doi 10.1145/3372885.3373827

Exploration of Neural Machine Translation in Autoformalization of Mathematics in Mizar

Authors: Qingxiang Wang, Chad Brown, Cezary Kaliszyk, Josef Urban

Abstract: In this paper we share several experiments trying to automatically translate informal mathematics into formal mathematics. In our context informal mathematics refers to human-written mathematical sentences in the LaTeX format; and formal mathematics refers to statements in the Mizar language. We conducted our experiments against three established neural network-based machine translation models tha… ▽ More In this paper we share several experiments trying to automatically translate informal mathematics into formal mathematics. In our context informal mathematics refers to human-written mathematical sentences in the LaTeX format; and formal mathematics refers to statements in the Mizar language. We conducted our experiments against three established neural network-based machine translation models that are known to deliver competitive results on translating between natural languages. To train these models we also prepared four informal-to-formal datasets. We compare and analyze our results according to whether the model is supervised or unsupervised. In order to augment the data available for auto-formalization and improve the results, we develop a custom type-elaboration mechanism and integrate it in the supervised translation. △ Less

Submitted 13 December, 2019; v1 submitted 5 December, 2019; originally announced December 2019.

Comments: The 9th ACM SIGPLAN International Conference on Certified Programs and Proofs

arXiv:1912.01525 [pdf, ps, other]

Self-Learned Formula Synthesis in Set Theory

Authors: Chad E. Brown, Thibault Gauthier

Abstract: A reinforcement learning algorithm accomplishes the task of synthesizing a set-theoretical formula that evaluates to given truth values for given assignments. A reinforcement learning algorithm accomplishes the task of synthesizing a set-theoretical formula that evaluates to given truth values for given assignments. △ Less

Submitted 3 December, 2019; originally announced December 2019.

arXiv:1911.04873 [pdf, ps, other]

Can Neural Networks Learn Symbolic Rewriting?

Authors: Bartosz Piotrowski, Josef Urban, Chad E. Brown, Cezary Kaliszyk

Abstract: This work investigates if the current neural architectures are adequate for learning symbolic rewriting. Two kinds of data sets are proposed for this research -- one based on automated proofs and the other being a synthetic set of polynomial terms. The experiments with use of the current neural machine translation models are performed and its results are discussed. Ideas for extending this line of… ▽ More This work investigates if the current neural architectures are adequate for learning symbolic rewriting. Two kinds of data sets are proposed for this research -- one based on automated proofs and the other being a synthetic set of polynomial terms. The experiments with use of the current neural machine translation models are performed and its results are discussed. Ideas for extending this line of research are proposed, and its relevance is motivated. △ Less

Submitted 26 May, 2020; v1 submitted 7 November, 2019; originally announced November 2019.

arXiv:1910.11370 [pdf]

A perspective on Microscopy Metadata: data provenance and quality control

Authors: Maximiliaan Huisman, Mathias Hammer, Alex Rigano, Ulrike Boehm, James J. Chambers, Nathalie Gaudreault, Alison J. North, Jaime A. Pimentel, Damir Sudar, Peter Bajcsy, Claire M. Brown, Alexander D. Corbett, Orestis Faklaris, Judith Lacoste, Alex Laude, Glyn Nelson, Roland Nitschke, David Grunwald, Caterina Strambio-De-Castillia

Abstract: The application of microscopy in biomedical research has come a long way since Antonie van Leeuwenhoek discovered unicellular organisms. Countless innovations have positioned light microscopy as a cornerstone of modern biology and a method of choice for connecting omics datasets to their biological and clinical correlates. Still, regardless of how convincing published imaging data looks, it does n… ▽ More The application of microscopy in biomedical research has come a long way since Antonie van Leeuwenhoek discovered unicellular organisms. Countless innovations have positioned light microscopy as a cornerstone of modern biology and a method of choice for connecting omics datasets to their biological and clinical correlates. Still, regardless of how convincing published imaging data looks, it does not always convey meaningful information about the conditions in which it was acquired, processed, and analyzed. Adequate record-keeping, reporting, and quality control are therefore essential to ensure experimental rigor and data fidelity, allow experiments to be reproducibly repeated, and promote the proper evaluation, interpretation, comparison, and re-use. To this end, microscopy images should be accompanied by complete descriptions detailing experimental procedures, biological samples, microscope hardware specifications, image acquisition parameters, and image analysis procedures, as well as metrics accounting for instrument performance and calibration. However, universal, community-accepted Microscopy Metadata standards and reporting specifications that would result in Findable Accessible Interoperable and Reproducible (FAIR) microscopy data have not yet been established. To understand this shortcoming and to propose a way forward, here we provide an overview of the nature of microscopy metadata and its importance for fostering data quality, reproducibility, scientific rigor, and sharing value in light microscopy. The proposal for tiered Microscopy Metadata Specifications that extend the OME Data Model put forth by the 4D Nucleome Initiative and by Bioimaging North America [1-3] as well as a suite of three complementary and interoperable tools are being developed to facilitate the process of image data documentation and are presented in related manuscripts [4-6]. △ Less

Submitted 31 May, 2021; v1 submitted 24 October, 2019; originally announced October 2019.

arXiv:1908.04295 [pdf, ps, other]

Interactive coin offerings

Authors: Jason Teutsch, Vitalik Buterin, Christopher Brown

Abstract: Ethereum has emerged as a dynamic platform for exchanging cryptocurrency tokens. While token crowdsales cannot simultaneously guarantee buyers both certainty of valuation and certainty of participation, we show that if each token buyer specifies a desired purchase quantity at each valuation then everyone can successfully participate. Our implementation introduces smart contract techniques which re… ▽ More Ethereum has emerged as a dynamic platform for exchanging cryptocurrency tokens. While token crowdsales cannot simultaneously guarantee buyers both certainty of valuation and certainty of participation, we show that if each token buyer specifies a desired purchase quantity at each valuation then everyone can successfully participate. Our implementation introduces smart contract techniques which recruit outside participants in order to circumvent computational complexity barriers. △ Less

Submitted 12 August, 2019; originally announced August 2019.

MSC Class: 91B26; 68Q25

arXiv:1907.08368 [pdf, ps, other]

doi 10.1007/978-3-030-23250-4_4

A Tale of Two Set Theories

Authors: Chad E. Brown, Karol Pąk

Abstract: We describe the relationship between two versions of Tarski-Grothendieck set theory: the first-order set theory of Mizar and the higher-order set theory of Egal. We show how certain higher-order terms and propositions in Egal have equivalent first-order presentations. We then prove Tarski's Axiom A (an axiom in Mizar) in Egal and construct a Grothendieck Universe operator (a primitive with axioms… ▽ More We describe the relationship between two versions of Tarski-Grothendieck set theory: the first-order set theory of Mizar and the higher-order set theory of Egal. We show how certain higher-order terms and propositions in Egal have equivalent first-order presentations. We then prove Tarski's Axiom A (an axiom in Mizar) in Egal and construct a Grothendieck Universe operator (a primitive with axioms in Egal) in Mizar. △ Less

Submitted 18 July, 2019; originally announced July 2019.

arXiv:1905.10481 [pdf, other]

doi 10.1145/3307650.3322253

Asymptotic Improvements to Quantum Circuits via Qutrits

Authors: Pranav Gokhale, Jonathan M. Baker, Casey Duckering, Natalie C. Brown, Kenneth R. Brown, Frederic T. Chong

Abstract: Quantum computation is traditionally expressed in terms of quantum bits, or qubits. In this work, we instead consider three-level qu$trits$. Past work with qutrits has demonstrated only constant factor improvements, owing to the $\log_2(3)$ binary-to-ternary compression factor. We present a novel technique using qutrits to achieve a logarithmic depth (runtime) decomposition of the Generalized Toff… ▽ More Quantum computation is traditionally expressed in terms of quantum bits, or qubits. In this work, we instead consider three-level qu$trits$. Past work with qutrits has demonstrated only constant factor improvements, owing to the $\log_2(3)$ binary-to-ternary compression factor. We present a novel technique using qutrits to achieve a logarithmic depth (runtime) decomposition of the Generalized Toffoli gate using no ancilla--a significant improvement over linear depth for the best qubit-only equivalent. Our circuit construction also features a 70x improvement in two-qudit gate count over the qubit-only equivalent decomposition. This results in circuit cost reductions for important algorithms like quantum neurons and Grover search. We develop an open-source circuit simulator for qutrits, along with realistic near-term noise models which account for the cost of operating qutrits. Simulation results for these noise models indicate over 90% mean reliability (fidelity) for our circuit construction, versus under 30% for the qubit-only baseline. These results suggest that qutrits offer a promising path towards scaling quantum computation. △ Less

Submitted 24 May, 2019; originally announced May 2019.

Comments: In ISCA '19: 46th International Symposium on Computer Architecture, 13 pages, 11 figures

arXiv:1904.09193 [pdf, ps, other]

Cantor-Bernstein implies Excluded Middle

Authors: Cécilia Pradic, Chad E. Brown

Abstract: We prove in constructive logic that the statement of the Cantor-Bernstein theorem implies excluded middle. This establishes that the Cantor-Bernstein theorem can only be proven assuming the full power of classical logic. The key ingredient is a theorem of Martín Escardó stating that quantification over a particular subset of the Cantor space $2^{\mathbb{N}}$, the so-called one-point compactifica… ▽ More We prove in constructive logic that the statement of the Cantor-Bernstein theorem implies excluded middle. This establishes that the Cantor-Bernstein theorem can only be proven assuming the full power of classical logic. The key ingredient is a theorem of Martín Escardó stating that quantification over a particular subset of the Cantor space $2^{\mathbb{N}}$, the so-called one-point compactification of $\mathbb{N}$, preserves decidable predicates. △ Less

Submitted 15 August, 2022; v1 submitted 19 April, 2019; originally announced April 2019.

Comments: 6pp / update: corrected an error in the intro wrt applicability of Thm 1, typos, added a couple of acks and a ref

MSC Class: 03F55

arXiv:1903.02539 [pdf, ps, other]

doi 10.1007/978-3-030-29436-6_8

GRUNGE: A Grand Unified ATP Challenge

Authors: Chad E. Brown, Thibault Gauthier, Cezary Kaliszyk, Geoff Sutcliffe, Josef Urban

Abstract: This paper describes a large set of related theorem proving problems obtained by translating theorems from the HOL4 standard library into multiple logical formalisms. The formalisms are in higher-order logic (with and without type variables) and first-order logic (possibly with multiple types, and possibly with type variables). The resultant problem sets allow us to run automated theorem provers t… ▽ More This paper describes a large set of related theorem proving problems obtained by translating theorems from the HOL4 standard library into multiple logical formalisms. The formalisms are in higher-order logic (with and without type variables) and first-order logic (possibly with multiple types, and possibly with type variables). The resultant problem sets allow us to run automated theorem provers that support different logical formats on corresponding problems, and compare their performances. This also results in a new "grand unified" large theory benchmark that emulates the ITP/ATP hammer setting, where systems and metasystems can use multiple ATP formalisms in complementary ways, and jointly learn from the accumulated knowledge. △ Less

Submitted 19 November, 2019; v1 submitted 6 March, 2019; originally announced March 2019.

Comments: CADE 27 -- 27th International Conference on Automated Deduction

Showing 1–50 of 68 results for author: Brown, C