-
Distance Critical Graphs
Authors:
Joshua Cooper,
Gabrielle Tauscheck
Abstract:
In 1971, Graham and Pollak provided a formula for the determinant of the distance matrix of any tree on $n$ vertices. Yan and Yeh reproved this by exploiting the fact that pendant vertices can be deleted from trees without changing the remaining entries of the distance matrix. Considering failures of their argument to generalize invites the question: which graphs have the property that deleting an…
▽ More
In 1971, Graham and Pollak provided a formula for the determinant of the distance matrix of any tree on $n$ vertices. Yan and Yeh reproved this by exploiting the fact that pendant vertices can be deleted from trees without changing the remaining entries of the distance matrix. Considering failures of their argument to generalize invites the question: which graphs have the property that deleting any one vertex results in a change to some pairwise distance? We refer to such worst-case graphs as ``distance critical''. This work explores the structural properties of distance critical graphs, preservation of distance-criticality by products, and the nature of extremal distance critical graphs. We end with a few open questions.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Materials science in the era of large language models: a perspective
Authors:
Ge Lei,
Ronan Docherty,
Samuel J. Cooper
Abstract:
Large Language Models (LLMs) have garnered considerable interest due to their impressive natural language capabilities, which in conjunction with various emergent properties make them versatile tools in workflows ranging from complex code generation to heuristic finding for combinatorial problems. In this paper we offer a perspective on their applicability to materials science research, arguing th…
▽ More
Large Language Models (LLMs) have garnered considerable interest due to their impressive natural language capabilities, which in conjunction with various emergent properties make them versatile tools in workflows ranging from complex code generation to heuristic finding for combinatorial problems. In this paper we offer a perspective on their applicability to materials science research, arguing their ability to handle ambiguous requirements across a range of tasks and disciplines mean they could be a powerful tool to aid researchers. We qualitatively examine basic LLM theory, connecting it to relevant properties and techniques in the literature before providing two case studies that demonstrate their use in task automation and knowledge extraction at-scale. At their current stage of development, we argue LLMs should be viewed less as oracles of novel insight, and more as tireless workers that can accelerate and unify exploration across domains. It is our hope that this paper can familiarise material science researchers with the concepts needed to leverage these tools in their own research.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
Blue and Green-Mode Energy-Efficient Chemiresistive Sensor Array Realized by Rapid Ensemble Learning
Authors:
Zeheng Wang,
James Cooper,
Muhammad Usman,
Timothy van der Laan
Abstract:
The rapid advancement of Internet of Things (IoT) necessitates the development of optimized Chemiresistive Sensor (CRS) arrays that are both energy-efficient and capable. This study introduces a novel optimization strategy that employs a rapid ensemble learning-based model committee approach to achieve these goals. Utilizing machine learning models such as Elastic Net Regression, Random Forests, a…
▽ More
The rapid advancement of Internet of Things (IoT) necessitates the development of optimized Chemiresistive Sensor (CRS) arrays that are both energy-efficient and capable. This study introduces a novel optimization strategy that employs a rapid ensemble learning-based model committee approach to achieve these goals. Utilizing machine learning models such as Elastic Net Regression, Random Forests, and XGBoost, among others, the strategy identifies the most impactful sensors in a CRS array for accurate classification: A weighted voting mechanism is introduced to aggregate the models' opinions in sensor selection, thereby setting up wo distinct working modes, termed "Blue" and "Green". The Blue mode operates with all sensors for maximum detection capability, while the Green mode selectively activates only key sensors, significantly reducing energy consumption without compromising detection accuracy. The strategy is validated through theoretical calculations and Monte Carlo simulations, demonstrating its effectiveness and accuracy. The proposed optimization strategy not only elevates the detection capability of CRS arrays but also brings it closer to theoretical limits, promising significant implications for the development of low-cost, easily fabricable next-generation IoT sensor terminals.
△ Less
Submitted 3 March, 2024;
originally announced March 2024.
-
Toward TransfORmers: Revolutionizing the Solution of Mixed Integer Programs with Transformers
Authors:
Joshua F. Cooper,
Seung Jin Choi,
I. Esra Buyuktahtakin
Abstract:
In this study, we introduce an innovative deep learning framework that employs a transformer model to address the challenges of mixed-integer programs, specifically focusing on the Capacitated Lot Sizing Problem (CLSP). Our approach, to our knowledge, is the first to utilize transformers to predict the binary variables of a mixed-integer programming (MIP) problem. Specifically, our approach harnes…
▽ More
In this study, we introduce an innovative deep learning framework that employs a transformer model to address the challenges of mixed-integer programs, specifically focusing on the Capacitated Lot Sizing Problem (CLSP). Our approach, to our knowledge, is the first to utilize transformers to predict the binary variables of a mixed-integer programming (MIP) problem. Specifically, our approach harnesses the encoder decoder transformer's ability to process sequential data, making it well-suited for predicting binary variables indicating production setup decisions in each period of the CLSP. This problem is inherently dynamic, and we need to handle sequential decision making under constraints. We present an efficient algorithm in which CLSP solutions are learned through a transformer neural network. The proposed post-processed transformer algorithm surpasses the state-of-the-art solver, CPLEX and Long Short-Term Memory (LSTM) in solution time, optimal gap, and percent infeasibility over 240K benchmark CLSP instances tested. After the ML model is trained, conducting inference on the model, reduces the MIP into a linear program (LP). This transforms the ML-based algorithm, combined with an LP solver, into a polynomial-time approximation algorithm to solve a well-known NP-Hard problem, with almost perfect solution quality.
△ Less
Submitted 24 May, 2024; v1 submitted 20 February, 2024;
originally announced February 2024.
-
SAMBA: A Trainable Segmentation Web-App with Smart Labelling
Authors:
Ronan Docherty,
Isaac Squires,
Antonis Vamvakeros,
Samuel J. Cooper
Abstract:
Segmentation is the assigning of a semantic class to every pixel in an image and is a prerequisite for various statistical analysis tasks in materials science, like phase quantification, physics simulations or morphological characterization. The wide range of length scales, imaging techniques and materials studied in materials science means any segmentation algorithm must generalise to unseen data…
▽ More
Segmentation is the assigning of a semantic class to every pixel in an image and is a prerequisite for various statistical analysis tasks in materials science, like phase quantification, physics simulations or morphological characterization. The wide range of length scales, imaging techniques and materials studied in materials science means any segmentation algorithm must generalise to unseen data and support abstract, user-defined semantic classes. Trainable segmentation is a popular interactive segmentation paradigm where a classifier is trained to map from image features to user drawn labels. SAMBA is a trainable segmentation tool that uses Meta's Segment Anything Model (SAM) for fast, high-quality label suggestions and a random forest classifier for robust, generalizable segmentations. It is accessible in the browser (https://www.sambasegment.com/) without the need to download any external dependencies. The segmentation backend is run in the cloud, so does not require the user to have powerful hardware.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
A Comparative Analysis of Text-to-Image Generative AI Models in Scientific Contexts: A Case Study on Nuclear Power
Authors:
Veda Joynt,
Jacob Cooper,
Naman Bhargava,
Katie Vu,
O Hwang Kwon,
Todd R. Allen,
Aditi Verma,
Majdi I. Radaideh
Abstract:
In this work, we propose and assess the potential of generative artificial intelligence (AI) to generate public engagement around potential clean energy sources. Such an application could increase energy literacy -- an awareness of low-carbon energy sources among the public therefore leading to increased participation in decision-making about the future of energy systems. We explore the use of gen…
▽ More
In this work, we propose and assess the potential of generative artificial intelligence (AI) to generate public engagement around potential clean energy sources. Such an application could increase energy literacy -- an awareness of low-carbon energy sources among the public therefore leading to increased participation in decision-making about the future of energy systems. We explore the use of generative AI to communicate technical information about low-carbon energy sources to the general public, specifically in the realm of nuclear energy. We explored 20 AI-powered text-to-image generators and compared their individual performances on general and scientific nuclear-related prompts. Of these models, DALL-E, DreamStudio, and Craiyon demonstrated promising performance in generating relevant images from general-level text related to nuclear topics. However, these models fall short in three crucial ways: (1) they fail to accurately represent technical details of energy systems; (2) they reproduce existing biases surrounding gender and work in the energy sector; and (3) they fail to accurately represent indigenous landscapes -- which have historically been sites of resource extraction and waste deposition for energy industries. This work is performed to motivate the development of specialized generative tools and their captions to improve energy literacy and effectively engage the public with low-carbon energy sources.
△ Less
Submitted 2 December, 2023;
originally announced December 2023.
-
White paper on cybersecurity in the healthcare sector. The HEIR solution
Authors:
Konstantinos Lampropoulos,
Apostolis Zarras,
Eftychia Lakka,
Polyanthi Barmpaki,
Kostas Drakonakis,
Manos Athanatos,
Herve Debar,
Andreas Alexopoulos,
Aristeidis Sotiropoulos,
George Tsakirakis,
Nikos Dimakopoulos,
Dimitris Tsolovos,
Matthias Pocs,
Michalis Smyrlis,
Ioannis Basdekis,
Georgios Spanoudakis,
Ovidiu Mihaila,
Bogdan Prelipcean,
Eliot Salant,
Sotiris Athanassopoulos,
Petros Papachristou,
Ioannis Ladakis,
John Chang,
Evangelos Floros,
Konstantinos Smyrlis
, et al. (7 additional authors not shown)
Abstract:
The healthcare sector is increasingly vulnerable to cyberattacks due to its growing digitalization. Patient data, including medical records and financial information, are at risk, potentially leading to identity theft and patient safety concerns. The European Union and other organizations identify key areas for healthcare system improvement, yet the industry still grapples with inadequate security…
▽ More
The healthcare sector is increasingly vulnerable to cyberattacks due to its growing digitalization. Patient data, including medical records and financial information, are at risk, potentially leading to identity theft and patient safety concerns. The European Union and other organizations identify key areas for healthcare system improvement, yet the industry still grapples with inadequate security practices. In response, the HEIR project offers a comprehensive cybersecurity approach, promoting security features from various regulatory frameworks and introducing tools such as the Secure Healthcare Framework and Risk Assessment for Medical Applications (RAMA). These measures aim to enhance digital health security and protect sensitive patient data while facilitating secure data access and privacy-aware techniques. In a rapidly evolving threat landscape, HEIR presents a promising framework for healthcare cybersecurity.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
HRTF upsampling with a generative adversarial network using a gnomonic equiangular projection
Authors:
Aidan O. T. Hogg,
Mads Jenkins,
He Liu,
Isaac Squires,
Samuel J. Cooper,
Lorenzo Picinali
Abstract:
An individualised head-related transfer function (HRTF) is very important for creating realistic virtual reality (VR) and augmented reality (AR) environments. However, acoustically measuring high-quality HRTFs requires expensive equipment and an acoustic lab setting. To overcome these limitations and to make this measurement more efficient HRTF upsampling has been exploited in the past where a hig…
▽ More
An individualised head-related transfer function (HRTF) is very important for creating realistic virtual reality (VR) and augmented reality (AR) environments. However, acoustically measuring high-quality HRTFs requires expensive equipment and an acoustic lab setting. To overcome these limitations and to make this measurement more efficient HRTF upsampling has been exploited in the past where a high-resolution HRTF is created from a low-resolution one. This paper demonstrates how generative adversarial networks (GANs) can be applied to HRTF upsampling. We propose a novel approach that transforms the HRTF data for direct use with a convolutional super-resolution generative adversarial network (SRGAN). This new approach is benchmarked against three baselines: barycentric upsampling, spherical harmonic (SH) upsampling and an HRTF selection approach. Experimental results show that the proposed method outperforms all three baselines in terms of log-spectral distortion (LSD) and localisation performance using perceptual models when the input HRTF is sparse (less than 20 measured positions).
△ Less
Submitted 27 February, 2024; v1 submitted 9 June, 2023;
originally announced June 2023.
-
A Generalization of the Graham-Pollak Tree Theorem to Steiner Distance
Authors:
Joshua Cooper,
Gabrielle Tauscheck
Abstract:
Graham and Pollak showed that the determinant of the distance matrix of a tree $T$ depends only on the number of vertices of $T$. Graphical distance, a function of pairs of vertices, can be generalized to ``Steiner distance'' of sets $S$ of vertices of arbitrary size, by defining it to be the fewest edges in any connected subgraph containing all of $S$. Here, we show that the same is true for tree…
▽ More
Graham and Pollak showed that the determinant of the distance matrix of a tree $T$ depends only on the number of vertices of $T$. Graphical distance, a function of pairs of vertices, can be generalized to ``Steiner distance'' of sets $S$ of vertices of arbitrary size, by defining it to be the fewest edges in any connected subgraph containing all of $S$. Here, we show that the same is true for trees' {\em Steiner distance hypermatrix} of all odd orders, whereas the theorem of Graham-Pollak concerns order $2$. We conjecture that the statement holds for all even orders as well.
△ Less
Submitted 31 May, 2023;
originally announced June 2023.
-
Two approaches to inpainting microstructure with deep convolutional generative adversarial networks
Authors:
Isaac Squires,
Samuel J. Cooper,
Amir Dahari,
Steve Kench
Abstract:
Imaging is critical to the characterisation of materials. However, even with careful sample preparation and microscope calibration, imaging techniques are often prone to defects and unwanted artefacts. This is particularly problematic for applications where the micrograph is to be used for simulation or feature analysis, as defects are likely to lead to inaccurate results. Microstructural inpainti…
▽ More
Imaging is critical to the characterisation of materials. However, even with careful sample preparation and microscope calibration, imaging techniques are often prone to defects and unwanted artefacts. This is particularly problematic for applications where the micrograph is to be used for simulation or feature analysis, as defects are likely to lead to inaccurate results. Microstructural inpainting is a method to alleviate this problem by replacing occluded regions with synthetic microstructure with matching boundaries. In this paper we introduce two methods that use generative adversarial networks to generate contiguous inpainted regions of arbitrary shape and size by learning the microstructural distribution from the unoccluded data. We find that one benefits from high speed and simplicity, whilst the other gives smoother boundaries at the inpainting border. We also outline the development of a graphical user interface that allows users to utilise these machine learning methods in a 'no-code' environment.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
MicroLib: A library of 3D microstructures generated from 2D micrographs using SliceGAN
Authors:
Steve Kench,
Isaac Squires,
Amir Dahari,
Samuel J Cooper
Abstract:
3D microstructural datasets are commonly used to define the geometrical domains used in finite element modelling. This has proven a useful tool for understanding how complex material systems behave under applied stresses, temperatures and chemical conditions. However, 3D imaging of materials is challenging for a number of reasons, including limited field of view, low resolution and difficult sampl…
▽ More
3D microstructural datasets are commonly used to define the geometrical domains used in finite element modelling. This has proven a useful tool for understanding how complex material systems behave under applied stresses, temperatures and chemical conditions. However, 3D imaging of materials is challenging for a number of reasons, including limited field of view, low resolution and difficult sample preparation. Recently, a machine learning method, SliceGAN, was developed to statistically generate 3D microstructural datasets of arbitrary size using a single 2D input slice as training data. In this paper, we present the results from applying SliceGAN to 87 different microstructures, ranging from biological materials to high-strength steels. To demonstrate the accuracy of the synthetic volumes created by SliceGAN, we compare three microstructural properties between the 2D training data and 3D generations, which show good agreement. This new microstructure library both provides valuable 3D microstructures that can be used in models, and also demonstrates the broad applicability of the SliceGAN algorithm.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
BioSimulators: a central registry of simulation engines and services for recommending specific tools
Authors:
Bilal Shaikh,
Lucian P. Smith,
Dan Vasilescu,
Gnaneswara Marupilla,
Michael Wilson,
Eran Agmon,
Henry Agnew,
Steven S. Andrews,
Azraf Anwar,
Moritz E. Beber,
Frank T. Bergmann,
David Brooks,
Lutz Brusch,
Laurence Calzone,
Kiri Choi,
Joshua Cooper,
John Detloff,
Brian Drawert,
Michel Dumontier,
G. Bard Ermentrout,
James R. Faeder,
Andrew P. Freiburger,
Fabian Fröhlich,
Akira Funahashi,
Alan Garny
, et al. (46 additional authors not shown)
Abstract:
Computational models have great potential to accelerate bioscience, bioengineering, and medicine. However, it remains challenging to reproduce and reuse simulations, in part, because the numerous formats and methods for simulating various subsystems and scales remain siloed by different software tools. For example, each tool must be executed through a distinct interface. To help investigators find…
▽ More
Computational models have great potential to accelerate bioscience, bioengineering, and medicine. However, it remains challenging to reproduce and reuse simulations, in part, because the numerous formats and methods for simulating various subsystems and scales remain siloed by different software tools. For example, each tool must be executed through a distinct interface. To help investigators find and use simulation tools, we developed BioSimulators (https://biosimulators.org), a central registry of the capabilities of simulation tools and consistent Python, command-line, and containerized interfaces to each version of each tool. The foundation of BioSimulators is standards, such as CellML, SBML, SED-ML, and the COMBINE archive format, and validation tools for simulation projects and simulation tools that ensure these standards are used consistently. To help modelers find tools for particular projects, we have also used the registry to develop recommendation services. We anticipate that BioSimulators will help modelers exchange, reproduce, and combine simulations.
△ Less
Submitted 13 March, 2022;
originally announced March 2022.
-
Quantifying the Computational Capability of a Nanomagnetic Reservoir Computing Platform with Emergent Magnetization Dynamics
Authors:
Ian T Vidamour,
Matthew O A Ellis,
David Griffin,
Guru Venkat,
Charles Swindells,
Richard W S Dawidek,
Thomas J Broomhall,
Nina-Juliane Steinke,
Joshaniel F K Cooper,
Francisco Maccherozzi,
Sarnjeet S Dhesi,
Susan Stepney,
Eleni Vasilaki,
Dan A Allwood,
Thomas J Hayward
Abstract:
Devices based on arrays of interconnected magnetic nano-rings with emergent magnetization dynamics have recently been proposed for use in reservoir computing applications, but for them to be computationally useful it must be possible to optimise their dynamical responses. Here, we use a phenomenological model to demonstrate that such reservoirs can be optimised for classification tasks by tuning h…
▽ More
Devices based on arrays of interconnected magnetic nano-rings with emergent magnetization dynamics have recently been proposed for use in reservoir computing applications, but for them to be computationally useful it must be possible to optimise their dynamical responses. Here, we use a phenomenological model to demonstrate that such reservoirs can be optimised for classification tasks by tuning hyperparameters that control the scaling and input rate of data into the system using rotating magnetic fields. We use task-independent metrics to assess the rings' computational capabilities at each set of these hyperparameters and show how these metrics correlate directly to performance in spoken and written digit recognition tasks. We then show that these metrics, and performance in tasks, can be further improved by expanding the reservoir's output to include multiple, concurrent measures of the ring arrays magnetic states.
△ Less
Submitted 31 January, 2022; v1 submitted 29 November, 2021;
originally announced November 2021.
-
Fusion of complementary 2D and 3D mesostructural datasets using generative adversarial networks
Authors:
Amir Dahari,
Steve Kench,
Isaac Squires,
Samuel J. Cooper
Abstract:
Modelling the impact of a material's mesostructure on device level performance typically requires access to 3D image data containing all the relevant information to define the geometry of the simulation domain. This image data must include sufficient contrast between phases to distinguish each material, be of high enough resolution to capture the key details, but also have a large enough field-of-…
▽ More
Modelling the impact of a material's mesostructure on device level performance typically requires access to 3D image data containing all the relevant information to define the geometry of the simulation domain. This image data must include sufficient contrast between phases to distinguish each material, be of high enough resolution to capture the key details, but also have a large enough field-of-view to be representative of the material in general. It is rarely possible to obtain data with all of these properties from a single imaging technique. In this paper, we present a method for combining information from pairs of distinct but complementary imaging techniques in order to accurately reconstruct the desired multi-phase, high resolution, representative, 3D images. Specifically, we use deep convolutional generative adversarial networks to implement super-resolution, style transfer and dimensionality expansion. To demonstrate the widespread applicability of this tool, two pairs of datasets are used to validate the quality of the volumes generated by fusing the information from paired imaging techniques. Three key mesostructural metrics are calculated in each case to show the accuracy of this method. Having confidence in the accuracy of our method, we then demonstrate its power by applying to a real data pair from a lithium ion battery electrode, where the required 3D high resolution image data is not available anywhere in the literature. We believe this approach is superior to previously reported statistical material reconstruction methods both in terms of its fidelity and ease of use. Furthermore, much of the data required to train this algorithm already exists in the literature, waiting to be combined. As such, our open-access code could precipitate a step change by generating the hard to obtain high quality image volumes necessary to simulate behaviour at the mesoscale.
△ Less
Submitted 30 September, 2022; v1 submitted 21 October, 2021;
originally announced October 2021.
-
Hoechst Is All You Need: Lymphocyte Classification with Deep Learning
Authors:
Jessica Cooper,
In Hwa Um,
Ognjen Arandjelović,
David J Harrison
Abstract:
Multiplex immunofluorescence and immunohistochemistry benefit patients by allowing cancer pathologists to identify several proteins expressed on the surface of cells, enabling cell classification, better understanding of the tumour micro-environment, more accurate diagnoses, prognoses, and tailored immunotherapy based on the immune status of individual patients. However, they are expensive and tim…
▽ More
Multiplex immunofluorescence and immunohistochemistry benefit patients by allowing cancer pathologists to identify several proteins expressed on the surface of cells, enabling cell classification, better understanding of the tumour micro-environment, more accurate diagnoses, prognoses, and tailored immunotherapy based on the immune status of individual patients. However, they are expensive and time consuming processes which require complex staining and imaging techniques by expert technicians. Hoechst staining is much cheaper and easier to perform, but is not typically used in this case as it binds to DNA rather than to the proteins targeted by immunofluorescent techniques, and it was not previously thought possible to differentiate cells expressing these proteins based only on DNA morphology. In this work we show otherwise, training a deep convolutional neural network to identify cells expressing three proteins (T lymphocyte markers CD3 and CD8, and the B lymphocyte marker CD20) with greater than 90% precision and recall, from Hoechst 33342 stained tissue only. Our model learns previously unknown morphological features associated with expression of these proteins which can be used to accurately differentiate lymphocyte subtypes for use in key prognostic metrics such as assessment of immune cell infiltration,and thereby predict and improve patient outcomes without the need for costly multiplex immunofluorescence.
△ Less
Submitted 16 July, 2021; v1 submitted 9 July, 2021;
originally announced July 2021.
-
Geometric vs Algebraic Nullity for Hyperpaths
Authors:
Joshua Cooper,
Grant Fickes
Abstract:
We consider the question of how the eigenvarieties of a hypergraph relate to the algebraic multiplicities of their corresponding eigenvalues. Specifically, we (1) fully describe the irreducible components of the zero-eigenvariety of a loose $3$-hyperpath (its "nullvariety"), (2) use recent results of Bao-Fan-Wang-Zhu to compute the corresponding algebraic multiplicity of zero (its "nullity"), and…
▽ More
We consider the question of how the eigenvarieties of a hypergraph relate to the algebraic multiplicities of their corresponding eigenvalues. Specifically, we (1) fully describe the irreducible components of the zero-eigenvariety of a loose $3$-hyperpath (its "nullvariety"), (2) use recent results of Bao-Fan-Wang-Zhu to compute the corresponding algebraic multiplicity of zero (its "nullity"), and then (3) for this special class of hypergraphs, verify a conjecture of Hu-Ye about the relationship between the geometric (multi-)dimension of the nullvariety and the nullity.
△ Less
Submitted 4 March, 2022; v1 submitted 3 July, 2021;
originally announced July 2021.
-
Sampling Permutations for Shapley Value Estimation
Authors:
Rory Mitchell,
Joshua Cooper,
Eibe Frank,
Geoffrey Holmes
Abstract:
Game-theoretic attribution techniques based on Shapley values are used to interpret black-box machine learning models, but their exact calculation is generally NP-hard, requiring approximation methods for non-trivial models. As the computation of Shapley values can be expressed as a summation over a set of permutations, a common approach is to sample a subset of these permutations for approximatio…
▽ More
Game-theoretic attribution techniques based on Shapley values are used to interpret black-box machine learning models, but their exact calculation is generally NP-hard, requiring approximation methods for non-trivial models. As the computation of Shapley values can be expressed as a summation over a set of permutations, a common approach is to sample a subset of these permutations for approximation. Unfortunately, standard Monte Carlo sampling methods can exhibit slow convergence, and more sophisticated quasi-Monte Carlo methods have not yet been applied to the space of permutations. To address this, we investigate new approaches based on two classes of approximation methods and compare them empirically. First, we demonstrate quadrature techniques in a RKHS containing functions of permutations, using the Mallows kernel in combination with kernel herding and sequential Bayesian quadrature. The RKHS perspective also leads to quasi-Monte Carlo type error bounds, with a tractable discrepancy measure defined on permutations. Second, we exploit connections between the hypersphere $\mathbb{S}^{d-2}$ and permutations to create practical algorithms for generating permutation samples with good properties. Experiments show the above techniques provide significant improvements for Shapley value estimates over existing methods, converging to a smaller RMSE in the same number of model evaluations.
△ Less
Submitted 3 February, 2022; v1 submitted 25 April, 2021;
originally announced April 2021.
-
Believe The HiPe: Hierarchical Perturbation for Fast, Robust, and Model-Agnostic Saliency Mapping
Authors:
Jessica Cooper,
Ognjen Arandjelović,
David J Harrison
Abstract:
Understanding the predictions made by Artificial Intelligence (AI) systems is becoming more and more important as deep learning models are used for increasingly complex and high-stakes tasks. Saliency mapping -- a popular visual attribution method -- is one important tool for this, but existing formulations are limited by either computational cost or architectural constraints. We therefore propose…
▽ More
Understanding the predictions made by Artificial Intelligence (AI) systems is becoming more and more important as deep learning models are used for increasingly complex and high-stakes tasks. Saliency mapping -- a popular visual attribution method -- is one important tool for this, but existing formulations are limited by either computational cost or architectural constraints. We therefore propose Hierarchical Perturbation, a very fast and completely model-agnostic method for interpreting model predictions with robust saliency maps. Using standard benchmarks and datasets, we show that our saliency maps are of competitive or superior quality to those generated by existing model-agnostic methods -- and are over 20 times faster to compute.
△ Less
Submitted 11 April, 2022; v1 submitted 22 February, 2021;
originally announced March 2021.
-
Generating 3D structures from a 2D slice with GAN-based dimensionality expansion
Authors:
Steve Kench,
Samuel J. Cooper
Abstract:
Generative adversarial networks (GANs) can be trained to generate 3D image data, which is useful for design optimisation. However, this conventionally requires 3D training data, which is challenging to obtain. 2D imaging techniques tend to be faster, higher resolution, better at phase identification and more widely available. Here, we introduce a generative adversarial network architecture, SliceG…
▽ More
Generative adversarial networks (GANs) can be trained to generate 3D image data, which is useful for design optimisation. However, this conventionally requires 3D training data, which is challenging to obtain. 2D imaging techniques tend to be faster, higher resolution, better at phase identification and more widely available. Here, we introduce a generative adversarial network architecture, SliceGAN, which is able to synthesise high fidelity 3D datasets using a single representative 2D image. This is especially relevant for the task of material microstructure generation, as a cross-sectional micrograph can contain sufficient information to statistically reconstruct 3D samples. Our architecture implements the concept of uniform information density, which both ensures that generated volumes are equally high quality at all points in space, and that arbitrarily large volumes can be generated. SliceGAN has been successfully trained on a diverse set of materials, demonstrating the widespread applicability of this tool. The quality of generated micrographs is shown through a statistical comparison of synthetic and real datasets of a battery electrode in terms of key microstructural metrics. Finally, we find that the generation time for a $10^8$ voxel volume is on the order of a few seconds, yielding a path for future studies into high-throughput microstructural optimisation.
△ Less
Submitted 10 February, 2021;
originally announced February 2021.
-
Recurrence Ranks and Moment Sequences
Authors:
Joshua Cooper,
Grant Fickes
Abstract:
We introduce the "moment rank" and "unitary rank" of numerical sequences, close relatives of linear-recursive order. We show that both parameters can be characterized by a broad set of criteria involving moments of measures, types of recurrence relations, Hankel matrix factorizations, Waring rank, analytic properties of generating functions, and algebraic properties of polynomial ideals. In the pr…
▽ More
We introduce the "moment rank" and "unitary rank" of numerical sequences, close relatives of linear-recursive order. We show that both parameters can be characterized by a broad set of criteria involving moments of measures, types of recurrence relations, Hankel matrix factorizations, Waring rank, analytic properties of generating functions, and algebraic properties of polynomial ideals. In the process, we solve the "complex finite-atomic" and "integral finite-atomic" moment problems: which sequences arise as the moments of a finite-atomic complex-/integer-valued measures on $\mathbb{C}$?
△ Less
Submitted 1 January, 2021;
originally announced January 2021.
-
Pores for thought: The use of generative adversarial networks for the stochastic reconstruction of 3D multi-phase electrode microstructures with periodic boundaries
Authors:
Andrea Gayon-Lombardo,
Lukas Mosser,
Nigel P. Brandon,
Samuel J. Cooper
Abstract:
The generation of multiphase porous electrode microstructures is a critical step in the optimisation of electrochemical energy storage devices. This work implements a deep convolutional generative adversarial network (DC-GAN) for generating realistic n-phase microstructural data. The same network architecture is successfully applied to two very different three-phase microstructures: A lithium-ion…
▽ More
The generation of multiphase porous electrode microstructures is a critical step in the optimisation of electrochemical energy storage devices. This work implements a deep convolutional generative adversarial network (DC-GAN) for generating realistic n-phase microstructural data. The same network architecture is successfully applied to two very different three-phase microstructures: A lithium-ion battery cathode and a solid oxide fuel cell anode. A comparison between the real and synthetic data is performed in terms of the morphological properties (volume fraction, specific surface area, triple-phase boundary) and transport properties (relative diffusivity), as well as the two-point correlation function. The results show excellent agreement between for datasets and they are also visually indistinguishable. By modifying the input to the generator, we show that it is possible to generate microstructure with periodic boundaries in all three directions. This has the potential to significantly reduce the simulated volume required to be considered representative and therefore massively reduce the computational cost of the electrochemical simulations necessary to predict the performance of a particular microstructure during optimisation.
△ Less
Submitted 4 May, 2020; v1 submitted 17 February, 2020;
originally announced March 2020.
-
Matrices of optimal tree-depth and a row-invariant parameterized algorithm for integer programming
Authors:
Timothy F. N. Chan,
Jacob W. Cooper,
Martin Koutecky,
Daniel Kral,
Kristyna Pekarkova
Abstract:
A long line of research on fixed parameter tractability of integer programming culminated with showing that integer programs with n variables and a constraint matrix with dual tree-depth d and largest entry D are solvable in time g(d,D)poly(n) for some function g. However, the dual tree-depth of a constraint matrix is not preserved by row operations, i.e., a given integer program can be equivalent…
▽ More
A long line of research on fixed parameter tractability of integer programming culminated with showing that integer programs with n variables and a constraint matrix with dual tree-depth d and largest entry D are solvable in time g(d,D)poly(n) for some function g. However, the dual tree-depth of a constraint matrix is not preserved by row operations, i.e., a given integer program can be equivalent to another with a smaller dual tree-depth, and thus does not reflect its geometric structure.
We prove that the minimum dual tree-depth of a row-equivalent matrix is equal to the branch-depth of the matroid defined by the columns of the matrix. We design a fixed parameter algorithm for computing branch-depth of matroids represented over a finite field and a fixed parameter algorithm for computing a row-equivalent matrix with minimum dual tree-depth. Finally, we use these results to obtain an algorithm for integer programming running in time g(d*,D)poly(n) where d* is the branch-depth of the constraint matrix; the branch-depth cannot be replaced by the more permissive notion of branch-width.
△ Less
Submitted 31 January, 2022; v1 submitted 15 July, 2019;
originally announced July 2019.
-
Characteristic Power Series of Graph Limits
Authors:
Joshua N. Cooper
Abstract:
In this note, we show how to obtain a "characteristic power series" of graphons -- infinite limits of dense graphs -- as the limit of normalized reciprocal characteristic polynomials. This leads to a new characterization of graph quasi-randomness and another perspective on spectral theory for graphons, a complete description of the function in terms of the spectrum of the graphon as a self-adjoint…
▽ More
In this note, we show how to obtain a "characteristic power series" of graphons -- infinite limits of dense graphs -- as the limit of normalized reciprocal characteristic polynomials. This leads to a new characterization of graph quasi-randomness and another perspective on spectral theory for graphons, a complete description of the function in terms of the spectrum of the graphon as a self-adjoint kernel operator. Interestingly, while we apply a standard regularization to classical determinants, it is unclear how necessary this is.
△ Less
Submitted 18 September, 2022; v1 submitted 13 June, 2019;
originally announced June 2019.
-
A Development of Continuous-Time Transfer Entropy
Authors:
Joshua N. Cooper,
Christopher D. Edgar
Abstract:
Transfer entropy (TE) was introduced by Schreiber in 2000 as a measurement of the predictive capacity of one stochastic process with respect to another. Originally stated for discrete time processes, we expand the theory in line with recent work of Spinney, Prokopenko, and Lizier to define TE for stochastic processes indexed over a compact interval taking values in a Polish state space. We provide…
▽ More
Transfer entropy (TE) was introduced by Schreiber in 2000 as a measurement of the predictive capacity of one stochastic process with respect to another. Originally stated for discrete time processes, we expand the theory in line with recent work of Spinney, Prokopenko, and Lizier to define TE for stochastic processes indexed over a compact interval taking values in a Polish state space. We provide a definition for continuous time TE using the Radon-Nikodym Theorem, random measures, and projective limits of probability spaces. As our main result, we provide necessary and sufficient conditions to obtain this definition as a limit of discrete time TE, as well as illustrate its application via an example involving Poisson point processes. As a derivative of continuous time TE, we also define the transfer entropy rate between two processes and show that (under mild assumptions) their stationarity implies a constant rate. We also investigate TE between homogeneous Markov jump processes and discuss some open problems and possible future directions.
△ Less
Submitted 30 June, 2019; v1 submitted 15 May, 2019;
originally announced May 2019.
-
Understanding Ancient Coin Images
Authors:
Jessica Cooper,
Ognjen Arandjelovic
Abstract:
In recent years, a range of problems within the broad umbrella of automatic, computer vision based analysis of ancient coins has been attracting an increasing amount of attention. Notwithstanding this research effort, the results achieved by the state of the art in the published literature remain poor and far from sufficiently well performing for any practical purpose. In the present paper we pres…
▽ More
In recent years, a range of problems within the broad umbrella of automatic, computer vision based analysis of ancient coins has been attracting an increasing amount of attention. Notwithstanding this research effort, the results achieved by the state of the art in the published literature remain poor and far from sufficiently well performing for any practical purpose. In the present paper we present a series of contributions which we believe will benefit the interested community. Firstly, we explain that the approach of visual matching of coins, universally adopted in all existing published papers on the topic, is not of practical interest because the number of ancient coin types exceeds by far the number of those types which have been imaged, be it in digital form (e.g. online) or otherwise (traditional film, in print, etc.). Rather, we argue that the focus should be on the understanding of the semantic content of coins. Hence, we describe a novel method which uses real-world multimodal input to extract and associate semantic concepts with the correct coin images and then using a novel convolutional neural network learn the appearance of these concepts. Empirical evidence on a real-world and by far the largest data set of ancient coins, we demonstrate highly promising results.
△ Less
Submitted 13 March, 2019; v1 submitted 6 March, 2019;
originally announced March 2019.
-
Inverse Kinematics and Sensitivity Minimization of an n-Stack Stewart Platform
Authors:
David Balaban,
John Cooper,
Erik Komendera
Abstract:
An autonomous system is presented to solve the problem of in space assembly, which can be used to further the NASA goal of deep space exploration. Of particular interest is the assembly of large truss structures, which requires precise and dexterous movement in a changing environment. A prototype of an autonomous manipulator called "Assemblers" was fabricated from an aggregation of Stewart Platfor…
▽ More
An autonomous system is presented to solve the problem of in space assembly, which can be used to further the NASA goal of deep space exploration. Of particular interest is the assembly of large truss structures, which requires precise and dexterous movement in a changing environment. A prototype of an autonomous manipulator called "Assemblers" was fabricated from an aggregation of Stewart Platform robots for the purpose of researching autonomous in space assembly capabilities. The forward kinematics for an Assembler is described by the set of translations and rotation angles for each component Stewart Platform, from which the position and orientation of the end effector are simple to calculate. However, selecting inverse kinematic poses, defined by the translations and rotation angles, for the Assembler requires coordination between each Stewart Platform and is an underconstrained non-linear optimization problem. For assembly tasks, it is ideal that the pose selected has the least sensitivity to disturbances possible. A method of sensitivity reduction is proposed by minimizing the Frobenius Norm (FN) of the Jacobian for the forward kinematics. The effectiveness of the FN method will be demonstrated through a Monte Carlo simulation method to model random motion internal to the structure.
△ Less
Submitted 12 November, 2018;
originally announced November 2018.
-
Weak regularity and finitely forcible graph limits
Authors:
Jacob W. Cooper,
Tomas Kaiser,
Daniel Kral,
Jonathan A. Noel
Abstract:
Graphons are analytic objects representing limits of convergent sequences of graphs. Lovász and Szegedy conjectured that every finitely forcible graphon, i.e. any graphon determined by finitely many graph densities, has a simple structure. In particular, one of their conjectures would imply that every finitely forcible graphon has a weak $\varepsilon$-regular partition with the number of parts bou…
▽ More
Graphons are analytic objects representing limits of convergent sequences of graphs. Lovász and Szegedy conjectured that every finitely forcible graphon, i.e. any graphon determined by finitely many graph densities, has a simple structure. In particular, one of their conjectures would imply that every finitely forcible graphon has a weak $\varepsilon$-regular partition with the number of parts bounded by a polynomial in $\varepsilon^{-1}$. We construct a finitely forcible graphon $W$ such that the number of parts in any weak $\varepsilon$-regular partition of $W$ is at least exponential in $\varepsilon^{-2}/2^{5\log^*\varepsilon^{-2}}$. This bound almost matches the known upper bound for graphs and, in a certain sense, is the best possible for graphons.
△ Less
Submitted 26 August, 2016; v1 submitted 30 June, 2015;
originally announced July 2015.
-
Coloring so that no Pythagorean Triple is Monochromatic
Authors:
Joshua Cooper,
Ralph Overstreet
Abstract:
We address the question of the "partition regularity" of the Pythagorean equation a^2+b^2=c^2; in particular, can the natural numbers be assigned a 2-coloring, so that no Pythagorean triple (i.e., a solution to the equation) is monochromatic? We prove that the hypergraph of Pythagorean triples can contain no Steiner triple systems, a natural obstruction to 2-colorability. Then, after transforming…
▽ More
We address the question of the "partition regularity" of the Pythagorean equation a^2+b^2=c^2; in particular, can the natural numbers be assigned a 2-coloring, so that no Pythagorean triple (i.e., a solution to the equation) is monochromatic? We prove that the hypergraph of Pythagorean triples can contain no Steiner triple systems, a natural obstruction to 2-colorability. Then, after transforming the question into one about 3-CNF satisfiability and applying some reductions, a SAT solver is used to find a 2-coloring for {1,...,7664}. Work continues as we seek to improve the reductions and extend the computation.
△ Less
Submitted 8 May, 2015;
originally announced May 2015.
-
Density dichotomy in random words
Authors:
Joshua Cooper,
Danny Rorabaugh
Abstract:
Word $W$ is said to encounter word $V$ provided there is a homomorphism $φ$ mapping letters to nonempty words so that $φ(V)$ is a substring of $W$. For example, taking $φ$ such that $φ(h)=c$ and $φ(u)=ien$, we see that "science" encounters "huh" since $cienc=φ(huh)$. The density of $V$ in $W$, $δ(V,W)$, is the proportion of substrings of $W$ that are homomorphic images of $V$. So the density of "h…
▽ More
Word $W$ is said to encounter word $V$ provided there is a homomorphism $φ$ mapping letters to nonempty words so that $φ(V)$ is a substring of $W$. For example, taking $φ$ such that $φ(h)=c$ and $φ(u)=ien$, we see that "science" encounters "huh" since $cienc=φ(huh)$. The density of $V$ in $W$, $δ(V,W)$, is the proportion of substrings of $W$ that are homomorphic images of $V$. So the density of "huh" in "science" is $2/{8 \choose 2}$. A word is doubled if every letter that appears in the word appears at least twice.
The dichotomy: Let $V$ be a word over any alphabet, $Σ$ a finite alphabet with at least 2 letters, and $W_n \in Σ^n$ chosen uniformly at random. Word $V$ is doubled if and only if $\mathbb{E}(δ(V,W_n)) \rightarrow 0$ as $n \rightarrow \infty$.
We further explore convergence for nondoubled words and concentration of the limit distribution for doubled words around its mean.
△ Less
Submitted 17 October, 2016; v1 submitted 16 April, 2015;
originally announced April 2015.
-
One file to share them all: Using the COMBINE Archive and the OMEX format to share all information about a modeling project
Authors:
Frank T. Bergmann,
Richard Adams,
Stuart Moodie,
Jonathan Cooper,
Mihai Glont,
Martin Golebiewski,
Michael Hucka,
Camille Laibe,
Andrew K. Miller,
David P. Nickerson,
Brett G. Olivier,
Nicolas Rodriguez,
Herbert M. Sauro,
Martin Scharm,
Stian Soiland-Reyes,
Dagmar Waltemath,
Florent Yvon,
Nicolas Le Novère
Abstract:
Background: With the ever increasing use of computational models in the biosciences, the need to share models and reproduce the results of published studies efficiently and easily is becoming more important. To this end, various standards have been proposed that can be used to describe models, simulations, data or other essential information in a consistent fashion. These constitute various separa…
▽ More
Background: With the ever increasing use of computational models in the biosciences, the need to share models and reproduce the results of published studies efficiently and easily is becoming more important. To this end, various standards have been proposed that can be used to describe models, simulations, data or other essential information in a consistent fashion. These constitute various separate components required to reproduce a given published scientific result.
Results: We describe the Open Modeling EXchange format (OMEX). Together with the use of other standard formats from the Computational Modeling in Biology Network (COMBINE), OMEX is the basis of the COMBINE Archive, a single file that supports the exchange of all the information necessary for a modeling and simulation experiment in biology. An OMEX file is a ZIP container that includes a manifest file, listing the content of the archive, an optional metadata file adding information about the archive and its content, and the files describing the model. The content of a COMBINE Archive consists of files encoded in COMBINE standards whenever possible, but may include additional files defined by an Internet Media Type. Several tools that support the COMBINE Archive are available, either as independent libraries or embedded in modeling software.
Conclusions: The COMBINE Archive facilitates the reproduction of modeling and simulation experiments in biology by embedding all the relevant information in one file. Having all the information stored and exchanged at once also helps in building activity logs and audit trails. We anticipate that the COMBINE Archive will become a significant help for modellers, as the domain moves to larger, more complex experiments such as multi-scale models of organs, digital organisms, and bioengineering.
△ Less
Submitted 30 September, 2014; v1 submitted 18 July, 2014;
originally announced July 2014.
-
Counting independent sets in hypergraphs
Authors:
Jeff Cooper,
Kunal Dutta,
Dhruv Mubayi
Abstract:
Let $G$ be a triangle-free graph with $n$ vertices and average degree $t$. We show that $G$ contains at least \[ e^{(1-n^{-1/12})\frac{1}{2}\frac{n}{t}\ln t (\frac{1}{2}\ln t-1)} \] independent sets. This improves a recent result of the first and third authors \cite{countingind}. In particular, it implies that as $n \to \infty$, every triangle-free graph on $n$ vertices has at least…
▽ More
Let $G$ be a triangle-free graph with $n$ vertices and average degree $t$. We show that $G$ contains at least \[ e^{(1-n^{-1/12})\frac{1}{2}\frac{n}{t}\ln t (\frac{1}{2}\ln t-1)} \] independent sets. This improves a recent result of the first and third authors \cite{countingind}. In particular, it implies that as $n \to \infty$, every triangle-free graph on $n$ vertices has at least $e^{(c_1-o(1)) \sqrt{n} \ln n}$ independent sets, where $c_1 = \sqrt{\ln 2}/4 = 0.208138..$. Further, we show that for all $n$, there exists a triangle-free graph with $n$ vertices which has at most $e^{(c_2+o(1))\sqrt{n}\ln n}$ independent sets, where $c_2 = 1+\ln 2 = 1.693147..$. This disproves a conjecture from \cite{countingind}.
Let $H$ be a $(k+1)$-uniform linear hypergraph with $n$ vertices and average degree $t$. We also show that there exists a constant $c_k$ such that the number of independent sets in $H$ is at least \[ e^{c_{k} \frac{n}{t^{1/k}}\ln^{1+1/k}{t}}. \] This is tight apart from the constant $c_k$ and generalizes a result of Duke, Lefmann, and Rödl \cite{uncrowdedrodl}, which guarantees the existence of an independent set of size $Ω(\frac{n}{t^{1/k}} \ln^{1/k}t)$. Both of our lower bounds follow from a more general statement, which applies to hereditary properties of hypergraphs.
△ Less
Submitted 25 October, 2013; v1 submitted 24 October, 2013;
originally announced October 2013.
-
Graham's Tree Reconstruction Conjecture and a Waring-Type Problem on Partitions
Authors:
Joshua Cooper,
Bill Kay,
Anton Swifton
Abstract:
Suppose $G$ is a tree. Graham's "Tree Reconstruction Conjecture" states that $G$ is uniquely determined by the integer sequence $|G|$, $|L(G)|$, $|L(L(G))|$, $|L(L(L(G)))|$, $\ldots$, where $L(H)$ denotes the line graph of the graph $H$. Little is known about this question apart from a few simple observations. We show that the number of trees on $n$ vertices which can be distinguished by their ass…
▽ More
Suppose $G$ is a tree. Graham's "Tree Reconstruction Conjecture" states that $G$ is uniquely determined by the integer sequence $|G|$, $|L(G)|$, $|L(L(G))|$, $|L(L(L(G)))|$, $\ldots$, where $L(H)$ denotes the line graph of the graph $H$. Little is known about this question apart from a few simple observations. We show that the number of trees on $n$ vertices which can be distinguished by their associated integer sequences is $e^{Ω((\log n)^{3/2})}$. The proof strategy involves constructing a large collection of caterpillar graphs using partitions arising from the Prouhet-Tarry-Escott problem.
△ Less
Submitted 23 August, 2017; v1 submitted 2 September, 2011;
originally announced September 2011.
-
Deterministic Random Walks on Regular Trees
Authors:
Joshua Cooper,
Benjamin Doerr,
Tobias Friedrich,
Joel Spencer
Abstract:
Jim Propp's rotor router model is a deterministic analogue of a random walk on a graph. Instead of distributing chips randomly, each vertex serves its neighbors in a fixed order.
Cooper and Spencer (Comb. Probab. Comput. (2006)) show a remarkable similarity of both models. If an (almost) arbitrary population of chips is placed on the vertices of a grid $\Z^d$ and does a simultaneous walk in the…
▽ More
Jim Propp's rotor router model is a deterministic analogue of a random walk on a graph. Instead of distributing chips randomly, each vertex serves its neighbors in a fixed order.
Cooper and Spencer (Comb. Probab. Comput. (2006)) show a remarkable similarity of both models. If an (almost) arbitrary population of chips is placed on the vertices of a grid $\Z^d$ and does a simultaneous walk in the Propp model, then at all times and on each vertex, the number of chips on this vertex deviates from the expected number the random walk would have gotten there by at most a constant. This constant is independent of the starting configuration and the order in which each vertex serves its neighbors.
This result raises the question if all graphs do have this property. With quite some effort, we are now able to answer this question negatively. For the graph being an infinite $k$-ary tree ($k \ge 3$), we show that for any deviation $D$ there is an initial configuration of chips such that after running the Propp model for a certain time there is a vertex with at least $D$ more chips than expected in the random walk model. However, to achieve a deviation of $D$ it is necessary that at least $\exp(Ω(D^2))$ vertices contribute by being occupied by a number of chips not divisible by $k$ at a certain time.
△ Less
Submitted 7 June, 2010;
originally announced June 2010.
-
De Bruijn Cycles for Covering Codes
Authors:
Fan Chung,
Joshua N. Cooper
Abstract:
A de Bruijn covering code is a q-ary string S so that every q-ary string is at most R symbol changes from some n-word appearing consecutively in S. We introduce these codes and prove that they can have length close to the smallest possible covering code. The proof employs tools from field theory, probability, and linear algebra. We also prove a number of ``spectral'' results on de Bruijn coverin…
▽ More
A de Bruijn covering code is a q-ary string S so that every q-ary string is at most R symbol changes from some n-word appearing consecutively in S. We introduce these codes and prove that they can have length close to the smallest possible covering code. The proof employs tools from field theory, probability, and linear algebra. We also prove a number of ``spectral'' results on de Bruijn covering codes. Included is a table of the best known bounds on the lengths of small binary de Bruijn covering codes, up to R=11 and n=13, followed by several open questions in this area.
△ Less
Submitted 28 October, 2003; v1 submitted 23 October, 2003;
originally announced October 2003.
-
Asymmetric binary covering codes
Authors:
Joshua N. Cooper,
Robert B. Ellis,
Andrew B. Kahng
Abstract:
An asymmetric binary covering code of length n and radius R is a subset C of the n-cube Q_n such that every vector x in Q_n can be obtained from some vector c in C by changing at most R 1's of c to 0's, where R is as small as possible. K^+(n,R) is defined as the smallest size of such a code. We show K^+(n,R) is of order 2^n/n^R for constant R, using an asymmetric sphere-covering bound and probab…
▽ More
An asymmetric binary covering code of length n and radius R is a subset C of the n-cube Q_n such that every vector x in Q_n can be obtained from some vector c in C by changing at most R 1's of c to 0's, where R is as small as possible. K^+(n,R) is defined as the smallest size of such a code. We show K^+(n,R) is of order 2^n/n^R for constant R, using an asymmetric sphere-covering bound and probabilistic methods. We show K^+(n,n-R')=R'+1 for constant coradius R' iff n>=R'(R'+1)/2. These two results are extended to near-constant R and R', respectively. Various bounds on K^+ are given in terms of the total number of 0's or 1's in a minimal code. The dimension of a minimal asymmetric linear binary code ([n,R]^+ code) is determined to be min(0,n-R). We conclude by discussing open problems and techniques to compute explicit values for K^+, giving a table of best known bounds.
△ Less
Submitted 4 September, 2003;
originally announced September 2003.