-
Interpretability analysis on a pathology foundation model reveals biologically relevant embeddings across modalities
Authors:
Nhat Le,
Ciyue Shen,
Chintan Shah,
Blake Martin,
Daniel Shenker,
Harshith Padigela,
Jennifer Hipp,
Sean Grullon,
John Abel,
Harsha Vardhan Pokkalla,
Dinkar Juyal
Abstract:
Mechanistic interpretability has been explored in detail for large language models (LLMs). For the first time, we provide a preliminary investigation with similar interpretability methods for medical imaging. Specifically, we analyze the features from a ViT-Small encoder obtained from a pathology Foundation Model via application to two datasets: one dataset of pathology images, and one dataset of…
▽ More
Mechanistic interpretability has been explored in detail for large language models (LLMs). For the first time, we provide a preliminary investigation with similar interpretability methods for medical imaging. Specifically, we analyze the features from a ViT-Small encoder obtained from a pathology Foundation Model via application to two datasets: one dataset of pathology images, and one dataset of pathology images paired with spatial transcriptomics. We discover an interpretable representation of cell and tissue morphology, along with gene expression within the model embedding space. Our work paves the way for further exploration around interpretable feature dimensions and their utility for medical and clinical applications.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
A framework for developing a knowledge management platform
Authors:
Marie Lisandra Zepeda Mendoza,
Sonali Agarwal,
James A. Blackshaw,
Vanesa Bol,
Audrey Fazzi,
Filippo Fiorini,
Amy Louise Foreman,
Nancy George,
Brett R. Johnson,
Brian Martin,
Dave McComb,
Euphemia Mutasa-Gottgens,
Helen Parkinson,
Martin Romacker,
Rolf Russell,
Valérien Ségard,
Shawn Zheng Kai Tan,
Wei Kheng Teh,
F. P. Winstanley,
Benedict Wong,
Adrian M. Smith
Abstract:
Knowledge management (KM) involves collecting, organizing, storing, and disseminating information to improve decision-making, innovation, and performance. Implementing KM at scale has become essential for organizations to effectively leverage vast accessible data. This paper is a compilation of concepts that emerged from KM workshops hosted by EMBL-EBI, attended by SMEs and industry. We provide gu…
▽ More
Knowledge management (KM) involves collecting, organizing, storing, and disseminating information to improve decision-making, innovation, and performance. Implementing KM at scale has become essential for organizations to effectively leverage vast accessible data. This paper is a compilation of concepts that emerged from KM workshops hosted by EMBL-EBI, attended by SMEs and industry. We provide guidance on envisioning, executing, evaluating, and evolving knowledge management platforms. We emphasize essential considerations such as setting knowledge domain boundaries and measuring success, as well as the importance of making knowledge accessible for downstream applications and non-computational users and highlights necessary personal and organizational skills for success. We stress the importance of collaboration and the need for convergence on shared principles and commitment to provide or seek resources to advance KM. The community is invited to join the journey of KM and contribute to the advancement of the field by applying and improving on the guidelines described.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
PLUTO: Pathology-Universal Transformer
Authors:
Dinkar Juyal,
Harshith Padigela,
Chintan Shah,
Daniel Shenker,
Natalia Harguindeguy,
Yi Liu,
Blake Martin,
Yibo Zhang,
Michael Nercessian,
Miles Markey,
Isaac Finberg,
Kelsey Luu,
Daniel Borders,
Syed Ashar Javed,
Emma Krause,
Raymond Biju,
Aashish Sood,
Allen Ma,
Jackson Nyman,
John Shamshoian,
Guillaume Chhor,
Darpan Sanghavi,
Marc Thibault,
Limin Yu,
Fedaa Najdawi
, et al. (8 additional authors not shown)
Abstract:
Pathology is the study of microscopic inspection of tissue, and a pathology diagnosis is often the medical gold standard to diagnose disease. Pathology images provide a unique challenge for computer-vision-based analysis: a single pathology Whole Slide Image (WSI) is gigapixel-sized and often contains hundreds of thousands to millions of objects of interest across multiple resolutions. In this wor…
▽ More
Pathology is the study of microscopic inspection of tissue, and a pathology diagnosis is often the medical gold standard to diagnose disease. Pathology images provide a unique challenge for computer-vision-based analysis: a single pathology Whole Slide Image (WSI) is gigapixel-sized and often contains hundreds of thousands to millions of objects of interest across multiple resolutions. In this work, we propose PathoLogy Universal TransfOrmer (PLUTO): a light-weight pathology FM that is pre-trained on a diverse dataset of 195 million image tiles collected from multiple sites and extracts meaningful representations across multiple WSI scales that enable a large variety of downstream pathology tasks. In particular, we design task-specific adaptation heads that utilize PLUTO's output embeddings for tasks which span pathology scales ranging from subcellular to slide-scale, including instance segmentation, tile classification, and slide-level prediction. We compare PLUTO's performance to other state-of-the-art methods on a diverse set of external and internal benchmarks covering multiple biologically relevant tasks, tissue types, resolutions, stains, and scanners. We find that PLUTO matches or outperforms existing task-specific baselines and pathology-specific foundation models, some of which use orders-of-magnitude larger datasets and model sizes when compared to PLUTO. Our findings present a path towards a universal embedding to power pathology image analysis, and motivate further exploration around pathology foundation models in terms of data diversity, architectural improvements, sample efficiency, and practical deployability in real-world applications.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
NOISe: Nuclei-Aware Osteoclast Instance Segmentation for Mouse-to-Human Domain Transfer
Authors:
Sai Kumar Reddy Manne,
Brendan Martin,
Tyler Roy,
Ryan Neilson,
Rebecca Peters,
Meghana Chillara,
Christine W. Lary,
Katherine J. Motyl,
Michael Wan
Abstract:
Osteoclast cell image analysis plays a key role in osteoporosis research, but it typically involves extensive manual image processing and hand annotations by a trained expert. In the last few years, a handful of machine learning approaches for osteoclast image analysis have been developed, but none have addressed the full instance segmentation task required to produce the same output as that of th…
▽ More
Osteoclast cell image analysis plays a key role in osteoporosis research, but it typically involves extensive manual image processing and hand annotations by a trained expert. In the last few years, a handful of machine learning approaches for osteoclast image analysis have been developed, but none have addressed the full instance segmentation task required to produce the same output as that of the human expert led process. Furthermore, none of the prior, fully automated algorithms have publicly available code, pretrained models, or annotated datasets, inhibiting reproduction and extension of their work. We present a new dataset with ~2*10^5 expert annotated mouse osteoclast masks, together with a deep learning instance segmentation method which works for both in vitro mouse osteoclast cells on plastic tissue culture plates and human osteoclast cells on bone chips. To our knowledge, this is the first work to automate the full osteoclast instance segmentation task. Our method achieves a performance of 0.82 mAP_0.5 (mean average precision at intersection-over-union threshold of 0.5) in cross validation for mouse osteoclasts. We present a novel nuclei-aware osteoclast instance segmentation training strategy (NOISe) based on the unique biology of osteoclasts, to improve the model's generalizability and boost the mAP_0.5 from 0.60 to 0.82 on human osteoclasts. We publish our annotated mouse osteoclast image dataset, instance segmentation models, and code at github.com/michaelwwan/noise to enable reproducibility and to provide a public tool to accelerate osteoporosis research.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Graph Homomorphism, Monotone Classes and Bounded Pathwidth
Authors:
Tala Eagling-Vose,
Barnaby Martin,
Daniel Paulusma,
Mark Siggers,
Siani Smith
Abstract:
A recent paper describes a framework for studying the computational complexity of graph problems on monotone classes, that is those omitting a set of graphs as a subgraph. If the problems lie in the framework, and many do, then the computational complexity can be described for all monotone classes defined by a finite set of omitted subgraphs. It is known that certain homomorphism problems, e.g.…
▽ More
A recent paper describes a framework for studying the computational complexity of graph problems on monotone classes, that is those omitting a set of graphs as a subgraph. If the problems lie in the framework, and many do, then the computational complexity can be described for all monotone classes defined by a finite set of omitted subgraphs. It is known that certain homomorphism problems, e.g. $C_5$-Colouring, do not sit in the framework. By contrast, we show that the more general problem of Graph Homomorphism does sit in the framework.
The original framework had examples where hard versus easy were NP-complete versus P, or at least quadratic versus almost linear. We give the first example of a problem in the framework such that hardness is in the polynomial hierarchy above NP. Considering a variant of the colouring game as studied by Bodlaender, we show that with the restriction of bounded alternation, the list version of this problem is contained in the framework. The hard cases are $Π_{2k}^\mathrm{P}$-complete and the easy cases are in P.
The cases in P comprise those classes for which the pathwidth is bounded. Bodlaender explains that Sequential $3$-Colouring Construction Game is in P on classes with bounded vertex separation number, which coincides with bounded pathwidth on unordered graphs. However, these graphs are ordered with a playing order for the two players, which corresponds to a prefix pattern in a quantified formula. We prove that Sequential $3$-Colouring Construction Game is Pspace-complete on some class of bounded pathwidth, using a celebrated result of Atserias and Oliva.
We consider several locally constrained variants of the homomorphism problem. Like $C_5$-Colouring, none of these is in the framework. However, when we consider the bounded-degree restrictions, we prove that each of these problems is in our framework.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
A Hormetic Approach to the Value-Loading Problem: Preventing the Paperclip Apocalypse?
Authors:
Nathan I. N. Henry,
Mangor Pedersen,
Matt Williams,
Jamin L. B. Martin,
Liesje Donkin
Abstract:
The value-loading problem is a significant challenge for researchers aiming to create artificial intelligence (AI) systems that align with human values and preferences. This problem requires a method to define and regulate safe and optimal limits of AI behaviors. In this work, we propose HALO (Hormetic ALignment via Opponent processes), a regulatory paradigm that uses hormetic analysis to regulate…
▽ More
The value-loading problem is a significant challenge for researchers aiming to create artificial intelligence (AI) systems that align with human values and preferences. This problem requires a method to define and regulate safe and optimal limits of AI behaviors. In this work, we propose HALO (Hormetic ALignment via Opponent processes), a regulatory paradigm that uses hormetic analysis to regulate the behavioral patterns of AI. Behavioral hormesis is a phenomenon where low frequencies of a behavior have beneficial effects, while high frequencies are harmful. By modeling behaviors as allostatic opponent processes, we can use either Behavioral Frequency Response Analysis (BFRA) or Behavioral Count Response Analysis (BCRA) to quantify the hormetic limits of repeatable behaviors. We demonstrate how HALO can solve the 'paperclip maximizer' scenario, a thought experiment where an unregulated AI tasked with making paperclips could end up converting all matter in the universe into paperclips. Our approach may be used to help create an evolving database of 'values' based on the hedonic calculus of repeatable behaviors with decreasing marginal utility. This positions HALO as a promising solution for the value-loading problem, which involves embedding human-aligned values into an AI system, and the weak-to-strong generalization problem, which explores whether weak models can supervise stronger models as they become more intelligent. Hence, HALO opens several research avenues that may lead to the development of a computational value system that allows an AI algorithm to learn whether the decisions it makes are right or wrong.
△ Less
Submitted 13 February, 2024; v1 submitted 12 February, 2024;
originally announced February 2024.
-
HiER: Highlight Experience Replay for Boosting Off-Policy Reinforcement Learning Agents
Authors:
Dániel Horváth,
Jesús Bujalance Martín,
Ferenc Gábor Erd��s,
Zoltán Istenes,
Fabien Moutarde
Abstract:
Even though reinforcement-learning-based algorithms achieved superhuman performance in many domains, the field of robotics poses significant challenges as the state and action spaces are continuous, and the reward function is predominantly sparse. Furthermore, on many occasions, the agent is devoid of access to any form of demonstration. Inspired by human learning, in this work, we propose a metho…
▽ More
Even though reinforcement-learning-based algorithms achieved superhuman performance in many domains, the field of robotics poses significant challenges as the state and action spaces are continuous, and the reward function is predominantly sparse. Furthermore, on many occasions, the agent is devoid of access to any form of demonstration. Inspired by human learning, in this work, we propose a method named highlight experience replay (HiER) that creates a secondary highlight replay buffer for the most relevant experiences. For the weights update, the transitions are sampled from both the standard and the highlight experience replay buffer. It can be applied with or without the techniques of hindsight experience replay (HER) and prioritized experience replay (PER). Our method significantly improves the performance of the state-of-the-art, validated on 8 tasks of three robotic benchmarks. Furthermore, to exploit the full potential of HiER, we propose HiER+ in which HiER is enhanced with an arbitrary data collection curriculum learning method. Our implementation, the qualitative results, and a video presentation are available on the project site: http://www.danielhorvath.eu/hier/.
△ Less
Submitted 9 July, 2024; v1 submitted 14 December, 2023;
originally announced December 2023.
-
Respiratory Anomaly Detection using Reflected Infrared Light-wave Signals
Authors:
Md Zobaer Islam,
Brenden Martin,
Carly Gotcher,
Tyler Martinez,
John F. O'Hara,
Sabit Ekin
Abstract:
In this study, we present a non-contact respiratory anomaly detection method using incoherent light-wave signals reflected from the chest of a mechanical robot that can breathe like human beings. In comparison to existing radar and camera-based sensing systems for vitals monitoring, this technology uses only a low-cost ubiquitous infrared light source and sensor. This light-wave sensing system rec…
▽ More
In this study, we present a non-contact respiratory anomaly detection method using incoherent light-wave signals reflected from the chest of a mechanical robot that can breathe like human beings. In comparison to existing radar and camera-based sensing systems for vitals monitoring, this technology uses only a low-cost ubiquitous infrared light source and sensor. This light-wave sensing system recognizes different breathing anomalies from the variations of light intensity reflected from the chest of the robot within a 0.5m-1.5m range with an average classification accuracy of up to 96.6% using machine learning.
△ Less
Submitted 22 April, 2024; v1 submitted 2 November, 2023;
originally announced November 2023.
-
Music Augmentation and Denoising For Peak-Based Audio Fingerprinting
Authors:
Kamil Akesbi,
Dorian Desblancs,
Benjamin Martin
Abstract:
Audio fingerprinting is a well-established solution for song identification from short recording excerpts. Popular methods rely on the extraction of sparse representations, generally spectral peaks, and have proven to be accurate, fast, and scalable to large collections. However, real-world applications of audio identification often happen in noisy environments, which can cause these systems to fa…
▽ More
Audio fingerprinting is a well-established solution for song identification from short recording excerpts. Popular methods rely on the extraction of sparse representations, generally spectral peaks, and have proven to be accurate, fast, and scalable to large collections. However, real-world applications of audio identification often happen in noisy environments, which can cause these systems to fail. In this work, we tackle this problem by introducing and releasing a new audio augmentation pipeline that adds noise to music snippets in a realistic way, by stochastically mimicking real-world scenarios. We then propose and release a deep learning model that removes noisy components from spectrograms in order to improve peak-based fingerprinting systems' accuracy. We show that the addition of our model improves the identification performance of commonly used audio fingerprinting systems, even under noisy conditions.
△ Less
Submitted 29 October, 2023; v1 submitted 20 October, 2023;
originally announced October 2023.
-
An Orbital Solution for WASP-12 b: Updated Ephemeris and Evidence for Decay Leveraging Citizen Science Data
Authors:
Avinash S. Nediyedath,
Martin J. Fowler,
A. Norris,
Shivaraj R. Maidur,
Kyle A. Pearson,
S. Dixon,
P. Lewin,
Andre O. Kovacs,
A. Odasso,
K. Davis,
M. Primm,
P. Das,
Bryan E. Martin,
D. Lalla
Abstract:
NASA Citizen Scientists have used Exoplanet Transit Interpretation Code (EXOTIC) to reduce 40 sets of time-series images of WASP-12 taken by privately owned telescopes and a 6-inch telescope operated by the Center for Astrophysics | Harvard & Smithsonian MicroObservatory (MOBs). Of these sets, 24 result in clean transit light curves of WASP-12 b which are included in the NASA Exoplanet Watch websi…
▽ More
NASA Citizen Scientists have used Exoplanet Transit Interpretation Code (EXOTIC) to reduce 40 sets of time-series images of WASP-12 taken by privately owned telescopes and a 6-inch telescope operated by the Center for Astrophysics | Harvard & Smithsonian MicroObservatory (MOBs). Of these sets, 24 result in clean transit light curves of WASP-12 b which are included in the NASA Exoplanet Watch website. We use priors from the NASA Exoplanet Archive to calculate the ephemeris of the planet and combine it with ETD (Exoplanet Transit Database), ExoClock, and TESS (Transiting Exoplanet Survey Satellite) observations. Combining these datasets gives an updated ephemeris for the WASP-12 b system of 2454508.97923 +/- 0.000051 BJDTDB with an orbital period of 1.09141935 +/- 2.16e-08 days which can be used to inform the efficient scheduling of future space telescope observations. The orbital decay of the planet was found to be -6.89e-10 +/- 4.01e-11 days/epoch. These results show the benefits of long-term observations by amateur astronomers that citizen scientists can analyze to augment the field of Exoplanet research.
△ Less
Submitted 10 November, 2023; v1 submitted 30 June, 2023;
originally announced June 2023.
-
Complexity Framework for Forbidden Subgraphs IV: The Steiner Forest Problem
Authors:
Hans L. Bodlaender,
Matthew Johnson,
Barnaby Martin,
Jelle J. Oostveen,
Sukanya Pandey,
Daniel Paulusma,
Siani Smith,
Erik Jan van Leeuwen
Abstract:
We study Steiner Forest on $H$-subgraph-free graphs, that is, graphs that do not contain some fixed graph $H$ as a (not necessarily induced) subgraph. We are motivated by a recent framework that completely characterizes the complexity of many problems on $H$-subgraph-free graphs. However, in contrast to e.g. the related Steiner Tree problem, Steiner Forest falls outside this framework. Hence, the…
▽ More
We study Steiner Forest on $H$-subgraph-free graphs, that is, graphs that do not contain some fixed graph $H$ as a (not necessarily induced) subgraph. We are motivated by a recent framework that completely characterizes the complexity of many problems on $H$-subgraph-free graphs. However, in contrast to e.g. the related Steiner Tree problem, Steiner Forest falls outside this framework. Hence, the complexity of Steiner Forest on $H$-subgraph-free graphs remained tantalizingly open. In this paper, we make significant progress towards determining the complexity of Steiner Forest on $H$-subgraph-free graphs. Our main results are four novel polynomial-time algorithms for different excluded graphs $H$ that are central to further understand its complexity. Along the way, we study the complexity of Steiner Forest for graphs with a small $c$-deletion set, that is, a small set $S$ of vertices such that each component of $G-S$ has size at most $c$. Using this parameter, we give two noteworthy algorithms that we later employ as subroutines. First, we prove Steiner Forest is FPT parameterized by $|S|$ when $c=1$ (i.e. the vertex cover number). Second, we prove Steiner Forest is polynomial-time solvable for graphs with a 2-deletion set of size at most 2. The latter result is tight, as the problem is NP-complete for graphs with a 3-deletion set of size 2.
△ Less
Submitted 15 October, 2023; v1 submitted 2 May, 2023;
originally announced May 2023.
-
Complexity Framework for Forbidden Subgraphs III: When Problems are Tractable on Subcubic Graphs
Authors:
Matthew Johnson,
Barnaby Martin,
Sukanya Pandey,
Daniël Paulusma,
Siani Smith,
Erik Jan van Leeuwen
Abstract:
For any finite set $\mathcal{H} = \{H_1,\ldots,H_p\}$ of graphs, a graph is $\mathcal{H}$-subgraph-free if it does not contain any of $H_1,\ldots,H_p$ as a subgraph. In recent work, meta-classifications have been studied: these show that if graph problems satisfy certain prescribed conditions, their complexity is determined on classes of $\mathcal{H}$-subgraph-free graphs. We continue this work an…
▽ More
For any finite set $\mathcal{H} = \{H_1,\ldots,H_p\}$ of graphs, a graph is $\mathcal{H}$-subgraph-free if it does not contain any of $H_1,\ldots,H_p$ as a subgraph. In recent work, meta-classifications have been studied: these show that if graph problems satisfy certain prescribed conditions, their complexity is determined on classes of $\mathcal{H}$-subgraph-free graphs. We continue this work and focus on problems that have polynomial-time solutions on classes that have bounded treewidth or maximum degree at most~$3$ and examine their complexity on $H$-subgraph-free graph classes where $H$ is a connected graph. With this approach, we obtain comprehensive classifications for (Independent) Feedback Vertex Set, Connected Vertex Cover, Colouring and Matching Cut. This resolves a number of open problems.
We highlight that, to establish that Independent Feedback Vertex Set belongs to this collection of problems, we first show that it can be solved in polynomial time on graphs of maximum degree $3$. We demonstrate that, with the exception of the complete graph on four vertices, each graph in this class has a minimum size feedback vertex set that is also an independent set.
△ Less
Submitted 1 May, 2023;
originally announced May 2023.
-
GPT-4 Technical Report
Authors:
OpenAI,
Josh Achiam,
Steven Adler,
Sandhini Agarwal,
Lama Ahmad,
Ilge Akkaya,
Florencia Leoni Aleman,
Diogo Almeida,
Janko Altenschmidt,
Sam Altman,
Shyamal Anadkat,
Red Avila,
Igor Babuschkin,
Suchir Balaji,
Valerie Balcom,
Paul Baltescu,
Haiming Bao,
Mohammad Bavarian,
Jeff Belgum,
Irwan Bello,
Jake Berdine,
Gabriel Bernadett-Shapiro,
Christopher Berner,
Lenny Bogdonoff,
Oleg Boiko
, et al. (256 additional authors not shown)
Abstract:
We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo…
▽ More
We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was developing infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4.
△ Less
Submitted 4 March, 2024; v1 submitted 15 March, 2023;
originally announced March 2023.
-
Noncontact Respiratory Anomaly Detection Using Infrared Light-Wave Sensing
Authors:
Md Zobaer Islam,
Brenden Martin,
Carly Gotcher,
Tyler Martinez,
John F. O'Hara,
Sabit Ekin
Abstract:
Human respiratory rate and its pattern convey essential information about the physical and psychological states of the subject. Abnormal breathing can indicate fatal health issues leading to further diagnosis and treatment. Wireless light-wave sensing (LWS) using incoherent infrared light shows promise in safe, discreet, efficient, and non-invasive human breathing monitoring without raising privac…
▽ More
Human respiratory rate and its pattern convey essential information about the physical and psychological states of the subject. Abnormal breathing can indicate fatal health issues leading to further diagnosis and treatment. Wireless light-wave sensing (LWS) using incoherent infrared light shows promise in safe, discreet, efficient, and non-invasive human breathing monitoring without raising privacy concerns. The respiration monitoring system needs to be trained on different types of breathing patterns to identify breathing anomalies.The system must also validate the collected data as a breathing waveform, discarding any faulty data caused by external interruption, user movement, or system malfunction. To address these needs, this study simulated normal and different types of abnormal respiration using a robot that mimics human breathing patterns. Then, time-series respiration data were collected using infrared light-wave sensing technology. Three machine learning algorithms, decision tree, random forest and XGBoost, were applied to detect breathing anomalies and faulty data. Model performances were evaluated through cross-validation, assessing classification accuracy, precision and recall scores. The random forest model achieved the highest classification accuracy of 96.75% with data collected at a 0.5m distance. In general, ensemble models like random forest and XGBoost performed better than a single model in classifying the data collected at multiple distances from the light-wave sensing setup.
△ Less
Submitted 16 April, 2024; v1 submitted 9 January, 2023;
originally announced January 2023.
-
Complexity Framework for Forbidden Subgraphs II: Edge Subdivision and the "H"-graphs
Authors:
Vadim Lozin,
Barnaby Martin,
Sukanya Pandey,
Daniel Paulusma,
Mark Siggers,
Siani Smith,
Erik Jan van Leeuwen
Abstract:
For a fixed set ${\cal H}$ of graphs, a graph $G$ is ${\cal H}$-subgraph-free if $G$ does not contain any $H \in {\cal H}$ as a (not necessarily induced) subgraph. A recently proposed framework gives a complete classification on ${\cal H}$-subgraph-free graphs (for finite sets ${\cal H}$) for problems that are solvable in polynomial time on graph classes of bounded treewidth, NP-complete on subcub…
▽ More
For a fixed set ${\cal H}$ of graphs, a graph $G$ is ${\cal H}$-subgraph-free if $G$ does not contain any $H \in {\cal H}$ as a (not necessarily induced) subgraph. A recently proposed framework gives a complete classification on ${\cal H}$-subgraph-free graphs (for finite sets ${\cal H}$) for problems that are solvable in polynomial time on graph classes of bounded treewidth, NP-complete on subcubic graphs, and whose NP-hardness is preserved under edge subdivision. While a lot of problems satisfy these conditions, there are also many problems that do not satisfy all three conditions and for which the complexity in ${\cal H}$-subgraph-free graphs is unknown. We study problems for which only the first two conditions of the framework hold (they are solvable in polynomial time on classes of bounded treewidth and NP-complete on subcubic graphs, but NP-hardness is not preserved under edge subdivision). In particular, we make inroads into the classification of the complexity of four such problems: Hamilton Cycle, $k$-Induced Disjoint Paths, $C_5$-Colouring and Star $3$-Colouring. Although we do not complete the classifications, we show that the boundary between polynomial time and NP-complete differs among our problems and also from problems that do satisfy all three conditions of the framework, in particular when we forbid certain subdivisions of the ``H''-graph (the graph that looks like the letter ``H''). Hence, we exhibit a rich complexity landscape among problems for ${\cal H}$-subgraph-free graph classes.
△ Less
Submitted 4 May, 2024; v1 submitted 25 November, 2022;
originally announced November 2022.
-
Complexity Framework For Forbidden Subgraphs I: The Framework
Authors:
Matthew Johnson,
Barnaby Martin,
Jelle J. Oostveen,
Sukanya Pandey,
Daniël Paulusma,
Siani Smith,
Erik Jan van Leeuwen
Abstract:
For any particular class of graphs, algorithms for computational problems restricted to the class often rely on structural properties that depend on the specific problem at hand. This begs the question if a large set of such results can be explained by some common problem conditions. We propose such conditions for $HH$-subgraph-free graphs. For a set of graphs $HH$, a graph $G$ is $HH$-subgraph-fr…
▽ More
For any particular class of graphs, algorithms for computational problems restricted to the class often rely on structural properties that depend on the specific problem at hand. This begs the question if a large set of such results can be explained by some common problem conditions. We propose such conditions for $HH$-subgraph-free graphs. For a set of graphs $HH$, a graph $G$ is $HH$-subgraph-free if $G$ does not contain any of graph from $H$ as a subgraph. Our conditions are easy to state. A graph problem must be efficiently solvable on graphs of bounded treewidth, computationally hard on subcubic graphs, and computational hardness must be preserved under edge subdivision of subcubic graphs. Our meta-classification says that if a graph problem satisfies all three conditions, then for every finite set $HH$, it is ``efficiently solvable'' on $HH$-subgraph-free graphs if $HH$ contains a disjoint union of one or more paths and subdivided claws, and is ``computationally hard'' otherwise. We illustrate the broad applicability of our meta-classification by obtaining a dichotomy between polynomial-time solvability and NP-completeness for many well-known partitioning, covering and packing problems, network design problems and width parameter problems. For other problems, we obtain a dichotomy between almost-linear-time solvability and having no subquadratic-time algorithm (conditioned on some hardness hypotheses). The proposed framework thus gives a simple pathway to determine the complexity of graph problems on $HH$-subgraph-free graphs. This is confirmed even more by the fact that along the way, we uncover and resolve several open questions from the literature.
△ Less
Submitted 20 July, 2023; v1 submitted 23 November, 2022;
originally announced November 2022.
-
Edge Multiway Cut and Node Multiway Cut are NP-complete on subcubic graphs
Authors:
Matthew Johnson,
Barnaby Martin,
Siani Smith,
Sukanya Pandey,
Daniel Paulusma,
Erik Jan van Leeuwen
Abstract:
We show that Edge Multiway Cut (also called Multiterminal Cut) and Node Multiway Cut are NP-complete on graphs of maximum degree $3$ (also known as subcubic graphs). This improves on a previous degree bound of $11$. Our NP-completeness result holds even for subcubic graphs that are planar.
We show that Edge Multiway Cut (also called Multiterminal Cut) and Node Multiway Cut are NP-complete on graphs of maximum degree $3$ (also known as subcubic graphs). This improves on a previous degree bound of $11$. Our NP-completeness result holds even for subcubic graphs that are planar.
△ Less
Submitted 9 February, 2024; v1 submitted 22 November, 2022;
originally announced November 2022.
-
Exchanging Keys with Authentication and Identity Protection for Secure Voice Communication without Side-channel
Authors:
Piotr Krasnowski,
Jerome Lebrun,
Bruno Martin
Abstract:
Motivated by an increasing need for privacy-preserving voice communications, we investigate here the original idea of sending encrypted data and speech in the form of pseudo-speech signals in the audio domain. Being less constrained than military ``Crypto Phones'' and allowing genuine public evaluation, this approach is quite promising for public unsecured voice communication infrastructures, such…
▽ More
Motivated by an increasing need for privacy-preserving voice communications, we investigate here the original idea of sending encrypted data and speech in the form of pseudo-speech signals in the audio domain. Being less constrained than military ``Crypto Phones'' and allowing genuine public evaluation, this approach is quite promising for public unsecured voice communication infrastructures, such as 3G cellular network and VoIP.A cornerstone of secure voice communications is the authenticated exchange of cryptographic keys with sole resource the voice channel, and neither Public Key Infrastructure (PKI) nor Certificate Authority (CA). In this paper, we detail our new robust double authentication mechanism based on signatures and Short Authentication Strings (SAS) ensuring strong authentication between the users while mitigating errors caused by unreliable voice channels and also identity protection against passive eavesdroppers. As symbolic model, our protocol has been formally proof-checked for security and fully validated by Tamarin Prover.
△ Less
Submitted 14 November, 2022;
originally announced November 2022.
-
Complexity Classification Transfer for CSPs via Algebraic Products
Authors:
Manuel Bodirsky,
Peter Jonsson,
Barnaby Martin,
Antoine Mottet,
Žaneta Semanišinová
Abstract:
We study the complexity of infinite-domain constraint satisfaction problems: our basic setting is that a complexity classification for the CSPs of first-order expansions of a structure $\mathfrak A$ can be transferred to a classification of the CSPs of first-order expansions of another structure $\mathfrak B$. We exploit a product of structures (the algebraic product) that corresponds to the produ…
▽ More
We study the complexity of infinite-domain constraint satisfaction problems: our basic setting is that a complexity classification for the CSPs of first-order expansions of a structure $\mathfrak A$ can be transferred to a classification of the CSPs of first-order expansions of another structure $\mathfrak B$. We exploit a product of structures (the algebraic product) that corresponds to the product of the respective polymorphism clones and present a complete complexity classification of the CSPs for first-order expansions of the $n$-fold algebraic power of $(\mathbb{Q};<)$. This is proved by various algebraic and logical methods in combination with knowledge of the polymorphisms of the tractable first-order expansions of $(\mathbb{Q};<)$ and explicit descriptions of the expressible relations in terms of syntactically restricted first-order formulas. By combining our classification result with general classification transfer techniques, we obtain surprisingly strong new classification results for highly relevant formalisms such as Allen's Interval Algebra, the $n$-dimensional Block Algebra, and the Cardinal Direction Calculus, even if higher-arity relations are allowed. Our results confirm the infinite-domain tractability conjecture for classes of structures that have been difficult to analyse with older methods. For the special case of structures with binary signatures, the results can be substantially strengthened and tightly connected to Ord-Horn formulas; this solves several longstanding open problems from the AI literature.
△ Less
Submitted 7 June, 2024; v1 submitted 7 November, 2022;
originally announced November 2022.
-
CyRSoXS: A GPU-accelerated virtual instrument for Polarized Resonant Soft X-ray Scattering (P-RSoXS)
Authors:
Kumar Saurabh,
Peter J. Dudenas,
Eliot Gann,
Veronica G. Reynolds,
Subhrangsu Mukherjee,
Daniel Sunday,
Tyler B. Martin,
Peter A. Beaucage,
Michael L. Chabinyc,
Dean M. DeLongchamp,
Adarsh Krishnamurthy,
Baskar Ganapathysubramanian
Abstract:
Polarized Resonant Soft X-ray scattering (P-RSoXS) has emerged as a powerful synchrotron-based tool that combines principles of X-ray scattering and X-ray spectroscopy. P-RSoXS provides unique sensitivity to molecular orientation and chemical heterogeneity in soft materials such as polymers and biomaterials. Quantitative extraction of orientation information from P-RSoXS pattern data is challengin…
▽ More
Polarized Resonant Soft X-ray scattering (P-RSoXS) has emerged as a powerful synchrotron-based tool that combines principles of X-ray scattering and X-ray spectroscopy. P-RSoXS provides unique sensitivity to molecular orientation and chemical heterogeneity in soft materials such as polymers and biomaterials. Quantitative extraction of orientation information from P-RSoXS pattern data is challenging because the scattering processes originate from sample properties that must be represented as energy-dependent three-dimensional tensors with heterogeneities at nanometer to sub-nanometer length scales. We overcome this challenge by developing an open-source virtual instrument that uses GPUs to simulate P-RSoXS patterns from real-space material representations with nanoscale resolution. Our computational framework CyRSoXS (https://github.com/usnistgov/cyrsoxs) is designed to maximize GPU performance. We demonstrate the accuracy and robustness of our approach by validating against an extensive set of test cases, which include both analytical solutions and numerical comparisons, demonstrating a speedup of over three orders relative to the current state-of-the-art simulation software. Such fast simulations open up a variety of applications that were previously computationally infeasible, including (a) pattern fitting, (b) co-simulation with the physical instrument for operando analytics, data exploration, and decision support, (c) data creation and integration into machine learning workflows, and (d) utilization in multi-modal data assimilation approaches. Finally, we abstract away the complexity of the computational framework from the end-user by exposing CyRSoXS to Python using Pybind. This eliminates I/O requirements for large-scale parameter exploration and inverse design, and democratizes usage by enabling seamless integration with a Python ecosystem (https://github.com/usnistgov/nrss).
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
Ontologizing Health Systems Data at Scale: Making Translational Discovery a Reality
Authors:
Tiffany J. Callahan,
Adrianne L. Stefanski,
Jordan M. Wyrwa,
Chenjie Zeng,
Anna Ostropolets,
Juan M. Banda,
William A. Baumgartner Jr.,
Richard D. Boyce,
Elena Casiraghi,
Ben D. Coleman,
Janine H. Collins,
Sara J. Deakyne-Davies,
James A. Feinstein,
Melissa A. Haendel,
Asiyah Y. Lin,
Blake Martin,
Nicolas A. Matentzoglu,
Daniella Meeker,
Justin Reese,
Jessica Sinclair,
Sanya B. Taneja,
Katy E. Trinkley,
Nicole A. Vasilevsky,
Andrew Williams,
Xingman A. Zhang
, et al. (7 additional authors not shown)
Abstract:
Background: Common data models solve many challenges of standardizing electronic health record (EHR) data, but are unable to semantically integrate all the resources needed for deep phenotyping. Open Biological and Biomedical Ontology (OBO) Foundry ontologies provide computable representations of biological knowledge and enable the integration of heterogeneous data. However, mapping EHR data to OB…
▽ More
Background: Common data models solve many challenges of standardizing electronic health record (EHR) data, but are unable to semantically integrate all the resources needed for deep phenotyping. Open Biological and Biomedical Ontology (OBO) Foundry ontologies provide computable representations of biological knowledge and enable the integration of heterogeneous data. However, mapping EHR data to OBO ontologies requires significant manual curation and domain expertise. Objective: We introduce OMOP2OBO, an algorithm for mapping Observational Medical Outcomes Partnership (OMOP) vocabularies to OBO ontologies. Results: Using OMOP2OBO, we produced mappings for 92,367 conditions, 8611 drug ingredients, and 10,673 measurement results, which covered 68-99% of concepts used in clinical practice when examined across 24 hospitals. When used to phenotype rare disease patients, the mappings helped systematically identify undiagnosed patients who might benefit from genetic testing. Conclusions: By aligning OMOP vocabularies to OBO ontologies our algorithm presents new opportunities to advance EHR-based deep phenotyping.
△ Less
Submitted 30 January, 2023; v1 submitted 10 September, 2022;
originally announced September 2022.
-
Few Induced Disjoint Paths for $H$-Free Graphs
Authors:
Barnaby Martin,
Daniël Paulusma,
Siani Smith,
Erik Jan van Leeuwen
Abstract:
Paths $P^1,\ldots,P^k$ in a graph $G=(V,E)$ are mutually induced if any two distinct $P^i$ and $P^j$ have neither common vertices nor adjacent vertices. For a fixed integer $k$, the $k$-Induced Disjoint Paths problem is to decide if a graph $G$ with $k$ pairs of specified vertices $(s_i,t_i)$ contains $k$ mutually induced paths $P^i$ such that each $P^i$ starts from $s_i$ and ends at $t_i$. Wherea…
▽ More
Paths $P^1,\ldots,P^k$ in a graph $G=(V,E)$ are mutually induced if any two distinct $P^i$ and $P^j$ have neither common vertices nor adjacent vertices. For a fixed integer $k$, the $k$-Induced Disjoint Paths problem is to decide if a graph $G$ with $k$ pairs of specified vertices $(s_i,t_i)$ contains $k$ mutually induced paths $P^i$ such that each $P^i$ starts from $s_i$ and ends at $t_i$. Whereas the non-induced version is well-known to be polynomial-time solvable for every fixed integer $k$, a classical result from the literature states that even $2$-Induced Disjoint Paths is NP-complete. We prove new complexity results for $k$-Induced Disjoint Paths if the input is restricted to $H$-free graphs, that is, graphs without a fixed graph $H$ as an induced subgraph. We compare our results with a complexity dichotomy for Induced Disjoint Paths, the variant where $k$ is part of the input.
△ Less
Submitted 13 June, 2022; v1 submitted 7 March, 2022;
originally announced March 2022.
-
Induced Disjoint Paths and Connected Subgraphs for $H$-Free Graphs
Authors:
Barnaby Martin,
Daniël Paulusma,
Siani Smith,
Erik Jan van Leeuwen
Abstract:
Paths $P_1,\ldots, P_k$ in a graph $G=(V,E)$ are mutually induced if any two distinct $P_i$ and $P_j$ have neither common vertices nor adjacent vertices. The Induced Disjoint Paths problem is to decide if a graph $G$ with $k$ pairs of specified vertices $(s_i,t_i)$ contains $k$ mutually induced paths $P_i$ such that each $P_i$ starts from $s_i$ and ends at $t_i$. This is a classical graph problem…
▽ More
Paths $P_1,\ldots, P_k$ in a graph $G=(V,E)$ are mutually induced if any two distinct $P_i$ and $P_j$ have neither common vertices nor adjacent vertices. The Induced Disjoint Paths problem is to decide if a graph $G$ with $k$ pairs of specified vertices $(s_i,t_i)$ contains $k$ mutually induced paths $P_i$ such that each $P_i$ starts from $s_i$ and ends at $t_i$. This is a classical graph problem that is NP-complete even for $k=2$. We introduce a natural generalization, Induced Disjoint Connected Subgraphs: instead of connecting pairs of terminals, we must connect sets of terminals. We give almost-complete dichotomies of the computational complexity of both problems for H-free graphs, that is, graphs that do not contain some fixed graph H as an induced subgraph. Finally, we give a complete classification of the complexity of the second problem if the number k of terminal sets is fixed, that is, not part of the input.
△ Less
Submitted 17 July, 2022; v1 submitted 23 February, 2022;
originally announced February 2022.
-
STIR$^2$: Reward Relabelling for combined Reinforcement and Imitation Learning on sparse-reward tasks
Authors:
Jesus Bujalance Martin,
Fabien Moutarde
Abstract:
In the search for more sample-efficient reinforcement-learning (RL) algorithms, a promising direction is to leverage as much external off-policy data as possible. For instance, expert demonstrations. In the past, multiple ideas have been proposed to make good use of the demonstrations added to the replay buffer, such as pretraining on demonstrations only or minimizing additional cost functions. We…
▽ More
In the search for more sample-efficient reinforcement-learning (RL) algorithms, a promising direction is to leverage as much external off-policy data as possible. For instance, expert demonstrations. In the past, multiple ideas have been proposed to make good use of the demonstrations added to the replay buffer, such as pretraining on demonstrations only or minimizing additional cost functions. We present a new method, able to leverage both demonstrations and episodes collected online in any sparse-reward environment with any off-policy algorithm. Our method is based on a reward bonus given to demonstrations and successful episodes (via relabeling), encouraging expert imitation and self-imitation. Our experiments focus on several robotic-manipulation tasks across two different simulation environments. We show that our method based on reward relabeling improves the performance of the base algorithm (SAC and DDPG) on these tasks. Finally, our best algorithm STIR$^2$ (Self and Teacher Imitation by Reward Relabeling), which integrates into our method multiple improvements from previous works, is more data-efficient than all baselines.
△ Less
Submitted 28 February, 2023; v1 submitted 11 January, 2022;
originally announced January 2022.
-
Colouring Generalized Claw-Free Graphs and Graphs of Large Girth: Bounding the Diameter
Authors:
Barnaby Martin,
Daniel Paulusma,
Siani Smith
Abstract:
For a fixed integer, the $k$-Colouring problem is to decide if the vertices of a graph can be coloured with at most $k$ colours for an integer $k$, such that no two adjacent vertices are coloured alike. A graph $G$ is $H$-free if $G$ does not contain $H$ as an induced subgraph. It is known that for all $k\geq 3$, the $k$-Colouring problem is NP-complete for $H$-free graphs if $H$ contains an induc…
▽ More
For a fixed integer, the $k$-Colouring problem is to decide if the vertices of a graph can be coloured with at most $k$ colours for an integer $k$, such that no two adjacent vertices are coloured alike. A graph $G$ is $H$-free if $G$ does not contain $H$ as an induced subgraph. It is known that for all $k\geq 3$, the $k$-Colouring problem is NP-complete for $H$-free graphs if $H$ contains an induced claw or cycle. The case where $H$ contains a cycle follows from the known result that the problem is NP-complete even for graphs of arbitrarily large fixed girth. We examine to what extent the situation may change if in addition the input graph has bounded diameter.
△ Less
Submitted 23 November, 2021;
originally announced November 2021.
-
Learning from demonstrations with SACR2: Soft Actor-Critic with Reward Relabeling
Authors:
Jesus Bujalance Martin,
Raphael Chekroun,
Fabien Moutarde
Abstract:
During recent years, deep reinforcement learning (DRL) has made successful incursions into complex decision-making applications such as robotics, autonomous driving or video games. Off-policy algorithms tend to be more sample-efficient than their on-policy counterparts, and can additionally benefit from any off-policy data stored in the replay buffer. Expert demonstrations are a popular source for…
▽ More
During recent years, deep reinforcement learning (DRL) has made successful incursions into complex decision-making applications such as robotics, autonomous driving or video games. Off-policy algorithms tend to be more sample-efficient than their on-policy counterparts, and can additionally benefit from any off-policy data stored in the replay buffer. Expert demonstrations are a popular source for such data: the agent is exposed to successful states and actions early on, which can accelerate the learning process and improve performance. In the past, multiple ideas have been proposed to make good use of the demonstrations in the buffer, such as pretraining on demonstrations only or minimizing additional cost functions. We carry on a study to evaluate several of these ideas in isolation, to see which of them have the most significant impact. We also present a new method for sparse-reward tasks, based on a reward bonus given to demonstrations and successful episodes. First, we give a reward bonus to the transitions coming from demonstrations to encourage the agent to match the demonstrated behaviour. Then, upon collecting a successful episode, we relabel its transitions with the same bonus before adding them to the replay buffer, encouraging the agent to also match its previous successes. The base algorithm for our experiments is the popular Soft Actor-Critic (SAC), a state-of-the-art off-policy algorithm for continuous action spaces. Our experiments focus on manipulation robotics, specifically on a 3D reaching task for a robotic arm in simulation. We show that our method SACR2 based on reward relabeling improves the performance on this task, even in the absence of demonstrations.
△ Less
Submitted 3 December, 2021; v1 submitted 27 October, 2021;
originally announced October 2021.
-
Predictive Geological Mapping with Convolution Neural Network Using Statistical Data Augmentation on a 3D Model
Authors:
Cedou Matthieu,
Gloaguen Erwan,
Blouin Martin,
Caté Antoine,
Paiement Jean-Philippe,
Tirdad Shiva
Abstract:
Airborne magnetic data are commonly used to produce preliminary geological maps. Machine learning has the potential to partly fulfill this task rapidly and objectively, as geological mapping is comparable to a semantic segmentation problem. Because this method requires a high-quality dataset, we developed a data augmentation workflow that uses a 3D geological and magnetic susceptibility model as i…
▽ More
Airborne magnetic data are commonly used to produce preliminary geological maps. Machine learning has the potential to partly fulfill this task rapidly and objectively, as geological mapping is comparable to a semantic segmentation problem. Because this method requires a high-quality dataset, we developed a data augmentation workflow that uses a 3D geological and magnetic susceptibility model as input. The workflow uses soft-constrained Multi-Point Statistics, to create many synthetic 3D geological models, and Sequential Gaussian Simulation algorithms, to populate the models with the appropriate magnetic distribution. Then, forward modeling is used to compute the airborne magnetic responses of the synthetic models, which are associated with their counterpart surficial lithologies. A Gated Shape Convolutional Neural Network algorithm was trained on a generated synthetic dataset to perform geological mapping of airborne magnetic data and detect lithological contacts. The algorithm also provides attention maps highlighting the structures at different scales, and clustering was applied to its high-level features to do a semi-supervised segmentation of the area. The validation conducted on a portion of the synthetic dataset and data from adjacent areas shows that the methodology is suitable to segment the surficial geology using airborne magnetic data. Especially, the clustering shows a good segmentation of the magnetic anomalies into a pertinent geological map. Moreover, the first attention map isolates the structures at low scales and shows a pertinent representation of the original data. Thus, our method can be used to produce preliminary geological maps of good quality and new representations of any area where a geological and petrophysical 3D model exists, or in areas sharing the same geological context, using airborne magnetic data only.
△ Less
Submitted 27 October, 2021;
originally announced October 2021.
-
The complexity of quantified constraints: collapsibility, switchability and the algebraic formulation
Authors:
Catarina Carvalho,
Florent Madelaine,
Barnaby Martin,
Dmitriy Zhuk
Abstract:
Let A be an idempotent algebra on a finite domain. By mediating between results of Chen and Zhuk, we argue that if A satisfies the polynomially generated powers property (PGP) and B is a constraint language invariant under A (that is, in Inv(A)), then QCSP(B) is in NP. In doing this we study the special forms of PGP, switchability and collapsibility, in detail, both algebraically and logically, ad…
▽ More
Let A be an idempotent algebra on a finite domain. By mediating between results of Chen and Zhuk, we argue that if A satisfies the polynomially generated powers property (PGP) and B is a constraint language invariant under A (that is, in Inv(A)), then QCSP(B) is in NP. In doing this we study the special forms of PGP, switchability and collapsibility, in detail, both algebraically and logically, addressing various questions such as decidability on the way.
We then prove a complexity-theoretic converse in the case of infinite constraint languages encoded in propositional logic, that if Inv(A) satisfies the exponentially generated powers property (EGP), then QCSP(Inv(A)) is co-NP-hard. Since Zhuk proved that only PGP and EGP are possible, we derive a full dichotomy for the QCSP, justifying what we term the Revised Chen Conjecture. This result becomes more significant now the original Chen Conjecture is known to be false.
Switchability was introduced by Chen as a generalisation of the already-known collapsibility. For three-element domain algebras A that are switchable and omit a G-set, we prove that, for every finite subset D of Inv(A), Pol(D) is collapsible. The significance of this is that, for QCSP on finite structures (over a three-element domain), all QCSP tractability (in P) explained by switchability is already explained by collapsibility.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
Optimizing Graph Transformer Networks with Graph-based Techniques
Authors:
Loc Hoang,
Udit Agarwal,
Gurbinder Gill,
Roshan Dathathri,
Abhik Seal,
Brian Martin,
Keshav Pingali
Abstract:
Graph transformer networks (GTN) are a variant of graph convolutional networks (GCN) that are targeted to heterogeneous graphs in which nodes and edges have associated type information that can be exploited to improve inference accuracy. GTNs learn important metapaths in the graph, create weighted edges for these metapaths, and use the resulting graph in a GCN. Currently, the only available implem…
▽ More
Graph transformer networks (GTN) are a variant of graph convolutional networks (GCN) that are targeted to heterogeneous graphs in which nodes and edges have associated type information that can be exploited to improve inference accuracy. GTNs learn important metapaths in the graph, create weighted edges for these metapaths, and use the resulting graph in a GCN. Currently, the only available implementation of GTNs uses dense matrix multiplication to find metapaths. Unfortunately, the space overhead of this approach can be large, so in practice it is used only for small graphs. In addition, the matrix-based implementation is not fine-grained enough to use random-walk based methods to optimize metapath finding. In this paper, we present a graph-based formulation and implementation of the GTN metapath finding problem. This graph-based formulation has two advantages over the matrix-based approach. First, it is more space efficient than the original GTN implementation and more compute-efficient for metapath sizes of practical interest. Second, it permits us to implement a sampling method that reduces the number of metapaths that must be enumerated, allowing the implementation to be used for larger graphs and larger metapath sizes. Experimental results show that our implementation is $6.5\times$ faster than the original GTN implementation on average for a metapath length of 4, and our sampling implementation is $155\times$ faster on average than this implementation without compromising on the accuracy of the GTN.
△ Less
Submitted 15 June, 2021;
originally announced June 2021.
-
Towards a Sample Efficient Reinforcement Learning Pipeline for Vision Based Robotics
Authors:
Maxence Mahe,
Pierre Belamri,
Jesus Bujalance Martin
Abstract:
Deep Reinforcement learning holds the guarantee of empowering self-ruling robots to master enormous collections of conduct abilities with negligible human mediation. The improvements brought by this technique enables robots to perform difficult tasks such as grabbing or reaching targets. Nevertheless, the training process is still time consuming and tedious especially when learning policies only w…
▽ More
Deep Reinforcement learning holds the guarantee of empowering self-ruling robots to master enormous collections of conduct abilities with negligible human mediation. The improvements brought by this technique enables robots to perform difficult tasks such as grabbing or reaching targets. Nevertheless, the training process is still time consuming and tedious especially when learning policies only with RGB camera information. This way of learning is capital to transfer the task from simulation to the real world since the only external source of information for the robot in real life is video. In this paper, we study how to limit the time taken for training a robotic arm with 6 Degrees Of Freedom (DOF) to reach a ball from scratch by assembling a pipeline as efficient as possible. The pipeline is divided into two parts: the first one is to capture the relevant information from the RGB video with a Computer Vision algorithm. The second one studies how to train faster a Deep Reinforcement Learning algorithm in order to make the robotic arm reach the target in front of him. Follow this link to find videos and plots in higher resolution: \url{https://drive.google.com/drive/folders/1_lRlDSoPzd_GTcVrxNip10o_lm-_DPdn?usp=sharing}
△ Less
Submitted 20 May, 2021;
originally announced May 2021.
-
Disjoint Paths and Connected Subgraphs for H-Free Graphs
Authors:
Walter Kern,
Barnaby Martin,
Daniël Paulusma,
Siani Smith,
Erik Jan van Leeuwen
Abstract:
The well-known Disjoint Paths problem is to decide if a graph contains k pairwise disjoint paths, each connecting a different terminal pair from a set of k distinct pairs. We determine, with an exception of two cases, the complexity of the Disjoint Paths problem for $H$-free graphs. If $k$ is fixed, we obtain the $k$-Disjoint Paths problem, which is known to be polynomial-time solvable on the clas…
▽ More
The well-known Disjoint Paths problem is to decide if a graph contains k pairwise disjoint paths, each connecting a different terminal pair from a set of k distinct pairs. We determine, with an exception of two cases, the complexity of the Disjoint Paths problem for $H$-free graphs. If $k$ is fixed, we obtain the $k$-Disjoint Paths problem, which is known to be polynomial-time solvable on the class of all graphs for every $k \geq 1$. The latter does no longer hold if we need to connect vertices from terminal sets instead of terminal pairs. We completely classify the complexity of $k$-Disjoint Connected Subgraphs for $H$-free graphs, and give the same almost-complete classification for Disjoint Connected Subgraphs for $H$-free graphs as for Disjoint Paths.
△ Less
Submitted 13 May, 2021;
originally announced May 2021.
-
Partitioning H-Free Graphs of Bounded Diameter
Authors:
Christoph Brause,
Petr Golovach,
Barnaby Martin,
Daniël Paulusma,
Siani Smith
Abstract:
A natural way of increasing our understanding of NP-complete graph problems is to restrict the input to a special graph class. Classes of $H$-free graphs, that is, graphs that do not contain some graph $H$ as an induced subgraph, have proven to be an ideal testbed for such a complexity study. However, if the forbidden graph $H$ contains a cycle or claw, then these problems often stay NP-complete.…
▽ More
A natural way of increasing our understanding of NP-complete graph problems is to restrict the input to a special graph class. Classes of $H$-free graphs, that is, graphs that do not contain some graph $H$ as an induced subgraph, have proven to be an ideal testbed for such a complexity study. However, if the forbidden graph $H$ contains a cycle or claw, then these problems often stay NP-complete. A recent complexity study on the $k$-Colouring problem shows that we may still obtain tractable results if we also bound the diameter of the $H$-free input graph. We continue this line of research by initiating a complexity study on the impact of bounding the diameter for a variety of classical vertex partitioning problems restricted to $H$-free graphs. We prove that bounding the diameter does not help for Independent Set, but leads to new tractable cases for problems closely related to 3-Colouring. That is, we show that Near-Bipartiteness, Independent Feedback Vertex Set, Independent Odd Cycle Transversal, Acyclic 3-Colouring and Star 3-Colouring are all polynomial-time solvable for chair-free graphs of bounded diameter. To obtain these results we exploit a new structural property of 3-colourable chair-free graphs.
△ Less
Submitted 18 April, 2022; v1 submitted 10 May, 2021;
originally announced May 2021.
-
Acyclic, Star, and Injective Colouring: Bounding the Diameter
Authors:
Christoph Brause,
Petr Golovach,
Barnaby Martin,
Pascal Ochem,
Daniël Paulusma,
Siani Smith
Abstract:
We examine the effect of bounding the diameter for well-studied variants of the Colouring problem. A colouring is acyclic, star, or injective if any two colour classes induce a forest, star forest or disjoint union of vertices and edges, respectively. The corresponding decision problems are Acyclic Colouring, Star Colouring and Injective Colouring. The last problem is also known as $L(1,1)$-Labell…
▽ More
We examine the effect of bounding the diameter for well-studied variants of the Colouring problem. A colouring is acyclic, star, or injective if any two colour classes induce a forest, star forest or disjoint union of vertices and edges, respectively. The corresponding decision problems are Acyclic Colouring, Star Colouring and Injective Colouring. The last problem is also known as $L(1,1)$-Labelling and we also consider the framework of $L(a,b)$-Labelling. We prove a number of (almost-)complete complexity classifications. In particular, we show that for graphs of diameter at most $d$, Acyclic $3$-Colouring is polynomial-time solvable if $d\leq 2$ but NP-complete if $d\geq 4$, and Star $3$-Colouring is polynomial-time solvable if $d\leq 3$ but NP-complete for $d\geq 8$. As far as we are aware, Star $3$-Colouring is the first problem that exhibits a complexity jump for some $d\geq 3$. Our third main result is that $L(1,2)$-Labelling is NP-complete for graphs of diameter $2$; we relate the latter problem to a special case of Hamiltonian Path.
△ Less
Submitted 8 September, 2021; v1 submitted 21 April, 2021;
originally announced April 2021.
-
QCSP on Reflexive Tournaments
Authors:
Benoit Larose,
Petar Markovic,
Barnaby Martin,
Daniel Paulusma,
Siani Smith,
Stanislav Zivny
Abstract:
We give a complexity dichotomy for the Quantified Constraint Satisfaction Problem QCSP(H) when H is a reflexive tournament. It is well-known that reflexive tournaments can be split into a sequence of strongly connected components H_1,...,H_n so that there exists an edge from every vertex of H_i to every vertex of H_j if and only if i<j. We prove that if H has both its initial and final strongly co…
▽ More
We give a complexity dichotomy for the Quantified Constraint Satisfaction Problem QCSP(H) when H is a reflexive tournament. It is well-known that reflexive tournaments can be split into a sequence of strongly connected components H_1,...,H_n so that there exists an edge from every vertex of H_i to every vertex of H_j if and only if i<j. We prove that if H has both its initial and final strongly connected component (possibly equal) of size 1, then QCSP(H) is in NL and otherwise QCSP(H) is NP-hard.
△ Less
Submitted 28 December, 2021; v1 submitted 21 April, 2021;
originally announced April 2021.
-
The complete classification for quantified equality constraints
Authors:
Dmitriy Zhuk,
Barnaby Martin,
Michal Wrona
Abstract:
We prove that QCSP$(\mathbb{N};x=y\rightarrow y=z)$ is PSpace-complete, settling a question open for more than ten years. This completes the complexity classification for the QCSP over equality languages as a trichotomy between Logspace, NP-complete and PSpace-complete. We additionally settle the classification for bounded alternation QCSP$(Γ)$, for $Γ$ an equality language. Such problems are eith…
▽ More
We prove that QCSP$(\mathbb{N};x=y\rightarrow y=z)$ is PSpace-complete, settling a question open for more than ten years. This completes the complexity classification for the QCSP over equality languages as a trichotomy between Logspace, NP-complete and PSpace-complete. We additionally settle the classification for bounded alternation QCSP$(Γ)$, for $Γ$ an equality language. Such problems are either in Logspace, NP-complete, co-NP-complete or rise in complexity in the Polynomial Hierarchy.
△ Less
Submitted 4 August, 2022; v1 submitted 1 April, 2021;
originally announced April 2021.
-
Introducing a Novel Data over Voice Technique for Secure Voice Communication
Authors:
Piotr Krasnowski,
Jerome Lebrun,
Bruno Martin
Abstract:
The current increasing need for privacy-preserving voice communications is leading to new ideas for securing voice transmission. This paper refers to a relatively new concept of sending encrypted data or speech as pseudo-speech in the audio domain over existing voice communication infrastructures, like 3G cellular network and Voice over IP (VoIP). The distinctive characteristic of such a communica…
▽ More
The current increasing need for privacy-preserving voice communications is leading to new ideas for securing voice transmission. This paper refers to a relatively new concept of sending encrypted data or speech as pseudo-speech in the audio domain over existing voice communication infrastructures, like 3G cellular network and Voice over IP (VoIP). The distinctive characteristic of such a communication system is that it relies on the robust transmission of binary information in the form of audio signal. This work presents a novel Data over Voice (DoV) technique based on codebooks of short harmonic waveforms. The technique provides a sufficiently fast and reliable data rate over cellular networks and many VoIP applications. The new method relies on general principles of Linear Predictive Coding for voice compression (LPC voice coding) and is more versatile compared to solutions trained on exact channel models. The technique gives by design a high control over the desired rate of transmission and provides robustness to channel distortion. In addition, an efficient codebook design approach inspired by quaternary error correcting codes is proposed. The usability of the proposed DoV technique for secure voice communication over cellular networks and VoIP has been successfully validated by empirical experiments. The paper details the system parameters, putting a special emphasis on system's security and technical challenges.
△ Less
Submitted 1 February, 2022; v1 submitted 22 February, 2021;
originally announced February 2021.
-
Introducing an experimental distortion-tolerant speech encryption scheme for secure voice communication
Authors:
Piotr Krasnowski,
Jerome Lebrun,
Bruno Martin
Abstract:
The current increasing need for privacy-preserving voice communications is leading to new ideas for securing voice transmission. This paper refers to a relatively new concept of sending encrypted speech as pseudo-speech in the audio domain over digital voice communication infrastructures, like 3G cellular network and VoIP.
This work presents a novel distortion-tolerant speech encryption scheme f…
▽ More
The current increasing need for privacy-preserving voice communications is leading to new ideas for securing voice transmission. This paper refers to a relatively new concept of sending encrypted speech as pseudo-speech in the audio domain over digital voice communication infrastructures, like 3G cellular network and VoIP.
This work presents a novel distortion-tolerant speech encryption scheme for secure voice communications over voice channels that combines the robustness of analog speech scrambling and elevated security offered by digital ciphers like AES-CTR. The system scrambles vocal parameters of a speech signal (loudness, pitch, timbre) using distance-preserving pseudo-random translations and rotations on a hypersphere of parameters. Next, scrambled parameters are encoded to a pseudo-speech signal adapted to transmission over digital voice channels equipped with voice activity detection. Upon reception of this pseudo-speech signal, the legitimate receiver restores distorted copies of the initial vocal parameters. Despite some deciphering errors, an integrated neural-based vocoder based on the LPCNet architecture reconstructs an intelligible speech.
The experimental implementation of this speech encryption scheme has been tested by simulations and sending an encrypted signal over FaceTime between two iPhones 6 connected to the same WiFi network. Moreover, speech excerpts restored from encrypted signals were evaluated by a speech quality assessment on a group of about 40 participants. The experiments demonstrated that the proposed scheme produces intelligible speech with a gracefully progressive quality degradation depending on the channel noise. Finally, the preliminary computational analysis suggested that the presented setting may operate on high-end portable devices in nearly real-time.
△ Less
Submitted 19 February, 2021;
originally announced February 2021.
-
Depth lower bounds in Stabbing Planes for combinatorial principles
Authors:
Stefan Dantchev,
Nicola Galesi,
Abdul Ghani,
Barnaby Martin
Abstract:
Stabbing Planes (also known as Branch and Cut) is a proof system introduced very recently which, informally speaking, extends the DPLL method by branching on integer linear inequalities instead of single variables. The techniques known so far to prove size and depth lower bounds for Stabbing Planes are generalizations of those used for the Cutting Planes proof system. For size lower bounds these a…
▽ More
Stabbing Planes (also known as Branch and Cut) is a proof system introduced very recently which, informally speaking, extends the DPLL method by branching on integer linear inequalities instead of single variables. The techniques known so far to prove size and depth lower bounds for Stabbing Planes are generalizations of those used for the Cutting Planes proof system. For size lower bounds these are established by monotone circuit arguments, while for depth these are found via communication complexity and protection. As such these bounds apply for lifted versions of combinatorial statements. Rank lower bounds for Cutting Planes are also obtained by geometric arguments called protection lemmas.
In this work we introduce two new geometric approaches to prove size/depth lower bounds in Stabbing Planes working for any formula: (1) the antichain method, relying on Sperner's Theorem and (2) the covering method which uses results on essential coverings of the boolean cube by linear polynomials, which in turn relies on Alon's combinatorial Nullenstellensatz.
We demonstrate their use on classes of combinatorial principles such as the Pigeonhole principle, the Tseitin contradictions and the Linear Ordering Principle. By the first method we prove almost linear size lower bounds and optimal logarithmic depth lower bounds for the Pigeonhole principle and analogous lower bounds for the Tseitin contradictions over the complete graph and for the Linear Ordering Principle. By the covering method we obtain a superlinear size lower bound and a logarithmic depth lower bound for Stabbing Planes proof of Tseitin contradictions over a grid graph.
△ Less
Submitted 10 January, 2024; v1 submitted 15 February, 2021;
originally announced February 2021.
-
Colouring Graphs of Bounded Diameter in the Absence of Small Cycles
Authors:
Barnaby Martin,
Daniel Paulusma,
Siani Smith
Abstract:
For $k\geq 1$, a $k$-colouring $c$ of $G$ is a mapping from $V(G)$ to $\{1,2,\ldots,k\}$ such that $c(u)\neq c(v)$ for any two non-adjacent vertices $u$ and $v$. The $k$-Colouring problem is to decide if a graph $G$ has a $k$-colouring. For a family of graphs ${\cal H}$, a graph $G$ is ${\cal H}$-free if $G$ does not contain any graph from ${\cal H}$ as an induced subgraph. Let $C_s$ be the $s$-ve…
▽ More
For $k\geq 1$, a $k$-colouring $c$ of $G$ is a mapping from $V(G)$ to $\{1,2,\ldots,k\}$ such that $c(u)\neq c(v)$ for any two non-adjacent vertices $u$ and $v$. The $k$-Colouring problem is to decide if a graph $G$ has a $k$-colouring. For a family of graphs ${\cal H}$, a graph $G$ is ${\cal H}$-free if $G$ does not contain any graph from ${\cal H}$ as an induced subgraph. Let $C_s$ be the $s$-vertex cycle. In previous work (MFCS 2019) we examined the effect of bounding the diameter on the complexity of $3$-Colouring for $(C_3,\ldots,C_s)$-free graphs and $H$-free graphs where $H$ is some polyad. Here, we prove for certain small values of $s$ that $3$-Colouring is polynomial-time solvable for $C_s$-free graphs of diameter $2$ and $(C_4,C_s)$-free graphs of diameter $2$. In fact, our results hold for the more general problem List $3$-Colouring. We complement these results with some hardness result for diameter $4$.
△ Less
Submitted 19 January, 2021;
originally announced January 2021.
-
Hard Problems That Quickly Become Very Easy
Authors:
Barnaby Martin,
Daniël Paulusma,
Siani Smith
Abstract:
A graph class is hereditary if it is closed under vertex deletion. We give examples of NP-hard, PSPACE-complete and NEXPTIME-complete problems that become constant-time solvable for every hereditary graph class that is not equal to the class of all graphs.
A graph class is hereditary if it is closed under vertex deletion. We give examples of NP-hard, PSPACE-complete and NEXPTIME-complete problems that become constant-time solvable for every hereditary graph class that is not equal to the class of all graphs.
△ Less
Submitted 5 August, 2021; v1 submitted 17 December, 2020;
originally announced December 2020.
-
The complexity of L(p,q)-Edge-Labelling
Authors:
Gaetan Berthe,
Barnaby Martin,
Daniel Paulusma,
Siani Smith
Abstract:
We consider the L(p,q)-Edge-Labelling problem, which is the edge variant of the well-known L(p,q)-Labelling problem. So far, the complexity of this problem was only partially classified. We complete this study for all nonnegative p and q, by showing that, whenever (p,q) is not (0,0), L(p,q)-Edge-Labelling problem is NP-complete. We do this by proving that for all nonnegative p and q, except p=q=0,…
▽ More
We consider the L(p,q)-Edge-Labelling problem, which is the edge variant of the well-known L(p,q)-Labelling problem. So far, the complexity of this problem was only partially classified. We complete this study for all nonnegative p and q, by showing that, whenever (p,q) is not (0,0), L(p,q)-Edge-Labelling problem is NP-complete. We do this by proving that for all nonnegative p and q, except p=q=0, there exists an integer k so that L(p,q)-Edge-k-Labelling is NP-complete.
△ Less
Submitted 23 May, 2022; v1 submitted 27 August, 2020;
originally announced August 2020.
-
Acyclic, Star and Injective Colouring: A Complexity Picture for H-Free Graphs
Authors:
Jan Bok,
Nikola Jedlickova,
Barnaby Martin,
Pascal Ochem,
Daniel Paulusma,
Siani Smith
Abstract:
A (proper) colouring is acyclic, star, or injective if any two colour classes induce a forest, star forest or disjoint union of vertices and edges, respectively. Hence, every injective colouring is a star colouring and every star colouring is an acyclic colouring. The corresponding decision problems are Acyclic Colouring, Star Colouring and Injective Colouring (the last problem is also known as…
▽ More
A (proper) colouring is acyclic, star, or injective if any two colour classes induce a forest, star forest or disjoint union of vertices and edges, respectively. Hence, every injective colouring is a star colouring and every star colouring is an acyclic colouring. The corresponding decision problems are Acyclic Colouring, Star Colouring and Injective Colouring (the last problem is also known as $L(1,1)$-Labelling). A classical complexity result on Colouring is a well-known dichotomy for $H$-free graphs (a graph is $H$-free if it does not contain $H$ as an induced subgraph). In contrast, there is no systematic study into the computational complexity of Acyclic Colouring, Star Colouring and Injective Colouring despite numerous algorithmic and structural results that have appeared over the years. We perform such a study and give almost complete complexity classifications for Acyclic Colouring, Star Colouring and Injective Colouring on $H$-free graphs (for each of the problems, we have one open case). Moreover, we give full complexity classifications if the number of colours $k$ is fixed, that is, not part of the input. From our study it follows that for fixed $k$ the three problems behave in the same way, but this is no longer true if $k$ is part of the input. To obtain several of our results we prove stronger complexity results that in particular involve the girth of a graph and the class of line graphs of multigraphs.
△ Less
Submitted 10 May, 2021; v1 submitted 21 August, 2020;
originally announced August 2020.
-
Proof complexity and the binary encoding of combinatorial principles
Authors:
Stefan Dantchev,
Nicola Galesi,
Abdul Ghani,
Barnaby Martin
Abstract:
We consider Proof Complexity in light of the unusual binary encoding of certain combinatorial principles. We contrast this Proof Complexity with the normal unary encoding in several refutation systems, based on Resolution and Integer Linear Programming. Please consult the article for the full abstract.
We consider Proof Complexity in light of the unusual binary encoding of certain combinatorial principles. We contrast this Proof Complexity with the normal unary encoding in several refutation systems, based on Resolution and Integer Linear Programming. Please consult the article for the full abstract.
△ Less
Submitted 5 April, 2022; v1 submitted 4 August, 2020;
originally announced August 2020.
-
A Methodology to Assess the Human Factors Associated with Lunar Teleoperated Assembly Tasks
Authors:
Arun Kumar,
Mason Bell,
Benjamin Mellinkoff,
Alex Sandoval,
Wendy Bailey Martin,
Jack Burns
Abstract:
Low-latency telerobotics can enable more intricate surface tasks on extraterrestrial planetary bodies than has ever been attempted. For humanity to create a sustainable lunar presence, well-developed collaboration between humans and robots is necessary to perform complex tasks. This paper presents a methodology to assess the human factors, situational awareness (SA) and cognitive load (CL), associ…
▽ More
Low-latency telerobotics can enable more intricate surface tasks on extraterrestrial planetary bodies than has ever been attempted. For humanity to create a sustainable lunar presence, well-developed collaboration between humans and robots is necessary to perform complex tasks. This paper presents a methodology to assess the human factors, situational awareness (SA) and cognitive load (CL), associated with teleoperated assembly tasks. Currently, telerobotic assembly on an extraterrestrial body has never been attempted, and a valid methodology to assess the associated human factors has not been developed. The Telerobotics Laboratory at the University of Colorado-Boulder created the Telerobotic Simulation System (TSS) which enables remote operation of a rover and a robotic arm. The TSS was used in a laboratory experiment designed as an analog to a lunar mission. The operator's task was to assemble a radio interferometer. Each participant completed this task under two conditions, remote teleoperation (limited SA) and local operation (optimal SA). The goal of the experiment was to establish a methodology to accurately measure the operator's SA and CL while performing teleoperated assembly tasks. A successful methodology would yield results showing greater SA and lower CL while operating locally. Performance metrics showed greater SA and lower CL in the local environment, supported by a 27% increase in the mean time to completion of the assembly task when operating remotely. Subjective measurements of SA and CL did not align with the performance metrics. Results from this experiment will guide future work attempting to accurately quantify the human factors associated with telerobotic assembly. Once an accurate methodology has been developed, we will be able to measure how new variables affect an operator's SA and CL to optimize the efficiency and effectiveness of telerobotic assembly tasks.
△ Less
Submitted 16 May, 2020;
originally announced May 2020.
-
Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims
Authors:
Miles Brundage,
Shahar Avin,
Jasmine Wang,
Haydn Belfield,
Gretchen Krueger,
Gillian Hadfield,
Heidy Khlaaf,
Jingying Yang,
Helen Toner,
Ruth Fong,
Tegan Maharaj,
Pang Wei Koh,
Sara Hooker,
Jade Leung,
Andrew Trask,
Emma Bluemke,
Jonathan Lebensold,
Cullen O'Keefe,
Mark Koren,
Théo Ryffel,
JB Rubinovitz,
Tamay Besiroglu,
Federica Carugati,
Jack Clark,
Peter Eckersley
, et al. (34 additional authors not shown)
Abstract:
With the recent wave of progress in artificial intelligence (AI) has come a growing awareness of the large-scale impacts of AI systems, and recognition that existing regulations and norms in industry and academia are insufficient to ensure responsible AI development. In order for AI developers to earn trust from system users, customers, civil society, governments, and other stakeholders that they…
▽ More
With the recent wave of progress in artificial intelligence (AI) has come a growing awareness of the large-scale impacts of AI systems, and recognition that existing regulations and norms in industry and academia are insufficient to ensure responsible AI development. In order for AI developers to earn trust from system users, customers, civil society, governments, and other stakeholders that they are building AI responsibly, they will need to make verifiable claims to which they can be held accountable. Those outside of a given organization also need effective means of scrutinizing such claims. This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems and their associated development processes, with a focus on providing evidence about the safety, security, fairness, and privacy protection of AI systems. We analyze ten mechanisms for this purpose--spanning institutions, software, and hardware--and make recommendations aimed at implementing, exploring, or improving those mechanisms.
△ Less
Submitted 20 April, 2020; v1 submitted 15 April, 2020;
originally announced April 2020.
-
Sherali-Adams and the binary encoding of combinatorial principles
Authors:
Stefan Dantchev,
Abdul Ghani,
Barnaby Martin
Abstract:
We consider the Sherali-Adams (SA) refutation system together with the unusual binary encoding of certain combinatorial principles. For the unary encoding of the Pigeonhole Principle and the Least Number Principle, it is known that linear rank is required for refutations in SA, although both admit refutations of polynomial size. We prove that the binary encoding of the Pigeonhole Principle require…
▽ More
We consider the Sherali-Adams (SA) refutation system together with the unusual binary encoding of certain combinatorial principles. For the unary encoding of the Pigeonhole Principle and the Least Number Principle, it is known that linear rank is required for refutations in SA, although both admit refutations of polynomial size. We prove that the binary encoding of the Pigeonhole Principle requires exponentially-sized SA refutations, whereas the binary encoding of the Least Number Principle admits logarithmic rank, polynomially-sized SA refutations. We continue by considering a refutation system between SA and Lasserre (Sum-of-Squares). In this system, the Least Number Principle requires linear rank while the Pigeonhole Principle becomes constant rank.
△ Less
Submitted 1 November, 2019;
originally announced November 2019.
-
Marine Mammal Species Classification using Convolutional Neural Networks and a Novel Acoustic Representation
Authors:
Mark Thomas,
Bruce Martin,
Katie Kowarski,
Briand Gaudet,
Stan Matwin
Abstract:
Research into automated systems for detecting and classifying marine mammals in acoustic recordings is expanding internationally due to the necessity to analyze large collections of data for conservation purposes. In this work, we present a Convolutional Neural Network that is capable of classifying the vocalizations of three species of whales, non-biological sources of noise, and a fifth class pe…
▽ More
Research into automated systems for detecting and classifying marine mammals in acoustic recordings is expanding internationally due to the necessity to analyze large collections of data for conservation purposes. In this work, we present a Convolutional Neural Network that is capable of classifying the vocalizations of three species of whales, non-biological sources of noise, and a fifth class pertaining to ambient noise. In this way, the classifier is capable of detecting the presence and absence of whale vocalizations in an acoustic recording. Through transfer learning, we show that the classifier is capable of learning high-level representations and can generalize to additional species. We also propose a novel representation of acoustic signals that builds upon the commonly used spectrogram representation by way of interpolating and stacking multiple spectrograms produced using different Short-time Fourier Transform (STFT) parameters. The proposed representation is particularly effective for the task of marine mammal species classification where the acoustic events we are attempting to classify are sensitive to the parameters of the STFT.
△ Less
Submitted 30 July, 2019;
originally announced July 2019.
-
QCSP monsters and the demise of the Chen Conjecture
Authors:
Dmitriy Zhuk,
Barnaby Martin
Abstract:
We give a surprising classification for the computational complexity of the Quantified Constraint Satisfaction Problem over a constraint language $Γ$, QCSP$(Γ)$, where $Γ$ is a finite language over $3$ elements which contains all constants. In particular, such problems are either in P, NP-complete, co-NP-complete or PSpace-complete. Our classification refutes the hitherto widely-believed Chen Conj…
▽ More
We give a surprising classification for the computational complexity of the Quantified Constraint Satisfaction Problem over a constraint language $Γ$, QCSP$(Γ)$, where $Γ$ is a finite language over $3$ elements which contains all constants. In particular, such problems are either in P, NP-complete, co-NP-complete or PSpace-complete. Our classification refutes the hitherto widely-believed Chen Conjecture.
Additionally, we show that already on a 4-element domain there exists a constraint language $Γ$ such that QCSP$(Γ)$ is DP-complete (from Boolean Hierarchy), and on a 10-element domain there exists a constraint language giving the complexity class $Θ_{2}^{P}$.
Meanwhile, we prove the Chen Conjecture for finite conservative languages $Γ$. If the polymorphism clone of $Γ$ has the polynomially generated powers (PGP) property then QCSP$(Γ)$ is in NP. Otherwise, the polymorphism clone of $Γ$ has the exponentially generated powers (EGP) property and QCSP$(Γ)$ is PSpace-complete.
△ Less
Submitted 27 July, 2022; v1 submitted 29 June, 2019;
originally announced July 2019.
-
Pointing task on smart glasses: Comparison of four interaction techniques
Authors:
Ilyasse Belkacem,
Isabelle Pecci,
Benoît Martin
Abstract:
Mobile devices such as smartphones, smartwatches or smart glasses have revolutionized how we interact. We are interested in smart glasses because they have the advantage of providing a simultaneous view of both physical and digital worlds. Despite this potential, pointing task on smart glasses is not really widespread. In this paper, we compared four interaction techniques for selecting targets :…
▽ More
Mobile devices such as smartphones, smartwatches or smart glasses have revolutionized how we interact. We are interested in smart glasses because they have the advantage of providing a simultaneous view of both physical and digital worlds. Despite this potential, pointing task on smart glasses is not really widespread. In this paper, we compared four interaction techniques for selecting targets : (a) the Absolute Head Movement and (b) the Relative Head Movement, where head movement controls the cursor on smart glasses in absolute or relative way, (c) the Absolute Free Hand interaction, where the forefinger control the cursor and (d) the Tactile Surface interaction, where the user controls the cursor via a small touchpad connected to smart glasses. We conducted an experiment with 18 participants. The Tactile Surface and Absolute Head Movement were the most efficient. The Relative Head Movement and Absolute Free Hand interactions were promising and require more exploration for other tasks.
△ Less
Submitted 14 May, 2019;
originally announced May 2019.
-
Resolution and the binary encoding of combinatorial principles
Authors:
Stefan Dantchev,
Nicola Galesi,
Barnaby Martin
Abstract:
We investigate the size complexity of proofs in $Res(s)$ -- an extension of Resolution working on $s$-DNFs instead of clauses -- for families of contradictions given in the {\em unusual binary} encoding. A motivation of our work is size lower bounds of refutations in Resolution for families of contradictions in the usual unary encoding. Our main interest is the $k$-Clique Principle, whose Resoluti…
▽ More
We investigate the size complexity of proofs in $Res(s)$ -- an extension of Resolution working on $s$-DNFs instead of clauses -- for families of contradictions given in the {\em unusual binary} encoding. A motivation of our work is size lower bounds of refutations in Resolution for families of contradictions in the usual unary encoding. Our main interest is the $k$-Clique Principle, whose Resolution complexity is still unknown.
Our main result is a $n^{Ω(k)}$ lower bound for the size of refutations of the binary $k$-Clique Principle in $Res(\lfloor \frac{1}{2}\log \log n\rfloor)$. This improves the result of Lauria, Pudlák et al. [24] who proved the lower bound for Resolution, that is $Res(1)$.
Our second lower bound proves that in $RES(s)$ for $s\leq \log^{\frac{1}{2-ε}}(n)$, the shortest proofs of the $BinPHP^m_n$, requires size $2^{n^{1-δ}}$, for any $δ>0$. Furthermore we prove that $BinPHP^m_n$ can be refuted in size $2^{Θ(n)}$ in treelike $Res(1)$, contrasting with the unary case, where $PHP^m_n$ requires treelike $RES(1)$ \ refutations of size $2^{Ω(n \log n)}$ [9,16].
Furthermore we study under what conditions the complexity of refutations in Resolution will not increase significantly (more than a polynomial factor) when shifting between the unary encoding and the binary encoding. We show that this is true, from unary to binary, for propositional encodings of principles expressible as a $Π_2$-formula and involving {\em total variable comparisons}. We then show that this is true, from binary to unary, when one considers the \emph{functional unary encoding}. Finally we prove that the binary encoding of the general Ordering principle $OP$ -- with no total ordering constraints -- is polynomially provable in Resolution.
△ Less
Submitted 19 September, 2018; v1 submitted 8 September, 2018;
originally announced September 2018.