-
AlphaZeroES: Direct score maximization outperforms planning loss minimization
Authors:
Carlos Martin,
Tuomas Sandholm
Abstract:
Planning at execution time has been shown to dramatically improve performance for agents in both single-agent and multi-agent settings. A well-known family of approaches to planning at execution time are AlphaZero and its variants, which use Monte Carlo Tree Search together with a neural network that guides the search by predicting state values and action probabilities. AlphaZero trains these netw…
▽ More
Planning at execution time has been shown to dramatically improve performance for agents in both single-agent and multi-agent settings. A well-known family of approaches to planning at execution time are AlphaZero and its variants, which use Monte Carlo Tree Search together with a neural network that guides the search by predicting state values and action probabilities. AlphaZero trains these networks by minimizing a planning loss that makes the value prediction match the episode return, and the policy prediction at the root of the search tree match the output of the full tree expansion. AlphaZero has been applied to both single-agent environments (such as Sokoban) and multi-agent environments (such as chess and Go) with great success. In this paper, we explore an intriguing question: In single-agent environments, can we outperform AlphaZero by directly maximizing the episode score instead of minimizing this planning loss, while leaving the MCTS algorithm and neural architecture unchanged? To directly maximize the episode score, we use evolution strategies, a family of algorithms for zeroth-order blackbox optimization. Our experiments indicate that, across multiple environments, directly maximizing the episode score outperforms minimizing the planning loss.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Simultaneous incremental support adjustment and metagame solving: An equilibrium-finding framework for continuous-action games
Authors:
Carlos Martin,
Tuomas Sandholm
Abstract:
We present a framework for computing approximate mixed-strategy Nash equilibria of continuous-action games. It is a modification of the traditional double oracle algorithm, extended to multiple players and continuous action spaces. Unlike prior methods, it maintains fixed-cardinality pure strategy sets for each player. Thus, unlike prior methods, only a constant amount of memory is necessary. Furt…
▽ More
We present a framework for computing approximate mixed-strategy Nash equilibria of continuous-action games. It is a modification of the traditional double oracle algorithm, extended to multiple players and continuous action spaces. Unlike prior methods, it maintains fixed-cardinality pure strategy sets for each player. Thus, unlike prior methods, only a constant amount of memory is necessary. Furthermore, it does not require exact metagame solving on each iteration, which can be computationally expensive for large metagames. Moreover, it does not require global best-response computation on each iteration, which can be computationally expensive or even intractable for high-dimensional action spaces and general games. Our method incrementally reduces the exploitability of the strategy profile in the finite metagame, pushing it toward Nash equilibrium. Simultaneously, it incrementally improves the pure strategies that best respond to this strategy profile in the full game. We evaluate our method on various continuous-action games, showing that it obtains approximate mixed-strategy Nash equilibria with low exploitability.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
SoundLoCD: An Efficient Conditional Discrete Contrastive Latent Diffusion Model for Text-to-Sound Generation
Authors:
Xinlei Niu,
Jing Zhang,
Christian Walder,
Charles Patrick Martin
Abstract:
We present SoundLoCD, a novel text-to-sound generation framework, which incorporates a LoRA-based conditional discrete contrastive latent diffusion model. Unlike recent large-scale sound generation models, our model can be efficiently trained under limited computational resources. The integration of a contrastive learning strategy further enhances the connection between text conditions and the gen…
▽ More
We present SoundLoCD, a novel text-to-sound generation framework, which incorporates a LoRA-based conditional discrete contrastive latent diffusion model. Unlike recent large-scale sound generation models, our model can be efficiently trained under limited computational resources. The integration of a contrastive learning strategy further enhances the connection between text conditions and the generated outputs, resulting in coherent and high-fidelity performance. Our experiments demonstrate that SoundLoCD outperforms the baseline with greatly reduced computational resources. A comprehensive ablation study further validates the contribution of each component within SoundLoCD. Demo page: \url{https://XinleiNIU.github.io/demo-SoundLoCD/}.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Analysis of the BraTS 2023 Intracranial Meningioma Segmentation Challenge
Authors:
Dominic LaBella,
Ujjwal Baid,
Omaditya Khanna,
Shan McBurney-Lin,
Ryan McLean,
Pierre Nedelec,
Arif Rashid,
Nourel Hoda Tahon,
Talissa Altes,
Radhika Bhalerao,
Yaseen Dhemesh,
Devon Godfrey,
Fathi Hilal,
Scott Floyd,
Anastasia Janas,
Anahita Fathi Kazerooni,
John Kirkpatrick,
Collin Kent,
Florian Kofler,
Kevin Leu,
Nazanin Maleki,
Bjoern Menze,
Maxence Pajot,
Zachary J. Reitman,
Jeffrey D. Rudie
, et al. (96 additional authors not shown)
Abstract:
We describe the design and results from the BraTS 2023 Intracranial Meningioma Segmentation Challenge. The BraTS Meningioma Challenge differed from prior BraTS Glioma challenges in that it focused on meningiomas, which are typically benign extra-axial tumors with diverse radiologic and anatomical presentation and a propensity for multiplicity. Nine participating teams each developed deep-learning…
▽ More
We describe the design and results from the BraTS 2023 Intracranial Meningioma Segmentation Challenge. The BraTS Meningioma Challenge differed from prior BraTS Glioma challenges in that it focused on meningiomas, which are typically benign extra-axial tumors with diverse radiologic and anatomical presentation and a propensity for multiplicity. Nine participating teams each developed deep-learning automated segmentation models using image data from the largest multi-institutional systematically expert annotated multilabel multi-sequence meningioma MRI dataset to date, which included 1000 training set cases, 141 validation set cases, and 283 hidden test set cases. Each case included T2, T2/FLAIR, T1, and T1Gd brain MRI sequences with associated tumor compartment labels delineating enhancing tumor, non-enhancing tumor, and surrounding non-enhancing T2/FLAIR hyperintensity. Participant automated segmentation models were evaluated and ranked based on a scoring system evaluating lesion-wise metrics including dice similarity coefficient (DSC) and 95% Hausdorff Distance. The top ranked team had a lesion-wise median dice similarity coefficient (DSC) of 0.976, 0.976, and 0.964 for enhancing tumor, tumor core, and whole tumor, respectively and a corresponding average DSC of 0.899, 0.904, and 0.871, respectively. These results serve as state-of-the-art benchmarks for future pre-operative meningioma automated segmentation algorithms. Additionally, we found that 1286 of 1424 cases (90.3%) had at least 1 compartment voxel abutting the edge of the skull-stripped image edge, which requires further investigation into optimal pre-processing face anonymization steps.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
On the weight dynamics of learning networks
Authors:
Nahal Sharafi,
Christoph Martin,
Sarah Hallerberg
Abstract:
Neural networks have become a widely adopted tool for tackling a variety of problems in machine learning and artificial intelligence. In this contribution we use the mathematical framework of local stability analysis to gain a deeper understanding of the learning dynamics of feed forward neural networks. Therefore, we derive equations for the tangent operator of the learning dynamics of three-laye…
▽ More
Neural networks have become a widely adopted tool for tackling a variety of problems in machine learning and artificial intelligence. In this contribution we use the mathematical framework of local stability analysis to gain a deeper understanding of the learning dynamics of feed forward neural networks. Therefore, we derive equations for the tangent operator of the learning dynamics of three-layer networks learning regression tasks. The results are valid for an arbitrary numbers of nodes and arbitrary choices of activation functions. Applying the results to a network learning a regression task, we investigate numerically, how stability indicators relate to the final training-loss. Although the specific results vary with different choices of initial conditions and activation functions, we demonstrate that it is possible to predict the final training loss, by monitoring finite-time Lyapunov exponents or covariant Lyapunov vectors during the training process.
△ Less
Submitted 30 April, 2024;
originally announced May 2024.
-
Understanding and Shaping Human-Technology Assemblages in the Age of Generative AI
Authors:
Josh Andres,
Chris Danta,
Andrea Bianchi,
Sungyeon Hong,
Zhuying Li,
Eduardo B. Sandoval,
Charles Martin,
Ned Cooper
Abstract:
Generative AI capabilities are rapidly transforming how we perceive, interact with, and relate to machines. This one-day workshop invites HCI researchers, designers, and practitioners to imaginatively inhabit and explore the possible futures that might emerge from humans combining generative AI capabilities into everyday technologies at massive scale. Workshop participants will craft stories, visu…
▽ More
Generative AI capabilities are rapidly transforming how we perceive, interact with, and relate to machines. This one-day workshop invites HCI researchers, designers, and practitioners to imaginatively inhabit and explore the possible futures that might emerge from humans combining generative AI capabilities into everyday technologies at massive scale. Workshop participants will craft stories, visualisations, and prototypes through scenario-based design to investigate these possible futures, resulting in the production of an open-annotated scenario library and a journal or interactions article to disseminate the findings. We aim to gather the DIS community knowledge to explore, understand and shape the relations this new interaction paradigm is forging between humans, their technologies and the environment in safe, sustainable, enriching, and responsible ways.
△ Less
Submitted 4 May, 2024; v1 submitted 28 April, 2024;
originally announced April 2024.
-
HybridVC: Efficient Voice Style Conversion with Text and Audio Prompts
Authors:
Xinlei Niu,
Jing Zhang,
Charles Patrick Martin
Abstract:
We introduce HybridVC, a voice conversion (VC) framework built upon a pre-trained conditional variational autoencoder (CVAE) that combines the strengths of a latent model with contrastive learning. HybridVC supports text and audio prompts, enabling more flexible voice style conversion. HybridVC models a latent distribution conditioned on speaker embeddings acquired by a pretrained speaker encoder…
▽ More
We introduce HybridVC, a voice conversion (VC) framework built upon a pre-trained conditional variational autoencoder (CVAE) that combines the strengths of a latent model with contrastive learning. HybridVC supports text and audio prompts, enabling more flexible voice style conversion. HybridVC models a latent distribution conditioned on speaker embeddings acquired by a pretrained speaker encoder and optimises style text embeddings to align with the speaker style information through contrastive learning in parallel. Therefore, HybridVC can be efficiently trained under limited computational resources. Our experiments demonstrate HybridVC's superior training efficiency and its capability for advanced multi-modal voice style conversion. This underscores its potential for widespread applications such as user-defined personalised voice in various social media platforms. A comprehensive ablation study further validates the effectiveness of our method.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Quantifying Software Correctness by Combining Architecture Modeling and Formal Program Analysis
Authors:
Florian Lanzinger,
Christian Martin,
Frederik Reiche,
Samuel Teuber,
Robert Heinrich,
Alexander Weigl
Abstract:
Most formal methods see the correctness of a software system as a binary decision. However, proving the correctness of complex systems completely is difficult because they are composed of multiple components, usage scenarios, and environments. We present QuAC, a modular approach for quantifying the correctness of service-oriented software systems by combining software architecture modeling with de…
▽ More
Most formal methods see the correctness of a software system as a binary decision. However, proving the correctness of complex systems completely is difficult because they are composed of multiple components, usage scenarios, and environments. We present QuAC, a modular approach for quantifying the correctness of service-oriented software systems by combining software architecture modeling with deductive verification. Our approach is based on a model of the service-oriented architecture and the probabilistic usage scenarios of the system. The correctness of a single service is approximated by a coverage region, which is a formula describing which inputs for that service are proven to not lead to an erroneous execution. The coverage regions can be determined by a combination of various analyses, e.g., formal verification, expert estimations, or testing. The coverage regions and the software model are then combined into a probabilistic program. From this, we can compute the probability that under a given usage profile no service is called outside its coverage region. If the coverage region is large enough, then instead of attempting to get 100% coverage, which may be prohibitively expensive, run-time verification or testing approaches may be used to deal with inputs outside the coverage region. We also present an implementation of QuAC for Java using the modeling tool Palladio and the deductive verification tool KeY. We demonstrate its usability by applying it to a software simulation of an energy system.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
AERIAL-CORE: AI-Powered Aerial Robots for Inspection and Maintenance of Electrical Power Infrastructures
Authors:
Anibal Ollero,
Alejandro Suarez,
Christos Papaioannidis,
Ioannis Pitas,
Juan M. Marredo,
Viet Duong,
Emad Ebeid,
Vit Kratky,
Martin Saska,
Chloe Hanoune,
Amr Afifi,
Antonio Franchi,
Charalampos Vourtsis,
Dario Floreano,
Goran Vasiljevic,
Stjepan Bogdan,
Alvaro Caballero,
Fabio Ruggiero,
Vincenzo Lippiello,
Carlos Matilla,
Giovanni Cioffi,
Davide Scaramuzza,
Jose R. Martinez-de-Dios,
Begona C. Arrue,
Carlos Martin
, et al. (5 additional authors not shown)
Abstract:
Large-scale infrastructures are prone to deterioration due to age, environmental influences, and heavy usage. Ensuring their safety through regular inspections and maintenance is crucial to prevent incidents that can significantly affect public safety and the environment. This is especially pertinent in the context of electrical power networks, which, while essential for energy provision, can also…
▽ More
Large-scale infrastructures are prone to deterioration due to age, environmental influences, and heavy usage. Ensuring their safety through regular inspections and maintenance is crucial to prevent incidents that can significantly affect public safety and the environment. This is especially pertinent in the context of electrical power networks, which, while essential for energy provision, can also be sources of forest fires. Intelligent drones have the potential to revolutionize inspection and maintenance, eliminating the risks for human operators, increasing productivity, reducing inspection time, and improving data collection quality. However, most of the current methods and technologies in aerial robotics have been trialed primarily in indoor testbeds or outdoor settings under strictly controlled conditions, always within the line of sight of human operators. Additionally, these methods and technologies have typically been evaluated in isolation, lacking comprehensive integration. This paper introduces the first autonomous system that combines various innovative aerial robots. This system is designed for extended-range inspections beyond the visual line of sight, features aerial manipulators for maintenance tasks, and includes support mechanisms for human operators working at elevated heights. The paper further discusses the successful validation of this system on numerous electrical power lines, with aerial robots executing flights over 10 kilometers away from their ground control stations.
△ Less
Submitted 4 January, 2024;
originally announced January 2024.
-
Non-Intrusive Load Monitoring for Feeder-Level EV Charging Detection: Sliding Window-based Approaches to Offline and Online Detection
Authors:
Cameron Martin,
Fucai Ke,
Hao Wang
Abstract:
Understanding electric vehicle (EV) charging on the distribution network is key to effective EV charging management and aiding decarbonization across the energy and transport sectors. Advanced metering infrastructure has allowed distribution system operators and utility companies to collect high-resolution load data from their networks. These advancements enable the non-intrusive load monitoring (…
▽ More
Understanding electric vehicle (EV) charging on the distribution network is key to effective EV charging management and aiding decarbonization across the energy and transport sectors. Advanced metering infrastructure has allowed distribution system operators and utility companies to collect high-resolution load data from their networks. These advancements enable the non-intrusive load monitoring (NILM) technique to detect EV charging using load measurement data. While existing studies primarily focused on NILM for EV charging detection in individual households, there is a research gap on EV charging detection at the feeder level, presenting unique challenges due to the combined load measurement from multiple households. In this paper, we develop a novel and effective approach for EV detection at the feeder level, involving sliding-window feature extraction and classical machine learning techniques, specifically models like XGBoost and Random Forest. Our developed method offers a lightweight and efficient solution, capable of quick training. Moreover, our developed method is versatile, supporting both offline and online EV charging detection. Our experimental results demonstrate high-accuracy EV charging detection at the feeder level, achieving an F-Score of 98.88% in offline detection and 93.01% in online detection.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training
Authors:
Yefan Zhou,
Tianyu Pang,
Keqin Liu,
Charles H. Martin,
Michael W. Mahoney,
Yaoqing Yang
Abstract:
Regularization in modern machine learning is crucial, and it can take various forms in algorithmic design: training set, model family, error function, regularization terms, and optimizations. In particular, the learning rate, which can be interpreted as a temperature-like parameter within the statistical mechanics of learning, plays a crucial role in neural network training. Indeed, many widely ad…
▽ More
Regularization in modern machine learning is crucial, and it can take various forms in algorithmic design: training set, model family, error function, regularization terms, and optimizations. In particular, the learning rate, which can be interpreted as a temperature-like parameter within the statistical mechanics of learning, plays a crucial role in neural network training. Indeed, many widely adopted training strategies basically just define the decay of the learning rate over time. This process can be interpreted as decreasing a temperature, using either a global learning rate (for the entire model) or a learning rate that varies for each parameter. This paper proposes TempBalance, a straightforward yet effective layer-wise learning rate method. TempBalance is based on Heavy-Tailed Self-Regularization (HT-SR) Theory, an approach which characterizes the implicit self-regularization of different layers in trained models. We demonstrate the efficacy of using HT-SR-motivated metrics to guide the scheduling and balancing of temperature across all network layers during model training, resulting in improved performance during testing. We implement TempBalance on CIFAR10, CIFAR100, SVHN, and TinyImageNet datasets using ResNets, VGGs, and WideResNets with various depths and widths. Our results show that TempBalance significantly outperforms ordinary SGD and carefully-tuned spectral norm regularization. We also show that TempBalance outperforms a number of state-of-the-art optimizers and learning rate schedulers.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Towards Foundational Models for Molecular Learning on Large-Scale Multi-Task Datasets
Authors:
Dominique Beaini,
Shenyang Huang,
Joao Alex Cunha,
Zhiyi Li,
Gabriela Moisescu-Pareja,
Oleksandr Dymov,
Samuel Maddrell-Mander,
Callum McLean,
Frederik Wenkel,
Luis Müller,
Jama Hussein Mohamud,
Ali Parviz,
Michael Craig,
Michał Koziarski,
Jiarui Lu,
Zhaocheng Zhu,
Cristian Gabellini,
Kerstin Klaser,
Josef Dean,
Cas Wognum,
Maciej Sypetkowski,
Guillaume Rabusseau,
Reihaneh Rabbany,
Jian Tang,
Christopher Morris
, et al. (10 additional authors not shown)
Abstract:
Recently, pre-trained foundation models have enabled significant advancements in multiple fields. In molecular machine learning, however, where datasets are often hand-curated, and hence typically small, the lack of datasets with labeled features, and codebases to manage those datasets, has hindered the development of foundation models. In this work, we present seven novel datasets categorized by…
▽ More
Recently, pre-trained foundation models have enabled significant advancements in multiple fields. In molecular machine learning, however, where datasets are often hand-curated, and hence typically small, the lack of datasets with labeled features, and codebases to manage those datasets, has hindered the development of foundation models. In this work, we present seven novel datasets categorized by size into three distinct categories: ToyMix, LargeMix and UltraLarge. These datasets push the boundaries in both the scale and the diversity of supervised labels for molecular learning. They cover nearly 100 million molecules and over 3000 sparsely defined tasks, totaling more than 13 billion individual labels of both quantum and biological nature. In comparison, our datasets contain 300 times more data points than the widely used OGB-LSC PCQM4Mv2 dataset, and 13 times more than the quantum-only QM1B dataset. In addition, to support the development of foundational models based on our proposed datasets, we present the Graphium graph machine learning library which simplifies the process of building and training molecular machine learning models for multi-task and multi-level molecular datasets. Finally, we present a range of baseline results as a starting point of multi-task and multi-level training on these datasets. Empirically, we observe that performance on low-resource biological datasets show improvement by also training on large amounts of quantum data. This indicates that there may be potential in multi-task and multi-level training of a foundation model and fine-tuning it to resource-constrained downstream tasks.
△ Less
Submitted 18 October, 2023; v1 submitted 6 October, 2023;
originally announced October 2023.
-
State of the Art Report: Verified Computation
Authors:
Jim Woodcock,
Mikkel Schmidt Andersen,
Diego F. Aranha,
Stefan Hallerstede,
Simon Thrane Hansen,
Nikolaj Kuhne Jakobsen,
Tomas Kulik,
Peter Gorm Larsen,
Hugo Daniel Macedo,
Carlos Ignacio Isasa Martin,
Victor Alexander Mtsimbe Norrild
Abstract:
This report describes the state of the art in verifiable computation. The problem being solved is the following:
The Verifiable Computation Problem (Verifiable Computing Problem) Suppose we have two computing agents. The first agent is the verifier, and the second agent is the prover. The verifier wants the prover to perform a computation. The verifier sends a description of the computation to t…
▽ More
This report describes the state of the art in verifiable computation. The problem being solved is the following:
The Verifiable Computation Problem (Verifiable Computing Problem) Suppose we have two computing agents. The first agent is the verifier, and the second agent is the prover. The verifier wants the prover to perform a computation. The verifier sends a description of the computation to the prover. Once the prover has completed the task, the prover returns the output to the verifier. The output will contain proof. The verifier can use this proof to check if the prover computed the output correctly. The check is not required to verify the algorithm used in the computation. Instead, it is a check that the prover computed the output using the computation specified by the verifier. The effort required for the check should be much less than that required to perform the computation.
This state-of-the-art report surveys 128 papers from the literature comprising more than 4,000 pages. Other papers and books were surveyed but were omitted. The papers surveyed were overwhelmingly mathematical. We have summarised the major concepts that form the foundations for verifiable computation. The report contains two main sections. The first, larger section covers the theoretical foundations for probabilistically checkable and zero-knowledge proofs. The second section contains a description of the current practice in verifiable computation. Two further reports will cover (i) military applications of verifiable computation and (ii) a collection of technical demonstrators. The first of these is intended to be read by those who want to know what applications are enabled by the current state of the art in verifiable computation. The second is for those who want to see practical tools and conduct experiments themselves.
△ Less
Submitted 16 February, 2024; v1 submitted 29 August, 2023;
originally announced August 2023.
-
AI planning in the imagination: High-level planning on learned abstract search spaces
Authors:
Carlos Martin,
Tuomas Sandholm
Abstract:
Search and planning algorithms have been a cornerstone of artificial intelligence since the field's inception. Giving reinforcement learning agents the ability to plan during execution time has resulted in significant performance improvements in various domains. However, in real-world environments, the model with respect to which the agent plans has been constrained to be grounded in the real envi…
▽ More
Search and planning algorithms have been a cornerstone of artificial intelligence since the field's inception. Giving reinforcement learning agents the ability to plan during execution time has resulted in significant performance improvements in various domains. However, in real-world environments, the model with respect to which the agent plans has been constrained to be grounded in the real environment itself, as opposed to a more abstract model which allows for planning over compound actions and behaviors. We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training, which is completely decoupled from the real environment. Unlike prior approaches, this enables the agent to perform high-level planning at arbitrary timescales and reason in terms of compound or temporally-extended actions, which can be useful in environments where large numbers of base-level micro-actions are needed to perform relevant macro-actions. In addition, our method is more general than comparable prior methods because it seamlessly handles settings with continuous action spaces, combinatorial action spaces, and partial observability. We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman. Experimentally, it outperforms comparable prior methods without assuming access to an environment simulator at execution time.
△ Less
Submitted 2 December, 2023; v1 submitted 16 August, 2023;
originally announced August 2023.
-
Latent Optimal Paths by Gumbel Propagation for Variational Bayesian Dynamic Programming
Authors:
Xinlei Niu,
Christian Walder,
Jing Zhang,
Charles Patrick Martin
Abstract:
We propose the stochastic optimal path which solves the classical optimal path problem by a probability-softening solution. This unified approach transforms a wide range of DP problems into directed acyclic graphs in which all paths follow a Gibbs distribution. We show the equivalence of the Gibbs distribution to a message-passing algorithm by the properties of the Gumbel distribution and give all…
▽ More
We propose the stochastic optimal path which solves the classical optimal path problem by a probability-softening solution. This unified approach transforms a wide range of DP problems into directed acyclic graphs in which all paths follow a Gibbs distribution. We show the equivalence of the Gibbs distribution to a message-passing algorithm by the properties of the Gumbel distribution and give all the ingredients required for variational Bayesian inference of a latent path, namely Bayesian dynamic programming (BDP). We demonstrate the usage of BDP in the latent space of variational autoencoders (VAEs) and propose the BDP-VAE which captures structured sparse optimal paths as latent variables. This enables end-to-end training for generative tasks in which models rely on unobserved structural information. At last, we validate the behavior of our approach and showcase its applicability in two real-world applications: text-to-speech and singing voice synthesis. Our implementation code is available at \url{https://github.com/XinleiNIU/LatentOptimalPathsBayesianDP}.
△ Less
Submitted 25 June, 2024; v1 submitted 4 June, 2023;
originally announced June 2023.
-
A Study of Generative Large Language Model for Medical Research and Healthcare
Authors:
Cheng Peng,
Xi Yang,
Aokun Chen,
Kaleb E Smith,
Nima PourNejatian,
Anthony B Costa,
Cheryl Martin,
Mona G Flores,
Ying Zhang,
Tanja Magoc,
Gloria Lipori,
Duane A Mitchell,
Naykky S Ospina,
Mustafa M Ahmed,
William R Hogan,
Elizabeth A Shenkman,
Yi Guo,
Jiang Bian,
Yonghui Wu
Abstract:
There is enormous enthusiasm and concerns in using large language models (LLMs) in healthcare, yet current assumptions are all based on general-purpose LLMs such as ChatGPT. This study develops a clinical generative LLM, GatorTronGPT, using 277 billion words of mixed clinical and English text with a GPT-3 architecture of 20 billion parameters. GatorTronGPT improves biomedical natural language proc…
▽ More
There is enormous enthusiasm and concerns in using large language models (LLMs) in healthcare, yet current assumptions are all based on general-purpose LLMs such as ChatGPT. This study develops a clinical generative LLM, GatorTronGPT, using 277 billion words of mixed clinical and English text with a GPT-3 architecture of 20 billion parameters. GatorTronGPT improves biomedical natural language processing for medical research. Synthetic NLP models trained using GatorTronGPT generated text outperform NLP models trained using real-world clinical text. Physicians Turing test using 1 (worst) to 9 (best) scale shows that there is no significant difference in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights on the opportunities and challenges of LLMs for medical research and healthcare.
△ Less
Submitted 22 May, 2023;
originally announced May 2023.
-
Comprehensive and user-analytics-friendly cancer patient database for physicians and researchers
Authors:
Ali Firooz,
Avery T. Funkhouser,
Julie C. Martin,
W. Jeffery Edenfield,
Homayoun Valafar,
Anna V. Blenda
Abstract:
Nuanced cancer patient care is needed, as the development and clinical course of cancer is multifactorial with influences from the general health status of the patient, germline and neoplastic mutations, co-morbidities, and environment. To effectively tailor an individualized treatment to each patient, such multifactorial data must be presented to providers in an easy-to-access and easy-to-analyze…
▽ More
Nuanced cancer patient care is needed, as the development and clinical course of cancer is multifactorial with influences from the general health status of the patient, germline and neoplastic mutations, co-morbidities, and environment. To effectively tailor an individualized treatment to each patient, such multifactorial data must be presented to providers in an easy-to-access and easy-to-analyze fashion. To address the need, a relational database has been developed integrating status of cancer-critical gene mutations, serum galectin profiles, serum and tumor glycomic profiles, with clinical, demographic, and lifestyle data points of individual cancer patients. The database, as a backend, provides physicians and researchers with a single, easily accessible repository of cancer profiling data to aid-in and enhance individualized treatment. Our interactive database allows care providers to amalgamate cohorts from these groups to find correlations between different data types with the possibility of finding "molecular signatures" based upon a combination of genetic mutations, galectin serum levels, glycan compositions, and patient clinical data and lifestyle choices. Our project provides a framework for an integrated, interactive, and growing database to analyze molecular and clinical patterns across cancer stages and subtypes and provides opportunities for increased diagnostic and prognostic power.
△ Less
Submitted 1 February, 2023;
originally announced February 2023.
-
ApproxED: Approximate exploitability descent via learned best responses
Authors:
Carlos Martin,
Tuomas Sandholm
Abstract:
There has been substantial progress on finding game-theoretic equilibria. Most of that work has focused on games with finite, discrete action spaces. However, many games involving space, time, money, and other fine-grained quantities have continuous action spaces (or are best modeled as having such). We study the problem of finding an approximate Nash equilibrium of games with continuous action se…
▽ More
There has been substantial progress on finding game-theoretic equilibria. Most of that work has focused on games with finite, discrete action spaces. However, many games involving space, time, money, and other fine-grained quantities have continuous action spaces (or are best modeled as having such). We study the problem of finding an approximate Nash equilibrium of games with continuous action sets. The standard measure of closeness to Nash equilibrium is exploitability, which measures how much players can benefit from unilaterally changing their strategy. We propose two new methods that minimize an approximation of exploitability with respect to the strategy profile. The first method uses a learned best-response function, which takes the current strategy profile as input and outputs candidate best responses for each player. The strategy profile and best-response functions are trained simultaneously, with the former trying to minimize exploitability while the latter tries to maximize it. The second method maintains an ensemble of candidate best responses for each player. In each iteration, the best-performing elements of each ensemble are used to update the current strategy profile. The strategy profile and ensembles are simultaneously trained to minimize and maximize the approximate exploitability, respectively. We evaluate our methods on various continuous games and GAN training, showing that they outperform prior methods.
△ Less
Submitted 12 June, 2024; v1 submitted 20 January, 2023;
originally announced January 2023.
-
OpenTwins: An open-source framework for the design, development and integration of effective 3D-IoT-AI-powered digital twins
Authors:
Julia Robles,
Cristian Martín,
Manuel Díaz
Abstract:
Although digital twins have recently emerged as a clear alternative for reliable asset representations, most of the solutions and tools available for the development of digital twins are tailored to specific environments. Furthermore, achieving reliable digital twins often requires the orchestration of technologies and paradigms such as machine learning, the Internet of Things, and 3D visualizatio…
▽ More
Although digital twins have recently emerged as a clear alternative for reliable asset representations, most of the solutions and tools available for the development of digital twins are tailored to specific environments. Furthermore, achieving reliable digital twins often requires the orchestration of technologies and paradigms such as machine learning, the Internet of Things, and 3D visualization, which are rarely seamlessly aligned. In this paper, we present a generic framework for the development of effective digital twins combining some of the aforementioned areas. In this open framework, digital twins can be easily developed and orchestrated with 3D connected visualizations, IoT data streams, and real-time machine-learning predictions. To demonstrate the feasibility of the framework, a use case in the Petrochemical Industry 4.0 has been developed.
△ Less
Submitted 12 January, 2023;
originally announced January 2023.
-
The RPM3D project: 3D Kinematics for Remote Patient Monitoring
Authors:
Alicia Fornés,
Asma Bensalah,
Cristina Carmona-Duarte,
Jialuo Chen,
Miguel A. Ferrer,
Andreas Fischer,
Josep Lladós,
Cristina Martín,
Eloy Opisso,
Réjean Plamondon,
Anna Scius-Bertrand,
Josep Maria Tormos
Abstract:
This project explores the feasibility of remote patient monitoring based on the analysis of 3D movements captured with smartwatches. We base our analysis on the Kinematic Theory of Rapid Human Movement. We have validated our research in a real case scenario for stroke rehabilitation at the Guttmann Institute5 (neurorehabilitation hospital), showing promising results. Our work could have a great im…
▽ More
This project explores the feasibility of remote patient monitoring based on the analysis of 3D movements captured with smartwatches. We base our analysis on the Kinematic Theory of Rapid Human Movement. We have validated our research in a real case scenario for stroke rehabilitation at the Guttmann Institute5 (neurorehabilitation hospital), showing promising results. Our work could have a great impact in remote healthcare applications, improving the medical efficiency and reducing the healthcare costs. Future steps include more clinical validation, developing multi-modal analysis architectures (analysing data from sensors, images, audio, etc.), and exploring the application of our technology to monitor other neurodegenerative diseases.
△ Less
Submitted 9 December, 2022;
originally announced December 2022.
-
Finding mixed-strategy equilibria of continuous-action games without gradients using randomized policy networks
Authors:
Carlos Martin,
Tuomas Sandholm
Abstract:
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients. Such game access is common in reinforcement learning settings, where the environment is typically treated as a black box. To tackle this problem, we apply zeroth-order optimization techniques that combine smoothed gradient estimators with equilibrium-finding dynamics. We model p…
▽ More
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients. Such game access is common in reinforcement learning settings, where the environment is typically treated as a black box. To tackle this problem, we apply zeroth-order optimization techniques that combine smoothed gradient estimators with equilibrium-finding dynamics. We model players' strategies using artificial neural networks. In particular, we use randomized policy networks to model mixed strategies. These take noise in addition to an observation as input and can flexibly represent arbitrary observation-dependent, continuous-action distributions. Being able to model such mixed strategies is crucial for tackling continuous-action games that lack pure-strategy equilibria. We evaluate the performance of our method using an approximation of the Nash convergence metric from game theory, which measures how much players can benefit from unilaterally changing their strategy. We apply our method to continuous Colonel Blotto games, single-item and multi-item auctions, and a visibility game. The experiments show that our method can quickly find high-quality approximate equilibria. Furthermore, they show that the dimensionality of the input noise is crucial for performance. To our knowledge, this paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
Embodying the Glitch: Perspectives on Generative AI in Dance Practice
Authors:
Benedikte Wallace,
Charles P. Martin
Abstract:
What role does the break from realism play in the potential for generative artificial intelligence as a creative tool? Through exploration of glitch, we examine the prospective value of these artefacts in creative practice. This paper describes findings from an exploration of AI-generated "mistakes" when using movement produced by a generative deep learning model as an inspiration source in dance…
▽ More
What role does the break from realism play in the potential for generative artificial intelligence as a creative tool? Through exploration of glitch, we examine the prospective value of these artefacts in creative practice. This paper describes findings from an exploration of AI-generated "mistakes" when using movement produced by a generative deep learning model as an inspiration source in dance composition.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
A Morphology Focused Diffusion Probabilistic Model for Synthesis of Histopathology Images
Authors:
Puria Azadi Moghadam,
Sanne Van Dalen,
Karina C. Martin,
Jochen Lennerz,
Stephen Yip,
Hossein Farahani,
Ali Bashashati
Abstract:
Visual microscopic study of diseased tissue by pathologists has been the cornerstone for cancer diagnosis and prognostication for more than a century. Recently, deep learning methods have made significant advances in the analysis and classification of tissue images. However, there has been limited work on the utility of such models in generating histopathology images. These synthetic images have s…
▽ More
Visual microscopic study of diseased tissue by pathologists has been the cornerstone for cancer diagnosis and prognostication for more than a century. Recently, deep learning methods have made significant advances in the analysis and classification of tissue images. However, there has been limited work on the utility of such models in generating histopathology images. These synthetic images have several applications in pathology including utilities in education, proficiency testing, privacy, and data sharing. Recently, diffusion probabilistic models were introduced to generate high quality images. Here, for the first time, we investigate the potential use of such models along with prioritized morphology weighting and color normalization to synthesize high quality histopathology images of brain cancer. Our detailed results show that diffusion probabilistic models are capable of synthesizing a wide range of histopathology images and have superior performance compared to generative adversarial networks.
△ Less
Submitted 28 September, 2022; v1 submitted 27 September, 2022;
originally announced September 2022.
-
Human Activity Recognition on Time Series Accelerometer Sensor Data using LSTM Recurrent Neural Networks
Authors:
Chrisogonas O. Odhiambo,
Sanjoy Saha,
Corby K. Martin,
Homayoun Valafar
Abstract:
The use of sensors available through smart devices has pervaded everyday life in several applications including human activity monitoring, healthcare, and social networks. In this study, we focus on the use of smartwatch accelerometer sensors to recognize eating activity. More specifically, we collected sensor data from 10 participants while consuming pizza. Using this information, and other compa…
▽ More
The use of sensors available through smart devices has pervaded everyday life in several applications including human activity monitoring, healthcare, and social networks. In this study, we focus on the use of smartwatch accelerometer sensors to recognize eating activity. More specifically, we collected sensor data from 10 participants while consuming pizza. Using this information, and other comparable data available for similar events such as smoking and medication-taking, and dissimilar activities of jogging, we developed a LSTM-ANN architecture that has demonstrated 90% success in identifying individual bites compared to a puff, medication-taking or jogging activities.
△ Less
Submitted 3 June, 2022;
originally announced June 2022.
-
GatorTron: A Large Clinical Language Model to Unlock Patient Information from Unstructured Electronic Health Records
Authors:
Xi Yang,
Aokun Chen,
Nima PourNejatian,
Hoo Chang Shin,
Kaleb E Smith,
Christopher Parisien,
Colin Compas,
Cheryl Martin,
Mona G Flores,
Ying Zhang,
Tanja Magoc,
Christopher A Harle,
Gloria Lipori,
Duane A Mitchell,
William R Hogan,
Elizabeth A Shenkman,
Jiang Bian,
Yonghui Wu
Abstract:
There is an increasing interest in developing artificial intelligence (AI) systems to process and interpret electronic health records (EHRs). Natural language processing (NLP) powered by pretrained language models is the key technology for medical AI systems utilizing clinical narratives. However, there are few clinical language models, the largest of which trained in the clinical domain is compar…
▽ More
There is an increasing interest in developing artificial intelligence (AI) systems to process and interpret electronic health records (EHRs). Natural language processing (NLP) powered by pretrained language models is the key technology for medical AI systems utilizing clinical narratives. However, there are few clinical language models, the largest of which trained in the clinical domain is comparatively small at 110 million parameters (compared with billions of parameters in the general domain). It is not clear how large clinical language models with billions of parameters can help medical AI systems utilize unstructured EHRs. In this study, we develop from scratch a large clinical language model - GatorTron - using >90 billion words of text (including >82 billion words of de-identified clinical text) and systematically evaluate it on 5 clinical NLP tasks including clinical concept extraction, medical relation extraction, semantic textual similarity, natural language inference (NLI), and medical question answering (MQA). We examine how (1) scaling up the number of parameters and (2) scaling up the size of the training data could benefit these NLP tasks. GatorTron models scale up the clinical language model from 110 million to 8.9 billion parameters and improve 5 clinical NLP tasks (e.g., 9.6% and 9.5% improvement in accuracy for NLI and MQA), which can be applied to medical AI systems to improve healthcare delivery. The GatorTron models are publicly available at: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og.
△ Less
Submitted 16 December, 2022; v1 submitted 2 February, 2022;
originally announced March 2022.
-
Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data
Authors:
Yaoqing Yang,
Ryan Theisen,
Liam Hodgkinson,
Joseph E. Gonzalez,
Kannan Ramchandran,
Charles H. Martin,
Michael W. Mahoney
Abstract:
Selecting suitable architecture parameters and training hyperparameters is essential for enhancing machine learning (ML) model performance. Several recent empirical studies conduct large-scale correlational analysis on neural networks (NNs) to search for effective \emph{generalization metrics} that can guide this type of model selection. Effective metrics are typically expected to correlate strong…
▽ More
Selecting suitable architecture parameters and training hyperparameters is essential for enhancing machine learning (ML) model performance. Several recent empirical studies conduct large-scale correlational analysis on neural networks (NNs) to search for effective \emph{generalization metrics} that can guide this type of model selection. Effective metrics are typically expected to correlate strongly with test performance. In this paper, we expand on prior analyses by examining generalization-metric-based model selection with the following objectives: (i) focusing on natural language processing (NLP) tasks, as prior work primarily concentrates on computer vision (CV) tasks; (ii) considering metrics that directly predict \emph{test error} instead of the \emph{generalization gap}; (iii) exploring metrics that do not need access to data to compute. From these objectives, we are able to provide the first model selection results on large pretrained Transformers from Huggingface using generalization metrics. Our analyses consider (I) hundreds of Transformers trained in different settings, in which we systematically vary the amount of data, the model size and the optimization hyperparameters, (II) a total of 51 pretrained Transformers from eight families of Huggingface NLP models, including GPT2, BERT, etc., and (III) a total of 28 existing and novel generalization metrics. Despite their niche status, we find that metrics derived from the heavy-tail (HT) perspective are particularly useful in NLP tasks, exhibiting stronger correlations than other, more popular metrics. To further examine these metrics, we extend prior formulations relying on power law (PL) spectral distributions to exponential (EXP) and exponentially-truncated power law (E-TPL) families.
△ Less
Submitted 4 June, 2023; v1 submitted 6 February, 2022;
originally announced February 2022.
-
Estimating covariant Lyapunov vectors from data
Authors:
Christoph Martin,
Nahal Sharafi,
Sarah Hallerberg
Abstract:
Covariant Lyapunov vectors characterize the directions along which perturbations in dynamical systems grow. They have also been studied as predictors of critical transitions and extreme events. For many applications like, for example, prediction, it is necessary to estimate the vectors from data since model equations are unknown for many interesting phenomena. We propose a novel method for estimat…
▽ More
Covariant Lyapunov vectors characterize the directions along which perturbations in dynamical systems grow. They have also been studied as predictors of critical transitions and extreme events. For many applications like, for example, prediction, it is necessary to estimate the vectors from data since model equations are unknown for many interesting phenomena. We propose a novel method for estimating covariant Lyapunov vectors based on data records without knowing the underlying equations of the system. In contrast to previous approaches, our approach can be applied to high-dimensional data-sets. We demonstrate that this purely data-driven approach can accurately estimate covariant Lyapunpov vectors from data records generated by low and high-dimensional dynamical systems. The highest dimension of a time-series from which covariant Lyapunov vectors were estimated in this contribution is 128. Being able to infer covariant Lyapunov vectors from data-records could encourage numerous future applications in data-analysis and data-based predictions.
△ Less
Submitted 11 October, 2021; v1 submitted 16 July, 2021;
originally announced July 2021.
-
Post-mortem on a deep learning contest: a Simpson's paradox and the complementary roles of scale metrics versus shape metrics
Authors:
Charles H. Martin,
Michael W. Mahoney
Abstract:
To understand better good generalization performance in state-of-the-art neural network (NN) models, and in particular the success of the ALPHAHAT metric based on Heavy-Tailed Self-Regularization (HT-SR) theory, we analyze of a corpus of models that was made publicly-available for a contest to predict the generalization accuracy of NNs. These models include a wide range of qualities and were train…
▽ More
To understand better good generalization performance in state-of-the-art neural network (NN) models, and in particular the success of the ALPHAHAT metric based on Heavy-Tailed Self-Regularization (HT-SR) theory, we analyze of a corpus of models that was made publicly-available for a contest to predict the generalization accuracy of NNs. These models include a wide range of qualities and were trained with a range of architectures and regularization hyperparameters. We break ALPHAHAT into its two subcomponent metrics: a scale-based metric; and a shape-based metric. We identify what amounts to a Simpson's paradox: where "scale" metrics (from traditional statistical learning theory) perform well in aggregate, but can perform poorly on subpartitions of the data of a given depth, when regularization hyperparameters are varied; and where "shape" metrics (from HT-SR theory) perform well on each subpartition of the data, when hyperparameters are varied for models of a given depth, but can perform poorly overall when models with varying depths are aggregated. Our results highlight the subtlety of comparing models when both architectures and hyperparameters are varied; the complementary role of implicit scale versus implicit shape parameters in understanding NN model quality; and the need to go beyond one-size-fits-all metrics based on upper bounds from generalization theory to describe the performance of NN models. Our results also clarify further why the ALPHAHAT metric from HT-SR theory works so well at predicting generalization across a broad range of CV and NLP models.
△ Less
Submitted 8 February, 2022; v1 submitted 1 June, 2021;
originally announced June 2021.
-
Empirical Analysis on Productivity Prediction and Locality for Use Case Points Method
Authors:
Mohammad Azzeh,
Ali Bou Nassif,
Cuauhtemoc Lopez Martin
Abstract:
Use Case Points (UCP) method has been around for over two decades. Although, there was a substantial criticism concerning the algebraic construction and factors assessment of UCP, it remains an efficient early size estimation method. Predicting software effort from UCP is still an ever-present challenge. The earlier version of UCP method suggested using productivity as a cost driver, where fixed o…
▽ More
Use Case Points (UCP) method has been around for over two decades. Although, there was a substantial criticism concerning the algebraic construction and factors assessment of UCP, it remains an efficient early size estimation method. Predicting software effort from UCP is still an ever-present challenge. The earlier version of UCP method suggested using productivity as a cost driver, where fixed or a few pre-defined productivity ratios have been widely agreed. While this approach was successful when no enough historical data is available, it is no longer acceptable because software projects are different in terms of development aspects. Therefore, it is better to understand the relationship between productivity and other UCP variables. This paper examines the impact of data locality approaches on productivity and effort prediction from multiple UCP variables. The environmental factors are used as partitioning factors to produce local homogeneous data either based on their influential levels or using clustering algorithms. Different machine learning methods, including solo and ensemble methods, are used to construct productivity and effort prediction models based on the local data. The results demonstrate that the prediction models that are created based on local data surpass models that use entire data. Also, the results show that conforming the hypothetical assumption between productivity and environmental factors is not necessarily a requirement for success of locality.
△ Less
Submitted 11 February, 2021;
originally announced February 2021.
-
Detecting discriminatory risk through data annotation based on Bayesian inferences
Authors:
Elena Beretta,
Antonio Vetrò,
Bruno Lepri,
Juan Carlos De Martin
Abstract:
Thanks to the increasing growth of computational power and data availability, the research in machine learning has advanced with tremendous rapidity. Nowadays, the majority of automatic decision making systems are based on data. However, it is well known that machine learning systems can present problematic results if they are built on partial or incomplete data. In fact, in recent years several s…
▽ More
Thanks to the increasing growth of computational power and data availability, the research in machine learning has advanced with tremendous rapidity. Nowadays, the majority of automatic decision making systems are based on data. However, it is well known that machine learning systems can present problematic results if they are built on partial or incomplete data. In fact, in recent years several studies have found a convergence of issues related to the ethics and transparency of these systems in the process of data collection and how they are recorded. Although the process of rigorous data collection and analysis is fundamental in the model design, this step is still largely overlooked by the machine learning community. For this reason, we propose a method of data annotation based on Bayesian statistical inference that aims to warn about the risk of discriminatory results of a given data set. In particular, our method aims to deepen knowledge and promote awareness about the sampling practices employed to create the training set, highlighting that the probability of success or failure conditioned to a minority membership is given by the structure of the data available. We empirically test our system on three datasets commonly accessed by the machine learning community and we investigate the risk of racial discrimination.
△ Less
Submitted 27 January, 2021;
originally announced January 2021.
-
Composing an Ensemble Standstill Work for Myo and Bela
Authors:
Charles Patrick Martin,
Alexander Refsum Jensenius,
Jim Torresen
Abstract:
This paper describes the process of developing a standstill performance work using the Myo gesture control armband and the Bela embedded computing platform. The combination of Myo and Bela allows a portable and extensible version of the standstill performance concept while introducing muscle tension as an additional control parameter. We describe the technical details of our setup and introduce My…
▽ More
This paper describes the process of developing a standstill performance work using the Myo gesture control armband and the Bela embedded computing platform. The combination of Myo and Bela allows a portable and extensible version of the standstill performance concept while introducing muscle tension as an additional control parameter. We describe the technical details of our setup and introduce Myo-to-Bela and Myo-to-OSC software bridges that assist with prototyping compositions using the Myo controller.
△ Less
Submitted 4 December, 2020;
originally announced December 2020.
-
A Laptop Ensemble Performance System using Recurrent Neural Networks
Authors:
Rohan Proctor,
Charles Patrick Martin
Abstract:
The popularity of applying machine learning techniques in musical domains has created an inherent availability of freely accessible pre-trained neural network (NN) models ready for use in creative applications. This work outlines the implementation of one such application in the form of an assistance tool designed for live improvisational performances by laptop ensembles. The primary intention was…
▽ More
The popularity of applying machine learning techniques in musical domains has created an inherent availability of freely accessible pre-trained neural network (NN) models ready for use in creative applications. This work outlines the implementation of one such application in the form of an assistance tool designed for live improvisational performances by laptop ensembles. The primary intention was to leverage off-the-shelf pre-trained NN models as a basis for assisting individual performers either as musical novices looking to engage with more experienced performers or as a tool to expand musical possibilities through new forms of creative expression. The system expands upon a variety of ideas found in different research areas including new interfaces for musical expression, generative music and group performance to produce a networked performance solution served via a web-browser interface. The final implementation of the system offers performers a mixture of high and low-level controls to influence the shape of sequences of notes output by locally run NN models in real time, also allowing performers to define their level of engagement with the assisting generative models. Two test performances were played, with the system shown to feasibly support four performers over a four minute piece while producing musically cohesive and engaging music. Iterations on the design of the system exposed technical constraints on the use of a JavaScript environment for generative models in a live music context, largely derived from inescapable processing overheads.
△ Less
Submitted 3 December, 2020;
originally announced December 2020.
-
Sonic Sculpture: Activating Engagement with Head-Mounted Augmented Reality
Authors:
Charles Patrick Martin,
Zeruo Liu,
Yichen Wang,
Wennan He,
Henry Gardner
Abstract:
This work examines how head-mounted AR can be used to build an interactive sonic landscape to engage with a public sculpture. We describe a sonic artwork, "Listening To Listening", that has been designed to accompany a real-world sculpture with two prototype interaction schemes. Our artwork is created for the HoloLens platform so that users can have an individual experience in a mixed reality cont…
▽ More
This work examines how head-mounted AR can be used to build an interactive sonic landscape to engage with a public sculpture. We describe a sonic artwork, "Listening To Listening", that has been designed to accompany a real-world sculpture with two prototype interaction schemes. Our artwork is created for the HoloLens platform so that users can have an individual experience in a mixed reality context. Personal head-mounted AR systems have recently become available and practical for integration into public art projects, however research into sonic sculpture works has yet to account for the affordances of current portable and mainstream AR systems. In this work, we take advantage of the HoloLens' spatial awareness to build sonic spaces that have a precise spatial relationship to a given sculpture and where the sculpture itself is modelled in the augmented scene as an "invisible hologram". We describe the artistic rationale for our artwork, the design of the two interaction schemes, and the technical and usability feedback that we have obtained from demonstrations during iterative development.
△ Less
Submitted 3 December, 2020;
originally announced December 2020.
-
Tracking Ensemble Performance on Touch-Screens with Gesture Classification and Transition Matrices
Authors:
Charles Martin,
Henry Gardner,
Ben Swift
Abstract:
We present and evaluate a novel interface for tracking ensemble performances on touch-screens. The system uses a Random Forest classifier to extract touch-screen gestures and transition matrix statistics. It analyses the resulting gesture-state sequences across an ensemble of performers. A series of specially designed iPad apps respond to this real-time analysis of free-form gestural performances…
▽ More
We present and evaluate a novel interface for tracking ensemble performances on touch-screens. The system uses a Random Forest classifier to extract touch-screen gestures and transition matrix statistics. It analyses the resulting gesture-state sequences across an ensemble of performers. A series of specially designed iPad apps respond to this real-time analysis of free-form gestural performances with calculated modifications to their musical interfaces. We describe our system and evaluate it through cross-validation and profiling as well as concert experience.
△ Less
Submitted 1 December, 2020;
originally announced December 2020.
-
Performing with a Mobile Computer System for Vibraphone
Authors:
Charles Martin
Abstract:
This paper describes the development of an Apple iPhone based mobile computer system for vibraphone and its use in a series of the author's performance projects in 2011 and 2012. This artistic research was motivated by a desire to develop an alternative to laptop computers for the author's existing percussion and computer performance practice. The aims were to develop a light, compact and flexible…
▽ More
This paper describes the development of an Apple iPhone based mobile computer system for vibraphone and its use in a series of the author's performance projects in 2011 and 2012. This artistic research was motivated by a desire to develop an alternative to laptop computers for the author's existing percussion and computer performance practice. The aims were to develop a light, compact and flexible system using mobile devices that would allow computer music to infiltrate solo and ensemble performance situations where it is difficult to use a laptop computer. The project began with a system that brought computer elements to Nordlig Vinter, a suite of percussion duos, using an iPhone, RjDj, Pure Data and a home-made pickup system. This process was documented with video recordings and analysed using ethnographic methods. The mobile computer music setup proved to be elegant and convenient in performance situations with very little time and space to set up, as well as in performance classes and workshops. The simple mobile system encouraged experimentation and the platforms used enabled sharing with a wider audience.
△ Less
Submitted 30 November, 2020;
originally announced December 2020.
-
Strike on Stage: a percussion and media performance
Authors:
Charles Martin,
Chi-Hsia Lai
Abstract:
This paper describes Strike on Stage, an interface and corresponding audio-visual performance work developed and performed in 2010 by percussionists and media artists Chi-Hsia Lai and Charles Martin. The concept of Strike on Stage is to integrate computer visuals and sound into an improvised percussion performance. A large projection surface is positioned directly behind the performers, while a co…
▽ More
This paper describes Strike on Stage, an interface and corresponding audio-visual performance work developed and performed in 2010 by percussionists and media artists Chi-Hsia Lai and Charles Martin. The concept of Strike on Stage is to integrate computer visuals and sound into an improvised percussion performance. A large projection surface is positioned directly behind the performers, while a computer vision system tracks their movements. The setup allows computer visualisation and sonification to be directly responsive and unified with the performers' gestures.
△ Less
Submitted 30 November, 2020;
originally announced December 2020.
-
Cross-artform performance using networked interfaces: Last Man to Die's Vital LMTD
Authors:
Charles Martin,
Benjamin Forster,
Hanna Cormick
Abstract:
In 2009 the cross artform group, Last Man to Die, presented a series of performances using new interfaces and networked performance to integrate the three artforms of its members (actor, Hanna Cormick, visual artist, Benjamin Forster and percussionist, Charles Martin). This paper explains our artistic motivations and design for a computer vision surface and networked heartbeat sensor as well as th…
▽ More
In 2009 the cross artform group, Last Man to Die, presented a series of performances using new interfaces and networked performance to integrate the three artforms of its members (actor, Hanna Cormick, visual artist, Benjamin Forster and percussionist, Charles Martin). This paper explains our artistic motivations and design for a computer vision surface and networked heartbeat sensor as well as the experience of mounting our first major work, Vital LMTD.
△ Less
Submitted 30 November, 2020;
originally announced December 2020.
-
Towards Movement Generation with Audio Features
Authors:
Benedikte Wallace,
Charles P. Martin,
Jim Torresen,
Kristian Nymoen
Abstract:
Sound and movement are closely coupled, particularly in dance. Certain audio features have been found to affect the way we move to music. Is this relationship between sound and movement something which can be modelled using machine learning? This work presents initial experiments wherein high-level audio features calculated from a set of music pieces are included in a movement generation model tra…
▽ More
Sound and movement are closely coupled, particularly in dance. Certain audio features have been found to affect the way we move to music. Is this relationship between sound and movement something which can be modelled using machine learning? This work presents initial experiments wherein high-level audio features calculated from a set of music pieces are included in a movement generation model trained on motion capture recordings of improvised dance. Our results indicate that the model learns to generate realistic dance movements which vary depending on the audio features.
△ Less
Submitted 26 November, 2020;
originally announced November 2020.
-
Exosphere -- Bringing The Cloud Closer
Authors:
Julian L. Pistorius,
Chris Martin,
Sanjana Sudarshan,
David S. LeBauer
Abstract:
Exosphere provides researcher-friendly software for managing computing workloads on OpenStack cloud infrastructure. Exosphere is a user-friendly alternative to Horizon, the default OpenStack graphical interface. Exosphere can be used with most research cloud infrastructure, requiring near-zero custom integration work.
Exosphere provides researcher-friendly software for managing computing workloads on OpenStack cloud infrastructure. Exosphere is a user-friendly alternative to Horizon, the default OpenStack graphical interface. Exosphere can be used with most research cloud infrastructure, requiring near-zero custom integration work.
△ Less
Submitted 13 October, 2020; v1 submitted 24 August, 2020;
originally announced August 2020.
-
Providing reliability and auditability to the IoT LwM2M protocol through Blockchain
Authors:
Cristian Martín,
Iván Alba,
Joaquín Trillo,
Enrique Soler,
Bartolomé Rubio,
Manuel Díaz
Abstract:
Blockchain has come to provide transparency, reliability as well as to increase the security in computer systems, especially in distributed ones like the Internet of Things (IoT). A few integrations have been proposed in this context so far; however, most of these solutions do not pay special attention to the interoperability of the IoT, one of the biggest challenges in this field. In this paper,…
▽ More
Blockchain has come to provide transparency, reliability as well as to increase the security in computer systems, especially in distributed ones like the Internet of Things (IoT). A few integrations have been proposed in this context so far; however, most of these solutions do not pay special attention to the interoperability of the IoT, one of the biggest challenges in this field. In this paper, a Blockchain solution has been integrated into the OMA Lightweight M2M (LwM2M), a promising industry IoT protocol for global interoperability. This integration provides reliability and auditability to the LwM2M protocol enabling IoT devices (LwM2M clients) to transparently interact with the protocol. Furthermore, a missing reliable API to allow users and applications to securely interact with the system and an interface to store critical information like anomalies for auditability have been defined.
△ Less
Submitted 15 August, 2020;
originally announced August 2020.
-
Kafka-ML: connecting the data stream with ML/AI frameworks
Authors:
Cristian Martín,
Peter Langendoerfer,
Pouya Soltani Zarrin,
Manuel Díaz,
Bartolomé Rubio
Abstract:
Machine Learning (ML) and Artificial Intelligence (AI) have a dependency on data sources to train, improve and make predictions through their algorithms. With the digital revolution and current paradigms like the Internet of Things, this information is turning from static data into continuous data streams. However, most of the ML/AI frameworks used nowadays are not fully prepared for this revoluti…
▽ More
Machine Learning (ML) and Artificial Intelligence (AI) have a dependency on data sources to train, improve and make predictions through their algorithms. With the digital revolution and current paradigms like the Internet of Things, this information is turning from static data into continuous data streams. However, most of the ML/AI frameworks used nowadays are not fully prepared for this revolution. In this paper, we proposed Kafka-ML, an open-source framework that enables the management of TensorFlow ML/AI pipelines through data streams (Apache Kafka). Kafka-ML provides an accessible and user-friendly Web User Interface where users can easily define ML models, to then train, evaluate and deploy them for inference. Kafka-ML itself and its deployed components are fully managed through containerization technologies, which ensure its portability and easy distribution and other features such as fault-tolerance and high availability. Finally, a novel approach has been introduced to manage and reuse data streams, which may lead to the (no) utilization of data storage and file systems.
△ Less
Submitted 16 July, 2020; v1 submitted 7 June, 2020;
originally announced June 2020.
-
A Process for the Evaluation of Node Embedding Methods in the Context of Node Classification
Authors:
Christoph Martin,
Meike Riebeling
Abstract:
Node embedding methods find latent lower-dimensional representations which are used as features in machine learning models. In the last few years, these methods have become extremely popular as a replacement for manual feature engineering. Since authors use various approaches for the evaluation of node embedding methods, existing studies can rarely be efficiently and accurately compared. We addres…
▽ More
Node embedding methods find latent lower-dimensional representations which are used as features in machine learning models. In the last few years, these methods have become extremely popular as a replacement for manual feature engineering. Since authors use various approaches for the evaluation of node embedding methods, existing studies can rarely be efficiently and accurately compared. We address this issue by developing a process for a fair and objective evaluation of node embedding procedures w.r.t. node classification. This process supports researchers and practitioners to compare new and existing methods in a reproducible way. We apply this process to four popular node embedding methods and make valuable observations. With an appropriate combination of hyperparameters, good performance can be achieved even with embeddings of lower dimensions, which is positive for the run times of the downstream machine learning task and the embedding algorithm. Multiple hyperparameter combinations yield similar performance. Thus, no extensive, time-consuming search is required to achieve reasonable performance in most cases.
△ Less
Submitted 29 May, 2020;
originally announced May 2020.
-
Environmental Adaptation of Robot Morphology and Control through Real-world Evolution
Authors:
Tønnes F. Nygaard,
Charles P. Martin,
David Howard,
Jim Torresen,
Kyrre Glette
Abstract:
Robots operating in the real world will experience a range of different environments and tasks. It is essential for the robot to have the ability to adapt to its surroundings to work efficiently in changing conditions. Evolutionary robotics aims to solve this by optimizing both the control and body (morphology) of a robot, allowing adaptation to internal, as well as external factors. Most work in…
▽ More
Robots operating in the real world will experience a range of different environments and tasks. It is essential for the robot to have the ability to adapt to its surroundings to work efficiently in changing conditions. Evolutionary robotics aims to solve this by optimizing both the control and body (morphology) of a robot, allowing adaptation to internal, as well as external factors. Most work in this field has been done in physics simulators, which are relatively simple and not able to replicate the richness of interactions found in the real world. Solutions that rely on the complex interplay between control, body, and environment are therefore rarely found. In this paper, we rely solely on real-world evaluations and apply evolutionary search to yield combinations of morphology and control for our mechanically self-reconfiguring quadruped robot. We evolve solutions on two distinct physical surfaces and analyze the results in terms of both control and morphology. We then transition to two previously unseen surfaces to demonstrate the generality of our method. We find that the evolutionary search finds high-performing and diverse morphology-controller configurations by adapting both control and body to the different properties of the physical environments. We additionally find that morphology and control vary with statistical significance between the environments. Moreover, we observe that our method allows for morphology and control parameters to transfer to previously-unseen terrains, demonstrating the generality of our approach.
△ Less
Submitted 20 October, 2020; v1 submitted 30 March, 2020;
originally announced March 2020.
-
Addressing the Memory Bottleneck in AI Model Training
Authors:
David Ojika,
Bhavesh Patel,
G. Anthony Reina,
Trent Boyer,
Chad Martin,
Prashant Shah
Abstract:
Using medical imaging as case-study, we demonstrate how Intel-optimized TensorFlow on an x86-based server equipped with 2nd Generation Intel Xeon Scalable Processors with large system memory allows for the training of memory-intensive AI/deep-learning models in a scale-up server configuration. We believe our work represents the first training of a deep neural network having large memory footprint…
▽ More
Using medical imaging as case-study, we demonstrate how Intel-optimized TensorFlow on an x86-based server equipped with 2nd Generation Intel Xeon Scalable Processors with large system memory allows for the training of memory-intensive AI/deep-learning models in a scale-up server configuration. We believe our work represents the first training of a deep neural network having large memory footprint (~ 1 TB) on a single-node server. We recommend this configuration to scientists and researchers who wish to develop large, state-of-the-art AI models but are currently limited by memory.
△ Less
Submitted 11 March, 2020;
originally announced March 2020.
-
Efficient exploration of zero-sum stochastic games
Authors:
Carlos Martin,
Tuomas Sandholm
Abstract:
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay, such as in financial or military simulations and computer games. During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well. Af…
▽ More
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay, such as in financial or military simulations and computer games. During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well. After that, the algorithm has to produce a strategy that has low exploitability. Our motivation is to quickly learn strategies that have low exploitability in situations where evaluating the payoffs of a queried strategy profile is costly. For the stochastic game setting, we propose using the distribution of state-action value functions induced by a belief distribution over possible environments. We compare the performance of various exploration strategies for this task, including generalizations of Thompson sampling and Bayes-UCB to this new setting. These two consistently outperform other strategies.
△ Less
Submitted 24 February, 2020;
originally announced February 2020.
-
Predicting trends in the quality of state-of-the-art neural networks without access to training or testing data
Authors:
Charles H. Martin,
Tongsu,
Peng,
Michael W. Mahoney
Abstract:
In many applications, one works with neural network models trained by someone else. For such pretrained models, one may not have access to training data or test data. Moreover, one may not know details about the model, e.g., the specifics of the training data, the loss function, the hyperparameter values, etc. Given one or many pretrained models, it is a challenge to say anything about the expecte…
▽ More
In many applications, one works with neural network models trained by someone else. For such pretrained models, one may not have access to training data or test data. Moreover, one may not know details about the model, e.g., the specifics of the training data, the loss function, the hyperparameter values, etc. Given one or many pretrained models, it is a challenge to say anything about the expected performance or quality of the models. Here, we address this challenge by providing a detailed meta-analysis of hundreds of publicly-available pretrained models. We examine norm based capacity control metrics as well as power law based metrics from the recently-developed Theory of Heavy-Tailed Self Regularization. We find that norm based metrics correlate well with reported test accuracies for well-trained models, but that they often cannot distinguish well-trained versus poorly-trained models. We also find that power law based metrics can do much better -- quantitatively better at discriminating among series of well-trained models with a given architecture; and qualitatively better at discriminating well-trained versus poorly-trained models. These methods can be used to identify when a pretrained neural network has problems that cannot be detected simply by examining training/test accuracies.
△ Less
Submitted 2 June, 2021; v1 submitted 16 February, 2020;
originally announced February 2020.
-
Lessons Learned from Real-World Experiments with DyRET: the Dynamic Robot for Embodied Testing
Authors:
Tønnes F. Nygaard,
Jørgen Nordmoen,
Charles P. Martin,
Kyrre Glette
Abstract:
Robots are used in more and more complex environments, and are expected to be able to adapt to changes and unknown situations. The easiest and quickest way to adapt is to change the control system of the robot, but for increasingly complex environments one should also change the body of the robot -- its morphology -- to better fit the task at hand. The theory of Embodied Cognition states that cont…
▽ More
Robots are used in more and more complex environments, and are expected to be able to adapt to changes and unknown situations. The easiest and quickest way to adapt is to change the control system of the robot, but for increasingly complex environments one should also change the body of the robot -- its morphology -- to better fit the task at hand. The theory of Embodied Cognition states that control is not the only source of cognition, and the body, environment, interaction between these and the mind all contribute as cognitive resources. Taking advantage of these concepts could lead to improved adaptivity, robustness, and versatility, however, executing these concepts on real-world robots puts additional requirements on the hardware and has several challenges when compared to learning just control. In contrast to the majority of work in Evolutionary Robotics, Eiben argues for real-world experiments in his `Grand Challenges for Evolutionary Robotics'. This requires robust hardware platforms that are capable of repeated experiments which should at the same time be flexible when unforeseen demands arise. In this paper, we introduce our unique robot platform with self-adaptive morphology. We discuss the challenges we have faced when designing it, and the lessons learned from real-world testing and learning.
△ Less
Submitted 14 May, 2019;
originally announced May 2019.
-
An Interactive Musical Prediction System with Mixture Density Recurrent Neural Networks
Authors:
Charles P Martin,
Jim Torresen
Abstract:
This paper is about creating digital musical instruments where a predictive neural network model is integrated into the interactive system. Rather than predicting symbolic music (e.g., MIDI notes), we suggest that predicting future control data from the user and precise temporal information can lead to new and interesting interactive possibilities. We propose that a mixture density recurrent neura…
▽ More
This paper is about creating digital musical instruments where a predictive neural network model is integrated into the interactive system. Rather than predicting symbolic music (e.g., MIDI notes), we suggest that predicting future control data from the user and precise temporal information can lead to new and interesting interactive possibilities. We propose that a mixture density recurrent neural network (MDRNN) is an appropriate model for this task. The predictions can be used to fill-in control data when the user stops performing, or as a kind of filter on the user's own input. We present an interactive MDRNN prediction server that allows rapid prototyping of new NIMEs featuring predictive musical interaction by recording datasets, training MDRNN models, and experimenting with interaction modes. We illustrate our system with several example NIMEs applying this idea. Our evaluation shows that real-time predictive interaction is viable even on single-board computers and that small models are appropriate for small datasets.
△ Less
Submitted 10 April, 2019;
originally announced April 2019.
-
The invisible power of fairness. How machine learning shapes democracy
Authors:
Elena Beretta,
Antonio Santangelo,
Bruno Lepri,
Antonio Vetrò,
Juan Carlos De Martin
Abstract:
Many machine learning systems make extensive use of large amounts of data regarding human behaviors. Several researchers have found various discriminatory practices related to the use of human-related machine learning systems, for example in the field of criminal justice, credit scoring and advertising. Fair machine learning is therefore emerging as a new field of study to mitigate biases that are…
▽ More
Many machine learning systems make extensive use of large amounts of data regarding human behaviors. Several researchers have found various discriminatory practices related to the use of human-related machine learning systems, for example in the field of criminal justice, credit scoring and advertising. Fair machine learning is therefore emerging as a new field of study to mitigate biases that are inadvertently incorporated into algorithms. Data scientists and computer engineers are making various efforts to provide definitions of fairness. In this paper, we provide an overview of the most widespread definitions of fairness in the field of machine learning, arguing that the ideas highlighting each formalization are closely related to different ideas of justice and to different interpretations of democracy embedded in our culture. This work intends to analyze the definitions of fairness that have been proposed to date to interpret the underlying criteria and to relate them to different ideas of democracy.
△ Less
Submitted 22 March, 2019;
originally announced March 2019.
-
Evolving Robots on Easy Mode: Towards a Variable Complexity Controller for Quadrupeds
Authors:
Tønnes Frostad Nygaard,
Charles Patrick Martin,
Jim Torresen,
Kyrre Glette
Abstract:
The complexity of a legged robot's environment or task can inform how specialised its gait must be to ensure success. Evolving specialised robotic gaits demands many evaluations - acceptable for computer simulations, but not for physical robots. For some tasks, a more general gait, with lower optimization costs, could be satisfactory. In this paper, we introduce a new type of gait controller where…
▽ More
The complexity of a legged robot's environment or task can inform how specialised its gait must be to ensure success. Evolving specialised robotic gaits demands many evaluations - acceptable for computer simulations, but not for physical robots. For some tasks, a more general gait, with lower optimization costs, could be satisfactory. In this paper, we introduce a new type of gait controller where complexity can be set by a single parameter, using a dynamic genotype-phenotype mapping. Low controller complexity leads to conservative gaits, while higher complexity allows more sophistication and high performance for demanding tasks, at the cost of optimization effort. We investigate the new controller on a virtual robot in simulations and do preliminary testing on a real-world robot. We show that having variable complexity allows us to adapt to different optimization budgets. With a high evaluation budget in simulation, a complex controller performs best. Moreover, real-world evolution with a limited evaluation budget indicates that a lower gait complexity is preferable for a relatively simple environment.
△ Less
Submitted 12 February, 2019;
originally announced February 2019.