-
Latent Linear Quadratic Regulator for Robotic Control Tasks
Authors:
Yuan Zhang,
Shaohui Yang,
Toshiyuki Ohtsuka,
Colin Jones,
Joschka Boedecker
Abstract:
Model predictive control (MPC) has played a more crucial role in various robotic control tasks, but its high computational requirements are concerning, especially for nonlinear dynamical models. This paper presents a $\textbf{la}$tent $\textbf{l}$inear $\textbf{q}$uadratic $\textbf{r}$egulator (LaLQR) that maps the state space into a latent space, on which the dynamical model is linear and the cos…
▽ More
Model predictive control (MPC) has played a more crucial role in various robotic control tasks, but its high computational requirements are concerning, especially for nonlinear dynamical models. This paper presents a $\textbf{la}$tent $\textbf{l}$inear $\textbf{q}$uadratic $\textbf{r}$egulator (LaLQR) that maps the state space into a latent space, on which the dynamical model is linear and the cost function is quadratic, allowing the efficient application of LQR. We jointly learn this alternative system by imitating the original MPC. Experiments show LaLQR's superior efficiency and generalization compared to other baselines.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
GPT-4 is judged more human than humans in displaced and inverted Turing tests
Authors:
Ishika Rathi,
Sydney Taylor,
Benjamin K. Bergen,
Cameron R. Jones
Abstract:
Everyday AI detection requires differentiating between people and AI in informal, online conversations. In many cases, people will not interact directly with AI systems but instead read conversations between AI systems and other people. We measured how well people and large language models can discriminate using two modified versions of the Turing test: inverted and displaced. GPT-3.5, GPT-4, and…
▽ More
Everyday AI detection requires differentiating between people and AI in informal, online conversations. In many cases, people will not interact directly with AI systems but instead read conversations between AI systems and other people. We measured how well people and large language models can discriminate using two modified versions of the Turing test: inverted and displaced. GPT-3.5, GPT-4, and displaced human adjudicators judged whether an agent was human or AI on the basis of a Turing test transcript. We found that both AI and displaced human judges were less accurate than interactive interrogators, with below chance accuracy overall. Moreover, all three judged the best-performing GPT-4 witness to be human more often than human witnesses. This suggests that both humans and current LLMs struggle to distinguish between the two when they are not actively interrogating the person, underscoring an urgent need for more accurate tools to detect AI in conversations.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Multi-Agent Search-Type Problems on Polygons
Authors:
Konstantinos Georgiou,
Caleb Jones,
Jesse Lucier
Abstract:
We present several advancements in search-type problems for fleets of mobile agents operating in two dimensions under the wireless model. Potential hidden target locations are equidistant from a central point, forming either a disk (infinite possible locations) or regular polygons (finite possible locations). Building on the foundational disk evacuation problem, the disk priority evacuation proble…
▽ More
We present several advancements in search-type problems for fleets of mobile agents operating in two dimensions under the wireless model. Potential hidden target locations are equidistant from a central point, forming either a disk (infinite possible locations) or regular polygons (finite possible locations). Building on the foundational disk evacuation problem, the disk priority evacuation problem with $k$ Servants, and the disk $w$-weighted search problem, we make improvements on several fronts. First we establish new upper and lower bounds for the $n$-gon priority evacuation problem with $1$ Servant for $n \leq 13$, and for $n_k$-gons with $k=2, 3, 4$ Servants, where $n_2 \leq 11$, $n_3 \leq 9$, and $n_4 \leq 10$, offering tight or nearly tight bounds. The only previous results known were a tight upper bound for $k=1$ and $n=6$ and lower bounds for $k=1$ and $n \leq 9$. Second, our work improves the best lower bound known for the disk priority evacuation problem with $k=1$ Servant from $4.46798$ to $4.64666$ and for $k=2$ Servants from $3.6307$ to $3.65332$. Third, we improve the best lower bounds known for the disk $w$-weighted group search problem, significantly reducing the gap between the best upper and lower bounds for $w$ values where the gap was largest. These improvements are based on nearly tight upper and lower bounds for the $11$-gon and $12$-gon $w$-weighted evacuation problems, while previous analyses were limited only to lower bounds and only to $7$-gons.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task?
Authors:
Zhiqiang Pi,
Annapurna Vadaparty,
Benjamin K. Bergen,
Cameron R. Jones
Abstract:
Recent empirical results have sparked a debate about whether or not Large Language Models (LLMs) are capable of Theory of Mind (ToM). While some have found LLMs to be successful on ToM evaluations such as the False Belief task (Kosinski, 2023), others have argued that LLMs solve these tasks by exploiting spurious correlations -- not representing beliefs -- since they fail on trivial alterations to…
▽ More
Recent empirical results have sparked a debate about whether or not Large Language Models (LLMs) are capable of Theory of Mind (ToM). While some have found LLMs to be successful on ToM evaluations such as the False Belief task (Kosinski, 2023), others have argued that LLMs solve these tasks by exploiting spurious correlations -- not representing beliefs -- since they fail on trivial alterations to these tasks (Ullman, 2023). In this paper, we introduce SCALPEL: a technique to generate targeted modifications for False Belief tasks to test different specific hypotheses about why LLMs fail. We find that modifications which make explicit common inferences -- such as that looking at a transparent object implies recognizing its contents -- preserve LLMs' performance. This suggests that LLMs' failures on modified ToM tasks could result from a lack of more general commonsense reasoning, rather than a failure to represent mental states. We argue that SCALPEL could be helpful for explaining LLM successes and failures in other cases.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
People cannot distinguish GPT-4 from a human in a Turing test
Authors:
Cameron R. Jones,
Benjamin K. Bergen
Abstract:
We evaluated 3 systems (ELIZA, GPT-3.5 and GPT-4) in a randomized, controlled, and preregistered Turing test. Human participants had a 5 minute conversation with either a human or an AI, and judged whether or not they thought their interlocutor was human. GPT-4 was judged to be a human 54% of the time, outperforming ELIZA (22%) but lagging behind actual humans (67%). The results provide the first…
▽ More
We evaluated 3 systems (ELIZA, GPT-3.5 and GPT-4) in a randomized, controlled, and preregistered Turing test. Human participants had a 5 minute conversation with either a human or an AI, and judged whether or not they thought their interlocutor was human. GPT-4 was judged to be a human 54% of the time, outperforming ELIZA (22%) but lagging behind actual humans (67%). The results provide the first robust empirical demonstration that any artificial system passes an interactive 2-player Turing test. The results have implications for debates around machine intelligence and, more urgently, suggest that deception by current AI systems may go undetected. Analysis of participants' strategies and reasoning suggests that stylistic and socio-emotional factors play a larger role in passing the Turing test than traditional notions of intelligence.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Data reification in a concurrent rely-guarantee algebra
Authors:
Larissa A. Meinicke,
Ian J. Hayes,
Cliff B. Jones
Abstract:
Specifications of significant systems can be made short and perspicuous by using abstract data types; data reification can provide a clear, stepwise, development history of programs that use more efficient concrete representations. Data reification (or "refinement") techniques for sequential programs are well established. This paper applies these ideas to concurrency, in particular, an algebraic t…
▽ More
Specifications of significant systems can be made short and perspicuous by using abstract data types; data reification can provide a clear, stepwise, development history of programs that use more efficient concrete representations. Data reification (or "refinement") techniques for sequential programs are well established. This paper applies these ideas to concurrency, in particular, an algebraic theory supporting rely-guarantee reasoning about concurrency. A concurrent version of the Galler-Fischer equivalence relation data structure is used as an example.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Emergent Language Symbolic Autoencoder (ELSA) with Weak Supervision to Model Hierarchical Brain Networks
Authors:
Ammar Ahmed Pallikonda Latheef,
Alberto Santamaria-Pang,
Craig K Jones,
Haris I Sair
Abstract:
Brain networks display a hierarchical organization, a complexity that poses a challenge for existing deep learning models, often structured as flat classifiers, leading to difficulties in interpretability and the 'black box' issue. To bridge this gap, we propose a novel architecture: a symbolic autoencoder informed by weak supervision and an Emergent Language (EL) framework. This model moves beyon…
▽ More
Brain networks display a hierarchical organization, a complexity that poses a challenge for existing deep learning models, often structured as flat classifiers, leading to difficulties in interpretability and the 'black box' issue. To bridge this gap, we propose a novel architecture: a symbolic autoencoder informed by weak supervision and an Emergent Language (EL) framework. This model moves beyond traditional flat classifiers by producing hierarchical clusters and corresponding imagery, subsequently represented through symbolic sentences to improve the clinical interpretability of hierarchically organized data such as intrinsic brain networks, which can be characterized using resting-state fMRI images. Our innovation includes a generalized hierarchical loss function designed to ensure that both sentences and images accurately reflect the hierarchical structure of functional brain networks. This enables us to model functional brain networks from a broader perspective down to more granular details. Furthermore, we introduce a quantitative method to assess the hierarchical consistency of these symbolic representations. Our qualitative analyses show that our model successfully generates hierarchically organized, clinically interpretable images, a finding supported by our quantitative evaluations. We find that our best performing loss function leads to a hierarchical consistency of over 97% when identifying images corresponding to brain networks. This approach not only advances the interpretability of deep learning models in neuroimaging analysis but also represents a significant step towards modeling the intricate hierarchical nature of brain networks.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Diagram Analysis of Iterative Algorithms
Authors:
Chris Jones,
Lucas Pesenti
Abstract:
We study a general class of first-order iterative algorithms which includes power iteration, belief propagation and Approximate Message Passing (AMP), and many forms of gradient descent. When the input is a random matrix with i.i.d. entries, we present a new way to analyze these algorithms using combinatorial diagrams. Each diagram is a small graph, and the operations of the algorithm correspond t…
▽ More
We study a general class of first-order iterative algorithms which includes power iteration, belief propagation and Approximate Message Passing (AMP), and many forms of gradient descent. When the input is a random matrix with i.i.d. entries, we present a new way to analyze these algorithms using combinatorial diagrams. Each diagram is a small graph, and the operations of the algorithm correspond to simple combinatorial operations on these graphs.
We prove a fundamental property of the diagrams: asymptotically, we can discard all of the diagrams except for the trees. The mechanics of first-order algorithms simplify dramatically as the algorithmic operations have particularly simple and interpretable effects on the trees. We further show that the tree-shaped diagrams are essentially a basis of asymptotically independent Gaussian vectors.
The tree approximation mirrors the assumption of the cavity method, a 40-year-old non-rigorous technique in statistical physics which has served as one of the most fundamental techniques in the field. We demonstrate the connection with the replica symmetric cavity method by "implementing" heuristic physics derivations into rigorous proofs. We rigorously establish that belief propagation is asymptotically equal to its associated AMP algorithm and we give a new simple proof of the state evolution formula for AMP.
These results apply when the iterative algorithm runs for constantly many iterations. We then push the diagram analysis to a number of iterations that scales with the dimension $n$ of the input matrix. We prove that for debiased power iteration, the tree diagram representation accurately describes the dynamic all the way up to $n^{Ω(1)}$ iterations. We conjecture that this can be extended up to $n^{1/2}$ iterations but no further. Our proofs use straightforward combinatorial arguments akin to the trace method from random matrix theory.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
How to Evaluate Entity Resolution Systems: An Entity-Centric Framework with Application to Inventor Name Disambiguation
Authors:
Olivier Binette,
Youngsoo Baek,
Siddharth Engineer,
Christina Jones,
Abel Dasylva,
Jerome P. Reiter
Abstract:
Entity resolution (record linkage, microclustering) systems are notoriously difficult to evaluate. Looking for a needle in a haystack, traditional evaluation methods use sophisticated, application-specific sampling schemes to find matching pairs of records among an immense number of non-matches. We propose an alternative that facilitates the creation of representative, reusable benchmark data sets…
▽ More
Entity resolution (record linkage, microclustering) systems are notoriously difficult to evaluate. Looking for a needle in a haystack, traditional evaluation methods use sophisticated, application-specific sampling schemes to find matching pairs of records among an immense number of non-matches. We propose an alternative that facilitates the creation of representative, reusable benchmark data sets without necessitating complex sampling schemes. These benchmark data sets can then be used for model training and a variety of evaluation tasks. Specifically, we propose an entity-centric data labeling methodology that integrates with a unified framework for monitoring summary statistics, estimating key performance metrics such as cluster and pairwise precision and recall, and analyzing root causes for errors. We validate the framework in an application to inventor name disambiguation and through simulation studies. Software: https://github.com/OlivierBinette/er-evaluation/
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
Mitigating attribute amplification in counterfactual image generation
Authors:
Tian Xia,
Mélanie Roschewitz,
Fabio De Sousa Ribeiro,
Charles Jones,
Ben Glocker
Abstract:
Causal generative modelling is gaining interest in medical imaging due to its ability to answer interventional and counterfactual queries. Most work focuses on generating counterfactual images that look plausible, using auxiliary classifiers to enforce effectiveness of simulated interventions. We investigate pitfalls in this approach, discovering the issue of attribute amplification, where unrelat…
▽ More
Causal generative modelling is gaining interest in medical imaging due to its ability to answer interventional and counterfactual queries. Most work focuses on generating counterfactual images that look plausible, using auxiliary classifiers to enforce effectiveness of simulated interventions. We investigate pitfalls in this approach, discovering the issue of attribute amplification, where unrelated attributes are spuriously affected during interventions, leading to biases across protected characteristics and disease status. We show that attribute amplification is caused by the use of hard labels in the counterfactual training process and propose soft counterfactual fine-tuning to mitigate this issue. Our method substantially reduces the amplification effect while maintaining effectiveness of generated images, demonstrated on a large chest X-ray dataset. Our work makes an important advancement towards more faithful and unbiased causal modelling in medical imaging.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Deep Neural Network Initialization with Sparsity Inducing Activations
Authors:
Ilan Price,
Nicholas Daultry Ball,
Samuel C. H. Lam,
Adam C. Jones,
Jared Tanner
Abstract:
Inducing and leveraging sparse activations during training and inference is a promising avenue for improving the computational efficiency of deep networks, which is increasingly important as network sizes continue to grow and their application becomes more widespread. Here we use the large width Gaussian process limit to analyze the behaviour, at random initialization, of nonlinear activations tha…
▽ More
Inducing and leveraging sparse activations during training and inference is a promising avenue for improving the computational efficiency of deep networks, which is increasingly important as network sizes continue to grow and their application becomes more widespread. Here we use the large width Gaussian process limit to analyze the behaviour, at random initialization, of nonlinear activations that induce sparsity in the hidden outputs. A previously unreported form of training instability is proven for arguably two of the most natural candidates for hidden layer sparsification; those being a shifted ReLU ($φ(x)=\max(0, x-τ)$ for $τ\ge 0$) and soft thresholding ($φ(x)=0$ for $|x|\leτ$ and $x-\text{sign}(x)τ$ for $|x|>τ$). We show that this instability is overcome by clipping the nonlinear activation magnitude, at a level prescribed by the shape of the associated Gaussian process variance map. Numerical experiments verify the theory and show that the proposed magnitude clipped sparsifying activations can be trained with training and test fractional sparsity as high as 85\% while retaining close to full accuracy.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.
-
Development and validation of an artificial intelligence model to accurately predict spinopelvic parameters
Authors:
Edward S. Harake,
Joseph R. Linzey,
Cheng Jiang,
Rushikesh S. Joshi,
Mark M. Zaki,
Jaes C. Jones,
Siri S. Khalsa,
John H. Lee,
Zachary Wilseck,
Jacob R. Joseph,
Todd C. Hollon,
Paul Park
Abstract:
Objective. Achieving appropriate spinopelvic alignment has been shown to be associated with improved clinical symptoms. However, measurement of spinopelvic radiographic parameters is time-intensive and interobserver reliability is a concern. Automated measurement tools have the promise of rapid and consistent measurements, but existing tools are still limited by some degree of manual user-entry re…
▽ More
Objective. Achieving appropriate spinopelvic alignment has been shown to be associated with improved clinical symptoms. However, measurement of spinopelvic radiographic parameters is time-intensive and interobserver reliability is a concern. Automated measurement tools have the promise of rapid and consistent measurements, but existing tools are still limited by some degree of manual user-entry requirements. This study presents a novel artificial intelligence (AI) tool called SpinePose that automatically predicts spinopelvic parameters with high accuracy without the need for manual entry.
Methods. SpinePose was trained and validated on 761 sagittal whole-spine X-rays to predict sagittal vertical axis (SVA), pelvic tilt (PT), pelvic incidence (PI), sacral slope (SS), lumbar lordosis (LL), T1-pelvic angle (T1PA), and L1-pelvic angle (L1PA). A separate test set of 40 X-rays was labeled by 4 reviewers, including fellowship-trained spine surgeons and a fellowship-trained radiologist with neuroradiology subspecialty certification. Median errors relative to the most senior reviewer were calculated to determine model accuracy on test images. Intraclass correlation coefficients (ICC) were used to assess inter-rater reliability.
Results. SpinePose exhibited the following median (interquartile range) parameter errors: SVA: 2.2(2.3)mm, p=0.93; PT: 1.3(1.2)°, p=0.48; SS: 1.7(2.2)°, p=0.64; PI: 2.2(2.1)°, p=0.24; LL: 2.6(4.0)°, p=0.89; T1PA: 1.1(0.9)°, p=0.42; and L1PA: 1.4(1.6)°, p=0.49. Model predictions also exhibited excellent reliability at all parameters (ICC: 0.91-1.0).
Conclusions. SpinePose accurately predicted spinopelvic parameters with excellent reliability comparable to fellowship-trained spine surgeons and neuroradiologists. Utilization of predictive AI tools in spinal imaging can substantially aid in patient selection and surgical planning.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
Using YOLO v7 to Detect Kidney in Magnetic Resonance Imaging
Authors:
Pouria Yazdian Anari,
Fiona Obiezu,
Nathan Lay,
Fatemeh Dehghani Firouzabadi,
Aditi Chaurasia,
Mahshid Golagha,
Shiva Singh,
Fatemeh Homayounieh,
Aryan Zahergivar,
Stephanie Harmon,
Evrim Turkbey,
Rabindra Gautam,
Kevin Ma,
Maria Merino,
Elizabeth C. Jones,
Mark W. Ball,
W. Marston Linehan,
Baris Turkbey,
Ashkan A. Malayeri
Abstract:
Introduction This study explores the use of the latest You Only Look Once (YOLO V7) object detection method to enhance kidney detection in medical imaging by training and testing a modified YOLO V7 on medical image formats. Methods Study includes 878 patients with various subtypes of renal cell carcinoma (RCC) and 206 patients with normal kidneys. A total of 5657 MRI scans for 1084 patients were r…
▽ More
Introduction This study explores the use of the latest You Only Look Once (YOLO V7) object detection method to enhance kidney detection in medical imaging by training and testing a modified YOLO V7 on medical image formats. Methods Study includes 878 patients with various subtypes of renal cell carcinoma (RCC) and 206 patients with normal kidneys. A total of 5657 MRI scans for 1084 patients were retrieved. 326 patients with 1034 tumors recruited from a retrospective maintained database, and bounding boxes were drawn around their tumors. A primary model was trained on 80% of annotated cases, with 20% saved for testing (primary test set). The best primary model was then used to identify tumors in the remaining 861 patients and bounding box coordinates were generated on their scans using the model. Ten benchmark training sets were created with generated coordinates on not-segmented patients. The final model used to predict the kidney in the primary test set. We reported the positive predictive value (PPV), sensitivity, and mean average precision (mAP). Results The primary training set showed an average PPV of 0.94 +/- 0.01, sensitivity of 0.87 +/- 0.04, and mAP of 0.91 +/- 0.02. The best primary model yielded a PPV of 0.97, sensitivity of 0.92, and mAP of 0.95. The final model demonstrated an average PPV of 0.95 +/- 0.03, sensitivity of 0.98 +/- 0.004, and mAP of 0.95 +/- 0.01. Conclusion Using a semi-supervised approach with a medical image library, we developed a high-performing model for kidney detection. Further external validation is required to assess the model's generalizability.
△ Less
Submitted 12 February, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
Principled Preferential Bayesian Optimization
Authors:
Wenjie Xu,
Wenbin Wang,
Yuning Jiang,
Bratislav Svetozarevic,
Colin N. Jones
Abstract:
We study the problem of preferential Bayesian optimization (BO), where we aim to optimize a black-box function with only preference feedback over a pair of candidate solutions. Inspired by the likelihood ratio idea, we construct a confidence set of the black-box function using only the preference feedback. An optimistic algorithm with an efficient computational method is then developed to solve th…
▽ More
We study the problem of preferential Bayesian optimization (BO), where we aim to optimize a black-box function with only preference feedback over a pair of candidate solutions. Inspired by the likelihood ratio idea, we construct a confidence set of the black-box function using only the preference feedback. An optimistic algorithm with an efficient computational method is then developed to solve the problem, which enjoys an information-theoretic bound on the total cumulative regret, a first-of-its-kind for preferential BO. This bound further allows us to design a scheme to report an estimated best solution, with a guaranteed convergence rate. Experimental results on sampled instances from Gaussian processes, standard test functions, and a thermal comfort optimization problem all show that our method stably achieves better or competitive performance as compared to the existing state-of-the-art heuristics, which, however, do not have theoretical guarantees on regret bounds or convergence.
△ Less
Submitted 29 May, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
Ensuring Data Privacy in AC Optimal Power Flow with a Distributed Co-Simulation Framework
Authors:
Xinliang Dai,
Alexander Kocher,
Jovana Kovačević,
Burak Dindar,
Yuning Jiang,
Colin N. Jones,
Hüseyin Çakmak,
Veit Hagenmeyer
Abstract:
During the energy transition, the significance of collaborative management among institutions is rising, confronting challenges posed by data privacy concerns. Prevailing research on distributed approaches, as an alternative to centralized management, often lacks numerical convergence guarantees or is limited to single-machine numerical simulation. To address this, we present a distributed approac…
▽ More
During the energy transition, the significance of collaborative management among institutions is rising, confronting challenges posed by data privacy concerns. Prevailing research on distributed approaches, as an alternative to centralized management, often lacks numerical convergence guarantees or is limited to single-machine numerical simulation. To address this, we present a distributed approach for solving AC Optimal Power Flow (OPF) problems within a geographically distributed environment. This involves integrating the energy system Co-Simulation (eCoSim) module in the eASiMOV framework with the convergence-guaranteed distributed optimization algorithm, i.e., the Augmented Lagrangian based Alternating Direction Inexact Newton method (ALADIN). Comprehensive evaluations across multiple system scenarios reveal a marginal performance slowdown compared to the centralized approach and the distributed approach executed on single machines -- a justified trade-off for enhanced data privacy. This investigation serves as empirical validation of the successful execution of distributed AC OPF within a geographically distributed environment, highlighting potential directions for future research.
△ Less
Submitted 15 March, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
RTA-Former: Reverse Transformer Attention for Polyp Segmentation
Authors:
Zhikai Li,
Murong Yi,
Ali Uneri,
Sihan Niu,
Craig Jones
Abstract:
Polyp segmentation is a key aspect of colorectal cancer prevention, enabling early detection and guiding subsequent treatments. Intelligent diagnostic tools, including deep learning solutions, are widely explored to streamline and potentially automate this process. However, even with many powerful network architectures, there still comes the problem of producing accurate edge segmentation. In this…
▽ More
Polyp segmentation is a key aspect of colorectal cancer prevention, enabling early detection and guiding subsequent treatments. Intelligent diagnostic tools, including deep learning solutions, are widely explored to streamline and potentially automate this process. However, even with many powerful network architectures, there still comes the problem of producing accurate edge segmentation. In this paper, we introduce a novel network, namely RTA-Former, that employs a transformer model as the encoder backbone and innovatively adapts Reverse Attention (RA) with a transformer stage in the decoder for enhanced edge segmentation. The results of the experiments illustrate that RTA-Former achieves state-of-the-art (SOTA) performance in five polyp segmentation datasets. The strong capability of RTA-Former holds promise in improving the accuracy of Transformer-based polyp segmentation, potentially leading to better clinical decisions and patient outcomes. Our code is publicly available on GitHub.
△ Less
Submitted 28 April, 2024; v1 submitted 21 January, 2024;
originally announced January 2024.
-
Extending Rely-Guarantee thinking to handle Real-Time Scheduling
Authors:
Cliff B. Jones,
Alan Burns
Abstract:
The reference point for developing any artefact is its specification; to develop software formally, a formal specification is required. For sequential programs, pre and post conditions (together with abstract objects) suffice; rely and guarantee conditions extend the scope of formal development approaches to tackle concurrency. In addition, real-time systems need ways of both requiring progress an…
▽ More
The reference point for developing any artefact is its specification; to develop software formally, a formal specification is required. For sequential programs, pre and post conditions (together with abstract objects) suffice; rely and guarantee conditions extend the scope of formal development approaches to tackle concurrency. In addition, real-time systems need ways of both requiring progress and relating that progress to some notion of time. This paper extends rely-guarantee ideas to cope with specifications of -- and assumptions about -- real-time schedulers. Furthermore it shows how the approach helps identify and specify fault-tolerance aspects of such schedulers by systematically challenging the assumptions.
△ Less
Submitted 30 November, 2023;
originally announced December 2023.
-
Using Rely/Guarantee to Pinpoint Assumptions underlying Security Protocols
Authors:
Nisansala P. Yatapanage,
Cliff B. Jones
Abstract:
The verification of security protocols is essential, in order to ensure the absence of potential attacks. However, verification results are only valid with respect to the assumptions under which the verification was performed. These assumptions are often hidden and are difficult to identify, making it unclear whether a given protocol is safe to deploy into a particular environment. Rely/guarantee…
▽ More
The verification of security protocols is essential, in order to ensure the absence of potential attacks. However, verification results are only valid with respect to the assumptions under which the verification was performed. These assumptions are often hidden and are difficult to identify, making it unclear whether a given protocol is safe to deploy into a particular environment. Rely/guarantee provides a mechanism for abstractly reasoning about the interference from the environment. Using this approach, the assumptions are made clear and precise. This paper investigates this approach on the Needham-Schroeder Public Key protocol, showing that the technique can effectively uncover the assumptions under which the protocol can withstand attacks from intruders.
△ Less
Submitted 25 November, 2023;
originally announced November 2023.
-
Evidential Uncertainty Quantification: A Variance-Based Perspective
Authors:
Ruxiao Duan,
Brian Caffo,
Harrison X. Bai,
Haris I. Sair,
Craig Jones
Abstract:
Uncertainty quantification of deep neural networks has become an active field of research and plays a crucial role in various downstream tasks such as active learning. Recent advances in evidential deep learning shed light on the direct quantification of aleatoric and epistemic uncertainties with a single forward pass of the model. Most traditional approaches adopt an entropy-based method to deriv…
▽ More
Uncertainty quantification of deep neural networks has become an active field of research and plays a crucial role in various downstream tasks such as active learning. Recent advances in evidential deep learning shed light on the direct quantification of aleatoric and epistemic uncertainties with a single forward pass of the model. Most traditional approaches adopt an entropy-based method to derive evidential uncertainty in classification, quantifying uncertainty at the sample level. However, the variance-based method that has been widely applied in regression problems is seldom used in the classification setting. In this work, we adapt the variance-based approach from regression to classification, quantifying classification uncertainty at the class level. The variance decomposition technique in regression is extended to class covariance decomposition in classification based on the law of total covariance, and the class correlation is also derived from the covariance. Experiments on cross-domain datasets are conducted to illustrate that the variance-based approach not only results in similar accuracy as the entropy-based one in active domain adaptation but also brings information about class-wise uncertainties as well as between-class correlations. The code is available at https://github.com/KerryDRX/EvidentialADA. This alternative means of evidential uncertainty quantification will give researchers more options when class uncertainties and correlations are important in their applications.
△ Less
Submitted 19 November, 2023;
originally announced November 2023.
-
Stable Linear Subspace Identification: A Machine Learning Approach
Authors:
Loris Di Natale,
Muhammad Zakwan,
Bratislav Svetozarevic,
Philipp Heer,
Giancarlo Ferrari-Trecate,
Colin N. Jones
Abstract:
Machine Learning (ML) and linear System Identification (SI) have been historically developed independently. In this paper, we leverage well-established ML tools - especially the automatic differentiation framework - to introduce SIMBa, a family of discrete linear multi-step-ahead state-space SI methods using backpropagation. SIMBa relies on a novel Linear-Matrix-Inequality-based free parametrizati…
▽ More
Machine Learning (ML) and linear System Identification (SI) have been historically developed independently. In this paper, we leverage well-established ML tools - especially the automatic differentiation framework - to introduce SIMBa, a family of discrete linear multi-step-ahead state-space SI methods using backpropagation. SIMBa relies on a novel Linear-Matrix-Inequality-based free parametrization of Schur matrices to ensure the stability of the identified model.
We show how SIMBa generally outperforms traditional linear state-space SI methods, and sometimes significantly, although at the price of a higher computational burden. This performance gap is particularly remarkable compared to other SI methods with stability guarantees, where the gain is frequently above 25% in our investigations, hinting at SIMBa's ability to simultaneously achieve state-of-the-art fitting performance and enforce stability. Interestingly, these observations hold for a wide variety of input-output systems and on both simulated and real-world data, showcasing the flexibility of the proposed approach. We postulate that this new SI paradigm presents a great extension potential to identify structured nonlinear models from data, and we hence open-source SIMBa on https://github.com/Cemempamoi/simba.
△ Less
Submitted 26 March, 2024; v1 submitted 6 November, 2023;
originally announced November 2023.
-
Does GPT-4 pass the Turing test?
Authors:
Cameron R. Jones,
Benjamin K. Bergen
Abstract:
We evaluated GPT-4 in a public online Turing test. The best-performing GPT-4 prompt passed in 49.7% of games, outperforming ELIZA (22%) and GPT-3.5 (20%), but falling short of the baseline set by human participants (66%). Participants' decisions were based mainly on linguistic style (35%) and socioemotional traits (27%), supporting the idea that intelligence, narrowly conceived, is not sufficient…
▽ More
We evaluated GPT-4 in a public online Turing test. The best-performing GPT-4 prompt passed in 49.7% of games, outperforming ELIZA (22%) and GPT-3.5 (20%), but falling short of the baseline set by human participants (66%). Participants' decisions were based mainly on linguistic style (35%) and socioemotional traits (27%), supporting the idea that intelligence, narrowly conceived, is not sufficient to pass the Turing test. Participant knowledge about LLMs and number of games played positively correlated with accuracy in detecting AI, suggesting learning and practice as possible strategies to mitigate deception. Despite known limitations as a test of intelligence, we argue that the Turing test continues to be relevant as an assessment of naturalistic communication and deception. AI models with the ability to masquerade as humans could have widespread societal consequences, and we analyse the effectiveness of different strategies and criteria for judging humanlikeness.
△ Less
Submitted 20 April, 2024; v1 submitted 31 October, 2023;
originally announced October 2023.
-
Learning to Decode the Surface Code with a Recurrent, Transformer-Based Neural Network
Authors:
Johannes Bausch,
Andrew W Senior,
Francisco J H Heras,
Thomas Edlich,
Alex Davies,
Michael Newman,
Cody Jones,
Kevin Satzinger,
Murphy Yuezhen Niu,
Sam Blackwell,
George Holland,
Dvir Kafri,
Juan Atalaya,
Craig Gidney,
Demis Hassabis,
Sergio Boixo,
Hartmut Neven,
Pushmeet Kohli
Abstract:
Quantum error-correction is a prerequisite for reliable quantum computation. Towards this goal, we present a recurrent, transformer-based neural network which learns to decode the surface code, the leading quantum error-correction code. Our decoder outperforms state-of-the-art algorithmic decoders on real-world data from Google's Sycamore quantum processor for distance 3 and 5 surface codes. On di…
▽ More
Quantum error-correction is a prerequisite for reliable quantum computation. Towards this goal, we present a recurrent, transformer-based neural network which learns to decode the surface code, the leading quantum error-correction code. Our decoder outperforms state-of-the-art algorithmic decoders on real-world data from Google's Sycamore quantum processor for distance 3 and 5 surface codes. On distances up to 11, the decoder maintains its advantage on simulated data with realistic noise including cross-talk, leakage, and analog readout signals, and sustains its accuracy far beyond the 25 cycles it was trained on. Our work illustrates the ability of machine learning to go beyond human-designed algorithms by learning from data directly, highlighting machine learning as a strong contender for decoding in quantum computers.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
CMSSW Scaling Limits on Many-Core Machines
Authors:
Christopher Jones,
Patrick Gartung
Abstract:
Today the LHC offline computing relies heavily on CPU resources, despite the interest in compute accelerators, such as GPUs, for the longer term future. The number of cores per CPU socket has continued to increase steadily, reaching the levels of 64 cores (128 threads) with recent AMD EPYC processors, and 128 cores on Ampere Altra Max ARM processors. Over the course of the past decade, the CMS dat…
▽ More
Today the LHC offline computing relies heavily on CPU resources, despite the interest in compute accelerators, such as GPUs, for the longer term future. The number of cores per CPU socket has continued to increase steadily, reaching the levels of 64 cores (128 threads) with recent AMD EPYC processors, and 128 cores on Ampere Altra Max ARM processors. Over the course of the past decade, the CMS data processing framework, CMSSW, has been transformed from a single-threaded framework into a highly concurrent one. The first multithreaded version was brought into production by the start of the LHC Run 2 in 2015. Since then, the framework's threading efficiency has gradually been improved by adding more levels of concurrency and reducing the amount of serial code paths. The latest addition was support for concurrent Runs. In this work we review the concurrency model of the CMSSW, and measure its scalability with real CMS applications, such as simulation and reconstruction, on mode rn many-core machines. We show metrics such as event processing throughput and application memory usage with and without the contribution of I/O, as I/O has been the major scaling limitation for the CMS applications.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
Multi-Agent Bayesian Optimization with Coupled Black-Box and Affine Constraints
Authors:
Wenjie Xu,
Yuning Jiang,
Bratislav Svetozarevic,
Colin N. Jones
Abstract:
This paper studies the problem of distributed multi-agent Bayesian optimization with both coupled black-box constraints and known affine constraints. A primal-dual distributed algorithm is proposed that achieves similar regret/violation bounds as those in the single-agent case for the black-box objective and constraint functions. Additionally, the algorithm guarantees an $\mathcal{O}(N\sqrt{T})$ b…
▽ More
This paper studies the problem of distributed multi-agent Bayesian optimization with both coupled black-box constraints and known affine constraints. A primal-dual distributed algorithm is proposed that achieves similar regret/violation bounds as those in the single-agent case for the black-box objective and constraint functions. Additionally, the algorithm guarantees an $\mathcal{O}(N\sqrt{T})$ bound on the cumulative violation for the known affine constraints, where $N$ is the number of agents. Hence, it is ensured that the average of the samples satisfies the affine constraints up to the error $\mathcal{O}({N}/{\sqrt{T}})$. Furthermore, we characterize certain conditions under which our algorithm can bound a stronger metric of cumulative violation and provide best-iterate convergence without affine constraint. The method is then applied to both sampled instances from Gaussian processes and a real-world optimal power allocation problem for wireless communication; the results show that our method simultaneously provides close-to-optimal performance and maintains minor violations on average, corroborating our theoretical analysis.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Data-driven adaptive building thermal controller tuning with constraints: A primal-dual contextual Bayesian optimization approach
Authors:
Wenjie Xu,
Bratislav Svetozarevic,
Loris Di Natale,
Philipp Heer,
Colin N Jones
Abstract:
We study the problem of tuning the parameters of a room temperature controller to minimize its energy consumption, subject to the constraint that the daily cumulative thermal discomfort of the occupants is below a given threshold. We formulate it as an online constrained black-box optimization problem where, on each day, we observe some relevant environmental context and adaptively select the cont…
▽ More
We study the problem of tuning the parameters of a room temperature controller to minimize its energy consumption, subject to the constraint that the daily cumulative thermal discomfort of the occupants is below a given threshold. We formulate it as an online constrained black-box optimization problem where, on each day, we observe some relevant environmental context and adaptively select the controller parameters. In this paper, we propose to use a data-driven Primal-Dual Contextual Bayesian Optimization (PDCBO) approach to solve this problem. In a simulation case study on a single room, we apply our algorithm to tune the parameters of a Proportional Integral (PI) heating controller and the pre-heating time. Our results show that PDCBO can save up to 4.7% energy consumption compared to other state-of-the-art Bayesian optimization-based methods while keeping the daily thermal discomfort below the given tolerable threshold on average. Additionally, PDCBO can automatically track time-varying tolerable thresholds while existing methods fail to do so. We then study an alternative constrained tuning problem where we aim to minimize the thermal discomfort with a given energy budget. With this formulation, PDCBO reduces the average discomfort by up to 63% compared to state-of-the-art safe optimization methods while keeping the average daily energy consumption below the required threshold.
△ Less
Submitted 1 October, 2023;
originally announced October 2023.
-
Synthia's Melody: A Benchmark Framework for Unsupervised Domain Adaptation in Audio
Authors:
Chia-Hsin Lin,
Charles Jones,
Björn W. Schuller,
Harry Coppock
Abstract:
Despite significant advancements in deep learning for vision and natural language, unsupervised domain adaptation in audio remains relatively unexplored. We, in part, attribute this to the lack of an appropriate benchmark dataset. To address this gap, we present Synthia's melody, a novel audio data generation framework capable of simulating an infinite variety of 4-second melodies with user-specif…
▽ More
Despite significant advancements in deep learning for vision and natural language, unsupervised domain adaptation in audio remains relatively unexplored. We, in part, attribute this to the lack of an appropriate benchmark dataset. To address this gap, we present Synthia's melody, a novel audio data generation framework capable of simulating an infinite variety of 4-second melodies with user-specified confounding structures characterised by musical keys, timbre, and loudness. Unlike existing datasets collected under observational settings, Synthia's melody is free of unobserved biases, ensuring the reproducibility and comparability of experiments. To showcase its utility, we generate two types of distribution shifts-domain shift and sample selection bias-and evaluate the performance of acoustic deep learning models under these shifts. Our evaluations reveal that Synthia's melody provides a robust testbed for examining the susceptibility of these models to varying levels of distribution shift.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Applications of Sequential Learning for Medical Image Classification
Authors:
Sohaib Naim,
Brian Caffo,
Haris I Sair,
Craig K Jones
Abstract:
Purpose: The aim of this work is to develop a neural network training framework for continual training of small amounts of medical imaging data and create heuristics to assess training in the absence of a hold-out validation or test set.
Materials and Methods: We formulated a retrospective sequential learning approach that would train and consistently update a model on mini-batches of medical im…
▽ More
Purpose: The aim of this work is to develop a neural network training framework for continual training of small amounts of medical imaging data and create heuristics to assess training in the absence of a hold-out validation or test set.
Materials and Methods: We formulated a retrospective sequential learning approach that would train and consistently update a model on mini-batches of medical images over time. We address problems that impede sequential learning such as overfitting, catastrophic forgetting, and concept drift through PyTorch convolutional neural networks (CNN) and publicly available Medical MNIST and NIH Chest X-Ray imaging datasets. We begin by comparing two methods for a sequentially trained CNN with and without base pre-training. We then transition to two methods of unique training and validation data recruitment to estimate full information extraction without overfitting. Lastly, we consider an example of real-life data that shows how our approach would see mainstream research implementation.
Results: For the first experiment, both approaches successfully reach a ~95% accuracy threshold, although the short pre-training step enables sequential accuracy to plateau in fewer steps. The second experiment comparing two methods showed better performance with the second method which crosses the ~90% accuracy threshold much sooner. The final experiment showed a slight advantage with a pre-training step that allows the CNN to cross ~60% threshold much sooner than without pre-training.
Conclusion: We have displayed sequential learning as a serviceable multi-classification technique statistically comparable to traditional CNNs that can acquire data in small increments feasible for clinically realistic scenarios.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
An Optimization Case Study for solving a Transport Robot Scheduling Problem on Quantum-Hybrid and Quantum-Inspired Hardware
Authors:
Dominik Leib,
Tobias Seidel,
Sven Jäger,
Raoul Heese,
Caitlin Isobel Jones,
Abhishek Awasthi,
Astrid Niederle,
Michael Bortz
Abstract:
We present a comprehensive case study comparing the performance of D-Waves' quantum-classical hybrid framework, Fujitsu's quantum-inspired digital annealer, and Gurobi's state-of-the-art classical solver in solving a transport robot scheduling problem. This problem originates from an industrially relevant real-world scenario. We provide three different models for our problem following different de…
▽ More
We present a comprehensive case study comparing the performance of D-Waves' quantum-classical hybrid framework, Fujitsu's quantum-inspired digital annealer, and Gurobi's state-of-the-art classical solver in solving a transport robot scheduling problem. This problem originates from an industrially relevant real-world scenario. We provide three different models for our problem following different design philosophies. In our benchmark, we focus on the solution quality and end-to-end runtime of different model and solver combinations. We find promising results for the digital annealer and some opportunities for the hybrid quantum annealer in direct comparison with Gurobi. Our study provides insights into the workflow for solving an application-oriented optimization problem with different strategies, and can be useful for evaluating the strengths and weaknesses of different approaches.
△ Less
Submitted 24 October, 2023; v1 submitted 18 September, 2023;
originally announced September 2023.
-
No Fair Lunch: A Causal Perspective on Dataset Bias in Machine Learning for Medical Imaging
Authors:
Charles Jones,
Daniel C. Castro,
Fabio De Sousa Ribeiro,
Ozan Oktay,
Melissa McCradden,
Ben Glocker
Abstract:
As machine learning methods gain prominence within clinical decision-making, addressing fairness concerns becomes increasingly urgent. Despite considerable work dedicated to detecting and ameliorating algorithmic bias, today's methods are deficient with potentially harmful consequences. Our causal perspective sheds new light on algorithmic bias, highlighting how different sources of dataset bias m…
▽ More
As machine learning methods gain prominence within clinical decision-making, addressing fairness concerns becomes increasingly urgent. Despite considerable work dedicated to detecting and ameliorating algorithmic bias, today's methods are deficient with potentially harmful consequences. Our causal perspective sheds new light on algorithmic bias, highlighting how different sources of dataset bias may appear indistinguishable yet require substantially different mitigation strategies. We introduce three families of causal bias mechanisms stemming from disparities in prevalence, presentation, and annotation. Our causal analysis underscores how current mitigation methods tackle only a narrow and often unrealistic subset of scenarios. We provide a practical three-step framework for reasoning about fairness in medical imaging, supporting the development of safe and equitable AI prediction models.
△ Less
Submitted 31 July, 2023;
originally announced July 2023.
-
Automated Artifact Detection in Ultra-widefield Fundus Photography of Patients with Sickle Cell Disease
Authors:
Anqi Feng,
Dimitri Johnson,
Grace R. Reilly,
Loka Thangamathesvaran,
Ann Nampomba,
Mathias Unberath,
Adrienne W. Scott,
Craig Jones
Abstract:
Importance: Ultra-widefield fundus photography (UWF-FP) has shown utility in sickle cell retinopathy screening; however, image artifact may diminish quality and gradeability of images. Objective: To create an automated algorithm for UWF-FP artifact classification. Design: A neural network based automated artifact detection algorithm was designed to identify commonly encountered UWF-FP artifacts in…
▽ More
Importance: Ultra-widefield fundus photography (UWF-FP) has shown utility in sickle cell retinopathy screening; however, image artifact may diminish quality and gradeability of images. Objective: To create an automated algorithm for UWF-FP artifact classification. Design: A neural network based automated artifact detection algorithm was designed to identify commonly encountered UWF-FP artifacts in a cross section of patient UWF-FP. A pre-trained ResNet-50 neural network was trained on a subset of the images and the classification accuracy, sensitivity, and specificity were quantified on the hold out test set. Setting: The study is based on patients from a tertiary care hospital site. Participants: There were 243 UWF-FP acquired from patients with sickle cell disease (SCD), and artifact labelling in the following categories was performed: Eyelash Present, Lower Eyelid Obstructing, Upper Eyelid Obstructing, Image Too Dark, Dark Artifact, and Image Not Centered. Results: Overall, the accuracy for each class was Eyelash Present at 83.7%, Lower Eyelid Obstructing at 83.7%, Upper Eyelid Obstructing at 98.0%, Image Too Dark at 77.6%, Dark Artifact at 93.9%, and Image Not Centered at 91.8%. Conclusions and Relevance: This automated algorithm shows promise in identifying common imaging artifacts on a subset of Optos UWF-FP in SCD patients. Further refinement is ongoing with the goal of improving efficiency of tele-retinal screening in sickle cell retinopathy (SCR) by providing a photographer real-time feedback as to the types of artifacts present, and the need for image re-acquisition. This algorithm also may have potential future applicability in other retinal diseases by improving quality and efficiency of image acquisition of UWF-FP.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
The Role of Subgroup Separability in Group-Fair Medical Image Classification
Authors:
Charles Jones,
Mélanie Roschewitz,
Ben Glocker
Abstract:
We investigate performance disparities in deep classifiers. We find that the ability of classifiers to separate individuals into subgroups varies substantially across medical imaging modalities and protected characteristics; crucially, we show that this property is predictive of algorithmic bias. Through theoretical analysis and extensive empirical evaluation, we find a relationship between subgro…
▽ More
We investigate performance disparities in deep classifiers. We find that the ability of classifiers to separate individuals into subgroups varies substantially across medical imaging modalities and protected characteristics; crucially, we show that this property is predictive of algorithmic bias. Through theoretical analysis and extensive empirical evaluation, we find a relationship between subgroup separability, subgroup disparities, and performance degradation when models are trained on data with systematic bias such as underdiagnosis. Our findings shed new light on the question of how models become biased, providing important insights for the development of fair medical imaging AI.
△ Less
Submitted 6 July, 2023;
originally announced July 2023.
-
Physics-Informed Machine Learning for Modeling and Control of Dynamical Systems
Authors:
Truong X. Nghiem,
Ján Drgoňa,
Colin Jones,
Zoltan Nagy,
Roland Schwan,
Biswadip Dey,
Ankush Chakrabarty,
Stefano Di Cairano,
Joel A. Paulson,
Andrea Carron,
Melanie N. Zeilinger,
Wenceslao Shaw Cortez,
Draguna L. Vrabie
Abstract:
Physics-informed machine learning (PIML) is a set of methods and tools that systematically integrate machine learning (ML) algorithms with physical constraints and abstract mathematical models developed in scientific and engineering domains. As opposed to purely data-driven methods, PIML models can be trained from additional information obtained by enforcing physical laws such as energy and mass c…
▽ More
Physics-informed machine learning (PIML) is a set of methods and tools that systematically integrate machine learning (ML) algorithms with physical constraints and abstract mathematical models developed in scientific and engineering domains. As opposed to purely data-driven methods, PIML models can be trained from additional information obtained by enforcing physical laws such as energy and mass conservation. More broadly, PIML models can include abstract properties and conditions such as stability, convexity, or invariance. The basic premise of PIML is that the integration of ML and physics can yield more effective, physically consistent, and data-efficient models. This paper aims to provide a tutorial-like overview of the recent advances in PIML for dynamical system modeling and control. Specifically, the paper covers an overview of the theory, fundamental concepts and methods, tools, and applications on topics of: 1) physics-informed learning for system identification; 2) physics-informed learning for control; 3) analysis and verification of PIML models; and 4) physics-informed digital twins. The paper is concluded with a perspective on open challenges and future research opportunities.
△ Less
Submitted 24 June, 2023;
originally announced June 2023.
-
Bayesian Optimization of Expensive Nested Grey-Box Functions
Authors:
Wenjie Xu,
Yuning Jiang,
Bratislav Svetozarevic,
Colin N. Jones
Abstract:
We consider the problem of optimizing a grey-box objective function, i.e., nested function composed of both black-box and white-box functions. A general formulation for such grey-box problems is given, which covers the existing grey-box optimization formulations as special cases. We then design an optimism-driven algorithm to solve it. Under certain regularity assumptions, our algorithm achieves s…
▽ More
We consider the problem of optimizing a grey-box objective function, i.e., nested function composed of both black-box and white-box functions. A general formulation for such grey-box problems is given, which covers the existing grey-box optimization formulations as special cases. We then design an optimism-driven algorithm to solve it. Under certain regularity assumptions, our algorithm achieves similar regret bound as that for the standard black-box Bayesian optimization algorithm, up to a constant multiplicative term depending on the Lipschitz constants of the functions considered. We further extend our method to the constrained case and discuss special cases. For the commonly used kernel functions, the regret bounds allow us to derive a convergence rate to the optimal solution. Experimental results show that our grey-box optimization method empirically improves the speed of finding the global optimal solution significantly, as compared to the standard black-box optimization algorithm.
△ Less
Submitted 2 August, 2023; v1 submitted 8 June, 2023;
originally announced June 2023.
-
Navigating Fairness in Radiology AI: Concepts, Consequences,and Crucial Considerations
Authors:
Vasantha Kumar Venugopal,
Abhishek Gupta,
Rohit Takhar,
Charlene Liew Jin Yee,
Catherine Jones,
Gilberto Szarf
Abstract:
Artificial Intelligence (AI) has significantly revolutionized radiology, promising improved patient outcomes and streamlined processes. However, it's critical to ensure the fairness of AI models to prevent stealthy bias and disparities from leading to unequal outcomes. This review discusses the concept of fairness in AI, focusing on bias auditing using the Aequitas toolkit, and its real-world impl…
▽ More
Artificial Intelligence (AI) has significantly revolutionized radiology, promising improved patient outcomes and streamlined processes. However, it's critical to ensure the fairness of AI models to prevent stealthy bias and disparities from leading to unequal outcomes. This review discusses the concept of fairness in AI, focusing on bias auditing using the Aequitas toolkit, and its real-world implications in radiology, particularly in disease screening scenarios. Aequitas, an open-source bias audit toolkit, scrutinizes AI models' decisions, identifying hidden biases that may result in disparities across different demographic groups and imaging equipment brands. This toolkit operates on statistical theories, analyzing a large dataset to reveal a model's fairness. It excels in its versatility to handle various variables simultaneously, especially in a field as diverse as radiology. The review explicates essential fairness metrics: Equal and Proportional Parity, False Positive Rate Parity, False Discovery Rate Parity, False Negative Rate Parity, and False Omission Rate Parity. Each metric serves unique purposes and offers different insights. We present hypothetical scenarios to demonstrate their relevance in disease screening settings, and how disparities can lead to significant real-world impacts.
△ Less
Submitted 2 June, 2023;
originally announced June 2023.
-
Deep Labeling of fMRI Brain Networks
Authors:
Ammar Ahmed Pallikonda Latheef,
Sejal Ghate,
Zhipeng Hui,
Alberto Santamaria-Pang,
Ivan Tarapov,
Haris I Sair,
Craig K Jones
Abstract:
Resting State Networks (RSNs) of the brain extracted from Resting State functional Magnetic Resonance Imaging (RS-fMRI) are used in the pre-surgical planning to guide the neurosurgeon. This is difficult, though, as expert knowledge is required to label each of the RSNs. There is a lack of efficient and standardized methods to be used in clinical workflows. Additionally, these methods need to be ge…
▽ More
Resting State Networks (RSNs) of the brain extracted from Resting State functional Magnetic Resonance Imaging (RS-fMRI) are used in the pre-surgical planning to guide the neurosurgeon. This is difficult, though, as expert knowledge is required to label each of the RSNs. There is a lack of efficient and standardized methods to be used in clinical workflows. Additionally, these methods need to be generalizable since the method needs to work well regardless of the acquisition technique. We propose an accurate, fast, and lightweight deep learning approach to label RSNs. Group Independent Component Analysis (ICA) was used to extract large scale functional connectivity patterns in the cohort and dual regression was used to back project them on individual subject RSNs. We compare a Multi-Layer Perceptron (MLP) based method with 2D and 3D Convolutional Neural Networks (CNNs) and find that the MLP is faster and more accurate. The MLP method performs as good or better than other works despite its compact size. We prove the generalizability of our method by showing that the MLP performs at 100% accuracy in the holdout dataset and 98.3% accuracy in three other sites' fMRI acquisitions.
△ Less
Submitted 5 May, 2023;
originally announced May 2023.
-
Primal-Dual Contextual Bayesian Optimization for Control System Online Optimization with Time-Average Constraints
Authors:
Wenjie Xu,
Yuning Jiang,
Bratislav Svetozarevic,
Colin N. Jones
Abstract:
This paper studies the problem of online performance optimization of constrained closed-loop control systems, where both the objective and the constraints are unknown black-box functions affected by exogenous time-varying contextual disturbances. A primal-dual contextual Bayesian optimization algorithm is proposed that achieves sublinear cumulative regret with respect to the dynamic optimal soluti…
▽ More
This paper studies the problem of online performance optimization of constrained closed-loop control systems, where both the objective and the constraints are unknown black-box functions affected by exogenous time-varying contextual disturbances. A primal-dual contextual Bayesian optimization algorithm is proposed that achieves sublinear cumulative regret with respect to the dynamic optimal solution under certain regularity conditions. Furthermore, the algorithm achieves zero time-average constraint violation, ensuring that the average value of the constraint function satisfies the desired constraint. The method is applied to both sampled instances from Gaussian processes and a continuous stirred tank reactor parameter tuning problem; simulation results show that the method simultaneously provides close-to-optimal performance and maintains constraint feasibility on average. This contrasts current state-of-the-art methods, which either suffer from large cumulative regret or severe constraint violations for the case studies presented.
△ Less
Submitted 20 September, 2023; v1 submitted 12 April, 2023;
originally announced April 2023.
-
Sum-of-Squares Lower Bounds for Densest $k$-Subgraph
Authors:
Chris Jones,
Aaron Potechin,
Goutham Rajendran,
Jeff Xu
Abstract:
Given a graph and an integer $k$, Densest $k$-Subgraph is the algorithmic task of finding the subgraph on $k$ vertices with the maximum number of edges. This is a fundamental problem that has been subject to intense study for decades, with applications spanning a wide variety of fields. The state-of-the-art algorithm is an $O(n^{1/4 + ε})$-factor approximation (for any $ε> 0$) due to Bhaskara et a…
▽ More
Given a graph and an integer $k$, Densest $k$-Subgraph is the algorithmic task of finding the subgraph on $k$ vertices with the maximum number of edges. This is a fundamental problem that has been subject to intense study for decades, with applications spanning a wide variety of fields. The state-of-the-art algorithm is an $O(n^{1/4 + ε})$-factor approximation (for any $ε> 0$) due to Bhaskara et al. [STOC '10]. Moreover, the so-called log-density framework predicts that this is optimal, i.e. it is impossible for an efficient algorithm to achieve an $O(n^{1/4 - ε})$-factor approximation. In the average case, Densest $k$-Subgraph is a prototypical noisy inference task which is conjectured to exhibit a statistical-computational gap.
In this work, we provide the strongest evidence yet of hardness for Densest $k$-Subgraph by showing matching lower bounds against the powerful Sum-of-Squares (SoS) algorithm, a meta-algorithm based on convex programming that achieves state-of-art algorithmic guarantees for many optimization and inference problems. For $k \leq n^{\frac{1}{2}}$, we obtain a degree $n^δ$ SoS lower bound for the hard regime as predicted by the log-density framework.
To show this, we utilize the modern framework for proving SoS lower bounds on average-case problems pioneered by Barak et al. [FOCS '16]. A key issue is that small denser-than-average subgraphs in the input will greatly affect the value of the candidate pseudoexpectation operator around the subgraph. To handle this challenge, we devise a novel matrix factorization scheme based on the positive minimum vertex separator. We then prove an intersection tradeoff lemma to show that the error terms when using this separator are indeed small.
△ Less
Submitted 30 March, 2023;
originally announced March 2023.
-
Active Learning in Brain Tumor Segmentation with Uncertainty Sampling, Annotation Redundancy Restriction, and Data Initialization
Authors:
Daniel D Kim,
Rajat S Chandra,
Jian Peng,
Jing Wu,
Xue Feng,
Michael Atalay,
Chetan Bettegowda,
Craig Jones,
Haris Sair,
Wei-hua Liao,
Chengzhang Zhu,
Beiji Zou,
Li Yang,
Anahita Fathi Kazerooni,
Ali Nabavizadeh,
Harrison X Bai,
Zhicheng Jiao
Abstract:
Deep learning models have demonstrated great potential in medical 3D imaging, but their development is limited by the expensive, large volume of annotated data required. Active learning (AL) addresses this by training a model on a subset of the most informative data samples without compromising performance. We compared different AL strategies and propose a framework that minimizes the amount of da…
▽ More
Deep learning models have demonstrated great potential in medical 3D imaging, but their development is limited by the expensive, large volume of annotated data required. Active learning (AL) addresses this by training a model on a subset of the most informative data samples without compromising performance. We compared different AL strategies and propose a framework that minimizes the amount of data needed for state-of-the-art performance. 638 multi-institutional brain tumor MRI images were used to train a 3D U-net model and compare AL strategies. We investigated uncertainty sampling, annotation redundancy restriction, and initial dataset selection techniques. Uncertainty estimation techniques including Bayesian estimation with dropout, bootstrapping, and margins sampling were compared to random query. Strategies to avoid annotation redundancy by removing similar images within the to-be-annotated subset were considered as well. We determined the minimum amount of data necessary to achieve similar performance to the model trained on the full dataset (α = 0.1). A variance-based selection strategy using radiomics to identify the initial training dataset is also proposed. Bayesian approximation with dropout at training and testing showed similar results to that of the full data model with less than 20% of the training data (p=0.293) compared to random query achieving similar performance at 56.5% of the training data (p=0.814). Annotation redundancy restriction techniques achieved state-of-the-art performance at approximately 40%-50% of the training data. Radiomics dataset initialization had higher Dice with initial dataset sizes of 20 and 80 images, but improvements were not significant. In conclusion, we investigated various AL strategies with dropout uncertainty estimation achieving state-of-the-art performance with the least annotated data.
△ Less
Submitted 4 February, 2023;
originally announced February 2023.
-
Violation-Aware Contextual Bayesian Optimization for Controller Performance Optimization with Unmodeled Constraints
Authors:
Wenjie Xu,
Colin N Jones,
Bratislav Svetozarevic,
Christopher R. Laughman,
Ankush Chakrabarty
Abstract:
We study the problem of performance optimization of closed-loop control systems with unmodeled dynamics. Bayesian optimization (BO) has been demonstrated to be effective for improving closed-loop performance by automatically tuning controller gains or reference setpoints in a model-free manner. However, BO methods have rarely been tested on dynamical systems with unmodeled constraints and time-var…
▽ More
We study the problem of performance optimization of closed-loop control systems with unmodeled dynamics. Bayesian optimization (BO) has been demonstrated to be effective for improving closed-loop performance by automatically tuning controller gains or reference setpoints in a model-free manner. However, BO methods have rarely been tested on dynamical systems with unmodeled constraints and time-varying ambient conditions. In this paper, we propose a violation-aware contextual BO algorithm (VACBO) that optimizes closed-loop performance while simultaneously learning constraint-feasible solutions under time-varying ambient conditions. Unlike classical constrained BO methods which allow unlimited constraint violations, or 'safe' BO algorithms that are conservative and try to operate with near-zero violations, we allow budgeted constraint violations to improve constraint learning and accelerate optimization. We demonstrate the effectiveness of our proposed VACBO method for energy minimization of industrial vapor compression systems under time-varying ambient temperature and humidity.
△ Less
Submitted 28 January, 2023;
originally announced January 2023.
-
PatentsView-Evaluation: Evaluation Datasets and Tools to Advance Research on Inventor Name Disambiguation
Authors:
Olivier Binette,
Sarvo Madhavan,
Jack Butler,
Beth Anne Card,
Emily Melluso,
Christina Jones
Abstract:
We present PatentsView-Evaluation, a Python package that enables researchers to evaluate the performance of inventor name disambiguation systems such as PatentsView.org. The package includes benchmark datasets and evaluation tools, and aims to advance research on inventor name disambiguation by providing access to high-quality evaluation data and improving evaluation standards.
We present PatentsView-Evaluation, a Python package that enables researchers to evaluate the performance of inventor name disambiguation systems such as PatentsView.org. The package includes benchmark datasets and evaluation tools, and aims to advance research on inventor name disambiguation by providing access to high-quality evaluation data and improving evaluation standards.
△ Less
Submitted 9 January, 2023;
originally announced January 2023.
-
Towards Scalable Physically Consistent Neural Networks: an Application to Data-driven Multi-zone Thermal Building Models
Authors:
Loris Di Natale,
Bratislav Svetozarevic,
Philipp Heer,
Colin Neil Jones
Abstract:
With more and more data being collected, data-driven modeling methods have been gaining in popularity in recent years. While physically sound, classical gray-box models are often cumbersome to identify and scale, and their accuracy might be hindered by their limited expressiveness. On the other hand, classical black-box methods, typically relying on Neural Networks (NNs) nowadays, often achieve im…
▽ More
With more and more data being collected, data-driven modeling methods have been gaining in popularity in recent years. While physically sound, classical gray-box models are often cumbersome to identify and scale, and their accuracy might be hindered by their limited expressiveness. On the other hand, classical black-box methods, typically relying on Neural Networks (NNs) nowadays, often achieve impressive performance, even at scale, by deriving statistical patterns from data. However, they remain completely oblivious to the underlying physical laws, which may lead to potentially catastrophic failures if decisions for real-world physical systems are based on them. Physically Consistent Neural Networks (PCNNs) were recently developed to address these aforementioned issues, ensuring physical consistency while still leveraging NNs to attain state-of-the-art accuracy.
In this work, we scale PCNNs to model building temperature dynamics and propose a thorough comparison with classical gray-box and black-box methods. More precisely, we design three distinct PCNN extensions, thereby exemplifying the modularity and flexibility of the architecture, and formally prove their physical consistency. In the presented case study, PCNNs are shown to achieve state-of-the-art accuracy, even outperforming classical NN-based models despite their constrained structure. Our investigations furthermore provide a clear illustration of NNs achieving seemingly good performance while remaining completely physics-agnostic, which can be misleading in practice. While this performance comes at the cost of computational complexity, PCNNs on the other hand show accuracy improvements of 17-35% compared to all other physically consistent methods, paving the way for scalable physically consistent models with state-of-the-art performance.
△ Less
Submitted 4 April, 2023; v1 submitted 23 December, 2022;
originally announced December 2022.
-
Benchmarking Offline Reinforcement Learning Algorithms for E-Commerce Order Fraud Evaluation
Authors:
Soysal Degirmenci,
Chris Jones
Abstract:
Amazon and other e-commerce sites must employ mechanisms to protect their millions of customers from fraud, such as unauthorized use of credit cards. One such mechanism is order fraud evaluation, where systems evaluate orders for fraud risk, and either "pass" the order, or take an action to mitigate high risk. Order fraud evaluation systems typically use binary classification models that distingui…
▽ More
Amazon and other e-commerce sites must employ mechanisms to protect their millions of customers from fraud, such as unauthorized use of credit cards. One such mechanism is order fraud evaluation, where systems evaluate orders for fraud risk, and either "pass" the order, or take an action to mitigate high risk. Order fraud evaluation systems typically use binary classification models that distinguish fraudulent and legitimate orders, to assess risk and take action. We seek to devise a system that considers both financial losses of fraud and long-term customer satisfaction, which may be impaired when incorrect actions are applied to legitimate customers. We propose that taking actions to optimize long-term impact can be formulated as a Reinforcement Learning (RL) problem. Standard RL methods require online interaction with an environment to learn, but this is not desirable in high-stakes applications like order fraud evaluation. Offline RL algorithms learn from logged data collected from the environment, without the need for online interaction, making them suitable for our use case. We show that offline RL methods outperform traditional binary classification solutions in SimStore, a simplified e-commerce simulation that incorporates order fraud risk. We also propose a novel approach to training offline RL policies that adds a new loss term during training, to better align policy exploration with taking correct actions.
△ Less
Submitted 5 December, 2022;
originally announced December 2022.
-
Computationally Efficient Reinforcement Learning: Targeted Exploration leveraging Simple Rules
Authors:
Loris Di Natale,
Bratislav Svetozarevic,
Philipp Heer,
Colin N. Jones
Abstract:
Model-free Reinforcement Learning (RL) generally suffers from poor sample complexity, mostly due to the need to exhaustively explore the state-action space to find well-performing policies. On the other hand, we postulate that expert knowledge of the system often allows us to design simple rules we expect good policies to follow at all times. In this work, we hence propose a simple yet effective m…
▽ More
Model-free Reinforcement Learning (RL) generally suffers from poor sample complexity, mostly due to the need to exhaustively explore the state-action space to find well-performing policies. On the other hand, we postulate that expert knowledge of the system often allows us to design simple rules we expect good policies to follow at all times. In this work, we hence propose a simple yet effective modification of continuous actor-critic frameworks to incorporate such rules and avoid regions of the state-action space that are known to be suboptimal, thereby significantly accelerating the convergence of RL agents. Concretely, we saturate the actions chosen by the agent if they do not comply with our intuition and, critically, modify the gradient update step of the policy to ensure the learning process is not affected by the saturation step. On a room temperature control case study, it allows agents to converge to well-performing policies up to 6-7x faster than classical agents without computational overhead and while retaining good final performance.
△ Less
Submitted 12 September, 2023; v1 submitted 29 November, 2022;
originally announced November 2022.
-
CONFIG: Constrained Efficient Global Optimization for Closed-Loop Control System Optimization with Unmodeled Constraints
Authors:
Wenjie Xu,
Yuning Jiang,
Bratislav Svetozarevic,
Colin N. Jones
Abstract:
In this paper, the CONFIG algorithm, a simple and provably efficient constrained global optimization algorithm, is applied to optimize the closed-loop control performance of an unknown system with unmodeled constraints. Existing Gaussian process based closed-loop optimization methods, either can only guarantee local convergence (e.g., SafeOPT), or have no known optimality guarantee (e.g., constrai…
▽ More
In this paper, the CONFIG algorithm, a simple and provably efficient constrained global optimization algorithm, is applied to optimize the closed-loop control performance of an unknown system with unmodeled constraints. Existing Gaussian process based closed-loop optimization methods, either can only guarantee local convergence (e.g., SafeOPT), or have no known optimality guarantee (e.g., constrained expected improvement) at all, whereas the recently introduced CONFIG algorithm has been proven to enjoy a theoretical global optimality guarantee. In this study, we demonstrate the effectiveness of CONFIG algorithm in the applications. The algorithm is first applied to an artificial numerical benchmark problem to corroborate its effectiveness. It is then applied to a classical constrained steady-state optimization problem of a continuous stirred-tank reactor. Simulation results show that our CONFIG algorithm can achieve performance competitive with the popular CEI (Constrained Expected Improvement) algorithm, which has no known optimality guarantee. As such, the CONFIG algorithm offers a new tool, with both a provable global optimality guarantee and competitive empirical performance, to optimize the closed-loop control performance for a system with soft unmodeled constraints. Last, but not least, the open-source code is available as a python package to facilitate future applications.
△ Less
Submitted 18 December, 2022; v1 submitted 21 November, 2022;
originally announced November 2022.
-
Physically Consistent Neural ODEs for Learning Multi-Physics Systems
Authors:
Muhammad Zakwan,
Loris Di Natale,
Bratislav Svetozarevic,
Philipp Heer,
Colin N. Jones,
Giancarlo Ferrari Trecate
Abstract:
Despite the immense success of neural networks in modeling system dynamics from data, they often remain physics-agnostic black boxes. In the particular case of physical systems, they might consequently make physically inconsistent predictions, which makes them unreliable in practice. In this paper, we leverage the framework of Irreversible port-Hamiltonian Systems (IPHS), which can describe most m…
▽ More
Despite the immense success of neural networks in modeling system dynamics from data, they often remain physics-agnostic black boxes. In the particular case of physical systems, they might consequently make physically inconsistent predictions, which makes them unreliable in practice. In this paper, we leverage the framework of Irreversible port-Hamiltonian Systems (IPHS), which can describe most multi-physics systems, and rely on Neural Ordinary Differential Equations (NODEs) to learn their parameters from data. Since IPHS models are consistent with the first and second principles of thermodynamics by design, so are the proposed Physically Consistent NODEs (PC-NODEs). Furthermore, the NODE training procedure allows us to seamlessly incorporate prior knowledge of the system properties in the learned dynamics. We demonstrate the effectiveness of the proposed method by learning the thermodynamics of a building from the real-world measurements and the dynamics of a simulated gas-piston system. Thanks to the modularity and flexibility of the IPHS framework, PC-NODEs can be extended to learn physically consistent models of multi-physics distributed systems.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
Improving mean-field network percolation models with neighbourhood information
Authors:
Chris Jones,
Karoline Wiesner
Abstract:
Mean field theory models of percolation on networks provide analytic estimates of network robustness under node or edge removal. We introduce a new mean field theory model based on generating functions that includes information about the tree-likeness of each node's local neighbourhood. We show that our new model outperforms all other generating function models in prediction accuracy when testing…
▽ More
Mean field theory models of percolation on networks provide analytic estimates of network robustness under node or edge removal. We introduce a new mean field theory model based on generating functions that includes information about the tree-likeness of each node's local neighbourhood. We show that our new model outperforms all other generating function models in prediction accuracy when testing their estimates on a wide range of real-world network data. We compare the new model's performance against the recently introduced message passing models and provide evidence that the standard version is also outperformed, while the `loopy' version is only outperformed on a targeted attack strategy. As we show, however, the computational complexity of our model implementation is much lower than that of message passing algorithms. We provide evidence that all discussed models are poor in predicting networks with highly modular structure with dispersed modules, which are also characterised by high mixing times, identifying this as a general limitation of percolation prediction models.
△ Less
Submitted 31 July, 2023; v1 submitted 4 November, 2022;
originally announced November 2022.
-
Exact Completeness of LP Hierarchies for Linear Codes
Authors:
Leonardo Nagami Coregliano,
Fernando Granha Jeronimo,
Chris Jones
Abstract:
Determining the maximum size $A_2(n,d)$ of a binary code of blocklength $n$ and distance $d$ remains an elusive open question even when restricted to the important class of linear codes. Recently, two linear programming hierarchies extending Delsarte's LP were independently proposed to upper bound $A_2^{\text{Lin}}(n,d)$ (the analogue of $A_2(n,d)$ for linear codes). One of these hierarchies, by t…
▽ More
Determining the maximum size $A_2(n,d)$ of a binary code of blocklength $n$ and distance $d$ remains an elusive open question even when restricted to the important class of linear codes. Recently, two linear programming hierarchies extending Delsarte's LP were independently proposed to upper bound $A_2^{\text{Lin}}(n,d)$ (the analogue of $A_2(n,d)$ for linear codes). One of these hierarchies, by the authors, was shown to be approximately complete in the sense that the hierarchy converges to $A_2^{\text{Lin}}(n,d)$ as the level grows beyond $n^2$. Despite some structural similarities, not even approximate completeness was known for the other hierarchy by Loyfer and Linial.
In this work, we prove that both hierarchies recover the exact value of $A_2^{\text{Lin}}(n,d)$ at level $n$. We also prove that at this level the polytope of Loyfer and Linial is integral.Even though these hierarchies seem less powerful than general hierarchies such as Sum-of-Squares, we show that they have enough structure to yield exact completeness via pseudoprobabilities.
△ Less
Submitted 2 November, 2022;
originally announced November 2022.
-
Random Max-CSPs Inherit Algorithmic Hardness from Spin Glasses
Authors:
Chris Jones,
Kunal Marwaha,
Juspreet Singh Sandhu,
Jonathan Shi
Abstract:
We study random constraint satisfaction problems (CSPs) in the unsatisfiable regime. We relate the structure of near-optimal solutions for any Max-CSP to that for an associated spin glass on the hypercube, using the Guerra-Toninelli interpolation from statistical physics. The noise stability polynomial of the CSP's predicate is, up to a constant, the mixture polynomial of the associated spin glass…
▽ More
We study random constraint satisfaction problems (CSPs) in the unsatisfiable regime. We relate the structure of near-optimal solutions for any Max-CSP to that for an associated spin glass on the hypercube, using the Guerra-Toninelli interpolation from statistical physics. The noise stability polynomial of the CSP's predicate is, up to a constant, the mixture polynomial of the associated spin glass. We prove two main consequences:
1) We relate the maximum fraction of constraints that can be satisfied in a random Max-CSP to the ground state energy density of the corresponding spin glass. Since the latter value can be computed with the Parisi formula, we provide numerical values for some popular CSPs.
2) We prove that a Max-CSP possesses generalized versions of the overlap gap property if and only if the same holds for the corresponding spin glass. We transfer results from Huang et al. [arXiv:2110.07847, 2021] to obstruct algorithms with overlap concentration on a large class of Max-CSPs. This immediately includes local classical and local quantum algorithms.
△ Less
Submitted 10 January, 2023; v1 submitted 6 October, 2022;
originally announced October 2022.
-
Estimating the Performance of Entity Resolution Algorithms: Lessons Learned Through PatentsView.org
Authors:
Olivier Binette,
Sokhna A York,
Emma Hickerson,
Youngsoo Baek,
Sarvo Madhavan,
Christina Jones
Abstract:
This paper introduces a novel evaluation methodology for entity resolution algorithms. It is motivated by PatentsView.org, a U.S. Patents and Trademarks Office patent data exploration tool that disambiguates patent inventors using an entity resolution algorithm. We provide a data collection methodology and tailored performance estimators that account for sampling biases. Our approach is simple, pr…
▽ More
This paper introduces a novel evaluation methodology for entity resolution algorithms. It is motivated by PatentsView.org, a U.S. Patents and Trademarks Office patent data exploration tool that disambiguates patent inventors using an entity resolution algorithm. We provide a data collection methodology and tailored performance estimators that account for sampling biases. Our approach is simple, practical and principled -- key characteristics that allow us to paint the first representative picture of PatentsView's disambiguation performance. This approach is used to inform PatentsView's users of the reliability of the data and to allow the comparison of competing disambiguation algorithms.
△ Less
Submitted 17 April, 2023; v1 submitted 3 October, 2022;
originally announced October 2022.
-
Lower Bounds on the Worst-Case Complexity of Efficient Global Optimization
Authors:
Wenjie Xu,
Yuning Jiang,
Emilio T. Maddalena,
Colin N. Jones
Abstract:
Efficient global optimization is a widely used method for optimizing expensive black-box functions such as tuning hyperparameter, and designing new material, etc. Despite its popularity, less attention has been paid to analyzing the inherent hardness of the problem although, given its extensive use, it is important to understand the fundamental limits of efficient global optimization algorithms. I…
▽ More
Efficient global optimization is a widely used method for optimizing expensive black-box functions such as tuning hyperparameter, and designing new material, etc. Despite its popularity, less attention has been paid to analyzing the inherent hardness of the problem although, given its extensive use, it is important to understand the fundamental limits of efficient global optimization algorithms. In this paper, we study the worst-case complexity of the efficient global optimization problem and, in contrast to existing kernel-specific results, we derive a unified lower bound for the complexity of efficient global optimization in terms of the metric entropy of a ball in its corresponding reproducing kernel Hilbert space~(RKHS). Specifically, we show that if there exists a deterministic algorithm that achieves suboptimality gap smaller than $ε$ for any function $f\in S$ in $T$ function evaluations, it is necessary that $T$ is at least $Ω\left(\frac{\log\mathcal{N}(S(\mathcal{X}), 4ε,\|\cdot\|_\infty)}{\log(\frac{R}ε)}\right)$, where $\mathcal{N}(\cdot,\cdot,\cdot)$ is the covering number, $S$ is the ball centered at $0$ with radius $R$ in the RKHS and $S(\mathcal{X})$ is the restriction of $S$ over the feasible set $\mathcal{X}$. Moreover, we show that this lower bound nearly matches the upper bound attained by non-adaptive search algorithms for the commonly used squared exponential kernel and the Matérn kernel with a large smoothness parameter $ν$, up to a replacement of $d/2$ by $d$ and a logarithmic term $\log\frac{R}ε$. That is to say, our lower bound is nearly optimal for these kernels.
△ Less
Submitted 20 September, 2022;
originally announced September 2022.