Skip to main content

Showing 1–30 of 30 results for author: Rudner, T G J

  1. arXiv:2407.11942  [pdf, other

    q-bio.BM cs.LG stat.ML

    Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design

    Authors: Leo Klarner, Tim G. J. Rudner, Garrett M. Morris, Charlotte M. Deane, Yee Whye Teh

    Abstract: Generative models have the potential to accelerate key steps in the discovery of novel molecular therapeutics and materials. Diffusion models have recently emerged as a powerful approach, excelling at unconditional sample generation and, with data-driven guidance, conditional generation within their training domain. Reliably sampling from high-value regions beyond the training data, however, remai… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Published in the Proceedings of the 41st International Conference on Machine Learning (ICML 2024)

  2. arXiv:2405.05852  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.RO stat.ML

    Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control

    Authors: Gunshi Gupta, Karmesh Yadav, Yarin Gal, Dhruv Batra, Zsolt Kira, Cong Lu, Tim G. J. Rudner

    Abstract: Embodied AI agents require a fine-grained understanding of the physical world mediated through visual and language inputs. Such capabilities are difficult to learn solely from task-specific data. This has led to the emergence of pre-trained vision-language models as a tool for transferring representations learned from internet-scale data to downstream tasks and new domains. However, commonly used… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  3. arXiv:2404.19640  [pdf, other

    cs.LG cs.AI cs.CV stat.ME stat.ML

    Attacking Bayes: On the Adversarial Robustness of Bayesian Neural Networks

    Authors: Yunzhen Feng, Tim G. J. Rudner, Nikolaos Tsilivis, Julia Kempe

    Abstract: Adversarial examples have been shown to cause neural networks to fail on a wide range of vision and language tasks, but recent work has claimed that Bayesian neural networks (BNNs) are inherently robust to adversarial perturbations. In this work, we examine this claim. To study the adversarial robustness of BNNs, we investigate whether it is possible to successfully break state-of-the-art BNN infe… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  4. arXiv:2403.09869  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    Mind the GAP: Improving Robustness to Subpopulation Shifts with Group-Aware Priors

    Authors: Tim G. J. Rudner, Ya Shi Zhang, Andrew Gordon Wilson, Julia Kempe

    Abstract: Machine learning models often perform poorly under subpopulation shifts in the data distribution. Developing methods that allow machine learning models to better generalize to such shifts is crucial for safe deployment in real-world settings. In this paper, we develop a family of group-aware prior (GAP) distributions over neural network parameters that explicitly favor models that generalize well… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Published in Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024)

  5. arXiv:2402.00809  [pdf, other

    cs.LG stat.ML

    Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI

    Authors: Theodore Papamarkou, Maria Skoularidou, Konstantina Palla, Laurence Aitchison, Julyan Arbel, David Dunson, Maurizio Filippone, Vincent Fortuin, Philipp Hennig, José Miguel Hernández-Lobato, Aliaksandr Hubin, Alexander Immer, Theofanis Karaletsos, Mohammad Emtiyaz Khan, Agustinus Kristiadi, Yingzhen Li, Stephan Mandt, Christopher Nemeth, Michael A. Osborne, Tim G. J. Rudner, David Rügamer, Yee Whye Teh, Max Welling, Andrew Gordon Wilson, Ruqi Zhang

    Abstract: In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learni… ▽ More

    Submitted 2 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  6. arXiv:2312.17210  [pdf, other

    stat.ML cs.AI cs.LG

    Continual Learning via Sequential Function-Space Variational Inference

    Authors: Tim G. J. Rudner, Freddie Bickford Smith, Qixuan Feng, Yee Whye Teh, Yarin Gal

    Abstract: Sequential Bayesian inference over predictive functions is a natural framework for continual learning from streams of data. However, applying it to neural networks has proved challenging in practice. Addressing the drawbacks of existing techniques, we propose an optimization objective derived by formulating continual learning as sequential function-space variational inference. In contrast to exist… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in Proceedings of the 39th International Conference on Machine Learning (ICML 2022)

  7. arXiv:2312.17199  [pdf, other

    stat.ML cs.AI cs.LG

    Tractable Function-Space Variational Inference in Bayesian Neural Networks

    Authors: Tim G. J. Rudner, Zonghao Chen, Yee Whye Teh, Yarin Gal

    Abstract: Reliable predictive uncertainty estimation plays an important role in enabling the deployment of neural networks to safety-critical settings. A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters, infer an approximate posterior distribution, and use it to make stochastic predictions. However, explicit inference… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in Advances in Neural Information Processing Systems 35 (NeurIPS 2022)

  8. arXiv:2312.17174  [pdf, other

    cs.CV cs.AI cs.LG

    Visual Explanations of Image-Text Representations via Multi-Modal Information Bottleneck Attribution

    Authors: Ying Wang, Tim G. J. Rudner, Andrew Gordon Wilson

    Abstract: Vision-language pretrained models have seen remarkable success, but their application to safety-critical settings is limited by their lack of interpretability. To improve the interpretability of vision-language models such as CLIP, we propose a multi-modal information bottleneck (M2IB) approach that learns latent representations that compress irrelevant information while preserving relevant visual… ▽ More

    Submitted 22 June, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in Advances in Neural Information Processing Systems 36 (NeurIPS 2023)

  9. arXiv:2312.17173  [pdf, other

    stat.ML cs.LG

    Non-Vacuous Generalization Bounds for Large Language Models

    Authors: Sanae Lotfi, Marc Finzi, Yilun Kuang, Tim G. J. Rudner, Micah Goldblum, Andrew Gordon Wilson

    Abstract: Modern language models can contain billions of parameters, raising the question of whether they can generalize beyond the training data or simply regurgitate their training corpora. We provide the first non-vacuous generalization bounds for pretrained large language models (LLMs), indicating that language models are capable of discovering regularities that generalize to unseen data. In particular,… ▽ More

    Submitted 12 February, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

  10. arXiv:2312.17168  [pdf, other

    cs.LG cs.AI

    Can Active Sampling Reduce Causal Confusion in Offline Reinforcement Learning?

    Authors: Gunshi Gupta, Tim G. J. Rudner, Rowan Thomas McAllister, Adrien Gaidon, Yarin Gal

    Abstract: Causal confusion is a phenomenon where an agent learns a policy that reflects imperfect spurious correlations in the data. Such a policy may falsely appear to be optimal during training if most of the training data contain such spurious correlations. This phenomenon is particularly pronounced in domains such as robotics, with potentially large gaps between the open- and closed-loop performance of… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in Proceedings of the 2nd Conference on Causal Learning and Reasoning (CLeaR 2021)

  11. arXiv:2312.17162  [pdf, other

    stat.ML cs.AI cs.LG

    Function-Space Regularization in Neural Networks: A Probabilistic Perspective

    Authors: Tim G. J. Rudner, Sanyam Kapoor, Shikai Qiu, Andrew Gordon Wilson

    Abstract: Parameter-space regularization in neural network optimization is a fundamental tool for improving generalization. However, standard parameter-space regularization methods make it challenging to encode explicit preferences about desired predictive functions into neural network training. In this work, we approach regularization in neural networks from a probabilistic perspective and show that by vie… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in Proceedings of the 40th International Conference on Machine Learning (ICML 2023)

  12. arXiv:2312.00794  [pdf, other

    cs.CV cs.AI cs.CY cs.LG stat.AP

    Informative Priors Improve the Reliability of Multimodal Clinical Data Classification

    Authors: L. Julian Lechuga Lopez, Tim G. J. Rudner, Farah E. Shamout

    Abstract: Machine learning-aided clinical decision support has the potential to significantly improve patient care. However, existing efforts in this domain for principled quantification of uncertainty have largely been limited to applications of ad-hoc solutions that do not consistently improve reliability. In this work, we consider stochastic neural networks and design a tailor-made multimodal data-driven… ▽ More

    Submitted 16 November, 2023; originally announced December 2023.

    Comments: Published in ML4H 2023 Findings Track Collection

  13. arXiv:2311.15990  [pdf, other

    cs.LG stat.ML

    Should We Learn Most Likely Functions or Parameters?

    Authors: Shikai Qiu, Tim G. J. Rudner, Sanyam Kapoor, Andrew Gordon Wilson

    Abstract: Standard regularized training procedures correspond to maximizing a posterior distribution over parameters, known as maximum a posteriori (MAP) estimation. However, model parameters are of interest only insomuch as they combine with the functional form of a model to provide a function that can make good predictions. Moreover, the most likely parameters under the parameter posterior do not generall… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: NeurIPS 2023. Code available at https://github.com/activatedgeek/function-space-map

  14. arXiv:2307.15073  [pdf, other

    q-bio.BM cs.LG stat.ML

    Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions

    Authors: Leo Klarner, Tim G. J. Rudner, Michael Reutlinger, Torsten Schindler, Garrett M. Morris, Charlotte Deane, Yee Whye Teh

    Abstract: Accelerating the discovery of novel and more effective therapeutics is an important pharmaceutical problem in which deep learning is playing an increasingly significant role. However, real-world drug discovery tasks are often characterized by a scarcity of labeled data and significant covariate shift$\unicode{x2013}\unicode{x2013}$a setting that poses a challenge to standard deep learning methods.… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: Published in the Proceedings of the 40th International Conference on Machine Learning (ICML 2023)

  15. arXiv:2305.20028  [pdf, other

    cs.LG stat.ML

    A Study of Bayesian Neural Network Surrogates for Bayesian Optimization

    Authors: Yucen Lily Li, Tim G. J. Rudner, Andrew Gordon Wilson

    Abstract: Bayesian optimization is a highly efficient approach to optimizing objective functions which are expensive to query. These objectives are typically represented by Gaussian process (GP) surrogate models which are easy to optimize and support exact inference. While standard GP surrogates have been well-established in Bayesian optimization, Bayesian neural networks (BNNs) have recently become practic… ▽ More

    Submitted 8 May, 2024; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: ICLR 2024. Code available at https://github.com/yucenli/bnn-bo

  16. arXiv:2305.20009  [pdf, other

    cs.LG q-bio.BM

    Protein Design with Guided Discrete Diffusion

    Authors: Nate Gruver, Samuel Stanton, Nathan C. Frey, Tim G. J. Rudner, Isidro Hotzel, Julien Lafrance-Vanasse, Arvind Rajpal, Kyunghyun Cho, Andrew Gordon Wilson

    Abstract: A popular approach to protein design is to combine a generative model with a discriminative model for conditional sampling. The generative model samples plausible sequences while the discriminative model guides a search for sequences with high fitness. Given its broad success in conditional sampling, classifier-guided diffusion modeling is a promising foundation for protein design, leading many to… ▽ More

    Submitted 12 December, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

    Journal ref: Advances in Neural Information Processing Systems 36, December 10-16, 2023

  17. arXiv:2303.00633  [pdf, other

    cs.IT cs.AI

    An Information-Theoretic Perspective on Variance-Invariance-Covariance Regularization

    Authors: Ravid Shwartz-Ziv, Randall Balestriero, Kenji Kawaguchi, Tim G. J. Rudner, Yann LeCun

    Abstract: Variance-Invariance-Covariance Regularization (VICReg) is a self-supervised learning (SSL) method that has shown promising results on a variety of tasks. However, the fundamental mechanisms underlying VICReg remain unexplored. In this paper, we present an information-theoretic perspective on the VICReg objective. We begin by deriving information-theoretic quantities for deterministic networks as a… ▽ More

    Submitted 1 May, 2024; v1 submitted 1 March, 2023; originally announced March 2023.

  18. On Sequential Bayesian Inference for Continual Learning

    Authors: Samuel Kessler, Adam Cobb, Tim G. J. Rudner, Stefan Zohren, Stephen J. Roberts

    Abstract: Sequential Bayesian inference can be used for continual learning to prevent catastrophic forgetting of past tasks and provide an informative prior when learning new tasks. We revisit sequential Bayesian inference and test whether having access to the true posterior is guaranteed to prevent catastrophic forgetting in Bayesian neural networks. To do this we perform sequential Bayesian inference usin… ▽ More

    Submitted 9 July, 2023; v1 submitted 4 January, 2023; originally announced January 2023.

    Comments: Published in Entropy, 24 pages, 14 figures

  19. arXiv:2212.13936  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

    Authors: Tim G. J. Rudner, Cong Lu, Michael A. Osborne, Yarin Gal, Yee Whye Teh

    Abstract: KL-regularized reinforcement learning from expert demonstrations has proved successful in improving the sample efficiency of deep reinforcement learning algorithms, allowing them to be applied to challenging physical real-world tasks. However, we show that KL-regularized reinforcement learning with behavioral reference policies derived from expert demonstrations can suffer from pathological traini… ▽ More

    Submitted 28 December, 2022; originally announced December 2022.

    Comments: Published in Advances in Neural Information Processing Systems 34 (NeurIPS 2021)

  20. arXiv:2211.12717  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    Benchmarking Bayesian Deep Learning on Diabetic Retinopathy Detection Tasks

    Authors: Neil Band, Tim G. J. Rudner, Qixuan Feng, Angelos Filos, Zachary Nado, Michael W. Dusenberry, Ghassen Jerfel, Dustin Tran, Yarin Gal

    Abstract: Bayesian deep learning seeks to equip deep neural networks with the ability to precisely quantify their predictive uncertainty, and has promised to make deep learning more reliable for safety-critical real-world applications. Yet, existing Bayesian deep learning methods fall short of this promise; new methods continue to be evaluated on unrealistic test beds that do not reflect the complexities of… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

    Comments: Published in Neural Information Processing Systems (NeurIPS) 2021 Datasets and Benchmarks Track Proceedings. First two authors contributed equally. Code available at https://rebrand.ly/retina-benchmark

  21. arXiv:2207.07411  [pdf, other

    cs.LG stat.ML

    Plex: Towards Reliability using Pretrained Large Model Extensions

    Authors: Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek , et al. (1 additional authors not shown)

    Abstract: A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive per… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

    Comments: Code available at https://goo.gle/plex-code

  22. arXiv:2206.04779  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations

    Authors: Cong Lu, Philip J. Ball, Tim G. J. Rudner, Jack Parker-Holder, Michael A. Osborne, Yee Whye Teh

    Abstract: Offline reinforcement learning has shown great promise in leveraging large pre-collected datasets for policy learning, allowing agents to forgo often-expensive online data collection. However, offline reinforcement learning from visual observations with continuous action spaces remains under-explored, with a limited understanding of the key challenges in this complex domain. In this paper, we esta… ▽ More

    Submitted 6 July, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: Published at TMLR, 2023

  23. arXiv:2106.04015  [pdf, other

    cs.LG

    Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning

    Authors: Zachary Nado, Neil Band, Mark Collier, Josip Djolonga, Michael W. Dusenberry, Sebastian Farquhar, Qixuan Feng, Angelos Filos, Marton Havasi, Rodolphe Jenatton, Ghassen Jerfel, Jeremiah Liu, Zelda Mariet, Jeremy Nixon, Shreyas Padhy, Jie Ren, Tim G. J. Rudner, Faris Sbahi, Yeming Wen, Florian Wenzel, Kevin Murphy, D. Sculley, Balaji Lakshminarayanan, Jasper Snoek, Yarin Gal , et al. (1 additional authors not shown)

    Abstract: High-quality estimates of uncertainty and robustness are crucial for numerous real-world applications, especially for deep learning which underlies many deployed ML systems. The ability to compare techniques for improving these estimates is therefore very important for research and practice alike. Yet, competitive comparisons of methods are often lacking due to a range of reasons, including: compu… ▽ More

    Submitted 5 January, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

  24. arXiv:2104.10190  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    Outcome-Driven Reinforcement Learning via Variational Inference

    Authors: Tim G. J. Rudner, Vitchyr H. Pong, Rowan McAllister, Yarin Gal, Sergey Levine

    Abstract: While reinforcement learning algorithms provide automated acquisition of optimal policies, practical application of such methods requires a number of design decisions, such as manually designing reward functions that not only define the task, but also provide sufficient shaping to accomplish it. In this paper, we view reinforcement learning as inferring policies that achieve desired outcomes, rath… ▽ More

    Submitted 28 December, 2022; v1 submitted 20 April, 2021; originally announced April 2021.

    Comments: Published in Advances in Neural Information Processing Systems 34 (NeurIPS 2021)

  25. arXiv:2011.00515  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    On Signal-to-Noise Ratio Issues in Variational Inference for Deep Gaussian Processes

    Authors: Tim G. J. Rudner, Oscar Key, Yarin Gal, Tom Rainforth

    Abstract: We show that the gradient estimates used in training Deep Gaussian Processes (DGPs) with importance-weighted variational inference are susceptible to signal-to-noise ratio (SNR) issues. Specifically, we show both theoretically and via an extensive empirical evaluation that the SNR of the gradient estimates for the latent variable's variational parameters decreases as the number of importance sampl… ▽ More

    Submitted 21 July, 2021; v1 submitted 1 November, 2020; originally announced November 2020.

    Comments: Published in Proceedings of the 38th International Conference on Machine Learning (ICML 2021)

  26. arXiv:2011.00415  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    Inter-domain Deep Gaussian Processes

    Authors: Tim G. J. Rudner, Dino Sejdinovic, Yarin Gal

    Abstract: Inter-domain Gaussian processes (GPs) allow for high flexibility and low computational cost when performing approximate inference in GP models. They are particularly suitable for modeling data exhibiting global structure but are limited to stationary covariance functions and thus fail to model non-stationary data effectively. We propose Inter-domain Deep Gaussian Processes, an extension of inter-d… ▽ More

    Submitted 1 November, 2020; originally announced November 2020.

    Comments: Published in Proceedings of the 37th International Conference on Machine Learning (ICML 2020)

  27. arXiv:1912.10481  [pdf, other

    stat.ML cs.LG eess.IV

    A Systematic Comparison of Bayesian Deep Learning Robustness in Diabetic Retinopathy Tasks

    Authors: Angelos Filos, Sebastian Farquhar, Aidan N. Gomez, Tim G. J. Rudner, Zachary Kenton, Lewis Smith, Milad Alizadeh, Arnoud de Kroon, Yarin Gal

    Abstract: Evaluation of Bayesian deep learning (BDL) methods is challenging. We often seek to evaluate the methods' robustness and scalability, assessing whether new tools give `better' uncertainty estimates than old ones. These evaluations are paramount for practitioners when choosing BDL tools on-top of which they build their applications. Current popular evaluations of BDL methods, such as the UCI experi… ▽ More

    Submitted 22 December, 2019; originally announced December 2019.

  28. arXiv:1902.04043  [pdf, other

    cs.LG cs.MA stat.ML

    The StarCraft Multi-Agent Challenge

    Authors: Mikayel Samvelyan, Tabish Rashid, Christian Schroeder de Witt, Gregory Farquhar, Nantas Nardelli, Tim G. J. Rudner, Chia-Man Hung, Philip H. S. Torr, Jakob Foerster, Shimon Whiteson

    Abstract: In the last few years, deep multi-agent reinforcement learning (RL) has become a highly active area of research. A particularly challenging class of problems in this area is partially observable, cooperative, multi-agent learning, in which teams of agents must learn to coordinate their behaviour while conditioning only on their private observations. This is an attractive research area since such p… ▽ More

    Submitted 9 December, 2019; v1 submitted 11 February, 2019; originally announced February 2019.

  29. arXiv:1812.01756  [pdf, other

    cs.CV cs.AI cs.LG

    Multi$^{\mathbf{3}}$Net: Segmenting Flooded Buildings via Fusion of Multiresolution, Multisensor, and Multitemporal Satellite Imagery

    Authors: Tim G. J. Rudner, Marc Rußwurm, Jakub Fil, Ramona Pelich, Benjamin Bischke, Veronika Kopackova, Piotr Bilinski

    Abstract: We propose a novel approach for rapid segmentation of flooded buildings by fusing multiresolution, multisensor, and multitemporal satellite imagery in a convolutional neural network. Our model significantly expedites the generation of satellite imagery-based flood maps, crucial for first responders and local authorities in the early stages of flood events. By incorporating multitemporal satellite… ▽ More

    Submitted 4 December, 2018; originally announced December 2018.

    Comments: To appear in Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19)

    ACM Class: I.2.10; I.4.6

  30. arXiv:1811.01132  [pdf, other

    cs.LG stat.ML

    VIREL: A Variational Inference Framework for Reinforcement Learning

    Authors: Matthew Fellows, Anuj Mahajan, Tim G. J. Rudner, Shimon Whiteson

    Abstract: Applying probabilistic models to reinforcement learning (RL) enables the application of powerful optimisation tools such as variational inference to RL. However, existing inference frameworks and their algorithms pose significant challenges for learning optimal policies, e.g., the absence of mode capturing behaviour in pseudo-likelihood methods and difficulties learning deterministic policies in m… ▽ More

    Submitted 16 July, 2020; v1 submitted 2 November, 2018; originally announced November 2018.