Skip to main content

Showing 1–8 of 8 results for author: van der Ouderaa, T F A

  1. arXiv:2312.17244  [pdf, other

    cs.LG cs.CL

    The LLM Surgeon

    Authors: Tycho F. A. van der Ouderaa, Markus Nagel, Mart van Baalen, Yuki M. Asano, Tijmen Blankevoort

    Abstract: State-of-the-art language models are becoming increasingly large in an effort to achieve the highest performance on large corpora of available textual data. However, the sheer size of the Transformer architectures makes it difficult to deploy models within computational, environmental or device-specific constraints. We explore data-driven compression of existing pretrained models as an alternative… ▽ More

    Submitted 20 March, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

  2. arXiv:2310.06131  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Layer-wise Equivariances Automatically using Gradients

    Authors: Tycho F. A. van der Ouderaa, Alexander Immer, Mark van der Wilk

    Abstract: Convolutions encode equivariance symmetries into neural networks leading to better generalisation performance. However, symmetries provide fixed hard constraints on the functions a network can represent, need to be specified in advance, and can not be adapted. Our goal is to allow flexible symmetry constraints that can automatically be learned from data using gradients. Learning symmetry and assoc… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

  3. arXiv:2306.03968  [pdf, other

    stat.ML cs.LG

    Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels

    Authors: Alexander Immer, Tycho F. A. van der Ouderaa, Mark van der Wilk, Gunnar Rätsch, Bernhard Schölkopf

    Abstract: Selecting hyperparameters in deep learning greatly impacts its effectiveness but requires manual effort and expertise. Recent works show that Bayesian model selection with Laplace approximations can allow to optimize such hyperparameters just like standard neural network parameters using gradients and on the training data. However, estimating a single hyperparameter gradient requires a pass throug… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: ICML 2023

  4. arXiv:2204.07178  [pdf, other

    cs.LG

    Relaxing Equivariance Constraints with Non-stationary Continuous Filters

    Authors: Tycho F. A. van der Ouderaa, David W. Romero, Mark van der Wilk

    Abstract: Equivariances provide useful inductive biases in neural network modeling, with the translation equivariance of convolutional neural networks being a canonical example. Equivariances can be embedded in architectures through weight-sharing and place symmetry constraints on the functions a neural network can represent. The type of symmetry is typically fixed and has to be chosen in advance. Although… ▽ More

    Submitted 13 November, 2022; v1 submitted 14 April, 2022; originally announced April 2022.

  5. arXiv:2202.12439  [pdf, other

    stat.ML cs.LG

    Learning Invariant Weights in Neural Networks

    Authors: Tycho F. A. van der Ouderaa, Mark van der Wilk

    Abstract: Assumptions about invariances or symmetries in data can significantly increase the predictive power of statistical models. Many commonly used models in machine learning are constraint to respect certain symmetries in the data, such as translation equivariance in convolutional neural networks, and incorporation of new symmetry types is actively being studied. Yet, efforts to learn such invariances… ▽ More

    Submitted 2 August, 2022; v1 submitted 24 February, 2022; originally announced February 2022.

  6. arXiv:2202.10638  [pdf, other

    stat.ML cs.LG

    Invariance Learning in Deep Neural Networks with Differentiable Laplace Approximations

    Authors: Alexander Immer, Tycho F. A. van der Ouderaa, Gunnar Rätsch, Vincent Fortuin, Mark van der Wilk

    Abstract: Data augmentation is commonly applied to improve performance of deep learning by enforcing the knowledge that certain transformations on the input preserve the output. Currently, the data augmentation parameters are chosen by human effort and costly cross-validation, which makes it cumbersome to apply to new datasets. We develop a convenient gradient-based method for selecting the data augmentatio… ▽ More

    Submitted 13 October, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

    Comments: NeurIPS 2022

  7. arXiv:2010.00231  [pdf, other

    eess.IV cs.CV

    Deep Group-wise Variational Diffeomorphic Image Registration

    Authors: Tycho F. A. van der Ouderaa, Ivana Išgum, Wouter B. Veldhuis, Bob D. de Vos

    Abstract: Deep neural networks are increasingly used for pair-wise image registration. We propose to extend current learning-based image registration to allow simultaneous registration of multiple images. To achieve this, we build upon the pair-wise variational and diffeomorphic VoxelMorph approach and present a general mathematical framework that enables both registration of multiple images to their geodes… ▽ More

    Submitted 1 October, 2020; originally announced October 2020.

  8. arXiv:1902.02729  [pdf, other

    cs.CV

    Reversible GANs for Memory-efficient Image-to-Image Translation

    Authors: Tycho F. A. van der Ouderaa, Daniel E. Worrall

    Abstract: The Pix2pix and CycleGAN losses have vastly improved the qualitative and quantitative visual quality of results in image-to-image translation tasks. We extend this framework by exploring approximately invertible architectures which are well suited to these losses. These architectures are approximately invertible by design and thus partially satisfy cycle-consistency before training even begins. Fu… ▽ More

    Submitted 7 February, 2019; originally announced February 2019.