Skip to main content

Showing 1–4 of 4 results for author: Alkin, B

  1. arXiv:2406.04303  [pdf, other

    cs.CV cs.AI cs.LG

    Vision-LSTM: xLSTM as Generic Vision Backbone

    Authors: Benedikt Alkin, Maximilian Beck, Korbinian Pöppel, Sepp Hochreiter, Johannes Brandstetter

    Abstract: Transformers are widely used as generic backbones in computer vision, despite initially introduced for natural language processing. Recently, the Long Short-Term Memory (LSTM) has been extended to a scalable and performant architecture - the xLSTM - which overcomes long-standing LSTM limitations via exponential gating and parallelizable matrix memory structure. In this report, we introduce Vision-… ▽ More

    Submitted 2 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  2. arXiv:2402.12365  [pdf, other

    cs.LG cs.AI physics.flu-dyn

    Universal Physics Transformers: A Framework For Efficiently Scaling Neural Operators

    Authors: Benedikt Alkin, Andreas Fürst, Simon Schmid, Lukas Gruber, Markus Holzleitner, Johannes Brandstetter

    Abstract: Neural operators, serving as physics surrogate models, have recently gained increased interest. With ever increasing problem complexity, the natural question arises: what is an efficient way to scale neural operators to larger and more complex simulations - most importantly by taking into account different types of simulation datasets. This is of special interest since, akin to their numerical cou… ▽ More

    Submitted 30 April, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  3. arXiv:2402.10093  [pdf, other

    cs.CV cs.AI cs.LG

    MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations

    Authors: Benedikt Alkin, Lukas Miklautz, Sepp Hochreiter, Johannes Brandstetter

    Abstract: We introduce MIM (Masked Image Modeling)-Refiner, a contrastive learning boost for pre-trained MIM models. MIM-Refiner is motivated by the insight that strong representations within MIM models generally reside in intermediate layers. Accordingly, MIM-Refiner leverages multiple contrastive heads that are connected to different intermediate layers. In each head, a modified nearest neighbor objective… ▽ More

    Submitted 3 June, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  4. arXiv:2304.10520  [pdf, other

    cs.CV cs.AI cs.LG

    Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget

    Authors: Johannes Lehner, Benedikt Alkin, Andreas Fürst, Elisabeth Rumetshofer, Lukas Miklautz, Sepp Hochreiter

    Abstract: Masked Image Modeling (MIM) methods, like Masked Autoencoders (MAE), efficiently learn a rich representation of the input. However, for adapting to downstream tasks, they require a sufficient amount of labeled data since their rich features code not only objects but also less relevant image background. In contrast, Instance Discrimination (ID) methods focus on objects. In this work, we study how t… ▽ More

    Submitted 14 September, 2023; v1 submitted 20 April, 2023; originally announced April 2023.