Skip to main content

Showing 1–19 of 19 results for author: Franke, J

  1. arXiv:2406.18701  [pdf, other

    cs.LG cs.AI

    Fast Optimizer Benchmark

    Authors: Simon Blauth, Tobias Bürger, Zacharias Häringer, Jörg Franke, Frank Hutter

    Abstract: In this paper, we present the Fast Optimizer Benchmark (FOB), a tool designed for evaluating deep learning optimizers during their development. The benchmark supports tasks from multiple domains such as computer vision, natural language processing, and graph learning. The focus is on convenient usage, featuring human-readable YAML configurations, SLURM integration, and plotting utilities. FOB can… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 5 pages + 12 appendix pages, submitted to AutoML Conf 2024 Workshop Track

  2. arXiv:2405.10299  [pdf, other

    cs.LG cs.AI

    HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models

    Authors: Rhea Sanjay Sukthanker, Arber Zela, Benedikt Staffler, Aaron Klein, Lennart Purucker, Joerg K. H. Franke, Frank Hutter

    Abstract: The increasing size of language models necessitates a thorough analysis across multiple dimensions to assess trade-offs among crucial hardware metrics such as latency, energy consumption, GPU memory usage, and performance. Identifying optimal model configurations under specific hardware constraints is becoming essential but remains challenging due to the computational load of exhaustive training a… ▽ More

    Submitted 21 June, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    Comments: 48 pages, 69 figures, 10 tables

  3. arXiv:2401.05351  [pdf, other

    q-bio.BM cs.LG

    Rethinking Performance Measures of RNA Secondary Structure Problems

    Authors: Frederic Runge, Jörg K. H. Franke, Daniel Fertmann, Frank Hutter

    Abstract: Accurate RNA secondary structure prediction is vital for understanding cellular regulation and disease mechanisms. Deep learning (DL) methods have surpassed traditional algorithms by predicting complex features like pseudoknots and multi-interacting base pairs. However, traditional distance measures can hardly deal with such tertiary interactions and the currently used evaluation measures (F1 scor… ▽ More

    Submitted 4 December, 2023; originally announced January 2024.

    Comments: 12 pages, Accepted at the Machine Learning for Structural Biology Workshop, NeurIPS 2023

  4. arXiv:2311.12909  [pdf, other

    stat.ML cs.LG

    Non-Sequential Ensemble Kalman Filtering using Distributed Arrays

    Authors: Cédric Travelletti, Jörg Franke, David Ginsbourger, Stefan Brönnimann

    Abstract: This work introduces a new, distributed implementation of the Ensemble Kalman Filter (EnKF) that allows for non-sequential assimilation of large datasets in high-dimensional problems. The traditional EnKF algorithm is computationally intensive and exhibits difficulties in applications requiring interaction with the background covariance matrix, prompting the use of methods like sequential assimila… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

  5. arXiv:2311.09058  [pdf, other

    cs.LG

    Constrained Parameter Regularization

    Authors: Jörg K. H. Franke, Michael Hefenbrock, Gregor Koehler, Frank Hutter

    Abstract: Regularization is a critical component in deep learning training, with weight decay being a commonly used approach. It applies a constant penalty coefficient uniformly across all parameters. This may be unnecessarily restrictive for some parameters, while insufficiently restricting others. To dynamically adjust penalty coefficients for different parameter groups, we present constrained parameter r… ▽ More

    Submitted 6 December, 2023; v1 submitted 15 November, 2023; originally announced November 2023.

  6. arXiv:2310.03940  [pdf, other

    cs.CV cs.AI

    Beyond Random Augmentations: Pretraining with Hard Views

    Authors: Fabio Ferreira, Ivo Rapant, Jörg K. H. Franke, Frank Hutter

    Abstract: Many Self-Supervised Learning (SSL) methods aim for model invariance to different image augmentations known as views. To achieve this invariance, conventional approaches make use of random sampling operations within the image augmentation pipeline. We hypothesize that the efficacy of pretraining pipelines based on conventional random view sampling can be enhanced by explicitly selecting views that… ▽ More

    Submitted 27 May, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

  7. arXiv:2309.07513  [pdf, other

    cs.CV

    RecycleNet: Latent Feature Recycling Leads to Iterative Decision Refinement

    Authors: Gregor Koehler, Tassilo Wald, Constantin Ulrich, David Zimmerer, Paul F. Jaeger, Jörg K. H. Franke, Simon Kohl, Fabian Isensee, Klaus H. Maier-Hein

    Abstract: Despite the remarkable success of deep learning systems over the last decade, a key difference still remains between neural network and human decision-making: As humans, we cannot only form a decision on the spot, but also ponder, revisiting an initial guess from different angles, distilling relevant information, arriving at a better decision. Here, we propose RecycleNet, a latent feature recyclin… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: Accepted at 2024 Winter Conference on Applications of Computer Vision (WACV)

  8. arXiv:2307.10073  [pdf, other

    cs.LG q-bio.BM

    Scalable Deep Learning for RNA Secondary Structure Prediction

    Authors: Jörg K. H. Franke, Frederic Runge, Frank Hutter

    Abstract: The field of RNA secondary structure prediction has made significant progress with the adoption of deep learning techniques. In this work, we present the RNAformer, a lean deep learning model using axial attention and recycling in the latent space. We gain performance improvements by designing the architecture for modeling the adjacency matrix directly in the latent space and by scaling the size o… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: Accepted at the 2023 ICML Workshop on Computational Biology. Honolulu, Hawaii, USA, 2023

  9. arXiv:2307.08801  [pdf, other

    cs.LG q-bio.GN

    Towards Automated Design of Riboswitches

    Authors: Frederic Runge, Jörg K. H. Franke, Frank Hutter

    Abstract: Experimental screening and selection pipelines for the discovery of novel riboswitches are expensive, time-consuming, and inefficient. Using computational methods to reduce the number of candidates for the screen could drastically decrease these costs. However, existing computational approaches do not fully satisfy all requirements for the design of such initial screening libraries. In this work,… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: 9 pages, Accepted at the 2023 ICML Workshop on Computational Biology

  10. arXiv:2211.00860  [pdf, other

    physics.ao-ph cs.CV

    Insight into cloud processes from unsupervised classification with a rotationally invariant autoencoder

    Authors: Takuya Kurihana, James Franke, Ian Foster, Ziwei Wang, Elisabeth Moyer

    Abstract: Clouds play a critical role in the Earth's energy budget and their potential changes are one of the largest uncertainties in future climate projections. However, the use of satellite observations to understand cloud feedbacks in a warming climate has been hampered by the simplicity of existing cloud classification schemes, which are based on single-pixel cloud properties rather than utilizing spat… ▽ More

    Submitted 20 November, 2022; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: 5 pages, 3 figures, the 36th conference on Neural Information Processing Systems (NeurIPS) Machine Learning and the Physical Sciences workshop

  11. arXiv:2205.13927  [pdf, other

    cs.LG q-bio.BM

    Probabilistic Transformer: Modelling Ambiguities and Distributions for RNA Folding and Molecule Design

    Authors: Jörg K. H. Franke, Frederic Runge, Frank Hutter

    Abstract: Our world is ambiguous and this is reflected in the data we use to train our algorithms. This is particularly true when we try to model natural processes where collected data is affected by noisy measurements and differences in measurement techniques. Sometimes, the process itself is ambiguous, such as in the case of RNA folding, where the same nucleotide sequence can fold into different structure… ▽ More

    Submitted 14 November, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

    Comments: 38 pages, Accepted at 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  12. arXiv:2203.01717  [pdf, other

    cs.LG

    Practitioner Motives to Select Hyperparameter Optimization Methods

    Authors: Niklas Hasebrook, Felix Morsbach, Niclas Kannengießer, Marc Zöller, Jörg Franke, Marius Lindauer, Frank Hutter, Ali Sunyaev

    Abstract: Advanced programmatic hyperparameter optimization (HPO) methods, such as Bayesian optimization, have high sample efficiency in reproducibly finding optimal hyperparameter values of machine learning (ML) models. Yet, ML practitioners often apply less sample-efficient HPO methods, such as grid search, which often results in under-optimized ML models. As a reason for this behavior, we suspect practit… ▽ More

    Submitted 26 June, 2023; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: submitted to JMLR; currently under review

  13. arXiv:2109.10731  [pdf, other

    eess.IV cs.CV physics.med-ph

    Automatic Plane Adjustment of Orthopedic Intra-operative Flat Panel Detector CT-Volumes

    Authors: Celia Martin Vicario, Florian Kordon, Felix Denzinger, Jan Siad El Barbari, Maxim Privalov, Jochen Franke, Sarina Thomas, Lisa Kausch, Andreas Maier, Holger Kunze

    Abstract: Purpose 3D acquisitions are often acquired to assess the result in orthopedic trauma surgery. With a mobile C-Arm system, these acquisitions can be performed intra-operatively. That reduces the number of required revision surgeries. However, due to the operation room setup, the acquisitions typically cannot be performed such that the acquired volumes are aligned to the anatomical regions. Thus,… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

  14. arXiv:2103.08203  [pdf, other

    cs.SD eess.AS q-bio.NC

    Computational timbre and tonal system similarity analysis of the music of Northern Myanmar-based Kachin compared to Xinjiang-based Uyghur ethnic groups

    Authors: Rolf Bader, Michael Blaß, Jonas Franke

    Abstract: The music of Northern Myanmar Kachin ethnic group is compared to the music of western China, Xijiang based Uyghur music, using timbre and pitch feature extraction and machine learning. Although separated by Tibet, the muqam tradition of Xinjiang might be found in Kachin music due to myths of Kachin origin, as well as linguistic similarities, e.g., the Kachin term 'makan' for a musical piece. Extra… ▽ More

    Submitted 15 March, 2021; originally announced March 2021.

    Comments: 12 pages, 9 figures

  15. arXiv:2010.13117  [pdf, other

    cs.LG cs.AI

    Hyperparameter Transfer Across Developer Adjustments

    Authors: Danny Stoll, Jörg K. H. Franke, Diane Wagner, Simon Selg, Frank Hutter

    Abstract: After developer adjustments to a machine learning (ML) algorithm, how can the results of an old hyperparameter optimization (HPO) automatically be used to speedup a new HPO? This question poses a challenging problem, as developer adjustments can change which hyperparameter settings perform well, or even the hyperparameter search space itself. While many approaches exist that leverage knowledge obt… ▽ More

    Submitted 25 October, 2020; originally announced October 2020.

  16. arXiv:2009.01555  [pdf, other

    cs.LG stat.ML

    Sample-Efficient Automated Deep Reinforcement Learning

    Authors: Jörg K. H. Franke, Gregor Köhler, André Biedenkapp, Frank Hutter

    Abstract: Despite significant progress in challenging problems across various domains, applying state-of-the-art deep reinforcement learning (RL) algorithms remains challenging due to their sensitivity to the choice of hyperparameters. This sensitivity can partly be attributed to the non-stationarity of the RL problem, potentially requiring different hyperparameter settings at various stages of the learning… ▽ More

    Submitted 17 March, 2021; v1 submitted 3 September, 2020; originally announced September 2020.

    Comments: In Proceedings of the International Conference on Learning Representations (ICLR 2021), 2021

  17. arXiv:1910.12824  [pdf, other

    cs.LG cs.NE stat.ML

    Neural Architecture Evolution in Deep Reinforcement Learning for Continuous Control

    Authors: Jörg K. H. Franke, Gregor Köhler, Noor Awad, Frank Hutter

    Abstract: Current Deep Reinforcement Learning algorithms still heavily rely on handcrafted neural network architectures. We propose a novel approach to automatically find strong topologies for continuous control tasks while only adding a minor overhead in terms of interactions in the environment. To achieve this, we combine Neuroevolution techniques with off-policy training and propose a novel architecture… ▽ More

    Submitted 27 February, 2020; v1 submitted 28 October, 2019; originally announced October 2019.

    Comments: NeurIPS 2019 MetaLearn Workshop

  18. arXiv:1907.10465  [pdf, ps, other

    eess.IV cs.CV

    Multi-task Localization and Segmentation for X-ray Guided Planning in Knee Surgery

    Authors: Florian Kordon, Peter Fischer, Maxim Privalov, Benedict Swartman, Marc Schnetzke, Jochen Franke, Ruxandra Lasowski, Andreas Maier, Holger Kunze

    Abstract: X-ray based measurement and guidance are commonly used tools in orthopaedic surgery to facilitate a minimally invasive workflow. Typically, a surgical planning is first performed using knowledge of bone morphology and anatomical landmarks. Information about bone location then serves as a prior for registration during overlay of the planning on intra-operative X-ray images. Performing these steps m… ▽ More

    Submitted 24 July, 2019; originally announced July 2019.

    Comments: Accepted for MICCAI 2019

  19. arXiv:1807.02658  [pdf, other

    cs.CL cs.LG

    Robust and Scalable Differentiable Neural Computer for Question Answering

    Authors: Jörg Franke, Jan Niehues, Alex Waibel

    Abstract: Deep learning models are often not easily adaptable to new tasks and require task-specific adjustments. The differentiable neural computer (DNC), a memory-augmented neural network, is designed as a general problem solver which can be used in a wide range of tasks. But in reality, it is hard to apply this model to new tasks. We analyze the DNC and identify possible improvements within the applicati… ▽ More

    Submitted 7 July, 2018; originally announced July 2018.

    Comments: Accepted at Workshop on Machine Reading for Question Answering (MRQA), ACL 2018. 14 pages, 5 figures