subscribe to arXiv mailings

Roadmap on Data-Centric Materials Science

Authors: Stefan Bauer, Peter Benner, Tristan Bereau, Volker Blum, Mario Boley, Christian Carbogno, C. Richard A. Catlow, Gerhard Dehm, Sebastian Eibl, Ralph Ernstorfer, Ádám Fekete, Lucas Foppa, Peter Fratzl, Christoph Freysoldt, Baptiste Gault, Luca M. Ghiringhelli, Sajal K. Giri, Anton Gladyshev, Pawan Goyal, Jason Hattrick-Simpers, Lara Kabalan, Petr Karpov, Mohammad S. Khorrami, Christoph Koch, Sebastian Kokott , et al. (36 additional authors not shown)

Abstract: Science is and always has been based on data, but the terms "data-centric" and the "4th paradigm of" materials research indicate a radical change in how information is retrieved, handled and research is performed. It signifies a transformative shift towards managing vast data collections, digital repositories, and innovative data analytics methods. The integration of Artificial Intelligence (AI) a… ▽ More Science is and always has been based on data, but the terms "data-centric" and the "4th paradigm of" materials research indicate a radical change in how information is retrieved, handled and research is performed. It signifies a transformative shift towards managing vast data collections, digital repositories, and innovative data analytics methods. The integration of Artificial Intelligence (AI) and its subset Machine Learning (ML), has become pivotal in addressing all these challenges. This Roadmap on Data-Centric Materials Science explores fundamental concepts and methodologies, illustrating diverse applications in electronic-structure theory, soft matter theory, microstructure research, and experimental techniques like photoemission, atom probe tomography, and electron microscopy. While the roadmap delves into specific areas within the broad interdisciplinary field of materials science, the provided examples elucidate key concepts applicable to a wider range of topics. The discussed instances offer insights into addressing the multifaceted challenges encountered in contemporary materials research. △ Less

Submitted 1 May, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

Comments: Review, outlook, roadmap, perspective

arXiv:2303.05994 [pdf, other]

doi 10.1016/j.cpc.2023.108973

A performance portable implementation of the semi-Lagrangian algorithm in six dimensions

Authors: Nils Schild, Mario Raeth, Sebastian Eibl, Klaus Hallatschek, Katharina Kormann

Abstract: In this paper, we describe our approach to develop a simulation software application for the fully kinetic Vlasov equation which will be used to explore physics beyond the gyrokinetic model. Simulating the fully kinetic Vlasov equation requires efficient utilization of compute and storage capabilities due to the high dimensionality of the problem. In addition, the implementation needs to be extens… ▽ More In this paper, we describe our approach to develop a simulation software application for the fully kinetic Vlasov equation which will be used to explore physics beyond the gyrokinetic model. Simulating the fully kinetic Vlasov equation requires efficient utilization of compute and storage capabilities due to the high dimensionality of the problem. In addition, the implementation needs to be extensibility regarding the physical model and flexible regarding the hardware for production runs. We start on the algorithmic background to simulate the 6-D Vlasov equation using a semi-Lagrangian algorithm. The performance portable software stack, which enables production runs on pure CPU as well as AMD or Nvidia GPU accelerated nodes, is presented. The extensibility of our implementation is guaranteed through the described software architecture of the main kernel, which achieves a memory bandwidth of almost 500 GB/s on a V100 Nvidia GPU and around 100 GB/s on an Intel Xeon Gold CPU using a single code base. We provide performance data on multiple node level architectures discussing utilized and further available hardware capabilities. Finally, the network communication bottleneck of 6-D grid based algorithms is quantified. A verification of physics beyond gyrokinetic theory for the example of ion Bernstein waves concludes the work. △ Less

Submitted 10 March, 2023; originally announced March 2023.

arXiv:2109.10876 [pdf, other]

doi 10.1016/j.cpc.2023.108760

Code modernization strategies for short-range non-bonded molecular dynamics simulations

Authors: James Vance, Zhen-Hao Xu, Nikita Tretyakov, Torsten Stuehn, Markus Rampp, Sebastian Eibl, Christoph Junghans, André Brinkmann

Abstract: Modern HPC systems are increasingly relying on greater core counts and wider vector registers. Thus, applications need to be adapted to fully utilize these hardware capabilities. One class of applications that can benefit from this increase in parallelism are molecular dynamics simulations. In this paper, we describe our efforts at modernizing the ESPResSo++ molecular dynamics simulation package b… ▽ More Modern HPC systems are increasingly relying on greater core counts and wider vector registers. Thus, applications need to be adapted to fully utilize these hardware capabilities. One class of applications that can benefit from this increase in parallelism are molecular dynamics simulations. In this paper, we describe our efforts at modernizing the ESPResSo++ molecular dynamics simulation package by restructuring its particle data layout for efficient memory accesses and applying vectorization techniques to benefit the calculation of short-range non-bonded forces, which results in an overall three times speedup and serves as a baseline for further optimizations. We also implement fine-grained parallelism for multi-core CPUs through HPX, a C++ runtime system which uses lightweight threads and an asynchronous many-task approach to maximize concurrency. Our goal is to evaluate the performance of an HPX-based approach compared to the bulk-synchronous MPI-based implementation. This requires the introduction of an additional layer to the domain decomposition scheme that defines the task granularity. On spatially inhomogeneous systems, which impose a corresponding load-imbalance in traditional MPI-based approaches, we demonstrate that by choosing an optimal task size, the efficient work-stealing mechanisms of HPX can overcome the overhead of communication resulting in an overall 1.4 times speedup compared to the baseline MPI version. △ Less

Submitted 15 June, 2023; v1 submitted 22 September, 2021; originally announced September 2021.

Comments: 42 pages, 9 figures, SI

Journal ref: Comp. Phys. Comm. 290, 108760 (2023)

arXiv:2103.04103 [pdf, other]

doi 10.1017/jfm.2021.870

Rheology of mobile sediment beds in laminar shear flow: effects of creep and polydispersity

Authors: Christoph Rettinger, Sebastian Eibl, Ulrich Rüde, Bernhard Vowinckel

Abstract: Classical scaling relationships for rheological quantities such as the $μ(J)$-rheology have become increasingly popular for closures of two-phase flow modeling. However, these frameworks have been derived for monodisperse particles. We aim to extend these considerations to sediment transport modeling by using a more realistic sediment composition. We investigate the rheological behavior of sheared… ▽ More Classical scaling relationships for rheological quantities such as the $μ(J)$-rheology have become increasingly popular for closures of two-phase flow modeling. However, these frameworks have been derived for monodisperse particles. We aim to extend these considerations to sediment transport modeling by using a more realistic sediment composition. We investigate the rheological behavior of sheared sediment beds composed of polydisperse spherical particles in a laminar Couette-type shear flow. The sediment beds consist of particles with a diameter size ratio of up to ten, which corresponds to grains ranging from fine to coarse sand. The data was generated using fully coupled, grain resolved direct numerical simulations using a combined lattice Boltzmann - discrete element method. These highly-resolved data yield detailed depth-resolved profiles of the relevant physical quantities that determine the rheology, i.e., the local shear rate of the fluid, particle volume fraction, total shear, and granular pressure. A comparison against experimental data shows excellent agreement for the monodisperse case. We improve upon the parameterization of the $μ(J)$-rheology by expressing its empirically derived parameters as a function of the maximum particle volume fraction. Furthermore, we extend these considerations by exploring the creeping regime for viscous numbers much lower than used by previous studies to calibrate these correlations. Considering the low viscous numbers of our data, we found that the friction coefficient governing the quasi-static state in the creeping regime tends to a finite value for vanishing shear, which decreases the critical friction coefficient by a factor of three for all cases investigated. △ Less

Submitted 30 June, 2021; v1 submitted 6 March, 2021; originally announced March 2021.

arXiv:2103.02388 [pdf, other]

A massively parallel Eulerian-Lagrangian method for advection-dominated transport in viscous fluids

Authors: Nils Kohl, Marcus Mohr, Sebastian Eibl, Ulrich Rüde

Abstract: Motivated by challenges in Earth mantle convection, we present a massively parallel implementation of an Eulerian-Lagrangian method for the advection-diffusion equation in the advection-dominated regime. The advection term is treated by a particle-based, characteristics method coupled to a block-structured finite-element framework. Its numerical and computational performance is evaluated in multip… ▽ More Motivated by challenges in Earth mantle convection, we present a massively parallel implementation of an Eulerian-Lagrangian method for the advection-diffusion equation in the advection-dominated regime. The advection term is treated by a particle-based, characteristics method coupled to a block-structured finite-element framework. Its numerical and computational performance is evaluated in multiple, two- and three-dimensional benchmarks, including curved geometries, discontinuous solutions, pure advection, and it is applied to a coupled non-linear system modeling buoyancy-driven convection in Stokes flow. We demonstrate the parallel performance in a strong and weak scaling experiment, with scalability to up to $147,456$ parallel processes, solving for more than $5.2 \times 10^{10}$ (52 billion) degrees of freedom per time-step. △ Less

Submitted 3 March, 2021; originally announced March 2021.

Comments: 22 pages

MSC Class: 65M25; 65Y05; 65M60

arXiv:2010.13342 [pdf, other]

Resiliency in Numerical Algorithm Design for Extreme Scale Simulations

Authors: Emmanuel Agullo, Mirco Altenbernd, Hartwig Anzt, Leonardo Bautista-Gomez, Tommaso Benacchio, Luca Bonaventura, Hans-Joachim Bungartz, Sanjay Chatterjee, Florina M. Ciorba, Nathan DeBardeleben, Daniel Drzisga, Sebastian Eibl, Christian Engelmann, Wilfried N. Gansterer, Luc Giraud, Dominik Goeddeke, Marco Heisig, Fabienne Jezequel, Nils Kohl, Xiaoye Sherry Li, Romain Lion, Miriam Mehl, Paul Mycek, Michael Obersteiner, Enrique S. Quintana-Orti , et al. (11 additional authors not shown)

Abstract: This work is based on the seminar titled ``Resiliency in Numerical Algorithm Design for Extreme Scale Simulations'' held March 1-6, 2020 at Schloss Dagstuhl, that was attended by all the authors. Naive versions of conventional resilience techniques will not scale to the exascale regime: with a main memory footprint of tens of Petabytes, synchronously writing checkpoint data all the way to backgr… ▽ More This work is based on the seminar titled ``Resiliency in Numerical Algorithm Design for Extreme Scale Simulations'' held March 1-6, 2020 at Schloss Dagstuhl, that was attended by all the authors. Naive versions of conventional resilience techniques will not scale to the exascale regime: with a main memory footprint of tens of Petabytes, synchronously writing checkpoint data all the way to background storage at frequent intervals will create intolerable overheads in runtime and energy consumption. Forecasts show that the mean time between failures could be lower than the time to recover from such a checkpoint, so that large calculations at scale might not make any progress if robust alternatives are not investigated. More advanced resilience techniques must be devised. The key may lie in exploiting both advanced system features as well as specific application knowledge. Research will face two essential questions: (1) what are the reliability requirements for a particular computation and (2) how do we best design the algorithms and software to meet these requirements? One avenue would be to refine and improve on system- or application-level checkpointing and rollback strategies in the case an error is detected. Developers might use fault notification interfaces and flexible runtime systems to respond to node failures in an application-dependent fashion. Novel numerical algorithms or more stochastic computational approaches may be required to meet accuracy requirements in the face of undetectable soft errors. The goal of this Dagstuhl Seminar was to bring together a diverse group of scientists with expertise in exascale computing to discuss novel ways to make applications resilient against detected and undetected faults. In particular, participants explored the role that algorithms and applications play in the holistic approach needed to tackle this challenge. △ Less

Submitted 26 October, 2020; originally announced October 2020.

Comments: 45 pages, 3 figures, submitted to The International Journal of High Performance Computing Applications

ACM Class: D.4.5; G.4; G.1; D.4.4

arXiv:2009.07400 [pdf, other]

tinyMD: A Portable and Scalable Implementation for Pairwise Interactions Simulations

Authors: Rafael Ravedutti L. Machado, Jonas Schmitt, Sebastian Eibl, Jan Eitzinger, Roland Leißa, Sebastian Hack, Arsène Pérard-Gayot, Richard Membarth, Harald Köstler

Abstract: This paper investigates the suitability of the AnyDSL partial evaluation framework to implement tinyMD: an efficient, scalable, and portable simulation of pairwise interactions among particles. We compare tinyMD with the miniMD proxy application that scales very well on parallel supercomputers. We discuss the differences between both implementations and contrast miniMD's performance for single-nod… ▽ More This paper investigates the suitability of the AnyDSL partial evaluation framework to implement tinyMD: an efficient, scalable, and portable simulation of pairwise interactions among particles. We compare tinyMD with the miniMD proxy application that scales very well on parallel supercomputers. We discuss the differences between both implementations and contrast miniMD's performance for single-node CPU and GPU targets, as well as its scalability on SuperMUC-NG and Piz Daint supercomputers. Additionaly, we demonstrate tinyMD's flexibility by coupling it with the waLBerla multi-physics framework. This allow us to execute tinyMD simulations using the load-balancing mechanism implemented in waLBerla. △ Less

Submitted 15 September, 2020; originally announced September 2020.

Comments: 35 pages, 8 figures, submitted to Journal of Computational Science

MSC Class: B.8.2; D.1.3; D.3.3; J.2

arXiv:2008.13046 [pdf]

doi 10.1063/5.0025505

Densification of Single-Walled Carbon Nanotube Films: Mesoscopic Distinct Element Method Simulations and Experimental Validation

Authors: Grigorii Drozdov, Igor Ostanin, Hao Xu, Yuezhou Wang, Traian Dumitrică, Artem Grebenko, Alexey P. Tsapenko, Yuriy Gladush, Georgy Ermolaev, Valentyn S. Volkov, Sebastian Eibl, Ulrich Rüde, Albert G. Nasibulin

Abstract: Nanometer thin single-walled carbon nanotube (CNT) films collected from the aerosol chemical deposition reactors have gathered attention for their promising applications. Densification of these pristine films provides an important way to manipulate the mechanical, electronic, and optical properties. To elucidate the underlying microstructural level restructuring, which is ultimately responsible fo… ▽ More Nanometer thin single-walled carbon nanotube (CNT) films collected from the aerosol chemical deposition reactors have gathered attention for their promising applications. Densification of these pristine films provides an important way to manipulate the mechanical, electronic, and optical properties. To elucidate the underlying microstructural level restructuring, which is ultimately responsible for the change in properties, we perform large scale vector-based mesoscopic distinct element method simulations in conjunction with electron microscopy and spectroscopic ellipsometry characterization of pristine and densified films by drop-cast volatile liquid processing. Matching the microscopy observations, pristine CNT films with finite thickness are modeled as self-assembled CNT networks comprising entangled dendritic bundles with branches extending down to individual CNTs. Simulations of the film under uniaxial compression uncover an ultra-soft densification regime extending to a ~75% strain, which is likely accessible with the surface tensional forces arising from liquid surface tension during the evaporation. When removing the loads, the pre-compressed samples evolve into homogeneously densified films with thickness values depending on both the pre-compression level and the sample microstructure. The significant reduction in thickness, confirmed by our spectroscopic ellipsometry, is attributed to the underlying structural changes occurring at the 100 nm scale, including the zipping of the thinnest dendritic branches. △ Less

Submitted 29 August, 2020; originally announced August 2020.

Comments: 12 figures

arXiv:1909.13772 [pdf, other]

doi 10.1016/j.camwa.2020.01.007

waLBerla: A block-structured high-performance framework for multiphysics simulations

Authors: Martin Bauer, Sebastian Eibl, Christian Godenschwager, Nils Kohl, Michael Kuron, Christoph Rettinger, Florian Schornbaum, Christoph Schwarzmeier, Dominik Thönnes, Harald Köstler, Ulrich Rüde

Abstract: Programming current supercomputers efficiently is a challenging task. Multiple levels of parallelism on the core, on the compute node, and between nodes need to be exploited to make full use of the system. Heterogeneous hardware architectures with accelerators further complicate the development process. waLBerla addresses these challenges by providing the user with highly efficient building blocks… ▽ More Programming current supercomputers efficiently is a challenging task. Multiple levels of parallelism on the core, on the compute node, and between nodes need to be exploited to make full use of the system. Heterogeneous hardware architectures with accelerators further complicate the development process. waLBerla addresses these challenges by providing the user with highly efficient building blocks for developing simulations on block-structured grids. The block-structured domain partitioning is flexible enough to handle complex geometries, while the structured grid within each block allows for highly efficient implementations of stencil-based algorithms. We present several example applications realized with waLBerla, ranging from lattice Boltzmann methods to rigid particle simulations. Most importantly, these methods can be coupled together, enabling multiphysics simulations. The framework uses meta-programming techniques to generate highly efficient code for CPUs and GPUs from a symbolic method formulation. To ensure software quality and performance portability, a continuous integration toolchain automatically runs an extensive test suite encompassing multiple compilers, hardware architectures, and software configurations. △ Less

Submitted 30 September, 2019; originally announced September 2019.

arXiv:1906.10963 [pdf, other]

A Modular and Extensible Software Architecture for Particle Dynamics

Authors: Sebastian Eibl, Ulrich Rüde

Abstract: Creating a highly parallel and flexible discrete element software requires an interdisciplinary approach, where expertise from different disciplines is combined. On the one hand domain specialists provide interaction models between particles. On the other hand high-performance computing specialists optimize the code to achieve good performance on different hardware architectures. In particular, th… ▽ More Creating a highly parallel and flexible discrete element software requires an interdisciplinary approach, where expertise from different disciplines is combined. On the one hand domain specialists provide interaction models between particles. On the other hand high-performance computing specialists optimize the code to achieve good performance on different hardware architectures. In particular, the software must be carefully crafted to achieve good scaling on massively parallel supercomputers. Combining all this in a flexible and extensible, widely usable software is a challenging task. In this article we outline the design decisions and concepts of a newly developed particle dynamics code MESA-PD that is implemented as part of the waLBerla multi-physics framework. Extensibility, flexibility, but also performance and scalability are primary design goals for the new software framework. In particular, the new modular architecture is designed such that physical models can be modified and extended by domain scientists without understanding all details of the parallel computing functionality and the underlying distributed data structures that are needed to achieve good performance on current supercomputer architectures. This goal is achieved by combining the high performance simulation framework waLBerla with code generation techniques. All code and the code generator are released as open source under GPLv3 within the publicly available waLBerla framework (www.walberla.net). △ Less

Submitted 26 June, 2019; originally announced June 2019.

Comments: Proceedings Of The 8Th International Conference On Discrete Element Methods

arXiv:1905.05042 [pdf, other]

Computational Study of Ultrathin CNT Films with the Scalable Mesoscopic Distinct Element Method

Authors: Igor Ostanin, Traian Dumitrică, Sebastian Eibl, Ulrich Rüde

Abstract: In this work we present a computational study of the small strain mechanics of freestanding ultrathin CNT films under in-plane loading. The numerical modeling of the mechanics of representatively large specimens with realistic micro- and nanostructure is presented. Our simulations utilize the scalable implementation of the mesoscopic distinct element method of the waLBerla multi-physics framework.… ▽ More In this work we present a computational study of the small strain mechanics of freestanding ultrathin CNT films under in-plane loading. The numerical modeling of the mechanics of representatively large specimens with realistic micro- and nanostructure is presented. Our simulations utilize the scalable implementation of the mesoscopic distinct element method of the waLBerla multi-physics framework. Within our modeling approach, CNTs are represented as chains of interacting rigid segments. Neighboring segments in the chain are connected with elastic bonds, resolving tension, bending, shear and torsional deformations. These bonds represent a covalent bonding within CNT surface and utilize Enhanced Vector Model (EVM) formalism. Segments of the neighboring CNTs interact with realistic coarse-grained anisotropic vdW potential, enabling relative slip of CNTs in contact. The advanced simulation technique allowed us to gain useful insights on the behavior of CNT materials. In particular, it was established that the energy dissipation during CNT sliding leads to extended load transfer that conditions material-like mechanical response of the weakly bonded assemblies of CNTs. △ Less

Submitted 19 October, 2019; v1 submitted 13 May, 2019; originally announced May 2019.

arXiv:1808.00829 [pdf, other]

doi 10.1016/j.cpc.2019.06.020

A Systematic Comparison of Dynamic Load Balancing Algorithms for Massively Parallel Rigid Particle Dynamics

Authors: Sebastian Eibl, Ulrich Rüde

Abstract: As compute power increases with time, more involved and larger simulations become possible. However, it gets increasingly difficult to efficiently use the provided computational resources. Especially in particle-based simulations with a spatial domain partitioning large load imbalances can occur due to the simulation being dynamic. Then a static domain partitioning may not be suitable. This can de… ▽ More As compute power increases with time, more involved and larger simulations become possible. However, it gets increasingly difficult to efficiently use the provided computational resources. Especially in particle-based simulations with a spatial domain partitioning large load imbalances can occur due to the simulation being dynamic. Then a static domain partitioning may not be suitable. This can deteriorate the overall runtime of the simulation significantly. Sophisticated load balancing strategies must be designed to alleviate this problem. In this paper we conduct a systematic evaluation of the performance of six different load balancing algorithms. Our tests cover a wide range of simulation sizes, and employ one of the largest supercomputers available. In particular we study the runtime and memory complexity of all components of the simulation carefully. When progressing to extreme scale simulations it is essential to identify bottlenecks and to predict the scaling behaviour. Scaling experiments are shown for up to over one million processes. The performance of each algorithm is analyzed with respect to the quality of the load balancing and its runtime costs. For all tests, the waLBerla multiphysics framework is employed. △ Less

Submitted 2 August, 2019; v1 submitted 2 August, 2018; originally announced August 2018.

arXiv:1802.02765 [pdf, other]

A local parallel communication algorithm for polydisperse rigid body dynamics

Authors: Sebastian Eibl, Ulrich Rüde

Abstract: The simulation of large ensembles of particles is usually parallelized by partitioning the domain spatially and using message passing to communicate between the processes handling neighboring subdomains. The particles are represented as individual geometric objects and are associated to the subdomains. Handling collisions and migrating particles between subdomains, as required for proper parallel… ▽ More The simulation of large ensembles of particles is usually parallelized by partitioning the domain spatially and using message passing to communicate between the processes handling neighboring subdomains. The particles are represented as individual geometric objects and are associated to the subdomains. Handling collisions and migrating particles between subdomains, as required for proper parallel execution, requires a complex communication protocol. Typically, the parallelization is restricted to handling only particles that are smaller than a subdomain. In many applications, however, particle sizes may vary drastically with some of them being larger than a subdomain. In this article we propose a new communication and synchronization algorithm that can handle the parallelization without size restrictions on the particles. Despite the additional complexity and extended functionality, the new algorithm introduces only minimal overhead. We demonstrate the scalability of the previous and the new communication algorithms up to almost two million parallel processes and for handling ten billion (1e10) geometrically resolved particles on a state-of-the-art petascale supercomputer. Different scenarios are presented to analyze the performance of the new algorithm and to demonstrate its capability to simulate polydisperse scenarios, where large individual particles can extend across several subdomains. △ Less

Submitted 2 August, 2018; v1 submitted 8 February, 2018; originally announced February 2018.

arXiv:1706.00221 [pdf, other]

doi 10.1007/s00466-017-1486-0

The Maximum Dissipation Principle in Rigid-Body Dynamics with Purely Inelastic Impacts

Authors: Tobias Preclik, Sebastian Eibl, Ulrich Rüde

Abstract: Formulating a consistent theory for rigid-body dynamics with impacts is an intricate problem. Twenty years ago Stewart published the first consistent theory with purely inelastic impacts and an impulsive friction model analogous to Coulomb friction. In this paper we demonstrate that the consistent impact model can exhibit multiple solutions with a varying degree of dissipation even in the single-c… ▽ More Formulating a consistent theory for rigid-body dynamics with impacts is an intricate problem. Twenty years ago Stewart published the first consistent theory with purely inelastic impacts and an impulsive friction model analogous to Coulomb friction. In this paper we demonstrate that the consistent impact model can exhibit multiple solutions with a varying degree of dissipation even in the single-contact case. Replacing the impulsive friction model based on Coulomb friction by a model based on the maximum dissipation principle resolves the non-uniqueness in the single-contact impact problem. The paper constructs the alternative impact model and presents integral equations describing rigid-body dynamics with a non-impulsive and non-compliant contact model and an associated purely inelastic impact model maximizing dissipation. An analytic solution is derived for the single-contact impact problem. The models are then embedded into a time-stepping scheme. The macroscopic behaviour is compared to Coulomb friction in a large-scale granular flow problem. △ Less

Submitted 1 June, 2017; originally announced June 2017.

Journal ref: Springer, Computational Mechanics, 2017

Showing 1–14 of 14 results for author: Eibl, S