-
How big is Big Data?
Authors:
Daniel T. Speckhard,
Tim Bechtel,
Luca M. Ghiringhelli,
Martin Kuban,
Santiago Rigamonti,
Claudia Draxl
Abstract:
Big data has ushered in a new wave of predictive power using machine learning models. In this work, we assess what {\it big} means in the context of typical materials-science machine-learning problems. This concerns not only data volume, but also data quality and veracity as much as infrastructure issues. With selected examples, we ask (i) how models generalize to similar datasets, (ii) how high-q…
▽ More
Big data has ushered in a new wave of predictive power using machine learning models. In this work, we assess what {\it big} means in the context of typical materials-science machine-learning problems. This concerns not only data volume, but also data quality and veracity as much as infrastructure issues. With selected examples, we ask (i) how models generalize to similar datasets, (ii) how high-quality datasets can be gathered from heterogenous sources, (iii) how the feature set and complexity of a model can affect expressivity, and (iv) what infrastructure requirements are needed to create larger datasets and train models on them. In sum, we find that big data present unique challenges along very different aspects that should serve to motivate further work.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
Combining genetic algorithm and compressed sensing for features and operators selection in symbolic regression
Authors:
Aliaksei Mazheika,
Sergey V. Levchenko,
Luca M. Ghiringhelli
Abstract:
Symbolic-inference methods have recently found a broad application in materials science. In particular, the Sure-Independence Screening and Sparsifying Operator (SISSO) performs symbolic regression and classification by adopting compressed sensing for the selection of an optimized subset of features and mathematical operators out of a given set of candidates. However, SISSO becomes computationally…
▽ More
Symbolic-inference methods have recently found a broad application in materials science. In particular, the Sure-Independence Screening and Sparsifying Operator (SISSO) performs symbolic regression and classification by adopting compressed sensing for the selection of an optimized subset of features and mathematical operators out of a given set of candidates. However, SISSO becomes computationally unpractical when the set of candidate features and operators exceeds the size of few tens. In the present work, we combine SISSO with a genetic algorithm (GA) for the global search of the optimal subset of features and operators. We demonstrate that GA-SISSO efficiently finds more accurate predictive models than the original SISSO, due to the possibility to access a larger input feature and operator space. GA-SISSO was applied for the search of the model for the prediction of carbon-dioxide adsorption energies on semiconductor oxides. The obtained with GA-SISSO model has much higher accuracy compared to models previously discussed in the literature (based solely on the O 2p-band center). The analysis of features importance shows that, besides the O 2p-band center, the contribution of the electrostatic potential above adsorption sites and the surface formation energies are also important.
△ Less
Submitted 23 March, 2024;
originally announced March 2024.
-
Roadmap on Data-Centric Materials Science
Authors:
Stefan Bauer,
Peter Benner,
Tristan Bereau,
Volker Blum,
Mario Boley,
Christian Carbogno,
C. Richard A. Catlow,
Gerhard Dehm,
Sebastian Eibl,
Ralph Ernstorfer,
Ádám Fekete,
Lucas Foppa,
Peter Fratzl,
Christoph Freysoldt,
Baptiste Gault,
Luca M. Ghiringhelli,
Sajal K. Giri,
Anton Gladyshev,
Pawan Goyal,
Jason Hattrick-Simpers,
Lara Kabalan,
Petr Karpov,
Mohammad S. Khorrami,
Christoph Koch,
Sebastian Kokott
, et al. (36 additional authors not shown)
Abstract:
Science is and always has been based on data, but the terms "data-centric" and the "4th paradigm of" materials research indicate a radical change in how information is retrieved, handled and research is performed. It signifies a transformative shift towards managing vast data collections, digital repositories, and innovative data analytics methods. The integration of Artificial Intelligence (AI) a…
▽ More
Science is and always has been based on data, but the terms "data-centric" and the "4th paradigm of" materials research indicate a radical change in how information is retrieved, handled and research is performed. It signifies a transformative shift towards managing vast data collections, digital repositories, and innovative data analytics methods. The integration of Artificial Intelligence (AI) and its subset Machine Learning (ML), has become pivotal in addressing all these challenges. This Roadmap on Data-Centric Materials Science explores fundamental concepts and methodologies, illustrating diverse applications in electronic-structure theory, soft matter theory, microstructure research, and experimental techniques like photoemission, atom probe tomography, and electron microscopy. While the roadmap delves into specific areas within the broad interdisciplinary field of materials science, the provided examples elucidate key concepts applicable to a wider range of topics. The discussed instances offer insights into addressing the multifaceted challenges encountered in contemporary materials research.
△ Less
Submitted 1 May, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
On the Uncertainty Estimates of Equivariant-Neural-Network-Ensembles Interatomic Potentials
Authors:
Shuaihua Lu,
Luca M. Ghiringhelli,
Christian Carbogno,
Jinlan Wang,
Matthias Scheffler
Abstract:
Machine-learning (ML) interatomic potentials (IPs) trained on first-principles datasets are becoming increasingly popular since they promise to treat larger system sizes and longer time scales, compared to the {\em ab initio} techniques producing the training data. Estimating the accuracy of MLIPs and reliably detecting when predictions become inaccurate is key for enabling their unfailing usage.…
▽ More
Machine-learning (ML) interatomic potentials (IPs) trained on first-principles datasets are becoming increasingly popular since they promise to treat larger system sizes and longer time scales, compared to the {\em ab initio} techniques producing the training data. Estimating the accuracy of MLIPs and reliably detecting when predictions become inaccurate is key for enabling their unfailing usage. In this paper, we explore this aspect for a specific class of MLIPs, the equivariant-neural-network (ENN) IPs using the ensemble technique for quantifying their prediction uncertainties. We critically examine the robustness of uncertainties when the ENN ensemble IP (ENNE-IP) is applied to the realistic and physically relevant scenario of predicting local-minima structures in the configurational space. The ENNE-IP is trained on data for liquid silicon, created by density-functional theory (DFT) with the generalized gradient approximation (GGA) for the exchange-correlation functional. Then, the ensemble-derived uncertainties are compared with the actual errors (comparing the results of the ENNE-IP with those of the underlying DFT-GGA theory) for various test sets, including liquid silicon at different temperatures and out-of-training-domain data such as solid phases with and without point defects as well as surfaces. Our study reveals that the predicted uncertainties are generally overconfident and hold little quantitative predictive power for the actual errors.
△ Less
Submitted 31 August, 2023;
originally announced September 2023.
-
Uncertainty Quantification in Deep Neural Networks through Statistical Inference on Latent Space
Authors:
Luigi Sbailò,
Luca M. Ghiringhelli
Abstract:
Uncertainty-quantification methods are applied to estimate the confidence of deep-neural-networks classifiers over their predictions. However, most widely used methods are known to be overconfident. We address this problem by developing an algorithm that exploits the latent-space representation of data points fed into the network, to assess the accuracy of their prediction. Using the latent-space…
▽ More
Uncertainty-quantification methods are applied to estimate the confidence of deep-neural-networks classifiers over their predictions. However, most widely used methods are known to be overconfident. We address this problem by developing an algorithm that exploits the latent-space representation of data points fed into the network, to assess the accuracy of their prediction. Using the latent-space representation generated by the fraction of training set that the network classifies correctly, we build a statistical model that is able to capture the likelihood of a given prediction. We show on a synthetic dataset that commonly used methods are mostly overconfident. Overconfidence occurs also for predictions made on data points that are outside the distribution that generated the training data. In contrast, our method can detect such out-of-distribution data points as inaccurately predicted, thus aiding in the automatic detection of outliers.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
Recent advances in the SISSO method and their implementation in the SISSO++ code
Authors:
Thomas A. R. Purcell,
Matthias Scheffler,
Luca M. Ghiringhelli
Abstract:
Accurate and explainable artificial-intelligence (AI) models are promising tools for the acceleration of the discovery of new materials, ore new applications for existing materials. Recently, symbolic regression has become an increasingly popular tool for explainable AI because it yields models that are relatively simple analytical descriptions of target properties. Due to its deterministic nature…
▽ More
Accurate and explainable artificial-intelligence (AI) models are promising tools for the acceleration of the discovery of new materials, ore new applications for existing materials. Recently, symbolic regression has become an increasingly popular tool for explainable AI because it yields models that are relatively simple analytical descriptions of target properties. Due to its deterministic nature, the sure-independence screening and sparsifying operator (SISSO) method is a particularly promising approach for this application. Here we describe the new advancements of the SISSO algorithm, as implemented into SISSO++, a C++ code with Python bindings. We introduce a new representation of the mathematical expressions found by SISSO. This is a first step towards introducing ``grammar'' rules into the feature creation step. Importantly, by introducing a controlled non-linear optimization to the feature creation step we expand the range of possible descriptors found by the methodology. Finally, we introduce refinements to the solver algorithms for both regression and classification, that drastically increase the reliability and efficiency of SISSO. For all of these improvements to the basic SISSO algorithm, we not only illustrate their potential impact, but also fully detail how they operate both mathematically and computationally.
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
Automatic Identification of Crystal Structures and Interfaces via Artificial-Intelligence-based Electron Microscopy
Authors:
Andreas Leitherer,
Byung Chul Yeo,
Christian H. Liebscher,
Luca M. Ghiringhelli
Abstract:
Characterizing crystal structures and interfaces down to the atomic level is an important step for designing advanced materials. Modern electron microscopy routinely achieves atomic resolution and is capable to resolve complex arrangements of atoms with picometer precision. Here, we present AI-STEM, an automatic, artificial-intelligence based method, for accurately identifying key characteristics…
▽ More
Characterizing crystal structures and interfaces down to the atomic level is an important step for designing advanced materials. Modern electron microscopy routinely achieves atomic resolution and is capable to resolve complex arrangements of atoms with picometer precision. Here, we present AI-STEM, an automatic, artificial-intelligence based method, for accurately identifying key characteristics from atomic-resolution scanning transmission electron microscopy (STEM) images of polycrystalline materials. The method is based on a Bayesian convolutional neural network (BNN) that is trained only on simulated images. AI-STEM automatically and accurately identifies crystal structure, lattice orientation, and location of interface regions in synthetic and experimental images. The model is trained on cubic and hexagonal crystal structures, yielding classifications and uncertainty estimates, while no explicit information on structural patterns at the interfaces is included during training. This work combines principles from probabilistic modeling, deep learning, and information theory, enabling automatic analysis of experimental, atomic-resolution images.
△ Less
Submitted 6 September, 2023; v1 submitted 22 March, 2023;
originally announced March 2023.
-
The NOMAD Artificial-Intelligence Toolkit: Turning materials-science data into knowledge and understanding
Authors:
Luigi Sbailò,
Ádám Fekete,
Luca M. Ghiringhelli,
Matthias Scheffler
Abstract:
We present the Novel-Materials-Discovery (NOMAD) Artificial-Intelligence (AI) Toolkit, a web-browser-based infrastructure for the interactive AI-based analysis of materials-science findable, accessible, interoperable, and reusable (FAIR) data. The AI Toolkit readily operates on the FAIR data stored in the central server of the NOMAD Archive, the largest database of materials-science data worldwide…
▽ More
We present the Novel-Materials-Discovery (NOMAD) Artificial-Intelligence (AI) Toolkit, a web-browser-based infrastructure for the interactive AI-based analysis of materials-science findable, accessible, interoperable, and reusable (FAIR) data. The AI Toolkit readily operates on the FAIR data stored in the central server of the NOMAD Archive, the largest database of materials-science data worldwide, as well as locally stored, users' owned data. The NOMAD Oasis, a local, stand alone server can be also used to run the AI Toolkit. By using Jupyter notebooks that run in a web-browser, the NOMAD data can be queried and accessed; data mining, machine learning, and other AI techniques can be then applied to analyse them. This infrastructure brings the concept of reproducibility in materials science to the next level, by allowing researchers to share not only the data contributing to their scientific publications, but also all the developed methods and analytics tools. Besides reproducing published results, users of the NOMAD AI toolkit can modify the Jupyter notebooks towards their own research work.
△ Less
Submitted 9 November, 2022; v1 submitted 31 May, 2022;
originally announced May 2022.
-
Shared Metadata for Data-Centric Materials Science
Authors:
Luca M. Ghiringhelli,
Carsten Baldauf,
Tristan Bereau,
Sandor Brockhauser,
Christian Carbogno,
Javad Chamanara,
Stefano Cozzini,
Stefano Curtarolo,
Claudia Draxl,
Shyam Dwaraknath,
Ádám Fekete,
James Kermode,
Christoph T. Koch,
Markus Kühbach,
Alvin Noe Ladines,
Patrick Lambrix,
Maja-Olivia Lenz-Himmer,
Sergey Levchenko,
Micael Oliveira,
Adam Michalchuk,
Ron Miller,
Berk Onat,
Pasquale Pavone,
Giovanni Pizzi,
Benjamin Regler
, et al. (10 additional authors not shown)
Abstract:
The expansive production of data in materials science, their widespread sharing and repurposing requires educated support and stewardship. In order to ensure that this need helps rather than hinders scientific work, the implementation of the FAIR-data principles (Findable, Accessible, Interoperable, and Reusable) must not be too narrow. Besides, the wider materials-science community ought to agree…
▽ More
The expansive production of data in materials science, their widespread sharing and repurposing requires educated support and stewardship. In order to ensure that this need helps rather than hinders scientific work, the implementation of the FAIR-data principles (Findable, Accessible, Interoperable, and Reusable) must not be too narrow. Besides, the wider materials-science community ought to agree on the strategies to tackle the challenges that are specific to its data, both from computations and experiments. In this paper, we present the result of the discussions held at the workshop on "Shared Metadata and Data Formats for Big-Data Driven Materials Science". We start from an operative definition of metadata, and what features a FAIR-compliant metadata schema should have. We will mainly focus on computational materials-science data and propose a constructive approach for the FAIRification of the (meta)data related to ground-state and excited-states calculations, potential-energy sampling, and generalized workflows. Finally, challenges with the FAIRification of experimental (meta)data and materials-science ontologies are presented together with an outlook of how to meet them.
△ Less
Submitted 23 August, 2023; v1 submitted 29 May, 2022;
originally announced May 2022.
-
Accelerating Materials-Space Exploration for Thermal Insulators by Mapping Materials Properties via Artificial Intelligence
Authors:
Thomas A. R. Purcell,
Matthias Scheffler,
Luca M. Ghiringhelli,
Christian Carbogno
Abstract:
Reliable artificial-intelligence models have the potential to accelerate the discovery of materials with optimal properties for various applications, including superconductivity, catalysis, and thermoelectricity. Advancements in this field are often hindered by the scarcity and quality of available data and the significant effort required to acquire new data. For such applications, reliable surrog…
▽ More
Reliable artificial-intelligence models have the potential to accelerate the discovery of materials with optimal properties for various applications, including superconductivity, catalysis, and thermoelectricity. Advancements in this field are often hindered by the scarcity and quality of available data and the significant effort required to acquire new data. For such applications, reliable surrogate models that help guide materials space exploration using easily accessible materials properties are urgently needed. Here, we present a general, data-driven framework that provides quantitative predictions as well as qualitative rules for steering data creation for all datasets via a combination of symbolic regression and sensitivity analysis. We demonstrate the power of the framework by generating an accurate analytic model for the lattice thermal conductivity using only 75 experimentally measured values. By extracting the most influential material properties from this model, we are then able to hierarchically screen 732 materials and find 80 ultra-insulating materials.
△ Less
Submitted 6 June, 2023; v1 submitted 27 April, 2022;
originally announced April 2022.
-
Hierarchical symbolic regression for identifying key physical parameters correlated with bulk properties of perovskites
Authors:
Lucas Foppa,
Thomas A. R. Purcell,
Sergey V. Levchenko,
Matthias Scheffler,
Luca M. Ghiringhelli
Abstract:
Symbolic regression identifies key physical parameters describing materials properties by uncovering correlations as nonlinear analytical expressions. However, the pool of expressions grows rapidly with complexity, compromising its efficiency. We tackle this challenge by a hierarchical approach: identified expressions are used as input parameters for obtaining more complex expressions. Crucially,…
▽ More
Symbolic regression identifies key physical parameters describing materials properties by uncovering correlations as nonlinear analytical expressions. However, the pool of expressions grows rapidly with complexity, compromising its efficiency. We tackle this challenge by a hierarchical approach: identified expressions are used as input parameters for obtaining more complex expressions. Crucially, this framework can transfer knowledge among properties, highlighting physical relationships. We demonstrate this strategy by using the Sure-Independence-Screening-and-Sparsifying-Operator (SISSO) approach to identify expressions correlated with the lattice constant and cohesive energy, which are then used to model the bulk modulus of ABO3 perovskites.
△ Less
Submitted 25 February, 2022;
originally announced February 2022.
-
Ab initio approach for thermodynamic surface phases with full consideration of anharmonic effects -- the example of hydrogen at Si(100)
Authors:
Yuanyuan Zhou,
Chunye Zhu,
Matthias Scheffler,
Luca M. Ghiringhelli
Abstract:
A reliable description of surfaces structures in a reactive environment is crucial to understand materials functions. We present a first-principles theory of replica-exchange grand-canonical-ensemble molecular dynamics (REGC-MD) and apply it to evaluate phase equilibria of surfaces in reactive gas-phase environment. We identify the different surface phases and locate phase boundaries including tri…
▽ More
A reliable description of surfaces structures in a reactive environment is crucial to understand materials functions. We present a first-principles theory of replica-exchange grand-canonical-ensemble molecular dynamics (REGC-MD) and apply it to evaluate phase equilibria of surfaces in reactive gas-phase environment. We identify the different surface phases and locate phase boundaries including triple and critical points. The approach is demonstrated by addressing open questions for the Si(100) surface in contact with a hydrogen atmosphere. In the range from 300 to 1 000 K, we find 25 distinct thermodynamically stable surface phases, for which we also provide microscopic descriptions. Most of the identified phases, including few order-disorder phase transitions, have not yet been observed experimentally. The REGC-MD-derived phase diagram shows significant, qualitative differences to the description by the state-of-the-art "ab initio atomistic thermodynamics" approach.
△ Less
Submitted 2 February, 2022;
originally announced February 2022.
-
Trends in atomistic simulation software usage
Authors:
Leopold Talirz,
Luca M. Ghiringhelli,
Berend Smit
Abstract:
Driven by the unprecedented computational power available to scientific research, the use of computers in solid-state physics, chemistry and materials science has been on a continuous rise. This review focuses on the software used for the simulation of matter at the atomic scale. We provide a comprehensive overview of major codes in the field, and analyze how citations to these codes in the academ…
▽ More
Driven by the unprecedented computational power available to scientific research, the use of computers in solid-state physics, chemistry and materials science has been on a continuous rise. This review focuses on the software used for the simulation of matter at the atomic scale. We provide a comprehensive overview of major codes in the field, and analyze how citations to these codes in the academic literature have evolved since 2010. An interactive version of the underlying data set is available at https://atomistic.software .
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
Identifying outstanding transition-metal-alloy heterogeneous catalysts for the oxygen reduction and evolution reactions via subgroup discovery
Authors:
Lucas Foppa,
Luca M. Ghiringhelli
Abstract:
In order to estimate the reactivity of a large number of potentially complex heterogeneous catalysts while searching for novel and more efficient materials, physical as well as data-centric models have been developed for a faster evaluation of adsorption energies compared to first-principles calculations. However, global models designed to describe as many materials as possible might overlook the…
▽ More
In order to estimate the reactivity of a large number of potentially complex heterogeneous catalysts while searching for novel and more efficient materials, physical as well as data-centric models have been developed for a faster evaluation of adsorption energies compared to first-principles calculations. However, global models designed to describe as many materials as possible might overlook the very few compounds that have the appropriate adsorption properties to be suitable for a given catalytic process. Here, the subgroup-discovery (SGD) local artificial-intelligence approach is used to identify the key descriptive parameters and constrains on their values, the so-called SG rules, which particularly describe transition-metal surfaces with outstanding adsorption properties for the oxygen reduction and evolution reactions. We start from a data set of 95 oxygen adsorption energy values evaluated by density-functional-theory calculations for several monometallic surfaces along with 16 atomic, bulk and surface properties as candidate descriptive parameters. From this data set, SGD identifies constraints on the most relevant parameters describing materials and adsorption sites that (i) result in O adsorption energies within the Sabatier-optimal range required for the oxygen reduction reaction and (ii) present the largest deviations from the linear scaling relations between O and OH adsorption energies, which limit the performance in the oxygen evolution reaction. The SG rules not only reflect the local underlying physicochemical phenomena that result in the desired adsorption properties but also guide the challenging design of alloy catalysts.
△ Less
Submitted 25 June, 2021;
originally announced June 2021.
-
Interpretability of machine-learning models in physical sciences
Authors:
Luca M. Ghiringhelli
Abstract:
In machine learning (ML), it is in general challenging to provide a detailed explanation on how a trained model arrives at its prediction. Thus, usually we are left with a black-box, which from a scientific standpoint is not satisfactory. Even though numerous methods have been recently proposed to interpret ML models, somewhat surprisingly, interpretability in ML is far from being a consensual con…
▽ More
In machine learning (ML), it is in general challenging to provide a detailed explanation on how a trained model arrives at its prediction. Thus, usually we are left with a black-box, which from a scientific standpoint is not satisfactory. Even though numerous methods have been recently proposed to interpret ML models, somewhat surprisingly, interpretability in ML is far from being a consensual concept, with diverse and sometimes contrasting motivations for it. Reasonable candidate properties of interpretable models could be model transparency (i.e. how does the model work?) and post hoc explanations (i.e., what else can the model tell me?). Here, I review the current debate on ML interpretability and identify key challenges that are specific to ML applied to materials science.
△ Less
Submitted 21 April, 2021;
originally announced April 2021.
-
Robust recognition and exploratory analysis of crystal structures via Bayesian deep learning
Authors:
Andreas Leitherer,
Angelo Ziletti,
Luca M. Ghiringhelli
Abstract:
Due to their ability to recognize complex patterns, neural networks can drive a paradigm shift in the analysis of materials science data. Here, we introduce ARISE, a crystal-structure identification method based on Bayesian deep learning. As a major step forward, ARISE is robust to structural noise and can treat more than 100 crystal structures, a number that can be extended on demand. While being…
▽ More
Due to their ability to recognize complex patterns, neural networks can drive a paradigm shift in the analysis of materials science data. Here, we introduce ARISE, a crystal-structure identification method based on Bayesian deep learning. As a major step forward, ARISE is robust to structural noise and can treat more than 100 crystal structures, a number that can be extended on demand. While being trained on ideal structures only, ARISE correctly characterizes strongly perturbed single- and polycrystalline systems, from both synthetic and experimental resources. The probabilistic nature of the Bayesian-deep-learning model allows to obtain principled uncertainty estimates, which are found to be correlated with crystalline order of metallic nanoparticles in electron tomography experiments. Applying unsupervised learning to the internal neural-network representations reveals grain boundaries and (unapparent) structural regions sharing easily interpretable geometrical properties. This work enables the hitherto hindered analysis of noisy atomic structural data from computations or experiments.
△ Less
Submitted 8 November, 2021; v1 submitted 17 March, 2021;
originally announced March 2021.
-
Materials genes of heterogeneous catalysis from clean experiments and artificial intelligence
Authors:
Lucas Foppa,
Luca M. Ghiringhelli,
Frank Girgsdies,
Maike Hashagen,
Pierre Kube,
Michael Hävecker,
Spencer J. Carey,
Andrey Tarasov,
Peter Kraus,
Frank Rosowski,
Robert Schlögl,
Annette Trunschke,
Matthias Scheffler
Abstract:
Heterogeneous catalysis is an example of a complex materials function, governed by an intricate interplay of several processes, e.g., the different surface chemical reactions, and the dynamic re-structuring of the catalyst material at reaction conditions. Modelling the full catalytic progression via first-principles statistical mechanics is impractical, if not impossible. Instead, we show here how…
▽ More
Heterogeneous catalysis is an example of a complex materials function, governed by an intricate interplay of several processes, e.g., the different surface chemical reactions, and the dynamic re-structuring of the catalyst material at reaction conditions. Modelling the full catalytic progression via first-principles statistical mechanics is impractical, if not impossible. Instead, we show here how a tailored artificial-intelligence approach can be applied, even to a small number of materials, to model catalysis and determine the key descriptive parameters ("materials genes") reflecting the processes that trigger, facilitate, or hinder catalyst performance. We start from a consistent experimental set of "clean data", containing nine vanadium-based oxidation catalysts. These materials were synthesized, fully characterized, and tested according to standardized protocols. By applying the symbolic-regression SISSO approach, we identify correlations between the few most relevant materials properties and their reactivity. This approach highlights the underlying physicochemical processes, and accelerates catalyst design.
△ Less
Submitted 16 February, 2021;
originally announced February 2021.
-
Data-driven equation for drug-membrane permeability across drugs and membranes
Authors:
Arghya Dutta,
Jilles Vreeken,
Luca M. Ghiringhelli,
Tristan Bereau
Abstract:
Drug efficacy depends on its capacity to permeate across the cell membrane. We consider the prediction of passive drug-membrane permeability coefficients. Beyond the widely recognized correlation with hydrophobicity, we additionally consider the functional relationship between passive permeation and acidity. To discover easily interpretable equations that explain the data well, we use the recently…
▽ More
Drug efficacy depends on its capacity to permeate across the cell membrane. We consider the prediction of passive drug-membrane permeability coefficients. Beyond the widely recognized correlation with hydrophobicity, we additionally consider the functional relationship between passive permeation and acidity. To discover easily interpretable equations that explain the data well, we use the recently proposed sure-independence screening and sparsifying operator (SISSO), an artificial-intelligence technique that combines symbolic regression with compressed sensing. Our study is based on a large in silico dataset of 0.4 million small molecules extracted from coarse-grained simulations. We rationalize the equation suggested by SISSO via an analysis of the inhomogeneous solubility-diffusion model in several asymptotic acidity regimes. We further extend our analysis to the dependence on lipid-membrane composition. Lipid-tail unsaturation plays a key role, but surprisingly contributes stepwise rather than proportionally. Our results are in line with previously observed changes in permeability, suggesting the distinction between liquid-disordered (Ld) and liquid-ordered (Lo) permeation. Together, compressed sensing with analytically derived asymptotes establish and validate an accurate, broadly applicable, and interpretable equation for passive permeability across both drug and lipid-tail chemistry.
△ Less
Submitted 29 June, 2021; v1 submitted 3 December, 2020;
originally announced December 2020.
-
Numerical Quality Control for DFT-based Materials Databases
Authors:
Christian Carbogno,
Kristian Sommer Thygesen,
Björn Bieniek,
Claudia Draxl,
Luca M. Ghiringhelli,
Andris Gulans,
Oliver T. Hofmann,
Karsten W. Jacobsen,
Sven Lubeck,
Jens Jørgen Mortensen,
Mikkel Strange,
Elisabeth Wruss,
Matthias Scheffler
Abstract:
Electronic-structure theory is a strong pillar of materials science. Many different computer codes that employ different approaches are used by the community to solve various scientific problems. Still, the precision of different packages has only recently been scrutinized thoroughly, focusing on a specific task, namely selecting a popular density functional, and using unusually high, extremely pr…
▽ More
Electronic-structure theory is a strong pillar of materials science. Many different computer codes that employ different approaches are used by the community to solve various scientific problems. Still, the precision of different packages has only recently been scrutinized thoroughly, focusing on a specific task, namely selecting a popular density functional, and using unusually high, extremely precise numerical settings for investigating 71 monoatomic crystals. Little is known, however, about method- and code-specific uncertainties that arise under numerical settings that are commonly used in practice. We shed light on this issue by investigating the deviations in total and relative energies as a function of computational parameters. Using typical settings for basis sets and k-grids, we compare results for 71 elemental and 63 binary solids obtained by three different electronic-structure codes that employ fundamentally different strategies. On the basis of the observed trends, we propose a simple, analytical model for the estimation of the errors associated with the basis-set incompleteness. We cross-validate this model using ternary systems obtained from the NOMAD Repository and discuss how our approach enables the comparison of the heterogeneous data present in computational materials databases.
△ Less
Submitted 31 January, 2022; v1 submitted 24 August, 2020;
originally announced August 2020.
-
Investigating the ranges of (meta)stable phase formation in (InxGa1-x)2O3: Impact of the cation coordination
Authors:
C. Wouters,
C. Sutton,
L. M. Ghiringhelli,
T. Markurt,
R. Schewski,
A. Hassa,
H. von Wenckstern,
M. Grundmann,
M. Scheffler,
M. Albrecht
Abstract:
We investigate the phase diagram of the heterostructural solid solution (InxGa1-x)2O3 both computationally, by combining cluster expansion and density functional theory, and experimentally, by means of TEM measurements of pulsed laser deposited (PLD) heteroepitaxial thin films. The shapes of the Gibbs free energy curves for the monoclinic, hexagonal and cubic bixbyite alloy as a function of compos…
▽ More
We investigate the phase diagram of the heterostructural solid solution (InxGa1-x)2O3 both computationally, by combining cluster expansion and density functional theory, and experimentally, by means of TEM measurements of pulsed laser deposited (PLD) heteroepitaxial thin films. The shapes of the Gibbs free energy curves for the monoclinic, hexagonal and cubic bixbyite alloy as a function of composition can be explained in terms of the preferred cation coordination environments of indium and gallium. We show by atomically resolved STEM that the strong preference of indium for six-fold coordination results in ordered monoclinic and hexagonal lattices. This ordering impacts the configurational entropy in the solid solution and thereby the (InxGa1-x)2O3 phase diagram. The resulting phase diagram is characterized by very limited solubilities of gallium and indium in the monoclinic, hexagonal and cubic ground state phases respectively but exhibits wide metastable ranges at realistic growth temperatures. On the indium rich side of the phase diagram a wide miscibility gap is found, which results in phase separated layers. The experimentally observed indium solubilities in the PLD samples are in the range of x=0.45 and x=0.55 for monoclinic and hexagonal single-phase films, while for phase separated films we find x=0.5 for the monoclinic phase, x=0.65-0.7 for the hexagonal phase and x>0.9 for the cubic phase. These values are consistent with the computed metastable ranges for each phase.
△ Less
Submitted 11 August, 2020;
originally announced August 2020.
-
TCMI: a non-parametric mutual-dependence estimator for multivariate continuous distributions
Authors:
Benjamin Regler,
Matthias Scheffler,
Luca M. Ghiringhelli
Abstract:
The identification of relevant features, i.e., the driving variables that determine a process or the properties of a system, is an essential part of the analysis of data sets with a large number of variables. A mathematical rigorous approach to quantifying the relevance of these features is mutual information. Mutual information determines the relevance of features in terms of their joint mutual d…
▽ More
The identification of relevant features, i.e., the driving variables that determine a process or the properties of a system, is an essential part of the analysis of data sets with a large number of variables. A mathematical rigorous approach to quantifying the relevance of these features is mutual information. Mutual information determines the relevance of features in terms of their joint mutual dependence to the property of interest. However, mutual information requires as input probability distributions, which cannot be reliably estimated from continuous distributions such as physical quantities like lengths or energies. Here, we introduce total cumulative mutual information (TCMI), a measure of the relevance of mutual dependences that extends mutual information to random variables of continuous distribution based on cumulative probability distributions. TCMI is a non-parametric, robust, and deterministic measure that facilitates comparisons and rankings between feature sets with different cardinality. The ranking induced by TCMI allows for feature selection, i.e., the identification of variable sets that are nonlinear statistically related to a property of interest, taking into account the number of data samples as well as the cardinality of the set of variables. We evaluate the performance of our measure with simulated data, compare its performance with similar multivariate-dependence measures, and demonstrate the effectiveness of our feature-selection method on a set of standard data sets and a typical scenario in materials science.
△ Less
Submitted 30 July, 2022; v1 submitted 30 January, 2020;
originally announced January 2020.
-
Artifcial-intelligence-driven discovery of catalyst \textit{genes} with application to CO2 activation on semiconductor oxides
Authors:
Aliaksei Mazheika,
Yanggang Wang,
Rosendo Valero,
Luca M. Ghiringhelli,
Francesc Vines,
Francesc Illas,
Sergey V. Levchenko,
Matthias Scheffler
Abstract:
Catalytic-materials design requires predictive modeling of the interaction between catalyst and reactants. This is challenging due to the complexity and diversity of structure-property relationships across the chemical space. Here, we report a strategy for a rational design of catalytic materials using the artifcial intelligence approach (AI) subgroup discovery. We identify catalyst \textit{genes}…
▽ More
Catalytic-materials design requires predictive modeling of the interaction between catalyst and reactants. This is challenging due to the complexity and diversity of structure-property relationships across the chemical space. Here, we report a strategy for a rational design of catalytic materials using the artifcial intelligence approach (AI) subgroup discovery. We identify catalyst \textit{genes} (features) that correlate with mechanisms that trigger, facilitate, or hinder the activation of carbon dioxide (CO$_2$) towards a chemical conversion. The AI model is trained on frst-principles data for a broad family of oxides. We demonstrate that surfaces of experimentally identifed good catalysts consistently exhibit combinations of \textit{genes} resulting in a strong elongation of a C-O bond. The same combinations of \textit{genes} also minimize the OCO-angle, the previously proposed indicator of activation, albeit under the constraint that the Sabatier principle is satisfed. Based on these fndings, we propose a set of new promising catalyst materials for CO$_2$ conversion.
△ Less
Submitted 3 May, 2022; v1 submitted 13 December, 2019;
originally announced December 2019.
-
Determining Surface Phase Diagrams Including Anharmonic Effects
Authors:
Yuanyuan Zhou,
Matthias Scheffler,
Luca M. Ghiringhelli
Abstract:
We introduce a massively parallel replica-exchange grand-canonical sampling algorithm to simulate materials at realistic conditions, in particular surfaces and clusters in reactive atmospheres. Its purpose is to determine in an automated fashion equilibrium phase diagrams for a given potential-energy surface (PES) and for any observable sampled in the grand-canonical ensemble. The approach enables…
▽ More
We introduce a massively parallel replica-exchange grand-canonical sampling algorithm to simulate materials at realistic conditions, in particular surfaces and clusters in reactive atmospheres. Its purpose is to determine in an automated fashion equilibrium phase diagrams for a given potential-energy surface (PES) and for any observable sampled in the grand-canonical ensemble. The approach enables an unbiased sampling of the phase space and is embarrassingly parallel. It is demonstrated for a model of Lennard-Jones system describing a surface in contact with a gas phase. Furthermore, the algorithm is applied to Si$_M$ clusters ($M=2, 4$) in contact with an H$_{2}$ atmosphere, with all interactions described at the \textit{ab initio} level, i.e., via density-functional theory, with the PBE gradient-corrected exchange-correlation functional. We identify the most thermodynamically stable phases at finite $T, p$(H$_{2}$) conditions.
△ Less
Submitted 29 August, 2019;
originally announced August 2019.
-
Simultaneous Learning of Several Materials Properties from Incomplete Databases with Multi-Task SISSO
Authors:
Runhai Ouyang,
Emre Ahmetcik,
Christian Carbogno,
Matthias Scheffler,
Luca M. Ghiringhelli
Abstract:
The identification of descriptors of materials properties and functions that capture the underlying physical mechanisms is a critical goal in data-driven materials science. Only such descriptors will enable a trustful and efficient scanning of materials spaces and possibly the discovery of new materials. Recently, the sure-independence screening and sparsifying operator (SISSO) has been introduced…
▽ More
The identification of descriptors of materials properties and functions that capture the underlying physical mechanisms is a critical goal in data-driven materials science. Only such descriptors will enable a trustful and efficient scanning of materials spaces and possibly the discovery of new materials. Recently, the sure-independence screening and sparsifying operator (SISSO) has been introduced and was successfully applied to a number of materials-science problems. SISSO is a compressed-sensing based methodology yielding predictive models that are expressed in form of analytical formulas, built from simple physical properties. These formulas are systematically selected from an immense number (billions or more) of candidates. In this work, we describe a powerful extension of the methodology to a 'multi-task learning' approach, which identifies a single descriptor capturing multiple target materials properties at the same time. This approach is specifically suited for a heterogeneous materials database with scarce or partial data, e.g., in which not all properties are reported for all materials in the training set. As showcase examples, we address the construction of materials-properties maps for the relative stability of octet-binary compounds, considering several crystal phases simultaneously, and the metal/insulator classification of binary materials distributed over many crystal-prototypes.
△ Less
Submitted 3 January, 2019;
originally announced January 2019.
-
NOMAD 2018 Kaggle Competition: Solving Materials Science Challenges Through Crowd Sourcing
Authors:
Christopher Sutton,
Luca M. Ghiringhelli,
Takenori Yamamoto,
Yury Lysogorskiy,
Lars Blumenthal,
Thomas Hammerschmidt,
Jacek Golebiowski,
Xiangyue Liu,
Angelo Ziletti,
Matthias Scheffler
Abstract:
Machine learning (ML) is increasingly used in the field of materials science, where statistical estimates of computed properties are employed to rapidly examine the chemical space for new compounds. However, a systematic comparison of several ML models for this domain has been hindered by the scarcity of appropriate datasets of materials properties, as well as the lack of thorough benchmarking stu…
▽ More
Machine learning (ML) is increasingly used in the field of materials science, where statistical estimates of computed properties are employed to rapidly examine the chemical space for new compounds. However, a systematic comparison of several ML models for this domain has been hindered by the scarcity of appropriate datasets of materials properties, as well as the lack of thorough benchmarking studies. To address this, a public data-analytics competition was organized by the Novel Materials Discovery (NOMAD) Centre of Excellence and hosted by the on-line platform Kaggle using a dataset of $3\,000$ (Al$_x$ Ga$_y$ In$_z$)$_2$ O$_3$ compounds (with $x+y+z = 1$). The aim of this challenge was to identify the best ML model for the prediction of two key physical properties that are relevant for optoelectronic applications: the electronic band gap energy and the crystalline formation energy. In this contribution, we present a summary of the top three ML approaches of the competition including the 1st place solution based on a crystal graph representation that is new for ML of the properties of materials. The 2nd place model combined many candidate descriptors from a set of compositional, atomic environment-based, and average structural properties with the light gradient-boosting machine regression model. The 3rd place model employed the smooth overlap of atomic positions representation with a neural network. To gain insight into whether the representation or the regression model determines the overall model performance, nine ML models obtained by combining the representations and regression models of the top three approaches were compared by looking at the correlations among prediction errors. At fixed representation, the largest correlation is observed in predictions made with kernel ridge regression and neural network, reflecting a similar performance on the same test set samples.
△ Less
Submitted 30 November, 2018;
originally announced December 2018.
-
Two-to-three dimensional transition in neutral gold clusters: the crucial role of van der Waals interactions and temperature
Authors:
Bryan R. Goldsmith,
Jacob Florian,
Jin-Xun Liu,
Philipp Gruene,
Jonathan T. Lyon,
David M. Rayner,
André Fielicke,
Matthias Scheffler,
Luca M. Ghiringhelli
Abstract:
We predict the structures of neutral gas-phase gold clusters ($Au_n$, $n$ = 5$-$13) at finite temperatures based on free-energy calculations obtained by replica-exchange ab initio molecular dynamics. The structures of neutral $Au_5$$-$$Au_{13}$ clusters are assigned at 100 K based on a comparison of experimental far-infrared multiple photon dissociation spectra performed on Kr-tagged gold clusters…
▽ More
We predict the structures of neutral gas-phase gold clusters ($Au_n$, $n$ = 5$-$13) at finite temperatures based on free-energy calculations obtained by replica-exchange ab initio molecular dynamics. The structures of neutral $Au_5$$-$$Au_{13}$ clusters are assigned at 100 K based on a comparison of experimental far-infrared multiple photon dissociation spectra performed on Kr-tagged gold clusters with theoretical anharmonic IR spectra and free-energy calculations. The critical gold cluster size where the most stable isomer changes from planar to nonplanar is $Au_{11}$ (capped-trigonal prism, $D_{3h}$) at 100 K. However, at 300 K (i.e., room temperature), planar and nonplanar isomers may coexist even for $Au_8$, $Au_9$, and $Au_{10}$ clusters. Density-functional theory exchange-correlation functionals within the generalized gradient or hybrid approximation must be corrected for long-range van der Waals interactions to accurately predict relative gold cluster isomer stabilities. Our work gives insight into the stable structures of gas-phase gold clusters by highlighting the impact of temperature, and therefore the importance of free-energy over total energy studies, and long-range van der Waals interactions on gold cluster stability.
△ Less
Submitted 19 November, 2018;
originally announced November 2018.
-
(Meta-)stability and Core-Shell Dynamics of Gold Nanoclusters at Finite Temperature
Authors:
Diego Guedes-Sobrinho,
Weiqi Wang,
Ian Hamilton,
Juarez L. F. Da Silva,
Luca M. Ghiringhelli
Abstract:
Gold nanoclusters have been the focus of numerous computational studies but an atomistic understanding of their structural and dynamical properties at finite temperature is far from satisfactory. To address this deficiency, we investigate gold nanoclusters via ab initio molecular dynamics, in a range of sizes where a core-shell morphology is observed. We analyze their structure and dynamics using…
▽ More
Gold nanoclusters have been the focus of numerous computational studies but an atomistic understanding of their structural and dynamical properties at finite temperature is far from satisfactory. To address this deficiency, we investigate gold nanoclusters via ab initio molecular dynamics, in a range of sizes where a core-shell morphology is observed. We analyze their structure and dynamics using of state-of-the-art techniques, including unsupervised machine-learning nonlinear dimensionality reduction (sketch-map) for describing the similarities and differences among the range of sampled configurations. We find that, whereas the gold nanoclusters exhibit continuous structural rearrangement, they clearly show persistent motifs: a cationic core of one to five atoms is loosely bound to a shell which typically displays a substructure resulting from the competition between locally spherical vs planar fragments. Besides illuminating the properties of core-shell gold nanoclusters, the present study proposes a set of useful tools for understanding their nature in operando.
△ Less
Submitted 11 November, 2018;
originally announced November 2018.
-
Artificial Intelligence for High-Throughput Discovery of Topological Insulators: the Example of Alloyed Tetradymites
Authors:
Guohua Cao,
Runhai Ouyang,
Luca M. Ghiringhelli,
Matthias Scheffler,
Huijun Liu,
Christian Carbogno,
Zhenyu Zhang
Abstract:
Significant advances have been made in predicting new topological materials using high-throughput empirical descriptors or symmetry-based indicators. To date, these approaches have been applied to materials in existing databases, and are severely limited to systems with well-defined symmetries, leaving a much larger materials space unexplored. Using tetradymites as a prototypical class of examples…
▽ More
Significant advances have been made in predicting new topological materials using high-throughput empirical descriptors or symmetry-based indicators. To date, these approaches have been applied to materials in existing databases, and are severely limited to systems with well-defined symmetries, leaving a much larger materials space unexplored. Using tetradymites as a prototypical class of examples, we uncover a novel two-dimensional descriptor by applying an artificial intelligence (AI) based approach for fast and reliable identification of the topological characters of a drastically expanded range of materials, without prior determination of their specific symmetries and detailed band structures. By leveraging this descriptor that contains only the atomic number and electronegativity of the constituent species, we have readily scanned a huge number of alloys in the tetradymite family. Strikingly, nearly half of which are identified to be topological insulators, revealing a much larger territory of the topological materials world. The present work also attests the increasingly important role of such AI-based approaches in modern materials discovery.
△ Less
Submitted 27 February, 2020; v1 submitted 14 August, 2018;
originally announced August 2018.
-
Analysis of Topological Transitions in Two-dimensional Materials by Compressed Sensing
Authors:
Carlos Mera Acosta,
Runhai Ouyang,
Adalberto Fazzio,
Matthias Scheffler,
Luca M. Ghiringhelli,
Christian Carbogno
Abstract:
Quantum spin-Hall insulators (QSHIs), i.e., two-dimensional topological insulators (TIs) with a symmetry-protected band inversion, have attracted considerable scientific interest in recent years. In this work, we have computed the topological Z2 invariant for 220 functionalized honeycomb lattices that are isoelectronic to functionalized graphene. Besides confirming the TI character of well-known m…
▽ More
Quantum spin-Hall insulators (QSHIs), i.e., two-dimensional topological insulators (TIs) with a symmetry-protected band inversion, have attracted considerable scientific interest in recent years. In this work, we have computed the topological Z2 invariant for 220 functionalized honeycomb lattices that are isoelectronic to functionalized graphene. Besides confirming the TI character of well-known materials such as functionalized stanene, our study identifies 45 yet unreported QSHIs. We applied a compressed-sensing approach to identify a physically meaningful descriptor for the Z2 invariant that only depends on the properties of the material's constituent atoms. This enables us to draw a map of materials, in which metals, trivial insulators, and QSHI form distinct regions. This analysis yields fundamental insights in the mechanisms driving topological transitions. The transferability of the identified model is explicitly demonstrated for an additional set of honeycomb lattices with different functionalizations that are not part of the original set of 220 graphene-type materials used to identify the descriptor. In this class, we predict 74 more novel QSHIs that have not been reported in literature yet.
△ Less
Submitted 28 May, 2018;
originally announced May 2018.
-
Structure and electronic properties of transition-metal/Mg bimetallic clusters at realistic temperatures and oxygen partial pressures
Authors:
Shikha Saini,
Debalaya Sarker,
Pooja Basera,
Sergey V. Levchenko,
Luca M. Ghiringhelli,
Saswata Bhattacharya
Abstract:
Composition, atomic structure, and electronic properties of TM$_x$Mg$_y$O$_z$ clusters (TM = Cr, Ni, Fe, Co, $x+y \leq 3$) at realistic temperature $T$ and partial oxygen pressure $p_{\textrm{O}_2}$ conditions are explored using the {\em ab initio} atomistic thermodynamics approach. The low-energy isomers of the different clusters are identified using a massively parallel cascade genetic algorithm…
▽ More
Composition, atomic structure, and electronic properties of TM$_x$Mg$_y$O$_z$ clusters (TM = Cr, Ni, Fe, Co, $x+y \leq 3$) at realistic temperature $T$ and partial oxygen pressure $p_{\textrm{O}_2}$ conditions are explored using the {\em ab initio} atomistic thermodynamics approach. The low-energy isomers of the different clusters are identified using a massively parallel cascade genetic algorithm at the hybrid density-functional level of theory. On analyzing a large set of data, we find that the fundamental gap E$_\textrm{g}$ of the thermodynamically stable clusters are strongly affected by the presence of Mg-coordinated O$_2$ moieties. In contrast, the nature of the transition metal does not play a significant role in determining E$_\textrm{g}$. Using E$_\textrm{g}$ of a cluster as a descriptor of its redox properties, our finding is against the conventional belief that the transition metal plays the key role in determining the electronic and therefore chemical properties of the clusters. High reactivity may be correlated more strongly with oxygen content in the cluster than with any specific TM type.
△ Less
Submitted 4 April, 2018;
originally announced April 2018.
-
GAtor: A First Principles Genetic Algorithm for Molecular Crystal Structure Prediction
Authors:
Farren Curtis,
Xiayue Li,
Timothy Rose,
Álvaro Vázquez-Mayagoitia,
Saswata Bhattacharya,
Luca M. Ghiringhelli,
Noa Marom
Abstract:
We present the implementation of GAtor, a massively parallel, first principles genetic algorithm (GA) for molecular crystal structure prediction. GAtor is written in Python and currently interfaces with the FHI-aims code to perform local optimizations and energy evaluations using dispersion-inclusive density functional theory (DFT). GAtor offers a variety of fitness evaluation, selection, crossove…
▽ More
We present the implementation of GAtor, a massively parallel, first principles genetic algorithm (GA) for molecular crystal structure prediction. GAtor is written in Python and currently interfaces with the FHI-aims code to perform local optimizations and energy evaluations using dispersion-inclusive density functional theory (DFT). GAtor offers a variety of fitness evaluation, selection, crossover, and mutation schemes. Breeding operators designed specifically for molecular crystals provide a balance between exploration and exploitation. Evolutionary niching is implemented in GAtor by using machine learning to cluster the dynamically updated population by structural similarity and then employing a cluster-based fitness function. Evolutionary niching promotes uniform sampling of the potential energy surface by evolving several sub-populations, which helps overcome initial pool biases and selection biases (genetic drift). The various settings offered by GAtor increase the likelihood of locating numerous low-energy minima, including those located in disconnected, hard to reach regions of the potential energy landscape. The best structures generated are re-relaxed and re-ranked using a hierarchy of increasingly accurate DFT functionals and dispersion methods. GAtor is applied to a chemically diverse set of four past blind test targets, characterized by different types of intermolecular interactions. The experimentally observed structures and other low-energy structures are found for all four targets. In particular, for Target II, 5-cyano-3-hydroxythiophene, the top ranked putative crystal structure is a $Z^\prime$=2 structure with P$\bar{1}$ symmetry and a scaffold packing motif, which has not been reported previously.
△ Less
Submitted 23 February, 2018;
originally announced February 2018.
-
New Tolerance Factor to Predict the Stability of Perovskite Oxides and Halides
Authors:
Christopher J. Bartel,
Christopher Sutton,
Bryan R. Goldsmith,
Runhai Ouyang,
Charles B. Musgrave,
Luca M. Ghiringhelli,
Matthias Scheffler
Abstract:
Predicting the stability of the perovskite structure remains a longstanding challenge for the discovery of new functional materials for many applications including photovoltaics and electrocatalysts. We developed an accurate, physically interpretable, and one-dimensional tolerance factor, τ, that correctly predicts 92% of compounds as perovskite or nonperovskite for an experimental dataset of 576…
▽ More
Predicting the stability of the perovskite structure remains a longstanding challenge for the discovery of new functional materials for many applications including photovoltaics and electrocatalysts. We developed an accurate, physically interpretable, and one-dimensional tolerance factor, τ, that correctly predicts 92% of compounds as perovskite or nonperovskite for an experimental dataset of 576 $ABX_3$ materials ($\textit{X} =$ $O^{2-}$, $F^-$, $Cl^-$, $Br^-$, $I^-$) using a novel data analytics approach based on SISSO (sure independence screening and sparsifying operator). τ is shown to generalize outside the training set for 1,034 experimentally realized single and double perovskites (91% accuracy) and is applied to identify 23,314 new double perovskites ($A_2$$\textit{BB'}$$X_6$) ranked by their probability of being stable as perovskite. This work guides experimentalists and theorists towards which perovskites are most likely to be successfully synthesized and demonstrates an approach to descriptor identification that can be extended to arbitrary applications beyond perovskite stability predictions.
△ Less
Submitted 6 January, 2019; v1 submitted 23 January, 2018;
originally announced January 2018.
-
SISSO: a compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates
Authors:
Runhai Ouyang,
Stefano Curtarolo,
Emre Ahmetcik,
Matthias Scheffler,
Luca M. Ghiringhelli
Abstract:
The lack of reliable methods for identifying descriptors - the sets of parameters capturing the underlying mechanisms of a materials property - is one of the key factors hindering efficient materials development. Here, we propose a systematic approach for discovering descriptors for materials properties, within the framework of compressed-sensing based dimensionality reduction. SISSO (sure indepen…
▽ More
The lack of reliable methods for identifying descriptors - the sets of parameters capturing the underlying mechanisms of a materials property - is one of the key factors hindering efficient materials development. Here, we propose a systematic approach for discovering descriptors for materials properties, within the framework of compressed-sensing based dimensionality reduction. SISSO (sure independence screening and sparsifying operator) tackles immense and correlated features spaces, and converges to the optimal solution from a combination of features relevant to the materials' property of interest. In addition, SISSO gives stable results also with small training sets. The methodology is benchmarked with the quantitative prediction of the ground-state enthalpies of octet binary materials (using ab initio data) and applied to the showcase example of predicting the metal/insulator classification of binaries (with experimental data). Accurate, predictive models are found in both cases. For the metal-insulator classification model, the predictive capability are tested beyond the training data: It rediscovers the available pressure-induced insulator->metal transitions and it allows for the prediction of yet unknown transition candidates, ripe for experimental validation. As a step forward with respect to previous model-identification methods, SISSO can become an effective tool for automatic materials development.
△ Less
Submitted 27 June, 2018; v1 submitted 9 October, 2017;
originally announced October 2017.
-
Insightful classification of crystal structures using deep learning
Authors:
A. Ziletti,
D. Kumar,
M. Scheffler,
L. M. Ghiringhelli
Abstract:
Computational methods that automatically extract knowledge from data are critical for enabling data-driven materials science. A reliable identification of lattice symmetry is a crucial first step for materials characterization and analytics. Current methods require a user-specified threshold, and are unable to detect average symmetries for defective structures. Here, we propose a machine-learning-…
▽ More
Computational methods that automatically extract knowledge from data are critical for enabling data-driven materials science. A reliable identification of lattice symmetry is a crucial first step for materials characterization and analytics. Current methods require a user-specified threshold, and are unable to detect average symmetries for defective structures. Here, we propose a machine-learning-based approach to automatically classify structures by crystal symmetry. First, we represent crystals by calculating a diffraction image, then construct a deep-learning neural-network model for classification. Our approach is able to correctly classify a dataset comprising more than 100 000 simulated crystal structures, including heavily defective ones. The internal operations of the neural network are unraveled through attentive response maps, demonstrating that it uses the same landmarks a materials scientist would use, although never explicitly instructed to do so. Our study paves the way for crystal-structure recognition of - possibly noisy and incomplete - three-dimensional structural data in big-data materials science.
△ Less
Submitted 30 May, 2018; v1 submitted 7 September, 2017;
originally announced September 2017.
-
Theoretical evidence for unexpected O-rich phases at corners of MgO surfaces
Authors:
Saswata Bhattacharya,
Daniel Berger,
Karsten Reuter,
Luca M. Ghiringhelli,
Sergey V. Levchenko
Abstract:
Realistic oxide materials are often semiconductors, in particular at elevated temperatures, and their surfaces contain undercoordiated atoms at structural defects such as steps and corners. Using hybrid density-functional theory and ab initio atomistic thermodynamics, we investigate the interplay of bond-making, bond-breaking, and charge-carrier trapping at the corner defects at the (100) surface…
▽ More
Realistic oxide materials are often semiconductors, in particular at elevated temperatures, and their surfaces contain undercoordiated atoms at structural defects such as steps and corners. Using hybrid density-functional theory and ab initio atomistic thermodynamics, we investigate the interplay of bond-making, bond-breaking, and charge-carrier trapping at the corner defects at the (100) surface of a p-doped MgO in thermodynamic equilibrium with an O2 atmosphere. We show that by manipulating the coordination of surface atoms one can drastically change and even reverse the order of stability of reduced versus oxidized surface sites.
△ Less
Submitted 18 June, 2017;
originally announced June 2017.
-
Identifying Consistent Statements about Numerical Data with Dispersion-Corrected Subgroup Discovery
Authors:
Mario Boley,
Bryan R. Goldsmith,
Luca M. Ghiringhelli,
Jilles Vreeken
Abstract:
Existing algorithms for subgroup discovery with numerical targets do not optimize the error or target variable dispersion of the groups they find. This often leads to unreliable or inconsistent statements about the data, rendering practical applications, especially in scientific domains, futile. Therefore, we here extend the optimistic estimator framework for optimal subgroup discovery to a new cl…
▽ More
Existing algorithms for subgroup discovery with numerical targets do not optimize the error or target variable dispersion of the groups they find. This often leads to unreliable or inconsistent statements about the data, rendering practical applications, especially in scientific domains, futile. Therefore, we here extend the optimistic estimator framework for optimal subgroup discovery to a new class of objective functions: we show how tight estimators can be computed efficiently for all functions that are determined by subgroup size (non-decreasing dependence), the subgroup median value, and a dispersion measure around the median (non-increasing dependence). In the important special case when dispersion is measured using the average absolute deviation from the median, this novel approach yields a linear time algorithm. Empirical evaluation on a wide range of datasets shows that, when used within branch-and-bound search, this approach is highly efficient and indeed discovers subgroups with much smaller errors.
△ Less
Submitted 23 April, 2017; v1 submitted 26 January, 2017;
originally announced January 2017.
-
Uncovering structure-property relationships of materials by subgroup discovery
Authors:
B. R. Goldsmith,
M. Boley,
J. Vreeken,
M. Scheffler,
L. M. Ghiringhelli
Abstract:
Subgroup discovery (SGD) is presented here as a data-mining approach to help find interpretable local patterns, correlations, and descriptors of a target property in materials-science data. Specifically, we will be concerned with data generated by density-functional theory calculations. At first, we demonstrate that SGD can identify physically meaningful models that classify the crystal structures…
▽ More
Subgroup discovery (SGD) is presented here as a data-mining approach to help find interpretable local patterns, correlations, and descriptors of a target property in materials-science data. Specifically, we will be concerned with data generated by density-functional theory calculations. At first, we demonstrate that SGD can identify physically meaningful models that classify the crystal structures of 82 octet binary semiconductors as either rocksalt or zincblende. SGD identifies an interpretable two-dimensional model derived from only the atomic radii of valence s and p orbitals that properly classifies the crystal structures for 79 of the 82 octet binary semiconductors. The SGD framework is subsequently applied to 24 400 configurations of neutral gas-phase gold clusters with 5 to 14 atoms to discern general patterns between geometrical and physicochemical properties. For example, SGD helps find that van der Waals interactions within gold clusters are linearly correlated with their radius of gyration and are weaker for planar clusters than for nonplanar clusters. Also, a descriptor that predicts a local linear correlation between the chemical hardness and the cluster isomer stability is found for the even-sized gold clusters.
△ Less
Submitted 13 December, 2016;
originally announced December 2016.
-
Learning physical descriptors for materials science by compressed sensing
Authors:
Luca M. Ghiringhelli,
Jan Vybiral,
Emre Ahmetcik,
Runhai Ouyang,
Sergey V. Levchenko,
Claudia Draxl,
Matthias Scheffler
Abstract:
The availability of big data in materials science offers new routes for analyzing materials properties and functions and achieving scientific understanding. Finding structure in these data that is not directly visible by standard tools and exploitation of the scientific information requires new and dedicated methodology based on approaches from statistical learning, compressed sensing, and other r…
▽ More
The availability of big data in materials science offers new routes for analyzing materials properties and functions and achieving scientific understanding. Finding structure in these data that is not directly visible by standard tools and exploitation of the scientific information requires new and dedicated methodology based on approaches from statistical learning, compressed sensing, and other recent methods from applied mathematics, computer science, statistics, signal processing, and information science. In this paper, we explain and demonstrate a compressed-sensing based methodology for feature selection, specifically for discovering physical descriptors, i.e., physical parameters that describe the material and its properties of interest, and associated equations that explicitly and quantitatively describe those relevant properties. As showcase application and proof of concept, we describe how to build a physical model for the quantitative prediction of the crystal structure of binary compound semiconductors.
△ Less
Submitted 13 December, 2016;
originally announced December 2016.
-
Towards a Common Format for Computational Material Science Data
Authors:
Luca M. Ghiringhelli,
Christian Carbogno,
Sergey Levchenko,
Fawzi Mohamed,
Georg Huhs,
Martin Lueders,
Micael Oliveira,
Matthias Scheffler
Abstract:
Information and data exchange is an important aspect of scientific progress. In computational materials science, a prerequisite for smooth data exchange is standardization, which means using agreed conventions for, e.g., units, zero base lines, and file formats. There are two main strategies to achieve this goal. One accepts the heterogeneous nature of the community which comprises scientists from…
▽ More
Information and data exchange is an important aspect of scientific progress. In computational materials science, a prerequisite for smooth data exchange is standardization, which means using agreed conventions for, e.g., units, zero base lines, and file formats. There are two main strategies to achieve this goal. One accepts the heterogeneous nature of the community which comprises scientists from physics, chemistry, bio-physics, and materials science, by complying with the diverse ecosystem of computer codes and thus develops "converters" for the input and output files of all important codes. These converters then translate the data of all important codes into a standardized, code-independent format. The other strategy is to provide standardized open libraries that code developers can adopt for shaping their inputs, outputs, and restart files, directly into the same code-independent format. We like to emphasize in this paper that these two strategies can and should be regarded as complementary, if not even synergetic. The main concepts and software developments of both strategies are very much identical, and, obviously, both approaches should give the same final result. In this paper, we present the appropriate format and conventions that were agreed upon by two teams, the Electronic Structure Library (ESL) of CECAM and the NOMAD (NOvel MAterials Discovery) Laboratory, a European Centre of Excellence (CoE). This discussion includes also the definition of hierarchical metadata describing state-of-the-art electronic-structure calculations.
△ Less
Submitted 16 July, 2016;
originally announced July 2016.
-
Strengthening gold-gold bonds by complexing gold clusters with noble gases
Authors:
Luca M. Ghiringhelli,
Sergey V. Levchenko
Abstract:
We report an unexpectedly strong and complex chemical bonding of rare-gas atoms to neutral gold clusters. The bonding features are consistently reproduced at different levels of approximation within density-functional theory and beyond: from GGA, through hybrid and double-hybrid functionals, up to renormalized second-order perturbation theory. The main finding is that the adsorption of Ar, Kr, and…
▽ More
We report an unexpectedly strong and complex chemical bonding of rare-gas atoms to neutral gold clusters. The bonding features are consistently reproduced at different levels of approximation within density-functional theory and beyond: from GGA, through hybrid and double-hybrid functionals, up to renormalized second-order perturbation theory. The main finding is that the adsorption of Ar, Kr, and Xe reduces electron-electron repulsion within gold dimer, causing strengthening of the Au-Au bond. Differently from the dimer, the rare-gas adsorption effects on the gold trimer's geometry and vibrational frequencies are mainly due to electron occupation of the trimer's lowest unoccupied molecular orbital. For the trimer, the theoretical results are also consistent with far-infrared multiple photon dissociation experiments.
△ Less
Submitted 12 March, 2015;
originally announced March 2015.
-
Computational Design of Nanoclusters by Property-Based Genetic Algorithms: Tuning the Electronic Properties of (TiO$_2$)$_n$ Clusters
Authors:
Saswata Bhattacharya,
Benjamin H. Sonin,
Christopher J. Jumonville,
Luca M. Ghiringhelli,
Noa Marom
Abstract:
In order to design clusters with desired properties, we have implemented a suite of genetic algorithms tailored to optimize for low total energy, high vertical electron affinity (VEA), and low vertical ionization potential (VIP). Applied to (TiO$_2$)$_n$ clusters, the property-based optimization reveals the underlying structure-property relations and the structural features that may serve as activ…
▽ More
In order to design clusters with desired properties, we have implemented a suite of genetic algorithms tailored to optimize for low total energy, high vertical electron affinity (VEA), and low vertical ionization potential (VIP). Applied to (TiO$_2$)$_n$ clusters, the property-based optimization reveals the underlying structure-property relations and the structural features that may serve as active sites for catalysis. High VEA and low VIP are correlated with the presence of several dangling-O atoms and their proximity, respectively. We show that the electronic properties of (TiO$_2$)$_n$ up to n=20 correlate more strongly with the presence of these structural features than with size.
△ Less
Submitted 23 January, 2015;
originally announced January 2015.
-
Big Data of Materials Science - Critical Role of the Descriptor
Authors:
Luca M. Ghiringhelli,
Jan Vybiral,
Sergey V. Levchenko,
Claudia Draxl,
Matthias Scheffler
Abstract:
Statistical learning of materials properties or functions so far starts with a largely silent, non-challenged step: the choice of the set of descriptive parameters (termed descriptor). However, when the scientific connection between the descriptor and the actuating mechanisms is unclear, causality of the learned descriptor-property relation is uncertain. Thus, trustful prediction of new promising…
▽ More
Statistical learning of materials properties or functions so far starts with a largely silent, non-challenged step: the choice of the set of descriptive parameters (termed descriptor). However, when the scientific connection between the descriptor and the actuating mechanisms is unclear, causality of the learned descriptor-property relation is uncertain. Thus, trustful prediction of new promising materials, identification of anomalies, and scientific advancement are doubtful. We analyse this issue and define requirements for a suited descriptor. For a classical example, the energy difference of zincblende/wurtzite and rocksalt semiconductors, we demonstrate how a meaningful descriptor can be found systematically.
△ Less
Submitted 5 February, 2015; v1 submitted 26 November, 2014;
originally announced November 2014.
-
Efficient ab initio schemes for finding thermodynamically stable and metastable atomic structures: Benchmark of cascade genetic algorithms
Authors:
Saswata Bhattacharya,
Sergey V. Levchenko,
Luca M. Ghiringhelli,
Matthias Scheffler
Abstract:
A first-principles based methodology for efficiently and accurately finding thermodynamically stable and metastable atomic structures is introduced and benchmarked. The approach is demonstrated for gas-phase metal-oxide clusters in thermodynamic equilibrium with a reactive (oxygen) atmosphere at finite pressure and temperature. It consists of two steps. At first, the potential-energy surface is sc…
▽ More
A first-principles based methodology for efficiently and accurately finding thermodynamically stable and metastable atomic structures is introduced and benchmarked. The approach is demonstrated for gas-phase metal-oxide clusters in thermodynamic equilibrium with a reactive (oxygen) atmosphere at finite pressure and temperature. It consists of two steps. At first, the potential-energy surface is scanned by means of a global-optimization technique, i.e., a massive-parallel first-principles cascade genetic algorithm for which the choice of all parameters is validated against higher-level methods. In particular, we validate a) the criteria for selection and combination of structures used for the assemblage of new candidate structures, and b) the choice of the exchange-correlation functional. The selection criteria are validated against a fully unbiased method: replica-exchange molecular dynamics. Our choice of the exchange-correlation functional, the van-der-Waals-corrected PBE0 hybrid functional, is justified by comparisons up to highest level currently achievable within density-functional theory, i.e., the renormalized second-order perturbation theory, rPT2. In the second step, the low-energy structures are analyzed by means of ab initio atomistic thermodynamics in order to determine compositions and structures that minimize the Gibbs free energy at given temperature and pressure of the reactive atmosphere.
△ Less
Submitted 30 September, 2014;
originally announced September 2014.
-
A quantum reactive scattering perspective on electronic nonadiabaticity
Authors:
Yang Peng,
Luca M. Ghiringhelli,
Heiko Appel
Abstract:
Based on quantum reactive-scattering theory, we propose a method for studying the electronic nonadiabaticity in collision processes involving electron-ion rearrangements. We investigate the state-to-state transition probability for electron-ion rearrangements with two comparable approaches. In the first approach the information of the electron is only contained in the ground-state Born-Oppenheimer…
▽ More
Based on quantum reactive-scattering theory, we propose a method for studying the electronic nonadiabaticity in collision processes involving electron-ion rearrangements. We investigate the state-to-state transition probability for electron-ion rearrangements with two comparable approaches. In the first approach the information of the electron is only contained in the ground-state Born-Oppenheimer potential-energy surface, which is the starting point of common reactive-scattering calculations. In the second approach, the electron is explicitly taken into account and included in the calculations at the same level as the ions. Hence, the deviation in the results between the two approaches directly reflects the electronic nonadiabaticity during the collision process. To illustrate the method, we apply it to the well-known proton-transfer model of Shin and Metiu (one electron and three ions), generalized by us in order to allow for reactive scattering channels. It is shown that our explicit electron approach is able to capture electronic nonadiabaticity and the renormalization of the reaction barrier near the classical turning points of the potential in nuclear configuration space. In contrast, system properties near the equilibrium geometry of the asymptotic scattering channels are hardly affected by electronic nonadiabatic effects. We also present an analytical expression for the transition amplitude of the asymmetric proton-transfer model based on the direct evaluation of integrals over the involved Airy functions.
△ Less
Submitted 13 March, 2014;
originally announced March 2014.
-
Stability and metastability of clusters in a reactive atmosphere: theoretical evidence for unexpected stoichiometries of MgMOx
Authors:
Saswata Bhattacharya,
Sergey V. Levchenko,
Luca M. Ghiringhelli,
Matthias Scheffler
Abstract:
By applying a genetic algorithm in a cascade approach of increasing accuracy, we calculate the composition and structure of MgMOx clusters at realistic temperatures and oxygen pressures. The stable and metastable systems are identified by ab initio atomistic thermodynamics. We find that small clusters (M <= 5) are in thermodynamic equilibrium when x > M. The non-stoichiometric clusters exhibit pec…
▽ More
By applying a genetic algorithm in a cascade approach of increasing accuracy, we calculate the composition and structure of MgMOx clusters at realistic temperatures and oxygen pressures. The stable and metastable systems are identified by ab initio atomistic thermodynamics. We find that small clusters (M <= 5) are in thermodynamic equilibrium when x > M. The non-stoichiometric clusters exhibit peculiar magnetic behavior, suggesting the possibility of tuning magnetic properties by changing environmental pressure and temperature conditions. Furthermore, we show that density-functional theory (DFT) with a hybrid exchange-correlation (xc) functional is needed for predicting accurate phase diagrams of metal-oxide clusters. Neither a (sophisticated) force field nor DFT with (semi)local xc functionals are sufficient for even a qualitative prediction.
△ Less
Submitted 21 August, 2013; v1 submitted 17 April, 2013;
originally announced April 2013.
-
Autocatalytic and cooperatively-stabilized dissociation of water on a stepped platinum surface
Authors:
Davide Donadio,
Luca M. Ghiringhelli,
Luigi Delle Site
Abstract:
Water-metal interfaces are ubiquitous and play a key role in many chemical processes, from catalysis to corrosion. Whereas water adlayers on atomically flat transition metal surfaces have been investigated in depth, little is known about the chemistry of water on stepped surfaces, commonly occurring in realistic situations. Using first-principles simulations we study the adsorption of water on a s…
▽ More
Water-metal interfaces are ubiquitous and play a key role in many chemical processes, from catalysis to corrosion. Whereas water adlayers on atomically flat transition metal surfaces have been investigated in depth, little is known about the chemistry of water on stepped surfaces, commonly occurring in realistic situations. Using first-principles simulations we study the adsorption of water on a stepped platinum surface. We find that water adsorbs preferentially at the step edge, forming linear clusters or chains, stabilized by the cooperative effect of chemical bonds with the substrate and hydrogen bonds. In contrast with flat Pt, at steps water molecules dissociate forming mixed hydroxyl/water structures, through an autocatalytic mechanism promoted by hydrogen bonding. Nuclear quantum effects contribute to stabilize partially dissociated cluster and chains. Together with the recently demonstrated attitude of water chains adsorbed on stepped Pt surfaces to transfer protons via thermally activated hopping, these findings candidate these systems as viable proton wires.
△ Less
Submitted 12 November, 2012;
originally announced November 2012.
-
Electronic Energy Functionals: Levy-Lieb principle within the Ground State Path Integral Quantum Monte Carlo
Authors:
Luigi Delle Site,
Luca M. Ghiringhelli,
David Ceperley
Abstract:
We propose a theoretical/computational protocol based on the use of the Ground State (GS) Path Integral (PI) Quantum Monte Carlo (QMC) for the calculation of the kinetic and Coulomb energy density for a system of $N$ interacting electrons in an external potential. The idea is based on the derivation of the energy densities via the $N-1$-conditional probability density within the framework of the L…
▽ More
We propose a theoretical/computational protocol based on the use of the Ground State (GS) Path Integral (PI) Quantum Monte Carlo (QMC) for the calculation of the kinetic and Coulomb energy density for a system of $N$ interacting electrons in an external potential. The idea is based on the derivation of the energy densities via the $N-1$-conditional probability density within the framework of the Levy-Lieb constrained search principle. The consequences for the development of energy functionals within the context of Density Functional Theory (DFT) are discussed. We propose also the possibility of going beyond the energy densities and extend this idea to a computational procedure where the $N-1$-conditional probability is an implicit functional of the electron density, independently from the external potential. In principle, such a procedure paves the way for an {\it on-the-fly} determination of the energy functional for any system.
△ Less
Submitted 29 May, 2012;
originally announced May 2012.
-
Local structure of liquid carbon controls diamond nucleation
Authors:
L. M. Ghiringhelli,
C. Valeriani,
E. J. Meijer,
D. Frenkel
Abstract:
Diamonds melt at temperatures above 4000 K. There are no measurements of the steady-state rate of the reverse process: diamond nucleation from the melt, because experiments are difficult at these extreme temperatures and pressures. Using numerical simulations, we estimate the diamond nucleation rate and find that it increases by many orders of magnitude when the pressure is increased at constant…
▽ More
Diamonds melt at temperatures above 4000 K. There are no measurements of the steady-state rate of the reverse process: diamond nucleation from the melt, because experiments are difficult at these extreme temperatures and pressures. Using numerical simulations, we estimate the diamond nucleation rate and find that it increases by many orders of magnitude when the pressure is increased at constant supersaturation. The reason is that an increase in pressure changes the local coordination of carbon atoms from three-fold to four-fold. It turns out to be much easier to nucleate diamond in a four-fold coordinated liquid than in a liquid with three-fold coordination, because in the latter case the free-energy cost to create a diamond-liquid interface is higher. We speculate that this mechanism for nucleation control is relevant for crystallization in many network-forming liquids. On the basis of our calculations, we conclude that homogeneous diamond nucleation is likely in carbon-rich stars and unlikely in gaseous planets.
△ Less
Submitted 12 July, 2009;
originally announced July 2009.
-
Surface Induced Crystallization in Supercooled Tetrahedral Liquids
Authors:
T. Li,
D. Donadio,
L. M. Ghiringhelli,
G. Galli
Abstract:
Freezing is a fundamental physical phenomenon that has been studied over many decades; yet the role played by surfaces in determining nucleation has remained elusive. Here we report direct computational evidence of surface induced nucleation in supercooled systems with a negative slope of their melting line (dP/dT < 0). This unexpected result is related to the density decrease occurring upon cry…
▽ More
Freezing is a fundamental physical phenomenon that has been studied over many decades; yet the role played by surfaces in determining nucleation has remained elusive. Here we report direct computational evidence of surface induced nucleation in supercooled systems with a negative slope of their melting line (dP/dT < 0). This unexpected result is related to the density decrease occurring upon crystallization, and to surface tension facilitating the initial nucleus formation. Our findings support the hypothesis of surface induced crystallization of ice in the atmosphere, and provide insight, at the atomistic level, into nucleation mechanisms of widely used semiconductors.
△ Less
Submitted 3 December, 2008; v1 submitted 3 December, 2008;
originally announced December 2008.
-
State-of-the-art models for the phase diagram of carbon and diamond nucleation
Authors:
Luca M. Ghiringhelli,
C. Valeriani,
J. H. Los,
E. J. Meijer,
A. Fasolino,
D. Frenkel
Abstract:
We review recent developments in the modelling of the phase diagram and the kinetics of crystallization of carbon. In particular, we show that a particular class of bond-order potentials (the so-called LCBOP models) account well for many of the known structural and thermodynamic properties of carbon at high pressures and temperatures. We discuss the LCBOP models in some detail. In addition, we b…
▽ More
We review recent developments in the modelling of the phase diagram and the kinetics of crystallization of carbon. In particular, we show that a particular class of bond-order potentials (the so-called LCBOP models) account well for many of the known structural and thermodynamic properties of carbon at high pressures and temperatures. We discuss the LCBOP models in some detail. In addition, we briefly review the ``history'' of experimental and theoretical studies of the phase behaviour of carbon. Using a well-tested version of the LCBOP model (viz. LCBOPI+) we address some of the more controversial hypotheses concerning the phase behaviour of carbon, in particular: the suggestion that liquid carbon can exist in two phases separated by a first-order phase transition and the conjecture that diamonds could have formed by homogeneous nucleation in Uranus and Neptune.
△ Less
Submitted 15 July, 2009; v1 submitted 10 April, 2008;
originally announced April 2008.