Exploration des Raumes Chemischer Verbindungen mit Methoden des maschinellen Lernens
Theoretische Chemie: Elektronenstruktur, Dynamik, Simulation
Zusammenfassung der Projektergebnisse
The general objective of this project was enable rational exploration of chemical compound space (CCS) by developing efficient and accurate ML models. The first important step was to assess the capabilities and limitations of machine learning (ML) techniques for an accurate prediction of molecular energies in CCS. The ML models were trained on a large set of reference energies computed with hybrid density-functional theory (DFT) including van der Waals interactions, as well as the quantum-chemical “gold standard” CCSD(T) method that represents the best possible reference that is still computationally feasible. As originally planed, the project developed in four intertwined directions: Generation of a large set of reference molecular energies; Developing a ladder of physical models (descriptors) for organic molecules, from classical charge repulsion to approximate electronic models for use as input to the ML model; Application and analysis of efficient ML models (kernel-based learning such as Support Vector Machines and Gaussian processes, neural networks, etc.); Physical analysis (exploration) of the chemical compound space using optimal ML models. This will be done both from the point of view of computational complexity (dimensionality, sparsity etc.), and also in terms of the underlying chemistry (for example, one question is whether one can identify classes of molecules in CCS). Our work has led to a number of developments in the areas of data-driven representations of physical systems, advances in incorporating prior knowledge of the application domain, the development of a hierarchy of molecular and material descriptors, as well as a consolidated understanding of the demand on statistical models in atomistic simulations. A novel and challenging aspect was that we allowed variations both in chemical composition and in configurational degrees of freedom (bonding and geometry). This required the extensions of traditional ML models and novel scalable and physically meaningful representations, which required a joint effort between physics, chemistry, and computer science. In addition to modeling atomic interactions, we also aimed at fostering the understanding of ML based potentials with the development of interpretable models. Our analysis revealed that the most effective statistical inference methods are able to recover and exercise chemical concepts in a fully data-driven way.
Projektbezogene Publikationen (Auswahl)
-
(2014). How to represent crystal structures for machine learning: Towards fast prediction of electronic properties. Physical Review B, 89(20), 205118
Schütt, K. T., Glawe, H., Brockherde, F., Sanna, A., Müller, K. R., & Gross, E. K. U.
-
(2015). Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space. The Journal of Physical Chemistry Letters, 6(12), 2326-2331
Hansen, K., Biegler, F., Ramakrishnan, R., Pronobis, W., Von Lilienfeld, O. A., Müller, K. R., & Tkatchenko, A.
-
(2017) "Machine Learning of Accurate Energy-conserving Molecular Force Fields". In: Science Advances, 3(5), e1603015
Chmiela, S., Tkatchenko, A., Sauceda, H.E., Poltavsky, I., Schütt, K.T., Müller, K.-R.
-
(2017) "Quantum-chemical insights from deep tensor neural networks". In: Nature Communications, 8, 13890
Schütt, K. T., Arbabzadah, F., Chmiela, S., Müller, K. R., Tkatchenko, A.
-
(2017) "SchNet: A continuous-filter convolutional neural network for modeling quantum interactions.". In: Advances in Neural Information Processing Systems, 31, pages 991–1001
Schütt, K.T., Kindermans, P.-J., Sauceda, H.E., Chmiela, S., Tkatchenko, A., Müller, K.-R.
-
(2017). Bypassing the Kohn-Sham equations with machine learning. Nature Communications, 8(1), 872
Brockherde, F., Vogt, L., Li, L., Tuckerman, M. E., Burke, K., & Müller, K. R.
-
(2018) "Towards Exact Molecular Dynamics Simulations with Machine-Learned Force Fields". In: Nature Communications, 9(1), 3887
Chmiela, S., Sauceda, H. E., Müller, K.-R., Tkatchenko, A.
-
(2018). Capturing intensive and extensive DFT/TDDFT molecular properties with machine learning. The European Physical Journal B, 91(8), 178
Pronobis, W., Schütt, K. T., Tkatchenko, A., Müller, K. R.
-
"Molecular Force Fields with Gradient-Domain Machine Learning: Construction and Ap- plication to Dynamics of Small Molecules with Coupled Cluster Forces". In: The Journal of Chemical Physics, 150, 2019, 114102
Sauceda, H.E., Chmiela, S., Poltavsky, I., Müller, K.-R., Tkatchenko, A.
-
(2019) "sGDML: Constructing Accurate and Data Efficient Molecular Force Fields Using Machine Learning". In: Computer Physics Communications
Chmiela, S., Sauceda, Poltavsky, I., H. E., Müller, K.-R., Tkatchenko, A.