Project Details
Projekt Print View

Importance sampling of chemical compound space: Thermodynamic properties from high-throughput coarse-grained simulations

Subject Area Theoretical Chemistry: Electronic Structure, Dynamics, Simulation
Term from 2016 to 2021
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 285228850
 
Final Report Year 2022

Final Report Abstract

Coarse-grained (CG) modeling has been a workhorse of multiscale modeling for soft matter for two main reasons: a reductionist (“physicist-type”) approach and a lower computational load to sample conformational space. This project leverages a different reason: compound screening. Transferable CG models can significantly reduce the size of chemical space by introducing degeneracy—similar molecules map to the same CG representation. Our approach showed a systematic way to compute complex thermodynamic properties for many compounds—orders of magnitude more than previously achieved. The insertion of small organic solutes in single-component phospholipid membranes revolve around the notion of a potential of mean force (PMF), casting the free-energy profile across the interface. We generated PMFs for all one- and two-bead representations of the CG Martini force field. This analysis revealed (i) the general collection of PMF shapes; (ii) linear relations between important thermodynamic quantities, especially water/octanol partitioning; (iii) a database of PMFs for 4 × 10^5 compounds. We extended our computational screening approach to passive permeability coefficients. Our large-scale analysis established for the first time a permeability surface, describing the change as a function of crucial physicochemical parameters. The established structure–property relationships links important functional groups to the target property, and even enables inverse molecular design by suggesting relevant chemical groups for a desired permeability coefficient. A scale-up of the chemical space covered led to the design of an importance sampling strategy, using a combination of both Monte Carlo simulations and machine learning. On a fundamental level, we used information theoretic tools to better understand what it meant to build top-down CG models that target chemical space as a whole. Thermodynamic accuracy can be reached with small numbers of bead types, and also enable a hierarchical screening strategy. Applications on membrane-specific and phase-altering compounds demonstrate the benefits of the method to derive clear design rules, and even suggest candidates for experimental testing. Unlike high-throughput calculations targeting electronic properties (e.g., from density functional theory; DFT), our high-throughput coarse-graining (HTCG) scheme requires low computational investment. While high-throughput DFT will first run all necessary calculations and later seek to coarse-grain the relevant information (e.g., from unsupervised learning), HTCG uses the underlying physics to first coarse-grain before running any simulation. This leads to lower computational load and, critically, a simplified structure–property relationship. In essence, the method takes advantage of both physical understanding of the problem, together with a data-centric approach to screening. HTCG complements recent efforts in the field of explainable/interpretable machine learning.

Publications

  • In silico screening of drug-membrane thermodynamics reveals linear relations between bulk partitioning and the potential of mean force. The Journal of Chemical Physics, 147(12):125101, 2017
    Roberto Menichetti, Kiran H Kanekal, Kurt Kremer, and Tristan Bereau
    (See online at https://doi.org/10.1063/1.4987012)
  • Non-covalent interactions across organic and biological subsets of chemical space: Physics-based potentials parametrized from machine learning. The Journal of Chemical Physics, 148(24):241706, 2018
    Tristan Bereau, Robert A DiStasio Jr., Alexandre Tkatchenko, and O Anatole Von Lilienfeld
    (See online at https://doi.org/10.1063/1.5009502)
  • Controlled exploration of chemical space by machine learning of coarse-grained representations. Phys. Rev. E, 100:033302, 2019
    Christian Hoffmann, Roberto Menichetti, Kiran H Kanekal, and Tristan Bereau
    (See online at https://doi.org/10.1103/physreve.100.033302)
  • Drug–membrane permeability across chemical space. ACS Centr. Sci., 5(2):290–298, 2019
    Roberto Menichetti, Kiran H Kanekal, and Tristan Bereau
    (See online at https://doi.org/10.1021/acscentsci.8b00718)
  • Resolution limit of data-driven coarse-grained models spanning chemical space. J. Chem. Phys., 151(16):164106, October 2019
    Kiran H. Kanekal and Tristan Bereau
    (See online at https://doi.org/10.1063/1.5119101)
  • Designing exceptional gas-separation polymer membranes using machine learning. Sci. Adv., 6(20):eaaz4301, May 2020
    J. Wesley Barnett, Connor R. Bilchak, Yiwen Wang, Brian C. Benicewicz, Laura A. Murdock, Tristan Bereau, and Sanat K. Kumar
    (See online at https://doi.org/10.1126/sciadv.aaz4301)
  • Hydration free energies from kernel-based machine learning: Compound-database bias. J. Chem. Phys., 153(1):014101, July 2020
    Clemens Rauer and Tristan Bereau
    (See online at https://doi.org/10.1063/5.0012230)
  • Inserting small molecules across membrane mixtures: Insight from the potential of mean force. Biophys. J., 118(6):1321–1332, March 2020
    Alessia Centi, Arghya Dutta, Sapun H. Parekh, and Tristan Bereau
    (See online at https://doi.org/10.1016/j.bpj.2020.01.039)
 
 

Additional Information

Textvergrößerung und Kontrastanpassung