Journal Article10.1021/acs.chemrev.4c00572
Data Generation for Machine Learning Interatomic Potentials and Beyond
Maksim Kulichenko,Benjamin Nebgen,Nicholas Lubbers,Justin S. Smith,Kipton Barros,Alice Allen,Adela Habib,Emily Shinkle,Nikita Fedik,Ying Wai Li,Richard A. Messerly,Sergei Tretiak +11 more
17
TL;DR: This review explores the evolution of data-driven chemistry, focusing on machine learning-based interatomic potentials and the importance of high-quality training data for accurate modeling of chemical and structural properties at the atomic level.
read more
Abstract: The field of data-driven chemistry is undergoing an evolution, driven by innovations in machine learning models for predicting molecular properties and behavior. Recent strides in ML-based interatomic potentials have paved the way for accurate modeling of diverse chemical and structural properties at the atomic level. The key determinant defining MLIP reliability remains the quality of the training data. A paramount challenge lies in constructing training sets that capture specific domains in the vast chemical and structural space. This Review navigates the intricate landscape of essential components and integrity of training data that ensure the extensibility and transferability of the resulting models. We delve into the details of active learning, discussing its various facets and implementations. We outline different types of uncertainty quantification applied to atomistic data acquisition and the correlations between estimated uncertainty and true error. The role of atomistic data samplers in generating diverse and informative structures is highlighted. Furthermore, we discuss data acquisition via modified and surrogate potential energy surfaces as an innovative approach to diversify training data. The Review also provides a list of publicly available data sets that cover essential domains of chemical space.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Dual‐Atom Nanozymes (DAzymes): from Synthesis to Applications
Pir Muhammad,Jun Zhang,Yan Wang,Yan Wang,Sumaira Hanif,Ping Zhang,Guoping Chen,Kehui Yuan,Saud Asif Ahmed,Junfeng Zhang,Chenchen Li,Zhendong Lei,Kelong Fan,Yanli Wang,Yanli Wang,Pir Muhammad,Jun Zhang,Yan Wang,Sumaira Hanif,Ping Zhang,Guoping Chen,Kehui Yuan,Saud Asif Ahmed,Junfeng Zhang,Chenchen Li,Zhendong Lei,Kelong Fan,Yanli Wang,Zhengdong Lei +28 more
TL;DR: This review explores the synthesis, catalytic mechanisms, and applications of dual-atom nanozymes (DAzymes), highlighting their exceptional performance, stability, and adaptability in energy, environmental, biomedical, and biosensing fields, while discussing challenges and future directions.
1
A Robust Machine Learned Interatomic Potential for Nb: Collision Cascade Simulations with Accurate Non-equilibrium properties
Utkarsh Bhardwaj,Vinayak Mishra,Suman Mondal,M. Warrier +3 more
TL;DR: Researchers develop a machine learning interatomic potential for niobium, accurately capturing non-equilibrium properties crucial for radiation damage simulations, outperforming existing potentials in accuracy and efficiency for large-scale MD simulations and collision cascade simulations.
Fast and Fourier features for transfer learning of interatomic potentials
Pietro Novelli,Giacomo Meanti,Pedro J. Buigues,Lorenzo Rosasco,Michele Parrinello,Massimiliano Pontil,Leo H. Bonati +6 more
TL;DR: Researchers introduce franken, a transfer learning framework that extracts atomic descriptors from pre-trained graph neural networks and transfers them to new systems using random Fourier features, achieving fast and accurate adaptation with minimal hyperparameter tuning.
Materials Interface Engineering: Impact of Interfacial Molecular Orientation on Organic Electronic Devices
Attia Shaheen,Nadia Anwar,Fang Chen,Yue Chan,Haibing Xie,Shern-Long Lee +5 more
Abstract: Abstract Molecular‐based electronic devices exploit the unique properties of single molecules or assemblies to surpass conventional solid‐state systems in miniaturization, efficiency, and functional diversity. Their performance hinges on controlling molecular orientation and packing at interfaces, which dictate charge transport and energy‐level alignment. Molecular orientation describes the directional alignment (e.g., face‐on, edge‐on) of organic semiconductors relative to substrates, while packing involves spatial arrangement, crystallinity, and interactions (π–π stacking, hydrogen bonding). Advanced techniques such as scanning probe microscopy (SPM), grazing‐incidence wide‐angle X‐ray scattering (GIWAXS), and Kelvin probe force microscopy (KPFM) elucidate these structural features, establishing correlations with optoelectronic properties like light absorption, exciton dynamics, and charge carrier mobility. The interplay between orientation and packing governs energy‐level alignment and charge transport via interfacial work function modulation. Emerging computational tools like machine learning (ML) and multiscale simulations enable predictive design of molecular configurations for targeted device functionalities. However, challenges remain in achieving uniform molecular alignment across practical device architectures and establishing robust structure‐property relationships under operational conditions. Addressing these requires integrating experimental characterization, computational modeling, and synthetic innovation. This review highlights the need for multidisciplinary approaches to advance molecular electronics toward practical, high‐performance applications, balancing fundamental insights with scalable fabrication.
High-throughput electronic property prediction of cyclic molecules with 3D-enhanced machine learning
Peikun Zheng,Olexandr Isayev +1 more
Abstract: Ring Vault contains 201 546 cyclic molecules across 11 elements. AIMNet2 with 3D information outperformed 2D models in predicting the electronic properties of cyclic molecules.
References
Self-Consistent Equations Including Exchange and Correlation Effects
Walter Kohn,L. J. Sham +1 more
TL;DR: In this paper, the Hartree and Hartree-Fock equations are applied to a uniform electron gas, where the exchange and correlation portions of the chemical potential of the gas are used as additional effective potentials.
Highly accurate protein structure prediction with AlphaFold
John M. Jumper,Richard O. Evans,Alexander Pritzel,Tim Green,Michael Figurnov,Olaf Ronneberger,Kathryn Tunyasuvunakool,Russell Bates,Augustin Žídek,Anna Potapenko,Alex Bridgland,Clemens Meyer,Simon A. A. Kohl,Andrew J. Ballard,Andrew Cowie,Bernardino Romera-Paredes,Stanislav Nikolov,R. D. Jain,Jonas Adler,Trevor Back,Stig Petersen,David Reiman,Ellen Clancy,Michal Zielinski,Martin Steinegger,Michalina Pacholska,Tamas Berghammer,Sebastian Bodenstein,David L. Silver,Oriol Vinyals,Andrew W. Senior,Koray Kavukcuoglu,Pushmeet Kohli,Demis Hassabis +33 more
TL;DR: For example, AlphaFold as mentioned in this paper predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture. But the accuracy is limited by the fact that no homologous structure is available.
A Survey on Transfer Learning
Sinno Jialin Pan,Qiang Yang +1 more
TL;DR: The relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift are discussed.
A climbing image nudged elastic band method for finding saddle points and minimum energy paths
TL;DR: In this article, a modification of the nudged elastic band method for finding minimum energy paths is presented, where one of the images is made to climb up along the elastic band to converge rigorously on the highest saddle point.
Development and testing of a general amber force field.
TL;DR: A general Amber force field for organic molecules is described, designed to be compatible with existing Amber force fields for proteins and nucleic acids, and has parameters for most organic and pharmaceutical molecules that are composed of H, C, N, O, S, P, and halogens.