The Open Reaction Database.
Steven Kearnes,Michael R. Maser,Michael Wleklinski,Anton Kast,Abigail G. Doyle,Spencer D. Dreher,Joel M. Hawkins,Klavs F. Jensen,Connor W. Coley +8 more
221
TL;DR: Open Reaction Database (ORD) as mentioned in this paper is an open-access schema and infrastructure for structuring and sharing organic reaction data, including a centralized data repository, which supports conventional and emerging technologies, from benchtop reactions to automated high-throughput experiments.
read more
Abstract: Chemical reaction data in journal articles, patents, and even electronic laboratory notebooks are currently stored in various formats, often unstructured, which presents a significant barrier to downstream applications, including the training of machine-learning models. We present the Open Reaction Database (ORD), an open-access schema and infrastructure for structuring and sharing organic reaction data, including a centralized data repository. The ORD schema supports conventional and emerging technologies, from benchtop reactions to automated high-throughput experiments and flow chemistry. The data, schema, supporting code, and web-based user interfaces are all publicly available on GitHub. Our vision is that a consistent data representation and infrastructure to support data sharing will enable downstream applications that will greatly improve the state of the art with respect to computer-aided synthesis planning, reaction prediction, and other predictive chemistry tasks.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Scientific discovery in the age of artificial intelligence
Hanchen Wang,Tianfan Fu,Yuanqi Du,Wenhao Gao,Kexin Huang,Ziming Liu,Payal Chandak,Shengchao Liu,Peter Van Katwyk,A Deac,Animashree Anandkumar,Karianne J. Bergen,Carla Gomes,Shirley Ho,Pushmeet Kohli,L. Lasenby,Jure Leskovec,Tie-Yan Liu,Arjun K. Manrai,Debora Marks,Bharath Ramsundar,Le Song,Jimeng Sun,Jian Tang,Petar Veličković,Max Welling,Linfeng Zhang,Connor W. Coley,Yoshua Bengio,Marinka Zitnik +29 more
TL;DR: This work examines breakthroughs over the past decade that include self-supervised learning, which allows models to be trained on vast amounts of unlabelled data, and geometric deeplearning, which leverages knowledge about the structure of scientific data to enhance model accuracy and efficiency.
696
The rise of self-driving labs in chemical and materials sciences
TL;DR: Self-driving Lab (SDL) as discussed by the authors is a machine-learning-assisted modular experimental platform that iteratively operates a series of experiments selected by the machine learning algorithm to achieve a user-defined objective.
A Brief Introduction to Chemical Reaction Optimization
Connor J. Taylor,Alexander Pomberger,Kobi Felton,Rachel Grainger,Magda Helena Barecka,Thomas W. Chamberlain,Richard A. Bourne,Christopher N. Johnson,Alexei A. Lapkin +8 more
TL;DR: In this paper , it has been shown that model-based, algorithm-based and miniaturized high-throughput techniques outperform human chemical intuition and achieve reaction optimization in a much more time and material-efficient manner.
183
SELFIES and the future of molecular string representations
Mario Krenn,Qianxiang Ai,Senja Barthel,Nessa Carson,Angelo Frei,Nathan C. Frey,Pascal Friederich,Théophile Gaudin,Albert A Gayle,Kevin Maik Jablonka,R. Lameiro,Dominik Lemm,Alston Lo,Seyed Mohamad Moosavi,Jos'e Manuel N'apoles-Duarte,AkshatKumar Nigam,Robert Pollice,Kohulan Rajan,Ulrich Schatzschneider,Philippe Schwaller,Marta Skreta,Berend Smit,Felix Strieth-Kalthoff,Chong Sun,Gary Tom,Guido Falk von Rudorff,Andrew Wang,Andrew D. White,A. R. Young,Rose Yu,Alán Aspuru-Guzik +30 more
TL;DR: The authors proposed 16 concrete future projects for robust molecular representations, which involve the extension toward new chemical domains, exciting questions at the interface of AI and robust languages, and interpretability for both humans and machines.
177
A review of molecular representation in the age of machine learning
TL;DR: In this paper , four classes of representations are introduced: string, connection table, feature-based, and computer-learned representations for chemical VAEs, including simplified molecular-input line-entry system (SMILES), International Chemical Identifier (InChI), and MDL molfile, of which SMILES was the first to successfully be used in conjunction with a VAE to yield a continuous representation of molecules.
176
References
The FAIR Guiding Principles for scientific data management and stewardship
Mark Wilkinson,Michel Dumontier,IJsbrand Jan Aalbersberg,Gabrielle Appleton,Myles Axton,Arie Baak,Niklas Blomberg,Jan-Willem Boiten,Luiz Olavo Bonino da Silva Santos,Philip E. Bourne,Jildau Bouwman,Anthony J. Brookes,Timothy Clark,Mercè Crosas,Ingrid Dillo,Olivier G. Dumon,Scott C. Edmunds,Chris T. Evelo,Richard Finkers,Alejandra Gonzalez-Beltran,Alasdair J. G. Gray,Paul Groth,Carole Goble,Jeffrey S. Grethe,Jaap Heringa,Peter A C 't Hoen,Rob Hooft,Tobias Kuhn,Ruben Kok,Joost N. Kok,Scott J. Lusher,Maryann E. Martone,Albert Mons,Abel L. Packer,Bengt Persson,Philippe Rocca-Serra,Marco Roos,Rene van Schaik,Susanna-Assunta Sansone,Erik Anthony Schultes,Thierry Sengstag,Ted Slater,George Strawn,Morris A. Swertz,Mark Thompson,Johan van der Lei,Erik M. van Mulligen,Jan Velterop,Andra Waagmeester,Peter Wittenburg,Katherine Wolstencroft,Jun Zhao,Barend Mons,Barend Mons +53 more
TL;DR: The FAIR Data Principles as mentioned in this paper are a set of data reuse principles that focus on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals.
Planning chemical syntheses with deep neural networks and symbolic AI
TL;DR: This work combines Monte Carlo tree search with an expansion policy network that guides the search, and a filter network to pre-select the most promising retrosynthetic steps that solve for almost twice as many molecules, thirty times faster than the traditional computer-aided search method.
1.7K
Merging photoredox with nickel catalysis: Coupling of α-carboxyl sp3-carbons with aryl halides
Zhiwei Zuo,Derek T. Ahneman,Lingling Chu,Jack A. Terrett,Abigail G. Doyle,David W. C. MacMillan +5 more
TL;DR: In this article, the synergistic combination of photoredox catalysis and nickel catalysis was used to achieve a direct decarboxylative sp(3)-sp(2) cross-coupling of amino acids with aryl halides.
Predicting reaction performance in C–N cross-coupling using machine learning
TL;DR: It is demonstrated that machine learning can be used to predict the performance of a synthetic reaction in multidimensional chemical space using data obtained via high-throughput experimentation and provides significantly improved predictive performance over linear regression analysis.
886
Prediction of Organic Reaction Outcomes Using Machine Learning
TL;DR: A model framework for anticipating reaction outcomes that combines the traditional use of reaction templates with the flexibility in pattern recognition afforded by neural networks is reported.
726