ColabFold: making protein folding accessible to all

doi:10.1038/s41592-022-01488-1

Open AccessJournal Article10.1038/s41592-022-01488-1

ColabFold: making protein folding accessible to all

Milot Mirdita, +2 more

- 30 May 2022

- Nature Methods

- Vol. 19, Iss: 6, pp 679-682

5.9K

TL;DR: ColabFold as discussed by the authors combines the fast homology search of MMseqs2 with AlphaFold2 or RoseTTAFold for protein folding and achieves 40-60fold faster search and optimized model utilization.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1126/science.ade2574

Evolutionary-scale prediction of atomic level protein structure with a language model

Zeming Lin, +14 more

- 21 Dec 2022

- Science

TL;DR: The ESM Metage-nomic Atlas as discussed by the authors is the first large-scale structural characterization of metagenomic proteins, with more than 617 million structures, including more than 225 million high confidence predictions.

...read moreread less

2.2K

Journal Article•10.1002/pro.4792

UCSF ChimeraX: Tools for Structure Building and Analysis.

Elaine C Meng, +6 more

- 29 Sep 2023

- Protein Science

TL;DR: New methods in the UCSF ChimeraX molecular modeling package are described that take advantage of machine‐learning structure predictions, provide likelihood‐based fitting in maps, and compute per‐residue scores to identify modeling errors.

...read moreread less

1.1K

•Journal Article•10.1038/s41467-021-27838-9

Harnessing protein folding neural networks for peptide–protein docking

Tomer Tsaban, +5 more

- 10 Jan 2022

- Nature Communications

TL;DR: For example, AlphaFold2 as discussed by the authors generates peptide-protein complex models without requiring multiple sequence alignment information for the peptide partner, and can handle binding-induced conformational changes of the receptor.

...read moreread less

858

10.1038/s41586-023-06221-2

Scientific discovery in the age of artificial intelligence

Hanchen Wang, +29 more

- 01 Aug 2023

- Nature

TL;DR: This work examines breakthroughs over the past decade that include self-supervised learning, which allows models to be trained on vast amounts of unlabelled data, and geometric deeplearning, which leverages knowledge about the structure of scientific data to enhance model accuracy and efficiency.

...read moreread less

696

Journal Article•10.1038/s41587-022-01618-2

Large language models generate functional protein sequences across diverse families

Ali Madani, +11 more

- 26 Jan 2023

- Nature Biotechnology

TL;DR: ProGen is described, a language model that can generate protein sequences with a predictable function across large protein families, akin to generating grammatically and semantically correct natural language sentences on diverse topics.

...read moreread less

600

...

Expand

References

Journal Article•10.1109/MCSE.2007.55

Matplotlib: A 2D Graphics Environment

J.D. Hunter

- 01 May 2007

- Computing in Science and Engineering

TL;DR: Matplotlib is a 2D graphics package used for Python for application development, interactive scripting, and publication-quality image generation across user interfaces and operating systems.

...read moreread less

34.7K

•Journal Article•10.1038/S41586-021-03819-2

Highly accurate protein structure prediction with AlphaFold

John M. Jumper, +33 more

- 15 Jul 2021

- Nature

TL;DR: For example, AlphaFold as mentioned in this paper predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture. But the accuracy is limited by the fact that no homologous structure is available.

...read moreread less

28.2K

•Journal Article•10.1093/NAR/GKY1049

UniProt: A worldwide hub of protein knowledge

Alex Bateman

- 01 Jan 2019

- Nucleic Acids Research

7.1K

•Book

Accelerated Profile HMM Searches

Sean R. Eddy

- 01 May 2015

TL;DR: An acceleration heuristic for profile HMMs, the “multiple segment Viterbi” (MSV) algorithm, which computes an optimal sum of multiple ungapped local alignment segments using a striped vector-parallel approach previously described for fast Smith/Waterman alignment.

...read moreread less

6.3K

•Journal Article•10.1093/NAR/GKAA913

Pfam: The protein families database in 2021.

Jaina Mistry, +11 more

- 08 Jan 2021

- Nucleic Acids Research

TL;DR: The Pfam database is a widely used resource for classifying protein sequences into families and domains and the reintroduced Pfam-B which provides an automatically generated supplement to Pfam and contains 136 730 novel clusters of sequences that are not yet matched by a Pfam family.

...read moreread less

5.6K