CDD/SPARCLE: the conserved domain database in 2020
Shennan Lu,Jiyao Wang,Farideh Chitsaz,Myra K. Derbyshire,Renata C. Geer,Noreen R. Gonzales,Marc Gwadz,David I. Hurwitz,Gabriele H. Marchler,James S. Song,Narmada Thanki,Roxanne A. Yamashita,Mingzhang Yang,Dachuan Zhang,Chanjuan Zheng,Christopher J. Lanczycki,Aron Marchler-Bauer +16 more
TL;DR: As NLM's Conserved Domain Database (CDD) enters its 20th year of operations as a publicly available resource, curation staff continues to develop hierarchical classifications of widely distributed protein domain families, and to record conserved sites associated with molecular function, so that they can be mapped onto user queries in support of hypothesis-driven biomolecular research.
read more
Abstract: As NLM's Conserved Domain Database (CDD) enters its 20th year of operations as a publicly available resource, CDD curation staff continues to develop hierarchical classifications of widely distributed protein domain families, and to record conserved sites associated with molecular function, so that they can be mapped onto user queries in support of hypothesis-driven biomolecular research. CDD offers both an archive of pre-computed domain annotations as well as live search services for both single protein or nucleotide queries and larger sets of protein query sequences. CDD staff has continued to characterize protein families via conserved domain architectures and has built up a significant corpus of curated domain architectures in support of naming bacterial proteins in RefSeq. These architecture definitions are available via SPARCLE, the Subfamily Protein Architecture Labeling Engine. CDD can be accessed at https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Systematic analysis of CNGCs in cotton and the positive role of GhCNGC32 and GhCNGC35 in salt tolerance
Zheng-Li Lu,Guo-ying Yin,Mao Chai,Lu Sun,Hengling Wei,Jian Chen,Yufeng Yang,Xiaokang Fu,Shiyun Li +8 more
TL;DR: In this paper , a total of 114 cyclic nucleotide-gated ion channels (CNGC) genes were identified from the genomes of four cotton species, clustered into five main groups: I, II, III, IVa, and IVb.
A Rare, Virulent Clostridium perfringens Bacteriophage Susfortuna Is the First Isolated Bacteriophage in a New Viral Genus
Julie Stenberg Pedersen,Witold Kot,Maja Plöger,Réne Lametsh,Horst Neve,Charles M. A. P. Franz,Lars Hestbjerg Hansen +6 more
- 16 Dec 2020
TL;DR: Phage therapy using C. perfringens type A for the treatment of neonatal and weaned piglets with HBV infection results in down-regulation of Bovine spleen prolapse and HBV coronavirus.
3
Closing the Gap: Horizontal Transfer of Mariner Transposons between Rhus Gall Aphids and Other Insects
TL;DR: The knowledge gap surrounding HTT is closed and the events between Rhus gall aphids and other insects for the first time are reported and will help to understand the evolution and spread of transposable elements in the genomes of Rhus Gall aphids.
Characterization of YABBY genes and the correlation between their transcript levels and histone modifications in strawberry
TL;DR: In this article , the authors have identified six and 25 YAB proteins in the woodland strawberry (Fragaria vesca, 2n=2x=14) and the cultivated strawberry(Fragaria × ananassa), respectively, and performed phylogenetic and structural analyses to characterize the YAB protein.
3
Recombination and selection trajectory of the monkeypox virus during its adaptation in the human population
Jialu Zheng,Jinfeng Zeng,Haoyu Long,Jian Chen,Kaijie Liu,Yixiong Chen,Xiangjun Du +6 more
TL;DR: To gain insights into the evolutionary dynamics of MPXV, comprehensive in silico recombination and selection analyses were conducted based on MPXV whole genome sequence data and detected 25 genes under positive selection, mainly associated with immune response and viral regulation.
3
References
The Pfam protein families database in 2019.
Sara El-Gebali,Jaina Mistry,Alex Bateman,Sean R. Eddy,Aurelien Luciani,Simon C. Potter,Matloob Qureshi,Lorna Richardson,Gustavo A. Salazar,Alfredo Smart,Erik L. L. Sonnhammer,Layla Hirsh,Layla Hirsh,Lisanna Paladin,Damiano Piovesan,Silvio C. E. Tosatto,Robert D. Finn +16 more
TL;DR: A significant comparison to the structural classification database that led to the creation of 825 new families based on their set of uncharacterized families (EUFs) was carried out and Pfam entries were connected to the Sequence Ontology (SO) through mapping of the Pfam type definitions to SO terms.
4.7K
CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.
Aron Marchler-Bauer,Yu Bo,Lianyi Han,Jane He,Christopher J. Lanczycki,Shennan Lu,Farideh Chitsaz,Myra K. Derbyshire,Renata C. Geer,Noreen R. Gonzales,Marc Gwadz,David I. Hurwitz,Fu Lu,Gabriele H. Marchler,James S. Song,Narmada Thanki,Zhouxi Wang,Roxanne A. Yamashita,Dachuan Zhang,Chanjuan Zheng,Lewis Y. Geer,Stephen H. Bryant +21 more
TL;DR: NCBI's Conserved Domain Database (CDD) aims at annotating biomolecular sequences with the location of evolutionarily conserved protein domain footprints, and functional sites inferred from such footprints.
2.6K
CD-Search: protein domain annotations on the fly
TL;DR: The Conserved Domain Search service (CD-Search), a web-based tool for the detection of structural and functional domains in protein sequences, uses BLAST(R) heuristics to provide a fast, interactive service, and searches a comprehensive collection of domain models.
The COG database: new developments in phylogenetic classification of proteins from complete genomes
Roman L. Tatusov,Darren A. Natale,Igor Garkavtsev,Tatiana Tatusova,Uma Shankavaram,Bachoti S. Rao,Boris Kiryutin,Michael Y. Galperin,Natalie D. Fedorova,Eugene V. Koonin +9 more
TL;DR: The new features added to the COG database include information pages with structural and functional details on each COG and literature references, improvements of the COGNITOR program that is used to fit new proteins into the COGs, and classification of genomes and COGs constructed by using principal component analysis.
20 years of the SMART protein domain annotation resource.
Ivica Letunic,Peer Bork +1 more
TL;DR: In its 20th year, the SMART analysis results pages have been streamlined again and its information sources have been updated, and the internal full text search engine has been redesigned and updated, resulting in greatly increased search speed.