Gencode 2021
Adam Frankish,Mark Diekhans,Irwin Jungreis,Julien Lagarde,Jane E. Loveland,Jonathan M. Mudge,Cristina Sisu,James C. Wright,Joel Armstrong,If Barnes,Andrew Berry,Alexandra Bignell,Carles Boix,S. Carbonell Sala,Fiona Cunningham,T. Di Domenico,Sarah Donaldson,Ian T. Fiddes,C. Garcia Giron,José M. González,Tiago Grego,Matthew Hardy,Thibaut Hourlier,Kerstin Howe,Toby Hunt,Osagie G. Izuogu,Rory Johnson,Fergal J. Martin,Laura Martinez,S. Mohanan,Paul R. Muir,Fabio C. P. Navarro,Anne Parker,Baikang Pei,Fernando Pozo,F. C. Riera,Magali Ruffier,Bianca M. Schmitt,E. Stapleton,Marie Marthe Suner,I. Sycheva,Barbara Uszczynska-Ratajczak,Maxim Y Wolf,Jinrui Xu,Y. T. Yang,Andrew D. Yates,Daniel R. Zerbino,Yan Zhang,Jyoti S. Choudhary,Mark Gerstein,Roderic Guigó,Tim Hubbard,Manolis Kellis,Benedict Paten,Michael L. Tress,Paul Flicek +55 more
TL;DR: The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics as mentioned in this paper. But the annotation process does not support the creation of transcript structures and the determination of their function.
read more
Abstract: The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Long non-coding RNAs: definitions, functions, challenges and recommendations
John S. Mattick,Paulo P. Amaral,P. Carninci,Susan Carpenter,Howard Y. Chang,Ling-Ling Chen,Runsheng Chen,Caroline Dean,Marcel E. Dinger,Katherine A. Fitzgerald,Thomas R. Gingeras,Mitchell Guttman,Tetsuro Hirose,Maite Huarte,Rory Johnson,Chandrasekhar Kanduri,Philipp Kapranov,Jeanne B. Lawrence,Jeannie T. Lee,Joshua T. Mendell,Tim R. Mercer,Kathryn J. Moore,Shinichi Nakagawa,John L. Rinn,David L. Spector,Igor Ulitsky,Yue Wang,Jeremy E. Wilusz,Mian Hua Wu +28 more
TL;DR: The definition and nomenclature of long non-coding RNAs and their conservation, expression, phenotypic visibility, structure, and functions are discussed in this paper , where the authors also discuss research challenges and recommendations to advance the understanding of the roles of lncRNAs in development, cell biology and disease.
Ensembl 2023
Fergal J. Martin,M. Ridwan Amode,Alisha Aneja,Olanrewaju Austine-Orimoloye,Andrey G. Azov,If H. A. Barnes,Arne Becker,Ruth Bennett,Andrew Berry,Jyothish N Nair Thulasee Bhai,Simarpreet Kaur Bhurji,Alexandra Bignell,Sanjay Boddu,Paulo R Branco Lins,Lucy Brooks,Shashank Budhanuru Ramaraju,Mehrnaz Charkhchi,Alexander Cockburn,Luca Da Rin Fiorretto,R. Davidson,K. Dodiya,Sarah Donaldson,Bilal El Houdaigui,Tamara El Naboulsi,Reham Fatima,Carlos García Girón,Thiago A. L. Genez,Gurpreet S. Ghattaoraya,J. Gonzalez Martinez,Cristina Guijarro,Matthew P. Hardy,Zoe Hollis,Thibaut Hourlier,Toby Hunt,M. Kay,Vinay Kaykala,Tuan Le,Diana Lemos,Diego Marques-Coelho,José Carlos Marugán,Gabriel Merino,Louisse Paola Mirabueno,Aleena Mushtaq,Syed Nakib Hossain,Denye Nathaniel Ogeh,Manoj Pandian Sakthivel,Anne Parker,Malcolm Stockbridge Perry,Ivana Piližota,Irina Prosovetskaia,J. Perez-Silva,Ahamed Imran Abdul Salam,Nuno Saraiva-Agostinho,Helen Schuilenburg,Daniel Sheppard,Swati Sinha,Botond Sipos,W. Stark,Emily Steed,Ranjit Sukumaran,Dulika S. Sumathipala,Marie-Marthe Suner,Likhitha Surapaneni,Kyösti Sutinen,Michal Szpak,Francesca Floriana Tricomi,David Urbina-Gómez,Andres Veidenberg,Thomas A. Walsh,Brandon Walts,Elizabeth Wass,Natalie Willhoft,Jamie Allen,Jorge Alvarez-Jarreta,Marc Chakiachvili,Beth Flint,Stefano Giorgetti,Leanne Haggerty,Garth R Ilsley,Jane E. Loveland,Benjamin L. Moore,Jonathan M. Mudge,John Tate,David Thybert,Stephen J. Trevanion,Andrea Winterbottom,Adam Frankish,Sarah E. Hunt,Magali Ruffier,Fiona Cunningham,Sarah Dyer,Robert D. Finn,Kevin L. Howe,Peter W. Harrison,Andrew D. Yates,Paul Flicek +95 more
TL;DR: Ensembl (https://www.ensembl.org) has produced high-quality genomic resources for vertebrates and model organisms for more than twenty years as mentioned in this paper . During that time, our resources, services and tools have continually evolved in line with both the publicly available genome data and the downstream research and applications that utilize the Ensembl platform.
624
A draft human pangenome reference
Wen-Wei Liao,Mobin Asri,Jana Ebler,Daniel Doerr,Marina Haukness,Glenn Hickey,Shuangjia Lu,Julian K. Lucas,Jean Marcel Maurice Monlong,Haley J. Abel,Silvia Buonaiuto,Xian Chang,Haoyu Cheng,Justin Jang Hann Chu,Vincenza Colonna,Jordan M. Eizenga,Xiaowen Feng,Christian Fischer,Robert S. Fulton,Shilpa Garg,Cristian Groza,Andrea Guarracino,William T. Harvey,Simon Heumos,Kerstin Howe,Miten Jain,Tsung-Yu Lu,Charles Markello,Fergal J. Martin,Matthew Mitchell,Katherine M. Munson,Moses N. Mwaniki,Adam M. Novak,Hugh E. Olsen,Trevor Pesout,David Porubsky,Pjotr Prins,Jonas Andreas Sibbesen,Chad Tomlinson,Flavia Villani,Mitchell R. Vollger,Guillaume Bourque,Mark Chaisson,Paul Flicek,Adam M. Phillippy,Justin M. Zook,Evan E. Eichler,David Haussler,Erich D. Jarvis,Karen H. Miga,Ting Wang,Erik Garrison,Tobias Marschall,Ira M. Hall,Heng Li,Benedict Paten +55 more
TL;DR: The pangenome reference as discussed by the authors contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals and is more than 99% accurate at the structural and base pair levels.
CellMarker 2.0: an updated database of manually curated cell markers in human/mouse and web tools based on scRNA-seq data
Congxue Hu,Tengyue Li,Yingqi Xu,Xinxin Zhang,Feng Li,Jing Bai,Jing Chen,Wenqi Jiang,Kaiyue Yang,Qi Ou,Xia Li,Pengli Wang,Yunpeng Zhang +12 more
TL;DR: Six flexible web tools, including cell annotation, cell clustering, cell malignancy, cell differentiation, cell feature and cell communication, were developed to analysis and visualization of single cell sequencing data.
The UCSC Genome Browser database: 2023 update
Luis R Nassar,Galt P. Barber,Anna Benet-Pagès,Jonathan Casper,Hiram Clawson,Mark Diekhans,Clayton M. Fischer,Jairo Navarro Gonzalez,Angie S. Hinrichs,Brian T. Lee,Christopher Lee,Pranav Muthuraman,Beagan Nguy,Tiana Pereira,Parisa Nejad,Gerardo Perez,Brian J. Raney,Daniel Schmelter,Matthew L. Speir,Brittney Wick,Ann S. Zweig,David Haussler,Robert M. Kuhn,Maximilian Haeussler,W. James Kent +24 more
TL;DR: The UCSC Genome Browser (http://genome.ucsc.edu) as discussed by the authors is an omics data consolidator, graphical viewer, and general bioinformatics resource that continues to serve the community as it enters its 23rd year.
383
References
The Human Genome Browser at UCSC
W. James Kent,Charles W. Sugnet,Terrence S. Furey,Krishna M. Roskin,Tom H. Pringle,Alan M. Zahler,and David Haussler +6 more
TL;DR: A mature web tool for rapid and reliable display of any requested portion of the genome at any scale, together with several dozen aligned annotation tracks, is provided at http://genome.ucsc.edu.
The PRIDE database and related tools and resources in 2019: improving support for quantification data.
Yasset Perez-Riverol,Attila Csordas,Jingwen Bai,Manuel Bernal-Llinares,Suresh Hewapathirana,Deepti J. Kundu,Avinash Inuganti,Johannes Griss,Johannes Griss,Gerhard Mayer,Martin Eisenacher,Enrique Perez,Julian Uszkoreit,Julianus Pfeuffer,Timo Sachsenberg,Şule Yılmaz,Shivani Tiwary,Juergen Cox,Enrique Audain,Mathias Walzer,Andrew F. Jarnuczak,Tobias Ternent,Alvis Brazma,Juan Antonio Vizcaíno +23 more
TL;DR: Key statistics on the current data contents and volume of downloads are outlined, and how PRIDE data are starting to be disseminated to added-value resources including Ensembl, UniProt and Expression Atlas are outlined.
Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation
Nuala A. O'Leary,Mathew W. Wright,J. Rodney Brister,Stacy Ciufo,Diana Haddad,Richard McVeigh,Bhanu Rajput,Barbara Robbertse,Brian Smith-White,Danso Ako-adjei,Alexander Astashyn,Azat Badretdin,Yiming Bao,Olga Blinkova,Vyacheslav Brover,Vyacheslav Chetvernin,Jinna Choi,Eric Cox,Olga Ermolaeva,Catherine M. Farrell,Tamara Goldfarb,Tripti Gupta,Daniel H. Haft,Eneida L. Hatcher,Wratko Hlavina,Vinita Joardar,Vamsi K. Kodali,Wenjun Li,Donna Maglott,Patrick Masterson,Kelly M. McGarvey,Michael R. Murphy,Kathleen O'Neill,Shashikant Pujar,Sanjida H. Rangwala,Daniel Rausch,Lillian D. Riddick,Conrad L. Schoch,Andrei Shkeda,Susan S. Storz,Hanzhen Sun,Françoise Thibaud-Nissen,Igor Tolstoy,Raymond E. Tully,Anjana R. Vatsan,Craig Wallin,David Webb,Wendy Wu,Melissa J. Landrum,Avi Kimchi,Tatiana Tatusova,Michael DiCuccio,Paul Kitts,Terence Murphy,Kim D. Pruitt +54 more
TL;DR: The approach to utilizing available RNA-Seq and other data types in the authors' manual curation process for vertebrate, plant, and other species is summarized, and a new direction for prokaryotic genomes and protein name management is described.
6K
GENCODE: The reference human genome annotation for The ENCODE Project
Jennifer Harrow,Adam Frankish,José M. González,Electra Tapanari,Mark Diekhans,Felix Kokocinski,Bronwen Aken,Daniel Barrell,Amonida Zadissa,Stephen M. J. Searle,If H. A. Barnes,Alexandra Bignell,Veronika Boychenko,Toby Hunt,M. Kay,Gaurab Mukherjee,Jeena Rajan,Gloria Despacio-Reyes,Gary Saunders,Charles A. Steward,Rachel A. Harte,Michael F. Lin,Cédric Howald,Andrea Tanzer,Thomas Derrien,Jacqueline Chrast,Nathalie Walters,Suganthi Balasubramanian,Baikang Pei,Michael L. Tress,Jose Manuel Rodriguez,Iakes Ezkurdia,Jeltje Van Baren,Michael R. Brent,David Haussler,Manolis Kellis,Alfonso Valencia,Alexandre Reymond,Mark Gerstein,Roderic Guigó,Tim Hubbard +40 more
TL;DR: This work has examined the completeness of the transcript annotation and found that 35% of transcriptional start sites are supported by CAGE clusters and 62% of protein-coding genes have annotated polyA sites, and over one-third of GENCODE protein-Coding genes aresupported by peptide hits derived from mass spectrometry spectra submitted to Peptide Atlas.
Related Papers (5)
Nuala A. O'Leary,Mathew W. Wright,J. Rodney Brister,Stacy Ciufo,Diana Haddad,Richard McVeigh,Bhanu Rajput,Barbara Robbertse,Brian Smith-White,Danso Ako-adjei,Alexander Astashyn,Azat Badretdin,Yiming Bao,Olga Blinkova,Vyacheslav Brover,Vyacheslav Chetvernin,Jinna Choi,Eric Cox,Olga Ermolaeva,Catherine M. Farrell,Tamara Goldfarb,Tripti Gupta,Daniel H. Haft,Eneida L. Hatcher,Wratko Hlavina,Vinita Joardar,Vamsi K. Kodali,Wenjun Li,Donna Maglott,Patrick Masterson,Kelly M. McGarvey,Michael R. Murphy,Kathleen O'Neill,Shashikant Pujar,Sanjida H. Rangwala,Daniel Rausch,Lillian D. Riddick,Conrad L. Schoch,Andrei Shkeda,Susan S. Storz,Hanzhen Sun,Françoise Thibaud-Nissen,Igor Tolstoy,Raymond E. Tully,Anjana R. Vatsan,Craig Wallin,David Webb,Wendy Wu,Melissa J. Landrum,Avi Kimchi,Tatiana Tatusova,Michael DiCuccio,Paul Kitts,Terence Murphy,Kim D. Pruitt +54 more