MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes
Brandi L. Cantarel,Ian F Korf,Sofia M. C. Robb,Genís Parra,Eric D. Ross,Barry Moore,Carson Holt,Alejandro Sánchez Alvarado,Mark Yandell +8 more
TL;DR: The results demonstrate that MAKER provides a simple and effective means to convert a genome sequence into a community-accessible genome database, and should prove especially useful for emerging model organism genome projects for which extensive bioinformatics resources may not be readily available.
read more
Abstract: We have developed a portable and easily configurable genome annotation pipeline called MAKER. Its purpose is to allow investigators to independently annotate eukaryotic genomes and create genome databases. MAKER identifies repeats, aligns ESTs and proteins to a genome, produces ab initio gene predictions, and automatically synthesizes these data into gene annotations having evidence-based quality indices. MAKER is also easily trainable: Outputs of preliminary runs are used to automatically retrain its gene-prediction algorithm, producing higher-quality gene-models on subsequent runs. MAKER’s inputs are minimal, and its outputs can be used to create a GMOD database. Its outputs can also be viewed in the Apollo Genome browser; this feature of MAKER provides an easy means to annotate, view, and edit individual contigs and BACs without the overhead of a database. As proof of principle, we have used MAKER to annotate the genome of the planarian Schmidtea mediterranea and to create a new genome database, SmedGD. We have also compared MAKER’s performance to other published annotation pipelines. Our results demonstrate that MAKER provides a simple and effective means to convert a genome sequence into a community-accessible genome database. MAKER should prove especially useful for emerging model organism genome projects for which extensive bioinformatics resources may not be readily available.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Whole-genome sequencing and comparative genomic analysis of potential biotechnological strains of Trichoderma harzianum, Trichoderma atroviride, and Trichoderma reesei
Rafaela Rossi Rosolen,Maria Augusta Crivelente Horta,P. H. C. de Azevedo,Carla Cristina da Silva,Danilo Augusto Sforça,Gustavo H. Goldman,Anete Pereira de Souza +6 more
TL;DR: In this article , the authors performed whole-genome sequencing and assembly of the T. harzianum IOC-3844 and T. reesei CBMAI-0711 (Tr0711) strains.
8
Signatures of selection in recently domesticated macadamia
Jishan Lin,Wenping Zhang,Xingtan Zhang,Xiaokai Ma,Shengcheng Zhang,Shuai Chen,Yibin Wang,Haifeng Jia,Zhenyang Liao,Jing Lin,Mengting Zhu,Xiuming Xu,Mingxing Cai,Hui Zeng,Ji Wan,Wei-Hai Yang,Tracie K. Matsumoto,Craig Hardner,Catherine J Nock,Ray Ming +19 more
TL;DR: In this paper , the genome of macadamia was sequenced and assembled into 794 Mb in 14 pseudo-chromosomes with 37,728 genes, including genes involved in fatty acid biosynthesis, seed coat development, and heat stress response.
Draft Genome Sequence of the Fungus Associated with Oak Wilt Mortality in South Korea, Raffaelea quercus-mongolicae KACC44405.
Jongbum Jeon,Ki-Tae Kim,Hyeunjeong Song,Gir-Won Lee,Kyeongchae Cheong,Hyunbin Kim,Gobong Choi,Yong-Hwan Lee,Jane Stewart,Ned B. Klopfenstein,Mee-Sook Kim +10 more
TL;DR: The 27.0-Mb draft genome sequence of R. quercus-mongolicae strain KACC44405 is presented and it is shown that the fungus is vectored and dispersed by the ambrosia beetle, Platypus koryoensis.
8
Low mutation load in a supergene underpinning alternative male mating strategies in ruff
Jason Joseph Hill,Erik D Enbody,Huijuan Bi,Sangeet Lamichhaney,Doreen Schwochow,Shady Younis,Fredrik Widemo,Leif Andersson +7 more
TL;DR: The results suggest that the inversion inuffs may be much younger than previously thought and the lack of mutation load despite recessive lethality can be explained by the introgression of the inversions from a now extinct lineage.
8
Patent
Compositions and methods for improving lettuce production
Rachel Didonato Floro,Justin Lee,Gregg Bogosian,Doug Bryant +3 more
- 04 Dec 2014
TL;DR: In this article, the authors provide both compositions comprising Methylobacterium and compositions comprising methylobacteria that are depleted of substances that promote growth of resident microorganisms on a lettuce plant or seed.
8
References
Basic Local Alignment Search Tool
TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.
98.8K
MUSCLE: multiple sequence alignment with high accuracy and high throughput
TL;DR: MUSCLE is a new computer program for creating multiple alignments of protein sequences that includes fast distance estimation using kmer counting, progressive alignment using a new profile function the authors call the log-expectation score, and refinement using tree-dependent restricted partitioning.
45.1K
The Pfam protein families database
Marco Punta,Penny Coggill,Ruth Y. Eberhardt,Jaina Mistry,John Tate,Chris Boursnell,Ningze Pang,Kristoffer Forslund,Goran Ceric,Jody Clements,Andreas Heger,Liisa Holm,Erik L. L. Sonnhammer,Sean R. Eddy,Alex Bateman,Robert D. Finn +15 more
TL;DR: The definition and use of family-specific, manually curated gathering thresholds are explained and some of the features of domains of unknown function (also known as DUFs) are discussed, which constitute a rapidly growing class of families within Pfam.
Pfam: the protein families database.
Robert D. Finn,Alex Bateman,Jody Clements,Penelope Coggill,Ruth Y. Eberhardt,Sean R. Eddy,Andreas Heger,Kirstie Hetherington,Liisa Holm,Jaina Mistry,Erik L. L. Sonnhammer,John Tate,Marco Punta +12 more
TL;DR: Pfam as discussed by the authors is a widely used database of protein families, containing 14 831 manually curated entries in the current version, version 27.0, and has been updated several times since 2012.
The sequence of the human genome.
J. Craig Venter,Mark Raymond Adams,Eugene W. Myers,Peter W. Li,Richard J. Mural,Granger G. Sutton,Hamilton O. Smith,Mark Yandell,Cheryl A. Evans,Robert A. Holt,Jeannine D. Gocayne,Peter Amanatides,Richard M. Ballew,Daniel H. Huson,Jennifer R. Wortman,Qing Zhang,Chinnappa D. Kodira,Xiangqun H. Zheng,Lin Chen,Marian P. Skupski,Gangadharan Subramanian,Paul Thomas,Jinghui Zhang,George L. Gabor Miklos,Catherine R. Nelson,Samuel Broder,Andrew G. Clark,J. H. Nadeau,Victor A. McKusick,Norton D. Zinder,Arnold J. Levine,Richard J. Roberts,M. I. Simon,Carolyn W. Slayman,Michael W. Hunkapiller,Randall Bolanos,Arthur L. Delcher,Ian M. Dew,Daniel Fasulo,Michael Flanigan,Liliana Florea,Aaron L. Halpern,Sridhar Hannenhalli,Saul A. Kravitz,Samuel Levy,Clark M. Mobarry,Knut Reinert,Karin A. Remington,Jane Abu-Threideh,Ellen M. Beasley,Kendra Biddick,Vivien Bonazzi,Rhonda Brandon,Michele Cargill,Ishwar Chandramouliswaran,Rosane Charlab,Kabir Chaturvedi,Zuoming Deng,Valentina Di Francesco,Patrick Dunn,Karen Eilbeck,Carlos Evangelista,Andrei Gabrielian,Weiniu Gan,Wangmao Ge,Fangcheng Gong,Zhiping Gu,Ping Guan,Thomas J. Heiman,Maureen E. Higgins,Rui-Ru Ji,Zhaoxi Ke,Karen A. Ketchum,Zhongwu Lai,Yiding Lei,Zhenya Li,Jiayin Li,Yong Liang,Xiaoying Lin,Fu Lu,Gennady V. Merkulov,Natalia Milshina,Helen M. Moore,Ashwinikumar K Naik,Vaibhav A. Narayan,Beena Neelam,Deborah Nusskern,Douglas B. Rusch,Steven L. Salzberg,Wei Shao,Bixiong Chris Shue,Jingtao Sun,Zhen Yuan Wang,Aihui Wang,Xin Wang,Jian Wang,Ming-Hui Wei,Ron Wides,Chunlin Xiao,Chunhua Yan,Alison Yao,Jane Ye,Ming Zhan,Weiqing Zhang,Hongyu Zhang,Qi Zhao,Liansheng Zheng,Fei Zhong,Wenyan Zhong,Shiaoping C. Zhu,Shaying Zhao,Dennis A. Gilbert,Suzanna Baumhueter,Gene Spier,Christine Carter,Anibal Cravchik,Trevor Woodage,Feroze Ali,Huijin An,Aderonke Awe,Danita Baldwin,Holly Baden,Mary Barnstead,Ian Barrow,Karen Beeson,Dana A. Busam,Amy Carver,Ming Lai Cheng,Liz Curry,Steve Danaher,Lionel Davenport,Raymond Desilets,Susanne Dietz,Kristina Dodson,Lisa Doup,Steven Ferriera,Neha Garg,Andres Gluecksmann,Brit J. Hart,Jason Haynes,Charles Haynes,Cheryl Heiner,Suzanne Hladun,Damon Hostin,Jarrett Houck,Timothy Howland,Chinyere Ibegwam,Jeffery Johnson,Francis Kalush,Lesley Kline,Shashi Koduru,Amy Love,Felecia Mann,David May,Steven McCawley,Tina C. McIntosh,Ivy McMullen,Mee Moy,Linda Moy,Brian Murphy,Keith Nelson,Cynthia Pfannkoch,Eric Pratts,Vinita Puri,Hina Qureshi,Matthew Reardon,Robert Rodriguez,Yu-Hui Rogers,Deanna Romblad,Bob Ruhfel,Richard T. Scott,Cynthia Sitter,Michelle Smallwood,Erin Stewart,Renee Strong,Ellen Suh,Reginald Thomas,Ni Ni Tint,Sukyee Tse,Claire Vech,Gary Wang,Jeremy Wetter,Sherita Williams,Monica Williams,Sandra Windsor,Emily Winn-Deen,Keriellen Wolfe,Jayshree Zaveri,Karena Zaveri,Josep F. Abril,Roderic Guigó,Michael J. Campbell,Kimmen Sjölander,Brian Karlak,Anish Kejariwal,Huaiyu Mi,Betty Lazareva,Thomas Hatton,Apurva Narechania,Karen Diemer,Anushya Muruganujan,Nan Guo,Shinji Sato,Vineet Bafna,Sorin Istrail,Ross Lippert,Russell Schwartz,Brian P. Walenz,Shibu Yooseph,David Allen,Anand Basu,James Baxendale,Louis Blick,Marcelo Caminha,John Carnes-Stine,Parris Caulk,Yen-Hui Chiang,My Coyne,Carl Dahlke,Anne Deslattes Mays,Maria Dombroski,Michael Donnelly,Dale Ely,Shiva Esparham,Carl Fosler,Harold Gire,Stephen Glanowski,Kenneth Glasser,Anna Glodek,Mark Gorokhov,Ken Graham,Barry Gropman,Michael Harris,Jeremy Heil,Scott Henderson,Jeffrey Hoover,Donald Jennings,Catherine Jordan,James Jordan,John Kasha,Leonid Kagan,Cheryl L. Kraft,Alexander Levitsky,Mark Lewis,Xiangjun Liu,John Lopez,Daniel Ma,William H. Majoros,Joe McDaniel,Sean C. Murphy,Matthew Newman,Trung Hieu Nguyen,Ngoc Nguyen,Marc Nodell,Sue Pan,Jim Peck,Marshall Peterson,William Rowe,Robert Sanders,John Scott,Michael Simpson,Thomas J. Smith,Arlan Sprague,Timothy B. Stockwell,Russell Turner,Eli Venter,Mei Wang,Meiyuan Wen,David Wu,Mitchell Wu,Ashley Xia,Ali Zandieh,Xiaohong Zhu +272 more
TL;DR: Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems are indicated.
13.6K
Related Papers (5)
Manfred Grabherr,Brian J. Haas,Moran Yassour,Moran Yassour,Joshua Z. Levin,Dawn Thompson,Ido Amit,Xian Adiconis,Lin Fan,Raktima Raychowdhury,Qiandong Zeng,Zehua Chen,Evan Mauceli,Nir Hacohen,Andreas Gnirke,Nicholas Rhind,Federica Di Palma,Bruce W. Birren,Chad Nusbaum,Kerstin Lindblad-Toh,Kerstin Lindblad-Toh,Nir Friedman,Aviv Regev +22 more