Logomaker: Beautiful sequence logos in python
Ammar Tareen,Justin B. Kinney +1 more
TL;DR: Logomaker, a Python API for creating publication-quality sequence logos that can produce both standard and highly customized logos from any matrix-like array of numbers, is introduced.
read more
Abstract: Sequence logos are visually compelling ways of illustrating the biological properties of DNA, RNA, and protein sequences, yet it is currently difficult to generate such logos within the Python programming environment. Here we introduce Logomaker, a Python API for creating publication-quality sequence logos. Logomaker can produce both standard and highly customized logos from any matrix-like array of numbers. Logos are rendered as vector graphics that are easy to stylize using standard matplotlib functions. Methods for creating logos from multiple-sequence alignments are also included. Availability and Implementation Logomaker can be installed using the pip package manager and is compatible with both Python 2.7 and Python 3.6. Source code is available at http://github.com/jbkinney/logomaker. Supplemental Information Documentation is provided at http://logomaker.readthedocs.io. Contact jkinney@cshl.edu.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Deep mutational scanning of SARS-CoV-2 receptor binding domain reveals constraints on folding and ACE2 binding.
Tyler N. Starr,Allison J. Greaney,Allison J. Greaney,Sarah K Hilton,Sarah K Hilton,Katharine H.D. Crawford,Katharine H.D. Crawford,Mary Jane Navarro,John E. Bowen,M. Alejandra Tortorici,Alexandra C. Walls,David Veesler,Jesse D. Bloom,Jesse D. Bloom,Jesse D. Bloom +14 more
TL;DR: An interactive visualization and open analysis pipeline is presented to facilitate use of the dataset for vaccine design and functional annotation of mutations observed during viral surveillance.
473
Versatile and multivalent nanobodies efficiently neutralize SARS-CoV-2
Yufei Xiang,Sham Nambulli,Zhengyun Xiao,Heng Liu,Zhe Sang,Zhe Sang,W. Paul Duprex,Dina Schneidman-Duhovny,Cheng Zhang,Yi Shi,Yi Shi +10 more
TL;DR: A large repertoire of highly potent neutralizing nanobodies (Nbs) to the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike protein receptor binding domain (RBD) is identified and a structure of one of the most potent Nbs in complex with the RBD is determined.
391
Drag-and-drop genome insertion of large sequences without double-strand DNA cleavage using CRISPR-directed integrases
Matthew T. N. Yarnall,Eleonora I. Ioannidi,Cian Schmitt-Ulms,Rohan N. Krajeski,Justin K. Lim,Lukas Villiger,Wenyuan Zhou,Kaiyi Jiang,Sofya K. Garushyants,Nathan Roberts,Liyang Zhang,Christopher A. Vakulskas,John A. Walker,Anastasia P. Kadina,Adrianna E Zepeda,Kevin Holden,Hong-Yu Ma,Jun Xie,Guangping Gao,Lander Foquet,Greg Bial,Sara K. Donnelly,Yoshinari Miyata,Daniel R. Radiloff,Jordana M. Henderson,Andrew Ujita,Omar O. Abudayyeh,Jonathan S. Gootenberg +27 more
TL;DR: Large sequences are integrated site specifically into the human genome without double-strand DNA cleavage, expanding the capabilities of genome editing by allowing large, multiplexed gene insertion without reliance on DNA repair pathways.
261
Stop codon context influences genome-wide stimulation of termination codon readthrough by aminoglycosides
Jamie R Wangen,Rachel Green +1 more
TL;DR: It is found that G418-induced miscoding alters gene expression with substantial effects on translation of histone genes, selenoprotein genes, and S-adenosylmethionine decarboxylase (AMD1).
163
Massively Parallel Assays and Quantitative Sequence–Function Relationships
TL;DR: A unified conceptual framework and a core set of mathematical model strategies that studies in these diverse areas can make use of are described, including the identification of clinically relevant genomic variants, the modeling of transcription factor binding to DNA, the functional and evolutionary landscapes of proteins, and cis-regulatory mechanisms in both transcription and mRNA splicing.
142
References
Pfam: the protein families database.
Robert D. Finn,Alex Bateman,Jody Clements,Penelope Coggill,Ruth Y. Eberhardt,Sean R. Eddy,Andreas Heger,Kirstie Hetherington,Liisa Holm,Jaina Mistry,Erik L. L. Sonnhammer,John Tate,Marco Punta +12 more
TL;DR: Pfam as discussed by the authors is a widely used database of protein families, containing 14 831 manually curated entries in the current version, version 27.0, and has been updated several times since 2012.
WebLogo: A Sequence Logo Generator
TL;DR: WebLogo generates sequence logos, graphical representations of the patterns within a multiple sequence alignment that provide a richer and more precise description of sequence similarity than consensus sequences and can rapidly reveal significant features of the alignment otherwise difficult to perceive.
Sequence logos: a new way to display consensus sequences
Thomas D. Schneider,R M Stephens +1 more
TL;DR: From these 'sequence logos', one can determine not only the consensus sequence but also the relative frequency of bases and the information content at every position in a site or sequence.
3.6K
GENCODE reference annotation for the human and mouse genomes.
Adam Frankish,Mark Diekhans,Anne-Maud Ferreira,Rory Johnson,Irwin Jungreis,Irwin Jungreis,Jane E. Loveland,Jonathan M. Mudge,Cristina Sisu,Cristina Sisu,James C. Wright,Joel Armstrong,If Barnes,Andrew Berry,Alexandra Bignell,Silvia Carbonell Sala,Jacqueline Chrast,Fiona Cunningham,Tomás Di Domenico,Sarah Donaldson,Ian T. Fiddes,Carlos García Girón,Jose Manuel Gonzalez,Tiago Grego,Matthew P. Hardy,Thibaut Hourlier,Toby Hunt,Osagie G. Izuogu,Julien Lagarde,Fergal J. Martin,Laura Martinez,Shamika Mohanan,Paul R. Muir,Fabio C. P. Navarro,Anne Parker,Baikang Pei,Fernando Pozo,Magali Ruffier,Bianca M. Schmitt,Eloise Stapleton,Marie-Marthe Suner,Irina Sycheva,Barbara Uszczynska-Ratajczak,Jinuri Xu,Andrew D. Yates,Daniel R. Zerbino,Yan Zhang,Yan Zhang,Bronwen Aken,Jyoti S. Choudhary,Mark Gerstein,Roderic Guigó,Tim Hubbard,Manolis Kellis,Manolis Kellis,Benedict Paten,Alexandre Reymond,Michael L. Tress,Paul Flicek +58 more
TL;DR: This work generates primary data, creates bioinformatics tools and provides analysis to support the work of expert manual gene annotators and automated gene annotation pipelines to identify and characterise gene loci to the highest standard.
3K
•Proceedings Article
Learning important features through propagating activation differences
Avanti Shrikumar,Peyton Greenside,Anshul Kundaje +2 more
- 06 Aug 2017
TL;DR: DeepLIFT (Deep Learning Important FeaTures), a method for decomposing the output prediction of a neural network on a specific input by backpropagating the contributions of all neurons in the network to every feature of the input, is presented.