Logomaker: beautiful sequence logos in Python.
Ammar Tareen,Justin B. Kinney +1 more
380
TL;DR: Logomaker is a Python API for creating publication-quality sequence logos that can produce both standard and highly customized logos from either a matrix-like array of numbers or a multiple-sequence alignment.
read more
Abstract: Summary Sequence logos are visually compelling ways of illustrating the biological properties of DNA, RNA and protein sequences, yet it is currently difficult to generate and customize such logos within the Python programming environment. Here we introduce Logomaker, a Python API for creating publication-quality sequence logos. Logomaker can produce both standard and highly customized logos from either a matrix-like array of numbers or a multiple-sequence alignment. Logos are rendered as native matplotlib objects that are easy to stylize and incorporate into multi-panel figures. Availability and implementation Logomaker can be installed using the pip package manager and is compatible with both Python 2.7 and Python 3.6. Documentation is provided at http://logomaker.readthedocs.io; source code is available at http://github.com/jbkinney/logomaker.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Deep Mutational Scanning of SARS-CoV-2 Receptor Binding Domain Reveals Constraints on Folding and ACE2 Binding.
Tyler N. Starr,Allison J. Greaney,Allison J. Greaney,Sarah K Hilton,Sarah K Hilton,Daniel Ellis,Katharine H.D. Crawford,Katharine H.D. Crawford,Adam S. Dingens,Mary Jane Navarro,John E. Bowen,M. Alejandra Tortorici,Alexandra C. Walls,Neil P. King,David Veesler,Jesse D. Bloom,Jesse D. Bloom,Jesse D. Bloom +17 more
TL;DR: It is found that a substantial number of mutations to the RBD are well tolerated or even enhance ACE2 binding, including at ACE2 interface residues that vary across SARS-related coronaviruses.
2K
Deep mutational scanning of SARS-CoV-2 receptor binding domain reveals constraints on folding and ACE2 binding.
Tyler N. Starr,Allison J. Greaney,Allison J. Greaney,Sarah K Hilton,Sarah K Hilton,Katharine H.D. Crawford,Katharine H.D. Crawford,Mary Jane Navarro,John E. Bowen,M. Alejandra Tortorici,Alexandra C. Walls,David Veesler,Jesse D. Bloom,Jesse D. Bloom,Jesse D. Bloom +14 more
TL;DR: An interactive visualization and open analysis pipeline is presented to facilitate use of the dataset for vaccine design and functional annotation of mutations observed during viral surveillance.
473
Versatile and multivalent nanobodies efficiently neutralize SARS-CoV-2
Yufei Xiang,Sham Nambulli,Zhengyun Xiao,Heng Liu,Zhe Sang,Zhe Sang,W. Paul Duprex,Dina Schneidman-Duhovny,Cheng Zhang,Yi Shi,Yi Shi +10 more
TL;DR: A large repertoire of highly potent neutralizing nanobodies (Nbs) to the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike protein receptor binding domain (RBD) is identified and a structure of one of the most potent Nbs in complex with the RBD is determined.
391
Drag-and-drop genome insertion of large sequences without double-strand DNA cleavage using CRISPR-directed integrases
Matthew T. N. Yarnall,Eleonora I. Ioannidi,Cian Schmitt-Ulms,Rohan N. Krajeski,Justin K. Lim,Lukas Villiger,Wenyuan Zhou,Kaiyi Jiang,Sofya K. Garushyants,Nathan Roberts,Liyang Zhang,Christopher A. Vakulskas,John A. Walker,Anastasia P. Kadina,Adrianna E Zepeda,Kevin Holden,Hong-Yu Ma,Jun Xie,Guangping Gao,Lander Foquet,Greg Bial,Sara K. Donnelly,Yoshinari Miyata,Daniel R. Radiloff,Jordana M. Henderson,Andrew Ujita,Omar O. Abudayyeh,Jonathan S. Gootenberg +27 more
TL;DR: Large sequences are integrated site specifically into the human genome without double-strand DNA cleavage, expanding the capabilities of genome editing by allowing large, multiplexed gene insertion without reliance on DNA repair pathways.
261
Optimization of therapeutic antibodies by predicting antigen specificity from antibody sequence via deep learning.
Derek M Mason,Simon Friedensohn,Cédric R. Weber,Christian Jordi,Bastian Wagner,Simon M Meng,Roy A. Ehling,Lucia Bonati,Jan Dahinden,Pablo Gainza,Bruno E. Correia,Sai T. Reddy +11 more
TL;DR: In this article, the authors used deep learning models trained on antibody-mutagenesis libraries to generate antibody variants and predict their antigen specificity, which can then be filtered for viscosity, clearance, solubility and immunogenicity.
213
References
Pfam: the protein families database.
Robert D. Finn,Alex Bateman,Jody Clements,Penelope Coggill,Ruth Y. Eberhardt,Sean R. Eddy,Andreas Heger,Kirstie Hetherington,Liisa Holm,Jaina Mistry,Erik L. L. Sonnhammer,John Tate,Marco Punta +12 more
TL;DR: Pfam as discussed by the authors is a widely used database of protein families, containing 14 831 manually curated entries in the current version, version 27.0, and has been updated several times since 2012.
WebLogo: A Sequence Logo Generator
TL;DR: WebLogo generates sequence logos, graphical representations of the patterns within a multiple sequence alignment that provide a richer and more precise description of sequence similarity than consensus sequences and can rapidly reveal significant features of the alignment otherwise difficult to perceive.
MEME Suite: tools for motif discovery and searching
Timothy L. Bailey,Mikael Bodén,Fabian A. Buske,Martin C. Frith,Charles E. Grant,Luca Clementi,Jingyuan Ren,Wilfred W. Li,William Stafford Noble +8 more
TL;DR: The popular MEME motif discovery algorithm is now complemented by the GLAM2 algorithm which allows discovery of motifs containing gaps, and all of the motif-based tools are now implemented as web services via Opal.
10.2K
Sequence logos: a new way to display consensus sequences
Thomas D. Schneider,R M Stephens +1 more
TL;DR: From these 'sequence logos', one can determine not only the consensus sequence but also the relative frequency of bases and the information content at every position in a site or sequence.
3.6K
GENCODE reference annotation for the human and mouse genomes.
Adam Frankish,Mark Diekhans,Anne-Maud Ferreira,Rory Johnson,Irwin Jungreis,Irwin Jungreis,Jane E. Loveland,Jonathan M. Mudge,Cristina Sisu,Cristina Sisu,James C. Wright,Joel Armstrong,If Barnes,Andrew Berry,Alexandra Bignell,Silvia Carbonell Sala,Jacqueline Chrast,Fiona Cunningham,Tomás Di Domenico,Sarah Donaldson,Ian T. Fiddes,Carlos García Girón,Jose Manuel Gonzalez,Tiago Grego,Matthew P. Hardy,Thibaut Hourlier,Toby Hunt,Osagie G. Izuogu,Julien Lagarde,Fergal J. Martin,Laura Martinez,Shamika Mohanan,Paul R. Muir,Fabio C. P. Navarro,Anne Parker,Baikang Pei,Fernando Pozo,Magali Ruffier,Bianca M. Schmitt,Eloise Stapleton,Marie-Marthe Suner,Irina Sycheva,Barbara Uszczynska-Ratajczak,Jinuri Xu,Andrew D. Yates,Daniel R. Zerbino,Yan Zhang,Yan Zhang,Bronwen Aken,Jyoti S. Choudhary,Mark Gerstein,Roderic Guigó,Tim Hubbard,Manolis Kellis,Manolis Kellis,Benedict Paten,Alexandre Reymond,Michael L. Tress,Paul Flicek +58 more
TL;DR: This work generates primary data, creates bioinformatics tools and provides analysis to support the work of expert manual gene annotators and automated gene annotation pipelines to identify and characterise gene loci to the highest standard.
3K