A fast, lock-free approach for efficient parallel counting of occurrences of k-mers

doi:10.1093/BIOINFORMATICS/BTR011

Open AccessJournal Article10.1093/BIOINFORMATICS/BTR011

A fast, lock-free approach for efficient parallel counting of occurrences of k-mers

Guillaume Marçais, +1 more

- 01 Mar 2011

- Bioinformatics

- Vol. 27, Iss: 6, pp 764-770

4.1K

TL;DR: This work proposes a new k-mer counting algorithm and associated implementation, called Jellyfish, which is fast and memory efficient, based on a multithreaded, lock-free hash table optimized for counting k-mers up to 31 bases in length.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1111/1755-0998.13190

Chromosome-level genome assembly of a cyprinid fish Onychostoma macrolepis by integration of nanopore sequencing, Bionano and Hi-C technology.

Lina Sun, +10 more

- 01 Sep 2020

- Molecular Ecology Resources

TL;DR: The chromosome‐level assembly of the O.macrolepis genome obtained from the integration of nanopore long‐read sequencing with physical maps produced using Bionano and Hi‐C technology provides a valuable genomic resource for further biological and evolutionary studies of O. macrolepis.

...read moreread less

33

•Journal Article•10.1111/NPH.17738

De novo genome assembly and in natura epigenomics reveal salinity-induced DNA methylation in the mangrove tree Bruguiera gymnorhiza.

Matin Miryeganeh, +3 more

- 16 Oct 2021

- New Phytologist

TL;DR: In this paper, the authors performed a de novo genome assembly and in natura epigenome analyses of the mangrove Bruguiera gymnorhiza, one of the dominant species in the world.

...read moreread less

33

•Journal Article•10.1038/S41598-021-87419-0

The absence of the caffeine synthase gene is involved in the naturally decaffeinated status of Coffea humblotiana, a wild species from Comoro archipelago.

Nathalie Raharimalala, +20 more

- 14 Apr 2021

- Scientific Reports

TL;DR: In this article, the authors reported on the 422-Mb chromosome-level assembly of the Coffea humblotiana genome, a wild and endangered, naturally caffeine-free species from the Comoro archipelago.

...read moreread less

33

•Journal Article•10.1038/s42003-022-04154-6

The bHLH-zip transcription factor SREBP regulates triterpenoid and lipid metabolisms in the medicinal fungus Ganoderma lingzhi

Yong-Nan Liu, +8 more

- 03 Jan 2023

- Communications biology

TL;DR: In this paper , the authors identify putative targets of the transcription factor sterol regulatory element-binding protein (SREBP), including the genes of triterpenoid synthesis and lipid metabolism.

...read moreread less

33

•Journal Article•10.7717/PEERJ.340

The evolutionary history and diagnostic utility of the CRISPR-Cas system within Salmonella enterica ssp. enterica

James B. Pettengill, +7 more

- 17 Apr 2014

- PeerJ

TL;DR: A novel clustering approach based onCRISPR spacer content is developed, but it is found that typing based on CRISPRs was less accurate than the MLST-based alternative; typed based on WGS data was the most accurate.

...read moreread less

33

...

Expand

References

•Journal Article•10.1093/NAR/GKH340

MUSCLE: multiple sequence alignment with high accuracy and high throughput

Robert C. Edgar

- 01 Mar 2004

- Nucleic Acids Research

TL;DR: MUSCLE is a new computer program for creating multiple alignments of protein sequences that includes fast distance estimation using kmer counting, progressive alignment using a new profile function the authors call the log-expectation score, and refinement using tree-dependent restricted partitioning.

...read moreread less

45.1K

•Book

Introduction to Algorithms

Thomas H. Cormen, +2 more

- 01 Jan 1990

TL;DR: The updated new edition of the classic Introduction to Algorithms is intended primarily for use in undergraduate or graduate courses in algorithms or data structures and presents a rich variety of algorithms and covers them in considerable depth while making their design and analysis accessible to all levels of readers.

...read moreread less

24.8K

Journal Article•10.21276/IJRE.2018.5.5.4

MapReduce: simplified data processing on large clusters

Jeffrey Dean, +1 more

- 06 Dec 2004

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.

...read moreread less

22.7K

Journal Article•10.1145/1327452.1327492

MapReduce: simplified data processing on large clusters

Jeffrey Dean, +1 more

- 01 Jan 2008

- Communications of The ACM

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.

...read moreread less

18.6K

Journal Article•10.1126/SCIENCE.287.5461.2196

A Whole-Genome Assembly of Drosophila

Eugene W. Myers, +28 more

- 24 Mar 2000

- Science

TL;DR: The quality of a whole-genome assembly of Drosophila melanogaster and the nature of the computer algorithms that accomplished it are reported on and should be of substantial value to the scientific community.

...read moreread less

1.6K