TL;DR: The UCSC Genome Browser Database (GBD) is a publicly available collection of genome assembly sequence data and integrated annotations for a large number of organisms, including extensive comparative-genomic resources.
Abstract: The UCSC Genome Browser Database (GBD, http://genome.ucsc.edu) is a publicly available collection of genome assembly sequence data and integrated annotations for a large number of organisms, including extensive comparative-genomic resources. In the past year, 13 new genome assemblies have been added, including two important primate species, orangutan and marmoset, bringing the total to 46 assemblies for 24 different vertebrates and 39 assemblies for 22 different invertebrate animals. The GBD datasets may be viewed graphically with the UCSC Genome Browser, which uses a coordinate-based display system allowing users to juxtapose a wide variety of data. These data include all mRNAs from GenBank mapped to all organisms, RefSeq alignments, gene predictions, regulatory elements, gene expression data, repeats, SNPs and other variation data, as well as pairwise and multiple-genome alignments. A variety of other bioinformatics tools are also provided, including BLAT, the Table Browser, the Gene Sorter, the Proteome Browser, VisiGene and Genome Graphs.
TL;DR: In this paper, the authors developed a methodology which may both guide in a comparison of different assembler's output and improve the overall quality of the genome assembly sequences, by merging the sequences produced by different assembly programs.
Abstract: Many software tools are currently available to solve the hard goal of assembling millions of fragments produced in sequencing projects. Such a variety includes packages for long and short reads, generated by classical and next-generation sequencing technologies. Often the result produced by different tools can diverge---sometime significantly---for many reasons: the underlying algorithm, the data structures employed, the heuristics implemented, default parameters, etc. On the ground of the above considerations, we were motivated in developing a methodology which may both guide in a comparison of different assembler's output and improve the overall quality of the genome assembly sequences,by merging the sequences produced by different assembly programs.
TL;DR: In this paper, a standard sequence verification method and device were provided, and also an evaluation method and devices of a genome assembly sequence were presented. But the evaluation efficiency of the standard sequence on the genome assembly sequences was further improved.
Abstract: The invention discloses a standard sequence verification method and device, and also discloses an evaluation method and device of a genome assembly sequence According to the standard seeunce verification method provided by the invention, a sample DAN segment is contained in the standard sequence The verification method comprises the steps of obtaining a reads database, wherein the reads databaseis obtained by building a sequencing library after the sample SNA segmentation processing and performing sequencing; performing comparison on the standard sequence with the reads database; calculating the accuracy of the standard sequence The method and device provided by the invention have the beneficial effects that the reads database obtained through sample DNA sequencing is compared with thestandard sequence, ie, the standard sequence is subjected to auxiliary verification; according to the comparison result, wrong standard sequence is filtered away; the accuracy of the standard sequence used for verifying the genome assembly sequence is ensured; the evaluation efficiency of the standard sequence on the genome assembly sequence is further improved
TL;DR: In this paper, a method and system for the joint assembly of second-generation sequences and third-generation single molecule real-time sequencing sequences is presented, which can improve the index and accuracy of genome assembly.
Abstract: The invention discloses a method and system for the joint assembly of second-generation sequences and third-generation single molecule real-time sequencing sequences. The method comprises the steps ofperforming a second-generation sequence assembly to obtain a first-stage second-generation genome skeleton sequence; using second-generation sequences for hole filling of the first-stage second-generation genome skeleton sequence to obtain a second-stage second-generation genome skeleton sequence; using third-generation single molecule real-time sequencing sequences for hole filling of the second-generation genome skeleton sequences to obtain first-stage second-generation and third-generation skeleton sequences; splicing self-corrected third-generation single-molecule real-time sequencing sequences with the first-stage second-generation and third-generation skeleton sequences to obtain second-stage second-generation and third-generation skeleton sequences by using the mutual overlapping relation between the self-corrected third-generation single-molecule real-time sequencing sequences and the first-stage second-generation and third-generation skeleton sequences; comparing the second-generation sequences with the second-stage second-generation and third-generation skeleton sequences to obtain invalid comparison regions and replacing the regions with invalid sequences to obtain third-stage second-generation and third-generation skeleton sequences; and using the second-generation sequences for hole filling of the third-stage second-generation and third-generation skeleton sequences to obtain the final genome assembly sequence. The method can improve the index and accuracy of genome assembly.