Aggressive Assembly of Pyrosequencing Reads With Mates
. 2008 Dec 15;24(24):2818-24.
doi: 10.1093/bioinformatics/btn548. Epub 2008 October 24.
Aggressive assembly of pyrosequencing reads with mates
Affiliations
- PMID: 18952627
- PMCID: PMC2639302
- DOI: 10.1093/bioinformatics/btn548
Free PMC article
Aggressive associates of pyrosequencing reads with mates
Bioinformatics. .
Free PMC article
Abstract
Motivation: DNA sequence reads from Sanger and pyrosequencing platforms differ in toll, accuracy, typical coverage, boilerplate read length and the variety of available paired-end protocols. Both read types tin can complement 1 another in a 'hybrid' approach to whole-genome shotgun sequencing projects, but assembly software must exist modified to accommodate their different characteristics. This is true even of pyrosequencing mated and unmated read combinations. Without special modifications, assemblers tuned for homogeneous sequence data may perform poorly on hybrid data.
Results: Celera Assembler was modified for combinations of ABI 3730 and 454 FLX reads. The revised pipeline called CABOG (Celera Assembler with the Best Overlap Graph) is robust to homopolymer run length doubt, loftier read coverage and heterogeneous read lengths. In tests on iv genomes, it generated the longest contigs amid all assemblers tested. It exploited the mate constraints provided by paired-end reads from either platform to build larger contigs and scaffolds, which were validated by comparison to a finished reference sequence. A low rate of contig mis-associates was detected in some CABOG assemblies, just this was reduced in the presence of sufficient mate pair data.
Availability: The software is freely bachelor equally open-source from http://wgs-assembler.sf.net nether the GNU Public License.
Figures
Two representations of a best overlap graph. In (a), the layout resembles a multiple sequence alignment. In (b) each read is represented by two nodes joined by an undirected border. Arrows represent all-time overlaps, where best means covering the most sequence. There are common best overlaps between successive pairs of reads A through D. Due to erroneous bases at one end (wavy line), read E has a non-mutual best overlap to B. Paths span undirected and directed edges alternately. Path EBA converges on path ABCD. CABOG scores read E lower than the others since only iii reads are on paths from it. Starting with whatever one of the loftier-scoring reads, CABOG would build initial unitig ABCD, and then E. Using saved information nigh each path intersection, CABOG would discount the intersection at B because the path from Due east spanned merely one read earlier B. Information technology would pause ABCD only if at that place were also a change in read arrival rate at B, which is not the case here. Although linear-time directed-path following finds the longest possible unitig in this constructed case, it is non guaranteed to do so when paths bridge multiple intersections.
Similar articles
-
An algorithm for automatic closure during associates.
BMC Bioinformatics. 2010 Sep 10;11:457. doi: x.1186/1471-2105-11-457. BMC Bioinformatics. 2010. PMID: 20831800 Costless PMC article.
-
QuorUM: An Error Corrector for Illumina Reads.
PLoS One. 2015 Jun 17;10(half dozen):e0130821. doi: 10.1371/journal.pone.0130821. eCollection 2015. PLoS One. 2015. PMID: 26083032 Costless PMC article.
-
Assessing the feasibility of GS FLX Pyrosequencing for sequencing the Atlantic salmon genome.
BMC Genomics. 2008 Aug 28;9:404. doi: x.1186/1471-2164-9-404. BMC Genomics. 2008. PMID: 18755037 Complimentary PMC article.
-
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal functioning of OLC approaches.
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl vii):507. doi: 10.1186/s12864-016-2895-8. BMC Genomics. 2016. PMID: 27556636 Complimentary PMC commodity.
-
Assembly algorithms for adjacent-generation sequencing data.
Genomics. 2010 Jun;95(half dozen):315-27. doi: 10.1016/j.ygeno.2010.03.001. Epub 2010 Mar 6. Genomics. 2010. PMID: 20211242 Complimentary PMC article. Review.
Cited past 257 manufactures
-
EvalDNA: a machine learning-based tool for the comprehensive evaluation of mammalian genome associates quality.
BMC Bioinformatics. 2021 Nov 27;22(ane):570. doi: 10.1186/s12859-021-04480-2. BMC Bioinformatics. 2021. PMID: 34837948 Complimentary PMC commodity.
-
Telomere-to-telomere gapless chromosomes of banana using nanopore sequencing.
Commun Biol. 2021 Sep 7;four(1):1047. doi: 10.1038/s42003-021-02559-3. Commun Biol. 2021. PMID: 34493830 Complimentary PMC commodity.
-
SWALO: scaffolding with assembly likelihood optimization.
Nucleic Acids Res. 2021 November 18;49(xx):e117. doi: 10.1093/nar/gkab717. Nucleic Acids Res. 2021. PMID: 34417615 Free PMC commodity.
-
Haplotype-resolved de novo assembly of the Vero cell line genome.
NPJ Vaccines. 2021 Aug 20;6(1):106. doi: 10.1038/s41541-021-00358-9. NPJ Vaccines. 2021. PMID: 34417462 Costless PMC commodity.
-
Empirical evaluation of methods for de novo genome assembly.
PeerJ Comput Sci. 2021 Jul 9;7:e636. doi: x.7717/peerj-cs.636. eCollection 2021. PeerJ Comput Sci. 2021. PMID: 34307867 Free PMC article.
References
-
- Bentley DR. Whole-genome re-sequencing. Curr. Opin. Genet. Dev. 2006;16:545–552. - PubMed
-
- Blattner FR, et al. The complete genome sequence of Escherichia coli G-12. Scientific discipline. 1997;277:1453–1474. - PubMed
-
- Chaisson MJ, Pevzner PA. Short read fragment assembly of bacterial genomes. Genome Res. 2008;18:324–330. - PMC - PubMed
-
- Chou HH, Holmes MH. Dna sequence quality trimming and vector removal. Bioinformatics. 2001;17:1093–1104. - PubMed
-
- Denisov 1000, et al. Consensus generation and variant detection by Celera Assembler. Bioinformatics. 2008;24:1035–1040. - PubMed
Publication types
MeSH terms
LinkOut - more resources
-
Full Text Sources
- Europe PubMed Central
- Ovid Technologies, Inc.
- PubMed Fundamental
- Silverchair Information Systems
-
Other Literature Sources
- The Lens - Patent Citations
Source: https://pubmed.ncbi.nlm.nih.gov/18952627/
0 Response to "Aggressive Assembly of Pyrosequencing Reads With Mates"
Post a Comment