Aggressive Assembly of Pyrosequencing Reads With Mates

. 2008 Dec 15;24(24):2818-24.

doi: 10.1093/bioinformatics/btn548. Epub 2008 October 24.

Aggressive assembly of pyrosequencing reads with mates

Affiliations

  • PMID: 18952627
  • PMCID: PMC2639302
  • DOI: 10.1093/bioinformatics/btn548

Free PMC article

Aggressive associates of pyrosequencing reads with mates

Jason R Miller  et al. Bioinformatics. .

Free PMC article

Abstract

Motivation: DNA sequence reads from Sanger and pyrosequencing platforms differ in toll, accuracy, typical coverage, boilerplate read length and the variety of available paired-end protocols. Both read types tin can complement 1 another in a 'hybrid' approach to whole-genome shotgun sequencing projects, but assembly software must exist modified to accommodate their different characteristics. This is true even of pyrosequencing mated and unmated read combinations. Without special modifications, assemblers tuned for homogeneous sequence data may perform poorly on hybrid data.

Results: Celera Assembler was modified for combinations of ABI 3730 and 454 FLX reads. The revised pipeline called CABOG (Celera Assembler with the Best Overlap Graph) is robust to homopolymer run length doubt, loftier read coverage and heterogeneous read lengths. In tests on iv genomes, it generated the longest contigs amid all assemblers tested. It exploited the mate constraints provided by paired-end reads from either platform to build larger contigs and scaffolds, which were validated by comparison to a finished reference sequence. A low rate of contig mis-associates was detected in some CABOG assemblies, just this was reduced in the presence of sufficient mate pair data.

Availability: The software is freely bachelor equally open-source from http://wgs-assembler.sf.net nether the GNU Public License.

Figures

Fig. 1.
Fig. ane.

Two representations of a best overlap graph. In (a), the layout resembles a multiple sequence alignment. In (b) each read is represented by two nodes joined by an undirected border. Arrows represent all-time overlaps, where best means covering the most sequence. There are common best overlaps between successive pairs of reads A through D. Due to erroneous bases at one end (wavy line), read E has a non-mutual best overlap to B. Paths span undirected and directed edges alternately. Path EBA converges on path ABCD. CABOG scores read E lower than the others since only iii reads are on paths from it. Starting with whatever one of the loftier-scoring reads, CABOG would build initial unitig ABCD, and then E. Using saved information nigh each path intersection, CABOG would discount the intersection at B because the path from Due east spanned merely one read earlier B. Information technology would pause ABCD only if at that place were also a change in read arrival rate at B, which is not the case here. Although linear-time directed-path following finds the longest possible unitig in this constructed case, it is non guaranteed to do so when paths bridge multiple intersections.

Similar articles

  • An algorithm for automatic closure during associates.

    Koren South, Miller JR, Walenz BP, Sutton G. Koren S, et al. BMC Bioinformatics. 2010 Sep 10;11:457. doi: x.1186/1471-2105-11-457. BMC Bioinformatics. 2010. PMID: 20831800 Costless PMC article.

  • QuorUM: An Error Corrector for Illumina Reads.

    Marçais G, Yorke JA, Zimin A. Marçais G, et al. PLoS One. 2015 Jun 17;10(half dozen):e0130821. doi: 10.1371/journal.pone.0130821. eCollection 2015. PLoS One. 2015. PMID: 26083032 Costless PMC article.

  • Assessing the feasibility of GS FLX Pyrosequencing for sequencing the Atlantic salmon genome.

    Quinn NL, Levenkova N, Chow W, Bouffard P, Boroevich KA, Knight JR, Jarvie TP, Lubieniecki KP, Desany BA, Koop BF, Harkins TT, Davidson WS. Quinn NL, et al. BMC Genomics. 2008 Aug 28;9:404. doi: x.1186/1471-2164-9-404. BMC Genomics. 2008. PMID: 18755037 Complimentary PMC article.

  • Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal functioning of OLC approaches.

    Cherukuri Y, Janga SC. Cherukuri Y, et al. BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl vii):507. doi: 10.1186/s12864-016-2895-8. BMC Genomics. 2016. PMID: 27556636 Complimentary PMC commodity.

  • Assembly algorithms for adjacent-generation sequencing data.

    Miller JR, Koren S, Sutton One thousand. Miller JR, et al. Genomics. 2010 Jun;95(half dozen):315-27. doi: 10.1016/j.ygeno.2010.03.001. Epub 2010 Mar 6. Genomics. 2010. PMID: 20211242 Complimentary PMC article. Review.

Cited past 257 manufactures

  • EvalDNA: a machine learning-based tool for the comprehensive evaluation of mammalian genome associates quality.

    MacDonald ML, Lee KH. MacDonald ML, et al. BMC Bioinformatics. 2021 Nov 27;22(ane):570. doi: 10.1186/s12859-021-04480-2. BMC Bioinformatics. 2021. PMID: 34837948 Complimentary PMC commodity.

  • Telomere-to-telomere gapless chromosomes of banana using nanopore sequencing.

    Belser C, Baurens FC, Noel B, Martin 1000, Cruaud C, Istace B, Yahiaoui N, Labadie G, Hřibová Due east, Doležel J, Lemainque A, Wincker P, D'Hont A, Aury JM. Belser C, et al. Commun Biol. 2021 Sep 7;four(1):1047. doi: 10.1038/s42003-021-02559-3. Commun Biol. 2021. PMID: 34493830 Complimentary PMC commodity.

  • SWALO: scaffolding with assembly likelihood optimization.

    Rahman A, Pachter Fifty. Rahman A, et al. Nucleic Acids Res. 2021 November 18;49(xx):e117. doi: 10.1093/nar/gkab717. Nucleic Acids Res. 2021. PMID: 34417615 Free PMC commodity.

  • Haplotype-resolved de novo assembly of the Vero cell line genome.

    Sène MA, Kiesslich Southward, Djambazian H, Ragoussis J, Xia Y, Kamen AA. Sène MA, et al. NPJ Vaccines. 2021 Aug 20;6(1):106. doi: 10.1038/s41541-021-00358-9. NPJ Vaccines. 2021. PMID: 34417462 Costless PMC commodity.

  • Empirical evaluation of methods for de novo genome assembly.

    Dida F, Yi G. Dida F, et al. PeerJ Comput Sci. 2021 Jul 9;7:e636. doi: x.7717/peerj-cs.636. eCollection 2021. PeerJ Comput Sci. 2021. PMID: 34307867 Free PMC article.

References

    1. Bentley DR. Whole-genome re-sequencing. Curr. Opin. Genet. Dev. 2006;16:545–552. - PubMed
    1. Blattner FR, et al. The complete genome sequence of Escherichia coli G-12. Scientific discipline. 1997;277:1453–1474. - PubMed
    1. Chaisson MJ, Pevzner PA. Short read fragment assembly of bacterial genomes. Genome Res. 2008;18:324–330. - PMC - PubMed
    1. Chou HH, Holmes MH. Dna sequence quality trimming and vector removal. Bioinformatics. 2001;17:1093–1104. - PubMed
    1. Denisov 1000, et al. Consensus generation and variant detection by Celera Assembler. Bioinformatics. 2008;24:1035–1040. - PubMed

Publication types

MeSH terms

LinkOut - more resources

  • Full Text Sources

  • Other Literature Sources

ligginshinty1991.blogspot.com

Source: https://pubmed.ncbi.nlm.nih.gov/18952627/

0 Response to "Aggressive Assembly of Pyrosequencing Reads With Mates"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel