S. Swat, A. Laskowski, J. Badura, A. ƚwiercz, W. Frohmberg, P. Wojciechowski, M. Kasprzak, J. Blazewicz

Reconstruction de novo of a genome sequence is a great challenge, largely due to computational difficulties connected with processing millions of reads at once. ALGA is a new method realizing this process and is based on the overlap-layout-consensus approach. The approach consists of three phases: construction of the overlap graph, preparation of the graph for traversal and agreement of final sequences. It is generally viewed as more accurate than the so-called de Bruijn graph approach, but much more consuming in the sense of time and memory. Several new ideas were implemented in order to increase efficiency at each of the phases, including a number of heuristics designed to effectively simplify the overlap graph's structure during the second phase as well as during the graph creation. ALGA was tested on a few real data sets, including whole human genome, and the results were evaluated with the standard tool QUAST. In comparison to other assemblers, ALGA provides very good results according to metrics such as genome coverage fraction, length of resulting sequences and occurrences of misassemblies.

Keywords: Genome assembly de novo, heuristics

Scheduled

TB2 Bioinformatics
June 10, 2021  11:15 AM
2 - LV Kantorovich


Latest news

  • 6/5/21
    Conference abstract book

Cookie policy

We use cookies in order to be able to identify and authenticate you on the website. They are necessary for the correct functioning of it, and therefore they can not be disabled. If you continue browsing the website, you are agreeing with their acceptance, as well as our Privacy Policy.

Additionally, we use Google Analytics in order to analyze the website traffic. They also use cookies and you can accept or refuse them with the buttons below.

You can read more details about our Cookie Policy and our Privacy Policy.