Quality assessment of de novo sequence assembly tools
High-throughput next generation sequencing technologies progressed very rapidly; revolutionized genomics by providing a robust working field for new studies to be performed and promising the facilitation of the achievements that was extremely challenging before. Although the massive output of these instruments is getting more accurate, still delivers the projection of the real sequence in very short fragments; which necessitates another process of merging and ordering those fragments to reconstruct the larger sequences. This process is performed by sequence assemblers and in the absence of a reference genome; it becomes a de novo sequence assembly. Since assembling millions of fragments in biological aspects have many obvious challenges, there have been many studies specifically focused on developing tools that can adapt to newly announced sequencing technologies, take advantage of the computer science achievements and the technological advancement of computer hardware to the utmost. But these sequence assemblers also need to justify the gain they claim. We took 5 of the commonly used assemblers and assembled two genomic datasets, mined the never mentioned statistics before and commonly used statistics that thought to be the representative of the quality of the assembly. On top of that we also used experimentally validated data that is known to be a part of the organismsâ€™ genome and trailed those in assemblies.