Next-generation sequencing (NGS) provides a great opportunity to investigate genome-wide variation at nucleotide resolution. Due to the huge amount of data, NGS applications require very fast and accurate alignment algorithms. Most existing algorithms for read mapping basically adopt seed-and-extend strategy, which is sequential in nature. We develop a sequence partitioning algorithm, called Kart, which can process long reads as fast as short reads by dividing a read into small fragments that can be aligned independently. Our experiment result indicates that Kart spends much less time on longer reads than other aligners and still produce reliable alignments even when the error rate is as high as 15%.
We also develop a new RNA-seq de novo mapper, call Kart-RNA, which adopts similar concept. The experiment results on synthetic datasets and real NGS datasets showed that Kart-RNA is a highly efficient aligner that yields the highest or comparable sensitivity and accuracy compared to most state-of-the-art aligners, and more importantly, it spends the least amount of time among the selected aligners.