Towards computational improvement of DNA database indexing and short DNA query searching

Stojanov, Done and Koceski, Saso and Mileva, Aleksandra and Koceska, Natasa and Martinovska Bande, Cveta (2014) Towards computational improvement of DNA database indexing and short DNA query searching. Biotechnology & Biotechnological Equipment, 28 (5). pp. 958-967. ISSN 1310-2818

[thumbnail of 13102818.2014.pdf]

Download (527kB) | Preview


In order to facilitate and speed up the search of massive DNA databases, the database is indexed at the beginning,
employing a mapping function. By searching through the indexed data structure, exact query hits can be identified. If the
database is searched against an annotated DNA query, such as a known promoter consensus sequence, then the starting
locations and the number of potential genes can be determined. This is particularly relevant if unannotated DNA sequences
have to be functionally annotated. However, indexing a massive DNA database and searching an indexed data structure
with millions of entries is a time-demanding process. In this paper, we propose a fast DNA database indexing and
searching approach, identifying all query hits in the database, without having to examine all entries in the indexed data
structure, limiting the maximum length of a query that can be searched against the database. By applying the proposed
indexing equation, the whole human genome could be indexed in 10 hours on a personal computer, under the assumption
that there is enough RAM to store the indexed data structure.

Item Type: Article
Subjects: Natural sciences > Computer and information sciences
Divisions: Faculty of Computer Science
Depositing User: Done Stojanov
Date Deposited: 27 Nov 2014 13:03
Last Modified: 27 Nov 2014 13:03

Actions (login required)

View Item View Item