A practical, bioinformatic workflow system for large data sets generated by next-generation sequencing

Cantacessi, Cinzia and Jex, Aaron R. and Hall, Ross S. and Young, Neil D. and Campbell, Bronwyn E. and Joachim, Anja and Nolan, Matthew J. and Abubucker, Sahar and Sternberg, Paul W. and Ranganathan, Shoba and Mitreva, Makedonka and Gasser, Robin B. (2010) A practical, bioinformatic workflow system for large data sets generated by next-generation sequencing. Nucleic Acids Research, 38 (17). pp. 1-12.

Preview

Text
A practical, bioinformatic workflow system for large data sets generated by next-generation sequencing.pdf
Download (178kB) | Preview

Abstract

Transcriptomics (at the level of single cells, tissues and/or whole organisms) underpins many fields of biomedical science, from understanding the basic
cellular function in model organisms, to the elucidation of the biological events that govern the development and progression of human diseases, and
the exploration of the mechanisms of survival, drug-resistance and virulence of pathogens. Next-generation sequencing (NGS) technologies
are contributing to a massive expansion of transcriptomics in all fields and are reducing the cost, time and performance barriers presented by con-
ventional approaches. However, bioinformatic tools for the analysis of the sequence data sets produced by these technologies can be daunting
to researchers with limited or no expertise in bio- informatics. Here, we constructed a semi- automated, bioinformatic workflow system, and
critically evaluated it for the analysis and annotation of large-scale sequence data sets generated by NGS. We demonstrated its utility for the exploration
of differences in the transcriptomes among various stages and both sexes of an economically important parasitic worm (Oesophagostomum dentatum) as
well as the prediction and prioritization of essential molecules (including GTPases, protein kinases and phosphatases) as novel drug target candidates. This workflow system provides a practical tool for the assembly, annotation and analysis of NGS data expertise. The custom-written Perl, Python and Unix
shell computer scripts used can be readily modified or adapted to suit many different applications. This system is now utilized routinely for the analysis of
data sets from pathogens of major socio-economic importance and can, in principle, be applied to transcriptomics data sets from any organism.

Item Type:	Article
Subjects:	Medical and Health Sciences > Other medical sciences
Divisions:	Faculty of Medical Science
Depositing User:	Marija Kalejska
Date Deposited:	30 Nov 2012 10:48
Last Modified:	27 Oct 2014 13:29
URI:	https://eprints.ugd.edu.mk/id/eprint/2572

Actions (login required)

: View Item