Request More Information  |  Email Support  |  Education Site  |  Resource Center


Second Generation DNA Alignment Tools

Christie Robertson1, Eric Flynn1, Gary Montry2, Sandra Porter1, and Todd M. Smith1
1. Geospiza, Inc., 2442 NW Market St., Seattle, WA 98107 USA
2. Southwest Parallel Software, Albuquerque, NM

The human genome project spurred the development of high throughput technologies, especially in the area of DNA sequencing. Not only has this effort uncovered the sequence of the human genome, it has catalyzed development of an entire industry based on DNA sequencing and genomics. Since these technologies produce enormous amounts of data, they depend on bioinformatics programs for data management. Phrap, Cross_Match, RepeatMasker, and Consed have played an integral role in genome projects and have come to be accepted as standard tools for genomic alignment and assembly. As sequencing technology and software have evolved, however, so too have the scientific applications that rely on these programs. Specific needs associated with whole genome shotgun sequencing, EST cluster analysis, and genotyping applications highlight the importance of updating standard bioinformatics programs to meet the requirements of a broader community.

We are re-engineering Phrap, Cross_Match and RepeatMasker to improve their performance and utility through optimizing the core algorithms and developing a framework to store, manipulate, and view assembled sequence data. We are developing a structure through which specific XML-formatted hints and constraints will be able to pass instructions to the core alignment program, giving it information on the handling of parts of the data, or the data set as a whole, in individualized ways. Hints regarding read pairs, associations or non-associations between reads or contigs, sequencing reaction conditions, highly-repetitive regions, reference sequences, and other information will be able to be applied to direct sequence alignment, without altering the underlying data itself. In addition, a new viewing program is being developed to review, edit, and manipulate sequences, giving users unprecedented control over their data.

Recomb 2003 DNA Sequencing Technologies and Computation

Research
BioHDF
rPhrap
Publications
Abstracts
Posters
White Papers