Geospiza and The HDF Group are collaborating to develop portable, scalable, bioinformatics data storage technologies in HDF5. The first products, based on BioHDF, will provide data models, APIs, software tools (I/O, algorithms), and a viewer based on HDFView, to support DNA polymorphism discovery and genotyping. Using BioHDF, researchers will be able perform resequencing-based SNP discovery, analyze genotyping data, and export datasets in formats ready for submission to key databases. As a programming environment, BioHDF will be easily extended to accept data from new genotyping platforms and format data for interchange with many databases. Additionally, BioHDF will be able to be used to support whole genome association studies and linkage disequilibrium (LD) calculations in very large data sets like HapMap. BioHDF will be delivered to the research community as an open source technology.
Project Goals:
- Extend the BioHDF data model to support polymorphism discovery and genotyping, and integrate diverse types of data from multiple technologies.
- Build a complete application, accompanied by software tools and APIs, to support BioHDF use in the research community.
- Build a prototype application for whole genome association studies based on linkage disequilibrium.
- Research methods for incorporating BioHDF into enterprise applications for clinical research and diagnostics.
For more information about BioHDF please visit The HDF Group's BioHDF site. For more information on Geospiza's research efforts or the potential for future collaborations, please request more information or call (206) 633-4403.
