Request More Information  |  Email Support  |  Education Site  |  Resource Center


LINEA, A New Assembly System

Mon-Chaio Lo1 , Gary Montry2 , Christie Robertson1 , Todd Smith1

1. Geospiza, Inc., 2442 NW Market St. #344, Seattle, WA, 98107, USA
2. Southwest Parallel Software, Albuquerque, NM, USA

There are two general paradigms for storing data into computer files – spreading data out over multiple files or consolidating all data into a single file. In many assembly models, such as in Phred/Phrap/Consed, data must be read and written from multiple files in a variety of formats throughout the assembly process. This creates a huge burden on the researcher to conform to sometimes obscure naming and hierarchical schemes in order to run each specific program. Furthermore, flat text files, while having the benefit of being human-readable, create a large performance barrier because they must be parsed sequentially each time information is needed. These obstacles can be overcome by the introduction of a binary file format that stores relevant assembly data and an SDK that provides any assembly program access to an efficient conduit into the stored data. Linea is an assembly suite designed to address these problems. At its core is a widely supported and open format, the Hierarchical Data Format, or HDF. Also included in Linea is a re-engineered Phrap-like assembler, which has been built to take full advantage of the benefits of the Linea system.

2004 Plant and Animal Genomics Conference

Research
BioHDF
rPhrap
Publications
Abstracts
Posters
White Papers