Sha256: 8620ced0ccdd3c79d2a698cf8e82f55d412c6aaba30d6614c69021191611e1e8

Contents?: true

Size: 1.45 KB

Versions: 1

Compression:

Stored size: 1.45 KB

Contents

# parse_fasta #

So you want to parse a fasta file...

## Installation ##

Add this line to your application's Gemfile:

    gem 'parse_fasta'

And then execute:

    $ bundle

Or install it yourself as:

    $ gem install parse_fasta

## Overview ##

Provides the method `#each_record` for the `File` class.

	each_record { |header, sequence| block }

The whole file is not loaded into memory, so have no fear of giant
fasta files!

## Usage ##

An example that lists the length for each sequence.

    require 'parse_fasta'

	File.open(ARGV.first, 'r').each_record do |header, sequence|
	  puts [header, sequence.length].join("\t")
	end

## Benchmark ##

Just for fun, I wanted to compare the execution time to that of
BioRuby. I calculated sequence length for each fasta record with both
the `each_record` method from this gem and using the `FastaFormat`
class from BioRuby. You can see the test script in `benchmark.rb`.

The test file contained 2,009,897 illumina reads and the file size
was 1.1 gigabytes. Here are the results from Ruby's `Benchmark` class:

                      user     system      total        real
    parse_fasta  64.530000   1.740000  66.270000 ( 67.081502)
    bioruby     116.250000   2.260000 118.510000 (120.223710)

I just wanted a nice, clean way to parse fasta files, but being nearly
twice as fasta as BioRuby doesn't hurt either!

## Notes ##

Currently in doesn't check whether your file is actually a fasta file
or anything, so watch out.

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
parse_fasta-0.0.5 README.md