= bio-gff3

GFF3 plugin for BioRuby, aimed at parsing big data

Features:

# Take GFF (genome browser) information and digest mRNA and CDS sequences
# Options for low memory use and caching of records
# Support for external FASTA files

You can use this plugin in two ways. First as a standalone program, next as a
plugin library to BioRuby.

For example, fetch mRNA and CDS information from GFF3 files and output to FASTA:

  ./bin/gff3-fetch mrna test/data/gff/test.gff3
  ./bin/gff3-fetch cds test/data/gff/test.gff3

Or clone this repository and add the 'lib' dir to the Ruby search path and 

  require 'bio/db/gff/gffdb'

You can also run RSpec with something like

  rspec -I ../bioruby/lib/ spec/*.rb 

This implementation depends on BioRuby's basic GFF3 parser, with the possible
advantage that the plugin is faster and does not consume all memory. The Gff3
specs are based on the output of the Wormbase genome browser.

For a write-up see http://thebird.nl/bioruby/BioRuby_GFF3.html 

-------------------------------------------------------------------------------


  Fetch and assemble mRNAs, or CDS and print in FASTA format. 

    gff3-fetch [--no-cache] mRNA|CDS [filename.fa] filename.gff

  Where:

    --no-cache      : do not load everything in memory (slower)
    mRNA            : assemble mRNA
    CDS             : assemble CDS 

  Multiple GFF3 files can be used. For external FASTA files, always the last
  one before the GFF file is used.

  Examples:

    Find mRNA and CDS information from test.gff3 (which includes sequence information)

      gff3-fetch mRNA test/data/gff/test.gff3
      gff3-fetch CDS test/data/gff/test.gff3

    Find CDS from external FASTA file

      gff3-fetch CDS test/data/gff/MhA1_Contig1133.fa test/data/gff/MhA1_Contig1133.gff3

    Find mRNA from external FASTA file, without loading everything in RAM

      gff3-fetch --no-cache mRNA test/data/gff/test-ext-fasta.fa test/data/gff/test-ext-fasta.gff3   

  If you use this software, please cite http://dx.doi.org/10.1093/bioinformatics/btq475

== Copyright

Copyright (C) 2010,2011 Pjotr Prins <pjotr.prins@thebird.nl>