README.rdoc in bio-gff3-0.8.0 vs README.rdoc in bio-gff3-0.8.2

- old
+ new

@@ -1,69 +1,60 @@ = bio-gff3 -GFF3 plugin for BioRuby, aimed at parsing big data +GFF3 parser, aimed at parsing big data GFF3 to return sequences of any type, +including assembled mRNA, protein and CDS sequences. Features: -# Take GFF (genome browser) information and digest mRNA and CDS sequences +# Take GFF3 (genome browser) information of any type, and assemble sequences, e.g. mRNA and CDS # Options for low memory use and caching of records -# Support for external FASTA files +# Support for external FASTA input files +# Use of multi-cores (NYI) +Currently the output is a FASTA file. + You can use this plugin in two ways. First as a standalone program, next as a plugin library to BioRuby. -For example, fetch mRNA and CDS information from GFF3 files and output to FASTA: +== Install and run gff3-fetch - ./bin/gff3-fetch mrna test/data/gff/test.gff3 - ./bin/gff3-fetch cds test/data/gff/test.gff3 +After installing ruby 1.9, or later, you can use rubygems -Or clone this repository and add the 'lib' dir to the Ruby search path and + gem install bio-gff3 - require 'bio/db/gff/gffdb' +Then, fetch mRNA and CDS information from GFF3 files and output to FASTA: -You can also run RSpec with something like + gff3-fetch mrna test/data/gff/test.gff3 + gff3-fetch cds test/data/gff/test.gff3 - rspec -I ../bioruby/lib/ spec/*.rb +== Development -This implementation depends on BioRuby's basic GFF3 parser, with the possible -advantage that the plugin is faster and does not consume all memory. The Gff3 -specs are based on the output of the Wormbase genome browser. +To use the library -For a write-up see http://thebird.nl/bioruby/BioRuby_GFF3.html + require 'bio-gff3' -------------------------------------------------------------------------------- +For coding examples see ./bin/gff3-fetch and the ./spec/*rb +You can run RSpecs with something like - Fetch and assemble mRNAs, or CDS and print in FASTA format. + rspec -I ../bioruby/lib/ spec/*.rb - gff3-fetch [--no-cache] mRNA|CDS [filename.fa] filename.gff +(supposing you are referring a bioruby source repository) - Where: +This implementation depends on BioRuby's basic GFF3 parser, with the possible +advantage that the plugin can assemble sequences, is faster and does not +consume all memory. The Gff3 specs are based on the output of the Wormbase +genome browser. - --no-cache : do not load everything in memory (slower) - mRNA : assemble mRNA - CDS : assemble CDS +== See also - Multiple GFF3 files can be used. For external FASTA files, always the last - one before the GFF file is used. + gff3-fetch --help - Examples: +For a write-up see http://thebird.nl/bioruby/BioRuby_GFF3.html - Find mRNA and CDS information from test.gff3 (which includes sequence information) +------------------------------------------------------------------------------- - gff3-fetch mRNA test/data/gff/test.gff3 - gff3-fetch CDS test/data/gff/test.gff3 - - Find CDS from external FASTA file - - gff3-fetch CDS test/data/gff/MhA1_Contig1133.fa test/data/gff/MhA1_Contig1133.gff3 - - Find mRNA from external FASTA file, without loading everything in RAM - - gff3-fetch --no-cache mRNA test/data/gff/test-ext-fasta.fa test/data/gff/test-ext-fasta.gff3 - - If you use this software, please cite http://dx.doi.org/10.1093/bioinformatics/btq475 - == Copyright Copyright (C) 2010,2011 Pjotr Prins <pjotr.prins@thebird.nl> +