README.rdoc in bio-gff3-0.8.0 vs README.rdoc in bio-gff3-0.8.2
- old
+ new
@@ -1,69 +1,60 @@
= bio-gff3
-GFF3 plugin for BioRuby, aimed at parsing big data
+GFF3 parser, aimed at parsing big data GFF3 to return sequences of any type,
+including assembled mRNA, protein and CDS sequences.
Features:
-# Take GFF (genome browser) information and digest mRNA and CDS sequences
+# Take GFF3 (genome browser) information of any type, and assemble sequences, e.g. mRNA and CDS
# Options for low memory use and caching of records
-# Support for external FASTA files
+# Support for external FASTA input files
+# Use of multi-cores (NYI)
+Currently the output is a FASTA file.
+
You can use this plugin in two ways. First as a standalone program, next as a
plugin library to BioRuby.
-For example, fetch mRNA and CDS information from GFF3 files and output to FASTA:
+== Install and run gff3-fetch
- ./bin/gff3-fetch mrna test/data/gff/test.gff3
- ./bin/gff3-fetch cds test/data/gff/test.gff3
+After installing ruby 1.9, or later, you can use rubygems
-Or clone this repository and add the 'lib' dir to the Ruby search path and
+ gem install bio-gff3
- require 'bio/db/gff/gffdb'
+Then, fetch mRNA and CDS information from GFF3 files and output to FASTA:
-You can also run RSpec with something like
+ gff3-fetch mrna test/data/gff/test.gff3
+ gff3-fetch cds test/data/gff/test.gff3
- rspec -I ../bioruby/lib/ spec/*.rb
+== Development
-This implementation depends on BioRuby's basic GFF3 parser, with the possible
-advantage that the plugin is faster and does not consume all memory. The Gff3
-specs are based on the output of the Wormbase genome browser.
+To use the library
-For a write-up see http://thebird.nl/bioruby/BioRuby_GFF3.html
+ require 'bio-gff3'
--------------------------------------------------------------------------------
+For coding examples see ./bin/gff3-fetch and the ./spec/*rb
+You can run RSpecs with something like
- Fetch and assemble mRNAs, or CDS and print in FASTA format.
+ rspec -I ../bioruby/lib/ spec/*.rb
- gff3-fetch [--no-cache] mRNA|CDS [filename.fa] filename.gff
+(supposing you are referring a bioruby source repository)
- Where:
+This implementation depends on BioRuby's basic GFF3 parser, with the possible
+advantage that the plugin can assemble sequences, is faster and does not
+consume all memory. The Gff3 specs are based on the output of the Wormbase
+genome browser.
- --no-cache : do not load everything in memory (slower)
- mRNA : assemble mRNA
- CDS : assemble CDS
+== See also
- Multiple GFF3 files can be used. For external FASTA files, always the last
- one before the GFF file is used.
+ gff3-fetch --help
- Examples:
+For a write-up see http://thebird.nl/bioruby/BioRuby_GFF3.html
- Find mRNA and CDS information from test.gff3 (which includes sequence information)
+-------------------------------------------------------------------------------
- gff3-fetch mRNA test/data/gff/test.gff3
- gff3-fetch CDS test/data/gff/test.gff3
-
- Find CDS from external FASTA file
-
- gff3-fetch CDS test/data/gff/MhA1_Contig1133.fa test/data/gff/MhA1_Contig1133.gff3
-
- Find mRNA from external FASTA file, without loading everything in RAM
-
- gff3-fetch --no-cache mRNA test/data/gff/test-ext-fasta.fa test/data/gff/test-ext-fasta.gff3
-
- If you use this software, please cite http://dx.doi.org/10.1093/bioinformatics/btq475
-
== Copyright
Copyright (C) 2010,2011 Pjotr Prins <pjotr.prins@thebird.nl>
+