BioInterchange's command-line tool biointerchange
can be installed as a command line tools as follows:
gem install biointerchange
Examples:
biointerchange --input biointerchange.gvf --rdf rdf.biointerchange.gvf --batchsize 100 --file examples/estd176_Banerjee_et_al_2011.2012-11-29.NCBI36.gvf biointerchange --input dbcls.catanns.json --rdf rdf.bh12.sio --file examples/pubannotation.10096561.json --annotate_name 'Peter Smith' --annotate_name_id 'peter.smith@example.com' biointerchange --input uk.ac.man.pdfx --rdf rdf.bh12.sio --file examples/gb-2007-8-3-R40.xml --annotate_name 'Peter Smith' --annotate_name_id 'peter.smith@example.com' biointerchange --input phylotastic.newick --rdf rdf.phylotastic.newick --file examples/tree2.new --annotate_date '1 June 2006'Input formats:
biointerchange.gff3
: Generic Feature Format Version 3biointerchange.gvf
: Genome Variation Formatdbcls.catanns.json
: PubAnnotation categorical annotationsphylotastic.newick
: Newick tree file formatuk.ac.man.pdfx
: PDFxrdf.biointerchange.gff3
: RDFization of biointerchange.gff3
rdf.biointerchange.gvf
: RDFization of biointerchange.gvf
rdf.bh12.sio
: RDFization of dbcls.catanns.json
or uk.ac.man.pdfx
rdf.phylotastic.newick
: RDFization of phylotastic.newick
RDF data produced by BioInterchange can be directly loaded into a triple store. The following gives an example about loading and querying RDF data using Sesame; the commands are entered via Sesame's bin/console.sh
:
> create memory. Please specify values for the following variables: Repository ID [memory]: testrepo Repository title [Memory store]: Test Repository Persist (true|false) [true]: false Sync delay [0]: Repository created > open testrepo. testrepo> loadTo list all. testrepo> sparql select * where { ?s ?p ?o } .
seqid
entries from a GFF3/GVF-file conversion in the store, the following SPARQL query can be used:
testrepo> sparql select * where { ?s <http://www.biointerchange.org/gvf1o#GVF1_0004> ?o } .
Data consistency is verifyable for the output formats rdf.biointerchange.gff3
and rdf.biointerchange.gvf
using the BioInterchange ontologies GFF3O and GVF1O. The following is an example of how Jena's command line tools and the HermiT reasoner can be used for conistency verification:
rdfcat <path-to-gff3o/gvf1o> <yourdata.n3> > merged.xml java -d64 -Xmx4G -jar HermiT.jar -k -v merged.xmlAnother approach is to load the data and its related GFF3O/GVF1O ontology into Protege, merge them, and then use the "Explain inconsistent ontology" menu item to inspect possible data inconsistencies.
The following list provides information on the origin of the example-data files in the examples
directory:
bininda_emonds_mammals.new
: Newick formatted Bininda-Emonds mammals tree (see The delayed rise of present-day mammals). Downloaded from https://github.com/bendmorris/rdf-treestore/blob/master/trees/bininda_emonds_mammals.newBovineGenomeChrX.gff3.gz
: Gzipped GFF3 file describing a Bos taurus chromosome X. Downloaded from http://bovinegenome.org/?q=download_chromosome_gff3chromosome_BF.gff
: GFF3 file of floating contigs from the Baylor Sequencing Centre. Downloaded from http://dictybase.org/Downloadsestd176_Banerjee_et_al_2011.2012-11-29.NCBI36.gvf
: GVF file of EBI's DGVa. Downloaded from ftp://ftp.ebi.ac.uk/pub/databases/dgva/estd176_Banerjee_et_al_2011/gvf/estd176_Banerjee_et_al_2011.2012-11-29.NCBI36.gvfgb-2007-8-3-R40.xml
: Generated by PDFx from open-access source PDF Sense-antisense pairs in mammals: functional and evolutionary considerationsSaccharomyces_cerevisiae_incl_consequences.gvf.gz
: Downloaded from ftp://ftp.ensembl.org/pub/release-71/variation/gvf/saccharomyces_cerevisiae/Saccharomyces_cerevisiae_incl_consequences.gvf.gz