Identify problems with predicted genes

This is a GSoC 2013 project.
Details about the project's progress during the Coding period can be found here.
We also have a blog.

Please note that some of the functionalities
of this tool are still under development.
So, stay tunned!



Authors

Abstract

The goal of GeneValidator is to identify problems with gene predictions and provide useful information based on the similarities to genes in public databases.The results of the prediction validation will make evidence about how the sequencing curation may be done and can be useful in improving / trying new approaches for gene prediction tools. The main target users of this tool are the Biologists who want to validate the data obtained in their own laboratories.

Current Validations

Requirements

Installation

  1. Get the source code
    $ git clone git@github.com:monicadragan/gene_prediction.git

  2. Be sudo and build the gem
    $ sudo rake

  3. Run GeneValidation
    $ genevalidator [validations] [skip_blast] [start] [tabular] [mafft] [raw_seq] FILE

Example that emphasizes all the validations:
$ genevalidator -x data/all_validations_prot/all_validations_prot.xml data/all_validations_prot/all_validations_prot.fasta

Learn more:
$ genevalidator -h

Outputs

By running GeneValidator on your dataset you get numbers and plots. Some relevant files will be generated at the same path with the input file. The results are available in 3 formats: * console table output * validation results in YAML format (the YAML file has the same name with the input file + YAML extension) * html output with plot visualization (the useful files will be generated in the 'html' directory, at the same path with the input file)
! Note: for the moment check the html output with Firefox browser only !

Have a look at our results!

Other things

  1. Run unit tests
    $ rake test

  2. Generate documentation
    $ rake doc