Class Bio::DB::Sam
In: lib/bio/db/sam.rb
Parent: Object

Methods

Attributes

sam_file  [R] 

Public Class methods

Destructor method that closes the file before letting the object be garbage collected.

Merges n BAM files. This doesn‘t require to create a SAM object

files:An array with the paths to the files.
merged_file:The path to the merged file
headers:The BAM file containing the header
add_RG:If true, the RG tag is added (infered from the filenames)
by_qname:If true, the bamfiles should by ordered by query name, if false, by coordinates.

To make a new sam object. Initialize expects a hash optsa with the following elemets:

fasta:The fasta file with the reference. (nil)
bam:path to a binary SAM file (nil)
tam:path to a text SAM file (nil)
compressed:If the binary file is compressed (true)
write:If the file is to be writen (false). Not supported yet.

*NOTE:* you can‘t use binary and text formats simultaneusly. To make queries, the file has to be a sorted binary. This function doesn‘t actually open the file, it just prepares the object to be opened in a later stage.

Public Instance methods

Returns the average coverage of a region in a bam file.

Returns an array with the coverage at each possition in the queried region This is a simple average coverage just calculated with the first and last possition of the alignment, ignoring the gaps.

Closes the sam file and destroys the C pointers using the functions provided by libbam

utility method that does not use the samtools API, it calls samtools directly as if on the command line and catches the output, to use this method you must have a version of samtools that supports the pileup command (< 0.1.17) otherwise the command will fail. mpileup is the preferred method for getting pileups. With this method the sam object should be created as usual, but you need to pass this method a string of options for samtools you don‘t need to provide the call to samtools pileup itself or -f <fasta file> or the bam file itself, these are taken from the sam object

Returns an array of Alignments on a given region.

Returns the sequence for a given region.

Executes a function on each Alignment inside the queried region of the chromosome. The chromosome can be either the textual name or a FixNum with the internal index. However, you need to get the internal index with the provided API, otherwise the pointer is outside the scope of the C library. Returns the count of alignments in the region. WARNING: Accepts an index already parsed by the library. It fails when you use your own FixNum (FFI-bug?)

Loads the bam index to be used for fetching. If the index doesn‘t exists the index is built provided that the user has writing access to the folder where the BAM file is located. If the creation of the file fails a SAMException is thrown. If the index doesn‘t exist, loading it will take more time. It is suggested to generate the index separatedly if the bam file sits on a server where the executing user may not have writing permissions in the server.

Loads the reference file to be able to query regions of it. This requires the fai index to exist in the same folder than the reference. If it doesn‘t exisits, this functions attempts to generate it. If user doesn‘t have writing permissions on the folder, or the creation of the fai fails for any reason, a SAMException is thrown.

calls the mpileup function, opts is a hash of options identical to the command line options for mpileup. is an iterator that yields a Pileup object for each postion the command line options that generate/affect BCF/VCF are ignored ie (g,u,e,h,I,L,o,p) call the option as a symbol of the flag, eg -r for region is called :r => "some SAM compatible region" eg bam.mpileup(:r => "chr1:1000-2000", :q => 50) gets the bases with quality > 50 on chr1 between 1000-5000

Function that actually opens the sam file Throws a SAMException if the file can‘t be open.

Generates a query sting to be used by the region parser in samtools. In principle, you shouldn‘t need to use this function.

Prints a description of the sam file in a text format containg if it is binary or text, the path and the fasta file of the reference

[Validate]