Sha256: c4ebf1a0778c127d694a90348df17512d4f2d4ee2d9ba5b73bd594c373515570

Contents?: true

Size: 1.73 KB

Versions: 1

Compression:

Stored size: 1.73 KB

Contents

* TODO
** Square the U/L term?? Max of 100?
** Rarefaction curves
*** bash + samtools + resampling (100 times?)
*** How many divisions??
** Triangular numbers
*** Triangular number defined as
*** T(n) = n(n+1)/2
*** Maximum overlap determined by function of read length L
*** The read with most 'even' overlaps is directly in the middle
*** EXAMPLES:
**** Left most read (a)
***** T(L-1)
**** Next left-most read (b)
***** (L-1) + T(L-1) - 1
**** (c)
***** (L-2) + (L-1) + T(L-1) - 2 - 1
*** For each of j reads (aligned in best case scenario)
**** Sum overlaps with all other reads:
**** f(L) = 2*T(L-1) - T(J-1) - T(L - J)
**** f(L) = (L-1)(L-1+1) - (J-1)(J-1+1)/2 - (L-J)(L-J+1)/2
**** f(L) = L(L-1) - J*(J-1)/2 - (L-J)(L-J+1)/2
**** f(L) = L^2 - L + (-J*J + J)/2 - (L-J)(L-J+1)/2
**** 2f(L) = 2*(L^2) - 2L - J^2 + J - (L-J)(L-J+1)
**** 2f(L) = 2*(L^2) - 2L - J^2 + J - (L^2 - 2LJ + L + J^2 - J)
**** 2f(L) = 2*(L^2) - 2L - J^2 + J - L^2 + 2LJ - L - J^2 + J
**** 2f(L) = 2*(L^2) - L^2 - J^2 - J^2 + 2LJ - 2L - L + J + J
**** 2f(L) = L^2 - 2*(J^2) + 2LJ - 3L + 2J
**** f(L) = -J^2 + (L^2)/2 + LJ - 3L/2 + J
**** 2850 (triangular number T(L-1) L=76 J=1
**** f(76) = 2850
* Notes
** U = u/L
** U/L is the number of unique reads at that base, length normalized
** U*O/L vs. 200*U*O/(L^2)
** the average summed overlap is O/(L)
** average because different reads in the formation have different summed overlaps
** the average average overlap is 2L/3 or O/L/(L-1) or O/(L^2-L)
** 
** D = d /
** the average summed dissimilarity is D/(L)
** average because different reads in the formation have different summed dissimilarities
** the average average dissimilarity is L/3 or D/L/(L-1) or D/(L^2-L)
** This matches the average similarity nicely...
** 
** u*d / L*L*(L-1)
* Bugs

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
ngs-ci-0.0.2.b TODO.org