README.md in bio-gemma-wrapper-0.99.2 vs README.md in bio-gemma-wrapper-0.99.3
- old
+ new
@@ -6,15 +6,16 @@
Nat. Genet., 2016)](cfw.gif)
## Introduction
Gemma-wrapper allows running GEMMA with LOCO, GEMMA with caching,
-GEMMA in parallel (now the default), and GEMMA on PBS. Gemma-wrapper
-is used to run GEMMA as part of the https://genenetwork.org/
-environment.
+GEMMA in parallel (now the default with LOCO), and GEMMA on
+PBS. Gemma-wrapper is used to run GEMMA as part of the
+https://genenetwork.org/ environment.
-Note that gemma-wrapper is projected to be integrated into gemma2/lib.
+Note that a version of gemma-wrapper is projected to be integrated
+into gemma itself.
GEMMA is a software toolkit for fast application of linear mixed
models (LMMs) and related models to genome-wide association studies
(GWAS) and other large-scale data sets.
@@ -27,10 +28,18 @@
gemma-wrapper requires a recent version of GEMMA and essentially
does a pass-through of all standard GEMMA invocation switches. On
return gemma-wrapper can return a JSON object (--json) which is
useful for web-services.
+## Performance
+
+LOCO runs in parallel by default which is at least a 5x performance
+improvement on a machine with enough cores. GEMMA without LOCO,
+however, does not run in parallel by default. Performance
+improvements with the parallel implementation for LOCO and non-LOCO
+can be viewed [here](./test/performance/releases.gmi).
+
## Installation
Prerequisites are
* A recent version of [GEMMA](https://github.com/genetics-statistics/GEMMA)
@@ -51,19 +60,23 @@
```
Usage: gemma-wrapper [options] -- [gemma-options]
--permutate n Permutate # times by shuffling phenotypes
--permute-phenotypes filen Phenotypes to be shuffled in permutations
- --loco [x,y,1,2,3...] Run full LOCO
+ --loco Run full leave-one-chromosome-out (LOCO)
+ --chromosomes [1,2,3] Run specific chromosomes
--input filen JSON input variables (used for LOCO)
--cache-dir path Use a cache directory
--json Create output file in JSON format
- --force Force computation
- --slurm [options] Submit to slurm PBS
+ --force Force computation (override cache)
+ --parallel Run jobs in parallel
+ --no-parallel Do not run jobs in parallel
+ --slurm[=opts] Use slurm PBS for submitting jobs
--q, --quiet Run quietly
-v, --verbose Run verbosely
- --debug Show debug messages and keep intermediate output
+ -d, --debug Show debug messages and keep intermediate output
+ --dry-run Show commands, but don't execute
-- Anything after gets passed to GEMMA
-h, --help display this help and exit
```
@@ -97,10 +110,11 @@
the data files are found):
gemma-wrapper -- \
-g test/data/input/BXD_geno.txt.gz \
-p test/data/input/BXD_pheno.txt \
+ -a test/data/input/BXD_snps.txt \
-gk \
-debug
Run it twice to see
@@ -114,10 +128,11 @@
You can also get JSON output on STDOUT by providing the --json switch
gemma-wrapper --json -- \
-g test/data/input/BXD_geno.txt.gz \
-p test/data/input/BXD_pheno.txt \
+ -a test/data/input/BXD_snps.txt \
-gk \
-debug > K.json
K.json is something that can be parsed with a calling program, and is
also below as input for the GWA step. Example:
@@ -131,21 +146,22 @@
default. If you want something else provide a --cache-dir, e.g.
gemma-wrapper --cache-dir ~/.gemma-cache -- \
-g test/data/input/BXD_geno.txt.gz \
-p test/data/input/BXD_pheno.txt \
+ -a test/data/input/BXD_snps.txt \
-gk \
-debug
will store K in ~/.gemma-cache.
### GWA
Run the LMM using the K's captured earlier in K.json using the --input
switch
- gemma-wrapper --json --loco --input K.json -- \
+ gemma-wrapper --json --input K.json -- \
-g test/data/input/BXD_geno.txt.gz \
-p test/data/input/BXD_pheno.txt \
-c test/data/input/BXD_covariates2.txt \
-a test/data/input/BXD_snps.txt \
-lmm 2 -maf 0.1 \
@@ -161,10 +177,10 @@
using the -loco switch (for supported formats check
https://github.com/genetics-statistics/GEMMA/issues/46). To loop all
chromosomes first create all K's with
gemma-wrapper --json \
- --loco 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,X -- \
+ --loco -- \
-g test/data/input/BXD_geno.txt.gz \
-p test/data/input/BXD_pheno.txt \
-a test/data/input/BXD_snps.txt \
-gk \
-debug > K.json