1
0
Fork 0
mirror of https://github.com/MillironX/nf-configs.git synced 2024-11-25 01:19:54 +00:00

Add MPI-EVA profile

This commit is contained in:
James Fellows Yates 2021-02-14 17:50:20 +01:00
parent f0ba4853df
commit 537f52a640
6 changed files with 344 additions and 14 deletions

View file

@ -6,20 +6,20 @@ A repository for hosting Nextflow configuration files containing custom paramete
## Table of contents <!-- omit in toc -->
* [Using an existing config](#using-an-existing-config)
* [Configuration and parameters](#configuration-and-parameters)
* [Offline usage](#offline-usage)
* [Adding a new config](#adding-a-new-config)
* [Checking user hostnames](#checking-user-hostnames)
* [Testing](#testing)
* [Documentation](#documentation)
* [Uploading to `nf-core/configs`](#uploading-to-nf-coreconfigs)
* [Adding a new pipeline-specific config](#adding-a-new-pipeline-specific-config)
* [Pipeline-specific institutional documentation](#pipeline-specific-institutional-documentation)
* [Pipeline-specific documentation](#pipeline-specific-documentation)
* [Enabling pipeline-specific configs within a pipeline](#enabling-pipeline-specific-configs-within-a-pipeline)
* [Create the pipeline-specific `nf-core/configs` files](#create-the-pipeline-specific-nf-coreconfigs-files)
* [Help](#help)
- [Using an existing config](#using-an-existing-config)
- [Configuration and parameters](#configuration-and-parameters)
- [Offline usage](#offline-usage)
- [Adding a new config](#adding-a-new-config)
- [Checking user hostnames](#checking-user-hostnames)
- [Testing](#testing)
- [Documentation](#documentation)
- [Uploading to `nf-core/configs`](#uploading-to-nf-coreconfigs)
- [Adding a new pipeline-specific config](#adding-a-new-pipeline-specific-config)
- [Pipeline-specific institutional documentation](#pipeline-specific-institutional-documentation)
- [Pipeline-specific documentation](#pipeline-specific-documentation)
- [Enabling pipeline-specific configs within a pipeline](#enabling-pipeline-specific-configs-within-a-pipeline)
- [Create the pipeline-specific `nf-core/configs` files](#create-the-pipeline-specific-nf-coreconfigs-files)
- [Help](#help)
## Using an existing config
@ -107,6 +107,7 @@ Currently documentation is available for the following systems:
* [CZBIOHUB_AWS](docs/czbiohub.md)
* [DENBI_QBIC](docs/denbi_qbic.md)
* [EBC](docs/ebc.md)
* [EVA](docs/eva.md)
* [GENOTOUL](docs/genotoul.md)
* [GENOUEST](docs/genouest.md)
* [GIS](docs/gis.md)
@ -174,6 +175,7 @@ Currently documentation is available for the following pipelines within specific
* [UPPMAX](docs/pipeline/ampliseq/uppmax.md)
* eager
* [SHH](docs/pipeline/eager/shh.md)
* [EVA](docs/pipeline/eager/eva.md)
* rnafusion
* [MUNIN](docs/pipeline/rnafusion/munin.md)
* sarek

52
conf/eva.config Normal file
View file

@ -0,0 +1,52 @@
//Profile config names for nf-core/configs
params {
config_profile_description = 'Generic MPI-EVA cluster(s) profile provided by nf-core/configs.'
config_profile_contact = 'James Fellows Yates (@jfy133)'
config_profile_url = 'https://eva.mpg.de'
}
// Preform work directory cleanup after a successful run
cleanup = true
singularity {
enabled = true
autoMounts = true
}
process {
executor = 'sge'
penv = 'smp'
queue = 'all.q'
}
executor {
queueSize = 8
}
profiles {
archgen {
params {
igenomes_base = "/projects1/public_data/igenomes/"
config_profile_description = 'MPI-EVA archgen profile, provided by nf-core/configs.'
max_memory = 256.GB
max_cpus = 32
max_time = 720.h
//Illumina iGenomes reference file path
igenomes_base = "/projects1/public_data/igenomes/"
}
process {
queue = 'archgen.q'
}
singularity {
cacheDir = "/mnt/archgen/users/singularity_scratch"
}
}
// Profile to deactivate automatic cleanup of work directory after a successful run. Overwrites cleanup option.
debug {
cleanup = false
}
}

View file

@ -0,0 +1,212 @@
// Profile config names for nf-core/configs
params {
// Specific nf-core/configs params
config_profile_contact = 'James Fellows Yates (@jfy133)'
config_profile_description = 'nf-core/eager EVA profile provided by nf-core/configs'
}
// Specific nf-core/eager process configuration
process {
maxRetries = 2
// Solution for clusterOptions comes from here: https://github.com/nextflow-io/nextflow/issues/332 + personal toMega conversion
clusterOptions = { "-S /bin/bash -j y -o output.log -l h_vmem=${task.memory.toMega().toString().replaceAll(/[\sB]/,'')}M,virtual_free=${task.memory.toMega().toString().replaceAll(/[\sB]/,'')}M" }
withLabel:'sc_tiny'{
cpus = { check_max( 1, 'cpus' ) }
memory = { check_max( 1.GB * task.attempt, 'memory' ) }
time = { task.attempt == 3 ? 1440.h : task.attempt == 2 ? 48.h : 2.h }
}
withLabel:'sc_small'{
cpus = { check_max( 1, 'cpus' ) }
memory = { check_max( 4.GB * task.attempt, 'memory' ) }
time = { task.attempt == 3 ? 1440.h : task.attempt == 2 ? 48.h : 2.h }
}
withLabel:'sc_medium'{
cpus = { check_max( 1, 'cpus' ) }
memory = { check_max( 8.GB * task.attempt, 'memory' ) }
time = { task.attempt == 3 ? 1440.h : task.attempt == 2 ? 48.h : 2.h }
}
withLabel:'mc_small'{
cpus = { check_max( 2, 'cpus' ) }
memory = { check_max( 4.GB * task.attempt, 'memory' ) }
time = { task.attempt == 3 ? 1440.h : task.attempt == 2 ? 48.h : 2.h }
}
withLabel:'mc_medium' {
cpus = { check_max( 4, 'cpus' ) }
memory = { check_max( 8.GB * task.attempt, 'memory' ) }
time = { task.attempt == 3 ? 1440.h : task.attempt == 2 ? 48.h : 2.h }
}
withLabel:'mc_large'{
cpus = { check_max( 8, 'cpus' ) }
memory = { check_max( 16.GB * task.attempt, 'memory' ) }
time = { task.attempt == 3 ? 1440.h : task.attempt == 2 ? 48.h : 2.h }
}
withLabel:'mc_huge'{
cpus = { check_max( 32, 'cpus' ) }
memory = { check_max( 256.GB * task.attempt, 'memory' ) }
time = { task.attempt == 3 ? 1440.h : task.attempt == 2 ? 48.h : 2.h }
}
// Fixes for SGE and Java incompatibility due to Java using more memory than you tell it to use
withName: makeSeqDict {
clusterOptions = { "-S /bin/bash -l h_vmem=${task.memory.toMega().plus(4000).toString().replaceAll(/[\sB]/,'')}M,virtual_free=${task.memory.toMega().plus(4000).toString().replaceAll(/[\sB]/,'')}M" }
}
withName: fastqc {
clusterOptions = { "-S /bin/bash -l h_vmem=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M,virtual_free=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M" }
}
withName: adapter_removal {
clusterOptions = { "-S /bin/bash -l h_vmem=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M,virtual_free=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M" }
}
withName: dedup {
clusterOptions = { "-S /bin/bash -l h_vmem=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M,virtual_free=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M" }
}
withName: markduplicates {
clusterOptions = { "-S /bin/bash -l h_vmem=${task.memory.toMega().plus(5000).toString().replaceAll(/[\sB]/,'')}M,virtual_free=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M" }
}
withName: malt {
clusterOptions = { "-S /bin/bash -l h_vmem=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M,virtual_free=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M" }
}
withName: maltextract {
clusterOptions = { "-S /bin/bash -l h_vmem=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M,virtual_free=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M" }
}
withName: multivcfanalyzer {
clusterOptions = { "-S /bin/bash -l h_vmem=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M,virtual_free=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M" }
}
withName: mtnucratio {
clusterOptions = { "-S /bin/bash -l h_vmem=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M,virtual_free=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M" }
}
withName: vcf2genome {
clusterOptions = { "-S /bin/bash -l h_vmem=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M,virtual_free=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M" }
}
withName: qualimap {
clusterOptions = { "-S /bin/bash -l h_vmem=${task.memory.toMega().plus(5000).toString().replaceAll(/[\sB]/,'')}M,virtual_free=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M" }
}
withName: damageprofiler {
clusterOptions = { "-S /bin/bash -l h_vmem=${task.memory.toMega().plus(5000).toString().replaceAll(/[\sB]/,'')}M,virtual_free=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M" }
}
withName: circularmapper {
clusterOptions = { "-S /bin/bash -l h_vmem=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M,virtual_free=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M" }
}
withName: circulargenerator {
clusterOptions = { "-S /bin/bash -l h_vmem=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M,virtual_free=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M" }
}
withName: preseq {
clusterOptions = { "-S /bin/bash -l h_vmem=${task.memory.toMega().plus(4000).toString().replaceAll(/[\sB]/,'')}M,virtual_free=${task.memory.toMega().plus(1000).toString().replaceAll(/[\sB]/,'')}M" }
}
}
profiles {
big_data {
params {
// Specific nf-core/configs params
config_profile_contact = 'James Fellows Yates (@jfy133)'
config_profile_description = 'nf-core/eager big-data EVA profile provided by nf-core/configs'
}
executor {
queueSize = 6
}
process {
maxRetries = 2
withName:hostremoval_input_fastq {
cpus = { check_max( 1, 'cpus' ) }
memory = { check_max( 32.GB * task.attempt, 'memory' ) }
time = 1440.h
}
withLabel:'sc_tiny'{
cpus = { check_max( 1, 'cpus' ) }
memory = { check_max( 2.GB * task.attempt, 'memory' ) }
time = { task.attempt == 3 ? 1440.h : task.attempt == 2 ? 48.h : 2.h }
}
withLabel:'sc_small'{
cpus = { check_max( 1, 'cpus' ) }
memory = { check_max( 8.GB * task.attempt, 'memory' ) }
time = { task.attempt == 3 ? 1440.h : task.attempt == 2 ? 48.h : 2.h }
}
withLabel:'sc_medium'{
cpus = { check_max( 1, 'cpus' ) }
memory = { check_max( 16.GB * task.attempt, 'memory' ) }
time = { task.attempt == 3 ? 1440.h : task.attempt == 2 ? 48.h : 2.h }
}
withLabel:'mc_small'{
cpus = { check_max( 2, 'cpus' ) }
memory = { check_max( 8.GB * task.attempt, 'memory' ) }
time = { task.attempt == 3 ? 1440.h : task.attempt == 2 ? 48.h : 2.h }
}
withLabel:'mc_medium' {
cpus = { check_max( 4, 'cpus' ) }
memory = { check_max( 16.GB * task.attempt, 'memory' ) }
time = { task.attempt == 3 ? 1440.h : task.attempt == 2 ? 48.h : 2.h }
}
withLabel:'mc_large'{
cpus = { check_max( 8, 'cpus' ) }
memory = { check_max( 32.GB * task.attempt, 'memory' ) }
time = { task.attempt == 3 ? 1440.h : task.attempt == 2 ? 48.h : 2.h }
}
withLabel:'mc_huge'{
cpus = { check_max( 32, 'cpus' ) }
memory = { check_max( 512.GB * task.attempt, 'memory' ) }
time = { task.attempt == 3 ? 1440.h : task.attempt == 2 ? 48.h : 2.h }
}
}
}
pathogen_loose {
params {
config_profile_description = 'Pathogen (loose) MPI-EVA profile, provided by nf-core/configs.'
bwaalnn = 0.01
bwaalnl = 16
}
}
pathogen_strict {
params {
config_profile_description = 'Pathogen (strict) MPI-EVA SDAG profile, provided by nf-core/configs.'
bwaalnn = 0.1
bwaalnl = 32
}
}
human {
params {
config_profile_description = 'Human MPI-EVA SDAG profile, provided by nf-core/configs.'
bwaalnn = 0.01
bwaalnl = 16500
}
}
}

29
docs/eva.md Normal file
View file

@ -0,0 +1,29 @@
# nf-core/configs: EVA Configuration
All nf-core pipelines have been successfully configured for use on the Department of Genetics and Archaeogenetic's clusters at the [Max Planck Institute for Evolutionary Anthropology (MPI-EVA)](http://eva.mpg.de).
To use, run the pipeline with `-profile eva`. You can further with optimise submissions by specifying which cluster queue you are using e,g, `-profile eva,archgen`. This will download and launch the [`eva.config`](../conf/eva.config) which has been pre-configured with a setup suitable for the `all.q` queue. The number of parallel jobs that run is currently limited to 8.
Using this profile, a docker image containing all of the required software will be downloaded, and converted to a `singularity` image before execution of the pipeline. The image will currently be centrally stored here:
## Additional Profiles
We currently also offer profiles for the different department's specific nodes.
### archgen
If you specify `-profile eva,archgen` you will be able to use the nodes available on the `archgen.q` queue.
Note the following characteristics of this profile:
- By default, job resources are assigned a maximum number of CPUs of 32, 256 GB maximum memory and 720.h maximum wall time.
- Using this profile will currently store singularity images in a cache under `/mnt/archgen/users/singularity_scratch/cache/`. All archgen users currently have read/write access to this directory, however this will likely change to a read-only directory in the future that will be managed by the IT team.
- Intermediate files will be _automatically_ cleaned up (see `debug` below if you don't want this to happen) on successful run completion.
>NB: You will need an account and VPN access to use the cluster at MPI-EVA in order to run the pipeline. If in doubt contact the IT team.
>NB: Nextflow will need to submit the jobs via SGE to the clusters and as such the commands above will have to be executed on one of the head nodes. If in doubt contact IT.
### debug
This simple profile just turns off automatic clean up of intermediate files. This can be useful for debugging. Specify e.g. with `-profile eva,archgen`

View file

@ -0,0 +1,34 @@
# nf-core/configs: eva eager specific configuration
Extra specific configuration for eager pipeline
## Usage
To use, run the pipeline with `-profile eva`.
This will download and launch the eager specific [`eva.config`](../../../conf/pipeline/eager/eva.config) which has been pre-configured with a setup suitable for the MPI-EVA cluster.
Example: `nextflow run nf-core/eager -profile eva`
## eager specific configurations for eva
Specific configurations for eva has been made for eager.
### General profiles
- The general MPI-EVA profile runs with default nf-core/eager parameters, but with modifications to account for issues SGE have with Java tools.
#### big_data
- This defines larger base computing resources for when working with very deep sequenced or high-endogenous samples.
### Contextual profiles
#### Human Pop-Gen
* `human`: optimised for mapping of human aDNA reads (i.e. bwa aln defaults as `-l 16500, -n 0.01`)
#### Pathogen
* `pathogen_loose`: optimised for mapping of human aDNA reads (i.e. bwa aln defaults as `-l 16 -n 0.01`)
* `pathogen_strict`: optimised for mapping of human aDNA reads (i.e. bwa aln defaults as `-l 32, -n 0.1`)

View file

@ -23,6 +23,7 @@ profiles {
crick { includeConfig "${params.custom_config_base}/conf/crick.config" }
czbiohub_aws { includeConfig "${params.custom_config_base}/conf/czbiohub_aws.config" }
ebc { includeConfig "${params.custom_config_base}/conf/ebc.config" }
eva { includeConfig "${params.custom_config_base}/conf/eva.config" }
icr_davros { includeConfig "${params.custom_config_base}/conf/icr_davros.config" }
imperial { includeConfig "${params.custom_config_base}/conf/imperial.config" }
imperial_mb { includeConfig "${params.custom_config_base}/conf/imperial_mb.config" }