nf-core/configs: Sage Bionetworks Global Configuration

To use this custom configuration, run the pipeline with -profile sage. This will download and load the sage.config, which contains a number of optimizations relevant to Sage employees running workflows on AWS (e.g. using Nextflow Tower). This profile will also load any applicable pipeline-specific configuration.

This global configuration includes the following tweaks:

Update the default value for igenomes_base to s3://sage-igenomes
Enable retries by default when exit codes relate to insufficient memory
Allow pending jobs to finish if the number of retries are exhausted
Increase the amount of time allowed for file transfers
Increase the default chunk size for multipart uploads to S3
Slow down job submission rate to avoid overwhelming any APIs
Define the check_max() function, which is missing in Sarek v2
Slow the increase in the number of allocated CPU cores on retries
Increase the default time limits because we run pipelines on AWS

Additional information about iGenomes

The following iGenomes prefixes have been copied from s3://ngi-igenomes/ (eu-west-1) to s3://sage-igenomes (us-east-1). See this script for more information. The sage-igenomes S3 bucket has been configured to openly available, but files cannot be downloaded out of us-east-1 to avoid egress charges. You can check the conf/igenomes.config file in each nf-core pipeline to figure out the mapping between genome IDs (i.e. for --genome) and iGenomes prefixes (example).

Human Genome Builds
- Homo_sapiens/Ensembl/GRCh37
- Homo_sapiens/GATK/GRCh37
- Homo_sapiens/UCSC/hg19
- Homo_sapiens/GATK/GRCh38
- Homo_sapiens/NCBI/GRCh38
- Homo_sapiens/UCSC/hg38
Mouse Genome Builds
- Mus_musculus/Ensembl/GRCm38
- Mus_musculus/UCSC/mm10

2 KiB Raw Blame History

nf-core/configs: Sage Bionetworks Global Configuration

Additional information about iGenomes

2 KiB

Raw Blame History