1
0
Fork 0
mirror of https://github.com/MillironX/nf-configs.git synced 2024-11-10 20:13:09 +00:00

Merge branch 'nf-core:master' into master

This commit is contained in:
James A. Fellows Yates 2022-09-14 15:01:34 +02:00 committed by GitHub
commit 8ea01e2041
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
42 changed files with 1184 additions and 54 deletions

View file

@ -8,5 +8,5 @@ trim_trailing_whitespace = true
indent_size = 4
indent_style = space
[*.{md,yml,yaml}]
[*.{md,yml,yaml,cff}]
indent_size = 2

View file

@ -30,6 +30,7 @@ jobs:
matrix:
profile:
- "abims"
- "adcra"
- "alice"
- "aws_tower"
- "awsbatch"
@ -49,6 +50,7 @@ jobs:
- "cheaha"
- "computerome"
- "crick"
- "crukmi"
- "denbi_qbic"
- "ebc"
- "eddie"
@ -66,6 +68,7 @@ jobs:
- "jax"
- "lugh"
- "marvin"
- "medair"
- "mjolnir_globe"
- "maestro"
- "mpcdf"
@ -77,8 +80,10 @@ jobs:
- "phoenix"
- "prince"
- "rosalind"
- "sage"
- "sahmri"
- "sanger"
- "sbc_sharc"
- "seg_globe"
- "uct_hpc"
- "unibe_ibu"

56
CITATION.cff Normal file
View file

@ -0,0 +1,56 @@
cff-version: 1.2.0
message: "If you use `nf-core tools` in your work, please cite the `nf-core` publication"
authors:
- family-names: Ewels
given-names: Philip
- family-names: Peltzer
given-names: Alexander
- family-names: Fillinger
given-names: Sven
- family-names: Patel
given-names: Harshil
- family-names: Alneberg
given-names: Johannes
- family-names: Wilm
given-names: Andreas
- family-names: Ulysse Garcia
given-names: Maxime
- family-names: Di Tommaso
given-names: Paolo
- family-names: Nahnsen
given-names: Sven
title: "The nf-core framework for community-curated bioinformatics pipelines."
version: 2.4.1
doi: 10.1038/s41587-020-0439-x
date-released: 2022-05-16
url: https://github.com/nf-core/tools
prefered-citation:
type: article
authors:
- family-names: Ewels
given-names: Philip
- family-names: Peltzer
given-names: Alexander
- family-names: Fillinger
given-names: Sven
- family-names: Patel
given-names: Harshil
- family-names: Alneberg
given-names: Johannes
- family-names: Wilm
given-names: Andreas
- family-names: Ulysse Garcia
given-names: Maxime
- family-names: Di Tommaso
given-names: Paolo
- family-names: Nahnsen
given-names: Sven
doi: 10.1038/s41587-020-0439-x
journal: nature biotechnology
start: 276
end: 278
title: "The nf-core framework for community-curated bioinformatics pipelines."
issue: 3
volume: 38
year: 2020
url: https://dx.doi.org/10.1038/s41587-020-0439-x

View file

@ -10,7 +10,6 @@ A repository for hosting Nextflow configuration files containing custom paramete
- [Configuration and parameters](#configuration-and-parameters)
- [Offline usage](#offline-usage)
- [Adding a new config](#adding-a-new-config)
- [Checking user hostnames](#checking-user-hostnames)
- [Testing](#testing)
- [Documentation](#documentation)
- [Uploading to `nf-core/configs`](#uploading-to-nf-coreconfigs)
@ -68,6 +67,8 @@ Before adding your config file to nf-core/configs, we highly recommend writing a
N.B. In your config file, please also make sure to add an extra `params` section with `params.config_profile_description`, `params.config_profile_contact` and `params.config_profile_url` set to reasonable values.
Users will get information on who wrote the configuration profile then when executing a nf-core pipeline and can report back if there are things missing for example.
N.B. If you try to specify a shell environment variable within your profile, in some cases you may get an error during testing of something like `Unknown config attribute env.USER_SCRATCH -- check config file: /home/runner/work/configs/configs/nextflow.config` (where the bash environment variable is `$USER_SCRATCH`). This is because the github runner will not have your institutional environment variables set. To fix this you can define this as an internal variable, and set a fallback value for that variable. A good example is in the [VSC_UGENT profile](`https://github.com/nf-core/configs/blob/69468e7ca769643b151a6cfd1ab24185fc341c06/conf/vsc_ugent.config#L2`).
### Testing
If you want to add a new custom config file to `nf-core/configs` please test that your pipeline of choice runs as expected by using the [`-c`](https://www.nextflow.io/docs/latest/config.html) parameter.
@ -87,6 +88,7 @@ See [`nf-core/configs/docs`](https://github.com/nf-core/configs/tree/master/docs
Currently documentation is available for the following systems:
- [ABIMS](docs/abims.md)
- [ADCRA](docs/adcra.md)
- [ALICE](docs/alice.md)
- [AWSBATCH](docs/awsbatch.md)
- [AWS_TOWER](docs/aws_tower.md)
@ -104,6 +106,7 @@ Currently documentation is available for the following systems:
- [CHEAHA](docs/cheaha.md)
- [Computerome](docs/computerome.md)
- [CRICK](docs/crick.md)
- [Cancer Research UK Manchester Institute](docs/crukmi.md)
- [CZBIOHUB_AWS](docs/czbiohub.md)
- [DENBI_QBIC](docs/denbi_qbic.md)
- [EBC](docs/ebc.md)
@ -121,6 +124,7 @@ Currently documentation is available for the following systems:
- [LUGH](docs/lugh.md)
- [MAESTRO](docs/maestro.md)
- [MARVIN](docs/marvin.md)
- [MEDAIR](docs/medair.md)
- [MJOLNIR_GLOBE](docs/mjolnir_globe.md)
- [MPCDF](docs/mpcdf.md)
- [MUNIN](docs/munin.md)
@ -131,7 +135,9 @@ Currently documentation is available for the following systems:
- [PHOENIX](docs/phoenix.md)
- [PRINCE](docs/prince.md)
- [ROSALIND](docs/rosalind.md)
- [SAGE BIONETWORKS](docs/sage.md)
- [SANGER](docs/sanger.md)
- [SBC_SHARC](docs/sbc_sharc.md)
- [SEG_GLOBE](docs/seg_globe.md)
- [UCT_HPC](docs/uct_hpc.md)
- [UNIBE_IBU](docs/unibe_ibu.md)
@ -192,16 +198,25 @@ Currently documentation is available for the following pipelines within specific
- ampliseq
- [BINAC](docs/pipeline/ampliseq/binac.md)
- [UPPMAX](docs/pipeline/ampliseq/uppmax.md)
- atacseq
- [SBC_SHARC](docs/pipeline/atacseq/sbc_sharc.md)
- chipseq
- [SBC_SHARC](docs/pipeline/chipseq/sbc_sharc.md)
- eager
- [EVA](docs/pipeline/eager/eva.md)
- mag
- [EVA](docs/pipeline/mag/eva.md)
- rnafusion
- [HASTA](docs/pipeline/rnafusion/hasta.md)
- [MUNIN](docs/pipeline/rnafusion/munin.md)
- rnaseq
- [SBC_SHARC](docs/pipeline/rnaseq/sbc_sharc.md)
- rnavar
- [MUNIN](docs/pipeline/rnavar/munin.md)
- sarek
- [Cancer Research UK Manchester Institute](docs/pipeline/sarek/crukmi.md)
- [MUNIN](docs/pipeline/sarek/munin.md)
- [SBC_SHARC](docs/pipeline/sarek/sbc_sharc.md)
- [UPPMAX](docs/pipeline/sarek/uppmax.md)
- taxprofiler
- [EVA](docs/pipeline/taxprofiler/eva.md)

40
conf/adcra.config Normal file
View file

@ -0,0 +1,40 @@
/*
* --------------------------------------------------------------
* nf-core pipelines config file for AD project using CRA HPC
* --------------------------------------------------------------
*/
params {
config_profile_name = 'adcra'
config_profile_description = 'CRA HPC profile provided by nf-core/configs'
config_profile_contact = 'Kalayanee Chairat (@kalayaneech)'
config_profile_url = 'https://bioinformatics.kmutt.ac.th/'
}
params {
max_cpus = 16
max_memory = 128.GB
max_time = 120.h
}
// Specify the job scheduler
executor {
name = 'slurm'
queueSize = 20
submitRateLimit = '6/1min'
}
singularity {
enabled = true
autoMounts = true
}
process {
scratch = true
queue = 'unlimit'
queueStatInterval = '10 min'
maxRetries = 3
errorStrategy = { task.attempt <=3 ? 'retry' : 'finish' }
cache = 'lenient'
exitStatusReadTimeoutMillis = '2700000'
}

View file

@ -1,3 +1,6 @@
// Define the Scratch directory
def scratch_dir = System.getenv("USER_SCRATCH") ?: "/tmp"
params {
config_profile_name = 'cheaha'
config_profile_description = 'University of Alabama at Birmingham Cheaha HPC'
@ -5,9 +8,15 @@ params {
config_profile_url = 'https://www.uab.edu/cores/ircp/bds'
}
env {
TMPDIR="$USER"
SINGULARITY_TMPDIR="$scratch_dir"
}
singularity {
enabled = true
autoMounts = true
runOptions = "--contain --workdir $scratch_dir"
}
process {

52
conf/crukmi.config Normal file
View file

@ -0,0 +1,52 @@
//Profile config names for nf-core/configs
params {
config_profile_description = 'Cancer Research UK Manchester Institute HPC cluster profile provided by nf-core/configs'
config_profile_contact = 'Stephen Kitcatt, Simon Pearce (@skitcattCRUKMI, @sppearce)'
config_profile_url = 'http://scicom.picr.man.ac.uk/projects/user-support/wiki'
}
env {
SINGULARITY_CACHEDIR = '/lmod/nextflow_software'
}
singularity {
enabled = true
autoMounts = true
}
process {
beforeScript = 'module load apps/singularity/3.8.0'
executor = 'pbs'
errorStrategy = {task.exitStatus in [143,137,104,134,139,140] ? 'retry' : 'finish'}
maxErrors = '-1'
maxRetries = 3
withLabel:process_low {
cpus = { check_max( 1 * task.attempt, 'cpus' ) }
memory = { check_max( 5.GB * task.attempt, 'memory' ) }
}
withLabel:process_medium {
cpus = { check_max( 4 * task.attempt, 'cpus' ) }
memory = { check_max( 20.GB * task.attempt, 'memory' ) }
}
withLabel:process_high {
cpus = { check_max( 16 * task.attempt, 'cpus' ) }
memory = { check_max( 80.GB * task.attempt, 'memory' ) }
}
}
executor {
name = 'pbs'
queueSize = 1000
pollInterval = '10 sec'
}
params {
max_memory = 2000.GB
max_cpus = 32
max_time = 72.h
}

View file

@ -19,4 +19,6 @@ google.lifeSciences.preemptible = params.google_preemptible
if (google.lifeSciences.preemptible) {
process.errorStrategy = { task.exitStatus in [8,10,14] ? 'retry' : 'terminate' }
process.maxRetries = 5
}
}
process.machineType = { task.memory > task.cpus * 6.GB ? ['custom', task.cpus, task.cpus * 6656].join('-') : null }

View file

@ -16,19 +16,29 @@ singularity {
params {
max_memory = 180.GB
max_cpus = 36
max_time = 336.h
max_time = 336.h
}
process {
executor = 'slurm'
clusterOptions = { "-A $params.priority ${params.clusterOptions ?: ''}" }
clusterOptions = { "-A $params.priority ${params.clusterOptions ?: ''}" }
}
profiles {
dev_prio {
stub_prio {
params {
priority = 'development'
clusterOptions = "--qos=low"
max_memory = 6.GB
max_cpus = 1
max_time = 1.h
}
}
dev_prio {
params {
priority = 'development'
clusterOptions = "--qos=low"
}
}

46
conf/medair.config Normal file
View file

@ -0,0 +1,46 @@
//Profile config names for nf-core/configs
params {
config_profile_description = 'Cluster profile for medair (local cluster of Clinical Genomics Gothenburg)'
config_profile_contact = 'Clinical Genomics, Gothenburg (cgg-rd@gu.se, cgg-it@gu.se)'
config_profile_url = 'https://www.scilifelab.se/units/clinical-genomics-goteborg/'
}
//Nextflow parameters
singularity {
enabled = true
cacheDir = "/apps/bio/dependencies/nf-core/singularities"
}
profiles {
wgs {
process {
queue = 'wgs.q'
executor = 'sge'
penv = 'mpi'
process.clusterOptions = '-l excl=1'
params.max_cpus = 40
params.max_time = 48.h
params.max_memory = 128.GB
}
}
production {
process {
queue = 'production.q'
executor = 'sge'
penv = 'mpi'
process.clusterOptions = '-l excl=1'
params.max_cpus = 40
params.max_time = 480.h
params.max_memory = 128.GB
}
}
}
//Specific parameter for pipelines that can use Sentieon (e.g. nf-core/sarek, nf-core/raredisease)
process {
withLabel:'sentieon' {
container = "/apps/bio/singularities/sentieon-211204-peta.simg"
}
}

View file

@ -61,7 +61,7 @@ profiles {
params {
config_profile_description = 'MPCDF raven profile (unofficially) provided by nf-core/configs.'
memory = 2000000.MB
max_memory = 2000000.MB
max_cpus = 72
max_time = 24.h
}

View file

@ -0,0 +1,74 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sheffield Bioinformatics Core Configuration Profile - ShARC
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Custom Pipeline Resource Config for nf-core/atacseq
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
// process-specific resource requirements - reduced specification from those in atacseq/conf/base.config
process {
withLabel:process_low {
cpus = { check_max( 2 * task.attempt, 'cpus' ) }
memory = { check_max( 4.GB * task.attempt, 'memory' ) }
time = { check_max( 4.h * task.attempt, 'time' ) }
}
withLabel:process_medium {
cpus = { check_max( 4 * task.attempt, 'cpus' ) }
memory = { check_max( 8.GB * task.attempt, 'memory' ) }
time = { check_max( 6.h * task.attempt, 'time' ) }
}
withLabel:process_high {
cpus = { check_max( 8 * task.attempt, 'cpus' ) }
memory = { check_max( 16.GB * task.attempt, 'memory' ) }
time = { check_max( 8.h * task.attempt, 'time' ) }
}
withLabel:process_long {
time = { check_max( 12.h * task.attempt, 'time' ) }
}
}
// function 'check_max()' to ensure that resource requirements don't go beyond maximum limit
def check_max(obj, type) {
if (type == 'memory') {
try {
if (obj.compareTo(params.max_memory as nextflow.util.MemoryUnit) == 1)
return params.max_memory as nextflow.util.MemoryUnit
else
return obj
} catch (all) {
println " ### ERROR ### Max memory '${params.max_memory}' is not valid! Using default value: $obj"
return obj
}
} else if (type == 'time') {
try {
if (obj.compareTo(params.max_time as nextflow.util.Duration) == 1)
return params.max_time as nextflow.util.Duration
else
return obj
} catch (all) {
println " ### ERROR ### Max time '${params.max_time}' is not valid! Using default value: $obj"
return obj
}
} else if (type == 'cpus') {
try {
return Math.min(obj, params.max_cpus as int)
} catch (all) {
println " ### ERROR ### Max cpus '${params.max_cpus}' is not valid! Using default value: $obj"
return obj
}
}
}

View file

@ -0,0 +1,74 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sheffield Bioinformatics Core Configuration Profile - ShARC
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Custom Pipeline Resource Config for nf-core/chipseq
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
// process-specific resource requirements - reduced specification from those in chipseq/conf/base.config
process {
withLabel:process_low {
cpus = { check_max( 2 * task.attempt, 'cpus' ) }
memory = { check_max( 4.GB * task.attempt, 'memory' ) }
time = { check_max( 4.h * task.attempt, 'time' ) }
}
withLabel:process_medium {
cpus = { check_max( 4 * task.attempt, 'cpus' ) }
memory = { check_max( 8.GB * task.attempt, 'memory' ) }
time = { check_max( 6.h * task.attempt, 'time' ) }
}
withLabel:process_high {
cpus = { check_max( 8 * task.attempt, 'cpus' ) }
memory = { check_max( 16.GB * task.attempt, 'memory' ) }
time = { check_max( 8.h * task.attempt, 'time' ) }
}
withLabel:process_long {
time = { check_max( 12.h * task.attempt, 'time' ) }
}
}
// function 'check_max()' to ensure that resource requirements don't go beyond maximum limit
def check_max(obj, type) {
if (type == 'memory') {
try {
if (obj.compareTo(params.max_memory as nextflow.util.MemoryUnit) == 1)
return params.max_memory as nextflow.util.MemoryUnit
else
return obj
} catch (all) {
println " ### ERROR ### Max memory '${params.max_memory}' is not valid! Using default value: $obj"
return obj
}
} else if (type == 'time') {
try {
if (obj.compareTo(params.max_time as nextflow.util.Duration) == 1)
return params.max_time as nextflow.util.Duration
else
return obj
} catch (all) {
println " ### ERROR ### Max time '${params.max_time}' is not valid! Using default value: $obj"
return obj
}
} else if (type == 'cpus') {
try {
return Math.min(obj, params.max_cpus as int)
} catch (all) {
println " ### ERROR ### Max cpus '${params.max_cpus}' is not valid! Using default value: $obj"
return obj
}
}
}

View file

@ -1,5 +1,5 @@
process {
withName:'PICARD_MARKDUPLICATES' {
memory = { check_max( 90.GB * task.attempt, 'memory' ) }
}
@ -7,7 +7,26 @@ process {
cpus = { check_max( 16 * task.attempt, 'cpus' ) }
memory = { check_max( 80.GB * task.attempt, 'memory' ) }
}
withName:'QUALIMAP_BAMQC' {
ext.args = { "--java-mem-size=${task.memory.giga / 1.15 as long}G" }
withLabel:'sentieon' {
beforeScript = { "export PATH=\$PATH:\$SENTIEON_INSTALL_DIR/sentieon-genomics-202112.02/bin" }
}
}
withName: 'BCFTOOLS_VIEW' {
if (params.genome == 'GRCh37') {
ext.args = '--output-type z --apply-filters PASS --exclude "INFO/clinical_genomics_mipAF > 0.40 | INFO/swegenAF > 0.40 | INFO/clingen_ngiAF > 0.40 | INFO/gnomad_svAF > 0.40 "'
} else if (params.genome == 'GRCh38') {
ext.args = '--output-type z --apply-filters PASS --exclude "INFO/swegen_FRQ > 0.40"'
}
publishDir = [
enabled: false,
]
}
// Java memory fixes
withName:'QUALIMAP_BAMQC' {
clusterOptions = { "-A $params.priority ${params.clusterOptions ?: ''} ${task.memory ? "--mem ${task.memory.mega * 1.15 as long}M" : ''}" }
}
withName:'PICARD_MARKDUPLICATES' {
clusterOptions = { "-A $params.priority ${params.clusterOptions ?: ''} ${task.memory ? "--mem ${task.memory.mega * 1.15 as long}M" : ''}" }
}
}

View file

@ -0,0 +1,7 @@
// rnafusion/hasta specific profile config for Clinical Genomics Stockholm usage
params {
all = true
trim = true
fusioninspector_filter = true
}

View file

@ -0,0 +1,79 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sheffield Bioinformatics Core Configuration Profile - ShARC
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Custom Pipeline Resource Config for nf-core/rnaseq
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
// process-specific resource requirements - reduced specification from those in rnaseq/conf/base.config
process {
withLabel:process_low {
cpus = { check_max( 2 * task.attempt, 'cpus' ) }
memory = { check_max( 4.GB * task.attempt, 'memory' ) }
time = { check_max( 4.h * task.attempt, 'time' ) }
}
withLabel:process_medium {
cpus = { check_max( 4 * task.attempt, 'cpus' ) }
memory = { check_max( 8.GB * task.attempt, 'memory' ) }
time = { check_max( 6.h * task.attempt, 'time' ) }
}
withLabel:process_high {
cpus = { check_max( 8 * task.attempt, 'cpus' ) }
memory = { check_max( 16.GB * task.attempt, 'memory' ) }
time = { check_max( 8.h * task.attempt, 'time' ) }
}
withLabel:process_long {
time = { check_max( 12.h * task.attempt, 'time' ) }
}
withLabel:process_high_memory {
memory = { check_max( 60.GB * task.attempt, 'memory' ) }
}
}
// function 'check_max()' to ensure that resource requirements don't go beyond maximum limit
def check_max(obj, type) {
if (type == 'memory') {
try {
if (obj.compareTo(params.max_memory as nextflow.util.MemoryUnit) == 1)
return params.max_memory as nextflow.util.MemoryUnit
else
return obj
} catch (all) {
println " ### ERROR ### Max memory '${params.max_memory}' is not valid! Using default value: $obj"
return obj
}
} else if (type == 'time') {
try {
if (obj.compareTo(params.max_time as nextflow.util.Duration) == 1)
return params.max_time as nextflow.util.Duration
else
return obj
} catch (all) {
println " ### ERROR ### Max time '${params.max_time}' is not valid! Using default value: $obj"
return obj
}
} else if (type == 'cpus') {
try {
return Math.min(obj, params.max_cpus as int)
} catch (all) {
println " ### ERROR ### Max cpus '${params.max_cpus}' is not valid! Using default value: $obj"
return obj
}
}
}

View file

@ -0,0 +1,18 @@
// Profile config names for nf-core/configs
params {
// Specific nf-core/configs params
config_profile_description = 'Cancer Research UK Manchester Institute HPC cluster profile provided by nf-core/configs'
config_profile_contact = 'Stephen Kitcatt, Simon Pearce (@skitcattCRUKMI, @sppearce)'
config_profile_url = 'http://scicom.picr.man.ac.uk/projects/user-support/wiki'
}
// Specific nf-core/sarek process configuration
process {
withName: 'SAMTOOLS_MPILEUP' {
cpus = 1
memory = { 5.GB * task.attempt }
}
}

View file

@ -0,0 +1,114 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sheffield Bioinformatics Core Configuration Profile - ShARC
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Custom Pipeline Resource Config for nf-core/sarek
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
// process-specific resource requirements - reduced specification from those in sarek/conf/base.config
process {
// process labels
withLabel:process_low {
cpus = { check_max( 2 * task.attempt, 'cpus' ) }
memory = { check_max( 4.GB * task.attempt, 'memory' ) }
time = { check_max( 4.h * task.attempt, 'time' ) }
}
withLabel:process_medium {
cpus = { check_max( 4 * task.attempt, 'cpus' ) }
memory = { check_max( 8.GB * task.attempt, 'memory' ) }
time = { check_max( 6.h * task.attempt, 'time' ) }
}
withLabel:process_high {
cpus = { check_max( 8 * task.attempt, 'cpus' ) }
memory = { check_max( 16.GB * task.attempt, 'memory' ) }
time = { check_max( 8.h * task.attempt, 'time' ) }
}
withLabel:process_long {
time = { check_max( 12.h * task.attempt, 'time' ) }
}
withLabel:process_high_memory {
memory = { check_max( 60.GB * task.attempt, 'memory' ) }
}
// process name
withName:'BWAMEM1_MEM|BWAMEM2_MEM' {
cpus = { check_max( 12 * task.attempt, 'cpus' ) }
memory = { check_max( 16.GB * task.attempt, 'memory' ) }
time = { check_max( 8.h * task.attempt, 'time' ) }
}
withName:'FASTP' {
cpus = { check_max( 12 * task.attempt, 'cpus' ) }
}
withName:'FASTQC|FASTP|MOSDEPTH|SAMTOOLS_CONVERT' {
memory = { check_max( 4.GB * task.attempt, 'memory' ) }
}
withName:'GATK4_APPLYBQSR|GATK4_APPLYBQSR_SPARK|GATK4_BASERECALIBRATOR|SAMTOOLS_STATS' {
cpus = { check_max( 4 * task.attempt, 'cpus' ) }
}
withName:'GATK4_APPLYBQSR|GATK4_APPLYBQSR_SPARK|GATK4_BASERECALIBRATOR|GATK4_GATHERBQSRREPORTS' {
memory = { check_max( 16.GB * task.attempt, 'memory' ) }
}
withName:'GATK4_MARKDUPLICATES' {
memory = { check_max( 16.GB * task.attempt, 'memory' ) }
}
withName:'FREEBAYES|SAMTOOLS_STATS|SAMTOOLS_INDEX|UNZIP' {
cpus = { check_max( 1 * task.attempt, 'cpus' ) }
}
}
// function 'check_max()' to ensure that resource requirements don't go beyond maximum limit
def check_max(obj, type) {
if (type == 'memory') {
try {
if (obj.compareTo(params.max_memory as nextflow.util.MemoryUnit) == 1)
return params.max_memory as nextflow.util.MemoryUnit
else
return obj
} catch (all) {
println " ### ERROR ### Max memory '${params.max_memory}' is not valid! Using default value: $obj"
return obj
}
} else if (type == 'time') {
try {
if (obj.compareTo(params.max_time as nextflow.util.Duration) == 1)
return params.max_time as nextflow.util.Duration
else
return obj
} catch (all) {
println " ### ERROR ### Max time '${params.max_time}' is not valid! Using default value: $obj"
return obj
}
} else if (type == 'cpus') {
try {
return Math.min(obj, params.max_cpus as int)
} catch (all) {
println " ### ERROR ### Max cpus '${params.max_cpus}' is not valid! Using default value: $obj"
return obj
}
}
}

View file

@ -8,23 +8,27 @@
params {
// Genome reference file paths
genomes {
// SARS-CoV-2
'NC_045512.2' {
// This version of the reference has been kept here for backwards compatibility.
// Please use 'MN908947.3' if possible because all primer sets are available / have been pre-prepared relative to that assembly
fasta = 'https://github.com/nf-core/test-datasets/raw/viralrecon/genome/NC_045512.2/GCF_009858895.2_ASM985889v3_genomic.200409.fna.gz'
gff = 'https://github.com/nf-core/test-datasets/raw/viralrecon/genome/NC_045512.2/GCF_009858895.2_ASM985889v3_genomic.200409.gff.gz'
nextclade_dataset = 'https://github.com/nf-core/test-datasets/raw/viralrecon/genome/MN908947.3/nextclade_sars-cov-2_MN908947_2022-01-18T12_00_00Z.tar.gz'
nextclade_dataset = 'https://github.com/nf-core/test-datasets/raw/viralrecon/genome/MN908947.3/nextclade_sars-cov-2_MN908947_2022-06-14T12_00_00Z.tar.gz'
nextclade_dataset_name = 'sars-cov-2'
nextclade_dataset_reference = 'MN908947'
nextclade_dataset_tag = '2022-01-18T12:00:00Z'
nextclade_dataset_tag = '2022-06-14T12:00:00Z'
}
// SARS-CoV-2
'MN908947.3' {
fasta = 'https://github.com/nf-core/test-datasets/raw/viralrecon/genome/MN908947.3/GCA_009858895.3_ASM985889v3_genomic.200409.fna.gz'
gff = 'https://github.com/nf-core/test-datasets/raw/viralrecon/genome/MN908947.3/GCA_009858895.3_ASM985889v3_genomic.200409.gff.gz'
nextclade_dataset = 'https://github.com/nf-core/test-datasets/raw/viralrecon/genome/MN908947.3/nextclade_sars-cov-2_MN908947_2022-01-18T12_00_00Z.tar.gz'
nextclade_dataset = 'https://github.com/nf-core/test-datasets/raw/viralrecon/genome/MN908947.3/nextclade_sars-cov-2_MN908947_2022-06-14T12_00_00Z.tar.gz'
nextclade_dataset_name = 'sars-cov-2'
nextclade_dataset_reference = 'MN908947'
nextclade_dataset_tag = '2022-01-18T12:00:00Z'
nextclade_dataset_tag = '2022-06-14T12:00:00Z'
primer_sets {
artic {
'1' {
@ -66,5 +70,28 @@ params {
}
}
}
// Monkeypox
'NC_063383.1' {
fasta = 'https://github.com/nf-core/test-datasets/raw/viralrecon/genome/NC_063383.1/GCF_014621545.1_ASM1462154v1_genomic.220824.fna.gz'
gff = 'https://github.com/nf-core/test-datasets/raw/viralrecon/genome/NC_063383.1/GCF_014621545.1_ASM1462154v1_genomic.220824.gff.gz'
nextclade_dataset = 'https://github.com/nf-core/test-datasets/raw/viralrecon/genome/NC_063383.1/nextclade_hMPXV_NC_063383.1_2022-08-19T12_00_00Z.tar.gz'
nextclade_dataset_name = 'hMPXV'
nextclade_dataset_reference = 'NC_063383.1'
nextclade_dataset_tag = '2022-08-19T12:00:00Z'
}
// Monkeypox
'ON563414.3' {
fasta = 'https://github.com/nf-core/test-datasets/raw/viralrecon/genome/ON563414.3/GCA_023516015.3_ASM2351601v1_genomic.220824.fna.gz'
gff = 'https://github.com/nf-core/test-datasets/raw/viralrecon/genome/ON563414.3/GCA_023516015.3_ASM2351601v1_genomic.220824.gff.gz'
}
// Monkeypox
'MT903344.1' {
fasta = 'https://github.com/nf-core/test-datasets/raw/viralrecon/genome/MT903344.1/GCA_014621585.1_ASM1462158v1_genomic.220824.fna.gz'
gff = 'https://github.com/nf-core/test-datasets/raw/viralrecon/genome/MT903344.1/GCA_014621585.1_ASM1462158v1_genomic.220824.gff.gz'
}
}
}

117
conf/sage.config Normal file
View file

@ -0,0 +1,117 @@
// Config profile metadata
params {
config_profile_description = 'The Sage Bionetworks profile'
config_profile_contact = 'Bruno Grande (@BrunoGrandePhD)'
config_profile_url = 'https://github.com/Sage-Bionetworks-Workflows'
}
// Leverage us-east-1 mirror of select human and mouse genomes
params {
igenomes_base = 's3://sage-igenomes/igenomes'
max_memory = '128.GB'
max_cpus = 16
max_time = '240.h'
}
// Enable retries globally for certain exit codes
process {
errorStrategy = { task.exitStatus in [143,137,104,134,139,247] ? 'retry' : 'finish' }
maxRetries = 5
maxErrors = '-1'
}
// Increase time limit to allow file transfers to finish
// The default is 12 hours, which results in timeouts
threadPool.FileTransfer.maxAwait = '24 hour'
// Configure Nextflow to be more reliable on AWS
aws {
region = "us-east-1"
client {
uploadChunkSize = 209715200
}
batch {
maxParallelTransfers = 1
}
}
executor {
name = 'awsbatch'
// Ensure unlimited queue size on AWS Batch
queueSize = 500
// Slow down the rate at which AWS Batch jobs accumulate in
// the queue (an attempt to prevent orphaned EBS volumes)
submitRateLimit = '5 / 1 sec'
}
// Adjust default resource allocations (see `../docs/sage.md`)
process {
cpus = { check_max( 1 * slow(task.attempt), 'cpus' ) }
memory = { check_max( 6.GB * task.attempt, 'memory' ) }
time = { check_max( 24.h * task.attempt, 'time' ) }
// Process-specific resource requirements
withLabel:process_low {
cpus = { check_max( 4 * slow(task.attempt), 'cpus' ) }
memory = { check_max( 12.GB * task.attempt, 'memory' ) }
time = { check_max( 24.h * task.attempt, 'time' ) }
}
withLabel:process_medium {
cpus = { check_max( 12 * slow(task.attempt), 'cpus' ) }
memory = { check_max( 36.GB * task.attempt, 'memory' ) }
time = { check_max( 48.h * task.attempt, 'time' ) }
}
withLabel:process_high {
cpus = { check_max( 24 * slow(task.attempt), 'cpus' ) }
memory = { check_max( 72.GB * task.attempt, 'memory' ) }
time = { check_max( 96.h * task.attempt, 'time' ) }
}
withLabel:process_long {
time = { check_max( 192.h * task.attempt, 'time' ) }
}
withLabel:process_high_memory {
memory = { check_max( 128.GB * task.attempt, 'memory' ) }
}
}
// Function to slow the increase of the resource multipler
// as attempts are made. The rationale is that the number
// of CPU cores isn't a limiting factor as often as memory.
def slow(attempt, factor = 2) {
return Math.ceil( attempt / factor) as int
}
// Function to ensure that resource requirements don't go
// beyond a maximum limit (copied here for Sarek v2)
def check_max(obj, type) {
if (type == 'memory') {
try {
if (obj.compareTo(params.max_memory as nextflow.util.MemoryUnit) == 1)
return params.max_memory as nextflow.util.MemoryUnit
else
return obj
} catch (all) {
println " ### ERROR ### Max memory '${params.max_memory}' is not valid! Using default value: $obj"
return obj
}
} else if (type == 'time') {
try {
if (obj.compareTo(params.max_time as nextflow.util.Duration) == 1)
return params.max_time as nextflow.util.Duration
else
return obj
} catch (all) {
println " ### ERROR ### Max time '${params.max_time}' is not valid! Using default value: $obj"
return obj
}
} else if (type == 'cpus') {
try {
return Math.min( obj, params.max_cpus as int )
} catch (all) {
println " ### ERROR ### Max cpus '${params.max_cpus}' is not valid! Using default value: $obj"
return obj
}
}
}

View file

@ -1,35 +1,33 @@
// Profile details
params {
config_profile_description = 'The Wellcome Sanger Institute HPC cluster profile'
config_profile_contact = 'Anthony Underwood (@aunderwo)'
config_profile_url = 'https://www.sanger.ac.uk/group/informatics-support-group/'
}
singularity {
enabled = true
cacheDir = "${baseDir}/singularity"
runOptions = '--bind /lustre --bind /nfs/pathnfs01 --bind /nfs/pathnfs02 --bind /nfs/pathnfs03 --bind /nfs/pathnfs04 --bind /nfs/pathnfs05 --bind /nfs/pathnfs06 --no-home'
config_profile_description = 'The Wellcome Sanger Institute HPC cluster (farm5) profile'
config_profile_contact = 'Priyanka Surana (@priyanka-surana)'
config_profile_url = 'https://www.sanger.ac.uk'
}
// Queue and retry strategy
process{
executor = 'lsf'
queue = 'normal'
errorStrategy = { task.attempt <= 5 ? "retry" : "finish" }
process.maxRetries = 5
withLabel:process_long {
queue = 'long'
}
executor = 'lsf'
queue = { task.time < 12.h ? 'normal' : task.time < 48.h ? 'long' : 'basement' }
errorStrategy = 'retry'
maxRetries = 5
}
// Executor details
executor{
name = 'lsf'
perJobMemLimit = true
poolSize = 4
submitRateLimit = '5 sec'
killBatchSize = 50
name = 'lsf'
perJobMemLimit = true
poolSize = 4
submitRateLimit = '5 sec'
killBatchSize = 50
}
// Max resources
params {
max_memory = 128.GB
max_cpus = 64
max_time = 48.h
max_memory = 683.GB
max_cpus = 256
max_time = 720.h
}
// For singularity
singularity.runOptions = '--bind /lustre --bind /nfs'

57
conf/sbc_sharc.config Normal file
View file

@ -0,0 +1,57 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sheffield Bioinformatics Core Configuration Profile - ShARC
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Base Institutional Configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
params {
// nf-core specific parameters displayed in header summary of each run
config_profile_description = 'Sheffield Bioinformatics Core - ShARC'
config_profile_contact = 'Lewis Quayle (l.quayle@sheffield.ac.uk)'
config_profile_url = 'https://docs.hpc.shef.ac.uk/en/latest/sharc/index.html'
// hpc resource limits
max_cpus = 16
max_memory = 64.GB
max_time = 96.h
}
// container engine
singularity {
enabled = true
autoMounts = true
}
// hpc configuration specific to ShARC
process {
// scheduler
executor = 'sge'
penv = 'smp'
queue = { task.time <= 6.h ? 'shortint.q' : 'all.q' }
clusterOptions = { "-l rmem=${task.memory.toGiga()}G" }
// error and retry handling
errorStrategy = { task.exitStatus in [143,137,104,134,139,140] ? 'retry' : 'finish' }
maxRetries = 2
}

View file

@ -7,9 +7,9 @@ workDir = "$scratch_dir/work"
// Perform work directory cleanup when the run has succesfully completed
// cleanup = true
// Reduce the job submit rate to about 10 per second, this way the server won't be bombarded with jobs
// Reduce the job submit rate to about 5 per second, this way the server won't be bombarded with jobs
executor {
submitRateLimit = '10 sec'
submitRateLimit = '3 sec'
}
// Specify that singularity should be used and where the cache dir will be for the images

39
docs/adcra.md Normal file
View file

@ -0,0 +1,39 @@
# nf-core/configs: CRA HPC Configuration
nfcore pipeline sarek and rnaseq have been tested on the CRA HPC.
## Before running the pipeline
- You will need an account to use the CRA HPC cluster in order to run the pipeline.
- Make sure that Singularity and Nextflow are installed.
- Downlode pipeline singularity images to a HPC system using [nf-core tools](https://nf-co.re/tools/#downloading-pipelines-for-offline-use)
```
$ conda install nf-core
$ nf-core download
```
- You will need to specify a Singularity cache directory in your ~./bashrc. This will store your container images in this cache directory without repeatedly downloading them every time you run a pipeline. Since space on home directory is limited, using lustre file system is recommended.
```
export NXF_SINGULARITY_CACHEDIR = "/lustre/fs0/storage/yourCRAAccount/cache_dir"
```
- Download iGenome reference to be used as a local copy.
```
$ aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/GATK/GRCh38/ /lustre/fs0/storage/yourCRAAccount/references/Homo_sapiens/GATK/GRCh38/
```
## Running the pipeline using the adcra config profile
- Run the pipeline within a [screen](https://linuxize.com/post/how-to-use-linux-screen/) or [tmux](https://linuxize.com/post/getting-started-with-tmux/) session.
- Specify the config profile with `-profile adcra`.
- Using lustre file systems to store results (`--outdir`) and intermediate files (`-work-dir`) is recommended.
```
nextflow run /path/to/nf-core/<pipeline-name> -profile adcra \
--genome GRCh38 \
--igenomes_base /path/to/genome_references/ \
... # the rest of pipeline flags
```

View file

@ -13,6 +13,8 @@ module load Singularity
module load Nextflow
```
Various tasks will be run inside of Singularity containers and all temp files typically written to `/tmp` and `/var/tmp` are instead written to the path pointed to by the `USER_SCRATCH` environment variable. This means that these temp files are stored in a user specific location, making them inaccessible to other users for pipeline reruns. Some of these temp files can be large and cleanup is also the responsibility of the user.
All of the intermediate files required to run the pipeline will be stored in the `work/` directory. It is recommended to delete this directory after the pipeline has finished successfully because it can get quite large, and all of the main output files will be saved in the `results/` directory anyway.
> NB: You will need an account to use the HPC cluster on Cheaha in order to run the pipeline. If in doubt contact UAB IT Research Computing.</br></br>

15
docs/crukmi.md Normal file
View file

@ -0,0 +1,15 @@
# nf-core/configs: Cancer Research UK Manchester Institute Configuration
All nf-core pipelines have been successfully configured for the use on the HPC (phoenix) at Cancer Research UK Manchester Institute.
To use, run the pipeline with `-profile crukmi`. This will download and launch the [`crukmi.config`](../conf/crukmi.config) which has been pre-configured with a setup suitable for the phoenix HPC. Using this profile, singularity images will be downloaded to run on the cluster.
Before running the pipeline you will need to load Nextflow using the environment module system, for example via:
```bash
## Load Nextflow and Singularity environment modules
module purge
module load apps/nextflow/22.04.5
```
The pipeline should always be executed inside a workspace on the `/scratch/` system. All of the intermediate files required to run the pipeline will be stored in the `work/` directory. It is recommended to delete this directory after the pipeline has finished successfully because it can get quite large, and all of the main output files will be saved in the `results/` directory.

70
docs/medair.md Normal file
View file

@ -0,0 +1,70 @@
# nf-core/configs: Medair Configuration
All nf-core pipelines have been successfully configured for use on the Medair cluster at Clinical Genomics Gothenburg.
To use, run the pipeline with `-profile medair`. This will download and launch the [`medair.config`](../conf/medair.config) which has been pre-configured with a setup suitable for the Medair cluster.
It will enable Nextflow to manage the pipeline jobs via the `SGE` job scheduler.
Using this profile, a docker image containing all of the required software will be downloaded, and converted to a Singularity image before execution of the pipeline.
You will need an account to use the Medair cluster in order to download or run pipelines. If in doubt, contact cgg-it.
## Download nf-core pipelines
### Set-up: load Nextflow and nf-core tools
First you need to load relevant softwares: Nextflow and nf-core tools. You can do it as follow:
```bash
## Load Nextflow
module load nextflow
## Load nf-core tools
module load miniconda
source activate nf-core
```
### Storage of Singularity images
When downloading a nf-core pipeline for the first time (or a specific version of a pipeline), you can choose to store the Singularity image for future use. We chose to have a central location for these images on medair: `/apps/bio/dependencies/nf-core/singularities`.
For Nexflow to know where to store new images, run or add the following to your `.bashrc`:
```bash
export NXF_SINGULARITY_CACHEDIR="/apps/bio/dependencies/nf-core/singularities"
```
> Comment: This was also added to cronuser.
### Download a pipeline
We have started to download pipelines in the following location: `/apps/bio/repos/nf-core/`
Use the `nf-core download --singularity-cache-only` command to start a download. It will open an interactive menu. Choose `singularity` for the software container image, and `none` for the compression type.
## Run nf-core pipelines
Nextflow will need to submit the jobs via the job scheduler to the HPC cluster and as such the commands below will have to be executed on one of the login nodes. If in doubt contact cgg-it (cgg-it[at]gu.se).
### Set-up: load Nextflow and Singularity
Before running a pipeline you will need to load Nextflow and Singularity using the environment module system on Medair. You can do this by issuing the commands below:
```bash
## Load Nextflow and Singularity environment modules
module purge
module load nextflow
module load singularity
```
### Choose a profile
Depending on what you are running, you can choose between the `wgs` and `production` profiles. Jobs running with the `wgs` profile run on a queue with higher priority. Jobs running with the `production` profile can last longer (max time: 20 days, versus 2 days for the `wgs` profile).
For example, the following job would run with the `wgs` profile:
```bash
run nextflow nf-core/raredisease -profile medair,wgs
```
### Sentieon
In some pipelines (sarek, raredisease) it is possible to use Sentieon for alignment and variant calling. If ones uses the label `sentieon` for running a process, the config file contains the path to the Sentieon singularity image on Medair.

View file

@ -0,0 +1,11 @@
# nf-core/configs: ATAC-Seq Specific Configuration - Sheffield Bioinformatics Core Facility ShARC
Specific configuration for [nf-co.re/atacseq](https://nf-co.re/atacseq) pipeline
## Usage
To use, run nextflow with the pipeline using `-profile sbc_sharc` (note the single hyphen).
This will download and launch the atacseq specific [`sbc_sharc.config`](../../../conf/pipeline/atacseq/sbc_sharc.config) which has been pre-configured with a setup suitable for the [University of Sheffield ShARC cluster](https://docs.hpc.shef.ac.uk/en/latest/index.html) and will automatically load the appropriate pipeline-specific configuration file.
Example: `nextflow run nf-core/atacseq -profile sbc_sharc`

View file

@ -0,0 +1,11 @@
# nf-core/configs: ChIP-Seq Specific Configuration - Sheffield Bioinformatics Core Facility ShARC
Specific configuration for [nf-co.re/chipseq](https://nf-co.re/chipseq) pipeline
## Usage
To use, run nextflow with the pipeline using `-profile sbc_sharc` (note the single hyphen).
This will download and launch the chipseq specific [`sbc_sharc.config`](../../../conf/pipeline/chipseq/sbc_sharc.config) which has been pre-configured with a setup suitable for the [University of Sheffield ShARC cluster](https://docs.hpc.shef.ac.uk/en/latest/index.html) and will automatically load the appropriate pipeline-specific configuration file.
Example: `nextflow run nf-core/chipseq -profile sbc_sharc`

View file

@ -0,0 +1,19 @@
# nf-core/configs: HASTA rnafusion specific configuration
Extra specific configuration for rnafusion pipeline
## Usage
To use, run the pipeline with `-profile hasta`.
This will download and launch the rnafusion specific [`hasta.config`](../../../conf/pipeline/rnafusion/munin.config) which has been pre-configured with a setup suitable for the `HASTA` cluster.
Example: `nextflow run nf-core/rnafusion -profile hasta`
## rnafusion specific configurations for HASTA
Specific configurations for `HASTA` has been made for rnafusion.
- Always run all the analysis steps (all = true)
- Use trimming (trim = true)
- Take the fusions identified by at least 2 fusion detection tools to the fusioninspector analysis (fusioninspector_filter = true)

View file

@ -0,0 +1,11 @@
# nf-core/configs: RNA-Seq Specific Configuration - Sheffield Bioinformatics Core Facility ShARC
Specific configuration for [nf-co.re/rnaseq](https://nf-co.re/rnaseq) pipeline
## Usage
To use, run nextflow with the pipeline using `-profile sbc_sharc` (note the single hyphen).
This will download and launch the rnaseq specific [`sbc_sharc.config`](../../../conf/pipeline/rnaseq/sbc_sharc.config) which has been pre-configured with a setup suitable for the [University of Sheffield ShARC cluster](https://docs.hpc.shef.ac.uk/en/latest/index.html) and will automatically load the appropriate pipeline-specific configuration file.
Example: `nextflow run nf-core/rnaseq -profile sbc_sharc`

View file

@ -0,0 +1,17 @@
# nf-core/configs: CRUK-MI sarek specific configuration
Extra specific configuration for sarek pipeline
## Usage
To use, run the pipeline with `-profile crukmi`.
This will download and launch the sarek specific [`crukmi.config`](../../../conf/pipeline/sarek/munin.config) which has been pre-configured with a setup suitable for the Cancer Research UK Manchester Institute cluster (phoenix).
Example: `nextflow run nf-core/sarek -profile crukmi`
## Sarek specific configurations for CRUK-MI
Specific configurations for `CRUK-MI` has been made for sarek.
- Initial requested resources for SAMTOOLS_MPILEUP are only 5GB and 1 core.

View file

@ -0,0 +1,11 @@
# nf-core/configs: Sarek Specific Configuration - Sheffield Bioinformatics Core Facility ShARC
Specific configuration for [nf-co.re/sarek](https://nf-co.re/sarek) pipeline
## Usage
To use, run nextflow with the pipeline using `-profile sbc_sharc` (note the single hyphen).
This will download and launch the sarek specific [`sbc_sharc.config`](../../../conf/pipeline/sarek/sbc_sharc.config) which has been pre-configured with a setup suitable for the [University of Sheffield ShARC cluster](https://docs.hpc.shef.ac.uk/en/latest/index.html) and will automatically load the appropriate pipeline-specific configuration file.
Example: `nextflow run nf-core/sarek -profile sbc_sharc`

30
docs/sage.md Normal file
View file

@ -0,0 +1,30 @@
# nf-core/configs: Sage Bionetworks Global Configuration
To use this custom configuration, run the pipeline with `-profile sage`. This will download and load the [`sage.config`](../conf/sage.config), which contains a number of optimizations relevant to Sage employees running workflows on AWS (_e.g._ using Nextflow Tower). This profile will also load any applicable pipeline-specific configuration.
This global configuration includes the following tweaks:
- Update the default value for `igenomes_base` to `s3://sage-igenomes`
- Enable retries by default when exit codes relate to insufficient memory
- Allow pending jobs to finish if the number of retries are exhausted
- Increase the amount of time allowed for file transfers
- Increase the default chunk size for multipart uploads to S3
- Slow down job submission rate to avoid overwhelming any APIs
- Define the `check_max()` function, which is missing in Sarek v2
- Slow the increase in the number of allocated CPU cores on retries
- Increase the default time limits because we run pipelines on AWS
## Additional information about iGenomes
The following iGenomes prefixes have been copied from `s3://ngi-igenomes/` (`eu-west-1`) to `s3://sage-igenomes` (`us-east-1`). See [this script](https://github.com/Sage-Bionetworks-Workflows/nextflow-infra/blob/main/bin/mirror-igenomes.sh) for more information. The `sage-igenomes` S3 bucket has been configured to openly available, but files cannot be downloaded out of `us-east-1` to avoid egress charges. You can check the `conf/igenomes.config` file in each nf-core pipeline to figure out the mapping between genome IDs (_i.e._ for `--genome`) and iGenomes prefixes ([example](https://github.com/nf-core/rnaseq/blob/89bf536ce4faa98b4d50a8ec0a0343780bc62e0a/conf/igenomes.config#L14-L26)).
- **Human Genome Builds**
- `Homo_sapiens/Ensembl/GRCh37`
- `Homo_sapiens/GATK/GRCh37`
- `Homo_sapiens/UCSC/hg19`
- `Homo_sapiens/GATK/GRCh38`
- `Homo_sapiens/NCBI/GRCh38`
- `Homo_sapiens/UCSC/hg38`
- **Mouse Genome Builds**
- `Mus_musculus/Ensembl/GRCm38`
- `Mus_musculus/UCSC/mm10`

View file

@ -2,8 +2,6 @@
To use, run the pipeline with `-profile sanger`. This will download and launch the [`sanger.config`](../conf/sanger.config) which has been
pre-configured with a setup suitable for the Wellcome Sanger Institute LSF cluster.
Using this profile, either a docker image containing all of the required software will be downloaded, and converted to a Singularity image or
a Singularity image downloaded directly before execution of the pipeline.
## Running the workflow on the Wellcome Sanger Institute cluster
@ -14,10 +12,12 @@ The latest version of Nextflow is not installed by default on the cluster. You w
A recommended place to move the `nextflow` executable to is `~/bin` so that it's in the `PATH`.
Nextflow manages each process as a separate job that is submitted to the cluster by using the `bsub` command.
Since the Nextflow pipeline will submit individual jobs for each process to the cluster and dependencies will be provided bu Singularity images you shoudl make sure that your account has access to the Singularity binary by adding these lines to your `.bashrc` file
If asking Nextflow to use Singularity to run the individual jobs,
you should make sure that your account has access to the Singularity binary by adding these lines to your `.bashrc` file
```bash
[[ -f /software/pathogen/farm5 ]] && module load ISG/singularity
[[ -f /software/modules/ISG/singularity ]] && module load ISG/singularity
```
Nextflow shouldn't run directly on the submission node but on a compute node.
@ -26,16 +26,16 @@ To do so make a shell script with a similar structure to the following code and
```bash
#!/bin/bash
#BSUB -o /path/to/a/log/dir/%J.o
#BSUB -e /path/to/a/log/dir//%J.e
#BSUB -e /path/to/a/log/dir/%J.e
#BSUB -M 8000
#BSUB -q long
#BSUB -n 4
#BSUB -q oversubscribed
#BSUB -n 2
export HTTP_PROXY='http://wwwcache.sanger.ac.uk:3128'
export HTTPS_PROXY='http://wwwcache.sanger.ac.uk:3128'
export NXF_ANSI_LOG=false
export NXF_OPTS="-Xms8G -Xmx8G -Dnxf.pool.maxThreads=2000"
export NXF_VER=21.04.0-edge
export NXF_VER=22.04.0-5697
nextflow run \

40
docs/sbc_sharc.md Normal file
View file

@ -0,0 +1,40 @@
# nf-core/configs: Sheffield Bioinformatics Core Facility ShARC Configuration
## Using the SBC_ShARC Institutional Configuration Profile
To use [`sbc_sharc.config`](../conf/sbc_sharc.config), run nextflow with an nf-core pipeline using `-profile sbc_sharc` (note the single hyphen).
This will download and launch [`sbc_sharc.config`](../conf/sbc_sharc.config) which has been pre-configured with a setup suitable for the ShARC cluster and will automatically load the appropriate pipeline-specific configuration file.
The following nf-core pipelines have been successfully configured for use on the the [University of Sheffield ShARC cluster](https://docs.hpc.shef.ac.uk/en/latest/index.html):
- [nf-co.re/atacseq](https://nf-co.re/atacseq)
- [nf-co.re/chipseq](https://nf-co.re/chipseq)
- [nf-co.re/rnaseq](https://nf-co.re/rnaseq)
- [nf-co.re/sarek](https://nf-co.re/sarek)
When using [`sbc_sharc.config`](../conf/sbc_sharc.config) with the pipelines listed above, the appropriate configuration file from the list below will be loaded automatically:
- [atacseq sbc_sharc.config](../conf/pipeline/atacseq/sbc_sharc.config)
- [chipseq sbc_sharc.config](../conf/pipeline/chipseq/sbc_sharc.config)
- [rnaseq sbc_sharc.config](../conf/pipeline/rnaseq/sbc_sharc.config)
- [sarek sbc_sharc.config](../conf/pipeline/sarek/sbc_sharc.config)
The [`sbc_sharc.config`](../conf/sbc_sharc.config) configuration file might work with other nf-core pipelines as it stands but we cannot guarantee they will run without issue. We will be continuing to create, test and optimise configurations for new pipelines in the future.
## A Note on Singularity Containers
The [`sbc_sharc.config`](../conf/sbc_sharc.config) configuration file supports running nf-core pipelines with Singularity containers; Singularity images will be downloaded automatically before execution of the pipeline.
When you run nextflow for the first time, Singularity will create a hidden directory `.singularity` in your `$HOME` directory `/home/$USER` which has very very limited (10GB) space available. It is therefore a good idea to create a directory somewhere else (e.g., `/data/$USER`) with more room and link the locations. To do this, run the following series of commands:
```shell
# change directory to $HOME
cd $HOME
# make the directory that will be linked to
mkdir /data/$USER/.singularity
# link the new directory with the existing one
ln -s /data/$USER/.singularity .singularity
```

View file

@ -11,6 +11,7 @@
//Please use a new line per include Config section to allow easier linting/parsing. Thank you.
profiles {
abims { includeConfig "${params.custom_config_base}/conf/abims.config" }
adcra { includeConfig "${params.custom_config_base}/conf/adcra.config" }
alice { includeConfig "${params.custom_config_base}/conf/alice.config" }
aws_tower { includeConfig "${params.custom_config_base}/conf/aws_tower.config" }
awsbatch { includeConfig "${params.custom_config_base}/conf/awsbatch.config" }
@ -30,6 +31,7 @@ profiles {
cheaha { includeConfig "${params.custom_config_base}/conf/cheaha.config" }
computerome { includeConfig "${params.custom_config_base}/conf/computerome.config" }
crick { includeConfig "${params.custom_config_base}/conf/crick.config" }
crukmi { includeConfig "${params.custom_config_base}/conf/crukmi.config" }
czbiohub_aws { includeConfig "${params.custom_config_base}/conf/czbiohub_aws.config" }
denbi_qbic { includeConfig "${params.custom_config_base}/conf/denbi_qbic.config" }
ebc { includeConfig "${params.custom_config_base}/conf/ebc.config" }
@ -49,6 +51,7 @@ profiles {
lugh { includeConfig "${params.custom_config_base}/conf/lugh.config" }
maestro { includeConfig "${params.custom_config_base}/conf/maestro.config" }
marvin { includeConfig "${params.custom_config_base}/conf/marvin.config" }
medair { includeConfig "${params.custom_config_base}/conf/medair.config" }
mjolnir_globe { includeConfig "${params.custom_config_base}/conf/mjolnir_globe.config" }
mpcdf { includeConfig "${params.custom_config_base}/conf/mpcdf.config" }
munin { includeConfig "${params.custom_config_base}/conf/munin.config" }
@ -59,8 +62,10 @@ profiles {
phoenix { includeConfig "${params.custom_config_base}/conf/phoenix.config" }
prince { includeConfig "${params.custom_config_base}/conf/prince.config" }
rosalind { includeConfig "${params.custom_config_base}/conf/rosalind.config" }
sage { includeConfig "${params.custom_config_base}/conf/sage.config" }
sahmri { includeConfig "${params.custom_config_base}/conf/sahmri.config" }
sanger { includeConfig "${params.custom_config_base}/conf/sanger.config"}
sbc_sharc { includeConfig "${params.custom_config_base}/conf/sbc_sharc.config"}
seg_globe { includeConfig "${params.custom_config_base}/conf/seg_globe.config"}
uct_hpc { includeConfig "${params.custom_config_base}/conf/uct_hpc.config" }
unibe_ibu { includeConfig "${params.custom_config_base}/conf/unibe_ibu.config" }

13
pipeline/atacseq.config Normal file
View file

@ -0,0 +1,13 @@
/*
* -------------------------------------------------
* nfcore/atacseq custom profile Nextflow config file
* -------------------------------------------------
* Config options for custom environments.
* Cluster-specific config options should be saved
* in the conf/pipeline/atacseq folder and imported
* under a profile name here.
*/
profiles {
sbc_sharc { includeConfig "${params.custom_config_base}/conf/pipeline/atacseq/sbc_sharc.config" }
}

13
pipeline/chipseq.config Normal file
View file

@ -0,0 +1,13 @@
/*
* -------------------------------------------------
* nfcore/chipseq custom profile Nextflow config file
* -------------------------------------------------
* Config options for custom environments.
* Cluster-specific config options should be saved
* in the conf/pipeline/chipseq folder and imported
* under a profile name here.
*/
profiles {
sbc_sharc { includeConfig "${params.custom_config_base}/conf/pipeline/chipseq/sbc_sharc.config" }
}

View file

@ -9,5 +9,6 @@
*/
profiles {
hasta { includeConfig "${params.custom_config_base}/conf/pipeline/rnafusion/hasta.config" }
munin { includeConfig "${params.custom_config_base}/conf/pipeline/rnafusion/munin.config" }
}
}

View file

@ -11,5 +11,6 @@
profiles {
eddie { includeConfig "${params.custom_config_base}/conf/pipeline/rnaseq/eddie.config" }
mpcdf { includeConfig "${params.custom_config_base}/conf/pipeline/rnaseq/mpcdf.config" }
sbc_sharc { includeConfig "${params.custom_config_base}/conf/pipeline/rnaseq/sbc_sharc.config" }
utd_sysbio { includeConfig "${params.custom_config_base}/conf/pipeline/rnaseq/utd_sysbio.config" }
}

View file

@ -9,10 +9,12 @@
*/
profiles {
munin { includeConfig "${params.custom_config_base}/conf/pipeline/sarek/munin.config" }
uppmax { includeConfig "${params.custom_config_base}/conf/pipeline/sarek/uppmax.config" }
icr_davros { includeConfig "${params.custom_config_base}/conf/pipeline/sarek/icr_davros.config" }
cfc { includeConfig "${params.custom_config_base}/conf/pipeline/sarek/cfc.config" }
cfc_dev { includeConfig "${params.custom_config_base}/conf/pipeline/sarek/cfc.config" }
crukmi { includeConfig "${params.custom_config_base}/conf/pipeline/sarek/crukmi.config" }
eddie { includeConfig "${params.custom_config_base}/conf/pipeline/sarek/eddie.config" }
}
icr_davros { includeConfig "${params.custom_config_base}/conf/pipeline/sarek/icr_davros.config" }
munin { includeConfig "${params.custom_config_base}/conf/pipeline/sarek/munin.config" }
sbc_sharc { includeConfig "${params.custom_config_base}/conf/pipeline/sarek/sbc_sharc.config" }
uppmax { includeConfig "${params.custom_config_base}/conf/pipeline/sarek/uppmax.config" }
}