Add guidelines section

This commit is contained in:
drpatelh 2020-08-06 13:43:19 +01:00
parent c7c65e3ad0
commit 1b294f4249

View file

@ -99,8 +99,7 @@ We have written a helper command in the `nf-core/tools` package that uses the Gi
## Adding a new module file ## Adding a new module file
> **NB:** The definition and standards for module files are still under discussion > **NB:** The definition and standards for module files are still under discussion
amongst the nf-core community but your contributions but your contributions are always amongst the nf-core community but your contributions are always more than welcome! :)
more than welcome! :)
If you decide to upload a module to `nf-core/modules` then this will If you decide to upload a module to `nf-core/modules` then this will
ensure that it will become available to all nf-core pipelines, ensure that it will become available to all nf-core pipelines,
@ -108,9 +107,12 @@ and to everyone within the Nextflow community! See
[`nf-core/modules/software`](https://github.com/nf-core/modules/tree/master/software) [`nf-core/modules/software`](https://github.com/nf-core/modules/tree/master/software)
for examples. for examples.
### Current guidelines
The key words "MUST", "MUST NOT", "SHOULD", etc. are to be interpreted as described in [RFC 2119](https://tools.ietf.org/html/rfc2119). The key words "MUST", "MUST NOT", "SHOULD", etc. are to be interpreted as described in [RFC 2119](https://tools.ietf.org/html/rfc2119).
### Defining inputs, outputs and parameters #### Defining inputs, outputs and parameters
- A module file SHOULD only define inputs and outputs as parameters. Additionally, - A module file SHOULD only define inputs and outputs as parameters. Additionally,
- it MUST define threads or resources where required for a particular process using `task.cpus` - it MUST define threads or resources where required for a particular process using `task.cpus`
- ~~it MUST be possible to pass additional parameters to the tool as a command line string via the `params.<MODULE>_args` parameter.~~ - ~~it MUST be possible to pass additional parameters to the tool as a command line string via the `params.<MODULE>_args` parameter.~~
@ -122,13 +124,16 @@ The key words "MUST", "MUST NOT", "SHOULD", etc. are to be interpreted as descri
- A Process MUST NOT contain a `when` statement. - A Process MUST NOT contain a `when` statement.
- Optional inputs need development on the nextflow side. In the meanwhile, "fake files" MAY be used to work around this issue. - Optional inputs need development on the nextflow side. In the meanwhile, "fake files" MAY be used to work around this issue.
### Atomicity #### Atomicity
- Software that can be piped together SHOULD be added to separate module files unless there is an run-time, storage advantage in implementing in this way e.g. `bwa mem | samtools view -C -T ref.fasta` to output CRAM instead of SAM. - Software that can be piped together SHOULD be added to separate module files unless there is an run-time, storage advantage in implementing in this way e.g. `bwa mem | samtools view -C -T ref.fasta` to output CRAM instead of SAM.
### Resource requirements #### Resource requirements
- Each module MUST define a label `process_low`, `process_medium` or `process_high` to declare resource requirements. (*These flags will be ignored outside of nf-core and the pipeline developer is free to define adequate resource requirements*) - Each module MUST define a label `process_low`, `process_medium` or `process_high` to declare resource requirements. (*These flags will be ignored outside of nf-core and the pipeline developer is free to define adequate resource requirements*)
### Publishing results #### Publishing results
- The module MUST accept the parameters `params.out_dir` and `params.publish_dir` and MUST publish results into `${params.out_dir}/${params.publish_dir}`. - The module MUST accept the parameters `params.out_dir` and `params.publish_dir` and MUST publish results into `${params.out_dir}/${params.publish_dir}`.
- The `publishDirMode` MUST be configurable via `params.publish_dir_mode` - The `publishDirMode` MUST be configurable via `params.publish_dir_mode`
- The module MUST accept a parameter `params.publish_results` accepting at least - The module MUST accept a parameter `params.publish_results` accepting at least
@ -139,28 +144,32 @@ The key words "MUST", "MUST NOT", "SHOULD", etc. are to be interpreted as descri
- To ensure consistent naming, files SHOULD be renamed according to the `$name` variable before returning them. - To ensure consistent naming, files SHOULD be renamed according to the `$name` variable before returning them.
### Testing #### Testing
- Every module MUST be tested by adding a test workflow with a toy dataset. - Every module MUST be tested by adding a test workflow with a toy dataset.
- Test data MUST be stored within this repo. It is RECOMMENDED to re-use generic files from `tests/data` by symlinking them into the test directory of the module. Specific files MUST be added to the test-directory directly. Test files MUST be kept as tiny as possible. - Test data MUST be stored within this repo. It is RECOMMENDED to re-use generic files from `tests/data` by symlinking them into the test directory of the module. Specific files MUST be added to the test-directory directly. Test files MUST be kept as tiny as possible.
### Software requirements #### Software requirements
- Software requirements SHOULD be declared in a conda `environment.yml` file, including exact version numbers. Additionally, there MUST be a `Dockerfile` that containerizes the environment, or packages the software if conda is not available. - Software requirements SHOULD be declared in a conda `environment.yml` file, including exact version numbers. Additionally, there MUST be a `Dockerfile` that containerizes the environment, or packages the software if conda is not available.
- Docker containers MUST BE identified by their `sha256(Dockerfile + environment.yml)`. - Docker containers MUST BE identified by their `sha256(Dockerfile + environment.yml)`.
- Each module must have it's own `Dockerfile` and `environment.yml` file - Each module must have it's own `Dockerfile` and `environment.yml` file
- Care should be taken to maintain identical files for subcommands that use the same software. Then the hash tag will be the same and they will be implicitly re-used across subcommands. - Care should be taken to maintain identical files for subcommands that use the same software. Then the hash tag will be the same and they will be implicitly re-used across subcommands.
### File formats #### File formats
- Wherever possible, [CRAM](https://en.wikipedia.org/wiki/CRAM_(file_format)) files SHOULD be used over BAM files. - Wherever possible, [CRAM](https://en.wikipedia.org/wiki/CRAM_(file_format)) files SHOULD be used over BAM files.
- Wherever possible, FASTQ files SHOULD be compressed using gzip. - Wherever possible, FASTQ files SHOULD be compressed using gzip.
### Documentation #### Documentation
- A module MUST be documented in the `meta.yml` file. It MUST document `params`, `input` and `output`. `input` and `output` MUST be a nested list. [Exact detail need to be elaborated. ] - A module MUST be documented in the `meta.yml` file. It MUST document `params`, `input` and `output`. `input` and `output` MUST be a nested list. [Exact detail need to be elaborated. ]
### Uploading to `nf-core/modules` ### Uploading to `nf-core/modules`
[Fork](https://help.github.com/articles/fork-a-repo/) the `nf-core/modules` repository to your own GitHub account. Within the local clone of your fork add the module file to the [`nf-core/modules/software`](https://github.com/nf-core/modules/tree/master/software) directory. Please keep the naming consistent between the module and documentation files e.g. `bwa.nf` and `bwa.md`, respectively. [Fork](https://help.github.com/articles/fork-a-repo/) the `nf-core/modules` repository to your own GitHub account. Within the local clone of your fork add the module file to the [`nf-core/modules/software`](https://github.com/nf-core/modules/tree/master/software) directory.
Commit and push these changes to your local clone on GitHub, and then [create a pull request](https://help.github.com/articles/creating-a-pull-request-from-a-fork/) on `nf-core/modules` GitHub repo with the appropriate information. Commit and push these changes to your local clone on GitHub, and then [create a pull request](https://help.github.com/articles/creating-a-pull-request-from-a-fork/) on the `nf-core/modules` GitHub repo with the appropriate information.
We will be notified automatically when you have created your pull request, and providing that everything adheres to nf-core guidelines we will endeavour to approve your pull request as soon as possible. We will be notified automatically when you have created your pull request, and providing that everything adheres to nf-core guidelines we will endeavour to approve your pull request as soon as possible.