Update README

This commit is contained in:
Gregor Sturm 2020-07-14 10:51:19 +02:00
parent 8155270653
commit 4ee6ddc7ab
4 changed files with 26 additions and 14 deletions

3
.gitignore vendored
View file

@ -1,5 +1,6 @@
.nextflow*
work/
data/
results/
./data
.DS_Store
*.code-workspace

View file

@ -6,22 +6,22 @@ A repository for hosting nextflow [`DSL2`](https://www.nextflow.io/docs/edge/dsl
## Table of contents
* [Using existing modules](#using-existing-modules)
* [Configuration and parameters](#configuration-and-parameters)
* [Offline usage](#offline-usage)
* [Adding a new module file](#adding-a-new-module-file)
* [Testing](#testing)
* [Documentation](#documentation)
* [Uploading to `nf-core/modules`](#uploading-to-nf-coremodules)
* [Help](#help)
- [Using existing modules](#using-existing-modules)
- [Configuration and parameters](#configuration-and-parameters)
- [Offline usage](#offline-usage)
- [Adding a new module file](#adding-a-new-module-file)
- [Testing](#testing)
- [Documentation](#documentation)
- [Uploading to `nf-core/modules`](#uploading-to-nf-coremodules)
- [Help](#help)
## Terminology
The features offered by Nextflow DSL 2 can be used in various ways depending on the granularity with which you would like to write pipelines. Please see the listing below for the hierarchy and associated terminology we have decided to use when referring to DSL 2 components:
* *Module*: A `process`that can be used within different pipelines and is as atomic as possible i.e. cannot be split into another module. An example of this would be a module file containing the process definition for a single tool such as `FastQC`. This repository has been created to only host atomic module files that should be added to the `tools` sub-directory along with the required documentation, software and tests.
* *Sub-workflow*: A chain of multiple modules that offer a higher-level of functionality within the context of a pipeline. For example, a sub-workflow to run multiple QC tools with FastQ files as input. Sub-workflows should be shipped with the pipeline implementation and if required they should be shared amongst different pipelines directly from there. As it stands, this repository will not host sub-workflows.
* *Workflow*: What DSL 1 users would consider an end-to-end pipeline. For example, from one or more inputs to a series of outputs. This can either be implemented using a large monolithic script as with DSL 1, or by using a combination of DSL 2 individual modules and sub-workflows.
- _Module_: A `process`that can be used within different pipelines and is as atomic as possible i.e. cannot be split into another module. An example of this would be a module file containing the process definition for a single tool such as `FastQC`. This repository has been created to only host atomic module files that should be added to the `tools` sub-directory along with the required documentation, software and tests.
- _Sub-workflow_: A chain of multiple modules that offer a higher-level of functionality within the context of a pipeline. For example, a sub-workflow to run multiple QC tools with FastQ files as input. Sub-workflows should be shipped with the pipeline implementation and if required they should be shared amongst different pipelines directly from there. As it stands, this repository will not host sub-workflows.
- _Workflow_: What DSL 1 users would consider an end-to-end pipeline. For example, from one or more inputs to a series of outputs. This can either be implemented using a large monolithic script as with DSL 1, or by using a combination of DSL 2 individual modules and sub-workflows.
## Using existing modules
@ -56,7 +56,19 @@ nextflow run /path/to/pipeline/ -c /path/to/custom_module.conf
If you decide to upload your module file to `nf-core/modules` then this will ensure that it will be automatically downloaded, and available at run-time to all nf-core pipelines, and to everyone within the Nextflow community! See [`nf-core/modules/nf`](https://github.com/nf-core/modules/tree/master/nf) for examples.
> The definition and standards for module files are still under discussion amongst the community but hopefully, a description should be added here soon!
The definition and standards for module files are still under discussion amongst the community.
Currently the following points have been agreed on:
- Module file should only define inputs/outputs as parameters and have the ability to use `params.MODULENAME_options` as an additional parameter to add any additional settings via pipelines.
- Specify single-end boolean values within the input channel and not be inferred from the data e.g. [here](https://github.com/nf-core/tools/blob/028a9b3f9d1ad044e879a1de13d3c3a25a06b9a7/nf_core/pipeline-template/%7B%7Bcookiecutter.name_noslash%7D%7D/modules/nf-core/fastqc.nf#L13)
- Define threads or resources where required for a particular process using `task.cpus`
- Software that can be piped together should be added to separate module files unless there is an run-time, storage advantage in implementing in this way e.g. `bwa mem | samtools view` to output BAM instead of SAM
- Process names should be all uppercase
- The `publishDirMode` should be configurable
- Test data is stored within this repo. Re-use generic files from `tests/data` by
symlinking them into the test directory of the module. Add specific files
to the test-directory directly. Keep test files as tiny as possible.
### Testing

@ -1 +0,0 @@
Subproject commit ddbd0c4cf7f1721c78673c4dcc91fcd7940e67f8

0
tests/data/.gitignore vendored Normal file
View file