nf-core_modules/README.md

# ![nf-core/modules](docs/images/nfcore-modules_logo.png)

> DSL2 IS AN EXPERIMENTAL FEATURE UNDER DEVELOPMENT. SYNTAX, ORGANISATION AND LAYOUT OF THIS REPOSITORY MAY CHANGE IN THE NEAR FUTURE!

A repository for hosting nextflow [`DSL2`](https://www.nextflow.io/docs/edge/dsl2.htmlhttps://www.nextflow.io/docs/edge/dsl2.html) module files containing tool-specific process definitions and their associated documentation.

## Table of contents

- [Using existing modules](#using-existing-modules)
    - [Configuration and parameters](#configuration-and-parameters)
    - [Offline usage](#offline-usage)
- [Adding a new module file](#adding-a-new-module-file)
    - [Testing](#testing)
    - [Documentation](#documentation)
    - [Uploading to `nf-core/modules`](#uploading-to-nf-coremodules)
- [Help](#help)

## Terminology

The features offered by Nextflow DSL 2 can be used in various ways depending on the granularity with which you would like to write pipelines. Please see the listing below for the hierarchy and associated terminology we have decided to use when referring to DSL 2 components:

- *Module*: A `process`that can be used within different pipelines and is as atomic as possible i.e. cannot be split into another module. An example of this would be a module file containing the process definition for a single tool such as `FastQC`. This repository has been created to only host atomic module files that should be added to the `tools` sub-directory along with the required documentation, software and tests.
- *Sub-workflow*: A chain of multiple modules that offer a higher-level of functionality within the context of a pipeline. For example, a sub-workflow to run multiple QC tools with FastQ files as input. Sub-workflows should be shipped with the pipeline implementation and if required they should be shared amongst different pipelines directly from there. As it stands, this repository will not host sub-workflows.
- *Workflow*: What DSL 1 users would consider an end-to-end pipeline. For example, from one or more inputs to a series of outputs. This can either be implemented using a large monolithic script as with DSL 1, or by using a combination of DSL 2 individual modules and sub-workflows.

## Using existing modules

The Nextflow [`include`](https://www.nextflow.io/docs/edge/dsl2.html#modules-include) statement can be used within your pipelines in order to load module files that you have available locally.

You should be able to get a good idea as to how other people are using module files by looking at pipelines available in nf-core e.g. [`nf-core/chipseq`](https://github.com/nf-core/chipseq/tree/dev) (work in progress)

### Configuration and parameters

The module files hosted in this repository define a set of processes for software tools such as `fastqc`, `trimgalore`, `bwa` etc. This allows you to share and add common functionality across multiple pipelines in a modular fashion.

> The definition and standards for module files are still under discussion amongst the community but hopefully, a description should be added here soon!

### Offline usage

If you want to use an existing module file available in `nf-core/modules`, and you're running on a system that has no internet connection, you'll need to download the repository (e.g. `git clone https://github.com/nf-core/modules.git`) and place it in a location that is visible to the file system on which you are running the pipeline. Then run the pipeline by creating a custom config file called e.g. `custom_module.conf` containing the following information:

```bash
include /path/to/downloaded/modules/directory/
```

Then you can run the pipeline by directly passing the additional config file with the `-c` parameter:

```bash
nextflow run /path/to/pipeline/ -c /path/to/custom_module.conf
```

> Note that the nf-core/tools helper package has a `download` command to download all required pipeline
> files + singularity containers + institutional configs + modules in one go for you, to make this process easier.

## Adding a new module file

If you decide to upload your module file to `nf-core/modules` then this will ensure that it will be automatically downloaded, and available at run-time to all nf-core pipelines, and to everyone within the Nextflow community! See [`nf-core/modules/software`](https://github.com/nf-core/modules/tree/master/software) for examples.

The definition and standards for module files are still under discussion amongst the community.

Currently the following points have been agreed on:

- Module file should only define inputs/outputs as parameters and have the ability to use `params.MODULENAME_options` as an additional parameter to add any additional settings via pipelines.
- Specify single-end boolean values within the input channel and not be inferred from the data e.g. [here](https://github.com/nf-core/tools/blob/028a9b3f9d1ad044e879a1de13d3c3a25a06b9a7/nf_core/pipeline-template/%7B%7Bcookiecutter.name_noslash%7D%7D/modules/nf-core/fastqc.nf#L13)
- Define threads or resources where required for a particular process using
`task.cpus`
- Software that can be piped together should be added to separate module files unless there is an run-time, storage advantage in implementing in this way e.g. `bwa mem | samtools view` to output BAM instead of SAM - Process names should be all uppercase
- The `publishDirMode` should be configurable
- Test data is stored within this repo. Re-use generic files from `tests/data` by symlinking them into the test directory of the module. Add specific files to the test-directory directly. Keep test files as tiny as possible.
- Software requirements should be declared in a conda `environment.yml` file, including exact version numbers. Additionally, there should be a `Dockerfile` that containerizes the environment.
- Each process should emit a file `TOOL.version.txt` containing a single line with the software's version in the format `vX.X.X`.

### Testing

If you want to add a new module config file to `nf-core/modules` please test that your pipeline of choice runs as expected by using the [`-include`](https://www.nextflow.io/docs/edge/dsl2.html#modules-include) statement with a local version of the module file.

### Documentation

Please add some documentation to the top of the module file in the form of native Nextflow comments. This has to be specified in a particular format as you will be able to see from other examples in the [`nf-core/modules/nf`](https://github.com/nf-core/modules/tree/master/nf) directory.

### Uploading to `nf-core/modules`

[Fork](https://help.github.com/articles/fork-a-repo/) the `nf-core/modules` repository to your own GitHub account. Within the local clone of your fork add the module file to the [`nf-core/modules/software`](https://github.com/nf-core/modules/tree/master/software) directory. Please keep the naming consistent between the module and documentation files e.g. `bwa.nf` and `bwa.md`, respectively.

Commit and push these changes to your local clone on GitHub, and then [create a pull request](https://help.github.com/articles/creating-a-pull-request-from-a-fork/) on `nf-core/modules` GitHub repo with the appropriate information.

We will be notified automatically when you have created your pull request, and providing that everything adheres to nf-core guidelines we will endeavour to approve your pull request as soon as possible.

## Help

If you have any questions or issues please send us a message on [Slack](https://nf-co.re/join/slack).
Fill out repo 2019-07-26 09:19:07 +00:00			`# ![nf-core/modules](docs/images/nfcore-modules_logo.png)`

			`> DSL2 IS AN EXPERIMENTAL FEATURE UNDER DEVELOPMENT. SYNTAX, ORGANISATION AND LAYOUT OF THIS REPOSITORY MAY CHANGE IN THE NEAR FUTURE!`

Update README.md 2020-03-16 13:42:22 +00:00			A repository for hosting nextflow [`DSL2`](https://www.nextflow.io/docs/edge/dsl2.htmlhttps://www.nextflow.io/docs/edge/dsl2.html) module files containing tool-specific process definitions and their associated documentation.
Fix links 2019-07-26 12:38:08 +00:00
Fill out repo 2019-07-26 09:19:07 +00:00			`## Table of contents`
Fix markdownlint 2020-07-11 11:48:11 +00:00
Update README 2020-07-14 08:51:19 +00:00			`- [Using existing modules](#using-existing-modules)`
Fix markdownlint Move markdownlint.yml to root directory. Like that it is automatically discovered and applied by most IDEs 2020-07-14 10:14:16 +00:00			`- [Configuration and parameters](#configuration-and-parameters)`
			`- [Offline usage](#offline-usage)`
Update README 2020-07-14 08:51:19 +00:00			`- [Adding a new module file](#adding-a-new-module-file)`
Fix markdownlint Move markdownlint.yml to root directory. Like that it is automatically discovered and applied by most IDEs 2020-07-14 10:14:16 +00:00			`- [Testing](#testing)`
			`- [Documentation](#documentation)`
			- [Uploading to `nf-core/modules`](#uploading-to-nf-coremodules)
Update README 2020-07-14 08:51:19 +00:00			`- [Help](#help)`
Fill out repo 2019-07-26 09:19:07 +00:00
Propose module organization See above changes to readme :_) 2020-03-16 13:06:03 +00:00			`## Terminology`

Apply suggestions from code review Co-Authored-By: Harshil Patel <drpatelh@users.noreply.github.com> 2020-03-16 14:42:52 +00:00			`The features offered by Nextflow DSL 2 can be used in various ways depending on the granularity with which you would like to write pipelines. Please see the listing below for the hierarchy and associated terminology we have decided to use when referring to DSL 2 components:`
Propose module organization See above changes to readme :_) 2020-03-16 13:06:03 +00:00
Fix markdownlint Move markdownlint.yml to root directory. Like that it is automatically discovered and applied by most IDEs 2020-07-14 10:14:16 +00:00			- Module: A `process`that can be used within different pipelines and is as atomic as possible i.e. cannot be split into another module. An example of this would be a module file containing the process definition for a single tool such as `FastQC`. This repository has been created to only host atomic module files that should be added to the `tools` sub-directory along with the required documentation, software and tests.
			`- Sub-workflow: A chain of multiple modules that offer a higher-level of functionality within the context of a pipeline. For example, a sub-workflow to run multiple QC tools with FastQ files as input. Sub-workflows should be shipped with the pipeline implementation and if required they should be shared amongst different pipelines directly from there. As it stands, this repository will not host sub-workflows.`
			`- Workflow: What DSL 1 users would consider an end-to-end pipeline. For example, from one or more inputs to a series of outputs. This can either be implemented using a large monolithic script as with DSL 1, or by using a combination of DSL 2 individual modules and sub-workflows.`
Propose module organization See above changes to readme :_) 2020-03-16 13:06:03 +00:00
Fix links 2019-07-26 12:38:08 +00:00			`## Using existing modules`
Fill out repo 2019-07-26 09:19:07 +00:00
Refine README 2019-07-26 12:30:26 +00:00			The Nextflow [`include`](https://www.nextflow.io/docs/edge/dsl2.html#modules-include) statement can be used within your pipelines in order to load module files that you have available locally.
Fill out repo 2019-07-26 09:19:07 +00:00
Add software dependencies and version string to requirements list 2020-07-14 10:07:41 +00:00			You should be able to get a good idea as to how other people are using module files by looking at pipelines available in nf-core e.g. [`nf-core/chipseq`](https://github.com/nf-core/chipseq/tree/dev) (work in progress)
Fill out repo 2019-07-26 09:19:07 +00:00
Refine README 2019-07-26 12:30:26 +00:00			`### Configuration and parameters`
Fill out repo 2019-07-26 09:19:07 +00:00
Fix links 2019-07-26 12:38:08 +00:00			The module files hosted in this repository define a set of processes for software tools such as `fastqc`, `trimgalore`, `bwa` etc. This allows you to share and add common functionality across multiple pipelines in a modular fashion.

Typo 2019-07-26 15:06:49 +00:00			`> The definition and standards for module files are still under discussion amongst the community but hopefully, a description should be added here soon!`
Fill out repo 2019-07-26 09:19:07 +00:00
			`### Offline usage`

Refine README 2019-07-26 12:30:26 +00:00			If you want to use an existing module file available in `nf-core/modules`, and you're running on a system that has no internet connection, you'll need to download the repository (e.g. `git clone https://github.com/nf-core/modules.git`) and place it in a location that is visible to the file system on which you are running the pipeline. Then run the pipeline by creating a custom config file called e.g. `custom_module.conf` containing the following information:
Fill out repo 2019-07-26 09:19:07 +00:00
			```bash
Refine README 2019-07-26 12:30:26 +00:00			`include /path/to/downloaded/modules/directory/`
Fill out repo 2019-07-26 09:19:07 +00:00			```

Refine README 2019-07-26 12:30:26 +00:00			Then you can run the pipeline by directly passing the additional config file with the `-c` parameter:
Fill out repo 2019-07-26 09:19:07 +00:00
			```bash
Refine README 2019-07-26 12:30:26 +00:00			`nextflow run /path/to/pipeline/ -c /path/to/custom_module.conf`
Fill out repo 2019-07-26 09:19:07 +00:00			```

			> Note that the nf-core/tools helper package has a `download` command to download all required pipeline
Refine README 2019-07-26 12:30:26 +00:00			`> files + singularity containers + institutional configs + modules in one go for you, to make this process easier.`
Fill out repo 2019-07-26 09:19:07 +00:00
Refine README 2019-07-26 12:30:26 +00:00			`## Adding a new module file`
Fill out repo 2019-07-26 09:19:07 +00:00
Add software dependencies and version string to requirements list 2020-07-14 10:07:41 +00:00			If you decide to upload your module file to `nf-core/modules` then this will ensure that it will be automatically downloaded, and available at run-time to all nf-core pipelines, and to everyone within the Nextflow community! See [`nf-core/modules/software`](https://github.com/nf-core/modules/tree/master/software) for examples.
Fill out repo 2019-07-26 09:19:07 +00:00
Update README 2020-07-14 08:51:19 +00:00			`The definition and standards for module files are still under discussion amongst the community.`

			`Currently the following points have been agreed on:`

Fix markdownlint Move markdownlint.yml to root directory. Like that it is automatically discovered and applied by most IDEs 2020-07-14 10:14:16 +00:00			- Module file should only define inputs/outputs as parameters and have the ability to use `params.MODULENAME_options` as an additional parameter to add any additional settings via pipelines.
			`- Specify single-end boolean values within the input channel and not be inferred from the data e.g. [here](https://github.com/nf-core/tools/blob/028a9b3f9d1ad044e879a1de13d3c3a25a06b9a7/nf_core/pipeline-template/%7B%7Bcookiecutter.name_noslash%7D%7D/modules/nf-core/fastqc.nf#L13)`
Add software dependencies and version string to requirements list 2020-07-14 10:07:41 +00:00			`- Define threads or resources where required for a particular process using`
Fix markdownlint Move markdownlint.yml to root directory. Like that it is automatically discovered and applied by most IDEs 2020-07-14 10:14:16 +00:00			`task.cpus`
			- Software that can be piped together should be added to separate module files unless there is an run-time, storage advantage in implementing in this way e.g. `bwa mem \| samtools view` to output BAM instead of SAM - Process names should be all uppercase
Update README 2020-07-14 08:51:19 +00:00			- The `publishDirMode` should be configurable
Fix markdownlint Move markdownlint.yml to root directory. Like that it is automatically discovered and applied by most IDEs 2020-07-14 10:14:16 +00:00			- Test data is stored within this repo. Re-use generic files from `tests/data` by symlinking them into the test directory of the module. Add specific files to the test-directory directly. Keep test files as tiny as possible.
			- Software requirements should be declared in a conda `environment.yml` file, including exact version numbers. Additionally, there should be a `Dockerfile` that containerizes the environment.
			- Each process should emit a file `TOOL.version.txt` containing a single line with the software's version in the format `vX.X.X`.
Fill out repo 2019-07-26 09:19:07 +00:00
			`### Testing`

Refine README 2019-07-26 12:30:26 +00:00			If you want to add a new module config file to `nf-core/modules` please test that your pipeline of choice runs as expected by using the [`-include`](https://www.nextflow.io/docs/edge/dsl2.html#modules-include) statement with a local version of the module file.
Fill out repo 2019-07-26 09:19:07 +00:00
			`### Documentation`

FastQC prototype with docs 2019-07-26 15:05:13 +00:00			Please add some documentation to the top of the module file in the form of native Nextflow comments. This has to be specified in a particular format as you will be able to see from other examples in the [`nf-core/modules/nf`](https://github.com/nf-core/modules/tree/master/nf) directory.
Fill out repo 2019-07-26 09:19:07 +00:00
			### Uploading to `nf-core/modules`

Add software dependencies and version string to requirements list 2020-07-14 10:07:41 +00:00			[Fork](https://help.github.com/articles/fork-a-repo/) the `nf-core/modules` repository to your own GitHub account. Within the local clone of your fork add the module file to the [`nf-core/modules/software`](https://github.com/nf-core/modules/tree/master/software) directory. Please keep the naming consistent between the module and documentation files e.g. `bwa.nf` and `bwa.md`, respectively.
Fill out repo 2019-07-26 09:19:07 +00:00
Refine README 2019-07-26 12:30:26 +00:00			Commit and push these changes to your local clone on GitHub, and then [create a pull request](https://help.github.com/articles/creating-a-pull-request-from-a-fork/) on `nf-core/modules` GitHub repo with the appropriate information.
Fill out repo 2019-07-26 09:19:07 +00:00
			`We will be notified automatically when you have created your pull request, and providing that everything adheres to nf-core guidelines we will endeavour to approve your pull request as soon as possible.`

			`## Help`

Replace Slack join URL 2019-07-30 14:31:44 +00:00			`If you have any questions or issues please send us a message on [Slack](https://nf-co.re/join/slack).`