From 4ee6ddc7ab454ef8ada355a8989b06f5ce0c3f49 Mon Sep 17 00:00:00 2001 From: Gregor Sturm Date: Tue, 14 Jul 2020 10:51:19 +0200 Subject: [PATCH 1/4] Update README --- .gitignore | 3 ++- README.md | 36 ++++++++++++++++++++++++------------ test-datasets | 1 - tests/data/.gitignore | 0 4 files changed, 26 insertions(+), 14 deletions(-) delete mode 160000 test-datasets create mode 100644 tests/data/.gitignore diff --git a/.gitignore b/.gitignore index 07c0144a..9fdbb0cc 100644 --- a/.gitignore +++ b/.gitignore @@ -1,5 +1,6 @@ .nextflow* work/ -data/ results/ +./data .DS_Store +*.code-workspace diff --git a/README.md b/README.md index 74f5115a..98cd33d8 100644 --- a/README.md +++ b/README.md @@ -6,22 +6,22 @@ A repository for hosting nextflow [`DSL2`](https://www.nextflow.io/docs/edge/dsl ## Table of contents -* [Using existing modules](#using-existing-modules) - * [Configuration and parameters](#configuration-and-parameters) - * [Offline usage](#offline-usage) -* [Adding a new module file](#adding-a-new-module-file) - * [Testing](#testing) - * [Documentation](#documentation) - * [Uploading to `nf-core/modules`](#uploading-to-nf-coremodules) -* [Help](#help) +- [Using existing modules](#using-existing-modules) + - [Configuration and parameters](#configuration-and-parameters) + - [Offline usage](#offline-usage) +- [Adding a new module file](#adding-a-new-module-file) + - [Testing](#testing) + - [Documentation](#documentation) + - [Uploading to `nf-core/modules`](#uploading-to-nf-coremodules) +- [Help](#help) ## Terminology The features offered by Nextflow DSL 2 can be used in various ways depending on the granularity with which you would like to write pipelines. Please see the listing below for the hierarchy and associated terminology we have decided to use when referring to DSL 2 components: -* *Module*: A `process`that can be used within different pipelines and is as atomic as possible i.e. cannot be split into another module. An example of this would be a module file containing the process definition for a single tool such as `FastQC`. This repository has been created to only host atomic module files that should be added to the `tools` sub-directory along with the required documentation, software and tests. -* *Sub-workflow*: A chain of multiple modules that offer a higher-level of functionality within the context of a pipeline. For example, a sub-workflow to run multiple QC tools with FastQ files as input. Sub-workflows should be shipped with the pipeline implementation and if required they should be shared amongst different pipelines directly from there. As it stands, this repository will not host sub-workflows. -* *Workflow*: What DSL 1 users would consider an end-to-end pipeline. For example, from one or more inputs to a series of outputs. This can either be implemented using a large monolithic script as with DSL 1, or by using a combination of DSL 2 individual modules and sub-workflows. +- _Module_: A `process`that can be used within different pipelines and is as atomic as possible i.e. cannot be split into another module. An example of this would be a module file containing the process definition for a single tool such as `FastQC`. This repository has been created to only host atomic module files that should be added to the `tools` sub-directory along with the required documentation, software and tests. +- _Sub-workflow_: A chain of multiple modules that offer a higher-level of functionality within the context of a pipeline. For example, a sub-workflow to run multiple QC tools with FastQ files as input. Sub-workflows should be shipped with the pipeline implementation and if required they should be shared amongst different pipelines directly from there. As it stands, this repository will not host sub-workflows. +- _Workflow_: What DSL 1 users would consider an end-to-end pipeline. For example, from one or more inputs to a series of outputs. This can either be implemented using a large monolithic script as with DSL 1, or by using a combination of DSL 2 individual modules and sub-workflows. ## Using existing modules @@ -56,7 +56,19 @@ nextflow run /path/to/pipeline/ -c /path/to/custom_module.conf If you decide to upload your module file to `nf-core/modules` then this will ensure that it will be automatically downloaded, and available at run-time to all nf-core pipelines, and to everyone within the Nextflow community! See [`nf-core/modules/nf`](https://github.com/nf-core/modules/tree/master/nf) for examples. -> The definition and standards for module files are still under discussion amongst the community but hopefully, a description should be added here soon! +The definition and standards for module files are still under discussion amongst the community. + +Currently the following points have been agreed on: + +- Module file should only define inputs/outputs as parameters and have the ability to use `params.MODULENAME_options` as an additional parameter to add any additional settings via pipelines. +- Specify single-end boolean values within the input channel and not be inferred from the data e.g. [here](https://github.com/nf-core/tools/blob/028a9b3f9d1ad044e879a1de13d3c3a25a06b9a7/nf_core/pipeline-template/%7B%7Bcookiecutter.name_noslash%7D%7D/modules/nf-core/fastqc.nf#L13) +- Define threads or resources where required for a particular process using `task.cpus` +- Software that can be piped together should be added to separate module files unless there is an run-time, storage advantage in implementing in this way e.g. `bwa mem | samtools view` to output BAM instead of SAM +- Process names should be all uppercase +- The `publishDirMode` should be configurable +- Test data is stored within this repo. Re-use generic files from `tests/data` by + symlinking them into the test directory of the module. Add specific files + to the test-directory directly. Keep test files as tiny as possible. ### Testing diff --git a/test-datasets b/test-datasets deleted file mode 160000 index ddbd0c4c..00000000 --- a/test-datasets +++ /dev/null @@ -1 +0,0 @@ -Subproject commit ddbd0c4cf7f1721c78673c4dcc91fcd7940e67f8 diff --git a/tests/data/.gitignore b/tests/data/.gitignore new file mode 100644 index 00000000..e69de29b From fe882f1579d0f3c0076bef19c9a655aff1f0131f Mon Sep 17 00:00:00 2001 From: Gregor Sturm Date: Tue, 14 Jul 2020 12:07:41 +0200 Subject: [PATCH 2/4] Add software dependencies and version string to requirements list --- README.md | 45 +++++++++++++++++++++++++++++---------------- 1 file changed, 29 insertions(+), 16 deletions(-) diff --git a/README.md b/README.md index 98cd33d8..33608145 100644 --- a/README.md +++ b/README.md @@ -7,12 +7,12 @@ A repository for hosting nextflow [`DSL2`](https://www.nextflow.io/docs/edge/dsl ## Table of contents - [Using existing modules](#using-existing-modules) - - [Configuration and parameters](#configuration-and-parameters) - - [Offline usage](#offline-usage) + - [Configuration and parameters](#configuration-and-parameters) + - [Offline usage](#offline-usage) - [Adding a new module file](#adding-a-new-module-file) - - [Testing](#testing) - - [Documentation](#documentation) - - [Uploading to `nf-core/modules`](#uploading-to-nf-coremodules) + - [Testing](#testing) + - [Documentation](#documentation) + - [Uploading to `nf-core/modules`](#uploading-to-nf-coremodules) - [Help](#help) ## Terminology @@ -27,7 +27,7 @@ The features offered by Nextflow DSL 2 can be used in various ways depending on The Nextflow [`include`](https://www.nextflow.io/docs/edge/dsl2.html#modules-include) statement can be used within your pipelines in order to load module files that you have available locally. -You should be able to get a good idea as to how other people are using module files by looking at pipelines available in nf-core e.g. [`nf-core/rnaseq`](https://github.com/nf-core/rnaseq/pull/162) +You should be able to get a good idea as to how other people are using module files by looking at pipelines available in nf-core e.g. [`nf-core/chipseq`](https://github.com/nf-core/chipseq/tree/dev) (work in progress) ### Configuration and parameters @@ -54,21 +54,34 @@ nextflow run /path/to/pipeline/ -c /path/to/custom_module.conf ## Adding a new module file -If you decide to upload your module file to `nf-core/modules` then this will ensure that it will be automatically downloaded, and available at run-time to all nf-core pipelines, and to everyone within the Nextflow community! See [`nf-core/modules/nf`](https://github.com/nf-core/modules/tree/master/nf) for examples. +If you decide to upload your module file to `nf-core/modules` then this will ensure that it will be automatically downloaded, and available at run-time to all nf-core pipelines, and to everyone within the Nextflow community! See [`nf-core/modules/software`](https://github.com/nf-core/modules/tree/master/software) for examples. The definition and standards for module files are still under discussion amongst the community. Currently the following points have been agreed on: -- Module file should only define inputs/outputs as parameters and have the ability to use `params.MODULENAME_options` as an additional parameter to add any additional settings via pipelines. -- Specify single-end boolean values within the input channel and not be inferred from the data e.g. [here](https://github.com/nf-core/tools/blob/028a9b3f9d1ad044e879a1de13d3c3a25a06b9a7/nf_core/pipeline-template/%7B%7Bcookiecutter.name_noslash%7D%7D/modules/nf-core/fastqc.nf#L13) -- Define threads or resources where required for a particular process using `task.cpus` -- Software that can be piped together should be added to separate module files unless there is an run-time, storage advantage in implementing in this way e.g. `bwa mem | samtools view` to output BAM instead of SAM -- Process names should be all uppercase +- Module file should only define inputs/outputs as parameters and have the + ability to use `params.MODULENAME_options` as an additional parameter to add + any additional settings via pipelines. +- Specify single-end boolean values + within the input channel and not be inferred from the data e.g. + [here](https://github.com/nf-core/tools/blob/028a9b3f9d1ad044e879a1de13d3c3a25a06b9a7/nf_core/pipeline-template/%7B%7Bcookiecutter.name_noslash%7D%7D/modules/nf-core/fastqc.nf#L13) +- Define threads or resources where required for a particular process using + `task.cpus` +- Software that can be piped together should be added to separate + module files unless there is an run-time, storage advantage in implementing + in this way e.g. `bwa mem | samtools view` to output BAM instead of SAM - + Process names should be all uppercase - The `publishDirMode` should be configurable -- Test data is stored within this repo. Re-use generic files from `tests/data` by - symlinking them into the test directory of the module. Add specific files - to the test-directory directly. Keep test files as tiny as possible. +- Test data is stored within this repo. Re-use generic files + from `tests/data` by symlinking them into the test directory of the module. + Add specific files to the test-directory directly. Keep test files as tiny as + possible. +- Software requirements should be declared in a conda `environment.yml` file, + including exact version numbers. Additionally, there should be a `Dockerfile` + that containerizes the environment. +- Each process should emit a file `TOOL.version.txt` containing a single line + with the software's version in the format `vX.X.X`. ### Testing @@ -80,7 +93,7 @@ Please add some documentation to the top of the module file in the form of nativ ### Uploading to `nf-core/modules` -[Fork](https://help.github.com/articles/fork-a-repo/) the `nf-core/modules` repository to your own GitHub account. Within the local clone of your fork add the module file to the [`nf-core/modules/nf`](https://github.com/nf-core/modules/tree/master/nf) directory. Please keep the naming consistent between the module and documentation files e.g. `bwa.nf` and `bwa.md`, respectively. +[Fork](https://help.github.com/articles/fork-a-repo/) the `nf-core/modules` repository to your own GitHub account. Within the local clone of your fork add the module file to the [`nf-core/modules/software`](https://github.com/nf-core/modules/tree/master/software) directory. Please keep the naming consistent between the module and documentation files e.g. `bwa.nf` and `bwa.md`, respectively. Commit and push these changes to your local clone on GitHub, and then [create a pull request](https://help.github.com/articles/creating-a-pull-request-from-a-fork/) on `nf-core/modules` GitHub repo with the appropriate information. From 8f3718795b9999b80c38073484d751abb05bef16 Mon Sep 17 00:00:00 2001 From: Gregor Sturm Date: Tue, 14 Jul 2020 12:14:16 +0200 Subject: [PATCH 3/4] Fix markdownlint Move markdownlint.yml to root directory. Like that it is automatically discovered and applied by most IDEs --- .github/workflows/lint-code.yml | 8 ++-- .github/markdownlint.yml => .markdownlint.yml | 0 README.md | 43 +++++++------------ 3 files changed, 19 insertions(+), 32 deletions(-) rename .github/markdownlint.yml => .markdownlint.yml (100%) diff --git a/.github/workflows/lint-code.yml b/.github/workflows/lint-code.yml index fb9385c7..7e366a4b 100644 --- a/.github/workflows/lint-code.yml +++ b/.github/workflows/lint-code.yml @@ -9,13 +9,13 @@ jobs: - uses: actions/setup-node@v1 with: - node-version: '10' + node-version: "10" - name: Install markdownlint run: npm install -g markdownlint-cli - name: Run Markdownlint - run: markdownlint ${GITHUB_WORKSPACE} -c ${GITHUB_WORKSPACE}/.github/markdownlint.yml + run: markdownlint ${GITHUB_WORKSPACE} -c ${GITHUB_WORKSPACE}/.markdownlint.yml EditorConfig: runs-on: ubuntu-latest @@ -24,7 +24,7 @@ jobs: - uses: actions/setup-node@v1 with: - node-version: '10' + node-version: "10" - name: Install ECLint run: npm install -g eclint @@ -41,7 +41,7 @@ jobs: - name: Install NodeJS uses: actions/setup-node@v1 with: - node-version: '10' + node-version: "10" - name: Install yaml-lint run: npm install -g yaml-lint diff --git a/.github/markdownlint.yml b/.markdownlint.yml similarity index 100% rename from .github/markdownlint.yml rename to .markdownlint.yml diff --git a/README.md b/README.md index 33608145..d8fc9553 100644 --- a/README.md +++ b/README.md @@ -7,21 +7,21 @@ A repository for hosting nextflow [`DSL2`](https://www.nextflow.io/docs/edge/dsl ## Table of contents - [Using existing modules](#using-existing-modules) - - [Configuration and parameters](#configuration-and-parameters) - - [Offline usage](#offline-usage) + - [Configuration and parameters](#configuration-and-parameters) + - [Offline usage](#offline-usage) - [Adding a new module file](#adding-a-new-module-file) - - [Testing](#testing) - - [Documentation](#documentation) - - [Uploading to `nf-core/modules`](#uploading-to-nf-coremodules) + - [Testing](#testing) + - [Documentation](#documentation) + - [Uploading to `nf-core/modules`](#uploading-to-nf-coremodules) - [Help](#help) ## Terminology The features offered by Nextflow DSL 2 can be used in various ways depending on the granularity with which you would like to write pipelines. Please see the listing below for the hierarchy and associated terminology we have decided to use when referring to DSL 2 components: -- _Module_: A `process`that can be used within different pipelines and is as atomic as possible i.e. cannot be split into another module. An example of this would be a module file containing the process definition for a single tool such as `FastQC`. This repository has been created to only host atomic module files that should be added to the `tools` sub-directory along with the required documentation, software and tests. -- _Sub-workflow_: A chain of multiple modules that offer a higher-level of functionality within the context of a pipeline. For example, a sub-workflow to run multiple QC tools with FastQ files as input. Sub-workflows should be shipped with the pipeline implementation and if required they should be shared amongst different pipelines directly from there. As it stands, this repository will not host sub-workflows. -- _Workflow_: What DSL 1 users would consider an end-to-end pipeline. For example, from one or more inputs to a series of outputs. This can either be implemented using a large monolithic script as with DSL 1, or by using a combination of DSL 2 individual modules and sub-workflows. +- *Module*: A `process`that can be used within different pipelines and is as atomic as possible i.e. cannot be split into another module. An example of this would be a module file containing the process definition for a single tool such as `FastQC`. This repository has been created to only host atomic module files that should be added to the `tools` sub-directory along with the required documentation, software and tests. +- *Sub-workflow*: A chain of multiple modules that offer a higher-level of functionality within the context of a pipeline. For example, a sub-workflow to run multiple QC tools with FastQ files as input. Sub-workflows should be shipped with the pipeline implementation and if required they should be shared amongst different pipelines directly from there. As it stands, this repository will not host sub-workflows. +- *Workflow*: What DSL 1 users would consider an end-to-end pipeline. For example, from one or more inputs to a series of outputs. This can either be implemented using a large monolithic script as with DSL 1, or by using a combination of DSL 2 individual modules and sub-workflows. ## Using existing modules @@ -60,28 +60,15 @@ The definition and standards for module files are still under discussion amongst Currently the following points have been agreed on: -- Module file should only define inputs/outputs as parameters and have the - ability to use `params.MODULENAME_options` as an additional parameter to add - any additional settings via pipelines. -- Specify single-end boolean values - within the input channel and not be inferred from the data e.g. - [here](https://github.com/nf-core/tools/blob/028a9b3f9d1ad044e879a1de13d3c3a25a06b9a7/nf_core/pipeline-template/%7B%7Bcookiecutter.name_noslash%7D%7D/modules/nf-core/fastqc.nf#L13) +- Module file should only define inputs/outputs as parameters and have the ability to use `params.MODULENAME_options` as an additional parameter to add any additional settings via pipelines. +- Specify single-end boolean values within the input channel and not be inferred from the data e.g. [here](https://github.com/nf-core/tools/blob/028a9b3f9d1ad044e879a1de13d3c3a25a06b9a7/nf_core/pipeline-template/%7B%7Bcookiecutter.name_noslash%7D%7D/modules/nf-core/fastqc.nf#L13) - Define threads or resources where required for a particular process using - `task.cpus` -- Software that can be piped together should be added to separate - module files unless there is an run-time, storage advantage in implementing - in this way e.g. `bwa mem | samtools view` to output BAM instead of SAM - - Process names should be all uppercase +`task.cpus` +- Software that can be piped together should be added to separate module files unless there is an run-time, storage advantage in implementing in this way e.g. `bwa mem | samtools view` to output BAM instead of SAM - Process names should be all uppercase - The `publishDirMode` should be configurable -- Test data is stored within this repo. Re-use generic files - from `tests/data` by symlinking them into the test directory of the module. - Add specific files to the test-directory directly. Keep test files as tiny as - possible. -- Software requirements should be declared in a conda `environment.yml` file, - including exact version numbers. Additionally, there should be a `Dockerfile` - that containerizes the environment. -- Each process should emit a file `TOOL.version.txt` containing a single line - with the software's version in the format `vX.X.X`. +- Test data is stored within this repo. Re-use generic files from `tests/data` by symlinking them into the test directory of the module. Add specific files to the test-directory directly. Keep test files as tiny as possible. +- Software requirements should be declared in a conda `environment.yml` file, including exact version numbers. Additionally, there should be a `Dockerfile` that containerizes the environment. +- Each process should emit a file `TOOL.version.txt` containing a single line with the software's version in the format `vX.X.X`. ### Testing From 7121879714724ada76e8760e2ec2f6343a4fdae7 Mon Sep 17 00:00:00 2001 From: Gregor Sturm Date: Tue, 14 Jul 2020 16:13:05 +0200 Subject: [PATCH 4/4] Add: All outputs should be named --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index d8fc9553..20932b31 100644 --- a/README.md +++ b/README.md @@ -65,10 +65,11 @@ Currently the following points have been agreed on: - Define threads or resources where required for a particular process using `task.cpus` - Software that can be piped together should be added to separate module files unless there is an run-time, storage advantage in implementing in this way e.g. `bwa mem | samtools view` to output BAM instead of SAM - Process names should be all uppercase -- The `publishDirMode` should be configurable +- The `publishDirMode` should be configurable via `params.publish_dir_mode` - Test data is stored within this repo. Re-use generic files from `tests/data` by symlinking them into the test directory of the module. Add specific files to the test-directory directly. Keep test files as tiny as possible. - Software requirements should be declared in a conda `environment.yml` file, including exact version numbers. Additionally, there should be a `Dockerfile` that containerizes the environment. - Each process should emit a file `TOOL.version.txt` containing a single line with the software's version in the format `vX.X.X`. +- All outputs should be named ### Testing