mirror of
https://github.com/MillironX/nf-core_modules.git
synced 2024-11-13 13:23:09 +00:00
Merge branch 'nf-core:master' into master
This commit is contained in:
commit
06a8ac9b72
1069 changed files with 2378 additions and 1790 deletions
2
.github/workflows/nf-core-linting.yml
vendored
2
.github/workflows/nf-core-linting.yml
vendored
|
@ -16,7 +16,7 @@ jobs:
|
|||
- uses: dorny/paths-filter@v2
|
||||
id: filter
|
||||
with:
|
||||
filters: "tests/config/pytest_software.yml"
|
||||
filters: "tests/config/pytest_modules.yml"
|
||||
|
||||
lint:
|
||||
runs-on: ubuntu-20.04
|
||||
|
|
8
.github/workflows/pytest-workflow.yml
vendored
8
.github/workflows/pytest-workflow.yml
vendored
|
@ -14,7 +14,7 @@ jobs:
|
|||
- uses: dorny/paths-filter@v2
|
||||
id: filter
|
||||
with:
|
||||
filters: 'tests/config/pytest_software.yml'
|
||||
filters: "tests/config/pytest_modules.yml"
|
||||
|
||||
test:
|
||||
runs-on: ubuntu-20.04
|
||||
|
@ -25,9 +25,9 @@ jobs:
|
|||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
nxf_version: ['21.04.0']
|
||||
tags: ['${{ fromJson(needs.changes.outputs.modules) }}']
|
||||
profile: ['docker', 'singularity', 'conda']
|
||||
nxf_version: ["21.04.0"]
|
||||
tags: ["${{ fromJson(needs.changes.outputs.modules) }}"]
|
||||
profile: ["docker", "singularity", "conda"]
|
||||
env:
|
||||
NXF_ANSI_LOG: false
|
||||
steps:
|
||||
|
|
86
README.md
86
README.md
|
@ -34,9 +34,9 @@ A repository for hosting [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl
|
|||
|
||||
The module files hosted in this repository define a set of processes for software tools such as `fastqc`, `bwa`, `samtools` etc. This allows you to share and add common functionality across multiple pipelines in a modular fashion.
|
||||
|
||||
We have written a helper command in the `nf-core/tools` package that uses the GitHub API to obtain the relevant information for the module files present in the [`software/`](software/) directory of this repository. This includes using `git` commit hashes to track changes for reproducibility purposes, and to download and install all of the relevant module files.
|
||||
We have written a helper command in the `nf-core/tools` package that uses the GitHub API to obtain the relevant information for the module files present in the [`modules/`](modules/) directory of this repository. This includes using `git` commit hashes to track changes for reproducibility purposes, and to download and install all of the relevant module files.
|
||||
|
||||
1. Install the latest version of [`nf-core/tools`](https://github.com/nf-core/tools#installation) (`>=1.13`)
|
||||
1. Install the latest version of [`nf-core/tools`](https://github.com/nf-core/tools#installation) (`>=2.0`)
|
||||
2. List the available modules:
|
||||
|
||||
```console
|
||||
|
@ -48,7 +48,7 @@ We have written a helper command in the `nf-core/tools` package that uses the Gi
|
|||
| \| | \__, \__/ | \ |___ \`-._,-`-,
|
||||
`._,._,'
|
||||
|
||||
nf-core/tools version 1.13
|
||||
nf-core/tools version 2.0
|
||||
|
||||
INFO Modules available from nf-core/modules (master): pipeline_modules.py:164
|
||||
|
||||
|
@ -73,10 +73,10 @@ We have written a helper command in the `nf-core/tools` package that uses the Gi
|
|||
| \| | \__, \__/ | \ |___ \`-._,-`-,
|
||||
`._,._,'
|
||||
|
||||
nf-core/tools version 1.13
|
||||
nf-core/tools version 2.0
|
||||
|
||||
INFO Installing fastqc pipeline_modules.py:213
|
||||
INFO Downloaded 3 files to ./modules/nf-core/software/fastqc pipeline_modules.py:236
|
||||
INFO Downloaded 3 files to ./modules/nf-core/modules/fastqc pipeline_modules.py:236
|
||||
```
|
||||
|
||||
4. Import the module in your Nextflow script:
|
||||
|
@ -86,7 +86,7 @@ We have written a helper command in the `nf-core/tools` package that uses the Gi
|
|||
|
||||
nextflow.enable.dsl = 2
|
||||
|
||||
include { FASTQC } from './modules/nf-core/software/fastqc/main' addParams( options: [:] )
|
||||
include { FASTQC } from './modules/nf-core/modules/fastqc/main' addParams( options: [:] )
|
||||
```
|
||||
|
||||
5. Remove the module from the pipeline repository if required:
|
||||
|
@ -100,7 +100,7 @@ We have written a helper command in the `nf-core/tools` package that uses the Gi
|
|||
| \| | \__, \__/ | \ |___ \`-._,-`-,
|
||||
`._,._,'
|
||||
|
||||
nf-core/tools version 1.13
|
||||
nf-core/tools version 2.0
|
||||
|
||||
INFO Removing fastqc pipeline_modules.py:271
|
||||
INFO Successfully removed fastqc pipeline_modules.py:285
|
||||
|
@ -117,7 +117,7 @@ We have written a helper command in the `nf-core/tools` package that uses the Gi
|
|||
| \| | \__, \__/ | \ |___ \`-._,-`-,
|
||||
`._,._,'
|
||||
|
||||
nf-core/tools version 1.13
|
||||
nf-core/tools version 2.0
|
||||
|
||||
INFO Linting pipeline: . lint.py:104
|
||||
INFO Linting module: fastqc lint.py:106
|
||||
|
@ -128,7 +128,7 @@ We have written a helper command in the `nf-core/tools` package that uses the Gi
|
|||
╭──────────────┬───────────────────────────────┬──────────────────────────────────╮
|
||||
│ Module name │ Test message │ File path │
|
||||
├──────────────┼───────────────────────────────┼──────────────────────────────────┤
|
||||
│ fastqc │ Local copy of module outdated │ modules/nf-core/software/fastqc/ │
|
||||
│ fastqc │ Local copy of module outdated │ modules/nf-core/modules/fastqc/ │
|
||||
╰──────────────┴────────────────────────────── ┴──────────────────────────────────╯
|
||||
╭──────────────────────╮
|
||||
│ LINT RESULTS SUMMARY │
|
||||
|
@ -146,12 +146,12 @@ We have plans to add other utility commands to help developers install and maint
|
|||
If you decide to upload a module to `nf-core/modules` then this will
|
||||
ensure that it will become available to all nf-core pipelines,
|
||||
and to everyone within the Nextflow community! See
|
||||
[`software/`](software)
|
||||
[`modules/`](modules)
|
||||
for examples.
|
||||
|
||||
### Checklist
|
||||
|
||||
Please check that the module you wish to add isn't already on [`nf-core/modules`](https://github.com/nf-core/modules/tree/master/software):
|
||||
Please check that the module you wish to add isn't already on [`nf-core/modules`](https://github.com/nf-core/modules/tree/master/modules):
|
||||
- Use the [`nf-core modules list`](https://github.com/nf-core/tools#list-modules) command
|
||||
- Check [open pull requests](https://github.com/nf-core/modules/pulls)
|
||||
- Search [open issues](https://github.com/nf-core/modules/issues)
|
||||
|
@ -165,7 +165,7 @@ If the module doesn't exist on `nf-core/modules`:
|
|||
|
||||
We have implemented a number of commands in the `nf-core/tools` package to make it incredibly easy for you to create and contribute your own modules to nf-core/modules.
|
||||
|
||||
1. Install the latest version of [`nf-core/tools`](https://github.com/nf-core/tools#installation) (`>=1.13`)
|
||||
1. Install the latest version of [`nf-core/tools`](https://github.com/nf-core/tools#installation) (`>=2.0`)
|
||||
2. Install [`Nextflow`](https://www.nextflow.io/docs/latest/getstarted.html#installation) (`>=21.04.0`)
|
||||
3. Install any of [`Docker`](https://docs.docker.com/engine/installation/), [`Singularity`](https://www.sylabs.io/guides/3.0/user-guide/) or [`Conda`](https://conda.io/miniconda.html)
|
||||
4. [Fork and clone this repo locally](#uploading-to-nf-coremodules)
|
||||
|
@ -181,7 +181,7 @@ We have implemented a number of commands in the `nf-core/tools` package to make
|
|||
git checkout -b fastqc
|
||||
```
|
||||
|
||||
6. Create a module using the [nf-core DSL2 module template](https://github.com/nf-core/tools/blob/master/nf_core/module-template/software/main.nf):
|
||||
6. Create a module using the [nf-core DSL2 module template](https://github.com/nf-core/tools/blob/master/nf_core/module-template/modules/main.nf):
|
||||
|
||||
```console
|
||||
$ nf-core modules create . --tool fastqc --author @joebloggs --label process_low --meta
|
||||
|
@ -192,36 +192,36 @@ We have implemented a number of commands in the `nf-core/tools` package to make
|
|||
| \| | \__, \__/ | \ |___ \`-._,-`-,
|
||||
`._,._,'
|
||||
|
||||
nf-core/tools version 1.13
|
||||
nf-core/tools version 2.0
|
||||
|
||||
INFO Using Bioconda package: 'bioconda::fastqc=0.11.9' create.py:130
|
||||
INFO Using Docker / Singularity container with tag: 'fastqc:0.11.9--0' create.py:140
|
||||
INFO Created / edited following files: create.py:218
|
||||
./software/fastqc/functions.nf
|
||||
./software/fastqc/main.nf
|
||||
./software/fastqc/meta.yml
|
||||
./tests/software/fastqc/main.nf
|
||||
./tests/software/fastqc/test.yml
|
||||
./tests/config/pytest_software.yml
|
||||
./modules/fastqc/functions.nf
|
||||
./modules/fastqc/main.nf
|
||||
./modules/fastqc/meta.yml
|
||||
./tests/modules/fastqc/main.nf
|
||||
./tests/modules/fastqc/test.yml
|
||||
./tests/config/pytest_modules.yml
|
||||
```
|
||||
|
||||
All of the files required to add the module to `nf-core/modules` will be created/edited in the appropriate places. The 4 files you will need to change are:
|
||||
|
||||
1. [`./software/fastqc/main.nf`](https://github.com/nf-core/modules/blob/master/software/fastqc/main.nf)
|
||||
1. [`./modules/fastqc/main.nf`](https://github.com/nf-core/modules/blob/master/modules/fastqc/main.nf)
|
||||
|
||||
This is the main script containing the `process` definition for the module. You will see an extensive number of `TODO` statements to help guide you to fill in the appropriate sections and to ensure that you adhere to the guidelines we have set for module submissions.
|
||||
|
||||
2. [`./software/fastqc/meta.yml`](https://github.com/nf-core/modules/blob/master/software/fastqc/meta.yml)
|
||||
2. [`./modules/fastqc/meta.yml`](https://github.com/nf-core/modules/blob/master/modules/fastqc/meta.yml)
|
||||
|
||||
This file will be used to store general information about the module and author details - the majority of which will already be auto-filled. However, you will need to add a brief description of the files defined in the `input` and `output` section of the main script since these will be unique to each module.
|
||||
|
||||
3. [`./tests/software/fastqc/main.nf`](https://github.com/nf-core/modules/blob/master/tests/software/fastqc/main.nf)
|
||||
3. [`./tests/modules/fastqc/main.nf`](https://github.com/nf-core/modules/blob/master/tests/modules/fastqc/main.nf)
|
||||
|
||||
Every module MUST have a test workflow. This file will define one or more Nextflow `workflow` definitions that will be used to unit test the output files created by the module. By default, one `workflow` definition will be added but please feel free to add as many as possible so we can ensure that the module works on different data types / parameters e.g. separate `workflow` for single-end and paired-end data.
|
||||
|
||||
Minimal test data required for your module may already exist within this repository, in which case you may just have to change a couple of paths in this file - see the [Test data](#test-data) section for more info and guidelines for adding new standardised data if required.
|
||||
|
||||
4. [`./tests/software/fastqc/test.yml`](https://github.com/nf-core/modules/blob/master/tests/software/fastqc/test.yml)
|
||||
4. [`./tests/modules/fastqc/test.yml`](https://github.com/nf-core/modules/blob/master/tests/modules/fastqc/test.yml)
|
||||
|
||||
This file will contain all of the details required to unit test the main script in the point above using [pytest-workflow](https://pytest-workflow.readthedocs.io/). If possible, any outputs produced by the test workflow(s) MUST be included and listed in this file along with an appropriate check e.g. md5sum. The different test options are listed in the [pytest-workflow docs](https://pytest-workflow.readthedocs.io/en/stable/#test-options).
|
||||
|
||||
|
@ -240,24 +240,24 @@ We have implemented a number of commands in the `nf-core/tools` package to make
|
|||
| \| | \__, \__/ | \ |___ \`-._,-`-,
|
||||
`._,._,'
|
||||
|
||||
nf-core/tools version 1.13
|
||||
nf-core/tools version 2.0
|
||||
|
||||
|
||||
INFO Press enter to use default values (shown in brackets) or type your own responses test_yml_builder.py:51
|
||||
? Tool name: fastqc
|
||||
Test YAML output path (- for stdout) (tests/software/fastqc/test.yml):
|
||||
INFO Looking for test workflow entry points: 'tests/software/fastqc/main.nf' test_yml_builder.py:116
|
||||
Test YAML output path (- for stdout) (tests/modules/fastqc/test.yml):
|
||||
INFO Looking for test workflow entry points: 'tests/modules/fastqc/main.nf' test_yml_builder.py:116
|
||||
INFO Building test meta for entry point 'test_fastqc_single_end' test_yml_builder.py:150
|
||||
Test name (fastqc test_fastqc_single_end):
|
||||
Test command (nextflow run tests/software/fastqc -entry test_fastqc_single_end -c tests/config/nextflow.config):
|
||||
Test command (nextflow run tests/modules/fastqc -entry test_fastqc_single_end -c tests/config/nextflow.config):
|
||||
Test tags (comma separated) (fastqc,fastqc_single_end):
|
||||
Test output folder with results (leave blank to run test):
|
||||
? Choose software profile Singularity
|
||||
INFO Setting env var '$PROFILE' to 'singularity' test_yml_builder.py:258
|
||||
INFO Running 'fastqc' test with command: test_yml_builder.py:263
|
||||
nextflow run tests/software/fastqc -entry test_fastqc_single_end -c tests/config/nextflow.config --outdir /tmp/tmpgbneftf5
|
||||
nextflow run tests/modules/fastqc -entry test_fastqc_single_end -c tests/config/nextflow.config --outdir /tmp/tmpgbneftf5
|
||||
INFO Test workflow finished! test_yml_builder.py:276
|
||||
INFO Writing to 'tests/software/fastqc/test.yml' test_yml_builder.py:293
|
||||
INFO Writing to 'tests/modules/fastqc/test.yml' test_yml_builder.py:293
|
||||
```
|
||||
|
||||
> NB: See docs for [running tests manually](#running-tests-manually) if you would like to run the tests manually.
|
||||
|
@ -273,7 +273,7 @@ We have implemented a number of commands in the `nf-core/tools` package to make
|
|||
| \| | \__, \__/ | \ |___ \`-._,-`-,
|
||||
`._,._,'
|
||||
|
||||
nf-core/tools version 1.13
|
||||
nf-core/tools version 2.0
|
||||
|
||||
INFO Linting modules repo: . lint.py:102
|
||||
INFO Linting module: fastqc lint.py:106
|
||||
|
@ -284,9 +284,9 @@ We have implemented a number of commands in the `nf-core/tools` package to make
|
|||
╭──────────────┬──────────────────────────────────────────────────────────────┬──────────────────────────────────╮
|
||||
│ Module name │ Test message │ File path │
|
||||
├──────────────┼──────────────────────────────────────────────────────────────┼──────────────────────────────────┤
|
||||
│ fastqc │ TODO string in meta.yml: #Add a description of the module... │ modules/nf-core/software/fastqc/ │
|
||||
│ fastqc │ TODO string in meta.yml: #Add a description and other det... │ modules/nf-core/software/fastqc/ │
|
||||
│ fastqc │ TODO string in meta.yml: #Add a description of all of the... │ modules/nf-core/software/fastqc/ │
|
||||
│ fastqc │ TODO string in meta.yml: #Add a description of the module... │ modules/nf-core/modules/fastqc/ │
|
||||
│ fastqc │ TODO string in meta.yml: #Add a description and other det... │ modules/nf-core/modules/fastqc/ │
|
||||
│ fastqc │ TODO string in meta.yml: #Add a description of all of the... │ modules/nf-core/modules/fastqc/ │
|
||||
╰──────────────┴──────────────────────────────────────────────────────────────┴──────────────────────────────────╯
|
||||
╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
|
||||
│ [!] 1 Test Failed │
|
||||
|
@ -294,7 +294,7 @@ We have implemented a number of commands in the `nf-core/tools` package to make
|
|||
╭──────────────┬──────────────────────────────────────────────────────────────┬──────────────────────────────────╮
|
||||
│ Module name │ Test message │ File path │
|
||||
├──────────────┼──────────────────────────────────────────────────────────────┼──────────────────────────────────┤
|
||||
│ fastqc │ 'meta' map not emitted in output channel(s) │ modules/nf-core/software/fastqc/ │
|
||||
│ fastqc │ 'meta' map not emitted in output channel(s) │ modules/nf-core/modules/fastqc/ │
|
||||
╰──────────────┴──────────────────────────────────────────────────────────────┴──────────────────────────────────╯
|
||||
╭──────────────────────╮
|
||||
│ LINT RESULTS SUMMARY │
|
||||
|
@ -356,7 +356,7 @@ Please follow the steps below to run the tests locally:
|
|||
|
||||
3. Install [`pytest-workflow`](https://pytest-workflow.readthedocs.io/en/stable/#installation)
|
||||
|
||||
4. Start running your own tests using the appropriate [`tag`](https://github.com/nf-core/modules/blob/3d720a24fd3c766ba56edf3d4e108a1c45d353b2/tests/software/fastqc/test.yml#L3-L5) defined in the `test.yml`:
|
||||
4. Start running your own tests using the appropriate [`tag`](https://github.com/nf-core/modules/blob/3d720a24fd3c766ba56edf3d4e108a1c45d353b2/tests/modules/fastqc/test.yml#L3-L5) defined in the `test.yml`:
|
||||
|
||||
- Typical command with Docker:
|
||||
|
||||
|
@ -383,7 +383,7 @@ Please follow the steps below to run the tests locally:
|
|||
|
||||
### Uploading to `nf-core/modules`
|
||||
|
||||
[Fork](https://help.github.com/articles/fork-a-repo/) the `nf-core/modules` repository to your own GitHub account. Within the local clone of your fork add the module file to the [`software/`](software) directory. Please try and keep PRs as atomic as possible to aid the reviewing process - ideally, one module addition/update per PR.
|
||||
[Fork](https://help.github.com/articles/fork-a-repo/) the `nf-core/modules` repository to your own GitHub account. Within the local clone of your fork add the module file to the [`modules/`](modules) directory. Please try and keep PRs as atomic as possible to aid the reviewing process - ideally, one module addition/update per PR.
|
||||
|
||||
Commit and push these changes to your local clone on GitHub, and then [create a pull request](https://help.github.com/articles/creating-a-pull-request-from-a-fork/) on the `nf-core/modules` GitHub repo with the appropriate information.
|
||||
|
||||
|
@ -395,6 +395,8 @@ The key words "MUST", "MUST NOT", "SHOULD", etc. are to be interpreted as descri
|
|||
|
||||
#### General
|
||||
|
||||
- All non-mandatory command-line tool options MUST be provided as a string i.e. `options.args` where `options` is a Groovy Map that MUST be provided via the Nextflow `addParams` option when including the module via `include` in the parent workflow.
|
||||
|
||||
- Software that can be piped together SHOULD be added to separate module files
|
||||
unless there is a run-time, storage advantage in implementing in this way. For example,
|
||||
using a combination of `bwa` and `samtools` to output a BAM file instead of a SAM file:
|
||||
|
@ -413,13 +415,13 @@ using a combination of `bwa` and `samtools` to output a BAM file instead of a SA
|
|||
echo \$(bwa 2>&1) | sed 's/^.*Version: //; s/Contact:.*\$//' > ${software}.version.txt
|
||||
```
|
||||
|
||||
If the software is unable to output a version number on the command-line then a variable called `VERSION` can be manually specified to create this file e.g. [homer/annotatepeaks module](https://github.com/nf-core/modules/blob/master/software/homer/annotatepeaks/main.nf).
|
||||
If the software is unable to output a version number on the command-line then a variable called `VERSION` can be manually specified to create this file e.g. [homer/annotatepeaks module](https://github.com/nf-core/modules/blob/master/modules/homer/annotatepeaks/main.nf).
|
||||
|
||||
- The process definition MUST NOT contain a `when` statement.
|
||||
|
||||
#### Naming conventions
|
||||
|
||||
- The directory structure for the module name must be all lowercase e.g. [`software/bwa/mem/`](software/bwa/mem/). The name of the software (i.e. `bwa`) and tool (i.e. `mem`) MUST be all one word.
|
||||
- The directory structure for the module name must be all lowercase e.g. [`modules/bwa/mem/`](modules/bwa/mem/). The name of the software (i.e. `bwa`) and tool (i.e. `mem`) MUST be all one word.
|
||||
|
||||
- The process name in the module file MUST be all uppercase e.g. `process BWA_MEM {`. The name of the software (i.e. `BWA`) and tool (i.e. `MEM`) MUST be all one word separated by an underscore.
|
||||
|
||||
|
@ -431,7 +433,7 @@ using a combination of `bwa` and `samtools` to output a BAM file instead of a SA
|
|||
|
||||
- A module file SHOULD only define input and output files as command-line parameters to be executed within the process.
|
||||
|
||||
- All other parameters MUST be provided as a string i.e. `options.args` where `options` is a Groovy Map that MUST be provided via the Nextflow `addParams` option when including the module via `include` in the parent workflow.
|
||||
- All `params` within the module MUST be initialised and used in the local context of the module. In other words, named `params` defined in the parent workflow MUST NOT be assumed to be passed to the module to allow developers to call their parameters whatever they want. In general, it may be more suitable to use additional `input` value channels to cater for such scenarios.
|
||||
|
||||
- If the tool supports multi-threading then you MUST provide the appropriate parameter using the Nextflow `task` variable e.g. `--threads $task.cpus`.
|
||||
|
||||
|
@ -514,7 +516,7 @@ publishDir "${params.outdir}",
|
|||
saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), publish_id:meta.id) }
|
||||
```
|
||||
|
||||
The `saveFiles` function can be found in the [`functions.nf`](software/fastqc/functions.nf) file of utility functions that will be copied into all module directories. It uses the various publishing `options` specified as input to the module to construct and append the relevant output path to `params.outdir`.
|
||||
The `saveFiles` function can be found in the [`functions.nf`](modules/fastqc/functions.nf) file of utility functions that will be copied into all module directories. It uses the various publishing `options` specified as input to the module to construct and append the relevant output path to `params.outdir`.
|
||||
|
||||
We also use a standardised parameter called `params.publish_dir_mode` that can be used to alter the file publishing method (default: `copy`).
|
||||
|
||||
|
@ -522,7 +524,7 @@ We also use a standardised parameter called `params.publish_dir_mode` that can b
|
|||
|
||||
The features offered by Nextflow DSL2 can be used in various ways depending on the granularity with which you would like to write pipelines. Please see the listing below for the hierarchy and associated terminology we have decided to use when referring to DSL2 components:
|
||||
|
||||
- *Module*: A `process` that can be used within different pipelines and is as atomic as possible i.e. cannot be split into another module. An example of this would be a module file containing the process definition for a single tool such as `FastQC`. At present, this repository has been created to only host atomic module files that should be added to the [`software/`](software/) directory along with the required documentation and tests.
|
||||
- *Module*: A `process` that can be used within different pipelines and is as atomic as possible i.e. cannot be split into another module. An example of this would be a module file containing the process definition for a single tool such as `FastQC`. At present, this repository has been created to only host atomic module files that should be added to the [`modules/`](modules/) directory along with the required documentation and tests.
|
||||
|
||||
- *Sub-workflow*: A chain of multiple modules that offer a higher-level of functionality within the context of a pipeline. For example, a sub-workflow to run multiple QC tools with FastQ files as input. Sub-workflows should be shipped with the pipeline implementation and if required they should be shared amongst different pipelines directly from there. As it stands, this repository will not host sub-workflows although this may change in the future since well-written sub-workflows will be the most powerful aspect of DSL2.
|
||||
|
||||
|
|
|
@ -1,38 +0,0 @@
|
|||
nextflow.preview.dsl=2
|
||||
|
||||
process FASTQ_SCREEN {
|
||||
|
||||
publishDir "$outputdir",
|
||||
mode: "link", overwrite: true
|
||||
|
||||
// depending on the number of genomes and the type of genome (e.g. plants!), memory needs to be ample!
|
||||
// label 'bigMem'
|
||||
// label 'multiCore'
|
||||
|
||||
input:
|
||||
tuple val(name), path(reads)
|
||||
val outputdir
|
||||
// fastq_screen_args are best passed in to the workflow in the following manner:
|
||||
// --fastq_screen_args="--subset 200000 --force"
|
||||
val fastq_screen_args
|
||||
val verbose
|
||||
|
||||
output:
|
||||
path "*png", emit: png
|
||||
path "*html", emit: html
|
||||
path "*txt", emit: report
|
||||
|
||||
script:
|
||||
println(name)
|
||||
println(reads)
|
||||
println(outputdir)
|
||||
if (verbose){
|
||||
println ("[MODULE] FASTQ SCREEN ARGS: "+ fastq_screen_args)
|
||||
}
|
||||
|
||||
"""
|
||||
module load fastq_screen
|
||||
fastq_screen $fastq_screen_args $reads
|
||||
"""
|
||||
|
||||
}
|
|
@ -1,31 +0,0 @@
|
|||
name: FastQ Screen
|
||||
description: Run FastQ Screen on sequenced reads for Species Identification
|
||||
keywords:
|
||||
- Quality Control
|
||||
- Species Screen
|
||||
- Contamination
|
||||
tools:
|
||||
- fastqc:
|
||||
description: |
|
||||
FastQ Screen allows you to screen a library of sequences in
|
||||
FastQ format against a set of sequence databases so you can
|
||||
see if the composition of the library matches with what you expect.
|
||||
homepage: https://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/
|
||||
documentation: https://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/_build/html/index.html
|
||||
input:
|
||||
-
|
||||
- sample_id:
|
||||
type: string
|
||||
description: Sample identifier
|
||||
- reads:
|
||||
type: file
|
||||
description: Input FastQ file
|
||||
output:
|
||||
-
|
||||
- report:
|
||||
type: file
|
||||
description: FastQ Screen report
|
||||
pattern: "*_screen.{txt,html,png}"
|
||||
optional_pattern: "*_screen.bisulfite_orientation.png"
|
||||
authors:
|
||||
- "@FelixKrueger"
|
|
@ -1 +0,0 @@
|
|||
../../../../tests/data/fastq/rna/test_R1.fastq.gz
|
|
@ -1 +0,0 @@
|
|||
../../../../tests/data/fastq/rna/test_R1_val_1.fq.gz
|
|
@ -1 +0,0 @@
|
|||
../../../../tests/data/fastq/rna/test_R2.fastq.gz
|
|
@ -1 +0,0 @@
|
|||
../../../../tests/data/fastq/rna/test_R2_val_2.fq.gz
|
|
@ -1 +0,0 @@
|
|||
../../../../tests/data/fastq/rna/test_single_end.fastq.gz
|
|
@ -1,30 +0,0 @@
|
|||
#!/usr/bin/env nextflow
|
||||
nextflow.preview.dsl = 2
|
||||
|
||||
params.outdir = "."
|
||||
params.fastq_screen_args = ''
|
||||
// fastq_screen_args are best passed in to the workflow in the following manner:
|
||||
// --fastq_screen_args="--subset 200000 --force"
|
||||
|
||||
params.verbose = false
|
||||
|
||||
if (params.verbose){
|
||||
println ("[WORKFLOW] FASTQ SCREEN ARGS ARE: " + params.fastq_screen_args)
|
||||
}
|
||||
|
||||
// TODO: include '../../../tests/functions/check_process_outputs.nf'
|
||||
include '../main.nf'
|
||||
|
||||
// Define input channels
|
||||
|
||||
ch_read_files = Channel
|
||||
.fromFilePairs('../../../test-datasets/Ecoli*{1,2}.fastq.gz',size:-1)
|
||||
// .view() // to check whether the input channel works
|
||||
|
||||
// Run the workflow
|
||||
workflow {
|
||||
main:
|
||||
FASTQ_SCREEN(ch_read_files, params.outdir, params.fastq_screen_args, params.verbose)
|
||||
|
||||
// TODO .check_output()
|
||||
}
|
|
@ -1,2 +0,0 @@
|
|||
// docker.enabled = true
|
||||
params.outdir = './results'
|
|
@ -1,31 +0,0 @@
|
|||
#Fastq_screen version: 0.14.0 #Aligner: bowtie2 #Reads in subset: 100000
|
||||
Genome #Reads_processed #Unmapped %Unmapped #One_hit_one_genome %One_hit_one_genome #Multiple_hits_one_genome %Multiple_hits_one_genome #One_hit_multiple_genomes %One_hit_multiple_genomes Multiple_hits_multiple_genomes %Multiple_hits_multiple_genomes
|
||||
Cat 10000 9171 91.71 0 0.00 0 0.00 421 4.21 408 4.08
|
||||
Chicken 10000 8932 89.32 0 0.00 0 0.00 64 0.64 1004 10.04
|
||||
Cow 10000 8484 84.84 0 0.00 0 0.00 294 2.94 1222 12.22
|
||||
Drosophila 10000 9469 94.69 0 0.00 0 0.00 19 0.19 512 5.12
|
||||
Human 10000 8367 83.67 2 0.02 3 0.03 354 3.54 1274 12.74
|
||||
Mouse 10000 122 1.22 3265 32.65 869 8.69 2066 20.66 3678 36.78
|
||||
Pig 10000 8459 84.59 0 0.00 0 0.00 334 3.34 1207 12.07
|
||||
Rat 10000 6432 64.32 1 0.01 3 0.03 1334 13.34 2230 22.30
|
||||
Zebrafish 10000 9125 91.25 0 0.00 0 0.00 41 0.41 834 8.34
|
||||
Arabidopsis 10000 9497 94.97 0 0.00 0 0.00 5 0.05 498 4.98
|
||||
Grape 10000 9600 96.00 0 0.00 1 0.01 82 0.82 317 3.17
|
||||
Potato 10000 9460 94.60 0 0.00 0 0.00 12 0.12 528 5.28
|
||||
Tomato 10000 9521 95.21 0 0.00 0 0.00 45 0.45 434 4.34
|
||||
Adapters 10000 10000 100.00 0 0.00 0 0.00 0 0.00 0 0.00
|
||||
Brachybacterium 10000 10000 100.00 0 0.00 0 0.00 0 0.00 0 0.00
|
||||
Pseudomonas 10000 10000 100.00 0 0.00 0 0.00 0 0.00 0 0.00
|
||||
Massilia_oculi 10000 9999 99.99 0 0.00 1 0.01 0 0.00 0 0.00
|
||||
Ecoli 10000 9998 99.98 1 0.01 1 0.01 0 0.00 0 0.00
|
||||
Lambda 10000 10000 100.00 0 0.00 0 0.00 0 0.00 0 0.00
|
||||
MT 10000 7856 78.56 0 0.00 0 0.00 2034 20.34 110 1.10
|
||||
PhiX 10000 10000 100.00 0 0.00 0 0.00 0 0.00 0 0.00
|
||||
rRNA 10000 9157 91.57 0 0.00 0 0.00 111 1.11 732 7.32
|
||||
Wasp 10000 9473 94.73 0 0.00 0 0.00 211 2.11 316 3.16
|
||||
Vectors 10000 9713 97.13 0 0.00 0 0.00 52 0.52 235 2.35
|
||||
Worm 10000 9645 96.45 0 0.00 0 0.00 13 0.13 342 3.42
|
||||
Yeast 10000 9507 95.07 0 0.00 0 0.00 4 0.04 489 4.89
|
||||
Mycoplasma 10000 9998 99.98 0 0.00 0 0.00 0 0.00 2 0.02
|
||||
|
||||
%Hit_no_genomes: 0.88
|
|
@ -1,34 +0,0 @@
|
|||
import java.security.MessageDigest
|
||||
private static String getMD5(File file) throws IOException
|
||||
{
|
||||
// https://howtodoinjava.com/java/io/how-to-generate-sha-or-md5-file-checksum-hash-in-java/
|
||||
//Get file input stream for reading the file content
|
||||
FileInputStream fis = new FileInputStream(file);
|
||||
|
||||
//Create byte array to read data in chunks
|
||||
byte[] byteArray = new byte[1024];
|
||||
int bytesCount = 0;
|
||||
|
||||
//Read file data and update in message digest
|
||||
def digest = MessageDigest.getInstance("MD5")
|
||||
while ((bytesCount = fis.read(byteArray)) != -1) {
|
||||
digest.update(byteArray, 0, bytesCount);
|
||||
};
|
||||
|
||||
//close the stream; We don't need it now.
|
||||
fis.close();
|
||||
|
||||
//Get the hash's bytes
|
||||
byte[] bytes = digest.digest();
|
||||
|
||||
//This bytes[] has bytes in decimal format;
|
||||
//Convert it to hexadecimal format
|
||||
StringBuilder sb = new StringBuilder();
|
||||
for(int i=0; i< bytes.length ;i++)
|
||||
{
|
||||
sb.append(Integer.toString((bytes[i] & 0xff) + 0x100, 16).substring(1));
|
||||
}
|
||||
|
||||
//return complete hash
|
||||
return sb.toString();
|
||||
}
|
|
@ -1,16 +0,0 @@
|
|||
process tcoffee {
|
||||
tag "$fasta"
|
||||
publishDir "${params.outdir}/tcoffee"
|
||||
container 'quay.io/biocontainers/t_coffee:11.0.8--py27pl5.22.0_5'
|
||||
|
||||
input:
|
||||
path "$fasta"
|
||||
|
||||
output:
|
||||
path "${fasta}.aln"
|
||||
|
||||
script:
|
||||
"""
|
||||
t_coffee -seq $fasta -outfile ${fasta}.aln
|
||||
"""
|
||||
}
|
|
@ -1,28 +0,0 @@
|
|||
name: t-coffee
|
||||
description: Run tcofee multiple sequence alignment
|
||||
keywords:
|
||||
- MSA
|
||||
- sequence aligment
|
||||
tools:
|
||||
- t-coffee:
|
||||
description: |
|
||||
T-Coffee is a multiple sequence alignment package.
|
||||
It uses a progressive approach and a consistency objective
|
||||
function for alignment evaluation.
|
||||
homepage: http://www.tcoffee.org/
|
||||
documentation: http://www.tcoffee.org/Projects/tcoffee/index.html#DOCUMENTATION
|
||||
input:
|
||||
-
|
||||
- fasta:
|
||||
type: path
|
||||
description: Input fasta file
|
||||
pattern: "*.{fasta,fa,tfa}"
|
||||
output:
|
||||
-
|
||||
- alignment:
|
||||
type: file
|
||||
description: tcoffee alignment file
|
||||
pattern: "*.aln"
|
||||
|
||||
authors:
|
||||
- "@JoseEspinosa"
|
|
@ -1,15 +0,0 @@
|
|||
#!/usr/bin/env nextflow
|
||||
|
||||
nextflow.preview.dsl = 2
|
||||
|
||||
include check_output from '../../../tests/functions/check_process_outputs.nf'
|
||||
include tcoffee from '../main.nf'
|
||||
|
||||
// Define input channels
|
||||
fasta = Channel.fromPath('../../../test-datasets/tools/tcoffee/input/BBA0001.tfa')
|
||||
|
||||
// Run the workflow
|
||||
workflow {
|
||||
tcoffee(fasta)
|
||||
// .check_output()
|
||||
}
|
|
@ -1,2 +0,0 @@
|
|||
docker.enabled = true
|
||||
params.outdir = './results'
|
55
modules/bedtools/genomecov/main.nf
Normal file
55
modules/bedtools/genomecov/main.nf
Normal file
|
@ -0,0 +1,55 @@
|
|||
// Import generic module functions
|
||||
include { initOptions; saveFiles; getSoftwareName } from './functions'
|
||||
|
||||
params.options = [:]
|
||||
options = initOptions(params.options)
|
||||
|
||||
process BEDTOOLS_GENOMECOV {
|
||||
tag "$meta.id"
|
||||
label 'process_medium'
|
||||
publishDir "${params.outdir}",
|
||||
mode: params.publish_dir_mode,
|
||||
saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) }
|
||||
|
||||
conda (params.enable_conda ? "bioconda::bedtools=2.30.0" : null)
|
||||
if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) {
|
||||
container "https://depot.galaxyproject.org/singularity/bedtools:2.30.0--hc088bd4_0"
|
||||
} else {
|
||||
container "quay.io/biocontainers/bedtools:2.30.0--hc088bd4_0"
|
||||
}
|
||||
|
||||
input:
|
||||
tuple val(meta), path(intervals)
|
||||
path sizes
|
||||
val extension
|
||||
|
||||
output:
|
||||
tuple val(meta), path("*.${extension}"), emit: genomecov
|
||||
path "*.version.txt" , emit: version
|
||||
|
||||
script:
|
||||
def software = getSoftwareName(task.process)
|
||||
def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}"
|
||||
if (intervals.name =~ /\.bam/) {
|
||||
"""
|
||||
bedtools \\
|
||||
genomecov \\
|
||||
-ibam $intervals \\
|
||||
$options.args \\
|
||||
> ${prefix}.${extension}
|
||||
|
||||
bedtools --version | sed -e "s/bedtools v//g" > ${software}.version.txt
|
||||
"""
|
||||
} else {
|
||||
"""
|
||||
bedtools \\
|
||||
genomecov \\
|
||||
-i $intervals \\
|
||||
-g $sizes \\
|
||||
$options.args \\
|
||||
> ${prefix}.${extension}
|
||||
|
||||
bedtools --version | sed -e "s/bedtools v//g" > ${software}.version.txt
|
||||
"""
|
||||
}
|
||||
}
|
|
@ -15,20 +15,26 @@ input:
|
|||
description: |
|
||||
Groovy Map containing sample information
|
||||
e.g. [ id:'test', single_end:false ]
|
||||
- bam:
|
||||
- intervals:
|
||||
type: file
|
||||
description: Input BAM file
|
||||
pattern: "*.{bam}"
|
||||
description: BAM/BED/GFF/VCF
|
||||
pattern: "*.{bam|bed|gff|vcf}"
|
||||
- sizes:
|
||||
type: file
|
||||
description: Tab-delimited table of chromosome names in the first column and chromosome sizes in the second column
|
||||
- extension:
|
||||
type: string
|
||||
description: Extension of the output file (e. g., ".bg", ".bedgraph", ".txt", ".tab", etc.) It is set arbitrarily by the user and corresponds to the file format which depends on arguments.
|
||||
output:
|
||||
- meta:
|
||||
type: map
|
||||
description: |
|
||||
Groovy Map containing sample information
|
||||
e.g. [ id:'test', single_end:false ]
|
||||
- bed:
|
||||
- genomecov:
|
||||
type: file
|
||||
description: Computed genomecov bed file
|
||||
pattern: "*.{bed}"
|
||||
description: Computed genome coverage file
|
||||
pattern: "*.${extension}"
|
||||
- version:
|
||||
type: file
|
||||
description: File containing software version
|
||||
|
@ -37,3 +43,4 @@ authors:
|
|||
- "@Emiller88"
|
||||
- "@sruthipsuresh"
|
||||
- "@drpatelh"
|
||||
- "@sidorov-si"
|
|
@ -19,11 +19,12 @@ process BEDTOOLS_INTERSECT {
|
|||
}
|
||||
|
||||
input:
|
||||
tuple val(meta), path(bed1), path(bed2)
|
||||
tuple val(meta), path(intervals1), path(intervals2)
|
||||
val extension
|
||||
|
||||
output:
|
||||
tuple val(meta), path('*.bed'), emit: bed
|
||||
path '*.version.txt' , emit: version
|
||||
tuple val(meta), path("*.${extension}"), emit: intersect
|
||||
path '*.version.txt' , emit: version
|
||||
|
||||
script:
|
||||
def software = getSoftwareName(task.process)
|
||||
|
@ -31,10 +32,10 @@ process BEDTOOLS_INTERSECT {
|
|||
"""
|
||||
bedtools \\
|
||||
intersect \\
|
||||
-a $bed1 \\
|
||||
-b $bed2 \\
|
||||
-a $intervals1 \\
|
||||
-b $intervals2 \\
|
||||
$options.args \\
|
||||
> ${prefix}.bed
|
||||
> ${prefix}.${extension}
|
||||
|
||||
bedtools --version | sed -e "s/bedtools v//g" > ${software}.version.txt
|
||||
"""
|
|
@ -1,5 +1,5 @@
|
|||
name: bedtools_intersect
|
||||
description: allows one to screen for overlaps between two sets of genomic features.
|
||||
description: Allows one to screen for overlaps between two sets of genomic features.
|
||||
keywords:
|
||||
- bed
|
||||
- intersect
|
||||
|
@ -14,24 +14,27 @@ input:
|
|||
description: |
|
||||
Groovy Map containing sample information
|
||||
e.g. [ id:'test', single_end:false ]
|
||||
- bed1:
|
||||
- intervals1:
|
||||
type: file
|
||||
description: BED file, each feature in 1 is compared to 2 in search of overlaps
|
||||
pattern: "*.{bed}"
|
||||
- bed2:
|
||||
description: BAM/BED/GFF/VCF
|
||||
pattern: "*.{bam|bed|gff|vcf}"
|
||||
- intervals2:
|
||||
type: file
|
||||
description: Second bed file, used to compare to first BED file
|
||||
pattern: "*.{bed}"
|
||||
description: BAM/BED/GFF/VCF
|
||||
pattern: "*.{bam|bed|gff|vcf}"
|
||||
- extension:
|
||||
type: value
|
||||
description: Extension of the output file. It is set by the user and corresponds to the file format which depends on arguments (e. g., ".bed", ".bam", ".txt", etc.).
|
||||
output:
|
||||
- meta:
|
||||
type: map
|
||||
description: |
|
||||
Groovy Map containing sample information
|
||||
e.g. [ id:'test', single_end:false ]
|
||||
- bed:
|
||||
- intersect:
|
||||
type: file
|
||||
description: BED file with intersected intervals
|
||||
pattern: "*.{bed}"
|
||||
description: File containing the description of overlaps found between the two features
|
||||
pattern: "*.${extension}"
|
||||
- version:
|
||||
type: file
|
||||
description: File containing software version
|
||||
|
@ -40,3 +43,4 @@ authors:
|
|||
- "@Emiller88"
|
||||
- "@sruthipsuresh"
|
||||
- "@drpatelh"
|
||||
- "@sidorov-si"
|
|
@ -4,7 +4,7 @@ include { initOptions; saveFiles; getSoftwareName } from './functions'
|
|||
params.options = [:]
|
||||
options = initOptions(params.options)
|
||||
|
||||
process BEDTOOLS_GENOMECOV {
|
||||
process BEDTOOLS_SUBTRACT {
|
||||
tag "$meta.id"
|
||||
label 'process_medium'
|
||||
publishDir "${params.outdir}",
|
||||
|
@ -19,19 +19,20 @@ process BEDTOOLS_GENOMECOV {
|
|||
}
|
||||
|
||||
input:
|
||||
tuple val(meta), path(bam)
|
||||
tuple val(meta), path(intervals1), path(intervals2)
|
||||
|
||||
output:
|
||||
tuple val(meta), path("*.bed"), emit: bed
|
||||
path "*.version.txt" , emit: version
|
||||
path "*.version.txt" , emit: version
|
||||
|
||||
script:
|
||||
def software = getSoftwareName(task.process)
|
||||
def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}"
|
||||
"""
|
||||
bedtools \\
|
||||
genomecov \\
|
||||
-ibam $bam \\
|
||||
subtract \\
|
||||
-a $intervals1 \\
|
||||
-b $intervals2 \\
|
||||
$options.args \\
|
||||
> ${prefix}.bed
|
||||
|
45
modules/bedtools/subtract/meta.yml
Normal file
45
modules/bedtools/subtract/meta.yml
Normal file
|
@ -0,0 +1,45 @@
|
|||
name: bedtools_subtract
|
||||
description: Finds overlaps between two sets of regions (A and B), removes the overlaps from A and reports the remaining portion of A.
|
||||
keywords:
|
||||
- bed
|
||||
- gff
|
||||
- vcf
|
||||
- subtract
|
||||
tools:
|
||||
- bedtools:
|
||||
description: |
|
||||
A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.
|
||||
documentation: https://bedtools.readthedocs.io/en/latest/content/tools/subtract.html
|
||||
|
||||
input:
|
||||
- meta:
|
||||
type: map
|
||||
description: |
|
||||
Groovy Map containing sample information
|
||||
e.g. [ id:'test', single_end:false ]
|
||||
- intervals1:
|
||||
type: file
|
||||
description: BED/GFF/VCF
|
||||
pattern: "*.{bed|gff|vcf}"
|
||||
- intervals2:
|
||||
type: file
|
||||
description: BED/GFF/VCF
|
||||
pattern: "*.{bed|gff|vcf}"
|
||||
|
||||
output:
|
||||
- meta:
|
||||
type: map
|
||||
description: |
|
||||
Groovy Map containing sample information
|
||||
e.g. [ id:'test', single_end:false ]
|
||||
- bed:
|
||||
type: file
|
||||
description: File containing the difference between the two sets of features
|
||||
patters: "*.bed"
|
||||
- version:
|
||||
type: file
|
||||
description: File containing software version
|
||||
pattern: "*.{version.txt}"
|
||||
|
||||
authors:
|
||||
- "@sidorov-si"
|
Some files were not shown because too many files have changed in this diff Show more
Loading…
Reference in a new issue