nf-core_modules/modules/deepbgc/pipeline/meta.yml
louperelo eae945721d
new module: deepbgc/pipeline (#2014)
* not working yet (db not found)

* modify deeparg/download module to return db-path

* 🪄

* Prettier

* add test.yml

* much prettier

* test.yml delete md5 for pot. empty files

* adapt test.yml

* test.yml again

* Apply suggestions from code review

Co-authored-by: James A. Fellows Yates <jfy133@gmail.com>

Co-authored-by: James A. Fellows Yates <jfy133@gmail.com>
2022-09-07 14:59:01 +02:00

88 lines
2.5 KiB
YAML

name: "deepbgc_pipeline"
description: detect BGCs in bacterial and fungal genomes using deep learning
keywords:
- Biosynthetic Gene Cluster
- deep learning
- neural network
- random forest
- genomes
- bacteria
- fungi
tools:
- "deepbgc":
description: "DeepBGC - Biosynthetic Gene Cluster detection and classification"
homepage: "https://github.com/Merck/deepbgc"
documentation: "https://github.com/Merck/deepbgc"
tool_dev_url: "https://github.com/Merck/deepbgc"
doi: "10.1093/nar/gkz654"
licence: "['MIT']"
input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test' ]
- genome:
type: file
description: FASTA/GenBank/Pfam CSV file
pattern: "*.{fasta,fa,fna,gbk,csv}"
output:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test']
- versions:
type: file
description: File containing software versions
pattern: "versions.yml"
- readme:
type: file
description: txt file containing description of output files
pattern: "*.{txt}"
- log:
type: file
description: Log output of DeepBGC
pattern: "*.{txt}"
- json:
type: file
description: AntiSMASH JSON file for sideloading.
pattern: "*.{json}"
- bgc_gbk:
type: file
description: Sequences and features of all detected BGCs in GenBank format.
pattern: "*.{bgc.gbk}"
- bgc_tsv:
type: file
description: Table of detected BGCs and their properties.
pattern: "*.{bgc.tsv}"
- full_gbk:
type: file
description: Fully annotated input sequence with proteins, Pfam domains (PFAM_domain features) and BGCs (cluster features)
pattern: "*.{full.gbk}"
- pfam_tsv:
type: file
description: Table of Pfam domains (pfam_id) from given sequence (sequence_id) in genomic order, with BGC detection scores.
pattern: "*.{pfam.tsv}"
- bgc_png:
type: file
description: Detected BGCs plotted by their nucleotide coordinates.
pattern: "*.{bgc.png}"
- pr_png:
type: file
description: Precision-Recall curve based on predicted per-Pfam BGC scores.
pattern: "*.{pr.png}"
- roc_png:
type: file
description: ROC curve based on predicted per-Pfam BGC scores.
pattern: "*.{roc.png}"
- score_png:
type: file
description: BGC detection scores of each Pfam domain in genomic order.
pattern: "*.{score.png}"
authors:
- "@louperelo"
- "@jfy133"