Chromap Module (#659)

* Initialise chromap module

* Revert "Initialise chromap module"

This reverts commit 47c67ae231a6f221ef5b9b7b444b583b5406852b.

* Remake chromap base files with new layout

* Copy chromap

* Copy index

* Add compression

* Update padding

* Update container

* Update chromap input test data

* Add chromap chromap tests

* Add padding

* Update comment

* update yaml file

* Remove TODOs

* Add fasta input to yaml

* Update YAML

* Remove comment, update container

* Remove comments

* Import Chromap index

* Update test.yml

* Fix read input

* Update test.yml

* Add bcftools/concat module. (#641)

* draft for bcftools modules [ci skip]

* initial test for bcftools concat

* Update the params for testing

* fix tests

* Accomodate code review [ci skip]

Co-authored-by: James A. Fellows Yates <jfy133@gmail.com>

* Update the meta file and open PR for review

* Update the keyword

* Update the tags for module [ci skip[

* add threads

Co-authored-by: James A. Fellows Yates <jfy133@gmail.com>

* add module for dragonflye (#633)

* add module for dragonflye

* fix tests for dragonflye

* Update test.yml

* Update meta.yml

* Update main.nf

* Update main.nf

* Update modules/dragonflye/meta.yml

Co-authored-by: Gregor Sturm <mail@gregor-sturm.de>

* update typos. change quote from ' to ". (#652)

* Add bcftools/norm module (#655)

* Initial draft [ci skip]

* trigger first test

* update output file path

* Tests passing

* finishing touches for meta.yml and update checksum

* tweak checksum

* add threads to the module

* skip version info for matching test md5sum [ci skip]

* Add ref fasta and finalize the module

Co-authored-by: Gregor Sturm <mail@gregor-sturm.de>

* Expansionhunter (#666)


Please enter the commit message for your changes. Lines starting

* adds expansionhunter module

Co-authored-by: Maxime U. Garcia <maxime.garcia@scilifelab.se>

* Update test.yml (#668)

* Specify in guidelines one should split CPUs when module has n > 1 tool (#660)

* Specify more guidelines on input channels

* Linting

* Updates based on code review

* Update README.md

* Fix broken sentence

* Describe CPU splitting

* Update README.md

Co-authored-by: Gregor Sturm <mail@gregor-sturm.de>

* More CPU examples

Co-authored-by: Gregor Sturm <mail@gregor-sturm.de>

* Add dsh-bio export-segments module (#631)

Co-authored-by: Gregor Sturm <mail@gregor-sturm.de>

* update: `BWA/ALN` (#653)

* Specify more guidelines on input channels

* Linting

* Updates based on code review

* Update README.md

* Fix broken sentence

* Remove reads from output channel following module guidelines. Should do a .join() based on $meta, to reassociate.

Co-authored-by: Gregor Sturm <mail@gregor-sturm.de>

* Update seqwish reported version to match bioconda version. (#678)

* Bbmap index (#683)

BBMap index module

* Initialise chromap module

* Revert "Initialise chromap module"

This reverts commit 47c67ae231a6f221ef5b9b7b444b583b5406852b.

* Remove unnecessary files

* Remove unnecessary files

* Update modules/chromap/index/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update modules/chromap/index/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update modules/chromap/chromap/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update tests/modules/chromap/chromap/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update tests/modules/chromap/chromap/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update tests/modules/chromap/chromap/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update modules/chromap/index/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Remove pytest_software.yml

* Apply suggestions from code review

Co-authored-by: Abhinav Sharma <abhi18av@users.noreply.github.com>
Co-authored-by: James A. Fellows Yates <jfy133@gmail.com>
Co-authored-by: Robert A. Petit III <robbie.petit@gmail.com>
Co-authored-by: Gregor Sturm <mail@gregor-sturm.de>
Co-authored-by: JIANHONG OU <jianhong@users.noreply.github.com>
Co-authored-by: Anders Jemt <jemten@users.noreply.github.com>
Co-authored-by: Maxime U. Garcia <maxime.garcia@scilifelab.se>
Co-authored-by: Michael L Heuer <heuermh@acm.org>
Co-authored-by: Daniel Lundin <daniel.lundin@lnu.se>
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
This commit is contained in:
Mahesh Binzer-Panchal 2021-09-15 18:20:55 +02:00 committed by GitHub
parent 561f16fe74
commit 58134cb929
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
11 changed files with 529 additions and 0 deletions

View file

@ -0,0 +1,68 @@
//
// Utility functions used in nf-core DSL2 module files
//
//
// Extract name of software tool from process name using $task.process
//
def getSoftwareName(task_process) {
return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()
}
//
// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules
//
def initOptions(Map args) {
def Map options = [:]
options.args = args.args ?: ''
options.args2 = args.args2 ?: ''
options.args3 = args.args3 ?: ''
options.publish_by_meta = args.publish_by_meta ?: []
options.publish_dir = args.publish_dir ?: ''
options.publish_files = args.publish_files
options.suffix = args.suffix ?: ''
return options
}
//
// Tidy up and join elements of a list to return a path string
//
def getPathFromList(path_list) {
def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries
paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes
return paths.join('/')
}
//
// Function to save/publish module results
//
def saveFiles(Map args) {
if (!args.filename.endsWith('.version.txt')) {
def ioptions = initOptions(args.options)
def path_list = [ ioptions.publish_dir ?: args.publish_dir ]
if (ioptions.publish_by_meta) {
def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta
for (key in key_list) {
if (args.meta && key instanceof String) {
def path = key
if (args.meta.containsKey(key)) {
path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key]
}
path = path instanceof String ? path : ''
path_list.add(path)
}
}
}
if (ioptions.publish_files instanceof Map) {
for (ext in ioptions.publish_files) {
if (args.filename.endsWith(ext.key)) {
def ext_list = path_list.collect()
ext_list.add(ext.value)
return "${getPathFromList(ext_list)}/$args.filename"
}
}
} else if (ioptions.publish_files == null) {
return "${getPathFromList(path_list)}/$args.filename"
}
}
}

View file

@ -0,0 +1,93 @@
// Import generic module functions
include { initOptions; saveFiles; getSoftwareName } from './functions'
params.options = [:]
options = initOptions(params.options)
def VERSION = 0.1 // No version information printed
process CHROMAP_CHROMAP {
tag "$meta.id"
label 'process_medium'
publishDir "${params.outdir}",
mode: params.publish_dir_mode,
saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) }
conda (params.enable_conda ? "bioconda::chromap=0.1 bioconda::samtools=1.13" : null)
if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) {
container "https://depot.galaxyproject.org/singularity/mulled-v2-1f09f39f20b1c4ee36581dc81cc323c70e661633:2cad7c5aa775241887eff8714259714a39baf016-0"
} else {
container "quay.io/biocontainers/mulled-v2-1f09f39f20b1c4ee36581dc81cc323c70e661633:2cad7c5aa775241887eff8714259714a39baf016-0"
}
input:
tuple val(meta), path(reads)
path fasta
path index
path barcodes
path whitelist
path chr_order
path pairs_chr_order
output:
tuple val(meta), path("*.bed.gz") , optional:true, emit: bed
tuple val(meta), path("*.bam") , optional:true, emit: bam
tuple val(meta), path("*.tagAlign.gz"), optional:true, emit: tagAlign
tuple val(meta), path("*.pairs.gz") , optional:true, emit: pairs
path "*.version.txt" , emit: version
script:
def software = getSoftwareName(task.process)
def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}"
def args = options.args.tokenize()
def file_extension = options.args.contains("--SAM")? 'sam' :
options.args.contains("--TagAlign")? 'tagAlign' :
options.args.contains("--pairs")? 'pairs' : 'bed'
if (barcodes) {
args << "-b ${barcodes.join(',')}"
if (whitelist) {
args << "--barcode-whitelist $whitelist"
}
}
if (chr_order) {
args << "--chr-order $chr_order"
}
if (pairs_chr_order){
args << "--pairs-natural-chr-order $pairs_chr_order"
}
def compression_cmds = """
gzip ${prefix}.${file_extension}
"""
if (options.args.contains("--SAM")) {
compression_cmds = """
samtools view $options.args2 -@ ${task.cpus} -bh \\
-o ${prefix}.bam ${prefix}.${file_extension}
rm ${prefix}.${file_extension}
samtools --version 2>&1 | sed 's/^.*samtools //; s/Using.*\$//' > ${software}.version.txt
"""
}
if (meta.single_end) {
"""
chromap ${args.join(' ')} \\
-t $task.cpus \\
-x $index \\
-r $fasta \\
-1 ${reads.join(',')} \\
-o ${prefix}.${file_extension}
echo "$VERSION" > ${software}.version.txt
""" + compression_cmds
} else {
"""
chromap ${args.join(' ')} \\
-t $task.cpus \\
-x $index \\
-r $fasta \\
-1 ${reads[0]} \\
-2 ${reads[1]} \\
-o ${prefix}.${file_extension}
echo "$VERSION" > ${software}.version.txt
""" + compression_cmds
}
}

View file

@ -0,0 +1,88 @@
name: chromap_chromap
description: |
Performs preprocessing and alignment of chromatin fastq files to
fasta reference files using chromap.
keywords:
- chromap
- alignment
- map
- fastq
- bam
- sam
- hi-c
- atac-seq
- chip-seq
- trimming
- duplicate removal
tools:
- chromap:
description: Fast alignment and preprocessing of chromatin profiles
homepage: https://github.com/haowenz/chromap
documentation: https://github.com/haowenz/chromap
tool_dev_url: https://github.com/haowenz/chromap
doi: ""
licence: ['GPL v3']
input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- reads:
type: file
description: |
List of input FastQ files of size 1 and 2 for single-end and paired-end data,
respectively.
- fasta:
type: file
description: |
The fasta reference file.
- index:
type: file
description: |
Chromap genome index files (*.index)
- barcodes:
type: file
description: |
Cell barcode files
- whitelist:
type: file
description: |
Cell barcode whitelist file
- chr_order:
type: file
description: |
Custom chromosome order
- pairs_chr_order:
type: file
description: |
Natural chromosome order for pairs flipping
output:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- version:
type: file
description: File containing software version
pattern: "*.{version.txt}"
- bed:
type: file
description: BED file
pattern: "*.bed.gz"
- bam:
type: file
description: BAM file
pattern: "*.bam"
- tagAlign:
type: file
description: tagAlign file
pattern: "*.tagAlign.gz"
- pairs:
type: file
description: pairs file
pattern: "*.pairs.gz"
authors:
- "@mahesh-panchal"

View file

@ -0,0 +1,68 @@
//
// Utility functions used in nf-core DSL2 module files
//
//
// Extract name of software tool from process name using $task.process
//
def getSoftwareName(task_process) {
return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()
}
//
// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules
//
def initOptions(Map args) {
def Map options = [:]
options.args = args.args ?: ''
options.args2 = args.args2 ?: ''
options.args3 = args.args3 ?: ''
options.publish_by_meta = args.publish_by_meta ?: []
options.publish_dir = args.publish_dir ?: ''
options.publish_files = args.publish_files
options.suffix = args.suffix ?: ''
return options
}
//
// Tidy up and join elements of a list to return a path string
//
def getPathFromList(path_list) {
def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries
paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes
return paths.join('/')
}
//
// Function to save/publish module results
//
def saveFiles(Map args) {
if (!args.filename.endsWith('.version.txt')) {
def ioptions = initOptions(args.options)
def path_list = [ ioptions.publish_dir ?: args.publish_dir ]
if (ioptions.publish_by_meta) {
def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta
for (key in key_list) {
if (args.meta && key instanceof String) {
def path = key
if (args.meta.containsKey(key)) {
path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key]
}
path = path instanceof String ? path : ''
path_list.add(path)
}
}
}
if (ioptions.publish_files instanceof Map) {
for (ext in ioptions.publish_files) {
if (args.filename.endsWith(ext.key)) {
def ext_list = path_list.collect()
ext_list.add(ext.value)
return "${getPathFromList(ext_list)}/$args.filename"
}
}
} else if (ioptions.publish_files == null) {
return "${getPathFromList(path_list)}/$args.filename"
}
}
}

View file

@ -0,0 +1,40 @@
// Import generic module functions
include { initOptions; saveFiles; getSoftwareName } from './functions'
params.options = [:]
options = initOptions(params.options)
def VERSION = 0.1 // No version information printed
process CHROMAP_INDEX {
tag '$fasta'
label 'process_medium'
publishDir "${params.outdir}",
mode: params.publish_dir_mode,
saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:[:], publish_by_meta:[]) }
conda (params.enable_conda ? "bioconda::chromap=0.1 bioconda::samtools=1.13" : null)
if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) {
container "https://depot.galaxyproject.org/singularity/mulled-v2-1f09f39f20b1c4ee36581dc81cc323c70e661633:2cad7c5aa775241887eff8714259714a39baf016-0"
} else {
container "quay.io/biocontainers/mulled-v2-1f09f39f20b1c4ee36581dc81cc323c70e661633:2cad7c5aa775241887eff8714259714a39baf016-0"
}
input:
path fasta
output:
path "*.index" , emit: index
path "*.version.txt", emit: version
script:
def software = getSoftwareName(task.process)
def prefix = fasta.baseName
"""
chromap -i $options.args \\
-t $task.cpus \\
-r $fasta \\
-o ${prefix}.index
echo "$VERSION" > ${software}.version.txt
"""
}

View file

@ -0,0 +1,33 @@
name: chromap_index
description: Indexes a fasta reference genome ready for chromatin profiling.
keywords:
- index
- fasta
- genome
- reference
tools:
- chromap:
description: Fast alignment and preprocessing of chromatin profiles
homepage: https://github.com/haowenz/chromap
documentation: https://github.com/haowenz/chromap
tool_dev_url: https://github.com/haowenz/chromap
doi: ""
licence: ['GPL v3']
input:
- fasta:
type: file
description: Fasta reference file.
output:
- version:
type: file
description: File containing software version
pattern: "*.{version.txt}"
- index:
type: file
description: Index file of the reference genome
pattern: "*.{index}"
authors:
- "@mahesh-panchal"

View file

@ -226,6 +226,14 @@ cat/fastq:
- modules/cat/fastq/** - modules/cat/fastq/**
- tests/modules/cat/fastq/** - tests/modules/cat/fastq/**
chromap/chromap:
- modules/chromap/chromap/**
- tests/modules/chromap/chromap/**
chromap/index:
- modules/chromap/index/**
- tests/modules/chromap/index/**
cnvkit: cnvkit:
- modules/cnvkit/** - modules/cnvkit/**
- tests/modules/cnvkit/** - tests/modules/cnvkit/**

View file

@ -0,0 +1,79 @@
#!/usr/bin/env nextflow
nextflow.enable.dsl = 2
include { CHROMAP_INDEX } from '../../../../modules/chromap/index/main.nf' addParams( options: [:] )
include { CHROMAP_CHROMAP as CHROMAP_CHROMAP_BASE } from '../../../../modules/chromap/chromap/main.nf' addParams( options: [:] )
include { CHROMAP_CHROMAP as CHROMAP_CHROMAP_SAM } from '../../../../modules/chromap/chromap/main.nf' addParams( options: ['args': '--SAM'] )
workflow test_chromap_chromap_single_end {
// Test single-end and gz compressed output
fasta = file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true)
input = [
[ id:'test', single_end:true ], // meta map
[ file(params.test_data['sarscov2']['illumina']['test_1_fastq_gz'], checkIfExists: true) ]
]
CHROMAP_INDEX ( fasta )
CHROMAP_CHROMAP_BASE (
input, // meta + read data
fasta, // reference genome
CHROMAP_INDEX.out.index, // reference index
[], // barcode file
[], // barcode whitelist
[], // chromosome order file
[] // pairs chromosome order file
)
}
workflow test_chromap_chromap_paired_end {
// Test paired-end and gz compressed output
fasta = file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true)
input = [
[ id:'test', single_end:false ], // meta map
[
file(params.test_data['sarscov2']['illumina']['test_1_fastq_gz'], checkIfExists: true),
file(params.test_data['sarscov2']['illumina']['test_2_fastq_gz'], checkIfExists: true)
]
]
CHROMAP_INDEX ( fasta )
CHROMAP_CHROMAP_BASE (
input, // meta + read data
fasta, // reference genome
CHROMAP_INDEX.out.index, // reference index
[], // barcode file
[], // barcode whitelist
[], // chromosome order file
[] // pairs chromosome order file
)
}
workflow test_chromap_chromap_paired_bam {
// Test paired-end and bam output
fasta = file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true)
input = [
[ id:'test', single_end:false ], // meta map
[
file(params.test_data['sarscov2']['illumina']['test_1_fastq_gz'], checkIfExists: true),
file(params.test_data['sarscov2']['illumina']['test_2_fastq_gz'], checkIfExists: true)
]
]
CHROMAP_INDEX ( fasta )
CHROMAP_CHROMAP_SAM (
input, // meta + read data
fasta, // reference genome
CHROMAP_INDEX.out.index, // reference index
[], // barcode file
[], // barcode whitelist
[], // chromosome order file
[] // pairs chromosome order file
)
}

View file

@ -0,0 +1,32 @@
- name: chromap chromap test_chromap_chromap_single_end
command: nextflow run tests/modules/chromap/chromap -entry test_chromap_chromap_single_end -c tests/config/nextflow.config
tags:
- chromap/chromap
- chromap
files:
- path: output/chromap/genome.index
md5sum: f889d5f61d80823766af33277d27d386
- path: output/chromap/test.bed.gz
md5sum: 7029066c27ac6f5ef18d660d5741979a
- name: chromap chromap test_chromap_chromap_paired_end
command: nextflow run tests/modules/chromap/chromap -entry test_chromap_chromap_paired_end -c tests/config/nextflow.config
tags:
- chromap/chromap
- chromap
files:
- path: output/chromap/genome.index
md5sum: f889d5f61d80823766af33277d27d386
- path: output/chromap/test.bed.gz
md5sum: cafd8fb21977f5ae69e9008b220ab169
- name: chromap chromap test_chromap_chromap_paired_bam
command: nextflow run tests/modules/chromap/chromap -entry test_chromap_chromap_paired_bam -c tests/config/nextflow.config
tags:
- chromap/chromap
- chromap
files:
- path: output/chromap/genome.index
md5sum: f889d5f61d80823766af33277d27d386
- path: output/chromap/test.bam
md5sum: bd1e3fe0f3abd1430ae191754f16a3ed

View file

@ -0,0 +1,12 @@
#!/usr/bin/env nextflow
nextflow.enable.dsl = 2
include { CHROMAP_INDEX } from '../../../../modules/chromap/index/main.nf' addParams( options: [:] )
workflow test_chromap_index {
input = file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true)
CHROMAP_INDEX ( input )
}

View file

@ -0,0 +1,8 @@
- name: chromap index test_chromap_index
command: nextflow run tests/modules/chromap/index -entry test_chromap_index -c tests/config/nextflow.config
tags:
- chromap/index
- chromap
files:
- path: output/chromap/genome.index
md5sum: f889d5f61d80823766af33277d27d386