nf-core_modules/modules/ascat/main.nf

156 lines
6.1 KiB
Text
Raw Normal View History

ASCAT (#1332) * First commit * putting correct links for singularity and docker containers (just had to search for bioconda+ascat to find them, and then put them in like the rest of the nf-core tools had it * adding first try of relevant commands (not working yet, just took their basic pipeline example * test commit * remove test * starting up work with module after 3.0.0 upgrade * add ascat.prepareHTS statemet * add location of docker for new mulled alleleCounter+ASCAT container * first full run with ASCAT on HG00154.mapped.ILLUMINA.bwa.GBR.low_coverage.20101123.bam * add notes on dropbox download * use a newer pytest_modules.yml * add outpit * trying to align with current Sarek output * adding in FH comments * busy clearing up arguments and testing. Still WIP * first working run, in nextflow, with sarek-like output. Still needs more work on input arguments * cleaning up before writing up findings * testing with putting in arguments in args * draft for solution 3 style for arguments * one more test added * adding FH map * finished testing maps for args * wrap-up cram/crai test successfully * updates to address ability to put in ref.fasta argument for cram running * adding remaining import-HTS commands in as args, and removing the chr21/chr22 only testing to test-nextflow.config * first test with auto-downloading the s3-data (when not given as an argument) * removing download-logic for supporting files, documenting in meta.yml, fixing ref_fasta bug * adding mulled singularity container * removing tests * fix left padding lint issue * lint failure in meta.yml * more linting errors * add when argument * adding stub functionality * add stub run * correct md5sum for versions.yml * more testing with -runstub * stub code in pure bash - not mixed with R * reformat version.yml * get rid of absolute paths in test.yml * correct wrong md5sum * adding allelecount conda link * rename normal_bam to input_bam etc * let the pipeline dev worry about matching the right loci and allele files * dont hardcode default genomebuild * adding download instruction comment * add doi * fix conda addition bug * add args documentation * test new indent * new test with meta.yml indentation * retry with new meta.yml * retry with new meta.yml - now with empty lines around * retry with new meta.yml - remove trailing whitepsace * trying to fix found quote character that cannot start any token error * try with one empty line above triple-quote and no empty line below * trying with pipe character * checking if its the ending triple quote * one more try with meta.yml * test update bioconda versioning for linting failure * test update bioconda versioning for linting failure 2 * testing allelecounter version error on conda Co-authored-by: @lassefolkersen Co-authored-by: @FriederikeHanssen
2022-03-15 06:18:43 -04:00
process ASCAT {
tag "$meta.id"
label 'process_medium'
conda (params.enable_conda ? "bioconda::ascat=3.0.0 bioconda::cancerit-allelecount-4.3.0": null)
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/mulled-v2-c278c7398beb73294d78639a864352abef2931ce:dfe5aaa885de434adb2b490b68972c5840c6d761-0':
'quay.io/biocontainers/mulled-v2-c278c7398beb73294d78639a864352abef2931ce:dfe5aaa885de434adb2b490b68972c5840c6d761-0' }"
input:
tuple val(meta), path(input_normal), path(index_normal), path(input_tumor), path(index_tumor)
path(allele_files)
path(loci_files)
output:
tuple val(meta), path("*png"), emit: png
tuple val(meta), path("*cnvs.txt"), emit: cnvs
tuple val(meta), path("*purityploidy.txt"), emit: purityploidy
tuple val(meta), path("*segments.txt"), emit: segments
path "versions.yml", emit: versions
when:
task.ext.when == null || task.ext.when
script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
def gender = args.gender ? "$args.gender" : "NULL"
def genomeVersion = args.genomeVersion ? "$args.genomeVersion" : "NULL"
def purity = args.purity ? "$args.purity" : "NULL"
def ploidy = args.ploidy ? "$args.ploidy" : "NULL"
def gc_files = args.gc_files ? "$args.gc_files" : "NULL"
def minCounts_arg = args.minCounts ? ",minCounts = $args.minCounts" : ""
def chrom_names_arg = args.chrom_names ? ",chrom_names = $args.chrom_names" : ""
def min_base_qual_arg = args.min_base_qual ? ",min_base_qual = $args.min_base_qual" : ""
def min_map_qual_arg = args.min_map_qual ? ",min_map_qual = $args.min_map_qual" : ""
def ref_fasta_arg = args.ref_fasta ? ",ref.fasta = '$args.ref_fasta'" : ""
def skip_allele_counting_tumour_arg = args.skip_allele_counting_tumour ? ",skip_allele_counting_tumour = $args.skip_allele_counting_tumour" : ""
def skip_allele_counting_normal_arg = args.skip_allele_counting_normal ? ",skip_allele_counting_normal = $args.skip_allele_counting_normal" : ""
"""
#!/usr/bin/env Rscript
library(RColorBrewer)
library(ASCAT)
options(bitmapType='cairo')
#prepare from BAM files
ascat.prepareHTS(
tumourseqfile = "$input_tumor",
normalseqfile = "$input_normal",
tumourname = "Tumour",
normalname = "Normal",
allelecounter_exe = "alleleCounter",
alleles.prefix = "$allele_files",
loci.prefix = "$loci_files",
gender = "$gender",
genomeVersion = "$genomeVersion",
nthreads = $task.cpus
$minCounts_arg
$chrom_names_arg
$min_base_qual_arg
$min_map_qual_arg
$ref_fasta_arg
$skip_allele_counting_tumour_arg
$skip_allele_counting_normal_arg
)
#Load the data
ascat.bc = ascat.loadData(
Tumor_LogR_file = "Tumour_tumourLogR.txt",
Tumor_BAF_file = "Tumour_normalBAF.txt",
Germline_LogR_file = "Tumour_normalLogR.txt",
Germline_BAF_file = "Tumour_normalBAF.txt",
genomeVersion = "$genomeVersion",
gender = "$gender"
)
#optional GC wave correction
if(!is.null($gc_files)){
ascat.bc = ascat.GCcorrect(ascat.bc, $gc_files)
}
#Plot the raw data
ascat.plotRawData(ascat.bc)
#Segment the data
ascat.bc = ascat.aspcf(ascat.bc)
#Plot the segmented data
ascat.plotSegmentedData(ascat.bc)
#Run ASCAT to fit every tumor to a model, inferring ploidy, normal cell contamination, and discrete copy numbers
#If psi and rho are manually set:
if (!is.null($purity) && !is.null($ploidy)){
ascat.output <- ascat.runAscat(ascat.bc, gamma=1, rho_manual=$purity, psi_manual=$ploidy)
} else if(!is.null($purity) && is.null($ploidy)){
ascat.output <- ascat.runAscat(ascat.bc, gamma=1, rho_manual=$purity)
} else if(!is.null($ploidy) && is.null($purity)){
ascat.output <- ascat.runAscat(ascat.bc, gamma=1, psi_manual=$ploidy)
} else {
ascat.output <- ascat.runAscat(ascat.bc, gamma=1)
}
#Write out segmented regions (including regions with one copy of each allele)
write.table(ascat.output[["segments"]], file=paste0("$prefix", ".segments.txt"), sep="\t", quote=F, row.names=F)
#Write out CNVs in bed format
cnvs=ascat.output[["segments"]][2:6]
write.table(cnvs, file=paste0("$prefix",".cnvs.txt"), sep="\t", quote=F, row.names=F, col.names=T)
#Write out purity and ploidy info
summary <- tryCatch({
matrix(c(ascat.output[["aberrantcellfraction"]], ascat.output[["ploidy"]]), ncol=2, byrow=TRUE)}, error = function(err) {
# error handler picks up where error was generated
print(paste("Could not find optimal solution: ",err))
return(matrix(c(0,0),nrow=1,ncol=2,byrow = TRUE))
}
)
colnames(summary) <- c("AberrantCellFraction","Ploidy")
write.table(summary, file=paste0("$prefix",".purityploidy.txt"), sep="\t", quote=F, row.names=F, col.names=T)
#version export. Have to hardcode process name and software name because
#won't run inside an R-block
version_file_path="versions.yml"
f <- file(version_file_path,"w")
writeLines("ASCAT:", f)
writeLines(" ascat: 3.0.0",f)
close(f)
"""
stub:
def prefix = task.ext.prefix ?: "${meta.id}"
"""
echo stub > ${prefix}.cnvs.txt
echo stub > ${prefix}.purityploidy.txt
echo stub > ${prefix}.segments.txt
echo stub > Tumour.ASCATprofile.png
echo stub > Tumour.ASPCF.png
echo stub > Tumour.germline.png
echo stub > Tumour.rawprofile.png
echo stub > Tumour.sunrise.png
echo stub > Tumour.tumour.png
ASCAT (#1332) * First commit * putting correct links for singularity and docker containers (just had to search for bioconda+ascat to find them, and then put them in like the rest of the nf-core tools had it * adding first try of relevant commands (not working yet, just took their basic pipeline example * test commit * remove test * starting up work with module after 3.0.0 upgrade * add ascat.prepareHTS statemet * add location of docker for new mulled alleleCounter+ASCAT container * first full run with ASCAT on HG00154.mapped.ILLUMINA.bwa.GBR.low_coverage.20101123.bam * add notes on dropbox download * use a newer pytest_modules.yml * add outpit * trying to align with current Sarek output * adding in FH comments * busy clearing up arguments and testing. Still WIP * first working run, in nextflow, with sarek-like output. Still needs more work on input arguments * cleaning up before writing up findings * testing with putting in arguments in args * draft for solution 3 style for arguments * one more test added * adding FH map * finished testing maps for args * wrap-up cram/crai test successfully * updates to address ability to put in ref.fasta argument for cram running * adding remaining import-HTS commands in as args, and removing the chr21/chr22 only testing to test-nextflow.config * first test with auto-downloading the s3-data (when not given as an argument) * removing download-logic for supporting files, documenting in meta.yml, fixing ref_fasta bug * adding mulled singularity container * removing tests * fix left padding lint issue * lint failure in meta.yml * more linting errors * add when argument * adding stub functionality * add stub run * correct md5sum for versions.yml * more testing with -runstub * stub code in pure bash - not mixed with R * reformat version.yml * get rid of absolute paths in test.yml * correct wrong md5sum * adding allelecount conda link * rename normal_bam to input_bam etc * let the pipeline dev worry about matching the right loci and allele files * dont hardcode default genomebuild * adding download instruction comment * add doi * fix conda addition bug * add args documentation * test new indent * new test with meta.yml indentation * retry with new meta.yml * retry with new meta.yml - now with empty lines around * retry with new meta.yml - remove trailing whitepsace * trying to fix found quote character that cannot start any token error * try with one empty line above triple-quote and no empty line below * trying with pipe character * checking if its the ending triple quote * one more try with meta.yml * test update bioconda versioning for linting failure * test update bioconda versioning for linting failure 2 * testing allelecounter version error on conda Co-authored-by: @lassefolkersen Co-authored-by: @FriederikeHanssen
2022-03-15 06:18:43 -04:00
echo 'ASCAT:' > versions.yml
echo ' ascat: 3.0.0' >> versions.yml
"""
}