nf-core_modules/tests/data
Ravneet Bhuller 8a2a9f7e81
cnvkit module (#173)
* Normal bam file added

* Normal bam.bai file added

* Tumour bam bai files added

* human dir added

* annotation dir added

* cnvkit dir added

* cnvkit dir added

* Update software/cnvkit/main.nf

Co-authored-by: Maxime Garcia <maxime.garcia@scilifelab.se>

* Update software/cnvkit/main.nf

Co-authored-by: Maxime Garcia <maxime.garcia@scilifelab.se>

* Update software/cnvkit/main.nf

Co-authored-by: Maxime Garcia <maxime.garcia@scilifelab.se>

* changed input filenames

* edited main.nf

* edited main.nf

* edited meta.nf

* edited test.yml

* filters.yml

* edited main

* edited main

* edited meta

* edited meta

* edited main

* removed unwanted lines

* edited the path to the main.nf

* removed function.nf

* added functions.nf

* deleted 2 workflows and craeted a common workflow

* deleted paths for 2 workflows and created paths for a common workflow

* Deleted annotation dir

* deleted params.modules

* Edited meta.with_normal

* deleted normal_280_sub_chr21.bam

* deleted normal_280_sub_chr21.bam.bai

* deleted tumour_278_sub_chr21.bam

* deleted tumour_278_sub_chr21.bam.bai

* Edited input and script parts

* Edited input part

* Added

* Edited args

* Edited script

* Edited input

* Changed annotation to annotationfile

* Changed description of the tool

* edited singularuty container

* edited input

* line 44 removed trailing whitespace

* Edited addParams

* Deleted pdf output

* Deleted pdf output

* edited the path to main.nf

* edited path to the main.nf

* Added docker image version

* Removed extra ../

* added md5sums

* added md5sums

* Update software/cnvkit/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update software/cnvkit/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Edited the script

* Edited the input

* Edited main.nf

* Edited main.nf

* edited md5sum for reference.cnn

* removed human fasta

* removed human fasta.fai

* added GRCh38 fasta

* added GRCh38 fasta.fai

* added hg19 fasta.fai

* added hg19 fasta

* Edited fasta file name

* Edited bed file names and md5sums

* Edited md5sums

* edited the input and script section

* edited input section

* added targetfile

* changed the files

* changed the output files

* added bam files

* added bam files

* remove files

* added md5sums

* replace file

* added files

* edited tests/software/cnvkit files

* edited tests/software/cnvkit files

* edited authors list

* removed files

* added files

* added files

* added files

* added files

* added file

* added file

* added file

* added file

* edited files

* edited files

* edited files

* edited files

* edited files

* edited files

* added new module

* added new module

* edited files

* edited file

* edited file

* edited file

* removed files

Co-authored-by: kaurravneet4123 <kaurravneet4123@yahoo.com@users.noreply.github.com>
Co-authored-by: Maxime Garcia <maxime.garcia@scilifelab.se>
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-03-22 15:27:30 -07:00
..
generic Exchange VCF data by sarscov2 data (#261) 2021-03-09 10:04:08 +01:00
genomics/sarscov2 cnvkit module (#173) 2021-03-22 15:27:30 -07:00
.gitignore Update README 2020-07-14 10:55:38 +02:00
README.md add mergebamalignment (#259) 2021-03-17 15:56:56 +01:00

Modules Test Data

This directory contains all data used for the individual module tests. It is currently organised in genomics and generic. The former contains all typical data required for genomics modules, such as fasta, fastq and bam files. Every folder in genomics corresponds to a single organisms. Any other data is stored in generic. This contains files that currently cannot be associated to a genomics category, but also depreciated files which will be removed in the future and exchanged by files in genomics.

When adding a new module, please check carefully whether the data necessary for the tests exists already in tests/data/genomics. If you can't find the data, please ask about it in the slack #modules channel.

Data Description

genomics

  • sarscov2
    • bam:
      • 'test_{,methylated}_paired_end.bam': sarscov2 sequencing reads aligned against test_genomic.fasta using minimap2
      • 'test_{,methylated}_paired_end.sorted.bam': sorted version of the above bam file
      • 'test_{,methylated}_paired_end.bam.sorted.bam.bai': bam index for the sorted bam file
      • 'test_single_end.bam': alignment (unsorted) of the 'test_1.fastq.gz' reads against test_genomic.fasta using minimap2
      • 'test_unaligned.bam': unmapped BAM file created from 'test_1.fastq.gz' using GATK4 SamToFastq
    • bed
      • 'test.bed': exemplary bed file for the MT192765.1 genome (fasta/test_genomic.fasta)
      • 'test.2.bed': slightly modified copy of the above file
      • 'test.bed.gz': gzipped version
      • 'test.genome.sizes': genome size for the MT192765.1 genome
    • fasta
      • 'test_genomic.fasta': MT192765.1 genomem including (GCA_011545545.1_ASM1154554v1)
      • 'test_genomic.dict': GATK dict for 'test_genomic.fasta'
      • 'test_genomic.fasta.fai': fasta index for 'test_genomic.fasta'
      • 'test_cds_from_genomic.fasta': coding sequencing from MT192765.1 genome (transcripts)
    • fastq
      • 'test_{1,2}.fastq.gz' sarscov2 paired-end sequencing reads
      • 'test_{1,2}.2.fastq.gz: copies of the above reads
      • 'test_methylated_{1,2}.fastq.gz' sarscov2 paired-end bisulfite sequencing reads (generated with Sherman)
    • gtf
      • 'test_genomic.gtf': GTF for MT192765.1 genome
      • 'test_genomic.gff3': GFF for MT192765.1 genome
      • 'test_genomic.gff3.gz': bgzipped-version
    • paf
      • 'test_cds_from_genomic.paf': PAF file for MT192765.1 genome
    • vcf
      • 'test.vcf', 'test2.vcf': generated from 'test_paired_end.sorted.bam' using bcftools mpileup, call and filter
      • 'test3.vcf': generated from 'test_single_end.sorted.bam' using bcftools mpileup, call and filter
      • '*.gz': generated from VCF files using bgzip
      • '.tbi': generated from '.vcf.gz' files using tabix -p vcf -f <file>

generic

  • 'a.gff3.gz': bgzipped gff3 file currently necessary for TABIX test
  • bedgraph: bedgraph files for seacr
  • fasta: additional fasta file currently necessary for STAR
  • fastq: additional fastq files currently necessary for STAR
  • gtf: additional gtf file for STAR
  • vcf: several VCF files for tools using those, will be removed in the future
  • 'test.txt.gar.gz' exemplary tar file for the untar module