* Make targets.bed optional when running in wgs mode
* added test for cram
* Update test_data_config with new reference.cnn
* Update main.nf to allow tumor-only running
Still need a unit-test for this. Almost ready, but needs this file as input https://github.com/nf-core/test-datasets/blob/modules/data/generic/cnn/reference.cnn
* re-writing previous changes, but now it wont crash the entire CI-setup
* fixing overlooked merge conflict
* last overlooked merge-conflict
* move all files to batch subfolder
* adding an optional input for a reference file (needed when running germline and tumoronly)
* minor typo
* update meta.yml
* aligning code, renaming cnvkit to cnvkit_batch, renaming tumorbam to tumor, normalbam to normal
* Update pytest_modules.yml
Co-authored-by: EC2 Default User <ec2-user@ip-172-31-21-198.us-west-2.compute.internal>
Co-authored-by: Lasse Folkersen <lassefolkersen@gmail.com>
Co-authored-by: Robert A. Petit III <robbie.petit@gmail.com>
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* add software/cooler
* fix the wrong files uploaded.
* create a branch for cooler/zoomify
* Apply suggestions from code review
* update functions.nf to new version.
* update the test file to test-datasets.
* update the test method of zoomify
* update dump test file.
* update version.txt to version.yml
* Update modules/cooler/dump/main.nf
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* fix the output bug of versions update to pytest_modules.yml
* update the test file path and fix the output versions.
* Update modules/cooler/dump/main.nf
* indent
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
Co-authored-by: Gregor Sturm <mail@gregor-sturm.de>
Co-authored-by: Sébastien Guizard <sguizard@ed.ac.uk>
Co-authored-by: FriederikeHanssen <Friederike.hanssen@qbic.uni-tuebingen.de>
* New modules added: issues #200 and #310
* Update main.nf
* Update meta.yml
* Update tests/modules/gatk4/genotypegvcfs/main.nf
* Apply suggestions from code review
* Update main.nf
* Updating tests for GenomicsDB input and adding the path for this test resource to test_data.config
* Some minor changes on one of the test files I forgot to include
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
Co-authored-by: GCJMackenzie <43276267+GCJMackenzie@users.noreply.github.com>
* add software/cooler
* fix the wrong files uploaded.
* create a branch for cooler/merge
* remove the bin_size from metadata.
* update the test_data to test-datasets
* update pytest_modules.yml
* update the test file from single input file to two input file.
update the output file from hdf5 to bedpe.
* update the version.txt to version.yml and functions.nf
* change version.yml to versions
* update the test file path and fix the output versions.
* Update meta.yml
Correct "version" to "versions"
* Update main.nf
Fix typo
* Update main.nf
Remove some spaces
Co-authored-by: Gregor Sturm <mail@gregor-sturm.de>
Co-authored-by: Sébastien Guizard <sguizard@ed.ac.uk>
* added template
* integrated module
* added fasta index info
* test works, have placeholder data for baits until test-data PR is merged
* added new files to config
* updated test files
* fixing fails ✨
* okay final fix here on the md5sum :face_palm:
* md5sum variable
* update meta.yml to reflect consistency to main.nf
* reverted version so conda works
* Apply suggestions from code review
Co-authored-by: Sébastien Guizard <sguizard@ed.ac.uk>
* md5sum can't be generated consistently for output
Co-authored-by: Sébastien Guizard <sguizard@ed.ac.uk>
* Implement PLINK_EXTRACT module
* fix plink version number
* Update main.nf
* Update test_data.config
* Update modules/plink/extract/main.nf
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* just use one channel
* fix test with new channel input
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* first commit with imputeme as a module. Extensive re-write of imputeme-code, resulting in release v1.0.7 that is runnable in the next-flow framework.
Co-authored-by: EC2 Default User <ec2-user@ip-172-31-21-198.us-west-2.compute.internal>
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
Co-authored-by: Pontus Freyhult <pontus_github@soua.net>
* saving changes to checkout
* saving to sort out other branch
* removed yml tracking of files that cant be tracked due to directory name changing between runs
* test data added, ready for pr
* fix eol linting error
* Update modules/gatk4/genomicsdbimport/main.nf
Co-authored-by: Francesco L <53608000+lescai@users.noreply.github.com>
* merging with master
* update push to show progress
* tests now working untar able to pass data to genomicsdbimport
* commit to checkout
* tests updated, module reworked to simplify and emit updated gendb
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* update meta.yml
Priority of input options changed, updated to reflect this
* Update test.yml
name prefix changed in main script, test.yml updated to reflect this
* fix tests due to review changes
Co-authored-by: GCJMackenzie <gavin.mackenzie@nibsc.org>
Co-authored-by: Francesco L <53608000+lescai@users.noreply.github.com>
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Provide an exisiting bam file for optitype
* Update main.nf
Attempt at fixing this with new testing data
* Trying slightly different approach
* Mini fixes, not sure whats wrong here
* Add bam file with NM tags in all reads for optitype
Co-authored-by: Alexander Peltzer <apeltzer@users.noreply.github.com>
Co-authored-by: Alexander Peltzer <alexander.peltzer@boehringer-ingelheim.com>
* add liftOver module
* add liftover module tests
* fix getProcessName
* fix tests
* fix out of date function
* version numbers should be numeric
* drop versions.yml from test.yml
* Update modules/ucsc/liftover/main.nf
Remove software name variable
Co-authored-by: Jose Espinosa-Carrasco <kadomu@gmail.com>
* Update tests/modules/ucsc/liftover/main.nf
Use test chain file
Co-authored-by: Jose Espinosa-Carrasco <kadomu@gmail.com>
* add genome_chain_gz to test data config
* update md5sum for new chain test data
* Fix indentation in file declaration
Co-authored-by: Jose Espinosa-Carrasco <kadomu@gmail.com>
* hifiasm copied from fastqc
* hifiasm tests init from fastqc
* meta.yml init; test.yml and main.nf for printing version
* Add hifiasm version printing
* Removed spaced on an empty line
* Reverted hifiasm from main
* Updated seacr callpeak to include a control threshold
* Whitespace
Co-authored-by: Sviatoslav Sidorov <sviatoslav.sidorov@crick.ac.uk>
Co-authored-by: Svyatoslav Sidorov <svet.sidorov@gmail.com>
* Reduce number of required input files for damage profiler
* Remove rebugging
* Add optional species list file.
* Working pending updated test-dataset update
* Add genome header to config
* chore: use template to create fasterq module
* feat: add fasterq-dump process module
* docs: provide input and output descriptions
* docs: add comment on `--temp`
* fix: use correct variable
* tests: define test output
* refactor: address review comments
* refactor: remove vdb-config input
* chore: add new test data to config
* tests: define single-end and paired-end cases
* refactor: choose specific output
* tests: do not expect single FASTQ for paired-end
* feat: add compression
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* tests: revert the test data name
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* files for learnreadorientationmodel initialised for first commit
* finished scripts and yml files. test working locally but needs an f1r2 test data on nf-core before it can be submitted
* updated test data location
* versions file updated, test data added
* updated versions file, edited test file
* small formatting update to main.nf
* Update main.nf
* Update test_data.config
* updated tests main.nf
* Update test_data.config
* Apply suggestions from code review
* Update modules/gatk4/learnreadorientationmodel/main.nf
* Update modules/gatk4/learnreadorientationmodel/meta.yml
* fixed tests failing
Co-authored-by: GCJMackenzie <gavin.mackenzie@nibsc.org>
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* initiated files for calculate contamination
* pushing local repo to remote
* created script, filled in meta yml, created tests and test yml. local checks passing, needs repo side test data
* added option and tests for outputting optional segmentation file
* saving for test push
* versions updated, test data added
* Update main.nf
* fixed versions info, should report correctly now
* small update to main.nf outputs formatting
* Apply suggestions from code review
* Update test_data.config
* Apply suggestions from code review
Co-authored-by: GCJMackenzie <gavin.mackenzie@nibsc.org>
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* 📦 NEW: Add module lima
* 👌 IMPROVE: Move .pbi output to reports channel
* 🐛 FIX: Fix report channel definition
* 👌IMPROVE; Remove options from command line
update test script with removed options
* 👌 IMPROVE: Add some pacbio test files
* 🐛 FIX: Add Pacbio index to test_data.config
* 👌 IMPROVE: Re add 10000 data test
* 🐛 FIX: Add pbi input
* 👌 IMPROVE: Add parallelization to lima
* 👌 IMPROVE: Add some pbindex
* 🐛 FIX: Add pbi extension to files
* 👌 IMPROVE: The accept one channel (primers move into the first channel)
* 👌 IMPROVE: Assign a value channel for pimers
Improve code workflow readability
* 👌 IMPROVE: Update .gitignore
* 👌 IMPROVE: Update module to last template version
* 🐛 FIX: Correct Singularity and Docker URL
* 👌 IMPROVE: Update to the last version of modules template
* 👌 IMPROVE: Update test_data.config
* 👌 IMPROVE: Remove pbi from input files
* 👌 IMPROVE: Final version of test datasets config
* 👌 IMPROVE: Remove useless index + Fix Typos
* 🐛 FIX: Fill contains args
* 📦 NEW: Add module lima
* 👌 IMPROVE: Move .pbi output to reports channel
* 🐛 FIX: Fix report channel definition
* 👌IMPROVE; Remove options from command line
update test script with removed options
* 🐛 FIX: Add pbi input
* 👌 IMPROVE: Add parallelization to lima
* 👌 IMPROVE: Add some pacbio test files
* 🐛 FIX: Add Pacbio index to test_data.config
* 👌 IMPROVE: Re add 10000 data test
* 👌 IMPROVE: Add some pbindex
* 🐛 FIX: Add pbi extension to files
* 👌 IMPROVE: The accept one channel (primers move into the first channel)
* 👌 IMPROVE: Assign a value channel for pimers
Improve code workflow readability
* 👌 IMPROVE: Update .gitignore
* 👌 IMPROVE: Update module to last template version
* 🐛 FIX: Correct Singularity and Docker URL
* 👌 IMPROVE: Update to the last version of modules template
* 👌 IMPROVE: Update test_data.config
* 👌 IMPROVE: Remove pbi from input files
* 👌 IMPROVE: Final version of test datasets config
* 👌 IMPROVE: Remove useless index + Fix Typos
* 🐛 FIX: Fill contains args
* 👌 IMPROVE: Add channel for each output
* 👌 IMPROVE: Remove comments
* 📦 NEW: Add module lima
* 👌 IMPROVE: Move .pbi output to reports channel
* 🐛 FIX: Fix report channel definition
* 👌IMPROVE; Remove options from command line
update test script with removed options
* 🐛 FIX: Add pbi input
* 👌 IMPROVE: Add parallelization to lima
* 👌 IMPROVE: Add some pacbio test files
* 🐛 FIX: Add Pacbio index to test_data.config
* 👌 IMPROVE: Re add 10000 data test
* 👌 IMPROVE: Add some pbindex
* 🐛 FIX: Add pbi extension to files
* 👌 IMPROVE: The accept one channel (primers move into the first channel)
* 👌 IMPROVE: Assign a value channel for pimers
Improve code workflow readability
* 👌 IMPROVE: Update module to last template version
* 🐛 FIX: Correct Singularity and Docker URL
* 👌 IMPROVE: Update to the last version of modules template
* 👌 IMPROVE: Update test_data.config
* 👌 IMPROVE: Remove pbi from input files
* 🐛 FIX: Fill contains args
* 📦 NEW: Add module lima
* 👌 IMPROVE: Move .pbi output to reports channel
* 🐛 FIX: Fix report channel definition
* 👌IMPROVE; Remove options from command line
update test script with removed options
* 🐛 FIX: Add pbi input
* 👌 IMPROVE: Add parallelization to lima
* 👌 IMPROVE: Add some pacbio test files
* 🐛 FIX: Add Pacbio index to test_data.config
* 👌 IMPROVE: Re add 10000 data test
* 👌 IMPROVE: Add some pbindex
* 🐛 FIX: Add pbi extension to files
* 👌 IMPROVE: The accept one channel (primers move into the first channel)
* 👌 IMPROVE: Assign a value channel for pimers
Improve code workflow readability
* 👌 IMPROVE: Update module to last template version
* 🐛 FIX: Correct Singularity and Docker URL
* 👌 IMPROVE: Update to the last version of modules template
* 👌 IMPROVE: Update test_data.config
* 👌 IMPROVE: Remove pbi from input files
* 👌 IMPROVE: Final version of test datasets config
* 👌 IMPROVE: Remove useless index + Fix Typos
* 🐛 FIX: Fill contains args
* 👌 IMPROVE: Add channel for each output
* 👌 IMPROVE: Remove comments
* 🐛 FIX: Clean test_data.config
* Update modules/lima/main.nf
Add meta to each output
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Update modules/lima/main.nf
Remove useless parenthesis
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* 🐛 FIX: Keep version number only
* 🐛 FIX: Reintegrate prefix variable and use it to define output file name
* 👌 IMPROVE: add suffix arg to check output files names
* 👌 IMPROVE: Use prefix for output filename
* 🐛 FIX: Set optional output
Allow usage of different input formats
* 👌 IMPROVE: Update meta file
* 👌 IMPROVE: Update test
One test for each input file type
* 👌 IMPROVE: add fasta, fastq.gz, fastq, fastq.gz test files
* 👌 IMPROVE: Update with last templates / Follow new version.yaml rule
* 🐛 FIX: Fix typos and include getProcessName function
* 👌 IMPROVE: Update .gitignore
* 👌 IMPROVE: Using suffix to manage output was not a my best idea
Add a bash code to detect extension and update output file name
* 👌 IMPROVE: clean code
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
Co-authored-by: Gregor Sturm <mail@gregor-sturm.de>
Co-authored-by: Mahesh Binzer-Panchal <mahesh.binzer-panchal@nbis.se>
* Specify more guidelines on input channels
* Linting
* Updates based on code review
* Update README.md
* Fix broken sentence
* Start maltextract module
* start tests
* Get tests working now we have test data
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Changes after review
* Update tests/modules/maltextract/main.nf
Co-authored-by: Jose Espinosa-Carrasco <kadomu@gmail.com>
* Update tests/modules/maltextract/main.nf
Co-authored-by: Jose Espinosa-Carrasco <kadomu@gmail.com>
* Update tests/modules/maltextract/main.nf
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
Co-authored-by: Jose Espinosa-Carrasco <kadomu@gmail.com>
Please enter the commit message for your changes. Lines starting
* adds expansionhunter module
Co-authored-by: Maxime U. Garcia <maxime.garcia@scilifelab.se>
* Specify more guidelines on input channels
* Linting
* Updates based on code review
* Update README.md
* Fix broken sentence
* Add unzip module
* Remove missing TODOs update mtea
* Apply changes after code-review from @grst
* Account for user trying to supply two input archives
* Remove debugging test
* Update modules/unzip/main.nf
Co-authored-by: Jose Espinosa-Carrasco <kadomu@gmail.com>
* Correct output path
Co-authored-by: Jose Espinosa-Carrasco <kadomu@gmail.com>
* New last/lastal to align query sequences on a target index
`lastal` is the main program of the [LAST](https://gitlab.com/mcfrith/last)
suite. It align query DNA sequences in FASTA or FASTQ format to a
target index of DNA or protein sequences. The index is produced by
the `lastdb` program (module `last/lastdb`). The score matrix for
evaluating the alignment can be chosen among preset ones or computed
iteratively by the `last-train` program (module `last/train`). For
this reason, the `last/lastal` module proposed here has one input
channel containing an optional file, that has to be dummy when not used.
The LAST aligner outputs MAF files that can be very large (up to
hundreds of gigabytes), therefore this module unconditionally compresses
its output with gzip.
This new module is part of the work described in Issue #464. During
this development, we fix the version of LAST to 1219 to ensure
consistency (hence ignore lint's version warning).
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Un-hardcode the path to the LAST index.
Among multiple alternatives I have chosen the following command to
detect the sample name of the index, because it fails in situations
where there is no index files in the index folder, and in situations
were there are two indexes files in the folder. Not failing would
result in feeding garbage information in the INDEX_NAME variable.
basename \$(ls $index/*.bck) .bck
In case of missing file, a clear error message is given by `ls`. In
case of more than one file, the error message of `basename` is more
cryptic, unfortunately. (`basename: extra operand ‘.bck’`)
Alternatives that do not fail if there is no .bck file:
basename $index/*bck .bck
find $index -name '*bck' | sed 's/.bck//'
Alternatives that do not fail if there are more than one .bck file:
basename -s .bck $index/*bck
ls $index/*.bck | xargs basename -s .bck
find $index -name '*bck' | sed 's/.bck//'
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* New module last/mafswap to reorder sequences in alignments
The `maf-swap` tool distributed with [LAST](https://gitlab.com/mcfrith/last)
reorders sequences in alignment files in Multiple Alignment Format.
When run without command-line arguments, it will swap the target and the
query sequences. This is useful when turning a many-to-many alignment
into a many-to-one and then a one-to-one alignment in conjunction with
the `last-split` command (split, swap, split and swap again).
The LAST aligner outputs MAF files, but other tools also use this
format. As MAF files can be very large (up to hundreds of gigabytes),
the module expects its input to be compressed with gzip and will
compress its output.
This new module is part of the work described in Issue #464. During
this development, we fix the version of LAST to 1219 to ensure
consistency (hence ignore lint's version warning).
* Update MD5 sum.
Actually, 7029066c27ac6f5ef18d660d5741979a is the MD5 sum of
an empty file compressed with `gzip --no-name`… This happened
because I forgot to update the config file after correcting the
module… sorry !
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Change name as suggested in pull request.
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* New last/train module to train alignment parameters.
The last-train command creates a parameter file that
will be used by last/lastal module for sequence alignment.
It takes indexed sequences and query sequences as input
and we use the metadata of both to create an id of the
parameter output file.
Submission of the LAST modules is discussed in more
details in the issue #464. For consistancy, we use LAST
version 1219 for this whole development and will upgrade later.
* Corrected files according to the nf-core v1.14 standards.
* Fixed function.nf file for the last-train module.
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Find index name.
* Correct after the input channels were changed.
* Use double underscore as a name separator.
Single underscores can happen in ids, therefore, we would like to keep two underscores.
* Remove extra spaces.
* Fixed the passing of the "score matrix" line.
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Update software/last/train/main.nf
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Add new human data and fix sarscov paths
* Fix filename typo
* Apply code review
* replace index with to match sarscov data
* lower case
* indent everythin
* Adapt sarscov keys to new naming convention
* Update test_data.config
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Adding bigWig and deeptools computeMatrix files to config
* Adding meta.yml for deeptools modules
* Add test for deeptools modules
* Fixing and reordering tags
* Fixing conda test that worked in local...
* Apply suggestions from code review
* Changing bigwig file pattern to include bigwig extension
* Saving after last change is a good practice
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Resolve suggests after PR review
* add newline to functions
* need variable interpolation using double quotes; remove unnecessary tag
* add a more resilient link to raw github files
* remove trailing slash
* Update software/iqtree/main.nf
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Add abacas module
* Add test for abacas module
* Add Harshil to authorship
* Updating test with the data uploaded to nf-core/datasets
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>