* hifiasm copied from fastqc
* hifiasm tests init from fastqc
* meta.yml init; test.yml and main.nf for printing version
* Add hifiasm version printing
* Removed spaced on an empty line
* Reverted hifiasm from main
* hifiasm copied from fastqc
* hifiasm tests init from fastqc
* meta.yml init; test.yml and main.nf for printing version
* Add hifiasm version printing
* Removed spaced on an empty line
* Reverted hifiasm from main
* Added seqtk/subseq and checking for seed in seqtk/sample
* Separate authors in software/seqtk/sample/meta.yml
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Align commans in output channesl software/seqtk/subseq/main.nf
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Define prefix in software/seqtk/subseq/main.nf
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Use prefix in output file name software/seqtk/subseq/main.nf
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Define suffix in options in tests/software/seqtk/subseq/main.nf
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Change output file name in tests/software/seqtk/subseq/test.yml
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Remove a to-do point from tests/software/seqtk/subseq/test.yml
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Added --no-name into gzip commands
* Update samtools from 1.10 to 1.12 (#530)
* feat: remove social preview image to use GitHub OpenGraph
* feat: update samtools from 1.10 to 1.12
* fix: CI tests
* fix: add meta.yml file for samtools/merge
* Update software/samtools/merge/meta.yml
Co-authored-by: Jose Espinosa-Carrasco <kadomu@gmail.com>
* Update software/samtools/merge/meta.yml
Co-authored-by: Jose Espinosa-Carrasco <kadomu@gmail.com>
* hifiasm copied from fastqc
* hifiasm tests init from fastqc
* meta.yml init; test.yml and main.nf for printing version
* Add hifiasm version printing
* Removed spaced on an empty line
* Reverted hifiasm from main
* Added seqtk/subseq and checking for seed in seqtk/sample
* hifiasm copied from fastqc
* hifiasm tests init from fastqc
* meta.yml init; test.yml and main.nf for printing version
* Add hifiasm version printing
* Removed spaced on an empty line
* Reverted hifiasm from main
* Separate authors in software/seqtk/sample/meta.yml
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Align commans in output channesl software/seqtk/subseq/main.nf
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Define prefix in software/seqtk/subseq/main.nf
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Use prefix in output file name software/seqtk/subseq/main.nf
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Define suffix in options in tests/software/seqtk/subseq/main.nf
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Change output file name in tests/software/seqtk/subseq/test.yml
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Remove a to-do point from tests/software/seqtk/subseq/test.yml
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Added --no-name into gzip commands
* Replaced functions.nf in seqtk/subseq
* Refreshed tests for sample and subseq
* Corrected paired-end test and YAML description for sample
Co-authored-by: Sviatoslav Sidorov <sviatoslav.sidorov@crick.ac.uk>
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
Co-authored-by: Maxime U. Garcia <max.u.garcia@gmail.com>
Co-authored-by: Jose Espinosa-Carrasco <kadomu@gmail.com>
* add software/pairtools
* create a branch for pairtools/restrict
* fix the different output of conda and docker
* remove customized code.
* add newline to Frag.bed file.
* change the folder of frag.bed.
* change \n to \r\n
* Remove work.frag.bed
Co-authored-by: JoseEspinosa <kadomu@gmail.com>
* New last/mafconvert module to convert MAF alignments.
The `maf-convert` tool distributed with [LAST](https://gitlab.com/mcfrith/last)
reads alignmnts in [MAF](https://genome-asia.ucsc.edu/FAQ/FAQformat.html#format5)
format and converts them in another format (axt, blast, blasttab, chain,
gff, html, psl, sam, tab).
This new module is part of the work described in Issue #464. During this
development, we fix the versiob of LAST to 1219 to ensure consistency.
We will upgrade it later.
* Delete white space.
* Update the function.nf file to the dev version.
The `last-postmask` tool distributed with [LAST](https://gitlab.com/mcfrith/last)
filters alignments in a MAF file to remove those with too many masked
(lower-case) positions compared with their score.
As other filter modules like `last/split`, its output file risks to
overwrite its input file as their names are constructed from the sample
ID when multiple filters are chained in the pipeline. I added a check
that gives a clearer error message in this case. Please let me know
what you think about; I can add this test to the existing LAST modules
as well.
This new module is part of the work discribed in Issue #464. During this
development, we fix the version of LAST to 1219 to ensure consistency.
We will upgrade it later.
* New last/dotplot module for pairwise similarity plots
The `last-dotplot` tool takes a pairwise alignment in
[MAF](http://genome.ucsc.edu/FAQ/FAQformat.html#format5) format,
possibly compressed with gzip, or in a tabular format produced by the
`maf-convert` tool, and produces a similarity dot-plot of the two
sequences in one of the graphical formats supported by the Python
Imaging Library.
A the tool guesses the output format by the file extension of the file,
which is constructed by the module at run time, I have used the `args2`
option to convey this information to the module.
This new module is part of the work described in Issue #464. During
this development, we fix the version of LAST to 1219 to ensure
consistency (hence please ignore lint's version warning).
* Update the functions.nf file to the dev branch.
https://raw.githubusercontent.com/nf-core/tools/dev/nf_core/module-template/software/functions.nf
* feat: remove social preview image to use GitHub OpenGraph
* feat: update samtools from 1.10 to 1.12
* fix: CI tests
* fix: add meta.yml file for samtools/merge
* Update software/samtools/merge/meta.yml
Co-authored-by: Jose Espinosa-Carrasco <kadomu@gmail.com>
* Update software/samtools/merge/meta.yml
Co-authored-by: Jose Espinosa-Carrasco <kadomu@gmail.com>
* add software/pairtools
* create a branch for pairtools/sort
* fix the different output of conda and docker.
* remove customized code.
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* add software/pairtools
* create a branch for pairtools/parse
* fix the issue of bioconda output is different from docker.
* remove customized code from test.
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* add software/pairtools
* create a branch for pairtools/flip
* fix the issue of PG line in output
* remove custom code from test.
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* New last/lastal to align query sequences on a target index
`lastal` is the main program of the [LAST](https://gitlab.com/mcfrith/last)
suite. It align query DNA sequences in FASTA or FASTQ format to a
target index of DNA or protein sequences. The index is produced by
the `lastdb` program (module `last/lastdb`). The score matrix for
evaluating the alignment can be chosen among preset ones or computed
iteratively by the `last-train` program (module `last/train`). For
this reason, the `last/lastal` module proposed here has one input
channel containing an optional file, that has to be dummy when not used.
The LAST aligner outputs MAF files that can be very large (up to
hundreds of gigabytes), therefore this module unconditionally compresses
its output with gzip.
This new module is part of the work described in Issue #464. During
this development, we fix the version of LAST to 1219 to ensure
consistency (hence ignore lint's version warning).
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Un-hardcode the path to the LAST index.
Among multiple alternatives I have chosen the following command to
detect the sample name of the index, because it fails in situations
where there is no index files in the index folder, and in situations
were there are two indexes files in the folder. Not failing would
result in feeding garbage information in the INDEX_NAME variable.
basename \$(ls $index/*.bck) .bck
In case of missing file, a clear error message is given by `ls`. In
case of more than one file, the error message of `basename` is more
cryptic, unfortunately. (`basename: extra operand ‘.bck’`)
Alternatives that do not fail if there is no .bck file:
basename $index/*bck .bck
find $index -name '*bck' | sed 's/.bck//'
Alternatives that do not fail if there are more than one .bck file:
basename -s .bck $index/*bck
ls $index/*.bck | xargs basename -s .bck
find $index -name '*bck' | sed 's/.bck//'
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* New last/split module to find split alignments.
The `last-split` tool distributed with [LAST](https://gitlab.com/mcfrith/last)
finds split or spliced alignments in a MAF file that is produced with, for
example, LAST `lastal` command.
This new module is part of the work discribed in Issue #464. During this
development, we fix the versiob of LAST to 1219 to ensure consistency. We will
upgrade it later.
* Update software/last/split/main.nf
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* add additional ucsc tools
* Update software/ucsc/wigtobigwig/meta.yml
Co-authored-by: Kevin Menden <kevin.menden@live.com>
* Update the functions.nf and software name for ucsc/wigtobigwig and bigwigaverageoverbed.
Co-authored-by: Kevin Menden <kevin.menden@live.com>
* New module last/mafswap to reorder sequences in alignments
The `maf-swap` tool distributed with [LAST](https://gitlab.com/mcfrith/last)
reorders sequences in alignment files in Multiple Alignment Format.
When run without command-line arguments, it will swap the target and the
query sequences. This is useful when turning a many-to-many alignment
into a many-to-one and then a one-to-one alignment in conjunction with
the `last-split` command (split, swap, split and swap again).
The LAST aligner outputs MAF files, but other tools also use this
format. As MAF files can be very large (up to hundreds of gigabytes),
the module expects its input to be compressed with gzip and will
compress its output.
This new module is part of the work described in Issue #464. During
this development, we fix the version of LAST to 1219 to ensure
consistency (hence ignore lint's version warning).
* Update MD5 sum.
Actually, 7029066c27ac6f5ef18d660d5741979a is the MD5 sum of
an empty file compressed with `gzip --no-name`… This happened
because I forgot to update the config file after correcting the
module… sorry !
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Change name as suggested in pull request.
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* New last/train module to train alignment parameters.
The last-train command creates a parameter file that
will be used by last/lastal module for sequence alignment.
It takes indexed sequences and query sequences as input
and we use the metadata of both to create an id of the
parameter output file.
Submission of the LAST modules is discussed in more
details in the issue #464. For consistancy, we use LAST
version 1219 for this whole development and will upgrade later.
* Corrected files according to the nf-core v1.14 standards.
* Fixed function.nf file for the last-train module.
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Find index name.
* Correct after the input channels were changed.
* Use double underscore as a name separator.
Single underscores can happen in ids, therefore, we would like to keep two underscores.
* Remove extra spaces.
* Fixed the passing of the "score matrix" line.
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Update software/last/train/main.nf
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* added intervallisttools module
* add intervallisttools module
* arguments are now supplied using options.args
* removed java heapsize settings
* changes in main.nf and it is tested
* comment added
* Update software/gatk4/intervallisttools/meta.yml
Co-authored-by: Maxime U. Garcia <maxime.garcia@scilifelab.se>
* Update tests/software/gatk4/intervallisttools/test.yml
Co-authored-by: Maxime U. Garcia <maxime.garcia@scilifelab.se>
* review comment on tags in bedtointerval
* modified the test to get input from bedtointerval module
* Update software/gatk4/intervallisttools/meta.yml
* Apply suggestions from code review
Co-authored-by: Kevin Menden <kevin.menden@live.com>
* Apply suggestions from code review
* Update tests/config/pytest_software.yml
Co-authored-by: Kevin Menden <kevin.menden@live.com>
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: @praveenraj2018 <praveen.raj.somarajan@ki.se>
Co-authored-by: Maxime U. Garcia <maxime.garcia@scilifelab.se>
Co-authored-by: Kevin Menden <kevin.menden@live.com>
* output constant sites as a val as well as a file so it can be passed into iqtree
* Using an env variable because that's far safer!
* Update software/snpsites/main.nf
* remove hardcoded param that should be a user option
* Update software/snpsites/main.nf
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Bump pangolin version
* Add nextclade to software list
* Add nextclade module
* Update md5sum for Pangolin due to version bump
* Adding some URL to meta.yml
* Adding new line at end of file
Co-authored-by: JoseEspinosa <kadomu@gmail.com>
* New last/lastdb module to index sequences before alignment.
The `lastdb` command creates a sequence index for the LAST aligner
(https://gitlab.com/mcfrith/last). Input can be in FASTA or FASTQ
format, and compression is handled automagically. DNA or protein
sequences can be indexed.
The sequence index is a collection of files sharing the same basename.
This module sets the basename to the sample identifier (`$meta.id`) and
creates the index in a directory always called `lastdb`. The module's
output channel then conveys a copy of the metadata and the path to the
`lastdb` directory.
Other modules will follow (see Issue #464). The LAST aligner can align
proteins to proteins, DNA to DNA and can translate DNA align to
proteins.
* Remove trailing whitespace.
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Update as suggested in PR.
* Attempt to pass linting.
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Basic kb-python count functionality for scRNA-seq quantification working.
* Added tests and test data for workflow kite.
* Removed trailing whitespace
* Changed output channels to tuples with meta
Based on suggestions by @KevinMenden.
* Moved workflow and technology to input variables. Currently create test-yaml file script failes with cryptic message.
* Update software/kallistobustools/count/main.nf
@KevinMenden fixed wrong path definition
Co-authored-by: Kevin Menden <kevin.menden@live.com>
* Increased version of kb-python
* Updated tests with raw links.
* Fixed subtool referencing: kallistobustools/count
* Added newline
* Update software/kallistobustools/count/main.nf
Co-authored-by: Kevin Menden <kevin.menden@live.com>
* raxml-ng is compute intensive - upgrade process label to high
* ensure raxmlng uses --all when bootstrapping
* remove block that should be taken of by the options passed in
* update tests
* Bump version and format references to use raw params.test_data
* Apply suggestions from code review
Co-authored-by: Kevin Menden <kevin.menden@live.com>
* Updated tags
Missed on in last commit.
Co-authored-by: Kevin Menden <kevin.menden@live.com>
* Add new human data and fix sarscov paths
* Fix filename typo
* Apply code review
* replace index with to match sarscov data
* lower case
* indent everythin
* Adapt sarscov keys to new naming convention
* Update test_data.config
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Update HISAT2 build module
* Bump preseq version
* Fix tests
* Add meta.yml for preseq to fix linting
* Auto-detect --genomeSAindexNbases for smaller genomes
* Add placeholder to use human data for the tests
* Add CSI output option to samtools/index
* Fix samtools/index tests
* new raxml module
* new raxml module
* pass in args for bootstrap and add test for support file
* remove unnecessary tag
* ensure tags meet guidleines
* Apply suggestions from code review
* Update to latest functions file
Co-authored-by: avantonder <avt@sanger.ac.uk>
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Adding bigWig and deeptools computeMatrix files to config
* Adding meta.yml for deeptools modules
* Add test for deeptools modules
* Fixing and reordering tags
* Fixing conda test that worked in local...
* Apply suggestions from code review
* Changing bigwig file pattern to include bigwig extension
* Saving after last change is a good practice
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Resolve suggests after PR review
* add newline to functions
* need variable interpolation using double quotes; remove unnecessary tag
* add a more resilient link to raw github files
* remove trailing slash
* Update software/iqtree/main.nf
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Add abacas module
* Add test for abacas module
* Add Harshil to authorship
* Updating test with the data uploaded to nf-core/datasets
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* initial 'modules create' of minia
* fixed tests
* finished meta.yml
* fixed filters.yml
* resolved issues in pytest_software.yml
* add newline
* Update software/minia/main.nf
* fixing a bunch of module tests
* remove vscode
* fixed minia
* move test data directory to nf-core/test-datasets
* bump multiqc version
* remove the test data
* updated test data link
* update README
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* new submodule mash/sketch
* fixed submodule naming
* OK, tag is diff to keyword
* OK another round 🤣
* removed TODO comments
* updated as per review comments 🙆♂️
* updated functions.nf 😁
* Update software/mash/sketch/main.nf
* Update main.nf
Removed blank line at the 12th
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* new module: rasusa
* Removed blank line in software/rasusa/main
* updated code as per reviewcomments
* removed blank line as failed for lint
* updated as per review comments
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Added kallistobustools/ref. Local tests all passing with Docker. Linting passed. Test data currently in /tests/data/delete_me
* Removed trailing whitespace line 29
* Moved workflow from meta to options.
* Update main.nf
* Forgot to remove previous testing input channel for workflow.
* Apply suggestions from code review
Applied changes suggested by @drpatelh
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Added gtf to meta.yml.
* Apply suggestions from code review
Adding @drpatelh suggested changes.
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Moved workflow to input value. Fixed tests.
* Update tests/software/kallistobustools/ref/test.yml
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* adding fasttree module
* correct trailing whitespace
* using sarscov2 as a test dir
* remove TODO
* update test data naming
* further test data naming updates
* remove options in favour of $options.args
* ensure non standard exit codes don't cause an issue
* update md5sum
* ci: Add modules lint step
Moved it ahead of the nextflow install so ideally it'll fail before we
bother doing any more setup
* ci: _ => /
* Update tests/config/pytest_software.yml
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Restructuring with new test data sets + fixing tests
* Remove checkings for warning files
* Remove md5 check for test.gene_clusters.fa
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* inital commit
* added meta.yaml info
* add initial logic for featurecounts test
* add args and change SE/PE to strandedness for featurecounts test
* added tests to pytest
* added test.yml
* removed GTF flag in options
* corrected test meta params
* meta yaml corrected tool info
* update test.yml
* fix lint errors meta.yml
Co-authored-by: Nicholas TODA <nicholas.toda@mnhn.fr>
* Added fgbio callmolecularconsensusreads and sortbam modules
* Fixed naming issue in meta.yml
* fix: test.yml and config lint
* Revert "fix: test.yml and config lint"
This reverts commit 0453bc3a8dc3dab6997442a4349ee2241adcc380, which caused the sortbam tests to fail.
* style: Fix test names
* style: Remove trailing whitespace
* fixed test.yml
* fix: test data in sortbam
* fix: data format
* fix: test data for callmolecularconsensusreads
* Corrected with updated test data
* Apply suggestions from code review
Applied changes from code review, mainly syntactical changes
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
Co-authored-by: Sruthi Suresh <sps180004@ganymede.utdallas.edu>
Co-authored-by: Edmund Miller <edmund.a.miller@protonmail.com>
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* initial commit hisat2/build
* initial commit hisat2/build
* changed names for hisat2
* fixed directory structure and args
* added splice site test data
* added splice site inputs
* replaced list with individual args
* fixed removed commas
* added test yml file
* updated hisat2 conda version
* added meta.yml
* added meta.yml description
* added meta.yml inputs
* added meta.yml outputs
* update conda version for hisat2
* removed trailing whitespace meta.yml
* fixed version number for containers
* added test data to test config
* updated for new test logic
* fix pytest issue?
* fix pytest issue
* fixed wrong tool in meta.yaml
* updated tets.yaml name
* handle build bug for testing
* handle build bug for testing in yaml
* moved test folder to fix build bug
* use old hisat2 version to avoid conda giving inconsistent md5sum
* initial commit
* removed temp file
* added meta yaml
* add to pytest
* added tests
* added test yml
* add align meta yaml
* add hisat2 align to pytest
* remove need for splice data by calling process
* add hisat2 align se test
* add hisat2 align pe test
* update names hisat2 align
* update software pytest for using mutiple modules
* remove splice site test data since using module instead
* remove splice site from config since using module instead
* fixed extra brace
* added hisat2 align test.yml
* removed md5sum for bam files
* updated build md5sums
* Apply suggestions from code review
Co-authored-by: Nicholas TODA <nicholas.toda@mnhn.fr>
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* fix the test path in main.nf for salmon/index and quant
* fix typo
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
Mannnn, hopefully I finally got it right :)
* replaced /salmon/salmon/ with /index/salmon/
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
* Update test paths in bwa mappers
* Fix indentation
* indices pass lcoally now
* no idea how they could ever pass before, Tests pass locally no
* Update samtools and bwamem2 versions
* Correct mulled containers + md5
* Update test for trimgalore
* Fix picard markduplicate test
* Fix tests for picard mergesamfiles
* Fix checksum for markduplicates
* Fix multiplemetrics and wgsmetrics
* Fix checksum for mergesamfiles
* Adding tar.gz kraken2 db to test data
* Update test path files for untar module
* Update test path files for kraken2/run module
* Update test path files for cat/fastq module
* update test data paths
* Update test md5sums
* gatk test fixes & update variantfiltration main
* few extra fixes after review
* fix suspected format error
* Update software/gatk4/variantfiltration/main.nf
* Update software/gatk4/variantfiltration/main.nf
* Apply suggestions from code review
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>