Commit graph

1645 commits

Author SHA1 Message Date
Charles Plessy
207930139a
New last/lastal module to align query sequences on a target index (#510)
* New last/lastal to align query sequences on a target index

`lastal` is the main program of the [LAST](https://gitlab.com/mcfrith/last)
suite.  It align query DNA sequences in FASTA or FASTQ format to a
target index of DNA or protein sequences.  The index is produced by
the `lastdb` program (module `last/lastdb`).  The score matrix for
evaluating the alignment can be chosen among preset ones or computed
iteratively by the `last-train` program (module `last/train`).  For
this reason, the `last/lastal` module proposed here has one input
channel containing an optional file, that has to be dummy when not used.

The LAST aligner outputs MAF files that can be very large (up to
hundreds of gigabytes), therefore this module unconditionally compresses
its output with gzip.

This new module is part of the work described in Issue #464.  During
this development, we fix the version of LAST to 1219 to ensure
consistency (hence ignore lint's version warning).

* Apply suggestions from code review

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Un-hardcode the path to the LAST index.

Among multiple alternatives I have chosen the following command to
detect the sample name of the index, because it fails in situations
where there is no index files in the index folder, and in situations
were there are two indexes files in the folder.  Not failing would
result in feeding garbage information in the INDEX_NAME variable.

    basename \$(ls $index/*.bck) .bck

In case of missing file, a clear error message is given by `ls`.  In
case of more than one file, the error message of `basename` is more
cryptic, unfortunately.  (`basename: extra operand ‘.bck’`)

Alternatives that do not fail if there is no .bck file:

    basename $index/*bck .bck
    find $index -name '*bck' | sed 's/.bck//'

Alternatives that do not fail if there are more than one .bck file:

    basename -s .bck $index/*bck
    ls $index/*.bck | xargs basename -s .bck
    find $index -name '*bck' | sed 's/.bck//'

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-05-25 22:10:48 +01:00
JIANHONG OU
34f555a26a
add software/pairix (#508)
* add software/pairix

* Update the functions.nf to new version.
Remove -p parameter and fix version output command.
fix the duplicated documentation.
2021-05-25 21:22:57 +01:00
aleksandrabliznina
4575e5455c
New last/split module to find split alignments. (#511)
* New last/split module to find split alignments.

The `last-split` tool distributed with [LAST](https://gitlab.com/mcfrith/last)
finds split or spliced alignments in a MAF file that is produced with, for
example, LAST `lastal` command.

This new module is part of the work discribed in Issue #464. During this
development, we fix the versiob of LAST to 1219 to ensure consistency. We will
upgrade it later.

* Update software/last/split/main.nf

* Apply suggestions from code review

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-05-24 20:15:57 +01:00
JIANHONG OU
ce68395240
add additional ucsc tools (#506)
* add additional ucsc tools

* Update software/ucsc/wigtobigwig/meta.yml

Co-authored-by: Kevin Menden <kevin.menden@live.com>

* Update the functions.nf and software name for ucsc/wigtobigwig and bigwigaverageoverbed.

Co-authored-by: Kevin Menden <kevin.menden@live.com>
2021-05-20 15:39:33 -04:00
Charles Plessy
e75f88c68a
New module last/mafswap to reorder sequences in alignments (#500)
* New module last/mafswap to reorder sequences in alignments

The `maf-swap` tool distributed with [LAST](https://gitlab.com/mcfrith/last)
reorders sequences in alignment files in Multiple Alignment Format.
When run without command-line arguments, it will swap the target and the
query sequences.  This is useful when turning a many-to-many alignment
into a many-to-one and then a one-to-one alignment in conjunction with
the `last-split` command (split, swap, split and swap again).

The LAST aligner outputs MAF files, but other tools also use this
format.  As MAF files can be very large (up to hundreds of gigabytes),
the module expects its input to be compressed with gzip and will
compress its output.

This new module is part of the work described in Issue #464.  During
this development, we fix the version of LAST to 1219 to ensure
consistency (hence ignore lint's version warning).

* Update MD5 sum.

Actually, 7029066c27ac6f5ef18d660d5741979a is the MD5 sum of
an empty file compressed with `gzip --no-name`…  This happened
because I forgot to update the config file after correcting the
module… sorry !

* Apply suggestions from code review

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Change name as suggested in pull request.

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-05-19 08:59:23 +01:00
aleksandrabliznina
b592cea30b
New last/train module to train alignment parameters. (#492)
* New last/train module to train alignment parameters.

The last-train command creates a parameter file that
will be used by last/lastal module for sequence alignment.
It takes indexed sequences and query sequences as input
and we use the metadata of both to create an id of the
parameter output file.

Submission of the LAST modules is discussed in more
details in the issue #464. For consistancy, we use LAST
version 1219 for this whole development and will upgrade later.

* Corrected files according to the nf-core v1.14 standards.

* Fixed function.nf file for the last-train module.

* Apply suggestions from code review

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Find index name.

* Correct after the input channels were changed.

* Use double underscore as a name separator.

Single underscores can happen in ids, therefore, we would like to keep two underscores.

* Remove extra spaces.

* Fixed the passing of the "score matrix" line.

* Apply suggestions from code review

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update software/last/train/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-05-19 08:37:08 +01:00
Charles Plessy
e84eaa22f3
Update pytest tag name for fastqc (#501)
Running pytest with `--tag fastqc_single_end` does not work (it runs zero tests), as it appears that the tag name was changed to `fastqc` during the transition from underscores to slashes.
2021-05-19 08:30:22 +01:00
Jose Espinosa-Carrasco
0bbd7acfc4
Fix number of cpus for modules with piped tools (#499)
* Split CPUs for piped commands

* Fix tests, bams no md5 check
2021-05-18 09:52:00 +02:00
Jose Espinosa-Carrasco
95e02f913f
Update comments with new style (#497)
* Update comment style on functions.nf files

* Update test main.nf comments

* Add meta for ggread
2021-05-12 14:56:46 +01:00
praveenraj2018
598ca152ec
Intervallisttools (#491)
* added intervallisttools module

* add intervallisttools module

* arguments are now supplied using options.args

* removed java heapsize settings

* changes in main.nf and it is tested

* comment added

* Update software/gatk4/intervallisttools/meta.yml

Co-authored-by: Maxime U. Garcia <maxime.garcia@scilifelab.se>

* Update tests/software/gatk4/intervallisttools/test.yml

Co-authored-by: Maxime U. Garcia <maxime.garcia@scilifelab.se>

* review comment on tags in bedtointerval

* modified the test to get input from bedtointerval module

* Update software/gatk4/intervallisttools/meta.yml

* Apply suggestions from code review

Co-authored-by: Kevin Menden <kevin.menden@live.com>

* Apply suggestions from code review

* Update tests/config/pytest_software.yml

Co-authored-by: Kevin Menden <kevin.menden@live.com>

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: @praveenraj2018 <praveen.raj.somarajan@ki.se>
Co-authored-by: Maxime U. Garcia <maxime.garcia@scilifelab.se>
Co-authored-by: Kevin Menden <kevin.menden@live.com>
2021-05-12 11:44:36 +02:00
Jose Espinosa-Carrasco
cdff9a056d
Increase conda build time (#489)
* Add module description to yml

* Increase conda build time
2021-05-10 12:23:52 +01:00
MGordon09
1f465a63d0
Bbmap/bbduk (#487)
* bbmap/bbduk module created

* created bbmap/bbduk module

* updated main.nf

* changed test.yml tags

* removed whitespaces

* Adjusted main.nf spacing

* whitespace, tags

* fix optional files, tags, tidy code

* fix suffix

* Apply suggestions from code review

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-05-10 11:45:52 +01:00
Ravneet Bhuller
12ebce50f7
Request to review code for seqtk/sample module (#488)
* added files

* removed file

* added file

* changed files

* changed files

* edited file

* edited file

* edited files

* edited files

* edited files

* edited tags

* edited tags

* edited tags

* edited tags

Co-authored-by: kaurravneet4123 <kaurravneet4123@yahoo.com@users.noreply.github.com>
2021-05-09 23:55:35 +01:00
Jose Espinosa-Carrasco
f8ea9828cd
Add artic minion (#486)
* Add artic minion module

* Add fast5 to test data configuration

* Add test for artic minion
2021-05-07 16:37:35 +01:00
Anthony Underwood
4422454ba5
Snpsites (#480)
* output constant sites as a val as well as a file so it can be passed into iqtree

* Using an env variable because that's far safer!

* Update software/snpsites/main.nf

* remove hardcoded param that should be a user option

* Update software/snpsites/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-05-06 16:05:07 +01:00
Harshil Patel
b6e4ecabba
Add Nextclade module (#484)
* Bump pangolin version

* Add nextclade to software list

* Add nextclade module

* Update md5sum for Pangolin due to version bump

* Adding some URL to meta.yml

* Adding new line at end of file

Co-authored-by: JoseEspinosa <kadomu@gmail.com>
2021-05-06 15:48:15 +01:00
Yuk Kei Wan
faf77d6fee
add nanolyse module (from nanoseq modules) (#471)
* add nanolyse modules

* add clean.fastq.gz path and md5sum

* fix errors

* remove unreproducible md5sum

* solve linting problem

* address PR suggestions

* GET_NANOLYSE_FASTA as a local module

* Update software/nanolyse/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update software/nanolyse/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update software/nanolyse/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update software/nanolyse/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update software/nanolyse/meta.yml

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update software/nanolyse/meta.yml

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update tests/software/nanolyse/test.yml

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update tests/software/nanolyse/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update tests/software/nanolyse/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* update to the version from nf-core/tools-dev

* input and output files cannot have the same names

* Update test.yml

* Update software/nanolyse/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update tests/software/nanolyse/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update test.yml

* revert

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-05-05 11:20:09 +01:00
Michael L Heuer
e3cf4c822c
Update dsh-bio modules to version 2.0.4 (#482)
* Update dsh-bio modules to version 2.0.4

* update docker tag

* update md5 checksums

* Update software/dshbio/filtergff3/main.nf

* Update software/dshbio/splitgff3/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-05-04 14:28:44 +01:00
Harshil Patel
6ade84b5cd
Update README.md 2021-05-04 11:25:30 +01:00
Edmund Miller
bdee7804ca
build: Bump version to 21.04.0 (#481)
* build: Bump version to 21.04.0

Recent stable release https://github.com/nextflow-io/nextflow/releases/tag/v21.04.0

* Update README.md

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-05-04 00:38:29 +01:00
Kevin Menden
5e86245388
updated paths to test data (#478)
* updated paths to test data

* Update test_data.config

changed file names
2021-05-03 07:13:07 +01:00
Yuk Kei Wan
6ad50f8ec4
Add stringtie merge module (from nanoseq modules) (#475)
* add stringtie merge module

* add md5sum and path for stringtie.merged.gtf

* fix errors

* try fixing stringtie check error

* add tag

* remove unreproducible md5sum

* address PR suggestions

* hopefully fix linting error
2021-05-03 07:18:51 +02:00
Charles Plessy
16d20a7cc4
New last/lastdb module to index sequences before alignment. (#476)
* New last/lastdb module to index sequences before alignment.

The `lastdb` command creates a sequence index for the LAST aligner
(https://gitlab.com/mcfrith/last). Input can be in FASTA or FASTQ
format, and compression is handled automagically.  DNA or protein
sequences can be indexed.

The sequence index is a collection of files sharing the same basename.
This module sets the basename to the sample identifier (`$meta.id`) and
creates the index in a directory always called `lastdb`.  The module's
output channel then conveys a copy of the metadata and the path to the
`lastdb` directory.

Other modules will follow (see Issue #464).  The LAST aligner can align
proteins to proteins, DNA to DNA and can translate DNA align to
proteins.

* Remove trailing whitespace.

* Apply suggestions from code review

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update as suggested in PR.

* Attempt to pass linting.

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-05-02 11:36:31 +01:00
Anthony Underwood
f9433639cf
Update README to include some git best practices (#415)
* Update README to include some git best practices

* correct linting errors

* extra info about returning to master and deleting the branch

* use rebase

* stress importance of rebase

* Update README to include some git best practices

* correct linting errors

* extra info about returning to master and deleting the branch

* Changed position of git commands

* move later git commands down the action list
2021-04-30 17:27:06 +01:00
Harshil Patel
466ab67808
Fixes for nanoseq modules (#479)
* Fix minimap2 index module

* Fix minimap2 index tests

* Fix graphmap2 index module

* Fix graphmap2 module

* Fix ECLint

* Fix bedtools bamtobed module

* Fix tests for bedtools bamtobed module

* Add tag for graphmap2 align module

* Fix EClint

* Fix qcat module

* Add md5sum for graphmap2/align module

* Remove non-started test data file

* Remove md5sum for graphmap2 align
2021-04-30 15:57:43 +01:00
Yuk Kei Wan
05f479f03a
add qcat module (from nanoseq modules) (#469)
* add qcat module

* remove md5sum(nom-reproducible)
2021-04-30 13:20:56 +01:00
Yuk Kei Wan
3f804ee667
add bedtools bamtobed module (from nanoseq modules) (#466)
* add bedtools bamtobed module

* fix erros

* fix linting problem
2021-04-30 13:20:31 +01:00
Yuk Kei Wan
a8720463ac
add graphmap index and align modules (from nanoseq modules) (#468)
* add graphmap index module

* add graphmap2/index

* add graohmap2 align module

* remove graphmap2 align md5sum
2021-04-30 13:18:58 +01:00
Yuk Kei Wan
05b067e907
add minimap2 index module (#467) 2021-04-30 13:18:11 +01:00
Kevin Menden
bbf8626b28
Bugfix (#477)
* initial 'modules create' of minia

* fixed tests

* finished meta.yml

* fixed filters.yml

* resolved issues in pytest_software.yml

* add newline

* Update software/minia/main.nf

* fixing a bunch of module tests

* remove vscode

* fixed minia

* remove the test data again

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-04-30 07:32:10 +02:00
Florian Wuennemann
88dda738ac
Added kallistobustools/count (#409)
* Basic kb-python count functionality for scRNA-seq quantification working.

* Added tests and test data for workflow kite.

* Removed trailing whitespace

* Changed output channels to tuples with meta

Based on suggestions by @KevinMenden.

* Moved workflow and technology to input variables. Currently create test-yaml file script failes with cryptic message.

* Update software/kallistobustools/count/main.nf

@KevinMenden fixed wrong path definition

Co-authored-by: Kevin Menden <kevin.menden@live.com>

* Increased version of kb-python

* Updated tests with raw links.

* Fixed subtool referencing: kallistobustools/count

* Added newline

* Update software/kallistobustools/count/main.nf

Co-authored-by: Kevin Menden <kevin.menden@live.com>
2021-04-30 07:27:17 +02:00
Anthony Underwood
6a31737cb8
Raxmlng (#474)
* raxml-ng is compute intensive - upgrade process label to high

* ensure raxmlng uses --all when bootstrapping

* remove block that should be taken of by the options passed in

* update tests
2021-04-29 08:20:05 +01:00
FriederikeHanssen
9ce4427275
Add gvcf index files (#472) 2021-04-28 20:23:10 +01:00
Anthony Underwood
9b0aa3e239
raxml-ng is compute intensive - upgrade process label to high (#473) 2021-04-28 20:21:44 +01:00
Florian Wuennemann
a5d0cf3686
Update kallistobustools/ref module (#465)
* Bump version and format references to use raw params.test_data

* Apply suggestions from code review

Co-authored-by: Kevin Menden <kevin.menden@live.com>

* Updated tags

Missed on in last commit.

Co-authored-by: Kevin Menden <kevin.menden@live.com>
2021-04-28 14:54:40 +02:00
Daniel Lundin
d7a3286a9a
New module to use hmmalign from HMMER to align sequences (#470)
* Ignore vim tmp files

* Added hmmalign module, not yet tests

* Test output

* Replaced functions.nf for hmmalign with upstream

* Update software/hmmer/hmmalign/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update software/hmmer/hmmalign/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update software/hmmer/hmmalign/meta.yml

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update tests/software/hmmer/hmmalign/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update tests/software/hmmer/hmmalign/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update tests/software/hmmer/hmmalign/test.yml

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update tests/config/pytest_software.yml

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

* Update software/hmmer/hmmalign/main.nf

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-04-28 11:21:24 +01:00
Jose Espinosa-Carrasco
d63ff4ba1b
Add artic guppyplex (#455)
* Adding artic guppyplex module

* Adding guppyplex tests

* Fix tests

* Correcting typo

* Fix lint

* Fix test

* Missing description

* Missing descriptions

* Update functions to last version as suggested

* Bump newest version of nanoplot
2021-04-27 15:57:34 +01:00
Maxime Garcia
789a799e41
feat: remove social preview image to use GitHub OpenGraph (#461) 2021-04-26 11:59:12 +01:00
Harshil Patel
9ea7f50963
Bump plasmidid version (#460)
* Bump plasmidid version

* Fix tests
2021-04-26 12:40:38 +02:00
FriederikeHanssen
ae154b8c3f
Add human data paths (#458)
* Add new human data and fix sarscov paths

* Fix filename typo

* Apply code review

* replace index with to match sarscov data

* lower case

* indent everythin

* Adapt sarscov keys to new naming convention

* Update test_data.config

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-04-19 15:30:43 +01:00
Kevin Menden
f7e3b8260c
change process.time for tests to 2h (#452) 2021-04-16 13:08:03 +01:00
Harshil Patel
d1c6082a66
Update modules required for rnaseq pipeline (#449)
* Update HISAT2 build module

* Bump preseq version

* Fix tests

* Add meta.yml for preseq to fix linting

* Auto-detect --genomeSAindexNbases for smaller genomes

* Add placeholder to use human data for the tests

* Add CSI output option to samtools/index

* Fix samtools/index tests
2021-04-16 08:56:47 +01:00
Harshil Patel
defaca4f1b
Add ucsc/bedclip module (#450)
* Add ucsc/bedclip module

* Fix tests

* Fix nf-core lint
2021-04-15 22:04:59 +02:00
Anthony Underwood
e2d64bd7ec
minor typo (#451) 2021-04-15 19:45:23 +01:00
Jose Espinosa-Carrasco
1e033bbf02
Fixing abacas meta.yml file (#447)
* Fixing abacas meta.yml file

* Fix lint test
2021-04-15 11:17:06 +01:00
Anthony Underwood
2ed9b6ae28
Raxmlng (#443)
* new raxml module

* new raxml module

* pass in args for bootstrap and add test for support file

* remove unnecessary tag

* ensure tags meet guidleines

* Apply suggestions from code review

* Update to latest functions file

Co-authored-by: avantonder <avt@sanger.ac.uk>
Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-04-14 08:38:59 +01:00
Jose Espinosa-Carrasco
750bd8c3e3
Finish deeptools modules (#442)
* Adding bigWig and deeptools computeMatrix files to config

* Adding meta.yml for deeptools modules

* Add test for deeptools modules

* Fixing and reordering tags

* Fixing conda test that worked in local...

* Apply suggestions from code review

* Changing bigwig file pattern to include bigwig extension

* Saving after last change is a good practice

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-04-13 21:48:43 +01:00
Kevin Menden
043a7d1e3c
remove deprecated test tags (#440)
* remove deprecated test tags

* fix bowtie, gatk4/applybsqr

* fix gatk4 baserecalibrator

* fixed shovill

* fixed yara/mapper

* fixed kallistobustools/ref paths
2021-04-13 18:03:09 +01:00
Jose Espinosa-Carrasco
a9fcbd93cc
Move assembly test files from genome to illumina (#441)
* FIx plasmidid tests for new contigs.fasta file

* Fixing two md5sum hashes

* Update test path in config for illumina assembly files

* Update modules using assembly files

* Correctly setting path of assembly files

* Update tests/config/test_data.config

Co-authored-by: Harshil Patel <drpatelh@users.noreply.github.com>
2021-04-13 12:52:11 +02:00
Jose Espinosa-Carrasco
26fdebc8de
FIx plasmidid tests for new contigs.fasta file (#438)
* FIx plasmidid tests for new contigs.fasta file

* Fixing two md5sum hashes
2021-04-13 08:49:44 +01:00