mirror of
https://github.com/MillironX/taxprofiler.git
synced 2024-11-22 11:29:54 +00:00
Apply review suggestion
Co-authored-by: James A. Fellows Yates <jfy133@gmail.com>
This commit is contained in:
parent
af8fd18d97
commit
3c33ba66ca
1 changed files with 13 additions and 5 deletions
|
@ -188,17 +188,25 @@ Note that the FASTQ file(s) may _not_ always be the 'final' reads that go into t
|
|||
|
||||
### Kraken2
|
||||
|
||||
[Kraken](https://ccb.jhu.edu/software/kraken2/) is a taxonomic sequence classifier that assigns taxonomic labels to DNA sequences. Kraken examines the k-mers within a query sequence and uses the information within those k-mers to query a database. That database maps -mers to the lowest common ancestor (LCA) of all genomes known to contain a given k-mer.
|
||||
|
||||
<details markdown="1">
|
||||
<summary>Output files</summary>
|
||||
|
||||
- `kraken2`
|
||||
- `<sample_id>.classified.fastq.gz`
|
||||
- `<sample_id>.unclassified.fastq.gz`
|
||||
- `<sample_id>.report.txt`
|
||||
- `<sample_id>.classifiedreads.txt`
|
||||
- `kraken2/`
|
||||
- `<db_name>_combined_reports.txt`: A combined profile of all samples aligned to a given database (as generated by `krakentools`)
|
||||
- <db_name>/
|
||||
- `<sample_id>_<db_name>.classified.fastq.gz`: FASTQ file containing all reads that had a hit against a reference in the database for a given sample
|
||||
- `<sample_id>_<db_name>.unclassified.fastq.gz`: FASTQ file containing all reads that did not have a hit in the database for a given sample
|
||||
- `<sample_id>_<db_name>.report.txt`: A Kraken2 report that summarises the fraction abundance, taxonomic ID, number of Kmers, taxonomic path of all the hits in the Kraken2 run for a given sample
|
||||
- `<sample_id>_<db_name>.classifiedreads.txt`: A list of read IDs and the hits each read had against each database for a given sample
|
||||
|
||||
</details>
|
||||
|
||||
The main taxonomic profiling file from Kraken2 is the `_combined_reports.txt` or `*report.txt` file. The former provides you the broadest over view of the taxonomic profiling results across all samples against a single databse, where you get two columns for each sample e.g. `2_all` and `2_lvl`, as well as a summarised column summing up across all samples `tot_all` and `tot_lvl`. The latter gives you the most information for a single sample. The report file is also used for the taxpasta step.
|
||||
|
||||
You will only recieve the FASTQs and `*classifiedreads.txt` file if you supply `--kraken2_save_reads` and/or `--kraken2_save_readclassification` parameters to the pipeline.
|
||||
|
||||
### KrakenUniq
|
||||
|
||||
<details markdown="1">
|
||||
|
|
Loading…
Reference in a new issue