Add documentation for output docs

2024-11-22 07:19:55 +00:00 · 2023-01-12 15:47:29 +01:00 · 2023-01-12 15:47:29 +01:00 · 9f21181972
commit 9f21181972
parent 3b6be07623
1 changed files with 22 additions and 17 deletions
--- a/docs/output.md
+++ b/docs/output.md
@ -64,6 +64,8 @@ fastp can automatically detect adapter sequences for Illumina data.
 <summary>Output files</summary>

 - `fastp`
+  - `<sample_id>.fastp.fastq.gz`: File with the trimmed unmerged fastq reads.
+  - `<sample_id>.merged.fastq.gz`: File with the reads that were successfully merged.

 </details>

@ -95,7 +97,7 @@ Note that the FASTQ files may _not_ always be the 'final' reads that go into tax
 <summary>Output files</summary>

 - `porechop`
-  - `<sample_id>.fastq.gz`
+  - `<sample_id>.fastq.gz`: Adapter-trimmed file

 </details>

@ -143,8 +145,8 @@ Note that the FASTQ file(s) may _not_ always be the 'final' reads that go into t
 <summary>Output files</summary>

 - `filtlong`
-  - `<sample_id>_filtered.fastq.g`
-  - `<sample_id>_filtered.log`
+  - `<sample_id>_filtered.fastq.gz`: Quality or short read data filtered file
+  - `<sample_id>_filtered.log`: log file containing summary statistics

 </details>

@ -174,7 +176,7 @@ Note that the FASTQ file(s) may _not_ always be the 'final' reads that go into t
 <summary>Output files</summary>

 - `minimap2`
-  - `<sample_id>.bam`
+  - `<sample_id>.bam`: Alignment file in bam format

 </details>

@ -184,7 +186,7 @@ Note that the FASTQ file(s) may _not_ always be the 'final' reads that go into t
 <summary>Output files</summary>

 - `samtoolsstats`
-  - `<sample_id>.stats`
+  - `<sample_id>.stats`: File containing samtools stats output

 </details>

@ -204,7 +206,7 @@ Note that the FASTQ file(s) may _not_ always be the 'final' reads that go into t

 </details>

-The main taxonomic profiling file from KrakenUniq is the `*.tsv` file. This provides the basic results from Kraken2 but with the corrected abundance information. 
+The main taxonomic profiling file from KrakenUniq is the `*.tsv` file. This provides the basic results from Kraken2 but with the corrected abundance information.

 ### Kraken2

@ -256,10 +258,11 @@ You will only receive the FASTQs and `*classifiedreads.txt` file if you supply `
 <summary>Output files</summary>

 - `centrifuge`
-  - `<sample_id>.centrifuge.mapped.fastq.gz`
-  - `<sample_id>.centrifuge.report.txt`
-  - `<sample_id>.centrifuge.results.txt`
-  - `<sample_id>.centrifuge.unmapped.fastq.gz`
+  - `<sample_id>.centrifuge.mapped.fastq.gz`: Fastq files containing all mapped reads
+  - `<sample_id>.centrifuge.report.txt`: A classification report that summarises the taxonomic ID, the taxonomic rank, length of genome sequence, number of classified and uniquely classified reads
+  - `<sample_id>.centrifuge.results.txt`: A file that summarises the classification assignment for a read, i.e read ID, sequence ID, score for the classification, score for the next best classification, number of classifications for this read
+  - `<sample_id>.centrifuge.txt`: A Kraken2-style report that summarises the fraction abundance, taxonomic ID, number of k-mers, taxonomic path of all the hits in the centrifuge run for a given sample
+  - `<sample_id>.centrifuge.unmapped.fastq.gz`: Fastq file containing all unmapped reads

 </details>

@ -269,7 +272,8 @@ You will only receive the FASTQs and `*classifiedreads.txt` file if you supply `
 <summary>Output files</summary>

 - `kaiju`
-  - `<sample_id>.tsv`
+  - `<sample_id>.tsv`: A file that summarises the fraction abundance, taxonomic ID, number of reads and taxonomic names
+  - `kaiju_<db_name>_combined_reports.txt`: A combined profile of all samples aligned to a given database (as generated by `kaiju2table`)

 </details>

@ -279,8 +283,8 @@ You will only receive the FASTQs and `*classifiedreads.txt` file if you supply `
 <summary>Output files</summary>

 - `diamond`
-  - `<sample_id>.log`
-  - `<sample_id>.sam`
+  - `<sample_id>.log`: A log file containing stdout information
+  - `<sample_id>.sam`: A file in SAM format that contains the aligned reads

 </details>

@ -321,22 +325,23 @@ You will only recieve the `.sam` and `.megan` files if you supply `--malt_save_r

 </details>

-The main taxonomic profiling file from MetaPhlAn3 is the `*_profile.txt` file. This provides the abundance estimates from MetaPhlAn3 however does not include raw counts by default. 
+The main taxonomic profiling file from MetaPhlAn3 is the `*_profile.txt` file. This provides the abundance estimates from MetaPhlAn3 however does not include raw counts by default.
 ### mOTUs

 <details markdown="1">
 <summary>Output files</summary>

 - `motus`
-  - `<sample_id>.log`
-  - `<sample_id>.out`
+  - `<sample_id>.log`: A log file that contains summary statistics
+  - `<sample_id>.out`: A classification file that summarises taxonomic identifiers, by default at the rank of mOTUs (i.e., species level), and their relative abundances in the profiled sample.
+  - `motus_<db_name>_combined_reports.txt`: A combined profile of all samples aligned to a given database (as generated by `motus_merge`)

 </details>
 ### Krona

 [Krona](https://github.com/marbl/Krona) is Krona allows the exploration of (metagenomic) hierarchical data with interactive zooming, multi-layered pie charts.

-Krona charts will be generated by the pipeline for supported tools (Kraken2, Centrifuge, Kaiju, and MALT) 
+Krona charts will be generated by the pipeline for supported tools (Kraken2, Centrifuge, Kaiju, and MALT)

 <details markdown="1">
 <summary>Output files</summary>