From 604df56fdbf80bf6e079b75c4e2a1b4901ebd9af Mon Sep 17 00:00:00 2001 From: sofstam Date: Fri, 17 Feb 2023 13:49:13 +0100 Subject: [PATCH] Add a description about the files used by taxpasta --- docs/output.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/output.md b/docs/output.md index 284e55b..9a7518d 100644 --- a/docs/output.md +++ b/docs/output.md @@ -444,9 +444,13 @@ The resulting HTML files can be loaded into your web browser for exploration. Ea Output files - `taxpasta` - - `.*.{tsv,csv,arrow,parquet,biom}`: A list of taxonomic profiler output files. The standard format is the `tsv`. The first column describes the taxonomy ID and the rest of the columns describe the read counts for each sample. + + - `_*.{tsv,csv,arrow,parquet,biom}`: Standardised taxon table containing multiple samples. The standard format is the `tsv`. The first column describes the taxonomy ID and the rest of the columns describe the read counts for each sample. + +These files will likely be the most useful files for the comparison of differences in classification between different tools or building consensuses, with the caveat they have slightly less information than the actual output from each tool (which may have non-standard information e.g. taxonomic rank, percentage of hits, abundance estimations). + The following report files are used for the taxpasta step: - Bracken: `_.tsv` Taxpasta used the `new_est_reads` column for the standardised profile.