1
0
Fork 0
mirror of https://github.com/MillironX/taxprofiler.git synced 2024-11-22 10:29:54 +00:00

Clarify some documentation

This commit is contained in:
James Fellows Yates 2023-03-06 10:41:32 +01:00
parent 9ed384e0f2
commit 4938549d65

View file

@ -35,19 +35,21 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
- [MultiQC](#multiqc) - Aggregate report describing results and QC from the whole pipeline - [MultiQC](#multiqc) - Aggregate report describing results and QC from the whole pipeline
- [Pipeline information](#pipeline-information) - Report metrics generated during the workflow execution - [Pipeline information](#pipeline-information) - Report metrics generated during the workflow execution
### FastQC or falco ### FastQC or Falco
<details markdown="1"> <details markdown="1">
<summary>Output files</summary> <summary>Output files</summary>
- `fastqc/` - `fastqc/`
- `*_fastqc.html`: FastQC report containing quality metrics. - `*_fastqc.html`: FastQC or Falco report containing quality metrics.
- `*_fastqc.zip`: Zip archive containing the FastQC report, tab-delimited data file and plot images. - `*_fastqc.zip`: Zip archive containing the FastQC report, tab-delimited data file and plot images (FastQC only).
</details> </details>
[FastQC](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) gives general quality metrics about your sequenced reads. It provides information about the quality score distribution across your reads, per base sequence content (%A/T/G/C), adapter contamination and overrepresented sequences. For further reading and documentation see the [FastQC help pages](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/). [FastQC](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) gives general quality metrics about your sequenced reads. It provides information about the quality score distribution across your reads, per base sequence content (%A/T/G/C), adapter contamination and overrepresented sequences. For further reading and documentation see the [FastQC help pages](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/).
If preprocessing is turned on, nf-core/taxprofiler runs FastQC/Falco twice -once before and once after adapter removal/read merging, to allow evaluation of the performance of these preprocessing steps. Note in the General Stats table, the columns of these two instances of FastQC/Falco are placed next to each other to make it easier to evaluate. However, the columns of the actual preprocessing steps (e.g., fastp or AdapterRemoval) will be displayed _after_ the two FastQC/Falco columns, even if they were run 'between' the two FastQC/Falco jobs in the pipeline itself.
> Falco produces identical output to FastQC but in the `falco/` directory. > Falco produces identical output to FastQC but in the `falco/` directory.
![MultiQC - FastQC sequence counts plot](images/mqc_fastqc_counts.png) ![MultiQC - FastQC sequence counts plot](images/mqc_fastqc_counts.png)
@ -56,8 +58,6 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
![MultiQC - FastQC adapter content plot](images/mqc_fastqc_adapter.png) ![MultiQC - FastQC adapter content plot](images/mqc_fastqc_adapter.png)
> **NB:** The FastQC plots displayed in the MultiQC report shows _untrimmed_ reads. They may contain adapter sequence and potentially regions with low quality.
### fastp ### fastp
[fastp](https://github.com/OpenGene/fastp) is a FASTQ pre-processing tool for quality control, trimmming of adapters, quality filtering and other features. [fastp](https://github.com/OpenGene/fastp) is a FASTQ pre-processing tool for quality control, trimmming of adapters, quality filtering and other features.