Add documentation to README

Signed-off-by: Thomas A. Christensen II <25492070+MillironX@users.noreply.github.com>
2024-12-22 03:08:17 +00:00 · 2022-01-11 10:13:53 -06:00 · 2022-01-11 10:13:53 -06:00 · 6377725abc
commit 6377725abc
parent a74e650635
1 changed files with 193 additions and 0 deletions
--- a/README.rst
+++ b/README.rst
@ -1,2 +1,195 @@
 nfdocs-parser
 =============
 A `Sphinx <https://www.sphinx-doc.org>`_ Extension for automatically generating
 documentation from `Nextflow <https://nextflow.io>`_ workflows, processes, and
 functions with YAML docstrings.
 Usage
 -----
 I'm not putting this on `PyPi <https://pypi.org>`_ just yet, so you'll have to
 use this as a Git submodule. In your documentation project, run
 .. code-block:: bash
    mkdir _ext
    git submodule https://github.com/MillironX/nfdocs-parser.git _ext/nfdocs-parser.git
 Then update your ``conf.py`` to include the following lines:
 .. code-block:: python
    import os
    import sys
    sys.path.append(os.path.abspath('./_ext'))
    extensions = [
        # Keep any other extensions, like InterSphinx and AutoSectionLabel here
        'nfdocs-parser.nfdocs-parser',
    ]
 Inside your documentation file, include the following directive:
 .. code-block:: rst
    .. nfdocs:: ../path/to/nextflow/project/directory
 The extension will look for properly-formatted docstrings in all of your
 Nextflow files (extension ``.nf``) within that directory and its subdirectories
 and output the documentation in that file.
 Example
 -------
 Take a simple Nextflow process, this is a pared-down example from
 `nf-core/modules <https://github.com/nf-core/modules>`_.
 .. code-block:: groovy
    process KRAKEN2 {
        input:
        tuple val(prefix), path(reads)
        path(db)
        output:
        tuple val(prefix), path("*classified*"), emit: classified
        tuple val(prefix), path("*unclassified*"), emit: unclassified
        tuple val(prefix), path("*report.txt"), emit: report
        script:
        """
        kraken2 \\
            --db ${db} \\
            --classified-out ${prefix}.classified.fastq \\
            --unclassified-out ${prefix}.unclassified.fastq \\
            --report ${prefix}.kraken2.report.txt \\
            ${reads}
        """
    }
 This process still has a lot of unique inputs and outputs, so we annotate it
 using triple slashes (``///``) and YAML notation
 .. code-block:: groovy
    /// summary: Classifies metagenomic sequence data
    /// input:
    ///   - tuple:
    ///       - name: prefix
    ///         type: val(String)
    ///         description: Sample identifier
    ///       - name: reads
    ///         type: path
    ///         description: List of input FastQ files
    ///   - name: db
    ///     type: path
    ///     description: Kraken2 database directory
    /// output:
    ///   - name: classified
    ///     tuple:
    ///       - type: val(String)
    ///         description: Sample identifier
    ///       - type: path
    ///         description: |
    ///           Reads classified to belong to any of the taxa in the Kraken2
    ///           database
    ///   - name: unclassified
    ///     tuple:
    ///       - type: val(String)
    ///         description: Sample identifier
    ///       - type: path
    ///         description: |
    ///           Reads not classified to belong to any of the taxa in the
    ///           Kraken2 database
    ///   - name: txt
    ///     tuple:
    ///       - type: val(String)
    ///         description: Sample identifier
    ///       - type: path
    ///         description: |
    ///           Kraken2 report containing stats about classified and not
    ///           classified reads
    process KRAKEN2 {
        ...
    }
 You will get output that looks something like this:
 Input
 '''''
 +-------------------------+-------------------------------------------------------------------+
 | **Tuple**               |                                                                   |
 |                         | +--------------------------+----------------------------------+   |
 |                         | | **prefix** (val(String)) | Sample identifier                |   |
 |                         | +--------------------------+----------------------------------+   |
 |                         | | **reads** (path)         | List of input FastQ files        |   |
 |                         | +--------------------------+----------------------------------+   |
 |                         |                                                                   |
 +-------------------------+-------------------------------------------------------------------+
 | **db** (path)           | Kraken2 database directory                                        |
 +-------------------------+-------------------------------------------------------------------+
 Output
 ''''''
 +--------------------------+-------------------------------------------------------------------+
 | **classified** (Tuple)   |                                                                   |
 |                          | +--------------------------+----------------------------------+   |
 |                          | | val(String)              | Sample identifier                |   |
 |                          | +--------------------------+----------------------------------+   |
 |                          | | path                     | Reads classified to belong to    |   |
 |                          | |                          | any of the taxa in the Kraken2   |   |
 |                          | |                          | database                         |   |
 |                          | +--------------------------+----------------------------------+   |
 |                          |                                                                   |
 +--------------------------+-------------------------------------------------------------------+
 | **unclassified** (Tuple) |                                                                   |
 |                          | +--------------------------+----------------------------------+   |
 |                          | | val(String)              | Sample identifier                |   |
 |                          | +--------------------------+----------------------------------+   |
 |                          | | path                     | Reads not classified to belong   |   |
 |                          | |                          | to any of the taxa in the        |   |
 |                          | |                          | Kraken2 database                 |   |
 |                          | +--------------------------+----------------------------------+   |
 |                          |                                                                   |
 +--------------------------+-------------------------------------------------------------------+
 | **txt** (Tuple)          |                                                                   |
 |                          | +--------------------------+----------------------------------+   |
 |                          | | val(String)              | Sample identifier                |   |
 |                          | +--------------------------+----------------------------------+   |
 |                          | | path                     | Kraken2 report containing stats  |   |
 |                          | |                          | about classified and not         |   |
 |                          | |                          | classified reads                 |   |
 |                          | +--------------------------+----------------------------------+   |
 |                          |                                                                   |
 +--------------------------+-------------------------------------------------------------------+
 Motivation
 ----------
 I liked using the XML documentation blocks in VB.NET because it worked so well
 with IntelliSense. I often find myself scrolling through Nextflow code to
 remember what the form of the input tuple for a particular process was or how
 many outputs I need to account for. YAML seemed like a far superior language for
 documentation, and as most of my Nextflow projects were already using Sphinx,
 parsing the docstrings as part of my Sphinx documentation seemed like the
 logical thing to do.
 Why don't you just use a sidecar ``meta.yml`` file like nf-core does?
 ---------------------------------------------------------------------
 Honestly, because I started using my own format before realizing what the
 ``meta.yml`` file had in it. After some consideration, however, I like my system
 better and am not planning to add compatibility for ``meta.yml`` files.
 Reasons my system is better:
 * No need for sidecar files: everything is in one place
 * Tuple channels are noted as being different that the components that make them
  up, e.g. knowing that a process requires a tuple of ``val(prefix), file(reads)`` and
  a file of ``reference_genome`` is more informative than knowing that a
  process needs a ``val(prefix)``, ``file(reads)`` and
  ``file(reference_genome)``