Skip to content

Conversation

@bentsherman
Copy link
Member

This PR re-organizes the documentation on workflows and dataflow logic to match the current Nextflow programming model.

  • Move "Outputs" to the top after "Entry workflow" and "Parameters" to emphasize that these three things go together

  • Move "Dataflow" page into the "Workflows" page as a section after "Entry workflow" and "Named workflows", since dataflow logic exists primarily in the context of a workflow

  • Move auxiliary sections (calling processes, special operators, recursion) under "Dataflow", since all of these concepts fall under the umbrella of dataflow logic

I'm open to different ordering / structuring. I just think all of this content goes together

@bentsherman bentsherman requested a review from a team as a code owner December 11, 2025 00:16
@netlify
Copy link

netlify bot commented Dec 11, 2025

Deploy Preview for nextflow-docs-staging ready!

Name Link
🔨 Latest commit fc7bcae
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/695d897fd02b1900076bfd8c
😎 Deploy Preview https://deploy-preview-6648--nextflow-docs-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@christopher-hakkaart
Copy link
Collaborator

I see the logic in having all of this stuff together.

My instinct here is that I think the page would benefit from a stronger overview at the top, i.e., a list of major parts (entry, parameters, outputs, named workflows, dataflow) with one-line descriptions and an explanation that all of this is related under the concept of a workflow, with a reasonably simple example showing these parts. This probably softens the need for the order to be perfect, as everything is already framed rather than being progressive information.

I'll vibe a PR to see how it looks in this scenario.

Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
@bentsherman
Copy link
Member Author

@christopher-hakkaart I tried my hand at an overview for the Workflows page, let me know what you think

Copy link
Collaborator

@christopher-hakkaart christopher-hakkaart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I like the direction and changes. I would like to take a pass at this, but I'm tied up with other projects for the next couple of weeks and won't have time to give it the attention it deserves. This is a significant improvement, and I would suggest mostly language changes rather than rearranging sections. I don't want to hold this up.

My main suggestions are to add a list of the main sections and minimal descriptions that are scannable rather than sentences and add a bit more context/description at the start of sections to help frame the section.

Take or leave what you like/dislike.

I started some consistency suggestions, but they probably aren't worth it until the whole page is standardized, which I can follow up with in a second PR.

## Outputs

:::{versionadded} 25.10.0
This feature is available as a preview in Nextflow {ref}`24.04 <workflow-outputs-first-preview>`, {ref}`24.10 <workflow-outputs-second-preview>`, and {ref}`25.04 <workflow-outputs-third-preview>`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This feature is available as a preview in Nextflow {ref}`24.04 <workflow-outputs-first-preview>`, {ref}`24.10 <workflow-outputs-second-preview>`, and {ref}`25.04 <workflow-outputs-third-preview>`.
Workflow outputs are available as a preview in Nextflow {ref}`24.04 <workflow-outputs-first-preview>`, {ref}`24.10 <workflow-outputs-second-preview>`, and {ref}`25.04 <workflow-outputs-third-preview>`.

docs/workflow.md Outdated
# Workflows

In Nextflow, a **workflow** is a function that is specialized for composing processes and dataflow logic (i.e. channels and operators).
In Nextflow, a **workflow** is a function that is specialized for composing {ref}`processes <process-page>` and dataflow logic.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In Nextflow, a **workflow** is a function that is specialized for composing {ref}`processes <process-page>` and dataflow logic.
In Nextflow, a **workflow** is a specialized function for composing {ref}`processes <process-page>` and dataflow logic.

Workflow outputs are intended to replace the {ref}`publishDir <process-publishdir>` directive. See {ref}`migrating-workflow-outputs` for guidance on migrating from `publishDir` to workflow outputs.
:::

A script can define an *output block* which declares the top-level outputs of the workflow. Each output should be assigned in the `publish` section of the entry workflow. Any channel in the workflow can be assigned to an output, including process and subworkflow outputs.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A script can define an *output block* which declares the top-level outputs of the workflow. Each output should be assigned in the `publish` section of the entry workflow. Any channel in the workflow can be assigned to an output, including process and subworkflow outputs.
A script can define an *output block* to declare the top-level workflow outputs. Each output should be assigned in the `publish` section of the entry workflow. Any channel in the workflow can be assigned to an output, including process and subworkflow outputs.

Comment on lines +5 to +11
In Nextflow, a **workflow** is a function that is specialized for composing {ref}`processes <process-page>` and dataflow logic:

See {ref}`syntax-workflow` for a full description of the workflow syntax.
- An [entry workflow](#entry-workflow) is the entrypoint of a pipeline. It can take [parameters](#parameters) as inputs using the `params` block, and it can publish [outputs](#outputs) using the `output` block.

:::{note}
Workflows were introduced in DSL2. If you are still using DSL1, see {ref}`dsl1-page` for more information about how to migrate your Nextflow pipelines to DSL2.
:::
- A [named workflow](#named-workflows) is a workflow that can be called by other workflows. It can define its own inputs and outputs, which are called *takes* and *emits*.

- Both entry workflows and named workflows can contain [dataflow logic](#dataflow) such as calling processes, workflows, and channel operators.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In Nextflow, a **workflow** is a function that is specialized for composing {ref}`processes <process-page>` and dataflow logic:
See {ref}`syntax-workflow` for a full description of the workflow syntax.
- An [entry workflow](#entry-workflow) is the entrypoint of a pipeline. It can take [parameters](#parameters) as inputs using the `params` block, and it can publish [outputs](#outputs) using the `output` block.
:::{note}
Workflows were introduced in DSL2. If you are still using DSL1, see {ref}`dsl1-page` for more information about how to migrate your Nextflow pipelines to DSL2.
:::
- A [named workflow](#named-workflows) is a workflow that can be called by other workflows. It can define its own inputs and outputs, which are called *takes* and *emits*.
- Both entry workflows and named workflows can contain [dataflow logic](#dataflow) such as calling processes, workflows, and channel operators.
A **workflow** composes {ref}`processes <process-page>` and dataflow logic to define how data flows through your pipeline. A Nextflow script typically includes:
- **[Entry workflow](#entry-workflow)**: A main entrypoint that orchestrates the pipeline
- **[Parameters](#parameters)**: Configurable inputs
- **[Outputs](#outputs)**: Published results
- **[Named workflows](#named-workflows)**: Reusable workflow components that can be called by other workflows
- **[Dataflow](#dataflow)**: Channels and operators connecting processes
For detailed syntax and usage instructions, see {ref}`syntax-workflow`.

IMO - these are relatively novel concepts for a new user, a high-level list of what each part is that can be scanned helps frame the page.


## Parameters

Parameters can be declared in a Nextflow script with the `params` block or with *legacy* parameter declarations.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Parameters are configurable variables that control pipeline behavior. You can declare parameters with [typed parameters](#typed-parameters) in the `params` block or with [legacy parameters](#legacy-parameters) to customize pipeline behavior at runtime.

Comment on lines +161 to +163
The default output directory is `results` in the launch directory.

By default, all output files are published to the output directory. Each output in the output block can define where files are published using the `path` directive. For example:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The default output directory is `results` in the launch directory.
By default, all output files are published to the output directory. Each output in the output block can define where files are published using the `path` directive. For example:
The default output directory is `results` in the launch directory.
By default, Nextflow publishes all output files to the output directory. Each output in the output block can define where Nextflow publishes files using the `path` directive:

└── ...
```

All files received by an output are published into the specified directory. Lists, maps, and tuples are recursively scanned for nested files. For example:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All files received by an output are published into the specified directory. Lists, maps, and tuples are recursively scanned for nested files. For example:
Nextflow publishes all files received by an output into the specified directory. Nextflow recursively scans lists, maps, and tuples for nested files:


The above example publishes each channel value to a different subdirectory. In this case, each pair of FASTQ files is published into a subdirectory based on the sample ID.

The closure can even define a different path for each individual file using the `>>` operator:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The closure can even define a different path for each individual file using the `>>` operator:
You can define a different path for each individual file using the `>>` operator:

}
```

Each `>>` specifies a *source file* and *publish target*. The source file should be a file or collection of files, and the publish target should be a directory or file name. If the publish target ends with a slash, it is treated as the directory in which source files are published. Otherwise, it is treated as the target filename of a source file. Only files that are published with the `>>` operator are saved to the output directory.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Each `>>` specifies a *source file* and *publish target*. The source file should be a file or collection of files, and the publish target should be a directory or file name. If the publish target ends with a slash, it is treated as the directory in which source files are published. Otherwise, it is treated as the target filename of a source file. Only files that are published with the `>>` operator are saved to the output directory.
Each `>>` specifies a *source file* and *publish target*. The source file should be a file or collection of files, and the publish target should be a directory or file name. If the publish target ends with a slash, Nextflow treats it as the directory in which to publish source files.


### Index files

Each output can create an index file of the values that were published. An index file preserves the structure of channel values, including metadata, which is simpler than encoding this information with directories and file names. The index file can be a CSV (`.csv`), JSON (`.json`), or YAML (`.yml`, `.yaml`) file. The channel values should be files, lists, maps, or tuples.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Each output can create an index file of the values that were published. An index file preserves the structure of channel values, including metadata, which is simpler than encoding this information with directories and file names. The index file can be a CSV (`.csv`), JSON (`.json`), or YAML (`.yml`, `.yaml`) file. The channel values should be files, lists, maps, or tuples.
Index files are structured metadata files that catalog published outputs and their associated metadata. An index file preserves the structure of channel values, including metadata, which is simpler than encoding this information with directories and file names. The index file can be a CSV (`.csv`), JSON (`.json`), or YAML (`.yml`, `.yaml`) file. The channel values should be files, lists, maps, or tuples.

@christopher-hakkaart
Copy link
Collaborator

I started working on a PR here before the holidays: https://github.com/christopher-hakkaart/nextflow/tree/chris-docs-workflow-page

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants