Skip to content

Commit 1b651d5

Browse files
damanm24Daman MulyeCopilot
authored
Docs: flowey: Add developer/contributor guide for working with flowey (#2278)
Adding more content to our flowey documentation to help ease ramp up for contributors in this area --------- Co-authored-by: Daman Mulye <daman.mulye@hotmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 parent b27f087 commit 1b651d5

File tree

12 files changed

+969
-5
lines changed

12 files changed

+969
-5
lines changed

Guide/src/SUMMARY.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,13 @@
3535
- [Running Fuzzers](./dev_guide/tests/fuzzing/running.md)
3636
- [Writing Fuzzers](./dev_guide/tests/fuzzing/writing.md)
3737
- [Developer Tools / Utilities](./dev_guide/dev_tools.md)
38+
- [`flowey`](./dev_guide/dev_tools/flowey.md)
39+
- [`Flowey Fundamentals`](./dev_guide/dev_tools/flowey/flowey_fundamentals.md)
40+
- [`Steps`](./dev_guide/dev_tools/flowey/steps.md)
41+
- [`Variables`](./dev_guide/dev_tools/flowey/variables.md)
42+
- [`Nodes`](./dev_guide/dev_tools/flowey/nodes.md)
43+
- [`Artifacts`](./dev_guide/dev_tools/flowey/artifacts.md)
44+
- [`Pipelines`](./dev_guide/dev_tools/flowey/pipelines.md)
3845
- [`cargo xtask`](./dev_guide/dev_tools/xtask.md)
3946
- [`cargo xflowey`](./dev_guide/dev_tools/xflowey.md)
4047
- [VmgsTool](./dev_guide/dev_tools/vmgstool.md)
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# Flowey
2+
3+
Flowey is an in-house, Rust library for writing maintainable, cross-platform automation. It enables developers to define CI/CD pipelines and local workflows as type-safe Rust code that can generate backend-specific YAML (Azure DevOps, GitHub Actions) or execute directly on a local machine. Rather than writing automation logic in YAML with implicit dependencies, flowey treats automation as first-class Rust code with explicit, typed dependencies tracked through a directed acyclic graph (DAG).
4+
5+
## Why Flowey?
6+
7+
Traditional CI/CD pipelines using YAML-based configuration (e.g., Azure DevOps Pipelines, GitHub Actions workflows) have several fundamental limitations that become increasingly problematic as projects grow in complexity:
8+
9+
### The Problems with Traditional YAML Pipelines
10+
11+
#### Non-Local Reasoning and Global State
12+
13+
- YAML pipelines heavily rely on global state and implicit dependencies (environment variables, file system state, installed tools)
14+
- Understanding what a step does often requires mentally tracking state mutations across the entire pipeline
15+
- Debugging requires reasoning about the entire pipeline context rather than isolated units of work
16+
- Changes in one part of the pipeline can have unexpected effects in distant, seemingly unrelated parts
17+
18+
#### Maintainability Challenges
19+
20+
- YAML lacks type safety, making it easy to introduce subtle bugs (typos in variable names, incorrect data types, etc.)
21+
- No compile-time validation means errors only surface at runtime
22+
- Refactoring is risky and error-prone without automated tools to catch breaking changes
23+
- Code duplication is common because YAML lacks good abstraction mechanisms
24+
- Testing pipeline logic requires actually running the pipeline, making iteration slow and expensive
25+
26+
#### Platform Lock-In
27+
28+
- Pipelines are tightly coupled to their specific CI backend (ADO, GitHub Actions, etc.)
29+
- Multi-platform support means maintaining multiple, divergent YAML files
30+
31+
#### Local Development Gaps
32+
33+
- Developers can't easily test pipeline changes before pushing to CI
34+
- Reproducing CI failures locally is difficult or impossible
35+
- The feedback loop is slow: push → wait for CI → debug → repeat
36+
37+
### Flowey's Solution
38+
39+
Flowey addresses these issues by treating automation as **first-class Rust code**:
40+
41+
- **Type Safety**: Rust's type system catches errors at compile-time rather than runtime
42+
- **Local Reasoning**: Dependencies are explicit through typed variables, not implicit through global state
43+
- **Portability**: Write once, generate YAML for any backend (ADO, GitHub Actions, or run locally)
44+
- **Reusability**: Nodes are composable building blocks that can be shared across pipelines
45+
46+
## Flowey's Directory Structure
47+
48+
Flowey is architected as a standalone tool with a layered crate structure that separates project-agnostic core functionality from project-specific implementations:
49+
50+
- **`flowey_core`**: Provides the core types and traits shared between user-facing and internal Flowey code, such as the essential abstractions for nodes and pipelines.
51+
- **`flowey`**: Thin wrapper around `flowey_core` that exposes the public API for defining nodes and pipelines.
52+
- **`flowey_cli`**: Command-line interface for running flowey - handles YAML generation, local execution, and pipeline orchestration.
53+
- **`schema_ado_yaml`**: Rust types for Azure DevOps YAML schemas used during pipeline generation.
54+
- **`flowey_lib_common`**: Ecosystem-wide reusable nodes (installing Rust, running Cargo, downloading tools, etc.) that could be useful across projects outside of OpenVMM.
55+
- **`flowey_lib_hvlite`**: OpenVMM-specific nodes and workflows that build on the common library primitives.
56+
- **`flowey_hvlite`**: The OpenVMM pipeline definitions that compose nodes from the libraries above into complete CI/CD workflows.
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# Artifacts
2+
3+
Artifacts enable typed data transfer between jobs with automatic dependency management, abstracting away CI system complexities like name collisions and manual job ordering.
4+
5+
## Typed vs Untyped Artifacts
6+
7+
**Typed artifacts (recommended)** provide type-safe artifact handling by defining
8+
a custom type that implements the `Artifact` trait:
9+
10+
```rust
11+
#[derive(Serialize, Deserialize)]
12+
struct MyArtifact {
13+
#[serde(rename = "output.bin")]
14+
binary: PathBuf,
15+
#[serde(rename = "metadata.json")]
16+
metadata: PathBuf,
17+
}
18+
19+
impl Artifact for MyArtifact {}
20+
21+
let (pub_artifact, use_artifact) = pipeline.new_typed_artifact("my-files");
22+
```
23+
24+
**Untyped artifacts** provide simple directory-based artifacts for simpler cases:
25+
26+
```rust
27+
let (pub_artifact, use_artifact) = pipeline.new_artifact("my-files");
28+
```
29+
30+
For detailed examples of defining and using artifacts, see the [Artifact trait documentation](https://openvmm.dev/rustdoc/linux/flowey_core/pipeline/trait.Artifact.html).
31+
32+
Both `pipeline.new_typed_artifact("name")` and `pipeline.new_artifact("name")` return a tuple of handles: `(pub_artifact, use_artifact)`. When defining a job you convert them with the job context:
33+
34+
```rust
35+
// In a producing job:
36+
let artifact_out = ctx.publish_artifact(pub_artifact);
37+
// artifact_out : WriteVar<MyArtifact> (typed)
38+
// or WriteVar<PathBuf> for untyped
39+
40+
// In a consuming job:
41+
let artifact_in = ctx.use_artifact(use_artifact);
42+
// artifact_in : ReadVar<MyArtifact> (typed)
43+
// or ReadVar<PathBuf> for untyped
44+
```
45+
46+
After conversion, you treat the returned `WriteVar` / `ReadVar` like any other flowey variable (claim them in steps, write/read values).
47+
Key concepts:
48+
49+
- The `Artifact` trait works by serializing your type to JSON in a format that reflects a directory structure
50+
- Use `#[serde(rename = "file.exe")]` to specify exact file names
51+
- Typed artifacts ensure compile-time type safety when passing data between jobs
52+
- Untyped artifacts are simpler but don't provide type guarantees
53+
- Tuple handles must be lifted with `ctx.publish_artifact(...)` / `ctx.use_artifact(...)` to become flowey variables
54+
55+
## How Flowey Manages Artifacts Under the Hood
56+
57+
During the **pipeline resolution phase** (build-time), flowey:
58+
59+
1. **Identifies artifact producers and consumers** by analyzing which jobs write to vs read from each artifact's `WriteVar`/`ReadVar`
60+
2. **Constructs the job dependency graph** ensuring producers run before consumers
61+
3. **Generates backend-specific upload/download steps** in the appropriate places:
62+
- For ADO: Uses `PublishPipelineArtifact` and `DownloadPipelineArtifact` tasks
63+
- For GitHub Actions: Uses `actions/upload-artifact` and `actions/download-artifact`
64+
- For local execution: Uses filesystem copying
65+
66+
At **runtime**, the artifact `ReadVar<PathBuf>` and `WriteVar<PathBuf>` work just like any other flowey variable:
67+
68+
- Producing jobs write artifact files to the path from `WriteVar<PathBuf>`
69+
- Flowey automatically uploads those files as an artifact
70+
- Consuming jobs read the path from `ReadVar<PathBuf>` where flowey has downloaded the artifact
Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
# Flowey Fundamentals
2+
3+
Before diving into how flowey works, let's establish the key building blocks that form the foundation of flowey's automation model. These concepts are flowey's Rust-based abstractions for common CI/CD workflow primitives.
4+
5+
## The Automation Workflow Model
6+
7+
In traditional CI/CD systems, workflows are defined using YAML with implicit dependencies and global state. Flowey takes a fundamentally different approach: **automation workflows are modeled as a directed acyclic graph (DAG) of typed, composable Rust components**. Each component has explicit inputs and outputs, and dependencies are tracked through the type system.
8+
9+
### Core Building Blocks
10+
11+
Flowey's model consists of a hierarchy of components:
12+
13+
**[Pipelines](https://openvmm.dev/rustdoc/linux/flowey_core/pipeline/trait.IntoPipeline.html)** are the top-level construct that defines a complete automation workflow. A pipeline specifies what work needs to be done and how it should be organized. Pipelines can target different execution backends (local machine, Azure DevOps, GitHub Actions) and generate appropriate configuration for each.
14+
15+
**[Jobs](https://openvmm.dev/rustdoc/linux/flowey_core/pipeline/struct.PipelineJob.html)** represent units of work that run on a specific platform (Windows, Linux, macOS) and architecture (x86_64, Aarch64). Jobs can run in parallel when they don't depend on each other, or sequentially when one job's output is needed by another. Each job is isolated and runs in its own environment.
16+
17+
**[Nodes](https://openvmm.dev/rustdoc/linux/flowey_core/node/trait.FlowNode.html)** are reusable units of automation logic that perform specific tasks (e.g., "install Rust toolchain", "run cargo build", "publish test results"). Nodes are invoked by jobs and emit one or more steps to accomplish their purpose. Nodes can depend on other nodes, forming a composable ecosystem of automation building blocks.
18+
19+
**Steps** are the individual units of work that execute at runtime. A step might run a shell command, execute Rust code, or interact with the CI backend. Steps are emitted by nodes during the build-time phase and executed in dependency order during runtime.
20+
21+
### Connecting the Pieces
22+
23+
These building blocks are connected through three key mechanisms:
24+
25+
**[Variables (`ReadVar`/`WriteVar`)](https://openvmm.dev/rustdoc/linux/flowey/node/prelude/struct.ReadVar.html)** enable data flow between steps. A `WriteVar<T>` represents a promise to produce a value of type `T` at runtime, while a `ReadVar<T>` represents a dependency on that value. Variables enforce write-once semantics (each value has exactly one producer) and create explicit dependencies in the DAG. For example, a "build" step might write a binary path to a `WriteVar<PathBuf>`, and a "test" step would read from the corresponding `ReadVar<PathBuf>`. This echoes Rust's "shared XOR mutable" ownership rule: a value has either one writer or multiple readers, never both concurrently.
26+
27+
**[Artifacts](https://openvmm.dev/rustdoc/linux/flowey_core/pipeline/trait.Artifact.html)** enable data transfer between jobs. Since jobs may run on different machines or at different times, artifacts package up files (like compiled binaries, test results, or build outputs) for transfer. Flowey automatically handles uploading artifacts at the end of producing jobs and downloading them at the start of consuming jobs, abstracting away backend-specific artifact APIs.
28+
29+
**[Side Effects](https://openvmm.dev/rustdoc/linux/flowey/node/prelude/type.SideEffect.html)** represent dependencies without data. Sometimes step B needs to run after step A, but A doesn't produce any data that B consumes (e.g., "install dependencies" must happen before "run tests", even though the test step doesn't directly use the installation output). Side effects are represented as `ReadVar<SideEffect>` and establish ordering constraints in the DAG without transferring actual values.
30+
31+
### Putting It Together
32+
33+
Here's an example of how these pieces relate:
34+
35+
```txt
36+
Pipeline
37+
├─ Job 1 (Linux x86_64)
38+
│ ├─ Node A (install Rust)
39+
│ │ └─ Step: Run rustup install
40+
│ │ └─ Produces: WriteVar<SideEffect> (installation complete)
41+
│ └─ Node B (build project)
42+
│ └─ Step: Run cargo build
43+
│ └─ Consumes: ReadVar<SideEffect> (installation complete)
44+
│ └─ Produces: WriteVar<PathBuf> (binary path) → Artifact
45+
46+
└─ Job 2 (Windows x86_64)
47+
└─ Node C (run tests)
48+
└─ Step: Run binary with test inputs
49+
└─ Consumes: ReadVar<PathBuf> (binary path) ← Artifact
50+
└─ Produces: WriteVar<PathBuf> (test results)
51+
```
52+
53+
In this example:
54+
55+
- The **Pipeline** defines two jobs that run on different platforms
56+
- **Job 1** installs Rust and builds the project, with step dependencies expressed through variables
57+
- **Job 2** runs tests using the binary from Job 1, with the binary transferred via an artifact
58+
- **Variables** create dependencies within a job (build depends on install)
59+
- **Artifacts** create dependencies between jobs (Job 2 depends on Job 1's output)
60+
- **Side Effects** represent the "Rust is installed" state without carrying data
61+
62+
## Two-Phase Execution Model
63+
64+
Flowey operates in two distinct phases:
65+
66+
1. **Build-Time (Resolution Phase)**: When you run `cargo xflowey regen`, flowey:
67+
- Reads `.flowey.toml` to determine which pipelines to regenerate
68+
- Builds the flowey binary (e.g., `flowey-hvlite`) via `cargo build`
69+
- Runs the flowey binary with `pipeline <backend> --out <file> <cmd>` for each pipeline definition
70+
- During this invocation, flowey constructs a **directed acyclic graph (DAG)** by:
71+
- Instantiating all nodes (reusable units of automation logic) defined in the pipeline
72+
- Processing their requests
73+
- Resolving dependencies between nodes via variables and artifacts
74+
- Determining the execution order
75+
- Performing flowey-specific validations (dependency resolution, type checking, etc.)
76+
- Generates YAML files for CI systems (ADO, GitHub Actions) at the paths specified in `.flowey.toml`
77+
78+
2. **Runtime (Execution Phase)**: The generated YAML is executed by the CI system (or locally via `cargo xflowey <pipeline>`). Steps (units of work) run in the order determined at build-time:
79+
- Variables are read and written with actual values
80+
- Commands are executed
81+
- Artifacts (data packages passed between jobs) are published/consumed
82+
- Side effects (dependencies) are resolved
83+
84+
The `.flowey.toml` file at the repo root defines which pipelines to generate and where. For example:
85+
86+
```toml
87+
[[pipeline.flowey_hvlite.github]]
88+
file = ".github/workflows/openvmm-pr.yaml"
89+
cmd = ["ci", "checkin-gates", "--config=pr"]
90+
```
91+
92+
When you run `cargo xflowey regen`:
93+
94+
1. It reads `.flowey.toml`
95+
2. Builds the `flowey-hvlite` binary
96+
3. Runs `flowey-hvlite pipeline github --out .github/workflows/openvmm-pr.yaml ci checkin-gates --config=pr`
97+
4. This generates/updates the YAML file with the resolved pipeline
98+
99+
**Key Distinction:**
100+
101+
- `cargo build -p flowey-hvlite` - Only compiles the flowey code to verify it builds successfully. **Does not** construct the DAG or generate YAML files.
102+
- `cargo xflowey regen` - Compiles the code **and** runs the full build-time resolution to construct the DAG, validate the pipeline, and regenerate all YAML files defined in `.flowey.toml`.
103+
104+
Always run `cargo xflowey regen` after modifying pipeline definitions to ensure the generated YAML files reflect your changes.
105+
106+
### Backend Abstraction
107+
108+
Flowey supports multiple execution backends:
109+
110+
- **Local**: Runs directly on your development machine
111+
- **ADO (Azure DevOps)**: Generates ADO Pipeline YAML
112+
- **GitHub Actions**: Generates GitHub Actions workflow YAML
113+
114+
```admonish warning
115+
Nodes should be written to work across ALL backends whenever possible. Relying on `ctx.backend()` to query the backend or manually emitting backend-specific steps (via `emit_ado_step` or `emit_gh_step`) should be avoided unless absolutely necessary. Most automation logic should be backend-agnostic, using `emit_rust_step` for cross-platform Rust code that works everywhere. Writing cross-platform flowey code enables locally testing pipelines which can be invaluable when iterating over CI changes.
116+
```
117+
118+
If a node only supports certain backends, it should immediately fast‑fail with a clear error ("`<Node>` not supported on `<backend>`") instead of silently proceeding. That failure signals it's time either to add the missing backend support or introduce a multi‑platform abstraction/meta‑node that delegates to platform‑specific nodes.
63.6 KB
Loading

0 commit comments

Comments
 (0)