Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@ docs/
- [docs/journey-state-machine.md](./docs/journey-state-machine.md) - Journey states and transitions
- [docs/genspec-format.md](./docs/genspec-format.md) - Genspec format reference
- [docs/testing-strategy.md](./docs/testing-strategy.md) - Testing approach
- [docs/adr/README.md](./docs/adr/README.md) - Architecture decision records
- [docs/architecture-roadmap.md](./docs/architecture-roadmap.md) - Architecture roadmap

## How It Works
Expand Down
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ This folder contains the product, architecture, and testing references for Waypo
- [journey-state-machine.md](./journey-state-machine.md) - Journey states and transitions
- [architecture-roadmap.md](./architecture-roadmap.md) - Long-term architecture plan
- [unix-architecture-plan.md](./unix-architecture-plan.md) - UNIX-style architecture notes
- [adr/README.md](./adr/README.md) - Architecture decision records

## Protocols and Formats

Expand Down
28 changes: 28 additions & 0 deletions docs/adr/0001-execution-controller.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# ADR 0001: Extract Execution Controller

Date: 2026-02-05
Status: Accepted

## Context

The FLY phase mixed UI, orchestration, execution, and state transitions inside
`src/waypoints/tui/screens/fly.py`. This coupling made the execution flow harder
to test, reason about, and evolve. A dedicated orchestration boundary was
needed to align with the “bicycle” philosophy and centralize execution logic.

## Decision

Introduce `ExecutionController` in `src/waypoints/orchestration/` to own:
- Execution state transitions
- Waypoint selection and sequencing
- Result handling and intervention flow

Move `ExecutionState` into `src/waypoints/fly/state.py` to make it a shared
execution concept rather than a UI-local enum.

## Consequences

- FLY screen becomes thinner and more focused on UI concerns.
- Execution logic is testable in isolation with unit tests.
- Additional orchestration features (rollback, richer reports) have a clear
home without bloating the UI layer.
28 changes: 28 additions & 0 deletions docs/adr/0002-flight-test-harness.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# ADR 0002: Flight Test Harness

Date: 2026-02-05
Status: Accepted

## Context

The testing strategy defined flight tests (L0–L5) but lacked operational tooling.
To improve iteration discipline, we needed a repeatable harness that records
results and validates generated projects against minimal expectations.

## Decision

Add `scripts/run_flight_test.py` to execute a flight test against an existing
project directory. The runner:
- Creates timestamped results directories
- Validates minimum expected files
- Runs optional smoke tests
- Writes a `meta.json` summary

Seed L0–L2 fixtures under `flight-tests/` to make the harness immediately usable.

## Consequences

- Provides a repeatable baseline for flight test validation.
- Creates an audit trail for regressions and improvements.
- Keeps generation concerns decoupled from validation so the harness is usable
before full automation is in place.
21 changes: 21 additions & 0 deletions docs/adr/0003-execution-report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# ADR 0003: Execution Report Model

Date: 2026-02-05
Status: Accepted

## Context

Execution outcomes were logged but lacked a structured report for summarizing
waypoint attempts. This made it hard to aggregate metrics or build future
observability features on top of execution artifacts.

## Decision

Introduce `ExecutionReport` as a structured summary of a waypoint execution
attempt, capturing result, timestamps, and completion data.

## Consequences

- Establishes a durable schema for execution summaries.
- Enables future aggregation and reporting without parsing logs.
- Keeps the report model independent of UI layers.
9 changes: 9 additions & 0 deletions docs/adr/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Architecture Decision Records

This directory captures the key architectural decisions for Waypoints.

## Index

- [ADR 0001: Extract Execution Controller](./0001-execution-controller.md)
- [ADR 0002: Flight Test Harness](./0002-flight-test-harness.md)
- [ADR 0003: Execution Report Model](./0003-execution-report.md)
95 changes: 95 additions & 0 deletions docs/analysis/fly-callgraph.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# FLY Call Graph (2026-02-05)

This document maps the current FLY execution flow from UI actions down to orchestration and execution. The goal is to identify extraction boundaries for a dedicated execution controller.

## Entry Points (User Actions)

- `FlyScreen.on_mount()`
- `coordinator.reset_stale_in_progress()`
- `_refresh_waypoint_list()`
- `_select_next_waypoint(include_in_progress=True)`
- `_update_git_status()` + timer
- `_update_project_metrics()`

- `action_start()`
- Handles retry of selected failed waypoint
- Handles resume from `PAUSED`
- Handles start from `READY` or after `CHART_REVIEW` / `LAND_REVIEW`
- Transitions via `coordinator.transition(...)`
- Sets `execution_state = RUNNING`
- `_execute_current_waypoint()`

- `action_pause()`
- Sets `execution_state = PAUSE_PENDING`
- Cancels executor if running (logs pause)

- `action_skip()`
- Marks current waypoint skipped (via selection change)
- `_select_next_waypoint()`

- `action_back()`
- Transitions `FLY_* -> CHART_REVIEW`
- Switches phase to `chart`

- `action_forward()`
- Validates `LAND_REVIEW` availability
- `coordinator.transition(LAND_REVIEW)` + `_switch_to_land_screen()`

- Intervention flow
- `_handle_intervention(...)` → `InterventionModal` → `_on_intervention_result(...)`

## Execution Flow

- `_execute_current_waypoint()`
- Marks waypoint `IN_PROGRESS` + saves flight plan
- Builds `WaypointExecutor` with callbacks and limits
- `run_worker(self._run_executor())`

- `_run_executor()`
- `WaypointExecutor.execute()` → returns `ExecutionResult`

- `on_worker_state_changed()`
- Handles `InterventionNeededError` or other failures
- Calls `_handle_execution_result(result)`

- `_handle_execution_result(result)`
- SUCCESS
- Mark COMPLETE + save
- Commit via git (receipt validation)
- Parent epic check
- Select next waypoint
- If all complete: transition `LAND_REVIEW`
- INTERVENTION_NEEDED / MAX_ITERATIONS / FAILED
- Mark FAILED
- Transition `FLY_INTERVENTION`
- CANCELLED
- Transition `FLY_PAUSED`

## Cross-Cutting Services

- `JourneyCoordinator`
- Transition validation and persistence
- Waypoint selection and completion checks

- `WaypointExecutor`
- Iterative execution loop
- Calls progress callback with `ExecutionContext`

- `ExecutionLogReader` / `ExecutionLogWriter`
- Audit trail for each waypoint

- `GitService` + `ReceiptValidator`
- Receipt validation
- Commit/tag integration

---

## Extraction Boundary (Target)

Introduce `ExecutionController` to own the flow currently distributed across `FlyScreen`:
- `start / pause / resume / skip / retry`
- State transitions
- Selection logic + execution sequencing
- Handling of `ExecutionResult`

`FlyScreen` should become a thin UI layer: inputs, rendering, and modal handling.
45 changes: 45 additions & 0 deletions docs/analysis/fly-invariants.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# FLY Invariants (2026-02-05)

These invariants define expected behavior in FLY execution. They should be preserved during refactor and enforced through tests.

## State and Transition Invariants

- `JourneyCoordinator.transition(...)` is the single source of truth for journey state transitions.
- `ExecutionState` is a UI execution mode, but must be consistent with `JourneyState`:
- `ExecutionState.RUNNING` implies `JourneyState.FLY_EXECUTING`.
- `ExecutionState.PAUSED` implies `JourneyState.FLY_PAUSED`.
- `ExecutionState.INTERVENTION` implies `JourneyState.FLY_INTERVENTION`.
- `ExecutionState.DONE` implies all waypoints complete and `JourneyState.LAND_REVIEW` is reachable.
- Non-recoverable states should not be persisted as resume checkpoints.

## Waypoint Status Invariants

- When execution starts, current waypoint becomes `IN_PROGRESS`.
- On success, waypoint must be marked `COMPLETE`, persisted, and logged.
- On intervention or failure, waypoint must be marked `FAILED` (or `SKIPPED` for explicit skips).
- Parent epic completion is checked after a child completes, but epics are not auto-completed.

## Selection Invariants

- Selection prefers resumable waypoints (`IN_PROGRESS`, `FAILED`) when resuming.
- Selection should not allow a waypoint whose dependencies are incomplete.
- Epics become eligible only when all children complete.

## Execution Invariants

- Execution uses `WaypointExecutor` exclusively.
- UI must remain responsive (execution runs in background worker).
- Progress updates are handled on main thread via `call_later`.
- `ExecutionResult` drives state transitions; no silent fall-through.

## Logging and Metrics Invariants

- Each waypoint execution produces an execution log.
- Cost and token metrics are updated after each waypoint.
- Receipt validation must occur before auto-commit.

## Recovery Invariants

- Stale `IN_PROGRESS` waypoints are reset to `PENDING` on screen mount.
- Intervention must surface a modal with explicit user action choices.
- Rollback is best-effort and must not corrupt the flight plan state.
Loading