Skip to content

✨ feat: Add .sweeper/config.toml and Confluent telemetry backend#10

Open
bdougie wants to merge 11 commits intomainfrom
feat/config-toml-telemetry-backend
Open

✨ feat: Add .sweeper/config.toml and Confluent telemetry backend#10
bdougie wants to merge 11 commits intomainfrom
feat/config-toml-telemetry-backend

Conversation

@bdougie
Copy link
Contributor

@bdougie bdougie commented Mar 17, 2026

Summary

  • Add .sweeper/config.toml as primary project config with sectioned TOML layout ([run], [provider], [telemetry], [vm]) and precedence: CLI flags > SWEEPER_* env vars > project .sweeper/config.toml > home ~/.sweeper/config.toml > defaults
  • Extract Publisher interface from the hardwired JSONL telemetry writer, keeping JSONL as the always-on default and adding Confluent Cloud as an opt-in production backend via a fan-out MultiPublisher
  • Move provider settings (name, model, api_base, allowed_tools) into [provider] in config.toml, separating AI backend selection from telemetry transport
  • Fix Kafka writer to use RequireAll acks so messages are confirmed by the broker instead of silently dropped

Architecture

                        sweeper run --vm -c 5
                              │
                    ┌─────────┼─────────┐
                    ▼         ▼         ▼
              ┌──────────────────────────────┐
              │        Worker Pool           │
              │ (rate-limited, max N=5)      │
              └──┬───┬───┬───┬───┬──────────┘
                 │   │   │   │   │
                 ▼   ▼   ▼   ▼   ▼
               ┌───┐┌───┐┌───┐┌───┐┌───┐
               │VM ││VM ││VM ││VM ││VM │  ◄── stereOS isolation
               │ 1 ││ 2 ││ 3 ││ 4 ││ 5 │
               └─┬─┘└─┬─┘└─┬─┘└─┬─┘└─┬─┘
                 │     │     │     │     │
                 ▼     ▼     ▼     ▼     ▼
              claude  claude claude claude claude
              --print --print --print --print --print
                 │     │     │     │     │
                 └─────┴──┬──┴─────┴─────┘
                          │
                          ▼
                 ┌─────────────────┐
                 │  MultiPublisher │  ◄── fan-out to all backends
                 └───┬─────────┬───┘
                     │         │
              ┌──────┘         └──────┐
              ▼                       ▼
    ┌──────────────────┐   ┌────────────────────┐
    │  JSONLPublisher   │   │ ConfluentPublisher  │
    │  (always-on)      │   │ (opt-in via TOML)   │
    └────────┬─────────┘   └─────────┬──────────┘
             │                       │
             ▼                       ▼
    .sweeper/telemetry/       Confluent Cloud
      2026-03-17.jsonl        (SASL/TLS, acks=all)
                                     │
                              ┌──────┴──────┐
                              ▼             ▼
                        Flink SQL      Dashboards
                       (streaming     (real-time
                       aggregation)   fix rates)

New packages

  • pkg/dotdir.sweeper/ directory resolution (project > home > empty)
  • pkg/config additions — TOMLConfig types, LoadTOML() loader, applyEnvOverrides(), FromTOML() bridge
  • pkg/telemetry/confluent — Kafka publisher with SASL/TLS for Confluent Cloud
  • pkg/telemetry additions — Publisher interface, JSONLPublisher, MultiPublisher

Confluent Build Hackathon Application Details

AI Application

Sweeper is an agent-powered code maintenance tool that dispatches parallel sub-agents to automatically fix linter issues, repair broken tests, and run migrations across codebases. It targets engineering teams who want to reduce manual lint-fix toil. The Confluent integration streams telemetry events (init, fix_attempt, round_complete) from sweep runs into a Kafka topic so teams can build real-time dashboards, detect regressions, and track fix success rates across repos at scale.

Confluent Connector(s)

N/A — Sweeper uses the segmentio/kafka-go writer directly with SASL/TLS to produce to Confluent Cloud. No managed connectors are used.

Flink Stream Processing Query

SELECT
  `data`['linter'] AS linter,
  `data`['strategy'] AS strategy,
  CAST(`data`['success'] AS BOOLEAN) AS success,
  COUNT(*) AS attempt_count,
  TUMBLE_END(event_time, INTERVAL '1' HOUR) AS window_end
FROM sweeper_telemetry
WHERE `type` = 'fix_attempt'
GROUP BY
  `data`['linter'],
  `data`['strategy'],
  CAST(`data`['success'] AS BOOLEAN),
  TUMBLE(event_time, INTERVAL '1' HOUR);

Schema

{
  "type": "record",
  "name": "SweeperTelemetryEvent",
  "namespace": "co.papercompute.sweeper",
  "fields": [
    { "name": "timestamp", "type": "string", "doc": "ISO-8601 timestamp" },
    { "name": "type", "type": { "type": "enum", "name": "EventType", "symbols": ["init", "fix_attempt", "round_complete"] }},
    { "name": "data", "type": { "type": "map", "values": ["null", "string", "int", "long", "boolean", "double"] }, "doc": "Event payload — keys vary by type" }
  ]
}

Event types and their data fields:

type data fields
init name, linterCommand, targetDir, maxRounds, staleThreshold
fix_attempt file, success, duration, issues, error, linter, round, strategy, provider, model, prompt_tokens, output_tokens
round_complete round, linter, tasks, fixed, failed

Test plan

  • go test ./... passes (all 15 packages)
  • Verify sweeper run --dry-run loads config from .sweeper/config.toml when present
  • Verify sweeper run --provider codex --dry-run overrides TOML config
  • Verify SWEEPER_PROVIDER_NAME=ollama sweeper run --dry-run env override works
  • Verify sweeper observe reads telemetry dir from config
  • Verify Confluent backend activates with [telemetry] backend = "confluent" and valid broker/topic config
  • Verify JSONL always writes locally regardless of backend setting
  • Verify events appear in Confluent topic (RequireAll acks)

@bdougie bdougie changed the title Add .sweeper/config.toml and Confluent telemetry backend ✨ feat: Add .sweeper/config.toml and Confluent telemetry backend Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant