Add autosolve actions for automated issue resolution by fantapop · Pull Request #14 · cockroachdb/actions

fantapop · 2026-03-25T00:25:39Z

Summary

Go implementation of composite actions for Claude-powered automated issue resolution:

autosolve/assess — Runs Claude in read-only mode to evaluate whether a task is suitable for automated resolution
autosolve/implement — Runs Claude to implement a solution, validates changes, runs AI security review, pushes to a fork, and creates a draft PR

Key features

Precompiled Go binary (no Go toolchain needed at runtime)
Per-file batched AI security review with generated-file detection
Token usage tracking across phases with combined markdown summary
Retry logic with Claude session resumption
Skill file support for custom prompts

Testing

Tested end-to-end against cockroachlabs/ccloud-private-automation-testing.

Test plan

go test ./... passes
Precompiled binary check passes in CI

Copilot

Pull request overview

This PR introduces a new autosolve Go-based automation tool and two composite GitHub Actions (autosolve/assess and autosolve/implement) to assess issue suitability and implement fixes with Claude, including PR creation, security checks, and usage tracking.

Changes:

Add Go implementation for assessment/implementation orchestration, prompt assembly, git/gh wrappers, and security checks.
Add composite actions (autosolve/assess, autosolve/implement) plus CI updates to run Go tests and validate the precompiled binary.
Add prompt templates and unit tests for core functionality.

Reviewed changes

Copilot reviewed 28 out of 30 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
autosolve/internal/security/security_test.go	Adds unit tests for blocked-path and sensitive-file enforcement and `.gitignore` warnings.
autosolve/internal/security/security.go	Implements blocked path checks, sensitive filename/extension detection, and symlink-to-blocked-path detection.
autosolve/internal/prompt/templates/security-preamble.md	Adds system preamble intended to constrain the model’s behavior for safety.
autosolve/internal/prompt/templates/implementation-footer.md	Adds implementation-phase instruction footer and required success/fail marker.
autosolve/internal/prompt/templates/assessment-footer.md	Adds assessment-phase instruction footer and required proceed/skip marker.
autosolve/internal/prompt/prompt_test.go	Adds tests for prompt construction, skill file inclusion, and custom criteria.
autosolve/internal/prompt/prompt.go	Implements prompt assembly from templates + task inputs.
autosolve/internal/implement/implement_test.go	Adds tests for retry logic, output writing, and summary extraction.
autosolve/internal/implement/implement.go	Implements the implementation phase: retries, security checks, staging/commit/push, PR creation, and AI security review.
autosolve/internal/github/github.go	Adds a `gh`-CLI-backed GitHub client for comments/labels/PR creation.
autosolve/internal/git/git.go	Adds a git CLI abstraction and helper to list changed files.
autosolve/internal/config/config_test.go	Adds tests for config parsing/validation and blocked path parsing.
autosolve/internal/config/config.go	Adds config loading/validation from action inputs and auth validation.
autosolve/internal/claude/claude_test.go	Adds tests for extracting markers/session IDs and usage tracking.
autosolve/internal/claude/claude.go	Adds Claude CLI runner + result parsing + usage tracking persistence.
autosolve/internal/assess/assess_test.go	Adds tests for assessment flow and summary extraction.
autosolve/internal/assess/assess.go	Implements assessment phase invocation and outputs/summary writing.
autosolve/internal/action/action_test.go	Adds tests for GitHub Actions output and step summary helpers.
autosolve/internal/action/action.go	Adds helpers for outputs, summaries, and workflow annotations.
autosolve/implement/action.yml	Defines the composite action to run `autosolve implement` and expose outputs.
autosolve/go.mod	Introduces the autosolve Go module definition.
autosolve/cmd/autosolve/main.go	Adds CLI entrypoint for `assess` and `implement` commands.
autosolve/build.sh	Adds cross-compile script producing the committed Linux binary.
autosolve/assess/action.yml	Defines the composite action to run `autosolve assess` and expose outputs.
autosolve/Makefile	Adds build/test/clean targets for local development and CI.
autosolve/.gitignore	Ignores the local dev binary output.
CHANGELOG.md	Documents the addition of the autosolve actions.
.github/workflows/test.yml	Updates CI to run Go tests and ensure the precompiled binary is up to date.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

autosolve/internal/implement/implement.go

autosolve/internal/security/security.go

autosolve/internal/implement/implement.go

autosolve/internal/security/security_test.go

autosolve/internal/implement/implement.go

.github/workflows/test.yml

fantapop · 2026-03-27T22:30:09Z

One thing I'm running into here is that build the action and committing it each time easily gets out of date and is annoying. I'm going to look into alternatives.

linhcrl

Left a couple of comments. Most of them are smaller/questions.

Also, here's some feedback I didn't know where to put:

In the PR description I see Precompiled Go binary (no Go toolchain needed at runtime) and one of the bottom checkboxes also mentions precompiled go binary. I'm assuming this just hasn't been updated right? I see that we actually recompile the binary every time this action is run
We should add some documentation in the README

linhcrl · 2026-03-30T04:16:11Z

autosolve/assess/action.yml

+        CLAUDE_CLI_VERSION: ${{ inputs.claude_cli_version }}
+
+    - name: Set up Go
+      uses: actions/setup-go@v5


v6 is the latest version (same with implement/action.yml)

Fixed — updated to v6 in both action.yml files. 🤖

linhcrl · 2026-03-30T15:38:03Z

autosolve/cmd/autosolve/main.go

+)
+
+// BuildSHA is set at build time via -ldflags.
+var BuildSHA = "dev"


Is the BuildSHA variable and version command intentionally left unimplemented? Currently the build step doesn't set it, so autosolve version always prints "dev". Can this be removed since the binary is built fresh on each run?

Good catch — removed the BuildSHA variable and version subcommand entirely since the binary is built fresh on each run. 🤖

linhcrl · 2026-03-31T20:00:02Z

autosolve/internal/assess/assess.go

+	action.LogInfo(fmt.Sprintf("Assessment usage: input=%d output=%d cost=$%.4f",
+		result.Usage.InputTokens, result.Usage.OutputTokens, result.Usage.CostUSD))
+	if result.ExitCode != 0 {
+		action.LogWarning(fmt.Sprintf("Claude CLI exited with code %d", result.ExitCode))


Curious, why are we only logging this as a warning and not returning an error? Wouldn't this mean that the output is not reliable?

The extraction failure is logged as a warning so the caller can still see the raw output in the uploaded artifact. The caller (assess.Run) checks both the exit code and whether the marker was found — if either fails, the assessment is marked as incomplete. So the warning here is diagnostic context, not the final verdict. 🤖

hmm, I'm going to explore this a bit more. The interface seems weird.

linhcrl · 2026-04-01T18:45:48Z

autosolve/internal/config/config.go

+)
+
+const (
+	defaultCommitSignature = "Co-Authored-By: Claude <noreply@anthropic.com>"


The default signature uses noreply@anthropic.com. Should this be a CockroachDB bot email instead, since these commits are being created in our repos by our automation?

Matches what cockroachdb/cockroach's autosolve uses (Co-Authored-By: Claude <noreply@anthropic.com>). Callers can override via the commit_signature input in action.yml if they want a different identity. 🤖

linhcrl · 2026-04-01T19:05:17Z

autosolve/internal/config/config.go

+		Skill:                  os.Getenv("INPUT_SKILL"),
+		AdditionalInstructions: os.Getenv("INPUT_ADDITIONAL_INSTRUCTIONS"),
+		AssessmentCriteria:     os.Getenv("INPUT_ASSESSMENT_CRITERIA"),
+		Model:                  envOrDefault("INPUT_MODEL", "sonnet"),


The actions themselves set the model default to claude-opus-4-6 but here the default is sonnet. Do they need to match?

Fixed — removed the Go-side default entirely. The model is now required from the action inputs (action.yml sets the default to claude-opus-4-6), so there's a single source of truth. 🤖

linhcrl · 2026-04-02T06:22:39Z

autosolve/internal/implement/implement.go

+	prTitle := cfg.PullRequestTitle
+	if prTitle == "" {
+		out, err := gitClient.Log("-1", "--format=%s")
+		if err == nil {
+			prTitle = out
+		}
+	}


We already created a pullRequestTitle variable above. Could we not use that?

Fixed — now reuses the pullRequestTitle variable instead of re-deriving it from git log. 🤖

linhcrl · 2026-04-02T06:32:17Z

autosolve/internal/prompt/templates/implementation-footer.md

+   characters, imperative mood), a blank line, then a body explaining
+   what was changed and why. Since all changes go into a single commit,


nit: Do we typically also enforce the 72 character rule for the body as well, or is that not necessary?

The 72-character rule for commit bodies is a convention but not as strictly enforced as subject lines. Since Claude is writing these, hard-wrapping the body would be nice but isn't critical. Left as-is for now — we can tighten it later if the output is messy. 🤖

linhcrl · 2026-04-02T06:55:37Z

autosolve/internal/implement/implement.go

+		if !positive {
+			action.LogWarning(fmt.Sprintf("AI security review found sensitive content in batch %d:", batchNum))
+			action.LogWarning(resultText)
+			_ = gitClient.ResetHead()


Curious, why do we reset staged changes only when secrets are found (line 640), but not when the review fails to run (lines 625, 634)? Is it because this is the only case where changes were actually deemed unsafe, vs. just couldn't be verified?

Fixed — ResetHead() is now called on all security review failure paths (exit code != 0, empty result, or sensitive content found). All calls log a warning on failure with a comment explaining why it's safe to continue (execution stops before any push can occur). 🤖

linhcrl · 2026-04-02T07:20:08Z

autosolve/internal/implement/implement_test.go

The test coverage is mostly focused on the retry loop. For security-critical code that creates PRs, should we add unit tests for the core helper functions (buildPRBody, isGeneratedDiff, readCommitMessage)? They're pure logic without external dependencies.

Added unit tests for all three: TestReadCommitMessage (4 subtests), TestBuildPRBody (3 subtests), TestIsGeneratedDiff (5 subtests). 🤖

linhcrl · 2026-04-02T07:38:59Z

autosolve/internal/implement/implement.go

+	_ = gitClient.Config("--local", "--unset", "credential.helper")
+	_, _ = gitClient.Remote("remove", "fork")


should we check the error here?

Now logs a warning if removing the fork remote fails. The remote may not exist if the run failed early, so it's best-effort — but we no longer silently swallow the error. 🤖

Update: the Cleanup function has been removed entirely. The runner VM is ephemeral so there's nothing to clean up — the fork remote disappears when the VM is destroyed. 🤖

- autosolve/assess: evaluate tasks for automated resolution suitability using Claude in read-only mode. - autosolve/implement: autonomously implement solutions, validate security, push to fork, and create PRs using Claude. Includes AI security review, token usage tracking, and per-file batched diff analysis. - Prefers roachdev wrapper for Claude CLI when available, falls back to native installer. - Go binary is built from source at action runtime via setup-go. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

- Bump setup-go to v6 in assess and implement actions - Remove model default from Go config; require it from action inputs - Remove dead code: BuildSHA/version command, IssuePromptTemplate, BuildIssuePrompt and its tests - Log error on UsageTracker.Save() failure instead of swallowing - Fix misleading "blocked paths" log when security check fails - Account for "autosolve: " prefix in commit subject length check - Reuse pullRequestTitle for PR creation instead of recomputing - Add comment explaining why Cleanup ignores errors Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Use a static GIT_ASKPASS script instead of writing tokens to git's credential helper. Credentials and GIT_ASKPASS are scoped to the git push subprocess only via PushEnv, so the token is never written to disk or visible in the broader process environment. Tokens are also unset from the environment after config loading so Claude CLI subprocesses cannot access them. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Claude may echo the prompt instructions (which contain the marker) in its output before producing the actual result. Using strings.Contains on the full text could match the echoed marker instead of the real one. Extract the last line containing the marker prefix to ensure the final verdict wins. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Previously EvalSymlinks errors were silently skipped, which could let a symlink to a blocked path slip through if the error was not ErrNotExist. Now only missing files are skipped; other errors are treated as violations. Adds tests for symlink-to-blocked-path detection and deleted-file handling. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Previously ResetHead was only called when the AI security review detected sensitive content. Now it also runs when the review fails to produce a result (e.g. crash or empty output). All ResetHead calls log a warning on failure instead of silently swallowing the error, with comments explaining why it is safe to continue. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Add tests for readCommitMessage, buildPRBody, and isGeneratedDiff. Treat a missing .autosolve-pr-body file as an incomplete attempt so the retry loop can try again rather than falling back to raw git log output as the PR body. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Boolean env vars (INPUT_CREATE_PR, INPUT_PR_DRAFT) now accept any case variation of true/false and error on invalid values like "yes" or "1", instead of silently treating them as false. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Run always returns non-nil Result so callers can unconditionally access usage/session ID. Non-zero exit code is informational, not an error — the marker is the authority. Error is returned only on empty result or parse failure. Adds logAttempt closure to keep bookkeeping adjacent to the Run call. Removes dead resultFile variable that was written but never read. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

- action helpers (SetOutput, SetOutputMultiline, WriteStepSummary, SaveLogArtifact) now return errors; appendToFile errors on empty path - UsageTracker.Save() returns error; callers log warning and continue - CreateLabel differentiates "already exists" from real errors - readCommitMessage/copyPRBody fail hard on os.Remove (stale files could interfere with retries) - Require .autosolve-commit-message in retry loop like .autosolve-pr-body - gitClient.Add failure is now a hard error - Remove Cleanup (ephemeral runner, nothing to clean up) - Remove unused CreateComment, RemoveLabel, FindPRByLabel from interface - Add action.LogResult to centralize post-Run bookkeeping - Use result.SessionID directly, remove dead ExtractSessionID - Replace randomDelimiter with static GHEOF - Include error values in all warning/error log messages Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

fantapop force-pushed the pr/autosolve-go branch from 324a6db to 0655ce2 Compare March 25, 2026 00:29

fantapop force-pushed the pr/shell-framework branch from 9134765 to 817680c Compare March 25, 2026 00:38

fantapop force-pushed the pr/autosolve-go branch from 0655ce2 to 6f1121d Compare March 25, 2026 00:38

fantapop force-pushed the pr/shell-framework branch from 817680c to 0a678c6 Compare March 25, 2026 01:00

fantapop force-pushed the pr/autosolve-go branch from 6f1121d to 5c7a16f Compare March 25, 2026 01:03

fantapop changed the base branch from pr/shell-framework to main March 25, 2026 01:03

This was referenced Mar 25, 2026

Generic autosolve github workflow for automated issue resolution #5

Closed

Add Go rewrite of autosolve actions #8

Closed

fantapop requested a review from Copilot March 25, 2026 01:24

Copilot started reviewing on behalf of fantapop March 25, 2026 01:25 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

fantapop force-pushed the pr/autosolve-go branch from 5c7a16f to a9a9010 Compare March 25, 2026 01:31

fantapop requested a review from linhcrl March 25, 2026 01:35

fantapop force-pushed the pr/autosolve-go branch 2 times, most recently from 1abbbb0 to 6fd24ba Compare March 25, 2026 06:52

fantapop commented Mar 25, 2026

View reviewed changes

.github/workflows/test.yml Outdated Show resolved Hide resolved

fantapop force-pushed the pr/autosolve-go branch 6 times, most recently from f818651 to 6bc6bc5 Compare March 27, 2026 22:21

fantapop force-pushed the pr/autosolve-go branch 2 times, most recently from d06e466 to f2ef7a1 Compare March 27, 2026 22:50

linhcrl reviewed Apr 2, 2026

View reviewed changes

fantapop and others added 4 commits April 3, 2026 12:05

fantapop force-pushed the pr/autosolve-go branch from d90cdd7 to 127df75 Compare April 3, 2026 19:09

fantapop and others added 2 commits April 3, 2026 12:54

		characters, imperative mood), a blank line, then a body explaining
		what was changed and why. Since all changes go into a single commit,

		_ = gitClient.Config("--local", "--unset", "credential.helper")
		_, _ = gitClient.Remote("remove", "fork")

Conversation

fantapop commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key features

Testing

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fantapop commented Mar 27, 2026

Uh oh!

linhcrl left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

linhcrl Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

fantapop commented Mar 25, 2026 •

edited

Loading

linhcrl left a comment •

edited

Loading

linhcrl Mar 30, 2026 •

edited

Loading