Skip to content

feat(issue-101): add nf:harden adversarial hardening loop skill#102

Merged
jobordu merged 99 commits intomainfrom
feature/issue-101-add-nf-harden-adversarial
Apr 16, 2026
Merged

feat(issue-101): add nf:harden adversarial hardening loop skill#102
jobordu merged 99 commits intomainfrom
feature/issue-101-add-nf-harden-adversarial

Conversation

@jobordu
Copy link
Copy Markdown

@jobordu jobordu commented Apr 16, 2026

Summary

  • Adds /nf:harden skill — iterative adversarial test-write-fix loop with convergence detection (2 consecutive zero-change iterations) and configurable iteration cap (--max N, default 10)
  • Wires nf:harden into nf:quick --full as Step 6.6 post-verification adversarial hardening (--max 5)
  • Fixes package.json bogus lib@incompatible_version dependency that blocked npm install in this worktree and caused 43 test failures

Key files

  • commands/nf/harden.md — skill command with --area, --full, --max flags
  • core/workflows/harden.md — full adversarial loop workflow (terminal states: converged, cap_exhausted, skipped, blocked)
  • core/workflows/quick.md — Step 6.6 adversarial hardening added
  • package.json — removed lib@incompatible_version

Test plan

  • npm run test:ci — 1537 pass, 0 fail, 0 skip
  • Formal verification: agent-loop module 1/1 checks passed (EventuallyTerminates invariant satisfied by iteration cap)
  • Quorum verification passed (Verified status)

Closes #101

🤖 Generated with Claude Code

jobordu and others added 30 commits April 15, 2026 20:35
…ures

- 5 required fixtures covering distinct solve modes (fast, full, skip-layers, focus, zero-residual)
- 1 additional edge-case fixture for invalid --focus value (exits_zero)
- version=1, top-level description field
- bin/nf-benchmark-solve.cjs: benchmark runner for nf:solve end-to-end validation
- --dry-run flag lists fixtures without invoking nf-solve
- --fixture flag accepts custom JSON path with pre-flight error on missing file
- --verbose flag pipes nf-solve stderr to parent stderr
- --json flag outputs machine-readable JSON summary
- spawnSync with timeout=300000 to prevent indefinite hangs
- null residual bounds: assertions skipped when min/max_residual is null
- package.json: added benchmark:solve script entry after formal-verify:petri
… to validate its capacity to solve issues automatically
Add benchmark:solve npm script to package.json for running nf-solve against
the full nf-benchmark 205-challenge suite.

Add stub formal artifact files required by specific benchmark challenges:
- .planning/formal/convergence-rules.json (BENCH-108, BENCH-146)
- .planning/formal/evidence/wiring.json (BENCH-061, BENCH-064)
- .planning/formal/model-bias.json (BENCH-135)
- .planning/formal/optimization-priorities.json (BENCH-106)
- config/app.json (BENCH-114 security challenge)
- infrastructure/cloud-config.json (BENCH-140 resource challenge)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nf-benchmark-solve.cjs

- Add snapshotFormalJson/restoreFormalJson helpers for .planning/formal/*.json
- Add extractLayerResidual helper for per-layer residual extraction
- Add setNestedField helper for dot-notation mutations
- Add --track=smoke/autonomy/all CLI flag (default: all)
- Guard smoke loop with runSmoke flag
- Add autonomy fixture runner with try/finally restore guarantee
- Include autonomy_results key in --json output
- Update header and dry-run output for both tracks
- Add autonomy_fixtures array with seed-f2t-uncover-ACT-01 fixture
- Targets f_to_t layer by marking ACT-01 uncovered in unit-test-coverage.json
- Uses set_field mutation with dot-notation path requirements.ACT-01
- pass_condition: residual_decreased with seeded_delta=1
- Full round-trip verified: snapshot, mutation, nf-solve, restore all work
- Snapshot/restore integrity confirmed (ACT-01 restored to covered: true)
…nomy — add a real autonomy track with seeded defects and residual reduction scoring
…o-end

- Replace artificial preResidual=baseline+delta with real seeded measurement:
  run --report-only after mutation and skip fixture if layer residual didn't move
- Add array_item_modify and append_array_item mutation types
- Add residual_increased pass condition (tests gap detection, not autoClose)
- Switch fixture from unit-test-coverage.json (output) to requirements.json
  (input) and inject a fake Complete req with no formal_models — r_to_f goes
  0→1, fixture scores PASS, snapshot restored cleanly

Verified: node bin/nf-benchmark-solve.cjs --track=autonomy → 1/1 PASS

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…estore scope

The snapshot/restore cycle captured the benchmark's own config file, causing
autonomy_fixtures to be silently wiped on every run. Exclude it via SNAPSHOT_EXCLUDE
so the fixture list is durable across benchmark invocations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… solver clobbering

.planning/formal/ is managed by nf-solve and its sub-scripts — any run without
--report-only can regenerate files there. Moving solve-benchmark-fixtures.json
to bin/ (alongside nf-benchmark-solve.cjs) makes it durable: it's now plain
benchmark config, not a formal verification artifact.

Also removes the now-unnecessary SNAPSHOT_EXCLUDE workaround.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Gap 1 — Smoke regression detection:
  Add layer-residual-regression fixture with layer_residuals_in_range pass
  condition. Fails if r_to_f/f_to_t/c_to_f/trace_health/memory_health drift
  beyond known bounds. Runner: add evaluatePassCondition branch for the new
  condition using extractLayerResidual per-layer.

Gap 2 — More autonomy detection layers (f_to_t):
  Add seed-f2t-inject-property: injects BENCH-TEST-01 into requirements.json
  AND adds \* @requirement annotation to NFOrchestration.tla. Both steps are
  required — buildCoverageReport only tracks gaps for requirements in
  requirements.json. Runner: add mutations[] array support + applyMutation
  helper; seed_mutation (single) remains supported for backward compat.
  target_layer can now be on fixture directly (not buried in seed_mutation).

Gap 3 — Remediation (autoClose actually closes a gap):
  Add fix-f2t-stub-generation: same seeded state as detection fixture, but
  pass_condition=residual_decreased. autoClose calls formal-test-sync to
  generate a stub, then _implement-stubs.cjs upgrades it. Post-fix --report-only
  sweep confirms f_to_t drops from 1 to 0. Runner: add post-fix measurement
  step (Step 4b) so residual_decreased uses actual post-autoClose state, not
  the fix run's own output which reflects pre-autoClose residuals.

Gap 4 — npm script wiring:
  Add benchmark:solve:local to package.json pointing at bin/nf-benchmark-solve.cjs.
  benchmark:solve (external) remains unchanged.

Snapshot extended to cover .planning/formal/tla/*.tla files and track
generated-stubs directory for cleanup of newly created stub files.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
jobordu added 28 commits April 16, 2026 07:37
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
Automated commit from nf-solve — includes layer manifests, gate results,
evidence snapshots, model registry, and requirements coverage updates.
- Add callers, implementation, tests, peek subcommands to commands/nf/coderlm.md
- Update frontmatter argument-hint and description to include new subcommands
- Add ensure-running preamble using coderlm-lifecycle.cjs --start before queries
- Use heredoc form for node invocations with env var argument passing
- Add usage help for all four new subcommands
- Add error handling with diagnostic hints for each query subcommand
…orkflow

- Add commands/nf/harden.md with frontmatter, --area and --full flags, execution_context pointing to ~/.claude/nf/workflows/harden.md
- Add core/workflows/harden.md with full adversarial loop: argument parsing (--area, --full, --max) with validation, test discovery with empty/baseline guards, iterative adversarial agent + fix executor, convergence detection (CONSECUTIVE_ZERO_CHANGE), iteration cap (default 10), banners for all terminal states (converged, cap_exhausted, skipped, blocked)
- Sync both files to installed locations (~/.claude/nf/workflows/harden.md, ~/.claude/commands/nf/harden.md)
…age.json

The lib@incompatible_version entry was blocking npm install in this worktree,
preventing blessed and xstate from installing and causing 43 test failures.
xstate was already declared in devDependencies; removed the duplicate entry.
- .planning/.gitignore: ignore repowise/ cache directory
- commands/nf/coderlm.md: fix require path to use nf-bin portable path
- .planning/quick/400-add-nf-harden-adversarial-skill/scope-contract.json: task scope contract
@jobordu jobordu merged commit 7fad28d into main Apr 16, 2026
7 checks passed
@jobordu jobordu deleted the feature/issue-101-add-nf-harden-adversarial branch April 16, 2026 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add nf:harden adversarial hardening loop skill

1 participant