Skip to content

Add FLE-style backtracking with AlphaEvolve integration#13

Merged
bdougie merged 5 commits intomainfrom
feat/fle-patterns
Mar 10, 2026
Merged

Add FLE-style backtracking with AlphaEvolve integration#13
bdougie merged 5 commits intomainfrom
feat/fle-patterns

Conversation

@bdougie
Copy link
Contributor

@bdougie bdougie commented Mar 10, 2026

Summary

  • Adds BacktrackManager class that saves/restores game state via PyBoy save_state/load_state to escape stuck navigation (e.g. Route 1 y=28 blocker)
  • Snapshots are taken on map changes and periodically; restores trigger when stuck_turns exceeds a configurable threshold
  • Four new evolvable params (bt_max_snapshots, bt_restore_threshold, bt_max_attempts, bt_snapshot_interval) integrated into evolve.py scoring, perturbation, and mutation prompts
  • Two new run_10_agents.py variants: aggressive_bt (low threshold, high retries) and no_bt (disabled baseline)

Test plan

  • 415 tests pass (uv run pytest tests/)
  • 100% coverage on all tracked modules
  • uv run scripts/evolve.py <rom> --generations 3 --max-turns 2000 — verify backtracking params appear in genome
  • uv run scripts/run_10_agents.py <rom> — compare backtrack-enabled vs disabled variants

Closes #5

bdougie added 4 commits March 10, 2026 06:56
BacktrackManager saves/restores game state via PyBoy save_state/load_state
to escape stuck navigation on Route 1. Snapshots on map change and
periodically; restores when stuck_turns exceeds threshold.

Four new evolvable params (bt_max_snapshots, bt_restore_threshold,
bt_max_attempts, bt_snapshot_interval) flow through evolve.py and
run_10_agents.py with two new variants: aggressive_bt and no_bt.
- Remove unused `field` import from dataclasses in agent.py
- Import `score()` from evolve.py in run_10_agents.py instead of duplicating it
- Reset _oak_wait_done, _pallet_diag_done, _house_diag_done, _lab_phase,
  _lab_turns, _lab_exit_turns on backtrack restore so one-time game
  sequences (Oak encounter, lab phases) can re-trigger after restore
- Skip periodic snapshots when position matches the last snapshot to
  avoid poisoning the pool with stuck-adjacent positions
The backtrack guard checked `map_id == 40 AND party_count == 0`, but
party_count changes to 1 the moment the agent picks up Charmander.
This allowed backtracking to fire immediately after the pickup, wiping
out progress.  Change guard to `map_id == 40` (entire lab is protected).

Also revert Oak trigger to PR #10's proven brute-force approach (4 rounds
of mash_a + wait) instead of script-state-aware gating that read 0xD5F1
while still on Pallet Town map where the address is meaningless.

ROM test confirms: agent picks Charmander, wins rival battle, exits lab.
@bdougie
Copy link
Contributor Author

bdougie commented Mar 10, 2026

ROM Test Results

Ran the agent with backtracking enabled against the real ROM (--max-turns 500). Key findings:

Bug Found & Fixed: Backtrack guard was too narrow

The original guard (map_id == 40 AND party_count == 0) failed at the critical moment:

BT_DEBUG | in_lab=True map=40 party=0 stuck=20   # guard holds
BT_DEBUG | in_lab=False map=40 party=1 stuck=21   # party flips to 1 -> guard fails!
BACKTRACK | Restored to turn 50 map=0 (10,1)       # progress wiped

The agent picks up Charmander at ~turn 21 of being "stuck", but the backtrack guard immediately lets a restore through because party_count changed to 1. Fixed by guarding on map_id == 40 alone — the entire lab sequence (starter pickup + rival battle) is protected.

After fix: Full sequence works

MAP CHANGE | 0 -> 40 | Pos: (5, 3)          # Enter lab
LAB | phase 0→1 south at (5,4)               # Navigate to pokeball
LAB | phase 1→2 at pokeball column (7,4)
STUCK | ... Streak: 20                        # Pressing A at pokeball
OVERWORLD | Party: 1 | Stuck: 46             # Charmander acquired!
BATTLE | Player HP: 20/20 | Enemy HP: 20/20  # Rival battle starts
BATTLE | Player HP: 8/20 | Enemy HP: 0/20    # Rival defeated
MAP CHANGE | 40 -> 0 | Pos: (5, 11)          # Exit lab to Pallet Town

Also fixed: Oak trigger reverted to brute-force

The script-state-aware Oak trigger (reading 0xD5F1) was checking the lab script address while still on Pallet Town (map 0), where it's meaningless. Reverted to PR #10's proven approach: 4 rounds of mash_a(30) + wait(300).

Remaining: post-lab navigation

After exiting the lab, the agent gets stuck at (12,12) / (7,12) on Pallet Town instead of heading north to Route 1. This is a separate navigation issue, not related to backtracking.

Test suite

  • 420 tests pass, 100% coverage

Documents the Factorio Learning Environment-inspired backtracking system:
snapshot/restore mechanics, evolvable parameters, and Oak's Lab guard.
Adds FLE paper to references list.
@bdougie bdougie merged commit 838a680 into main Mar 10, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Adopt patterns from Factorio Learning Environment

1 participant