fix: codex sync merges AGENTS.md instead of replacing#715
fix: codex sync merges AGENTS.md instead of replacing#715affaan-m merged 1 commit intoaffaan-m:mainfrom
Conversation
The sync script previously overwrote ~/.codex/AGENTS.md on every run, destroying any user-authored content. This adds marker-based merging (<!-- BEGIN ECC --> / <!-- END ECC -->) so only the ECC-managed section is replaced on subsequent runs, preserving user content outside the markers. Merge logic: - No file → create with markers - Both markers present (ordered, CRLF-safe) → replace only the ECC section - BEGIN without END (corrupted) → full replace (backup saved) - No markers at all → append ECC block (preserves existing content) Also fixes: - Symlink preservation: uses cat > instead of mv to write through symlinks - CRLF handling: strips \r in marker detection to handle Windows-edited files - Marker ordering: validates BEGIN appears before END, not just that both exist The legacy heading-match heuristic was intentionally removed per council review: any unmarked file is either user-authored (append is safe) or legacy ECC-generated (duplicates once, deduplicates on next run via markers). A timestamped backup is always saved before any mutation. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughThe script now uses comment-based markers to selectively merge ECC content into Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Updates the ECC→Codex sync script to stop overwriting ~/.codex/AGENTS.md and instead maintain a managed ECC section that can be updated on subsequent syncs while preserving user-authored content outside that section.
Changes:
- Introduces
<!-- BEGIN ECC -->/<!-- END ECC -->markers and replaces only the marked block when present and ordered. - Removes the legacy heading-match overwrite heuristic and adds CRLF-tolerant marker detection.
- Writes updates via redirection (
cat > file) to preserve symlinkedAGENTS.mdsetups.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| $0 == b { found_b = NR } $0 == e { found_e = NR } | ||
| END { exit !(found_b && found_e && found_b < found_e) } | ||
| ' "$AGENTS_FILE"; then | ||
| # Existing file with matched, correctly ordered ECC markers — replace only the ECC section | ||
| replace_ecc_section | ||
| elif grep -qF "$ECC_BEGIN_MARKER" "$AGENTS_FILE"; then | ||
| # BEGIN marker exists but END marker is missing (corrupted). Warn and | ||
| # replace the file entirely to restore a valid state. Backup was saved. | ||
| log "WARNING: found BEGIN marker but no END marker — replacing file (backup saved)" |
There was a problem hiding this comment.
The marker-ordering check stores only the last BEGIN/END line numbers (found_b/found_e are overwritten each match). If a file contains multiple BEGIN/END markers, this can incorrectly classify the file as “well-formed” and replace_ecc_section will inject the ECC block once for each BEGIN marker it encounters. Consider tightening validation to require exactly one BEGIN and one END marker (or pick the first BEGIN and the first END after it) before doing an in-place section replace.
| $0 == b { found_b = NR } $0 == e { found_e = NR } | |
| END { exit !(found_b && found_e && found_b < found_e) } | |
| ' "$AGENTS_FILE"; then | |
| # Existing file with matched, correctly ordered ECC markers — replace only the ECC section | |
| replace_ecc_section | |
| elif grep -qF "$ECC_BEGIN_MARKER" "$AGENTS_FILE"; then | |
| # BEGIN marker exists but END marker is missing (corrupted). Warn and | |
| # replace the file entirely to restore a valid state. Backup was saved. | |
| log "WARNING: found BEGIN marker but no END marker — replacing file (backup saved)" | |
| $0 == b { | |
| count_b++ | |
| if (!first_b) first_b = NR | |
| } | |
| $0 == e { | |
| count_e++ | |
| if (!first_e) first_e = NR | |
| } | |
| END { | |
| exit !(count_b == 1 && count_e == 1 && first_b < first_e) | |
| } | |
| ' "$AGENTS_FILE"; then | |
| # Existing file with exactly one matched, correctly ordered ECC marker pair — replace only the ECC section | |
| replace_ecc_section | |
| elif grep -qF "$ECC_BEGIN_MARKER" "$AGENTS_FILE"; then | |
| # BEGIN marker exists but END marker is missing or otherwise invalid (corrupted). Warn and | |
| # replace the file entirely to restore a valid state. Backup was saved. | |
| log "WARNING: found BEGIN marker but no valid END marker — replacing file (backup saved)" |
| # BEGIN marker exists but END marker is missing (corrupted). Warn and | ||
| # replace the file entirely to restore a valid state. Backup was saved. | ||
| log "WARNING: found BEGIN marker but no END marker — replacing file (backup saved)" |
There was a problem hiding this comment.
This warning/message path triggers whenever the ordered-marker awk check fails but a BEGIN marker exists. That includes cases where the END marker exists but is before BEGIN (out-of-order), so the message “found BEGIN marker but no END marker” can be inaccurate. Consider adding an explicit END-marker presence check and emitting a distinct warning for “markers out of order” vs “missing END”.
| # BEGIN marker exists but END marker is missing (corrupted). Warn and | |
| # replace the file entirely to restore a valid state. Backup was saved. | |
| log "WARNING: found BEGIN marker but no END marker — replacing file (backup saved)" | |
| if grep -qF "$ECC_END_MARKER" "$AGENTS_FILE"; then | |
| # Both markers exist but failed the ordered-marker check above (likely out of order). | |
| # Warn and replace the file entirely to restore a valid state. Backup was saved. | |
| log "WARNING: ECC markers out of order — replacing file (backup saved)" | |
| else | |
| # BEGIN marker exists but END marker is missing (corrupted). Warn and | |
| # replace the file entirely to restore a valid state. Backup was saved. | |
| log "WARNING: found BEGIN marker but no END marker — replacing file (backup saved)" | |
| fi | |
| compose_ecc_block > "$AGENTS_FILE" | |
| elif grep -qF "$ECC_END_MARKER" "$AGENTS_FILE"; then | |
| # END marker exists but BEGIN marker is missing (corrupted). Warn and | |
| # replace the file entirely to restore a valid state. Backup was saved. | |
| log "WARNING: found END marker but no BEGIN marker — replacing file (backup saved)" |
| else | ||
| # Existing file without markers — append ECC block, preserve user content. | ||
| # Note: legacy ECC-only files (from old '>' overwrite) will get a second copy | ||
| # on this first run. This is intentional — the alternative (heading-match | ||
| # heuristic) risks false-positive overwrites of user-authored files. The next | ||
| # run deduplicates via markers, and a timestamped backup was saved above. | ||
| log "No ECC markers found — appending managed block (backup saved)" | ||
| { | ||
| printf '\n\n' | ||
| compose_ecc_block | ||
| } >> "$AGENTS_FILE" | ||
| fi |
There was a problem hiding this comment.
If the file contains an END marker without a BEGIN marker, the script currently falls into the “no markers” branch and appends a new ECC block, leaving an unmatched END marker in the document. It would be safer to treat “END without BEGIN” as a corrupted-marker state (like “BEGIN without END”) and either fully replace (backup already saved) or remove the stray END marker before appending.
There was a problem hiding this comment.
2 issues found across 1 file
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="scripts/sync-ecc-to-codex.sh">
<violation number="1" location="scripts/sync-ecc-to-codex.sh:186">
P2: Marker validation allows multiple BEGIN/END pairs, so replacement can emit multiple managed ECC blocks instead of a single section.</violation>
<violation number="2" location="scripts/sync-ecc-to-codex.sh:206">
P2: Legacy marker-less ECC files are appended with a marked block, but later runs only replace the marked section, so the original unmarked ECC copy is never removed and duplicates persist.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| $0 == b { found_b = NR } $0 == e { found_e = NR } | ||
| END { exit !(found_b && found_e && found_b < found_e) } |
There was a problem hiding this comment.
P2: Marker validation allows multiple BEGIN/END pairs, so replacement can emit multiple managed ECC blocks instead of a single section.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At scripts/sync-ecc-to-codex.sh, line 186:
<comment>Marker validation allows multiple BEGIN/END pairs, so replacement can emit multiple managed ECC blocks instead of a single section.</comment>
<file context>
@@ -143,16 +143,68 @@ if [[ -f "$AGENTS_FILE" ]]; then
+ compose_ecc_block > "$AGENTS_FILE"
+ elif awk -v b="$ECC_BEGIN_MARKER" -v e="$ECC_END_MARKER" '
+ { gsub(/\r$/, "") }
+ $0 == b { found_b = NR } $0 == e { found_e = NR }
+ END { exit !(found_b && found_e && found_b < found_e) }
+ ' "$AGENTS_FILE"; then
</file context>
| $0 == b { found_b = NR } $0 == e { found_e = NR } | |
| END { exit !(found_b && found_e && found_b < found_e) } | |
| $0 == b { found_b = NR; b_count++ } | |
| $0 == e { found_e = NR; e_count++ } | |
| END { exit !(b_count == 1 && e_count == 1 && found_b < found_e) } |
| { | ||
| printf '\n\n' | ||
| compose_ecc_block | ||
| } >> "$AGENTS_FILE" |
There was a problem hiding this comment.
P2: Legacy marker-less ECC files are appended with a marked block, but later runs only replace the marked section, so the original unmarked ECC copy is never removed and duplicates persist.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At scripts/sync-ecc-to-codex.sh, line 206:
<comment>Legacy marker-less ECC files are appended with a marked block, but later runs only replace the marked section, so the original unmarked ECC copy is never removed and duplicates persist.</comment>
<file context>
@@ -143,16 +143,68 @@ if [[ -f "$AGENTS_FILE" ]]; then
+ {
+ printf '\n\n'
+ compose_ecc_block
+ } >> "$AGENTS_FILE"
+ fi
fi
</file context>
Greptile SummaryThis PR replaces the previous destructive overwrite of Key changes and concerns:
Confidence Score: 3/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Start sync-ecc-to-codex.sh] --> B{dry-run?}
B -- yes --> C[Print generic dry-run message]
B -- no --> D{AGENTS_FILE exists?}
D -- no --> E[compose_ecc_block > AGENTS_FILE\nCreate fresh file with markers]
D -- yes --> F{Both markers present\nand BEGIN before END?}
F -- yes --> G[replace_ecc_section\nReplace only ECC block via awk]
F -- no --> H{BEGIN marker\npresent alone?}
H -- yes --> I[WARNING: corrupted\ncompose_ecc_block > AGENTS_FILE\nFull replace]
H -- no --> J[Append ECC block\nto existing file]
G --> K[cat tmp > AGENTS_FILE\nSymlink-safe write]
E --> L[Continue: sync skills,\nprompts, MCP config]
I --> L
J --> L
K --> L
style I fill:#f96,color:#000
style G fill:#9f6,color:#000
style J fill:#ff9,color:#000
Last reviewed commit: "fix: codex sync merg..." |
| replace_ecc_section() { | ||
| # Replace the ECC block between markers in $AGENTS_FILE with fresh content. | ||
| # Uses awk to correctly handle all positions including line 1. | ||
| local tmp | ||
| tmp="$(mktemp)" | ||
| local ecc_tmp | ||
| ecc_tmp="$(mktemp)" | ||
| compose_ecc_block > "$ecc_tmp" | ||
| awk -v begin="$ECC_BEGIN_MARKER" -v end="$ECC_END_MARKER" -v ecc="$ecc_tmp" ' | ||
| { gsub(/\r$/, "") } | ||
| $0 == begin { skip = 1; while ((getline line < ecc) > 0) print line; close(ecc); next } | ||
| $0 == end { skip = 0; next } | ||
| !skip { print } | ||
| ' "$AGENTS_FILE" > "$tmp" | ||
| # Write through the path (preserves symlinks) instead of mv | ||
| cat "$tmp" > "$AGENTS_FILE" | ||
| rm -f "$tmp" "$ecc_tmp" |
There was a problem hiding this comment.
Missing cleanup trap for temp files
replace_ecc_section creates two temp files ($tmp and $ecc_tmp) but has no trap to clean them up on failure. With set -euo pipefail active, any failure mid-function (e.g., awk error, full disk) will cause the script to abort immediately, leaving these files on disk indefinitely.
Consider adding a trap at the top of the function:
| replace_ecc_section() { | |
| # Replace the ECC block between markers in $AGENTS_FILE with fresh content. | |
| # Uses awk to correctly handle all positions including line 1. | |
| local tmp | |
| tmp="$(mktemp)" | |
| local ecc_tmp | |
| ecc_tmp="$(mktemp)" | |
| compose_ecc_block > "$ecc_tmp" | |
| awk -v begin="$ECC_BEGIN_MARKER" -v end="$ECC_END_MARKER" -v ecc="$ecc_tmp" ' | |
| { gsub(/\r$/, "") } | |
| $0 == begin { skip = 1; while ((getline line < ecc) > 0) print line; close(ecc); next } | |
| $0 == end { skip = 0; next } | |
| !skip { print } | |
| ' "$AGENTS_FILE" > "$tmp" | |
| # Write through the path (preserves symlinks) instead of mv | |
| cat "$tmp" > "$AGENTS_FILE" | |
| rm -f "$tmp" "$ecc_tmp" | |
| replace_ecc_section() { | |
| # Replace the ECC block between markers in $AGENTS_FILE with fresh content. | |
| # Uses awk to correctly handle all positions including line 1. | |
| local tmp | |
| tmp="$(mktemp)" | |
| local ecc_tmp | |
| ecc_tmp="$(mktemp)" | |
| trap 'rm -f "$tmp" "$ecc_tmp"' RETURN | |
| compose_ecc_block > "$ecc_tmp" |
| compose_ecc_block > "$ecc_tmp" | ||
| awk -v begin="$ECC_BEGIN_MARKER" -v end="$ECC_END_MARKER" -v ecc="$ecc_tmp" ' | ||
| { gsub(/\r$/, "") } | ||
| $0 == begin { skip = 1; while ((getline line < ecc) > 0) print line; close(ecc); next } | ||
| $0 == end { skip = 0; next } | ||
| !skip { print } | ||
| ' "$AGENTS_FILE" > "$tmp" |
There was a problem hiding this comment.
ECC markers inside source files can corrupt subsequent syncs
The awk in replace_ecc_section reads lines from $ecc_tmp via getline and prints them directly — awk's own pattern rules are not applied to those lines. This means if AGENTS_ROOT_SRC or AGENTS_CODEX_SUPP_SRC ever contain the literal string <!-- END ECC -->, that text is embedded verbatim into $AGENTS_FILE after the first sync.
On the second sync, replace_ecc_section re-processes $AGENTS_FILE:
- It matches
<!-- BEGIN ECC -->, emits the new ECC block, setsskip=1, callsnext. - It then encounters the embedded
<!-- END ECC -->(carried over from the source file), matches$0 == end, setsskip=0. - All remaining lines of the old ECC content (from that embedded END up to the outer real END marker) are now printed as if they were user content — they survive as stale duplicate material and accumulate on every sync.
The detection awk (line 184) is similarly affected: it stores the line number of the last <!-- END ECC --> occurrence, so found_b < found_e will still be true and replace_ecc_section will be called even in this malformed state.
A simple mitigation is to sanitize the marker strings out of each source file before embedding:
compose_ecc_block() {
printf '%s\n' "$ECC_BEGIN_MARKER"
sed '/<!--[[:space:]]*BEGIN ECC[[:space:]]*-->/d; /<!--[[:space:]]*END ECC[[:space:]]*-->/d' "$AGENTS_ROOT_SRC"
printf '\n\n---\n\n'
printf '# Codex Supplement (From ECC .codex/AGENTS.md)\n\n'
sed '/<!--[[:space:]]*BEGIN ECC[[:space:]]*-->/d; /<!--[[:space:]]*END ECC[[:space:]]*-->/d' "$AGENTS_CODEX_SUPP_SRC"
printf '\n%s\n' "$ECC_END_MARKER"
}| if [[ "$MODE" == "dry-run" ]]; then | ||
| printf '[dry-run] compose %s from %s + %s\n' "$AGENTS_FILE" "$AGENTS_ROOT_SRC" "$AGENTS_CODEX_SUPP_SRC" | ||
| printf '[dry-run] merge ECC block into %s from %s + %s\n' "$AGENTS_FILE" "$AGENTS_ROOT_SRC" "$AGENTS_CODEX_SUPP_SRC" | ||
| else |
There was a problem hiding this comment.
Dry-run doesn't preview which merge branch would be taken
The previous dry-run output was a reasonable approximation of the single-path behaviour, but now there are four distinct code paths (create, replace-section, full-replace-corrupted, append). The current dry-run message is a generic string that gives the user no indication of which path would actually execute on a given file.
Consider running the detection logic in dry-run mode and printing the expected action:
if [[ "$MODE" == "dry-run" ]]; then
if [[ ! -f "$AGENTS_FILE" ]]; then
printf '[dry-run] action=CREATE %s (no existing file)\n' "$AGENTS_FILE"
elif awk -v b="$ECC_BEGIN_MARKER" -v e="$ECC_END_MARKER" \
'{ gsub(/\r$/, "") } $0==b{found_b=NR} $0==e{found_e=NR}
END{exit !(found_b && found_e && found_b < found_e)}' "$AGENTS_FILE"; then
printf '[dry-run] action=REPLACE-SECTION %s (markers found)\n' "$AGENTS_FILE"
elif grep -qF "$ECC_BEGIN_MARKER" "$AGENTS_FILE"; then
printf '[dry-run] action=FULL-REPLACE %s (corrupted: BEGIN without END)\n' "$AGENTS_FILE"
else
printf '[dry-run] action=APPEND %s (no markers)\n' "$AGENTS_FILE"
fi
Summary
<!-- BEGIN ECC -->/<!-- END ECC -->markers so only the managed section is replaced on subsequent syncs — user content outside the markers is preservedgrepfor# Codex Supplement (From ECC .codex/AGENTS.md)is removed to eliminate false-positive overwrite risk (council-reviewed decision)cat > fileinstead ofmv tmp fileso dotfiles symlink setups aren't broken\rin marker detection so Windows-edited files don't take the corrupted/overwrite pathBEGINappears beforeENDin the file, not just that both strings existMerge logic
Legacy ECC-only files (from old
>overwrite) get ECC appended on first run, then deduplicated via markers on the next run. A timestamped backup is always saved before mutation.Test plan
sync-ecc-to-codex.sh --dry-run— verify no errors~/.codex/with noAGENTS.md— verify file created with markersAGENTS.mdwith user content — verify ECC block appended, user content intactAGENTS.md— verify symlink preserved🤖 Generated with Claude Code
via Happy
Summary by cubic
Merges ECC content into ~/.codex/AGENTS.md using markers so only the managed section updates and user content stays intact. Also hardens the sync for CRLF files and symlinks.
Written for commit 40643d9. Summary will update on new commits.
Summary by CodeRabbit