Skip to content

Fix memory leak: clean up stale/failed items from extract map#624

Open
mprachar wants to merge 1 commit intoUnpackerr:mainfrom
mprachar:fix/memory-leak-stale-items
Open

Fix memory leak: clean up stale/failed items from extract map#624
mprachar wants to merge 1 commit intoUnpackerr:mainfrom
mprachar:fix/memory-leak-stale-items

Conversation

@mprachar
Copy link
Copy Markdown

@mprachar mprachar commented Mar 31, 2026

Summary

  • Add defensive cleanup for stale/failed items in the extract state map
  • Add opt-in pprof debug endpoints for runtime profiling

Context

Note: These fixes were originally thought to be the root cause of observed memory growth (3.5GB after 4 days). After deploying with pprof enabled, heap profiling revealed the actual cause is in golift.io/xtractr cue.go — the FLAC encoder loads entire files into memory during CUE splitting (writeTrackFLACflac.(*Encoder).WriteFrame, 121K allocations / 1.8GB).

These changes are still valuable as defensive improvements — they prevent unbounded map growth from stuck items — but they do not address the primary memory issue.

Changes

File Change
handlers.go Add EXTRACTFAILED retries-exhausted → DELETED case
handlers.go Add 24-hour stale item timeout for EXTRACTED/EXTRACTING/QUEUED
folder.go Add folder EXTRACTFAILED retries-exhausted → cleanup from both maps
webserver.go Add Pprof config field (default: false) and /debug/pprof/* endpoints
start.go Add staleItemTimeout constant (24 hours)

What these fix

Three code paths allow items to stay in u.Map indefinitely:

  1. EXTRACTFAILED with exhausted retries — no switch case matched in checkExtractDone()
  2. EXTRACTFAILED folder items — same gap in checkFolderStats()
  3. No TTL for intermediate states — items stuck at EXTRACTED/EXTRACTING/QUEUED had no maximum age

Each stuck item holds a *xtractr.Response with file lists. Over time this causes gradual map growth, though the primary memory consumer is the xtractr FLAC encoder (separate issue).

Test plan

  • go build ./... passes
  • go test ./... passes
  • go vet ./... passes
  • pprof endpoint confirmed working (used to diagnose the real leak)

🤖 Generated with Claude Code

Three pre-existing leak paths cause unbounded memory growth over days:

1. EXTRACTFAILED items with exhausted retries have no matching case in
   checkExtractDone() — they stay in u.Map forever holding xtractr.Response
2. EXTRACTFAILED folder items with exhausted retries and positive DeleteAfter
   match no case in checkFolderStats() — stuck in both maps forever
3. Items stuck at EXTRACTED/EXTRACTING/QUEUED (e.g. Starr app never imports)
   have no TTL — they hold xtractr.Response indefinitely

Fixes:
- Add retries-exhausted cleanup case in handlers.go and folder.go
- Add 24-hour stale item safety net for intermediate states
- Add opt-in pprof debug endpoints (webserver.pprof config) for profiling

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant