Skip to content

fix: fallback to replset primary when leader lease expired#14

Open
wallyxjh wants to merge 1 commit intolabring:fix/v0.9.3from
wallyxjh:fix-mongo-leader-lease
Open

fix: fallback to replset primary when leader lease expired#14
wallyxjh wants to merge 1 commit intolabring:fix/v0.9.3from
wallyxjh:fix-mongo-leader-lease

Conversation

@wallyxjh
Copy link
Collaborator

@wallyxjh wallyxjh commented Feb 6, 2026

  • Symptom: Scaling MongoDB from 3 replicas down to 1 gets stuck; Component controller keeps logging cluster has no leader. Cluster spec shows replicas: 1 while InstanceSet remains at 3.
  • Root cause: MongoDB does not run the HA loop by default, so the leader ConfigMap lease expires. leaveMember relies on the leader CM to locate the primary and fails when the leader is missing.
  • Fix: Add a fallback in the MongoDB lorry manager: when the leader CM is invalid, discover the primary via replSetGetStatus; if needed, probe members directly to find the primary, then continue
    replSetReconfig.
  • Impact: Limited to MongoDB scale‑in/leaveMember behavior (lorry sidecar). No behavior change when leader CM is valid.
  • Relevant code: pkg/lorry/engines/mongodb/manager.go
  • Verification:
    1. Scale out to 3: kubectl patch cluster -n --type json -p '[{"op":"replace","path":"/spec/componentSpecs/0/replicas","value":3}]'
    2. Expire leader lease: kubectl patch cm -n -mongodb-leader --type merge -p '{"metadata":{"annotations":{"renew-time":"1","acquire-time":"1","ttl":"15"}}}'
    3. Scale in to 1: patch replicas back to 1
    4. Old image gets stuck with cluster has no leader; new lorry image completes scale‑in and logs are clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant