Skip to content

Valkey master-replica support#131

Open
olehoerb wants to merge 13 commits intomasterfrom
valkey-cluster-config
Open

Valkey master-replica support#131
olehoerb wants to merge 13 commits intomasterfrom
valkey-cluster-config

Conversation

@olehoerb
Copy link
Collaborator

Description

feat: Implements complete Valkey master-replica support with leader election-based backup-restore coordination. Ensures safe backup and restore operations in multi-replica deployment.
Only Valkey-master (pod-0) performs backup/restore operations

  • Add leader election package using https://pkg.go.dev/k8s.io/client-go/tools/leaderelection for coordination
  • Add Valkey master-replica database implementation
  • Add integration tests for master-replica restore workflow
  • Add Kubernetes deployment examples and backing resources
  • Integrate leader election checks into backup coordination

References: #124
To close #124 helm chart is still needed

-cluster setup not working correctly
- fix backup leader/master mismatch
- Add context management
- Optimize backup coordination
refactored the name from valkey-cluster to valkey-master-replica
@olehoerb olehoerb requested a review from a team as a code owner October 29, 2025 14:26
@metal-robot metal-robot bot added the area: control-plane Affects the metal-stack control-plane area. label Oct 29, 2025
Copy link
Contributor

@majst01 majst01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, first simple comments from my side

Comment on lines +347 to +352
parts := strings.Split(podName, "-")
if len(parts) == 0 {
return -1
}

ordinalStr := parts[len(parts)-1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider strings.Cut

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing it out! But I think I found an even better solution here after reviewing the strings functions

Comment on lines +239 to +245
init.sh: "#!/bin/sh\nset -e\n\n# Extract pod ordinal from hostname (valkey-0, valkey-1,
etc.)\nORDINAL=$(hostname | sed 's/.*-//')\n\n# Pod 0 is the master, others are
replicas\nif [ \"$ORDINAL\" = \"0\" ]; then\n echo \"I am the master (pod-0)\"\t\t\n
\ exec valkey-server --port 6379 --bind 0.0.0.0\nelse\n echo \"I am a replica
(pod-$ORDINAL), connecting to master at valkey-0.valkey.${POD_NAMESPACE}.svc.cluster.local\"\n
\ exec valkey-server --port 6379 --bind 0.0.0.0 --replicaof valkey-0.valkey.${POD_NAMESPACE}.svc.cluster.local
6379\nfi\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be done with a multiline yaml content e.g. |

Comment on lines +247 to +253
init.sh: "#!/bin/sh\nset -e\n\n# Extract pod ordinal from hostname (valkey-0, valkey-1,
etc.)\nORDINAL=$(hostname | sed 's/.*-//')\n\n# Pod 0 is the master, others are
replicas\nif [ \"$ORDINAL\" = \"0\" ]; then\n echo \"I am the master (pod-0)\"\t\t\n
\ exec valkey-server --port 6379 --bind 0.0.0.0\nelse\n echo \"I am a replica
(pod-$ORDINAL), connecting to master at valkey-0.valkey.${POD_NAMESPACE}.svc.cluster.local\"\n
\ exec valkey-server --port 6379 --bind 0.0.0.0 --replicaof valkey-0.valkey.${POD_NAMESPACE}.svc.cluster.local
6379\nfi\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiline content

@github-project-automation github-project-automation bot moved this to In Progress in Development Oct 29, 2025
@olehoerb olehoerb requested a review from majst01 October 31, 2025 07:29
@vknabel vknabel moved this from In Progress to Upcoming in Development Nov 3, 2025
Copy link
Contributor

@Gerrit91 Gerrit91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually looks really good and promising!

Not fully done yet but here are some first review comments.

Comment on lines +63 to +68
if leaderElector, ok := b.db.(database.DatabaseLeaderElector); ok {
if !leaderElector.ShouldPerformBackup(ctx) {
b.log.Debug("skipping backup - not elected as leader")
return
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better use an extended contract in the DatabaseProber interface instead of a type cast, something like:

	if !b.db.IsLeader(ctx) {
			b.log.Debug("skipping backup - not elected as leader")
			return
		}

Makefile Outdated
.PHONY: test-integration-valkey-master-replica
test-integration-valkey-master-replica: kind-cluster-create
kind --name backup-restore-sidecar load docker-image ghcr.io/metal-stack/backup-restore-sidecar:latest
kind --name backup-restore-sidecar load docker-image ghcr.io/valkey-io/valkey:8.1-alpine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed as this comes from the internet.

Comment on lines +216 to +221
if podName == "" {
return nil, fmt.Errorf("cluster mode requires POD_NAME environment variable to be set")
}
if podNamespace == "" {
return nil, fmt.Errorf("cluster mode requires POD_NAMESPACE environment variable to be set")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These checks can be moved into the leader election package if they are required for it.

client *redis.Client

clusterMode bool
clusterSize int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field is not used and can probably be removed.

backup-cron-schedule: "*/1 * * * *"
object-prefix: valkey-test
object-prefix: valkey-test-${POD_NAME}
redis-addr: localhost:6379
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This key is not used for the valkey backend, so it can be removed.

log.Info("Creating Valkey instance", "clusterMode", clusterMode, "clusterSize", clusterSize)

v.client = redis.NewClient(&redis.Options{
Addr: "localhost:6379",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should ideally come from a configuration parameter (e.g. "valkey-addr")

}

if !isMaster {
db.log.Info("elected as backup leader but not Valkey master, skipping backup")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you ensure that the leader election changes to the database leader to prevent there are no backups taken in case of a constant mismatch?

v := &Valkey{
log: log,
datadir: datadir,
password: getPassword(password),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can replace the getPassword function in favor of a function from metal-lib:

Suggested change
password: getPassword(password),
password: pointer.SafeDerefOrDefault(password, ""),


// Leader election considers database role: only restore if this pod will be the Valkey master
// In master-replica mode, pod-0 is always the master (determined by init.sh)
podName := os.Getenv("POD_NAME")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's more secure to retrieve this from the leaderElection package because the pod name might originate from the configuration and not only from the environment variable.

return fmt.Errorf("restore file not present: %s", dump)
}

if db.clusterMode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably it's better to reduce code intention by negating this expression, the for loop in this function has explicit returns anyway so last line of this function does not need to be repeated.

Suggested change
if db.clusterMode {
if !db.clusterMode {

Copy link
Contributor

@Gerrit91 Gerrit91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to play around a bit with the setup. Is it correct that this is not a real Valkey Cluster as described here but rather just adding replicas, which cannot accept writes and are not promoted to master instances under any circumstances?

I came to this conclusion because I run:

❯ k exec -it valkey-master-replica-0 -c valkey -- valkey-cli cluster info
ERR This instance has cluster support disabled

When killing pod-0 constantly, I can still read values from database, but it's not possible to write anymore:

❯ k exec -it statefulsets/valkey-master-replica -c valkey -- valkey-cli set foo foo
(error) READONLY You can't write against a read only replica.

I think, this is also a good improvement in general as it allows serving read-requests during node roll or whatever. But I am not sure if leader election is really required in this setup because backups and restores can only be done from pod-0 anyway? Also we should not call it clusterMode?

Can I ask you to somewhere describe the approach in a markdown document in the docs folder? I think this would help a lot. :)

Spec: corev1.PodSpec{
HostNetwork: true,
HostNetwork: true,
ServiceAccountName: "valkey-backup-restore",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably be good to add a topology spread constraint here for an example on the node name, like:

      topologySpreadConstraints:
      - labelSelector:
          matchLabels:
            app: valkey
        maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: ScheduleAnyway

-use isMaster for backup coordination instead of leader election
-fix STATEFUL_NAME mismatch on valkey container
-restore HostNetwork for standalone, disable for master-replica
-fix newValkeyClient retry to actually ping
-added docs describing the approach
-replaced pointer usage with GO 1.26 pointer
-replaced pointer usage with GO 1.26 pointer
@olehoerb olehoerb requested a review from Gerrit91 February 17, 2026 08:00
Comment on lines +85 to +86
1. All pods compete for a `Lease` resource.
2. The winner checks its pod ordinal. Only pod-0 (the future master) actually restores from backup.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This I do not understand. Why isn't checking the ordinal enough if the backup needs to be restored? What benefit does the leader election bring?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine the the scenario all pods are terminated and all volumes are lost. Then the pods start up fresh and the replica wins the election and restores the data. When the master comes up, it will not restore the data because it's not the leader and start without data. Wouldn't the replicas then sync empty data from the master?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats actually true. The ordinal check is enough and using ordinal check should fix the issue with syncing


## Topology

This is **not** a Valkey Cluster (which requires `cluster-enabled yes` and has built-in sharding/failover). Instead it is a simple master-replica setup using a Kubernetes StatefulSet:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you evaluate the cluster mode once? What were the issues why it cannot be used as in general this would be really nice if a node roll of a K8s cluster would not cause any interruptions of the service in terms of write operations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: control-plane Affects the metal-stack control-plane area.

Projects

Status: Upcoming

Development

Successfully merging this pull request may close these issues.

Evaluate concept for valkey in clustered configuration

3 participants