Conversation
-cluster setup not working correctly
- fix cluster logic issues
- fix backup leader/master mismatch - Add context management - Optimize backup coordination
refactored the name from valkey-cluster to valkey-master-replica
majst01
left a comment
There was a problem hiding this comment.
Nice work, first simple comments from my side
| parts := strings.Split(podName, "-") | ||
| if len(parts) == 0 { | ||
| return -1 | ||
| } | ||
|
|
||
| ordinalStr := parts[len(parts)-1] |
There was a problem hiding this comment.
Thanks for pointing it out! But I think I found an even better solution here after reviewing the strings functions
deploy/valkey-local.yaml
Outdated
| init.sh: "#!/bin/sh\nset -e\n\n# Extract pod ordinal from hostname (valkey-0, valkey-1, | ||
| etc.)\nORDINAL=$(hostname | sed 's/.*-//')\n\n# Pod 0 is the master, others are | ||
| replicas\nif [ \"$ORDINAL\" = \"0\" ]; then\n echo \"I am the master (pod-0)\"\t\t\n | ||
| \ exec valkey-server --port 6379 --bind 0.0.0.0\nelse\n echo \"I am a replica | ||
| (pod-$ORDINAL), connecting to master at valkey-0.valkey.${POD_NAMESPACE}.svc.cluster.local\"\n | ||
| \ exec valkey-server --port 6379 --bind 0.0.0.0 --replicaof valkey-0.valkey.${POD_NAMESPACE}.svc.cluster.local | ||
| 6379\nfi\n" |
There was a problem hiding this comment.
Can be done with a multiline yaml content e.g. |
| init.sh: "#!/bin/sh\nset -e\n\n# Extract pod ordinal from hostname (valkey-0, valkey-1, | ||
| etc.)\nORDINAL=$(hostname | sed 's/.*-//')\n\n# Pod 0 is the master, others are | ||
| replicas\nif [ \"$ORDINAL\" = \"0\" ]; then\n echo \"I am the master (pod-0)\"\t\t\n | ||
| \ exec valkey-server --port 6379 --bind 0.0.0.0\nelse\n echo \"I am a replica | ||
| (pod-$ORDINAL), connecting to master at valkey-0.valkey.${POD_NAMESPACE}.svc.cluster.local\"\n | ||
| \ exec valkey-server --port 6379 --bind 0.0.0.0 --replicaof valkey-0.valkey.${POD_NAMESPACE}.svc.cluster.local | ||
| 6379\nfi\n" |
Gerrit91
left a comment
There was a problem hiding this comment.
Actually looks really good and promising!
Not fully done yet but here are some first review comments.
cmd/internal/backup/backup.go
Outdated
| if leaderElector, ok := b.db.(database.DatabaseLeaderElector); ok { | ||
| if !leaderElector.ShouldPerformBackup(ctx) { | ||
| b.log.Debug("skipping backup - not elected as leader") | ||
| return | ||
| } | ||
| } |
There was a problem hiding this comment.
Better use an extended contract in the DatabaseProber interface instead of a type cast, something like:
if !b.db.IsLeader(ctx) {
b.log.Debug("skipping backup - not elected as leader")
return
}
Makefile
Outdated
| .PHONY: test-integration-valkey-master-replica | ||
| test-integration-valkey-master-replica: kind-cluster-create | ||
| kind --name backup-restore-sidecar load docker-image ghcr.io/metal-stack/backup-restore-sidecar:latest | ||
| kind --name backup-restore-sidecar load docker-image ghcr.io/valkey-io/valkey:8.1-alpine |
There was a problem hiding this comment.
Can be removed as this comes from the internet.
| if podName == "" { | ||
| return nil, fmt.Errorf("cluster mode requires POD_NAME environment variable to be set") | ||
| } | ||
| if podNamespace == "" { | ||
| return nil, fmt.Errorf("cluster mode requires POD_NAMESPACE environment variable to be set") | ||
| } |
There was a problem hiding this comment.
These checks can be moved into the leader election package if they are required for it.
| client *redis.Client | ||
|
|
||
| clusterMode bool | ||
| clusterSize int |
There was a problem hiding this comment.
This field is not used and can probably be removed.
| backup-cron-schedule: "*/1 * * * *" | ||
| object-prefix: valkey-test | ||
| object-prefix: valkey-test-${POD_NAME} | ||
| redis-addr: localhost:6379 |
There was a problem hiding this comment.
This key is not used for the valkey backend, so it can be removed.
| log.Info("Creating Valkey instance", "clusterMode", clusterMode, "clusterSize", clusterSize) | ||
|
|
||
| v.client = redis.NewClient(&redis.Options{ | ||
| Addr: "localhost:6379", |
There was a problem hiding this comment.
This should ideally come from a configuration parameter (e.g. "valkey-addr")
| } | ||
|
|
||
| if !isMaster { | ||
| db.log.Info("elected as backup leader but not Valkey master, skipping backup") |
There was a problem hiding this comment.
How do you ensure that the leader election changes to the database leader to prevent there are no backups taken in case of a constant mismatch?
| v := &Valkey{ | ||
| log: log, | ||
| datadir: datadir, | ||
| password: getPassword(password), |
There was a problem hiding this comment.
You can replace the getPassword function in favor of a function from metal-lib:
| password: getPassword(password), | |
| password: pointer.SafeDerefOrDefault(password, ""), |
|
|
||
| // Leader election considers database role: only restore if this pod will be the Valkey master | ||
| // In master-replica mode, pod-0 is always the master (determined by init.sh) | ||
| podName := os.Getenv("POD_NAME") |
There was a problem hiding this comment.
Maybe it's more secure to retrieve this from the leaderElection package because the pod name might originate from the configuration and not only from the environment variable.
| return fmt.Errorf("restore file not present: %s", dump) | ||
| } | ||
|
|
||
| if db.clusterMode { |
There was a problem hiding this comment.
Probably it's better to reduce code intention by negating this expression, the for loop in this function has explicit returns anyway so last line of this function does not need to be repeated.
| if db.clusterMode { | |
| if !db.clusterMode { |
Gerrit91
left a comment
There was a problem hiding this comment.
I was able to play around a bit with the setup. Is it correct that this is not a real Valkey Cluster as described here but rather just adding replicas, which cannot accept writes and are not promoted to master instances under any circumstances?
I came to this conclusion because I run:
❯ k exec -it valkey-master-replica-0 -c valkey -- valkey-cli cluster info
ERR This instance has cluster support disabled
When killing pod-0 constantly, I can still read values from database, but it's not possible to write anymore:
❯ k exec -it statefulsets/valkey-master-replica -c valkey -- valkey-cli set foo foo
(error) READONLY You can't write against a read only replica.
I think, this is also a good improvement in general as it allows serving read-requests during node roll or whatever. But I am not sure if leader election is really required in this setup because backups and restores can only be done from pod-0 anyway? Also we should not call it clusterMode?
Can I ask you to somewhere describe the approach in a markdown document in the docs folder? I think this would help a lot. :)
| Spec: corev1.PodSpec{ | ||
| HostNetwork: true, | ||
| HostNetwork: true, | ||
| ServiceAccountName: "valkey-backup-restore", |
There was a problem hiding this comment.
It would probably be good to add a topology spread constraint here for an example on the node name, like:
topologySpreadConstraints:
- labelSelector:
matchLabels:
app: valkey
maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway| 1. All pods compete for a `Lease` resource. | ||
| 2. The winner checks its pod ordinal. Only pod-0 (the future master) actually restores from backup. |
There was a problem hiding this comment.
This I do not understand. Why isn't checking the ordinal enough if the backup needs to be restored? What benefit does the leader election bring?
There was a problem hiding this comment.
Imagine the the scenario all pods are terminated and all volumes are lost. Then the pods start up fresh and the replica wins the election and restores the data. When the master comes up, it will not restore the data because it's not the leader and start without data. Wouldn't the replicas then sync empty data from the master?
There was a problem hiding this comment.
Thats actually true. The ordinal check is enough and using ordinal check should fix the issue with syncing
|
|
||
| ## Topology | ||
|
|
||
| This is **not** a Valkey Cluster (which requires `cluster-enabled yes` and has built-in sharding/failover). Instead it is a simple master-replica setup using a Kubernetes StatefulSet: |
There was a problem hiding this comment.
Did you evaluate the cluster mode once? What were the issues why it cannot be used as in general this would be really nice if a node roll of a K8s cluster would not cause any interruptions of the service in terms of write operations.
Description
feat: Implements complete Valkey master-replica support with leader election-based backup-restore coordination. Ensures safe backup and restore operations in multi-replica deployment.
Only Valkey-master (pod-0) performs backup/restore operations
References: #124
To close #124 helm chart is still needed