Skip to content

Conversation

@ndebuhr
Copy link
Contributor

@ndebuhr ndebuhr commented Oct 30, 2025

Description

This PR proposes a more flexible health probe bind address to enable both IPv4 and IPv6 environments.

Currently, the sample deployment and default Helm chart installation fail in IPv4 environments - the fdb-operator pod crash loops with the following error:

kubectl logs fdb-operator-869cb4684d-8f5vz -n fdb-operator
Defaulted container "manager" out of: manager, foundationdb-kubernetes-init-7-1 (init), foundationdb-kubernetes-init-7-3 (init), foundationdb-kubernetes-init-7-4 (init)
{"level":"info","ts":"2025-10-30T01:03:58Z","logger":"setup","msg":"Operator starting in single namespace mode","namespace":"fdb-operator"}
{"level":"error","ts":"2025-10-30T01:03:58Z","logger":"setup","msg":"unable to start manager","error":"error listening on [::1]:9443: listen tcp [::1]:9443: bind: cannot assign requested address","stacktrace":"github.com/FoundationDB/fdb-kubernetes-operator/v2/setup.StartManager\n\t/workspace/setup/setup.go:429\nmain.main\n\t/workspace/main.go:58\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:283"}

This PR replaces the hard-coded IPv6 localhost address with a localhost spec that binds to whichever address family the system resolver prefers.

Type of change

Bug fix (non-breaking change which fixes an issue).

Discussion

We could alternatively update the Helm chart and/or deployment sample to override via the --health-probe-bind-address flag, but I think an IP-stack-agnostic default value probably makes the most sense.

Testing

Tested manually on IPv4 (Google Kubernetes Engine standard cluster), and the change fixes the crash looping.

kubectl logs fdb-operator-558dbbc867-
l4527 -n fdb-operator --follow
Defaulted container "manager" out of: manager, foundationdb-kubernetes-init-7-1 (init), foundationdb-kubernetes-init-7-3 (init), foundationdb-kubernetes-init-7-4 (init)
{"level":"info","ts":"2025-10-30T01:43:04Z","logger":"setup","msg":"Operator starting in single namespace mode","namespace":"fdb-operator"}
{"level":"info","ts":"2025-10-30T01:43:04Z","logger":"setup","msg":"Could not parse version from directory name","name":"7.1"}
{"level":"info","ts":"2025-10-30T01:43:04Z","logger":"setup","msg":"Could not parse version from directory name","name":"7.4"}
{"level":"info","ts":"2025-10-30T01:43:04Z","logger":"setup","msg":"Could not parse version from directory name","name":"7.3"}
{"level":"info","ts":"2025-10-30T01:43:04Z","logger":"setup","msg":"Updating pod update method","podUpdateMethod":"update"}
{"level":"info","ts":"2025-10-30T01:43:04Z","logger":"setup","msg":"setup manager"}
{"level":"info","ts":"2025-10-30T01:43:04Z","logger":"controller-runtime.metrics","msg":"Starting metrics server"}
{"level":"info","ts":"2025-10-30T01:43:04Z","logger":"controller-runtime.metrics","msg":"Serving metrics server","bindAddress":":8080","secure":false}
{"level":"info","ts":"2025-10-30T01:43:04Z","msg":"starting server","name":"health probe","addr":"127.0.0.1:9443"}
{"level":"info","ts":"2025-10-30T01:43:04Z","msg":"attempting to acquire leader lease fdb-operator/fdb-kubernetes-operator..."}
{"level":"info","ts":"2025-10-30T01:43:22Z","msg":"successfully acquired lease fdb-operator/fdb-kubernetes-operator"}
{"level":"info","ts":"2025-10-30T01:43:22Z","msg":"Starting EventSource","controller":"foundationdbcluster","controllerGroup":"apps.foundationdb.org","controllerKind":"FoundationDBCluster","source":"kind source: *v1.Service"}
{"level":"info","ts":"2025-10-30T01:43:22Z","msg":"Starting EventSource","controller":"foundationdbrestore","controllerGroup":"apps.foundationdb.org","controllerKind":"FoundationDBRestore","source":"kind source: *v1beta2.FoundationDBRestore"}
{"level":"info","ts":"2025-10-30T01:43:22Z","msg":"Starting EventSource","controller":"foundationdbcluster","controllerGroup":"apps.foundationdb.org","controllerKind":"FoundationDBCluster","source":"kind source: *v1beta2.FoundationDBCluster"}
{"level":"info","ts":"2025-10-30T01:43:22Z","msg":"Starting EventSource","controller":"foundationdbcluster","controllerGroup":"apps.foundationdb.org","controllerKind":"FoundationDBCluster","source":"kind source: *v1.Pod"}
{"level":"info","ts":"2025-10-30T01:43:22Z","msg":"Starting EventSource","controller":"foundationdbbackup","controllerGroup":"apps.foundationdb.org","controllerKind":"FoundationDBBackup","source":"kind source: *v1.Deployment"}
{"level":"info","ts":"2025-10-30T01:43:22Z","msg":"Starting EventSource","controller":"foundationdbcluster","controllerGroup":"apps.foundationdb.org","controllerKind":"FoundationDBCluster","source":"kind source: *v1.PersistentVolumeClaim"}
{"level":"info","ts":"2025-10-30T01:43:22Z","msg":"Starting EventSource","controller":"foundationdbcluster","controllerGroup":"apps.foundationdb.org","controllerKind":"FoundationDBCluster","source":"kind source: *v1.ConfigMap"}
{"level":"info","ts":"2025-10-30T01:43:22Z","msg":"Starting EventSource","controller":"foundationdbbackup","controllerGroup":"apps.foundationdb.org","controllerKind":"FoundationDBBackup","source":"kind source: *v1beta2.FoundationDBBackup"}
{"level":"info","ts":"2025-10-30T01:43:22Z","msg":"Starting Controller","controller":"foundationdbrestore","controllerGroup":"apps.foundationdb.org","controllerKind":"FoundationDBRestore"}
{"level":"info","ts":"2025-10-30T01:43:22Z","msg":"Starting workers","controller":"foundationdbrestore","controllerGroup":"apps.foundationdb.org","controllerKind":"FoundationDBRestore","worker count":1}
{"level":"info","ts":"2025-10-30T01:43:22Z","msg":"Starting Controller","controller":"foundationdbcluster","controllerGroup":"apps.foundationdb.org","controllerKind":"FoundationDBCluster"}
{"level":"info","ts":"2025-10-30T01:43:22Z","msg":"Starting workers","controller":"foundationdbcluster","controllerGroup":"apps.foundationdb.org","controllerKind":"FoundationDBCluster","worker count":1}
{"level":"info","ts":"2025-10-30T01:43:22Z","msg":"Starting Controller","controller":"foundationdbbackup","controllerGroup":"apps.foundationdb.org","controllerKind":"FoundationDBBackup"}
{"level":"info","ts":"2025-10-30T01:43:22Z","msg":"Starting workers","controller":"foundationdbbackup","controllerGroup":"apps.foundationdb.org","controllerKind":"FoundationDBBackup","worker count":1}

Documentation

I don't believe we need to add/update documentation for this one.

&o.HealthProbeBindAddress,
"health-probe-bind-address",
"[::1]:9443",
"localhost:9443",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a discussion around the default in the PR that added this flag: #2340 (comment). We decided to keep the current default value to reduce the risk of breaking things for customers. I would propose to change the default in the helm chart and creating a GitHub issue to change the default before cutting the 3.0 release of the operator (probably next year).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants