Skip to content

EnvoyGateway config only validated on control plane startup #7333

@twelvelabs

Description

@twelvelabs

Description:

Seems that EnvoyGateway validation is only performed on control plane startup, and not when the config map is updated. This can cause systems to silently get into an invalid state and prevent the control plane from restarting.

Repro steps:

We recently configured the global rate limiting service in our clusters using the following helm chart config:

config:
  envoyGateway:
    rateLimit:
      backend:
        type: Redis
        redis:
          url: 0.0.0.0:6378 # actual IP redacted

This worked fine for several days, but this morning one of the EG control plane pod s restarted and went into a crash loop w/ the following error:

unknown ratelimit redis url format: parse "0.0.0.0:6378": first path segment in URL cannot contain colon

Which appears to be from this validation logic.

Seems we just need to add a redis:// prefix to the URL, but we would have been notified earlier about the validation error.

I suspect it's not easy to surface this via Helm (given the async nature of how the config map is watched/processed), but I would have expected the watcher to log a validation error and prevent the rate limit service from getting setup.

I think the config watching logic is here. Shouldn't that code also call r.cfg.Validate()?

Environment:

  • chart: oci://docker.io/envoyproxy/gateway-helm
  • version: v1.5.0-rc.2

Metadata

Metadata

Assignees

Labels

kind/bugSomething isn't working

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions