Skip to content

gRPC-gateway WebSocket /v3/* proxy allows unauthenticated memory exhaustion via oversized message amplification #21556

@manizada

Description

@manizada

Bug report criteria

What happened?

(duplicating my original security report below as per discussion on our email chain this will be handled as a non-security bug report)

Summary

A remote unauthenticated attacker can force etcd to allocate large amounts of memory by sending a single large websocket message to the default /v3/ websocket proxy (gRPC-gateway). This can lead to memory-exhaustion DoS and potential OOM/service outage by increasing memory use ~5x of the attacker payload.

Details

etcd wraps the gRPC-gateway handler under /v3/ with a websocket proxy (grpc-websocket-proxy). The proxy reads websocket messages with gorilla/websocket ReadMessage() without a configured read limit, which buffers whole messages in memory. As a result, an attacker-controlled websocket message size can directly drive etcd heap usage.

In the PoC below, a single unauthenticated 256 MiB websocket message to /v3/kv/range caused etcd RSS to increase by ~1.26 GiB, ~5x the message size and remain elevated at least 10 seconds after request completion.

Impact

Unauthenticated remote memory-exhaustion DoS against etcd client endpoints that are reachable by the attacker. Also especially impactful for containerized deployments with tight memory limits.

Suggested severity: High, CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H (7.5)

What did you expect to happen?

etcd should enforce request-size limits (e.g., --max-request-bytes) before fully reading/buffering the payload, thus disallowing the attacker to a) drive so much memory via the request directly and b) amplify the usage 5x.

How can we reproduce it (as minimally and precisely as possible)?

PoC

repro-ws-huge.py

1. Adjust the attached PoC to point at the right paths for ETCD_BIN (see the top-level const definitions for other tunables too).
2. Run python ./repro-ws-huge.py
3. Expected output:

*** LAUNCH A FRESH SINGLE-NODE ETCD FOR THE WEBSOCKET MEMORY TEST ***$ /usr/local/bin/vh-etcd --name vh1 --data-dir /tmp/vh-repro-etcd-f002-data --listen-client-urls http://127.0.0.1:2379 --advertise-client-urls http://127.0.0.1:2379 --listen-peer-urls http://127.0.0.1:2380 --initial-advertise-peer-urls http://127.0.0.1:2380 --initial-cluster vh1=http://127.0.0.1:2380 --initial-cluster-state new --log-level info
*** CAPTURE BASELINE RSS BEFORE THE OVERSIZED WEBSOCKET MESSAGE ***
rss_before_kib=29980
*** SEND ONE OVERSIZED WEBSOCKET MESSAGE TO /V3/KV/RANGE ***
target=ws://127.0.0.1:2379/v3/kv/range
websocket_payload_bytes=268435456
*** CAPTURE RSS 10 SECONDS AFTER THE WEBSOCKET MESSAGE ***
rss_after_kib=1348604
rss_delta_kib=1318624

Anything else we need to know?

No response

Etcd version (please run commands below)

reproduced on etcd 3.6.7, 3.6.8, and main commit c3aff56

Etcd configuration (command line flags or environment variables)

No response

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

No response

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions