feat(datadog_metrics sink): switch series v2 endpoint to zstd compression#24956
feat(datadog_metrics sink): switch series v2 endpoint to zstd compression#24956vladimir-dd wants to merge 3 commits intomasterfrom
Conversation
5924ee2 to
c4c80b6
Compare
- Add `DatadogMetricsCompression` enum (Zlib/Zstd) to `config.rs` with a `compression()` method on `DatadogMetricsEndpoint`; Series v2 maps to Zstd, v1 and Sketches map to Zlib - Series v2 (/api/v2/series) now uses zstd; v1 and Sketches continue using zlib - Remove hardcoded "deflate" Content-Encoding; propagate `content_encoding` through `DatadogMetricsRequest` and `DatadogMetricsRequestBuilder` - Make `DatadogMetricsEncoder::new` infallible; move `with_payload_limits` to a test-only impl block; remove `CreateError` and `validate_payload_size_limits` - Fix `max_uncompressed_header_len` to take an endpoint parameter (only Series v1 has a JSON envelope; v2 and Sketches use protobuf with no envelope) - Fix `try_compress_buffer` worst-case estimate to use `max_compressed_size`, which dispatches between the deflate stored-block formula and the ZSTD_compressBound formula - Add proptests for both V1 (zlib) and V2 (zstd) encoding paths, with ranges proportional to each endpoint's real API limits Rationale: Series v2 uses protobuf + zstd compression while v1 and Sketches use zlib. The previous code hardcoded "deflate" for all endpoints. The new DatadogMetricsCompression enum makes the compression scheme a first-class property derived from the endpoint, ensuring Content-Encoding always matches the compressor. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
c4c80b6 to
fa052b6
Compare
- Add changelog fragment for zstd compression on Series v2 endpoint - Remove misplaced doc comment that had landed above generate_series_metrics Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@codex review |
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 76fb1c59bd
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| impl DatadogMetricsCompression { | ||
| pub(super) const fn content_encoding(self) -> &'static str { | ||
| match self { | ||
| Self::Zstd => "zstd", |
There was a problem hiding this comment.
Send Datadog v2 metrics with
zstd1 encoding token
For Series v2, content_encoding() now emits "zstd", but Datadog’s generated Metrics v2 client docs for submit_metrics list allowed content_encoding values as deflate, zstd1, and gzip (see MetricsAPI#submit_metrics allowable values at https://datadoghq.dev/datadog-api-client-ruby/DatadogAPIClient/V2/MetricsAPI.html). If the intake strictly validates this enum, all v2 requests from this change will carry an invalid Content-Encoding header and be rejected with 4xx responses, causing metric loss.
Useful? React with 👍 / 👎.
Summary
Rationale: Switch Series v2 (
/api/v2/series) to zstd compression.DatadogMetricsCompressionenum (Zlib/Zstd) inconfig.rswithcompressor(),content_encoding(), andmax_compressed_size()methodscompression()method onDatadogMetricsEndpoint: Series v2 → Zstd, Series v1 and Sketches → Zlibmax_compressed_size(n)for each scheme: Zlib uses the DEFLATE stored-block worst-case formula; Zstd mirrors theZSTD_compressBoundC macrocontent_encodingthroughDatadogMetricsRequestand the request builder instead of hardcoding"deflate"DatadogMetricsEncoder::new()infallible — production limits frompayload_limits()are always valid; removeCreateErrorandvalidate_payload_size_limitswith_payload_limits()to#[cfg(test)]; fixreset_state()to create the correct compressor for the endpoint on each resetmax_uncompressed_header_lento accept an endpoint argument — only Series v1 has a JSON envelope; all other endpoints return 0Tests added:
max_compressed_size_is_upper_bound: empirically validates both Zlib and Zstd formulas are true upper bounds using incompressible (Xorshift64) data, and are not overly conservative (slack ≤ 1% + 64 bytes)encode_series_v2_breaks_out_when_limit_reached_compressed: verifies the hot-path compressed-limit check works correctly for the zstd pathencoding_check_for_payload_limit_edge_cases_v2: proptest that any Series v2 payload decompresses cleanly with zstd and stays within configured limitsencoding_check_for_payload_limit_edge_cases→encoding_check_for_payload_limit_edge_cases_v1to make the scope explicitCorrectness verification:
compression()returns Zlib, wired tozlib_default()compressor and"deflate"header, same as beforecompressed_written + n + block_overhead(n) > limitequals newcompressed_len + max_compressed_size(n) > limitsincemax_compressed_size(n) = n + block_overhead(n)validate_payload_size_limitsremoval is safe:finish()remains the final safeguard; production limits are always validVector configuration
How did you test this PR?
cargo test --no-default-features --features sinks-datadog_metrics)max_compressed_size_is_upper_boundempirically confirms both Zlib and Zstd bound formulas at stored-block boundaries (16 KB for zlib, 128 KB for zstd) and at various sizes up to 500 KBencoding_check_for_payload_limit_edge_cases_v2fuzzes compressed and uncompressed limits for the V2/zstd pathmake check-clippypasses with no warningsChange Type
Is this a breaking change?
Does this PR include user facing changes?
References
Notes
@vectordotdev/vectorto reach out to us regarding this PR.pre-pushhook, please see this template.make fmtmake check-clippy(if there are failures it's possible some of them can be fixed withmake clippy-fix)make test