Skip to content

refactor(serde): compile-time schema layer and variant dispatch overhaul#56

Open
16bit-ykiko wants to merge 2 commits intomainfrom
refactor-serde
Open

refactor(serde): compile-time schema layer and variant dispatch overhaul#56
16bit-ykiko wants to merge 2 commits intomainfrom
refactor-serde

Conversation

@16bit-ykiko
Copy link
Copy Markdown
Member

@16bit-ykiko 16bit-ykiko commented Mar 14, 2026

Summary

  • Introduce a compile-time schema layer (kind.h, descriptor.h, match.h) that builds sorted field-entry tables via consteval functions, enabling O(1) key matching at runtime through fold-expression short-circuiting.
  • Refactor std::variant dispatch so the generic serde layer owns all four tagging strategies (untagged / external / internal / adjacent), while backends provide only minimal primitives (extract_object_keys, variant_support trait).
  • Add type-aware matching (kind_compatible) and a two-pass optimization for DOM backends (peek + schema match → direct deserialize, no probe/backtrack).

Changes

New files

  • include/eventide/serde/serde/schema/kind.htype_kind enum, consteval kind_of<T>(), hint_to_kind()
  • include/eventide/serde/serde/schema/descriptor.hfield_descriptor, type_descriptor, consteval describe<T>()
  • include/eventide/serde/serde/schema/match.hname_buf, constexpr rename policies, ct_field_entry, build_field_entries<T,Config>(), fold-based match_key, match_variant_by_keys, build_kind_to_index_map
  • tests/common/schema_tests.cpp — 52 test cases + extensive static_assert coverage

Modified files

  • serde.h — variant deserialization uses schema matching instead of probe-based approach
  • tagged.hvariant_support trait, refactored tagged variant deserialization
  • All backend deserializers — added extract_object_keys (simdjson, yyjson, toml)
  • All backend serializers/deserializers — added variant_support specializations

Test plan

  • All 542 unit tests pass (including 52 new schema tests)
  • simdjson backend: roundtrip, attrs, config, tagged variant tests pass
  • yyjson backend: standard + DOM tests pass
  • toml backend: roundtrip + standard tests pass
  • bincode backend: roundtrip + standard tests pass
  • flatbuffers backend: standard + object + schema tests pass

Made with Cursor

Summary by CodeRabbit

  • New Features

    • Added comprehensive schema system for type introspection and JSON Schema generation.
    • Introduced field-level attributes for serialization customization (skip conditions, type conversions, renaming, aliases).
    • Implemented improved variant serialization/deserialization with support for external, internal, and adjacent tagging strategies.
  • Refactor

    • Restructured variant handling across all serialization backends (bincode, JSON, TOML, FlatBuffers, content) using schema-driven dispatch.
    • Reorganized attribute and behavior systems for cleaner architecture.

Introduce a three-layer architecture for std::variant serialization:

- Schema layer (kind.h, descriptor.h, match.h): consteval type
  classification, field descriptor generation, and fold-expression
  based key matching with compile-time sorted entry tables.
- Generic layer (serde.h, tagged.h): owns variant strategy dispatch
  (untagged / externally / internally / adjacently tagged), calls
  schema for matching and backends for I/O.
- Backend layer: minimal primitives (extract_object_keys,
  variant_support trait) across all backends (simdjson, yyjson,
  toml, bincode, flatbuffers).

Key optimizations:
- All wire names and rename policies computed at compile time via
  consteval functions and name_buf (64-byte fixed buffer).
- Field entry tables sorted by string length; fold expression with
  || short-circuiting replaces runtime hash tables.
- Type-aware matching (kind_compatible) distinguishes alternatives
  with same field names but different value types.
- Two-pass strategy for DOM backends: peek + schema match, then
  direct deserialize without probe/backtrack.

Adds 52 schema unit tests (static_assert + runtime) covering kind
classification, descriptors, rename policies, match_key, variant
matching, and kind-to-index dispatch. All 542 tests pass.

Made-with: Cursor
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 14, 2026

📝 Walkthrough

Walkthrough

This PR introduces a comprehensive schema-driven architecture for Eventide's serde library. It adds compile-time type reflection and schema generation systems, refactors variant serialization/deserialization to use schema-based dispatch across all backends, reorganizes attributes into the schema namespace, and adds trait specializations for variant handling in bincode and flatbuffers.

Changes

Cohort / File(s) Summary
Schema System Foundation
include/eventide/serde/schema/kind.h, include/eventide/serde/schema/descriptor.h, include/eventide/serde/schema/node.h, include/eventide/serde/schema/schema.h
Introduces core type classification (type_kind, scalar_kind, type_flags), type-erased schema view structures (type_schema_view, field_schema_view), and a schema_node runtime DOM for variant matching. Provides compile-time reflection utilities and static schema construction via get_schema().
Schema Attributes & Behavior
include/eventide/serde/schema/attrs.h, include/eventide/serde/schema/behavior.h
Establishes compile-time attribute system for serde fields (skip, flatten, rename, alias, literal) and struct-level attributes (tagged, deny_unknown_fields). Introduces behavior system (enum_string, skip_if, with, as) with skip predicate evaluation and validation. Includes full field schema derivation and alias collision detection.
Schema Matching & Dispatch
include/eventide/serde/schema/match.h, include/eventide/serde/schema/json_schema.h
Implements compile-time field entry matching, kind compatibility checks, variant alternative selection logic, and runtime untagged_dispatch for variant deserialization. Adds JSON Schema generation capability for any type via to_json_schema().
Variant Refactor: JSON Backend
include/eventide/serde/json/deserializer.h, include/eventide/serde/json/serializer.h
Replaces multi-pass variant deserialization with source-consumption + schema-based dispatch via untagged_dispatch. Removes serialize_variant from Serializer and introduces consume_variant_source(), json_type_to_hint(), and to_schema_node() helpers. Updates deserialize_variant signature to return result_t.
Variant Refactor: TOML Backend
include/eventide/serde/toml/deserializer.h, include/eventide/serde/toml/serializer.h
Mirrors JSON changes: replaces dispatch logic with source-consumption, adds consume_variant_source(), node_to_hint(), to_schema_node(). Removes serialize_variant method. Uses schema::untagged_dispatch for deserialization.
Variant Refactor: Content Backend
include/eventide/serde/content/deserializer.h, include/eventide/serde/content/serializer.h
Replaces in-place variant dispatch with value-source-driven approach via consume_variant_source(). Adds value_ref_to_hint() and to_schema_node() helpers. Removes serialize_variant. Updates deserialize_variant to use untagged_dispatch with constructed schema node.
Variant Refactor: FlatBuffers Backend
include/eventide/serde/flatbuffers/serializer.h, include/eventide/serde/flatbuffers/schema.h
Changes serialize_variant from variadic to single-type template. Introduces scalar_name_from_kind() helper and refactors field/enum emission to use kind-based approach instead of type traits. Replaces is_schema_struct_v and can_inline_struct_v with constexpr lambdas and is_fb_struct() validation.
Trait Specializations
include/eventide/serde/bincode/serializer.h, include/eventide/serde/bincode/deserializer.h
Adds serialize_traits and deserialize_traits specializations for bincode::Serializer/Deserializer when T is std::variant. Specializations delegate to serializer.serialize_variant() and deserializer.deserialize_variant().
Tagged Variant Support
include/eventide/serde/serde/utils/tagged.h
Introduces serialize_tagged and deserialize_tagged paths supporting external/internal/adjacent tagging strategies. Adds deserialize_variant_at_index helpers and refactors match_and_deserialize_alt to use compile-time tag-name arrays instead of Names parameter.
Core Serde Logic Updates
include/eventide/serde/serde/serde.h, include/eventide/serde/serde/traits.h
Updates variant dispatch to use schema::resolve_tag_mode() to route to serialize_tagged/deserialize_tagged. Removes variant parameter requirements from serializer_like/deserializer_like concepts. Switches includes from attrs/* to schema/* headers.
Attribute System Migration
include/eventide/serde/serde/attrs.h, include/eventide/serde/serde/attrs/behavior.h, include/eventide/serde/serde/attrs/schema.h, include/eventide/serde/serde/attrs/hint.h
Migrates behavior and schema attribute declarations from attrs/ namespace to schema/ namespace via includes. Removes entire hint attribute system (hint, is_hint_attr_v, get_hint utilities). Consolidates attribute system into schema/ headers.
Dependency & Utility Updates
include/eventide/serde/serde/utils/common.h, include/eventide/serde/serde/utils/apply_behavior.h, include/eventide/serde/serde/utils/field_dispatch.h, include/eventide/serde/serde/utils/reflectable.h, include/eventide/serde/serde/annotation.h
Removes captured_dom_value machinery and associated type traits from common.h. Updates include paths for schema-based attributes. Changes field dispatch tagging condition to use schema::resolve_tag_mode().
Test Coverage
tests/schema/schema_tests.cpp, tests/CMakeLists.txt
Adds 1,200+ lines of comprehensive test suite for schema subsystem covering kind classification, descriptors, JSON Schema generation, rename policies, variant matching, wire names, and type mappings. Extends CMakeLists to include schema test sources.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • PR #25: Refactors variant serialization/deserialization mechanisms across backends, directly complementary to this PR's schema-driven dispatch redesign.
  • PR #42: Adds bincode::Serializer/Deserializer trait specializations for std::variant, overlapping with the bincode variant handling introduced here.
  • PR #26: Modifies the serde attribute and annotation system, related to this PR's attribute system reorganization from attrs/ to schema/ namespace.

Poem

🐰 Hop hop, schemas now take the stage,
No more variants caged in rage,
With kinds and hints in harmony dance,
Your serde gets a grand second chance! ✨
Tagged or not, they match so right,
Building futures, oh what a sight! 📐

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.97% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: a significant refactoring of the serde variant dispatch system with a new compile-time schema layer, directly matching the scope and primary objective of the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch refactor-serde
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
include/eventide/serde/serde/traits.h (1)

3-17: ⚠️ Potential issue | 🟠 Major

Add #include <variant> to this header.

eventide::serde::null_like on line 24 uses std::monostate, which is defined exclusively in <variant>. This header lacks a direct include for it, making the header non-self-contained and relying on transitive includes. This fragility can break if include order changes or dependencies are refactored.

Proposed fix
 `#include` <tuple>
 `#include` <type_traits>
 `#include` <utility>
+#include <variant>
 
 `#include` <vector>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/serde/traits.h` around lines 3 - 17, The header is
missing a direct include for <variant>, but it uses std::monostate in
eventide::serde::null_like; add `#include` <variant> to include the definition of
std::monostate so traits.h is self-contained and no longer depends on transitive
includes (locate the include block at the top of traits.h near the other
standard headers and insert the <variant> include).
🧹 Nitpick comments (1)
include/eventide/serde/json/deserializer.h (1)

362-415: JSON numbers always mapped to type_kind::integer - potential inconsistency with TOML backend.

All JSON numbers are mapped to type_kind::integer (line 399), while the TOML backend distinguishes between integer and floating types. Although kind_compatible allows integer↔floating compatibility, this could lead to subtle matching differences between backends.

Consider mapping to type_kind::floating when the number type is a floating-point value for consistency:

♻️ Suggested improvement
                 case simdjson::ondemand::json_type::number:
-                    kind = serde::schema::type_kind::integer;
+                    {
+                        simdjson::ondemand::number_type nt;
+                        if(val.get_number_type().get(nt) == simdjson::SUCCESS) {
+                            if(nt == simdjson::ondemand::number_type::floating_point_number) {
+                                kind = serde::schema::type_kind::floating;
+                            } else {
+                                kind = serde::schema::type_kind::integer;
+                            }
+                        } else {
+                            kind = serde::schema::type_kind::integer;
+                        }
+                    }
                     break;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/json/deserializer.h` around lines 362 - 415, In
extract_object_keys change the mapping for simdjson::ondemand::json_type::number
to detect whether the numeric value is integral or floating and set
serde::schema::type_kind to integer or floating accordingly; use the simdjson
ondemand value (val) to inspect if it has a fractional part (e.g., parse as
double and check if fmod == 0 or use val.is_double()/is_integral helpers if
available) and assign serde::schema::type_kind::floating when non-integer,
keeping existing integer mapping for whole numbers so JSON number handling
matches the TOML backend’s integer vs floating distinction.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@include/eventide/serde/content/deserializer.h`:
- Around line 145-154: extract_object_keys() currently classifies numeric values
by checking entry.value.get_int() and get_double() but omits
entry.value.get_uint(), causing unsigned integers to be misclassified and
breaking schema matching; fix by adding an else-if branch that calls
entry.value.get_uint() and sets kind = serde::schema::type_kind::integer (same
as for get_int()) in the loop that sets 'kind' (the block iterating for(auto
entry: *obj)), so unsigned numbers are treated as integers and align with
deserialize_uint() behavior.

In `@include/eventide/serde/serde/schema/match.h`:
- Around line 197-230: count_required_fields()/single_required_contribution()
currently count unique required fields but the matching code (e.g.,
match_variant_by_keys()) increments required_matched and matched for each
matched entry including aliases, allowing the same logical field to be counted
multiple times; fix by deduplicating aliases when matching: map each incoming
key to its canonical field identifier (use the same canonical name/index used by
resolve_field_required/refl::field_type) and maintain a per-variant set of seen
canonical fields (e.g., seen_field_indices or seen_field_names) so that
required_matched and matched are incremented only once per canonical field;
apply the same dedupe change to the other matching sites noted (the ranges
around lines 259-265, 303-306, and 357-376) so all alias hits count once toward
required and matched totals.
- Around line 34-49: The name_buf can overflow because from() clamps to 63 but
push_back() blindly writes into data[len++] and back() assumes non-empty; change
from() to clamp using sizeof(data)-1, and update push_back() to only append when
len < (sizeof(data)-1) (leave len unchanged when at capacity or optionally cap
it to max), and guard back() to not index when empty (e.g. return '\0' or
assert/handle empty). Apply the same capacity-safe logic to the other related
functions in this struct (lines roughly 54-90) so all mutations use
sizeof(data)-1 and never write past data[].
- Around line 145-150: The current kind_compatible function incorrectly treats
incoming floating-point values as compatible with integer schema fields (making
type_kind::floating matching type_kind::integer), causing ambiguous matches;
change the logic in kind_compatible so only integer incoming values are
considered compatible with floating schema (keep the existing integer->floating
allowance) and remove the symmetric floating->integer case—update the function
handling in kind_compatible (and any related checks using type_kind) so
type_kind::floating is not accepted for schema_kind == type_kind::integer.

In `@include/eventide/serde/serde/serde.h`:
- Around line 52-62: Add a missing specialization of variant_support for
toml::Serializer so the static_asserts in the serialize path (the checks using
variant_support<S>::externally_tagged / internally_tagged / adjacently_tagged)
do not fail for TOML; specifically declare variant_support<toml::Serializer>
with externally_tagged = true, internally_tagged = true, and adjacently_tagged =
true (matching toml::Deserializer’s full support) and place it alongside the
other serializer specializations so the
serialize_externally_tagged/serialize_internally_tagged/serialize_adjacently_tagged
calls for toml::Serializer compile.

---

Outside diff comments:
In `@include/eventide/serde/serde/traits.h`:
- Around line 3-17: The header is missing a direct include for <variant>, but it
uses std::monostate in eventide::serde::null_like; add `#include` <variant> to
include the definition of std::monostate so traits.h is self-contained and no
longer depends on transitive includes (locate the include block at the top of
traits.h near the other standard headers and insert the <variant> include).

---

Nitpick comments:
In `@include/eventide/serde/json/deserializer.h`:
- Around line 362-415: In extract_object_keys change the mapping for
simdjson::ondemand::json_type::number to detect whether the numeric value is
integral or floating and set serde::schema::type_kind to integer or floating
accordingly; use the simdjson ondemand value (val) to inspect if it has a
fractional part (e.g., parse as double and check if fmod == 0 or use
val.is_double()/is_integral helpers if available) and assign
serde::schema::type_kind::floating when non-integer, keeping existing integer
mapping for whole numbers so JSON number handling matches the TOML backend’s
integer vs floating distinction.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dbccfa7b-2ade-4374-b62f-cb598f31a1e5

📥 Commits

Reviewing files that changed from the base of the PR and between 35a77ea and a86847a.

📒 Files selected for processing (19)
  • include/eventide/serde/bincode/deserializer.h
  • include/eventide/serde/bincode/serializer.h
  • include/eventide/serde/content/deserializer.h
  • include/eventide/serde/content/serializer.h
  • include/eventide/serde/flatbuffers/deserializer.h
  • include/eventide/serde/flatbuffers/serializer.h
  • include/eventide/serde/json/deserializer.h
  • include/eventide/serde/json/serializer.h
  • include/eventide/serde/serde/schema/descriptor.h
  • include/eventide/serde/serde/schema/kind.h
  • include/eventide/serde/serde/schema/match.h
  • include/eventide/serde/serde/schema/schema.h
  • include/eventide/serde/serde/serde.h
  • include/eventide/serde/serde/traits.h
  • include/eventide/serde/serde/utils/common.h
  • include/eventide/serde/serde/utils/tagged.h
  • include/eventide/serde/toml/deserializer.h
  • include/eventide/serde/toml/serializer.h
  • tests/common/schema_tests.cpp
💤 Files with no reviewable changes (1)
  • include/eventide/serde/toml/serializer.h

Comment on lines +145 to +154
for(auto entry: *obj) {
auto kind = serde::schema::type_kind::any;
if(auto s = entry.value.get_string()) {
kind = serde::schema::type_kind::string;
} else if(auto b = entry.value.get_bool()) {
kind = serde::schema::type_kind::boolean;
} else if(auto i = entry.value.get_int()) {
kind = serde::schema::type_kind::integer;
} else if(auto d = entry.value.get_double()) {
kind = serde::schema::type_kind::floating;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Classify unsigned numbers as integer too.

extract_object_keys() skips content::ValueRef::get_uint(), even though deserialize_uint() relies on it on Lines 201-223. Large positive values can therefore be misclassified during schema matching and send untagged variant selection down the wrong alternative.

🔧 Proposed fix
             if(auto s = entry.value.get_string()) {
                 kind = serde::schema::type_kind::string;
             } else if(auto b = entry.value.get_bool()) {
                 kind = serde::schema::type_kind::boolean;
             } else if(auto i = entry.value.get_int()) {
                 kind = serde::schema::type_kind::integer;
+            } else if(auto u = entry.value.get_uint()) {
+                kind = serde::schema::type_kind::integer;
             } else if(auto d = entry.value.get_double()) {
                 kind = serde::schema::type_kind::floating;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/content/deserializer.h` around lines 145 - 154,
extract_object_keys() currently classifies numeric values by checking
entry.value.get_int() and get_double() but omits entry.value.get_uint(), causing
unsigned integers to be misclassified and breaking schema matching; fix by
adding an else-if branch that calls entry.value.get_uint() and sets kind =
serde::schema::type_kind::integer (same as for get_int()) in the loop that sets
'kind' (the block iterating for(auto entry: *obj)), so unsigned numbers are
treated as integers and align with deserialize_uint() behavior.

Comment on lines +34 to +49
struct name_buf {
char data[64]{};
std::uint8_t len = 0;

static constexpr name_buf from(std::string_view s) {
name_buf b;
auto n = s.size() < 63 ? s.size() : 63;
for(std::size_t i = 0; i < n; i++) b.data[i] = s[i];
b.len = static_cast<std::uint8_t>(n);
return b;
}

constexpr std::string_view view() const { return {data, len}; }
constexpr void push_back(char c) { data[len++] = c; }
constexpr char back() const { return data[len - 1]; }
constexpr bool empty() const { return len == 0; }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard overlong wire names in name_buf.

name_buf::from() truncates at 63 bytes, while push_back() never checks capacity. Long canonical names, aliases, or rename expansions can therefore either collide silently or overflow the buffer during consteval matching.

Also applies to: 54-90

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/serde/schema/match.h` around lines 34 - 49, The
name_buf can overflow because from() clamps to 63 but push_back() blindly writes
into data[len++] and back() assumes non-empty; change from() to clamp using
sizeof(data)-1, and update push_back() to only append when len <
(sizeof(data)-1) (leave len unchanged when at capacity or optionally cap it to
max), and guard back() to not index when empty (e.g. return '\0' or
assert/handle empty). Apply the same capacity-safe logic to the other related
functions in this struct (lines roughly 54-90) so all mutations use
sizeof(data)-1 and never write past data[].

Comment on lines +145 to +150
constexpr bool kind_compatible(type_kind incoming, type_kind schema_kind) {
if(incoming == type_kind::any || schema_kind == type_kind::any) return true;
if(incoming == schema_kind) return true;
if(incoming == type_kind::integer && schema_kind == type_kind::floating) return true;
if(incoming == type_kind::floating && schema_kind == type_kind::integer) return true;
return false;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't treat floating input as compatible with integer fields.

The symmetric numeric check here makes {x: 1.5} match both int x and double x, so match_variant_by_keys() falls back to declaration order instead of the actual schema.

🔧 Proposed fix
 constexpr bool kind_compatible(type_kind incoming, type_kind schema_kind) {
     if(incoming == type_kind::any || schema_kind == type_kind::any) return true;
     if(incoming == schema_kind) return true;
     if(incoming == type_kind::integer && schema_kind == type_kind::floating) return true;
-    if(incoming == type_kind::floating && schema_kind == type_kind::integer) return true;
     return false;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/serde/schema/match.h` around lines 145 - 150, The
current kind_compatible function incorrectly treats incoming floating-point
values as compatible with integer schema fields (making type_kind::floating
matching type_kind::integer), causing ambiguous matches; change the logic in
kind_compatible so only integer incoming values are considered compatible with
floating schema (keep the existing integer->floating allowance) and remove the
symmetric floating->integer case—update the function handling in kind_compatible
(and any related checks using type_kind) so type_kind::floating is not accepted
for schema_kind == type_kind::integer.

Comment on lines +197 to +230
// ─── Count required fields (unique fields only, no aliases) ─────────────────

template <typename T>
consteval std::size_t count_required_fields();

template <typename T, std::size_t I>
consteval std::size_t single_required_contribution() {
if constexpr(field_has_skip<T, I>()) {
return 0;
} else if constexpr(field_has_flatten<T, I>()) {
using field_t = refl::field_type<T, I>;
using inner_t = unwrapped_t<field_t>;
return count_required_fields<inner_t>();
} else {
return resolve_field_required<T, I>() ? 1 : 0;
}
}

template <typename T>
consteval std::size_t count_required_fields() {
if constexpr(!refl::reflectable_class<std::remove_cvref_t<T>>) {
return 0;
} else {
using U = std::remove_cvref_t<T>;
constexpr auto N = refl::field_count<U>();
if constexpr(N == 0) {
return 0;
} else {
return []<std::size_t... Is>(std::index_sequence<Is...>) consteval {
return (single_required_contribution<U, Is>() + ...);
}(std::make_index_sequence<N>{});
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Deduplicate aliases before counting required matches.

count_required_fields() counts unique fields, but match_variant_by_keys() increments required_matched and matched per matched entry. If a required field shows up under both its canonical name and an alias, those hits can satisfy multiple required-field slots and inflate the winner score for the wrong alternative.

Also applies to: 259-265, 303-306, 357-376

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/serde/schema/match.h` around lines 197 - 230,
count_required_fields()/single_required_contribution() currently count unique
required fields but the matching code (e.g., match_variant_by_keys()) increments
required_matched and matched for each matched entry including aliases, allowing
the same logical field to be counted multiple times; fix by deduplicating
aliases when matching: map each incoming key to its canonical field identifier
(use the same canonical name/index used by
resolve_field_required/refl::field_type) and maintain a per-variant set of seen
canonical fields (e.g., seen_field_indices or seen_field_names) so that
required_matched and matched are incremented only once per canonical field;
apply the same dedupe change to the other matching sites noted (the ranges
around lines 259-265, 303-306, and 357-376) so all alias hits count once toward
required and matched totals.

…d variants

- Move attrs/schema.h and attrs/behavior.h to serde/schema/ layer
- Add tag_mode enum and variant tagging info (tag_field, content_field,
  alt_names) to type_schema_view for compile-time introspection
- Consolidate 6 tagged S/D functions into 2 schema-driven ones
  (serialize_tagged / deserialize_tagged), eliminating TagAttr param
- Replace variant_support<Backend> trait with requires-based capability
  detection across all 9 backend files
- Remove attrs/hint.h (backend-specific custom attributes are anti-schema)
- Remove peek_type_hint() from json/toml/content deserializers
- Simplify serde.h and field_dispatch.h variant dispatch to single-line
  schema::resolve_tag_mode<V>() check
- Fix match.h fold expression warning

Made-with: Cursor
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

♻️ Duplicate comments (2)
include/eventide/serde/schema/match.h (2)

29-34: ⚠️ Potential issue | 🟠 Major

Don't accept floating input for integer fields.

The floating -> integer arm still makes an incoming floating field schema-match both integer and floating members, so object-variant selection can collapse to declaration order instead of the actual schema.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/schema/match.h` around lines 29 - 34, The match
function kind_compatible incorrectly allows floating incoming values to match
integer schema (the branch if(incoming == type_kind::floating && schema_kind ==
type_kind::integer)); remove that branch so only integer->floating is allowed
(keep the integer->floating arm that permits integer input for floating schema)
— update the function kind_compatible to not treat incoming type_kind::floating
as compatible with schema type_kind::integer and ensure references to type_kind
enum and the function name are used to locate the change.

245-255: ⚠️ Potential issue | 🟠 Major

Score aliases once per canonical field.

count_required_fields() is per logical field, but this loop increments matched and required_matched for every matching entry. If the payload contains both a canonical name and one of its aliases, the same field is counted twice and can incorrectly satisfy the required-field threshold or outrank the real match.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/schema/match.h` around lines 245 - 255, The loop over
node.fields in match.h currently increments matched and required_matched for
every matching entry (using detail::match_key<U, Config>(f.key) -> r), which
double-counts aliases; fix by de-duplicating per canonical field before
incrementing: maintain a local set of canonical identifiers (e.g.,
r->canonical_id or r->name or the pointer value of r) and only increment matched
and required_matched when that canonical id has not been seen yet; ensure the
logic that checks r->required uses this de-duped condition so it aligns with
count_required_fields() and prevents aliases from satisfying required-field
thresholds or affecting ranking.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@include/eventide/serde/flatbuffers/schema.h`:
- Around line 80-95: The function is_fb_struct(const schema::type_schema_view*
sv) fails to reject optional members; update the field loop in is_fb_struct to
check each field's requiredness (e.g., inspect field.required or the appropriate
flag on the field) and return false if a field is not required before proceeding
to nested type checks (so optional<T> will prevent classifying the type as a
FlatBuffers struct). Ensure the guard is placed before using field.nested and
the existing nested-kind checks so optional members are correctly classified as
tables.

In `@include/eventide/serde/schema/attrs.h`:
- Around line 358-380: The verify_field_aliases check only compares each alias
against other fields' canonical names (names[j]) so two fields can declare the
same alias and slip through; update the logic in the verify_field_aliases lambda
(used by validate_field_schema) to also fetch the alias attribute for each other
non-excluded field (i.e. refl::field_type<T, j>::attrs and
serde::detail::tuple_find_t<attrs_t, is_alias_attr> for index j) and compare
alias_name against that field's alias list as well as its canonical name,
returning false if any match is found; preserve the existing excluded and Is
handling and short-circuit behavior.

In `@include/eventide/serde/schema/descriptor.h`:
- Around line 198-214: name_buf::from currently truncates inputs to 63 bytes and
name_buf::push_back doesn't bounds-check the 64-byte buffer, allowing silent
truncation or overflow of transformed wire names; fix by computing the final
transformed name length before construction (or failing early) and adding a
compile-time/static check that the transformed name fits within the 64-byte
capacity, and update push_back to validate len < sizeof(data) before writing (or
assert/fail) so wire_name_static cannot produce silently truncated or
overflowing names.

In `@include/eventide/serde/schema/json_schema.h`:
- Around line 146-173: emit_structure currently always appends
,"additionalProperties":false making the JSON Schema stricter than runtime;
change emit_structure to consult the type schema's unknown-field policy (e.g.
inspect type_schema_view for the schema::deny_unknown_fields flag or equivalent
boolean) and only append ,"additionalProperties":false when that flag is set,
otherwise omit the additionalProperties entry so the generated schema matches
the runtime generic deserialize behavior; update emit_structure to reference sv
and the deny-unknown-fields symbol (schema::deny_unknown_fields or the
corresponding member on type_schema_view).
- Around line 176-183: emit_variant currently always emits an untagged oneOf;
update it to inspect type_schema_view->tagging, tag_field, content_field, and
alt_names and emit schemas that match serialize_tagged() for all tagging modes:
for untagged, keep oneOf of alternatives; for external, emit oneOf of
single-property objects where each property name is alt_names[i] and value is
emit(alternatives[i]); for internal, emit oneOf of objects that include a
literal tag_field property with value alt_names[i] plus the alternative schema
either merged in or placed under content_field if content_field is set; for
adjacent, emit oneOf of objects with a tag_field literal and a content_field
whose value is emit(alternatives[i]); use sv->alternatives, sv->alt_names,
sv->tag_field, sv->content_field and sv->tagging to build the appropriate JSON
Schema shapes so the emitted schema matches serialize_tagged() behavior.
- Around line 106-109: The branch handling type_kind::optional and
type_kind::pointer currently returns the element schema directly (in emit /
to_json_schema), which causes top-level optional/pointer types to not allow null
in the produced JSON Schema; change emit (the code path for type_kind::optional
and type_kind::pointer, and any callers such as to_json_schema) to wrap the
element schema as a nullable schema — e.g. produce a schema that accepts either
the element schema or null (using anyOf with the element schema and
{"type":"null"} or by adding "type":["null", ...] if element schema has a simple
type) — rather than returning sv->element unchanged so top-level
optional/pointer are represented as nullable.

In `@include/eventide/serde/schema/kind.h`:
- Around line 138-149: The implementation of scalar_kind_of<T>() wrongly
hard-codes long/unsigned long to specific scalar_kinds; remove the
std::same_as<U, long> and std::same_as<U, unsigned long> branches and instead
determine signed/unsigned integer kinds portably by using the type's signedness
and width (e.g., std::is_signed_v<U> combined with sizeof(U) or mapping to
std::int*_t/std::uint*_t equivalents) to return scalar_kind::int32/int64 or
scalar_kind::uint32/uint64 as appropriate; update the logic inside
scalar_kind_of to classify any integer type by its actual width and sign rather
than by the ABI-dependent 'long' alias.
- Around line 183-186: The is_ordered flag logic in the if constexpr block that
checks for typename U::key_type incorrectly excludes only std::unordered_map and
std::unordered_set, so unordered_multimap and unordered_multiset are being
marked ordered; update the condition in the same if constexpr (the branch that
currently calls f = f | type_flags::is_ordered) to also exclude
is_specialization_of<std::unordered_multimap, U> and
is_specialization_of<std::unordered_multiset, U> so those unordered container
specializations do not receive type_flags::is_ordered.

In `@include/eventide/serde/schema/match.h`:
- Around line 359-364: When hint_bits indicates an object and
match_variant<Config, Ts...>(node) returns an index, keep the current fast-path
attempt but do not treat a deserialize_variant_at<D, Ts...>(d2, *idx, v) failure
as a hard return: call make_deser() and attempt deserialize_variant_at, and if
it succeeds return the value, but if it fails, swallow / discard that error and
fall through into the existing generic candidate loop so remaining object
candidates are retried; ensure the code still uses the matched index as the fast
path and only continues to the generic loop on deserialization failure.

In `@include/eventide/serde/serde/attrs.h`:
- Around line 8-9: The umbrella header removed the re-export of hint; restore
source compatibility by re-adding the legacy forwarder: either include the
original header "eventide/serde/serde/attrs/hint.h" at the top of
include/eventide/serde/serde/attrs.h, or explicitly re-export the symbols (using
declarations) for eventide::serde::hint, eventide::serde::is_hint_attr_v, and
eventide::serde::get_hint_t in the attrs.h header so callers that include the
umbrella header still see those names.

In `@include/eventide/serde/serde/utils/field_dispatch.h`:
- Around line 89-92: The serialize branch currently unwraps annotated nested
Structs by forwarding value instead of the annotated field.value() only when
schema::resolve_tag_mode<field_t>() != schema::tag_mode::none; modify the logic
so that whenever the field type resolves to a tag mode other than none you
forward field.value(), and mirror that same conditional check in the
corresponding deserialize branch (and the similar block around the other
occurrence at the 197-201 region) so annotated nested structs (e.g.,
annotation<Struct, schema::rename_all<...>> or schema::deny_unknown_fields) are
preserved and the type-level handling in serde.h runs as intended (update the
calls to s_struct.serialize_field(effective_name, ...) and
s_struct.deserialize_field(...) accordingly).

In `@include/eventide/serde/serde/utils/tagged.h`:
- Around line 219-251: The internally-tagged branch currently reuses the raw
source (d.consume_variant_source()) and constructs content_reader(source), which
replays the entire wrapper including the synthetic tag_field and causes
alternatives to fail on unknown fields; modify the logic in the
schema::tag_mode::internal path (the block that reads tag_value and calls
match_and_deserialize_alt) to create a filtered/stripping reader that wraps the
consumed source and omits tag_field when iterating object fields, then pass that
filtered reader to serde::deserialize (instead of content_reader(source)) and
call finish() on the filtered reader so the tag field is not visible to the
alternative deserialization.

---

Duplicate comments:
In `@include/eventide/serde/schema/match.h`:
- Around line 29-34: The match function kind_compatible incorrectly allows
floating incoming values to match integer schema (the branch if(incoming ==
type_kind::floating && schema_kind == type_kind::integer)); remove that branch
so only integer->floating is allowed (keep the integer->floating arm that
permits integer input for floating schema) — update the function kind_compatible
to not treat incoming type_kind::floating as compatible with schema
type_kind::integer and ensure references to type_kind enum and the function name
are used to locate the change.
- Around line 245-255: The loop over node.fields in match.h currently increments
matched and required_matched for every matching entry (using
detail::match_key<U, Config>(f.key) -> r), which double-counts aliases; fix by
de-duplicating per canonical field before incrementing: maintain a local set of
canonical identifiers (e.g., r->canonical_id or r->name or the pointer value of
r) and only increment matched and required_matched when that canonical id has
not been seen yet; ensure the logic that checks r->required uses this de-duped
condition so it aligns with count_required_fields() and prevents aliases from
satisfying required-field thresholds or affecting ranking.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 77d57e64-3114-4981-bc82-9e42c8f5c38e

📥 Commits

Reviewing files that changed from the base of the PR and between a86847a and 1fa6e19.

📒 Files selected for processing (29)
  • include/eventide/serde/bincode/deserializer.h
  • include/eventide/serde/bincode/serializer.h
  • include/eventide/serde/content/deserializer.h
  • include/eventide/serde/content/serializer.h
  • include/eventide/serde/flatbuffers/schema.h
  • include/eventide/serde/flatbuffers/serializer.h
  • include/eventide/serde/json/deserializer.h
  • include/eventide/serde/json/serializer.h
  • include/eventide/serde/schema/attrs.h
  • include/eventide/serde/schema/behavior.h
  • include/eventide/serde/schema/descriptor.h
  • include/eventide/serde/schema/json_schema.h
  • include/eventide/serde/schema/kind.h
  • include/eventide/serde/schema/match.h
  • include/eventide/serde/schema/node.h
  • include/eventide/serde/schema/schema.h
  • include/eventide/serde/serde/annotation.h
  • include/eventide/serde/serde/attrs.h
  • include/eventide/serde/serde/attrs/behavior.h
  • include/eventide/serde/serde/attrs/hint.h
  • include/eventide/serde/serde/attrs/schema.h
  • include/eventide/serde/serde/serde.h
  • include/eventide/serde/serde/utils/apply_behavior.h
  • include/eventide/serde/serde/utils/field_dispatch.h
  • include/eventide/serde/serde/utils/reflectable.h
  • include/eventide/serde/serde/utils/tagged.h
  • include/eventide/serde/toml/deserializer.h
  • tests/CMakeLists.txt
  • tests/schema/schema_tests.cpp
💤 Files with no reviewable changes (3)
  • include/eventide/serde/json/serializer.h
  • include/eventide/serde/content/serializer.h
  • include/eventide/serde/serde/attrs/hint.h
✅ Files skipped from review due to trivial changes (2)
  • include/eventide/serde/serde/annotation.h
  • include/eventide/serde/serde/utils/reflectable.h
🚧 Files skipped from review as they are similar to previous changes (1)
  • include/eventide/serde/bincode/deserializer.h

Comment on lines +80 to +95
constexpr bool is_fb_struct(const schema::type_schema_view* sv) {
if(!sv || sv->kind != schema::type_kind::structure) return false;
if(!schema::has_flag(sv->flags, schema::type_flags::is_trivial)) return false;
if(!schema::has_flag(sv->flags, schema::type_flags::is_standard_layout)) return false;
for(const auto& field: sv->fields) {
if(!field.nested) return false;
auto fk = field.nested->kind;
if(fk == schema::type_kind::integer || fk == schema::type_kind::floating ||
fk == schema::type_kind::boolean || fk == schema::type_kind::character ||
fk == schema::type_kind::enumeration) {
continue;
}
if(fk == schema::type_kind::structure && is_fb_struct(field.nested)) continue;
return false;
}
}();
return true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's find and examine the field_schema_view structure definition
rg "struct field_schema_view|class field_schema_view" --type cpp --type h -A 10

Repository: clice-io/eventide

Length of output: 808


🏁 Script executed:

# Search for the exact file and examine the is_fb_struct function in context
cat -n include/eventide/serde/flatbuffers/schema.h | sed -n '70,100p'

Repository: clice-io/eventide

Length of output: 1560


🏁 Script executed:

# Look for how field.required is used elsewhere in the codebase
rg "field\.required" --type cpp --type h -B 2 -A 2

Repository: clice-io/eventide

Length of output: 515


🏁 Script executed:

# Check the descriptor.h file referenced in the review for context on optional field handling
fd "descriptor.h" --type f

Repository: clice-io/eventide

Length of output: 138


🏁 Script executed:

# Find where is_fb_struct is used
rg "is_fb_struct" --type cpp --type h -B 3 -A 3

Repository: clice-io/eventide

Length of output: 2320


🏁 Script executed:

# Look for tests related to this function
fd "test|spec" --type f -i | xargs rg "is_fb_struct" 2>/dev/null || echo "No tests found with is_fb_struct"

Repository: clice-io/eventide

Length of output: 93


🏁 Script executed:

# Check the descriptor.h file to understand how optional fields are resolved
cat -n include/eventide/serde/schema/descriptor.h | head -50

Repository: clice-io/eventide

Length of output: 1623


Reject optional fields before classifying a type as an FB struct.

is_fb_struct() determines whether to emit a type as a FlatBuffers struct (fixed layout, cannot represent absence) or table (can be null). The current implementation accepts fields based on their nested type kind but completely ignores field.required, so optional members incorrectly pass validation. This causes optional<T> members to be emitted as non-nullable struct fields, silently losing the optional semantics in the generated schema.

Add a guard check for required fields:

Suggested fix
     for(const auto& field: sv->fields) {
+        if(!field.required) return false;
         if(!field.nested) return false;
         auto fk = field.nested->kind;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
constexpr bool is_fb_struct(const schema::type_schema_view* sv) {
if(!sv || sv->kind != schema::type_kind::structure) return false;
if(!schema::has_flag(sv->flags, schema::type_flags::is_trivial)) return false;
if(!schema::has_flag(sv->flags, schema::type_flags::is_standard_layout)) return false;
for(const auto& field: sv->fields) {
if(!field.nested) return false;
auto fk = field.nested->kind;
if(fk == schema::type_kind::integer || fk == schema::type_kind::floating ||
fk == schema::type_kind::boolean || fk == schema::type_kind::character ||
fk == schema::type_kind::enumeration) {
continue;
}
if(fk == schema::type_kind::structure && is_fb_struct(field.nested)) continue;
return false;
}
}();
return true;
constexpr bool is_fb_struct(const schema::type_schema_view* sv) {
if(!sv || sv->kind != schema::type_kind::structure) return false;
if(!schema::has_flag(sv->flags, schema::type_flags::is_trivial)) return false;
if(!schema::has_flag(sv->flags, schema::type_flags::is_standard_layout)) return false;
for(const auto& field: sv->fields) {
if(!field.required) return false;
if(!field.nested) return false;
auto fk = field.nested->kind;
if(fk == schema::type_kind::integer || fk == schema::type_kind::floating ||
fk == schema::type_kind::boolean || fk == schema::type_kind::character ||
fk == schema::type_kind::enumeration) {
continue;
}
if(fk == schema::type_kind::structure && is_fb_struct(field.nested)) continue;
return false;
}
return true;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/flatbuffers/schema.h` around lines 80 - 95, The
function is_fb_struct(const schema::type_schema_view* sv) fails to reject
optional members; update the field loop in is_fb_struct to check each field's
requiredness (e.g., inspect field.required or the appropriate flag on the field)
and return false if a field is not required before proceeding to nested type
checks (so optional<T> will prevent classifying the type as a FlatBuffers
struct). Ensure the guard is placed before using field.nested and the existing
nested-kind checks so optional members are correctly classified as tables.

Comment on lines +358 to +380
// Check: no alias collides with another field's canonical name
auto verify_field_aliases = [&]<std::size_t I>() consteval {
if(excluded[I])
return true;
if constexpr(!detail::field_has_alias<T, I>()) {
return true;
} else {
using field_t = refl::field_type<T, I>;
using attrs_t = typename field_t::attrs;
using alias_attr = serde::detail::tuple_find_t<attrs_t, is_alias_attr>;
for(auto alias_name: alias_attr::names) {
for(std::size_t j = 0; j < sizeof...(Is); ++j) {
if(j == I || excluded[j])
continue;
if(alias_name == names[j]) {
return false;
}
}
}
return true;
}
};
if(!(verify_field_aliases.template operator()<Is>() && ...)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject alias-to-alias collisions during schema validation.

validate_field_schema() only compares each alias against other fields’ canonical names. Two different fields can still declare the same alias and both pass validation, which makes wire-key lookup ambiguous once aliases are materialized into the field table. Compare aliases against aliases from every other non-excluded field as well.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/schema/attrs.h` around lines 358 - 380, The
verify_field_aliases check only compares each alias against other fields'
canonical names (names[j]) so two fields can declare the same alias and slip
through; update the logic in the verify_field_aliases lambda (used by
validate_field_schema) to also fetch the alias attribute for each other
non-excluded field (i.e. refl::field_type<T, j>::attrs and
serde::detail::tuple_find_t<attrs_t, is_alias_attr> for index j) and compare
alias_name against that field's alias list as well as its canonical name,
returning false if any match is found; preserve the existing excluded and Is
handling and short-circuit behavior.

Comment on lines +198 to +214
struct name_buf {
char data[64]{};
std::uint8_t len = 0;

static constexpr name_buf from(std::string_view s) {
name_buf b;
auto n = s.size() < 63 ? s.size() : 63;
for(std::size_t i = 0; i < n; i++) b.data[i] = s[i];
b.len = static_cast<std::uint8_t>(n);
return b;
}

constexpr std::string_view view() const { return {data, len}; }
constexpr void push_back(char c) { data[len++] = c; }
constexpr char back() const { return data[len - 1]; }
constexpr bool empty() const { return len == 0; }
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't silently cap generated wire names at 63 bytes.

name_buf::from() truncates input at 63 characters, and push_back() never checks the fixed 64-byte buffer. Under a non-identity field_rename, long field names are either shortened or overflow this constexpr buffer before wire_name_static publishes them, which changes the wire format. Either compute the exact size first or fail with a clear static_assert when the transformed name exceeds the buffer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/schema/descriptor.h` around lines 198 - 214,
name_buf::from currently truncates inputs to 63 bytes and name_buf::push_back
doesn't bounds-check the 64-byte buffer, allowing silent truncation or overflow
of transformed wire names; fix by computing the final transformed name length
before construction (or failing early) and adding a compile-time/static check
that the transformed name fits within the 64-byte capacity, and update push_back
to validate len < sizeof(data) before writing (or assert/fail) so
wire_name_static cannot produce silently truncated or overflowing names.

Comment on lines +106 to +109
case type_kind::optional:
case type_kind::pointer:
if(sv->element) return emit(sv->element);
return "{}";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Model top-level nullable types as nullable in the schema.

This branch collapses optional<T> and pointer types to the element schema, so to_json_schema<std::optional<T>>() rejects null even though the serializer/deserializer accept it.

💡 Suggested fix
         case type_kind::optional:
         case type_kind::pointer:
-            if(sv->element) return emit(sv->element);
-            return "{}";
+            if(sv->element) {
+                return "{\"anyOf\":[{\"type\":\"null\"}," + emit(sv->element) + "]}";
+            }
+            return "{\"type\":\"null\"}";
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
case type_kind::optional:
case type_kind::pointer:
if(sv->element) return emit(sv->element);
return "{}";
case type_kind::optional:
case type_kind::pointer:
if(sv->element) {
return "{\"anyOf\":[{\"type\":\"null\"}," + emit(sv->element) + "]}";
}
return "{\"type\":\"null\"}";
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/schema/json_schema.h` around lines 106 - 109, The
branch handling type_kind::optional and type_kind::pointer currently returns the
element schema directly (in emit / to_json_schema), which causes top-level
optional/pointer types to not allow null in the produced JSON Schema; change
emit (the code path for type_kind::optional and type_kind::pointer, and any
callers such as to_json_schema) to wrap the element schema as a nullable schema
— e.g. produce a schema that accepts either the element schema or null (using
anyOf with the element schema and {"type":"null"} or by adding "type":["null",
...] if element schema has a simple type) — rather than returning sv->element
unchanged so top-level optional/pointer are represented as nullable.

Comment on lines +146 to +173
std::string emit_structure(const type_schema_view* sv) {
std::string result = "{\"type\":\"object\",\"properties\":{";
bool first = true;
for(const auto& field: sv->fields) {
if(!first) result += ',';
first = false;
result += "\"" + json_escape(field.wire_name) + "\":";
if(field.nested) {
result += emit(field.nested);
} else {
result += emit_kind_only(field.kind);
}
}
result += "}";

std::string required;
for(const auto& field: sv->fields) {
if(field.required) {
if(!required.empty()) required += ',';
required += "\"" + json_escape(field.wire_name) + "\"";
}
}
if(!required.empty()) {
result += ",\"required\":[" + required + "]";
}

result += ",\"additionalProperties\":false}";
return result;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't forbid unknown properties unless the type does.

emit_structure() always writes "additionalProperties":false, but the generic deserialize path still allows unknown fields unless schema::deny_unknown_fields is present. The generated schema is therefore stricter than the runtime contract for ordinary structs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/schema/json_schema.h` around lines 146 - 173,
emit_structure currently always appends ,"additionalProperties":false making the
JSON Schema stricter than runtime; change emit_structure to consult the type
schema's unknown-field policy (e.g. inspect type_schema_view for the
schema::deny_unknown_fields flag or equivalent boolean) and only append
,"additionalProperties":false when that flag is set, otherwise omit the
additionalProperties entry so the generated schema matches the runtime generic
deserialize behavior; update emit_structure to reference sv and the
deny-unknown-fields symbol (schema::deny_unknown_fields or the corresponding
member on type_schema_view).

Comment on lines +183 to +186
if constexpr(requires { typename U::key_type; }) {
if constexpr(!is_specialization_of<std::unordered_map, U> &&
!is_specialization_of<std::unordered_set, U>)
f = f | type_flags::is_ordered;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -path "*/include/eventide/serde/schema/kind.h" -type f

Repository: clice-io/eventide

Length of output: 99


🏁 Script executed:

cat -n ./include/eventide/serde/schema/kind.h | sed -n '170,200p'

Repository: clice-io/eventide

Length of output: 1377


🏁 Script executed:

rg "is_ordered|type_flags_of" include/eventide/serde/schema/kind.h -B 2 -A 2

Repository: clice-io/eventide

Length of output: 661


🏁 Script executed:

rg "is_ordered" --type cpp -B 3 -A 3

Repository: clice-io/eventide

Length of output: 1124


🏁 Script executed:

cat -n ./include/eventide/serde/schema/kind.h | sed -n '1,100p' | grep -E "is_specialization_of|include"

Repository: clice-io/eventide

Length of output: 659


🏁 Script executed:

rg "is_specialization_of" --type cpp -A 3 | head -30

Repository: clice-io/eventide

Length of output: 2595


🏁 Script executed:

find . -path "*/eventide/common/meta.h" -type f

Repository: clice-io/eventide

Length of output: 93


🏁 Script executed:

cat -n ./include/eventide/common/meta.h | head -80

Repository: clice-io/eventide

Length of output: 2659


🏁 Script executed:

python3 << 'EOF'
# Verify that std::unordered_multimap and std::unordered_multiset exist in C++ standard library
# and understand their properties

containers = {
    'std::unordered_map': 'unordered associative container',
    'std::unordered_set': 'unordered associative container',
    'std::unordered_multimap': 'unordered associative container with duplicate keys',
    'std::unordered_multiset': 'unordered associative container with duplicate values'
}

print("Standard Library Unordered Containers:")
for container, description in containers.items():
    print(f"  {container}: {description}")
    print(f"    - Has key_type: Yes")
    print(f"    - Is ordered: No")
    print()
EOF

Repository: clice-io/eventide

Length of output: 551


Exclude unordered_multimap and unordered_multiset from the is_ordered flag.

The filter only excludes std::unordered_map and std::unordered_set, allowing std::unordered_multimap and std::unordered_multiset to be incorrectly tagged with is_ordered despite being unordered containers.

Fix
-            if constexpr(!is_specialization_of<std::unordered_map, U> &&
-                         !is_specialization_of<std::unordered_set, U>)
+            if constexpr(!is_specialization_of<std::unordered_map, U> &&
+                         !is_specialization_of<std::unordered_multimap, U> &&
+                         !is_specialization_of<std::unordered_set, U> &&
+                         !is_specialization_of<std::unordered_multiset, U>)
                 f = f | type_flags::is_ordered;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if constexpr(requires { typename U::key_type; }) {
if constexpr(!is_specialization_of<std::unordered_map, U> &&
!is_specialization_of<std::unordered_set, U>)
f = f | type_flags::is_ordered;
if constexpr(requires { typename U::key_type; }) {
if constexpr(!is_specialization_of<std::unordered_map, U> &&
!is_specialization_of<std::unordered_multimap, U> &&
!is_specialization_of<std::unordered_set, U> &&
!is_specialization_of<std::unordered_multiset, U>)
f = f | type_flags::is_ordered;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/schema/kind.h` around lines 183 - 186, The is_ordered
flag logic in the if constexpr block that checks for typename U::key_type
incorrectly excludes only std::unordered_map and std::unordered_set, so
unordered_multimap and unordered_multiset are being marked ordered; update the
condition in the same if constexpr (the branch that currently calls f = f |
type_flags::is_ordered) to also exclude
is_specialization_of<std::unordered_multimap, U> and
is_specialization_of<std::unordered_multiset, U> so those unordered container
specializations do not receive type_flags::is_ordered.

Comment on lines +359 to +364
if(hint_bits & static_cast<std::uint8_t>(serde::type_hint::object)) {
auto idx = match_variant<Config, Ts...>(node);
if(idx.has_value()) {
D d2 = make_deser();
return deserialize_variant_at<D, Ts...>(d2, *idx, v);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Retry the remaining object candidates when the schema-picked one rejects the payload.

match_variant() is still heuristic: char/string coercions, numeric compatibility, and alias scoring can leave more than one plausible object alternative. Returning the first deserialize_variant_at() error here turns those cases into hard failures instead of falling through to the other object candidates. Keep this as the fast path, but continue into the generic candidate loop when the selected alternative does not deserialize.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/schema/match.h` around lines 359 - 364, When hint_bits
indicates an object and match_variant<Config, Ts...>(node) returns an index,
keep the current fast-path attempt but do not treat a deserialize_variant_at<D,
Ts...>(d2, *idx, v) failure as a hard return: call make_deser() and attempt
deserialize_variant_at, and if it succeeds return the value, but if it fails,
swallow / discard that error and fall through into the existing generic
candidate loop so remaining object candidates are retried; ensure the code still
uses the matched index as the fast path and only continues to the generic loop
on deserialization failure.

Comment on lines +8 to +9
#include "eventide/serde/schema/attrs.h"
#include "eventide/serde/schema/behavior.h"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep the umbrella header exporting hint.

include/eventide/serde/serde/attrs.h used to re-export eventide::serde::hint, eventide::serde::is_hint_attr_v, and eventide::serde::get_hint_t via include/eventide/serde/serde/attrs/hint.h. Removing that include here is a source-compatibility break for callers that only pull in the umbrella header. If hint is still supported, keep the legacy forwarder here; if it is intentionally gone, this needs an explicit migration path instead of a silent compile break.

Compatibility patch
 `#include` "spelling.h"
 `#include` "eventide/common/fixed_string.h"
 `#include` "eventide/serde/serde/annotation.h"
 `#include` "eventide/serde/schema/attrs.h"
 `#include` "eventide/serde/schema/behavior.h"
+#include "eventide/serde/serde/attrs/hint.h"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/serde/attrs.h` around lines 8 - 9, The umbrella header
removed the re-export of hint; restore source compatibility by re-adding the
legacy forwarder: either include the original header
"eventide/serde/serde/attrs/hint.h" at the top of
include/eventide/serde/serde/attrs.h, or explicitly re-export the symbols (using
declarations) for eventide::serde::hint, eventide::serde::is_hint_attr_v, and
eventide::serde::get_hint_t in the attrs.h header so callers that include the
umbrella header still see those names.

Comment on lines +89 to 92
if constexpr(schema::resolve_tag_mode<field_t>() != schema::tag_mode::none) {
return s_struct.serialize_field(effective_name, field.value());
} else {
return s_struct.serialize_field(effective_name, value);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve annotated nested structs here, not just tagged variants.

These branches only forward field.value() for tagged variants. Any nested annotation<Struct, schema::rename_all<...>> or annotation<Struct, schema::deny_unknown_fields> is still unwrapped to value, so the new type-level handling in include/eventide/serde/serde/serde.h never runs for fields of that type.

💡 Suggested direction
-                if constexpr(schema::resolve_tag_mode<field_t>() != schema::tag_mode::none) {
+                constexpr bool preserve_annotation =
+                    schema::resolve_tag_mode<field_t>() != schema::tag_mode::none ||
+                    (refl::reflectable_class<value_t> &&
+                     (detail::tuple_has_spec_v<attrs_t, schema::rename_all> ||
+                      detail::tuple_has_v<attrs_t, schema::deny_unknown_fields>));
+                if constexpr(preserve_annotation) {
                     return s_struct.serialize_field(effective_name, field.value());
                 } else {
                     return s_struct.serialize_field(effective_name, value);
                 }

Apply the same condition in the deserialize branch.

Also applies to: 197-201

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/serde/utils/field_dispatch.h` around lines 89 - 92,
The serialize branch currently unwraps annotated nested Structs by forwarding
value instead of the annotated field.value() only when
schema::resolve_tag_mode<field_t>() != schema::tag_mode::none; modify the logic
so that whenever the field type resolves to a tag mode other than none you
forward field.value(), and mirror that same conditional check in the
corresponding deserialize branch (and the similar block around the other
occurrence at the 197-201 region) so annotated nested structs (e.g.,
annotation<Struct, schema::rename_all<...>> or schema::deny_unknown_fields) are
preserved and the type-level handling in serde.h runs as intended (update the
calls to s_struct.serialize_field(effective_name, ...) and
s_struct.deserialize_field(...) accordingly).

Comment on lines +219 to +251
} else if constexpr(mode == schema::tag_mode::internal) {
static_assert(requires { d.consume_variant_source(); },
"internally tagged deserialization requires consume_variant_source()");
ET_EXPECTED_TRY_V(auto source, d.consume_variant_source());

std::string tag_value;
{
D tag_reader(source);
ET_EXPECTED_TRY_V(auto d_struct, tag_reader.deserialize_struct("", 0));
bool found = false;
while(true) {
ET_EXPECTED_TRY_V(auto key, d_struct.next_key());
if(!key.has_value()) break;
if(*key == tag_field) {
ET_EXPECTED_TRY(d_struct.deserialize_value(tag_value));
found = true;
break;
}
} else {
ET_EXPECTED_TRY(d_struct.skip_value());
}
if(!found) {
return std::unexpected(E::type_mismatch);
}
}

if(!has_tag || !has_content) {
return std::unexpected(E::type_mismatch);
}

if(buffered_content.has_value()) {
ET_EXPECTED_TRY(deserialize_content_for_tag([&](auto& alt) -> std::expected<void, E> {
content::Deserializer<typename D::config_type> buffered_deserializer(
*buffered_content);
ET_EXPECTED_TRY(serde::deserialize(buffered_deserializer, alt));
ET_EXPECTED_TRY(buffered_deserializer.finish());
return match_and_deserialize_alt<E>(
tag_value, names, value, [&](auto& alt) -> std::expected<void, E> {
using alt_t = std::remove_cvref_t<decltype(alt)>;
static_assert(refl::reflectable_class<alt_t>,
"internally_tagged requires struct alternatives");
D content_reader(source);
ET_EXPECTED_TRY(serde::deserialize(content_reader, alt));
ET_EXPECTED_TRY(content_reader.finish());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Strip the tag field before deserializing the internal payload.

The fresh content_reader(source) replays the whole wrapper object into alt, so the synthetic tag field is still present. Alternatives that reject unknown fields will now fail on valid internally tagged payloads because tag_field is not part of the alternative schema. Deserialize through a filtered view/reader that skips tag_field before handing the object to serde::deserialize.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/serde/serde/utils/tagged.h` around lines 219 - 251, The
internally-tagged branch currently reuses the raw source
(d.consume_variant_source()) and constructs content_reader(source), which
replays the entire wrapper including the synthetic tag_field and causes
alternatives to fail on unknown fields; modify the logic in the
schema::tag_mode::internal path (the block that reads tag_value and calls
match_and_deserialize_alt) to create a filtered/stripping reader that wraps the
consumed source and omits tag_field when iterating object fields, then pass that
filtered reader to serde::deserialize (instead of content_reader(source)) and
call finish() on the filtered reader so the tag field is not visible to the
alternative deserialization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant