Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
076998e
feat: ExtensionRegistry, wip
wackywendell Aug 12, 2025
57e5dee
wip compiles
wackywendell Aug 13, 2025
68f0e7d
working through some implications
wackywendell Aug 14, 2025
dfef7ee
Switched types over, compiles
wackywendell Aug 15, 2025
facb09b
Stripped down to just types
wackywendell Aug 15, 2025
5a5ac23
Enable test only with correct feature flag
wackywendell Aug 18, 2025
7869676
Code cleanup
wackywendell Aug 18, 2025
bc2a86b
Merged with existing parse
wackywendell Aug 18, 2025
1b49a6c
Mostly working but builtin types are a bit messed up
wackywendell Aug 18, 2025
6dc4302
now looks better
wackywendell Aug 18, 2025
9c593a1
On the way, trying to get validation working
wackywendell Aug 19, 2025
7a8584e
Extensions should be included when the extensions feature is enabled
wackywendell Aug 20, 2025
ac447d8
Delete some dead code
wackywendell Aug 20, 2025
9b47219
Start merging argument and types
wackywendell Aug 20, 2025
8a44db4
Some updates to type handling; compiles and passes tests
wackywendell Sep 3, 2025
47fbf18
Merge remote-tracking branch 'upstream/main' into parse-extensions
wackywendell Sep 15, 2025
5b1e5df
Update to match URN change
wackywendell Sep 16, 2025
6444f5c
Some renames
wackywendell Sep 16, 2025
162e8ab
Update tests
wackywendell Sep 17, 2025
421d957
Removed string / enum type parameter, that's not a thing
wackywendell Sep 18, 2025
46b6468
Time should be primitive, precisiontime parameterized
wackywendell Sep 19, 2025
83a4249
Enforce ranges on fixed-range types
wackywendell Sep 19, 2025
f415dc8
Enforce float / int bounds
wackywendell Sep 19, 2025
0cbfc34
Validate missing types
wackywendell Sep 19, 2025
93b41ef
Add error for type variations
wackywendell Sep 19, 2025
470aa00
Fix casing on type display, add tests
wackywendell Sep 19, 2025
202747a
Stable field ordering for structures
wackywendell Sep 19, 2025
ac479d9
fix(parse): apply Substrait identifier rules for CustomType names
wackywendell Oct 10, 2025
c0a60b4
docs(parse): add SPDX header, clarify errors, and guard duplicate ext…
wackywendell Oct 10, 2025
ca1e133
fix: clippy warning about nested ifs
wackywendell Oct 10, 2025
95270ab
doc: update extensions types and links
wackywendell Oct 10, 2025
25d6392
Merge remote-tracking branch 'upstream/main' into parse-extensions
wackywendell Oct 10, 2025
3e58a42
fix: update usage of typify-generated types to match typify upgrade
wackywendell Oct 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ include = [
[features]
default = []
extensions = ["dep:serde_yaml"]
parse = ["dep:hex", "dep:thiserror", "semver"]
parse = ["dep:hex", "dep:thiserror", "dep:serde_yaml", "semver"]
protoc = ["dep:protobuf-src"]
semver = ["dep:semver"]
serde = ["dep:pbjson", "dep:pbjson-build", "dep:pbjson-types"]
Expand All @@ -38,6 +38,8 @@ pbjson = { version = "0.8.0", optional = true }
pbjson-types = { version = "0.8.0", optional = true }
prost = "0.14.1"
prost-types = "0.14.1"
# Required by generated text schemas: the typify-generated code emits
# ::regress::Regex for `pattern` validations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, it looks like regress was already present before. Is this comment just clarifying why it was there in the first place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, just clarifying its use!

regress = "0.10.4"
semver = { version = "1.0.27", optional = true }
serde = { version = "1.0.228", features = ["derive"] }
Expand Down
1 change: 1 addition & 0 deletions src/extensions.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
//! included in the packaged crate, ignored by git, and automatically kept
//! in-sync.

#[cfg(feature = "extensions")]
include!(concat!(env!("OUT_DIR"), "/extensions.in"));

#[cfg(test)]
Expand Down
48 changes: 21 additions & 27 deletions src/parse/context.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@

use thiserror::Error;

use crate::parse::{
Anchor, Parse, proto::extensions::SimpleExtensionUrn, text::simple_extensions::SimpleExtensions,
};
use crate::parse::proto::extensions::SimpleExtensionUrn;
use crate::parse::text::simple_extensions::ExtensionFile;
use crate::parse::{Anchor, Parse};

/// A parse context.
///
Expand All @@ -22,22 +22,24 @@ pub trait Context {
{
item.parse(self)
}
}

pub trait ProtoContext: Context {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned, I don't have a ton of Rust experience, so it may be that this is totally standard.

But I wonder if there's a simpler way to handle parsing that isn't so trait-heavy. AFAICT, there is only one implementor of the ProtoContext trait, which is the test fixture (lines 81-107).

Could we drop the ProtoContext trait and use a concrete type instead, then update tests to use explicit instantiations of that type? If not, what is the benefit of using this trait here?

/// Add a [SimpleExtensionUrn] to this context. Must return an error for duplicate
/// anchors or when the urn is not supported.
/// anchors or when the URI is not supported.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just consistently use URN here? We have support for both in all of the other libraries and we will be dropping URI eventually.

///
/// This function must eagerly resolve and parse the simple extension, returning an
/// error if either fails.
fn add_simple_extension_urn(
&mut self,
simple_extension_urn: &SimpleExtensionUrn,
) -> Result<&SimpleExtensions, ContextError>;
) -> Result<&ExtensionFile, ContextError>;

/// Returns the simple extensions for the given simple extension anchor.
fn simple_extensions(
&self,
anchor: &Anchor<SimpleExtensionUrn>,
) -> Result<&SimpleExtensions, ContextError>;
) -> Result<&ExtensionFile, ContextError>;
}

/// Parse context errors.
Expand All @@ -57,57 +59,49 @@ pub enum ContextError {
}

#[cfg(test)]
pub(crate) mod tests {
pub(crate) mod fixtures {
use std::collections::{HashMap, hash_map::Entry};

use crate::parse::{
Anchor, context::ContextError, proto::extensions::SimpleExtensionUrn,
text::simple_extensions::SimpleExtensions,
text::simple_extensions::ExtensionFile,
};

/// A test context.
///
/// This currently mocks support for simple extensions (does not resolve or
/// parse).
#[derive(Default)]
pub struct Context {
empty_simple_extensions: SimpleExtensions,
simple_extensions: HashMap<Anchor<SimpleExtensionUrn>, SimpleExtensionUrn>,
simple_extensions: HashMap<Anchor<SimpleExtensionUrn>, ExtensionFile>,
}

impl Default for Context {
fn default() -> Self {
Self {
empty_simple_extensions: SimpleExtensions {},
simple_extensions: Default::default(),
}
}
}
impl super::Context for Context {}

impl super::Context for Context {
impl super::ProtoContext for Context {
fn add_simple_extension_urn(
&mut self,
simple_extension_urn: &crate::parse::proto::extensions::SimpleExtensionUrn,
) -> Result<&SimpleExtensions, ContextError> {
) -> Result<&ExtensionFile, ContextError> {
match self.simple_extensions.entry(simple_extension_urn.anchor()) {
Entry::Occupied(_) => Err(ContextError::DuplicateSimpleExtension(
simple_extension_urn.anchor(),
)),
Entry::Vacant(entry) => {
// TODO: fetch
entry.insert(simple_extension_urn.clone());
// For now just return an empty extension
Ok(&self.empty_simple_extensions)
let f = ExtensionFile::empty(simple_extension_urn.urn().clone());
let ext_ref = entry.insert(f);

Ok(ext_ref)
}
}
}

fn simple_extensions(
&self,
anchor: &Anchor<SimpleExtensionUrn>,
) -> Result<&SimpleExtensions, ContextError> {
) -> Result<&ExtensionFile, ContextError> {
self.simple_extensions
.contains_key(anchor)
.then_some(&self.empty_simple_extensions)
.get(anchor)
.ok_or(ContextError::UndefinedSimpleExtension(*anchor))
}
}
Expand Down
6 changes: 3 additions & 3 deletions src/parse/proto/extensions/simple_extension_urn.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ use std::str::FromStr;
use thiserror::Error;

use crate::{
parse::{Anchor, Context, Parse, context::ContextError},
parse::{Anchor, Parse, context::ContextError, context::ProtoContext},
proto,
urn::{InvalidUrn, Urn},
};
Expand Down Expand Up @@ -50,7 +50,7 @@ pub enum SimpleExtensionUrnError {
Context(#[from] ContextError),
}

impl<C: Context> Parse<C> for proto::extensions::SimpleExtensionUrn {
impl<C: ProtoContext> Parse<C> for proto::extensions::SimpleExtensionUrn {
type Parsed = SimpleExtensionUrn;
type Error = SimpleExtensionUrnError;

Expand Down Expand Up @@ -90,7 +90,7 @@ impl From<SimpleExtensionUrn> for proto::extensions::SimpleExtensionUrn {
#[cfg(test)]
mod tests {
use super::*;
use crate::parse::{Context as _, context::tests::Context};
use crate::parse::{Context as _, context::fixtures::Context};

#[test]
fn parse() -> Result<(), SimpleExtensionUrnError> {
Expand Down
6 changes: 3 additions & 3 deletions src/parse/proto/plan_version.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
//! Parsing of [proto::PlanVersion].

use crate::{
parse::{Parse, context::Context, proto::Version},
parse::{Parse, context::ProtoContext, proto::Version},
proto,
};
use thiserror::Error;
Expand Down Expand Up @@ -38,7 +38,7 @@ pub enum PlanVersionError {
Version(#[from] VersionError),
}

impl<C: Context> Parse<C> for proto::PlanVersion {
impl<C: ProtoContext> Parse<C> for proto::PlanVersion {
type Parsed = PlanVersion;
type Error = PlanVersionError;

Expand Down Expand Up @@ -71,7 +71,7 @@ impl From<PlanVersion> for proto::PlanVersion {
mod tests {
use super::*;
use crate::{
parse::{context::tests::Context, proto::VersionError},
parse::{context::fixtures::Context, proto::VersionError},
version,
};

Expand Down
6 changes: 3 additions & 3 deletions src/parse/proto/version.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
//! Parsing of [proto::Version].

use crate::{
parse::{Parse, context::Context},
parse::{Parse, context::ProtoContext},
proto, version,
};
use hex::FromHex;
Expand Down Expand Up @@ -75,7 +75,7 @@ pub enum VersionError {
Substrait(semver::Version, semver::VersionReq),
}

impl<C: Context> Parse<C> for proto::Version {
impl<C: ProtoContext> Parse<C> for proto::Version {
type Parsed = Version;
type Error = VersionError;

Expand Down Expand Up @@ -142,7 +142,7 @@ impl From<Version> for proto::Version {
#[cfg(test)]
mod tests {
use super::*;
use crate::parse::context::tests::Context;
use crate::parse::context::fixtures::Context;

#[test]
fn version() -> Result<(), VersionError> {
Expand Down
8 changes: 7 additions & 1 deletion src/parse/text/mod.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
// SPDX-License-Identifier: Apache-2.0

//! Parsing of [text](crate::text) types.
//! Utilities for working with Substrait *text* objects.
//!
//! The generated [`crate::text`] module exposes the raw YAML-derived structs
//! (e.g. [`crate::text::simple_extensions::SimpleExtensions`]). This module
//! provides parsing helpers that validate those raw values and offer
//! higher-level wrappers for validation, lookups, and combining into protobuf
//! objects.

pub mod simple_extensions;
Loading