[BUG] In-process evaluation consistency: custom operator edge cases and error handling

## Parent Issue

Sub-issue of #1770.

## Summary

The in-process flagd providers have several consistency bugs in custom operator implementations (`sem_ver`, `starts_with`, `ends_with`) and `$evaluators/$ref` resolution. The same flag configuration can produce different evaluation results depending on which SDK evaluates it. These issues fall into three categories:

1. **`sem_ver` parsing and comparison inconsistencies** (#1873) — `v`-prefix handling, partial version acceptance, and SemVer 2.0.0 Rule 10 violations (build metadata not ignored in .NET/Python/Rust).

2. **Inconsistent error return values across all custom operators** (#1874) — Some SDKs return `null` on error (triggering default variant fallback), others return `false` (taking the false branch), and `fractional` has three different no-match behaviors (empty string, exception, null).

3. **`$evaluators/$ref` resolution gaps** (#1875) — No parse-time validation of unresolved `$ref`s, non-deterministic replacement ordering in Go/.NET, and missing Rust support entirely.

> **Note:** `fractional` bucketing formula differences (#1872) will be addressed by the high-resolution fractional bucketing work in #1903, which replaces the hashing/normalization formula entirely.

## Strategy

### 1. Standardize error returns: always return `null`

All custom operators should return `null` (or the language equivalent: `nil`, `None`, `Null`) for any invalid input, parse failure, or error condition. This is already the convention for some operators and ensures consistent behavior: errors trigger a fallback to the default variant rather than silently taking the `false` branch of a conditional.

This applies to:
- `fractional` — parse errors, no-bucket-match
- `sem_ver` — parse failures, unknown operators
- `starts_with` / `ends_with` — invalid input
- `$evaluators/$ref` — unresolved references should produce a clear diagnostic at parse time

### 2. Enhance the gherkin test suite in flagd-testbed

The [flagd-testbed](https://github.com/open-feature/flagd-testbed) already has some error case coverage, but it needs to be extended to cover the edge cases identified in the sub-issues. Specifically:

- **`sem_ver`**: Test `v`-prefixed versions, partial versions, build metadata comparisons (`1.0.0+build1 == 1.0.0`), and parse failure returns
- **Error returns**: Test that all custom operators return `null` on invalid input (wrong arg count, wrong types, unparseable values)
- **`$evaluators/$ref`**: Test unresolved `$ref` behavior, whitespace variations

### 3. Fix each provider against the enhanced test suite

Once the gherkin suite is updated, each provider can vet its implementation against the new tests and fix discrepancies.

## Requirements

- [ ] Enhance the flagd-testbed gherkin suite to cover the edge cases in #1873, #1874, #1875
- [ ] Fix each provider implementation to pass the enhanced test suite
- [ ] All custom operator errors return `null` (not `false`, not empty string, not exception)

## Sub-issues

- #1873 — `sem_ver` parsing and comparison inconsistencies  
- #1874 — Custom operator error return value inconsistencies
- #1875 — `$evaluators/$ref` resolution gaps

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] In-process evaluation consistency: custom operator edge cases and error handling #1904

Parent Issue

Summary

Strategy

1. Standardize error returns: always return `null`

2. Enhance the gherkin test suite in flagd-testbed

3. Fix each provider against the enhanced test suite

Requirements

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] In-process evaluation consistency: custom operator edge cases and error handling #1904

Description

Parent Issue

Summary

Strategy

1. Standardize error returns: always return null

2. Enhance the gherkin test suite in flagd-testbed

3. Fix each provider against the enhanced test suite

Requirements

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. Standardize error returns: always return `null`