Skip to content

Conversation

@leoromanovsky
Copy link
Contributor

@leoromanovsky leoromanovsky commented Oct 23, 2025

This PR

Providers are currently unable to respect timeouts during initialization, which can cause applications to hang indefinitely when providers fail to connect to external services.

This PR adds optional context-aware initialization methods that enable timeout support while maintaining 100% backward compatibility.

Key additions:

  • ContextAwareStateHandler interface extending StateHandler with InitWithContext(ctx, evalCtx) error
  • New API methods: SetProviderWithContext, SetProviderWithContextAndWait, SetNamedProviderWithContext, SetNamedProviderWithContextAndWait
  • Smart detection automatically uses context-aware initialization when available, falls back to regular Init() for existing providers

👉 This PR is structured to be additive to the v1 branch and be usable immediately.

Related Issues

Notes

  • Zero breaking changes - all existing providers continue to work unchanged
  • Thread-safe implementation using existing mutex patterns
  • Comprehensive unit test coverage included
  • Updated mocks to support new interface methods

Follow-up Tasks

  • Documentation updates for provider implementation guide
  • Example provider implementations showing timeout usage

How to test

// Test timeout functionality
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()

err := openfeature.SetProviderWithContextAndWait(ctx, myProvider)
if err == context.DeadlineExceeded {
    log.Println("Provider initialization timed out")
}

Testing passing locally:

make test
go test --short -tags testtools -cover -timeout 1m ./...
ok  	github.com/open-feature/go-sdk/e2e	0.280s	coverage: [no statements]
ok  	github.com/open-feature/go-sdk/openfeature	(cached)	coverage: 55.7% of statements
ok  	github.com/open-feature/go-sdk/openfeature/hooks	(cached)	coverage: 100.0% of statements
?   	github.com/open-feature/go-sdk/openfeature/internal	[no test files]
ok  	github.com/open-feature/go-sdk/openfeature/memprovider	(cached)	coverage: 85.2% of statements
ok  	github.com/open-feature/go-sdk/openfeature/multi	(cached)	coverage: 81.6% of statements
ok  	github.com/open-feature/go-sdk/openfeature/telemetry	(cached)	coverage: 100.0% of statements
ok  	github.com/open-feature/go-sdk/openfeature/testing	(cached)	coverage: 91.9% of statements
	github.com/open-feature/go-sdk/pkg/openfeature		coverage: 0.0% of statements
	github.com/open-feature/go-sdk/pkg/openfeature/memprovider		coverage: 0.0% of statements

@leoromanovsky leoromanovsky changed the title Adds ability to pass a context.Context to provider with new public me… feat: Adds ability to pass a context.Context to provider with new public methods. Oct 23, 2025
@leoromanovsky leoromanovsky force-pushed the lr/provider-context branch 2 times, most recently from 51d0f23 to d138050 Compare October 23, 2025 18:57
@erka
Copy link
Member

erka commented Oct 23, 2025

Hey @leoromanovsky , thank you for starting this.

I believe the stakeholders agreed that it’s time for v2 and to place context handling in the right places. Breaking changes are expected and accepted by the team - trying to avoid them would only make things more complex in the long run, and the Go SDK would end up with awkward method names or need breaking changes later anyway.

If you could introduce context into the existing API, that would serve as a solid starting point for v2.

@leoromanovsky
Copy link
Contributor Author

Hey @leoromanovsky , thank you for starting this.

I believe the stakeholders agreed that it’s time for v2 and to place context handling in the right places. Breaking changes are expected and accepted by the team - trying to avoid them would only make things more complex in the long run, and the Go SDK would end up with awkward method names or need breaking changes later anyway.

If you could introduce context into the existing API, that would serve as a solid starting point for v2.

@erka

Thanks for the context about targeting v2.

I completely understand wanting to work towards larger API shifts there: especially given the set of planned breaking changes like generics and renamed lifecycle methods.

That said, my team at Datadog is preparing to launch on the Go SDK soon, and we’d like to build on a stable release (v1.x) rather than a pre-v2 snapshot. The context-aware lifecycle is a foundational need for us: it lets us manage initialization and shutdown deadlines properly in production.

To balance stability with forward progress, here’s what I’d propose:

  • Land this functionality in v1.x as an opt-in bridge (e.g. via a ProviderWithInitContext interface and/or WithInitContext(ctx) option).
  • Non-breaking, SemVer-safe, and fully compatible with v2’s upcoming shape. Mark it as provisional in docs (“will be standardized in v2”).
  • That gives early adopters a path now, while keeping the mainline clean.

In exchange, I’d love to get a clear picture of what v2 looks like specifically: which of these are considered “in scope” for the breaking release? Is the openfeature specification fully developed? My team is available to help here.

@erka
Copy link
Member

erka commented Oct 24, 2025

@leoromanovsky

We appreciate your company’s interest in OpenFeature and and look forward to keep making it better together. Saying that, I’ll also quote a note from another project

If you want to live dangerously, open a pull request and hope for the best.

My understanding is that the stakeholders agreed not to make major changes to v1 and are focusing on v2 instead. I’d suggest getting their support for this work and aligning on the timing. You could also be an early adopter of v2 and help shape its design with your company’s expertise. Personally, I think your work might not get the attention it deserves in v1, but it could have a really meaningful impact in v2.

The scope of v2 is being tracked in the related issue.
The OpenFeature specification is available at https://openfeature.dev/specification/ .

@toddbaert
Copy link
Member

toddbaert commented Oct 24, 2025

@leoromanovsky You're 100% correct about this API shortcoming; it's definitely something we want to add, and we specifically noted it for v2. The proposed v2 changes are breaking, they are quite small and really just about providing better APIs to our consumers (the change you want is a good example). By the way, we already have a v2 of some SDKs available, .NET for instance, so this is not new territory for us.

@erka is also correct - we've been hoping to do a v2 of the Go SDK since spring, but a lot of maintainers are focused on other parts of the project at the moment, so it's been delayed a bit (I was hoping we could release it this summer).

I'm not sure what your timetable is at Datadog, but the v2 backlog is not huge, and this change is perhaps the most significant part. It might actually be a better experience for your users overall to just start on a v2 rather than a v1, with a presumed migration to v2 later this year. Myself and the rest of the technical committee will fully support you guys in helping move the v2 along if that's the route you'd like to go.

If you'd rather just stick to non-breaking enhancements to 1.x for now, I also fully support that, though I have to admit it's not our preference.

Please give it some thought, in the meantime I'll review this PR.

cc @kinyoklion @lukas-reining @thomaspoignant @sahidvelji

@codecov
Copy link

codecov bot commented Oct 24, 2025

Codecov Report

❌ Patch coverage is 84.90566% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.23%. Comparing base (c2034e5) to head (597bafa).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
openfeature/openfeature_api.go 82.97% 8 Missing and 8 partials ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #442   +/-   ##
=======================================
  Coverage   83.23%   83.23%           
=======================================
  Files          27       27           
  Lines        1962     2034   +72     
=======================================
+ Hits         1633     1693   +60     
- Misses        294      299    +5     
- Partials       35       42    +7     
Flag Coverage Δ
e2e 83.23% <84.90%> (+<0.01%) ⬆️
unit 83.23% <84.90%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sahidvelji
Copy link
Contributor

If you'd rather just stick to non-breaking enhancements to 1.x for now, I also fully support that, though I have to admit it's not our preference.

I agree. However, the reason I didn't immediately add context aware state handling (init and shutdown) when I pointed it out is because I personally had no immediate need for it and could wait for v2. But if @leoromanovsky has an immediate use case, I honestly don't see why not.

In v2, I would want have context-aware state handling by default instead of as a separate interface.

@toddbaert
Copy link
Member

I've left nits and refactor requests, but fundamentally I am open to this PR.

@sahidvelji
Copy link
Contributor

If you have Go 1.25 installed, you can run go doc -http to preview the documentation. You may find this helpful: https://tip.golang.org/doc/comment

@lukas-reining
Copy link
Member

If you'd rather just stick to non-breaking enhancements to 1.x for now, I also fully support that, though I have to admit it's not our preference.

I think so too! @toddbaert It would be great to build the enhancement you built @leoromanovsky on the V2 if possible.
If you need this now, I think it would be acceptable to just include it in a non breaking 1.x release and rework it in the V2.

Myself and the rest of the technical committee will fully support you guys in helping move the v2 along if that's the route you'd like to go.

Absolutely!

The async goroutine was capturing ctx and newProvider by reference
in the closure, creating a race condition where the goroutine could
see modified values (like cancelled contexts) instead of the values
that existed when the goroutine was created.

Fixed by passing ctx, newProvider, and clientName as explicit
goroutine parameters instead of capturing them by reference.

This ensures the goroutine operates on the values that existed
when the async operation was initiated, preventing timing-dependent
failures.

Signed-off-by: Leo Romanovsky <leo.romanovsky@datadoghq.com>
Remove initNewAndShutdownOldWithContext function which was just a
pass-through to initNewAndShutdownOldInternal. Update callers to
use initNewAndShutdownOldInternal directly.

This reduces code complexity and eliminates an unnecessary
layer of indirection.

Signed-off-by: Leo Romanovsky <leo.romanovsky@datadoghq.com>
Rename functions to reflect their actual roles:
- initNewAndShutdownOldInternal → initNewAndShutdownOld (main implementation)
- initNewAndShutdownOld → initNewAndShutdownOldLegacy (background context wrapper)

This makes it clear which function is the primary implementation
and which is the legacy compatibility wrapper.

Signed-off-by: Leo Romanovsky <leo.romanovsky@datadoghq.com>
Remove context_closure_bug_test.go since the context closure bug
has been fixed. The test was created to demonstrate the issue
and is no longer needed.

Signed-off-by: Leo Romanovsky <leo.romanovsky@datadoghq.com>
Replace hardcoded 10-second timeout with DefaultShutdownTimeout constant.
This makes the timeout value explicit and documents the reasoning
behind the choice of 10 seconds for provider shutdown operations.

The constant prevents indefinite hangs during application termination
while allowing sufficient time for network cleanup operations.

Signed-off-by: Leo Romanovsky <leo.romanovsky@datadoghq.com>
Co-authored-by: Sahid Velji <sahidvelji@gmail.com>
Signed-off-by: Leo Romanovsky <leo.romanovsky@datadoghq.com>
- Change DefaultShutdownTimeout to private defaultShutdownTimeout constant
- Add timeout documentation to the initNewAndShutdownOld function
- Fix example_context_test.go to properly use openfeature package prefix
- Add missing import for openfeature package in examples
- Update all function calls to use openfeature.FunctionName() syntax

This follows Go best practices by keeping implementation details private
and properly demonstrates the public API in examples.

Signed-off-by: Leo Romanovsky <leo.romanovsky@datadoghq.com>
Replace first-error-only pattern with errors.Join to collect and return
all shutdown errors. This provides users with complete information about
all providers that failed to shut down cleanly, rather than stopping
at the first failure.

This is especially useful during application shutdown when you want
visibility into all the problems that occurred across multiple providers.

Signed-off-by: Leo Romanovsky <leo.romanovsky@datadoghq.com>
@sahidvelji
Copy link
Contributor

@leoromanovsky Thank you for this great contribution! I'll leave this open for another day in case there is more feedback. Otherwise, I will merge it tomorrow.

@leoromanovsky
Copy link
Contributor Author

@leoromanovsky Thank you for this great contribution! I'll leave this open for another day in case there is more feedback. Otherwise, I will merge it tomorrow.

Thank you - good plan to leave it open for a day to get more folks looking at the changes! Really enjoyed working with you and the team on this. Appreciate you being open to accepting this in v1 and hope the work informs the upcoming v2 changes.

Signed-off-by: Leo Romanovsky <leo.romanovsky@datadoghq.com>
@sahidvelji
Copy link
Contributor

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable feature by adding context.Context support for provider initialization and shutdown, which allows for timeouts and cancellations. The implementation is well-structured, additive, and maintains backward compatibility with existing providers. The new public API methods are clearly documented, and the inclusion of comprehensive tests and examples is excellent. I have a few suggestions to improve the code, mainly around error handling in background operations and improving the clarity of test code.

- Fix shutdown context propagation to use passed context with smart timeout extension
- Remove unnecessary pre-initialization context check as InitWithContext handles it
- Simplify error handling logic to avoid redundant context checks
- Add 330+ lines of comprehensive tests covering context propagation, error handling, and edge cases
- Validate all fixes with new test providers and scenarios

Addresses all technical concerns raised in review feedback.

Signed-off-by: Leo Romanovsky <leo.romanovsky@datadoghq.com>
- Remove obsolete test for timeout extension behavior that no longer exists
- Remove unused defaultShutdownTimeout constant and time import
- Update comment that referenced removed defaultShutdownTimeout
- Clean up 34 lines of dead code while maintaining functionality

Completes simplification of shutdown context logic per reviewer feedback.

Signed-off-by: Leo Romanovsky <leo.romanovsky@datadoghq.com>
Signed-off-by: Leo Romanovsky <leo.romanovsky@datadoghq.com>
@sahidvelji sahidvelji merged commit 1a0d39e into open-feature:main Oct 30, 2025
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants