Skip to content

Conversation

@rsoaresd
Copy link
Contributor

@rsoaresd rsoaresd commented Dec 16, 2025

Description

In CreateSpaceRequest(), the parent space is created with the appstudio tier, but GetDefaultNamespace() was called immediately after waiting only for UntilSpaceHasAnyProvisionedNamespaces().

This created a race condition where:

  1. Parent space is created and initially provisions with the default tier (base1ns)
  2. base1ns tier creates a -dev namespace
  3. Parent space then transitions to the requested appstudio tier
  4. appstudio tier doesn't have a -dev namespace, so it deletes it and creates a -tenant namespace instead
  5. The SpaceRequest is created in a namespace that might be in the middle of this transition
  6. Subsequent tier updates fail because they reference namespaces that were deleted during the initial tier transition

Assisted-by: Cursor

Issue ticket number and link

SANDBOX-1502

Summary by CodeRabbit

  • Tests
    • Improved test coverage and stability around space provisioning and readiness — tests now verify tier configuration, provisioning status, and availability of provisioned namespaces, with additional synchronization to reduce flakiness.

✏️ Tip: You can customize this high-level summary in your review settings.

@openshift-ci openshift-ci bot requested review from fbm3307 and xcoulon December 16, 2025 11:15
@coderabbitai
Copy link

coderabbitai bot commented Dec 16, 2025

Walkthrough

CreateSpaceRequest stopped waiting for a provisioned namespace via Host().WaitForSpace and now uses the Space returned by the signup flow as parentSpace. Separately, VerifySpaceRelatedResources now adds an explicit WaitForSpace refresh to ensure the Space has the expected tier, Provisioned condition, and at least one provisioned namespace.

Changes

Cohort / File(s) Change Summary
Create space request flow
testsupport/space/spacerequest.go
Removed the Host().WaitForSpace wait and now uses the Space object returned by the signup flow directly as parentSpace when creating the SpaceRequest; no explicit wait for a provisioned namespace is performed here.
Space verification sync
testsupport/user_assertions.go
After verifying the home Space, added a WaitForSpace call that refreshes the Space and requires the same name, the specified tier, the Provisioned condition, and at least one provisioned namespace before proceeding.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Inspect testsupport/space/spacerequest.go to ensure relying on the signup-provided Space is safe in all test paths.
  • Verify the new WaitForSpace criteria in testsupport/user_assertions.go do not introduce flakiness or deadlocks in tests.
  • Confirm error handling and logging remain meaningful after the change.

Suggested reviewers

  • rajivnathan
  • metlos

Poem

🐇 I hopped through code both near and far,
Swapped waits for the signup's guiding star.
Then paused a moment to double-check space,
Tier, provisioned, namespaces in place—
A happy hop, tests ready to race! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: fixing a race condition in CreateSpaceRequest, which is the core objective of the PR.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d99f836 and 0fc4d04.

📒 Files selected for processing (2)
  • testsupport/space/spacerequest.go (1 hunks)
  • testsupport/user_assertions.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • testsupport/space/spacerequest.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-01T09:51:30.271Z
Learnt from: fbm3307
Repo: codeready-toolchain/toolchain-e2e PR: 1175
File: testsupport/tiers/checks.go:1427-1608
Timestamp: 2025-08-01T09:51:30.271Z
Learning: In testsupport/tiers/checks.go, the go-prefixed cluster resource quota functions (like goClusterResourceQuotaDeployments, goClusterResourceQuotaReplicas, etc.) should be kept separate from their non-go counterparts, even though they appear similar. The maintainer prefers this separation for better code organization and to maintain clear boundaries between different tier implementations.

Applied to files:

  • testsupport/user_assertions.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Unit Tests
  • GitHub Check: Build & push operator bundles & dashboard image for e2e tests
🔇 Additional comments (1)
testsupport/user_assertions.go (1)

274-278: LGTM! Correctly addresses the race condition.

This additional wait ensures the Space is refreshed with the correct tier, provisioned condition, and—critically—provisioned namespaces after all intermediate verifications complete. This prevents returning a Space object that might be mid-transition during tier changes (e.g., base1ns → appstudio with namespace deletions/creations), which was causing the race condition described in the PR.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@xcoulon xcoulon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, Rafalea! 👏

Comment on lines 60 to 66
Execute(t)

// wait for the namespace to be provisioned since we will be creating the spacerequest into it.
parentSpace, err := awaitilities.Host().WaitForSpace(t, user.Space.Name, wait.UntilSpaceHasAnyProvisionedNamespaces())
parentSpace, err := awaitilities.Host().WaitForSpace(t, user.Space.Name,
wait.UntilSpaceHasTier("appstudio"),
wait.UntilSpaceHasConditions(wait.Provisioned()),
wait.UntilSpaceHasAnyProvisionedNamespaces())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at how the Signup is created:

  • it defines the needed & expected space tier as appstudio
  • and it also calls EnsureMUR which should trigger the evaluation of the provisioned namespaces:
    if r.ensureMUR {
    expectedSpaceTier := tiers.GetDefaultSpaceTierName(t, hostAwait)
    if !r.noSpace {
    if r.spaceTier != "" {
    tiers.MoveSpaceToTier(t, hostAwait, userSignup.Status.CompliantUsername, r.spaceTier)
    expectedSpaceTier = r.spaceTier
    }
    space := VerifySpaceRelatedResources(t, r.awaitilities, userSignup, expectedSpaceTier)
    r.space = space
    spaceMember := GetSpaceTargetMember(t, r.awaitilities, space)
    VerifyUserRelatedResources(t, r.awaitilities, userSignup, "deactivate30", ExpectUserAccountIn(spaceMember))

so when the execution of the "SignupRequest" logic is finished, then the space should be already in the expected state, so I'm wondering how the race condition could happen. Or I'm maybe missing some small and important detail 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I did not know!

The problem is by the time we call GetDefaultNamespace(parentSpace.Status.ProvisionedNamespaces)) it still has the -dev one and not the -tenant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@MatousJobanek MatousJobanek Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, the test execution probably simply happens too quickly. I would rather update the common logic in VerifySpaceRelatedResources:

func VerifySpaceRelatedResources(t *testing.T, awaitilities wait.Awaitilities, userSignup *toolchainv1alpha1.UserSignup, spaceTierName string) *toolchainv1alpha1.Space {
hostAwait := awaitilities.Host()
userSignup, err := hostAwait.WaitForUserSignup(t, userSignup.Name,
wait.UntilUserSignupHasStateLabel(toolchainv1alpha1.UserSignupStateLabelValueApproved),
wait.ContainsCondition(wait.Complete()))
require.NoError(t, err)
tier, err := hostAwait.WaitForNSTemplateTier(t, spaceTierName)
require.NoError(t, err)
hash, err := hash.ComputeHashForNSTemplateTier(tier) // we can assume the JSON marshalling will always work
require.NoError(t, err)
space, err := hostAwait.WaitForSpace(t, userSignup.Status.CompliantUsername,
wait.UntilSpaceHasTier(spaceTierName),
wait.UntilSpaceHasLabelWithValue(toolchainv1alpha1.SpaceCreatorLabelKey, userSignup.Name),
wait.UntilSpaceHasLabelWithValue(fmt.Sprintf("toolchain.dev.openshift.com/%s-tier-hash", spaceTierName), hash),
wait.UntilSpaceHasConditions(wait.Provisioned()),
wait.UntilSpaceHasStateLabel(toolchainv1alpha1.SpaceStateLabelValueClusterAssigned),
wait.UntilSpaceHasAnyTargetClusterSet())
require.NoError(t, err)
mur, err := hostAwait.WaitForMasterUserRecord(t, userSignup.Status.CompliantUsername,
wait.UntilMasterUserRecordHasConditions(wait.Provisioned(), wait.ProvisionedNotificationCRCreated()),
wait.UntilMasterUserRecordHasUserAccountStatusesInClusters(space.Spec.TargetCluster))
require.NoError(t, err)
testsupportsb.VerifySpaceBinding(t, hostAwait, mur.Name, space.Name, "admin")
bindings, err := hostAwait.ListSpaceBindings(space.Name)
require.NoError(t, err)
memberAwait := GetMurTargetMember(t, awaitilities, mur)
// Verify provisioned NSTemplateSet
nsTemplateSet, err := memberAwait.WaitForNSTmplSet(t, space.Name,
wait.UntilNSTemplateSetHasTier(tier.Name),
wait.UntilNSTemplateSetHasSpaceRolesFromBindings(tier, bindings),
)
require.NoError(t, err)
tierChecks, err := tiers.NewChecksForTier(tier)
require.NoError(t, err)
tiers.VerifyNSTemplateSet(t, hostAwait, memberAwait, nsTemplateSet, space, tierChecks)
require.Equal(t, space.Name, userSignup.Status.HomeSpace)
return space
}

so just move this code that you added in your PR to the end of the function mentioned above - something like this:

     ...
	space, err = awaitilities.Host().WaitForSpace(t, space.Name,
		wait.UntilSpaceHasTier(spaceTierName),
		wait.UntilSpaceHasConditions(wait.Provisioned()),
		wait.UntilSpaceHasAnyProvisionedNamespaces())
	require.NoError(t, err)
	return space
}

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! Addressed, thanks!

parentSpace := user.Space

// create the space request in the "default" namespace provisioned by the parentSpace
spaceRequest := NewSpaceRequest(t, append(opts, WithNamespace(GetDefaultNamespace(parentSpace.Status.ProvisionedNamespaces)))...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid that now parentSpace.Status.ProvisionedNamespaces might be empty, when this line is executed, thus the spacerequest will not be created. But maybe I'm wrong.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should not - look at the updated code in the other file/function - it has condition wait.UntilSpaceHasAnyProvisionedNamespaces() which ensures that there is gonna be some provisioned namespace set

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see now, I've missed that this VerifySpaceRelatedResources was called indirectly by the test.

It looks like we were already waiting for the space https://github.com/mfrancisc/toolchain-e2e/blob/8f4432ee3e06e75b90989469cee0aea29c28ab34/testsupport/user_assertions.go#L243-L249, we were probably only missing that wait.UntilSpaceHasAnyProvisionedNamespaces() in there.

Copy link
Collaborator

@MatousJobanek MatousJobanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, thanks 👍

@openshift-ci
Copy link

openshift-ci bot commented Dec 17, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MatousJobanek, rsoaresd, xcoulon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [MatousJobanek,rsoaresd,xcoulon]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rsoaresd
Copy link
Contributor Author

/retest

@sonarqubecloud
Copy link

@rsoaresd rsoaresd merged commit 8f4432e into codeready-toolchain:master Dec 17, 2025
9 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants