Skip to content

Stabilize integration tests by using a single shared mock server instead of multiple per-test servers #869

@KaveeshaPiumini

Description

@KaveeshaPiumini

Currently, in our integration tests we spin up separate mock servers for the external integrations.

Each of these is started individually within the relevant tests. My assumption is that the intermittent test failures happen when one or more of these mock servers are not killed properly or there is a race condition between starting/stopping them. This can leave ports in an inconsistent state and cause random failures.

Instead of having multiple ad-hoc mock servers, the idea is to introduce a single shared test server that is started once for the integration test suite and stopped once when the suite finishes. Different scenarios (Google, GitHub, SMS, webhook, etc.) can then be handled via different HTTP paths/handlers on this single server.

The intention of this change is specifically to stabilize the following tests, which are currently failing intermittently:

=== Failed
=== FAIL: flowregistration TestOURegistrationFlowTestSuite/TestSMSRegistrationFlowWithOUCreationDuplicateError/DuplicateOUName (5.01s)
    ouregistration_test.go:491: 
                Error Trace:    /Users/piumini/Work/THUNDER/thunder/tests/integration/flowregistration/ouregistration_test.go:491
                                                        /Users/piumini/go/pkg/mod/github.com/stretchr/testify@v1.10.0/suite/suite.go:115
                Error:          Expected value not to be nil.
                Test:           TestOURegistrationFlowTestSuite/TestSMSRegistrationFlowWithOUCreationDuplicateError/DuplicateOUName

=== FAIL: flowregistration TestOURegistrationFlowTestSuite/TestSMSRegistrationFlowWithOUCreationDuplicateError (10.04s)

=== FAIL: flowregistration TestOURegistrationFlowTestSuite (11.46s)
2025/11/28 14:52:19 Starting mock notification server on port 8098

These failures appear to be related to the mock notification server (port 8098) and are not consistently reproducible, which suggests a lifecycle / timing issue rather than a logic bug in the test itself.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions