Skip to content

Conversation

@ductrung-nguyen
Copy link

@ductrung-nguyen ductrung-nguyen commented Dec 5, 2025

Fix Unit Tests Using Real Network I/O

Description

Unit tests were taking 30+ minutes to run because they were making real network calls - HTTP requests to Splunk REST APIs that don't exist in the test environment (5-second timeouts each) and Kubernetes pod exec commands trying to connect to real clusters (30-second timeouts). This PR fixes that by mocking out the network I/O functions.

The goal of this PR is only to shorten the execution time of the existing tests to speed up the development process.

Key Changes

TL;DR: we mock out the network I/O functions!

Production Code

  • pkg/splunk/enterprise/clustermaster.go
    • Converted PerformCmasterBundlePush and VerifyCMasterisMultisite from regular functions to var functions so they can be mocked in tests

Test Code

  • pkg/splunk/enterprise/clustermaster_test.go

    • Added mocks for VerifyCMasterisMultisite, GetPodExecClient, and PerformCmasterBundlePush
    • Added initGlobalResourceTracker() call for app framework tests
  • pkg/splunk/enterprise/clustermanager_test.go

    • Added mocks for GetCMMultisiteEnvVarsCall, PerformCmBundlePush, and GetPodExecClient
  • pkg/splunk/enterprise/standalone_test.go

    • Added mocks for GetAppsList and GetPodExecClient
    • Added app download directory setup with cleanup
  • pkg/splunk/enterprise/indexercluster_test.go

    • Added mocks for VerifyRFPeers, GetSpecificSecretTokenFromPod, newIndexerClusterPodManager, GetAppsList, and GetPodExecClient
  • pkg/splunk/enterprise/licensemanager_test.go

    • Added mocks for VerifyRFPeers, addTelApp, GetAppsList, and GetPodExecClient
  • pkg/splunk/enterprise/licensemaster_test.go

    • Added mocks for VerifyCMasterisMultisite, VerifyRFPeers, addTelApp, GetAppsList, and GetPodExecClient
  • pkg/splunk/enterprise/afwscheduler_test.go

    • Added timing variable overrides to speed up worker loops
    • Added GetPodExecClient mock
  • pkg/splunk/util/util_test.go

    • Added podExecGetConfig mock to return fast-failing config instead of real K8s config

Testing and Verification

Ran the full test suite before and after:

Test Before After
TestPodExecCommand 150s 0.01s
TestStandaloneWithReadyState 77s 2.6s
TestLicenseManagerWithReadyState 78s 4s
TestLicenseMasterWithReadyState 86s 2.14s
TestIndexerClusterWithReadyState 78s 3.5s
TestClusterMasterWitReadyState 92s 1.4s
TestApplyClusterMasterWithSmartstore 90s 0.00s
TestClusterManagerWitReadyState 75s 0.01s
TestInstallWorkerHandler 150s 6s
TestRunPodCopyWorker 150s 0.00s
TestPodCopyWorkerHandler 751s 0.05s

BEFORE:

coverage: 97.7% of statements
composite coverage: 86.0% of statements

Ginkgo ran 6 suites in 36m48.729625s
Test Suite Passed

AFTER:

coverage: 97.3% of statements
composite coverage: 86.1% of statements

Ginkgo ran 6 suites in 4m50.211326625s
Test Suite Passed

No new tests added.

this PR fixes existing tests by only mocking network I/O.

Coverage impact: None.

All mocked functions have their own dedicated unit tests (e.g., TestPerformCmasterBundlePush, TestVerifyRFPeers, TestGetAppsList*). The integration tests mock these functions because their purpose is to test the reconciliation flow, not the individual network functions.

Future works

At the momemt, we are using the "var function" pattern - converting regular functions to package-level variables that can be swapped out in tests.

While this gets the job done, IMO, it's not idea, and not a good practice:

  • Global state can cause test pollution if we forget to restore the original function
  • Every test needs boilerplate save/defer/restore code (because of the above reason)
  • Functions like VerifyCMasterisMultisite that create internal dependencies (e.g., splclient.NewSplunkClient) can't be fully tested because the hardcoded dependencies aren't mockable
  • Parallel test execution becomes risky

A better long-term approach would be Dependency Injection - passing dependencies as function parameters or struct fields. For example, instead of VerifyCMasterisMultisite hardcoding splclient.NewSplunkClient, it could accept a NewSplunkClientFunc parameter.

This makes the code more testable, removes global state, and makes dependencies explicit. But that's a bigger refactor for another day.

Related Issues

#1636

PR Checklist

  • Code changes adhere to the project's coding standards.
  • Relevant unit and integration tests are included.
  • Documentation has been updated accordingly.
  • All tests pass locally.
  • The PR description follows the project's guidelines.

Copilot AI review requested due to automatic review settings December 5, 2025 15:29
@patrykw-splunk
Copy link
Collaborator

Hey, @ductrung-nguyen thanks for creating this PR. We will review it and get back to you as soon as possible

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses performance issues in unit tests by mocking network I/O operations that were causing tests to timeout. The changes successfully reduce test execution time from 30+ minutes to under 5 minutes by replacing real network calls (HTTP requests to Splunk REST APIs and Kubernetes pod exec commands) with mocks.

Key Changes:

  • Mocking infrastructure: Converted several production functions to var functions to enable mocking (GetPodExecClient, PerformCmasterBundlePush, VerifyCMasterisMultisite, GetSpecificSecretTokenFromPod, PerformCmBundlePush)
  • Test improvements: Added comprehensive mocking in test files to avoid real network I/O during unit tests
  • Timing optimizations: Made phase manager timing variables configurable for faster test execution
  • Dependency updates: Updated testing libraries (ginkgo/gomega) and related dependencies

Reviewed changes

Copilot reviewed 30 out of 33 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
pkg/splunk/util/util.go Converted GetPodExecClient to a var function for mocking
pkg/splunk/util/secrets.go Converted GetSpecificSecretTokenFromPod to a var function for mocking
pkg/splunk/util/util_test.go Added mock for podExecGetConfig to prevent real K8s connection attempts
pkg/splunk/enterprise/clustermaster.go Converted PerformCmasterBundlePush and VerifyCMasterisMultisite to var functions
pkg/splunk/enterprise/clustermanager.go Added podExecClient parameter to ApplyClusterManager and PerformCmBundlePush for dependency injection
pkg/splunk/enterprise/types.go Added podExecClient field to PipelineWorker struct for test injection
pkg/splunk/enterprise/afwscheduler.go Made timing variables configurable and updated worker functions to use injected podExecClient
pkg/splunk/enterprise/*_test.go Added comprehensive mocking for network I/O functions across all test files
pkg/splunk/test/controller.go Improved Get() method to properly handle GVK inference and use key.Name for NotFoundError
internal/controller/clustermanager_controller.go Updated ApplyClusterManager signature to accept podExecClient parameter
test//.go Fixed import ordering (ginkgo imports)
go.mod, go.sum Updated testing dependencies (ginkgo v2.27.2, gomega v1.38.2, go-logr v1.4.3)
cmd/main.go Minor whitespace cleanup
Comments suppressed due to low confidence (1)

pkg/splunk/enterprise/clustermanager_test.go:1788

  • Missing defer statement: The mock for GetAppsList on line 1785 is being reassigned but there's no corresponding defer statement to restore the original value after the test completes. This could cause test pollution if this mock affects other tests that run after this one. Add a defer statement to restore the saved value.
	// adding getapplist to fix test case
	GetAppsList = func(ctx context.Context, remoteDataClientMgr RemoteDataClientManager) (splclient.RemoteDataListResponse, error) {
		remoteDataListResponse := splclient.RemoteDataListResponse{}
		return remoteDataListResponse, nil
	}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants