Decrease retained object allocations in the reporter by camallen · Pull Request #253 · buildkite/test-collector-ruby

camallen · 2026-02-10T07:17:12Z

Save on maintained object allocations in the reporter, the changes in this PR are based on the below.

Buildkite TestCollector Memory Analysis

Gem Details

Version: 2.11.0

How the Collector Works (Lifecycle per Test)

before_setup — Creates a Tracer object, stores it in Thread.current[:_buildkite_tracer]
During test — Every SQL query, HTTP request, sleep() call, and WebMock stub creates Span objects appended to the tracer's tree
after_teardown — Finalizes the tracer, converts it to a Trace object, stores it in Uploader.traces[source_location]
Reporter#record — Queues the trace for upload. When queue hits batch_size (default 500), spawns a background thread to upload, then deletes traces from the hash (TE-5203 fork fix)

Memory Leak Vectors

1. CRITICAL: `Minitest::StatisticsReporter#record` retains failed/skipped test results forever

minitest.rb:889:

results << result if not result.passed? or result.skipped?

The Buildkite reporter extends Minitest::StatisticsReporter and calls super in record. Every failed or skipped test's full result object (including its entire Minitest::Test instance) gets accumulated in the reporter's results array for the entire test run. The Buildkite reporter inherits this unnecessarily.

2. CRITICAL: `Trace` holds a reference to the full `Minitest::Test` instance

minitest_plugin/trace.rb:20-25:

def initialize(example, history:, tags: nil, trace: nil, location_prefix: nil)
  @example = example  # <-- Full Minitest::Test instance!
  ...
end

@example is the entire test object — including all instance variables set during the test (fixtures, stubs, mock objects, etc.). This trace lives in Uploader.traces until it's batch-uploaded and deleted. With a batch size of 500, up to 500 full test objects are retained at any time.

3. HIGH: SQL query strings accumulated as span data

test_collector.rb:109-111:

ActiveSupport::Notifications.subscribe("sql.active_record") do |name, start, finish, id, payload|
  Buildkite::TestCollector::Uploader.tracer&.backfill(:sql, finish - start, **{ query: payload[:sql] })
end

Every single SQL query executed during a test gets its full query string stored as a Span child node. For a Rails app with fixtures and complex setup, a single test easily executes 50-200+ SQL queries. Each span object stores: section, start_at, end_at, detail hash (with the query string), and a children array.

With 550+ fixture files loaded, this is significant.

4. HIGH: HTTP request URLs stored in span detail hashes

network.rb:12:

detail = { method: request.method.upcase, url: uri.to_s, lib: "net-http" }

Every HTTP request (including WebMock-stubbed ones) creates a span with the full URL stored.

5. MEDIUM: Upload threads accumulate in `@upload_threads` array

session.rb:53:

@upload_threads << new_thread if new_thread

Completed Thread objects are never removed from this array. They're only killed at close(). With default batch_size of 500 and thousands of tests, this array grows with dead Thread references.

6. MEDIUM: TE-5203 fix has a timing gap

The fork's fix deletes traces from Uploader.traces after calling upload_data. But between storage and batch upload, up to batch_size (500) traces live in memory. Each trace holds @example (the full test instance) and @history (the entire span tree with all SQL queries).

The Compounding Effect

Why enabling the collector causes OOM while without it there's only gradual increase:

Without collector:

Tests run, minitest's own StatisticsReporter accumulates failed/skipped results (gradual increase)
Ruby GC can collect test objects after each test

With collector:

All the above, PLUS:
Each test creates a Tracer with a span tree capturing every SQL query, HTTP call, and sleep
The Trace object holds a strong reference to the entire Minitest::Test instance (preventing GC of the test and everything it references)
Up to 500 of these accumulate before batch upload
SQL spans can be 50-200+ per test, each storing the full query string
Upload threads accumulate as dead references
The history hash is a deep recursive structure that gets serialized to JSON (creating temporary copies)

Estimated Memory Per Test

For this app:

Minitest::Test instance with fixtures: ~50-200KB (depending on accessed fixtures)
SQL span objects (100 queries avg): ~30-50KB of query strings + span overhead
HTTP span objects: ~5-10KB
Trace metadata: ~2-5KB

Rough estimate: 100-250KB per test retained until batch flush.
At batch_size=500: 50-125MB peak before flush.

This is on top of the base memory. Tests that set up large objects, fixtures, or generate many SQL queries will be much larger.

Recommendations

Quick Wins (env vars / config changes)

1. Reduce batch size

Set BUILDKITE_ANALYTICS_UPLOAD_BATCH_SIZE=50 to flush more frequently and reduce peak memory.

2. Filter short spans

Set BUILDKITE_ANALYTICS_TRACE_MIN_MS=5 to filter out spans under 5ms (eliminates most SQL spans from the trace tree).

3. Disable tracing entirely

If you only need test timing data (not per-query traces), configure with tracing_enabled: false:

Buildkite::TestCollector.configure(hook: :minitest, tracing_enabled: false)

This skips all monkey-patching (Net::HTTP, Object#sleep, ActiveRecord subscriber) and no spans are created.

Gem Patches (require forking further)

4. Break the Trace -> Test reference

The biggest win would be patching the Trace to not hold @example. It only needs it for result_code, source_location, class.name, name, failures, and failure.message. These could be extracted eagerly in initialize instead of holding the full test object. This would allow GC to collect test instances immediately.

5. Clear StatisticsReporter results

Override record in the Buildkite reporter to not call super, or clear results periodically. This prevents minitest's own accumulation of failed/skipped test result objects.

…n object allocations

switch away from the Minitest::StatisticsReporter interface to save o…

fccd918

…n object allocations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decrease retained object allocations in the reporter#253

Decrease retained object allocations in the reporter#253
camallen wants to merge 1 commit intobuildkite:mainfrom
camallen:improve-memory-usage

camallen commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

camallen commented Feb 10, 2026

Buildkite TestCollector Memory Analysis

Gem Details

How the Collector Works (Lifecycle per Test)

Memory Leak Vectors

1. CRITICAL: Minitest::StatisticsReporter#record retains failed/skipped test results forever

2. CRITICAL: Trace holds a reference to the full Minitest::Test instance

3. HIGH: SQL query strings accumulated as span data

4. HIGH: HTTP request URLs stored in span detail hashes

5. MEDIUM: Upload threads accumulate in @upload_threads array

6. MEDIUM: TE-5203 fix has a timing gap

The Compounding Effect

Estimated Memory Per Test

Recommendations

Quick Wins (env vars / config changes)

1. Reduce batch size

2. Filter short spans

3. Disable tracing entirely

Gem Patches (require forking further)

4. Break the Trace -> Test reference

5. Clear StatisticsReporter results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. CRITICAL: `Minitest::StatisticsReporter#record` retains failed/skipped test results forever

2. CRITICAL: `Trace` holds a reference to the full `Minitest::Test` instance

5. MEDIUM: Upload threads accumulate in `@upload_threads` array