Skip to content

perf: denormalize post_status into event_dates to eliminate posts table JOIN#161

Merged
chubes4 merged 1 commit intomainfrom
perf/event-dates-post-status
Mar 23, 2026
Merged

perf: denormalize post_status into event_dates to eliminate posts table JOIN#161
chubes4 merged 1 commit intomainfrom
perf/event-dates-post-status

Conversation

@chubes4
Copy link
Member

@chubes4 chubes4 commented Mar 23, 2026

Summary

  • Add post_status column to datamachine_event_dates table with composite (post_status, start_datetime) index
  • Skip the 130K-row posts table JOIN in date-filtered queries by using ed.post_status = 'publish' directly
  • Sync post_status on transition_post_status hook to keep denormalized column in sync

Problem

The events site has 37K events in event_dates and 130K rows in the posts table. Every event count/aggregation query JOINed posts just to filter post_type = 'data_machine_events' AND post_status = 'publish', but the 128MB InnoDB buffer pool couldn't cache all those pages, causing 2-4s per query from disk reads.

Changes

File Change
EventDatesTable.php Add post_status column to schema, upsert(), update_status(), backfill()
meta-storage.php Add transition_post_status hook to keep status in sync
DateFilter.php Add $include_status and $join_column params to upcoming_sql(), past_sql(), date_range_sql()
Taxonomy_Helper.php Skip posts JOIN when date filter is active; use tr.object_id for cross-filter JOINs
UpcomingCountAbilities.php Skip posts JOIN, use ed.post_status directly

Benchmarks (37K events, 257 location terms)

Query Before After Improvement
Location term counts 2.9s 107ms 27x
Cross-filter counts 3.7s 174ms 21x

Migration

Schema change applied live. Backfill (UPDATE ed SET post_status = p.post_status) affected 805 rows (non-published events that were defaulting to 'publish').

Backward Compatibility

  • DateFilter::upcoming_sql() etc. default to include_status=true, join_column='p.ID' — existing callers that still JOIN posts are unaffected
  • EventDatesTable::upsert() auto-detects post_status from the post if not provided — existing callers don't need changes

…le JOIN

Add post_status column to datamachine_event_dates table with composite
index (post_status, start_datetime). This allows date-filtered queries
to skip the 130K-row posts table entirely.

- EventDatesTable: add post_status column to schema, upsert, and backfill
- meta-storage: sync post_status on transition_post_status hook
- DateFilter: add post_status filter and join_column params to SQL helpers
- Taxonomy_Helper: skip posts JOIN when date filter is active
- UpcomingCountAbilities: skip posts JOIN, use ed.post_status

Benchmarks on 37K events:
- Location term counts: 2.9s → 107ms (27x faster)
- Cross-filter counts: 3.7s → 174ms (21x faster)
@chubes4 chubes4 merged commit 37e469e into main Mar 23, 2026
1 check failed
@chubes4 chubes4 deleted the perf/event-dates-post-status branch March 23, 2026 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant