docs: proposal for resolving metrics query failures #2810

mikewilli · 2025-10-07T14:29:57Z

This doc describes an approach to resolving bigquery resource/timeout errors stemming from the metrics query. The proposed approach is two-phased: first we will work to resolve issues with discrete metrics, and monitor the results to determine if further action is necessary. If necessary, we will follow-on with a second phase of aggregating weekly and overall metrics from daily metrics, which would reduce the amount of data being queried, and eliminate the need to re-query telemetry tables for weekly and overall periods.

scholtzan · 2025-10-23T20:21:00Z

docs/proposal-0008-daily_metric_aggregations.md

+      ORDER BY client_level_daily_active_users_v2 DESC
+      ```
+    - If the above does not cover what we need, then some alternative options:
+      - add a new metric-hub parameter for metrics to specify how Jetstream should aggregate across days


What about metrics that just can't really be aggregated across days. Like COUNT DISTINCT?

Yea there's definitely some loss of fidelity for certain metrics with the proposed approach. As Daniel suggested I'll flesh out the various options here so we can decide whether this is the best approach and what the alternatives are.

danielkberry · 2025-10-30T23:11:06Z

docs/proposal-0008-daily_metric_aggregations.md

+      - `MIN` enrollment and exposure dates is not necessary since we already do this when building enrollments
+      - `SUM` enrollment and exposure events
+      - `LOGICAL_OR` for boolean metrics
+      - `SUM` for int metrics


I think this section needs to be more fleshed out. IMO, this is really the meat of the proposal so we should align on what options we'd consider, what their pros/cons are, etc. before we can make a decision.

danielkberry · 2025-10-30T23:11:50Z

docs/proposal-0008-daily_metric_aggregations.md

+      - retain the original column names in the daily metrics query output, and then just run the original metric `select_expression` against the daily results tables
+        - possibly store all daily values in a histogram field so that we can do arbitrary aggregations/statistics without loss of fidelity
+
+2. How to handle pre-enrollment analysis periods?


Off the cuff I would expect pre-experiment periods to work the same as all other periods (aggregate to day level, the aggregate to pre-enrollment period level)

I think the biggest differences here are:

these aren't critical for results availability

we'd need to run all of these daily queries on the first day of analysis, whereas for post-enrollment periods we typically already have those computed

That said, I agree that it would be a little odd for these to have a special workflow. I'll update the proposal to clarify the proposed route for this though (likely by making them work the same as other periods).

docs: proposal for daily metric aggregations

30805d4

mikewilli temporarily deployed to GH Actions October 7, 2025 14:30 — with GitHub Actions Inactive

Merge branch 'main' into daily-metric-aggregations-proposal

c64562d

mikewilli temporarily deployed to GH Actions October 22, 2025 19:02 — with GitHub Actions Inactive

alternatives; edits

47d57fe

mikewilli force-pushed the daily-metric-aggregations-proposal branch from 3caefa0 to 47d57fe Compare October 22, 2025 19:59

mikewilli temporarily deployed to GH Actions October 22, 2025 19:59 — with GitHub Actions Inactive

Merge branch 'main' into daily-metric-aggregations-proposal

2645f14

mikewilli marked this pull request as ready for review October 22, 2025 19:59

mikewilli temporarily deployed to GH Actions October 22, 2025 19:59 — with GitHub Actions Inactive

mikewilli requested review from danielkberry and scholtzan October 22, 2025 19:59

scholtzan reviewed Oct 23, 2025

View reviewed changes

danielkberry reviewed Oct 30, 2025

View reviewed changes

wip: add detailed options for aggregations

3a76256

mikewilli temporarily deployed to GH Actions November 3, 2025 19:38 — with GitHub Actions Inactive

update proposal based on POC testing

67d8075

mikewilli temporarily deployed to GH Actions November 20, 2025 22:05 — with GitHub Actions Inactive

mikewilli changed the title ~~docs: proposal for daily metric aggregations~~ docs: proposal for resolving metrics query failures Nov 20, 2025

clarify reasoning for decision point 1

61bdb6e

mikewilli temporarily deployed to GH Actions November 20, 2025 22:17 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: proposal for resolving metrics query failures #2810

docs: proposal for resolving metrics query failures #2810

Uh oh!

mikewilli commented Oct 7, 2025 •

edited

Loading

Uh oh!

scholtzan Oct 23, 2025

Uh oh!

mikewilli Oct 31, 2025

Uh oh!

danielkberry Oct 30, 2025

Uh oh!

danielkberry Oct 30, 2025

Uh oh!

mikewilli Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

docs: proposal for resolving metrics query failures #2810

Are you sure you want to change the base?

docs: proposal for resolving metrics query failures #2810

Uh oh!

Conversation

mikewilli commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scholtzan Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

mikewilli Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

danielkberry Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

danielkberry Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

mikewilli Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mikewilli commented Oct 7, 2025 •

edited

Loading