-
Notifications
You must be signed in to change notification settings - Fork 14
docs: proposal for resolving metrics query failures #2810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
3caefa0 to
47d57fe
Compare
| ORDER BY client_level_daily_active_users_v2 DESC | ||
| ``` | ||
| - If the above does not cover what we need, then some alternative options: | ||
| - add a new metric-hub parameter for metrics to specify how Jetstream should aggregate across days |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about metrics that just can't really be aggregated across days. Like COUNT DISTINCT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea there's definitely some loss of fidelity for certain metrics with the proposed approach. As Daniel suggested I'll flesh out the various options here so we can decide whether this is the best approach and what the alternatives are.
| - `MIN` enrollment and exposure dates is not necessary since we already do this when building enrollments | ||
| - `SUM` enrollment and exposure events | ||
| - `LOGICAL_OR` for boolean metrics | ||
| - `SUM` for int metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this section needs to be more fleshed out. IMO, this is really the meat of the proposal so we should align on what options we'd consider, what their pros/cons are, etc. before we can make a decision.
| - retain the original column names in the daily metrics query output, and then just run the original metric `select_expression` against the daily results tables | ||
| - possibly store all daily values in a histogram field so that we can do arbitrary aggregations/statistics without loss of fidelity | ||
|
|
||
| 2. How to handle pre-enrollment analysis periods? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Off the cuff I would expect pre-experiment periods to work the same as all other periods (aggregate to day level, the aggregate to pre-enrollment period level)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the biggest differences here are:
- these aren't critical for results availability
- we'd need to run all of these daily queries on the first day of analysis, whereas for post-enrollment periods we typically already have those computed
That said, I agree that it would be a little odd for these to have a special workflow. I'll update the proposal to clarify the proposed route for this though (likely by making them work the same as other periods).
This doc describes an approach to resolving bigquery resource/timeout errors stemming from the metrics query. The proposed approach is two-phased: first we will work to resolve issues with discrete metrics, and monitor the results to determine if further action is necessary. If necessary, we will follow-on with a second phase of aggregating weekly and overall metrics from daily metrics, which would reduce the amount of data being queried, and eliminate the need to re-query telemetry tables for weekly and overall periods.