-
Notifications
You must be signed in to change notification settings - Fork 84
Allow Action router to mutate Actions and implement a remote action router #205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This allows an external service to act as a plugin to the scheduler's action routing mechanism.
|
moroten
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice idea! As the remote action routing is a blocking call, actions can potentially be stuck in the state of getting routed. Do we need another Prometheus histogram for how long actions are in the routing state?
mkosiba
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I think I addressed all of the review feedback with the exception of adding the scope field.
I implemented a simple service that uses this protocol and good news is that even if the remote action router takes minutes, Bazel doesn't time out and fail the build, so I think this is OK as a starting point.
| }, | ||
| []string{"instance_name_prefix", "platform", "size_class", "outcome"}) | ||
|
|
||
| inMemoryBuildQueueRequestRoutingDuration = prometheus.NewHistogramVec( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really think that adding this metric is necessary? Doesn't grpc_client_handled_... already provide this information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I plan on having a separate set of metrics on the remote action, so I'm ok to not to have a separate one.
@moroten, you ok with that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other action router implementations tend to be so primitive that they don't contribute to the execution times in a meaningful way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grpc client metrics should be fine.
EdSchouten
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Just some small remaining remarks, and this is good to merge.
| }, | ||
| []string{"instance_name_prefix", "platform", "size_class", "outcome"}) | ||
|
|
||
| inMemoryBuildQueueRequestRoutingDuration = prometheus.NewHistogramVec( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other action router implementations tend to be so primitive that they don't contribute to the execution times in a meaningful way.

No description provided.