-
Notifications
You must be signed in to change notification settings - Fork 1.8k
feat: Add pipeline run parallelism config #12442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: Add pipeline run parallelism config #12442
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Hi @sduvvuri1603. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
99f2fc8 to
d34a1b2
Compare
|
/retest |
82756e1 to
60a35d8
Compare
|
@sduvvuri1603: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/ok-to-test |
|
/retest |
Signed-off-by: sduvvuri1603 <sduvvuri@redhat.com>
Signed-off-by: sduvvuri1603 <sduvvuri@redhat.com>
60a35d8 to
39fb3dd
Compare
Signed-off-by: sduvvuri1603 <sduvvuri@redhat.com>
39fb3dd to
c587b03
Compare
Signed-off-by: sduvvuri1603 <sduvvuri@redhat.com>
Signed-off-by: sduvvuri1603 <sduvvuri@redhat.com>
|
|
||
| Consult the [Python SDK reference docs](https://kubeflow-pipelines.readthedocs.io/en/stable/) when writing pipelines using the Python SDK. | ||
|
|
||
| > New in master: `dsl.PipelineConfig` now accepts an optional `pipeline_run_parallelism` integer to cap concurrent task execution for a run. The backend stores the requested limit in a shared ConfigMap and surfaces it to Argo Workflows via `spec.parallelism`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be more appropriate to add this entry to the CHANGELOG.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but will this be a part of a new section called "Unreleased Features" ? because I only see version release details in the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the PR will be included here as part of the release process. @mprahl , could you confirm if that’s correct?
Signed-off-by: Sruthi Duvvuri <sduvvuri@redhat.com>
Signed-off-by: sduvvuri1603 <sduvvuri@redhat.com>
Signed-off-by: sduvvuri1603 <sduvvuri@redhat.com>
|
|
||
| @pipeline_run_parallelism.setter | ||
| def pipeline_run_parallelism(self, value: Optional[int]) -> None: # pylint: disable=attribute-defined-outside-init | ||
| if value is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this required given https://github.com/kubeflow/pipelines/pull/12442/files#diff-631f096829954a5226e87a37347ab04d3e551465fdf0371b37fd9f9363c0c156R120 will set it to None if value is None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's right about the serialization part. But we need this guard to prevent a crash during initialization. Since init passes None by default, removing this check would cause it to hit the isinstance line and fail immediately. This just ensures we can safely create the object with no value set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess then you can just add 1 if statment:
if value:
if not isinstance(value, int):
raise ValueError(
'pipeline_run_parallelism must be an integer if specified.')
if value <= 0:
raise ValueError(
'pipeline_run_parallelism must be a positive integer.')
self._pipeline_run_parallelism = value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated code with suggested change
| name="pipeline-with-workspace", | ||
| description="A pipeline that demonstrates workspace functionality", | ||
| pipeline_config=dsl.PipelineConfig( | ||
| pipeline_run_parallelism=3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we actually set this to None here if we have an explicit test to test +ve values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, That makes sense. Will set this to none
|
@sduvvuri1603 can you please add what this config is suppose to do, to the PR description? and a section about how you;ve validated the functionality. |
|
|
||
| @dsl.pipeline( | ||
| name='pipeline-with-run-parallelism', | ||
| pipeline_config=dsl.PipelineConfig(pipeline_run_parallelism=7), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't 7 too high when the number of tasks in this pipeline is just 1? May be you should add more components to it or add a parallelFor loop and iterate over > pipeline_run_parallelism constants, so that we can validate that the config actually works.
Also what validation logic did you add to confirm the number of tasks created for a pipeline with this config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This specific test case is part of the SDK compilation suite to verify that the pipeline_run_parallelism field is correctly serialized from the Python SDK into the compiled YAML's PlatformSpec. It relies on the 'Golden File' comparison for validation here (ensuring the YAML contains pipelineRunParallelism: 7 correctly populated)
(It is not related to actual runtime limit covered by the backend integration tests where we submit these workflows to Argo is my understanding) Pls Lmk if this is correct!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any pipeline yaml file in this directory will be part of the end to end tests, so yes, the workflow will get submitted to argo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated code with ParallelFor loop and reduced pipeline_run_paralellism to '2'
Signed-off-by: sduvvuri1603 <sduvvuri@redhat.com>
Signed-off-by: sduvvuri1603 <sduvvuri@redhat.com>
Signed-off-by: sduvvuri1603 <sduvvuri@redhat.com>
5edaa23 to
dcbc311
Compare
nsingla
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Summary
pipeline_run_parallelismoption ondsl.PipelineConfig. This lets the API server own the Argo semaphore lifecycles instead of expecting users to edit shared ConfigMaps—eliminating a Kubernetes-heavy workflow and ensuring keys align to<pipeline>/<version>.spec.parallelism.pipeline_with_run_parallelismsample (three-itemParallelFor) to exercise the setting while leaving the workspace fixture focused on workspace behaviour.Validation
e2e_utils.go+ invocation inpipeline_e2e_test.go), rebuilt the test cluster with the fresh backend images, exercised the focused pipeline_run_parallelism scenario, and then ran the end-to-end suite to confirm the new check passes with the concurrency cap enforced.Follow up to PR - remove unused semaphore_key and mutex_name fields