[FEAT] add driver/executor pod in Spark#3016
Conversation
|
Thank you for opening this pull request! 🙌 These tips will help get your PR across the finish line:
|
2ff8b9a to
af03383
Compare
Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
af03383 to
7793398
Compare
Code Review Agent Run #3c7587Actionable Suggestions - 2
Additional Suggestions - 1
Review Details
|
Changelist by BitoThis pull request implements the following key changes.
|
|
|
||
| return MessageToDict(job.to_flyte_idl()) | ||
|
|
||
| def to_k8s_pod(self, pod_template: PodTemplate | None, settings: SerializationSettings) -> K8sPod | None: |
There was a problem hiding this comment.
Consider adding type hints for the return value of _get_container() in the to_k8s_pod() method. The method appears to use this internal method but its return type is not clearly specified in the type hints.
Code suggestion
Check the AI-generated fix before applying
| def to_k8s_pod(self, pod_template: PodTemplate | None, settings: SerializationSettings) -> K8sPod | None: | |
| def to_k8s_pod(self, pod_template: PodTemplate | None, settings: SerializationSettings) -> K8sPod | None: | |
| from flytekit.models import task as _task_model | |
| _get_container: Callable[..., _task_model.Container] = self._get_container |
Code Review Run #3c7587
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
Take the container with name set in driver/executor podTempalte primary_container_name Signed-off-by: machichima <nary12321@gmail.com>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #3016 +/- ##
===========================================
- Coverage 80.01% 46.67% -33.34%
===========================================
Files 318 319 +1
Lines 27075 26695 -380
Branches 2779 2806 +27
===========================================
- Hits 21663 12461 -9202
- Misses 4647 14123 +9476
+ Partials 765 111 -654 ☔ View full report in Codecov by Sentry. |
Code Review Agent Run #f512d4Actionable Suggestions - 0Review Details
|
Exclude those in the podTemplate of spark driver/executor pod Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
Code Review Agent Run #27c6aeActionable Suggestions - 2
Review Details
|
flytekit/core/utils.py
Outdated
| if task_type != "spark": | ||
| # for spark driver/executor, do not use the command and args from task podTemplate | ||
| container.command = primary_container.command | ||
| container.args = primary_container.args |
There was a problem hiding this comment.
Consider extracting the Spark-specific container command/args logic into a separate helper function to improve code organization and readability. The current nested if condition makes the code harder to follow.
Code suggestion
Check the AI-generated fix before applying
- if task_type != "spark":
- # for spark driver/executor, do not use the command and args from task podTemplate
- container.command = primary_container.command
- container.args = primary_container.args
+ if _should_copy_container_command_args(task_type):
+ container.command = primary_container.command
+ container.args = primary_container.args
+
def _should_copy_container_command_args(task_type: str) -> bool:
+ # for spark driver/executor, do not use the command and args from task podTemplate
+ return task_type != "spark"
Code Review Run #27c6ae
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
| pod_spec=driver_pod_spec_dict_remove_None, # type: ignore | ||
| ) | ||
|
|
||
| target_executor_k8sPod = K8sPod( | ||
| metadata=K8sObjectMetadata( | ||
| labels={"lKeyA_e": "lValA", "lKeyB_e": "lValB"}, | ||
| annotations={"aKeyA_e": "aValA", "aKeyB_e": "aValB"}, | ||
| ), | ||
| pod_spec=executor_pod_spec_dict_remove_None, # type: ignore |
There was a problem hiding this comment.
Consider removing the # type: ignore comments and properly typing the pod_spec parameter to match the expected type.
Code suggestion
Check the AI-generated fix before applying
- pod_spec=driver_pod_spec_dict_remove_None, # type: ignore
+ pod_spec=V1PodSpec(**driver_pod_spec_dict_remove_None),
@@ -378,1 +378,1 @@
- pod_spec=executor_pod_spec_dict_remove_None, # type: ignore
+ pod_spec=V1PodSpec(**executor_pod_spec_dict_remove_None),
Code Review Run #27c6ae
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
Code Review Agent Run #41dd0bActionable Suggestions - 1
Review Details
|
Signed-off-by: machichima <nary12321@gmail.com>
c6a8f94 to
d6b752b
Compare
Signed-off-by: machichima <nary12321@gmail.com>
Code Review Agent Run Status
|
flytekit/core/utils.py
Outdated
|
|
||
| container.command = primary_container.command | ||
| container.args = primary_container.args | ||
| if task_type != "spark": |
There was a problem hiding this comment.
We can use this function to create a k8sPod from podTemplate.
flytekit/flytekit/models/task.py
Lines 1079 to 1083 in 2ef875c
There was a problem hiding this comment.
Thanks for the information! I changed using this function and remove task_type in _serialize_pod_spec
Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
Code Review Agent Run Status
|
|
@machichima any chance we could expand the Spark plugin docs to include your example? |
Sure! Is it ok to add the pod_template settings into the existing |
…-driver-executor-podtemplate Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
01bc98a to
7f4e00b
Compare
Code Review Agent Run Status
|
|
The docs is updated here: flyteorg/flytesnacks#1782 |
|
|
||
| return MessageToDict(job.to_flyte_idl()) | ||
|
|
||
| def to_k8s_pod(self, pod_template: Optional[PodTemplate] = None) -> Optional[K8sPod]: |
There was a problem hiding this comment.
I use this function because we need to also check if the primary container name set in driver/executor is the same as in task, if not, a warning should be raised.
This function use from_pod_template to create k8spod in the end
| # @pytest.fixture(scope="function") | ||
| # def reset_spark_session() -> None: | ||
| # pyspark.sql.SparkSession.builder.getOrCreate().stop() | ||
| # yield | ||
| # pyspark.sql.SparkSession.builder.getOrCreate().stop() | ||
|
|
There was a problem hiding this comment.
| # @pytest.fixture(scope="function") | |
| # def reset_spark_session() -> None: | |
| # pyspark.sql.SparkSession.builder.getOrCreate().stop() | |
| # yield | |
| # pyspark.sql.SparkSession.builder.getOrCreate().stop() |
Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
Code Review Agent Run Status
|
Signed-off-by: machichima <nary12321@gmail.com> Signed-off-by: Atharva <atharvakulkarni172003@gmail.com>
Tracking issue
Related to flyteorg/flyte#4105
Why are the changes needed?
This PR update the flytekit-spark package to configure driver pod and executor pod separately using PodTemplate. Enable setting the separate primary_container_name for driver/executor pod separate from the task podTemplate.
What changes were proposed in this pull request?
Add driver_pod and executor_pod field with type PodTemplate in SparkJob.
How was this patch tested?
test_spark_driver_executor_podSpec@taskforhello_sparkfunction inmy_sparkexample here as follow to set the driver_pod and executor_pod.Verify the pods have Tolerations and EnvVar set.
Setup process
Screenshots
Check all the applicable boxes
Related PRs
flyteorg/flyte#6085
Docs link
Summary by Bito
Enhanced flytekit-spark package by implementing configurable driver and executor pod support through PodTemplate. Added driver_pod and executor_pod fields to SparkJob model with primary_only flag for pod spec serialization. The implementation includes type hint updates from K8sPod to PodTemplate, parameter order modifications, and improved SparkSession cleanup in tests. This enables granular control and customization of labels, annotations, containers, and tolerations for both driver and executor pods.Unit tests added: True
Estimated effort to review (1-5, lower is better): 2