-
Notifications
You must be signed in to change notification settings - Fork 96
chore: Deprecate make gke.
#2253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
sean-rose
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The README.md file will also need to be updated appropriately (though if you agree with my last comment, I could update the readme as part of my changes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script seems to be unrelated to the usage of the moz-fx-data-gke-sandbox project. Is there a particular reason why you're deleting it as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tech debt removal, I've asked many DENG over the years and none ever used this script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The readme mentions using this script for testing Dataproc jobs, and I seem to recall at least trying to test some Dataproc tasks locally when doing QA for Airflow upgrades (though I don't think I managed to get it fully working at that time).
In any case, since this is unrelated to GKE and we do still have Dataproc tasks in active DAGs, I don't think this script should be removed in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the end of yesterday's Data Infra WG meeting @akkomar suggested that these GKE scripts could be repurposed to facilitate running Airflow local dev workloads in our own personal dev projects, which sounds like a reasonable approach to me to preserve the option to have a quicker Airflow dev process (at the cost of having to configure our own personal dev projects to allow this to work).
If you agree that would be reasonable, I can contribute the necessary changes to this PR (e.g. having the scripts take a project ID argument; though since it looks tricky to pass arbitrary arguments through make we'd probably still want to remove those targets and have people run these scripts directly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
running Airflow local dev workloads in our own personal dev projects
This would require each sandbox project to have a GKE cluster with Workload Identity configured with various GCP (e.g. BQ, GCS, GAR, SQL, etc.) permissions. This also means it would be each developer's responsibility to cleanup unused resources. Mozcloud lacks budget monitoring for sandbox projects, so it is hard to monitor the costs of unused resources in those projects. This is why we had make gke create resources in a centralized project with a k8s cron job dedicated to cleaning up unused clusters.
For those reasons, I recommend against the solution proposed by @akkomar.
If you want to re-enable the feature being removed by this PR, I'd recommend building something similar but in the supported mozcloud platform (i.e. GCPv2).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked the suggestion you made in Slack about potentially setting up a shared GKE cluster in a new moz-fx-data-airflow-gke-dev project where developers could run local Airflow instance GKE tasks (potentially in user-specific namespaces), and I've filed DENG-9749 "Come up with new solution for telemetry-airflow devs to run GKE tasks from local Airflow instances", mentioning that idea plus the original create-GCPv2-GKE-sandbox-project idea.
In the meantime I'm OK with you proceeding with this PR since the GKE scripts no longer work as is.
However, I have squirreled away revised versions of the GKE scripts in the GKE-sandbox-config branch just in case someone like me or @gleonard-m ends up needing to resort to using a custom GKE sandbox setup.
🔗 Link your GitHub account to AtlassianTo enable Code Reviewer, please link your GitHub account to your Atlassian account. Click here to connect your accounts This is a one-time setup that takes less than a minute. |
sean-rose
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The readme still needs to be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The readme mentions using this script for testing Dataproc jobs, and I seem to recall at least trying to test some Dataproc tasks locally when doing QA for Airflow upgrades (though I don't think I managed to get it fully working at that time).
In any case, since this is unrelated to GKE and we do still have Dataproc tasks in active DAGs, I don't think this script should be removed in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked the suggestion you made in Slack about potentially setting up a shared GKE cluster in a new moz-fx-data-airflow-gke-dev project where developers could run local Airflow instance GKE tasks (potentially in user-specific namespaces), and I've filed DENG-9749 "Come up with new solution for telemetry-airflow devs to run GKE tasks from local Airflow instances", mentioning that idea plus the original create-GCPv2-GKE-sandbox-project idea.
In the meantime I'm OK with you proceeding with this PR since the GKE scripts no longer work as is.
However, I have squirreled away revised versions of the GKE scripts in the GKE-sandbox-config branch just in case someone like me or @gleonard-m ends up needing to resort to using a custom GKE sandbox setup.
|
@mikaeld this still needs to be completed, as the readme currently references commands like |
Description
We are deprecating infrastructure related to the temporary provisioning of GKE clusters for Airfow DAGs development.
Related Tickets & Documents