-
Notifications
You must be signed in to change notification settings - Fork 19
[WIP] Proof of Concept: Automated DE VM Setup with terraform + ansible #58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ngineering-setup into lorcanrae/automated-setup
…ngineering-setup into lorcanrae/automated-setup
…ngineering-setup into lorcanrae/automated-setup
…ngineering-setup into lorcanrae/automated-setup
…re redundancy to windows instructions
…ngineering-setup into lorcanrae/automated-setup
krokrob
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Impressive PoC @lorcanrae ! I won't review the code or the guidelines here. Here are my general comments:
- As you mentioned in the Drawbacks section, this new approach will make the setup very abstract and as a wannabe Data Engineer, students would like to be able to understand their setup environment, don't you think? The benefits won with a lighter and a more straightforward setup may not beat the satisfaction of building a VM setup from scratch for our target students
- That being said, we could use this precious content for 2 use-cases:
- An additional setup version, for students that are really struggling with the setup (maybe a 🔴 ) OR to be able to recreate the setup faster in case of incident (computer crash, mis-configured VM,....) AND for former alumni who want a fresh start
- Let's make it a challenge in the future Terraform unit! It is a great opportunity to apply the pedagogical approach we love at Le Wagon: build it the hard way to understand what is under the hood, then build it faster with a dedicated/specific framework
Happy to discuss
|
Hi @krokrob! Thank you for your patience, I've been held up on some other projects 🙏
This is a tricky one and I think depends on the technical level of the student. For some students, absolutely, I think they appreciate understanding their dev environment, but I would put them in the minority. For other students, I think there is very little understanding, they are just copy + pasting code blocks. There is the excitement of the day and we don't explicitly go through the creation of a VM in the lecture. With the amount of copy + pasting, their is a higher chance of introducing variance into the setup. My primary concern is students that do not finish the setup on Setup Day and it drags into following sessions, putting a strain on teaching resources and impacting students' learning experience.
I think this could be the way to go. If we're moving the Terraform module to earlier in the program, adding this as an challenge or optional challenge seems like a good fit. I recommend that we push these to a different branch (like WDYT? |
|
Hi @lorcanrae thanks for your reply, let's keep it in a dedicated branch 👌. How would you plan to propose this alternative automatic setup to students? during the first lecture? |
|
Great question. Not entirely sure. I think it would be more indirect, batch managers communicating it in the first session, have an additional info challenge on Kitt to go through the components that were installed and a link to the more manual setup. Maybe from updated marketing info (if this was implemented in the future)? I don't think it's worth changing the Setup Day Lecture. |
Proof of Concept: Automated DE VM Setup with Terraform and Ansible
The idea is to automate as much of the DE Setup as possible. Ensuring that students are fully setup on Setup Day and not lingering into the second unit (or later).
Students do the program on a GCP Virtual Machine, providing a controllable environment that lends itself to automation.
Terraform is used to provision the VM, Ansible is used to configure most of the VM.
Open to comment from anyone.
Benefits
gcloud compute config-sshto generate the keys and VS Code Remote - SSH config.gcloud auth application-default loginfor ADC credentials. Amend challenges for a Service Account and Key creation applying the principle of least privilege.Drawbacks
0101-Setup-Dayor the Project module.terraform.exetoPATH. Not that bad, but annoying.Setup Process
The setup process has three main sections:
gcloudinstallterraforminstallterraformto provision VMgcloudand ADC.zshrc,.direnvrc), forkdata-engineering-challenges, add upstream remoteTesting
Testing and feedback would be appreciated! Some of the local Windows parts may need fine-tuning.
✔️ Tested Linux setup
✔️ Tested Windows setup
Discussion Points
Currently configured the VM with a static external IP. There are students that change computer or user for whatever reason during the bootcamp. I've tested modifying the config generated by
gcloud compute config-sshwith a different user and it works fine. With an ephemeral external IP, this has its drawbacks:gcloudwon't remove the old ssh-config block if it has been modified, and the list of ssh-hosts would keep growing with an ephemeral external IP.gcloud compute config-sshand manually editing the user every time would be a painDropping the disk size to 100 GB. Should be enough, 150 GB is excessive and it's easy to increase disk size.
TODO