-
-
Notifications
You must be signed in to change notification settings - Fork 232
chore(vm): refactor network devices update logic #2260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+248
−81
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
c86c789
fix(vm): detach and delete network device when its config block is re…
hsnprsd 9b91ea8
chore: add acceptance tests
bpg 56e2053
linter is funny
bpg be2fd0c
add test to ensure that network devices are inherited from the templa…
hsnprsd a1e251e
restore `Computed` for the network and disable new tests
bpg File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a failing test for cloned VM scenario @TobiPeterG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please elaborate why the behaviour is wrong. If I understand it correctly, you create a VM with one network interface and then clone it to a VM without one. I would expect opentofu to remove the network interface from the VM when it's not in the config for the new VM. If you wanted the new VM to have the interface, you would add it in the config. What happens in this case?
More interesting is the question what happens when you clone a VM with multiple network interfaces and configure the cloned VM to also have these interfaces. Are they correctly mapped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure whether we want to keep the inherited network devices in the cloned VM or we expect the user to specify them. That is not my decision to make as I don't know what the intent was. If network devices should be specified whether or not they are cloned from a source VM, this PR solves the issue of removing extra network devices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue that we want to define the state of the infrastructure through opentofu, everything that is in the opentofu config should be there in reality and everything that is there in reality should be configured through opentofu. As a user, I would be confused when my opentofu config doesn't mention a network interface, but it actually exists on my infrastructure. But I guess it's bpg's call to make.
I would also wonder: How would you remove a network interface from a VM you clone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do this too. As a result, I'm not really qualified to jump in on this discussion since I don't use cloning, nor have I looked into how the Proxmox API exposes cloning, but I've run into similar minor issues when importing resources from public cloud providers....
Occasionally when importing a public cloud resource, I find that running a plan immediately after results in a proposed "change". Most of the time I don't care and just run apply to commit that setting into state, but other times I find that I don't actually want that setting and I need to edit my Terraform config to explicitly add that setting so I can change it to something else. Since the import succeeded with the config I had written, I've been assuming these minor anomalies are just due to added return values from the API call, or perhaps a mismatch in default values stored in the provider vs. the API vs. the cloud provider's web UI.
To me, Proxmox cloning feels like a server-side import of a VM resource. I wonder if it's worth considering cloning in this way? Ideally, the
cloneblock would just signal to the provider to call the appropriate clone API to initially create the cloned vm, but the end result is a vm resource with all the required config a vm resource requires. Should some config be missing, modified, or added, those differences can either be applied at time of cloning, or on the next apply.I don't use cloning so I'm not really sure how the functionality works, but I suspect this matches how Proxmox works. But from a Terraform provider prospective, I think this inheritance is problematic. It seems tricky for both the provider and humans to reason about inherited config vs. explicit config and how to make changes to inherited config in the future. I've seen this issue come up occasionally in bug reports such as #1988. I think it's worth considering failing a cloned vm resource attempt if the vm resource block doesn't exactly match the source template (or with whatever minor changes that can be easily applied like a change in memory, or dropping a network interface).
Obviously this has massive backwards compatibility problems (and there are certainly nuances I'm unaware of), but I thought I'd throw out my idea that cloning could be considered a server-side resource "import". It certainly adds some friction when creating a cloned resource, but the end result would be a proper explicitly-defined vm resource that behaves like any other non-cloned vm resource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please elaborate why this shouldn't change? Why wouldn't you just create a opentofu config matching the setup of the cloned VM to have an exact copy?
Aren't explicit removings defined by not specifying an attribute? That's how it work with other providers I use as well. When it's not defined to be there, it's removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That somewhat defeats the purpose of VM cloning and doesn’t match PVE’s behavior. In PVE, if you want to clone a VM, you simply provide a new ID and a target storage, and that’s pretty much it —- a full clone is created with all devices included.
I want to keep this behaviour because a) that's how PVE works, b) that's how the provider has been doin it from the very beginning. If this changes, then it will be a huge backward compatibility problem for a good portion of users out there.
Generally it does, but again, those "special" clone creation and update cases. I want to keep the current behaviour for the reasons mentioned above. My memory is a bit hazy, but when I was first experimenting with this in the VM v2 resource and the Plugin FWK, it seemed that explicitly setting an attribute to an empty or null value was the only reliable way to handle that.
Anyway, looks like this PR opened a can of worms 😅, and this topic probably deserves a separate discussion. I’ll try to run more tests on this PR to see if it breaks any of the existing behaviour, and if it does, to what extent. Then see what we can do to merge it with minimal side effects.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, this makes sense. The question is if the behavior of PVE is compatible with the concepts of terraform. In this case, it seems questionable.
Yes, this is true. But should breaking changes stop "improvements" (general question)? The current versions would continue to work the same way, breaking changes could/should be announced in the release notes. If anyone updates the provider to a new version without checking what changed is at fault for breaking things I think.
Yes, a separate discussion would be great. :)
I would be curious why the clone operation exists in PVE and how/why it's used with Terraform. In my case, for example, I only need the clone operation to get the disk copied from my template to my new VM. If there's a better way to handle this, I'm absolutely open for it. What other use cases would one have with terraform to clone a VM instead of copying the config of the template VM?
If there isn't a better way for disks, maybe it's worth making the clone operation a per-disk operation in terraform (only close a specific disk from another VM) and make the clone operation a completely new resource (e.g. proxmox_virtual_environment_clone) where a user can only specify what to clone and the new name without being able to modify the clone in any other way.
This way, the current clone operation could be removed and there wouldn't be "hidden", breaking changes. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's continue the discussion here: #2324
I tried to summarize the different options, I hope I didn't miss anything. If I did, feel free to add your points and thoughts :)