-
Notifications
You must be signed in to change notification settings - Fork 529
CNTRLPLANE-2120: Add KMS foundations in encryption controllers in library-go #1900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
@ardaguclu: This pull request references CNTRLPLANE-2120 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
c091cbc to
f734b05
Compare
flavianmissi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have to take a short break from reviewing, but leaving the comments I got so far.
|
/cc @ibihim @flavianmissi |
1794054 to
1ddc3d8
Compare
|
@flavianmissi I was uncomfortable about the disconnects between the sections and the verbosity. So I overhauled the EP to have better clarity. Please let me know your thoughts. |
e920a9c to
5804b76
Compare
f39a0d7 to
8f79ed6
Compare
|
/lgtm |
|
/cc @benluddy |
|
As we agreed with @flavianmissi, in next iterations there will be another condition to notify users to delete unused kms plugins from cluster, when prune_controller prunes them. |
ibihim
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work. I have some questions though as a beginner to downstream e2ee with kms
|
|
||
| **keyController** manages encryption key lifecycle. Creates encryption key secrets in `openshift-config-managed` namespace. For KMS mode, creates empty secrets with KMS configuration hashes. | ||
|
|
||
| **stateController** generates EncryptionConfiguration for API server consumption. Implements distributed state machine ensuring all API servers converge to same revision. For KMS mode, generates configuration with deterministic Unix socket paths. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a deterministic Unix Socket path? Are there probabilistic paths?
Why do we want this? To identify if the KMS is out of date and the address changed? To run several KMS plugins somehow in parallel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To run several KMS plugins somehow in parallel?
This is one of the reasons. Deterministic implies that plugin lifecycle runs kms plugin with a unix socket that can be generated by the encryption controllers to communicate with (plugin lifecycle and encryption controllers are not organically related). So that this works as a contract that provides a deterministic communication.
However, since we likely decide to only support external kms plugins. We won't need this functionality. I proposed this openshift/api#2622 API definition to directly use whatever user sets. Thus, this unix socket generation logic will be removed from the EP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't the socket path observed by an apiserver be chosen by the operator that mounts it as a host volume? In that case, the endpoints in the apiserver-facing config file wouldn't be 1:1 with user-provided absolute paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the EP to directly use unix domain socket path from apiserver.config.openshift.io. This field will be set by cluster admin and we'll directly use it. There isn't any unix domain socket generation mechanism any more.
|
I'll update this EP base on the changes in openshift/api#2622 |
|
|
||
| **keyController** manages encryption key lifecycle. Creates encryption key secrets in `openshift-config-managed` namespace. For KMS mode, creates empty secrets with KMS configuration hashes. | ||
|
|
||
| **stateController** generates EncryptionConfiguration for API server consumption. Implements distributed state machine ensuring all API servers converge to same revision. For KMS mode, generates configuration with deterministic Unix socket paths. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't the socket path observed by an apiserver be chosen by the operator that mounts it as a host volume? In that case, the endpoints in the apiserver-facing config file wouldn't be 1:1 with user-provided absolute paths.
|
|
||
| **KMS** is the external Key Management Service (AWS KMS, HashiCorp Vault, etc.) that stores and manages the Key Encryption Key (KEK). | ||
|
|
||
| **KMS plugin** is a gRPC service implementing Kubernetes KMS v2 API, running as a sidecar to API server pods. It communicates with the external KMS to encrypt/decrypt data encryption keys (DEKs). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't imagine plugins will run as true sidecar containers anymore, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct. I've updated the EP based on the last discussions. Please let me know your thoughts about this.
|
|
||
| #### Variation: Configuration Changes (Key Rotation) | ||
|
|
||
| When cluster admin updates KMS configuration (e.g., new key ARN, different region): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't key_id -- or a hash of key_id -- sufficient by itself to consider previously-encrypted resources "stale"? I would have expected any significant change to a KMS provider config to result in a new key_id (and insignificant changes would not).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't key_id -- or a hash of key_id -- sufficient by itself to consider previously-encrypted resources "stale"?
There are 2 dimensions. First one is to detect key rotation in the same KMS Plugin (we'll track key_id to detect this). Another one is to migrate to new KMS Plugin from old one (we track unix domain socket path to detect this).
| 2. Compares new hash with hash in most recent encryption key secret annotation. | ||
| 3. If hashes differ: | ||
| - Creates new encryption key secret with new hash | ||
| - migrationController automatically triggers re-encryption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any mechanism to provide backpressure in a situation where a KMS provider is rotating key_id very quickly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good point. I don't think we have any, since this is entirely managed externally. But in the future maybe we can provide a new degraded condition by detecting the frequency (I'm not sure how). cc: @flavianmissi
|
New changes are detected. LGTM label has been removed. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@ardaguclu: This pull request references CNTRLPLANE-2120 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/hold cancel /cc @flavianmissi @ibihim @benluddy could you PTAL?. Thank you |
|
@ardaguclu: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Plugin lifecycle management decision may have an impact on this EP. So I'm adding hold until the decision is made |
This PR is based on #1872 (changes in
enhancements/kube-apiserver/kms-encryption-foundations.md).There are many aspects that need to be implemented to support KMS in OpenShift. We have decided to open more granular EPs to better track the work.
This EPs main aim is to focus on the encryption controller changes in library-go. This EP defers some concepts to future in order to start with simpler, manageable iterations.