Skip to content

Conversation

@dobsonj
Copy link
Member

@dobsonj dobsonj commented Oct 22, 2025

What type of PR is this?

/kind documentation

(driver install documentation + deploy scripts)

What this PR does / why we need it:

  • scope roles/iam.serviceAccountUser to node service accounts

gce-pd-csi-driver requires roles/iam.serviceAccountUser to impersonate node service accounts to attach and detach disks. However, the current driver install instructions and deployment scripts recommend granting this role at the project level, which allows it to impersonate any service account in the project.

Google Security Health Analytics recommends NOT to scope this role at the project level, and some users are sensitive to this and must follow the principle of least privilege when deploying the CSI driver.

This PR introduces an explanation in the driver install instructions, along with an optional NODE_SERVICE_ACCOUNTS environment variable to specify node service accounts that the CSI driver may impersonate. If NODE_SERVICE_ACCOUNTS is set, then it scopes roles/iam.serviceAccountUser just to those provided service accounts instead of granting it at the project level.

There are two additional commits to fix issues I encountered while testing:

  • add-iam-policy-binding should set condition=None to avoid prompt
  • remove invalid --enable-multitenancy option from deployments

Testing: manually tested driver installation using ./deploy/setup-project.sh and ./deploy/kubernetes/deploy-driver.sh, with and without NODE_SERVICE_ACCOUNTS defined, and confirmed the driver can provision and attach volumes when roles/iam.serviceAccountUser is restricted just to the provided node service accounts.

Which issue(s) this PR fixes:

Fixes #1700

Special notes for your reviewer:

/cc @msau42 @mattcary

Does this PR introduce a user-facing change?:

Reduce scope of roles/iam.serviceAccountUser to specified node service accounts during driver installation

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/documentation Categorizes issue or PR as related to documentation. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 22, 2025
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Oct 22, 2025

@dobsonj: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-gcp-compute-persistent-disk-csi-driver-e2e-arm 7ef4a52 link false /test pull-gcp-compute-persistent-disk-csi-driver-e2e-arm
pull-gcp-compute-persistent-disk-csi-driver-e2e-windows-2022 7ef4a52 link false /test pull-gcp-compute-persistent-disk-csi-driver-e2e-windows-2022

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@sunnylovestiramisu
Copy link
Contributor

What type of clusters have you tested? Is it a GCE cluster or a GKE cluster that disabled the managed PDCSI driver?

@sunnylovestiramisu
Copy link
Contributor

/retest-required

@dobsonj
Copy link
Member Author

dobsonj commented Oct 22, 2025

What type of clusters have you tested? Is it a GCE cluster or a GKE cluster that disabled the managed PDCSI driver?

I tested on GCE - specifically OpenShift 4.20 on GCE with WIF enabled and the included PD CSI driver disabled so that these deployment scripts can deploy successfully to the gce-pd-csi-driver namespace. If you'd like to see results on a different configuration I'm happy to give it a try.

@mattcary
Copy link
Contributor

I think k8s on GCE is the right platform to test this. The SA and controller structure on GKE is entirely different.

/review @sunnylovestiramisu

node_sa=$(echo "${node_sa}" | xargs) # trim whitespace
echo "Granting ${SA_USER_ROLE} for ${node_sa} to serviceAccount:${IAM_NAME}"
gcloud iam service-accounts add-iam-policy-binding "${node_sa}" \
--member="serviceAccount:${IAM_NAME}" --condition=None \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need condition=None? What is the default of condition?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added --condition=None because without it I was getting prompted several times when running the script, for example:

$ gcloud projects add-iam-policy-binding "${PROJECT}" --member serviceAccount:"${IAM_NAME}" --role "${SA_USER_ROLE}"
...
 [5] TITLE=test-platform__ filter
 [6] None
 [7] Specify a new condition
The policy contains bindings with conditions, so specifying a condition is required when
adding a binding. Please specify a condition.:

The list of conditions might be specific to our project(?) but IMO we always want None for these bindings.

I did not see that prompt when running gcloud iam service-accounts add-iam-policy-binding to create a binding for a specific service account. The condition defaulted to None in that case, but I included it anyway to leave no ambiguity.

# Check each node service account for serviceAccountUser role
if use_scoped_sa_role;
then
IFS=',' read -ra NODE_SA_ARRAY <<< "${NODE_SERVICE_ACCOUNTS}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do some input validation for NODE_SERVICE_ACCOUNTS? Like not empty etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if use_scoped_sa_role check already returns false at line 73 if NODE_SERVICE_ACCOUNTS is an empty string:

function use_scoped_sa_role()
{
	[ -n "${NODE_SERVICE_ACCOUNTS:-}" ]
}

for example:

$ unset NODE_SERVICE_ACCOUNTS
$ [ -n "$NODE_SERVICE_ACCOUNTS" ]
$ echo $?
1
$ export NODE_SERVICE_ACCOUNTS=""
$ [ -n "$NODE_SERVICE_ACCOUNTS" ]
$ echo $?
1

I considered checking for a comma in NODE_SERVICE_ACCOUNTS, but it is valid to pass in a single service account with no comma delimiter, and this line still handles that case:

$ export NODE_SERVICE_ACCOUNTS="sa1"
$ IFS=',' read -ra NODE_SA_ARRAY <<< "${NODE_SERVICE_ACCOUNTS}" && echo ${NODE_SA_ARRAY[@]}
sa1
$ export NODE_SERVICE_ACCOUNTS="sa1,sa2"
$ IFS=',' read -ra NODE_SA_ARRAY <<< "${NODE_SERVICE_ACCOUNTS}" && echo ${NODE_SA_ARRAY[@]}
sa1 sa2

The only other thing I can think to validate is whether it's a valid service account, which we won't know until calling the gcloud command, which gives a useful error message:

$ gcloud iam service-accounts get-iam-policy "invalid-sa1" --project="${PROJECT}" --flatten="bindings[].members" --format='table(bindings.role)' --filter="bindings.members:${IAM_NAME}"
ERROR: (gcloud.iam.service-accounts.get-iam-policy) argument SERVICE_ACCOUNT: Bad value [invalid-sa1]: Not a valid service account identifier. It should be either a numeric string representing the unique_id or an email of the form: my-iam-account@somedomain.com or my-iam-account@PROJECT_ID.iam.gserviceaccount.com

$ gcloud iam service-accounts get-iam-policy "invalid-sa1@somedomain.com" --project="${PROJECT}" --flatten="bindings[].members" --format='table(bindings.role)' --filter="bindings.members:${IAM_NAME}"
ERROR: (gcloud.iam.service-accounts.get-iam-policy) NOT_FOUND: Unknown service account. This command is authenticated as jdobson@redhat.com which is the active account specified by the [core/account] property

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is good enough with use_scoped_sa_role :)

@sunnylovestiramisu
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 28, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dobsonj, sunnylovestiramisu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 28, 2025
@k8s-ci-robot k8s-ci-robot merged commit 7f8afd8 into kubernetes-sigs:master Oct 29, 2025
8 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/documentation Categorizes issue or PR as related to documentation. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reduce iam.serviceAccountUser role requirement

4 participants