BPF token support and testing infrastructure #1953
Conversation
e80d8a8 to
fa9614c
Compare
|
Thanks for doing this. If you end up landing this, I will withdraw #1948. I tested this PR in our environment, and it works as documented. In case it can be useful, I'm attaching the test program we used: ambient_token_test.go.zip This runs in our namespaced container with When running it without |
8f7f344 to
c2149ea
Compare
50c6abc to
0209971
Compare
d29e5d5 to
a416a64
Compare
|
Rebased on main and updated to match changes made in #1950. Marking ready for review. |
javiercardona-work
left a comment
There was a problem hiding this comment.
The updated version of this branch still passes our tests in our containerized environment.
[09:47:25 root@tsp_prn/mks/mks-system-pool-jcardona-sparklemuffin-kubelet/0 ~]$ /tmp/ambient-token-test
ambient-token-test v9
=== Diagnostics ===
user namespace: user:[4026535252]
unprivileged_bpf_disabled: 1
Capabilities before:
CapInh: 0000000000000000
CapPrm: 000001c7a9ac7dff
CapEff: 000001c7a9ac7dff
CapBnd: 000001c7a9ac7dff
CapAmb: 0000000000000000
=== Token Create ===
token.Create(): OK (fd=6, mount=/run/tw/bpf)
=== Feature Probes (ambient token) ===
prog SocketFilter supported
prog Kprobe supported
prog SchedCLS supported
prog XDP supported
prog TracePoint supported
prog CGroupSKB supported
map Hash supported
map Array supported
map PerfEventArray supported
map LRUHash supported
map RingBuf supported
=== Map Create (ambient token) ===
map create: OK (fd=3)
=== Program Load (ambient token) ===
program load: OK (fd=3)
Done.
|
@rgo3 I there something I can do to help merge this PR? Happy to run additional tests on my environment. |
We are waiting for @ti-mo to review, since I made the PR, and he is the only other project maintainer. But Timo is a busy man, with a very long todo list, so this just takes some time. All reviews are welcome, but Timo's approval determines merge-ability of the PR. |
|
Hi @ti-mo, Could you provide a rough estimate on when this might be ready for merge? I’m curious if there are any outstanding design concerns we should address, or if it’s primarily a matter of finding an opening in your schedule. Our team is excited to move away from our internal fork and adopt @dylandreimerink better implementation for this feature, so we’re just looking to coordinate our timeline. Thanks! |
|
Hi Javier, This PR is the result of internal discussions, so don't expect any departures from the overall architecture, if that makes things a bit more predictable for you. For the initial implementation, we want there to not be any API changes, which should make this branch easy to carry in your Cilium fork since there should be no token-related Cilium changes required. I'll start working on this on Monday and it will be merged when it's done. |
ti-mo
left a comment
There was a problem hiding this comment.
Thanks! This is all I got for now, will do a 2nd round when these are discussed/addressed.
1bff0ee to
cdf4dd8
Compare
|
I figured out that you can request object info on tokens as well, which returns the delegated permissions as numbers, which means we don't have to have all of the from string and parsing logic. I still need to address a few of the outstanding comments with regards to changing the API of the tokens testutils. Just pushing what I have so far. |
cdf4dd8 to
2da766f
Compare
ti-mo
left a comment
There was a problem hiding this comment.
Did another round focusing on the non-testing changes. New RunWithToken looks nice at a glance. 👍
4c18071 to
c9b943b
Compare
c9b943b to
e551b91
Compare
3259bde to
ebaab56
Compare
|
@dylandreimerink Played with this a bit after getting back from break. tl;dr: In response to #1953 (comment): I ended up moving I've left the testing harness untouched, but I think we should make some improvements to the subtest output streaming because, at least locally, Let's run through the changes offline tomorrow. |
ebaab56 to
b86d54e
Compare
cilium/ci-kernels#90 bumped kernels and added CONFIG_MEMCG. Signed-off-by: Timo Beckers <timo@isovalent.com>
Setting up the environment to be able to use BPF tokens is a bit involved. This commit adds the `RunWithToken` utility function to do this setup. Using tokens requires a few things: * The process must run in a user namespace that is not the initial user namespace of the system. * The process must have the CAP_BPF capability in that user namespace. * A BPFFS must be available that has been mounted with delegated permissions. * The BPFFS must be owned by the user namespace of the process. In addition we want to run some tests with tokens and other without. Some actions involved in setting up the environment for tokens are irreversible, so we have to spawn a subprocess to run the tests with tokens. The `RunWithToken` helper does all of the required setup. It creates a UNIX socket pair to communicate with the subprocess, then executes a new instance of the current test binary. The new process is executed in its own user NS and mount NS. Only the current test is executed in the subprocess. An environment variable is used to detect if we are executing in a subprocess or not. The parent will wait for a BPFFS context to be send over the socket pair and will delegate the permissions specified as arguments to `RunWithToken`. When the subprocess first starts, it check the env var, and does the child side of the setup. It creates a BPFFS context, and sends it to the parent via the UNIX socket. After getting confirmation that the delegation is done, the child mounts the BPFFS, drops capabilities, and does a execve to re-execute itself. This ensures that in the final stage tokens are setup, even during init and global variable initialization. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
…og,attach This commit generates the TokenCreate syscall wrapper, TokenCreateAttr, as well as TokenInfo. Also add stringer generation for the Cmd, MapType, ProgType and AttachType enums for use in error messages. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com> Co-authored-by: Timo Beckers <timo@isovalent.com>
This commit adds support for BPF tokens. When first attempting to create a map, program or BTF blob, automatically discover all bpffs mounts available to the process and attempt to create tokens until successful. This has the benefit of not hardcoding any paths or passing them via environment variables. Since token info was only added as of 6.17, allow obj_info to fail and always treat EPERM as ErrTokenCapabilities if tokens are in use and info is missing. This avoids having to add corner cases and token plumbing to all existing feature probes and tests. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com> Co-authored-by: Timo Beckers <timo@isovalent.com>
b86d54e to
9ae2fec
Compare
This PR introduces test utilities that allows us to write tests for BPF tokens. This
RunWithTokenutility function is largely based on the test code from the kernel selftests but modified to work with Go and its runtime.The individual commit messages contain more details. The high level is that we can write tests (and subtests) where we start by calling
RunWithTokenwith a set of delegated permissions. The rest of the test body will be executed in an environment with a BPFFS mounted with those permissions.This PR also includes BPF token support. This takes the "ambient token" approach, where we do not add any new API surface, but we automatically use tokens when they are available. This allows us to experiment with tokens without committing to an API. We can always add more controls in the future.
Instead of relying on hardcoding search paths or environment variables, this implementation scans the
/proc/self/mountinfoprocfile to find all BPFFS instances mounted in the processes mount namespace. This means we are able to use any BPFFS path, and even multiple BPFFS'es, for example if one FS has no permissions but holds pinned objects, and a secondary BPFFS is used to the delegated tokens. However, there is no method to select a preferred one, if multiple BPFFS'es with delegated privileges exist (which should be uncommon).A nice side effect of reading the mount info is that we also know which permissions are delegated which is used in this implementation to enhance error messages when using tokens but the token lacks certain permissions.
The main purpose of this PR was to implement the testing utilities, and the PR is structured such that we can drop the actual token implementation if any of the other proposals is more appealing then this one.
Note: This PR currently has cherry-picked a commit from #1950 for its capability syscalls. This PR should be rebased once #1950 is merged.