Skip to content

Close CONTINUE escape in kbox seccomp supervisor#43

Merged
jserv merged 2 commits intomainfrom
ebadf
Apr 1, 2026
Merged

Close CONTINUE escape in kbox seccomp supervisor#43
jserv merged 2 commits intomainfrom
ebadf

Conversation

@jserv
Copy link
Copy Markdown
Contributor

@jserv jserv commented Apr 1, 2026

The CONTINUE escape where a guest could exploit the supervisor's blind passthrough of untracked FDs and pointer-dependent syscalls.

Close #40


Summary by cubic

Hardened the seccomp supervisor by denying I/O on untracked FDs, replacing all path/pointer-dependent CONTINUEs with supervisor-side host syscalls, and fully tracking host-passthrough FDs (stdio, pipes, eventfd/timerfd/epoll, inherited). Fixes bind/connect EINVAL from stale socket entries and closes the CONTINUE escape.

  • Bug Fixes

    • Return EBADF for any FD not in the virtual table; gate all I/O with fd_should_deny_io (incl. mmap).
    • Replace CONTINUE for all path syscalls with host calls using local path copies and duped dirfds; cover readlinkat, statx, faccessat*, etc.
    • Handle epoll_ctl/wait, timerfd_*, and fstatfs without CONTINUE; validate args.
    • Write validated clone3 flags back before continue; allow execve on /dev/fd/N; reject virtual exec paths with EACCES.
    • Clear stale SHADOW_ONLY entries on ADDFD socket injection (fixes bind/connect); preserve SHADOW_ONLY on close.
  • Refactors

    • FD table: add mid-range tracking to eliminate the 1024–32768 gap; add KBOX_LKL_FD_SHADOW_ONLY; store table statically; unify lookups via fd_table_entry().
    • Track host-passthrough FDs: pre-register stdio; one-time /proc/<pid>/fd scan for inherited FDs; track pipes via ADDFD; create/inject eventfd, timerfd, epoll; emulate opens under /proc, /sys, /dev, and TTY; emulate dup/dup2/dup3 and fcntl(F_DUPFD*) via pidfd_getfd.
    • Tests updated for pipe2, F_DUPFD*, and mid-range cases.

Written for commit 0000182. Summary will update on new commits.

cubic-dev-ai[bot]

This comment was marked as resolved.

The supervisor blindly CONTINUEd any syscall on an FD not in the virtual
table, letting a guest read/write host FDs obtained via openat TOCTOU.
Return EBADF for every untracked FD while tracking all legitimate
host-passthrough FDs.

FD table: eliminate the [1024, 32768) gap with mid_fds[31744]. Move
fd_table to static storage. Convert all inline low_fds[]/entries[]
lookups to fd_table_entry().

Tracking: pre-register stdio; track pipe FDs via ADDFD; intercept
eventfd/timerfd_create/epoll_create1 (create in supervisor, inject via
ADDFD); scan /proc/{pid}/fd on first notification to capture inherited
FDs; emulate /proc /sys /dev /TTY opens (supervisor opens the file
translating /proc/self, injects via ADDFD); emulate
dup/dup2/dup3/fcntl(F_DUPFD) via pidfd_getfd; allow execve on /dev/fd/N
paths.

Enforcement: fd_should_deny_io() in every I/O handler. Move
epoll_ctl/wait, timerfd_settime/gettime, fstatfs from CONTINUE table to
gated handlers. Preserve SHADOW_ONLY entries on close (shared table
after fork); skip them in close_cloexec_entry.

Change-Id: I34eaba6cfcb181250ecd95998d2e530af949915e
CONTINUE re-reads pointer targets from guest memory, so a sibling thread
can swap the path between process_vm_readv and host kernel's re-read.
Replace every path-dependent CONTINUE with supervisor-side emulation
using local path copies and pidfd_getfd dirfd copies.

Emulated operations (supervisor calls host syscall): fstatat, statx,
stat, faccessat, readlinkat, symlinkat, linkat, utimensat, mkdirat,
unlinkat, fchmodat, fchownat, renameat2, and legacy wrappers (access,
mkdir, unlink, rmdir, chmod, chown).

Extract translate_proc_self() helper shared across all paths.
forward_execve: EACCES for virtual paths (not executable).
forward_clone3: write validated flags back before CONTINUE.

After this commit no should_continue_virtual_path or
should_continue_for_dirfd call results in CONTINUE.

Close #40

Change-Id: I1653e63ebd63f3e0a4428b24de051dfb05c769b1
@jserv jserv merged commit 390495c into main Apr 1, 2026
5 checks passed
@jserv jserv deleted the ebadf branch April 1, 2026 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Audit CONTINUE dispatches for TOCTOU races on guest memory

1 participant