Skip to content

Conversation

@cyphar
Copy link
Member

@cyphar cyphar commented Nov 26, 2025

Backport of #4951 and #4928.


In case early stage of runc init (nsenter) fails for some reason, it
logs error(s) with FATAL log level, via bail().

The runc init log is read by a parent (runc create/run/exec) and is
logged via normal logrus mechanism, which is all fine and dandy, except
when runc init fails, we return the error from the parent (which is
usually not too helpful, for example):

runc run failed: unable to start container process: can't get final child's PID from pipe: EOF

Now, the actual underlying error is from runc init and it was logged
earlier; here's how full runc output looks like:

FATA[0000] nsexec-1[3247792]: failed to unshare remaining namespaces: No space left on device
FATA[0000] nsexec-0[3247790]: failed to sync with stage-1: next state
ERRO[0000] runc run failed: unable to start container process: can't get final child's PID from pipe: EOF

The problem is, upper level runtimes tend to ignore everything except
the last line from runc, and thus error reported by e.g. docker is not
very helpful.

This patch tries to improve the situation by collecting FATAL errors
from runc init and appending those to the error returned (instead of
logging). With it, the above error will look like this:

ERRO[0000] runc run failed: unable to start container process: can't get final child's PID from pipe: EOF; runc init error(s): nsexec-1[141549]: failed to unshare remaining namespaces: No space left on device; nsexec-0[141547]: failed to sync with stage-1: next state

Yes, it is long and ugly, but at least the upper level runtime will report it.

Fixes: #4905

@cyphar cyphar added this to the 1.4.1 milestone Nov 26, 2025
@cyphar cyphar added the backport/1.4-pr A backport PR to release-1.4 label Nov 26, 2025
@lifubang
Copy link
Member

Could you please also include this one: #4951?

@cyphar
Copy link
Member Author

cyphar commented Nov 26, 2025

Will do, I thought they were already backported.

@cyphar cyphar force-pushed the 1.4-better-init-errors-4928 branch from 100f783 to 32a7907 Compare November 26, 2025 10:07
Since sane_kill after a failed read or write, but before reporting the
error from that read or write, it may change the errno value in case
kill(2) fails.

Save and restore the errno around the call to kill.

While at it,
 - change the code to return early;
 - don't return kill return value as no one is using it, and the errno
   value no longer correlates.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 9c8f476)
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
We use bail to report fatal errors, and bail always append %m
(aka strerror(errno)). In case an error condition did not set
errno, the log message will end up with ": Success" or an error
from a stale errno value. Either case is confusing for users.

Introduce bailx which is the same as bail except it does not
append %m, and use it where appropriate.

The naming follows libc's err(3) and errx(3).

PS we still use bail in a few cases after read or write, even
if that read/write did not return an error, because the code
does not distinguish between short read/write and error (-1).
This will be addressed by the next commit.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 067b833)
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Add a few missing sane_kill calls where they make sense.

Remove one useless sane_kill of stage2_pid, as during SYNC_USERMAP stage2
is not yet started. It is harmless yet it makes the code slightly harder
to read.

Set the child pid to -1 upon receiving SYNC_CHILD_FINISH
to minimize the chances of killing an unrelated process.
When a child sends SYNC_CHILD_FINISH it is about to exit
(although theoretically it could be stuck during debug logging).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit aea52d0)
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Introduce and use iobail, xread, and xwrite wrappers so that we can
properly check read/write return value and call either bail or bailx on
error, with proper diagnostics (distinguishing failed read/write from a
short read/write).

This prevents the "Success" prefix in errors like:

	failed to sync with stage-1: next state: Success

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 6c18b25)
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
In case early stage of runc init (nsenter) fails for some reason, it
logs error(s) with FATAL log level, via bail().

The runc init log is read by a parent (runc create/run/exec) and is
logged via normal logrus mechanism, which is all fine and dandy, except
when `runc init` fails, we return the error from the parent (which is
usually not too helpful, for example):

	runc run failed: unable to start container process: can't get final child's PID from pipe: EOF

Now, the actual underlying error is from runc init and it was logged
earlier; here's how full runc output looks like:

	FATA[0000] nsexec-1[3247792]: failed to unshare remaining namespaces: No space left on device
	FATA[0000] nsexec-0[3247790]: failed to sync with stage-1: next state
	ERRO[0000] runc run failed: unable to start container process: can't get final child's PID from pipe: EOF

The problem is, upper level runtimes tend to ignore everything except
the last line from runc, and thus error reported by e.g. docker is not
very helpful.

This patch tries to improve the situation by collecting FATAL errors
from runc init and appending those to the error returned (instead of
logging). With it, the above error will look like this:

	ERRO[0000] runc run failed: unable to start container process: can't get final child's PID from pipe: EOF; runc init error(s): nsexec-1[141549]: failed to unshare remaining namespaces: No space left on device; nsexec-0[141547]: failed to sync with stage-1: next state

Yes, it is long and ugly, but at least the upper level runtime will
report it.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit f944cce)
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
@kolyshkin kolyshkin force-pushed the 1.4-better-init-errors-4928 branch from 32a7907 to f1d0dd8 Compare November 27, 2025 01:59
@lifubang lifubang merged commit c362d6b into opencontainers:release-1.4 Nov 27, 2025
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/1.4-pr A backport PR to release-1.4

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants