Adding QNX Unit Testing Infrastructure and Python Dependency Integration#51
Conversation
There was a problem hiding this comment.
Important security, logic, and reliability fixes are needed before the QNX testing stack ships.
Status: Changes Suggested | Risk: Medium | Readiness: 5/10 (Needs Polish)
Issues Identified & Suggestions
- Harden security and result parsing:
cc_test_qnx.bzl,process_test_results.py - Fix rule/template and script robustness (runfiles, mounts, busy-wait):
run_as_exec.bzl,init_x86_64_cc_test.build.template,quality/qnx_unit_testing/*.sh,tools.build
Review Details
📁 14 files reviewed | 💬 14 comments
👍 / 👎 individual comments to help improve reviews for you
| cmd_bash = """ | ||
| echo {} > $(@) | ||
| """.format(excluded_tests_filter_str), |
There was a problem hiding this comment.
[Security] The excluded_tests_filter_str is injected unsanitized into a shell echo command via cmd_bash. If the name parameter or filter strings contain shell metacharacters (spaces, quotes, $, backticks), command injection or unexpected behavior can occur.
Quote the value in the echo command:
cmd_bash = """
echo '{}' > $(@)
""".format(excluded_tests_filter_str),Or better, use ctx.actions.write instead of a genrule to avoid shell interpretation entirely.
Context for Agents
The `excluded_tests_filter_str` is injected unsanitized into a shell `echo` command via `cmd_bash`. If the `name` parameter or filter strings contain shell metacharacters (spaces, quotes, `$`, backticks), command injection or unexpected behavior can occur.
Quote the value in the echo command:
```python
cmd_bash = """
echo '{}' > $(@)
""".format(excluded_tests_filter_str),
```
Or better, use `ctx.actions.write` instead of a genrule to avoid shell interpretation entirely.
File: quality/qnx_unit_testing/cc_test_qnx.bzl
Line: 86Chain of Thought
- Validation Decision: accepted
- Confidence: 0.90
- Likelihood Score: 4/5
- Likelihood Reason: GTest filter strings commonly contain
*(as shown in 2 of 3 documented examples on lines 68-70), which is a shell glob metacharacter. Without quoting, shell glob expansion will occur, leading to incorrect filter content in the output file. This is highly likely to surface in normal usage. - Suggested Action: keep
- Tool Calls Used: 2
Reasoning
The comment correctly identifies a real shell injection/unexpected behavior issue in newly introduced code.
Code analysis:
excluded_tests_filter_stris built by concatenating GTest filter strings (line 78-80).- It's injected unquoted into a shell
echocommand viacmd_bash(lines 84-86):echo {} > $(@). - The documented examples of filter strings include
*(glob metacharacter):["FooTest.*"],["*FooTest.*"].
Why this is a real issue:
- The
*in GTest filters likeFooTest.*will be subject to shell glob expansion. If any files in the working directory match the pattern, the echo output will contain the filenames instead of the intended filter string. - Additionally, the
\\at the end of the string (from line 80:":\\") creates a trailing backslash that will escape the subsequent whitespace/newline in the shell command, potentially causing malformed output. - No quoting means spaces,
$, backticks, and other shell metacharacters could cause command injection or unexpected behavior.
The fix is actionable and correct:
- Using single quotes
echo '{}' > $(@)would prevent shell interpretation of the value. - The
ctx.actions.writesuggestion is more robust but would require refactoring from a macro'snative.genruleto a custom rule implementation. Still, it's a valid better alternative.
This is new code introduced by this PR (the file didn't exist at BASE), so it's clearly within PR scope.
Evidence Summary
1. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/cc_test_qnx.bzl - Lines:
75-90
75: excluded_tests_filter = excluded_tests_filter if excluded_tests_filter else []
76:
77: # Convert filter list to GTest format: "-FooTest.Test1:BarTest.*"
78: excluded_tests_filter_str = "-"
79: for test_filter in excluded_tests_filter:
80: excluded_tests_filter_str = excluded_tests_filter_str + (test_filter + ":\\")
81:
82: native.genrule(
83: name = "{}_excluded_tests_filter".format(name),
84: cmd_bash = """
...
| shared_img.download('coverage.tar.gz', file) | ||
|
|
||
| with tarfile.open(coverage_archive) as tf: | ||
| tf.extractall(output_dir) |
There was a problem hiding this comment.
[Security] tarfile.extractall(output_dir) is vulnerable to path traversal (e.g., ../) from a crafted coverage.tar.gz produced by the guest, which can overwrite host files. Validate members before extracting (or use the filter argument in newer Python).
| tf.extractall(output_dir) | |
| for m in tf.getmembers(): | |
| dest = os.path.realpath(os.path.join(output_dir, m.name)) | |
| if not dest.startswith(os.path.realpath(output_dir) + os.sep): | |
| raise RuntimeError("Unsafe path in coverage archive") | |
| tf.extract(m, output_dir) |
Context for Agents
`tarfile.extractall(output_dir)` is vulnerable to path traversal (e.g., `../`) from a crafted `coverage.tar.gz` produced by the guest, which can overwrite host files. Validate members before extracting (or use the `filter` argument in newer Python).
```suggestion
for m in tf.getmembers():
dest = os.path.realpath(os.path.join(output_dir, m.name))
if not dest.startswith(os.path.realpath(output_dir) + os.sep):
raise RuntimeError("Unsafe path in coverage archive")
tf.extract(m, output_dir)
```
File: quality/qnx_unit_testing/process_test_results.py
Line: 25Chain of Thought
- Validation Decision: accepted
- Confidence: 0.92
- Likelihood Score: 3/5
- Likelihood Reason: The archive originates from a QEMU guest VM, which is typically a controlled environment in CI. However, a compromised or misconfigured guest could produce a malicious archive. The path traversal vulnerability in tarfile.extractall() is well-documented (CVE-2007-4559) and defense-in-depth makes this a reasonable concern.
- Suggested Action: keep
- Tool Calls Used: 3
Reasoning
-
Issue is real and introduced by this PR: The file
process_test_results.pyis entirely new (not found in base commit). Line 25 callstf.extractall(output_dir)without any member validation, which is a well-known path traversal vulnerability (CVE-2007-4559). -
The vulnerability is correctly described:
tarfile.extractall()does not sanitize member names by default. A crafted tar archive with entries like../../etc/maliciouscould write files outsideoutput_dir. Python's own documentation warns about this. -
The source of the archive increases risk: The
coverage.tar.gzis downloaded from a FAT filesystem image (shared_img) that represents a QEMU guest's shared storage. While the guest is likely a controlled test environment, the archive contents are not validated before extraction. -
The suggestion is actionable and correct: The proposed fix validates each member's resolved path against the output directory using
os.path.realpath(), preventing path traversal. The alternative of using Python 3.12+'sfilterparameter is also mentioned. -
No existing mitigations: There is no upstream validation of the tar archive contents anywhere in the code.
Evidence Summary
1. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/process_test_results.py
1: import os
2: import sys
3: import tarfile
4: from fs import open_fs
5:
6: def check_tests_result() -> int:
7: shared_img_path = sys.argv[1]
8: shared_img = open_fs("fat://" + shared_img_path + "?preserve_case=true")
9:
10: output_xml = os.environ.get("XML_OUTPUT_FILE")
...
| domain_status=$(echo $query | awk -F' ' '{print $2}') | ||
| echo "CONFIGURE_PERSISTENCY: domain_number: $domain_number" | ||
| echo "CONFIGURE_PERSISTENCY: domain_status: $domain_status" | ||
| if [[ -n "$domain_number" && -n "$domain_status" && "$domain_number" == "$ENC_DOMAIN" && ("$domain_status" == "UNLOCKED" || "$domain_status" == "LOCKED") ]]; then |
There was a problem hiding this comment.
[Logic] The configure_persistency.sh shebang is #!/bin/sh but uses [[ ... ]] (bashism) at line 71. On a strict POSIX shell, this will fail with a syntax error. Either change the shebang to #!/bin/bash or replace [[ ... ]] with POSIX-compatible [ ... ] tests (as done in unlock_persistent at line 119).
| if [[ -n "$domain_number" && -n "$domain_status" && "$domain_number" == "$ENC_DOMAIN" && ("$domain_status" == "UNLOCKED" || "$domain_status" == "LOCKED") ]]; then | |
| if [ -n "$domain_number" ] && [ -n "$domain_status" ] && [ "$domain_number" = "$ENC_DOMAIN" ] && { [ "$domain_status" = "UNLOCKED" ] || [ "$domain_status" = "LOCKED" ]; }; then |
Context for Agents
The `configure_persistency.sh` shebang is `#!/bin/sh` but uses `[[ ... ]]` (bashism) at line 71. On a strict POSIX shell, this will fail with a syntax error. Either change the shebang to `#!/bin/bash` or replace `[[ ... ]]` with POSIX-compatible `[ ... ]` tests (as done in `unlock_persistent` at line 119).
```suggestion
if [ -n "$domain_number" ] && [ -n "$domain_status" ] && [ "$domain_number" = "$ENC_DOMAIN" ] && { [ "$domain_status" = "UNLOCKED" ] || [ "$domain_status" = "LOCKED" ]; }; then
```
File: quality/qnx_unit_testing/configure_persistency.sh
Line: 71Chain of Thought
- Validation Decision: accepted
- Confidence: 0.97
- Likelihood Score: 5/5
- Likelihood Reason: The script has
#!/bin/shshebang, and line 71 uses[[ ... ]]which is a bash-only construct. On any strict POSIX sh (e.g., dash on Debian/Ubuntu, or the QNX sh this script targets), this will fail with a syntax error. This is inevitable if the script is ever run by a non-bash/bin/sh. - Suggested Action: keep
- Tool Calls Used: 1
Reasoning
The comment is accurate on all counts:
-
Shebang is
#!/bin/sh(line 1) — confirmed in the diff. This declares the script should be interpreted by a POSIX-compatible shell. -
Line 71 uses
[[ ... ]]— confirmed:if [[ -n "$domain_number" && -n "$domain_status" && "$domain_number" == "$ENC_DOMAIN" && ("$domain_status" == "UNLOCKED" || "$domain_status" == "LOCKED") ]]; then. The[[ ... ]]construct is a bash/ksh extension not defined in POSIX sh. -
Line 119 uses POSIX
[ ... ]— confirmed:if [ -n "$domain_number" ] && [ -n "$domain_status" ] && .... This shows inconsistency within the same file. -
The file is entirely new (all 165 lines added) — this issue is introduced by this PR, not legacy debt.
-
The suggested fix is correct — it properly converts
[[ ... ]]to chained[ ... ]tests with&&and uses{ ...; }grouping for the||condition, and correctly uses=instead of==for POSIX string comparison.
This is a real bug that would cause a syntax error on strict POSIX shells. The comment is specific, actionable, and provides a concrete code suggestion.
Evidence Summary
1. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/configure_persistency.sh - Lines:
119-119
119: if [ -n "$domain_number" ] && [ -n "$domain_status" ] && [ "$domain_number" == "$ENC_DOMAIN" ] && [ "$domain_status" == "LOCKED" ] && [ "$domain_status" != "UNLOCKED" ]; then
| cam-disk.so | ||
| fs-qnx6.so | ||
| io-blk.so | ||
| libc++.so.2 | ||
| libc.so | ||
| libc.so.6 | ||
| libcam.so.2 | ||
| libgcc_s.so.1 | ||
| libm.so | ||
| libsecpol.so.1 |
There was a problem hiding this comment.
[Logic] The libsecpol.so.1 library listed in the template's shared libraries section is absent from tools.build. If tools.build is the canonical shared library list used by the actual test images (via cc_test_qnx.bzl), tests depending on libsecpol.so.1 will fail with missing library errors at runtime. Conversely, tools.build includes several libraries (libqcrypto.so.1.0, libqh.so, libregex.so.1, libslog2.so.1, libslog2parse.so.1) that the template does not. Verify which file is authoritative and reconcile the lists.
Context for Agents
The `libsecpol.so.1` library listed in the template's shared libraries section is absent from `tools.build`. If `tools.build` is the canonical shared library list used by the actual test images (via `cc_test_qnx.bzl`), tests depending on `libsecpol.so.1` will fail with missing library errors at runtime. Conversely, `tools.build` includes several libraries (`libqcrypto.so.1.0`, `libqh.so`, `libregex.so.1`, `libslog2.so.1`, `libslog2parse.so.1`) that the template does not. Verify which file is authoritative and reconcile the lists.
File: quality/qnx_unit_testing/init_x86_64_cc_test.build.template
Line: 49Chain of Thought
- Validation Decision: accepted
- Confidence: 0.65
- Likelihood Score: 3/5
- Likelihood Reason: The discrepancy between the two library lists is real and could cause confusion or runtime failures if the wrong file is used as reference. However, the comment's framing about tools.build being used by cc_test_qnx.bzl is incorrect - the template is the authoritative file for test images.
- Suggested Action: modify
- Tool Calls Used: 6
Reasoning
The comment identifies a real discrepancy between init_x86_64_cc_test.build.template and tools.build regarding shared library lists. The factual claims about which libraries are present/absent in each file are correct:
libsecpol.so.1is in the template but not intools.buildlibqcrypto.so.1.0,libqh.so,libregex.so.1,libslog2.so.1,libslog2parse.so.1are intools.buildbut not the template
However, the comment's framing is partially incorrect. It states "If tools.build is the canonical shared library list used by the actual test images (via cc_test_qnx.bzl)" — but examining cc_test_qnx.bzl (line 132-136), the qnx_ifs rule for test images uses build_file = "//quality/qnx_unit_testing:init_build_cc_test" which is the expanded template. It does NOT reference tools.build at all.
tools.build is only used by the init_shell debugging image (BUILD lines 82-101), where it's included as a source with mapping TOOLS_BUILDFILE.
So the two files serve different purposes:
- Template → test images (via cc_test_qnx) and shell images
tools.build→ only shell/debug images (as an additional source)
The discrepancy is worth flagging since both files were introduced in this PR, serve overlapping purposes for QNX image building, and have different library sets which could lead to maintenance confusion. But the specific claim about tests failing because libsecpol.so.1 is missing from tools.build is misleading since tools.build isn't used for test images.
The recommendation to reconcile the lists is still actionable and valid given both are new files in this PR.
Evidence Summary
1. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/cc_test_qnx.bzl
1: # *******************************************************************************
2: # Copyright (c) 2025 Contributors to the Eclipse Foundation
3: #
4: # See the NOTICE file(s) distributed with this work for additional
5: # information regarding copyright ownership.
6: #
7: # This program and the accompanying materials are made available under the
8: # terms of the Apache License Version 2.0 which is available at
9: # https://www.apache.org/licenses/LICENSE-2.0
10: #
...
2. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/tools.build
1: # dynamic loader
2: /usr/lib/ldqnx-64.so.2=ldqnx-64.so.2
3:
4: # shared libraries (QNX 8.0 x86_64)
5: cam-disk.so
6: fs-qnx6.so
7: io-blk.so
8: libc++.so.2
9: libc.so
10: libc.so.6
...
3. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/BUILD
1: # *******************************************************************************
2: # Copyright (c) 2025 Contributors to the Eclipse Foundation
3: #
4: # See the NOTICE file(s) distributed with this work for additional
5: # information regarding copyright ownership.
6: #
7: # This program and the accompanying materials are made available under the
8: # terms of the Apache License Version 2.0 which is available at
9: # https://www.apache.org/licenses/LICENSE-2.0
10: #
...
| configure_persistency.sh=${CONFIGURE_PERSISTENCY_SCRIPT} | ||
|
|
||
| # Include test package filesystem | ||
| /=${TESTS} |
There was a problem hiding this comment.
[Logic] /=${TESTS} is mandatory in this template, but init_shell doesn’t pass a tars entry for TESTS, so mkifs will fail with an undefined variable. Make it optional ([-optional]) or pass a TESTS tar for init_shell.
Context for Agents
`/=${TESTS}` is mandatory in this template, but `init_shell` doesn’t pass a `tars` entry for `TESTS`, so mkifs will fail with an undefined variable. Make it optional (`[-optional]`) or pass a TESTS tar for `init_shell`.
File: quality/qnx_unit_testing/init_x86_64_cc_test.build.template
Line: 71Chain of Thought
- Validation Decision: accepted
- Confidence: 0.00
- Likelihood Score: 0/5
- Suggested Action: keep
- Tool Calls Used: 10
Reasoning
Failed to parse validation response
Evidence Summary
1. File Lookup
- Filename:
run_qemu_shell.sh
No files found matching 'run_qemu_shell.sh'
2. File Lookup
- Filename:
cc_test_qnx.bzl
No files found matching 'cc_test_qnx.bzl'
3. File Lookup
- Filename:
run_qemu.sh
No files found matching 'run_qemu.sh'
4. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/BUILD
1: # *******************************************************************************
2: # Copyright (c) 2025 Contributors to the Eclipse Foundation
3: #
4: # See the NOTICE file(s) distributed with this work for additional
5: # information regarding copyright ownership.
6: #
7: # This program and the accompanying materials are made available under the
8: # terms of the Apache License Version 2.0 which is available at
9: # https://www.apache.org/licenses/LICENSE-2.0
10: #
...
5. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/cc_test_qnx.bzl
1: # *******************************************************************************
2: # Copyright (c) 2025 Contributors to the Eclipse Foundation
3: #
4: # See the NOTICE file(s) distributed with this work for additional
5: # information regarding copyright ownership.
6: #
7: # This program and the accompanying materials are made available under the
8: # terms of the Apache License Version 2.0 which is available at
9: # https://www.apache.org/licenses/LICENSE-2.0
10: #
...
6. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/run_qemu.sh
1: #!/bin/bash
2:
3: set -euo pipefail
4:
5: IFS_IMAGE=$1
6: PERSISTENT_IMAGE=$2
7: PROCESS_TESTS_RESULTS=$3
8: TEST_RESULT_IMAGE=$4
9: TEST_IMAGE=$5
10:
...
7. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/run_qemu_shell.sh
1: #!/bin/bash
2:
3: set -euo pipefail
4:
5: IFS_IMAGE=$1
6: PERSISTENT_IMAGE=$2
7: TEST_RESULT_IMAGE=$3
8: TEST_IMAGE=$4
9:
10: writable_img=$(mktemp)
...
| echo "CONFIGURE_PERSISTENCY: $PERSISTENT_PARTITION_MOUNT_POINT already mounted, returning" | ||
| else | ||
| echo "CONFIGURE_PERSISTENCY: $PERSISTENT_PARTITION_MOUNT_POINT not mounted, hence mounting" | ||
| mount -t qnx6 -ocrypto=openssl "$PERSISTENT_PARTITION" "$PERSISTENT_PARTITION_MOUNT_POINT" |
There was a problem hiding this comment.
[Reliability] mount_qnx6fs_with_crypto assumes $PERSISTENT_PARTITION_MOUNT_POINT exists; if the directory isn’t created in the IFS, the mount fails and the caller retries forever. Create the mount point before mounting.
| mount -t qnx6 -ocrypto=openssl "$PERSISTENT_PARTITION" "$PERSISTENT_PARTITION_MOUNT_POINT" | |
| mkdir -p "$PERSISTENT_PARTITION_MOUNT_POINT" | |
| mount -t qnx6 -ocrypto=openssl "$PERSISTENT_PARTITION" "$PERSISTENT_PARTITION_MOUNT_POINT" |
Context for Agents
`mount_qnx6fs_with_crypto` assumes `$PERSISTENT_PARTITION_MOUNT_POINT` exists; if the directory isn’t created in the IFS, the mount fails and the caller retries forever. Create the mount point before mounting.
```suggestion
mkdir -p "$PERSISTENT_PARTITION_MOUNT_POINT"
mount -t qnx6 -ocrypto=openssl "$PERSISTENT_PARTITION" "$PERSISTENT_PARTITION_MOUNT_POINT"
```
File: quality/qnx_unit_testing/configure_persistency.sh
Line: 21Chain of Thought
- Validation Decision: accepted
- Confidence: 0.82
- Likelihood Score: 3/5
- Likelihood Reason: The mount point directory
/qnx6fs_persistentis not created anywhere in the script or visibly in the IFS build template. If the QNX IFS doesn't implicitly create it (which isn't guaranteed), the mount will fail and the caller retries forever. It's plausible this directory exists in the IFS via some mechanism not visible in the template, but the defensivemkdir -pis still the right approach. - Suggested Action: keep
- Tool Calls Used: 5
Reasoning
The comment correctly identifies a real issue in newly introduced code:
-
New file:
configure_persistency.shis entirely new in this PR (is_new is effectively true — 0 lines before, 165 after). -
Missing directory creation: The
mount_qnx6fs_with_crypto()function at line 21 runsmount -t qnx6 -ocrypto=openssl "$PERSISTENT_PARTITION" "$PERSISTENT_PARTITION_MOUNT_POINT"where$PERSISTENT_PARTITION_MOUNT_POINTis/qnx6fs_persistent. Themountcommand on Unix-like systems (including QNX) requires the mount point directory to already exist. There is nomkdir -pbefore this mount call. -
IFS template check: The
init_x86_64_cc_test.build.templatedoes not show/qnx6fs_persistentbeing created in the IFS. While QNX IFS can include implicit directory creation, there's no evidence of it here. -
Infinite retry confirmed: In
startup.shlines 26-28,configure_persistency.shis called insidewhile ! configure_persistency.sh; do ... done. If the mount point directory doesn't exist, every invocation fails, and the loop retries forever since the condition never changes. -
Actionable fix: The suggestion to add
mkdir -p "$PERSISTENT_PARTITION_MOUNT_POINT"before the mount command is concrete, correct, and minimal.
The only uncertainty is whether the /qnx6fs_persistent directory might be created by some other mechanism in the QNX boot process not visible in these files. But the defensive mkdir -p is still good practice and has no downside.
Evidence Summary
1. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/run_qemu.sh
1: #!/bin/bash
2:
3: set -euo pipefail
4:
5: IFS_IMAGE=$1
6: PERSISTENT_IMAGE=$2
7: PROCESS_TESTS_RESULTS=$3
8: TEST_RESULT_IMAGE=$4
9: TEST_IMAGE=$5
10:
...
2. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/startup.sh
1: #!/bin/sh
2:
3: slogger2
4: waitfor /dev/slog
5:
6: pci-server --config=/proc/boot/pci_server.cfg
7: waitfor /dev/pci
8:
9: pipe
10: waitfor /dev/pipe
...
3. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/configure_persistency.sh
1: #!/bin/sh
2:
3: set -eu
4:
5: PERSISTENT_PARTITION="/dev/hd1"
6: PERSISTENT_PARTITION_MOUNT_POINT="/qnx6fs_persistent"
7: PERSISTENT_ENCRYPTION_MOUNT_POINT=/persistent
8: QEMU=1
9:
10: ENC_DOMAIN=3
...
4. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/init_x86_64_cc_test.build.template
1: ###############################################################################
2: #
3: # QNX 8.0 Unit Test Image for x86_64
4: #
5: ###############################################################################
6:
7: [-optional +autolink]
8:
9: [image=0x3600000]
10: [virtual=x86_64,multiboot] boot = {
...
| -drive file="${TEST_IMAGE}",if=virtio,format=raw,index=2,media=disk,readonly=on \ | ||
| 2>&1 | sed 's/[^[:print:]]//g' | sed 's/\r//' | ||
|
|
||
| ${PROCESS_TESTS_RESULTS} "${writable_img}" |
There was a problem hiding this comment.
[Reliability] With set -euo pipefail (line 3), if ${PROCESS_TESTS_RESULTS} on this line returns a non-zero exit code (which happens whenever a test fails, since process_test_results.py returns the test's return code), the script immediately exits due to set -e. This means return_code=$? on the next line never executes, and more critically, the rm -rf cleanup on lines 49-50 never runs, leaking the temporary files created by mktemp on lines 11 and 14.
The fix is to capture the exit code in a way that doesn't trigger set -e, e.g.:
${PROCESS_TESTS_RESULTS} "${writable_img}" && return_code=0 || return_code=$?
Or use a trap for cleanup.
Context for Agents
[**Bug**] With `set -euo pipefail` (line 3), if `${PROCESS_TESTS_RESULTS}` on this line returns a non-zero exit code (which happens whenever a test fails, since `process_test_results.py` returns the test's return code), the script immediately exits due to `set -e`. This means `return_code=$?` on the next line never executes, and more critically, the `rm -rf` cleanup on lines 49-50 never runs, leaking the temporary files created by `mktemp` on lines 11 and 14.
The fix is to capture the exit code in a way that doesn't trigger `set -e`, e.g.:
```
${PROCESS_TESTS_RESULTS} "${writable_img}" && return_code=0 || return_code=$?
```
Or use a `trap` for cleanup.
File: quality/qnx_unit_testing/run_qemu.sh
Line: 46Chain of Thought
- Validation Decision: accepted
- Confidence: 0.97
- Likelihood Score: 5/5
- Likelihood Reason: This will happen every time a test fails, which is the primary failure path this script is designed to handle. The process_test_results.py script explicitly exits with the test's return code via sys.exit(), so any test failure triggers the bug.
- Suggested Action: keep
- Tool Calls Used: 2
Reasoning
The comment correctly identifies a real bug introduced by this PR:
-
set -euo pipefailon line 3: Confirmed. The-eflag causes immediate script exit on any command returning non-zero. -
Line 46 can return non-zero:
process_test_results.pydoessys.exit(check_tests_result())wherecheck_tests_result()returnsint(shared_img.readtext('returncode.log')). When a test fails, this will be a non-zero value, causing the Python process to exit non-zero. -
set -etriggers before line 47: Withset -e, when line 46 returns non-zero, bash exits the script immediately.return_code=$?on line 47 never executes, and critically, the cleanup on lines 49-50 (rm -rfof temp files) never runs. -
Temp file leak:
mktempon lines 11 and 14 creates temporary files that are never cleaned up when tests fail. -
New file: This file is entirely new in this PR (before_change_line_count: 0), so the issue is directly introduced by this PR.
The suggested fix (${PROCESS_TESTS_RESULTS} "${writable_img}" && return_code=0 || return_code=$?) is the standard bash idiom for capturing exit codes without triggering set -e. The alternative suggestion of using a trap for cleanup is also valid. Both are concrete, actionable fixes.
Evidence Summary
1. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/run_qemu.sh
1: #!/bin/bash
2:
3: set -euo pipefail
4:
5: IFS_IMAGE=$1
6: PERSISTENT_IMAGE=$2
7: PROCESS_TESTS_RESULTS=$3
8: TEST_RESULT_IMAGE=$4
9: TEST_IMAGE=$5
10:
...
2. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/process_test_results.py
1: import os
2: import sys
3: import tarfile
4: from fs import open_fs
5:
6: def check_tests_result() -> int:
7: shared_img_path = sys.argv[1]
8: shared_img = open_fs("fat://" + shared_img_path + "?preserve_case=true")
9:
10: output_xml = os.environ.get("XML_OUTPUT_FILE")
...
|
|
||
| def check_tests_result() -> int: | ||
| shared_img_path = sys.argv[1] | ||
| shared_img = open_fs("fat://" + shared_img_path + "?preserve_case=true") |
There was a problem hiding this comment.
[Reliability] The open_fs resource is never closed. If check_tests_result raises an exception partway through, the FAT filesystem handle leaks. Use a context manager:
| shared_img = open_fs("fat://" + shared_img_path + "?preserve_case=true") | |
| with open_fs("fat://" + shared_img_path + "?preserve_case=true") as shared_img: |
Then indent the rest of the function body under this with block.
Context for Agents
The `open_fs` resource is never closed. If `check_tests_result` raises an exception partway through, the FAT filesystem handle leaks. Use a context manager:
```suggestion
with open_fs("fat://" + shared_img_path + "?preserve_case=true") as shared_img:
```
Then indent the rest of the function body under this `with` block.
File: quality/qnx_unit_testing/process_test_results.py
Line: 8Chain of Thought
- Validation Decision: accepted
- Confidence: 0.97
- Likelihood Score: 4/5
- Likelihood Reason: The filesystem handle is never closed (no .close() call, no context manager). Any exception in the function body (e.g., missing 'returncode.log', failed download, missing env vars causing NoneType errors) will leak the FAT filesystem handle. Even on the happy path, the handle is never explicitly closed.
- Suggested Action: keep
- Tool Calls Used: 3
Reasoning
-
Accuracy: The comment correctly identifies that
open_fs()at line 8 returns a filesystem object that is never closed. There is noshared_img.close()call anywhere in the function, and nowithstatement wrapping it. -
Library documentation confirms: The pyfilesystem2 docs explicitly state: "Always close filesystems — either via a
withstatement (preferred) or by calling.close()explicitly." The docs recommend the context manager pattern (with open_fs(...) as fs:) as the preferred approach, which is exactly what the comment suggests. -
New code introduced by this PR: The file
process_test_results.pydoes not exist in the base commit — it is entirely new in this PR. So this is not pre-existing debt. -
Real risk of leaking: Multiple operations in the function body can raise exceptions (e.g.,
shared_img.exists(),shared_img.download(),shared_img.readtext(),open()calls,int()conversion). If any of these fail, the FAT filesystem handle leaks. Even on the happy path, the handle is never closed before the function returns. -
Actionable: The comment provides a concrete code suggestion — wrap the
open_fscall in awithstatement and indent the body accordingly. This is a clear, specific fix.
Evidence Summary
1. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/process_test_results.py
1: import os
2: import sys
3: import tarfile
4: from fs import open_fs
5:
6: def check_tests_result() -> int:
7: shared_img_path = sys.argv[1]
8: shared_img = open_fs("fat://" + shared_img_path + "?preserve_case=true")
9:
10: output_xml = os.environ.get("XML_OUTPUT_FILE")
...
2. Library Docs
- Library:
pyfilesystem2 - Query:
open_fs context manager close
Here is the relevant documentation for **pyfilesystem2** regarding `open_fs`, context managers, and closing filesystems:
---
## `open_fs` — Opening Filesystems
The `open_fs` function opens a filesystem from an FS URL string. It is the primary, general-purpose way to create filesystem objects.
```python
from fs import open_fs
...
| while ! mount -t dos -o case /dev/hd0 /shared; do | ||
| echo "Failed to mount /shared. Retrying..." | ||
| done |
There was a problem hiding this comment.
[Reliability] The startup script has no timeout or max-retry logic on the while mount-retry loops (lines 21-23, 26-28, 31-33). If a device never appears or a mount permanently fails, the QNX guest will spin forever. Since the QEMU test has timeout = "short" in Bazel, this will eventually be killed, but without any diagnostic output about why it timed out.
Consider adding a retry counter or timeout to avoid silent hangs:
retries=0
while ! mount -t dos -o case /dev/hd0 /shared; do
retries=$((retries + 1))
if [ $retries -ge 30 ]; then
echo "FATAL: Failed to mount /shared after 30 retries"
shutdown
fi
echo "Failed to mount /shared. Retrying..."
doneContext for Agents
The startup script has no timeout or max-retry logic on the `while` mount-retry loops (lines 21-23, 26-28, 31-33). If a device never appears or a mount permanently fails, the QNX guest will spin forever. Since the QEMU test has `timeout = "short"` in Bazel, this will eventually be killed, but without any diagnostic output about *why* it timed out.
Consider adding a retry counter or timeout to avoid silent hangs:
```bash
retries=0
while ! mount -t dos -o case /dev/hd0 /shared; do
retries=$((retries + 1))
if [ $retries -ge 30 ]; then
echo "FATAL: Failed to mount /shared after 30 retries"
shutdown
fi
echo "Failed to mount /shared. Retrying..."
done
```
File: quality/qnx_unit_testing/startup.sh
Line: 23Chain of Thought
- Validation Decision: accepted
- Confidence: 0.92
- Likelihood Score: 3/5
- Likelihood Reason: In a CI/QEMU testing environment, transient or permanent mount failures are plausible (e.g., device driver timing, image corruption). The Bazel timeout will eventually kill it, but with no diagnostic output explaining why.
- Suggested Action: keep
- Tool Calls Used: 4
Reasoning
The comment is accurate and actionable:
-
File is new in this PR:
startup.shdoes not exist in BASE, so this is entirely new code introduced by this PR. -
Infinite loops confirmed: Lines 21-23, 26-28, and 31-33 each contain
while ! <command>; do ... doneloops with no retry counter or timeout mechanism. If the command permanently fails, these loops spin forever. -
Bazel timeout confirmed: In
cc_test_qnx.bzlline 165, the test is configured withtimeout = "short", which means Bazel will eventually kill the test. However, as the comment correctly notes, this produces no diagnostic output about which mount failed or why. -
Actionable suggestion: The comment provides a concrete code example with a retry counter and diagnostic output that would help developers debug failures instead of seeing a generic timeout.
-
The
waitforcommands (lines 20, 25, 30) before each loop are also potentially infinite waits (QNXwaitfordefaults to waiting indefinitely), but the comment's focus on the while loops is still valid and independently useful.
The scenario of a mount permanently failing in a QEMU virtual environment is plausible, especially in CI where hardware emulation can be flaky.
Evidence Summary
1. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/startup.sh
1: #!/bin/sh
2:
3: slogger2
4: waitfor /dev/slog
5:
6: pci-server --config=/proc/boot/pci_server.cfg
7: waitfor /dev/pci
8:
9: pipe
10: waitfor /dev/pipe
...
2. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/run_qemu.sh
1: #!/bin/bash
2:
3: set -euo pipefail
4:
5: IFS_IMAGE=$1
6: PERSISTENT_IMAGE=$2
7: PROCESS_TESTS_RESULTS=$3
8: TEST_RESULT_IMAGE=$4
9: TEST_IMAGE=$5
10:
...
3. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/cc_test_qnx.bzl
1: # *******************************************************************************
2: # Copyright (c) 2025 Contributors to the Eclipse Foundation
3: #
4: # See the NOTICE file(s) distributed with this work for additional
5: # information regarding copyright ownership.
6: #
7: # This program and the accompanying materials are made available under the
8: # terms of the Apache License Version 2.0 which is available at
9: # https://www.apache.org/licenses/LICENSE-2.0
10: #
...
|
|
||
| # Wait for all test processes to finish | ||
| echo "Waiting for all test processes to finish..." | ||
| while pidin -F '%a %b %n' | grep cc_test_qnx > /dev/null 2>&1; do true; done |
There was a problem hiding this comment.
[Performance] This loop is a busy-wait with no sleep and will peg a CPU until tests finish. Add a short sleep inside the loop.
| while pidin -F '%a %b %n' | grep cc_test_qnx > /dev/null 2>&1; do true; done | |
| while pidin -F '%a %b %n' | grep cc_test_qnx > /dev/null 2>&1; do | |
| sleep 0.1 | |
| done |
Context for Agents
This loop is a busy-wait with no sleep and will peg a CPU until tests finish. Add a short sleep inside the loop.
```suggestion
while pidin -F '%a %b %n' | grep cc_test_qnx > /dev/null 2>&1; do
sleep 0.1
done
```
File: quality/qnx_unit_testing/run_test.sh
Line: 37Chain of Thought
- Validation Decision: accepted
- Confidence: 0.97
- Likelihood Score: 5/5
- Likelihood Reason: The busy-wait loop will inevitably peg the CPU every time this script runs, continuously spawning pidin and grep processes with no delay between iterations.
- Suggested Action: keep
- Tool Calls Used: 2
Reasoning
The comment correctly identifies a busy-wait loop at line 37 of a newly added file. The loop while pidin -F '%a %b %n' | grep cc_test_qnx > /dev/null 2>&1; do true; done has no sleep, meaning it will continuously spawn pidin and grep processes as fast as the shell can loop. This wastes CPU and is a well-known antipattern. The suggested fix of adding sleep 0.1 is standard, concrete, and appropriate. Since this is a new file (confirmed by checking BASE), this issue is introduced by this PR. The comment is accurate, actionable, and the issue is inevitable whenever the script runs.
Evidence Summary
1. File Snapshot (HEAD)
- File:
quality/qnx_unit_testing/run_test.sh - Lines:
35-38
35: # Wait for all test processes to finish
36: echo "Waiting for all test processes to finish..."
37: while pidin -F '%a %b %n' | grep cc_test_qnx > /dev/null 2>&1; do true; done
38: echo "Test processes finished"
This PR introduces QNX unit testing support and updates Python dependencies to enable test result processing and Bazel integration.