Skip to content

Adding QNX Unit Testing Infrastructure and Python Dependency Integration#51

Open
saksham-chawla wants to merge 1 commit intocodex/replay-20260319023913-pr-200-basefrom
codex/replay-20260319023913-pr-200-head
Open

Adding QNX Unit Testing Infrastructure and Python Dependency Integration#51
saksham-chawla wants to merge 1 commit intocodex/replay-20260319023913-pr-200-basefrom
codex/replay-20260319023913-pr-200-head

Conversation

@saksham-chawla
Copy link
Collaborator

@saksham-chawla saksham-chawla commented Mar 18, 2026

This PR introduces QNX unit testing support and updates Python dependencies to enable test result processing and Bazel integration.

Copy link

@propel-code-bot propel-code-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important security, logic, and reliability fixes are needed before the QNX testing stack ships.

Status: Changes Suggested | Risk: Medium | Readiness: 5/10 (Needs Polish)

Issues Identified & Suggestions
  • Harden security and result parsing: cc_test_qnx.bzl, process_test_results.py
  • Fix rule/template and script robustness (runfiles, mounts, busy-wait): run_as_exec.bzl, init_x86_64_cc_test.build.template, quality/qnx_unit_testing/*.sh, tools.build
Review Details

📁 14 files reviewed | 💬 14 comments

👍 / 👎 individual comments to help improve reviews for you

Comment on lines +84 to +86
cmd_bash = """
echo {} > $(@)
""".format(excluded_tests_filter_str),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

[Security] The excluded_tests_filter_str is injected unsanitized into a shell echo command via cmd_bash. If the name parameter or filter strings contain shell metacharacters (spaces, quotes, $, backticks), command injection or unexpected behavior can occur.

Quote the value in the echo command:

        cmd_bash = """
        echo '{}' > $(@)
        """.format(excluded_tests_filter_str),

Or better, use ctx.actions.write instead of a genrule to avoid shell interpretation entirely.

Context for Agents
The `excluded_tests_filter_str` is injected unsanitized into a shell `echo` command via `cmd_bash`. If the `name` parameter or filter strings contain shell metacharacters (spaces, quotes, `$`, backticks), command injection or unexpected behavior can occur.

Quote the value in the echo command:
```python
        cmd_bash = """
        echo '{}' > $(@)
        """.format(excluded_tests_filter_str),
```
Or better, use `ctx.actions.write` instead of a genrule to avoid shell interpretation entirely.

File: quality/qnx_unit_testing/cc_test_qnx.bzl
Line: 86
Chain of Thought
  • Validation Decision: accepted
  • Confidence: 0.90
  • Likelihood Score: 4/5
  • Likelihood Reason: GTest filter strings commonly contain * (as shown in 2 of 3 documented examples on lines 68-70), which is a shell glob metacharacter. Without quoting, shell glob expansion will occur, leading to incorrect filter content in the output file. This is highly likely to surface in normal usage.
  • Suggested Action: keep
  • Tool Calls Used: 2

Reasoning

The comment correctly identifies a real shell injection/unexpected behavior issue in newly introduced code.

Code analysis:

  1. excluded_tests_filter_str is built by concatenating GTest filter strings (line 78-80).
  2. It's injected unquoted into a shell echo command via cmd_bash (lines 84-86): echo {} > $(@).
  3. The documented examples of filter strings include * (glob metacharacter): ["FooTest.*"], ["*FooTest.*"].

Why this is a real issue:

  • The * in GTest filters like FooTest.* will be subject to shell glob expansion. If any files in the working directory match the pattern, the echo output will contain the filenames instead of the intended filter string.
  • Additionally, the \\ at the end of the string (from line 80: ":\\") creates a trailing backslash that will escape the subsequent whitespace/newline in the shell command, potentially causing malformed output.
  • No quoting means spaces, $, backticks, and other shell metacharacters could cause command injection or unexpected behavior.

The fix is actionable and correct:

  • Using single quotes echo '{}' > $(@) would prevent shell interpretation of the value.
  • The ctx.actions.write suggestion is more robust but would require refactoring from a macro's native.genrule to a custom rule implementation. Still, it's a valid better alternative.

This is new code introduced by this PR (the file didn't exist at BASE), so it's clearly within PR scope.

Evidence Summary

1. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/cc_test_qnx.bzl
  • Lines: 75-90
75:     excluded_tests_filter = excluded_tests_filter if excluded_tests_filter else []
76: 
77:     # Convert filter list to GTest format: "-FooTest.Test1:BarTest.*"
78:     excluded_tests_filter_str = "-"
79:     for test_filter in excluded_tests_filter:
80:         excluded_tests_filter_str = excluded_tests_filter_str + (test_filter + ":\\")
81: 
82:     native.genrule(
83:         name = "{}_excluded_tests_filter".format(name),
84:         cmd_bash = """
...

Fix in Cursor

shared_img.download('coverage.tar.gz', file)

with tarfile.open(coverage_archive) as tf:
tf.extractall(output_dir)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

[Security] tarfile.extractall(output_dir) is vulnerable to path traversal (e.g., ../) from a crafted coverage.tar.gz produced by the guest, which can overwrite host files. Validate members before extracting (or use the filter argument in newer Python).

Suggested change
tf.extractall(output_dir)
for m in tf.getmembers():
dest = os.path.realpath(os.path.join(output_dir, m.name))
if not dest.startswith(os.path.realpath(output_dir) + os.sep):
raise RuntimeError("Unsafe path in coverage archive")
tf.extract(m, output_dir)
Context for Agents
`tarfile.extractall(output_dir)` is vulnerable to path traversal (e.g., `../`) from a crafted `coverage.tar.gz` produced by the guest, which can overwrite host files. Validate members before extracting (or use the `filter` argument in newer Python).

```suggestion
for m in tf.getmembers():
    dest = os.path.realpath(os.path.join(output_dir, m.name))
    if not dest.startswith(os.path.realpath(output_dir) + os.sep):
        raise RuntimeError("Unsafe path in coverage archive")
    tf.extract(m, output_dir)
```

File: quality/qnx_unit_testing/process_test_results.py
Line: 25
Chain of Thought
  • Validation Decision: accepted
  • Confidence: 0.92
  • Likelihood Score: 3/5
  • Likelihood Reason: The archive originates from a QEMU guest VM, which is typically a controlled environment in CI. However, a compromised or misconfigured guest could produce a malicious archive. The path traversal vulnerability in tarfile.extractall() is well-documented (CVE-2007-4559) and defense-in-depth makes this a reasonable concern.
  • Suggested Action: keep
  • Tool Calls Used: 3

Reasoning

  1. Issue is real and introduced by this PR: The file process_test_results.py is entirely new (not found in base commit). Line 25 calls tf.extractall(output_dir) without any member validation, which is a well-known path traversal vulnerability (CVE-2007-4559).

  2. The vulnerability is correctly described: tarfile.extractall() does not sanitize member names by default. A crafted tar archive with entries like ../../etc/malicious could write files outside output_dir. Python's own documentation warns about this.

  3. The source of the archive increases risk: The coverage.tar.gz is downloaded from a FAT filesystem image (shared_img) that represents a QEMU guest's shared storage. While the guest is likely a controlled test environment, the archive contents are not validated before extraction.

  4. The suggestion is actionable and correct: The proposed fix validates each member's resolved path against the output directory using os.path.realpath(), preventing path traversal. The alternative of using Python 3.12+'s filter parameter is also mentioned.

  5. No existing mitigations: There is no upstream validation of the tar archive contents anywhere in the code.

Evidence Summary

1. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/process_test_results.py
1: import os
2: import sys
3: import tarfile
4: from fs import open_fs
5: 
6: def check_tests_result() -> int:
7:     shared_img_path = sys.argv[1]
8:     shared_img = open_fs("fat://" + shared_img_path + "?preserve_case=true")
9: 
10:     output_xml = os.environ.get("XML_OUTPUT_FILE")
...

Fix in Cursor

domain_status=$(echo $query | awk -F' ' '{print $2}')
echo "CONFIGURE_PERSISTENCY: domain_number: $domain_number"
echo "CONFIGURE_PERSISTENCY: domain_status: $domain_status"
if [[ -n "$domain_number" && -n "$domain_status" && "$domain_number" == "$ENC_DOMAIN" && ("$domain_status" == "UNLOCKED" || "$domain_status" == "LOCKED") ]]; then

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

[Logic] The configure_persistency.sh shebang is #!/bin/sh but uses [[ ... ]] (bashism) at line 71. On a strict POSIX shell, this will fail with a syntax error. Either change the shebang to #!/bin/bash or replace [[ ... ]] with POSIX-compatible [ ... ] tests (as done in unlock_persistent at line 119).

Suggested change
if [[ -n "$domain_number" && -n "$domain_status" && "$domain_number" == "$ENC_DOMAIN" && ("$domain_status" == "UNLOCKED" || "$domain_status" == "LOCKED") ]]; then
if [ -n "$domain_number" ] && [ -n "$domain_status" ] && [ "$domain_number" = "$ENC_DOMAIN" ] && { [ "$domain_status" = "UNLOCKED" ] || [ "$domain_status" = "LOCKED" ]; }; then
Context for Agents
The `configure_persistency.sh` shebang is `#!/bin/sh` but uses `[[ ... ]]` (bashism) at line 71. On a strict POSIX shell, this will fail with a syntax error. Either change the shebang to `#!/bin/bash` or replace `[[ ... ]]` with POSIX-compatible `[ ... ]` tests (as done in `unlock_persistent` at line 119).

```suggestion
    if [ -n "$domain_number" ] && [ -n "$domain_status" ] && [ "$domain_number" = "$ENC_DOMAIN" ] && { [ "$domain_status" = "UNLOCKED" ] || [ "$domain_status" = "LOCKED" ]; }; then
```

File: quality/qnx_unit_testing/configure_persistency.sh
Line: 71
Chain of Thought
  • Validation Decision: accepted
  • Confidence: 0.97
  • Likelihood Score: 5/5
  • Likelihood Reason: The script has #!/bin/sh shebang, and line 71 uses [[ ... ]] which is a bash-only construct. On any strict POSIX sh (e.g., dash on Debian/Ubuntu, or the QNX sh this script targets), this will fail with a syntax error. This is inevitable if the script is ever run by a non-bash /bin/sh.
  • Suggested Action: keep
  • Tool Calls Used: 1

Reasoning

The comment is accurate on all counts:

  1. Shebang is #!/bin/sh (line 1) — confirmed in the diff. This declares the script should be interpreted by a POSIX-compatible shell.

  2. Line 71 uses [[ ... ]] — confirmed: if [[ -n "$domain_number" && -n "$domain_status" && "$domain_number" == "$ENC_DOMAIN" && ("$domain_status" == "UNLOCKED" || "$domain_status" == "LOCKED") ]]; then. The [[ ... ]] construct is a bash/ksh extension not defined in POSIX sh.

  3. Line 119 uses POSIX [ ... ] — confirmed: if [ -n "$domain_number" ] && [ -n "$domain_status" ] && .... This shows inconsistency within the same file.

  4. The file is entirely new (all 165 lines added) — this issue is introduced by this PR, not legacy debt.

  5. The suggested fix is correct — it properly converts [[ ... ]] to chained [ ... ] tests with && and uses { ...; } grouping for the || condition, and correctly uses = instead of == for POSIX string comparison.

This is a real bug that would cause a syntax error on strict POSIX shells. The comment is specific, actionable, and provides a concrete code suggestion.

Evidence Summary

1. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/configure_persistency.sh
  • Lines: 119-119
119:     if [ -n "$domain_number" ] && [ -n "$domain_status" ] && [ "$domain_number" == "$ENC_DOMAIN" ] && [ "$domain_status" == "LOCKED" ] && [ "$domain_status" != "UNLOCKED" ]; then

Fix in Cursor

Comment on lines +40 to +49
cam-disk.so
fs-qnx6.so
io-blk.so
libc++.so.2
libc.so
libc.so.6
libcam.so.2
libgcc_s.so.1
libm.so
libsecpol.so.1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

[Logic] The libsecpol.so.1 library listed in the template's shared libraries section is absent from tools.build. If tools.build is the canonical shared library list used by the actual test images (via cc_test_qnx.bzl), tests depending on libsecpol.so.1 will fail with missing library errors at runtime. Conversely, tools.build includes several libraries (libqcrypto.so.1.0, libqh.so, libregex.so.1, libslog2.so.1, libslog2parse.so.1) that the template does not. Verify which file is authoritative and reconcile the lists.

Context for Agents
The `libsecpol.so.1` library listed in the template's shared libraries section is absent from `tools.build`. If `tools.build` is the canonical shared library list used by the actual test images (via `cc_test_qnx.bzl`), tests depending on `libsecpol.so.1` will fail with missing library errors at runtime. Conversely, `tools.build` includes several libraries (`libqcrypto.so.1.0`, `libqh.so`, `libregex.so.1`, `libslog2.so.1`, `libslog2parse.so.1`) that the template does not. Verify which file is authoritative and reconcile the lists.

File: quality/qnx_unit_testing/init_x86_64_cc_test.build.template
Line: 49
Chain of Thought
  • Validation Decision: accepted
  • Confidence: 0.65
  • Likelihood Score: 3/5
  • Likelihood Reason: The discrepancy between the two library lists is real and could cause confusion or runtime failures if the wrong file is used as reference. However, the comment's framing about tools.build being used by cc_test_qnx.bzl is incorrect - the template is the authoritative file for test images.
  • Suggested Action: modify
  • Tool Calls Used: 6

Reasoning

The comment identifies a real discrepancy between init_x86_64_cc_test.build.template and tools.build regarding shared library lists. The factual claims about which libraries are present/absent in each file are correct:

  • libsecpol.so.1 is in the template but not in tools.build
  • libqcrypto.so.1.0, libqh.so, libregex.so.1, libslog2.so.1, libslog2parse.so.1 are in tools.build but not the template

However, the comment's framing is partially incorrect. It states "If tools.build is the canonical shared library list used by the actual test images (via cc_test_qnx.bzl)" — but examining cc_test_qnx.bzl (line 132-136), the qnx_ifs rule for test images uses build_file = "//quality/qnx_unit_testing:init_build_cc_test" which is the expanded template. It does NOT reference tools.build at all.

tools.build is only used by the init_shell debugging image (BUILD lines 82-101), where it's included as a source with mapping TOOLS_BUILDFILE.

So the two files serve different purposes:

  • Template → test images (via cc_test_qnx) and shell images
  • tools.build → only shell/debug images (as an additional source)

The discrepancy is worth flagging since both files were introduced in this PR, serve overlapping purposes for QNX image building, and have different library sets which could lead to maintenance confusion. But the specific claim about tests failing because libsecpol.so.1 is missing from tools.build is misleading since tools.build isn't used for test images.

The recommendation to reconcile the lists is still actionable and valid given both are new files in this PR.

Evidence Summary

1. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/cc_test_qnx.bzl
1: # *******************************************************************************
2: # Copyright (c) 2025 Contributors to the Eclipse Foundation
3: #
4: # See the NOTICE file(s) distributed with this work for additional
5: # information regarding copyright ownership.
6: #
7: # This program and the accompanying materials are made available under the
8: # terms of the Apache License Version 2.0 which is available at
9: # https://www.apache.org/licenses/LICENSE-2.0
10: #
...

2. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/tools.build
1: # dynamic loader
2: /usr/lib/ldqnx-64.so.2=ldqnx-64.so.2
3: 
4: # shared libraries (QNX 8.0 x86_64)
5: cam-disk.so
6: fs-qnx6.so
7: io-blk.so
8: libc++.so.2
9: libc.so
10: libc.so.6
...

3. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/BUILD
1: # *******************************************************************************
2: # Copyright (c) 2025 Contributors to the Eclipse Foundation
3: #
4: # See the NOTICE file(s) distributed with this work for additional
5: # information regarding copyright ownership.
6: #
7: # This program and the accompanying materials are made available under the
8: # terms of the Apache License Version 2.0 which is available at
9: # https://www.apache.org/licenses/LICENSE-2.0
10: #
...

Fix in Cursor

configure_persistency.sh=${CONFIGURE_PERSISTENCY_SCRIPT}

# Include test package filesystem
/=${TESTS}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

[Logic] /=${TESTS} is mandatory in this template, but init_shell doesn’t pass a tars entry for TESTS, so mkifs will fail with an undefined variable. Make it optional ([-optional]) or pass a TESTS tar for init_shell.

Context for Agents
`/=${TESTS}` is mandatory in this template, but `init_shell` doesn’t pass a `tars` entry for `TESTS`, so mkifs will fail with an undefined variable. Make it optional (`[-optional]`) or pass a TESTS tar for `init_shell`.

File: quality/qnx_unit_testing/init_x86_64_cc_test.build.template
Line: 71
Chain of Thought
  • Validation Decision: accepted
  • Confidence: 0.00
  • Likelihood Score: 0/5
  • Suggested Action: keep
  • Tool Calls Used: 10

Reasoning

Failed to parse validation response

Evidence Summary

1. File Lookup

  • Filename: run_qemu_shell.sh
No files found matching 'run_qemu_shell.sh'

2. File Lookup

  • Filename: cc_test_qnx.bzl
No files found matching 'cc_test_qnx.bzl'

3. File Lookup

  • Filename: run_qemu.sh
No files found matching 'run_qemu.sh'

4. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/BUILD
1: # *******************************************************************************
2: # Copyright (c) 2025 Contributors to the Eclipse Foundation
3: #
4: # See the NOTICE file(s) distributed with this work for additional
5: # information regarding copyright ownership.
6: #
7: # This program and the accompanying materials are made available under the
8: # terms of the Apache License Version 2.0 which is available at
9: # https://www.apache.org/licenses/LICENSE-2.0
10: #
...

5. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/cc_test_qnx.bzl
1: # *******************************************************************************
2: # Copyright (c) 2025 Contributors to the Eclipse Foundation
3: #
4: # See the NOTICE file(s) distributed with this work for additional
5: # information regarding copyright ownership.
6: #
7: # This program and the accompanying materials are made available under the
8: # terms of the Apache License Version 2.0 which is available at
9: # https://www.apache.org/licenses/LICENSE-2.0
10: #
...

6. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/run_qemu.sh
1: #!/bin/bash
2: 
3: set -euo pipefail
4: 
5: IFS_IMAGE=$1
6: PERSISTENT_IMAGE=$2
7: PROCESS_TESTS_RESULTS=$3
8: TEST_RESULT_IMAGE=$4
9: TEST_IMAGE=$5
10: 
...

7. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/run_qemu_shell.sh
1: #!/bin/bash
2: 
3: set -euo pipefail
4: 
5: IFS_IMAGE=$1
6: PERSISTENT_IMAGE=$2
7: TEST_RESULT_IMAGE=$3
8: TEST_IMAGE=$4
9: 
10: writable_img=$(mktemp)
...

Fix in Cursor

echo "CONFIGURE_PERSISTENCY: $PERSISTENT_PARTITION_MOUNT_POINT already mounted, returning"
else
echo "CONFIGURE_PERSISTENCY: $PERSISTENT_PARTITION_MOUNT_POINT not mounted, hence mounting"
mount -t qnx6 -ocrypto=openssl "$PERSISTENT_PARTITION" "$PERSISTENT_PARTITION_MOUNT_POINT"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

[Reliability] mount_qnx6fs_with_crypto assumes $PERSISTENT_PARTITION_MOUNT_POINT exists; if the directory isn’t created in the IFS, the mount fails and the caller retries forever. Create the mount point before mounting.

Suggested change
mount -t qnx6 -ocrypto=openssl "$PERSISTENT_PARTITION" "$PERSISTENT_PARTITION_MOUNT_POINT"
mkdir -p "$PERSISTENT_PARTITION_MOUNT_POINT"
mount -t qnx6 -ocrypto=openssl "$PERSISTENT_PARTITION" "$PERSISTENT_PARTITION_MOUNT_POINT"
Context for Agents
`mount_qnx6fs_with_crypto` assumes `$PERSISTENT_PARTITION_MOUNT_POINT` exists; if the directory isn’t created in the IFS, the mount fails and the caller retries forever. Create the mount point before mounting.

```suggestion
mkdir -p "$PERSISTENT_PARTITION_MOUNT_POINT"
mount -t qnx6 -ocrypto=openssl "$PERSISTENT_PARTITION" "$PERSISTENT_PARTITION_MOUNT_POINT"
```

File: quality/qnx_unit_testing/configure_persistency.sh
Line: 21
Chain of Thought
  • Validation Decision: accepted
  • Confidence: 0.82
  • Likelihood Score: 3/5
  • Likelihood Reason: The mount point directory /qnx6fs_persistent is not created anywhere in the script or visibly in the IFS build template. If the QNX IFS doesn't implicitly create it (which isn't guaranteed), the mount will fail and the caller retries forever. It's plausible this directory exists in the IFS via some mechanism not visible in the template, but the defensive mkdir -p is still the right approach.
  • Suggested Action: keep
  • Tool Calls Used: 5

Reasoning

The comment correctly identifies a real issue in newly introduced code:

  1. New file: configure_persistency.sh is entirely new in this PR (is_new is effectively true — 0 lines before, 165 after).

  2. Missing directory creation: The mount_qnx6fs_with_crypto() function at line 21 runs mount -t qnx6 -ocrypto=openssl "$PERSISTENT_PARTITION" "$PERSISTENT_PARTITION_MOUNT_POINT" where $PERSISTENT_PARTITION_MOUNT_POINT is /qnx6fs_persistent. The mount command on Unix-like systems (including QNX) requires the mount point directory to already exist. There is no mkdir -p before this mount call.

  3. IFS template check: The init_x86_64_cc_test.build.template does not show /qnx6fs_persistent being created in the IFS. While QNX IFS can include implicit directory creation, there's no evidence of it here.

  4. Infinite retry confirmed: In startup.sh lines 26-28, configure_persistency.sh is called inside while ! configure_persistency.sh; do ... done. If the mount point directory doesn't exist, every invocation fails, and the loop retries forever since the condition never changes.

  5. Actionable fix: The suggestion to add mkdir -p "$PERSISTENT_PARTITION_MOUNT_POINT" before the mount command is concrete, correct, and minimal.

The only uncertainty is whether the /qnx6fs_persistent directory might be created by some other mechanism in the QNX boot process not visible in these files. But the defensive mkdir -p is still good practice and has no downside.

Evidence Summary

1. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/run_qemu.sh
1: #!/bin/bash
2: 
3: set -euo pipefail
4: 
5: IFS_IMAGE=$1
6: PERSISTENT_IMAGE=$2
7: PROCESS_TESTS_RESULTS=$3
8: TEST_RESULT_IMAGE=$4
9: TEST_IMAGE=$5
10: 
...

2. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/startup.sh
1: #!/bin/sh
2: 
3: slogger2
4: waitfor /dev/slog
5: 
6: pci-server --config=/proc/boot/pci_server.cfg
7: waitfor /dev/pci
8: 
9: pipe
10: waitfor /dev/pipe
...

3. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/configure_persistency.sh
1: #!/bin/sh
2: 
3: set -eu
4: 
5: PERSISTENT_PARTITION="/dev/hd1"
6: PERSISTENT_PARTITION_MOUNT_POINT="/qnx6fs_persistent"
7: PERSISTENT_ENCRYPTION_MOUNT_POINT=/persistent
8: QEMU=1
9: 
10: ENC_DOMAIN=3
...

4. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/init_x86_64_cc_test.build.template
1: ###############################################################################
2: #
3: # QNX 8.0 Unit Test Image for x86_64
4: #
5: ###############################################################################
6: 
7: [-optional +autolink]
8: 
9: [image=0x3600000]
10: [virtual=x86_64,multiboot] boot = {
...

Fix in Cursor

-drive file="${TEST_IMAGE}",if=virtio,format=raw,index=2,media=disk,readonly=on \
2>&1 | sed 's/[^[:print:]]//g' | sed 's/\r//'

${PROCESS_TESTS_RESULTS} "${writable_img}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

[Reliability] With set -euo pipefail (line 3), if ${PROCESS_TESTS_RESULTS} on this line returns a non-zero exit code (which happens whenever a test fails, since process_test_results.py returns the test's return code), the script immediately exits due to set -e. This means return_code=$? on the next line never executes, and more critically, the rm -rf cleanup on lines 49-50 never runs, leaking the temporary files created by mktemp on lines 11 and 14.

The fix is to capture the exit code in a way that doesn't trigger set -e, e.g.:

${PROCESS_TESTS_RESULTS} "${writable_img}" && return_code=0 || return_code=$?

Or use a trap for cleanup.

Context for Agents
[**Bug**] With `set -euo pipefail` (line 3), if `${PROCESS_TESTS_RESULTS}` on this line returns a non-zero exit code (which happens whenever a test fails, since `process_test_results.py` returns the test's return code), the script immediately exits due to `set -e`. This means `return_code=$?` on the next line never executes, and more critically, the `rm -rf` cleanup on lines 49-50 never runs, leaking the temporary files created by `mktemp` on lines 11 and 14.

The fix is to capture the exit code in a way that doesn't trigger `set -e`, e.g.:
```
${PROCESS_TESTS_RESULTS} "${writable_img}" && return_code=0 || return_code=$?
```
Or use a `trap` for cleanup.

File: quality/qnx_unit_testing/run_qemu.sh
Line: 46
Chain of Thought
  • Validation Decision: accepted
  • Confidence: 0.97
  • Likelihood Score: 5/5
  • Likelihood Reason: This will happen every time a test fails, which is the primary failure path this script is designed to handle. The process_test_results.py script explicitly exits with the test's return code via sys.exit(), so any test failure triggers the bug.
  • Suggested Action: keep
  • Tool Calls Used: 2

Reasoning

The comment correctly identifies a real bug introduced by this PR:

  1. set -euo pipefail on line 3: Confirmed. The -e flag causes immediate script exit on any command returning non-zero.

  2. Line 46 can return non-zero: process_test_results.py does sys.exit(check_tests_result()) where check_tests_result() returns int(shared_img.readtext('returncode.log')). When a test fails, this will be a non-zero value, causing the Python process to exit non-zero.

  3. set -e triggers before line 47: With set -e, when line 46 returns non-zero, bash exits the script immediately. return_code=$? on line 47 never executes, and critically, the cleanup on lines 49-50 (rm -rf of temp files) never runs.

  4. Temp file leak: mktemp on lines 11 and 14 creates temporary files that are never cleaned up when tests fail.

  5. New file: This file is entirely new in this PR (before_change_line_count: 0), so the issue is directly introduced by this PR.

The suggested fix (${PROCESS_TESTS_RESULTS} "${writable_img}" && return_code=0 || return_code=$?) is the standard bash idiom for capturing exit codes without triggering set -e. The alternative suggestion of using a trap for cleanup is also valid. Both are concrete, actionable fixes.

Evidence Summary

1. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/run_qemu.sh
1: #!/bin/bash
2: 
3: set -euo pipefail
4: 
5: IFS_IMAGE=$1
6: PERSISTENT_IMAGE=$2
7: PROCESS_TESTS_RESULTS=$3
8: TEST_RESULT_IMAGE=$4
9: TEST_IMAGE=$5
10: 
...

2. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/process_test_results.py
1: import os
2: import sys
3: import tarfile
4: from fs import open_fs
5: 
6: def check_tests_result() -> int:
7:     shared_img_path = sys.argv[1]
8:     shared_img = open_fs("fat://" + shared_img_path + "?preserve_case=true")
9: 
10:     output_xml = os.environ.get("XML_OUTPUT_FILE")
...

Fix in Cursor


def check_tests_result() -> int:
shared_img_path = sys.argv[1]
shared_img = open_fs("fat://" + shared_img_path + "?preserve_case=true")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended

[Reliability] The open_fs resource is never closed. If check_tests_result raises an exception partway through, the FAT filesystem handle leaks. Use a context manager:

Suggested change
shared_img = open_fs("fat://" + shared_img_path + "?preserve_case=true")
with open_fs("fat://" + shared_img_path + "?preserve_case=true") as shared_img:

Then indent the rest of the function body under this with block.

Context for Agents
The `open_fs` resource is never closed. If `check_tests_result` raises an exception partway through, the FAT filesystem handle leaks. Use a context manager:

```suggestion
    with open_fs("fat://" + shared_img_path + "?preserve_case=true") as shared_img:
```
Then indent the rest of the function body under this `with` block.

File: quality/qnx_unit_testing/process_test_results.py
Line: 8
Chain of Thought
  • Validation Decision: accepted
  • Confidence: 0.97
  • Likelihood Score: 4/5
  • Likelihood Reason: The filesystem handle is never closed (no .close() call, no context manager). Any exception in the function body (e.g., missing 'returncode.log', failed download, missing env vars causing NoneType errors) will leak the FAT filesystem handle. Even on the happy path, the handle is never explicitly closed.
  • Suggested Action: keep
  • Tool Calls Used: 3

Reasoning

  1. Accuracy: The comment correctly identifies that open_fs() at line 8 returns a filesystem object that is never closed. There is no shared_img.close() call anywhere in the function, and no with statement wrapping it.

  2. Library documentation confirms: The pyfilesystem2 docs explicitly state: "Always close filesystems — either via a with statement (preferred) or by calling .close() explicitly." The docs recommend the context manager pattern (with open_fs(...) as fs:) as the preferred approach, which is exactly what the comment suggests.

  3. New code introduced by this PR: The file process_test_results.py does not exist in the base commit — it is entirely new in this PR. So this is not pre-existing debt.

  4. Real risk of leaking: Multiple operations in the function body can raise exceptions (e.g., shared_img.exists(), shared_img.download(), shared_img.readtext(), open() calls, int() conversion). If any of these fail, the FAT filesystem handle leaks. Even on the happy path, the handle is never closed before the function returns.

  5. Actionable: The comment provides a concrete code suggestion — wrap the open_fs call in a with statement and indent the body accordingly. This is a clear, specific fix.

Evidence Summary

1. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/process_test_results.py
1: import os
2: import sys
3: import tarfile
4: from fs import open_fs
5: 
6: def check_tests_result() -> int:
7:     shared_img_path = sys.argv[1]
8:     shared_img = open_fs("fat://" + shared_img_path + "?preserve_case=true")
9: 
10:     output_xml = os.environ.get("XML_OUTPUT_FILE")
...

2. Library Docs

  • Library: pyfilesystem2
  • Query: open_fs context manager close
Here is the relevant documentation for **pyfilesystem2** regarding `open_fs`, context managers, and closing filesystems:

---

## `open_fs` — Opening Filesystems

The `open_fs` function opens a filesystem from an FS URL string. It is the primary, general-purpose way to create filesystem objects.

```python
from fs import open_fs
...

Fix in Cursor

Comment on lines +21 to +23
while ! mount -t dos -o case /dev/hd0 /shared; do
echo "Failed to mount /shared. Retrying..."
done

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended

[Reliability] The startup script has no timeout or max-retry logic on the while mount-retry loops (lines 21-23, 26-28, 31-33). If a device never appears or a mount permanently fails, the QNX guest will spin forever. Since the QEMU test has timeout = "short" in Bazel, this will eventually be killed, but without any diagnostic output about why it timed out.

Consider adding a retry counter or timeout to avoid silent hangs:

retries=0
while ! mount -t dos -o case /dev/hd0 /shared; do
    retries=$((retries + 1))
    if [ $retries -ge 30 ]; then
        echo "FATAL: Failed to mount /shared after 30 retries"
        shutdown
    fi
    echo "Failed to mount /shared. Retrying..."
done
Context for Agents
The startup script has no timeout or max-retry logic on the `while` mount-retry loops (lines 21-23, 26-28, 31-33). If a device never appears or a mount permanently fails, the QNX guest will spin forever. Since the QEMU test has `timeout = "short"` in Bazel, this will eventually be killed, but without any diagnostic output about *why* it timed out.

Consider adding a retry counter or timeout to avoid silent hangs:
```bash
retries=0
while ! mount -t dos -o case /dev/hd0 /shared; do
    retries=$((retries + 1))
    if [ $retries -ge 30 ]; then
        echo "FATAL: Failed to mount /shared after 30 retries"
        shutdown
    fi
    echo "Failed to mount /shared. Retrying..."
done
```

File: quality/qnx_unit_testing/startup.sh
Line: 23
Chain of Thought
  • Validation Decision: accepted
  • Confidence: 0.92
  • Likelihood Score: 3/5
  • Likelihood Reason: In a CI/QEMU testing environment, transient or permanent mount failures are plausible (e.g., device driver timing, image corruption). The Bazel timeout will eventually kill it, but with no diagnostic output explaining why.
  • Suggested Action: keep
  • Tool Calls Used: 4

Reasoning

The comment is accurate and actionable:

  1. File is new in this PR: startup.sh does not exist in BASE, so this is entirely new code introduced by this PR.

  2. Infinite loops confirmed: Lines 21-23, 26-28, and 31-33 each contain while ! <command>; do ... done loops with no retry counter or timeout mechanism. If the command permanently fails, these loops spin forever.

  3. Bazel timeout confirmed: In cc_test_qnx.bzl line 165, the test is configured with timeout = "short", which means Bazel will eventually kill the test. However, as the comment correctly notes, this produces no diagnostic output about which mount failed or why.

  4. Actionable suggestion: The comment provides a concrete code example with a retry counter and diagnostic output that would help developers debug failures instead of seeing a generic timeout.

  5. The waitfor commands (lines 20, 25, 30) before each loop are also potentially infinite waits (QNX waitfor defaults to waiting indefinitely), but the comment's focus on the while loops is still valid and independently useful.

The scenario of a mount permanently failing in a QEMU virtual environment is plausible, especially in CI where hardware emulation can be flaky.

Evidence Summary

1. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/startup.sh
1: #!/bin/sh
2: 
3: slogger2
4: waitfor /dev/slog
5: 
6: pci-server --config=/proc/boot/pci_server.cfg
7: waitfor /dev/pci
8: 
9: pipe
10: waitfor /dev/pipe
...

2. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/run_qemu.sh
1: #!/bin/bash
2: 
3: set -euo pipefail
4: 
5: IFS_IMAGE=$1
6: PERSISTENT_IMAGE=$2
7: PROCESS_TESTS_RESULTS=$3
8: TEST_RESULT_IMAGE=$4
9: TEST_IMAGE=$5
10: 
...

3. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/cc_test_qnx.bzl
1: # *******************************************************************************
2: # Copyright (c) 2025 Contributors to the Eclipse Foundation
3: #
4: # See the NOTICE file(s) distributed with this work for additional
5: # information regarding copyright ownership.
6: #
7: # This program and the accompanying materials are made available under the
8: # terms of the Apache License Version 2.0 which is available at
9: # https://www.apache.org/licenses/LICENSE-2.0
10: #
...

Fix in Cursor


# Wait for all test processes to finish
echo "Waiting for all test processes to finish..."
while pidin -F '%a %b %n' | grep cc_test_qnx > /dev/null 2>&1; do true; done

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended

[Performance] This loop is a busy-wait with no sleep and will peg a CPU until tests finish. Add a short sleep inside the loop.

Suggested change
while pidin -F '%a %b %n' | grep cc_test_qnx > /dev/null 2>&1; do true; done
while pidin -F '%a %b %n' | grep cc_test_qnx > /dev/null 2>&1; do
sleep 0.1
done
Context for Agents
This loop is a busy-wait with no sleep and will peg a CPU until tests finish. Add a short sleep inside the loop.

```suggestion
while pidin -F '%a %b %n' | grep cc_test_qnx > /dev/null 2>&1; do
    sleep 0.1
done
```

File: quality/qnx_unit_testing/run_test.sh
Line: 37
Chain of Thought
  • Validation Decision: accepted
  • Confidence: 0.97
  • Likelihood Score: 5/5
  • Likelihood Reason: The busy-wait loop will inevitably peg the CPU every time this script runs, continuously spawning pidin and grep processes with no delay between iterations.
  • Suggested Action: keep
  • Tool Calls Used: 2

Reasoning

The comment correctly identifies a busy-wait loop at line 37 of a newly added file. The loop while pidin -F '%a %b %n' | grep cc_test_qnx > /dev/null 2>&1; do true; done has no sleep, meaning it will continuously spawn pidin and grep processes as fast as the shell can loop. This wastes CPU and is a well-known antipattern. The suggested fix of adding sleep 0.1 is standard, concrete, and appropriate. Since this is a new file (confirmed by checking BASE), this issue is introduced by this PR. The comment is accurate, actionable, and the issue is inevitable whenever the script runs.

Evidence Summary

1. File Snapshot (HEAD)

  • File: quality/qnx_unit_testing/run_test.sh
  • Lines: 35-38
35: # Wait for all test processes to finish
36: echo "Waiting for all test processes to finish..."
37: while pidin -F '%a %b %n' | grep cc_test_qnx > /dev/null 2>&1; do true; done
38: echo "Test processes finished"

Fix in Cursor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants