Skip to content

ci: move fio steps into separate job#185

Draft
riley-dixon wants to merge 2 commits intodevelopfrom
rildixon/ci-fio-integration
Draft

ci: move fio steps into separate job#185
riley-dixon wants to merge 2 commits intodevelopfrom
rildixon/ci-fio-integration

Conversation

@riley-dixon
Copy link
Collaborator

Motivation

TBD - Use the FIO packages produced by ROCm/fio instead of building it separately here.

Technical Details

Test Plan

Test Result

Submission Checklist

AIHIPFILE-121

@riley-dixon riley-dixon self-assigned this Feb 12, 2026
Copilot AI review requested due to automatic review settings February 12, 2026 21:24
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the AIS system testing workflow by extracting FIO-related test steps from the main run_system_tests job into a separate run_FIO_tests job. The motivation (per the PR description) is to eventually use FIO packages produced by ROCm/fio instead of building FIO separately, though this PR currently still builds FIO from source as an intermediate step.

Changes:

  • Removed FIO repository checkout and build directory creation from the run_system_tests job
  • Added cleanup steps to the run_system_tests job to ensure proper resource cleanup
  • Created a new run_FIO_tests job that checks out the FIO repository and runs FIO-based tests

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

with:
repository: ROCm/fio
ref: hipFile
path: fio
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new run_FIO_tests job is missing the checkout of the hipFile repository. The FIO configure step (line 280-281) references HIPFILE=/ais/hipFile and HIPFILELIB paths, and the FIO test steps (lines 305, 316) reference /ais/hipFile/util/fio/write-read-verify.fio. Without checking out the hipFile repository, these paths won't exist and the job will fail. Add a checkout step for the hipFile repository similar to line 48-51 in the run_system_tests job.

Suggested change
path: fio
path: fio
- name: Fetching hipFile repository...
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd #v6.0.2
with:
repository: ROCm/hipFile
path: hipFile

Copilot uses AI. Check for mistakes.
Comment on lines +203 to +210
- name: Download hipFile runtime package
uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 #v7.0.0
with:
name: ${{ inputs.ais_hipfile_pkg_filename }}
- name: Download hipFile development package
uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 #v7.0.0
with:
name: ${{ inputs.ais_hipfile_pkg_dev_filename }}
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new run_FIO_tests job is missing the download of the hipFile build directory artifact. The FIO configure step at line 281 references HIPFILELIB=${HIPFILE}/build/src/amd_detail/ which expects the hipFile build artifacts to be present. Without downloading the hipfile-build-dir artifact (as done in line 60-64 of run_system_tests), the FIO configuration and build will fail. Add a step to download the hipfile-build-dir artifact and copy it into the container similar to lines 60-64 and 133-137 in the run_system_tests job.

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The directory creation command attempts to create /ais/hipFile/build but the hipFile repository has not been checked out or copied into the container in this job. This will fail because there is no /ais/hipFile directory. After adding the hipFile repository checkout step, ensure it's copied into the container before this step, or remove this line since the hipFile build directory artifact should be downloaded and copied separately (as done in the run_system_tests job at lines 60-64 and 133-137).

Suggested change
mkdir /ais/hipFile/build

Copilot uses AI. Check for mistakes.
if: ${{ always() }}
run: rm -rf ${GITHUB_WORKSPACE}/* ${GITHUB_WORKSPACE}/.*
run_FIO_tests:
runs-on: [linux, AIS]
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new run_FIO_tests job is missing a dependency on the run_system_tests job. Without specifying 'needs: run_system_tests', both jobs could run concurrently on the same AIS self-hosted runner, potentially causing resource contention (competing for GPU devices, disk I/O, etc.). Consider adding 'needs: run_system_tests' to ensure sequential execution and avoid resource conflicts, unless parallel execution is intentionally desired.

Suggested change
runs-on: [linux, AIS]
runs-on: [linux, AIS]
needs: run_system_tests

Copilot uses AI. Check for mistakes.
This is in preparation for calling out to the ROCm/fio workflow
to build and package FIO with hipFile support separately. This workflow
would then install the FIO packages produced instead of compiling it
separately.

This will allow us to at least sanity check the FIO packages we
produce and publish.
@riley-dixon riley-dixon force-pushed the rildixon/ci-fio-integration branch from 03afdaa to a2b2af8 Compare February 12, 2026 22:21
Copilot AI review requested due to automatic review settings February 17, 2026 23:36
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

.github/workflows/test-ais-system.yml:306

  • The PR description states "TBD - Use the FIO packages produced by ROCm/fio instead of building it separately here", but the implementation still builds FIO from source in the run_FIO_tests job (lines 283-306). The current changes create a build_FIO job that isn't being used, making this change incomplete. The implementation should either: (1) download and use pre-built FIO packages from the build_FIO job, or (2) wait to introduce the build_FIO job until the integration is complete.
      - name: Configure fio
        run: |
          docker exec \
            -t \
            -w /ais/fio/build \
            "${AIS_CONTAINER_NAME}" \
            /bin/bash -c '
              ROCM=/opt/rocm-${ROCM_VERSION} \
              HIPFILE=/ais/hipFile \
              HIPFILELIB=${HIPFILE}/build/src/amd_detail/ \
              HIP_PLATFORM=amd \
              CFLAGS="-I${ROCM}/include" \
              LDFLAGS="-L${ROCM}/lib -Wl,-rpath,${ROCM}/lib" \
              ../configure --enable-libhipfile
            '
      - name: Build fio
        run: |
          docker exec \
            -t \
            -w /ais/fio/build \
            "${AIS_CONTAINER_NAME}" \
            /bin/bash -c '
              make -j
            '

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +190 to +192
# Adds a level of indirection, but we no longer need to worry about
# adding logic to allow certain jobs to have been skipped depending
# on the platform of this current workflow, and all others downstream.
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states "Adds a level of indirection, but we no longer need to worry about adding logic to allow certain jobs to have been skipped depending on the platform of this current workflow, and all others downstream." However, this rationale is unclear. The comment should explain more specifically what problem this indirection solves and what "logic to allow certain jobs to have been skipped" means. Consider expanding this comment to help future maintainers understand the design decision.

Suggested change
# Adds a level of indirection, but we no longer need to worry about
# adding logic to allow certain jobs to have been skipped depending
# on the platform of this current workflow, and all others downstream.
# Delegate the FIO build to a reusable workflow so that all
# platform-specific build logic (including when to skip or run
# particular build steps for a given OS / platform) is defined
# in a single place. This indirection means this workflow, and
# any downstream workflows that depend on its artifacts, no longer
# need to duplicate complex `if:` conditions or handle cases where
# earlier platform-specific jobs were skipped.

Copilot uses AI. Check for mistakes.
Comment on lines 189 to 197
build_FIO:
# Adds a level of indirection, but we no longer need to worry about
# adding logic to allow certain jobs to have been skipped depending
# on the platform of this current workflow, and all others downstream.
uses: ROCm/fio/.github/workflows/build-debian.yml@rildixon/ci-hook-for-hipfile
with:
ais_hipfile_pkg_filename: ${{ inputs.ais_hipfile_pkg_filename }}
ais_hipfile_pkg_dev_filename: ${{ inputs.ais_hipfile_pkg_dev_filename }}
platform: ${{ inputs.platform }}
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build_FIO job has no dependencies on upstream jobs. Since the reusable workflow at ROCm/fio/.github/workflows/build-debian.yml expects ais_hipfile_pkg_filename and ais_hipfile_pkg_dev_filename as inputs, this job should depend on whichever job produces those artifacts (likely needs to be configured at the caller level, similar to how AIS_system_tests depends on build_and_test in ais-ci.yml). Without proper dependencies, the artifacts may not be available when this job runs.

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reusable workflow reference uses a non-standard branch name rildixon/ci-hook-for-hipfile. For production use, this should reference a stable branch (like main, develop, or a tagged version) rather than what appears to be a personal development branch. Using personal branches in production workflows can lead to instability if the branch is deleted or force-pushed.

Suggested change
uses: ROCm/fio/.github/workflows/build-debian.yml@rildixon/ci-hook-for-hipfile
uses: ROCm/fio/.github/workflows/build-debian.yml@main

Copilot uses AI. Check for mistakes.
Comment on lines +212 to +282
- name: Fetching fio repository...
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd #v6.0.2
with:
repository: ROCm/fio
ref: hipFile
path: fio
- name: Download hipFile runtime package
uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 #v7.0.0
with:
name: ${{ inputs.ais_hipfile_pkg_filename }}
- name: Download hipFile development package
uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 #v7.0.0
with:
name: ${{ inputs.ais_hipfile_pkg_dev_filename }}
- name: Authenticating to GitHub Container Registry
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 #v3.7.0
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Starting Docker Container
run: |
docker run \
-dt \
--rm \
--device=/dev/kfd \
--device=/dev/dri \
--security-opt seccomp=unconfined \
--pull always \
-v ${GITHUB_WORKSPACE}:/mnt/ais:ro \
-v "${AIS_MOUNT_PATH}:/mnt/ais-fs" \
--name "${AIS_CONTAINER_NAME}" \
"${AIS_INPUT_CI_IMAGE}"
- name: Create hipfile IO test directory
run: |
docker exec -t "${AIS_CONTAINER_NAME}" /bin/bash -c "mkdir -p /mnt/ais-fs/${AIS_CONTAINER_NAME}"
- name: Make copy of the code repository and create build directories
run: |
docker exec \
-t \
"${AIS_CONTAINER_NAME}" \
/bin/bash -c '
cp -R /mnt/ais /ais
mkdir /ais/fio/build
'
- name: Copy the hipFile packages into the container
run: |
docker cp \
"${GITHUB_WORKSPACE}/${AIS_INPUT_HIPFILE_PKG_FILENAME}" \
"${AIS_CONTAINER_NAME}:/root"
docker cp \
"${GITHUB_WORKSPACE}/${AIS_INPUT_HIPFILE_PKG_DEV_FILENAME}" \
"${AIS_CONTAINER_NAME}:/root"
- name: Install the hipFile packages
run: |
docker exec \
-t \
-w /root \
"${AIS_CONTAINER_NAME}" \
/bin/bash -c '
${{
format(
env.AIS_PKG_MGR == 'apt' && 'apt install -y "./{0}" "./{1}"' ||
env.AIS_PKG_MGR == 'dnf' && 'dnf install -y --cacheonly "./{0}" "./{1}"' ||
env.AIS_PKG_MGR == 'zypper' && 'zypper --no-refresh install -y --allow-unsigned-rpm "./{0}" "./{1}"' ||
'echo "Unknown platform."; exit 1',
inputs.ais_hipfile_pkg_filename,
inputs.ais_hipfile_pkg_dev_filename
)
}}
'
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The run_FIO_tests job depends on build_FIO but never downloads the artifacts produced by that job. Instead, it checks out the FIO repository again (line 212-217) and builds FIO from source (lines 283-306). This defeats the purpose of having a separate build_FIO job. Either the job should download and use the FIO artifacts from build_FIO, or the build_FIO job and its dependency are unnecessary.

Copilot uses AI. Check for mistakes.
@riley-dixon riley-dixon force-pushed the rildixon/ci-fio-integration branch from 454445d to 89335b4 Compare February 17, 2026 23:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments