Skip to content

Symbol missing issue with 1.3 version onwards in SLES and Intel Datacenter Max GPU on Aurora #113

@servesh

Description

@servesh

We are seeing a symbol missing with xpu-smi newer than 1.3 version.

$ ./xpu-smi
./xpu-smi: symbol lookup error: /home/servesh/Tests/xpu-smi/usr/lib64/libxpum.so.1: undefined symbol: _ZN6spdlog7details15periodic_workerD1Ev

There seems to be a dependency on spdlog which isn't listed in the rpm spec,

$ rpm -qpR ./xpu-smi-1.3.2-20250825.080849.0d946904.x86_64.rpm
warning: ./xpu-smi-1.3.2-20250825.080849.0d946904.x86_64.rpm: Header V4 RSA/SHA256 Signature, key ID 1b79da4d: NOKEY
/bin/sh
/bin/sh
/bin/sh
/bin/sh
/bin/sh
/bin/sh
intel-gsc >= 0.9.5
intel-level-zero-gpu >= 1.3.23726
level-zero >= 1.7.9.1
rpmlib(CompressedFileNames) <= 3.0.4-1
rpmlib(FileDigests) <= 4.6.0-1
rpmlib(PayloadFilesHavePrefix) <= 4.0-1
rpmlib(PayloadIsXz) <= 5.2-1

nor in the library list,

ldd ./xpu-smi
	linux-vdso.so.1 (0x00007ffed0d29000)
	libigsc.so.0 => /usr/lib64/libigsc.so.0 (0x00007f2d8a95e000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f2d8a959000)
	libxpum.so.1 => /home/servesh/Tests/xpu-smi/usr/lib64/libxpum.so.1 (0x00007f2d8a374000)
	libstdc++.so.6 => /opt/aurora/25.190.0/spack/unified/0.10.0/install/linux-sles15-x86_64/gcc-13.3.0/gcc-13.3.0-4enwbrb/lib64/libstdc++.so.6 (0x00007f2d8a115000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f2d89fc8000)
	libgcc_s.so.1 => /opt/aurora/25.190.0/spack/unified/0.10.0/install/linux-sles15-x86_64/gcc-13.3.0/gcc-13.3.0-4enwbrb/lib64/libgcc_s.so.1 (0x00007f2d89fa3000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f2d89f7f000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f2d89d8a000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f2d8a9a9000)
	libmetee.so.5.0.0 => /usr/lib64/libmetee.so.5.0.0 (0x00007f2d89d80000)
	libudev.so.1 => /usr/lib64/libudev.so.1 (0x00007f2d89d49000)
	libze_loader.so.1 => /usr/lib64/libze_loader.so.1 (0x00007f2d89c1f000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f2d89c15000)

ldd /home/servesh/Tests/xpu-smi/usr/lib64/libxpum.so.1
	linux-vdso.so.1 (0x00007fff489f2000)
	libze_loader.so.1 => /usr/lib64/libze_loader.so.1 (0x00007f5e5032e000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f5e50329000)
	libigsc.so.0 => /usr/lib64/libigsc.so.0 (0x00007f5e502f9000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f5e502d5000)
	libstdc++.so.6 => /opt/aurora/25.190.0/spack/unified/0.10.0/install/linux-sles15-x86_64/gcc-13.3.0/gcc-13.3.0-4enwbrb/lib64/libstdc++.so.6 (0x00007f5e50076000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f5e4ff29000)
	libgcc_s.so.1 => /opt/aurora/25.190.0/spack/unified/0.10.0/install/linux-sles15-x86_64/gcc-13.3.0/gcc-13.3.0-4enwbrb/lib64/libgcc_s.so.1 (0x00007f5e4ff04000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f5e4fd0f000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f5e50a56000)
	libmetee.so.5.0.0 => /usr/lib64/libmetee.so.5.0.0 (0x00007f5e4fd05000)
	libudev.so.1 => /usr/lib64/libudev.so.1 (0x00007f5e4fcce000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f5e4fcc2000)

It appears the if I pass the spdlog as LD_LIBRARY_PATH, this still doesn't help. The only way to get things to work is LD_PRELOAD the spdlog library.

LD_PRELOAD=/opt/aurora/25.190.0/spack/unified/0.10.0/install/linux-sles15-x86_64/gcc-13.3.0/spdlog-1.10.0-kjmwhnz/lib64/libspdlog.so ./xpu-smi
Intel XPU System Management Interface -- v1.3
Intel XPU System Management Interface provides the Intel datacenter GPU model. It can also be used to update the firmware.
Intel XPU System Management Interface is based on Intel oneAPI Level Zero. Before using Intel XPU System Management Interface, the GPU driver and Intel oneAPI Level Zero should be installed rightly.

It would also be good to understand why xpu-smi is deprecated for SLES and Intel Datacenter Max GPU. Aurora is the largest installation of Intel GPUs. Both the drivers and OneAPI SDK continue to support the system for atleast few more years. Many of users rely on xpu-smi as a means to access the counter data and having continued support is crucial. If we run into a bug or an issue at later date we will be stuck with an older version.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions