-
Notifications
You must be signed in to change notification settings - Fork 82
Open
Description
Environment
- DCGM Version:
3.3.6 - GPU: NVIDIA Datacenter GPUs (8-GPU system)
Steps to Reproduce
- Run EUD diagnostics:
sudo dcgmi diag -r eud -p "eud.suite_level=4,eud.passthrough_args='run_tests=compute,memory,hsio'" - Parse the detail log files
$ sudo /usr/share/nvidia/diagnostic/mla /var/log/nvidia-dcgm/dcgm_eud.mle
Error populating db! Check /var/log/nvidia-dcgm/dcgm_eud.debug for more information.
/var/log/nvidia-dcgm/dcgm_eud.mle did not parse correctly.
Attempting to analyze partial data!
No content available to write to reportType bgl for file /var/log/nvidia-dcgm/dcgm_eud.mle
RC: Failed to find Mle header
$ cat /var/log/nvidia-dcgm/dcgm_eud.debug
[<func> ]: MODS LOG ANALYZER DEBUG LOG: Thu Jan 22 19:14:26 2026
[<func> ]: MLA Version: 20.152
[<func> ]: Processing file '/var/log/nvidia-dcgm/dcgm_eud.mle' and writing to '/var/log/nvidia-dcgm/dcgm_eud.mle'
[<func> ]: Error: could not find MLE header information!
[<func> ]: Log did not parse correctly. MLA will attempt to analyze partial data!
Could you help me identify the cause of this issue or provide the correct method to use and parse the raw EUD logs?
Thanks!Metadata
Metadata
Assignees
Labels
No labels