Skip to content

APIs and libraries for disabling DPU #479

@hase1128

Description

@hase1128

Could you please provide information about the APIs and libraries defined within OPI?

I'd like to know if there are standardized APIs or methods/libraries for infrastructure administrators to prevent users from utilizing a specific DPU, for maintenance or other reasons.

For example, with NVIDIA GPUs, a vendor-specific command like nvidia-smi drain can be executed to prevent users from accessing the GPU.

In the case of DPUs, would simply shutting down the OS on the DPU be sufficient? While it might be possible to log in to the OS and execute a Linux shutdown command, are there standard APIs or libraries that allow us to remotely disable DPU usage?

Furthermore, with GPUs, it's necessary to stop any processes that have the GPU's device file open before executing commands like nvidia-smi drain. In the case of DPUs, it might be more complex and difficult to determine who or how the DPU is being utilized. Is a forced shutdown the only option in such situations?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions