A FastAPI-based REST API for monitoring NVIDIA GPUs using the NVIDIA Management Library (NVML) via the nvidia-ml-py package.
- Get information about all available NVIDIA GPUs
- Monitor GPU memory usage
- Check GPU utilization
- Query device capabilities
- Mock mode for development without NVIDIA GPUs
pip install -r requirements.txtYou can also run the API using Docker:
docker build -t nvml-rest-api .docker run -d --gpus all -p 8000:8000 --name nvml-api nvml-rest-apiThe --gpus all flag passes through all GPUs to the container. You can also specify individual GPUs if needed.
docker logs -f nvml-apidocker stop nvml-apiStart the server:
uvicorn nvml_rest_api.main:app --reloadAccess the API documentation at http://localhost:8000/docs
The API automatically falls back to mock mode when:
- NVIDIA GPUs are not available
- NVML library is not found
- NVIDIA drivers are not installed
In mock mode, the API provides simulated GPU data for testing and development.
GET /api/v1/gpus: List all available GPUsGET /api/v1/gpus/{device_id}: Get detailed information about a specific GPUGET /api/v1/gpus/{device_id}/memory: Get memory information for a specific GPUGET /api/v1/gpus/{device_id}/utilization: Get utilization metrics for a specific GPUGET /api/v1/status: Get system status information including mock mode
Example Output:
// example_url: "http://127.0.0.1:8000/api/v1/gpus"
response = {
"count": 1,
"gpus": [
{
"id": 0,
"name": "NVIDIA GeForce RTX 3090",
"uuid": "GPU-1f1ad567-3b5c-9e6d-ee7c-8f4d4b1c790e",
"memory": {
"total": 25769803776,
"free": 18024562688,
"used": 7745241088
},
"utilization": { "gpu": 18, "memory": 23 },
"power_usage": 31.99,
"power_limit": 370.0,
"temperature": 48,
"fan_speed": 0,
"performance_state": "P8",
"compute_mode": "Default",
"persistence_mode": false
}
]
}