mirror of
https://github.com/ChristianLempa/boilerplates.git
synced 2024-12-15 03:31:55 +01:00
1.4 KiB
1.4 KiB
Prerequisite
NVIDIA container toolkit
sudo apt -y install build-essential nvidia-cuda-toolkit nvidia-headless-495 nvidia-utils-495 libnvidia-encode-495 \
&& distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list \
&& sudo apt update \
&& sudo apt -y install nvidia-container-toolkit nvidia-container-runtime nvidia-docker2
Deployment
- Modify the prometheus configuration template
/etc/prometheus/prometheus.yml
location.
Job for Nvidia SMI exporter in prometheus config file
- job_name: 'nvidia_smi_exporter'
static_configs:
- targets: ['nvidia_smi_exporter:9835'] # if nvidia_smi_exporter container is not on same docker network , change this line to "- targets: ['whichever ip your host is:9835']"
Additional Referfences
Nvidia container toolkit Nvidia GPU exporter Documentation Official Prometheus Documentation Some grafana dashboard, not perfect, old, but configurable