The nvida-smi plugin for Telegraf basically gives you an overview of your GPU usage in the most current iteration in v1.10.4. This “guide” assumes you are using Windows as your host OS. Linux should be fairly easy to get going as long as you know where your nvidia-smi executable is located.
If you do not have Telegraf installed, check out my guides here.
Create a new conf file in telegraf.d folder.
Paste the following into the new file and save/close it.
# Pulls statistics from nvidia GPUs attached to the host [[inputs.nvidia_smi]] ## Optional: path to nvidia-smi binary, defaults to $PATH via exec.LookPath bin_path = "C:\\Program Files\\NVIDIA Corporation\\NVSMI\\nvidia-smi.exe" ## Optional: timeout for GPU polling timeout = "5s"
net stop telegraf net start telegraf
With Windows you have to use an escape \ when setting the bin_path otherwise you’ll get errors when Telegraf queries nvidia-smi.exe.
Once you have verified Telegraf is reporting Nvidia stats you can start creating your panels in Grafana. Use
nvidia-smi from your telegraf data source to build the panels.
Update February 2020
Telegraf recently updated its SMI plugin to include more data retrieval. This new data can be used to create more monitoring panels. Here is a list of the most recent fields that are now returned: