title: CUDA
breadcrumbs:
- title: Configuration
- title: High-Performance Computing (HPC)
---
{% include header.md %}
NVIDIA CUDA (Compute Unified Device Architecture) Toolkit, for programming CUDA-capable GPUs.
Related Pages
{:.no_toc}
Resources
Setup
Linux
The toolkit on Linux can be installed in different ways:
- Through an an existing package in your distro's repos (simplest and most compatible with other packages, but may be outdated).
- Through a downloaded package manager package (up to date but may be incompatible with your installed NVIDIA driver).
- Through a runfile (same as previous but more cross-distro and harder to manage).
If an NVIDIA driver is already installed, it must match the CUDA version.
Downloads: CUDA Toolkit Download (NVIDIA)
Ubuntu (NVIDIA CUDA Repo)
- Follow the steps to add the NVIDIA CUDA repo: CUDA Toolkit Download (NVIDIA)
- But don't install
cuda
yet.
- Remove anything NVIDIA or CUDA from the system to avoid conflicts:
apt purge --autoremove cuda nvidia-* libnvidia-*
- Warning: May break your PC. There may be better ways to do this.
- Install CUDA from the new repo (includes the NVIDIA driver):
apt install cuda
- Setup path: In
/etc/environment
, append :/usr/local/cuda/bin
to the end of the PATH list.
Docker Containers
- Docker containers may run NVIDIA applications using the NVIDIA runtime for Docker.
- TODO
DCGM
- For monitoring GPU hardware and performance.
- See the DCGM exporter for Prometheus for monitoring NVIDIA GPUs from Prometheus.
Programming
See CUDA (software engineering).
Usage and Tools
- Gathering system/GPU information with
nvidia-smi
:
- Show overview:
nvidia-smi
- Show topology matrix:
nvidia-smi topo --matrix
- Show topology info:
nvidia-smi topo <option>
- Show NVLink info:
nvidia-smi nvlink --status -i 0
(for GPU #0)
- Monitor device stats:
nvidia-smi dmon
- To specify which devices are available to the CUDA application and in which order, set the
CUDA_VISIBLE_DEVICES
env var to a comma-separated list of device IDs.
{% include footer.md %}