Category:
Resolution:
Resolved
UPDATE (Mar 15, 2023)
After the downtime on Mar. 14, 2023, OSC enabled a new Slurm option --gres=nsight
. DCGM will be disabled on the nodes for the job with the Slurm option, and Nsight will function normally.
==================================
We are experiencing an issue with Nsight GPU profiler, which is affected by the GPU monitoring service (DCGM) that we are running.
This causes Nsight to malfunction, and produce error messages:
==ERROR== Profiling failed because a driver resource was unavailable. Ensure that no other tool (like DCGM) is concurrently collecting profiling data. See https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html#faq for more details.
We are looking for a workaround to resolve this issue.
Please contact oschelp@osc.edu if there are questions.