I was running a CUDA application inside a nvidia-docker container. Wanting to profile it, I ran the application with nvprof and got a permissions warning and no profile information was generated:
==616== Warning: The user does not have permission to profile on the target device. See the following link for instructions to enable permissions and get more information: https://developer.nvidia.com/NVSOLN1000
==616== Warning: Some profiling data are not recorded. Make sure cudaProfilerStop() or cuProfilerStop() is called before application exit to flush profile data.
For another application, the error looked like this:
==643== NVPROF is profiling process 643, command: foobar
==643== Warning: The user does not have permission to profile on the target device. See the following link for instructions to enable permissions and get more information: https://developer.nvidia.com/NVSOLN1000
==643== Profiling application: foobar
==643== Profiling result:
No kernels were profiled.
No API activities were profiled.
The warning message has a link, but perusing that documentation is not relevant to this docker problem. Solution turned out to be that I needed to add the
--privileged option to my nvidia-docker command invocation.
NVIDIA Docker makes it easy to use Docker containers across machines with differing NVIDIA graphics drivers. After installing it, I ran a sample NVIDIA Docker command and got this error:
$ nvidia-docker run --rm nvidia/cuda nvidia-smi
docker: Error response from daemon: Post http://%2Frun%2Fdocker%2Fplugins%2Fnvidia-docker.sock/VolumeDriver.Mount: dial unix /run/docker/plugins/nvidia-docker.sock: connect: no such file or direct
Investigating the log files showed this:
$ cat /tmp/nvidia-docker.log
nvidia-docker-plugin | 2017/10/11 10:10:07 Loading NVIDIA unified memory
nvidia-docker-plugin | 2017/10/11 10:10:07 Loading NVIDIA management library
nvidia-docker-plugin | 2017/10/11 10:10:07 Discovering GPU devices
nvidia-docker-plugin | 2017/10/11 10:10:13 Provisioning volumes at /var/lib/nvidia-docker/volumes
nvidia-docker-plugin | 2017/10/11 10:10:13 Serving plugin API at /run/docker/plugins
nvidia-docker-plugin | 2017/10/11 10:10:13 Serving remote API at localhost:3476
nvidia-docker-plugin | 2017/10/11 10:10:13 Error: listen tcp 127.0.0.1:3476: bind: address already in use
That 3476 port turned out to be owned by no process. So what’s the problem?
I gave up and restarted Docker and everything worked fine after that (haha!):
$ sudo service docker restart
Tried with: NVIDIA Docker 1.x and Docker 1.11