nvprof in nvidia-docker permissions warning


I was running a CUDA application inside a nvidia-docker container. Wanting to profile it, I ran the application with nvprof and got a permissions warning and no profile information was generated:

==616== Warning: The user does not have permission to profile on the target device. See the following link for instructions to enable permissions and get more information: https://developer.nvidia.com/NVSOLN1000
==616== Warning: Some profiling data are not recorded. Make sure cudaProfilerStop() or cuProfilerStop() is called before application exit to flush profile data.

For another application, the error looked like this:

==643== NVPROF is profiling process 643, command: foobar
==643== Warning: The user does not have permission to profile on the target device. See the following link for instructions to enable permissions and get more information: https://developer.nvidia.com/NVSOLN1000
==643== Profiling application: foobar
==643== Profiling result:                                                                                                     
No kernels were profiled.                                                         
No API activities were profiled.


The warning message has a link, but perusing that documentation is not relevant to this docker problem. Solution turned out to be that I needed to add the --privileged option to my nvidia-docker command invocation.

How to install Raspbian 9

Raspbian 9 (Stretch) is the latest version of Debian for the Raspberry Pi.

Here is how I installed it:

  • Download the Raspbian Stretch Lite installation file from here.

  • We need a tool to write the OS image to a SD card. I used Etcher which can be installed from here.

  • Insert a SD card of at least 4GB capacity into your computer. Use Etcher and install the zip file to the SD card.

  • Eject the SD card. Remove it and plug it back into your computer. Create an empty file named ssh in the root directory of the SD card. This will enable you to SSH to your Raspberry Pi.

  • Insert this SD card into the Raspberry Pi board. Connect your Pi to your home wireless router with a Ethernet cable. You can also connect your Pi to your TV or computer display with a HDMI cable. Power on the Pi.

  • You can see Raspbian booting up on your TV or display. At the end it displays what IP address was assigned to it by DHCP. You can also figure out the IP address from the admin console of your wireless router. Let us say the IP address is

  • SSH to the IP address of your Pi. The login is pi and the password is raspberry.

$ ssh pi@
  • You are logged into the Pi now! Change the password using the passwd command.

  • Update the packages using these commands:

$ sudo apt update
$ sudo apt upgrade

Your Raspbian 9 is all set now for your use.

How to view Python environment variables

To view the environment variables in Python and query for a particular one:

import os
print(os.environ)  # Print a dict of env vars and their values
os.environ["PYTHONPATH"]  # Query a specific env var

Python picks up these environment variables from your shell when it is first invoked. So this is useful to check if your shell environment variables and their values are being correctly imported into Python.

How to view path of Python module

To view the paths of an imported Python module, say tensorflow:

import tensorflow

# Output is:
# ['/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/api/_v1',
#  '/usr/local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/api/_v1',
#  '/usr/local/lib/python3.6/site-packages/tensorflow',
#  '/usr/local/lib/python3.6/site-packages/tensorflow/_api/v1']

This can be useful to check if the right module from the right location is being imported. It is also useful to know where to go check out the Python source files of a module for inspection or debugging.

nvidia-smi cheatsheet

nvidia-smi (NVIDIA System Management Interface) is a tool to query, monitor and configure NVIDIA GPUs. It ships with and is installed along with the NVIDIA driver and it is tied to that specific driver version. It is a tool written using the NVIDIA Management Library (NVML).

  • To query the usage of all your GPUs:
$ nvidia-smi

I use this default invocation to check:

  • Version of driver.
  • Names of the GPUs.
  • Index of the GPUs, based on PCI Bus Order. This is different from the CUDA order.
  • Amount of memory each of the GPUs has.
  • Whether persistence mode is enabled on each of the GPUs
  • Utilization of each of the GPUs (if I’m running something on them).
  • List of processes executing on the GPUs.
  • To query the configuration parameters of all the GPUs:
$ nvidia-smi -q

I use this to check:

  • Default clocks (listed under Default Application Clocks).
  • Current clocks (listed under Application Clocks).
  • To query the configuration parameters of a particular GPU, use its index:
$ nvidia-smi -q -i 0

How to view PIP dependencies using pipdeptree

When you install a Python PIP package using the pip tool, you will notice that it figures out if the package depends on other packages and installs them too. It is sometimes useful to know what is this tree of packages that a package depends on. pipdeptree is a tool that makes it easy to view this information.

  • Installing pipdeptree is easy:
$ sudo pip3 install pipdeptree
  • View the dependency tree of every installed package
$ pipdeptree
  • View the dependency tree of a particular package foobar:
$ pipdeptree -p foobar
  • View the reverse dependency tree — the packages that are dependent on every installed package:
$ pipdeptree -r
  • Render the dependency tree graph to a PDF file using GraphViz:
$ pipdeptree --graph-output pdf > out.pdf

Tried with: Ubuntu 18.04

How to build and install TensorFlow

TensorFlow (TF) can be built from source easily and installed as a Python wheel package. I used the following steps to build it using Python3 and with support for CUDA and TensorRT:

  • Install Python3 pre-requisites:
$ sudo apt install python3-dev python3-pip
  • Install necessary Python3 packages locally:
$ pip3 install -U --user six numpy wheel setuptools mock
$ pip3 install -U --user keras_applications==1.0.6 --no-deps
$ pip3 install -U --user keras_preprocessing==1.0.5 --no-deps

These packages are installed to your ~/.local/lib/python3.x/site-packages directory. TF documentation also installs the latest pip3 from PyPI. However, doing that causes the infamous “Cannot import name main” error, so I do not do that.

  • TF uses Bazel as its build tool. Install it as described here. I ended up placing its binary in my ~/bin. Since my ~/bin is in my PATH, the Bazel binary can be executed from any place.

  • I recommend creating a tensorflow_root directory. This is because the TF packaging tends to write out to a location outside the TF source directory. Also, TF needs to access other libraries. So this root directory makes it easy to create all TF related directories under one umbrella.

  • Clone the TF Git repository inside the root directory:

$ cd tensorflow_root
$ git clone git@github.com:tensorflow/tensorflow.git
$ cd tensorflow
  • Configure the build process using:
$ ./configure

Some of the questions it asks and my replies:

  • Please specify the location of python. /usr/bin/python3
  • Please input the desired Python library path to use. /usr/local/lib/python3.6/dist-packages
  • Enable: XLA JIT, CUDA and TensorRT. Be careful, TF might not work with latest versions of CUDA, cuDNN and TensorRT. I used CUDA 10.0 and cuDNN 7.3 and TensorRT 5.0.
  • Did not enable: OpenCL SYCL and ROCm.
    • It is time to build TF. The command to build is:
$ bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package

The first build can take 2-4 hours to complete.

  • Now we are ready to package the build artifacts into a Python package. Specify where you want that package to be placed in the packaging command:
$ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /path_to_tensorflow_root/tensorflow_pkg

I found that it had generated a Python wheel file named: tensorflow-1.13.1-cp36-cp36m-linux_x86_64.whl

  • We are ready to install TF. Make sure you remove older versions of TF:
$ pip3 uninstall tensorflow tensorflow-estimator
  • Install TF from the Python wheel package:
$ pip3 install -U --user tensorflow-1.13.1-cp36-cp36m-linux_x86_64.whl
  • Check if TF is installed correctly:
$ python3 -c "import tensorflow"

Tried with: Tensorflow 1.13.1 and Ubuntu 18.04

How to set environment variable in gdb

GDB inherits the environment variables from your shell. Some environment variables, like LD_LIBRARY_PATH, might not be inherited for safety reasons. This can cause errors like this when you try to debug:

$ gdb --args ./a.out
(gdb) r
/path/to/a.out: error while loading shared libraries: libfoobar.so.5: cannot open shared object file: No such file or directory

You can set the LD_LIBRARY_PATH environment variable at the gdb shell using the set env command like this:

(gdb) set env LD_LIBRARY_PATH /path/to/1:/path/to/2

However, if you use a shell like Fish, you will notice that this will silently not work and you still get the cannot open shared object file error.

This is because under the covers, gdb is reading the SHELL environment variable to understand what is the shell and doing what is right for that shell. It might not understand how to work with the Fish, which is a relatively new shell.

The solution that works for me, is to set the SHELL variable at the Fish shell to the Bash path:

$ set SHELL (which bash)

And then launch gdb and set the environment variable as shown above. It works after that.

Reference: Your program’s environment (GDB Manual)

Unicode BOM problem


I processed a JSON file using some tool and the resulting JSON text file would not be accepted by other tools. They would complain that this was a UTF-8 Unicode (with BOM) text file. I had to remove whatever this BOM was from my UTF-8 file.


BOM is a byte order mark added by some tools to UTF-8 files. BOM is this 3-byte sequence: 0xEF,0xBB,0xBF.

You could use any tool or process to remove these 3-byte sequences. If you are on Linux, the awesome sed tool can do the job:

$ sed -i '1s/^\xEF\xBB\xBF//' in.txt

Reference: How can I remove the BOM from a UTF-8 file?