How to set environment variable in gdb

GDB inherits the environment variables from your shell. Some environment variables, like LD_LIBRARY_PATH, might not be inherited for safety reasons. This can cause errors like this when you try to debug:

$ gdb --args ./a.out
(gdb) r
/path/to/a.out: error while loading shared libraries: libfoobar.so.5: cannot open shared object file: No such file or directory

You can set the LD_LIBRARY_PATH environment variable at the gdb shell using the set env command like this:

(gdb) set env LD_LIBRARY_PATH /path/to/1:/path/to/2

However, if you use a shell like Fish, you will notice that this will silently not work and you still get the cannot open shared object file error.

This is because under the covers, gdb is reading the SHELL environment variable to understand what is the shell and doing what is right for that shell. It might not understand how to work with the Fish, which is a relatively new shell.

The solution that works for me, is to set the SHELL variable at the Fish shell to the Bash path:

$ set SHELL (which bash)

And then launch gdb and set the environment variable as shown above. It works after that.

Reference: Your program’s environment (GDB Manual)

GDB Colour Filter

A problem I face regularly with GDB is the backtrace. The stack trace lists out the function frames currently on the stack and it is a wall of text. it is hard to discern the function address, function name, function parameters, source file path and line number.

Normal GDB backtrace

This is precisely the problem that GDB Colour Filter solves. Clone its repo and source its Python file inside your ~/.gdbinit and you are set. Backtraces are now printed with distinctly different colors and formatting for all the components of a function frame. I especially find it useful to pick out the function name and the source file and line number.

GDB backtrace with GDB Colour Filter

There is only one slight problem: to display the components of a function frame at a consistent column this breaks down a frame into two lines. So your backtrace lines are doubled and might fill up the display when you try this.

How to build and use GDB

Ubuntu uses quite an old version of GDB. When I need to use an updated version of GDB, here is what I do:

  • Obtain the newer version of GDB by downloading its .tar.gz from here. Unzip it.
  • Configure and build it:
$ ./configure
$ make
  • The newly built GDB executable can be found at gdb/gdb in your current directory. You can copy it or invoke it directly to use it.

Tried with: GDB 8.1 and Ubuntu 16.04

How to skip stepping into files in GDB

Visual Studio C++ debugger has a feature called Just My Code which helps you to step over external code, like that in STL, and only step through the code of your own project. GDB does not have this feature at the time of this writing.

However, GDB has a skip -gfile feature that can be used in a similar way. You pass this command a glob pattern of files to ignore during stepping.

For example, to skip stepping into the source files of STL implementations on my system I use:

skip -gfile /usr/include/c++/5/bits/*

This works because the STL implementation files on my system are located at the above path.

Note that this feature requires GDB 7.12 or later.

Reference: GDB Skipping over files and functions

Tried with: GDB 8.1 and Ubuntu 16.04

How to debug Caffe

Caffe is written in C++ while its GPU accelerated portions are written in CUDA. It also ships with PyCaffe, a Python wrapper to the C++ code.

To debug Python code of PyCaffe

You might have written a Python script to train or load and use models for inference. The Python code of Caffe can be debugged as usual using the PDB Python debugger as described here.

To debug C++ code from Caffe binary

We can use GDB to debug the C++ code in Caffe. First, remember to build Caffe with debugging information. This can be done as described here, but remember to indicate Debug mode to CMake, like this:

$ cmake -DCMAKE_BUILD_TYPE=Debug ..

If you are using binary Caffe, then debugging the C++ code is straightforward using GDB:

$ gdb --args ./caffe --your-usual-caffe-arguments-go-here

You can set breakpoints in any Caffe C++ code and debug using GDB as usual. For more info see my GDB cheatsheet.

To debug C++ code from PyCaffe

If you are using PyCaffe and need to debug the C++ parts of Caffe code that is still possible using GDB! Remember to first build Caffe in Debug mode as shown above. Note that after make install, you might have to rename the _caffe-d.so file to _caffe.so to be able to import caffe in your Python script.

To be able to debug the C++ code, we invoke GDB and pass it the python interpreter and its arguments, which is our Python script that calls PyCaffe:

$ gdb --args python my_script_calls_pycaffe.py --some_input_arguments

GDB first begins in the Python interpreter binary. Python will later load all the required shared libraries including _caffe.so and more importantly libcaffe-nv.so which is the compiled version of the Caffe C++ code. The problem is that these libraries are not yet loaded at the beginning. What we can do is set one or more breakpoints at the places we want right away. GDB will complain that these breakpoint locations are as yet not known to it. That is okay, it will turn them into pending breakpoints that will be enabled when the shared library having that location is loaded.

For example, to set up a breakpoint right away in convolution layer code:

(gdb) b base_conv_layer.cpp:15

After you have set your breakpoints, press c to continue and GDB will load the shared libraries and stop when your breakpoint locations are hit. After this point it is debugging as usual in GDB.

If you are adventurous, you can even connect GDB to the Python process that is running PyCaffe/Caffe by using its PID. For example, after I find out that my Python script is running as PID 589:

$ gdb -p 589

You can set breakpoints in Caffe C++ code and GDB will stop at those locations.

I hope you have fun stepping through and exploring Caffe code! 🙂

Tried with: GDB 7.7.1 and Ubuntu 14.04

How to set regex breakpoint for shared library in GDB

Problem

Assume you are running a program under GDB and it is linked to shared library files. Not all the shared libraries are loaded at the beginning when you start the program with GDB. They are loaded when needed. So, how to set a regex breakpoint in a source file that belongs to one of the shared library files?

Solution

  • Setting a normal breakpoint will work if the shared library having that file has already been loaded. You can check the currently loaded shared libraries using the command: info shared

  • If you set a breakpoint for a file which belongs to a shared library that is not yet loaded, GDB will warn you that the breakpoint will only be set once the library is loaded. This is kinda okay.

  • However, if you try to set regex breakpoints (rbreak) that will fail silently if the shared library is not yet loaded. So how to know when is the earliest point when you can set such breakpoints?

  • I find it useful to configure GDB to stop whenever a shared library is loaded. This can be done by setting this option: set stop-on-solib-events 1

  • Now GDB stops every time at the point where one or more shared libraries need to be loaded. If I realize that the shared library I am interested in has now loaded, I run the regex breakpoint command at that point to set the breakpoints. Voila!

Tried with: GDB 7.11.1 and Ubuntu 14.04

GDB Cheatsheet

GDB is the GNU debugger which can be used to debug C and C++ programs at the commandline.

Compilation

  • Compile your code with debugging information to be able to use it with GDB. For GCC, this is the -g option.

Invocation

  • To open a program with GDB: gdb a.out

  • To open a program which has commandline arguments with GDB: gdb --args a.out --infile /home/joe/zoo.json

  • To load a core dump file, see here.

  • To debug a running process, use its PID. For example, to connect and debug a process with PID 945: gdb -p 945

Debugging

  • To start the program: r which is an alias for run.

  • To set a breakpoint: b which is an alias for breakpoint. Breakpoint can be set by specifying a line number (in the current file where the debugger has stopped) or function name or file path (partial or full) combined with line number or function name. For more info on locations that can be used with breakpoint command see here.

  • To continue from the current stopped location: c which is an alias for continue.

  • To execute next instruction n, step into the next instruction s and step out from a function finish.

  • To execute until a specified line number: until 420.

  • To set a breakpoint at every function in a given file: rbreak src/core/foo.cpp:. I find this incredibly useful when I start debugging a strange problem and I just to want to stop at every function in a file I suspect.

Information

  • To list source code: l

  • To print the current method, line number and filename: frame

  • To see list of breakpoints: info breakpoint or i b

  • To see list of shared library files that are loaded right now: info shared. Note that more shared libraries might be loaded during execution later.

  • To print value of a variable in the code right now: p foo_var

  • To print the data type of a variable or type in the code: ptype foo_type

Tried with: GDB 7.11.1 and Ubuntu 14.04

How to debug running Python program using PyCharm debugger

PDB is a fantastic debugger for Python, but it cannot be easily attached to an already running Python program. The recommended method to attach to a running Python program for debugging is GDB as described here. But, examining stack trace of a Python program and Python objects in a C++ debugger like GDB is not straightforward.

I recently discovered that the GUI debugger in PyCharm IDE can be used to attach to a running Python program and debug it. It is easy to do this:

  • An already running program: Let us assume that I already have a running Python program whose source files are all inside a /home/joe/foobar directory. It has been running an important task for hours now and I have discovered a tiny bug that can be fixed in the running program by changing the value of a global variable.
  • Enable ptrace of any process: For this type of live debugging, we need any process to be able to ptrace any other process. However, the kernel in your distribution may be setup to only allow ptrace of a child process by a parent process. Check that the value of /proc/sys/kernel/yama/ptrace_scope is 0. If not, set it temporarily to 0:
$ echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
  • Install PyCharm: Download PyCharm and unzip the downloaded file. I use the Community Edition which is free.
  • Run PyCharm: Run bin/pycharm.sh and open the directory containing the source files of the running program.
  • If necessary, set the Python interpreter for this project to be the same as that of the running program. That is, we make sure they both use the same version of Python.
  • In the source files, set one or more breakpoints where you would like to stop, inspect or change the running program.
  • Attach: Now we are ready to attach to our running program! Choose Run → Attach to local process and choose the PID of our already running program from the list.
  • Debug: Once attached, the program should stop at our breakpoints. We can now step through the program and change the value of variables to effect some live bug fixes! Once done, we can disable the breakpoints and allow the program to continue by itself.

Tried with: PyCharm 2016.2, Python 2.7.11 and Ubuntu 16.04

How to load core dump in GDB

Running an erroneous program may result it in exiting and dumping a core dump file named core. For example:

$ ./a.out
./a.out terminated by signal SIGSEGV (Address boundary error)

We need to load the core dump file in GDB to begin investigating the cause of the error. This is typically done using the command:

$ gdb ./a.out core

GDB loads the program and using the information from the core dump, the program stack and other information is restored to the point where it encountered the error.

A common first step in investigation is print the stack frames at this point:

(gdb) backtrace

Tried with: GDB 7.7.1, GCC 5.1.0 and Ubuntu 14.04

How to handle SIGTSTP with GDB

Problem

I use Ctrl+Z regularly at the shell to temporarily stop a program and then continue its running using the fg command. However, doing the same in GDB does not work as expected:

  • I press Ctrl+Z. GDB catches it and prints out a message saying SIGTSTP has been received.

  • Typing continue does not continue the execution of the program. The program keeps getting stopped again and prints this message:

(gdb) c
Continuing.

Program received signal SIGTSTP, Stopped (user).
[Switching to Thread 0x7fffe7e28700 (LWP 1492)]
0x00007ffff5b5112d in poll () at ../sysdeps/unix/syscall-template.S:81
81  T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)

Solution

To view the signals handled by GDB and how they are handled:

(gdb) info signals
  • To view how a specific signal, say SIGTSTP is handled:
(gdb) info signal SIGTSTP
Signal        Stop  Print   Pass to program Description
SIGTSTP       Yes   Yes Yes     Stopped (user)
  • We can see that by default, this signal is passed to the program. By not passing it to the program, just like what a shell does, our problem is solved. To do this, use the handle command:
(gdb) handle SIGTSTP nopass
  • Type continue now and the program continues from where it was stopped.

Tried with: GDB 7.7.1 and Ubuntu 14.04