Code Yarns ‍👨‍💻
Tech BlogPersonal Blog

Undefined symbol FreeCudaMemoryCallbacksRegistry

📅 2020-Mar-04 ⬩ ✍️ Ashwin Nanjappa ⬩ 📚 Archive

Problem

I installed PyTorch on my system and the very first import failed with this error:

>>> import torch
Traceback (most recent call last):
  File "<input>", line 1, in <module>
    import torch
  File "/usr/lib/python3/dist-packages/bpython/curtsiesfrontend/repl.py", line 251, in load_module
    module = self.loader.load_module(name)
  File "/home/joe/.local/lib/python3.6/site-packages/torch/__init__.py", line 81, in <module>
    from torch._C import *
  File "/usr/lib/python3/dist-packages/bpython/curtsiesfrontend/repl.py", line 251, in load_module
    module = self.loader.load_module(name)
ImportError: /home/joe/.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so: undefined symbol: _ZN3c1031FreeCudaMemoryCallbacksRegistryEv

Solution

It was obvious that the problem was CUDA or PyTorch's interface to CUDA.

I checked that /usr/local/cuda symlink was pointing to my CUDA 10.1 installation. I also made sure that /usr/local/cuda/lib64 was in my LD_LIBRARY_PATH. However, this did not fix the problem.

I then started checking which libraries /home/joe/.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so was dynamically loading at runtime using the ldd command. This investigation revealed that all the dependent libraries were in /home/joe/.local/lib/python3.6/site-packages, except for one: /usr/local/lib/libc10_cuda.so. Turns out that this is a library installed by PyTorch and I had no idea why there was an old copy of this file in /usr/local/lib. It was probably from an old Ubuntu installation of PyTorch that did not uninstall properly.

I removed this file and PyTorch picked up its local libc10_cuda.so and everything was fine!