Python JSON dump misses last newline

Problem

The dump method from the Python json package can be used to write a suitable Python object, usually a dictionary or list, to a JSON file. However, I discovered that Unix shell programs have problems working with such a JSON file. This turned out to be because this dump method does not end the last line with a newline character! According to the POSIX definition of a line in a text file, it needs to end with a newline character. (See here).

Solution

I replaced this:

json.dump(json_data, open("foobar.json", "w"), indent=4)

with this:

with open("foobar.json", "w") as json_file:
    json_text = json.dumps(json_data, indent=4)
    json_file.write("{}\n".format(json_text))  # Add newline cause Py does not
Advertisements

Invalid version number error with Python

Problem

I tried to import a Python package that I had installed from source. The import failed with this error:

File "/usr/lib/python2.7/distutils/version.py", line 40, in __init__
  self.parse(vstring)
File "/usr/lib/python2.7/distutils/version.py", line 107, in parse
  raise ValueError, "invalid version number '%s'" % vstring
ValueError: invalid version number '2.7.0rc3'

Solution

It turns out that package version number has to be in the x.y.z format. Else Python throws this error.

Since I had the source code of this package, I found all instances of 2.7.0rc3 and changed it to 2.7.0. Typically, this will be in the setup.py and version.py files. I removed the previously installed package and reinstalled this changed source code. I was able to import after this successfully.

Tried with: Ubuntu 14.04

CMake error building with Python libraries

Problem

I got this error from CMake when building a project that needs to link with Python 3.4 libraries:

-- Found PythonInterp: /usr/bin/python3 (found suitable version "3.4.3", minimum required is "3.0")
-- Could NOT find PythonLibs (missing:  PYTHON_LIBRARIES PYTHON_INCLUDE_DIRS) (Required is at least version "3.0")

Solution

Turns out that the CMake available on my system only supported finding Python 3 packages upto version 3.3. To change it to support Python 3.4 was possible by editing two files:

  • In file /usr/share/cmake-3.4/Modules/FindPythonInterp.cmake find the line containing _PYTHON3_VERSIONS and prepend 3.4 to the versions already listed there.

  • In file /usr/share/cmake-3.4/Modules/FindPythonLibs.cmake find the line containing _PYTHON3_VERSIONS and prepend 3.4 to the versions already listed there.

I was able to build with Python 3.x libraries after that.

Error building Caffe with Python 3 support

Caffe can be built with support for Python 2.x. This allows you to invoke Caffe easily from Python code. However, I wanted to call Caffe from Python 3.x code.

  • I built Boost with Python 3.x support. I could see that libboost_python3 library files were generated.

  • I added this to the normal CMake command that I use to build Caffe: -Dpython_version=3

Sadly, this popped up errors of this type:

libboost_python.so: undefined reference to `PyClass_Type'

This type of error indicates that the Python 2.x Boost library was being used to compile with Python 3.x libraries.

attrs package in Python

It is very rare that you learn something that completely changes how you program. Reading this post about the attrs package in Python was a revelation to me.

Coming from C++, I am not too big a fan on returning everything as lists and tuples. In many cases, you want to have structure and attributes and the class in Python is a good fit for this. However, creating a proper class with attributes that has all the necessary basic methods is a pain.

This is where attrs comes in. Add its decorator to the class and designate the attributes of the class using its methods and it will generate all the necessary dunder methods for you. You can also get some nice type checking and default values for the attributes too.

  • First, let us get the biggest confusion about this package out of the way! It is called attrs when you install it cause there is already another existing package called attr (the singular). But when you import and use it, then it is called attr. I know it is irritating, but this is the way it is.

  • To install it:

$ sudo pip3 install attrs
  • To decorate the class use attr.s. I read it is as the plural attrs. And to declare the class attributes, use attr.ib method. I read it as attribute.
@attr.s
class Creature:
    eyes = attr.ib()
    legs = attr.ib()
  • Once declared like this, the attributes can be provided while constructing an object of the class:
c = Creature(2, 4)
  • Object of this class can be constructed using keywords too:
c = Creature(legs=6, eyes=1000)
  • Notice that we have not specified any default value for the attributes. So, it will rightfully complain when constructing without values:
c = Creature()

TypeError: __init__() missing 2 required positional arguments: 'eyes' and 'legs'
  • Default values can be specified for attributes:
@attr.s
class Creature:
    eyes = attr.ib(default=2)
    legs = attr.ib(default=6)

c = Creature()

Note that if there are some rules you run up against if you provide default values for some attributes and not to others.

  • A beautiful __repr__ dunder method is automatically generated for your class. So, you can print any object:
c = Creature(3, 6)
print(c)

Creature(eyes=3, legs=6)

This is for me the killer feature! This is far more informational than just looking at a bunch of list or dict values.

  • Attributes can be get or set just like normal class attributes:
c = Creature(2, 4)
c.eyes = 10
print(c.legs)
  • Comparison methods are already generated for you, so you can go ahead and compare objects:
c1 = Creature(2, 4)
c2 = Creature(3, 9)
c1 == c2
  • You can add some semblance of type checking to attributes by using the instance_of validators provided by the package:
@attr.s
class Creature:
    eyes = attr.ib(validator=attr.validators.instance_of(int))
    legs = attr.ib()

c = Creature(3.14, 6)

TypeError: ("'eyes' must be <class 'int'> (got 3.14 that is a <class 'float'>)."
  • By default, class attributes are stored in a dictionary. You can switch this to use slots by changing the decorator:
@attr.s(slots=True)
class Creature:
    eyes = attr.ib()
    legs = attr.ib()
  • Are you curious to see the definition of the dunder methods it generates? You can do that using the inspect package:
import inspect
print(inspect.getsource(Creature.__init__))
print(inspect.getsource(Creature.__eq__))
print(inspect.getsource(Creature.__gt__))
  • Want to see what are all the methods and fields the package creates for a class?
print(attr.fields(Creature))

(Attribute(name='eyes', default=NOTHING, validator=<instance_of validator for type <class 'int'>>, repr=True, cmp=True, hash=True, init=True, convert=None), Attribute(name='legs', default=NOTHING, validator=None, repr=True, cmp=True, hash=True, init=True, convert=None))

There is a lot more stuff in this awesome must-use package that can be read here

Tried with: attrs 16.1.0, Python 3.5.2 and Ubuntu 16.04

How to debug running Python program using PyCharm debugger

PDB is a fantastic debugger for Python, but it cannot be easily attached to an already running Python program. The recommended method to attach to a running Python program for debugging is GDB as described here. But, examining stack trace of a Python program and Python objects in a C++ debugger like GDB is not straightforward.

I recently discovered that the GUI debugger in PyCharm IDE can be used to attach to a running Python program and debug it. It is easy to do this:

  • An already running program: Let us assume that I already have a running Python program whose source files are all inside a /home/joe/foobar directory. It has been running an important task for hours now and I have discovered a tiny bug that can be fixed in the running program by changing the value of a global variable.
  • Enable ptrace of any process: For this type of live debugging, we need any process to be able to ptrace any other process. However, the kernel in your distribution may be setup to only allow ptrace of a child process by a parent process. Check that the value of /proc/sys/kernel/yama/ptrace_scope is 0. If not, set it temporarily to 0:
$ echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
  • Install PyCharm: Download PyCharm and unzip the downloaded file. I use the Community Edition which is free.
  • Run PyCharm: Run bin/pycharm.sh and open the directory containing the source files of the running program.
  • If necessary, set the Python interpreter for this project to be the same as that of the running program. That is, we make sure they both use the same version of Python.
  • In the source files, set one or more breakpoints where you would like to stop, inspect or change the running program.
  • Attach: Now we are ready to attach to our running program! Choose Run → Attach to local process and choose the PID of our already running program from the list.
  • Debug: Once attached, the program should stop at our breakpoints. We can now step through the program and change the value of variables to effect some live bug fixes! Once done, we can disable the breakpoints and allow the program to continue by itself.

Tried with: PyCharm 2016.2, Python 2.7.11 and Ubuntu 16.04

Visual Studio Code extensions that I use

  • CPP Tools: The official extension for working with C++ code. Automatically indexes all code in the currently open directory, offers auto-completion and syntax highlighting.

  • Python by Don Jayamanne: There are many Python extensions, but this seems to be the most popular one. Syntax highlighting, indexing and code completion.

  • Vim: There are many Vim extensions, but this seems to be the most popular one. It has entire universes to traverse before it can be as good as Vrapper, the Vim extension for Eclipse. This VSCode extension offers very basic navigation and editing commands.

  • Git Blame: This extension does one little thing that I need everyday to work with code from other people: know who modified a line of code. This extension shows that for the current line in the status bar.

  • Matlab: I need to regularly browse through some MATLAB files. This extension offers syntax highlighting of Matlab files.

Tried with: Visual Studio Code 1.4 and Ubuntu 16.04

OrderedDict in Python

Lists and dictionaries are the fundamental data structures in Python. One of the problems I regularly face with dictionary is that I cannot iterate the keys in a certain order. In many problems, I have the keys in a certain order, I am able to insert the keys in that order, but need to be able to later iterate them in that same order.

The solution to these exact problems is the OrderedDict. It is just like the dictionary, but maintains the order of insertion of the keys. Later when you iterate over its keys, the order is the same as you inserted them in. I am guessing it is implemented by maintaining the keys in a list alongside a dictionary.

Usage of the OrderedDict is same as dictionary in all ways. The only difference is in creating it:

import collections

d = collections.OrderedDict()

Tried with: Python 3.4

How to speedtest from the shell

The SpeedTest website uses a Flash program that may not work on many Linux browsers. If you prefer to check the download and upload bandwidth from the shell, that is easy.

Install the speedtest-cli Python module from PyPI:

$ sudo pip install speedtest-cli

Run the test:

$ speedtest-cli

Tried with: speedtest-cli 0.3.4 and Ubuntu 15.10

AttributeError with Python Enum

Problem

I had code that had worked correctly with Python 2.7 and that used the old enum module. Recently it started throwing this error:

$ ./foo.py 
Traceback (most recent call last):
  File "./foo.py", line 146, in <module>
    main()
  File "./foo.py", line 100, in draw_plot
    if PlotType.Line == plot_type:
  File "/usr/local/lib/python2.7/dist-packages/enum/__init__.py", line 373, in __getattr__
    raise AttributeError(name)
AttributeError: Line

Solution

This error is caused when the enum34 module has been installed alongside the old enum module. enum34 is the backport for Python 2.x of the standard enum in Python 3.4. Many packages have started to use it and so it will be installed implicitly while installing another package. enum34 overrides the old enum files and causes this error.

You could remove enum34 and get rid of this error. But since Python 3.x has already adapted a new enum type, it might be wiser to uninstall the old enum and rewrite your code to use enum34. Its syntax is shown in this example.

Tried with: Python 2.7.6 and Ubuntu 14.04