attrs package in Python

It is very rare that you learn something that completely changes how you program. Reading this post about the attrs package in Python was a revelation to me.

Coming from C++, I am not too big a fan on returning everything as lists and tuples. In many cases, you want to have structure and attributes and the class in Python is a good fit for this. However, creating a proper class with attributes that has all the necessary basic methods is a pain.

This is where attrs comes in. Add its decorator to the class and designate the attributes of the class using its methods and it will generate all the necessary dunder methods for you. You can also get some nice type checking and default values for the attributes too.

  • First, let us get the biggest confusion about this package out of the way! It is called attrs when you install it cause there is already another existing package called attr (the singular). But when you import and use it, then it is called attr. I know it is irritating, but this is the way it is.

  • To install it:

$ sudo pip3 install attrs
  • To decorate the class use attr.s. I read it is as the plural attrs. And to declare the class attributes, use attr.ib method. I read it as attribute.
@attr.s
class Creature:
    eyes = attr.ib()
    legs = attr.ib()
  • Once declared like this, the attributes can be provided while constructing an object of the class:
c = Creature(2, 4)
  • Object of this class can be constructed using keywords too:
c = Creature(legs=6, eyes=1000)
  • Notice that we have not specified any default value for the attributes. So, it will rightfully complain when constructing without values:
c = Creature()

TypeError: __init__() missing 2 required positional arguments: 'eyes' and 'legs'
  • Default values can be specified for attributes:
@attr.s
class Creature:
    eyes = attr.ib(default=2)
    legs = attr.ib(default=6)

c = Creature()

Note that if there are some rules you run up against if you provide default values for some attributes and not to others.

  • A beautiful __repr__ dunder method is automatically generated for your class. So, you can print any object:
c = Creature(3, 6)
print(c)

Creature(eyes=3, legs=6)

This is for me the killer feature! This is far more informational than just looking at a bunch of list or dict values.

  • Attributes can be get or set just like normal class attributes:
c = Creature(2, 4)
c.eyes = 10
print(c.legs)
  • Comparison methods are already generated for you, so you can go ahead and compare objects:
c1 = Creature(2, 4)
c2 = Creature(3, 9)
c1 == c2
  • You can add some semblance of type checking to attributes by using the instance_of validators provided by the package:
@attr.s
class Creature:
    eyes = attr.ib(validator=attr.validators.instance_of(int))
    legs = attr.ib()

c = Creature(3.14, 6)

TypeError: ("'eyes' must be <class 'int'> (got 3.14 that is a <class 'float'>)."
  • By default, class attributes are stored in a dictionary. You can switch this to use slots by changing the decorator:
@attr.s(slots=True)
class Creature:
    eyes = attr.ib()
    legs = attr.ib()
  • Are you curious to see the definition of the dunder methods it generates? You can do that using the inspect package:
import inspect
print(inspect.getsource(Creature.__init__))
print(inspect.getsource(Creature.__eq__))
print(inspect.getsource(Creature.__gt__))
  • Want to see what are all the methods and fields the package creates for a class?
print(attr.fields(Creature))

(Attribute(name='eyes', default=NOTHING, validator=<instance_of validator for type <class 'int'>>, repr=True, cmp=True, hash=True, init=True, convert=None), Attribute(name='legs', default=NOTHING, validator=None, repr=True, cmp=True, hash=True, init=True, convert=None))

There is a lot more stuff in this awesome must-use package that can be read here

Tried with: attrs 16.1.0, Python 3.5.2 and Ubuntu 16.04

How to debug running Python program using PyCharm debugger

PDB is a fantastic debugger for Python, but it cannot be easily attached to an already running Python program. The recommended method to attach to a running Python program for debugging is GDB as described here. But, examining stack trace of a Python program and Python objects in a C++ debugger like GDB is not straightforward.

I recently discovered that the GUI debugger in PyCharm IDE can be used to attach to a running Python program and debug it. It is easy to do this:

  • An already running program: Let us assume that I already have a running Python program whose source files are all inside a /home/joe/foobar directory. It has been running an important task for hours now and I have discovered a tiny bug that can be fixed in the running program by changing the value of a global variable.
  • Enable ptrace of any process: For this type of live debugging, we need any process to be able to ptrace any other process. However, the kernel in your distribution may be setup to only allow ptrace of a child process by a parent process. Check that the value of /proc/sys/kernel/yama/ptrace_scope is 0. If not, set it temporarily to 0:
$ echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
  • Install PyCharm: Download PyCharm and unzip the downloaded file. I use the Community Edition which is free.
  • Run PyCharm: Run bin/pycharm.sh and open the directory containing the source files of the running program.
  • If necessary, set the Python interpreter for this project to be the same as that of the running program. That is, we make sure they both use the same version of Python.
  • In the source files, set one or more breakpoints where you would like to stop, inspect or change the running program.
  • Attach: Now we are ready to attach to our running program! Choose Run → Attach to local process and choose the PID of our already running program from the list.
  • Debug: Once attached, the program should stop at our breakpoints. We can now step through the program and change the value of variables to effect some live bug fixes! Once done, we can disable the breakpoints and allow the program to continue by itself.

Tried with: PyCharm 2016.2, Python 2.7.11 and Ubuntu 16.04

Visual Studio Code extensions that I use

  • CPP Tools: The official extension for working with C++ code. Automatically indexes all code in the currently open directory, offers auto-completion and syntax highlighting.

  • Python by Don Jayamanne: There are many Python extensions, but this seems to be the most popular one. Syntax highlighting, indexing and code completion.

  • Vim: There are many Vim extensions, but this seems to be the most popular one. It has entire universes to traverse before it can be as good as Vrapper, the Vim extension for Eclipse. This VSCode extension offers very basic navigation and editing commands.

  • Git Blame: This extension does one little thing that I need everyday to work with code from other people: know who modified a line of code. This extension shows that for the current line in the status bar.

  • Matlab: I need to regularly browse through some MATLAB files. This extension offers syntax highlighting of Matlab files.

Tried with: Visual Studio Code 1.4 and Ubuntu 16.04

OrderedDict in Python

Lists and dictionaries are the fundamental data structures in Python. One of the problems I regularly face with dictionary is that I cannot iterate the keys in a certain order. In many problems, I have the keys in a certain order, I am able to insert the keys in that order, but need to be able to later iterate them in that same order.

The solution to these exact problems is the OrderedDict. It is just like the dictionary, but maintains the order of insertion of the keys. Later when you iterate over its keys, the order is the same as you inserted them in. I am guessing it is implemented by maintaining the keys in a list alongside a dictionary.

Usage of the OrderedDict is same as dictionary in all ways. The only difference is in creating it:

import collections

d = collections.OrderedDict()

Tried with: Python 3.4

How to speedtest from the shell

The SpeedTest website uses a Flash program that may not work on many Linux browsers. If you prefer to check the download and upload bandwidth from the shell, that is easy.

Install the speedtest-cli Python module from PyPI:

$ sudo pip install speedtest-cli

Run the test:

$ speedtest-cli

Tried with: speedtest-cli 0.3.4 and Ubuntu 15.10

AttributeError with Python Enum

Problem

I had code that had worked correctly with Python 2.7 and that used the old enum module. Recently it started throwing this error:

$ ./foo.py 
Traceback (most recent call last):
  File "./foo.py", line 146, in <module>
    main()
  File "./foo.py", line 100, in draw_plot
    if PlotType.Line == plot_type:
  File "/usr/local/lib/python2.7/dist-packages/enum/__init__.py", line 373, in __getattr__
    raise AttributeError(name)
AttributeError: Line

Solution

This error is caused when the enum34 module has been installed alongside the old enum module. enum34 is the backport for Python 2.x of the standard enum in Python 3.4. Many packages have started to use it and so it will be installed implicitly while installing another package. enum34 overrides the old enum files and causes this error.

You could remove enum34 and get rid of this error. But since Python 3.x has already adapted a new enum type, it might be wiser to uninstall the old enum and rewrite your code to use enum34. Its syntax is shown in this example.

Tried with: Python 2.7.6 and Ubuntu 14.04

How to visualize Python profile data with SnakeViz

Profile data visualized with SnakeViz
Profile data visualized with SnakeViz

SnakeViz is a beautiful visualization tool for the profile statistics generated by the Python cProfile module. The data is presented in the browser as a colorful sunburst and you explore the data from the inner core outwards. You can also choose how deep you want the sunburst. Below the visualization, SnakeViz also presents the typical function call table with various columns and this table can be sorted based on any of the columns.

Installing SnakeViz is easy:

$ sudo pip install snakeviz

To run it on a stats file:

$ snakeviz foo.pstats

The visualization is opened in your default browser with the URL http://127.0.0.1:8080

Tried with: SnakeViz 0.2.1, Python 2.7.6 and Ubuntu 14.04

How to natural sort in Python

A lot of sources generate filenames or strings numbered naturally: bat_1, ... bat_9, bat_10. The sort in Python is lexicographic and does not result in a natural ordering of such strings.

The natsort Python package can be used to sort such strings naturally.

Installation for Python 3.x:

$ sudo pip3 install natsort

Usage is simple:

import natsort
out_list = natsort.natsorted(fname_list)

Other uses of this package can be found here.

Tried with: natsort 3.5.2, Python 3.4 and Ubuntu 14.04