How to view files over SSH using SimpleHTTPServer

It is convenient to connect to a remote computer using SSH and work at the shell. But viewing images files and other such common files can be a problem. Using a X server might not always be possible. A simple solution that works for me is to use the SimpleHTTPServer module that ships in Python.

  • Change to the directory which holds the files you want to view from a remote computer.
  • Run the SimpleHTTPServer there and provide a port number for the server:
$ python -m SimpleHTTPServer 8901
  • On the local computer, open a browser and connect to the server using the address: http://put-remote-computer-ip-here:8901
  • You can now view image files and other common file types right in the browser.

Tried with: Ubuntu 18.04

Advertisements

How to configure local computer for FastAI course

I wanted to check out the Practical Deep Learning for Coders course by FastAI. However, I noticed that the course provided configuration instructions mainly for cloud GPU instance providers like Paperspace. I have a notebook and a desktop computer with powerful NVIDIA GPUs and wanted to try the course on my local machines. The course material is also provided in the form of Jupyter notebooks, while I intended to turn those into Python programs to run locally.

Here are the steps I followed to get my local computer setup for the FastAI course:

  • The local computer was running Ubuntu 16.04 and NVIDIA drivers were already installed on it and working.
  • CUDA 9.0 was installed using the online instructions from NVIDIA.
  • The latest release of CuDNN was installed as described here.
  • Conda was installed and configured as described here. You should be able to run conda info from the shell. I wanted to try the course without resorting to Anaconda, but that seems unnecessarily complicated.
  • Clone the fastai Github repo:
$ git clone git@github.com:fastai/fastai.git
  • In the fastai directory, use this command to create a Conda environment named fastai and install all the required Python packages (including PyTorch) under that:
$ conda env update
  • Activate the fastai environment:
$ conda activate fastai
  • Add the path to fastai to your PYTHONPATH environment variable, so that you can import it from Python.

  • Open a Python interpreter and check if you can import PyTorch and it has CUDA and CuDNN support:

$ python
>>> import torch
>>> torch.cuda.is_available()
True
>>> torch.backends.cudnn.enabled
True
  • Open a Python interpreter and check if you can import the fastai package:
$ from fastai.imports import *

You are now ready to execute any of the code shown in the course at a Python interpreter or inside Python scripts. Note that you will still need to download any additional datasets needed by the course. You will find these instructions in the Jupyter notebooks or course material.

How to discover type hierarchy in Python

Given any type in Python, you can easily discover its ancestor and descendant types. This ease of discovery of the internals of the language is one of my favorite features of Python.

  • Remember that all types are descended from the object type.

  • Even type is a type and it is a child of the object type.

  • The __base__ attribute of any type has a string value with the name of the parent type.

  • The __subclasses__ method of any type lists the child types.

  • To determine which are the standard types (or builtin types or builtins as they are called in Python), check the __module__ attribute of the type. If it is builtins in Python 3 or __builtin__ in Python 2, then that is a standard type.

  • If you start from object, you can actually list the entire type hierarchy tree. A script that does just that can be found here.

  • In Python 3.5.2, I found that there are 143 builtin types (most of them are just types of Exception) in the tree:

object
+-- type
+-- dict_values
    +-- odict_values
+-- tuple_iterator
+-- set
+-- fieldnameiterator
+-- frame
+-- dict_keyiterator
+-- PyCapsule
+-- coroutine
+-- bytearray
+-- NoneType
+-- list
+-- dict
+-- getset_descriptor
+-- method-wrapper
+-- method
+-- str_iterator
+-- formatteriterator
+-- str
+-- set_iterator
+-- range_iterator
+-- memoryview
+-- cell
+-- generator
+-- map
+-- list_iterator
+-- stderrprinter
+-- reversed
+-- method_descriptor
+-- code
+-- weakproxy
+-- int
    +-- bool
+-- ellipsis
+-- module
+-- dict_items
    +-- odict_items
+-- bytearray_iterator
+-- Struct
+-- moduledef
+-- filter
+-- staticmethod
+-- tuple
+-- frozenset
+-- managedbuffer
+-- coroutine_wrapper
+-- function
+-- builtin_function_or_method
+-- odict_iterator
+-- float
+-- range
+-- super
+-- dict_keys
    +-- odict_keys
+-- list_reverseiterator
+-- bytes_iterator
+-- member_descriptor
+-- wrapper_descriptor
+-- property
+-- instancemethod
+-- zip
+-- weakref
+-- slice
+-- longrange_iterator
+-- dict_valueiterator
+-- EncodingMap
+-- callable_iterator
+-- mappingproxy
+-- BaseException
    +-- Exception
        +-- TypeError
        +-- StopAsyncIteration
        +-- SyntaxError
            +-- IndentationError
                +-- TabError
        +-- AttributeError
        +-- AssertionError
        +-- StopIteration
        +-- MemoryError
        +-- BufferError
        +-- NameError
            +-- UnboundLocalError
        +-- LookupError
            +-- IndexError
            +-- KeyError
        +-- EOFError
        +-- ImportError
        +-- ValueError
            +-- UnicodeError
                +-- UnicodeEncodeError
                +-- UnicodeDecodeError
                +-- UnicodeTranslateError
        +-- RuntimeError
            +-- RecursionError
            +-- NotImplementedError
        +-- SystemError
        +-- Warning
            +-- UserWarning
            +-- DeprecationWarning
            +-- BytesWarning
            +-- SyntaxWarning
            +-- PendingDeprecationWarning
            +-- FutureWarning
            +-- ResourceWarning
            +-- ImportWarning
            +-- RuntimeWarning
            +-- UnicodeWarning
        +-- ReferenceError
        +-- OSError
            +-- ConnectionError
                +-- BrokenPipeError
                +-- ConnectionAbortedError
                +-- ConnectionRefusedError
                +-- ConnectionResetError
            +-- BlockingIOError
            +-- NotADirectoryError
            +-- PermissionError
            +-- FileExistsError
            +-- TimeoutError
            +-- IsADirectoryError
            +-- InterruptedError
            +-- ProcessLookupError
            +-- FileNotFoundError
            +-- ChildProcessError
        +-- ArithmeticError
            +-- FloatingPointError
            +-- OverflowError
            +-- ZeroDivisionError
    +-- GeneratorExit
    +-- KeyboardInterrupt
    +-- SystemExit
+-- dict_itemiterator
+-- classmethod
+-- NotImplementedType
+-- iterator
+-- bytes
+-- enumerate
+-- classmethod_descriptor
+-- complex
+-- traceback
+-- weakcallableproxy
  • Note how bool is a child type of the int type.

  • In Python 2.7.12, I found that there are 60 builtin types in the tree:

object
+-- type
+-- weakref
+-- weakcallableproxy
+-- weakproxy
+-- int
    +-- bool
+-- basestring
    +-- str
    +-- unicode
+-- bytearray
+-- list
+-- NoneType
+-- NotImplementedType
+-- traceback
+-- super
+-- xrange
+-- dict
+-- set
+-- slice
+-- staticmethod
+-- complex
+-- float
+-- buffer
+-- long
+-- frozenset
+-- property
+-- memoryview
+-- tuple
+-- enumerate
+-- reversed
+-- code
+-- frame
+-- builtin_function_or_method
+-- instancemethod
+-- function
+-- classobj
+-- dictproxy
+-- generator
+-- getset_descriptor
+-- wrapper_descriptor
+-- instance
+-- ellipsis
+-- member_descriptor
+-- file
+-- PyCapsule
+-- cell
+-- callable-iterator
+-- iterator
+-- EncodingMap
+-- fieldnameiterator
+-- formatteriterator
+-- module
+-- classmethod
+-- dict_keys
+-- dict_items
+-- dict_values
+-- deque_iterator
+-- deque_reverse_iterator
+-- Struct
  • Note how str and unicode are child types of the basestring type. Also observe how this differs from Python 3 builtin types.

  • Also notice how in Python 2 the exception types are not builtin types.

How to read YAML file in Python with ordered keys

It is very easy to read a YAML file in Python as a combination of dict and lists using PyYAML. However, the YAML format does not require PyYAML to read the keys of any dict in the YAML file to be read in the order it appears in the file. In addition, Python dict also does not have any order to the keys in it. However, in certain situations it might be necessary to read the keys in YAML in the order they appear in the file. This can be done by using the yamlordereddictloader.

  • Installing this Python package is easy using pip:
$ sudo pip install yamlordereddictloader
  • Read YAML files by providing the loader from this package to PyYAML:
import yaml
import yamlordereddictloader

with open("foobar.yaml") as f:
    yaml_data = yaml.load(f, Loader=yamlordereddictloader.Loader)

This returns the data in the YAML file as a combination of lists and OrderedDict (instead of dict). So, almost all of the rest of the your code should work the same as before after this change.

Tried with: yamlordereddictloader 0.4 and Ubuntu 16.04

How to convert datetime to and from ISO 8601 string

ISO 8601 is a standardized format for representing date and time that is popular. Python has built-in support to convert to and from this format. But confusingly, those methods are distributed across two different modules!

  • Convert a datetime object to string in ISO 8601 format:
import datetime
datetime_str = some_datetime_obj.isoformat()
  • Convert a ISO 8601 format string to datetime object:
import dateutil.parser
some_datetime_obj = dateutil.parser.parse(datetime_str)

How to set encoder format for Python JSON

Python’s JSON module makes it very easy to dump data into a JSON file. However, I have found that the float values are encoded with a lot of decimal places or in scientific notation. There is no elegant method to set the formatting in the float encoder of the JSON module. The best solution seems to be to monkey patch its formatting option:

# Make json.dump() encode floats with 5 places of precision
import json
json.encoder.FLOAT_REPR = lambda x: format(x, '.5f')

Reference: https://stackoverflow.com/questions/1447287

Python dict get method

The Python dictionary provides an associate array interface to get the value associated with a key:

>>> d = { 1:"cat", 2:"rat" }
>>> d[1]
'cat'

However, this interface is not very friendly if you lookup a key that does not exist. In such a case, it throws a KeyError exception:

>>> d[3]
KeyError: 3

Python dictionary provides a get method that is safer, it returns a None value if the key is not present:

>>> print(d.get(3))
None

This method is actually cooler than it looks cause you can make it return any default value you want when the key is not present in the dictionary. You do this by passing the default value as the second argument:

>>> print(d.get(1, "elephant"))
cat
>>> print(d.get(3, "elephant"))
elephant

Bonus trick

In many cases, we might have the key in the dictionary, but its value is set to some default value like None or empty string or empty list or empty dict or such values. But at the point we are picking values from keys assume we want such default-valued keys to return a different default value. The trick is that since such default values default to False in Python, we can use that to our advantage.

For example, say the dictionary is already created and not under our control. But, whenever I read values from it, I want elephant if the key does not exist or if the value is a default value that evaluates to False. It gives rise to an elegant Python idiom using get method and or operator:

>>> d = { 1:"cat", 2:"rat", 3:None, 4:"" }
>>> v = d.get(1) or "elephant" ; print(v)
cat
>>> v = d.get(3) or "elephant" ; print(v)
elephant
>>> v = d.get(4) or "elephant" ; print(v)
elephant
>>> v = d.get(99) or "elephant" ; print(v)
elephant

dlopen: cannot load any more object with static TLS

Problem

I had a Python script that used Caffe2. It worked fine on one computer. On another computer with same setup, it would fail at the import caffe2.python line with this error:

WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode.
WARNING:root:Debug message: dlopen: cannot load any more object with static TLS
CRITICAL:root:Cannot load caffe2.python. Error: dlopen: cannot load any more object with static TLS

As I mentioned above, the GPU support warning is a red herring cause this Caffe2 Python was built with GPU support. The real error is the dlopen.

Solution

The only solution from Googling that gave a clue was this. As suggested there, I placed the import caffe2.python line at the top above all other imports. The error disappeared.

Tried with: Ubuntu 14.04

How to deal with YAML in Python

YAML (Yet Another Markup Language) is a language similar to JSON for reading and writing configuration information to files that are human readable. YAML is a superset of JSON. It uses indentation instead of the braces used by JSON.

  • To be able to deal with YAML in Python, install the PyYAML package:
$ sudo pip install PyYAML
$ sudo pip3 install PyYAML
  • Similar to JSON, YAML file can be directly loaded into a Python list or dict, depending on whether the root structure of the file is a list or a dict:
import yaml
y = yaml.load(open("foobar.yaml"))
  • Writing a Python structure back to a YAML is similarly straightforward:
yaml.dump(y, open("foobar.yaml", "w"))
  • Note that the YAML file is written in flow style by default. This makes it look a bit like JSON. For human readability, it might be better to dump in block style, like this:
yaml.dump(y, open("foobar.yaml", "w"), default_flow_style=False)

Tried with: PyYAML 3.11, Python 3.5.2 and Ubuntu 16.04