The strange case of varying floats in Protobuf

Problem

I was using Google Protobuf in a Python program to read some text format Protobuf messages, merge them and write them out. Surprisingly, for the same set of input text format message files, I was getting different outputs on two computers! The values that were different were float values. The float values were generally correct, but varied slightly in precision between the two computers.

Solution

This strange observation took quite a long investigation. I initially assumed that maybe the Protobuf library (libprotobuf.so) or the Python Protobuf package were of different versions on these two computers. Surprisingly, they were exactly the same.

The mystery finally turned out to be the Protobuf implementation type. There are currently two possible types: cpp and python. By default, the cpp implementation is used. However, on one of the computers, the python implementation had been chosen by an engineer during the PIP package installation. The way to pick the engine is by setting an environment variable named PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION to either cpp or python. The engineer had set this environment variable in his shell when playing around with Protobuf and had later installed the PIP package.

Once I explicitly set the PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION environment variable manually in my Python code before importing protobuf, the float values were the same on both computers!

Now why should the engine affect the float value? Because Python’s float is actually double precision. On the other hand, when a 32-bit float moved between Python code and the C++ engine and back to Python code, it was sometimes changing precision. By using the same engine on all computers, we ensured that at least the float values did not vary between machines.

Tried with: Python Protobuf 3.3.0 and Ubuntu 14.04

Advertisements

How to set encoder format for Python JSON

Python’s JSON module makes it very easy to dump data into a JSON file. However, I have found that the float values are encoded with a lot of decimal places or in scientific notation. There is no elegant method to set the formatting in the float encoder of the JSON module. The best solution seems to be to monkey patch its formatting option:

# Make json.dump() encode floats with 5 places of precision
import json
json.encoder.FLOAT_REPR = lambda x: format(x, '.5f')

Reference: https://stackoverflow.com/questions/1447287

How to convert string to number in C++

In C+11 and later versions, a string containing an integer or floating point value can be converted easily to its numeric value. To do this, use the stoi, stof and similar functions defined in the string header file. These functions throw an exception on encountering a malformed string and so are recommended over the old C atoi and atof functions.

This example demonstrates this:

Tried with: GCC 4.9.1 and Ubuntu 14.04

How to enable full speed FP64 in NVIDIA GPU

In many recent NVIDIA GPUs shipping in graphics cards, the FP64 cores are executed at reduced speed. For example, the GTX Titan is capable of achieving a double performance that is 1/3 of float performance. However, by default the card does FP64 at a reduced speed of 1/24 of FP32. This is done because the primary audience of these consumer cards are gamers. And games use mostly FP32 computations. Enabling full speed FP64 reduces the FP32 performance by a bit since the maximum clock speed needs to be reduced and also increases power consumption since all the power hungry FP64 cores are running.

To enable full speed FP64 on Linux, make sure you have the latest NVIDIA drivers installed. Open the NVIDIA X Server Settings application. Go to the section with the name of your graphics card > PowerMizer and enable the CUDA - Double precision option. That is it, your CUDA application should now run with full speed FP64 on the GPU.

Tried with: NVIDIA GTX Titan, NVIDIA driver 319.37, CUDA 5.5 and Ubuntu 12.04 LTS