The Visual Display of Quantitative Information is an iconic book on statistical graphics by Edward R. Tufte. The book had been on my reading list for many years now and I finally read the Second Edition this weekend. This book presents the theory and practice of statistical graphics, the graphs, plots, charts and maps used for depicting information. Pretty much what we call visualizations today. We live in a time of information overload and this book, first published in the 1980s, is apt today more than ever to better understand our data.
The book is quite easy to read, there is very little prose with every page filled with authentic reproductions of graphs from various sources. A good third of the book is used to introduce the reader to the history of graphs and examples of good and bad graphs. The visual depiction of data is merely 200 years old, surprisingly new considering how advanced both mathematics and art were by that time. The pioneers of the field were Lambert and especially Playfair. The latter invented bar charts and other kinds of charts, which he used to beautifully illustrate the economic rise and fall of the British empire. One of the must-see graphs in the book was created by Minard way back in 1869 and depicts the devastating losses in the Russian campaign of 1812 by Napolean. (It can be seen here.) With common examples, Tufte shows how most graphs we see in mass media today, intentionally or not, deceive us by showing wrong statistics. This chapter was an eye-opener since the reader gets a lot of guidance on how to detect such deviations.
The rest of the book is dedicated to the creation of graphs. Thanks to computer software, most graphs today are choked with unnecessary colors, patterns, graphics and text, all of which make the actual data hard to find and understand. Tufte seems to be a minimalist at heart. He radically redesigns some of our common plots into extremely minimal forms. He further formalizes this practice in following chapters by creating terms to quantify the various aspects of a graph. Data-ink refers to the fraction of the total ink used in a graph that refers to the data. Maximizing data-ink leads to graphs where the frivolous elements are discarded and the data shines through. In graphs with lots of data points, the data density becomes crucial. Again, high data density should be the goal, though this can be non-trivial to achieve. Tufte also snubs his nose at artists who add decorations instead of informing.
Tufte is a minimalist who firmly stands behind data and not behind the aesthetics of statistical graphics. In any paper, article or book, both the text and the graphics try to present information. Rigorous standards of typography, text layout, prose, terseness, and integrity are used for text. Tufte argues for such high standards for the graphics too. A graph provides the writer with a multi-dimensional playground for his data. If he strives to create a good graph, it gives the reader multiple levels of understanding of the underlying data. I found this book to be an illuminating read and I am pretty sure that you will never see a graph the same way after this book. This book is highly recommended for both creators and consumers of information i.e., everyone.