📅 2022-Apr-15 ⬩ ✍️ Ashwin Nanjappa ⬩ 🏷️ book, computer architecture, performance ⬩ 📚 Archive
Understanding Software Dynamics is a difficult book to define. Written by Richard L. Sites, it is a distillation of his life’s experience in performance optimization at the hardware-software boundary. It focuses on the 5 components that affect performance: compute (CPU), memory, storage, network and critical sections.
Part 1 of the book on Measurement looks at each of the 5 components, summarizing the history behind each and how to write C/C++ programs to measure their characteristics and performance. I found the author’s concise history of CPUs, memory, hard disks, SSD, and Ethernet to be fascinating, discovering many gems that I wasn’t previously aware of. When writing programs to discover the effects of these components on performance, we realize how there are multiple levels of caches in each of these and how compiler optimizations make it difficult to isolate a specific component.
Part 2 on Observation shows the importance of logging, profiling, dashboards and tracing to monitor and debug performance. Though primarily aimed at datacenter-scale applications, I learnt a lot about the programmatic solutions and tools available to aid this process.
Part 3 was primarily around KUTrace, a kernel-level tool developed by the author and folks at Google. I skipped this and moved to Part 4 on Reasoning which shows how to reason about performance that is not matching expectations since the system we built is waiting on CPU, memory, disk, network, locks, time or queues.
This book came recommended by Dan Luu and I must admit it fills a real gap in the practical understanding and debugging of performance issues in today’s accelerated and datacenter-scale applications and deployments. The author brings his enormous experience from the days of VAX, DEC Alpha, Google to bear in his writings and it was great to imbibe that kind of knowledge. If you are short on time, I highly recommend at least studying Part 1 of this book, which is where I got most of my value. This is also the part that overlaps a lot with my other favorite book: Computer Systems: A Programmer’s Perspective. Since the book’s illustrations focus on cloud applications, I learnt a lot about how large-scale solutions like GMail/search are distributed among hundreds of servers and how they communicate using RPCs. The book does miss out on GPUs, but I think it is easy to migrate these learnings over to such accelerators, which too have the similar performance constraints.