What I learned from 'Designing Distributed Systems' PART I
by Fahad Assoumani
Tagged as DistSys, Book, Containers
The book
Designing Distributed Systems: Patterns and Paradigms for Scalable, Reliable Services, is a book written by Brendan Burns, one of the heads behind Kubernetes and some other Cloud-related projects like GCP and Azure.
The book aims to present you (as the subtitle says) patterns and paradigms that revolve around the world of distributed software. It is an introduction to the many conventions and techniques you will probably find or use in some real world projects. With around 150 pages, it exposes contexts, problems and practical examples to help you tackle the materials presented.
[!IMPORTANT] You are supposed to have basic knowledge of containerization and orchestration in order to fully understand the materials of the book, so if you know about docker and kubernetes this book is for you.
Why I reddit
As a mid-level Software Engineer who does web and system development, I have been in touch with distributed software, scaling and reliability for a long time, but these notions were always abstracted away; it just felt like automatic and magical, even in my previous jobs and internships. Of course I had the basic picture, a high-level understanding of how these things work, but no real understanding of the underlying architecture. I knew some Systems Design principles but only from the perspective of the end user and not the designer.
I think this quote is the best description of how I felt at that time:
"Despite their prevalence, the design and development of these systems is often a black art practiced by a select group of wizards."
As my interest grews towards this very subject, I decided to learn more about it and the book was cited among others in some old Reddit posts.
4 Parts
The book is organized into 4 sections:
-
– The first one is the
Introduction, it explains the whats and whys of the components presented. It gives historical contexts and references. I won’t talk about this section in the book. -
– The second part, presents the
Single Node Patterns. -
– The third part, the
Serving PatternsorMulti Node Distributed Patterns. -
– The last one, the
Batch Computational Patterns.
What I learned
Single Node Pattern
Here, it emphasizes breaking a service into containers on a single machine. There are 3 patterns in this part: the Sidecar, the Ambassador and the Adaptor pattern.
In my young career, the one I’m most familiar with is the Sidecar; I often find myself using it for various personal projects.
In this pattern, you have 2 containers sharing some resources (network, filesystem, etc.), one containing the main logic of your application and another which acts as an auxiliary of the first one.
It can be used to enhance or extend an application. The given examples are Adding HTTPS to a Legacy Service, Dynamic Configurations, and more.
What I about with this chapter, besides the examples and use cases, is the importance given to design. It helps us formalize why it is important to have maintainable sidecars (parameterization, documentation, API) for the sake of modularity and reusability, which are also recurrent concepts among the other patterns.
The Ambassador acts as a broker between the application container and the external world. Examples given are Sharding a Service, Service Brokering and Request Splitting and Experimentations.
My first intuition of an actual ambassador was the nginx-proxy project from Jason Wilder (originally) that I used to dynamically link containers to an nginx reversed proxy in my VPS.
At that time, I was wondering if I could make one myself using Caddy just as a POC and for practical use (to replace nginx-proxy). So I made it caddy-proxy inspired by the cited project and its author’s blog post.
At first glance, I could not really make the distinction between this pattern and the former one, but after some time and research, it seems like the ambassador is more likely a specialized version of the sidecar, and it is the same for the next pattern.
The Adaptor standardizes (or modifies) the main application interface to conform to other interface used by other applications.
This is particularly handy when dealing with telemetry, as the examples show: Monitoring, Logging and Health Monitoring.
I couldn’t think of any moment I used it before, but the examples were pretty straightforward; your main container exposes data, you use a helper container to adapt the data to a particular format.
Conclusion
I really liked the way these patterns were presented, the focus is very much on the modularity of the containerized solutions.
I can now put a name on some of the techniques used in some of the projects I worked on, and can recognize them wherever I see them in other projects.
This part was a great appetizer, simple and clear. It shows you well how tighly coupled groups of containers scheduled in one machine can be important.
If I could summarize this section into one sentence it would be:
"Build once, reuse many"