By Wojciech Gryc on January 6, 2021
TLDR: This book is a fantastic introduction to the terminology and architecture options for building ML-driven products and services. If you are new to the field or are a technology leader who needs to brush up on their ML terminology and options, then this book is for you. Experts in the field looking to innovate on model performance might want to look elsewhere.
An oft-overlooked area of data science is the actual architecture of machine learning systems. It’s easy to get excited about the latest algorithms and ML packages, but broader operationalization of the system is taken for granted. If you are a data scientist aspiring to work in product with companies shipping productized ML models, then Machine Learning Design Patterns is worth a perusal.
The book addresses an issue that we at Phase AI see constantly: terminology in this space is still new, and the way we describe design patterns – let alone use them – is inconsistent. Worse still, a poorly architected ML product could introduce so many problems down that line that it all but guarantees failure for a startup or product launch. It’s wonderful to see the authors try and address this.
The majority of the book (chapters 2 through 7) provides an overview of common approaches to discussing and addressing machine learning problems. This includes data ingestion, cleaning, modeling, and even the ethics of AI. Each chapter has a set important terms and approaches (i.e., the design patterns); it provides definitions for the design pattern, and examples of how one can implement the design pattern and address common issues with it.
Let’s look at data preparation (Chapter 2, Data Representation Design Patterns) as an example. The authors provide an overview of how different types of variables can be represented for machine learning problems. This includes concepts like one-hot encoding, feature embedding, multimodal inputs, and more. One can write 1000s of pages on the topics in each chapter, so this is really more about terminology and a general introduction.
In addition to standardization of terminology, the book presents helpful diagrams on how to structure ML-driven products and pipelines. Figure 8-5, shown below, is a good example of this. These diagrams are critical to understanding the architecture and design patterns powering ML projects, and I hope more VPs of Engineering and Architecture document, review, and promote such diagrams within their ML-driven product and service organizations… Such documentation is the only way architecture principles will maintain their integrity as teams scale and team members come and go.
My biggest qualm with the book is that much of it is spent laying the groundwork for the field, and as such, only one chapter focuses on how the design patterns in this book could be used for building solutions to common problems. I would love to see a few chapters dedicated to in-depth analysis of architectures (or options) for performant product recommendation systems, or a fraud model at a bank, or something else. This could be a good opportunity for a sequel.
The question remains, however: who is this book for? If you are just learning ML or data science, you might benefit from a skim, but I would suggest you focus on learning the core skills of data science. If you are now entering the world of product management, architecting solutions, or leading an ML team, then these concepts are vital; if you are not familiar with them, then this is a good place to start. Similarly, if you are a technology manager or leader who doesn’t have much experience with ML, then this book will provide a nice foundation for your work.