Toby Segaran’s book is given the subtitle “Building Smart Web 2.0 Applications”. It’s clearly been assigned to the book by a marketing drone who felt that they needed to get the phrase “Web 2.0” in there somewhere. Unfortunately this left me feeling a bit disappointed when read this book because it’s not really what the book is about. A far better subtitle would have been “An Introduction to Machine Learning”, but I don’t work in marketing.
My expectation based on the title was that the book would focus mainly on how to generate recommendations based on a user interactions with a set of objects. I was expecting it to cover not only the mathematical basis for such algorithms but also the practical implications that sites like Last.fm and Amazon have to deal with processing such enormous data sets. While the book does cover recommendation systems they’ve given only a small slice of what is already quite a short book. There is no mention of how to process huge datasets either through map/reduce or some other scalable architecture.
The rest of the book is take up with classifying objects in a set, predicting the prices of a new objects given similar objects or filtering items such as email into spam and ham. These are staples of machine learning textbooks, but here are introduced in a very accessible and easy to read way. These chapters are interesting and informative but perhaps not that useful if you’re building a web 2.0 site.
This is a great addition to the O’Reilly stable of programming book, it’s just a shame they tried to shoehorn Web 2.0 in when that really gives the wrong impression of the book.