Book Review: Programming Collective Intelligence

Posted in Evolutionary Computation, Python, Software Development by Dan on December 13th, 2007

It’s called “Programming Collective Intelligence” and is presented as a book for building “Smart Web 2.0 Applications” but it is essentially an extremely accessible explanation of a wide array of machine learning and data-mining algorithms. How do sites like Amazon and Last.FM make recommendations? How do search engines work? How does Google News manage to categorise and present the most important news articles without human intervention? How do you build a useful spam filter?

All of these questions are answered and compelling example applications are built step-by-step to demonstrate the power of the ideas presented here. Decision trees, genetic algorithms, neural networks, support vector machines, genetic programming, Bayesian classifiers and non-negative matrix factorisation are some of the techniques covered and all without the dry, maths-heavy text that normally fills books on these topics.

The examples throughout are exclusively in Python, which may have put me off had I realised this when I ordered it. I have nothing against Python except for my complete lack of experience with it. However, the examples are easy enough to understand for anybody familiar with other high-level languages. As result of reading the book, I may actually try my hand at a bit of Python hacking now.

How well do these techniques work? Well I’d never have found out about this book but for Amazon’s automated recommendations system. I’d thoroughly recommend this book to anyone looking to learn about interesting AI techniques without wading through opaque academic papers.

(If you find the genetic algorithms and genetic programming topics interesting, check out the Watchmaker Framework for Evolutionary Computation and some of the books recommended there.)