This book is a fascinating read for anybody who – like me – likes both words and numbers. The author applies the tools of statistical analysis to a database made up from thousands of books and answers questions such as: did writers follow their own advice (Hemingway on the use of adverbs), who uses the most clichés in their work or even how often the weather is used in the first sentence of an author’s novel.
A particularly fascinating chapter shows how some words are used with different frequency by male and female writers and how it’s possible to tell the gender of an author from such commonplace words as ‘is’, ‘the’, ‘but’ or ‘and’. There’s also a table comparing the use of the words ‘he’ or ‘she’ in individual books with the most extreme being ‘The Hobbit’ – which only has a single ‘she’ for nearly two thousand ‘he’!
Going even further, it’s not only possible to guess the gender, but each author has an individual ‘fingerprint’ of how they use certain words, so with a large enough sample you can tell if they used alternative pennames (like for example J.K. Rowling and Stephen King) or how much of a co-written novel was the work of which author.
The chapter dealing with authors’ favourite words (which gave the book its name because Nabokov uses ‘mauve’ 44 times more often than the average English writer) also makes for fascinating reading. While it’s not surprising that J.K. Rowling should use ‘wizard’ a lot or Tolkien ‘elves’, Hemingway seems to have had an unusual liking for the word ‘concierge’?
All in all, the book has a really interesting approach towards analysing literature using statistical tools and it’s also an entertaining read. I only wish it were longer – or that some of the tools used were available online!