Sunday, February 22, 2009

Outliers + The Long Tail + Fooled by Randomness

So far this year I've been reading less fiction, and reading more books
that come under the general category of "popular science".  Here are
reviews of some books that can be loosely connected by concepts from
statistics and data analysis.  If I had to recommend just one of these
books, it would be "Fooled by Randomness".

1. "Outliers: The Story of Success" by Malcolm Gladwell

This book is by the same author who wrote "The Tipping Point" (which I
thought was pretty good) and "Blink" (which I didn't think was as
convincing).  This time Gladwell is looking at "outliers": a term used
to describe "things or phenomena that lie outside normal experience".
In particular, the book looks at why certain people and not others have
become successful.

A Wikipedia article provides a good overview and analysis of the book:

Here's a summary of the first part of the book (from page 175): "success
arises out of the steady application of advantages: when and where you
are born, what your parents did for a living, and what the circumstances
of your upbringing were, all make a significant difference in how well
you do in the world."  Luck can be a big factor.  The author then goes
on to introduce part two: "Traditions and attitudes we inherit from our
forbears can play the same role - i.e. cultural legacy counts too!"

By looking at the varying fortunes of highly intelligent people, the
author concludes that the IQ is not necessarily a factor in determining
success.  More important, it seems, is a degree of persistence.  A
"10,000-hour rule" is proposed, which indicates how much time of
practice is required before mastery can be achieved and exploited.
Examples given include musicians, sports stars and entrepreneurs.

Overall, I found it an interesting read.  However, critics argue that
the author may be oversimplifying things, and that not a lot of
statistical data is provided to support the conclusions.  After all,
this is a book, and not a rigourous scientific paper subject to peer
review.  Also, be aware that some of the arguments and conclusions are
not always politically-correct.

2. "The Long Tail" by Chris Anderson

Traditionally, retailers have relied on selling large numbers of a few
items - the big hits - to maximise profits.  Slower-moving items take up
space and so are not as cost-effective: distribution limitations,
storage costs, limited promotion/lack of information about availability.
In this book, the author argues that niches can pay well.  The internet
and e-commerce have overcome the "80/20 rule" in retail: 20% of products
account for 80% of sales.  Amazon (and others) have shown the tail is
much longer, and that there is money to be made in selling even just low
quantities of many more items.

For a more detailed overview, read through the following Wikipedia

Here are a couple of quotes:
* "The era of the big hit is over, thanks to: digital distribution or
no/low storage costs; easy to find products via search and
recommendations." (pages 134/135).
* "The secret to creating a thriving Long Tail business can be
summarised by two imperatives: 1. Make everything available; 2. Help me
find it" (page 217).  A list of nine rules to achieve these imperatives
are provided.

I've seen the book described as "visionary", but I wouldn't describe it
that way.  Most of the book describes recent internet-era success
stories (e.g. Amazon, iTunes, Netflix, eBay, Google).  So it's not so
much predicting what _will_ happen, as explaining what _has_ happened
and why.  Ken McCarthy, as mentioned in the Wikipedia article, arguably
did predict the "Long Tail" phenomenon for Internet commerce in 1994.

One nitpick I have with the book is with the concept of the "Economics
of Abundance".  The suggestion is that "normal" economics, based on
scarcity of resources and goods, is soon to be displaced.  There are
some products that appear "abundant", thanks to digitisation: once a
song or book is in digital form, it is theoretically possible to
distribute it to an infinite number of consumers.  But not all goods are
"information" goods - you can't eat "digital" food.  And don't expect a
Star Trek-style "replicator" any time soon.  So the Economics of
Scarcity will still be relevant for a while yet.  Even computers (and
replicators) need power, and that isn't so "abundant".  There's an
interesting piece on this debate, entitled "What Happens When the
Economics of Scarcity Meets the Economics of Abundance?" at:

Also, the term "Economics of Abundance" is technically an oxymoron,
since "economics" is defined as "the science which studies human
behaviour as a relationship between ends and scarce means which have
alternative uses."

I don't want to come across as too negative of the book.  It's an
interesting read, with valuable suggestions for people wanting to set up
businesses for the new digital economy.  It basically restates how, by
removing friction, markets can trade (some) goods more easily and
therefore at lower cost.  But it's not as helpful for service-based
industries, for example, where the scarcity of time limits the number of
clients one can have.

3. "Fooled by Randomness" by Nassim Nicholas Taleb

This is perhaps my pick of the books, and arguably the most timely.
Taleb is a former derivatives trader turned philosopher.  According to
his website, his "major hobby is teasing people who take themselves and
the quality of their knowledge too seriously and those who don’t have
the courage to sometimes say: I don't know...."

A very brief summary of the book is provided by Wikipedia:

Basically, the author argues that we overestimate causality and tend to
view the world as more explainable than it really is.  We mistake noise
for signal, and this can lead to disastrous decisions.  Randomness plays
a bigger part in our lives than we like to admit.

Most of the anecdotes and examples in the book come from the author's
past career in financial markets.  Quantitative analysts (quants) apply
sophisticated maths to investments, believing the models "tame"
randomness.  Even when risk is take into account, sometimes the full
extent of the consequences aren't: it's not just the simple probability
that counts, but rather the probability weighted by the extent of the
consequence.  Some quants don't even factor in certain outcomes, because
"they've never happened before".  The author argues that this is a
manifestation of the "Black Swan" theory or "rare event" problem in
induction.  In the 18th Century, David Hume wrote: "No amount of
observations of white swans can allow the inference that all swans are
white, but the observation of a single black swan is sufficient to
refute that conclusion."  As it happened, black swans were encountered
by Westerners for the first time in Australia later that Century.

Recent financial catastrophes (things that "have never happened before")
seem to vindicate the author's central argument.  Investment banks and
traders can ride their luck for a while, but unless they've taken
adequate precautions, they can eventually become undone or "blow up" -
they will see their own "black swans".

The author does ramble a fair bit, going off on tangents and getting
rather philosophical at times.  But this actually makes the book less
dry and more enjoyable.  It's also a bit scary reading about how the
people that are entrusted with looking after our money seem to be either
unaware of, or fail to properly manage, all the risks involved.  Or
maybe the greed is just too tempting?

Obviously, the author riding a wave of popularity at the moment.  Here's
a recent interview for the Sunday Times (UK): "Nassim Nicholas Taleb:
the prophet of boom and doom":

In 2006 Taleb wrote a followup book that looks more deeply into the
black swan theory, appropriately enough entitled "The Black Swan".

[If long links don't work, try copying the full link to your browser,
 or click on the links at <http://localhost:4000> ]