This is a confidence game Mark refers to in podcast 24, The Toad Not Taken. It highlights some common fallacies in how most people use historical data to make decisions. Don’t try this in real life — we hate to see our listeners get arrested!
Imagine we have available to us a very large supply of horse races where people can bet on the winner. For simplicity’s sake, let’s also assume all the races involve five horses. We’ll deal with 3,125 races in our example.
Let’s also assume we have a way of identifying 3,125 people who like to play the ponies but — and this is crucial — don’t know each other. Nowadays that probably wouldn’t be too difficult to organize, using web-based resources.
The con is very simple: we send each bettor a letter:
Dear X:
I’ve developed a sure-fire way of determining which horse will win a race. As someone interested in horse racing I know you’d find this of interest, if it were true. But I also know, as a reasonable person, you wouldn’t just take the word of a random stranger that his system works.
So to convince you my system works I offer you the following prediction, for free. Check out the results yourself.
Insert random prediction here
I’ll be in touch after the race.
All the Best, etc., etc.
When the races are run we know that we will have predicted the correct winner in 20% — 1 out of 5 — of the races. Because there has to be a winner (we assume none of the races are canceled — the con works without that constraint, but the math gets a bit more complicated).
We delete the addresses of all the people to whom we gave the wrong prediction (4/5ths, or 80%, of the total; 2,500, to be exact).
We then send a follow-up letter to each of the remaining 625 bettors, offering predictions on a new round of races. We follow the same process — make predictions, wait for outcomes, delete the bettors to whom we gave bad predictions — three more times (5 in total, counting the original round of letters).
At this point we’ve eliminated all but one of the bettors…and that person has experienced us making five correct predictions in a row. There’s only a 1 in 3,125 chance that that could happen! We must, indeed, have a system that can predict the outcome of horse races!
We hit him or her up for a lot of money, send them some bogus predictive software, and retire to enjoy the fruits of our con.
The con works because it takes advantage of the mistaken belief historical actions can anticipate future ones. There is a pattern; but it’s one that develops solely by chance, and so is meaningless as a predictive tool1.
There’s an analogous argument that Warren Buffet isn’t, in reality, a really, really good judge of investment opportunities. It’s just that, in a world with a large enough number of active investors, the odds on there being someone who can “routinely outperform the market” are very high. The truly unusual world, in a sense, would be one where there was no one like Warren Buffet, even if we know all the players are just guessing at random. ↩