Sep 21, 2010

Prediction and the Meaning of Probability

Early in Bruno de Finetti's book "Theory of Probability" (1974), he makes the intriguing statement that "probability does not exist". This is a philosophical distinction, but it gets at the heart of how he viewed the concept (or existence) of randomness. In the words of Robert Nau, paraphrasing de Finetti, "Probability exists only subjectively within the minds of individuals."

Indeed, one can define the notion of the "probability of an event" without appealing to some underlying assumption randomness. A probability can instead be considered in terms of gambling, where $P(X)$ is the price one would be willing to buy OR sell some hypothetical contract that paid out exactly 1 dollar if the outcome $X$ occurs. From Wikipedia:
You must set the price of a promise to pay $1 if there was life on Mars 1 billion years ago, and $0 if there was not, and tomorrow the answer will be revealed. You know that your opponent will be able to choose either to buy such a promise from you at the price you have set, or require you to buy such a promise from your opponent, still at the same price. In other words: you set the odds, but your opponent decides which side of the bet will be yours. The price you set is the "operational subjective probability" that you assign to the proposition on which you are betting. This price has to obey the probability axioms if you are not to face certain loss, as you would if you set a price above $1 (or a negative price). By considering bets on more than one event de Finetti could justify additivity. Prices, or equivalently odds, that do not expose you to certain loss through a Dutch book are called coherent.
This brings me back to my previous post on forecasting. Without an underlying assumption on the nature of the world, for example the i.i.d assumption, it becomes difficult to judge "how good is a forecaster?" In the calibration setting, on each round $t$ a forecaster guesses probability values $p_t$ and nature reveals outcomes $z_t \in \{0,1\}$. Of course, for a single pair $(p_t,z_t)$ we have no way to answer the question "was the forecaster right?" The question becomes even more remote we imagine that $z_t$ is also chosen by a potential adversary.

So let us now return to the notion of "calibration", which is a measure of the performance of a forecaster, from the previous post. The concept of calibration can be posed something like "the probability predictions roughly match the data frequencies". But while this might seem nice, it doesn't give us a way to judge how "good" the forecaster is, so it's somewhat hard to interpret. On the other hand, if we view this through the lens of de Finetti, using the idea of betting rates, we arrive at what I view is a much more natural interpretation. I'll now give a rough sketch of this idea.

Let's say you're a forecaster and a gambler, and on each round $t$ you predict a probability $p_t$ and also promise to buy or sell a contract, at the price of $p_t$, that pays off 1USD if the outcome $z_t = 1$. But you're worried that someone might come along and realize that your predictions have some inherent bias. (As mentioned in the last post, it has apparently been observed that when weather forecasters said a 50% chance of rain, it only rained 27% of the time!) Let's call an opponent a threshold bettor if he plans to buy (or sell) your contracts whenever the price is above (or below) a fixed value $\alpha$.

So if we pose the prediction problem in this way, we can say that a forecaster is calibrated if and only if she loses no money, on average and in the long run, to a threshold bettor. So the forecaster's predictions may not be good, but they are at least robust to gamblers seeking to exploit fine-grained biases.

Older confusing explanation: Imagine, on each round $t$, a forecaster providing a prediction $p_t$ followed by Nature providing an outcome $z_t$. Assume also that on each round the forecaster will put her money where her mouth is, and offer to buy OR sell a betting contract to Nature, one that pays off 1USD at the price $p_t$ in the event that $z_t = 1$. This seems pretty bad, of course, since Nature is the one choosing $z_t$! Not so -- we will restrict Nature and require that she must commit to a fixed buy/sell function of the following form: buy the contract whenever the price is below some known threshold $\alpha$, and otherwise sell. (Indeed, $\alpha$ can even be chosen in hindsight).

With this setup in mind, the statement "the forecaster is calibrated" is equivalent to "the forecaster will, in the long run, lose no money on average to Nature". More generally, we can see this as a notion of "goodness" of the forecaster in terms of a gambling strategy. If I wanted to use this forecaster to make bets on $z_t$, we could imagine an opponent gambler thinking "I believe the true probability is 20%, so I'm always going to buy a contract from you whenever you offer a price less than $ 0.20, and otherwise I'll sell you a contract." If our forecaster is actually calibrated, then we are certain to not lose any money (on average) to this opponent.

(This idea came out of a discussion I had today with my advisor Peter Bartlett, who I'd like to thank)


  1. I'm not sure I completely understand your notion of calibrated forecaster but your motivation and examples remind me of proper scoring rules in economics (or what Bob and I called "proper losses" in our ICML paper last year). If you look at Savage's "Elicitation of Personal Probabilities and Expectations" the derivation of proper scoring rules uses a similar trading/gambling framework.

    For a more modern take, you may want to look at Lambert, Pennock, and Shoham's "Eliciting Properties of Probability Distributions" from EC'08 (and the follow-up paper for classification in EC'09) if you haven't already seen them.

  2. Correct me if I am wrong, but doesn't this notion of calibrated imply that a weather forecaster who simply makes predictions using only the correct prior (while ignoring the current state of the weather) can claim to be calibrated? If that is the case, then an uncalibrated forecast is clearly bad, but a calibrated forecast isn't necessarily all that good either.

  3. Exactly. In that sense, "calibrated" is a bit like "unbiased." An estimator that always guesses the mean is unbiased, but pretty useless.

  4. Regarding the last couple of comments: it's not just being unbiased. It's being unbiased on *all* predictions. For a stationary distribution of outcomes, that's easy as mentioned, simply predict the empirical mean! But what if the outcomes are adversarially chosen? In this case calibration can still be achieved, which i think is still pretty surprising.

  5. This interpretation gives the Game Theoric Probability of Shafer and Vovk ( For the online learning setting this gives Defense Forecasting (

  6. Thanks to anonymous commenter for those last two papers, they seem quite interesting!