The following shows an example of a Scaled or Continuous question:
Instead of estimating the chance of a particular outcome, you are asked to forecast the outcome in natural units like $. Forecasts moving the estimate towards the actual outcome will be rewarded. Those moving it away will be penalized. As with probability questions, moving toward the extremes is progressively more expensive: we have merely rescaled the usual 0%-100% range and customized the interface.
Here you are about to move the estimate from $150M to $212M, at a cost of 99 points:
All Scaled questions have any values for the left and right edges: here $25M and $275M. Moving the estimate to the right is like buying shares in $275M and selling shares in $25M, exactly in the right proportion to maximize your payout at the new forecast. Conversely for moving the estimate left.
Expected Values
It’s important that you always think of the Scaled question as continuous. In the AUD$ example above that’s easy, because the range is large enough that it looks basically continuous. But sometimes we use Scaled to estimate small counts. For example, in the following MERS question, the unit is countries, the range is from 0 to 58, and the slider moves in whole steps:
Nevertheless, you are always estimating the expected value, and expected value is in general a continuous quantity. There are two cases where this is important.
- Mixtures. If the question is how many heads will be observed on 20 tosses of a fair coin, you should move the estimate towards 10. But you should also do that if you’re in a psychology experiment where you know the tosses are going to be with either a double-headed or double-tailed coin, chosen by the flip of a separate fair coin. You’re sure the actual outcome will be either 0 or 20, but are completely uncertain which. Your expected value is 10.
- Endpoints. Question writers try to set the range to cover at least 90% of envisioned resolutions. That means at least 10% of the time, the question will resolve outside the range. But it costs an infinite number of points to move it all the way to an endpoint, because moving expected value all the way to one side is putting absolutely 0% chance on any other value in the range. Once the value is known to be outside the range, the question should resolve.
Payoff, or How Should I Forecast?
The question basically pays off at the actual outcome. Technically, it’s done by resolving as a weighted mixture of the two endpoints.
You maximize your expected gain by moving the forecast towards your expected value. Of course you actually gain more if your own estimate is close to the actual value, but the actual value is unknown, and your expected payoff is based on your own belief. The core LMSR formula is proper, so it maximally rewards reporting your true belief. (Those so inclined can write down the expected payoff across outcomes for the core formula “gain = 100 * log2 (new_chance / old_chance)”, and set the first derivative to 0. Just remember to transform the natural units back to 0..1.)
Note: While question writers try to cover at least 90% of scenarios, it is known that people’s 90% ranges usually only contain the true value about 50% of the time, so it will be interesting to see if our guidelines help writers be better calibrated. It’s always worth considering that the writer did not make the ranges wide enough, but know that it gets very costly to move the estimate close to the edge.
Charles, why don’t you just solicit the question writers’ 99% confidence intervals? Hopefully that will contain the true range 75% of the time, and that’s not so far from 90.
If we discover we’re resolving outside the range half the time, we might see if that helps. Or use the formal range elicitation procedure developed by Burgman et al.
Can you elaborate on how your implementation of scaled continuous questions compares to the “continuous LMSRs” proposed in the paper linked below? http://dash.harvard.edu/bitstream/handle/1/9943223/Gao_BettingReal.pdf?sequence=1
Not much relation. Gao et al. consider probability distributions on a continuous range. For example, they allow you to select an interval and change its probability. Scaled only allows point estimates. Scaled is really just a binary question in disguise, where the resolution is allowed to be a mixture, and forecasts are interpreted as expected values. The usual 0..1 range is rescaled to represent the interval in natural units.
We have worked on more expressive variants that would allow distributions over date ranges or hierarchies, but going from algorithm to production software is a big expensive step, and we have decided first to improve the feedback and user dashboard to help more users stay engaged and to help active users better understand and manage their positions.
Pingback: Market Accuracy and Calibration | The Official SciCast Blog