Category Archives: Analysis

Google Analytics statistics for SciCast, as of May 22, 2015.

SciCast Final Report Released

The final SciCast annual report has been released!  See the “About” or “Project Data” menus above, or go directly to the SciCast Final Report download page.

Exeutive Summary (excerpts)

Registration and Activity

SciCast has seen over 11,000 registrations, and over 129,000 forecasts. Google Analytics reports over 76K unique IP addresses (suggesting 8 per registered user), and 1.3M pageviews. The average session duration was 5 minutes.

Continue reading


Users who have LIKED this post:

  • avatar
%SafeMode Trades vs #Trades,

Analysis of SciCast Safe Mode Usage

by Kellen Leister

Note: The following post has been revised slightly to use a power law (x^k) instead of a logarithmic function (k ln x). This is because the logarithmic function could go negative, which we know the percentage of trades cannot do. Revisions are in red.

Given the ease of switching back and forth between Safe Mode and Power Mode, we were interested to observe the relative use of each mode.  (For a description of the different modes, see the Appendix.) Continue reading


US Flu Forecast: Exploring links between national and regional level seasonal characteristics

For the flu forecasting challenge ( participants are required to predict several flu season characteristics, at national and at regional levels (10 HHS regions). For some of the required quantities  such as peak percentage influenza-like illness (ILI), and total seasonal ILI count  one may argue that national level values have some relationship with the regional level ones. Or, in other words participants may be led to believe that national level statistics can be obtained from regional level ones.

Continue reading


Will Philae land successfully on the surface of a comet?



SciCasters are following the ESA’s International Rosetta Mission and counting down the days of the much-anticipated landing of the Philae on a periodic comet known as Comet 67P/Churyumov-Gerasimenko. The latest news from ESA states that it will deploy the Philae to the comet on November 12.

Continue reading


Market Accuracy and Calibration

Prediction market performance can be assessed using a variety of methods. Recently, SciCast researchers have been taking a closer look at the market accuracy, which is measured in a variety of ways. A commonly used scoring rule is the Brier score that functions much like squared error between the forecasts and the outcomes on questions.

Continue reading


SciCast, Bluefin-21, and GeNIe

Reposted with permission from SciCast forecaster Jay Kominek. You can find his blog, here

I’m going to assume you’re familiar with SciCast; if you aren’t, that link is the place to start. Or maybe Wikipedia.

There has been a open question on SciCast, “Will Bluefin-21 locate the black box from Malaysian Airlines flight MH-370?”, since mid-April. (If you don’t know what MH370 is, I envy you.) It dropped fairly quickly to predicting that there was a 10% chance of Bluefin-21 locating MH370. Early on, that was reasonable enough. There was evidence pings from the black box had been detected in the region, so the entire Indian Ocean had been narrowed down to a relatively small area.

Unfortunately weeks passed and on May 29th Bluefin-21’s mission was completed, unsuccessfully. Bluefin-21 then stopped looking. At this point, I (and others) expected the forecast to plummet. But folks kept pushing it back up. In fact I count about 5 or 6 distinct individuals who moved the probability up after completion of the mission. There are perfectly good reasons related to the nature of the prediction market for some of those adjustments.

I’m interested in the bad reasons.

Continue reading


Cluster Analysis Data for HPV Questions

One of our active forecasters requested more information about the cluster analysis for the HPV-related questions on SciCast.


The U.S. CDC reports that human papillomavirus (HPV) is the most common sexually transmitted infection in the U.S.  Because some types of HPV are initially asymptomatic but increase the risk of cancer, particularly cervical cancer in women, great effort has been put into vaccinating the population against it.  Two HPV vaccines have been introduced since 2006, and the CDC encourages their use for girls age 11 and older.

Studies of HPV initially focused on the 13- to 17-year-old population, and the CDC estimates that 53.8% of U.S. females aged 13-17 had been vaccinated with at least one dose of an HPV vaccine in 2012, a gain of 0.8% since 2011.  The vaccination coverage varies widely across U.S. States, but some States are similar.  Therefore instead of linking each State to the US average, we put them in clusters.

Cluster Analysis

Clusters of States were created by first analyzing variables that correlate with HPV vaccination coverage in 2011 and 2012. A simple model using those variables for predicting HPV vaccination coverage explained over half the variation among states. To view the state-level variables, including HPV vaccination coverage estimates, open this Google Sheet.

On the most useful variables, states in a given cluster are more similar to each other than to States in other clusters. To create the clusters, I used the mclust and cluster libraries in R statistical software to try several forms of cluster analysis. Results varied somewhat, and we chose to use the five clusters that were relatively easy to interpret and each contained a reasonable number of States.

Cluster Model

The link structure on SciCast is a simple hierarchy:

  • US Rate -> {5 Clusters}
  • Cluster -> {States In the Cluster}
  • Also, US Rate in 2013 -> US Rate in 2014

That means you can forecast cluster rates given US rates (or vice versa), and a State’s expected rate given it’s cluster’s rate.  (Because States are modeled as scaled continuous questions, you cannot forecast cluster given State.)

Based on our model, we have set initial marginal distributions for all 5 clusters, and an initial conditional distribution of Cluster given the the US is in the most likely state. Both are approximately Normal, with the conditional distribution having smaller variance.

Not Known: StateS Conditional On Clusters

Although state HPV vaccination likely is dependent on the cluster to which the State belongs, we do not have a very clear model of the relation between a specific State’s vaccination and its cluster’s vaccination rate. Forecasts of a State rate given its cluster need to be filled in by users with the “Related Forecasts” section after a forecast on a cluster question (Even a forecast that completely agrees with the current market forecast will open up conditional forecasting options.). Hopefully the statistics in the file will help forecasters devise their own ideas!


by Ken Olson