How reliable are these predictions?

Each day a check is made on the predictions of the previous day. The following figures and statistics describe the reliability of the predictions published to date on the web, based on these checks. (Note that a summary of the predictions is provided here.)

Total days: 529
Event days: 51
Mean (f): 0.194
Mean (x): 0.096
Median (f): 0.097
Std dev (f): 0.187
Std dev (x): 0.295
Mean (f|x=1): 0.349
Mean (f|x=0): 0.177
Median (f|x=1): 0.301
Median (f|x=0): 0.089
Std dev (f|x=1): 0.226
Std dev (f|x=0): 0.174
Discrimination: 0.172
ME: 0.098
MAE: 0.223
MSE: 0.101
Linear assoc: 0.272
Skill: *****

Figure 1: Reliability plot for M-X event prediction, and associated statistics.

Total days: 529
Event days: 12
Mean (f): 0.042
Mean (x): 0.023
Median (f): 0.013
Std dev (f): 0.067
Std dev (x): 0.149
Mean (f|x=1): 0.221
Mean (f|x=0): 0.038
Median (f|x=1): 0.259
Median (f|x=0): 0.013
Std dev (f|x=1): 0.127
Std dev (f|x=0): 0.059
Discrimination: 0.182
ME: 0.020
MAE: 0.055
MSE: 0.019
Linear assoc: 0.407
Skill: 0.148

Figure 2: Reliability plot for X event prediction, and associated statistics.

To understand these plots, first note that predictions are on the horizontal axis and observations are on the vertical axis. The 45 degree solid line represents perfect prediction. The top figure is for prediction of flares with classes M to X, and the bottom figure is for X class flares.

In more detail, each plot may be understood as follows. The forecasts for all days are binned in increments of 0.05. For each bin, all the days on which predictions were made in the range of that bin are examined. The observed number of those days on which at least one event did occur is used to estimate the underlying probability of an event on those days. That is the vertical value of the plot, for that bin.

Similar "reliability" plots for predictions issued by the US National Oceanic and Atmospheric Administration are given here and here.

Next to the two plots above are additional statistics on the reliability of the predictions. In these tables, f refers to the forecast, and x to the observations. For each day x is either one or zero, i.e. one or more flares did or did not occur. For each day f is the value of epsilon for the forecast, i.e. the assigned probability for the occurrence of at least one flare. The statistics of f and x are shown.

Figure 3 shows plots of running "skill scores" for the predictions of M-X events (upper panel) and X events (lower panel). The skill score is a measure of how good the predictions are by comparison with a simple forecast consisting of the average of the observed value. A skill score of one is perfect prediction, and a positive/negative skill score represents better/worse prediction than the simple forecast. The final values on these plots are the skill scores for all days of predictions and observations to date, which are also listed in the tables to the right of Figures 1 and 2.

Figure 3: Running plots of skill scores for the predictions to date.

Additional information:

Main flare prediction page
More detail on how today's predictions were made
Summary of all predictions made to date
The prediction for 4 November 2003
A test of the method on historical data
Links to other pages related to flare prediction
Mike Wheatland's home page

Acknowledgement: The predictions given here are based on information from the Space Environment Center, Boulder, CO, National Oceanic and Atmospheric Administration (NOAA), US Dept. of Commerce.

Page maintained by Page last updated Wednesday, 13-Nov-2013 11:58:17 AEDT