Hamill T. M., R. Hagedorn and J. S. Whitaker (July 2008): Probabilistic Forecast Calibration Using ECMWF and GFS Ensemble Reforecasts. Part II: Precipitation. Mon. Weather Rev., 136 (7), 2620-2632. doi:10.1175/2007MWR2411.1

As a companion to Part I, which discussed the calibration of probabilistic 2-m temperature forecasts using large training datasets, Part II discusses the calibration of probabilistic forecasts of 12-hourly precipitation amounts. Again, large ensemble reforecast datasets from the European Centre for Medium-Range Weather Forecasts (ECMWF) and the Global Forecast System (GFS) were used for testing and calibration. North American Regional Reanalysis (NARR) 12-hourly precipitation analysis data were used for verification and training. Logistic regression was used to perform the calibration, with power-transformed ensemble means and spreads as predictors. Forecasts were produced and validated for every NARR grid point in the conterminous United States (CONUS). Training sample sizes were increased by including data from 10 nearby grid points with similar analyzed climatologies. “Raw” probabilistic forecasts from each system were considered, in which probabilities were set according to ensemble relative frequency. Calibrated forecasts were also considered based on three amounts of training data: the last 30 days of forecasts (available for 2005 only), weekly reforecasts during 1982–2001, and daily reforecasts during 1979–2003 (GFS only). Several main results were found. (i) Raw probabilistic forecasts from the ensemble prediction systems’ relative frequency possessed little or negative skill when skill was computed with a version of the Brier skill score (BSS) that does not award skill solely on the basis of differences in climatological probabilities among samples. ECMWF raw forecasts had larger skills than GFS raw forecasts. (ii) After calibration with weekly reforecasts, ECMWF forecasts were much improved in reliability and were moderately skillful. Similarly, GFS-calibrated forecasts were much more reliable, albeit somewhat less skillful. Nonetheless, GFS-calibrated forecasts were much more skillful than ECMWF raw forecasts. (iii) The last 30 days of training data produced calibrated forecasts of light-precipitation events that were nearly as skillful as those with weekly reforecast data. However, for higher precipitation thresholds, calibrated forecasts using the weekly reforecast datasets were much more skillful, indicating the importance of large sample size for the calibration of unusual and rare events. (iv) Training with daily GFS reforecast data provided calibrated forecasts with a skill only slightly improved relative to that from the weekly data.

