Comparison of cross-validation and bootstrap aggregating for building a seasonal streamflow forecast model
Abstract. Based on a hindcast experiment for the period 1982–2013 in 66 sub-catchments of the Swiss Rhine, the present study compares two approaches of building a regression model for seasonal streamflow forecasting. The first approach selects a single "best guess" model, which is tested by leave-one-out cross-validation. The second approach implements the idea of bootstrap aggregating, where bootstrap replicates are employed to select several models, and out-of-bag predictions provide model testing. The target value is mean streamflow for durations of 30, 60 and 90 days, starting with the 1st and 16th day of every month. Compared to the best guess model, bootstrap aggregating reduces the mean squared error of the streamflow forecast by seven percent on average. Thus, if resampling is anyway part of the model building procedure, bootstrap aggregating seems to be a useful strategy in statistical seasonal streamflow forecasting. Since the improved accuracy comes at the cost of a less interpretable model, the approach might be best suited for pure prediction tasks, e.g. as in operational applications.