Recent research on the bivariate flood peak/volume frequency analysis has mainly focused on the statistical aspects of the use of various copula models. The interplay of climatic and catchment processes in discriminating among these models has attracted less interest. In the paper we analyse the influence of climatic and hydrological controls on flood peak and volume relationships and their models, which are based on the concept of comparative hydrology in the catchments of a selected region in Austria. Independent flood events have been isolated and assigned to one of the three types of flood processes: synoptic floods, flash floods and snowmelt floods. First, empirical copulas are regionally compared in order to verify whether any flood processes are discernible in terms of the corresponding bivariate flood-peak relationships. Next the types of copulas, which are frequently used in hydrology are fitted, and their goodness-of-fit is examined in a regional scope. The spatial similarity of copulas and their rejection rate, depending on the flood type, region, and sample size are examined, too. In particular, the most remarkable difference is observed between flash floods and the other two types of flood. It is concluded that treating flood processes separately in such an analysis is beneficial, both hydrologically and statistically, since flood processes and the relationships associated with them are discernible both locally and regionally in the pilot region.
However, uncertainties inherent in the copula-based bivariate frequency analysis itself (caused, among others, also by the relatively small sample sizes for consistent copula model selection, upper tail dependence characterization and reliable predictions) may not be overcome in the scope of such a regional comparative analysis.
Bivariate distributions of flood peaks and flood event volumes may be needed for solving a range of practical problems, including, e.g., the design of retention basins and identifying the extent and duration of flooding in flood hazard zones. For the statistical analysis of flood peaks and volumes, identical marginal distributions for both random variables were used in the past (e.g., Goel et al., 1998; Yue et al., 2002). Recently, the use of copula-based multivariate models has become widespread. These allow for separate studies of the marginal distributions of the component variables and the correlation/dependence structure between them. Numerous studies have been published on this topic (e.g., Shiau, 2003; De Michele et al., 2005; Chowdhary et al., 2011; Requena et al., 2013), including recommendations how to select appropriate copula models (e.g., Favre et al., 2004; Genest and Favre, 2007). Despite numerous studies, as mentioned by Chowdhary et al. (2011), the use of copula-based multivariate distributions for hydrological design still cannot be regarded as having been satisfactorily resolved. The selection of the types of bivariate distributions and the estimation of their parameters from observed peak-volume pairs are associated with far greater uncertainties when compared to univariate distributions, since observed flood records of the required length are rarely available. This poses a problem for reliable estimations of flood risks in bivariate design cases. It is being increasingly recognized that the problem cannot be approached from only a purely statistical perspective. The crucial step for predictions in the multivariate modelling of flood characteristics by copulas is the choice of the copula which best fits the data (Favre et al., 2004). In this respect Serinaldi and Kilsby (2013) highlighted the importance of studying the relationships between processes that generate the design variables and also the statistical techniques used to model them. In previous studies we have attempted to better understand the hydrological factors controlling flood hydrograph shapes, which also have implications for the dependence between flood peaks and volumes. In Gaál et al. (2012), we analyzed the ratio of both quantities (flood time-scales) based on the concept of comparative hydrology in a regional context in Austria. We have compared catchments with contrasting characteristics in order to understand roles of climate (through the type of precipitation generated), together with the attributes of the environment and flow generation processes (e.g., through antecedent soil moisture and soil characteristics) in a holistic way. In Gaál et al. (2015), our aim was to understand the causal hydrological factors controlling the strength of the relationship (quantified by Spearman's rank correlation coefficient) between flood peaks and volumes for the same dataset. These coefficients ranged from about 0.2 in high alpine catchments to about 0.8 in lowlands. The weak dependence in high alpine catchments was attributed to the characteristic mix of flood types in the region. The results also suggested that the factors controlling the strength of the dependence may be more related to the climate than catchment characteristics. In Szolgay et al. (2015), we aimed at analyzing the formal suitability of various copula-based bivariate relationships between flood peaks and flood volumes, with a particular focus on two basic flood generating seasons (summer and winter floods) with the goal of going a little beyond the statistics alone in the choice of the copula functions for the engineering applications. It was concluded that, for rainfall-fed floods, three extreme value copulas performed best in the pilot region (the Galambos, Gumbel-Hougaard and Hüsler-Reiss copulas) followed by a normal copula. The other copulas were not regarded as regionally acceptable. For winter floods the best performer was the Frank copula, followed by the normal and Plackett copulas and the three extreme value models. The Clayton and Joe copulas indicated an unacceptable performance for both seasons. In Szolgay et al. (2015), we also illustrated the importance of considering the influence of the length of data series through two simple simulation experiments. The preferences for the choice of copulas were still visible, but less evident. The results indicated that the acceptance of copula models can be conditioned on the flood types but the length of the series and possibly also the homogeneity of the flood types within a region may play an important role.
Here, these approaches are followed in a more differentiated regional look
at the selection of a hydrological process-oriented copula model for flood
peak/volume relationships. Specifically, we are again interested in the
following questions:
how similar are flood peak/volume relationships of different flood types in
a climatically rather homogeneous region but in geologically different
subregions, which factors may play a role in forming the similarity of copula models in
the subregions and if recommendations could be formulated for engineering studies with respect
to the suitability of some copula types for the given flood processes?
The basic catchments characteristics used in this study.
ALL SITES
Location and geology of the pilot region in Austria and the subregions considered.
There are a wide variety of flood-generation mechanisms across Austria (e.g., Merz and Blöschl, 2003), which result in complex flood peak-volume relationships (Gaál et al., 2012, 2015; Szolgay et al., 2015). In order to decrease the complexity of runoff generation schemes in this analysis, we decided to reduce the climate variability through the selection of a geographically limited area, i.e., the Northern Lowlands region of Austria (Fig. 1), as a pilot region. The area is dominated by lowlands and hilly sites, with elevations ranging from about 400 to 1500 m a.s.l. The region is generally under the influence of air masses from the Atlantic. The mean annual precipitation in the target region shows a decreasing western-to-eastern gradient. Orographic enhancement is not significant; the annual rainfall amounts (from about 500 to 1500 mm) are significantly lower than in the Austrian Alps. The basic climatic and physiographic characteristics of the catchments are listed in Table 1.
Floods in the Northern Lowlands region may occur during the whole year. The winter floods are usually induced by snowmelt and rain-on-snow processes, when antecedent snowmelt saturates the soils, and temperature increases and/or relatively low rainfall intensities may then cause significant floods. Flash floods are mostly caused by convective events or cold fronts; synoptic floods depend on the particular circulation pattern, but westerly circulation prevails.
The data set used in this paper builds on the Austrian flood data described
in detail in Szolgay et al. (2015) and the papers referenced therein. The 72
small and mid-sized catchments analyzed have areas in the range of 10.6 to
444.3 km
Instead of the traditional engineering approach which often deals with flood volumes associated with the annual maxima of flood events, the current analysis intends to include all the flood events in the region which can be hydrologically regarded as independent. According to our understanding, two subsequent flood events are independent, when they do not originate from the same synoptic situation. We assumed that after a seven-day period, on the average, a completely different atmospheric circulation situation occurs in Central Europe (Gaál et al., 2015; Szolgay et al., 2015). The flood type classification according to the genesis of events in the region was introduced by Merz and Blöschl (2003) and modified in Gaál et al. (2015). As a further modification to that, the 25 697 flood events from the 1976–2007 period were classified as synoptic floods (originally long or short rain-induced floods in Merz and Blöschl, 2003), flash floods, and snowmelt floods (originally rain-on-snow floods or snowmelt floods).
In this paper we are interested in the similarity of empirical copulas of
flood peak-volume pairs, which we tested accordingly to the approach of
Remillard and Scaillet (2009). It is based on the Cramér-von Mises type
of distance measure:
As in Szolgay et al. (2015), nine frequently-used copulas were chosen to be
fitted locally, specifically from the Archimedean class (Clayton, Frank,
Gumbel-Hougaard and Joe copulas), the extreme-value class (Gumbel-Hougaard,
Galambos, Hüsler-Reiss), the elliptical class (normal, Student
As in Szolgay et al. (2015), the parameter
Number of identified independent flood events, stratified
by regions and flood types. ALL SITES
The regional distribution of the flood types for each subregion is shown in Table 2. Locally, the percentage of the flood types ranges between 51 and 74 % in the case of synoptic floods and 8 and 46 and 2 and 19 % for the snowmelt and flash floods, respectively. In the selection there are no catchments with a missing flood type. The relatively smaller number of snowmelt floods is due to the modest elevations in the region. The fact that flash floods represent 6.7 % of all the events is not negligible, since in Gaál et al. (2015), the flash floods only represented 1.8 % based on the annual maxima flood events.
First, a comparison of the empirical copulas for the different flood types locally was performed. In this analysis we were interested in whether different flood types, for the same catchment were distinguishable in terms of their empirical flood peak-volume copulas. The analysis was carried out for each catchment separately, and the flood samples of the process types were compared pairwise, i.e., synoptic floods vs. flash floods, synoptic floods vs. snowmelt floods, and flash floods vs. snowmelt floods. The results, which are shown in Fig. 2, suggest that synoptic and snowmelt floods could belong more often to the same unknown copula than is the case for the other combinations. This suggests that the synoptic and snowmelt floods are more similar to each other (in terms of their empirical copulas) than the other process pairs; or, in other words, flash floods tend to be more dissimilar from both the synoptic and snowmelt floods. This could be partly related to much stronger upper tail dependence of flash floods and their specific (similar) hydrograph shapes (Gaál et al., 2015). However, the relatively small number of events for such type of analysis in general and the differing number of events in the respective flood types in particular, may also play a role in the fact, that the analysis has not brought really conclusive results. These are objective factors, which cannot be overcome in the framework of comparative hydrology, when using only data available in practice.
Results of the comparison of the empirical copulas
locally: per cent ratio of the catchments where the equivalence of the pairs
of empirical copulas was rejected or could not be rejected at the given
level of significance (here
Next, a comparison of the empirical copulas for each flood type regionally was performed where we were interested in whether different catchments with the same flood type are distinguishable in terms of their empirical peak-volume copulas (Fig. 3). It can be seen that the empirical copulas of synoptic floods are the least similar between the catchments. This seems to be surprising; a closer look into the subregions showed that this phenomenon is more pronounced in the southwest than in the north. The high ratio of rejections of the synoptic events across different pairs of sites is therefore likely to be related to the more complex temporal rainfall structure, the mix of long and short rain processes in the dataset, and the more complex geology resulting in a lower degree of similarity between the different events and sites which may become more evident, when the sample size increases (as it is in the case of synoptic floods when compared to the other two types). A more detailed analysis of the phenomenon is beyond the goals of the present study. In the case of flash floods and snowmelt, the difference between the process types is smaller; the analysis suggests that most catchment pairs, for a given flood type are not distinguishable in terms of their empirical peak-volume copulas.
Next, we attempted a more detailed flood typology and regional
differentiation in fitting the copulas to the data than that in Szolgay et
al. (2015). The goodness-of-fit test of the nine copula types at the 72
catchments for all the floods merged into one set; and the three flood types
separately are shown in Fig. 4, stratified by subregions. The copula types
(each column represents a copula type) are organized alphabetically, while
the subregions are visualized by different color bars, which indicate
Results of the comparison of the empirical copulas
regionally: per cent ratio of the cases where the equivalence of the
empirical copulas was rejected or could not be rejected at the given level
of significance (here
Results of the goodness-of-fit test of the selected nine copula types within the three subregions (indicated by colors), for the all the floods merged in a single data set (top left) and for flood types treated separately. The bars indicate per cent ratio of the catchments where the given copula type was rejected.
In the case of analyzing all the floods together, we can see, that the three
extreme value copulas (the Galambos, Gumbel-Hougaard and Hüsler-Reiss
copulas) and the normal copula performed best in all the subregions (except
in the southwest, where the extreme value models clearly outperform the
normal). In 15 % of the catchments all nine models were rejected, the
acceptance rate oscillates around 50 per cent in general. This relatively
low rate (when compared to the process-wise analysis below) could be
attributed to the mix of flood types in the merged data sample, but the
effect of a relatively large sample size when compared to the other types
also cannot be excluded (larger sample decreases the uncertainty even to an
extent, where the models usually considered in practical applications are
not suitable at all in some cases). The acceptance rate of these four
copulas improved for the synoptic floods (still in 7 % of the catchment no
one model was found suitable), but was highly variable across the
subregions, which was not expected. This could be the result of the mix of
short and long rain processes (Merz and Blöschl, 2003) and a smaller
number of events in the subsamples; however, a detailed analysis is beyond
the goals of this study. Interestingly, the extremal copulas exhibited a
larger rejection rate in the Traunviertel and Flysh subregion, and the
Student
These results indicate that the acceptance of a particular copula model can be conditioned on the processes but that the size of the data samples and possibly also the homogeneity of the region with respect to the flood formation factors and flood types within the data set plays a role. However, uncertainties inherent in the bivariate frequency analysis methodology itself in real world applications (e.g., small samples for reliable model identification and upper tail dependence description) may not be overcome in the scope of such a regional comparative analysis using real data as conducted here.
Not much attention has been paid so far to directing a multivariate analysis of floods toward the selection of models for specific runoff generation processes. Here, this issue was addressed in a regional context by the differentiation of the flood types into three categories. Based on the results, it can be concluded that modeling dependence structure by treating flood processes separately in a regional context may prove beneficial with respect to narrowing the choice of acceptable models, since the suitability patterns of acceptable copula types are distinguishably different for the subregions/flood-types considered. This could help analysts to overcome some difficulties in the choice of the model caused by the inadequate length of a data series. On the other hand, it was also shown that a more detailed differentiation of the flood types and subregions opens in the selection of the model a greater degree of uncertainty than expected in Szolgay et al. (2015), which does not make the task easier for an analyst in practice. Despite that shortcoming, given that usually more than one statistically suitable dependence model exists, a regional analysis and an uncertainty analysis of the design values in the engineering studies resulting from the choice of a model can be recommended, especially for important water resources projects.
Our results support Favre et al. (2004) and Serinaldi and Kilsby (2013), who emphasized that further work is needed to choose the best copulas capable of reproducing the dependence structure of multivariate hydrological variables. But, as shown in a comparative hydrology framework above, the choice of the copula model that best fits the observed data and is regionally acceptable in term of flood typology, is not a trivial issue, even if more than statistical aspects are taken into consideration, since the lack of sufficient data makes the analysis difficult (if not even impossible). As a general recommendation resulting from this study, it is advisable to select models from the extreme value class of copulas (in the given region).
Note that even the adoption of generally accepted and widely used copula models may not lead to a successful bivariate fitting. Uncertainties inherent in the copula-based bivariate frequency analysis itself (caused, among others, also by the relatively small samples sizes for consistent copula model selection, upper tail dependence characterization and reliable predictions) may not be overcome in the scope of such a regional comparative analysis.
Based on this comparative study and results of other more advanced studies (e.g., Serinaldi, 2013, 2015) it can be concluded, if reliable predictions will be required for an important engineering application, the benefits of regional bivariate frequency analysis methods could be further explored (e.g., Ben Aissisa et al., 2015) or the potential of the combination of rainfall generators, rainfall runoff models, analysis of historical floods and advanced statistics considering uncertainty might be utilized.
We would like to thank the Austrian Academy of Sciences (International Strategy for Disaster Reduction Programme, IWHRE2008, 2008–2013), for its financial support. This research was also supported by the Slovak Research and Development Agency under Contract no. APVV 0496-10, by the Slovak Grant Agency VEGA under Project no. 1/0776/13 and by the IMPALA project (FP7-PEOPLE-2011-IEF-301953) of the Marie Curie Intra European Fellowship. This publication was further supported by Competence Center for SMART Technologies for Electronics and Informatics Systems and Services, ITMS 26240220072, funded by the Research & Development Operational Programme from the ERDF.
The authors would also like to thank Francesco Serinaldi for a very inspiring open discussion on the topic during a review of their other paper.