Bayesian inference of synthetic daily rating curves by coupling Chebyshev Polynomials and the GR4J model

. In ﬂuvial dynamics studies, there are instances where it becomes necessary to estimate the daily discharge of a river in locations where only one instantaneous level record is available per day. In such cases, there may be no rating curve


Introduction
Runoff time-series give valuable information for water resources.In practice, hydrometric stations measure only river stage.Runoff time series are estimated through the rating curve (RC), which is the relationship between paired instantaneous stage and discharge measurements (gaugings).RCs usually are approximated with a power law (WMO, 2010), but alternative approaches are polynomial regression, splines and fuzzy regression (Fenton, 2018;Jalbert et al., 2011;McMahon and Peel, 2019).Despite the plethora of techniques, building the RC is still complex since it depends on the availability of gaugings, variable hydraulic conditions, and understanding the physical process governing the stagedischarge relationship (Le Coz, 2012).
gaugings are absent and extrapolation techniques are needed (Di Baldassarre and Claps, 2011;Lang et al., 2010;Reistad et al., 2007).Some example extrapolation techniques are conveyance slope, areal comparison of peak discharges, flood rooting, step backwater and hydraulic modelling (WMO, 2010).All these methods are based on the availability of instantaneous discharges and most of them require additional field information.
The quality of the rating curve data and approximation defines the uncertainty of the observed discharge and affects the performance of the hydrological model as well as the optimization procedure (McMillan et al., 2010;Sellami et al., 2013).But what happens if we take advantage of this link to estimate the rating curve?Perhaps due to the complexity of the problem, only a few authors have investigated this.For example: Sikorska and Renard (2017) re-calibrate the rating curve using a Bayesian framework which couples the rating curve with the hydrological model outputs, and therefore consider structural and parametric uncertainties in the discharge prediction.Jian et al. (2017) use only water levels and a hydrological model to make discharge predictions.They calibrate the model using Spearman Rank correlation and the inverse rating curve.Equifinality is crucial in these kinds of problems since the parameters of the hydrological model could compensate for the errors associated with the parametric uncertainty of the rating curve (Lima et al., 2019).
The aim of this work is to take advantage of the link between stream stage observations and hydrological model outputs to develop a framework to estimate daily discharge at sites with incomplete rating curves.The daily rating curve represents the relationship between an instantaneous water level within a day and the mean daily discharge.The concept is introduced to address the problem of scaling between instantaneous water level observations and discharge simulations.The size of the basin affects the sub-daily variability of discharge (Blöschl and Sivapalan, 1995), affecting the estimated mean daily discharge and the magnitude of estimated discharges is further influenced by the nonlinearity of the rating curve (Kiang et al., 2018).These factors add uncertainties to discharge computation.The working hypothesis is that parametric uncertainties from a daily rating curve model and a hydrological model can be quantified using Bayesian Inference, resulting in an estimate of the daily rating curve based on sparse or limited data.For example, this approach could be used to estimate rating curves in a cross-catchment approach where at some locations only one instantaneous level data per day is available, for example, based on satellite altimetry of rivers (Kittel et al., 2021).

Study area and dataset
The study area is located in the headwater of the Lachlan River, which flows from the Great Dividing Range into Wyangala Dam (a tributary of the Murray-Darling river sys- tem, New South Wales, Australia).Average annual rainfall is around 1100 mm; monthly precipitation being generally uniform throughout the year.Annual evaporation varies from 900 to 1200 mm with strong seasonal behaviour between winter and summer.Groundwater is governed by fractured rocks and topography.
Figure 1 shows the location of the gauging sites.The stage discharge relationships at the gauging sites are well known and have relatively dense data sets (Water Data Online, http://www.bom.gov.au/waterdata/, last access: 14 September 2023).The simulation period chosen is from 2008 to 2013 to maximize the data availability with no significant changes in the stage-discharge relationship.Generally, the scatter of percentage differences between instantaneous discharge and daily discharge increases as daily discharge increases (Fig. 2).These differences are normally distributed with a zero mean being more homogeneous as basin size increases (Fig. 2).Climatological forcing using mean areal daily values of potential evapotranspiration and precipitation for each basin was generated using the gridded SILO -Australian climate database (Scientific Information for Land Owners, https://www.longpaddock.qld.gov.au/silo/, last access: 11 September 2023), which provides values at an approximate 5 km grid scale.Potential evapotranspiration is calculated by the FAO56 Penman-Monteith formula.Further information about the interpolation techniques used in SILO can be found in Jeffrey et al. (2001).

Method
The daily rating curve is the relationship between mean daily discharge (Q) and an arbitrary instantaneous water level within the day (H ).It differs from the classical definition of  RC since the latter works with paired instantaneous values of stages and discharges.We propose a daily rating curve model coupled with a rainfall-runoff model to obtain flow estimates at sites where only one instantaneous water level value is available per day.This work uses a cross catchment verification where a single rainfall-runoff model is fitted across the four sites (Fig. 1), and where one of the sites is assumed to have only instantaneous water level records, this is called the test site.Next, the daily discharge on the test site with only water level records is derived from the hydrological model to estimate the daily rating curve model.At the same time, the coupled models are optimized by Bayesian inference.

Rainfall-Runoff Model
The rainfall-runoff model simulates discharges at the 4 sites.It is based on the GR4J model (Perrin et al., 2003).GR4J is used since it is a simple lumped rainfall-runoff model with 4 parameters (X 1 : the production storage capacity; X 2 : the groundwater exchange coefficient; X 3 : one day ahead maximum capacity of the routing storage; and X 4 : the time base of unit hydrograph).Furthermore, the GR4J model is available in several programming languages and packages like Fortran or R with a very low computational cost (Andrews et al., 2011;Coron et al., 2017).The model requires the daily time series of precipitation and potential evapotranspiration as input variables.The implementation of the GR4J uses the same set of parameters for all sites, and the model predicting different gauging sites only differs in the climatological forcing.This approach considers that the characteristics of basins are similar, and the transfer of parameters across the different basins can be performed with a minor loss of prediction skill, which is, of course, a strong assumption that we aim to handle with the Bayesian Inference.
A general formulation of the rainfall-runoff model follows Eq. ( 1): where Q is the discharge (mm d −1 ), the subscript u is the gauging site, and GR4J() is the rainfall-runoff model as function of the climatological forcing denoted by φ (precipitation and potential evapotranspiration in mm d −1 ), and X 1 , X 2 , X 3 , X 4 the GR4J parameters.

Daily rating curve model
The daily rating curve model is built from instantaneous observed water levels and daily simulated discharges using GR4J.This preliminary approach assumes that the stagehttps://doi.org/10.5194/piahs-385-399-2024Proc.IAHS, 385, 399-406, 2024 discharge relationship could have up to one possible changing point caused by a change in the hydraulic controls or flows above the bank full stage.This assumption is represented using 3rd order Chebyshev polynomials (Eq.2).Chebyshev polynomials instead of other models overcome some problems of the automatic generation of rating curves, being a computationally efficient alternative (Fenton, 2018) with a reduced number of parameters.
where u = test is the instantaneous water level test site, X 5 , X 6 , X 7 are the 3rd order Chebyshev coefficients and h * is a transformation which rescales the stage between −1 to 1 (McMahon and Peel, 2019).

Bayesian Inference
Bayesian inference is often used for parameter optimization and uncertainty estimation.The method estimates the probability density function of the parameters of the model (known as a posterior distribution) by using a likelihood function and prior distributions of parameters and the Bayes theorem.
In this preliminary implementation, the likelihood function assumes that residual errors are Gaussian, homoscedastic and independent: where L is likelihood function, σ the variance of residuals, and the model residuals as = Q obs − Q sim .On the full gauged sites, Q obs is given by the daily observed discharges downloaded from the Bureau of Meteorology of Australia.At the water level test site, Q obs is imposed by the daily rating curve Eq. ( 2).The Bayesian Inference of the coupled models is using the Delayed Rejection Adaptive Metropolis algorithm (DRAM).This technique finds an ensemble of parameters values that represent parameter distributions and uncertainties.The implementation is using the package FME in the R environment (Soetaert and Petzoldt, 2010).Prior distributions of parameters have been defined by Gaussian probability distributions for parameters X 1 and X 2 , and non-informative uniform distributions for parameters X 3 to X 7 as well as σ .

Application of the Bayesian Inference
The prior and posterior distributions of parameters resulting from applying the DRAM algorithm across the instantaneous water level test sites are shown in Fig. 3. Posterior distributions of the parameters differ between the test cases, where each test case represents a different gauging station with only daily instantaneous values.Differences might be caused by the assumption that all basins are parameterized with the same set of parameters, ignoring catchment differences.Depending on which catchment is the test case, this would change the parameter distributions.A clear demonstration of this effect is shown by the time base of the unit hydrograph (X 4 , Fig. 3d).The parameter X 4 is related to the size, shape and slope of the basin.Posterior distributions of X 4 are similar for test cases which have similar basin characteristics, such as Abocrombie and Reids Flat.However not too much physical interpretation should be given to the parameters of GR4J since the parameters cannot always be related to the physical characteristics of the basins (Narbondo et al., 2020).In contrast, the daily rating curve model shows no similarities between parameters across gauging sites.
Despite being a parsimonious hydrological model the X 1 , X 2 , X 3 parameters of GR4J were correlated (Fig. 4).This has been highlighted before for GR4J (Yang et al., 2018;Arsenault and Brissette, 2014;Qi et al., 2020).The rating curve model also showed a high correlation between the parameters (Fig. 4), indicating a possible overparameterization that could be due to the choice of the degree of the Chebyshev polynomials.Here 3rd degree polynomials were included to allow for changes in the stage-discharge relationship.However, there is no interaction between the parameters of the rating curve and the parameters of the hydrological model (Fig. 4).This could be due to the effect introduced by the coupled scheme, which incorporates discharge observations from neighbouring basins, reducing the problem of equifinality in the coupling of the rating curve and hydrologic model, compared to similar work (Lima et al., 2019).

Daily rating curve estimation, potential applications and limitations
Two simulations of the daily discharge have been generated for the sites where only daily water level data are available.The first simulation is generated by the rainfall-runoff model, and the second by the daily rating curve model.Figure 5 shows the residuals of both simulations as a function of observed discharges.Two models overestimate the low flow with the GR4J estimations being less biased.Additionally, for medium to high flows, the overestimation decreases and the scatter of residuals of the daily rating curve is lower than for GR4J.Daily rating curves tend to overestimate/underestimate the flow at low/high stages (Fig. 6).Overall change points in the estimated rating curves are relatively smooth, which would point to potential overparameterization discussed in Sect.4.1.An interesting finding of this work is that the upper part of the daily rating curve has similarities with the instantaneous rating curve.In contrast, the lower part of the daily rating curve has non-negligible errors that could come from the simplistic structure of the rainfall-runoff model (Flores  et al., 2021).Also, it could be related to the structure of the differences between instantaneous discharge and daily discharge (Fig. 2); It is noted that the best fit is obtained for Reids Flat, which is the larger basin with the lowest sub-daily discharge variability.
The daily rating curve is a new concept that aims to establish a relationship between the stage and discharges in a different time scale.This addresses the problem of scaling between instantaneous water level observations and discharge simulations of a hydrological model which delivers mean discharges over a time step (a time aggregation), which is often forgotten.Although Fig. 6 compares the daily rating curve with actual instantaneous gaugings, it is important to remember that the daily approach has a different purpose and https://doi.org/10.5194/piahs-385-399-2024 Proc.IAHS, 385, 399-406, 2024  potential use that should not be confused with the classic instantaneous rating curve.We believe that the main potential of this approach could be data assimilation at sites with only water level records, such as from satellite altimetry.The presented approach also has some limitations which require further research.Temporal changes in the river geometry could be one of the most important constraints (Bhandari et al., 2023;Morlot et al., 2014), for this reason, the study period was limited to a period of time without significant changes in the stage-discharge relationship.Other problems, such as hysteresis or backwater, could occur in other study areas.In these cases, autoregressive models or more elegant formulations of the rating curve model could be included in the framework (Petersen-Øverleir, 2006).

Conclusions
This work introduces a framework to estimate daily rating curves at partly gauged sites.The concept of daily rating curves differs from the classical approach since it uses daily values of discharges rather than instantaneous values.
Bayesian optimization of the model parameters results in a significant overestimation of the flow at the low stages and moderate underestimation at the high stages.For medium stages, the daily rating curves do not differ greatly from the instantaneous rating curves.These results suggest that the daily rating curves have the potential to be used to estimate flows from stage levels under average conditions, which should be evaluated by extending the analysis to a broader set of river basins.
Disclaimer.Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Special issue statement.This article is part of the special issue "IAHS2022 -Hydrological sciences in the Anthropocene: Variability and change across space, time, extremes, and interfaces".It is a result of the XIth Scientific Assembly of the International Association of Hydrological Sciences (IAHS 2022), Montpellier, France, 29 May-3 June 2022.

Figure 2 .
Figure 2. Percentage difference of reported 15 min discharge and mean daily discharge as a function of daily discharge (lines: Generalised Additive Models for Location Scale and Shape, continuous line: 50th percentile, dashed line: 25-75th percentiles, pointed line: 5-95th percentiles).

Figure 3 .
Figure 3. Prior and posterior density functions of parameters of the GR4J (a, b, c, d) and daily rating curve models (e, f, g) for the four hypothetical instantaneous water level sites.

Figure 4 .
Figure 4. Correlation matrix, scatter plots and posterior density function of parameters.

Figure 5 .
Figure 5. GR4J and Daily Rating Curve (DRC) residuals as a function of daily discharges at the water level site and probability density function of log discharges (top) and log residuals (right).