Fitting sediment rating curves using regression analysis: a case study of Russian Arctic rivers

Published suspended sediment data for Arctic rivers is scarce. Suspended sediment rating curves for three medium to large rivers of the Russian Arctic were obtained using various curve-fitting techniques. Due to the biased sampling strategy, the raw datasets do not exhibit log-normal distribution, which restricts the applicability of a log-transformed linear fit. Non-linear (power) model coefficients were estimated using the Levenberg-Marquardt, Nelder-Mead and Hooke-Jeeves algorithms, all of which generally showed close agreement. A non-linear power model employing the Levenberg-Marquardt parameter evaluation algorithm was identified as an optimal statistical solution of the problem. Long-term annual suspended sediment loads estimated using the non-linear power model are, in general, consistent with previously published results.


INTRODUCTION
The sediment rating curve is widely employed as an empirical technique for relating suspended sediment concentrations C (g m -3 ) with water discharge Q (m 3 s -1 ) (Colby, 1956).As such, it introduces a causative linkage between two variables, one (Q) treated as an independent predictor (Glysson, 1987).A plethora of previous publications on rating curves expresses this relationship using a power function of the form: where a and b are the rating coefficient and exponent, respectively (Syvitski et al., 2000).
Alternative formulations of the sediment rating curve equation include the use of a power function with a constant (Asselman, 2000) and a simple linear fit (Mount & Abrahart, 2011).Each of these alternatives has its benefits and limitations.One potential use of the sediment rating curve, as argued by Fenn et al. (1985) is for exploring the internal features of coincident sediment and discharge datasets rather than for obtaining a plausible predictive model.Sediment rating curve equations, nonetheless, are used widely in producing suspended load estimates for periods when only water discharge data are available.Fitting a rating curve in this context, is therefore a regression problem, and the plausibility of the solution relies on the performance of the statistical methods involved (Cox et al., 2008).Implicit heterogeneity in the observational datasets poses a challenge for effective rating curve fitting.Depending on the time frame of monitoring, various intra-annual variations in sediment delivery and transport processes can be present in the datasets, including hysteresis and seasonality (Asselman, 2000).Fitting procedures based on discharge classes (Jansson, 1996), seasonal rating curve equations (Khanchoul & Jansson, 2008) and dataset separation by rising and falling limb stages of the hydrograph (Aquino et al., 2009) enhance the performance of rating curves in these circumstances, if such distinctions can be clearly drawn on the basis of the data collected.
The sediment rating curve approach, sensu stricto, is only applicable to the analysis of measured values of discharge and concentration, which are frequently treated as daily averages.When continuous sampling at short intervals is performed, time series analysis or artificial neural networks with consideration of autocorrelation may produce the best result (Mount & Abrahart, 2011).
The quality of the rating curve model depends as much on the fitting method, as it does on the datasets collected as well as the nature of the key geomorphic processes responsible for the sediment delivery to the study streams.The principal assumptions of the rating curve concept hold true when runoff is the major driver of sediment source activation, e.g. during snowmelt and rainstorm events.Arctic watersheds underlain by permafrost provide an example of an environment where geomorphic activity is frequently driven by heat (cryogenic processes).The purpose of the study was therefore to test the applicability of various sediment rating curve fitting techniques to the observational datasets for three medium to large rivers in the Russian Arctic.We assessed the efficiency of the resulting models and their general applicability to datasets where the observations are scarce and reflect the joint action of fluvial and cryogenic processes in generating suspended sediment fluxes.

Gauging sites
This study employed datasets for discharge and suspended sediment concentration from three gauging stations on medium to large rivers of the Russian Arctic: the Anabar River at Saskylakh, the Lena River at Tabaga and the Indigirka River at Vorontsovo (Table 1, Fig. 1).

Datasets
Table 2 summarizes the datasets used for the study sites.

METHODS
Notwithstanding the opinion that the fitting and use of sediment rating curves is well-documented and standardized (Mount & Abrahart, 2011), there are still ongoing debates on the appropriateness and accuracy of various curve fitting procedures.
This study was not designed to test all existing fitting techniques, but rather to examine the most common procedures: (a) linear regression on untransformed values (linear fit); (b) linear regression on log-transformed values (power fit); (c) non-linear regression (power fit).The log-transformed power model requires bias correction to account for the inequality of the means of the initial and log-transformed data.A bias correction factor CF was applied to the loglinear power models, as described by Ferguson (1986): where Ci and Ĉi are observed and predicted values, respectively, and n is the number of observations.
Non-linear regression fitting is regarded as an optimization problem, so the potential solutions can be numerous, depending on the chosen variety of the loss function optimization algorithms.In this study, the performance of the Levenberg-Marquardt, the Simplex (Nelder-Mead) and the Hooke-Jeeves algorithms was tested and compared.The former algorithm was developed for use innon-linear least squares solutions, while the latter two were designed for wider applications in non-linear optimization.

Initial data inspection
Selection of the most accurate fitting technique, as well as assessment of the applicability of any particular regression model, starts with data inspection, which is best performed graphically (Fig. 2).Only the sediment concentration data for the Lena River pass the Kolmogorov-Smirnov test for log-normality.In general, the suspended sediment distributions tend to be more skewed towards the left; the lower left parts of the scatter plots are overpopulated, though the degree of scatter remains low (Fig. 3).
Both histograms (Fig. 2) and scatter plots (Fig. 3) suggested that the overall quality of rating curves could be relatively poor.Threshold behaviour is characteristic of both the Lena and Indigirka River datasets, as the scatter increases significantly when discharges of 15 000 m 3 s -1 and 25 000 m 3 s -1 , respectively, are exceeded.For the Lena River this threshold value corresponds well with the effective discharge estimate of 16 000 m 3 s -1 , responsible for intense bank erosion (Tananaev, 2013).Above this threshold, the variability in suspended sediment concentration is ascribed to the introduction of significant amounts of wash load, originating from both the surrounding river basin and the eroded channel bank material.Table 3 Sediment rating curve equations for the Russian Arctic study rivers.

Anabar at Saskylakh
Lena at Tabaga Indigirka at Vorontsovo Linear fit  In general, the suspended sediment load estimates derived herein (Table 5) are broadly consistent with previously published results.For the Anabar River at Saskylakh, our estimate is close to the 0.4 Mt estimate from the papers of Gordeev et al. (1996) and Holmes et al. (2002).For the Lena River at Tabaga, our estimate exceeds the value of 7.7 Mt reported by Hasholt et al. (2005), and for the Indigirka R. at Vorontsovo, our estimate is lower than the estimates of 12.9 Mt (Gordeev et al., 1996), 12.0 Mt (Hasholt et al., 2005) and 11.1 Mt (Holmes et al., 2002) reported previously.

CONCLUSION
Based on the visual inspection of the suspended sediment rating curves and the Nash-Sutcliffe criterion, a non-linear power model employing the Levenberg-Marquardt parameter evaluation algorithm was identified as an optimal statistical solution of the problem.Long-term annual suspended sediment loads for the study rivers estimated using the non-linear power model are, in general, consistent with those reported previously.

Fig. 2
Fig. 2 Frequency distributions of water discharge and suspended sediment concentration for the Anabar (a, d), Lena (b, e) and Indigirka (c, f) rivers, respectively.

Fig. 3
Fig. 3 Scatter plots of suspended sediment concentration versus water discharge for the Anabar River at Saskylakh (a), the Lena River at Tabaga (b), and the Indigirka River at Vorontsovo (c).

Table 1
Summary information for the study watersheds and gauging stations.

Table 2
Summary statistics for the study datasets.length of the dataset; n, number of observations; QT, mean annual discharge for the years included in the dataset; Qd and SSCd, mean discharge and suspended sediment concentration for the dataset, respectively.

Table 5
Long-term seasonal and annual suspended sediment loads of the Russian Arctic study rivers.mean seasonal discharge; t, season duration; WR, seasonal suspended sediment load.