Improving the accuracy of a remotely-sensed flood warning system using a multi-objective pre-processing method for signal defects detection and elimination

Hossein Bonakdari; Amir Hossein Zaji; Keyvan Soltani; Bahram Gharabaghi

doi:10.5802/crgeos.4

Hydrologie, environnement - Article original

Improving the accuracy of a remotely-sensed flood warning system using a multi-objective pre-processing method for signal defects detection and elimination

Hossein Bonakdari ¹ ; Amir Hossein Zaji ² ; Keyvan Soltani ² ; Bahram Gharabaghi ³

¹ Department of Soils and Agri-Food Engineering, Université Laval, Québec, G1V 0A6, Canada
² Department of Civil Engineering, Razi University, Kermanshah, Iran
³ School of Engineering, University of Guelph, Guelph, Ontario, NIG 2W1, Canada

Comptes Rendus. Géoscience, Volume 352 (2020) no. 1, pp. 73-86.

Résumé

One of the primary goals of watershed management is to proactively monitor and forecast flood water levels to provide early warning for timely evacuation plans and save lives. One of the most economical ways to accomplish this objective is to use remotely-sensed satellite signals. Previous studies have indicated that an Advanced Microwave Scanning Radiometer (AMSR) sensor can be used for river water level monitoring combined with a few in-situ hydrometric gauges for the ground-truth data collection. However, space-based signals are influnced by many error-inducing natural factors, such as dust and cloud cover. Hence, a hybrid method is proposed, which comprises of a multi-objective particle swarm optimization model, a decision tree classification algorithm, the Hotelling’s $T^{2}$ outlier detection, and a regression model to identify and replace inaccurate space-based signals. This complex hybrid method will be referred to, in this study, with the acronym (OCOR). In the first phase of this hybrid method, the outlier signals are detected and eliminated from the dataset, and in the second phase, the eliminated signals along with signals lost due to satellite technical problems are estimated by ground-truth data calibration using in situ hydrometric stations. The two case studies of the White and Willamette Rivers demonstrate the performance of OCOR in practical situations.

Métadonnées

Reçu le : 2018-07-20
Révisé le : 2019-04-01
Accepté le : 2020-03-02
Publié le : 2020-06-04

DOI : 10.5802/crgeos.4

Keywords: River discharge forecasting, Particle swarm optimization, Decision tree classification, Hotelling’s

T^{2}

outlier detection, Satellite, Passive microwave signals

Affiliations des auteurs :

Hossein Bonakdari ¹ ; Amir Hossein Zaji ² ; Keyvan Soltani ² ; Bahram Gharabaghi ³

¹ Department of Soils and Agri-Food Engineering, Université Laval, Québec, G1V 0A6, Canada
² Department of Civil Engineering, Razi University, Kermanshah, Iran
³ School of Engineering, University of Guelph, Guelph, Ontario, NIG 2W1, Canada

Licence :

CC-BY 4.0

Droits d'auteur : Les auteurs conservent leurs droits

@article{CRGEOS_2020__352_1_73_0,
     author = {Hossein Bonakdari and Amir Hossein Zaji and Keyvan Soltani and Bahram Gharabaghi},
     title = {Improving the accuracy of a remotely-sensed flood warning system using a multi-objective pre-processing method for signal defects detection and elimination},
     journal = {Comptes Rendus. G\'eoscience},
     pages = {73--86},
     publisher = {Acad\'emie des sciences, Paris},
     volume = {352},
     number = {1},
     year = {2020},
     doi = {10.5802/crgeos.4},
     language = {en},
}

TY  - JOUR
AU  - Hossein Bonakdari
AU  - Amir Hossein Zaji
AU  - Keyvan Soltani
AU  - Bahram Gharabaghi
TI  - Improving the accuracy of a remotely-sensed flood warning system using a multi-objective pre-processing method for signal defects detection and elimination
JO  - Comptes Rendus. Géoscience
PY  - 2020
SP  - 73
EP  - 86
VL  - 352
IS  - 1
PB  - Académie des sciences, Paris
DO  - 10.5802/crgeos.4
LA  - en
ID  - CRGEOS_2020__352_1_73_0
ER  -

%0 Journal Article
%A Hossein Bonakdari
%A Amir Hossein Zaji
%A Keyvan Soltani
%A Bahram Gharabaghi
%T Improving the accuracy of a remotely-sensed flood warning system using a multi-objective pre-processing method for signal defects detection and elimination
%J Comptes Rendus. Géoscience
%D 2020
%P 73-86
%V 352
%N 1
%I Académie des sciences, Paris
%R 10.5802/crgeos.4
%G en
%F CRGEOS_2020__352_1_73_0

Hossein Bonakdari; Amir Hossein Zaji; Keyvan Soltani; Bahram Gharabaghi. Improving the accuracy of a remotely-sensed flood warning system using a multi-objective pre-processing method for signal defects detection and elimination. Comptes Rendus. Géoscience, Volume 352 (2020) no. 1, pp. 73-86. doi : 10.5802/crgeos.4. https://comptes-rendus.academie-sciences.fr/geoscience/articles/10.5802/crgeos.4/

Version originale du texte intégral (Proposez une traduction )

Le texte intégral ci-dessous peut contenir quelques erreurs de conversion par rapport à la version officielle de l'article publié.

1. Introduction

Despite the fact that climate change has increased the severity and frequency of floods and that floods have killed more people than all other natural disasters combined [De Groeve and Riva 2009a], more studies are required to provide cost-effective early warning systems for timely evacuation plans in operational context [Zaji et al. 2018]. Recent studies have successfully employed artificial intelligence models for solving nonlinear water resources problems [Akhbari et al. 2017, Bonakdari et al. 2020, Ebtehaj et al. 2018, 2019, Espinoza-Villar et al. 2018, Gholami et al. 2019, Sattar et al. 2019, Zaji and Bonakdari 2019].

He et al. [2014] compared the performance of three of the most popular data-driven techniques, i.e., Artificial Neural Networks (ANN), Adaptive Neuro-Fuzzy Inference Systems (ANFIS), and Support Vector Machines (SVM) in predicting river flow. The authors concluded that the SVM method performed better in simulating rivers in a semi-arid mountain region in China. Chen et al. [2015] applied a Hybrid Neural Network (HNN) based on fuzzy pattern recognition to simulate the Altamaha River. The authors compared three different optimization algorithms to determine the most appropriate HNN parameters and concluded that Differential Evolution (DE) and Ant Colony Optimization (ACO) outperformed the Artificial Bee Colony (ABC) algorithm.

Darras et al. [2017] employed feedforward and recurrent neural networks to forecast flash floods. The authors determined that the recurrent neural network performed better with longer lead time in flash flood prediction and the feedforward neural network performed better in simulating flash floods in short-range forecasting. Yaseen et al. [2017] combined the firefly algorithm with an adaptive neuro-fuzzy computing technique to propose a novel model for estimating of the stream flow. Young et al. [2017] introduced a hybrid model formed of a physically-based hydrologic modeling system and SVM and stated that this model showed good performance in simulating rainfall-runoff during typhoon events.

Alizadeh et al. [2017] combined the wavelet transform with ANN in order to predict rainfall-runoff in the Tolt River. The results showed that this model made appropriate forecasts for one and two months ahead. Zeynoddin et al. [2018] proposed a novel hybrid model to forecast rainfall of a basin with a tropical climate in Malaysia. One of the primary goals of using satellite information is to monitor and forecast the floods in ungauged basins.

The problem with river flood forecasting is that all data-driven models only use in situ information. Hence, because of the direct relation of flood data to the in-situ gauges, regions with no measuring stations cannot provide any information about the river. Nonetheless, the majority of rivers suffer from a lack of in situ stations.

In addition, the number of existing in situ stations in some countries are limited [Calmant and Seyler 2006, Khan et al. 2012, Shiklomanov et al. 2002, Sivapalan et al. 2003, Stokstad et al. 1999]. Hence, studying and forecasting floods is not possible on considerable regions of the earth. Therefore, Predicting in Ungauged Basins (PUB) has recently emerged as a topic of very high interest in flood forecasting [Salvia et al. 2011]. Researchers in this field mainly use satellite information such as space-based precipitation measurements to simulate hydrological circumstances [Brakenridge et al. 2007, Jiang et al. 2014, Khan et al. 2011, Su et al. 2008, Temimi et al. 2011, 2007].

An accurate means of simulating river flow applying satellite information is to utilize passive microwave sensors from the AMSR for Earth Observing System (AMSR-E) [Brakenridge et al. 2007]. Studies in this field began at the Dartmouth Flood Observatory (DFO), followed by the Joint Research Centre (JRC). AMSR-E sensors observe all of the Earth’s surface and detect wet river areas by calculating the differences in land and water brightness temperatures. Raw data is justified by the Global Flood Detection System (GFDS), which is accessible at http://www.gdacs.org/flooddetection.

To predict the rivers water level time series using satellite data, Zaji et al. [2019a] introduced a new approach of minimizing both horizontal and vertical errors. After that, Zaji et al. [2019b] developed a new Evolutionary Polynomial Regression-Time series predictor (EPR-T) and used it to predict the rivers discharge using satellite information. Bonakdari et al. [2019] developed two extensions of the Markov Chain (MC) method, namely Online-Markov Chain (O-MC) and Extreme Online-Markov Chain (EO-MC) to enhance the forecasting performance of rivers discharge using satellite information.

The signals received by satellites are affected by a range of weather conditions [Kugler and De Groeve 2007]. For instance, severe weather alters the emission signals of both land and water. The impact of severe weather is more critical when the water-surrounded area is smaller, which is why space-based signal accuracy is poorer when river discharge is low [Khan et al. 2012]. So, in order to improve the performance and reliability of satellite data, Zaji et al. [2019a] proposed an approach based on trial and error, classification, and outlier detection methods. The authors concluded that the developed method could significantly increase the accuracy of satellite data.

The primary goal of the present study is to introduce a hybrid method that comprises of a multi-objective particle swarm Optimization model, decision tree Classification algorithm, Hotelling’s T² Outlier detection, and a Regression model (OCOR) for evaluating raw satellite signals to obtain accurate continuous signals. With this method, outlier signals received through severe weather or other unknown environmental situations are initially detected and eliminated. Then the removed signals and those missed due to technical satellite problems are predicted using a regression method.

Previous studies have mainly focused on simulating river discharge using space-based signals [Frasson et al. 2017]. However, when the goal is to use historical information of satellite sensors to forecast the future river discharge, a gap-free continuous space-based dataset is needed [Weigend 2018]. Thus, the second step entails evaluating a continuous dataset of space-based signals with no missed samples. In the training phase of the OCOR method, the in-situ measurements are needed in order to determine one of the objective functions of the multi-objective optimization algorithm and to develop the regression method to replace the detected noisy samples.

2. Space-based signals

AMSR-E and AMSR2 passive microwave sensors are employed in the present study to gather space-based signals [Brakenridge et al. 2007]. After that, the obtained signals are used to measure the river discharge indirectly. In order to obtain more stable observations, the descending orbit method was selected in this study [Kugler and De Groeve 2007, Tekeli and Fouli 2017], and the Earth’s surface was measured at least once a day.

Microwave measurements, which are very sensitive to water, are widely used in the soil moisture estimation field of science [Njoku et al. 2003, Schmugge et al. 1980, Srivastava et al. 2017, Theis et al. 1982, Ulaby et al. 1978, Wang et al. 1982]. By using the microwave signals, river discharge is estimated according to the differences between the thermal emission of dry and wet surfaces. The brightness temperature of the considered wet area is called Measurement (M), which is a proxy of river water and the received information is calibrated applying the surrounding dry area’s brightness temperature (C) to obtain the final AMSR space-based Signal (S) as follows:

S = \frac{C}{M} .

(1)

Land brightness temperature is higher than wet area temperature [Kugler and De Groeve 2007], such that S is low when river discharge is low and vice versa [Birkinshaw et al. 2010].

According to Bjerklie et al. [2004] and Brakenridge et al. [2012], in most rivers, the discharge has a better correlation with flow width than with the flow velocity. So that, flow width, which is calculated by using the satellite information, could be considered as a robust proxy of river discharge.

3. Study area

In this work, two rivers are examined as case studies, namely the White River and the Willamette River. The properties of these two rivers are presented in Table 1. Both are located in the United States. The White and Willamette River sources are the Boston Mountains and the junction of the Middle Fork Willamette River with the Coast Fork Willamette River, respectively. Figure 1 presents a schematic overview of the in-situ stations and space-based measurement locations for the White and Willamette Rivers. In this figure, area M, area C, and the in-situ measurement station are denoted by blue, gray, and black, respectively.

Figure 1.
Schematic overview of the (a) White River and (b) Willamette River locations (Landsat satellite images in false color Composite (7,5,2)).

River name	Site ID	Station ID	Space based site coordinates Lat./Long. (dd)	In situ station coordinates Lat./Long. (dd)	Mean discharge (m³/s)
White, Newberry	524	03360500	38.91/−87.07	38.92/−87.011	12137
Willamette	925	14191000	45.18/−123.01	44.94/−123.042	18928

The in-situ station information was collected from the United States Geological Survey (USGS) (http://www.usgs.gov), and the space-based information was obtained from GFDS (http://www.gdacs.org). The time periods examined for the White and Willamette Rivers are from 2009 to 2011 and from 2002 to 2016, respectively.

4. Numerical models

The primary goal of the method proposed in current study is to re-evaluate the in situ and space-based datasets by detecting and replacing their errors. As mentioned in the literature, many studies have addressed calibrating space-based information within situ measurements. However, the poor performance of satellite sensors, in some cases, can lead to inaccurate signals. In addition, satellite or in situ gauges are not always accessible. However, forecasting river discharge using space-based data necessitates a dataset with no gaps, whereby the technique should replace eliminated and missed data with appropriate values.

The OCOR method is introduced in this study to more efficiently (compred to exisiting methods) achieve these goals. The OCOR method involves an outlier detection approach combined with a classification algorithm to detect outliers and inaccurate signals. The OCOR subsequently employs a regression model to evaluate the eliminated and missed information. In addition, the model is justified employing a multi-objective optimization algorithm.

Outlier detection is the science of detecting the abnormal samples by considering the amounts of other samples of the dataset [Mitra et al. 2009]. An outlier detection model tries to identify the Global and Contextual outliers of space-based signals. Global outliers occur when a sample deviates significantly from the whole dataset, and Contextual outliers occur when a sample deviates significantly from nearby samples in that season [Han et al. 2011].

Hotelling’s T² outlier detection method [Hotelling 1992] is used in the present study. Hotelling’s T² has been applied as a quality checking algorithm in many studies [Mason and Young 2002, Shabbak et al. 2011, Sullivan and Woodall 1996]. In Hotelling’s T², the entire dataset is transferred to a hypersphere with radius 𝛼 by using rotation and normalization. Following the mentioned transformations, the samples that satisfy the condition T²⩽𝛼² are deemed normal, and the remaining samples are considered abnormal. Hence, 𝛼 is a strict criterion of this method.

In the present study, supervised outlier detection is utilized, which, together with Hotelling’s T² outlier detection, necessitates a classification algorithm [Han et al. 2011]. In supervised outlier detection, the classification algorithm labels the samples as experts, after samples which samples that enlarge the classes are detected as outliers and eliminated from the dataset.

The OCOR method employs the decision tree algorithm, as it is one of the simplest and fastest classification algorithms [Quinlan et al. 1986, 1987, Quinlan and Rivest 1989]. This algorithm has a tree-like shape and comprises three parts. The first part entails internal nodes that make decisions on the attributes; the second part contains branches that represent the decisions made in the internal nodes; and the third part, or the final level of tree nodes, includes leaf nodes that present the class labels. The decision tree employed was introduced by Breiman et al. [1984] as the CART algorithm for measuring training dataset impurity as follows:

G I (dataset) = 1 - \sum_{i = 1}^{m} p_{i}^{2} where p_{i} = \frac{| C_{i, dataset} |}{| dataset |}

(2)

where C_i represents the class label p_i is the probability that the sample considered is in class C_i, and m is the number of classes considered. GI leads to a binary decision in each internal node. The precision of the decision tree classification algorithm is justified using the Minimum Parent Size (MPS) parameter. MPS determines the minimum number of samples allowed in each node and serves as a termination criterion. Lower MPS leads to a larger, more accurate decision tree and vice versa.

Parameters 𝛼 (strict Hotelling’s T² criterion), m (number of classes), and MPS (decision tree termination criterion) should be determined to attain an optimum model. The optimum position is a model with the lowest number of outlier samples and the highest coefficient of determination (R²) between the satellite signals and in situ discharge measurements. Hence, two goals must be attained, and a multi-objective optimization algorithm is needed.

In the present study, Multi-Objective Particle Swarm Optimization (MOPSO) [Coello and Lechuga 2002] is used. MOPSO is a combination of Particle Swarm Optimization (PSO) [Kennedy et al. 2001, Shaghaghi et al. 2017, Zaji et al. 2015] and the Non-Dominated Sorting Genetic Algorithm II (NSGA-II) [Deb et al. 2000]. The goal of MOPSO is to determine the most suitable 𝛼, m, and MPS values to attain a model that detects the lowest number of outliers with the highest R².

The final step of the OCOR method is the regression procedure. Here, the satellite-based samples that were eliminated in previous steps are replaced with accurate values. This procedure involves in situ discharge measurements as input samples to the regression model and satellite-based signals as the output samples. Following the elimination, the regression model is trained using the remaining samples, after which it models the eliminated samples.

The OCOR method tries to detect and eliminate as few inaccurate and noisy samples as possible by using a combination of outlier detection, classification, and multi-objective optimization algorithms and after that, replace the eliminated samples using a regression method.

Applying this model before using satellite rivers information is necessary for any further application of them. However, this version of the proposed method has some limitations. Firstly, in this method, the in-situ discharge measurements are used in order to find the highest R², which is one of the objectives of the optimization algorithm and also for replacing the eliminated samples, which is the last phase of OCOR method.

Therefore, this method is not applicable to ungauged basins. Secondly, this model has only been tested on the AMSR satellite sensors’ information. The footprint size of AMSR is really big (about 8 × 12 km²). The information obtained by this sensor are more suitable for large rivers. Thirdly, in order to use the proposed preprocessing method on other satellite sensors, some justifications should be done to this model. For instance, despite having a high resolution of MODerate resolution Imaging Spectroradiometer (MODIS), the gathered information is more sensitive to the cloudy weather condition, and those images should be eliminated from the dataset separately.

5. Results and discussion

As mentioned earlier, one of the optimal alternatives to river discharge measurement is satellite information. For the satellite signals to be usable, they must undergo a preprocessing phase. The OCOR method entails two phases: preprocessing and calibration. Satellite signals have two major drawbacks. The first is inaccuracy due to poor environmental conditions, and the second is missed signals on account of inaccessibility to the satellite.

Figure 2 presents the in-situ discharge measurements and satellite signals for the two rivers considered. This figure indicates a good correlation between the satellite signals and in situ discharge measurements. However, it is evident that many time periods have no information and there are missed samples. For the White River, besides missed satellite signals, there is a period of time for which in situ measurements were lost.

According to the study area section, in the present study, three years of daily information was used for the White River, and 15 years of daily information was used for the Willamette River. It should be noted that selecting the most appropriated time period for each case study should be done according to the complexity of the considered problem. For instance, according to Figure 2, for the Willamette River, the satellite signals were missed for a long period of time (the black areas). In this case, longer periods of in-situ measurements and satellite signals are needed to evaluate a reliable model.

On the other hand, selecting a longer period of time leads to having more computational time as well, in this case, by using a MacBook Pro with 3.1 GHz Intel Core i7 CPU and 16 GB Memory, the running time of Willamette River model takes about 10 h. However, with the same computer, the running time of the White River model takes about 2 h.

One of the most critical objectives of studies on river discharge is forecasting. Forecasting river discharge via any method requires a continuous dataset with no missed samples. The goal of the preprocessing phase in the OCOR method is to identify inaccurate satellite signals and eliminate them from the dataset. The purpose of the calibration phase in the OCOR method is to re-evaluate the excluded outliers and missed signals in order to obtain a continuous and accurate dataset for use as time series in river discharge forecasting using satellite information.

Figure 2.
Satellite signals and in situ discharge measurements for the (a) White River and (b) Willamette River.

An optimum outlier detection model will attain maximum R² with the least number of eliminated outliers. Hence, there are two aims, and this problem cannot be solved using simple optimization algorithms. In the present study, MOPSO was utilized to adjust three parameters, i.e., MPS, 𝛼, and m, to attain the best model. In the present model, MPS varied from 1 to 500, while 𝛼 and m varied from 1 to 10. Each case study underwent 100 iterations and the population number considered was 300. With multi-objective optimization methods, the best model is selected by using the concept of domination. According to the definition, j = [Goal_1j,…,Goal_nj] dominates i = [Goal_1i,…,Goal_ni] if all parameters of j are less than or equal to the parameters of i and none of the parameters of j are less than those of i.

In the MOPSO procedure, the non-dominated samples are saved in a repository. The present model repository can save 100 non-dominated samples. Upon reaching this number of samples, the best model is selected manually. Figure 3 illustrates the repository samples for the White and Willamette Rivers as red circles. In this figure, the best models are denoted by blue diamonds. These models evidently represent a good balance between high R² and a low number of eliminated samples.

Figure 3.
Repository samples for the (a) White River and (b) Willamette River.

The most suitable model of the White River had MPS of 3, 𝛼 of 1.49, and m of 9, while the best model of the Willamette River had MPS of 4, 𝛼 of 2.18, and m of 10. All samples for these models and the eliminated samples are shown in Figure 4, where the classes are shown in different colors. Moreover, Figure 4 indicates that these datasets were not classified accurately. However, the goal of the present study was not accurate classification, but rather to determine the best outlier samples that increase the R² of the satellite signals and in situ discharge measurements.

Figure 4.
Classification and outlier samples of the (a) White River and (b) Willamette River.

Thus far, a combination of multi-objective optimization, classification, and outlier selection methods were applied to detect and eliminate inaccurate signals from the dataset. The dataset additionally had gaps due to technical satellite problems. In order to fill these gaps, the second OCOR phase done was calibration. In this phase, the missed and outlier signals were evaluated using in situ discharge measurements, which served as nonlinear regression model inputs while the satellite signals were the targets. Following regression model training, the missed and eliminated signals were modeled. The calibration regression equations for the White and Willamette Rivers are given by (3) and (4), respectively, where S represents satellite signal values, and Q denotes in situ discharge measurements.

\begin{matrix} S & = & 4.63 e^{- 16} \times Q^{3} - 6.16 e^{- 11} \times Q^{2} \\ + 3.79 e^{- 6} \times Q + 1.014 \end{matrix}

(3)

\begin{matrix} S & = & 6.52 e^{- 18} \times Q^{3} - 6.86 e^{- 12} \times Q^{2} \\ + 1.15 e^{- 6} \times Q + 1.017 \end{matrix}

(4)

Values missed because of technical satellite problems are not specified. In situ gauges are also sometimes inaccessible, and the in-situ discharge measurement dataset has lost samples too. In such cases, a regression model can be employed to model in situ discharge measurements using satellite signals.

The signals evaluated for the two case studies are shown in Figure 5. Here, the red points are outlier signals eliminated from the dataset in the OCOR preprocessing phase and replaced in the calibration phase using (3) and (4). The blue points represent satellite signals missed because of technical problems. From this figure, it is evident that the Willamette River has lost signals for a long period of time.

These lost samples were re-evaluated by using the mentioned regression equations. The pink points in the figure for the White River dataset represent sample inputs replacing in situ measurement values missed due to technical gauge problems. For the White and Willamette Rivers, the R² amounts between the initial satellite signals and in-situ discharges are 0.57 and 0.39, respectively, and the R² amounts between the developed satellite signals and in-situ discharges are 0.86 and 0.7, respectively. So, using the OCOR method increases the performance of the satellite signals significantly. In addition, in the initial dataset, there are some gaps in the satellite signals and in-situ measurements, which are replaced in the final datasets.

Figure 5.
The final dataset of satellite-based signals for the (a) White River and (b) Willamette River.

After calibration, the OCOR method has attained the objectives. First, inaccurate satellite signals were detected and eliminated. Second, the missed and excluded signals were re-evaluated using regression models. According to Figure 6, subsequent to these two phases, the satellite signals exhibited an excellent correlation with the in-situ discharge measurements; hence, this dataset can be used as a time series to forecast future river discharge.

Figure 6.
Simulations of in situ discharge from the (a) White River and (b) Willamette River using satellite signals.

Finally, the OCOR method is used to distinguish between the normal water level of rivers and the flood water levels. In this theory, which was used in the field of satellite-based flood detection by DeGroeve and Riva [2009b], the Flood Magnitude (FM) is defined using the following equation:

FM = \frac{S - \bar{S}}{𝜎 (S)}

(5)

where S is the satellite signals as presented in (1), and 𝜎(S) is the standard deviation of the S dataset, which is defined as follow:

𝜎 = \sqrt{\frac{1}{N - 1} \sum_{i = 1}^{N} {(S_{i} - \bar{S})}^{2}}

(6)

where N is the number of S dataset’s samples and

\bar{S}

represents the average value of S. A normal distribution, when the average value of the dataset is zero, and its standard deviation is one the distribution is defined as follows:

𝜑 (x) = \frac{1}{\sqrt{2 𝜋}} e^{- \frac{1}{2} x^{2}}

(7)

where x represents a sample of the S dataset. The histograms of the FM datasets for the Willamette and White Rivers are presented in Figure 7. In this figure, the normal distribution was obtained using (7) is shown by a red line and the real distributions of the FM datasets are shown by blue bins. DeGroeve and Riva [2009b] and the GFDS organization proposed that considering FM samples with a standard deviation between two and four (2 < 𝜎 < 4) as small and regular floods and FM samples with a standard deviation higher than 4 (𝜎⩾4) as extreme floods.

Figure 7.
Histograms of the Willamette and White Rivers’ FM datasets.

6. Conclusion

One of the most worthwhile aims of flood disaster management is to provide real-time flood water level forecasting for early warning evacuation plans to save lives. Various studies have been recently conducted in this field of science. Satellite-based signals are becoming more readily available and cost-effective method to monitor a vast network of rivers during major storm events. In this study, a novel method is proposed to improve the accuracy of satellite-based water level forecasts, comprising a multi-objective particle swarm Optimization model, decision tree Classification algorithm, Hotelling’s T² Outlier detection, and a Regression model (OCOR). The new OCOR method can effectively identify inaccurate satellite signals and replace them with reasonable values.

Two case studies were selected to showcase the novel application of the new OCOR method, including the White River and the Willamette River, located in the United States. In total, 307 White River samples and 1008 Willamette River samples were identified as outliers or missing data due to satellite technical problems. Upon eliminating these samples, the satellite-based signals were successfully calibrated with the in situ hydrometric gauge data, and the deleted signals were gap-filled using the calibrated regression model. Hence, all dataset gaps were filled with satisfactory values. The final results demonstrate a superior correlation between in situ measurements and satellite-based signals following the OCOR model application and can be used as a cost-effective method for tracking water levels in vast river networks during major flood events for early warning evacuation plans to save lives.

Bibliographie

[Akhbari et al., 2017] A. Akhbari; H. Bonakdari; I. Ebtehaj Evolutionary prediction of electrocoagulation efficiency and energy consumption probing, Desalin. Water Treat., Volume 64 (2017), pp. 54-63 | DOI

[Alizadeh et al., 2017] M. J. Alizadeh; M. R. Kavianpour; O. Kisi; V. Nourani A new approach for simulating and forecasting the rainfall-runoff process within the next two months, J. Hydrol., Volume 548 (2017), pp. 588-597 | DOI

[Birkinshaw et al., 2010] S. J. Birkinshaw; G. O’donnell; P. Moore; C. Kilsby; H. Fowler; P. Berry Using satellite altimetry data to augme nt flow estimation techniques on the Mekong River, Hydrol. Process., Volume 24 (2010) no. 26, pp. 3811-3825 | DOI

[Bjerklie et al., 2004] D. M. Bjerklie; D. Moller; L. C. Smith; S. L. Dingman Estimating discharge in rivers using remotely sensed hydraulic information, J. Hydrol., Volume 309 (2004) no. 1, pp. 191-209

[Bonakdari et al., 2019] H. Bonakdari; A. H. Zaji; A. D. Binns; B. Gharabaghi Integrated Markov chains and uncertainty analysis techniques to more accurately forecast floods using satellite signals, J. Hydrol., Volume 572 (2019), pp. 75-95 | DOI

[Bonakdari et al., 2020] H. Bonakdari; F. Moradi; I. Ebtehaj; B. Gharabaghi; A. A. Sattar; A. H. Azimi; A. Radecki-Pawlik A non-tuned machine learning technique for abutment scour depth in clear water condition, Water, Volume 12 (2020), 301 pages | DOI

[Brakenridge et al., 2007] G. R. Brakenridge; S. V. Nghiem; E. Anderson; R. Mic Orbital microwave measurement of river discharge and ice status, Water Resour. Res., Volume 43 (2007) no. 4, W04405 pages | DOI

[Brakenridge et al., 2012] G. R. Brakenridge; S. Cohen; A. J. Kettner; T. De Groeve; S. V. Nghiem; J. P. Syvitski; B. M. Fekete Calibration of satellite measurements of river discharge using a global hydrology model, J. Hydrol., Volume 475 (2012), pp. 123-136 | DOI

[Breiman et al., 1984] L. Breiman; J. Friedman; C. J. Stone; R. A. Olshen Classification and Regression Trees, CRC Press, Boca Raton, Florida, USA, 1984 | DOI | Zbl

[Calmant and Seyler, 2006] S. Calmant; F. Seyler Continental surface waters from satellite altimetry, C. R. Geosci., Volume 338 (2006) no. 14, pp. 1113-1122 | DOI

[Chen et al., 2015] X. Y. Chen; K. W. Chau; A. O. Busari A comparative study of population-based optimization algorithms for downstream river flow forecasting by a hybrid neural network model, Eng. Appl. Artif. Intell., Volume 46 (2015), pp. 258-268 | DOI

[Coello and Lechuga, 2002] C. A. C. Coello; M. S. Lechuga MOPSO: a proposal for multiple objective particle swarm optimization, Evolutionary Computation, 2002. CEC ’02. Proceedings of the 2002 Congress on, Honolulu, USA, IEEE, New York, USA, 2002, pp. 1051-1056

[Darras et al., 2017] T. Darras; L. Kong-A-Siou; B. Vayssade; A. Johannet; S. Pistre Karst flash flood forecasting using recurrent and non-recurrent artificial neural network models: the case of the Lez Basin (Southern France), EuroKarst 2016, Neuchâtel: Advances in the Hydrogeology of Karst and Carbonate Reservoirs (P. Renard; C. Bertrand, eds.), Springer International Publishing, Cham, 2017, pp. 169-177 | DOI

[Deb et al., 2000] K. Deb; S. Agrawal; A. Pratap; T. Meyarivan A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II, International Conference on Parallel Problem Solving From Nature, Springer, 2000, pp. 849-858

[De Groeve and Riva, 2009a] T. De Groeve; P. Riva Early flood detection and mapping for humanitarian response, Proceedings of the 6 International ISCRAM Conference, Gothenburg, Sweden, Information Systems for Crisis Response and Management (ISCRAM), Gothenburg, Sweden, 2009a

[De Groeve and Riva, 2009b] T. De Groeve; P. Riva Global real-time detection of major floods using passive microwave remote sensing, Proceedings of the 33rd International Symposium on Remote Sensing of Environment, Stresa, Italy, International Center for Remote Sensing of Environment (ICRSE), Stresa, Italy, 2009b, pp. 4-8 | DOI

[Ebtehaj et al., 2018] I. Ebtehaj; H. Bonakdari; F. Moradi; B. Gharabaghi; Z. S. Khozani An integrated framework of Extreme Learning Machines for predicting scour at pile groups in clear water condition, Coast. Eng., Volume 135 (2018), pp. 1-15 | DOI

[Ebtehaj et al., 2019] I. Ebtehaj; H. Bonakdari; B. Gharabaghi Closure to “An integrated framework of Extreme Learning Machines for predicting scour at pile groups in clear water condition by Ebtehaj, I., Bonakdari, H., Moradi, F., Gharabaghi, B., Khozani, Z. S.”, Coast. Eng., Volume 147 (2019), pp. 135-137 | DOI

[Espinoza-Villar et al., 2018] R. Espinoza-Villar; J.-M. Martinez; E. Armijos; J.-C. Espinoza; N. Filizola; A. Dos Santos; B. Willems; P. Fraizy; W. Santini; Ph. Vauchel Spatio-temporal monitoring of suspended sediments in the Solimoes River (2000—2014), C. R. Geosci., Volume 350 (2018) no. 1–2, pp. 4-12 | DOI

[Frasson et al., 2017] R. P. d. M. Frasson; R. Wei; M. Durand; J. T. Minear; A. Domeneghetti; G. Schumann; B. A. Williams; E. Rodriguez; C. Picamilh; C. Lion; T. Pavelski; P. A. Garamois Automated river reach definition strategies: applications for the surface water and ocean topography mission, Water Resour. Res., Volume 53 (2017) no. 10, pp. 8164-8186 | DOI

[Gholami et al., 2019] A. Gholami; H. Bonakdari; P. Samui; M. Mohammadian; B. Gharabaghi Predicting stable alluvial channel profiles using emotional artificial neural networks, Appl. Soft Comput. J., Volume 78 (2019), pp. 420-437 | DOI

[Han et al., 2011] J. Han; J. Pei; M. Kamber Data Mining: Concepts and Techniques, Elsevier Inc., Amesterdam, Netherland, 2011 | Zbl

[He et al., 2014] Z. He; X. Wen; H. Liu; J. Du A comparative study of artificial neural network, adaptive neuro fuzzy inference system and support vector machine for forecasting river flow in the semiarid mountain region, J. Hydrol., Volume 509 (2014), pp. 379-386 | DOI

[Hotelling, 1992] H. Hotelling The generalization of Student’s ratio, Breakthroughs in Statistics, Springer, New York, USA, 1992 | DOI

[Jiang et al., 2014] S. Jiang; L. Ren; Y. Hong; X. Yang; M. Ma; Y. Zhang; F. Yuan Improvement of multi-satellite real-time precipitation products for ensemble streamflow simulation in a middle latitude basin in South China, Water Resour. Manage., Volume 28 (2014) no. 8, pp. 2259-2278 | DOI

[Kennedy et al., 2001] J. F. Kennedy; J. Kennedy; R. C. Eberhart; Y. Shi Swarm Intelligence, Morgan Kaufmann, San Francisco, California, USA, 2001

[Khan et al., 2011] S. I. Khan; Y. Hong; J. Wang; K. K. Yilmaz; J. J. Gourley; R. F. Adler; G. R. Brakenridge; F. Policelli; S. Habib; D. Irwin Satellite remote sensing and hydrologic modeling for flood inundation mapping in Lake Victoria basin: implications for hydrologic prediction in ungauged basins, IEEE Trans. Geosci. Remote Sens., Volume 49 (2011) no. 1, pp. 85-95 | DOI

[Khan et al., 2012] S. I. Khan; Y. Hong; H. J. Vergara; J. J. Gourley; G. R. Brakenridge; T. De Groeve; Z. L. Flamig; F. Policelli; B. Yong Microwave satellite data for hydrologic modeling in ungauged basins, IEEE Geosci. Remote Sens. Lett., Volume 9 (2012) no. 4, pp. 663-667 | DOI

[Kugler and De Groeve, 2007] Z. Kugler; T. De Groeve, 2007 (The global flood detection system. Office for Official Publications of the European Communities, Luxembourg, 45)

[Mason and Young, 2002] R. L. Mason; J. C. Young Multivariate Statistical Process Control with Industrial Applications, Society for Industrial Mathematics, Philadelphia, USA, 2002 | DOI | Zbl

[Mitra, 2009] A. Mitra Data transformation for normalization, Encyclopedia of Data Warehousing and Mining, IGI Global, Hershey, USA, 2009, pp. 566-571 | DOI

[Njoku et al., 2003] E. G. Njoku; T. J. Jackson; V. Lakshmi; T. K. Chan; S. V. Nghiem Soil moisture retrieval from AMSR-E, IEEE Trans. Geosci. Remote Sens., Volume 41 (2003) no. 2, pp. 215-229 | DOI

[Quinlan and Rivest, 1989] J. R. Quinlan; R. L. Rivest Inferring decision trees using the minimum description lenght principle, Info. Comput., Volume 80 (1989) no. 3, pp. 227-248 | DOI | Zbl

[Quinlan, 1986] J. R. Quinlan Induction of decision trees, Mach. Learn., Volume 1 (1986) no. 1, pp. 81-106 | DOI

[Quinlan, 1987] J. R. Quinlan Simplifying decision trees, Int. J. Man-Mach. Stud., Volume 27 (1987) no. 3, pp. 221-234 | DOI

[Salvia et al., 2011] M. Salvia; F. Grings; P. Ferrazzoli; V. Barraza; V. Douna; P. Perna; H. Karszenbaum Estimating flooded area and mean water level using active and passive microwaves: the example of Paran River delta floodplain, Hydrol. Earth Syst. Sci. Discuss., Volume 8 (2011) no. 2, pp. 2895-2928 | DOI

[Sattar et al., 2019] M. A. Sattar; H. Bonakdari; B. Gharabaghi; A. Radecki-Pawlik Hydraulic modeling and evaluation equations for the incipient motion of sandbags for levee breach closure operations, Water, Volume 11 (2019) no. 279, pp. 1-22 | DOI

[Schmugge, 1980] T. J. Schmugge Microwave approaches in hydrology, Photogramm. Eng. Remote Sens., Volume 46 (1980), pp. 495-507 | DOI

[Shabbak et al., 2011] A. Shabbak; H. Midi; M. N. Hassan The performance of robust multivariate statistical control charts based on different cutoff-points with sustained shifts in mean, J. Appl. Sci., Volume 11 (2011) no. 1, pp. 56-65 | DOI

[Shaghaghi et al., 2017] S. Shaghaghi; H. Bonakdari; A. Gholami; I. Ebtehaj; M. Zeinolabedini Comparative analysis of GMDH neural network based on genetic algorithm and particle swarm optimization in stable channel design, Appl. Math. Comput., Volume 313 (2017), pp. 271-286 | DOI

[Shiklomanov et al., 2002] A. Shiklomanov; R. Lammers; C. J. Vörösmarty Widespread decline in hydrological monitoring threatens pan-Arctic research, EOS Trans. AGU., Volume 83 (2002) no. 2, pp. 13-17 | DOI

[Sivapalan, 2003] M. Sivapalan Prediction in ungauged basins: a grand challenge for theoretical hydrology, Hydrol. Process., Volume 17 (2003) no. 15, pp. 3163-3170 | DOI

[Srivastava, 2017] P. K. Srivastava Satellite soil moisture: review of theory and applications in water resources, Water Resour. Manage., Volume 31 (2017) no. 10, pp. 3161-3176 | DOI

[Stokstad, 1999] E. Stokstad Scarcity of rain, stream gages threatens forecasts, Science, Volume 285 (1999) no. 5431, pp. 1199-1200 | DOI

[Su et al., 2008] F. Su; Y. Hong; D. P. Lettenmaier Evaluation of TRMM Multisatellite Precipitation Analysis (TMPA) and its utility in hydrologic prediction in the La Plata Basin, J. Hydrometeorol., Volume 9 (2008) no. 4, pp. 622-640 | DOI

[Sullivan and Woodall, 1996] J. H. Sullivan; W. H. Woodall A comparison of multivariate control charts for individual observations, J. Qual. Tech., Volume 28 (1996) no. 4, pp. 398-408 | DOI

[Tekeli and Fouli, 2017] A. E. Tekeli; H. Fouli Reducing false flood warnings of TRMM rain rates thresholds over Riyadh City, Saudi Arabia by utilizing AMSR-E Soil moisture information, Water Resour. Manage., Volume 31 (2017) no. 4, pp. 1243-1256 | DOI

[Temimi et al., 2007] M. Temimi; R. Leconte; F. Brissette; N. Chaouch Flood and soil wetness monitoring over the Mackenzie River Basin using AMSR-E 37 GHz brightness temperature, J. Hydrol., Volume 333 (2007) no. 2, pp. 317-328 | DOI

[Temimi et al., 2011] M. Temimi; T. Lacava; T. Lakhankar; V. Tramutoli; H. Ghedira; R. Ata; R. Khanbilvardi A multi-temporal analysis of AMSR-E data for flood and discharge monitoring during the 2008 flood in Iowa, Hydrol. Process., Volume 25 (2011) no. 16, pp. 2623-2634 | DOI

[Theis et al., 1982] S. W. Theis; M. J. McFarland; W. D. Rosenthal; C. L. Jones Microwave Remote Sensing of Soil Moistures, Remote Sensing Center, Texas A&M University, College Station, TX, 1982 (RSC-3458-129) | DOI

[Ulaby et al., 1978] F. T. Ulaby; P. P. Batlivala; M. C. Dobson Microwave backscatter dependence on surface roughness, soil moisture, and soil texture: Part I-bare soil, IEEE Trans. Geosci. Elec., Volume GE-16 (1978) no. 4, pp. 286-295 | DOI

[Wang et al., 1982] J. Wang; P. ONeill; E. Engman, 1982 (Remote Measurements of Soil Moisture by Microwave Radiometers at BARC Test Site II. 83954, NASA) | DOI

[Weigend, 2018] A. S. Weigend Time Series Prediction: Forecasting the Future and Understanding the Past, Routledge, Taylor and Francis, New York, USA, 2018 | DOI

[Yaseen et al., 2017] Z. M. Yaseen; I. Ebtehaj; H. Bonakdari; R. C. Deo; A. Danandeh Mehr; W. H. Melini Wan Mohtar; L. Diop; A. El-shafieh; V. P. Singh Novel approach for streamflow forecasting using a hybrid ANFIS-FFa model, J. Hydrol., Volume 554 (2017), pp. 263-276 | DOI

[Young et al., 2017] C. C. Young; W. C. Liu; M. C. Wu A physically based and machine learning hybrid approach for accurate rainfall-runoff modeling during extreme typhoon events, Appl. Soft Comput., Volume 53 (2017), pp. 205-216 | DOI

[Zaji and Bonakdari, 2019] A. H. Zaji; H. Bonakdari Robustness lake water level prediction using the search heuristic based artificial intelligence methods, ISH J. Hydraul. Eng., Volume 25 (2019) no. 3, pp. 316-324 | DOI

[Zaji et al., 2015] A. H. Zaji; H. Bonakdari; S Shamshirband; S. N. Qasem Potential of particle swarm optimization based radial basis function network to predict the discharge coefficient of a modified triangular side weir, Flow Measur. Instrum., Volume 45 (2015), pp. 404-407 | DOI

[Zaji et al., 2018] A. H. Zaji; H. Bonakdari; B. Gharabaghi Remote sensing satellite data preparation for simulating and forecasting river discharge, IEEE Trans. Geosci. Remote Sens., Volume 56 (2018) no. 6, pp. 3432-3441 | DOI

[Zaji et al., 2019a] A. H. Zaji; H. Bonakdari; B. Gharabaghi Applying upstream satellite signals and a 2-D Error minimization algorithm to advance early warning and management of flood water levels and river discharge, IEEE Trans. Geosci. Remote Sens., Volume 99 (2019a), pp. 1-9

[Zaji et al., 2019b] A. H. Zaji; H. Bonakdari; B. Gharabaghi Developing an AI-based method for river discharge forecasting using satellite signals, Theor. Appl. Climatol., Volume 138 (2019b), pp. 347-362 | DOI

[Zeynoddin et al., 2018] M. Zeynoddin; H. Bonakdari; A. Azari; I. Ebtehaj; B. Gharabaghi; H. M. Riahi Novel hybrid linear stochastic with non-linear extreme learning machine methods for forecasting monthly rainfall a tropical climate, J. Environ. Manage., Volume 222 (2018), pp. 190-206 | DOI

Cité par

Keyvan Soltani; Afshin Amiri; Isa Ebtehaj; Hanieh Cheshmehghasabani; Sina Fazeli; Silvio José Gumiere; Hossein Bonakdari Advanced Forecasting of Drought Zones in Canada Using Deep Learning and CMIP6 Projections, Climate, Volume 12 (2024) no. 8, p. 119 | DOI:10.3390/cli12080119
Keyvan Soltani; Arash Azari Terrestrial water storage anomaly estimating using machine learning techniques and satellite‐based data (a case study of Lake Urmia Basin), Irrigation and Drainage, Volume 73 (2024) no. 1, p. 215 | DOI:10.1002/ird.2863
Afshin Amiri; Keyvan Soltani; Isa Ebtehaj; Hossein Bonakdari A novel machine learning tool for current and future flood susceptibility mapping by integrating remote sensing and geographic information systems, Journal of Hydrology, Volume 632 (2024), p. 130936 | DOI:10.1016/j.jhydrol.2024.130936
Isa Ebtehaj; Hossein Bonakdari A comprehensive comparison of the fifth and sixth phases of the coupled model intercomparison project based on the Canadian earth system models in spatio-temporal variability of long-term flood susceptibility using remote sensing and flood frequency analysis, Journal of Hydrology, Volume 617 (2023), p. 128851 | DOI:10.1016/j.jhydrol.2022.128851
Victor Oliveira Santos; Paulo Alexandre Costa Rocha; John Scott; Jesse Van Griensven Thé; Bahram Gharabaghi A New Graph-Based Deep Learning Model to Predict Flooding with Validation on a Case Study on the Humber River, Water, Volume 15 (2023) no. 10, p. 1827 | DOI:10.3390/w15101827
Keyvan Soltani; Arash Azari Forecasting groundwater anomaly in the future using satellite information and machine learning, Journal of Hydrology, Volume 612 (2022), p. 128052 | DOI:10.1016/j.jhydrol.2022.128052
Isa Ebtehaj; Hossein Bonakdari A reliable hybrid outlier robust non-tuned rapid machine learning model for multi-step ahead flood forecasting in Quebec, Canada, Journal of Hydrology, Volume 614 (2022), p. 128592 | DOI:10.1016/j.jhydrol.2022.128592
Hossein Bonakdari; Mohammad Zeynoddin Forecasting time series by deep learning and hybrid methods, Stochastic Modeling (2022), p. 265 | DOI:10.1016/b978-0-323-91748-3.00008-2
Alireza Gholami; Hamid Khoshdast; Ahmad Hassanzadeh Applying hybrid genetic and artificial bee colony algorithms to simulate a bio-treatment of synthetic dye-polluted wastewater using a rhamnolipid biosurfactant, Journal of Environmental Management, Volume 299 (2021), p. 113666 | DOI:10.1016/j.jenvman.2021.113666

Cité par 9 documents. Sources : Crossref

Commentaires - Politique