Univariate modelling of summer-monsoon rainfall time series: Comparison between ARIMA and ARNN

Surajit Chattopadhyay; Goutami Chattopadhyay

doi:10.1016/j.crte.2009.10.016

External geophysics, climate and environment (Climate)

Univariate modelling of summer-monsoon rainfall time series: Comparison between ARIMA and ARNN
[Modélisation univariée de séries temporelles de pluie de mousson d’été : comparaison entre ARIMA et ARNN]

Présenté par : Anny Cazenave

Surajit Chattopadhyay ¹ ; Goutami Chattopadhyay ²

¹ Department of Computer Application, Pailan College of Management and Technology, 700 104 Kolkata, India
² Formerly, Department of Atmospheric Sciences, University of Calcutta, 700 019 Kolkata, India

Comptes Rendus. Géoscience, Volume 342 (2010) no. 2, pp. 100-107.

Résumés

Anglais
Français

The present article reports studies to develop a univariate model to forecast the summer monsoon (June–August) rainfall over India. Based on the data pertaining to the period 1871–1999, the trend and stationarity within the time series have been investigated. After revealing the randomness and non-stationarity within the time series, the autoregressive integrated moving average (ARIMA) models have been attempted and the ARIMA(0,1,1) has been identified as a suitable representative model. Consequently, an autoregressive neural network (ARNN) model has been attempted and the neural network has been trained as a multilayer perceptron with the extensive variable selection procedure. Sigmoid non-linearity has been used while training the network. Finally, a three-three-one architecture of the ARNN model has been obtained and after thorough statistical analysis the supremacy of ARNN has been established over ARIMA(0,1,1). The usefulness of ARIMA(0,1,1) has also been described.

Le présent article rapporte les études pour développer un modèle univarié susceptible de prédire la pluviosité de mousson (juin–août) sur l’Inde. Basées sur les données se rapportant à la période 1871–1999, tendance et stationnarité ont été recherchées au sein de séries temporelles. Après avoir révélé le caractère aléatoire et non stationnaire au sein de celles-ci, les modèles autorégressifs (ARIMA) à moyenne mouvante intégrée ont été explorés et le modèle ARIMA (0,1,1) a été identifié comme un modèle représentatif approprié. En conséquence, un modèle autorégressif à réseau neuronal (ARNN) a été exploré et le réseau neuronal a été essayé en tant que perceptron multicouche avec processus de sélection extensive variable. Finalement, une architecture 3-3-1 du modèle ARNN a été obtenue et après analyse statistique approfondie, la suprématie du modèle ARNN sur le modèle ARIMA (0,1,1) a été établie. L’utilité d’ARIMA (0,1,1) a aussi été décrite.

Métadonnées

Reçu le : 2009-08-21
Accepté le : 2009-11-24
Publié le : 2010-02-01

DOI : 10.1016/j.crte.2009.10.016

Keywords: Univariate model, Stationarity, ARIMA, ARNN, Multilayer perceptron, Rainfall
Mot clés : Modèle univarié, Stationnarité, ARIMA, ARNN, Perceptron multicouche, Pluie

Affiliations des auteurs :

Surajit Chattopadhyay ¹ ; Goutami Chattopadhyay ²

¹ Department of Computer Application, Pailan College of Management and Technology, 700 104 Kolkata, India
² Formerly, Department of Atmospheric Sciences, University of Calcutta, 700 019 Kolkata, India

@article{CRGEOS_2010__342_2_100_0,
     author = {Surajit Chattopadhyay and Goutami Chattopadhyay},
     title = {Univariate modelling of summer-monsoon rainfall time series: {Comparison} between {ARIMA} and {ARNN}},
     journal = {Comptes Rendus. G\'eoscience},
     pages = {100--107},
     publisher = {Elsevier},
     volume = {342},
     number = {2},
     year = {2010},
     doi = {10.1016/j.crte.2009.10.016},
     language = {en},
}

TY  - JOUR
AU  - Surajit Chattopadhyay
AU  - Goutami Chattopadhyay
TI  - Univariate modelling of summer-monsoon rainfall time series: Comparison between ARIMA and ARNN
JO  - Comptes Rendus. Géoscience
PY  - 2010
SP  - 100
EP  - 107
VL  - 342
IS  - 2
PB  - Elsevier
DO  - 10.1016/j.crte.2009.10.016
LA  - en
ID  - CRGEOS_2010__342_2_100_0
ER  -

%0 Journal Article
%A Surajit Chattopadhyay
%A Goutami Chattopadhyay
%T Univariate modelling of summer-monsoon rainfall time series: Comparison between ARIMA and ARNN
%J Comptes Rendus. Géoscience
%D 2010
%P 100-107
%V 342
%N 2
%I Elsevier
%R 10.1016/j.crte.2009.10.016
%G en
%F CRGEOS_2010__342_2_100_0

Surajit Chattopadhyay; Goutami Chattopadhyay. Univariate modelling of summer-monsoon rainfall time series: Comparison between ARIMA and ARNN. Comptes Rendus. Géoscience, Volume 342 (2010) no. 2, pp. 100-107. doi : 10.1016/j.crte.2009.10.016. https://comptes-rendus.academie-sciences.fr/geoscience/articles/10.1016/j.crte.2009.10.016/

Version originale du texte intégral

1 Introduction

In an agricultural country like India, the success or failure of crops and water scarcity in any year is a matter of greatest concern and these problems are highly associated with the behaviour of the summer monsoon rainfall. Mean monsoon rainfall over India, as a whole during June–September, is 88 cm with a coefficient of variation of 10% (Rajeevan, 2001). Accurate, long lead prediction of monsoon rainfall can improve planning to mitigate the adverse impacts of monsoon variability and to take advantage of favorable conditions (Reddy and Salvekar, 2003). The summer monsoon precipitation over India is dominated by the semi-permanent monsoon trough, which extends from West Pakistan to the North Bay of Bengal across Northwest India and the Westward moving synoptic disturbances developing over the North Bay of Bengal (Mohanty and Mohapatra, 2007). In the paper by Rajeevan (2001), the status and future prospects of long-range forecasts of Indian summer monsoon have been reviewed. In another work, Krishnamurthy and Shukla (2000) studied the interseasonal and interannual variability of summer monsoon rainfall over India. In the following paragraphs, we present a survey of the studies on rainfall time series over different parts of the world. Subsequently, we have mentioned the newness in our study with respect to the studies available on monsoon rainfall forecasting over India.

The study of rainfall time series is a topic of great interest in the field of climatology and hydrology. Some significant examples in such areas include Delleur and Kavvas (1978), Shin et al. (1990), Singh (1998). Both univariate (e.g. Soltani et al., 2007) and multivariate (Grimaldi et al., 2005) approaches have been attempted to model the rainfall time series. Impact of other atmospheric variables on rainfall has been discussed in various literatures (e.g. Cracknell and Varotsos, 2007; Varotsos, 2005; Chattopadhyay, 2007b). The association between rainfall and agrometeorological processes is well discussed (e.g. Jhajharia et al., 2009; Chattopadhyay et al., 2009). Several stochastic models were attempted to forecast the occurrence of rainfall, to investigate its seasonal variability and to forecast monthly/yearly rainfall over some given geographical area. Study of the rainfall is interesting because of the associated problems, such as forecasting, corrosion effects and climate variability and various literatures have discussed these issues (e.g. Kondratyev et al., 1995; Kondratyev and Varotsos, 2002; Ferm et al., 2005, 2006; Tzanis and Varotsos, 2008). In a study by Chin (1977), where daily precipitation records for 25 years at more than 100 stations in the conterminous United States were analyzed, it was proved that the proper Markov order describing the daily precipitation process has to be determined and cannot be assumed a priori. Gregory et al. (1993) applied a Markov chain model to investigate interannual variability of area averaged total precipitation. Wilks (1998) applied mixed exponential distribution to simulate precipitation amount at multiple sites exhibiting realistic spatial correlation. Chaotic features associated with the atmospheric phenomena have attracted the attention of modern scientists (e.g., Varotsos et al., 2007, Varotsos and Krik-Davidoff, 2006, Khan et al., 2005; Bandyopadhyay and Chattopadhyay, 2007). In recent times, the study of the possible presence of chaotic behavior in rainfall time series has been of much interest (Sivakumar, 2001). Mathematical tools based on the theoretical concepts underlying the methodologies for detection and modelling of dynamical and chaotic components within a hydrological time series have been studied extensively by various scientists like Islam and Sivakumar (2002); Khan et al. (2005) and Jayawardena and Lai (1994). The existence of deterministic chaos within rainfall time series is well documented in the literature (e.g. Rodriguez-Iturbe et al., 1989; Sharifi et al., 1990). Phase space reconstruction and artificial neural networks (ANN) are non-linear predictive tools that have been proposed in the modern literature as effective mathematical methodologies to be useful to hydrological time series characterized by chaotic features (Chattopadhyay and Chattopadhyay, 2008a; Elsner and Tsonis, 1992; Khan et al., 2005). The suitability of ANN over conventional statistical approaches has been demonstrated in many research papers dealing with hydrological processes (e.g. Chattopadhyay, 2007a ; Chattopadhyay and Chattopadhyay, 2008a). Applicability of ANN to rainfall time series is well documented in the literature. In recent times, the competence of ANN (Rojas, 1996) in forecasting chaotic time series has been established by several authors (e.g. Principe et al., 1992; Oliveira et al., 2000; Silverman and Dracup, 2000). Prediction of atmospheric events, especially rainfall, has benefited significantly by voluminous developments in the application field of ANN and rainfall events and quantities have been predicted statistically (e.g. DelSole and Shukla, 2002; Mohanty and Mohapatra, 2007). The advantages of ANN over traditional statistical and numerical weather prediction approaches have been discussed by McCann (1992), Kuligowski and Barros (1998) and Silverman and Dracup (2000). Several research papers are available where the suitability of the ANN approach has been established quantitatively over conventional statistical rainfall prediction procedures (e.g. Chattopadhyay, 2007a; Hastenrath, 1995; Toth et al., 2000; Ramirez et al., 2005; Chattopadhyay, 2007b; Chattopadhyay and Chattopadhyay, 2008b). Guhathakurata (2008) generated an ANN-based model that captured the input-output non-linear relationship and predicted the monsoon rainfall in India quite accurately. The purpose of the present article is to investigate the stationarity within the average monsoon rainfall time series in India and subsequently to model this time series through autoregressive approach.

The Asian monsoon circulation influences most of the tropics and subtropics of the Eastern Hemisphere and a major portion of the Earth's population (Chattopadhyay, 2007b). The southwest (summer) and the northeast (winter) monsoons influence weather and climate between 30N and 30S over the African, Indian and Asian land masses (Reddy and Salvekar, 2003, Chattopadhyay, 2007b). The variability in the monsoon rainfall depends heavily upon the sea surface temperature anomaly over the Indian Ocean (Clark et al., 2000). As the extra tropical circulation anomalies display energy dispersion away from the region of anomalous tropical convection, they have been taken to mean a Rossby wave response to the latent heat release associated with the tropical convection (Ferranti et al., 1990). In regions of anomalous tropical heating, there is a dynamical response with anomalous large-scale ascent and upper tropospheric divergence, which acts as a Rossby wave source (Sardeshmukh and Hoskins, 1988) for extratropical waves. The summer monsoon (June–August) is the most productive period in India with respect to its agricultural practices. Moreover, the conserved rainwater for this period is used for future irrigation purposes in this country. Therefore, forecasting of average summer monsoon rainfall is necessary for future agricultural and irrigation modelling over this country. A plethora of literature is available where the summer-monsoon rainfall over this country has been predicted through multivariate approach (e.g. Chattopadhyay, 2007a, 2007b; Chattopadhyay and Chattopadhyay, 2008a, 2008b and references therein). However, the multivariate approach requires various other parameters which themselves are characterized by chaotic properties. The autoregressive approach, which is a univariate approach, depends solely upon the concerned variable and therefore is free from the effect of other variables. Dahale and Singh (1993) adopted autoregressive approach to monsoon rainfall time series over India and identified third order autoregressive model as the best fit. In the present paper, we have viewed the autocorrelation structure of the monsoon rainfall time series and consequently adopted autoregressive integrated moving average (ARIMA) approach instead of tradition autoregressive (AR) approach. Finally, we have implemented ANN in autoregressive manner and compared its performance statistically with the ARIMA-based model. The methodology and implementation procedure are described in the subsequent sections.

2 Methodology

2.1 Generation of autoregressive process

The data used in the present article consists of the average summer monsoon rainfall over India during the period from 1871 to 1999. The data are collected from http://www.tropmet.res.in. Firstly, the autocorrelation structure has been investigated in Fig. 1. It is apparent from the figure that the time series has no significant serial dependence. Moreover, there are very small positive as well as negative spikes in the autocorrelation function. It should be noted that computing up to lag 60, the autocorrelation function is not gradually decaying to zero. However, at some points the autocorrelation almost vanishes. In some other points, the autocorrelation is becoming positive or negative. This indicates that the time series is not stationary. Moreover, randomness is discernible in the autocorrelation structure. Similar pattern of autocorrelation is available in the time series of levels of seasonal Nile flood obtained by El-Fandy et al. (1994). In the said work, the authors adopted ARIMA to model the said time series. Following their approach, ARIMA would be adopted in the present time series-modeling problem.

Fig. 1
Autocorrelation function of the time series of the summer-monsoon rainfall amount over India.
Fonction d’autocorrélation des séries temporelles de quantité de pluie de mousson d’été sur l’Inde.

In the next step, it is examined whether there is any linear trend within the time series. A basic approach in testing for trends is the regression approach (Woodward and Gray, 1993, 1995). In the present problem, the months (t) constitute the independent variable and maximum temperature (Y_t) the dependent variable. Thus, the model is Y_t = a + bt + E_t, where E_t are the residuals, a is the regression constant, b is the regression parameter and t varies from 1 to 129. The estimates for b and a are given by

\hat{b} = \frac{\sum_{t = 1}^{n} (t - \bar{t}) Y_{t}}{\sum_{t = 1}^{n} {(t - \bar{t})}^{2}}

(1)

\hat{a} = \bar{Y} - \hat{b} \bar{t}

(2)

In the present paper, the values of the estimates for b and a are 0.0036 and 203.26. Under the assumption that the residuals are independent and normally distributed with zero mean, the estimated standard error for the estimate of b is (Woodward and Gray, 1993, 1995)

S E^{(1)} (\hat{b}) = {[\frac{12 \sum_{t = 1}^{n} {(Y_{t} - \hat{a} - \hat{b} t)}^{2}}{(n - 2) n (n^{2} - 1)}]}^{\frac{1}{2}}

(3)

The null hypothesis of no linear trend is based on the assumption that $\hat{b} / S E^{(1)} (\hat{b})$ is distributed as student's t with (n-2) degrees of freedom. In the present article, its value is 0.6922. Thus, the null hypothesis is accepted and it is concluded that there is no linear trend within the time series.

A time series can be modeled by an autoregressive [AR(p)], a moving average [MA(q)], an autoregressive moving average [ARMA(p,q)], or an ARIMA(p,d,q) model. The ARIMA, popularized by Box and Jenkins (1976), is used for describing a wide variety of time series behaviour. This model is either stationary or non-stationary in the boundary of the stationary region (Woodward and Gray, 1993). A detailed description of the ARIMA process and its theoretical background is available in (Box et al., 2007). An ARIMA(p,d,q) process is given by the equation (Box et al., 2007):

ϕ (B) \nabla^{d} z_{t} = θ_{0} + θ (B) a_{t}

(4)

Here, B is the backward shift operator and

ϕ (B) = 1 - \sum_{k = 1}^{p} ϕ_{k} B^{k}

(5)

θ (B) = 1 - \sum_{k = 1}^{q} θ_{k} B^{k}

(6)

Some examples of application of ARIMA in the non-stationary meteorological time series include El-Fandy et al. (1994), Woodward and Gray (1993,1995) and Visser and Molenaar (1995). Three ARIMA(p,d,q) models namely ARIMA(0,1,1), ARIMA(0,2,2) and ARIMA(1,1,1) are now developed. Outcomes of the models are judged statistically. The three ARIMA models can be presented as:

ARIMA(0,1,1) process

\nabla z_{t} = (1 - θ_{1} B) a_{t}

(7)

ARIMA(0,2,2) process

\nabla^{2} z_{t} = (1 - θ_{1} B - θ_{2} B^{2}) a_{t}

(8)

ARIMA(1,1,1) process

\nabla z_{t} - ϕ_{1} \nabla z_{t - 1} = (1 - θ_{1} B) a_{t}

(9)

Since the time series under consideration does not have any seasonal pattern, it is not required to take p,d and q greater than 2. The results will be discussed in subsequent sections.

2.2 Generation of autoregressive neural network

In the last section of the present paper, the non-linearity and non-stationarity of the time series have been established and non-stationary time series models have been generated. In the present section, ANN would be adopted to model the same time series. The ANN has been generated in autoregressive manner and consequently the model has been named as autoregressive neural network (ARNN) model. An ordinary autoregressive model of order p is written as

z (t) = \sum_{i = 1}^{p} ϕ_{i} z (t - i) + \in (t)

(10)

In Eq. (10), a linear function F^L can be introduced, in which case the equivalent form of (10) would be (Dorffner, 1996)

z (t) = F^{L} (z (t - 1), z (t - 2), \dots, z (t - p)) + \in (t)

(11)

Replacing L by an ANN model leads to the ARNN model. A few examples of ARNN studies include Temizel and Caset (2005), Sallehuddin et al. (2007) and Trapletti et al. (2000). An ARNN application to the modeling of geophysical time series is available in Chattopadhyay and Chattopadhyay (2009). The ARNN model can be written as (Dorffner, 1996)

z (t) = F^{A R N N} (z (t - 1), z (t - 2), \dots, z (t - p)) + \in (t)

(12)

At this juncture, it is necessary to decide the number of predictors in the ARNN model. This is done by fitting autoregressive (AR(p)) models to the time series. Testing AR models for p equals 1, 2, 3 and 4 and computing the Akaike's information criterion (AIC) statistics for all the time models the AR(4) model is identified as the best and consequently four predictors are used to generate the ARNN model. It should be noted that AIC is a screening statistic whose minimization implies best predictive model. The AIC is given by (Storch and Zwiers, 1999)

A I C = n \log (2 π {\hat{σ}}_{E}^{2}) + \frac{S S E_{\{l_{1}, \dots, l_{p}\}}}{{\hat{σ}}_{E}^{2}} + 2 p

(13)

Details of the symbols used in the above expression are given in page 167, of Storch and Zwiers (1999). The neural network model adopted here belongs to the category of multilayer perceptron (MLP) (Gardner and Dorling, 1998 ; Gardner and Dorling, 1999). The theoretical details of MLP are available in various texts on ANN (e.g. Rojas, 1996 and Pal and Mitra, 1999); its suitability in hydrological modelling is well established in literature (Chattopadhyay and Chattopadhyay (2008a, 2008b) and references therein). Maier and Dandy (2000) presented an extensive review of the application of ANN in predicting water resource variables. In MLP, each network consists of several simple processors called neurons, or cells, which are highly interconnected and arranged in several layers. There are three basic types of layers: input layer, hidden layer(s) and output layer. Input and output layers are connected through hidden layer(s). There may be one to several hidden layers in between input and output layer. In mathematical form, the adaptive procedure of a feed forward MLP can be presented as

w_{k + 1} = w_{k} + η d_{k}

(14)

d_{k} = - \nabla E (w_{k})

(15)

The above equation represents an iteration process that finds the optimal weight vector by adapting the initial weight vector w₀. This adaptation is performed by sequential introduction of a set of input and target vectors to the network. The positive constant η is called the learning rate. The direction vector d_k is the negative gradient of the output error function E.

In the present problem, the dataset (i.e. the time series) consists of 129 data points. Since there are four predictors in autoregressive manner, the proposed model can be written as:

z {(t)}_{p r e d i c t e d} = F^{A R N N} (z (t - 1), z (t - 2), z (t - 3), z (t - 4)) + \in (t)

(16)

In this model, the neural network is trained through Kalman Filtering with sigmoid non-linearity in both hidden and output layer. From the entire dataset, 70% of the data are chosen randomly to train the ARNN and the remaining 30% are chosen as the test data. The entire dataset is used as the validation set. The minimization of the root mean squared error is chosen as the stopping criterion. Genetic Algorithm trains the network through extensive variable selection. The maximum number of hidden nodes is chosen as 24 using the method proposed by Perez et al. (2000). After training through extensive variable selection, the final architecture (i.e. input nodes–hidden nodes–output node) combination comes out to be 3-3-1. Performance of the model would be discussed statically in the subsequent section.

3 Discussion

The first part of the discussion is to present a scatter plots for the (actual-predicted) pairs pertaining to all of the five models. In Fig. 2a, the scatter plot for ARIMA(0,1,1) shows that there is a negative association between actual and predicted rainfall. A linear regression line has been fit to the plot and it is seen that there is a tendency towards linearity but there are significant deviations from the linear regression line. The Fig. 2b, which corresponds to ARIMA(0,2,2) model, shows a good positive linear association between actual and predicted values and the regression line has a significant closeness with the data cloud. This indicates the presence of a good positive correlation between actual and predicted values. Fig. 2c, which corresponds to ARIMA(1,1,1), has a widely spread data points having little closeness to the linear regression fit. Thus, it is apparently felt that ARIMA(1,1,1) has less ability than ARIMA(0,1,1) and ARIMA(0,2,2) to represent the time series under consideration. The Pearson correlation coefficients are calculated for each of the three cases and Table 1 shows that ARIMA(0,2,2) has the largest positive correlation; ARIMA(0,1,1) has negative correlation which is less in magnitude than the ARIMA(0,2,2); ARIMA(1,1,1) has very small positive correlation. Thus, from the figures and the correlation values, the third ARIMA model seems to have the least prediction capability. Willmott (1982) raised some vital questions regarding the usefulness of Pearson correlation coefficients and the scatter plots as measures of goodness of fit of a model. According to Willmott (1982), the main problem associated with the Pearson correlation coefficients is that the magnitudes of such correlations are not consistently related to the accuracy of prediction, where accuracy is defined as the degree to which model-predicted observations approach the magnitudes of their observed counterparts. Willmott (1982) proposed an ‘index of agreement’ d as a measure of goodness of fit of a model to the observed data. This index is given by

d = 1 - [\sum_{i} {|P_{i} - O_{i}|}^{2}] {[\sum_{i} {(|P_{i} - \bar{O}| + |O_{i} - \bar{O}|)}^{2}]}^{- 1}

(17)

Fig. 2
Scatter plots showing the association between actual summer monsoon rainfall and predictions from ARIMA(0,1,1) (a), ARIMA(0,2,2) (b) and ARIMA(1,1,1) (c).
Diagramme de dispersion des points montrant l’association des points entre pluie de mousson d’été actuelle et prédictions à partir d’ARIMA (0,1,1) (a), ARIMA(0,2,2) (b) et ARIMA (1,1,1) (c).

Statistics	ARIMA(0,1,1)	ARIMA(0,2,2)	ARIMA(1,1,1)	ARNN
Willmott's index	0.972	0.815	0.837	0.990
Pearson correlation	−0.427	0.766	0.196	0.987

Here, P_i and O_i denote the predicted and actual values of the ith observation of the time series. The values of the said statistic for the three models are presented in Table 1. It is observed from Table 1 that ARIMA(0,1,1) produces the maximum values of d. This means that the ARIMA(0,1,1) fits the best to the monsoon rainfall time series under consideration. As a further support to this observation, the Akaike Information Criteria (AIC) are calculated for the three models. It is found that the ARIMA(0,1,1) produces minimum AIC (1228.99), whereas, ARIMA(0,2,2) and ARIMA(1,1,1) produce AIC of 1256.96 and 1243.78 respectively. Since the minimization of AIC indicates better model, goodness of ARIMA(0,1,1) over the other models is further supported. The Pearson correlation and Willmott's index are calculated also for the ARNN model. It is found that the values are 0.987 and 0.99 respectively (Table 1). Thus, it is understood that ARNN outperforms ARIMA(0,1,1), which produced the maximum Willmott's index of agreement (0.972) among the competitive ARIMA models. However, ARIMA(0,1,1) can be considered as an alternative to ARNN because of its high Willmott's index, which is a little distant from that produced by ARNN model. A scatter plot is presented in Fig. 3, which corresponds to the actual versus predicted monsoon rainfall for the ARNN model. A very prominent positive and linear association is discernible from this figure. The data points are so close to the regression line that in some cases it seem that they are almost coincident with the regression line. To make the presentation more understandable, we have prepared a line diagram in Fig. 4, which shows that the actual and predicted summer monsoon rainfall amounts are very close to each other in the case of ARNN.

Fig. 3
Scatter plots showing the positive linear association between actual summer monsoon rainfall and predictions from autoregressive neural network (ARNN) model.
Diagramme de dispersion des points montrant l’association positive linéaire entre pluie de mousson d’été actuelle et prédiction à partir du modèle autorégressif à réseau neuronal (ARNN).

Fig. 4
Line diagram showing the association between actual summer monsoon rainfall and predictions from autoregressive neural network (ARNN) model.
Diagramme linéaire montrant l’association entre pluie de mousson d’été actuelle et prédictions à partir du modèle autorégressif à réseau neuronal (ARNN).

4 Concluding remarks

The present article has aimed at developing a univariate model for the monsoon rainfall time series over India. Rainfall is an essential component of the hydrological cycle and the monsoon season (June–August) of India is the most productive period for agricultural practices in this country. In some parts of this country, the monsoon rainfall is stored for future cultivation purposes. Several multivariate models have been attempted for the said time series. However, univariate modelling is not very frequent in this field. Authors of the present paper felt that a univariate model might be of more importance for predicting the said time series because it would not require the data for any other meteorological parameters. It was also felt that a univariate approach would be more acceptable in areas where some decisions need to be taken in light of the limited data availability. Prior to generating the representative models for the said time series, the serial correlation pattern of the time series was studied and the absence of any persistence within the time series was revealed. Simultaneously, absence of any cyclic pattern was also perceptible. From this study, it could be understood that the time series was not characterized by any stationarity. Instead, the time series was characterized by randomness. The presence of any linear trend within the time series was also investigated and it was detected that there was no linear trend within the time series. Subsequently, a non-linear time series approach in the form of ARIMA has been generated for the time series. After judging the three ARIMA models statistically, the ARIMA(0,1,1) was identified as the best model. Finally, an ARNN model was generated, where the number of predictors was decided on the basis of an autoregressive approach. Four predictors were required for the ARNN modelling, where the neural network was trained as multilayer perceptron with sigmoid non-linearity. After training through extensive variable selection by genetic algorithm, it was revealed that three nodes in the hidden layer with three nodes in input layer of the final model would be the best network architecture. Performance of the ARNN was judged statistically by means of scatter plot presentation and Willmott's index computation and it was revealed that the ARNN model performed better than ARIMA(0,1,1). However, from the high value of Willmott's index, it could be said that ARIMA(0,1,1) was not a negligible model and it might be an alternative model for forecasting the monsoon rainfall time series over India if there were no scope for neural network modelling.

Bibliographie

[Bandyopadhyay and Chattopadhyay, 2007] G. Bandyopadhyay; S. Chattopadhyay Single hidden layer artificial neural network models versus multiple linear regression model in forecasting the time series of total ozone, International Journal of Environmental Science and Technology, Volume 4 (2007), pp. 141-149

[Box and Jenkins, 1976] Box, G.E.P., Jenkins, G.M., 1976. time series analysis: forecasting and control. holden-Day.

[Box et al., 2007] G.E.P. Box; G.M. Jenkins; G.C. Reinsel Time series analysis: forecasting and control, Dorling Kindersley (India) Pvt Ltd, New Delhi, India, 2007 (licensees of Pearson Education in South Asia)

[Chattopadhyay, 2007a] S. Chattopadhyay Prediction of mean monthly total ozone time series- application of radial basis function network, International Journal of Remote Sensing, Volume 28 (2007), pp. 4037-4046

[Chattopadhyay, 2007b] S. Chattopadhyay Feed forward Artificial Neural Network model to predict the average summer-monsoon rainfall in India, Acta Geophysica, Volume 55 (2007), pp. 369-382

[Chattopadhyay and Chattopadhyay, 2008a] G. Chattopadhyay; S. Chattopadhyay A probe into the chaotic nature of total ozone time series by Correlation dimension method, Soft Computing, Volume 12 (2008), pp. 1007-1012

[Chattopadhyay and Chattopadhyay, 2008b] S. Chattopadhyay; G. Chattopadhyay Comparative study among different neural net learning algorithms applied to rainfall time series, Meteorological Applications, Volume 15 (2008), pp. 273-280

[Chattopadhyay and Chattopadhyay, 2009] G. Chattopadhyay; S. Chattopadhyay Autoregressive forecast of monthly total ozone concentration: a neurocomputing approach, Computers and Geosciences, Volume 35 (2009), pp. 1925-1932

[Chattopadhyay et al., 2009] S. Chattopadhyay; R. Jain; G. Chattopadhyay Estimating potential evapotranspiration from limited weather data over Gangetic West Bengal, India: a neurocomputing approach, Meteorological Applications, Volume 16 (2009), pp. 403-411

[Chin, 1977] E.H. Chin Modeling daily precipitation occurrence process with Markov chain, Water Resource Research, Volume 13 (1977), pp. 949-956

[Clark et al., 2000] C.O. Clark; J.E. Cole; P.J. Webster Indian Ocean SST and Indian Summer Rainfall: Predictive Relationships and their Decadal Variability, J. Climate, Volume 13 (2000), pp. 2503-2519

[Cracknell and Varotsos, 2007] A.P. Cracknell; C.A. Varotsos The IPCC Fourth Assessment Report and the fiftieth anniversary of Sputnik, Environmental Science And Pollution Research, Volume 14 (2007), pp. 384-387

[Dahale and Singh, 1993] S.D. Dahale; S.V. Singh Modelling of Indian monsoon rainfall time series by univariate Box-Jenkins type of models, Advances in Atmospheric Sciences, Volume 10 (1993), pp. 211-220

[Delleur and Kavvas, 1978] J.W. Delleur; M.L. Kavvas Stochastic models for monthly rainfall forecasting and synthetic generation, Journal of Applied Meteorology, Volume 17 (1978), pp. 1528-1536

[DelSole and Shukla, 2002] T. DelSole; J. Shukla Linear prediction of Indian Monsoon rainfall, Journal of Climate, Volume 15 (2002), pp. 3645-3658

[Dorffner, 1996] G. Dorffner Neural network for time series processing, Neural Network World, Volume 6 (1996), pp. 447-468

[El-Fandy et al., 1994] M.G. El-Fandy; Z.H. Ashour; S.M.M. Taiel Time series models adoptable for forecasting Nile floods and Ethiopian rainfalls, Bulletin of the American Meteorological Society, Volume 75 (1994), pp. 83-94

[Elsner and Tsonis, 1992] J. Elsner; A. Tsonis Nonlinear Prediction, Chaos, and Noise, Bulletin of American Meteorological Society, Volume 73 (1992), pp. 49-60

[Ferm et al., 2005] M. Ferm; F. De Santis; C. Varotsos Nitric acid measurements in connection with corrosion studies, Atmospheric Environment, Volume 39 (2005) no. 35, pp. 6664-6672

[Ferm et al., 2006] M. Ferm; J. Watt; S. O’hanlon; F. De Santis; C. Varotsos Deposition measurement of particulate matter in connection with corrosion studies, Analytical and Bioanalytical Chemistry, Volume 384 (2006) no. 6, pp. 1320-1330

[Ferranti et al., 1990] L. Ferranti; T.N. Palmer; F. Molteni; E. Klinker Tropical-extratropical interaction associated with the 30-60 day oscillation and its impact on medium and extended range prediction, Journal of Atmospheric Sciences, Volume 47 (1990), pp. 2177-2199

[Gardner and Dorling, 1998] M.W. Gardner; S.R. Dorling Artificial neural networks (the multilayer perceptron)--a review of applications in the atmospheric sciences, Atmospheric Environment, Volume 32 (1998), pp. 2627-2636

[Gardner and Dorling, 1999] M.W. Gardner; S.R. Dorling Neural network modelling and prediction of hourly NO_x and NO₂ concentrations in urban air in London, Atmospheric Environment, Volume 33 (1999), pp. 709-719

[Gregory et al., 1993] J.M. Gregory; T.M.L. Wigley; P.D. Jones Application of Markov models to area average daily precipitation series and inter annual variability in seasonal totals, Climate Dynamics, Volume 8 (1993), pp. 299-310

[Grimaldi et al., 2005] S. Grimaldi; F. Serinaldi; C. Tallerini Multivariate linear parametric models applied to daily rainfall time series, Advances in Geosciences, Volume 2 (2005), pp. 87-92

[Guhathakurata, 2008] P. Guhathakurata Long lead monsoon rainfall prediction for meteorological sub-divisions of India using deterministic artificial neural network, Meteorology and Atmospheric Physics, Volume 101 (2008), pp. 93-108

[Hastenrath, 1995] S. Hastenrath Recent Advances in Tropical Climate Prediction, J. Climate, Volume 8 (1995), pp. 1519-1532

[Islam and Sivakumar, 2002] M.N. Islam; B. Sivakumar Characterization and prediction of runoff dynamics: a nonlinear dynamical view, Advances in Water Resources, Volume 25 (2002), pp. 179-190

[Jayawardena and Lai, 1994] A.W. Jayawardena; F. Lai Analysis and prediction of chaos in rainfall and stream flow time series, Journal of Hydrology, Volume 153 (1994), pp. 23-52

[Jhajharia et al., 2009] D. Jhajharia; S.K. Shrivastava; D. Sarkar; S. Sarkar Temporal characteristics of pan evaporation trends under the humid conditions of northeast India, Agricultural and Forest Meteorology, Volume 149 (2009), pp. 763-770

[Khan et al., 2005] S. Khan; A.R. Ganguly; S. Saigal Detection and predictive modeling of chaos in finite hydrological time series, Nonlinear Processes in Geophysics, Volume 12 (2005), pp. 41-53

[Kondratyev and Varotsos, 2002] K.Y. Kondratyev; C. Varotsos Remote sensing and global tropospheric ozone observed dynamics, International Journal of Remote Sensing, Volume 23 (2002), pp. 159-178

[Kondratyev et al., 1995] K.Y. Kondratyev; O.M. Pokrovsky; C.A. Varotsos Atmospheric ozone trends and other factors of surface ultraviolet radiation variability, Environmental Conservation, Volume 22 (1995), pp. 259-261

[Krishnamurthy and Shukla, 2000] V. Krishnamurthy; J. Shukla Intraseasonal and Interannual Variability of Rainfall over India, Journal of Climate, Volume 13 (2000), pp. 4366-4377

[Kuligowski and Barros, 1998] R.J. Kuligowski; A.P. Barros Experiments in short-term precipitation forecasting using artificial neural networks, Monthly Weather Review, Volume 126 (1998), pp. 470-482

[McCann, 1992] D.W. McCann A neural network short-term forecast of significant thunderstorms, Weather and Forecasting, Volume 7 (1992), pp. 525-534

[Maier and Dandy, 2000] H.R. Maier; G.C. Dandy Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications, Environmental modeling and Software, Volume 15 (2000), pp. 101-124

[Mohanty and Mohapatra, 2007] U.C. Mohanty; M. Mohapatra Prediction of occurrence and quantity of daily summer monsoon precipitation over Orissa (India), Meteorological Applications, Volume 14 (2007), pp. 95-103

[Oliveira et al., 2000] K.A.D. Oliveira; A. Vannucci; E.C. da Silva Using artificial neural networks to forecast chaotic time series, Physica A: Statistical Mechanics and its Applications, Volume 284 (2000), pp. 393-404

[Pal and Mitra, 1999] Pal, S.K. and Mitra, S., 1999. Neuro-Fuzzy Pattern Recognition: Methods in Soft Computing, John Wiley & Sons, Inc. New York, NY, USA.

[Perez et al., 2000] P. Perez; A. Trier; J. Reyes Prediction of PM2.5 concentrations several hours in advance using neural networks in Santiago, Chile, Atmospheric Environment, Volume 34 (2000), pp. 1189-1196

[Principe et al., 1992] J.C. Principe; A. Rathie; J.M. Kuo Prediction of chaotic time series with neural networks and the issue of dynamic modeling, International Journal of Bifurcation and Chaos, Volume 2 (1992), pp. 989-996

[Rajeevan, 2001] M. Rajeevan Prediction of Indian summer monsoon: Status, problems and prospects, Current Science, Volume 81 (2001), pp. 1451-1457

[Ramirez et al., 2005] M.C.V. Ramirez; H.F. de Campos Velho; N.J. Ferreira Artificial neural network technique for rainfall forecasting applied to the Sao Paulo region, Journal of Hydrology, Volume 301 (2005), pp. 146-162

[Reddy and Salvekar, 2003] P.R.C. Reddy; P.S. Salvekar Equatorial East Indian Ocean sea surface temperature: A new predictor for seasonal and annual rainfall, Current Science, Volume 85 (2003), pp. 1600-1604

[Rodriguez-Iturbe et al., 1989] I. Rodriguez-Iturbe; B. Febres de Power; M.B. Sharifi; K.P. Georgakakos Chaos in Rainfall, Water Resources Research, Volume 25 (1989), pp. 1667-1675

[Rojas, 1996] R. Rojas Neural networks: a systematic introduction, Springer-Verlag New York, Inc, New York, USA, 1996

[Sallehuddin et al., 2007] R. Sallehuddin; S.M.H. Shamsuddin; S.Z.M. Hashim; A. Abraham Forecasting time series data using hybrid grey relational artificial neural network and autoregressive integrated moving average model, Neural Network World, Volume 6 (2007), pp. 573-605

[Sardeshmukh and Hoskins, 1988] P.D. Sardeshmukh; B.J. Hoskins The Generation of Global Rotational Flow by Steady Idealized Tropical Divergengence, J. Atmos. Sci., Volume 45 (1988), pp. 1228-1251

[Sharifi et al., 1990] M.B. Sharifi; K.P. Georgakakos; I. Rodriguez-Iturbe Evidence of Deterministic Chaos in the Pulse of Storm Rainfall, Journal of Atmospheric Sciences, Volume (1990), pp. 888-893

[Shin et al., 1990] K.S. Shin; G.R. North; Y.S. Ahn; P.A. Arkin Time scales and variability of area-averaged tropical oceanic rainfall, Monthly Weather Review, Volume 11 (1990), pp. 1507-1516

[Silverman and Dracup, 2000] D. Silverman; J.A. Dracup Artificial neural networks and long-range precipitation prediction in California, Journal of Applied Meteorology, Volume 39 (2000), pp. 57-66

[Singh, 1998] C.V. Singh Long term estimation of monsoon rainfall using stochastic models, Int. J. Climatol., Volume 18 (1998) no. 14, pp. 1611-1624

[Sivakumar, 2001] B. Sivakumar Is a chaotic multi-fractal approach for rainfall possible?, Hydrological Processes, Volume 15 (2001), pp. 943-955

[Soltani et al., 2007] S. Soltani; R. Modarres; S.S. Eslamian The use of time series modelling for the determination of rainfall climates of Iran, Int. J. Climatol., Volume 27 (2007) no. 6, pp. 819-829

[Storch and Zwiers, 1999] H.V. Storch; F.W. Zwiers Statistical analysis in climate research, Cambridge University Press, Cambridge, UK, 1999

[Temizel and Caset, 2005] T.T. Temizel; M.C. Caset A comparative study of autoregressive neural network hybrids, Neural Networks, Volume 5-6 (2005), pp. 781-789

[Toth et al., 2000] E. Toth; A. Brath; V. Montanari Comparison of short-term rainfall prediction models for real-time flood forecasting, J. Hydrol., Volume 239 (2000), pp. 132-147

[Trapletti et al., 2000] A. Trapletti; F. Leisch; K. Hornick Stationary and integrated autoregressive neural network processes, Neural Computation, Volume 10 (2000), pp. 2427-2450

[Tzanis and Varotsos, 2008] C. Tzanis; C.A. Varotsos Tropospheric aerosol forcing of climate: a case study for the greater area of Greece, International Journal of Remote Sensing, Volume 29 (2008), pp. 2507-2517

[Varotsos, 2005] C. Varotsos Modern computational techniques for environmental data; Application to the global ozone layer, Computational Science, Volume 3516 (2005), pp. 504-510 (ICCS 2005, PT 3 Book Series: Lecture notes in computer science)

[Varotsos and Krik-Davidoff, 2006] C. Varotsos; D. Krik-Davidoff Long-memory processes in ozone and temperature variations at the region 60 degrees S-60 degrees N, Atmospheric Chemistry and Physics, Volume 6 (2006), pp. 4093-4100

[Varotsos et al., 2007] C. Varotsos; M.N. Assimakopoulos; M. Efstathiou Technical Note: Long-term memory effect in the atmospheric CO2 concentration at Mauna Loa, Atmospheric Chemistry and Physics, Volume 7 (2007), pp. 629-634

[Visser and Molenaar, 1995] H. Visser; J. Molenaar Trend estimation and regression analysis in climatological time series: An application of structural time series models and the Kalman filter, J. Climate, Volume 8 (1995), pp. 969-979

[Wilks, 1998] D.S. Wilks Multisite generalization of a daily stochastic precipitation generation model, J. Hydrol., Volume 210 (1998), pp. 178-191

[Willmott, 1982] C.J. Willmott Some Comments on the Evaluation of Model Performance, Bulletin of American Meteorological Society, Volume 63 (1982), pp. 1309-1313

[Woodward and Gray, 1995] W.A. Woodward; H.L. Gray Selecting a model for detecting the presence of a trend, J. Climate, Volume 8 (1995), pp. 1929-1937

[Woodward and Gray, 1993] W.A. Woodward; H.L. Gray Global warming and the problem of testing for trend in time series data, J. Climate, Volume 6 (1993), pp. 953-962

Commentaires - Politique