Outline
Comptes Rendus

External geophysics, climate
Trend estimation and univariate forecast of the sunspot numbers: Development and comparison of ARMA, ARIMA and Autoregressive Neural Network models
Comptes Rendus. Géoscience, Volume 343 (2011) no. 7, pp. 433-442.

Abstracts

In the present study, a prominent 11-year cycle, supported by the pattern of the autocorrelation function and measures of Euclidean distances, in the mean annual sunspot number time series has been observed by considering the sunspot series for the duration of 1749 to 2007. The trend in the yearly sunspot series, which is found to be non-normally distributed, is examined through the Mann-Kendall non-parametric test. A statistically significant increasing trend is observed in the sunspot series in annual duration. The results indicate that the performance of the autoregressive neural network-based model is much better than the autoregressive moving average and autoregressive integrated moving average-based models for the univariate forecast of the yearly mean sunspot numbers.

Dans la présente étude, un cycle remarquable de 11 ans, corroboré par le diagramme de la fonction d’autocorrélation et les mesures des distances euclidiennes dans la série temporelle du nombre annuel moyen de taches solaires ont été observés, en prenant en considération les séries de taches solaires sur la période 1749–2007. La tendance dans les séries annuelles de taches solaires dont il a été trouvé qu’elles étaient distribuées non normalement, est examinée au moyen du test non paramétrique Mann-Kendall. Une tendance à un accroissement statistiquement significatif est observée dans les séries de taches solaires sur la durée de l’année. Les résultats indiquent que la performance du modèle basé sur le réseau neuronal autorégressif est bien meilleure que les modèles basés sur la moyenne autorégressive mouvante et sur la moyenne autorégressive intégrée mouvante, pour la prévision univariée du nombre moyen de taches solaires sur l’année.

Metadata
Received:
Accepted:
Published online:
DOI: 10.1016/j.crte.2011.07.008
Keywords: Sunspot, 11-year cycle, Trend, Mann-Kendall non-parametric test, ARMA, ARIMA, Artificial neural network
Mots-clés : Tache solaire, Cycle de 11 ans, Tendance, Test Mann-Kendall non paramétrique, ARMA, ARIMA, Réseau neuronal artificiel

Surajit Chattopadhyay 1; Deepak Jhajharia 2; Goutami Chattopadhyay 3

1 Department of Computer Application, Pailan College of Management and Technology, West Bengal University of Technology, Kolkata 700 104, West Bengal, India
2 Department of Agricultural Engineering, North Eastern Regional Institute of Science and Technology (Deemed University), Nirjuli, Itanagar 791 109, Arunachal Pradesh, India
3 Institute of Radio Physics and Electronics, University of Calcutta, Kolkata 700 009, West Bengal, India
@article{CRGEOS_2011__343_7_433_0,
     author = {Surajit Chattopadhyay and Deepak Jhajharia and Goutami Chattopadhyay},
     title = {Trend estimation and univariate forecast of the sunspot numbers: {Development} and comparison of {ARMA,} {ARIMA} and {Autoregressive} {Neural} {Network} models},
     journal = {Comptes Rendus. G\'eoscience},
     pages = {433--442},
     publisher = {Elsevier},
     volume = {343},
     number = {7},
     year = {2011},
     doi = {10.1016/j.crte.2011.07.008},
     language = {en},
}
TY  - JOUR
AU  - Surajit Chattopadhyay
AU  - Deepak Jhajharia
AU  - Goutami Chattopadhyay
TI  - Trend estimation and univariate forecast of the sunspot numbers: Development and comparison of ARMA, ARIMA and Autoregressive Neural Network models
JO  - Comptes Rendus. Géoscience
PY  - 2011
SP  - 433
EP  - 442
VL  - 343
IS  - 7
PB  - Elsevier
DO  - 10.1016/j.crte.2011.07.008
LA  - en
ID  - CRGEOS_2011__343_7_433_0
ER  - 
%0 Journal Article
%A Surajit Chattopadhyay
%A Deepak Jhajharia
%A Goutami Chattopadhyay
%T Trend estimation and univariate forecast of the sunspot numbers: Development and comparison of ARMA, ARIMA and Autoregressive Neural Network models
%J Comptes Rendus. Géoscience
%D 2011
%P 433-442
%V 343
%N 7
%I Elsevier
%R 10.1016/j.crte.2011.07.008
%G en
%F CRGEOS_2011__343_7_433_0
Surajit Chattopadhyay; Deepak Jhajharia; Goutami Chattopadhyay. Trend estimation and univariate forecast of the sunspot numbers: Development and comparison of ARMA, ARIMA and Autoregressive Neural Network models. Comptes Rendus. Géoscience, Volume 343 (2011) no. 7, pp. 433-442. doi : 10.1016/j.crte.2011.07.008. https://comptes-rendus.academie-sciences.fr/geoscience/articles/10.1016/j.crte.2011.07.008/

Original version of the full text

1 Introduction

Solar activity forecasting is an imperative tool, which is used in the various scientific and technological fields, like the operations of the low-Earth orbiting satellites, the electric power transmission lines, the high frequency radio communications and the geophysical applications (e.g. Alexandris et al., 1999; Katsambas et al., 1997). The particles and electromagnetic radiations flowing from the solar activity outbursts are also important to study the long-term climatic variations (Sello, 2001). The easily observable and long-term recorded aspects of the solar activity are the dark regions on the solar disk, i.e., the sunspots. Their numbers per time interval represent an index of the general solar magnetic activity (Echer et al., 2004a). Sunspot number (SN) series represents the longest running direct record of solar activity, with reliable observations dating back to the year 1610, soon after the invention of the telescope (Usoskin et al., 2003). SN is connected with solar flares strongly affecting the Earth's magnetosphere and the radiation environment. Mean solar magnetic field value is related to the interplanetary magnetic field controlling the averaged Earth's magnetosphere conditions (Dmitriev et al., 1999). The number of sunspots is characterized by a long-term temporal variation, reaching its maximum or minimum approximately every 11 years (the solar cycle). This variation, in turn, has an effect in terms of variation in the global climate (Tzanis and Varotsos, 2008; Varotsos and Cracknell, 2004). Many atmospheric phenomena exhibit decadal variability on both regional and global scales. Such phenomena have often been related empirically to solar cycle variations, on the order of 11 or sometimes 22 years (Rind, 2002). Shindell et al. (1999) showed how upper stratospheric ozone changes may amplify observed, 11-year solar cycle irradiance changes to affect climate. The latter is evidence of the interplay between the photochemistry and dynamics of the ozone layer especially over the middle and high latitudes in both hemispheres (Varotsos, 2002). A connection between the 11-year solar cycle, the QBO and total ozone was discussed by Varotsos (1989). Varotsos and Cracknell (2004) identified the existence of Guttenberg-Richter seismic law with-in the solar cycle, which was further validated by Chattopadhyay and Chattopadhyay (2008) using the Artificial Neural Network (ANN) and factor analysis. Sunspots grow over a few days and last from several days to a few months, indicating enhanced solar activity that modulates the weather and the climate (Chattopadhyay and Chattopadhyay, 2008). The association between climatic variation and the variation in solar cycle has been studied by several authors like Willett (1951), Friis-Christensen and Lassen (1991), Fröhlich and Lean (1998) and many others. Haigh (2006) used a general-circulation model (GCM) to investigate the impact of the 11-year solar-activity cycle on the climate of the lower atmosphere.

The prediction of solar cycles, a major thrust research area for long time, requires the knowledge of the solar dynamo, a term that includes the processes involved in the production, transport, and destruction of solar magnetic field (Pesnell, 2008). There are two methods of predicting future solar activity. First is the purely numeric method that explores the periodicity within the SN time series. Examples in this direction are Rigozo and Nordemann (1998) and Rigozo et al. (2001), Echer et al. (2004b), which do not depend upon the intrinsic physics. The other method is the precursor method which is based on correlating any solar/geophysical parameter from earlier in the solar cycle to SNs at solar maximum or some other point later in the cycle (Echer et al., 2004a). Examples in this direction are Kane (2001), Ahluwalia and Ygbuhay (2009). The work, which the authors are going to present in this article, comes in the first category.

The rate of solar flares and the amount of energy released by the solar flares are well correlated with the SN, as is the rate of coronal mass ejections (Pesnell, 2008). The SN time series provides the longest existing record of solar activity, and is thus the best available data set for studying the long-term evolution of solar activity and, in particular, of the 11-year activity cycle (Baranovski et al., 2008). Very recently, Gil and Luis (2009) studied the monthly SN time series and proved that the SN series displays the long memory, with a large degree of dependence between the observations that tends to disappear very slowly in time. Standard autoregressive models have been used to predict solar cycles (Hathaway et al., 1999). However, in the situations of non-linear and non-stationary situations, the conventional autoregressive processes are not very useful, and thus, are not much acceptable. ANN, which is the mathematical analogue of the nervous system, has been recognized by the geo-scientists as a powerful alternative to the conventional statistical approaches for developing predictive models for complex geophysical systems (e.g. Gardner and Dorling, 1998; Hsieh and Tang, 1998). The method has an ability to identify a relationship from given patterns and this makes it possible for ANNs to solve large-scale complex problems such as pattern recognition, non-linear modelling, classification, association, and control (Tayfur, 2002). Forecasting time series by ANN has been discussed in papers like Lin et al. (1994), Dorffner (1996). Complexity of the solar cycle is already documented in Kurths and Ruzmaikin (1990), Price et al. (1992) and Sello (2001). Consequently, the study of solar cycle has also been an area where ANN has got significant importance. In a multivariate environment, Lundstedt (1992) implemented ANN for predictions of solar-terrestrial effects. Sello (2001) showed that the monthly mean SNs contain true non-linear dependencies with the use of the method of surrogate data combined with the computation of linear and non-linear redundancies. Calvo et al. (1995) showed how the ANN can capture the intrinsic dynamics of solar activity, producing good long-term forecasting for periods up to a complete cycle. Conway et al. (1998) examined the use of feed forward neural networks in the long-term prediction of SNs. Park and Woo (2009) compared the performance of two types of ANN in forecasting the monthly SNs.

Analysis of the trend in the geophysical time series has long been done to view various climatic aspects. Analysis of the trend in the Indian summer monsoon rainfall has been done by Ghosh et al. (2009), Guhathakurta and Rajeevan (2008), Pal and Al-Tabbaa (2011) and Rana et al. (2011). Both parametric and non-parametric methods have been employed for identifying trends in data. However, recent studies have shown that non-parametric tests are more suitable for non-normally distributed and censored data, including missing values, which are frequently encountered in hydrological time series. These methods are less influenced by the presence of outliers in the data. Among those, the Mann-Kendall (MK) test (Mann, 1945; Kendall, 1975) is one of popular methods for trend analysis in meteorological time series (Chiew and McMahon, 1993; Dinpashoh et al., 2011; Jhajharia et al., 2011; Yue and Pilon, 2004; Zetterqvist, 1991). Influence of serial correlation on the MK test in trend-detection studies of hydrological time series has been studied by Yue and Wang (2002). In a recent study, Jhajharia and Singh (2011) applied MK test to study the trends in temperature, diurnal temperature range and sunshine duration in Northeast India. In another recent study, Jhajharia et al. (2011) investigated the trends in the reference evapotranspiration estimated through the Penman-Monteith method over the humid region of Northeast India by using the MK test after removing the effect of significant lag-1 serial correlation from the time series of reference evapotranspiration by pre-whitening. Using long-term ionosonde measurements in mid-latitudes, Bremer (1992) executed trend analysis and identified decrease in the peak height of the ionospheric F2-layer. The geomagnetic activity that is controlled by the Sun and its solar wind, has been tested for its trend by MK test by Love (2011). In the present article, first, we have studied the 11-year solar cycle in terms of the autocorrelation function (ACF) and Euclidean distance (ED) which are being used the first time to study the SN series and solar activity. Afterwards, we analyzed the trend within the SN timer series using MK test. Details of the test procedure are present in subsequent sections. In the next phase, instead of dealing with raw time series, we transformed the monthly SN data by Box-Cox transformation (Box and Cox, 1964), and then the probability distribution of the time series under the transformation was studied. After removing the irregularities and trends from the time series by the Box-Cox transformation, stationary and non-stationary methodologies were applied to the time series. Lastly, we implemented the autoregressive neural network (ARNN) model and compared it with ARMA and ARIMA models.

2 Data analysis

For the present study, the mean annual SNs for the duration 1749 to 2007 have been collected from the web site of NOAA, Boulder, Colorado (http://www.ngdc.noaa.gov/stp/solar/ssndata.html last accessed on 24th December, 2010). First, the nature of the ACF (Wilks, 2006) of the SA time series has been investigated up to lag-200 and it is presented in Fig. 1a. It is observed that the ACF is gradually decaying in a sinusoidal manner. This indicates that the time series is stationary. The most interesting feature of the time series is that the maximum autocorrelation coefficients (ACC) are occurring at the lags separated by 11. This means that the entries of time series are gaining highest positive association with the first entry in 11 years interval. In the subsequent sections, we viewed the 11-year cycle using the concept of ED and generated the univariate model for SN time series using various methods. In Fig. 1b, we present a periodogram and the most prominent peak is found at the time point 11. This further supports the presence of an 11-year cycle in the SN time series.

Fig. 1

a: autocorrelation function of the sunspot number time series under study computed up to several lags; b: Periodogram showing the maximum peak at the period of 10.792. This indicates an 11-year cycle.

a : fonction d’autocorrélation des séries temporelles de taches solaires étudiées, calculées jusqu’à différents décalages ; b : périodogramme montrant le pic maximum à la période 10,792. Ceci indique un cycle de 11 ans.

3 Trend estimation in Sunspot number time series

Chattopadhyay et al. (2011) discussed that the parametric methods require the data to be independent and normally distributed, whereas, even for smaller departures from normality, the non-parametric methods are sometimes better than parametric methods. Non-parametric methods use the ranks of observations rather than their actual values, which relax the requirements concerning the distribution of the data. The MK test, which is a non-parametric test, provides a test for positive or negative trends, which are not necessarily linear. It provides a robust test for trend, free from assumptions about mathematical form of the trend or the probability distribution of errors (Hasanean, 2001). The mathematical background of the MK test is discussed at length in papers such as Chattopadhyay et al. (2011), Jhajharia et al. (2009), Onoz and Bayazit (2003) and Yue et al. (2002). The MK trend test first computes S statistic given by (Dinpashoh et al., 2011; Jhajharia et al., 2011)

S=i=1n1j=i+1nsgnxjxi
where n is the number of observations, xj is the jth observation, and sgn(.) is the sign function which can be defined as
sgnθ=1ifθ>00ifθ=01ifθ<0

Under the assumption that data are independent and identically distributed, the mean and variance of the S statistic are given by:

ES=0
VS=nn12n+5i=1mtiti12ti+518
where, m is the number of groups of tied ranks, each with ti tied observations. The original MK statistic, designated by Z, can be computed as
Z=S1VarSS>00S=0S+1VarSS<0

If Z1α/2ZZ1α/2 then the null hypothesis of no trend can be accepted at significant level of α. Otherwise, the null hypothesis can be rejected and alternative hypothesis can be accepted at the significant level of α. In the present article, we are dealing with the time series of annual SNs. However, to have a clear view of the data under consideration, we have examined the trends in the SN time series in monthly scales as well. The MK non-parametric test was used to examine the trends in the SN time series because the SN time series are non-normally distributed. The results are displayed in Table 1.

Table 1

Résultats du test MK.

MK-test statistic Observation
January 1.98231 Rising trend at 5% significance level
February 1.95139 Rising trend at 10% significance level
March 2.26572 Rising trend at 5% significance level
April 1.97941 Rising trend at 5% significance level
May 1.90333 Rising trend at 10% significance level
June 2.41074 Rising trend at 1% significance level
July 2.47742 Rising trend at 1% significance level
August 2.38199 Rising trend at 1% significance level
September 2.0655 Rising trend at 5% significance level
October 2.17385 Rising trend at 5% significance level
November 1.39374 No trend
December 1.60689 No trend
Annual 2.13507 Rising trend at 5% significance level

From Table 1, it is observed that there is a rising trend in the annual SN time series, which resembles the nature of the time series in most of the months.

4 Euclidean distance approach

The approach of ED is not new in the field of the meteorology. Elmore and Richman (2001) used ED as a similarity metric for Principal Component Analysis. Geometrically, the ED between two points is the shortest possible distance between the two points. The majority of the cluster analyses in the climatological literature have been based upon EDs because of the many useful properties of the Euclidean metric. Most of the recent developments in clustering algorithms predominantly have involved the use of EDs as well (Mimmack et al., 2001). Considering the said time series from the point of view of ED, the time series of SNs is given as X = {x1, x2, x3,…, x259}, where x1 indicates the mean SN of the year 1749 and x2, x3,…, x259 etc. indicate the mean SNs of the subsequent years. From the series X, we have generated new series X1, X2 etc., where, Xk indicates the time series starting from xk. The ED between X and Xk is defined as:

dk=(x1xk)2+(x2xk+1)2+...+(x259k+1x259)2(1)

After computing the EDs for several lags k, we have plotted them in Fig. 2. The figure shows that like the ACF, the dk is following a wave pattern having its peaks in 11 years intervals. This further indicates the existence of 11-year solar cycles which is established in the literature.

Fig. 2

This figure shows the variation of Euclidean distance against lag-k. The Euclidean distances have been computed before Box-Cox transformation.

La figure montre la variation de la distance euclidienne en fonction de lag-k. Les distances euclidiennes ont été calculées avant transformation Box-Cox.

So far, we have considered the yearly mean SNs in its raw form. Now we examine whether the time series of annual SNs is normally distributed. The Anderson-Darling test (Anderson and Stern, 1996; Elmore, 2005) was used to test the normality of the SN time series. The assumption of the null hypothesis that the data are normally distributed is examined. We find that the value of test statistic under the null hypothesis is 6.078 and subsequently the null hypothesis is rejected. Therefore, the data are not normally distributed. For further study, we transformed the time series by means of Box-Cox transformations. This transformation removes the trend components and other irregularities from the time series. The Box-Cox transformation is given by (Seleshi et al., 1994):

Zt=Xtλ1λGλ1(2)
where, Xt is the original time series, G is the sample geometric mean, λ is the transformation parameter, and Zt is the transformed series. In this paper, we have taken λ = 0.359. The autocorrelation structure after Box-Cox transformation has been presented in Fig. 3, which resembles the same pattern as that of the original time series. After transforming the time series, we test for normal distribution, and we find that the transformed time series is following normal distribution with mean 0 and standard deviation 3.5.

Fig. 3

Autocorrelation function of the sunspot number time series under study computed up to several lags after Box-Cox transformation.

Fonction d’autocorrélation des séries temporelles de taches solaires étudiées, calculées jusqu’à différents décalages après transformation Box-Cox.

The plot presented in Fig. 4 represents a normal distribution followed by the Box-Cox transformed time series. However, before Box-Cox transformation, the mean and the standard deviation were 52.423 and 41.491, and when we try to fit a normal distribution to it, we find a plot in Fig. 5, which is not symmetric about the mean. This indicates that before the transformation, the data series was not normal.

Fig. 4

Distribution after Box-Cox transformation. The vertical axis represents the probability density.

Distribution après transformation Box-Cox. L’axe vertical représente la probabilité de densité.

Fig. 5

Distribution before Box-Cox transformation. The vertical axis represents the probability density.

Distribution avant transformation Box-Cox. L’axe vertical représente la probabilité de densité.

5 Development of ARMA and ARIMA models

After applying Box-Cox transformation, the data series is now free from trend components and other irregularities. As the ACF is showing a wave pattern with significant positive and negative spikes gradually tending to 0, we apply the stationary approach (Moran, 1954) autoregressive moving average (ARMA) to this time series. An ARMA (p, q) process is expressed as (Box et al., 2007):

ϕ(B)x¯t=θ(B)at(3)
where, φ(B) and θ(B) are the polynomials of degree p and q respectively, B is the backward shift operator, xt with upper bar is (xt - μ), where, μ is the sample mean. A detailed description of the ARIMA process and its theoretical background is available in Box et al. (2007).

As we have already seen that there is an 11-year cycle within the time series, we go up to 11 orders of autoregression with moving average orders of 1 and 2 respectively. Therefore, as a whole, 22 ARMA (p, q) models are to be tested. Goodness of fit of the models are tested though the Akaike Information Criteria (AIC) (Wilks, 2006). The minimization of the AIC indicates best fit. We find that the AIC is getting its minimum (= 913.11) for the model ARMA (11, 1). Therefore, it is understood that the yearly mean SNs are best represented by ARMA (11, 1), where the order of autoregression is 11 and order of moving average is 1.

Previously we have mentioned that because of the sinusoidal and gradually decaying pattern of the time series under consideration we are using a stationary approach. However, we also take into account the further increasing autocorrelation after lag 56. That's why, we are trying a non-stationary approach in this section. Autoregressive integrated moving average (ARIMA) (Box et al., 2007) modelling is attempted in this section. In general, this process is denoted as ARIMA(p, d, q), where d denotes the number of times the stationary process is summed. An ARIMA(p, d, q) process is expressed as

ϕ(B)dxt=θBat(4)

The difference operator in the left hand side of (4) is equal to (1-B). Here also, we have examined autoregressive orders up to 11 and moving average orders up to 2. We have kept d fixed at 1. Consequently, 22 models have been generated. The AIC has attained its minimum (= 921.13) at ARIMA (8, 1, 1). Thus, we find that in the case of ARIMA (p, d, q), we require only 8 previous values of the time series as predictor, whereas in ARMA (p, q) we require 11 previous values of the time series as predictors. Some examples of application of ARIMA in the non-stationary meteorological time series include El-Fandy et al. (1994), Visser and Molenaar (1995) and Woodward and Gray (1993, 1995).

Now, the question that naturally arises, is: Which one is better? ARMA (11, 1) or ARIMA (8, 1, 1)? To answer this question, we require computing a statistic that would measure potential of a model to make forecast. To do the same we compute Willmott's index that was advocated by Willmott (1982) to measure the degree of agreement between actual and predicted values. This is given as (Willmott, 1982)

d=1PiOi2(PiO¯+OiO¯)21(5)

where, Pi, Oi, and O with upper bar denote the ith predicted value, ith observed value, and mean of the observed values, respectively. Willmott (1982) recommended this statistic to assess the degree to which a model output fits an observed dataset. This statistic has been used by Chattopadhyay and Chattopadhyay (2009) and Comrie (1997). In the cases of the ARMA (11, 1) and ARIMA (8, 1, 1), the values of d are approximately equal to 0.96. This indicates that these two models are equally potent to forecast the time series of yearly mean SNs. However, as less number of predictors are needed in the ARIMA (8, 1, 1), the ARIMA (8, 1, 1) appears to be more useful than the ARMA (11, 1) to predict the yearly mean SNs. This indicates that before the transformation the data series was not normal.

6 Development of Autoregressive Neural Network (ARNN) model

ANN is now applied to the time series under consideration in an autoregressive manner. The form of ANN used in this letter is the multilayer perceptron (MLP), the most popular form of ANN used for prediction purpose. Application of an ANN in forecasting the SNs is not very new. The computations carried out inside the ANN amount to: (i) performing the weighted average of its impinging inputs; (ii) sending this average through a (biased) sigmoid function; and (iii) forwarding the sigmoid-function output to the next layer of neurons (Calvo et al., 1995). The newness in our approach is that we would examine the possibility of forecasting the yearly mean SNs by ANN in an autoregressive manner. In the autoregressive approach, the values of a time series are predicted on the basis of the previous values of the same time series. An autoregressive model of the order k implies that the current value of the time series is being predicted on the basis of past k data of the same random variable. Thus, a linear autoregressive model of order p can be expressed using p number of previous values of the time series, including a noise term as follows (Chattopadhyay and Chattopadhyay, 2009):

xt=i=1pαixti+t(6)

In the above equation, a linear function FL can be introduced, in which case the equivalent form of the above equation would be (Chattopadhyay and Chattopadhyay, 2009):

xt=FLxt1,xt2,,xtp+t(7)

Replacing L by MLP leads to the AR-NN model. Since we have already identified the 11 years cycle in the ACF, we would vary the number of predictors from 1 to 11. Here, we denote the proposed neural network model as ARNN (n) where n is the number of predictors derived from the time series. Thus, 11 ARNN (n) models would be generated by varying n from 1 to 11. The ARNN (n) models are trained as MLP (Gardner and Dorling, 1998) with sigmoid non-linearity. In mathematical form, the adaptive procedure of a feed forward MLP can be presented as (Kamarthi and Pittner, 1999):

wk+1=wk+ηdk(8)

The above equation represents an iterative process that finds the optimal weight vector by adapting the initial weight vector w0. This adaptation is performed by presenting to the network sequentially a set pairs of input and target vectors. The positive constant η is called the learning rate. The direction vector dk is the negative gradient of the output error function E. Mathematically, it is denoted as:

dk=E(xk)(9)

We use the adaptive gradient learning method to train the ARNN (n) models. Stopping criteria is chosen as the minimization of the mean squared error. The entire dataset, in each case, is divided into training and test cases in the ratio 7:3. In each case, an extensive variable selection procedure is followed by stepwise multiple regression analysis. The type of non-linearity in the ANN depends upon the type of activation function chosen to train the ANN. Gardner and Dorling (1998) have shown that the sigmoid activation function fx=1+ex1 performs better than other types of activation function. After generating the ARNN (n) models, the Willmott's indices are plotted in Fig. 6. The figure shows that the Willmott's indices attain their maximum for ARNN (1) and ARNN (11). Thus, from Fig. 6 it can be concluded that these two models are producing best forecast. However, the most important fact, which is understood from this observation, is that the neural network can perform very well even when it is provided with only one predictor. At this juncture, it can, therefore, be concluded that ARNN in general is more effective than ARMA or ARIMA (they require more number of predictors) to produce univariate forecast for the yearly mean SN time series. To view the quality of the forecasts pictorially we have created scatterplots in Figs. 7 and 8, which show that ARNN (1) produces forecasts with coefficient of determination R2 equals 0.98.

Figure 6

The Willmott's indices for various orders of Autoregressive Neural Network (ARNN).

Indices de Wilmott pour des ordres variés d’ARNN.

Figure 7

The scatterplot for Autoregressive Neural Network (ARNN) (1).

Nuage de points pour ARNN (1).

Figure 8

The scatterplot for Autoregressive Neural Network (ARNN) (11).

Nuage de points pour ARNN (11).

7 Concluding remarks

Finally, it is our observation that there is a prominent 11-year cycle in the mean yearly SN time series and it is supported by the pattern of the ACF and measures of EDs. Trend analysis by means of the non-parametric MK test reveals that the yearly SN time series exhibits increasing trend. It is further revealed that although it has a systematic pattern for autocorrelation, it is not distributed normally. Application of Box-Cox transformation converts it to a normally distributed time series. At the end, it has been revealed that for univariate forecast of the yearly mean SNs, ARNN model performs better than the autoregressive moving average and the autoregressive integrated moving average models.

Acknowledgements

The first author wishes to acknowledge the warm hospitality provided by Inter University Centre for Astronomy and Astrophysics, Pune, India, where a part of this study was carried out during the visit from 30 December, 2010 to 10 January, 2011. The authors wish to sincerely acknowledge the constructive suggestions from the reviewers to enhance the quality of the work.


References

[Ahluwalia and Ygbuhay, 2009] H.S. Ahluwalia; R.C. Ygbuhay Preliminary forecast for the peak of solar activity cycle 24, Adv. Space Res., Volume 44 (2009), pp. 611-614

[Alexandris et al., 1999] D. Alexandris; C. Varotsos; K.Y. Kondratyev; G. Chronopoulos On the altitude dependence of solar effective UV. Physics and Chemistry of the Earth, Part C-Solar-Terrestrial and Planetary Science, Volume 24 (1999), pp. 515-517

[Anderson and Stern, 1996] J.S. Anderson; W.F. Stern Evaluating the potential predictive utility of ensemble forecasts, J. Clim., Volume 9 (1996), pp. 260-269

[Baranovski et al., 2008] A.L. Baranovski; F. Clette1; V. Nollau Nonlinear solar cycle forecasting: theory and perspectives, Ann. Geophys., Volume 26 (2008), pp. 231-241

[Box and Cox, 1964] G.E.P. Box; D.R. Cox An analysis of transformations, J. Roy. Statist. Soc. Ser. B, Volume 26 (1964), pp. 211-252

[Box et al., 2007] Box, G.E.P., Jenkins, G.M., Reinsel, G.C., 2007. Time Series Analysis: forecasting and control. Pearson Education Inc. and Dorling Kindersley Publishing Inc.

[Bremer, 1992] J. Bremer Ionospheric trends in mid-latitudes as a possible indicator of the atmospheric greenhouse effect, J. Atmos. Terrest. Phys., Volume 54 (1992), pp. 1505-1511

[Calvo et al., 1995] R.A. Calvo; H.A. Ceccato; R.D. Piacentini Neural network prediction of solar activity, Astrophys. J., Volume 444 (1995), pp. 916-921

[Chattopadhyay and Chattopadhyay, 2008] S. Chattopadhyay; G. Chattopadhyay A factor analysis and neural network-based validation of the Varotsos-Cracknell theory on the 11-year solar cycle, Inter. J. Remote Sensing, Volume 29 (2008), pp. 2775-2786

[Chattopadhyay and Chattopadhyay, 2009] G. Chattopadhyay; S. Chattopadhyay Autoregressive forecast of monthly total ozone concentration: a neurocomputing approach, Comput. Geosci., Volume 35 (2009), pp. 1925-1932

[Chattopadhyay et al., 2011] G. Chattopadhyay; D. Jhajharia; S. Chattopadhyay Univariate modeling of monthly maximum temperature time series over Northeast India: neural network versus Yule-walker equation based approached, Meteorol. Appl., Volume 18 (2011), pp. 70-82

[Chiew and McMahon, 1993] F.H.S. Chiew; T.A. McMahon Detection of trend or change in annual flow of Australian rivers, Inter. J. Climatol., Volume 13 (1993), pp. 643-653

[Comrie, 1997] A.C. Comrie Comparing neural networks and regression models for ozone forecasting, J. Air Waste Manage. Assoc., Volume 47 (1997), pp. 653-663

[Conway et al., 1998] A.J. Conway; K.P. Macpherson; G. Blacklaw; J.C. Brown A neural network prediction of solar cycle 23, J. Geophys. Res., Volume 103 (1998), pp. 29733-29742

[Dinpashoh et al., 2011] Y. Dinpashoh; D. Jhajharia; A. Fakheri-Fard; V.P. Singh; E. Kahya Trends in reference evapotranspiration over Iran, J. Hydrol., Volume 399 (2011), pp. 422-433 | DOI

[Dmitriev et al., 1999] Dmitriev, A. et al., 1999. Solar activity forecasting on 1999–2000 by means of artificial neural networks. EGS XXIV General Assembly, 22 April 1999, The Hague, The Netherlands.

[Dorffner, 1996] G. Dorffner Neural Networks for Time Series Processing, Neural Network World, Volume 6 (1996), pp. 447-468

[Echer et al., 2004a] E. Echer; N.R. Rigozo; D.J.R. Nordemann; L.E.A. Vieira Prediction of solar activity on the basis of spectral characteristics of sunspot number, Annal. Geophys., Volume 22 (2004), pp. 2239-2243

[Echer et al., 2004b] E. Echer; N.R. Rigozo; M.P. Souza; L.E.A. Vieira; D.J.R. Nordemann Reconstruction of the aa index on the basis of spectral characteristics, Geofısica Internacional, Volume 43 (2004), pp. 103-111

[El-Fandy et al., 1994] M.G. El-Fandy; Z.H. Ashour; S.M.M. Taiel Time series models adoptable for forecasting Nile floods and Ethiopian rainfalls, Bull. Am. Meteorol. Soc., Volume 75 (1994), pp. 83-94

[Elmore, 2005] K.L. Elmore Alternatives to the chi-square test for evaluating rank histograms from ensemble forecasts, Weather Forecast., Volume 20 (2005), pp. 789-795

[Elmore and Richman, 2001] K.L. Elmore; M.B. Richman Euclidean distance as a similarity metric for principal component analysis, Mon. Weather Rev., Volume 129 (2001), pp. 540-549

[Friis-Christensen and Lassen, 1991] E. Friis-Christensen; K. Lassen Length of the solar cycle: an indicator of solar activity closely associated with climate, Science, Volume 254 (1991), pp. 698-700

[Fröhlich and Lean, 1998] C. Fröhlich; J. Lean The sun's total irradiance: cycles, trends and related climate change uncertainties since 1976, Geophys. Res. Lett., Volume 25 (1998), pp. 4377-4380

[Gardner and Dorling, 1998] M.W. Gardner; S.R. Dorling Artificial neural networks: the multilayer perceptron: a review of applications in atmospheric sciences, Atmosph. Environ., Volume 32 (1998), pp. 2627-2636

[Ghosh et al., 2009] S. Ghosh; V. Luniya; A. Gupta Trend analysis of Indian summer monsoon rainfall at different spatial scales, Atmosph. Sci. Lett., Volume 10 (2009), pp. 185-290

[Gil and Luis, 2009] A. Gil; A. Luis Time series modeling of sunspot numbers using long-range cyclical dependence, Solar Phys., Volume 257 (2009), pp. 371-381

[Guhathakurta and Rajeevan, 2008] P. Guhathakurta; M. Rajeevan Trends in the rainfall pattern over India, International J. Climatol., Volume 28 (2008), pp. 1453-1469

[Haigh, 2006] J.D. Haigh A GCM study of climate change in response to the 11-year solar cycle, Quart. J. Roy. Meteorol. Soc., Volume 125 (2006), pp. 871-892

[Hasanean, 2001] H.M. Hasanean Fluctuations of surface air temperature in the eastern Mediterranean, Theor. Appl. Climatol., Volume 68 (2001), pp. 75-87

[Hathaway et al., 1999] D.H. Hathaway; R.M. Wilson; E.J. Reichmann A synthesis of solar cycle prediction techniques, J. Geophys. Res., Volume 104 (1999), pp. 22375-22388

[Hsieh and Tang, 1998] W.W. Hsieh; B. Tang Applying neural network models to prediction and data analysis in meteorology and oceanography, Bull. Am. Meteorol. Soc., Volume 79 (1998), pp. 1855-1870

[Jhajharia and Singh, 2011] D. Jhajharia; V.P. Singh Trends in temperature, diurnal temperature range and sunshine duration in Northeast India, Int. J. Climatology, Volume 31 (2011), pp. 1353-1367 | DOI

[Jhajharia et al., 2009] D. Jhajharia; S.K. Shrivastava; D. Sarkar; S. Sarkar Temporal characteristics of pan evaporation trends under the humid conditions of Northeast India, Agric. Forest Meteorol., Volume 149 (2009), pp. 763-770

[Jhajharia et al., 2011] Jhajharia, D., Dinpashoh, Y., Kahya, E., Singh, V.P., Fakheri-Fard, A., 2011. Trends in reference evapotranspiration in the humid region of northeast India. Hydrol. Proc., (online first). | DOI

[Kamarthi and Pittner, 1999] S.V. Kamarthi; S. Pittner Accelerating neural network training using weight extrapolations, Neural Networks, Volume 12 (1999), pp. 1285-1299

[Kane, 2001] R.P. Kane Did predictions of the maximum sunspot number for solar cycle 23 come true?, Solar Physics, Volume 202 (2001), pp. 395-406

[Katsambas et al., 1997] A. Katsambas; C.A. Varotsos; G. Veziryianni; C. Antoniou Surface solar ultraviolet radiation: a theoretical approach of the SUVR reaching the ground in Athens, Greece, Environ. Sci. Pollut. Res., Volume 4 (1997), pp. 69-73

[Kendall, 1975] M.G. Kendall Rank Correlation Methods, Griffin, London, UK, 1975

[Kurths and Ruzmaikin, 1990] J. Kurths; A.A. Ruzmaikin On forecasting the sunspot numbers, Solar Physics, Volume 126 (1990), pp. 407-410

[Lin et al., 1994] F. Lin; X.H. Yu; S. Gregor; R. Irons Time series forecasting with neural networks (J. Russel; Stonier; Yu. Xing Huo, eds.), Complex systems: mechanism of adaptation, IOS Press, Netherlands, 1994, pp. 245-252

[Love, 2011] J.J. Love Secular trends in storm-level geomagnetic activity, Ann. Geophys., Volume 29 (2011), pp. 251-262 | DOI

[Lundstedt, 1992] H. Lundstedt Neural networks and predictions of solar-terrestrial effects, Planet. Space Sci., Volume 40 (1992), pp. 457-464

[Mann, 1945] H.B. Mann Nonparametric tests against trend, Econometrica, Volume 13 (1945), pp. 245-259

[Mimmack et al., 2001] G.M. Mimmack; J.M. Simon; J.S. Galpin Choice of distance matrices in cluster analysis: defining regions, J. Climate, Volume 14 (2001), pp. 2790-2797

[Moran, 1954] P.A.P. Moran Some experiments on the prediction of sunspot numbers, J. Roy. Statist. Soc., Ser. B, Volume 16 (1954), pp. 112-117

[Onoz and Bayazit, 2003] B. Onoz; M. Bayazit The power of statistical tests for trend detection, Turk. J. Eng. Environ. Sci., Volume 27 (2003), pp. 247-251

[Pal and Al-Tabbaa, 2011] I. Pal; A. Al-Tabbaa Assessing seasonal precipitation trends in India using parametric and non-parametric statistical techniques, Theor. Appl. Climatol., Volume 103 (2011), pp. 1-11

[Park and Woo, 2009] Park, D.C., Woo, D.M., 2009. Prediction of sunspot series using bilinear recurrent neural network. Proc. ICIME, 94–98.

[Pesnell, 2008] W.D. Pesnell Predictions of Solar Cycle 24, Solar Physics, Volume 252 (2008), pp. 209-220

[Price et al., 1992] C.P. Price; D. Prichard; E.A. Hogeson Do the sunspot numbers form a chaotic set?, J. Geophys. Res., Volume 97 (1992), pp. 19113-19120

[Rana et al., 2011] Rana A., Uvo, C. B., Bengtsson, L., Parth Sarthi, P. 2011. Trend analysis for rainfall in Delhi and Mumbai, India, Climate Dynamics, (online first). | DOI

[Rigozo and Nordemann, 1998] N.R. Rigozo; D.J.R. Nordemann Iterative regression analysis of periodicities in geophysical record time series, Revista Brasileira de Geofísica, Volume 16 (1998) | DOI

[Rigozo et al., 2001] N.R. Rigozo; E. Echer; L.E.A. Vieira; D.J.R. Nordemann Reconstruction of Wolf sunspot number on the basis of spectral characteristics and estimates of associated radio flux and solar wind parameters for the last millennium, Solar Phys., Volume 203 (2001), pp. 179-191

[Rind, 2002] D. Rind The sun's role in climate variations, Science, Volume 296 (2002), pp. 673-677

[Seleshi et al., 1994] Y. Seleshi; G.R. Demarée; J.W. Delleur Sunspot numbers as a possible indicator of annual rainfall at Addis Ababa, Ethiopia, Inter. J. Climatol., Volume 14 (1994), pp. 911-923

[Sello, 2001] S. Sello Solar cycle forecasting: a nonlinear dynamics approach, Astron. Astrophys., Volume 377 (2001), pp. 312-320

[Shindell et al., 1999] D. Shindell; D. Rind; N. Balachandran; J. Lean; P. Lonergan Solar cycle variability, ozone, and climate, Science, Volume 284 (1999), pp. 305-308

[Tayfur, 2002] G. Tayfur Artificial neural networks for sheet sediment transport, Hydrol. Sci. J., Volume 47 (2002), pp. 879-892

[Tzanis and Varotsos, 2008] C. Tzanis; C.A. Varotsos Tropospheric aerosol forcing of climate: a case study for the greater area of Greece, Inter. J. Remote Sensing, Volume 29 (2008), pp. 2507-2517

[Usoskin et al., 2003] I.G. Usoskin; S.K. Solanki; M. Schüssler; K. Mursula; K. Alanko Millennium-scale sunspot number reconstruction: evidence for an unusually active sun since the 1940s, Phys. Rev. Lett., Volume 91 (2003) (Article ID 211101)

[Varotsos, 1989] C. Varotsos Connections between the 11-year solar cycle, the Q.B.O. and total ozone - comments, J. Atmos. Terrest. Phys., Volume 51 (1989), pp. 367-370

[Varotsos, 2002] C. Varotsos The southern hemisphere ozone hole split in 2002, Environ. Sci. Pollut. Res., Volume 9 (2002) no. 6, pp. 375-376

[Varotsos and Cracknell, 2004] C.A. Varotsos; A.P. Cracknell New features observed in the 11-year solar cycle, Inter. J. Remote Sensing, Volume 25 (2004), pp. 2141-2157

[Visser and Molenaar, 1995] H. Visser; J. Molenaar Trend estimation and regression analysis in climatological time series: an application of structural time series models and the Kalman filter, J. Climate, Volume 8 (1995), pp. 969-979

[Wilks, 2006] Wilks, D.S., 2006. Statistical methods in atmospheric sciences. Elsevier.

[Willett, 1951] H.C. Willett Extrapolation of sunspot-climate relationships, J. Atmos. Sci., Volume 8 (1951), pp. 1-6

[Willmott, 1982] C.J. Willmott Some comments on the evaluation of model performance, Bull. Am. Meteorol. Soc., Volume 63 (1982), pp. 1309-1313

[Woodward and Gray, 1993] W.A. Woodward; H.L. Gray Global warming and the problem of testing for trend in time series data, J. Climate, Volume 6 (1993), pp. 953-962

[Woodward and Gray, 1995] W.A. Woodward; H.L. Gray Selecting a model for detecting the presence of a trend, J. Climate, Volume 8 (1995), pp. 1929-1937

[Yue and Pilon, 2004] S. Yue; P. Pilon A comparison of the power of the t test, Mann-Kendall and bootstrap tests for trend detection, Hydrol. Sci. J. Sci. Hydrol., Volume 49 (2004), pp. 21-37

[Yue and Wang, 2002] S. Yue; C.Y. Wang Applicability of prewhitening to eliminate the influence of serial correlation on the Mann-Kendall test, Water Resour. Res., Volume 38 (2002) (CiteID 1068) | DOI

[Yue et al., 2002] S. Yue; P. Pilon; G. Cavadias Power of the Mann-Kendall and Spearman's rho tests for detecting monotonic trends in hydrological series, J. Hydrol., Volume 259 (2002), pp. 254-271

[Zetterqvist, 1991] L. Zetterqvist Statistical estimation and interpretation of trends in water quality time series, Water Resour. Res., Volume 27 (1991), pp. 1637-1648


Comments - Policy