TL;DR: In this paper, a stationary stochastic ARMA/ARIMA [Autoregressive Moving (Integrated) Average] modeling approach has been adapted to forecast daily mean ambient air pollutants (O3, CO, NO and NO2) at an urban traffic site (ITO) of Delhi, India.
Abstract: In the present study, a stationary stochastic ARMA/ARIMA [Autoregressive Moving (Integrated) Average] modelling approach has been adapted to forecast daily mean ambient air pollutants (O3, CO, NO and NO2) concentration at an urban traffic site (ITO) of Delhi, India. Suitable variance stabilizing transformation has been applied to each time series in order to make them covariance stationary in a consistent way. A combination of different information-criterions, namely, AIC (Akaike Information Criterion), HIC (Hannon–Quinn Information Criterion), BIC (Bayesian Information criterion), and FPE (Final Prediction Error) in addition to ACF (autocorrelation function) and PACF (partial autocorrelation function) inspection, has been tried out to obtain suitable orders of autoregressive (p) and moving average (q) parameters for the ARMA(p,q)/ARIMA(p,d,q) models. Forecasting performance of the selected ARMA(p,q)/ARIMA(p,d,q) models has been evaluated on the basis of MAPE (mean absolute percentage error), MAE (mean absolute error) and RMSE (root mean square error) indicators. For 20 out of sample forecasts, one step (i.e., one day) ahead MAPE for CO, NO2, NO and O3, have been found to be 13.6, 12.1, 21.8 and 24.1%, respectively. Given the stochastic nature of air pollutants data and in the light of earlier reported studies regarding air pollutants forecasts, the forecasting performance of the present approach is satisfactory and the suggested forecasting procedure can be effectively utilized for short term air quality forewarning purposes.
TL;DR: In this paper, a linear stochastic models known as autoregressive integrated moving average (ARIMA) and multiplicative seasonal autoregression integrated moving averaging (SARIM) were used to predict drought in the Buyuk Menderes river basin using the Standardized Precipitation Index (SPI) as a drought index.
Abstract: In the present study, a seasonal and non-seasonal prediction of the Standardized Precipitation Index (SPI) time series is addressed by means of linear stochastic models. The methodology presented here is to develop adequate linear stochastic models known as autoregressive integrated moving average (ARIMA) and multiplicative seasonal autoregressive integrated moving average (SARIMA) to predict drought in the Buyuk Menderes river basin using SPI as drought index. Temporal characteristics of droughts based on SPI as an indicator of drought severity indicate that the basin is affected by severe and more or less prolonged periods of drought from 1975 to 2006. Therefore, drought prediction plays an important role for water resources management. ARIMA modeling approach involves the following three steps: model identification, parameter estimation, diagnostic checking. In model identification step, considering the autocorrelation function (ACF) and partial autocorrelation function (PACF) results of the SPI series, different ARIMA models are identified. The model gives the minimum Akaike Information Criterion (AIC) and Schwarz Bayesian Criterion (SBC) is selected as the best fit model. Parameter estimation step indicates that the estimated model parameters are significantly different from zero. Diagnostic check step is applied to the residuals of the selected ARIMA models and the results indicated that the residuals are independent, normally distributed and homoscedastic. For the model validation purposes, the predicted results using the best ARIMA models are compared to the observed data. The predicted data show reasonably good agreement with the actual data. The ARIMA models developed to predict drought found to give acceptable results up to 2 months ahead. The stochastic models developed for the Buyuk Menderes river basin can be employed to predict droughts up to 2 months of lead time with reasonably accuracy.
TL;DR: In this article, the authors employed the run test and the correlogram/partial autocorrelation function as alternate forms of the research instrument to carry out an investigation with the Nigerian stock market data, and the results of the three alternate tests revealed that the Nigeria stock market is efficient in the weak form and therefore follows a random walk process.
Abstract: The weak form hypothesis has been pointed out as dealing with whether or not security prices fully reflect historical price or return information. To carry out this investigation with the Nigerian stock market data, we employed the run test and the correlogram/partial autocorrelation function as alternate forms of the research instrument. The results of the three alternate tests revealed that the Nigerian stock market is efficient in the weak form and therefore follows a random walk process. Thus, the opportunity of making excess returns in the market is ruled out.
TL;DR: In this paper, a study of pressure head h and water content θ which characterize soil water status, in the space-time domain, was conducted in a field during a controlled drainage process, evaporation being prevented, along a 50 m transect in a volcanic Vesuvian soil.
Abstract: . Unsaturated hydraulic properties and their spatial variability today are analyzed in order to use properly mathematical models developed to simulate flow of the water and solute movement at the field-scale soils. Many studies have shown that observations of soil hydraulic properties should not be considered purely random, given that they possess a structure which may be described by means of stochastic processes. The techniques used for analyzing such a structure have essentially been based either on the theory of regionalized variables or to a lesser extent, on the analysis of time series. This work attempts to use the time-series approach mentioned above by means of a study of pressure head h and water content θ which characterize soil water status, in the space-time domain. The data of the analyses were recorded in the open field during a controlled drainage process, evaporation being prevented, along a 50 m transect in a volcanic Vesuvian soil. The isotropic hypothesis is empirical proved and then the autocorrelation ACF and the partial autocorrelation functions PACF were used to identify and estimate the ARMA(1,1) statistical model for the analyzed series and the AR(1) for the extracted signal. Relations with a state-space model are investigated, and a bivariate AR(1) model fitted. The simultaneous relations between θ and h are considered and estimated. The results are of value for sampling strategies and they should incite to a larger use of time and space series analysis.
TL;DR: In this paper, the authors analyzed monthly Malaysia crude oil production data for the period of January 2005 to May 2010 using time-series method called ARIMA model Autocorrelation and partial auto-correlation functions were calculated to examine the stationarity of the data then, an appropriate Box-Jenkins ARIMa model was fitted Validity of the model was tested using Box-Pierce statistic and Ljung-Box statistic techniques.
Abstract: Monthly Malaysia crude oil production data for the period of January 2005 to May 2010 were analyzed using time-series method called Autoregressive Integrated Moving Average (ARIMA) model Autocorrelation and partial autocorrelation functions were calculated to examine the stationarity of the data Then, an appropriate Box-Jenkins ARIMA model was fitted Validity of the model was tested using Box-Pierce statistic and Ljung-Box statistic techniques The predictability of future crude oil production as a forecast is measured for three leading months
TL;DR: It is found that the distribution of a set of the sample autocorrelation estimates is not independent and identically distributed, which implies that the result of diagnostic check and model building using the traditional assumption of iid can be quite misleading.
Abstract: It is shown that the sum of the sample autocorrelation function at lag h≥1 is always for any stationary time series with arbitrary length T≥2 (Hassani, 2009 [1]). In this paper, the distribution of a set of the sample autocorrelation function using the properties of this quantity is considered. It is found that the distribution of a set of the sample autocorrelation estimates is not independent and identically distributed. This finding implies that the result of diagnostic check and model building using the traditional assumption of iid can be quite misleading.
TL;DR: The comparison of the mean and variance of 3-year observed data vs predicted data from the selected best models show that the boron model from ARIMA modeling approaches could be used in a safe manner since the predicted values from these models preserve the basic statistics of observed data in terms of mean.
Abstract: In the present study, a seasonal and non-seasonal prediction of boron concentrations time series data for the period of 1996-2004 from Buyuk Menderes river in western Turkey are addressed by means of linear stochastic models. The methodology presented here is to develop adequate linear stochastic models known as autoregressive integrated moving average (ARIMA) and multiplicative seasonal autoregressive integrated moving average (SARIMA) to predict boron content in the Buyuk Menderes catchment. Initially, the Box-Whisker plots and Kendall's tau test are used to identify the trends during the study period. The measurements locations do not show significant overall trend in boron concentrations, though marginal increasing and decreasing trends are observed for certain periods at some locations. ARIMA modeling approach involves the following three steps: model identification, parameter estimation, and diagnostic checking. In the model identification step, considering the autocorrelation function (ACF) and partial autocorrelation function (PACF) results of boron data series, different ARIMA models are identified. The model gives the minimum Akaike information criterion (AIC) is selected as the best-fit model. The parameter estimation step indicates that the estimated model parameters are significantly different from zero. The diagnostic check step is applied to the residuals of the selected ARIMA models and the results indicate that the residuals are independent, normally distributed, and homoscadastic. For the model validation purposes, the predicted results using the best ARIMA models are compared to the observed data. The predicted data show reasonably good agreement with the actual data. The comparison of the mean and variance of 3-year (2002-2004) observed data vs predicted data from the selected best models show that the boron model from ARIMA modeling approaches could be used in a safe manner since the predicted values from these models preserve the basic statistics of observed data in terms of mean. The ARIMA modeling approach is recommended for predicting boron concentration series of a river.
TL;DR: It is shown that this approach improves the linear autoregressive fit and is used to generate time series that preserve the original autocorrelation and marginal distribution and develop a combined test that discriminates whether a linear stochastic time series is a monotonic or non-monotonic transform of a Gaussian time series.
Abstract: A framework is proposed for the analysis of non-Gaussian time series under the Gaussian assumption. The analysis is based on the Gaussian autocorrelation computed from the transform of the sample autocorrelation. It is shown that this approach improves the linear autoregressive fit. We also use it to generate time series that preserve the original autocorrelation and marginal distribution and develop a combined test that discriminates whether a linear stochastic time series is a monotonic or non-monotonic transform of a Gaussian time series. The usefulness of the proposed analysis is demonstrated on stock exchange volumes of several world markets.
TL;DR: In this article, three different single-hour models are used to forecast electricity price at off peak, plateau, and peak load, and a 24-hour model is also used for forecasting electricity price of all hours simultaneously.
Abstract: This paper discusses the electrical energy price forecasting in Iran power market. Due to the day-time variations in load and thereby electrical energy price, it is wise to use different models for forecasting energy price at different hours. In this paper, three different single-hour models are used to forecast electricity price at off peak, plateau, and peak load. A 24-hour model is also used to forecast electricity price of all hours simultaneously. Analysis of autocorrelation and partial autocorrelation functions suggests different models for each single hour model as well as the 24-hour model. The best models for off peak, plateau, and peak load are obtained to be ARIMA(1,1,1), ARIMA(2,1,1) and ARIMA(0,1,1), respectively. In addition, the time-series analyses result in an AR(2) model with 24-hour period for the 24-hour model as the most suitable model. The models are compared from viewpoints of accuracy and time consuming. The comparison shows that the user should compromise between accuracy and speed, when selecting single-hour or 24-hour models.
TL;DR: The results show that the autocorrelation of radio signals can be diminished by difference, but as the order of difference increases, the number of correlated coefficients goes beyond the confidence limits more, which implies the degree of autcorrelation can't be eliminated forever as the times of difference increase.
Abstract: In order to study the autocorrelation properties of 900MHz radio signal, the time series theory is introduced into statistical analysis. By means of the statistical data from signal acquisition system, the Box-Jenkins method is taken into radio signals analysis. From autocorrelation function calculation, it shows the radio signals exist regularity, nonstationarity and nonseasonal in this period. On the other hand, the results show that the autocorrelation of radio signals can be diminished by difference, but as the order of difference increases, the number of correlated coefficients goes beyond the confidence limits more, which implies the degree of autocorrelation can't be eliminated forever as the times of difference increase.
TL;DR: In this article, the authors investigated the effect of linear and non-linear techniques on long-term rainfall forecasting in the west mountainous region of Iran, where three different ANN models with three different input sets were trained.
Abstract: One of the major problems of water resources management is rainfall forecasting. Different linear and non-linear methods have been used in order to have an accurate forecast. Whilst there are some debates on whether the use of linear or non-linear techniques is better, it was found that rainfall modelling for the short term period is receiving more attention than those for long-term periods. This study gives attention to long-term rainfall modelling since long-term forecasting could provide better data for optimal management of a resource that is to be used over a substantial period of time. Hence, this study is to investigate the effect of linear and non-linear techniques on long-term rainfall forecasting. One of the non-linear techniques being widely used is the Artificial Neural Networks (ANN) approach which has the ability of mapping between input and output patterns without a priori knowledge of the system being modelled. The more popular linear techniques include the Box-Jenkins family of models. A feedforward Artificial Neural Network (ANN) rainfall model and a Seasonal Autoregressive Integrated Moving Average (SARIMA) rainfall model were developed to investigate their potentials in forecasting rainfall. The study area is the west mountainous region of Iran. Three meteorological stations among the several stations over the region were chosen as case study. The stations are the Hamedan Foroudgah, Nujeh, and Arak. Three different ANN models with three different input sets were trained. The first model investigated the effect of number of lags on the performance of the ANN. The number of lags varied from 1-12 previous months. The second model investigated the effect of adding monthly average to the inputs, and the third model considered seasonal average as an extra input in addition to the ones in the second model. The effect of the number of hidden nodes on ANN modeling was also examined. The preliminary inputs for SARIMA were found by examining the Autocorrelation and Partial Autocorrelation of the series. The 26 years monthly rainfall of 1977-2002 was used for training the models. The ANN models were trained and simulated using a program written in MATLAB environment (M-file). The SARIMA models were developed using SPSS syntax. The models were tested with one year monthly rainfall of 2003. It was proven that the larger lags outperform the lower ones in ANN modeling. Also, adding the extra monthly and seasonal average to the input set leads to better model performance. The number of hidden nodes was varied from 1-30. It was demonstrated that input nodes have more effect on performance criteria than the hidden nodes. The models were trained based on the Levenberg-Marquardt algorithm with tansigmoid activation function for the hidden layer and purelin activation function for the output layer. Simulation results for the independent testing data series showed that the model can perform well in simulating one year monthly rainfall in advance .The SARIMA models were built using the same set of data as for the ANN. Model selection was done among multiplicative and additive models and the results revealed that additive SARIMA models have the best performance. The simulation results from the ANN and SARIMA model showed that the SARIMA model has a better performance both in training and testing. Thus, it is recommended for modeling rainfall in the region.
TL;DR: 9 estimators will be presented and a comparison in face to the exact theoretical autocorrelation is done and the best is the AR modified Burg estimate.
Abstract: The autocorrelation function has a very important role in several application areas involving stochastic processes. In fact, it assumes the theoretical base for Spectral analysis, ARMA (and generalizations) modeling, detection, etc. However and as it is well known, the results obtained with the more current estimates of the autocorrelation function (biased or not) are frequently bad, even when we have access to a large number of points. On the other hand, in some applications, we need to perform fast correlations. The usual estimators do not allow a fast computation, even with the FFT. These facts motivated the search for alternative ways of computing the autocorrelation function. 9 estimators will be presented and a comparison in face to the exact theoretical autocorrelation is done. As we will see, the best is the AR modified Burg estimate.
TL;DR: In this article, a simulation and modeling of monthly precipitation and mean monthly temperature using stochastic methods was performed using data from Shiraz synoptic station, based on the ARIMA model, the autocorrelation and partial auto-correlation methods, examination of parameters and types of model.
Abstract: Stochastic models have been proposed as one technique for generating scenarios of future climate change. In climate study, temperature and precipitation are among the main indicators. The purpose of this study is simulation and modeling of monthly precipitation and mean monthly temperature using stochastic methods. In this study, the 21 years data on the precipitation and mean monthly temperature at shiraz synoptic station are used and based on ARIMA model, the autocorrelation and partial autocorrelation methods, examination of parameters and types of model, the suitable models for forecasting of monthly precipitation: ARIMA (0 0 0) (2 1 0)12 and for forecasting of the mean monthly temperature: ARIMA (2 1 0) (2 1 0)12 were obtained.
TL;DR: To reduce random drift of the fiber optic gyro, a model was set up using time series analysis method, aimed at weak non-stationary characteristics of random drift, and the results show that the way to identify drift model directly works better than the ways to identify random model and non- stationary drift model respectively commonly used in engineering.
Abstract: To reduce random drift of the fiber optic gyro,a model was set up using time series analysis method, aimed at weak non-stationary characteristics of random drift. A global optimum algorithm for selecting model order was put forward. It changed two-dimensional search program into one-dimensional one. Consistent estimation of model order was achieved. An improved parameter estimation algorithm was put forward. It transformed the non- linear parameter estimation process into a linear one. Both measured sequence and reverse one were used to estimate parameters,so data information was fully utilized and the accuracy of parameter estimation was improved. The goal of parameter estimation is to get the minimum sum of forward and backward filtering error squares that was found in( p + 1) -dimensional space. The trend terms,periodic terms,autocorrelation and partial autocorrelation of measured drift were analyzed. The statistical results of non-stationary models established from three ways verified the validity of above algorithms. And the results show that the way to identify drift model directly works better than the way to identify random model and non-stationary drift model respectively commonly used in engineering.
TL;DR: In this paper, an alternative bootstrap approach based on a result of Ramsey [F.L. Ramsey, Characterization of the partial autocorrelation function, Ann. Statist.
Abstract: In this work, we investigate an alternative bootstrap approach based on a result of Ramsey [F.L. Ramsey, Characterization of the partial autocorrelation function, Ann. Statist. 2 (1974), pp. 1296–1301] and on the Durbin–Levinson algorithm to obtain a surrogate series from linear Gaussian processes with long range dependence. We compare this bootstrap method with other existing procedures in a wide Monte Carlo experiment by estimating, parametrically and semi-parametrically, the memory parameter d. We consider Gaussian and non-Gaussian processes to prove the robustness of the method to deviations from normality. The approach is also useful to estimate confidence intervals for the memory parameter d by improving the coverage level of the interval.
TL;DR: In this paper, the authors consider a linear regression model with a design matrix that fits the periodic structure of a time series and analyze the effects of selecting different formulations to accommodate the autocorrelation in the residuals.
Abstract: In this paper we consider a Linear Regression Model with a design matrix that fits the periodic structure of a time series. As a consequence, the residuals are very often autocorrelated. The main problem is that residual autocorrelation does not necessarily entail error autocorrelation. To analyse the effects of selecting different formulations to accommodate the autocorrelation in the residuals, we consider two seemingly different ways to deal with this problem: the Linear Regression Model with the error terms following an Autoregressive Stationary Process and the Partial Adjustment Model. We study the equivalence between the two formulations. We go over the problem of estimating the parameters and, especially, of making inferences in this framework. After parameter estimation, we analyse the adequacy of the models. We demonstrate that the issue of selecting the most appropriate model to capture the autocorrelation in the residuals is, in this context, a kind of an artefact since the main results concerning the fitted values and forecasting features are the same. These modelling procedures are applied to the Portuguese coastal upwelling data and we compare the estimated models.
TL;DR: In this paper, the authors show how the standard descriptive statistical analysis of the data is unable to reveal a fine structure in a simulated sample of AR(2) stochastic process and emphasize that the violation of Bell inequalities gives no information on the completeness or the non locality of QT.
Abstract: Most of physical experiments are usually described as repeated measurements of some random variables. The experimental data registered by on-line computers form time series of outcomes. The frequencies of different outcomes are compared with the probabilities provided by the algorithms of quantum theory (QT). In spite of statistical predictions of QT a claim was made that the theory provided the most complete description of the data and of the underlying physical phenomena. This claim could be easily rejected if some fine structures, averaged out in standard statistical descriptive analysis, were found in the time series of experimental data. To search for these structures one has to use more subtle statistical tools which were developed to study time series produced by various stochastic processes. In this talk we review some of these tools. As an example we show how the standard descriptive statistical analysis of the data is unable to reveal a fine structure in a simulated sample of AR(2) stochastic process. We emphasize once again that the violation of Bell inequalities gives no information on the completeness or the non locality of QT. The appropriate way to test the completeness of quantum theory is to search for fine structures in time series of experimental data by means of the purity tests or by studying the autocorrelation and partial autocorrelation functions.
TL;DR: In this paper, the authors characterized the circumstances under which the limiting power vanishes, as the autocorrelation increases, thus extending the work of Kramer (2005, Journal of Statistical Planning and Inference 128, 489-496).
Abstract: Many popular tests for residual spatial autocorrelation in the context of the linear re- gression model belong to the class of invariant tests. This paper derives some exact properties of the power function of such tests. In particular, we characterize the circumstances under which the limiting power, as the autocorrelation increases, vanishes, thus extending the work of Kramer (2005, Journal of Statistical Planning and Inference 128, 489-496). More generally, the analysis in the paper sheds new light on how the power of invariant tests for spatial autocorrelation is affected by the matrix of regressors and by the spatial structure. A numerical study aimed at assessing the practical relevance of the theoretical results is included.
TL;DR: In this paper, the seasonal ARIMA (SARIMA) and Dynamic Regression (DR) or Transfer Function Modeling (TFM) were used to forecast the peak daily load in Malaysia.
Abstract: Malaysia's yearly steady growth in electricity consumption as a result of fast development in various sectors of the Malaysian economy have increased the need to have a more robust, reliable and accurate load forecasting for short —, medium-, or long-term. A reliable method for short term load forecasting is crucial to any decision maker in a power utility company. Many studies have been made to improve the forecasting accuracy using various methods. The forecasting errors for the holiday seasons are known to be higher than those for weekends. This paper aims to determine which model would be a better model to estimate the holiday effects and therefore give a better forecasting accuracy for the peak daily load in Malaysia. Some of the holiday effects in Malaysia are from Eid ul-Fitr, Christmas, Independence Day and Chinese New Year. The seasonal ARIMA (SARIMA) and Dynamic Regression (DR) or Transfer function modelling are considered. Furthermore, the final selection of the models depends on the Mean Absolute Percentage Error (MAPE) and others such as the sample autocorrelation function (ACF), the sample partial autocorrelation function (PACF) and a bias-corrected version of the Akaike's information criterion (AICC) statistic. The Dynamic Regression (DR) model recorded 2.22% as the lowest MAPE value for the 2004 New Year's Eve and 2.39% for the seven days ahead forecasting. And therefore, DR model is the most appropriate model to be considered for forecasting any public holidays in Malaysia.