Feedforward Artificial Neural Networks
The neural models most widely used in time series prediction problems are based in feedforward neural networks with backpropagation learning algorithm. Those models can be used as one-step as multi-step prediction. They consist of approximating the function F by a multilayer feedforward neural network. Introducing the vector (x(k),...,x(k-d)) as the k-th network input pattern, the one-step predicted value by the neural model can be written as follows:
where W1 is the parameter set of the neural model, which is obtained using the backpropagation algorithm [Rum86]. The update of the parameter set is based on the local error between the measured and predicted values, i.e.:
When the feedforward neuronal model is used to forecast the values of time series at instants k+1, k+2, ..., k+h+1, this is along the interval [k+1, k+h+1], two different approaches can be formulated [Zal00].
Predicting the time interval [k+1,k+h+1]. First approach.
The first nonlinear neural model consists of using the neural one-step model to predict the behaviour of the time series in the future. With this purpose, the model has to be used in a recurrent form because the predictive network output must be fed back as an input for the next prediction. If the aim is to predict h sampling times in the future, the input layer of the network is formed by a group of h neurones that memorise previous network outputs and the remaining neurones in the input layer receive the original or measured time series data. Thus, the predicted model outputs along the interval [k+1, k+h+1] are given by the following equations:
Predicting the prediction horizon x(k+h+1). Second approach
The second structure of neural model consists of using a multilayer feedforward network in order to predict, directly, the time series value at instant k+h+1 from the information available at the current instant k, x(k),...,x(k-m). In this case, the nonlinear model is written as follows:
where h is the prediction horizon. The set of parameter W2 is updated using the bakpropagation algorithm and following the negative gradient direction of the error measured at instant k+h+1, i.e.:
Advantages and disadvantages of the approaches
The main disadvantage of the first approach when it is used for multi-step prediction is that the parameter set has been obtained with the purpose of one-step prediction, i.e. to minimise the local errors. During the training phase, the model captures the relation between the actual observations of the original time series, x(k),...,x(k-m) and the next sampling time, x(k+1). However, when the model is acting as a multi-step prediction scheme a group of the input neurones receives the earlier approximated values. Hence, errors occurred for the predicted output network at some instant may be propagated to future sampling times and the quality of the approximations at next instants may be affected by those errors.
The number of the predictive network outputs feed back as the input of the network is given by the prediction horizon value. Therefore, the capability of the first neural model to predict the future may decrease when the prediction horizon is increased.
The second neural approach provides directly the prediction of the time series at instant k+h+1from the information at instant k+1. Hence, the inputs to the network when the model is used to predict the future are measured time series values and no outputs of the network must be feed back into the input network. Thus, the problem concerning the propagation of errors disappears when the second model is used as a nonlinear multi-step prediction scheme.
A disadvantage of the second model is relative to the structure of the model. As it was previously mentioned, in this case the model predicts directly the time series value at instant k+h+1. The inputs to the model may not contain sufficient information about the time series in order to predict that instant. That is, the input vector, x(k),...,x(k-m), may be very distant in the time from the prediction horizon, k+h+1, and it may not have any relation with that instant. In this case, the second neural model can not be used to predict the future. This structure has only sense when a relation exists between the information available at current instant and the prediction horizon.
On the other hand, it is necessary to point out that the second model has only been prepared to predict the time series value at instant k+h+1, while the first neural model can be used to predict each sampling time until the prediction horizon is reached. Thus, if the purpose is to predict the overall prediction interval [k+1,k+h+1], h different neural models of the second approach must be trained.