Evolutionary and Neural Computation for Time Series Prediction Minisite

Data Sets for Time-Series Analysis

• Mackey-Glass time series
• Description: The Mackey-Glass series, based on the Mackey-Glass differential equation [Mac77] is widely regarded as a benchmark for comparing the generalization ability of different methods. This series is a chaotic time series generated from the following time-delay ordinary differential equation:

Following the majority of studies, the series has been generated using the next values for the parameters: a=0.2, b=0.1, and T = 17. As in the mentioned studies, the task is to predict the value of the time series at point x[t+P] from the earlier points (x[t],x[t-6],x[t-12], x[t-18]). If the number of sample steps P is 50, the function to be learned -whose dimension is 4- is:

x(t) = f(x(t-50),x(t-50-6), x(t-50-12),x(t-50-18))

The initial 3500 samples must be discarded in order to avoid the initialization transients. In many studies, as [Ying97], 1000 data points form the training set, corresponding to the sample time between 3500 and 4499 and the test set is composed by the points corresponding to the time interval [4500,5000].

• Venice Lagoon time series

• Description: Unusually high tides, or sea surges, result from a combination of chaotic climatic elements in conjunction with the more normal, periodic, tidal systems associated with a particular area. The prediction of such events has always been subject of intense interest to mankind, not only from human point of view, but also from an economic one. The most famous example of flooding in the Venice Lagoon occurred in November 1966 when, driven by strong winds, the Venice Lagoon rose by nearly 2 m above the normal water level. The damage to the city's homes, churches and museums ran into hundred of millions of Euros

Tide's behaviour is difficult to be predicted, because depends of too much factors, like the astronomic and atmospheric agents. The problem has been approached by numerical models and statistical methods. Numerical models require the computation of the meteorological forcing functions on each point of the finite difference grid and, hence, they are computationally expensive. Linear stochastic models are suitable for online forecasting since they are simple and their computation burden is low.

Click on links to get the file which contains the level water in the Venice Lagoon measured in centimetres each hour along the years 1980-1989 and 1990-1995 (Provided by A. Tomasin, CNR-ISDMG Universita Ca'Foscari, Venice).

• Sunspot time series

• Description: This time series contains the average number of sunspots per month measured from January of 1749 to March of 1977. These data are available at http://sidc.oma.be ('RWC Belgium World Data Center for the Sunspot'). That chaotic time series has local behaviours, noise and even unpredictable zones using the archived knowledge

• North Atlantic Oscillation (NAO) time series
• Description: The North Atlantic Oscillation (NAO) is a large scale oscillation in atmospheric mass between the subtropical high presure and the polar low presure. The corresponding index varies from year to year, but also exhibits a tendency to remain in one phase for intervals lasting several years. This index is  traditionally defined as the normalized pressure difference between a station on the Azores and one on Iceland
The positive phase of the NAO reflects below-normal pressure across the high latitudes of the North Atlantic and above-normal pressure over the central North Atlantic, the eastern United States and western Europe. The negative phase reflects an opposite pattern of pressure anomalies over these regions. Both phases of the NAO are associated with basin-wide changes in the intensity and location of the North Atlantic jet stream and storm track, and in large-scale modulations of the normal patterns of zonal and meridional heat and moisture transport [Hur95], which in turn results in changes in temperature and precipitation patterns often extending from eastern North America to western and central Europe.

• Download: A time series with the normalized NAO index over each month from 1950 till 2005 is available here: NAO index (1950-2005)

• Winter NAO Index

• Description: The NAO index over the winter months has a bigger influence over the North Atlantic climate. That is why these winter index time series is widely used. Winter (December through March) index of the NAO based on the difference of normalized sea level pressure (SLP) between Lisbon, Portugal and Stykkisholmur/Reykjavik, Iceland since 1864, is available here. The SLP anomalies at each station were normalized by division of each seasonal mean pressure by the long-term mean (1864-1983) standard deviation. Normalization is used to avoid the series being dominated by the greater variability of the northern station. Positive values of the index indicate stronger-than-average westerlies over the middle latitudes.
The source of this information and aditional NAO indexes are found at http://www.cgd.ucar.edu/cas/jhurrell/indices.html

• Download: The data file is available here: Winter NAO index (1864-2004)

• Exchange Rates: US Dollars / Euro (ECU)

• Description: This is a time series provided by the Spanish Central Bank (Banco de España) that shows the mean value of the US Dollars per Euro (or ECU) rate, taken each month since January 1979 till July 2005. This and many others financial time series are available at http://www.bde.es/infoest/htmls/downld.htm

• Download: The data file is available here: Dollar / Euro (1979-2005)

• Exchange Rates: US Dollars / United Kingdom Pound

• Description: This is a time series provided by the Spanish Central Bank (Banco de España) that shows the mean value of the US Dollars per British Pound rate, taken each month since January 1981 till July 2005. This and many others financial time series are available at http://www.bde.es/infoest/htmls/downld.htm

• Download: The data file is available here: Dollar / UK Pound (1981-2005)

• Interest Rates: Mean Interest Rates for non transferable 3-months deposits

• Description: This is a time series provided by the Spanish Central Bank (Banco de España) that shows the mean interest Rates for non transferable 3-months deposits, taken each month since January 1977 till July 2005. This and many others financial time series are available at http://www.bde.es/infoest/htmls/downld.htm

• Consumer Price Index (CPI) of Spain

• Description: This is a time series provided by the Spanish Central Bank (Banco de España) that shows the Consumer Price Index (CPI) per month in Spain, since January 1960 till June 2005. The base index is 100 (CPI of OECD in June 2000). This and many others financial time series are available at http://www.bde.es/infoest/htmls/downld.htm

• Consumer Price Index (CPI) of Organisation for Economic Co-operation and Development (OECD)

• Description: This is a time series provided by the Spanish Central Bank (Banco de España) that shows the Consumer Price Index (CPI) per month in OECD since January 1960 till June 2005. The base index is 100 (CPI of OECD in June 2000). This and many others financial time series are available at http://www.bde.es/infoest/htmls/downld.htm

• IBEX 35 Stock Exchange Index

• Description: The main Spanish Stock Exchange Index (IBEX 35). The data have been taken daily from January 1992 till April 2005. This and many others Stock Exchange indexes are available at http://www.bolsasymercados.es/

• Time Series Prediction Competition: The CATS Benchmark

• Description: The goal of this competition is to provide a new benchmark for the problem of time series prediction and to compare the different methods or models that can be used for the prediction. The proposed time series is the CATS benchmark (for Competition on Artificial Time Series).
This artificial Time Series with 5,000 data is given. Within those 100 values are missing. These missing values are divided in 5 blocks:

• Elements 981 to 1,000;
• Elements 1,981 to 2,000;
• Elements 2,981 to 3,000;
• Elements 3,981 to 4,000;
• Elements 4,981 to 5,000;
These 100 missing values have to be predicted.
More detailed information is available at http://www.cis.hut.fi/~lendasse/competition/competition.html

• Download: The data file is available here: Time Series Prediction Competition

• Australia monthly production of electricity

• Spanish Electric Energy Demand (monthly organized)

• Description: Here we have a collection of time series showing the electric energy demand (GWh) in Spain for each month from 1959 till 2000. This data and many others related to the Spanish Electric System are available at Red Eléctrica Española: www.ree.es

Data are organized in columns per months.
• Download: The electricity demand data file is available here: Spanish Electric Energy Demand (1959-2000)

• Spanish Electric Energy Demand time series

• Description:The former data have been arranged and the whole time series of the Spanish electric-energy demand, in GWh, for each month from January 1959 till September 2000 is available. This data have been obtained from Red Eléctrica Española: www.ree.es

• Download: The data file is available here: Spanish Electric Energy Demand time series

 2005, University CARLOS III of Madrid