Selection of representative polling
stations


In every electoral process the carrying out of
surveys is very important, in order to make accurate
predictions of the global electoral results. Usually, these polls are
made on representative samples of the electorate, being these samples a
wide subset of the population. If it was possible to know which
are the most representative polling stations, that is the ones whose
electorate most accurately
reflect the position of the overall electorate, then the surveys could
be carried out only in these representative districts. In such a
way, the predictions of the survey would be accurate enough,
being its cost much lower. 



Instances : An instance of this problem
is presented here with available data corresponding to the last
autonomous elections in Madrid, held on October 2003. Each example
corresponds to a specific polling table, having 22 atributes
which indicate the number of votes obtained by each of the 22 political
partyes that took part on the elections. The exact attributes
description, as well as the 5865 examples corresponding to the totality
of the polling tables are:
The original data have been
processed in order to organize them in polling sections instead of
polling tables, resulting 3928 examples. These data organized in
polling sections have been processed again in order to transform the
number of votes of each political party in a relative value with
respect to the electoral roll of the section. This relative value is a
real number between 0 and 1 and it is calculated dividing
the number of votes by the number of people registered in that section.
Both transformed sets of data are available following this links:


Solutions to the problem:This is not a well known problem and, thus, there are not well known solutions for it. We propose two approaches to solve it and present the obtained results for the instance of the problem given above. Approach 1 We have used the SelfOrganizing Map (SOM) algorithm created by Professor Teuvo Kohonen [Kohonen, 1982, Kohonen, 1990, Kohonen, 1995c, Kohonen et al., 1996b] to select the most representative sections. Basically, we have tried eight different bidimensional topologies of SOM: 5*5, 6*6, ... and 12*12 neurones. Each polling section is characterized by 22 significant attributes, which represent the relative values of the electoral results for each political party. Thus, 3928 22dimensional vectors are mapped into different bidimensional maps by grouping similar vectors (electoral sections) together. On the other hand, the vector corresponding to the global polling results is also mapped. The bidimensional region where this global vector is mapped is the selected region, being all the vectors mapped into this region the corresponding selected electoral sections. The following links show the obtained results for each chosen topology. Besides, a summary with the a global measure of the error for each topology is given. The SOM topology is represented by N * N, being N the number of rows and columns of the bidimensional map. Results: SOM topologies: 5*5, 6*6, 7*7, 8*8, 9*9, 10*10, 11*11, 12*12 Summary for all topologies Approach 2 Genetic Algorithms ...... 









Last Updated: 6/07/05 For any question or suggestion, click here to contact with us. 
