IndElec: A Software for Analyzing Party Systems and Electoral Systems

IndElec is a software addressed to compute a wide range of indices from electoral data, which are intended to analyze both party systems and electoral systems in political studies. Further, IndElec can calculate such indices from electoral data at several levels of aggregation, even when the acronyms of some political parties change across districts. As the amount of information provided by IndElec may be considerable, this software also aids the user in the analysis of electoral data through three capabilities. First, IndElec automatically elaborates preliminary descriptive statistical reports of computed indices. Second, IndElec saves the computed information into text files in data matrix format, which can be directly loaded by any statistical software to facilitate more sophisticated statistical studies. Third, IndElec provides results in several file formats (text, CSV, HTML, R) to facilitate their visualization and management by using a wide range of application softwares (word processors, spreadsheets, web browsers, etc.). Finally, a graphical user interface is provided for IndElec to manage calculation processes, but no visualization facility is available in this environment. In fact, both the inputs and outputs for IndElec are arranged in files with the aforementioned formats.


Introduction
IndElec is a software intended to compute a wide range of indices measuring characteristics of party systems and electoral systems in political studies. Among such characteristics, we can briefly mention the disproportionality of an electoral system and some of the main dimensions of a party system, such as fragmentation, effective number of parties, concentration, competitiveness, polarization, regionalism, party linkage and volatility. More detailed information about the indices computed by IndElec, including references, is found in Appendix A.
IndElec was initially developed to carry out the analysis of all the elections held over 1977- 1999 in Spain, which is available in Oñate and Ocaña (1999). The studied elections were those for the Spanish parliament, the autonomous region parliaments and the European parliament, namely 65 elections in total. The high number of considered elections and the different aggregation levels available in the electoral databases, which were provided by the Spanish Ministry of the Interior, motivated the initial development of IndElec. However, this software is now designed to analyze not only the Spanish political system, but also any political system. From a computational point of view, some of the indices provided by IndElec (disproportionality, effective number of parties, fragmentation, etc.) are computed from a data set drawn from an election, which is given by the votes and seats obtained by the competing parties. IndElec also computes volatility indices, which depend on data drawn from two (consecutive) elections (Pedersen 1979; Bartolini and Mair 2007). Apart from its use like a spreadsheet with lot of indices implemented, when the electoral data present several levels of aggregation (state, region, district, etc.), IndElec carries out the calculations of such indices for each of the districts considered at every level of aggregation. In this data framework, some additional indices are implemented in IndElec to compare the effects of data aggregation on some characteristics of the studied political system (Cox 1999;Oñate and Ocaña 1999), i.e., regionalism and party linkage. Summarizing, more than sixty indices can be calculated by IndElec for each electoral distribution. By the way, IndElec can even learn to distinguish acronyms of political parties with the user aid, when some political parties present several acronyms across districts. For instance, this practice is common in Spanish elections, like a strategy, when a party wants to catch voters' regionalist feelings (Lago-Penas 2004;Oñate and Ocaña 1999;Diamandouros and Gunther 2001).
From a technical point of view, IndElec consists of several software libraries and a graphical user interface (GUI). Much of IndElec is coded in Pascal and its GUI is developed in Object Pascal (an object-oriented extension of Pascal). Though the current Windows binary release of IndElec is compiled by using Delphi, IndElec but its GUI could be compiled by the classic Borland Pascal compiler or any other freeware version (Free Pascal Compiler-Lazarus, etc.) with minor changes. On the whole, the logic in the programming of IndElec distinguishes two modules: Dimensi and Volatili. Dimensi includes the indices depending on an election, and Volatili is focused on those indices associated to two elections.
The exchange of information between IndElec and the user is conducted mainly through text files, something like the L A T E X way of work. Figure 1 illustrates this idea by showing a scheme of the use of IndElec. Firstly, the input information and some of the settings for IndElec (data and other specifications) will be saved into text files by the user. Secondly, the output information obtained by IndElec, which is made up of computed indices, statistical analyses and matrices, will also be saved in several text-based files by IndElec. This makes the use of IndElec easy, because any text editor can manage the files associated to IndElec. Moreover, to improve upon the readability and integrability of the IndElec output with other softwares, some additional file formats, such as CSV, HTML and R, are considered.
This manuscript is sketched out as follows. The first sections are focused on the module Dimensi of IndElec. Indeed Sections 2 and 4 explain the management of Dimensi for the two considered data frameworks, respectively. In this sense, the implementation of any structure of data aggregation by means of levels is treated in Section 3. The module Volatili of IndElec is thus described in Section 5. To illustrate some of the details provided in this manuscript, two real data examples will be recurrently considered: the Spanish parliamentary elections held in 2004 and 2000. Finally, the integrability between a statistical software, namely R (R Development Core Team 2011), and IndElec is exemplified in Section 6.

Module Dimensi with aggregated data
The aggregated data framework is given when the available electoral data consists of the overall numbers or shares of votes and seats for each of competing parties in a given election. This is the simplest data framework under which IndElec can be used. In fact, IndElec can thus be viewed like a spreadsheet containing lots of political indices implemented in its code. To illustrate the usage of the module Dimensi of IndElec, the 2004 Spanish parliamentary election will be considered in what follows.
The aggregated data for a given election must appear in a text file with extension *.dat. The information in such a file must be arranged according to the following syntax: the first line contains a short description of the electoral data; the second line is not taken into account by IndElec; each of the following lines contains the acronym, the votes and the seats, for each competing party. Any of such quantities for any party can be provided as number, proportion or percentage (the implementation of IndElec takes care of such numeric settings).
For example, the aggregated data from the 2004 Spanish parliamentary election, which are contained in the input file da04.dat, are arranged as follows: Under the aggregated data framework, IndElec performs the analysis of electoral data and saves the output information in three files with different formats. On the one hand, it generates a text file with extension *.out and an HTML file (da04.out and da04.htm, in our example). Apart from the indices of disproportionality and those of party dimensions but the volatility, the module Dimensi saves the electoral data ordered according to the votes and also their corresponding cumulative distributions of votes and seats. Further, to visualize the disproportionality by parties, it also displays the deviations between votes and seats for each party. For example, in the output of IndElec for the data file da04.dat, we can distinguish the following information:

Defining an aggregation structure in IndElec
Any data aggregation structure given through several levels (discrete aggregation) can be implemented in IndElec by the user. Levels of aggregation can be considered in electoral data, when there exists an aggregation structure of geographic units or items (countries, regions, provinces, districts, etc.) in the area where the studied election is held. According to such an aggregation structure, an electoral data distribution is thus gathered for each of those geographic units. Indeed such distributions will make up the data set to be provided to IndElec.
From a mathematical point of view, let R 1 be the area or overall region where a given election took place and L be the number of aggregation levels to be considered in this region. Each level of aggregation, denoted by ∈ {1, . . . , L}, is defined by a family F = {R i : i = 1, . . . , M } of disjoint geographic units such that M i=1 R i = R 1 , where F 1 = {R 1 } to ensure consistent notation. These families are assumed nested in such a way that ∀ ∈ {2, . . . , L} and ∀j ∈ {1, . . . , M }, then there must exist an unique i ∈ {1, . . . , M −1 } such that R −1 i ⊇ R j . Therefore, such an aggregation structure can be viewed as a set of nested layers, {F : = 1, . . . , L}, which establish subsequent partitions of the overall region, R 1 . Notice that the level of aggregation is determined by in a decreasing way. Indeed stands for splitting instead of aggregation.
The aforementioned aggregation structure can be understood by IndElec. To this end, the user must implement such an aggregation structure by composing some configuration text files, which must be included in the IndElec setup folder. In fact, IndElec will not understand an aggregation structure in the provided electoral data, unless such a structure is defined in IndElec. So the configuration files for defining an aggregation structure will be detailed in the following paragraphs.
First of all, the main of such configuration files, which must be named indelec.cfg, storages a scheme of the aggregation structure to be defined, such as follows: where nameAgLev is a character string which names the aggregation level , ∀ ∈ {1, . . . , L}.
Second, for each ∈ {2, . . . , L}, a configuration file named nameAgLev .txt will contain the descriptions of the geographic units of the aggregation level , i.e., the codification of F = {R i : i = 1, . . . , M }. To compose a nameAgLev .txt file, with ∈ {2, . . . , L}, the syntax to be considered is given from the following guidelines.
The first line of the file nameAgLev .txt contains the number of geographic units for the level , i.e., M . So the description of geographic units starts in the second line of this file.
Each geographic unit R i is identified by the code i ∈ {1, . . . , M } and its name (a character string).
Indeed the description of every geographic unit, R i , occupies three lines of the nameAgLev .txt file. The first line contains the code i of R i and also the codes of those geographic units, for higher levels of aggregation, containing R i . The name of R i appears in the second line. The third line is always blank to end the description of R i . For example, assume that we have The description of R i is then given by the following three lines: i i 1 . . . i −2 the name of R i a blank line (the number of regions in Spain) 1 (the code of Andalucia) ANDALUCIA (a blank line) . . . 14 (the code of Pais Vasco) PAIS_VASCO . . . Table 1: A view of the file CCAA.txt, which defines the aggregation level given by the 17 autonomous regions in Spain.
Notice that no code is considered for the highest level of aggregation given by F 1 = {R 1 } ( = 1). Further, apart from the descriptions of geographic units, the nameAgLev .txt configuration files specify the nested relationships among the families {F : = 1, . . . , L}.
Finally, as reality overcomes theory sometimes, IndElec is designed to allow that M i=1 R i ⊆ R 1 , for some aggregation levels. However, this enters only a slight variation into the logic underlying the theoretic framework considered here.

An example: Spanish parliamentary elections
In the study of Spanish parliamentary elections, it can be worth considering both regions and provinces. For the one hand, the provinces are the districts where the electoral rule is applied on. For the other hand, the regions are political and cultural unions of provinces (they are called autonomous regions). More information on the political map of Spain is available at http://www.maps.data-spain.com/ To implement the aggregation structure induced by the Spanish political map, the configuration (text) file indelec.cfg will contain the following elements:

Total CCAA Prov
This specifies that three aggregation levels (L = 3) can be considered, which stand for the aggregation levels given by Spain (F 1 ≡Total), the autonomous regions (F 2 ≡CCAA) and the provinces (F 3 ≡Prov). The geographic units for the aggregation levels given by CCAA and Prov are thus defined in the text files CCAA.txt and Prov.txt, respectively, whose contents are sketched in Tables 1 and 2. For instance, notice how the province Alava, which is coded by integer 1 in Prov.txt, is defined as included in the autonomous region Pais Vasco, which is coded by 14 in CCAA.txt.
The Spanish parliamentary elections not only provide an example to illustrate the definition of an aggregation structure in IndElec, but also show how IndElec can be adapted to real situations partially matching the framework for aggregation structures previously formulated. In fact, the family F 2 , made up of the Spanish autonomous regions, satisfies that 17 i=1 R 2 i ⊂ R 1 (R 1 is Spain), because two provinces, namely Ceuta and Melilla, are not considered in 17 i=1 R 2 i .

52
(the number of provinces or subregions in Spain) 1 14 (the code of Alava is 1; it is included in Pais Vasco) Alava (a blank line) . . .

1
(the code of Almeria is 4; it is included in Andalucia) Almeria . . . Table 2: A view of the file Prov.txt, which defines the aggregation level given by the 52 provinces in Spain.
Indeed Ceuta and Melilla are endowed by a special legal status (autonomous cities), what makes that they are not usually considered in the Spanish political map of autonomous regions. However, they are usually included as provinces.

Module Dimensi with levels of data aggregation
In this section, the use of the module Dimensi of IndElec from data with several aggregation levels will be presented. Roughly speaking, the management of Dimensi in this case can be viewed as an interactive process, where the user and IndElec exchange information until the final results (the output files) are obtained. By the way, due to the considerable number of input and output files involved in this data framework, it is highly recommended to use a specific folder for each election data set. The exposition in this section will follow the stages to be accomplished in an IndElec run under the considered data framework. The step-by-step process so derived is sketched out in Figure 2.

The database
The electoral data with some aggregation levels must be provided to IndElec in a (input) text file with extension *.dab. Indeed the considered aggregation structure in the data file should be previously defined such as is described in Section 3.
In the electoral data to be provided to IndElec, let H be the number of aggregation levels, 1 be the highest level of aggregation and R 1 i 1 be the overall geographic unit, where H > 1, 1 ≤ 1 < 1 + H − 1 ≤ L and i 1 ∈ {1, . . . , M 1 }. As we can see, the notation entered in Section 3 will be considered in what follows.
The electoral data must be stored in the *.dab text file by following these guidelines.
The first line of the *.dab file contains a short description of the electoral data.
The integers H and 1 appear in the following two lines, respectively.
The integer in the fourth line specifies the overall geographic unit. We have two options: it may be the code i 1 or the value zero. The value zero means that the code i 1 will appear in each of the following data records; otherwise, i 1 will not appear in those records. Nevertheless, if 1 = 1, then any nonzero integer could be considered to name R 1 .
The fifth line is blank. This establishes the end of the definition of the aggregation structure available in our data. Thus the data records of any of the considered electoral distributions appear sequentially from the sixth line.
Each party data record occupies H + 2 or H + 3 lines in the *.dab file: H − 1 lines, for the H − 1 codes describing the considered geographic unit (if the fourth line contains zero, then an additional line is needed to include i 1 ), and three lines for the acronym, the votes and the seats, respectively, for such a party in such a geographic unit. Finally, we must add a blank line in the data file to establish the end of a party data record.
For instance, let us consider a party with acronym PARTY which obtains V votes and S seats in the geographic unit R 1 +τ jτ , for any τ < H and any j τ ∈ {1, . . . , M 1 +τ }. The figures for V and S can be numbers or shares in the file. Further, assume that the geographic unit . . , M 1 +s }, ∀s = 1, . . . , τ . Under these settings, its party record in the *.dab file is stored as follows: For each level , the code σ( ) is an integer such that σ( ) > M , which stands for the collapse of the aggregation level . IndElec automatically recognizes such codes σ( ), ∀ , from the *.dab file. To illustrate the structure of a *.dab data file, we consider the 2004 Spanish parliamentary election with the aggregation structure defined in Section 3.1. The corresponding electoral data are available in the file spain4ag.dab, where its data records are included such as is described in Table 3.
In these electoral data, we can consider some records for the Spanish worker's socialist party (PSOE), which are roughly illustrated in Table 4. This table shows a common practice for some parties in elections in Spain: the acronym of a party changes across regions or districts in order to catch the regionalist feelings of potential voters. This means that PSOE-A, PSOE and PSE-EE, among others, are oficial acronyms of the same political party. This curious practice presents a serious problem in data analysis, because the parties are usually labeled in official databases by using several official acronyms. IndElec provides a way to sort out this problem, which is described in Section 4.2.

Management of acronyms
When the *.dab data file is provided to IndElec (or Dimensi), an information exchange process is performed between the user and IndElec. In this step-by-step process, on the one hand, the user teaches IndElec by providing information about parties and, on the other hand, IndElec eases the user's work by generating preliminary templates of some input files to serve in subsequent steps.
First, IndElec extracts all the acronyms from the *.dab file in a text file named siglas.txt. This file thus contains the acronyms recognized by IndElec from the provided data. However, the user must supply to IndElec additional information about the parties referred to by the acronyms in siglas.txt. In fact, the IndElec generated version of siglas.txt is just a template, where the user must specify whether the acronym belongs to a state-wide party, labeled by NO-PANE, or to a regional-wide party, labeled by PANE. To this end, the user will edit siglas.txt and then write down NO-PANE or PANE below each party acronym. After specifying this information in siglas.txt, the user version of siglas.txt is read by IndElec to incorporate the regional-national information. For the 2004 Spanish parliamentary election, the siglas.txt file to be provided to IndElec is described in Table 5.
Second, the problem of the acronym change across districts is solved through IndElec. Mathematically speaking, the solution of the problem consists of establishing the quotient set from the set of party acronyms, which appears in siglas.txt, where the equivalence relation establishes that the acronyms are equivalent when they are associated to the same political party. Indeed this quotient set of acronyms is defined from its equivalence classes, which are the subsets of acronyms belonging to the same party. The solution will be implemented by the user in the input text file siglaso.txt. In fact, this file will contain the aforementioned equivalence classes by following this guidelines:  for any equivalence class of acronyms, each acronym appears in a line of siglaso.txt and the end of its description is points out by a blank line.
For example, in the 2004 Spanish parliamentary election, the final version of siglaso.txt to be provided to IndElec is described in Table 6.
As the construction of siglaso.txt from scratch can be laborious for the user, IndElec provides a preliminary version of siglaso.txt to be only modified by using any text editor, where the acronyms considered at the highest level of aggregation are distinguished.
Finally, as some polarization indices can be obtained by IndElec, the (left-right) ideological scores in the interval [0, 10], for every party, must be supplied in the input text file siglapo.txt. The syntax of this file is inspired on that of siglaso.txt. In fact, to easily obtain siglapo.txt, we can modify siglaso.txt by adding such party scores in the first line of any record, where now each party is viewed as an equivalence class of acronyms in siglaso.txt. However, the equivalence classes in siglapo.txt are not necessarily equal to those in siglaso.txt.
In the 2004 Spanish parliamentary election, the input file siglapo.txt is illustrated in Table 7.

Ouput files
From data with several levels of aggregation, IndElec computes lot of political indices for each electoral distribution (set of pairs, votes and seats, for every party) associated to each of the geographic units in every aggregation level. Further, IndElec computes other political indices quantifying properties of party systems changing across the geographic aggregation (regionalism and party linkage, mainly  exploratory statistics (median, quartiles), covariance and correlation matrices, is automatically elaborated by IndElec to provide a first approach of the results. Moreover, the contents of result.out are available in both CSV and HTML formats. For the HTML format, IndElec additionally generates a version of result.out with frames which is available in resultf.htm (the version without frame is given by result.htm).
In order to facilitate the statistical analysis of the results derived by IndElec, they are organized in two data matrices (data frames, in the R terminology), which are stored in two kind of files. IndElec automatically generates both the text and CSV formats for the aforementioned files. In fact, the output files matriREG.* will contain the computed indices of regionalism and party linkage and the files matrizDD.*, the rest of indices derived by Dimensi. Therefore, these output files can be loaded as data file to any statistical software (R, S, SPSS, etc.), in order to perform sophisticated statistical analysis from the results derived by IndElec.

The module Volatili
Volatili is the module of IndElec addressed to calculate the volatility indices (Pedersen 1979;Katz, Rattinger, and Pedersen 1997). Associated to two elections held in two dates (years, for instance) rather than to one election, such as is the case with Dimensi, the implementation of volatility indices in IndElec was carried out in a special software module, which utilizes the internal calculations (binary files, etc.) previously obtained by Dimensi for each election. The implemented indices in Volatility are the total volatility indices proposed by Pedersen (1979) and a generalization of the bloc volatility indices suggested by Bartolini and Mair (2007). Moreover, the two data frameworks previously considered (aggregated data and data with aggregation levels) can be also managed by Volatili.
In political studies, volatility is a dimension quantifying the changing patterns in a party system, i.e., the total transfer of votes among political parties or blocs of parties between two consecutive elections. Pedersen (1979) suggested an index that quantifies such transfers among parties: the index of total volatility. The Pedersen volatility measure (PVM) became more sophisticated when Bartolini and Mair (2007) tried to explain the electoral change taking into the alignment of parties according to two ideological blocs, namely the left-wing parties and the right-wing parties. These authors thus defined the indices of (inter) bloc volatility and intra-bloc volatility.
A state of the art of the PVM can be found in Katz et al. (1997), where the broad range of its current applications and some suggestions about this dimension are pointed out. Such suggestions have motivated the generalization of the bloc volatility indices in IndElec by letting any number of blocs. To this end, the user will specify both the number of blocs and the character standing for each of such blocs in the configuration text file simbolos.afi, which must appear in the IndElec setup path. The syntax for simbolos.afi is sketched out as follows: the first line contains the number of blocs; each considered bloc is specified in a line by a character.
For example, if the user wants to consider those blocs in Bartolini and Mair (2007) (the left-wing parties and the right-wing parties), the file simbolos.afi will be as follows:

Party experienced increments for volatility indices
Though the PVM formula is very simple, some computational problems can arise when it is calculated from real data in practice. The main problems appear when the sets of competing parties in both considered elections, respectively, are not identical, such as is theoretically assumed in the PVM formula (Pedersen 1979). This problem arises when, for example, there are changes of party acronyms, merging of parties into coalitions or splitting former parties into new parties, etc. over both consecutive elections. The increments in votes or seats experienced by some parties, between both considered elections, are not so evident in such situations. Therefore, the PVM formula, which depends on such party experienced increments, could not be computed properly from some real data in practice.
These computational problems are solved in Bartolini and Mair (2007, Appendix 1, pp. 311-312) and Ocaña (2007). Bartolini and Mair propose a set of guidelines describing how to do in a wide range of such problematic situations, where the sets of competing parties are not identical. Once these guidelines are applied to our data, the equality of the sets of competing parties in both elections can be assumed in the so transformed electoral data. To sort out this problem, IndElec provides the way of implementing the Bartolini and Mair's rules by means of a input text file with extension *.ivo. Moreover, the alternative approximative volatility formulae developed by Ocaña (2007) are also implemented in IndElec.
Though the *.ivo input file will depend on the considered electoral data framework, it always includes the implementation of the party experienced increments by following a common syntax for both data frameworks. This syntax establishes that any increment for a party (party, coalition, etc.) is included in a *.ivo file by following these guidelines: the first line, for such an increment, contains the character of the bloc where the increment must be included for the bloc volatility indices; from the second line, each of the acronyms of parties in the second election, for such an increment, will appear in a line of the input file; the end of the above list of acronyms for the second election is established by a blank line (the first blank line); after the first blank line, each of acronyms of parties in the first election, for the considered increment, will appear in a line of the input file; the end of the above list of acronyms for the first election is given by a blank line (the second blank line); For example, assume that the i-th increment experienced by parties between two elections is given by p 2 parties (acronyms) from the second election and p 1 parties (acronyms) from the fist election. Further, suppose that such an increment is in the b-th bloc for the bloc volatility indices. The IndElec user can implement such an increment by composing the following contents in the corresponding *.ivo input file:

Character of the bloc b Party
(2) i 1 (the acronym of the i 1 -th party in the 2nd election) . . . (more parties of this increment in the 2nd election) Party (2) ip 2 (the acronym of the i p 2 -th party in the 2nd election) (the first blank line) Party (1) j 1 (the acronym of the j 1 -th party in the 1st election) . . . (more parties of this increment in the 1st election) Party (1) jp 1 (the acronym of the j p 1 -th party in the 1st election (the second blank line) It makes that IndElec incorporates the increment given by into the volatility formulae, where F (Party (t) ) stands for the vote or seat share of the party named by the acronym Party (t) , which is the -th party in the t-th election (t=1,2). Notice that either p 1 or p 2 may be zero and that two blank lines must always apear for each increment. Moreover, when the electoral data presents several levels of aggregation, it is not necessary to specify the acronyms of a party across the districts. IndElec learns such information from the corresponding siglaso.txt files for both considered elections, respectively, where Dimensi must have been previously applied.

Usage of Volatili
Roughly speaking, the usages of the module Volatili for the electoral data frameworks managed by IndElec, data aggregated and data with aggregation levels, present nonsignificant differences. However, some big differences arise, whether the programming of Volatili is taken into account for both data frameworks. As a user guide of Volatili, this section is focused on its usage and, then, it will contain an unified description of Volatili as compared to Dimensi for both cases.
The aforementioned similarity in the Volatili usage is due to the computational design. Indeed Volatili requires the previous execution of Dimensi for each of both studied elections. The binary files so generated by Dimensi for each election, which depend on the electoral data framework, provides the information needed to start Volatili calculations. In fact, in order to calculate volatility indices, the only specific information for Volatili is given by the party experienced increments, between both studied elections, which are needed to apply the volatility formulae (Bartolini and Mair 2007).

The input file
Such as was established in previous section, the party increments between both elections are implemented into an *.ivo text file. Further, the considered electoral data framework enters only a tiny difference in the information saved in such a file, which is located in its first four lines. Indeed the syntax of this header of the *.ivo file is given by the following guidelines: The first line contains the path of the folder where the data from the second election are stored. In a similar way, the third line contains that path of the first election.
The fifth line is always blank.
The party increments are thus arranged from the sixth line.
The differences by the data framework are found in the second and fourth lines of the header of the *.ivo file. If the electoral data are aggregated, then the name of the data file (without its extension *.dat) will appear below its corresponding election working path. If the electoral data presents several aggregation levels, then a number labeling each election will appear below each election path (the year, for instance).
This way the content of any *.ivo text file is sketched out as follows: the path for the 2nd election the data file name or a label, for the 2nd election the path for the 1st election the data file name or a label, for the 1st election (a blank line) Now, the descriptions of party increments. . .

Output files
The output information of Volatili follows the same idea of the output files of Dimensi. First, the scores of volatility indices are saved in report style into text and HTML formats; the CSV format is also available for disaggregated data. Such report files are named as the *.ivo file with the extensions *.res, *.htm and *.csv, respectively. Second, for aggregated data, IndElec also generates automatically a R source file which defines some R objects containing the volatility scores computed by IndElec (R Development Core Team 2011). Third, the computed volatility indices are saved in data matrix style in text and CSV formats with a common name, matVolat.

Using R and IndElec
This section will illustrate the integration of the statistical software R (R Development Core Team 2011) and IndElec through some data examples, according to both electoral data frameworks considered in this paper. Indeed IndElec provides a significant level of integrability with any statistical software, such as has been explained across this paper. However, IndElec provides some additional facilities to R users, which are illustrated in this section.
Roughly speaking, this section will demonstrate how a data frame can be (1) exported from R, (2) analyzed in IndElec and then (3) the so obtained results imported to R. Indeed the emphasis will be on the steps (1) and (3), because the step (2) has already been treated in previous sections. Further, taking into account the two modules of IndElec, Dimensi and Volatili, the step (3) is accomplished in the same way. However, as the electoral data files considered by Volatili must have been previously taken by Dimensi, the step (1) can only be explained for Dimensi. Notice that the relevant information needed by Volatili is only provided by the input file of the party increments (see Section 5.1). Therefore, the examples in this section will only illustrate the interactions of the module Dimensi of IndElec and R.

Aggregated electoral data
Consider the aggregated electoral data from the 2004 Spanish parliamentary election, which are presented in Section 2. Assume that these data are stored in a R data frame named rdaf.
The R data frame rdaf could be easily obtained from the data file da04.dat described in Section 2. To this end, the sentence in R R> rdaf <-read.table(file = "da04.dat", header = TRUE, skip = 1) reads da04.dat and skips its first line (it is a short data description). The three inherited variables (columns) of rdaf are named as Party, Vote and Seat, respectively.
In this framework, the input file for IndElec from a given R data frame is derived by the R function Adata2IndElec, which is provided in the IndElec distribution. This function creates the data file in the form that the module Dimensi needs from a standard R data frame containing aggregated electoral data. Its header is given by Adata2IndElec(dataName = "", acronyms, votes, seats, sTitle = "") where dataName is a character string containing the input data filename for IndElec to be created, acronyms is a string vector of the acronyms of the parties competing in the considered election, votes and seats are numeric vectors containing the votes and seats, respectively, of the considered parties, and sTitle is a character string containing a short description of the electoral data. For instance, taking into account the proposed example, the R sentence R> Adata2IndElec("da04n", rdaf$Party, rdaf$Vote, rdaf$Seat, + "2004 Spanish parliamentary election") will generate the input data file da04n.dat for IndElec in the R working directory from the data frame rdaf.
The contents of da04n.dat and da04.dat are equal. Therefore, the same conclusion holds for their corresponding output files.
The outputs of IndElec from da04n.dat would be arranged in several files with different formats, such as is explained in Section 2. Indeed IndElec would derive the following output files: da04n.out, da04n.htm and da04n.R, which all contain the same results but with different formats. Particularly, da04n.R would be a R source file which defines the results by IndElec from rdaf as a R list.

Disaggregated electoral data
This section will illustrate how IndElec can be applied on data with aggregation levels stored as a data frame in R. In this data framework, two different situations can be considered.
1. The available electoral data, which are saved in the given R data frame, are only those of the lowest level of aggregation. This means that an aggregation process is needed to obtain the electoral data for the rest of levels of aggregation. To obtain the input data file for IndElec, two R functions, namely DA2IndElec and DO2IndElec, are provided in the IndElec distribution, for both situations, respectively. Nevertheless, the aforementioned situations on the available data determine only which of the provided R function must be considered. In fact, the rest of steps to accomplish the task proposed in this section are the same for both situations. Because of this, we will only present an example of the first situation, which is artificial to make extensive use of R.
Consider a state which consists of two regions, named by Region1 and Region2. Region2 is divided into Subregion1 and Subregion2 (yellow color in Figure 3). Further, these subregions are split in five districts, which are labeled by an index: three districts (1, 2 and 5) in Subregion1 and two districts (3 and 4) in Subregion2. To visualize the so defined aggregation structure in this artificial state, its map is depicted in Figure 3.
Theoretically, four levels of aggregation are assumed in such a state, namely F 1 ≡State, F 2 ≡Region, F 3 ≡Subregion and F 4 ≡District, where the notation in Section 3 is considered. However, to illustrate the sophistication of IndElec, we will consider that the election under study was held only in Region2, and then that its corresponding electoral data are drawn from each if its districts. Taking into account the general framework in Section 4, in our problem, the highest level of aggregation is thus Region, with 1 = 2, for the regional unit Region2, labeled by i 1 = 2, and three aggregation levels are considered in the electoral data, H = 3, namely Region, Subregion and District. Nevertheless, we will assume that the available electoral data are only given by those of the District level (the first situation above).
Step 0. For the election held in Region2, its electoral data drawn from each of its districts are going to be generated in R. Consider 3 parties competing across the five districts of Region2, where the parties are labeled by Pa, for any a = 1, 2, 3, for instance. Assume that the distributions of the numbers of votes and seats are Poisson with parameters 20 and 3, respectively, for instance. As the information (description) on the considered districts must be joined to each data record, the variables of the electoral data frame can be generated in R as follows: R> noDat <-3 * 5 R> Parties <-gl(3, 1, label = c("P1", "P2", "P3"), length = noDat) R> LDistri <-gl(5, 3, length = noDat) R> LSubreg <-gl(2, 2 * 3, length = noDat) R> LRegion <-rep(2, noDat) R> v <-rpois(n = noDat, lambda = 20) R> s <-rpois(n = noDat, lambda = 3) R> rdatD <-data.frame (LRegion, LSubreg, LDistri, Parties, v, s) This way the R data frame rdatD contains the electoral data obtained by each of parties in each of the five districts, i.e., This data generation process is generalized in the R source file exampAg.R, which is available in the IndElec distribution.
Step 1. Once the disaggregated electoral data are available in a data frame, namely rdatD, the input data file for IndElec can be made by using one of the ad hoc R functions, DA2IndElec or DO2IndElec. These functions are managed in the same way. In fact, the only difference between both R functions is found in the electoral data in the R data frame. On the one hand, when the available electoral data are only those of the lowest level of aggregation, and thus a data aggregation process must be carried out to obtain the electoral data for the rest of levels, DA2IndElec must be executed. On the other hand, when all electoral data are available, DO2IndElec must be executed instead. Therefore, in our example we must consider DA2IndElec(dataName = "", l1, agLevels, parties, votes, seats, sTitle = "") where dataName is a character string containing the name of the input file to be created, l1 is the index of the highest level of aggregation, agLevels is a list containing the aggregation levels sorted in decreasing order of aggregation (each aggregation level is coded by a R factor), parties is a vector of strings (R factor) of party acronyms, votes and seats are numeric vectors containing the votes and seats of the competing parties, respectively, and sTitle is a short description of the electoral data, which will be included in the first line of the input file to be generated. For instance, taking into account the considered example, the R sentence R> DA2IndElec("Regi2D", 2, list(rdatD$LRegi, rdatD$LSubreg, rdatD$LDistri), + rdatD$Parties, rdatD$v, rdatD$s, sTitle="Region2 election") will generate the data file Regi2D.dab for IndElec in the R working directory from the disaggregated electoral data contained in the data frame rdatD.
Step 3. IndElec must be prepared to understand the aggregation structured in the electoral data to be analyzed, such as is established in Section 3. To this end, we must modify the file indelec.cfg to define the potential levels of aggregation to be considered in the artificial state (Figure 3), such as follows This implies that the four levels of aggregation are defined in the configuration files named as Region.txt, Subreg.txt and District.txt. In fact, the region level is defined in Region.txt as follows The subregion level is defined in Subreg.txt as follows 3 1 2 Subregion1 2 2 Subregion2 3 1 (it is not necessary) Region1 Finally, the district level is defined in District.txt as follows . . . 5 1 2 District5 6 3 1 (it is not necessary) Region1 These configuration files make possible the analysis of the data file Regi2D.dab by using IndElec (see Section 4).
Step 4. After the Dimensi step-by-step run (Figure 2), a lot of output files are generated by IndElec (see Section 4.3) with different purposes. Among such output files, the matriREG.* and matrizDD.* files let import the IndElec results to R. For instance, taking into account the CSV matrix files, the results are stored in two R data frames as follows: R> rOutDD <-read.csv("matrizDD.csv") R> rOutRL <-read.csv ("matriREG.csv") where rOutRL contains of the regional and party linkage indices and rOutDD, the rest of indices computed by IndElec.

Conclusions
This paper presents a software devoted to help the political researcher in the analysis of party systems and electoral systems. IndElec can calculate more than fifty political indices measuring characteristics of electoral systems and party systems, from electoral data. However, IndElec is flexible, because it can be adapted with the user aid to several situations arising when real electoral data are considered in a study (the presence of aggregation levels in data, party with several acronyms across districts, among others). Nevertheless, its development is always in progress Ocaña 2000, 2005;Ocaña and Oñate 2006;Ocaña 2007).
Finally, an important point is the integrability of the IndElec output with other softwares (word processor, spreadsheet, statistical softwares, etc.), which is achieved through the considered output file styles. On the one hand, the readability of the IndElec output is provided through the report-style files. Apart from providing an inspection tool to the user, they also let composing texts in the input files. On the other hand, the vast amount of scores obtained from disaggregated electoral data can be analyzed by any statistical software through the matrix-style output files. Moreover, R's users can easily manage the IndElec output derived such as described in Section 6.

A. Indices computed by IndElec: Formulae and references
This appendix gathers information on the main political indices computed by IndElec, namely their formulae and some of their references.
First of all, consider a given election under study. Let I = {1, . . . , N } be the set of parties competing in such an election and {(V i , S i ) : i ∈ I} be the joint distribution of votes and seats, expressed in percentages, which summarizes the electoral results obtained by these parties. Further, to ease notation, p i will denote the proportion of votes or seats, indistinctively, for the i-th political party, for any i ∈ I.
Though only necessary for some indices, we will consider that the competing parties are ordered according to their obtained votes such as follows: V i ≥ V i+1 , ∀ 1 ≤ i < N . However, the distortion yielded by the electoral system makes that this order is not necessarily maintained for their seats.
In some elections, the number of parties, N , can be extremely high and, thus, lot of parties have no seat. Under such circumstances, some indices may present unappropriate behaviors (Lijphart 1994), since a high proportion of parties with no seat is involved in the calculation of such indices. Because of this, some alternatives for N into the formulas of some political indices, which are proposed in the literature, have been implemented in IndElec. Lijphart (1994) suggests N L = max{J ∈ I : V J > 0.5}. Oñate and Ocaña (1999) consider two alternatives: N S = min J ∈ I : J i=1 S i = 100 and N + = max {J ∈ I : S i > 0, ∀ i = 1, . . . , J}.
Bias in the electoral system Cox and Shugart (1996) describe the filtering associated to an electoral system through a simple linear regression model given by S i = a + bV i + i , ∀ i ∈ I. Indeed these authors proposed to quantify the bias of an electoral system by the least squares (LS) estimate of the slope b. However, taking into account that the considered linear model is an excessive simplification in practice (the classic hypotheses on the model residuals may not be satisfied), some alternatives for the estimation of b were proposed in Oñate and Ocaña (1999). Some of such alternatives, which are based on EDA (Tukey 1977), propose to quantify the bias of the electoral system through the Tukey estimation (T) of b.
Volatility (Arian and Weiss 1969;Pedersen 1979;Bartolini and Mair 2007;Ocaña 2007) The volatility indices depend upon two elections held in two dates, which are denoted by the superscripts [t] and [t + 1].
Total volatility: Bloc volatility: BV = T V (for the blocs of parties).

Data with levels of aggregation
IndElec can work when electoral data present several levels of aggregation, such as is described in Section 3. In this sense, not only all the aforementioned indices can be re-calculated, but also new indices can be considered.
On the one hand, let be an aggregation level and R J be one of its geographic units, i.e., R J ∈ F . The electoral data obtained by the parties I in the geographic unit R J can be summarized by its corresponding joint distribution of votes and seats, expressed in percentages, denoted by {(V i (R J ), S i (R J )) : i ∈ I}. It follows that the aforementioned indices can be computed for each unit R J ,.
On the other hand, let U < L be any two aggregation levels and R U j U be a geographic unit of the aggregation level U , i.e., R U j U ∈ F U . In this framework, we can consider those geographic units of the aggregation level L which are contained in R U j U , i.e., Further, let I R be the set of regionalist parties (they are labeled as PANE in IndElec), where I R ⊂ I. The following indices are only computed by IndElec when the electoral data presents several aggregation levels.
Regionalism (Oñate and Ocaña 1999) Regionalist vote at the upper level item: V R(R U j U ) = i∈I R V i (R U j U ).
Regionalist vote at a lower level item: V R(R L j ).
Differentiated regionalist vote: V R(R L j ) − V R(R U j U ).
Differentiated regional vote: 1 2 i∈I |V i (R U j U ) − V i (R L j )|.