# Forecasting voting behaviour using machine learning – Poland in transition

Post on 03-Aug-2016

213 views

TRANSCRIPT

Annals of Operations Research 97 (2000) 3141 31

Forecasting voting behaviour using machine learning Poland in transition

G. Szkatua, J. Houbiec and D. WagnerSystems Research Institute, Polish Academy of Sciences, ul. Newelska 6, 01-447 Warsaw, Poland

E-mail: {szkatulg; holubiec; wagner}@ibspan.waw.pl

The aim of the paper is to apply some inductive learning method from examples (whichgives explicit decision rules of if-then type) to forecast the voting behaviour of individualmembers of the Polish Parliament. Results obtained are both interesting and promising.Keywords: learning from examples, voting behaviour, forecasting

1. Introduction

Under new political conditions in Poland (parliamentary democracy) results ofthe last election to the Polish Parliament1 cannot be considered as an adequate repre-sentation of the current political situation. One of the main features of this situationis political instability, due to economic social and personal reasons. As a result of thissituation, orientations with a clearly defined left-wing or right-wing character have notbeen established. The lack of political stability is demonstrated by the existence ofmore than 250 parties, see [1]. Due to these conditions, one is not in a position todefine clearly the partition of Polish Parliament with respect to, for instance, the leftright criterion or proanti government attitude. In such a situation it is very difficult toforecast the voting behaviour of particular members of the parliament and parliamentas a whole, especially in cases of complex and subtle matters.

In order to overcome these difficulties the authors suggest to apply the machinelearning from examples approach for forecasting purposes.

Machine learning from examples is a process of inferring a classification rulefor a given class from descriptions of some individual elements of this class, calledpositive examples, with some elements not belonging to this class, called negativeexamples, that are used to narrow the solution space. The aim of this approach is toclassify any new (and different) examples, described by the same number of attributeswith some prescribed attribute-values.

In the literature several methods of learning from examples are described. Theycan give explicit decision rules of if-then type, a decision tree, structure and weights1 September 1993September 1997.

J.C. Baltzer AG, Science Publishers

32 G. Szkatua et al. / Forecasting voting behaviour

of neural networks, etc. They can be used in computer expert systems which canprovide information and support in decision making.

In the paper [2] a new method of this kind was proposed which gives explicitdecision rules of if-then type. Results of test computations point out that its effec-tiveness is quite good.

In the authors opinion this method is well suited to be applied to forecast thevoting behaviour of individual members of the Polish Parliament, i.e., types of votes(for, against, abstain). Results of the application of the method to this problemare discussed in the paper.

2. Brief description of the methods

The problem can be formulated as follows. There is a set of examples (e.g., thevoting profiles of the members of Polish Parliament). Each example is described bythe same number of attributes aj , j = 1, . . . ,K, and the attributes have some discreteattribute-values.

For instance: the number of parliament member, the parliamentary group mem-bership (e.g., BBWR, KPN, MN, PSL, SLD), the number of votes, types of votes (inthe Polish Parliament three types of votes can be used: for, against, abstain),the matter of the voting (e.g., pro-government, anti-government, left, right),etc.

Each example belongs to some prescribed class. Classes can be as follows:

class 1: the type of vote is for,

class 2: the type of vote is against,

class 3: the type of vote is abstain.

One has to find classification rules for the classes, using all examples as patternsand subject to the following very general requirements: partial completeness, i.e., the classification rule must correctly describe all exam-

ples of the class,

partial consistency, i.e., the classification rule must not describe all examples notbelong to the class.

convergence, i.e., the optimal rule must be derived in finite number of steps.A classification rule for the given class can be used to forecast, e.g., the voting

behaviour of individual members of parliament; on the basis of new and different votes,described by the same number of attributes with some prescribed attribute-values.

An example e is represented by K attributes, a1, . . . , aK , and is described by theconjunction of K conditions sj = [aj = vj], j = 1, . . . ,K. The notation [aj = vj] is

G. Szkatua et al. / Forecasting voting behaviour 33

used to denote that the jth attribute assumes the value vj dj , where dj is the finiteset of values of aj , i.e.,

e = [a1 = v1] [aK = vK] =

jI={1,...,K}sj. (2.1)

For instance, the rth example can be described by the following conjunction ofconditions ( corresponds to the connective and):

er = [the number of parliament member = 38] [the parliamentary group membership = PSL] [the number of votes = 125] [the matter of voting = left] .

A conjunction of l conditions, l 6 K, i.e.,jI{1,...,K}

sj = CI , (2.2)

is called a complex. For instance, the complex CI1, I1 = {1, 4},CI1 = [the number of parliament member = 38]

[the matter of voting = left]is the conjunction of l = 2 conditions.

A complex CI , I {1, . . . ,K}, covers the example (2.1) if all the conditions ofthe complex CI are satisfied by the values of attributes of this example.

For instance, the complex CI2 , I2 = {2, 3},CI2 = [the parliamentary group membership = PSL]

[the number of votes = 125]covers the example er but does not cover the following example:

er1 = [the number of parliament member = 15] [the parliamentary group membership = SLD] [the number of votes = 125] .

Suppose that we have a set of examples belong to the class k, em SP , m =1, . . . ,P ; and a set of examples not belong to the class k, en SN , n = 1, . . . ,Nsuch that SP ,SN 6= , Sp Sn = , by assumption.

First, we introduce the function, for each value of attribute aj , j = 1, . . . ,K,

gj(v) = 1P

Pm=1

(em, v

) 1N

Nn=1

(en, v

), for each v dj , (2.3)

34 G. Szkatua et al. / Forecasting voting behaviour

where

(em, v

)=

{1 for vmj = v,0 otherwise, e

m SP ,

and analogously for (en, v).A complex CI , I {1, . . . ,K}, is equivalent to a vector x = [x1, . . . ,xK]T

such that xj = 1 if condition [aj = vj] occurs in this complex, and 0 otherwise.A weighted length of the complex CI is written as

d(CI ) =Kj=1

(1 gj(vj)

) xj. (2.4)The paper is concerned with the problem of determining classification rules to

be the disjunction of the elementary rules consisting of complexes of the type (2.2),i.e.,

IF {Rk} THEN {an example belongs to the class k}, (2.5)where

Rk = CI1 CIL , I1, . . . , IL {1, . . . ,K},where the symbol corresponds to or, k denotes the number of class and Ldenotes the number of complexes.

For instance, the rule for the class 1 can be described by the following disjunctionof complexes:

IF{([the number of parliament member = 3] [the matter of the voting = pro-government]) ([the matter of the voting = left] [the parliamentary group membership = SLD]) }

THEN {an example belongs to the class 1, i.e., the type of vote is for}.The weighted length of the classification rule composed of L complexes can be

introduced, see [2], i.e.,dRk (CI1 CIL) = max

i=1,...,Ld(CIi).

Using it, the problem of inductive learning from examples can be formulated asfollows: one has to find the optimal classification rule Rk for the class k, such that

minI1,...,IL

dRk (CI1 CIL). (2.6)

In other words, the optimal classification rule has the minimal weighted length.Therefore, shorter rules containing relevant attributes are preferred.

G. Szkatua et al. / Forecasting voting behaviour 35

The problem (2.6) is very difficult to solve, therefore an auxiliary problem canbe formulated, see [2], i.e., Rk is searched for such that

minI1d(CI1 ), . . . , min

ILd(CIL), (2.7)

where the minimisation is consecutively performed over the sets of indices I1, . . . , IL(2.5).

The problem (2.7) can be formulated as the set covering problem and can besolved by a modification of the well-known method called a greedy algorithm, see [2].A heuristic algorithm to be used to determine a classification rule for the kth class isas follows:

Step 1. Let the set of examples Sp belong to the class k, and a set of examples SN notbelonging to the class k, and SpSN = by assumption. Set the initial examplesS := Sp and Rk = , i.e., the initial set of complexes is assumed empty.

Step 2. Determine the starting point, e.g., the most typical example in the set ofexamples S, see [2]. The concept of a typical example is crucial for efficiencyof the algorithm.

Step 3. Solve the set covering problem and find a complex C with the minimalweighted length (2.4), starting from the example determined in step 2.

Step 4. Include complex C found in step 3 into the classification rule, Rk = RkC.Step 5. Discard from the set of examples S all the examples covered by the complex

C.

Step 6. If the remaining set S is empty then stop and the rule Rk is the one searchedfor; otherwise, proceed to step 2.In the case of constructing classification rules for three classes, the computations

were repeated 3 times (we have to construct the optimal rule for the class 1, class 2,class 3). The algorithm described above is relatively simple and efficient.

3. Forecasting of the voting behaviour

In order to prepare the basis for such forecasts special issues of parliamentarydebates were chosen. Forecasts concerning the ideological left-right voting criterionare based on 60 votes on the following issues: coalition members holding political functions, centralization and decentralization of the state, referendum on the constitution, referendum as a form of direct democracy, judicial control over decisions of state organs,

36 G. Szkatua et al. / Forecasting voting behaviour

privatization and re-privatization, the states support of the poorest, expansion of the social insurance system, level and type of taxation, centralization of banks, openness of public life, history, including the Polish Peoples Republic, abortion, the concordat, divorce.

To construct forecasts related to the attitude to the government 36 votes on thefollowing issues:

government-proposed resolutions, reports on the implementation of resolutions, governments reports, ministers reports, motions for a vote of confidence for the cabinet, motions for a vote of confidence for ministers, other issues tabled by the government

were taken into account.Analysing positions taken by parliamentary groups as well as individual members

one can draw the following conclusions:

the programmes of parties are collections of watchwards not necessarily mutuallyconsistent and without clear addressee;

the parties do not pay much attention to their programmes; the parties sometimes assume positions that are very far removed from their de-

clared programmes;

members representing various parliamentary groups very often vote out of line withtheir colleagues; in most cases such a situation does not result in any sanctions;

cases of exclusion from parliamentary groups have been usually caused by fi-nancial and personal conflicts or charges of an ethical nature rather than politicaldisagreements;

G. Szkatua et al. / Forecasting voting behaviour 37

members who do not want to vote on a given issue prefer to be absent; a characteristic feature of the present Parliament is relatively low attendance rate.

The conclusion that agreement on a programme is not an important criterion forparliamentary groups is confirmed by large number of affiliation changes (one memberhas changed his affiliation even four times). The conclusions presented indicate thatthe problem of forecasting the voting behaviour of members of the parliament (in theauthors opinion being of great importance today), is not simple and cannot be solvedwith the use of conventional methods, because there are not satisfied the basic assump-tions making if possible to use them (e.g., on the stability of mechanism generating aphenomenon under consideration).

Moreover, one has to determine criteria used to assess the position taken bya given parliamentary group or an individual member. It follows from the analysispresented in [1] that only two criteria can be distinguished without contradiction,namely the ideological one (the leftright criterion) and that reflecting attitude towardsgovernment actions and initiatives.

In general, in order to assess the position taken by a given parliamentary group orindividual member one has to take into account voting results in which they participatedas well as cases of their absence. However, due to the lack of information on reasonsof their absence (illness, work on a committee, deliberate avoidance of voting on aspecific issue, foreign trip, etc.) the authors were not in a position to take this aspectinto consideration.

The votes considered in the analysis of the leftright criterion are divided intotwo groups. The first one is comprised of votes where a vote for identifies a memberof Polish Parliament with the left and a vote against with the right. The second oneembraces votes, where a vote against identifies a member with the left and a votefor with the right.

In order to make it possible that the voting behaviour of a member of the parlia-ment is estimated in a quantitative way, specific values are to be assigned to the votes:yes, against, abstain.

It is assumed that in the case of the first group the value 1 is assigned to thevote for and +1 to the vote against. The votes of the second group have the valuesreversed. The value of abstain for votes of the first group is assumed to be equal +0.5and for the second group 0.5. The detailed discussion concerning the choice of votevalues is given in [1].

The total score of votes to be assessed with respect to the ideological criteriongiven by the kth member is computed as follows

wIk =Lk Pk q (N1k N2k)

Gk, (3.1)

where

wIk the vote score corresponding to the leftright criterion for the kth member,

38 G. Szkatua et al. / Forecasting voting behaviour

Figure 1. Distribution of votes of members of PSL parliamentary group determined with respect to theleftright criterion.

Lk the total number of the first group votes in which the kth member voted forand the number of the second group votes in which the member voted against,

Pk the total number of the first group votes in which the kth member voted againstand the number of the second group votes in which the member voted for,

N1k the total number of the first group votes in which the kth member abstained,N2k the total number of the second group votes in which the kth member abstained,q the absolute value of an abstaining vote; q = 0.5,Gk the total number of the first and the second group votes given by the kth member.

It should be noted that

Lk + Pk +N1k +N2k = Gk. (3.2)Hence, the value of wIk for a member who is the most left-wing is equal 1,

the most right-wing member has the score equal +1. Therefore, the higher the totalscore of members votes, the more right the member is; the lower the score, the moreleft-wing he/she is.

The distribution of votes given by the members of PSL, SLD, UW parliamentarygroups are shown in figures 1, 2 and 3, i.e., the total number of members for whomwIk [1, 0.9], . . . , [0.9, 1]. For example, the vote score, wIk, for 25 members of PSLbelongs to [0.4,0.3].

It follows from these figures that only SLD parliamentary group assumes a uni-form position. Among PSL parliamentary group members there are not many with thehighest value of wIk. For most of this group members the value of wIk is negative.Hence, one can draw a conclusion that a large group of the PSL members identifieswith the left or abstain. The position taken by the UW parliamentary group membersis not uniform too, but to much smaller extent.

Moreover, one can assume that there are voting results, especially those on spe-cific issues, in which members of the parliament do not express their true preferences(misunderstanding, will to stop an initiative of other groups or members, etc.).

G. Szkatua et al. / Forecasting voting behaviour 39

Figure 2. Distribution of votes of members of SLD parliamentary group determined with respect to theleftright criterion.

Figure 3. Distribution of votes of members of UW parliamentary group determined with respect to theleftright criterion.

Hence, if voting results are considered as examples, then, the authors made anattempt at applying the method described in section 2 to forecast the voting behaviourof individual members of the parliament.

4. Experiments and analysis of results obtained

The approach presented in section 2 has been used to construct a computerprogram, which is intended to create description of three classes (i.e., classificationrules in the form of disjunction of complexes) and is able to forecast the votingbehaviour of individual members of the Polish Parliament in new and different voting.

The problem under consideration is formulated to find classification rules for theclass 1 (the type of vote is for), class 2 (the type of vote is against) and class 3 (thetype of vote is abstain), (section 2) using all learning examples (voting results) aspatterns. One can evaluate the accuracy of classification rules based on the percentageof testing examples correctly classified.

The choice of voting results to be used to assess, with respect to the ideological

40 G. Szkatua et al. / Forecasting voting behaviour

Figure 4. Accuracy of classification rules for PSL parliamentary group.

criterion, the positions taken by members of the parliament was made by a politicalsciences expert, see [1]. To test the method considered six voting results on lustrationof the officer persons were selected from the whole set examples. Three parliamentarygroups were taken into account: PSL, SLD, UW. From each of these groups 16 mem-bers were chosen, only those who participated in all six voting under consideration.Hence, the total number of examples investigated was 16 6 3 = 288.

To construct a classification rule for each of three classes a learning set wasgenerated in a random way. The learning set included 70% of examples from thewhole set.

Classification rule for class 1, class 2 and class 3 can be used to forecast thevoting behaviour of individual members of parliament; on the basis of the remaining30% of examples (the testing set), described by the same number of attributes withsome prescribed attribute-values.

The ratio of correct classification decisions to the total number of decisions madewas taken as the measure of classification accuracy, in percent. Hence, the accuracy ofclassification rules for the class k, k = 1, 2, 3, was taken as the percentage of testingexamples correctly classified to the class k.

Computations were repeated 200 times; i.e., procedure applied was as follows:step 1, the random choice of the learning set, step 2, the optimal rule of classification tothe class 1, class 2 and class 3 constructed and step 3, test the classification accuracyfor the class k; and results were averaged.

The results are summarised in figures 4, 5 and 6.Results obtained give an evidence that the attitude of the SLD parliamentary

group members are very consistent. For members of SLD parliamentary group theaccuracy of classification rules for voting on lustration was greater then 95%.

In the case of PSL and UW parliamentary groups those who abstained were moreconsistent that members voting for or against.

G. Szkatua et al. / Forecasting voting behaviour 41

Figure 5. Accuracy of classification rules for SLD parliamentary group.

Figure 6. Accuracy of classification rules for UW parliamentary group.

5. Concluding remarks

Results obtained indicate that the method under consideration can be used toforecast the voting behaviour of individual members of parliament. Introducing a slightmodification, one can apply the method to forecast position taken by a parliamentarygroup considered as a whole.

The method can be also used to determine the most representative member of agiven parliamentary group. Further investigations aimed at broadening the scope ofapplication of the method are carried on.

References

[1] J. Holubiec, A. Malkiewicz, M. Mazurkiewicz, J. Mercik and D. Wagner, Identification of ideologicaldimensions under fuzziness. The case of Poland, in: Consensus under Fuzziness (Kluwer Academic,Boston, 1997).

[2] G. Szkatula, Machine learning from examples under errors in data, Ph.D. thesis, SRI PAS Warsaw,Poland (1996).

Recommended