Skip to Main Content

Designing Stable Elections

Steven Heilman

Communicated by Notices Associate Editor Emilie Purvine

Article cover

1. Introduction

Suppose votes have been cast in an election between two candidates, and then an adversary can select a fixed number of votes to change. Which voting method best preserves the outcome of the election? A majority vote does, among all voting methods where both candidates have an equal chance of winning the election.

Now, suppose votes have been cast in an election between two candidates, and then each vote is randomly changed with a small probability, independently of the other votes. It is desirable to keep the outcome of the election the same, regardless of the changes to the votes. It is well known that the US Electoral College system is more than four times more likely to have a changed outcome due to vote corruption, when compared to a majority vote. In fact, Mossel, O’Donnell, and Oleszkiewicz proved in 2005 that the majority voting method is most stable to this random vote corruption, among voting methods where each person has a small influence on the election. Below, we survey the design of elections that are resilient to attempted interference by third parties. We discuss some recent progress on the analogous result for elections between more than two candidates. In this case, plurality should be most stable to corruption in votes. We briefly discuss ranked choice voting methods (where a vote is a ranked list of candidates).

1.1. Condorcet’s paradox

Applications of mathematics to the analysis of elections perhaps began with Marquis de Condorcet in the 1700s. Condorcet’s famous paradox demonstrates that an election method that uses ranked preferences of voters might not have a sensible winner. Consider the ranking of three candidates , and between three voters , and as shown in Table 1.

Table 1.

Three voters (one for each row of the table) provide rankings of three candidates , and . For example, voter most prefers candidate .

Voter Rank 1 Rank 2 Rank 3
1
2
3

If we ignore candidate , then voters and prefer over , while voter prefers over (see Table 2). So, using a majority rule for these preferences, the voters prefer over .

Table 2.

If candidate is ignored in Table 1, the remaining rankings of candidates and indicate that is preferred by the majority of voters.

Voter Rank 1 Rank 2
1
2
3

If we ignore candidate in Table 1, then voters and prefer over , while voter prefers over . So, using a majority rule again, the voters prefer over .

Finally, if we ignore candidate in Table 1, then voters and prefer over , while voter prefers over . So, using a majority rule, the voters prefer over .

In conclusion, the voters prefer over , they prefer over , and they prefer over . So, no one has won the election! This observation is known as Condorcet’s paradox. The simplest way to use rankings of candidates might lead to no one winning the election.

In fact, if we compare pairs of candidates using something other than a majority rule, then some analogue of Condorcet’s paradox must still occur, unless we ignore all voters except for one (a dictatorship). This statement can be formalized as Arrow’s Impossibility Theorem.

1.2. Voting power

Game theorists such as Shapley, Shubik, and Banzhaf in the 1950s and 1960s further developed the mathematical and economical analysis of voting methods. As an illustrative example, we consider the 1965 restructuring of the UN Security Council.

Voting method 1 (Pre-1965 UN Security Council).

In pre-1965 rules, the UN Security Council had five permanent members and six nonpermanent members. A resolution passes in the Security Council only if:

all five permanent members want xit to pass, and

at least two nonpermanent members want it to pass.

In particular, a single permanent member can effectively veto a resolution by voting “no” on that resolution. This voting method was called unfair for the nonpermanent members, so it was restructured in 1965. After the restructuring, the council had the following form (still in use today).

Voting method 2 (Post-1965 UN Security Council).

The UN Security Council has five permanent members and now ten nonpermanent members. A resolution passes in the council only if:

all five permanent members want it to pass, and

at least four nonpermanent members want it to pass.

A rather vague question is then the following.

Question 1.1.

Are the post-1965 rules more equitable for nonpermanent members of the UN Security Council than pre-1965 rules?

There are various ways to answer this question. One answer, provided by Banzhaf, is to consider the power of a voter in each voting method, i.e., the relative ability of a voter to cause a resolution to pass by changing their vote. Suppose we label the post-1965 UN Security Council members by the integers through , where the numbers represent the five permanent members of the council, and the numbers represent nonpermanent members. Then, for any integer between and , let be the number of combinations of votes of members of the council (other than voter ), such that when voter changes their vote from “no” to “yes,” the resolution changes from not passing to passing. The Banzhaf power index of a voter is defined to be the following ratio:

For example, in the post-1965 rules, what would it take for a nonpermanent member to cause the resolution to pass? First, all permanent members would have to vote “yes.” Then, exactly three other nonpermanent members out of nine would vote yes. So, the number of combinations of votes other members would make is: the number of ways to select three members from a set of nine, i.e., . So, .

In the post-1965 rules, what would it take for a permanent member to cause the resolution to pass? First, all other permanent members would have to vote “yes.” Then, at least four nonpermanent members out of 10 would vote yes. So, the number of combinations of votes other members would make is: the number of ways to select at least four members from a set of . This number is . So, .

Similar considerations apply for pre-1965 rules. We summarize the Banzhaf power indices in Table 3.

Table 3.

Banzhaf power indices for UN Security Council voting methods.

Voting Method Banzhaf Power Index for Non-Permanent Member Banzhaf Power Index for Permanent Member
Pre-1965 Rules
Post-1965 Rules

In summary, the post-1965 rules give more power to nonpermanent members, and less power to permanent members of the UN Security Council. So, according to Banzhaf’s definition of voting power, the answer to Question 1.1 is: yes.

1.3. Voting methods as functions

Suppose we run an election between two candidates with voters, where is a large integer. For convenience, we denote the two candidates as and rather than and . If person votes for candidate , we define , and if person votes for candidate , we define . We can then make a list of votes as

A voting method is a function whose input is the votes and whose output is the winner of election. That is, denotes candidate winning the election when the votes are , and denotes candidate winning the election when the votes are .

Some examples of voting methods appear below.

Example 1.2.

The majority function is the function

If there are more votes than votes, then . And if there are more votes than votes, then . That is, agrees with our usual notion of majority: the candidate receiving the most votes wins the election. (To guarantee that someone wins the election, we could just assume that is odd, so that never takes the value .)

Figure 1.

An iterated majority function with “states.”

Example 1.3.

A dictator function is a function of the form

That is, the vote of the first person is the winner of the election. In this way, agrees with our usual notion of dictator: all votes are ignored, except for one. More generally, if , a dictator is a function of the form

Example 1.4.

If are fixed real numbers, a weighted majority function on voters is a function of the form

If is large for some , this corresponds to assigning more “weight” (i.e., more voting power, or more “say”) to the th voter. And if is small, this corresponds to assigning less “weight” (i.e., less voting power, or less “say”) to the th voter.

Example 1.5.

A two-layer iterated majority function is a function of the form

where are each weighted majority functions on voters, and is a weighted majority function on voters.

A two-layer iterated majority function is similar to an Electoral College system with states. The US Electoral College system then corresponds to .

Remark 1.6.

In learning theory, the iterated majority function is sometimes called a two-layer neural network with boolean activation function. The lines and nodes in Figure 1 are then interpreted as axons and neurons, respectively.

In the ensuing discussion, it is more convenient to replace the Banzhaf power index of a voter with the (almost identical) notion of influence of a voter.

Definition 1.7 (Influences).

Let be a voting method. Let be an integer. Define the influence of the th voter on , denoted , as

That is, is the probability that the th voter can change the outcome of the election, when other voters are equally likely to vote for either candidate.

Example 1.8.

The numbers used to define the Banzhaf power indices are just the influences, multiplied by . For example, in the post-1965 UN Security Council voting method with voters,

Put another way, the Banzhaf power indices are the influences, multiplied by a number causing them to sum to .

Table 4.

Influences for UN Security Council voting methods.

Voting Method Influence for Non-Permanent Member Influence for Permanent Member
Pre-1965 Rules
Post-1965 Rules

As above, we observe that a nonpermanent member has a higher probability of affecting the outcome of a resolution in post-1965 rules.

Example 1.9.

When is a dictator function of the form , then the first voter can always change the outcome of the election, and the other voters cannot, so

When is a majority function , then an application of Stirling’s formula implies that for all , , i.e.,

To see this, note that if is even, recall that Stirling’s formula implies that

Therefore, for all .

Perhaps it is a compelling reason to vote in a majority election with one hundred million voters when your probability of changing the election’s outcome is around in ten thousand.

2. Adversarial Corruption in Voting

2.1. Two candidates

Suppose people cast their votes in an election between two candidates. Then, suppose an adversary found a way to change several of the votes. By changing some votes, the adversary attempts to change the outcome of the election. Suppose also that the voting method is balanced in the following sense.

Definition 2.1 (Balanced voting method).

Let be a voting method. We say is balanced if each of the two candidates has an equal chance of winning the election. That is, the number of combinations of votes where candidate wins is equal to the number of combinations of votes where candidate wins.

For example, dictator functions and the majority function are balanced.

Question 2.2.

What balanced voting method is most resilient to adversarial changes to votes?

That is, if votes can be changed by the adversary, what is the least number of combinations of votes (of all voters) such that the adversary can change the election’s outcome?

In a dictatorship, e.g., , changing the first vote changes the outcome of the election, so this voting method is not at all resilient to adversarial changes. Similarly, a voting method that is only a function of a small set of voters (sometimes called a junta) will probably not be resilient to adversarial changes to votes. It turns out that the majority function is the balanced voting method most resilient to adversarial changes; we thank Daniel Kane for telling us the following argument.

Proposition 2.3 (Adversarial optimality of majority).

Let be an odd positive integer, and let be an integer satisfying . After the votes have been cast, suppose an adversary can change votes in an election between two candidates with voters. Then among all balanced voting methods, the majority function has the least number of combinations of votes where the election’s outcome can be altered by the adversary.

Before beginning the proof, we introduce some notation. For any , denote the “norm” of by . (This quantity is not a norm since for any .) Let . For any integer , we denote the distance neighborhood of by

Then is the set of possible votes that can be obtained by changing at most votes from a given . For any , let be a distance neighborhood of one “half” of the hypercube:

The key geometric fact used to prove Proposition 2.3 is the following theorem.

Theorem 2.4 (Harper’s inequality/hypercube vertex isoperimetric inequality).

Let . Let . Assume that

Then

Proof of Proposition 2.3.

We induct on . Let be the majority function, and let be another balanced voting method. Let be the set of votes where candidate wins the election, when is the voting method used to run the election. Note that . Since and are balanced, . So, Harper’s inequality, Theorem 2.4, implies that

The same inequality holds also when . Taken together, we conclude that the number of combinations of votes for which the outcome of the election can be altered with one adversarial vote change is smallest for the majority vote (since corresponds to the right side of 4). The case therefore follows by 4.

We now proceed with the inductive step. By the inductive hypothesis, if or if , we have

That is, . We need to prove the case . This again follows by Harper’s inequality, Theorem 2.4, since

Therefore, when or ,

That is, the number of votes for which the outcome of the election can be altered with adversarial vote changes is smallest for the voting method (since the majority vote corresponds to the right side of 5). The inductive step and the proof are complete.

For some related observations for ranked choice voting, see, e.g., MPR13, Lemma 3.3.

Proposition 2.3 can easily be extended to unbalanced voting methods. To state such a result, let be a real number and define a majority function with threshold to be a function of the form

Also, we say that two voting methods have the same balance if the number of combinations of votes resulting in candidate winning are the same for each voting method, i.e.,

For example, the majority function with threshold and the majority function with threshold do not have the same balance.

Proposition 2.5 (Adversarial optimality of majority, unbalanced case).

Let be an odd positive integer, and let be an integer satisfying . After the votes have been cast, suppose an adversary can change votes in an election between two candidates with voters. Let be a majority function with threshold , where is an even integer. Let be another voting method such that and have the same balance. Then the number of combinations of votes where the election’s outcome can be altered by the adversary is lesser for than for .

2.2. More than two candidates

It would be desirable to have an analogue of Proposition 2.3 for voting methods with more than two candidates. Such a result might require a version of Harper’s inequality, Theorem 2.4, for multiple sets. It is unclear if such an inequality can be proven.

2.3. Additional comments

Proposition 2.3 can be strengthened slightly, so that a voting method that is “close” to being as resilient as majority must itself be “close” to majority. Instead of applying Theorem 2.4, one instead uses a stronger version, such as KL20.

The majority function is known to be optimal in various senses. For example, the majority function maximizes the number of votes that agree with the outcome of the election O’D14, Theorem 2.33. Apparently Rousseau argued this was an ideal choice for a voting method in 1762 in “Du contrat social.” Theorem 3.6 below, the Majority is Stablest Theorem, also characterizes the majority function as being the most stable to random corruption in votes, among a reasonable class of voting methods.

For more background on social choice theory, see, e.g., O’D14, Chapter 2, O’D, Kal18, Section 3.

3. Independent Random Corruption of Votes

In Proposition 2.3, we showed that the majority function is the most stable voting method to adversarial corruption. The majority function is also most stable when votes are corrupted randomly, as shown below.

Theorem 3.1 (Majority is stablest, informal version, MOO10, Theorem 4.4).

Suppose we run an election with a large number of voters and two candidates. In this election, voters are modelled to have the following random behavior:

(i)

Voters cast their votes randomly, independently, with equal probability of voting for either candidate.

(ii)

Each voter has a small influence on the outcome of the election. (That is, all influences from Definition 1.7 are small.)

Then the majority function is the balanced voting method that best preserves the outcome of the election, when votes have been corrupted independently.

The definition of “best” here is intentionally vague. We will define “best” to mean: maximizing noise stability, as defined below in Definition 3.4. Also, the probability of each vote being changed (corrupted) should be less than in Theorem 3.1. Otherwise the majority preferences of the electorate are reversed upon corruption.

Some remarks concerning the sensibility of the assumptions of Theorem 3.1 now follow:

Suppose we completely ignore the votes, and just declare that the first candidate wins. This voting method is as stable to vote corruption as one can imagine, since any amount of corruption in votes cannot change the outcome of the election. Since this voting method is certainly undemocratic and uninteresting, some assumption in Theorem 3.1 must eliminate it. And indeed, this voting method is not balanced, so Theorem 3.1 ignores it. This voting method corresponds to a constant function .

As we saw in Example 1.9, a dictator function has one large influence, and the remaining voters have no influence on the election’s outcome. Consequently, the dictator voting method is quite stable to independently random changes to votes, since changing the votes of the nondictators has no effect on the election’s outcome. So, as in the previous example, the dictator function is rather stable to vote corruption for a rather uninteresting reason. We therefore eliminate dictator functions from consideration by imposing the democratic assumption (ii) that each voter has a small influence on the outcome of the election.

3.1. Two candidates

In this section, we will formalize the assumptions in Theorem 3.1, resulting in the formal version of the Majority is Stablest Theorem 3.6.

Assumption 1 (Voter assumptions).

There are voters denoted . There are two candidates denoted and .

For any , the th voter casts a single random vote taking the value or . (In particular, we are not dealing with ranked voting methods.)

The votes are independent, identically distributed (i.i.d.) random variables. That is, voters are modelled as independent decision makers with the same probabilities of voting for either candidate.

The voting method is a function . If the votes are , then the winner of the election is .

Remark 3.2.

One could argue that the voter assumptions are not realistic, since, e.g., a small group of friends will most likely share similar views, read similar news items, etc., so that their decisions are not truly independent. On the other hand, modeling a large number of voters to be independent individuals is somewhat plausible, from an aggregate perspective.

Assumption 2 (Voter corruption assumptions).

Let . Suppose we are given the votes of voters choosing between two candidates. The corrupted votes are defined as follows:

The corrupted votes are independent, identically distributed (i.i.d.) random variables.

For each , if , then with probability , is a uniformly random element of , and with probability , .

Remark 3.3.

When , for all , i.e., no vote corruption has occurred. When is close to , is almost the same as , i.e., and are strongly correlated, and a small amount of vote corruption has occurred.

When , the votes and are independent of each other, i.e., the corrupted votes have been so scrambled that they have no dependence (or correlation) with the original votes .

Notation. We denote the original (random) votes cast in the election as , and we denote the corrupted votes as .

Recall that the voting method takes the value or , according to which candidate ( or ) won the election. So, if the winner of the election is the same as the winner of the election with corrupted votes , then

On the other hand, if the winner of the election is different than the winner of the election with corrupted votes , then

So, the voting method that has the largest average value of

will be the most stable on average to random vote corruption. This observation motivates the following definition.

Definition 3.4 (Noise stability).

Let be a voting method. The noise stability of with correlation parameter is

Here denotes expected value, or average value, with respect to the random variables defined in Assumptions 1 and 2.

Remark 3.5.

The probability that the election’s outcome stays the same after vote corruption has occurred is .

3.1.1 Unbiased case. Theorem 3.1 can be restated as: the majority function maximizes noise stability, among a reasonable class of voting methods.

In the theorem below, we denote the majority function as , so that