How I Predicted Every Election Since 1916

And how “election pundit predictions” betray a misunderstanding of probability

November 2024

In just 91 lines of C++ code, I perfectly predicted every United States presidential election since 1916. That's 28 straight elections, counting the most recent one in 2024.

The crazy part is I didn't rely on any complicated polling data trends, voter sentiment, or policy analysis to make these predictions. I just used basic principles of probability.

The US presidential election results in 1916. Public domain. By AndyHogan14, Wikimedia.

Alright, I'll admit I cheated a little. But arguably not much more than the political pundits that claim to have predicted every election since, say, 1980.

Every election cycle, you see stories on the news of someone who has correctly predicted every election in however many years. Most recently, I saw stories about Allan Lichtman, who correctly predicted most of the 11 elections from 1984 through 2020. His system for predicting elections is called the “13 Keys”, and consists of 13 true/false questions to predict the winner of the election.[]

But then Allan Lichtman got the 2024 election wrong. Does this cast doubt upon election pundits who claim to have sophisticated election prediction systems?

In this article, I'm going to show you how you, too, can predict every single election in over 100 years. You can do this with a very simple deterministic system that requires even less information than the 13 keys, and yet is more accurate, as long as you're willing to be fooled by statistics!

I'll also explain why, mathematically, the seemingly insightful achievement of predicting election results actually means very little.

How is This Possible?

How is it possible to compute every single election since 1916? Surely it couldn't happen by random chance. After all, there have been 28 elections since 1916, inclusive. Each one has had at least 2 major candidates, and a few of them actually had 3. So the probability of guessing all 28 elections correctly purely by chance is less than $1/2^{28}$, which is about 1 in 300 million.

But wait: 300 million? That's a familiar number: the population of the United States is a little over 300 million. So if everyone in the United States guessed the election results at random for every election since 1916, we would expect about one of them to guess every single outcome correctly. This person would be praised by the country as a masterful political pundit, and everyone would eagerly await their prediction for the next election… even though it would have only a 1/2 chance of being correct!

Of course, very few, if any, Americans today have been alive to predict elections since 1916. And few Americans make public election predictions for the world to judge. So let's try an argument with slightly more realistic numbers.

Let's say there are 2000 Americans who are potentially in the business of predicting elections, and who are of age to have seen all elections from 1984 through 2024 (that's 12 elections). Each one has some kind of a system based on polling data, economic trends, and other factors, giving them a 60% chance of being correct in any given election. Then the chance that any given predictor gets all 12 elections correct is $(0.6)^{12}$, or about 0.2%. The chance that at least one predictor of the 2000 gets all 12 elections right is $1-(1–0.002)^{2000}$, or 98.7%!

If we allow more than 2000 predictors, or more than 60% accuracy, this probability gets even higher.

This assumes that all predictors are independent, which certainly isn't the case: all of them use much of the same underlying data. But even without the independence of predictors, 98.7% odds with just 2000 predictors is a high number. This indicates that it's quite possible for someone to be right on almost all elections, despite not having a very accurate underlying model.

How Likely is This?

Let's look deeper into this model of everyone in America guessing randomly.

In just one election, you have a 1/2 chance of being right. As you increase the number of elections, your chance of being right on all of them drops off exponentially. But your chance of being right on many or even most of them remains fairly high for quite a while after.

In general, the probability of getting $k$ out of $n$ elections right by guessing randomly is given by the binomial distribution: $$ P(k) = \binom{n}{k} \left(\frac{1}{2}\right)^n $$ The $\binom{n}{k}$ factor keeps our probability high for medium numbers of $k$.

Graph: probability of predicting elections by guessing randomly.

From the graph, once we reach 12 elections (1980–2024), we still have a 1.5% chance of getting just 2 elections wrong from guessing randomly. So this outcome is very much possible, especially when lots of people try to guess the elections, and when they do just a little better than guessing randomly. But eventually, with a large number of elections, you are almost guaranteed to get more than 5 wrong.

We can expand this random guessing model to 300 million Americans, using $$ P_{300\text{M}}(k) = 1 - (1 - P(k))^{300,000,000} $$ That is, $P_{300\text{M}}(k)$ is the chance that at least one person out of 300 million guesses exactly $k$ elections correctly.

Graph: probability of at least one person out of 300 million predicting elections.

All the way up to about 30 elections, there's a decent chance that someone will guess every single one correctly, just randomly! And we have decent numbers all the way into the 50s, where we might get just 5 elections wrong. Of course, past 5 elections, it's almost certain that someone of the 300 million gets more than 5 elections wrong.

Predicting Every Election Since 1916

Now it's time to predict every single election since 1916. The algorithm is very simple:

For each election, determine who the top contenders are. Generally there are only two top contenders in a US election; a few times, there have been three, four, or even one. I assume it's easy to figure this out.
Sort the contenders in alphabetical order by last name.
Flip a coin. Guess heads for the first contender, tails for the second. (Or maybe use a three-sided die for 3 contenders, etc.)

And that's basically it.

But there's one key thing about the coin. It can't be a physical coin. You have to use a pseudorandom number generator in a computer.

In fact, you have to use C/C++ random number generation. Seed it with the random seed 824050438, and then start picking random values. (Use modulus on each random value to pick the actual candidate.) If you go and check this algorithm with this seed, you'll be amazed to find that you can predict every single election from 1916 to 2024 correctly!

But wait, isn't that cheating?

Yes, choosing a random seed that I know works perfectly is cheating. But hardly more so than having multiple people predicting the election, and declaring a political pundit only when at least one gets most of the elections right, just as I declare an optimal seed when at least one gets all the elections right. It's just a matter of cheating at the individual level versus the societal level.

Let's make a toy model in Python. You can find the full code, as well as the more efficient C++ version, on GitHub.

First we set up and preprocess our dataset. In this case, it's the list of all main contenders in US elections, and who the winner was in each case.

elections = [ # list the winner first
  [1789, ["Washington"]],
  [1792, ["Washington"]],
  [1796, ["Adams", "Jefferson"]],
  [1800, ["Jefferson", "Adams"]],
  [1804, ["Jefferson", "Cotesworth"]],
  ...
  [1856, ["Buchanan", "Frémont", "Filmore"]],
  [1860, ["Lincoln", "Breckinridge", "Bell", "Douglas"]],
  [1864, ["Lincoln", "McClellan"]],
  [1868, ["Grant", "Seymour"]],
  ...
  [1996, ["Clinton", "Dole"]],
  [2000, ["Bush", "Gore"]],
  [2004, ["Bush", "Kerry"]],
  [2008, ["Obama", "McCain"]],
  [2012, ["Obama", "Romney"]],
  [2016, ["Trump", "Clinton"]],
  [2020, ["Biden", "Trump"]],
  [2024, ["Trump", "Harris"]]
]

for e in elections: # preprocessing
  sorted_names = sorted(e[1]) # sort alphabetically
  result = sorted_names.index(e[1][0]) # index of the winner, in alphabetical order
  e.append(len(sorted_names))
  e.append(result)

Now let's simulate randomly guessing elections 1 million times.

import random

TRIALS = 1e6 # 1 million

def simulate_elections(seed):
    # guess randomly using a given seed for all elections
    random.seed(seed)
    correct = 0
    for j in range(len(elections)):
        result = random.randint(0, elections[j][2]-1)
        if result == elections[j][3]:
            correct += 1
    return correct

max_correct = 0
best_seed = -1

for i in range(int(TRIALS)):
    correct = simulate_elections(i)
    if correct >= max_correct:
        max_correct = correct
        best_seed = i

print(f"{max_correct}/{len(elections)}")

This code runs in 20 seconds. The best seed comes out to 824728, with 48/60 elections correct. But can we do better? Can we get every single election correct?

We'll start by limiting ourselves to the last 28 elections (1916–2024). The code now runs in 13 seconds and gets 26/28 elections correct with the seed 787252. Getting better!

In order to improve from here, we need an improvement in processing power. My C++ code, which I won't include here, runs on essentially the same principle but adds multithreading. This allows me to run 3000 simulations on our dataset in parallel, speeding up this process tremendously.

In C++, I manage to get 28/28 elections correct using the seed 824050438, which takes 20 seconds to find.

Remember 20 seconds is just the time to discover this seed. Once we have the seed, we can technically compute election results almost instantly without knowing the results in advance! All we need is the list of top contenders in each election. We stuff in our seed and all the results will fall out perfectly.

So there you have it: a deterministic algorithm to perfectly predict every US presidential election since 1916!

This kind of accuracy is a crystal ball, the likes of which has not been seen in any election predictor in American history. Given this immense level of insight, you might be wondering who will win the 2028 US presidential election. Assuming a race between a Democrat and a Republican in 2028, the magic random seed 824050438 predicts… whoever's last name is first in alphabetical order. You heard it here first. Don't be surprised if I'm right!

Takeaways for a Scientist

What's the takeaway of this experiment in a scientific context, especially data science?

At first, my takeaway was not to extrapolate past model performance to future performance. After all, hindsight is 20/20. See this relevant XKCD: “Electoral Precedent”.

XKCD 1122: Electoral Precedent. By Randall Munroe. CC BY-NC 2.5

But I don't think that's exactly what we should take away from this. If a model does well on 2000 cat versus dog predictions, I think it's a safe bet that it'll also do quite well on the next 50, even if the future data has some important differences.

Instead, I think the more relevant insight here pertains to extrapolating model performance from small datasets. When a model has done well on a small dataset, we don't have enough evidence to predict its future performance. The US presidential election dataset is quite small: there have only been 60 as of 2024. Most well-known election predictors only try their hand at around 10, and that too imperfectly!

Another takeaway is always use a baseline before trusting your metrics. If you don't have at least a random chance baseline for your predictions, if not a more sophisticated model, good performance isn't always an indication that you're doing something right. This is a common mistake in machine learning, where people have the tendency to build deep learning models for simple datasets that work quite well, but ironically still worse than linear regression.

And how about the takeaway in a political context? I'm not saying that these political analysis models are completely baseless, like a random number prediction based on the candidates' last names. I'm sure they have better than 50% odds because they genuinely take important information into account.

But I am saying that we should be skeptical when we hear claims of any one person or method being able to consistently predict election results—especially if they get a few wrong, because the probability of getting most but not all correct by pure chance is significant. We should evaluate the methodology further before assuming its accuracy.

So my overall takeaway is that as a scientist, you should avoid extrapolating performance from small datasets, and always use a baseline before trusting your metrics. And as a citizen, don't believe everything the election pundits tell you: for all you know, they could be flipping coins off camera!

References

The GitHub for this article, including figures, is at crackalamoo/blog-demos.

Allan Lichtman (Wikipedia) ^