AI in Politics Applications, Part 2: Applying a Machine Learning Model that Predicts Election Outcomes to the Real World
The potential for AI in political campaigns — like many industries — is immense. While AI has taken off in 2023 due to advances in quantum computing, developments in deep learning, and availability of data, AI has yet to meaningfully disrupt political campaigns.
This blog post — split into two parts — intends to show an example of how even very basic applications of artificial intelligence can intersect with political campaigns to increase understanding, efficiency, and effectiveness of movement building. The first blog post aimed to provide transparency into how machine learning data science techniques might be effectively utilized, while this post digs deeper into how we might best leverage the results.
Last year, Campaign Brain founder Nate Levin experimented with developing a machine learning model written in Python to solve the question: can artificial intelligence help predict which candidates are most likely to win future races based on fundraising and expenditure?
Using data from the Federal Election Commission on 1,814 Senate and House races from 2016, an artificially intelligent model was created to understand the role of fundraising and expenditure in predicting likelihood of victory for a certain political candidate. Observing specifically the role of total value of contributions received by a candidate and total operating expenditure, it is evident through the outputs of the created model that artificial intelligence is effective for predicting likelihood of success for candidates. After the model was trained, it produced a score of .7438 when applied to the training data, indicating a relatively strong effectiveness of this model in predicting future results for candidates.
In the 2020 election cycle, $14.4 billion was spent on U.S. political campaigns, a new record at the federal level. Leading the way on expenditure was the presidential campaigns — a record $5.7B in 2020 — however, Congressional races totaled nearly $9B in total expenditures. This increase in spending is fueled from increased investment both from allied organizations such as PACs, as well as small donors. Beginning in 1974, the Federal Election Commission (FEC) required that candidates provide detailed reports of fundraising and expenditure, offering insights into variable levels of prosperity across differential political positions, incumbency, and expenditure strategy. Due to the fact that campaign fundraising and campaign expenditures are very highly correlated — r=.97 for incumbents — this study focuses on campaign fundraising.
For political parties and political action committees, who are strategically allocating funds in an attempt to maximize their collective political power, stronger predictive power of success is a critically important data-point. For political campaign staffers, understanding the likelihood of success may help staffers determine which races to work on. Finally, for political party leadership, who are engaging in levels of game theory to maximize success at multiple levels, across multiple districts, and must weigh both long-term and short-term priorities, the ability to predict victory through an AI model based on fundraising levels is extremely valuable.
The main application for this model in the real world would be to inform decision makers on the status of campaigns, and help to identify campaigns that are likely to be successful. Organizations such as political parties and PACs are constantly attempting to evaluate the status of campaigns, and are using this information to allocate scarce resources accordingly. That is, organizations such as these that may have a higher-level view, working towards achieving a majority, for instance, must do their best to understand who to support, and this model could be a valuable tool in helping to make these determinations. Moreover, for individual donors — possibly high-level donors who tend to support many candidates annually — this could be a useful tool to identify candidates to support.
The primary expected challenge when applying this model to the real world lies in its implementation in determining winners, and possibly answering a redundant question. Many resources are currently applied to the question of predicting who will win political races, and these inform strategic decisions from party leadership, betting markets among followers, and donation levels from individuals. Currently, these estimations are largely based on evaluation through district analysis and polling. That is, by looking at recent elections, a reasonable estimation of the likelihood of victory for a Democrat or Republican is achievable. For instance, one will often hear that a district is ‘Biden +2’, meaning that in the 2020 general election, Joe Biden carried this district by 2%. Although demographics and candidates change, this is still an effective datapoint to begin analysis from, and it is not certain that a model with ~75% accuracy based on fundraising is an upgrade. That said, a key step to overcome this would be to build a model to integrate both the predictive fundraising model combined with district analysis to understand with greater certainty who is likely to win.
In totality, this model to predict likelihood of victory in United States Senate and Congressional races has real value for candidates and campaigns, individual supporters, and allied groups and political parties that provide resources. By utilizing fundraising levels — which are consistently reported in line with FEC standards — campaigns can better understand their position in the race, and groups allocating resources have a clearer picture for which to begin their game-theory analysis that informs who they will distribute resources towards. Currently, much of this analysis is executed from a base of district analysis and polling, and this model represents a cheaper, replicable, way to understand relative position in the race. Moreover, as political fundraising continues to increase in terms of total expenditure each political cycle, this model is likely to become stronger over time.
It is not, however, strong enough to predict with certainty that a candidate will win. Future improvements should focus on implementing data from district analysis and live polls directly into the model, which should improve the accuracy of the model. However, even when doing that, it is necessary to understand that there is no world in which this model will end up 100% accurate. Politics is characterized by bizarre gaffes and late developments in campaigns — often due to a candidates’ regrettable behavior — that may fundamentally alter the course of the race, in spite of the district, the polls, and fundraising disparities. That is, this model should always be used as a tool in the toolbox of political parties, individual donors, and political action committees, but should not be used in a vacuum as the only angle of analysis. As Congress and the Senate continue to trend towards increased levels of polarization, this model offers an opportunity to better predict those races on the margin that are certain to determine majority control, and thus ownership of the policy decisions that will have domestic and global consequences for generations to come.