Football’s Most Important Bowl Game Is on a Computer

Gyver Machine Learning and the 2020 Super Bowl

The Kansas City Chiefs are the 2020 super bowl champions. In an exciting match, with an incredible fourth-quarter comeback, they beat the San Francisco 49ers: 31-20. Star of the evening was Patrick Mahomes. At 24 years old he is the youngest quarterback to win MVP honors. Off the pitch however, data analytics is slowly becoming one of the most valuable players for football teams.

Super Bowl 2020

Like many other sports federations, the National Football League (NFL) is using data analysis and A.I. to support teams and viewers at home with meaningful insights during live matches. These real time actionable insights can help coaches make crucial decisions during matches. Trying to bring these insights to the next level, the NFL has invested in their Analytics department. However, they also took a more creative approach, because for the second year in a row, Michael Lopez (director of data and analytics @ NFL) hosted a machine learning competition for the data science community of Given huge amounts of data, the data scientists were tasked to predict the result of a play when a ball carrier takes the handoff. In other words, how far will the player run with the ball before being torpedoed to the ground, based on information on all player’s positions, speeds, angles, directions, teams, names etc..

“Football’s Most Important Bowl Game Is on a Computer.” – Wallstreet Journal

While last year only 125 people participated, the 2020 Big Data Bowl hosted 2,038 teams from 75 countries that made 32,000 submissions and competed for a price pot of $75,000,-. Gyver was among the competitors. In an international team with two data scientists from China and one from Russia we achieved a silver medal and ended up in the top 3%. In this blog we will briefly explain this exciting competition, the main challenges and the approach we took.

Field Plot Football

Since the main objective of the competition was to create a model that predicts the result of a play (when a ball carrier takes the handoff), this was a typical supervised learning challenge. We received data of many different matches over a period of time. The first two months of the challenge (October and November) we spent on pre-processing, analysis, modeling and evaluation. In the following two months (December/January) the models were evaluated every week with new data to determine a final score. In other words, the model was put to the test in a basic production environment by using the newest data.

Feature engineering

The most interesting part of this competition was not in the architecture of the neural network or the (hyper)parameter tuning of said network. In this challenge, basic knowledge of football turned out to be a vital part of our solution. Consequently, feature engineering was very important for a high final ranking. 

Our most important features were the so-called Voronoi regions. A Voronoi diagram is simply a partition of a plane into regions close to each of a given set of objects. Or, in this case: they can be used to determine how much free space a particular player covers on the pitch. The image below shows the Voronoi region (space covered) for each defender of the opposing team (orange dots).

Plot quarterback

Since we can estimate how much space each opposing player covers, we can now calculate how much free space the ball carrier (the red dot) has in front of him for his rush. Doing this, we can add the feature: “voronoi_region_ballcarrier” to our model. Eventually, we created all kinds of variations of Voronoi region features. For example: using speed and direction of the players we calculated Voronoi regions based on player positions X seconds from now. 

Using these features in a linear model already achieved great results. Switching to more advanced models (neural networks), and trying different architectures and hyperparameter settings improved our score even further. However, in hindsight, feature engineering is what really set us apart from most of the 2,038 participating teams. 

Plot Quarterback 2


While feature engineering in combination with domain knowledge may have been the most important part of getting a decent score. Robustness of the model was extremely important for handling the new data that was fed to our models in the second phase. It is pretty straightforward that the new match data could slightly differ from the data we trained on.

After the first “production iteration”, for many teams, this became a nightmare. Because their models crashed. They simply weren’t prepared for variations in data and therefore failed to compile. Luckily, we were prepared for this, since we know getting a model in production is an often underestimated challenge.

Through the different iterations during the second phase of the competition our model performed steadily which resulted in a top 3% final ranking. A result we are very happy with. This Kaggle competition is yet another great example of how any context, domain or business could benefit from data science. For more information on the challenge check out this podcast. If you’re looking for new insights for your business contact us.

Let's get in touch