How The Unabated NFL Simulator Works

Rufus Peabody
August 6, 2021

NFL Simulator

The Unabated NFL Simulator is a powerful tool designed to assist you in projecting the rest of the NFL season. At first glance, the NFL Simulator may seem a bit intimidating. The purpose of this document is to better explain how the NFL Simulator works and how you can get the most out of it.

About The Unabated NFL Simulator 

Our simulator runs 10,000 Monte Carlo simulations of the entirety or remainder of the NFL season. We simulate each week of the season, with the probability of a win/loss/tie determined by the difference in the two teams’ power ratings at that point, home field advantage, and situational effects (bye weeks, added/reduced rest). We update the simulator weekly with the results from the previous week’s games.

Initially the power ratings provided by the user are directly used to simulate game results. But our simulator allows team ratings to evolve over time. After we simulate the regular season, we account for tiebreakers and simulate each week of the playoffs.


Why Do Power Ratings Change?

Power ratings change over the course of the season. Your power rating for a team in Week 1 will probably not be the same as in Week 5 (or Week 15). That doesn’t mean there’s anything wrong with your ratings. Power ratings change for two reasons: uncertainty and nonstationarity.

Uncertainty comes from the fact that we don’t know team strength; our ratings are our best guess, but there are teams that end up being a lot better than we expect, and teams that fail to meet expectations. I’m not talking about fumble luck being the cause here; I’m talking about Josh Allen being a better QB than most expected last season. And Carson Wentz being worse. As more games are played in a season, we get more data points and learn more about how good a team truly is. So the uncertainty decreases throughout the season.

Nonstationarity means that team strength isn’t some fixed number. It’s fluid, and changes based on injuries, momentum, and many other factors we can’t quantify. That’s the reason recent games carry more weight than games in the distant past. And in our simulation, it’s the reason team ratings continue to change late in the season, even when there is less uncertainty.


Simulating Quarterback Injuries In The NFL Simulator

The most unique thing the Unabated NFL Simulator does is simulate quarterback injuries. Quarterback is the most important position in professional sports, and an NFL QB has more of an effect on his team’s success than the other 10 players on offense combined. A QB injury is a major shock which can have a massive impact on a team’s rating. Certain teams are more fragile to a QB injury than others; if Russell Wilson were to be injured — something that hasn’t happened in a decade — the Seahawks’ rating would fall dramatically, as the dropoff to Seattle’s backup QB is large. But teams like the Broncos or 49ers, who have two similarly capable quarterbacks, are less fragile to a QB injury.

Using the last 20 years of data, we flagged QB injuries and missed time. We built a basic model to predict the probability a starting QB suffers an injury that causes him to miss at least one game, using a number of factors, including age, experience, size, and previous injury history. After each week, we simulate the possibility of a QB injury and if there is an injury, simulate the number of games missed (based on historical baselines).

The QB1 INJ parameter reflects the per-game probability that a QB suffers an injury causing them to miss at least a game. A per-game injury probability of 4% would mean a starting quarterback has a 50% likelihood of making it through the regular season without missing a game. A starting QB with a  2% per-game injury probability will have a 71% chance of an injury-free season, and one with a 1% per-game injury rate has an 84% chance of a healthy year.

Our modeled QB1 INJ numbers can be found in the Massey-Peabody ratings; otherwise, the pre-populated numbers will reflect a global average. Modeling QB injuries is a typical “small data” problem. The intention is for users to modify this to reflect their opinions on how injury-prone a QB is.


Backup QB Penalty

We default the backup QB penalty to -3 points in both the basic and advanced modes. This is not optimal. The drop-off from Russell Wilson to Geno Smith is not the same as from Ryan Fitzpatrick to Taylor Heinicke. Leaving every team’s backup QB penalty the same will bias your results a bit towards teams with elite QBs. On the other hand, elite QBs are typically less injury-prone than average, so using the same QB1 INJ number for all QBs will bias it the other way.


What If I Don’t Want To Simulate Quarterback Injuries? 

Easy. Set every QB1 INJ value to zero. There are some downstream consequences you should be aware of. Since we explicitly separate uncertainty within and between QBs, your sim will understate the true amount of variance in a team’s rating. Literally, you will be simulating the season with the knowledge that no QB gets injured, which isn’t realistic. To offset that, you could adjust the Rating Update Function (RUF) to be a little more aggressive for every team.

However, there is an easier way. If you use the same QB1 INJ number for every quarterback (default in some modes), and the same backup QB penalty, your expectation for team wins will be the same as if you set all QB injury probabilities to zero. But you will add uncertainty and hence, won’t need to change the RUF.


How Do You Deal With a QB Being Benched?

We don’t. However, If a QB gets benched, it (should) mean that his rating fell to a point where the backup was a better option, which shouldn’t really affect our numbers. 


What If The Backup QB Gets Injured?

The NFL Simulator does not currently simulate QB injuries for the backup. We could add that as a customization option in the future. In general the drop off from backup QB to third-string is small, so we don’t think this is a huge drawback. (Though the 2018 Washington Football Team begs to differ. I’m looking at you, Mark Sanchez and Josh Johnson).


Walk Me Through An NFL Simulation

The NFL Simulator first simulates score differential based on the difference between the two teams’ power rating, home field advantage, and difference in rest. Each team’s rating changes after each simulated game based on the (simulated) result relative to expectation. How much the rating changes is based on a function which was empirically estimated using historical Massey-Peabody ratings. 

If a team overperforms expectations, their rating improves. If they underperform, they’ll likely see their rating fall. Ratings are more responsive to game results early in the season because there’s more uncertainty. They are also more responsive when a team has a less experienced QB (again, due to more uncertainty). 

But there’s a wrinkle: We don’t know how much a team’s rating will actually change based solely on the final score of a game; the best we can do is an estimate. To know, we’d need to have more detailed game statistics. So we once again must simulate to get a team’s rating change, with the added uncertainty coming from the fact we don’t directly observe the stats that drive the team rating. 

For an example of why there is this uncertainty in predicting power rating change just based on score differential, think of a team that outplays their opponent, but has a -5 turnover differential and loses. The team’s rating changed, but we need to know how much to attribute to each unit, so we simulate how much the offensive and defensive ratings changed. Both could move in the same direction, or they could move in opposite directions. 

Since, in every mode, we simulate QB injuries, we need to allow for the starting QB’s rating to change. We assign a % of the offense’s rating change to the starting quarterback, with the QB’s proportion of credit/blame changing slightly as a function of the quarterback’s career starts. This means that the QB1 Injury Impact you set will not necessarily end up being how much a team’s rating changes if the starting QB gets hurt. If a team exceeded expectations, its starting QB’s rating likely improved. Therefore, a late-season injury would be more detrimental than we originally thought.

This whole process results in updated power ratings for offense, defense, and starting QB. But wait! We still have to deal with the possibility of a QB injury. We simulate whether a QB was injured based on the QB1 INJ specified by the user. In the event that a QB was injured, we simulate from a distribution of injury length (in games) based on historical QB injury durations. 

While there is some ability to predict QB injuries, we found no QB-by-QB variation in predicted injury duration. Basically, if a QB gets hurt in a simulation, we are spinning a roulette wheel to determine the number of games he’ll miss. We substitute the backup QB’s rating for the starting QB’s rating. The starting QB’s rating returns when he returns from injury. 


Rating Update Function (RUF)

What Is RUF?

Team ratings are not static. They change week to week based on how teams play relative to expectation. We give users the ability to change how sensitive these power rating changes are to simulated game outcomes.

A more aggressive RUF setting will lead to a wider range of outcomes for that team. If you think there is more uncertainty around a team’s rating, a more aggressive RUF setting is appropriate. The Jaguars, for example, have many questions that one could argue increases the uncertainty in their rating and would justify a more aggressive RUF: Will Urban Meyer’s collegiate coaching success translate to the NFL? Will Trevor Lawrence set the world on fire, or experience growing pains?

Conversely, some teams, as Dennis Green once opined, “are who we thought they were”. Teams that you believe have less uncertainty in their rating should have a more conservative RUF.


Do You Have To Change It?

No. The default is what we have found to be optimal historically for Massey-Peabody rating changes. Why should you trust that Massey-Peabody ratings are tuned properly and optimally weight prior vs. in-season? You don’t have to! However, carefully calibrate our defaults. They ensure that you will get the same variance in prediction error (actual wins minus expected) as we have observed the last ten years.


What If I Think There is More Uncertainty For Every Team Due to Covid?

Feel free to make the RUF more aggressive for every team.


How Much Of An Effect Will It Have?

There are 11 possible RUF settings, from -5 (most conservative) to +5 (most aggressive). The most conservative setting will result in an average absolute error of 1.45 points in the difference between the team’s preseason and year-end power rating (assuming no QB injury). A neutral setting will result in an average absolute error of 2.45. The most aggressive setting sees that number ballooning to 4.18.

Want this framed differently? In a simulation where every team had a rating of zero, with no QB injuries, using the most conservative RUF setting, a team would win from 7-10 games 64.5% of the time; with a neutral setting that drops to 51.5%, and with the most aggressive setting it falls to 41.9%.


NFL Simulator Results Screen

Here you can see some basic output from the simulation you just ran. We report the average number of wins (not the median) along with playoff, division, conference, and Super Bowl probabilities. These numbers are not based on a formula; they come directly from the 10,000 simulations that were run based on your inputs.

Navigate to the futures screen, and you can select different available markets from the dropdown, where you can not only see odds offered at a litany of different books, but your odds from your simulation. Offerings with positive expected value show up as green on the odds screen.

When looking at the Number of Wins markets, you can click on the triple lines to the right of a team’s division and see your fair price on the entire distribution based on your simulations (useful for exact number of wins and alternate win totals).