If building sports betting models to beat the NFL or NBA were easy … well, you’d already be part of a large group of sophisticated bettors who move markets with every wager.
If you’re not in the fold with other market-moving quants but you want to explore the concepts behind them, you might be in the right place.
From now through the end of July, we’re going to help the data science-curious ease into the world of regressions, distributions and data sets.
Welcome to Unabated’s Intro to Data Science. If you were in college, it would be Data Science 101. Here, it’s Data Science -110.
What You’ll Learn
Over the course of the next nine articles, we’re going to introduce you to fundamental concepts in data science, when and how to apply them, and explain how best to use these techniques.
We’ll teach you where to find your data and how to import it. How to scrub your data squeaky clean. And how to parse it.
Once you have the basics, we will give you the framework to start using your data to find out what data correlates to market prices; how to use it to determine whether you’re better off betting derivatives or full game prices; how to determine the value of the half (rebound, strikeout, yard, etc.) in props; and eventually, how to build functioning sports betting models that can be effective in less liquid markets.
A new article will be posted, for the most part, every week to 10 days, starting next week with an introductory look at how to collect your data.
The Limits of this Course
At the end of the summer, you won’t come away with a model you can use to beat the most liquid, efficient markets in sports betting.
If you’re dreaming of beating NFL totals this fall, we can help you take your first baby steps on that marathon trail, but we won’t be with you at the finish line.
When you get there, though, think of us. And maybe let us know what market inefficiencies you’re targeting around 2026 when you’ve got something really cooking.
The Tools You’ll Need
You’ll be able to use all these techniques and data armed with nothing more than Excel or Google Sheets. While spreadsheets may not be a long-term solution if you delve deeper into data science or eventually build your own sports betting models, for the purposes of this introduction they’ll be robust enough to get the job done.
To gather data, we will be providing some code in the programming language R that you can use to fetch data and export to a spreadsheet. Don’t worry, you won’t have to learn how to create your own code from scratch. Just learn a few simple steps to run it.
You will need to download the integrated development environment of your choice. We recommend RStudio, though you can also use Spyder, Virtual Studio Code, Atom, Jupyter or any other IDE, if you have a preference.
If you do want to tackle the material in R, we’ll be providing sample code you can use in each lesson to perform the same tasks.
Or, if you want to learn R in parallel to this project, there are excellent free courses available from Coursera, MIT’s OpenCourseWare and many more.
Most of the IDEs we’ve listed also support Python, another language that can be useful for this kind of work.
If you want to do some reading in your spare time, this extensive sports analytics bibliography from Notre Dame professor Scott Nestler will keep you busy for a long, long time.
Before we really get into the meat of the course, first we’re going to show you where to source data, how to obtain it, how to clean it, and how to import it. We’ll have sample data for you to use along the way, but you’ll need to know how to get fresh numbers as you move on and start building your own projects.
Are you ready? We’re ready. We’ll see you next week.
If you missed our other installments, here’s where you can find those: