To build a live prediction model you need a data source that contains historical & current data. Finding a good data source is key.
The site(s) or source(s) you choose will determine how much of the process you can automate.
- What sites allow scraping from their
- Scraping HTML tables with python.
- Advanced Scraping: Inspecting how data is communicated to a website.
- Storing the data.
To check if a site allows web scraping or not, you need to check the
robots.txt file for that site. To find this, go the websites domain name…
Ever wanted to build your own predictive model to create optimal dfs lineups using data science & machine learning?
This article will be your road map to connect the various processes needed to develop predictive player model(s) & optimize your predictions. This will be the hierarchical structure to several other posts that together will get you making predictions in no time.
Each section will contain its own article, with this article being the hub tying them all together. The data & python code repo will be linked on each article. The main repo is here.
The last step to building your predictive dfs model, is implementing linear algebra to figure out what your best lineups are based on your predictions. For this we will be using the python package PuLP. This article will cover how to add in salary & position constraints, as well as additional constraints to modify the lineups.
Lets import our modules, and load our data. …
Now that you have built your dataset, created some features, trained & tuned your model, you need to bring it all together to create your live prediction. That is, your prediction before an event occurs, in this case NFL Sunday.
For this to occur we need to gather the necessary features for the upcoming week to make predictions on. This is why we used the .shift() function in ETL. So we can make predictions on current week, with previous weeks data.
The data used is located here. Python Code is located here.
First thing we need to do is update…
Now that we have a feature set we will try out some models, analyze results & come up with a gameplan to predict our next weeks results.
In this section we will build predictive models based on the quarterbacks in our dataset. We will try two popular boosting machine learning algorithms. XGBoost & LightGBM.
Our target variable will be the QB draftkings points scored rank for a given week. You can easily change this target variable to predict who will throw most tds, or run for most yards etc.
Now that we have a dataset, it is time to shape & manipulate the data to a state we can create a predictive model.
Based on the data we have, a model for each position seems appropriate. In this article we will be going over the ETL to create a dataset to make predictions for a running backs fantasy rank in a given week. The file linked will include the transformations necessary to do each position with their defensive components as well. (Rb vs Defense)
We have two main…