Let's get into what I'm going to call starter-worthy production.
How did I find these values? The first step was to define starter-worthy using the top-24 and top-12 thresholds I mentioned at the top of this post. I then applied an algorithm called a support vector machine, which is a popular choice when trying to classify things into two or more groups -- in this case, starter-worthy or not, in this case starter worthy or not. What makes this algorithm a good choice is that the output relies on support vectors, which are points that are only going to be near the separation line. Very low scores and very high scores do not affect the algorithm's output. Well, that's it for the first post. Let me know what you think in the comments, or if you have any questions let me know by email, or on twitter. You can also subscribe!
Tweet to @MathWithJerome
Follow @MathWithJerome
If you're in a 12-team league, with the typical two RB slots to fill, then the top 24 running backs are typically deemed "starters." To get a sense of how things may go for you this week, you visit your favorite rankings site to check out where your guys fall in the top 20 to 30 players at the position. But what point production should you feel good about from those players? Can we find a point threshold that helps identify production worthy of being in your starting lineup? Let's take a look...
I'm defining starter-worthy production for each week as the top 24 scores for RB and WR, and the top 12 scores for QB and TE. The dataset consists of the top scores for each week including 50 for RB/WR, 18 for QB, and 20 for TE. The splits were needed for simplicity and don’t change the analysis. Looking at the visualization below, the blue histogram shows what starter-worthy production has looked like by position since the 2015 season. Toggling the separating line shows the point that optimally splits the starter-worthy scores from the rest of the data, and I'll explain more on that later. There's going to be some overlap, but that's ok and expected. And for you PPR folks, use the slider to see how your PPR value changes things. Take a few minutes to mess around with positions, years, and PPR values.
So, let's unravel some facts from that visualization. First, we can see the value that best separates out starter-worthy points. As an example, since the start of the 105 season, RB starter-worthy production in a full-point PPR league is 11.5 points per game. Does that mean if your RB failed to reach 11.5 points you shouldn't have had them in your lineup? Nope. That obviously depends on your roster and available options. What this does highlight is where your roster may be under-performing compared to the rest of your league. And last I looked...under-performing compared to your league is how you lose. Another thing to look at is how much overlap there is between the bottom of the starter-worthy group and the top of the rest. More overlap means more depth at the position, which has been an issue for both running backs and tight ends in 2018.
It's also important to look at how this separation has been trending over the last few seasons. The table to the right breaks down this key value by position and season and allows you to account for your PPR if necessary. QB values changing with PPR does make some sense, too. Every so often a quarterback snags a pass and sometimes it's even intentional.
How did I find these values? The first step was to define starter-worthy using the top-24 and top-12 thresholds I mentioned at the top of this post. I then applied an algorithm called a support vector machine, which is a popular choice when trying to classify things into two or more groups -- in this case, starter-worthy or not, in this case starter worthy or not. What makes this algorithm a good choice is that the output relies on support vectors, which are points that are only going to be near the separation line. Very low scores and very high scores do not affect the algorithm's output. Well, that's it for the first post. Let me know what you think in the comments, or if you have any questions let me know by email, or on twitter. You can also subscribe!