My friends and family know that I’ve been devoting a fair amount of time to fantasy baseball lately. I started back in February of this year trying to get my ‘system’ in place for Opening Day in April. While I don’t intend to share everything I’ve tested and learned, the blog is a good place to chime in on some age-old questions.
Today, the question is “Are Hot Streaks and Cold Streaks Real in Fantasy Baseball?”
Why it Matters
The answer matters to fantasy players for a few reasons.
- When a player performs really well for a few games in a row, his Salary on sites such as FanDuel and DraftKings also tends to rise. If there is no such thing as a hot streak, these players would be overpriced, but if a few high scoring games is a solid indicator of more strong games to come, then you’d want these hot hands in your line-up even at the higher prices.
- When a player performs really poorly (compare to his own prior performance) for a few games in row, one of two things could be happening. Either it is a random fluctuation and he is ‘due’ for bigger numbers to restore his season-to-date averages or he is at the beginning of a true slump. A true slump could be a minor virus or injury that isn’t worth disclosing, something harrowing in his personal life or just ‘not seeing the ball well’. If weak performance is an indicator of more weak performance to come, you’d want to avoid such players on your fantasy roster and the stat-heads may be able to detect emerging slumps faster than the media and pundits.
To answer this question, I focus on occasions where a hitter plays 3 days in row at any point in the season after they’ve played at least 20 total games. I refer to these 3 game sequences as ‘prior-to-last’, ‘last’ & ‘today’. I’m interested in what I can predict about ‘today’ (the 3rd game in the sequence) based on what happened in the prior-to-last and last games along with knowledge of the hitter’s season-to-date performance.
The analysis covers the entire 2016 MLB season. All points values are for the FanDual scoring system in use during the 2016 season.
- For every hitter, calculate their season-to-date Fantasy Point Per Game (FPPG) for all games played prior to ‘today’. (Note that this includes ‘prior-to-last’ and ‘last’.)
- For every hitter, calculate their most recent 2-game average FPPG by averaging points entered in the ‘prior-to-last’ and ‘last’ game. Convert these values into bucketed ranges.
- Group hitters into similar performance bands based on their season-to-date results. (I want to compare performance in the third game among hitters who are similarly capable overall, and differ only in how their two last games unfolded.)
- Calculate summary statistics for ‘today’ for hitters in similar performance bands, cut by prior 2-game results.
All of that might be easier to understand with a picture. Here is a small annotated excerpt of the 2016 Scoring Logs provided by DFS On Demand. I didn’t actually do this analysis in Excel, but this will help explain the steps.
As of 4/26/16, Albert Pujols had played 20 regular season games so this is the earliest date for which he is included in the analysis. This image contains 7 occasions where Albert Pujols played 3 days in row as noted by the ‘Y’ in the row labeled Today. Notice that these periods can be overlapping.
Let’s say ‘Today’ is 5/4. On the morning of 5/4, Pujols’ season-to-date average was 9.0 FPPG (average of 4/4 to 5/3) and the average of his last 2 games (5/2 and 5/3) was 14.1. In the game on 5/4, he posted 6 fantasy points.
The question is, among hitters averaging near 9.0 overall, when their last 2 games average near 14, what should be expect from the next game. You could simplify this and ask, will the next game be closer to the season average of 9 or closer to the 2-game streak of 14?
In this picture there are 7 instances to analyze for just 1 hitter. Across all hitters for the whole season, I had 16,751 instances to analyze.
I decided on the ‘bucket’s after examining some histograms of the season-to-date and last 2 game averages. Not surprisingly, the STD data in orange is much more tightly clustered. The STD data is roughly centered around 10, so I define 4 buckets of overall performance <8, 8 to 10, 10 to 12 and >12. The averages from the last 2 games (blue) take on a much broader range of values. I define the 6 buckets of recent performance as Exactly 0, >0 to 6, 6 to 12, 12 to 18, 18 to 24 and >24.
Finally, the moment you’ve all been waiting for.
We have 16,751 instances to plot corresponding to 24 situations (4 buckets of STD performance times 6 buckets of recent performance). A grouped boxplot is a good way to accomplish this. The solid part of the box plot shows the range in values from the 25th to the 75th percentile. The thick horizontal line in the center shows the median (the 50th percentile). In my analysis I’m not going to focus on the whiskers or the outlier points. Suffice it to say that if you are good enough to play major league baseball for three days in a row, you are good enough to have a 40+ point day every once in a while no matter what your season or recent performance has been.
(This chart is interactive)
I find this result fascinating. Here is what it means to me. Let me know in the comments if you agree or disagree with this interpretation.
The Stars (>12 STD Average FPPG)
The rightmost stack of boxes corresponds to the hitter averaging over 12 FPPG coming into ‘today’. Notice how the 25th and 75th percentile is essentially the same across all 6 colors. That means that the performance in the most recent 2 games has very little to do with the numbers that this hitter will post in Game 3. The 75th percentile is between 18 and 19 for all 6 groups meaning that no matter what the prior 2 days’ result, 3 out of 4 hitters will score below this mark and 1 of 4 will score above it in all the groups.
The one difference in the 6 colored boxes is the location of the median. In the yellow box (which corresponds to 2 prior games of 0 points), the median is 7 for Game 3, while for the other 5 groups the median is 9 or higher. This means that if a ‘Star’ player has put up two zeros in a row, there really might be something wrong. Half of the time they’ll exceed 7 in Game 3, but half or everyone else will exceed 9 in Game 3. You want to stay away from Stars with two zeros in a row. They might have an undisclosed injury or sickness that is taking them off their game.
The Fantasy Core (10-12 STD Average FPPG)
I’m calling the guys averaging 10-12 FPPG the Fantasy Core, because you’ll never have enough Salary Cap to roster only Stars. Your rosters will need to draw heavily from this group that performs above average in the long run but isn’t at the very top.
Notice the relative heights of the 6 boxes. The 75th percentile is highest for the ‘Exactly 0’ and ‘>24’ groups and is essentially the same for the 4 groups in the middle. These 2 groups at the extremes of recent performance also have the highest medians for Game 3 points. That is so cool! This is proof that hot streaks do exists (across 3 game intervals). Despite a season average in the 10-12 range, when a player has 2 days in a row that average above 24, the third day is likely to score higher than hitters in the same general group who aren’t on hot streaks.
Another amazing fantasy finding (which I almost didn’t want to share), is that players who are generally above average, but have had two consecutive zeros, are an excellent addition to your roster. Not only will they probably be a bit cheaper than they were 2 days ago, they are expected to produce more in Game 3. They are ‘due’ to have a bigger result which will bring their average back in line.
Below Average (8-10 STD Average FPPG)
The same effect can be seen to a lesser extent in this group. Again the 75th percentile is highest for the ‘Exactly 0’ and ‘>24’, but there is almost no difference in the medians. The hot hand effect is definitely real. A hitter who has been slightly below average season-to-date who busts out 2 games in a row averaging >24, is hot and will go better in Game 3 than others in his season-long peer group. If the very recent performance streak hasn’t been incorporated into their fantasy Salary then thesehitters are great values.
Let me know what further tests you’d like to see to fully explore the question of hot streaks and cold streaks.
I picked 3 game sequences, but I could have gone with 4 or 5 instead. I also could have allowed one day off within the sequence instead of requiring that they be consecutive. Finally, I could have restricted this to the true ‘everyday’ players by requiring >100 games played. It is also possible to do it by team, by lineup position or by Salary.
Which do you want to see?