What's new
Fantasy Football - Footballguys Forums

This is a sample guest message. Register a free account today to become a member! Once signed in, you'll be able to participate on this site by adding your own topics and posts, as well as connect with other members through your own private inbox!

Analysis of last years FBG projections (1 Viewer)

Jilez001

Footballguy
I did a little analysis of FBG's projections and thought I'd share the results with whomever was interested. I've uploaded a pdf to mediafire that I hope is accessible (never used it before). First, to explain the figure, I took Dodds season total projections and divided by his projected number of games. I then plotted that vs the players actual points per game (as determined via the data dominator). I did this mostly because just looking at season totals would have a great deal of complications due to injuries and outright replacements. In addition most fantasy leagues are set up to make weekly choices for a lineup so I'd expect per game performance to be more useful to us. The R^2 values that are listed next to the graphs can be interpreted as the percent of variation in Actual performance that is explained by the projections.

Perhaps more clearly, essentially 60% of the variability in players actual points per game can be explained by footballguys pre-season projections! I have done no comparison across other websites but that is extremely impressive to me as someone who has examined a great deal of data types.

http://www.mediafire.com/?c9tgjnplgmklxj1 - All player projections vs actual results possible using FBG Data dominator (QB=32 players, RB=123 players, WR= 136 players, TE= 81 players

http://www.mediafire.com/?142p1z4lwqnqc2x - Same format of graph looking only at top starters (QB=12, RB=24, WR=36,TE=12).

However, The in season projections are perhaps less promising in usefulness. The following are a pooled data set of all projection-actual pairs for wr/rb/te (note no qb's) using dodds weekly projections. To read the following graph projection value of 5 means all projections between 5.00 and 5.99 etc. The number of projections is the number of times Dodds projected a player to score in that range on the season. It's a graph of median values (50% of actual below and 50% above), and the bars around each point are the Inner quartile range (25 percentile to 75th percentile). In essence 50% of all actual scores were between those ranges (Yes the ranges are that big!).

http://www.mediafire.com/?dwptr8t0o8ldnas

The means (not displayed on graph) of the lower projection values ( about 1 thru 10) are quite accurate as you can see by the above table. Which is to say if Dodds predicts a player will score in the 6.00-6.99 range the mean value of actual responses was 6.37 for example. In that projection range the actual values not only keep increasing, but they are accurate to the projected values themselves. Once you reach projections higher than 10 however, the data becomes more erratic. This is possible due to the much smaller sample sizes involved but nonetheless limits the usefulness of discerning between players projected at more than 10 points in a given week.

Another thing to be considered is that this data is a pooled grouping of 3 different positions (wr te and rb) the relationships above may be more or less accurate depending on position.

I think the following graph does a good job of showing the value of the projections (for RB's at least, more to follow). The legend might be a little confusing. Basically its 3 lines that show the following information: The green line is the season average of PPG at a given rank. So the 1st rank is the average of all of the highest point totals scored at runningback each week. This line is what you could have done if you were PERFECT at predicting the ranks of RB's each week. The black line is the PPG of a specific player, ranked accordingly. So the 1st black dot is the Arian Foster dot, 2nd is Darren Mcfadden etc. Lastly, the Red line is the average actual PPG of FBG's projections ranked accordingly. So the 1st dot on the red line corresponds to an average of all of FBG's highest rated RB for each week (This ended up being a frankenstein of lots of Foster and Frank Gore as I recall).

http://www.mediafire.com/?1y4vy01949bq9h7

Whats neat about that is the remarkable similarity between The season net PPG for RB's and the average of FBG's weekly projections. Basically if you had listened to FBG's each week and played their highest projected RB you would have gotten something almost identical to Arian Foster! This may not seem like a big deal, but don't ignore that one is a projection and one is hindsight. Clearly there is still room for improvement in projections as I want to get to that illusive green line, and make kill people and not go to jail money.

I'll update further if I encounter anything particularly interesting as I look into the data further.

 
Last edited by a moderator:
I remember a few years back, someone compared the Top 200 Forward rankings to the weekly rankings and found that for RBs, you were better off using the Top 200 Forward rankings to set your weekly lineups. WRs were similar, I think, but QBs were more susceptible to matchups and were better predicted by the weekly rankings.

I've wondered if that holds up over other seasons or if it was a fluke...

 
Can I ask how many data points went into each graph?

Those are some very large r_2 values.

It looks like they are largely driven by players who are not slated to get much playing time and in fact do not.

In other words, the ordinary least squares analyses you did assumes bivariate normal data and those appear both positively skewed, which can have a big impact.

I would be more curious to see the same plots and effect sizes computed from only the top 12 projected QBs, top 24 projected RBs, top 36 projected WRs, and top 12 TEs.

 
Can I ask how many data points went into each graph? Those are some very large r_2 values. It looks like they are largely driven by players who are not slated to get much playing time and in fact do not.In other words, the ordinary least squares analyses you did assumes bivariate normal data and those appear both positively skewed, which can have a big impact.I would be more curious to see the same plots and effect sizes computed from only the top 12 projected QBs, top 24 projected RBs, top 36 projected WRs, and top 12 TEs.
Sure, Here goes, QB= 32 cases (presumably the projected qb's for most teams), this position has notable omissions on my part. I chose not to include projected backups as my calculation of ppg for projected backups was all over the place. I assumed this was mostly due to them not projecting full starts for backup qb's and them projecting erratic game totals for these players. The most notable omission was Mike Vick who they had projected to play 4? starts and score around 50? points on the season. Also, I didn't include Dennis Dixon who they had projected to play 4 starts (These were known to be full due to Roethlisberger suspension), However, my fellow Oregon Alumni only played in 2 games. I can't remember if he got hurt or not but I chose to eliminate him anyway as a 2 game sample size was quite small relative to the other qb's and I didn't feel was an accurate measurement of his actual ppg.RB=123WR=136TE=81I agree with you that there is a strong possibility the R^2 values are assisted by a clump of dots right next to 0. However, I think that the FBG staffs ability to 'correctly' identify a player who is going to average low ppg is useful to us as someone to ignore. And in contrast to get a sense of how many Brandon Lloyd's we are likely to miss out on from the clump of rejects at the bottom of their projections. Anyway, I did as you requested and sifted those numbers down to the top projected starters at their various positions. I included the mediafire link in my initial post. The R^2 values are quite a bit lower in both WR's and RB's however I'd take these with a large grain of salt as the number of samples is obviously quite a bit smaller. Going forward I'll try and include past years to get a better idea of the association of a 'top starter' projected vs actual points scored.I don't believe for the analysis I've done (No hypothesis testing or interval estimates) that I need to assume normality. Though, I agree the data is clearly skewed and not normal and is also heteroscedatic (Unequal variance across all x values). With this being a football forum I don't want to get too bogged down in specific statistics talk unless I'm misinforming people (Definitely open to being convinced of this via PM's maybe? :nerd: ).
 
Last edited by a moderator:
I remember a few years back, someone compared the Top 200 Forward rankings to the weekly rankings and found that for RBs, you were better off using the Top 200 Forward rankings to set your weekly lineups. WRs were similar, I think, but QBs were more susceptible to matchups and were better predicted by the weekly rankings. I've wondered if that holds up over other seasons or if it was a fluke...
I think I could check this without too much hassle. Would be an interesting piece of information if true.
 
Can I ask how many data points went into each graph? Those are some very large r_2 values. It looks like they are largely driven by players who are not slated to get much playing time and in fact do not.In other words, the ordinary least squares analyses you did assumes bivariate normal data and those appear both positively skewed, which can have a big impact.I would be more curious to see the same plots and effect sizes computed from only the top 12 projected QBs, top 24 projected RBs, top 36 projected WRs, and top 12 TEs.
Sure, Here goes, QB= 32 cases (presumably the projected qb's for most teams), this position has notable omissions on my part. I chose not to include projected backups as my calculation of ppg for projected backups was all over the place. I assumed this was mostly due to them not projecting full starts for backup qb's and them projecting erratic game totals for these players. The most notable omission was Mike Vick who they had projected to play 4? starts and score around 50? points on the season. Also, I didn't include Dennis Dixon who they had projected to play 4 starts (These were known to be full due to Roethlisberger suspension), However, my fellow Oregon Alumni only played in 2 games. I can't remember if he got hurt or not but I chose to eliminate him anyway as a 2 game sample size was quite small relative to the other qb's and I didn't feel was an accurate measurement of his actual ppg.RB=123WR=136TE=81I agree with you that there is a strong possibility the R^2 values are assisted by a clump of dots right next to 0. However, I think that the FBG staffs ability to 'correctly' identify a player who is going to average low ppg is useful to us as someone to ignore. And in contrast to get a sense of how many Brandon Lloyd's we are likely to miss out on from the clump of rejects at the bottom of their projections. Anyway, I did as you requested and sifted those numbers down to the top projected starters at their various positions. I included the mediafire link in my initial post. The R^2 values are quite a bit lower in both WR's and RB's however I'd take these with a large grain of salt as the number of samples is obviously quite a bit smaller. Going forward I'll try and include past years to get a better idea of the association of a 'top starter' projected vs actual points scored.I don't believe for the analysis I've done (No hypothesis testing or interval estimates) that I need to assume normality. Though, I agree the data is clearly skewed and not normal and is also heteroscedatic (Unequal variance across all x values). With this being a football forum I don't want to get too bogged down in specific statistics talk unless I'm misinforming people (Definitely open to being convinced of this via PM's maybe? :nerd: ).
Thanks Jilez!I think the numbers just on projected starters are still pretty good, as where I come from an r of .40 (the r_2 of .16) is still damn useful. And the "truth" probably lies somewhere in between the total sample and starters, as we often have to choose between someone just inside the total projected starters and someone a bit outside. (For example, the 22nd best projected RB and the 29th). So something like 1.5 times the projected starters might be relevant. BTW, I didn't mean to hit hard on the assumptions thing (such as when the data "are" skewed), it is just alot of the players near the bottom are never going to be considered as starters (and might not even be rostered) and they appeared to drive the large values.In the end, I suspect that an index that penalizes both for lack of convergence and agreement such as some forms of the intra-class correlation might be most appropriate, but my schema for those is poor and I have a hard time intrepreting what good and bad values are. Anyway, thanks again.
 
I remember a few years back, someone compared the Top 200 Forward rankings to the weekly rankings and found that for RBs, you were better off using the Top 200 Forward rankings to set your weekly lineups. WRs were similar, I think, but QBs were more susceptible to matchups and were better predicted by the weekly rankings. I've wondered if that holds up over other seasons or if it was a fluke...
I think I could check this without too much hassle. Would be an interesting piece of information if true.
Did you?
 

Users who are viewing this thread

Back
Top