TheMathNinja
Footballguy
For those of you who use TFL as a scoring category in your leagues (like I do), and find that TFL is a stat that tends to be excluded from most IDP projections pre-season, I did some quick math to find some formulas that can help as you make your IDP projections for your upcoming drafts:
The following models/formulas are based on the last 2 years of data for the top-scoring IDP's (60 per year of DT's, 100 per year of DE's, 150 per year of LB's), as recorded by MFL. In this counting scheme, TFL's include sacks and are included in the larger tackle statistics. The "AdjT" stat is "Adjusted Tackles", where
Adjusted Tackles = Solo Tackles + 0.5*Assisted Tackles [since this is how TFL and sacks are counted]
In my original models, I ran them with good statistical sense (i.e. with independent variables so as to eliminate co-linearity), but provided simpler, co-linearity-ridden formulas here for you because it's much simpler to punch into a calculator, and they're basically identical in accuracy:
DT-specific: TFL = 0.05*AdjT + 1.25*Sacks + 1 (this has R^2 of 0.70)
DE-specific: TFL = 0.1*AdjT + 1*Sacks (this has R^2 of 0.87)
combined DL: TFL = 0.1*AdjT + 1.05*Sacks (this has R^2 of 0.82)
LB: TFL = 0.05*AdjT + 1.1*Sacks + 0.2*ForcedFumbles (this has R^2 of 0.76)
There is some intuition to these results. Naturally, if you are predicting how many TFL's someone will record, you know that a sack is automatically a TFL (so you count sacks automatically, and figure that the coefficient on the sack variable will be near 1), and the other TFL's will come as some proportion of their tackles in general. For DE's, once you record the 1 sack, all you need to do is look to the rest of his tackles, divide by 10, and you're there. For DT's, it's not so simple. Beyond the 1 sack that automatically gets counted, it's clear that sacks also provide additional information of "backfield presence" for the DT's, and therefore, looking at their sack counts actually still provides additional information at how many tackles of a RB behind the line one DT will have against another (i.e. for every 1 sack one DT has above another, he also will on average have an additional 0.25 tackles for loss in the backfield that are non-sacks, hence the 1.25 coefficient for DT's).
For LB's, Forced Fumbles was a variable significant at the 9% level, so I decided to include it, though I won't die on that hill. It does make some level of intuitive sense to me that forced fumbles may be a proxy for some kind of "aggressiveness" variable that aids in TFL, and like DT's, sacks still helped to describe how many non-sack TFL's a given LB had, where every 10 sacks could explain a non-sack TFL for a LB...once again, I think calling this "backfield presence" or "blitzing propensity" is somewhat useful.
Other stats that could make even better models that I just didn't quite have the data or the time for:
- Added specificity to position (i.e. left DE, right DE, DT in 3-4, DT in 4-3, ROLB, LOLB, ILB, etc.)
- Defensive scheme (making variables or even separate models for whether the player works in a 3-4 or 4-3 scheme, especially for LB's)
- QB hits
- Last year's TFL-to-Tackle ratio for player
Happy projecting!
The following models/formulas are based on the last 2 years of data for the top-scoring IDP's (60 per year of DT's, 100 per year of DE's, 150 per year of LB's), as recorded by MFL. In this counting scheme, TFL's include sacks and are included in the larger tackle statistics. The "AdjT" stat is "Adjusted Tackles", where
Adjusted Tackles = Solo Tackles + 0.5*Assisted Tackles [since this is how TFL and sacks are counted]
In my original models, I ran them with good statistical sense (i.e. with independent variables so as to eliminate co-linearity), but provided simpler, co-linearity-ridden formulas here for you because it's much simpler to punch into a calculator, and they're basically identical in accuracy:
DT-specific: TFL = 0.05*AdjT + 1.25*Sacks + 1 (this has R^2 of 0.70)
DE-specific: TFL = 0.1*AdjT + 1*Sacks (this has R^2 of 0.87)
combined DL: TFL = 0.1*AdjT + 1.05*Sacks (this has R^2 of 0.82)
LB: TFL = 0.05*AdjT + 1.1*Sacks + 0.2*ForcedFumbles (this has R^2 of 0.76)
There is some intuition to these results. Naturally, if you are predicting how many TFL's someone will record, you know that a sack is automatically a TFL (so you count sacks automatically, and figure that the coefficient on the sack variable will be near 1), and the other TFL's will come as some proportion of their tackles in general. For DE's, once you record the 1 sack, all you need to do is look to the rest of his tackles, divide by 10, and you're there. For DT's, it's not so simple. Beyond the 1 sack that automatically gets counted, it's clear that sacks also provide additional information of "backfield presence" for the DT's, and therefore, looking at their sack counts actually still provides additional information at how many tackles of a RB behind the line one DT will have against another (i.e. for every 1 sack one DT has above another, he also will on average have an additional 0.25 tackles for loss in the backfield that are non-sacks, hence the 1.25 coefficient for DT's).
For LB's, Forced Fumbles was a variable significant at the 9% level, so I decided to include it, though I won't die on that hill. It does make some level of intuitive sense to me that forced fumbles may be a proxy for some kind of "aggressiveness" variable that aids in TFL, and like DT's, sacks still helped to describe how many non-sack TFL's a given LB had, where every 10 sacks could explain a non-sack TFL for a LB...once again, I think calling this "backfield presence" or "blitzing propensity" is somewhat useful.
Other stats that could make even better models that I just didn't quite have the data or the time for:
- Added specificity to position (i.e. left DE, right DE, DT in 3-4, DT in 4-3, ROLB, LOLB, ILB, etc.)
- Defensive scheme (making variables or even separate models for whether the player works in a 3-4 or 4-3 scheme, especially for LB's)
- QB hits
- Last year's TFL-to-Tackle ratio for player
Happy projecting!