Question about the "if you take out that big play" argument (1 Viewer)

redman · Nov 2, 2006

I'm a pretty poor number cruncher, so I pose this question to the forum.

Most people (me included) jump all over people who point out that "such-and-such-player would only have this yards per carry/reception/attempt average if you removed this big play." The logic in the criticism is that, of course, you can't simply ignore big plays because they count towards the player's statistics of course and they indicate the ability of the player to break a big play every once in a while, i.e. "past performance may be indicative of future performance".

In statistics, however, there are such things as statistical outliers, meaning unusually large or small numbers that form the exception to the data being studied. In addition, it can be useful to try to figure out what amount of yards per carry, for example, are the most indicative of the "average run" that a RB has. For example, Terrell Davis averaged 4.7 yards per carry in 1995 while Barry Sanders averaged 4.8, however I'm sure that a close examination of their carries would reveal that Barry had a greater proportion of runs for lost yards and also of runs over 20 yards than Davis did. They were very different RB's.

Certain methodologies allow you when analyzing stats to remove both the largest and the smallest numbers from your data set before looking for the mean. However, taking a RB's longest and shortest runs out of the equation is problematic because a RB can gain far more yards past the line of scrimmage than he can lose behind the line of scrimmage on the average carry so, while the impact of a long run on his average as a statistical outlier would be diminished, it would still be there.

The question, then, is how do you go about figuring out what length of run is the most indicative of a RB's carries? Taking the median would seem like a logical way, except that rushing yards are measured in whole numbers on each carry by the NFL leading to pretty homogenous results when comparing RB's, and anyway I don't know of a source that compiles, orders and lists all of a RB's runs each game by distance. That seems like a lot of work. Any ideas here?

Bri · Nov 2, 2006

just go with average and call it a day

Marc Levin · Nov 2, 2006

Bri said:
just go with average and call it a day

flippant answer - two backs may average 4.0 per carry, but I obviously want the one that gets more 4 yard carries - the reason being that a back who averages 4.0, but gets them on one or two big plays in a game, is more likely to be replaced if he stops running the big plays (for whatever reason) - plus, his YPC plummets faster after a couple bad games. Finally, knowing the answer helps determine whether a team's running game is more or less likely to succeed against a tough run D.The back that gets 4 more often and averages 4 per carry puts his team in a great position much more often than back #1 - that guy is unlikely to lose his job.

Moreover, redman's question is extremely useful in determining whether an injured starter's immediate backup could be as, nearly as, or more successful. A back that tends to get a lot of yardage on big plays and otherwise gets stopped for less than 3 is much more likely making his numbers on individual talent rather than OL play.

how do you go about figuring out what length of run is the most indicative of a RB's carries? Taking the median would seem like a logical way, except that rushing yards are measured in whole numbers on each carry by the NFL leading to pretty homogenous results when comparing RB's, and anyway I don't know of a source that compiles, orders and lists all of a RB's runs each game by distance. That seems like a lot of work. Any ideas here?

redman, I can give you an uneductaed answer, but it'll take me too long to explain it. The short version is that you take the player's YPC and pick a number (with YPC, I would go with about a yard) that is an "acceptable" deviation. You then count how many runs were outside that range - a higher number outside the range = more of a big play back. More numbers inside the range = more consistent an RB (and more likely to be running behind a good offensive line).I would say that every time a RB who is averaging 5.5 YPC runs anywhere between 4 and 7 yards, that is an "acceptable" deviation - 3 yards in a carry is pretty far off a 5.5 average, and anything over 8 yards is pretty far off a 5.5 average (and if a 5.5 back gets to 8 yards, he's likely going a lot further than that.

I would also say that every time a RB who is averaging in the 4.0 range (I'd call 3.9 to 4.2 in that range) runs anywhere between 3 and 5 yards, that is an "acceptable" deviation - 2 yards or less is pretty far off a 4.0-ish average, and anything at 6 or more yards is pretty far off a 4.0 average

You are correct - to compare any two backs is labor intensive - to compare ALL RBs in the league would require a script (or a heckuva lot of hours to devote to the project). But, what you can do is pull the play by play from the player page on our site and you can see the length of each run without having to sift through the NFL.com play by play. Or you might be able to record answers by using the data dominator.

redman · Nov 2, 2006

Thanks Mark, great post.

coo2ie · Nov 2, 2006

Wow... I think you are overthinking it. If you know a backs average yards per carry and a good estimate how many times he will get the ball, then it would seem that this is the best projection.

However, if you really want to know the standard deviation... which is what it sounds like you are after in your question... it would seem the player with the highest std would be the player most likely to break the big play (you would need a reasonable # of data points of course). Copy the data in excel, run a statistical analysis to determine the std dev.

Marc Levin · Nov 2, 2006

redman said:
Thanks Mark, great post.

Wimer didn't post here, I did.

CalBear · Nov 2, 2006

You'll never be able to come up with a statistical model that does a good job of predicting fantasy football production, because there are too many variables and too few data points. The best you can hope for is to use analysis to improve what you're looking at as raw numbers.

A scientist will not discard an outlier just because it is outlying; discarding lowest and highest is a good way to hide important information. In scientific papers there is a specific rationale for any instance where a data point is discarded. So, you have to look at the situations individually. Is the big play typical of the player's performance? Was the game situation one he's likely to be in again? Will there be other opportunities for similar plays, and will the player be able to execute?

An example of an outlier I would have discarded is Marc Boerigter's 8 TDs on 20 receptions in 2002. The team had shown no indication that they were going to move him up the WR depth chart, there were plenty of other TD options around, and he didn't seem that physically talented. Since 2002, he has 19 receptions and 0 TDs. That year was an outlier.

Tatum Bell in 2005 had a lot of long runs. He was coming back into the same system, had no serious competition for touches, and was clearly a very talented back. I would not discard his 2005 season or any of the long runs in it--that was his game.

Ron Dayne in 2005 had one long run. Before that, he had four years of slow, plodding play, and on the rest of his plays in 2005, he still looked slow and plodding. Discarding the one long run probably made sense in evaluating his prospects for 2006.

You have to use judgement.

redman · Nov 2, 2006

Marc Levin said:
redman said:

Thanks Markc, great post.

Click to expand...

Wimer didn't post here, I did.

Ok, fixed.

MLBrandow · Nov 2, 2006

If it's one run in a larger block of runs, I think it's acceptable to take the outlier out.

If you're talking multiple runs in a large block of runs, per say "Well take out the longest run of each game and you have these pedestrian numbers", I think that's different.

If a guy is busting off a long run each game, that's a trend, not a fluke.

If you want an accurate measurement of what a guy can do, or at least a more reflective stat on how well he's been performing, I think it's plenty acceptable to take out one play out of 100 that is an obvious outlier.

If you're talking about taking out 5 or 6 plays out of a few hundred though, I don't think I'd agree as much, if that makes sense.

Bri · Nov 2, 2006

Marc Levin said:
I can give you an uneductaed answer, but it'll take me too long to explain it.

Bri · Nov 2, 2006

redman,

do you play in a CDM league? Does your league score YPA? curious why so interested in average

Bri · Nov 2, 2006

Marc Levin said:
Moreover, redman's question is extremely useful in determining whether an injured starter's immediate backup could be as, nearly as, or more successful.

IMO no it's not.Backup RBs often play a 3rd down role. 3rd down backs generally average more

Angrymutt · Nov 2, 2006

You can't look at just YPC. It's simple, you have to look at touches, catches (if ppr league), red zone/goal line, when they get the ball during the game, can they step in if needed, does the team feed off the running game or the passing game, does the D get the offense on the field. IMO if you look at just ypc, then you are not doing your job as a fantasy manager. (it's not really that simple i guess, that is what makes it fun, edited for clarification)

Marc Levin · Nov 2, 2006

Bri said:
Marc Levin said:

Moreover, redman's question is extremely useful in determining whether an injured starter's immediate backup could be as, nearly as, or more successful.

Click to expand...

IMO no it's not.Backup RBs often play a 3rd down role. 3rd down backs generally average more

:whoosh: I think you missed why I made the statement - you need to re-read it in context.

BTW, the bolded part is not true - more examples of b/u who are not 3rd d. backs come to mind than th eother way around - there is a certain skill set you want out of your third down back thatr is not necessarily th esame skill set you want from your starter (or else your starter would ALSO be your 3rd down back - ala LT, who simply can NOT Be taken out of the game in any 3rd down situations b/c of his pass receiving, blocking, and RB speed)

Marc Levin · Nov 2, 2006

Angrymutt said:
You can't look at just YPC. It's simple, you have to look at touches, catches (if ppr league), red zone/goal line, when they get the ball during the game, can they step in if needed, does the team feed off the running game or the passing game, does the D get the offense on the field. IMO if you look at just ypc, then you are not doing your job as a fantasy manager. (it's not really that simple i guess, that is what makes it fun, edited for clarification)

you also get a :whoosh:You have no idea if redman wants this info for the fantasy purposes you described above - he may be looking for some specific information related exclusively to what is really reflected in a back's YPC.

Marc Levin · Nov 2, 2006

CalBear said:
You'll never be able to come up with a statistical model that does a good job of predicting fantasy football production, because there are too many variables and too few data points. The best you can hope for is to use analysis to improve what you're looking at as raw numbers.

A scientist will not discard an outlier just because it is outlying; discarding lowest and highest is a good way to hide important information. In scientific papers there is a specific rationale for any instance where a data point is discarded. So, you have to look at the situations individually. Is the big play typical of the player's performance? Was the game situation one he's likely to be in again? Will there be other opportunities for similar plays, and will the player be able to execute?

An example of an outlier I would have discarded is Marc Boerigter's 8 TDs on 20 receptions in 2002. The team had shown no indication that they were going to move him up the WR depth chart, there were plenty of other TD options around, and he didn't seem that physically talented. Since 2002, he has 19 receptions and 0 TDs. That year was an outlier.

Tatum Bell in 2005 had a lot of long runs. He was coming back into the same system, had no serious competition for touches, and was clearly a very talented back. I would not discard his 2005 season or any of the long runs in it--that was his game.

Ron Dayne in 2005 had one long run. Before that, he had four years of slow, plodding play, and on the rest of his plays in 2005, he still looked slow and plodding. Discarding the one long run probably made sense in evaluating his prospects for 2006.

You have to use judgement.

very well said - I don't think redman wants to discard outliers - I think he wants to do what I described above.

He wants to find out which RBs run close to their YPC most often.

guderian · Nov 2, 2006

You have to understand that if a player has a big play and you leave it in their stats and project that forward through the rest of the season, then you are implicitly assuming that the player will have another, similar big play. Therefore, don't look backward but look forward as to what is reasonable to assume going forward.

For instance, Chester Taylor had a 95 yard run in the first half of the season. If you double all of his stats as a projection for the rest of the season, then you are implicitly assuming that he will have another 95 yard run. One could argue that he's likely to have another big play, but what if the team is only on the 50 yard line instead of their 5 yard line? Then it's only a 50 yard run.

Bottom line is that you should look at the player and leave in the plays that you think they can replicate and take out the ones that you think are unlikely to be duplicated--there is no easy answer.

Bri · Nov 2, 2006

Marc Levin said:
Bri said:

Marc Levin said:

Moreover, redman's question is extremely useful in determining whether an injured starter's immediate backup could be as, nearly as, or more successful.

Click to expand...

IMO no it's not.Backup RBs often play a 3rd down role. 3rd down backs generally average more

Click to expand...

:whoosh: I think you missed why I made the statement - you need to re-read it in context.

BTW, the bolded part is not true - more examples of b/u who are not 3rd d. backs come to mind than th eother way around - there is a certain skill set you want out of your third down back thatr is not necessarily th esame skill set you want from your starter (or else your starter would ALSO be your 3rd down back - ala LT, who simply can NOT Be taken out of the game in any 3rd down situations b/c of his pass receiving, blocking, and RB speed)

well woosh aside, a RB will average more on 3rd down. Googling I can't find it but...Also, I still think a backup RB will average more.

I used to play in CDMs leagues and (IIRC)they score RBs ranking them 1 thru 30(or 50?) in total yards, TDs, maybe catches, and YPA. Forgive me, it's been oh 4-5 years. They score the best RB with a 30 and the lowest with a 1(reverse actual) and those are their points. Anyhow, Yards Per Attempt is just as valuable as yards and TDs etc.

I've been looking on my computer's hard drive. I've got alot of this info somewhere, maybe I deleted it. I'll tell ya though there are some folks here that play CDM leagues and they'd know. It's part of the game and it's scoring, ya study averages.

I found the stats in a DB I'm working on. Holler a team, I'll give it to ya.

Average per down for the Cards

1st 2.36

2nd 2.19

3rd 3.5

4th 3.0

Pats

1st 4.22

2nd 4.08

3rd 4.27

4th 1.33

thesurfshop19 · Nov 2, 2006

I kind of like the "success rate" stat that SSOG brings up a lot (success = 40% of yards needed on 1st down, 60% on 2nd down, and 100% on 3rd and 4th down).

The way I see it, success rate is to yards per carry as on base percentage is to slugging percentage. I could be way off though -- both numbers seem to be pretty useful in evaluating RBs though. :shrug:

Bri · Nov 2, 2006

thesurfshop19 said:
I kind of like the "success rate" stat that SSOG brings up a lot (success = 40% of yards needed on 1st down, 60% on 2nd down, and 100% on 3rd and 4th down).The way I see it, success rate is to yards per carry as on base percentage is to slugging percentage. I could be way off though -- both numbers seem to be pretty useful in evaluating RBs though.

been there done that discussionwaste of time IMO, but a fun debateFor me, someone with too much time and little useful gleaned from it.

Marc Levin · Nov 2, 2006

Bri said:
Marc Levin said:

Bri said:

Marc Levin said:

Moreover, redman's question is extremely useful in determining whether an injured starter's immediate backup could be as, nearly as, or more successful.

Click to expand...

IMO no it's not.Backup RBs often play a 3rd down role. 3rd down backs generally average more

Click to expand...

:whoosh: I think you missed why I made the statement - you need to re-read it in context.

BTW, the bolded part is not true - more examples of b/u who are not 3rd d. backs come to mind than th eother way around - there is a certain skill set you want out of your third down back thatr is not necessarily th esame skill set you want from your starter (or else your starter would ALSO be your 3rd down back - ala LT, who simply can NOT Be taken out of the game in any 3rd down situations b/c of his pass receiving, blocking, and RB speed)

Click to expand...

well woosh aside, a RB will average more on 3rd down. Googling I can't find it but...Also, I still think a backup RB will average more.

:whoosh: in effect - you are still not getting why the stat requested is an important factor in determining whether the b/u RB will play well subbing for the starter if the starter is injured.Bolded part = :confused:

for me. What the heck are you talking about there?

Bri · Nov 2, 2006

Marc(or redman) can't find CDM thread since only 3 letters. Trying to email/PM one of those guys to chime in here. If you have any luck searching, please do

Bri · Nov 2, 2006

Marc Levin said:
Bri said:

well woosh aside, a RB will average more on 3rd down. Googling I can't find it but...

Also, I still think a backup RB will average more.

Click to expand...

:whoosh: in effect - you are still not getting why the stat requested is an important factor in determining whether the b/u RB will play well subbing for the starter if the starter is injured.Bolded part = for me. What the heck are you talking about there?

Hmmmlet's suppose:

a D plans for the starting RB so his average suffers a little bit

the backup's average is higher

When the backup becomes the starter, his average will then suffer.

Using his average as a backup as an indicator won't show you anything all too useful but will instead frustrate you.

ETA take a look at Fargas or Morris

http://www.profootballreference.com/games/FargJu00.htm#2006

http://www.profootballreference.com/games/MorrMa00.htm#2006

Marc Levin · Nov 2, 2006

guderian said:
For instance, Chester Taylor had a 95 yard run in the first half of the season. If you double all of his stats as a projection for the rest of the season, then you are implicitly assuming that he will have another 95 yard run. One could argue that he's likely to have another big play, but what if the team is only on the 50 yard line instead of their 5 yard line? Then it's only a 50 yard run. and take out the ones that you think are unlikely to be duplicated--there is no easy answer.

You are not implicitly assuming that he wil have another 95 yard run. You are assuming that he will have 95 more yards than his average of the first 8 games without that run - not that he will get a 95 yard big play run.What you are saying is that by doubling you not only assume he will run a 95 yard play, but that he will run exactly the same as he did in the first half regardless of any other factors - including the exact same number of bad games. You are basically saying that averages mean nothing. An RB's averages from the first half of the season are generally, but certainly not completely, a decent reflection of his second half.

TinHat · Nov 2, 2006

while Barry Sanders averaged 4.8, however I'm sure that a close examination of their carries would reveal that Barry had a greater proportion of runs for lost yards and also of runs over 20 yards than Davis did. They were very different RB's.

I would just leave it right there. If you try to break it down any more, you will lose valuable insight.

CalBear · Nov 2, 2006

thesurfshop19 said:
I kind of like the "success rate" stat that SSOG brings up a lot (success = 40% of yards needed on 1st down, 60% on 2nd down, and 100% on 3rd and 4th down).The way I see it, success rate is to yards per carry as on base percentage is to slugging percentage. I could be way off though -- both numbers seem to be pretty useful in evaluating RBs though.

The question with success rate, that I have not seen any data about, is how predictive it is. Does a player's success rate in year N indicate that he will likely have a similar success rate in year N+1? And more to the point, is it a better predictor than the other things we use to predict fantasy scoring? Success rate had Ron Dayne as a top RB last year, which immediately throws it into question.

Marc Levin · Nov 2, 2006

Bri said:
Marc Levin said:

Bri said:

well woosh aside, a RB will average more on 3rd down. Googling I can't find it but...

Also, I still think a backup RB will average more.

Click to expand...

:whoosh: in effect - you are still not getting why the stat requested is an important factor in determining whether the b/u RB will play well subbing for the starter if the starter is injured.Bolded part = for me. What the heck are you talking about there?

Click to expand...

Hmmmlet's suppose:

a D plans for the starting RB so his average suffers a little bit

the backup's average is higher

When the backup becomes the starter, his average will then suffer.

Using his average as a backup as an indicator won't show you anything all too useful but will instead frustrate you.

ETA take a look at Fargas or Morris

http://www.profootballreference.com/games/FargJu00.htm#2006

http://www.profootballreference.com/games/MorrMa00.htm#2006

Yup - as I thought - you are STILL not getting it so I will have to use a hypothetical.Starting RB #1 averages 4.0 YPC, and runs at or near 4.0 YPC almost every single play - he is obviously behind a great OL and is not greatly individually talented because he's getting 4 yards every play and not much more, yet he's also not ever getting less than 4.0 YPC.

S-RB #1 gets injured. Whoever his backup is will be inserted behind an incredible OL. There is, therefore, a better than reasonable chance that the b/u will have similar (or better) running success.

S-RB#2 averages 4.0 YPC, and almost never runs for 4 yards on any play - loads of negative and 1 or 2 yard carries and a couple big plays each and every game. He is obviously behind a not-so-great OL and is extremely individually talented.

S-RB #2 gets injured. Whoever his backup is will be inserted behind a not so incredible OL. There is, therefore, not a particularly reasonable chance that the b/u will have similar running success as S-RB#2.

I think the above is what's been :whoosh: for you. If you got that, and you still made that bolded statement, then you are giving me stats about b/u numbers for no real good reason - I take what you are saying about b/u as a given.

I assume you are not dumb, so I must not have explained what I meant explicitly enough. That hypo should do it.

Marc Levin · Nov 2, 2006

CalBear said:
thesurfshop19 said:

I kind of like the "success rate" stat that SSOG brings up a lot (success = 40% of yards needed on 1st down, 60% on 2nd down, and 100% on 3rd and 4th down).The way I see it, success rate is to yards per carry as on base percentage is to slugging percentage. I could be way off though -- both numbers seem to be pretty useful in evaluating RBs though.

Click to expand...

The question with success rate, that I have not seen any data about, is how predictive it is. Does a player's success rate in year N indicate that he will likely have a similar success rate in year N+1? And more to the point, is it a better predictor than the other things we use to predict fantasy scoring? Success rate had Ron Dayne as a top RB last year, which immediately throws it into question.

I think the more valuable use of success rate is not to predict year to year success, but game to game.It'd be a decent stat to predict whether a player's first half stats will continue though the second half. The stat's BEST usage, however, is in comparing RBBC members, who, presumably, face the same defenses and behind the same OL and in the same game - for example, sucess rate is a GREAT way to decide whether Addai or Rhodes is running better.

Angrymutt · Nov 2, 2006

Marc Levin said:
Bri said:

Marc Levin said:

Moreover, redman's question is extremely useful in determining whether an injured starter's immediate backup could be as, nearly as, or more successful.

Click to expand...

IMO no it's not.Backup RBs often play a 3rd down role. 3rd down backs generally average more

Click to expand...

:whoosh: I think you missed why I made the statement - you need to re-read it in context.

BTW, the bolded part is not true - more examples of b/u who are not 3rd d. backs come to mind than th eother way around - there is a certain skill set you want out of your third down back thatr is not necessarily th esame skill set you want from your starter (or else your starter would ALSO be your 3rd down back - ala LT, who simply can NOT Be taken out of the game in any 3rd down situations b/c of his pass receiving, blocking, and RB speed)

Agreed. If info requested is, "what rb is closet to averaging his ypc each attempt" then you need at much larger sample then what a few weeks would do. Take in account a team preparing for a back-up when a #1 is out, or preparing for a backup that week, after weeks of getting more and more touches will have a huge impact. These are things that can't be predicted and thus change from week to week. I'd think the sample is a bit samll to get a good idea of what to expect. But if you look at what is available as the season goes on, then which RB is within 1 standard deviation or less from his mean average going into the next week, i'd guess that is the best way. Trouble is you might be crunching numbers all week, and not look at the other factors. i tend to ramble sorry if it doesn't make to much sense

Bri · Nov 2, 2006

Angrymutt said:
Agreed. If info requested is, "what rb is closet to averaging his ypc each attempt" then

(Not to single you out but good example for my confusion)Marc, to me, this sentence is "If info requested is, "what RB is closest to averaging his AVERAGE each attempt" I mean YPC is an average so it's :confused:

to me

Bri · Nov 2, 2006

Marc Levin said:
CalBear said:

thesurfshop19 said:

I kind of like the "success rate" stat that SSOG brings up a lot (success = 40% of yards needed on 1st down, 60% on 2nd down, and 100% on 3rd and 4th down).

The way I see it, success rate is to yards per carry as on base percentage is to slugging percentage. I could be way off though -- both numbers seem to be pretty useful in evaluating RBs though.

Click to expand...

The question with success rate, that I have not seen any data about, is how predictive it is. Does a player's success rate in year N indicate that he will likely have a similar success rate in year N+1? And more to the point, is it a better predictor than the other things we use to predict fantasy scoring? Success rate had Ron Dayne as a top RB last year, which immediately throws it into question.

Click to expand...

I think the more valuable use of success rate is not to predict year to year success, but game to game.It'd be a decent stat to predict whether a player's first half stats will continue though the second half.

The stat's BEST usage, however, is in comparing RBBC members, who, presumably, face the same defenses and behind the same OL and in the same game - for example, sucess rate is a GREAT way to decide whether Addai or Rhodes is running better.

score would affect that more than anything

Maurile Tremblay · Nov 2, 2006

redman said:
Most people (me included) jump all over people who point out that "such-and-such-player would only have this yards per carry/reception/attempt average if you removed this big play."

1. Pointing out that a single play makes a big difference in a guy's average is just another way of pointing out that the sample size is too small for the average to be meaningful.2. As CalBear states, you should never throw out a play just for the heck of it. Nonetheless, it may make sense not to count an 80-yard TD run as being twice as impressive as a 40-yard TD run. The only difference, most likely, was field position -- but the fact that the guy with the 80-yd run had worse field position on a previous drive doesn't really tell you anything useful about how well he'll run in the future. Therefore -- and I think I saw this idea at FootballOutsiders -- it might make sense (when forming a predictive model) to count any runs longer than 40 yards as just being 40 yards. That reduces the "luck" factor of field position.

Bri · Nov 2, 2006

Marc Levin said:
Bri said:

Marc Levin said:

Bri said:

well woosh aside, a RB will average more on 3rd down. Googling I can't find it but...

Also, I still think a backup RB will average more.

Click to expand...

:whoosh: in effect - you are still not getting why the stat requested is an important factor in determining whether the b/u RB will play well subbing for the starter if the starter is injured.Bolded part = for me. What the heck are you talking about there?

Click to expand...

Hmmmlet's suppose:

a D plans for the starting RB so his average suffers a little bit

the backup's average is higher

When the backup becomes the starter, his average will then suffer.

Using his average as a backup as an indicator won't show you anything all too useful but will instead frustrate you.

ETA take a look at Fargas or Morris

http://www.profootballreference.com/games/FargJu00.htm#2006

http://www.profootballreference.com/games/MorrMa00.htm#2006

Click to expand...

Yup - as I thought - you are STILL not getting it so I will have to use a hypothetical.Starting RB #1 averages 4.0 YPC, and runs at or near 4.0 YPC almost every single play - he is obviously behind a great OL and is not greatly individually talented because he's getting 4 yards every play and not much more, yet he's also not ever getting less than 4.0 YPC.

S-RB #1 gets injured. Whoever his backup is will be inserted behind an incredible OL. There is, therefore, a better than reasonable chance that the b/u will have similar (or better) running success.

S-RB#2 averages 4.0 YPC, and almost never runs for 4 yards on any play - loads of negative and 1 or 2 yard carries and a couple big plays each and every game. He is obviously behind a not-so-great OL and is extremely individually talented.

S-RB #2 gets injured. Whoever his backup is will be inserted behind a not so incredible OL. There is, therefore, not a particularly reasonable chance that the b/u will have similar running success as S-RB#2.

I think the above is what's been :whoosh: for you. If you got that, and you still made that bolded statement, then you are giving me stats about b/u numbers for no real good reason - I take what you are saying about b/u as a given.

I assume you are not dumb, so I must not have explained what I meant explicitly enough. That hypo should do it.

You described the first back as extremely individually talented and yet expect the less talented backup(with no distinguishable qualities mentioned) to perform as well.

Mootej · Nov 3, 2006

redman said:
I'm a pretty poor number cruncher, so I pose this question to the forum.

Most people (me included) jump all over people who point out that "such-and-such-player would only have this yards per carry/reception/attempt average if you removed this big play." The logic in the criticism is that, of course, you can't simply ignore big plays because they count towards the player's statistics of course and they indicate the ability of the player to break a big play every once in a while, i.e. "past performance may be indicative of future performance".

In statistics, however, there are such things as statistical outliers, meaning unusually large or small numbers that form the exception to the data being studied. In addition, it can be useful to try to figure out what amount of yards per carry, for example, are the most indicative of the "average run" that a RB has. For example, Terrell Davis averaged 4.7 yards per carry in 1995 while Barry Sanders averaged 4.8, however I'm sure that a close examination of their carries would reveal that Barry had a greater proportion of runs for lost yards and also of runs over 20 yards than Davis did. They were very different RB's.

Certain methodologies allow you when analyzing stats to remove both the largest and the smallest numbers from your data set before looking for the mean. However, taking a RB's longest and shortest runs out of the equation is problematic because a RB can gain far more yards past the line of scrimmage than he can lose behind the line of scrimmage on the average carry so, while the impact of a long run on his average as a statistical outlier would be diminished, it would still be there.

The question, then, is how do you go about figuring out what length of run is the most indicative of a RB's carries? Taking the median would seem like a logical way, except that rushing yards are measured in whole numbers on each carry by the NFL leading to pretty homogenous results when comparing RB's, and anyway I don't know of a source that compiles, orders and lists all of a RB's runs each game by distance. That seems like a lot of work. Any ideas here?

You might try this.Dump all the runs into frequency bins: how many times -10 to -5 yds, -5 to 0, 0 to 5, etc. that the RB runs that length in a given game. You may want smaller bins if you have enough data.

Fit this to a Poisson distribution. The average predicted by the Poisson will be the most likely number of times in a given game that the RB will run that distance. This will be a better fit and a better number. You can also look at how the frequency distribution looks.

Excell has all the functions (frequency, poisson, graphing).

Bri · Nov 3, 2006

Bri said:
just go with average and call it a day

allow me to go on with this.Don't analyze yards per attempt and total yards and leave out "rushes" aka "carries". Oh so many people do and it's foolish. FF is about opportunity and that is a RBs opportunity. What does it take to figure out the average? Total yards/carries right? Sorry so blatantly obvious but how can you discuss a product of that math formula and leave out one of two of the essential elements?Tell me where in this thread the # of carries was discussed

Marc Levin · Nov 3, 2006

Bri said:
You described the first back as extremely individually talented and yet expect the less talented backup(with no distinguishable qualities mentioned) to perform as well.

Bri - you really are not getting this - actually, I pray for your sake that you are simply reading and not getting it .I said S-RB#1 is NOT all that individually talented - the OL is getting him 4 yards downfield more often than S-RB #1 is getting himself past the LOS - like in Denver, plug in a monkey with a helmet and he can get 4 yards behind Denver's OL.No back except Barry Sanders (plug him in for S-RB#2) could make chicken salad out of the chicken #### that was Detroit's OL most of the years he was there. Thus, Barry's lines were often negative one yard, two yards, negative one yard, 55 yards and a TD. (rinse, punt, gett he ball, repeat)Barry's backup would not have been able to do crap behind that OL.Are you coming closer to getting this? Am I REALLY not explaining this clearly?

Bri · Nov 3, 2006

Marc Levin said:
Bri said:

You described the first back as extremely individually talented and yet expect the less talented backup(with no distinguishable qualities mentioned) to perform as well.

Click to expand...

Bri - you really are not getting this - actually, I pray for your sake that you are simply reading and not getting it .I said S-RB#1 is NOT all that individually talented - the OL is getting him 4 yards downfield more often than S-RB #1 is getting himself past the LOS - like in Denver, plug in a monkey with a helmet and he can get 4 yards behind Denver's OL.No back except Barry Sanders (plug him in for S-RB#2) could make chicken salad out of the chicken #### that was Detroit's OL most of the years he was there. Thus, Barry's lines were often negative one yard, two yards, negative one yard, 55 yards and a TD. (rinse, punt, gett he ball, repeat)Barry's backup would not have been able to do crap behind that OL.Are you coming closer to getting this? Am I REALLY not explaining this clearly?

I get it, I wrote "first back" should have meant S RB2 sorry I messed your code up. But as I mentioned you expect the backup to Barry Sanders of the Detroit Lions (aka BSDL's backup) to do as well as him. You're telling me you'd have to analyze stats to know that? I don't recall Walter Payton of the Chicago Bears (aka WPCB) backing up BSDL.oh yeah and

Maurile Tremblay · Nov 3, 2006

Marc Levin said:
Am I REALLY not explaining this clearly?

FWIW, I have no idea what either one of you guys are talking about. :shrug:

Edit to add: Barry Sanders was pretty good, though. So I agree with whoever has that side in the debate.

Phlash · Nov 3, 2006

If we discounted all of Barry Sander's big runs he'd be considered a horrible running back. Instead he's considered one of the best of all time.

Marc Levin · Nov 3, 2006

Mootej said:
Fit this to a Poisson distribution. The average predicted by the Poisson will be the most likely number of times in a given game that the RB will run that distance. This will be a better fit and a better number. You can also look at how the frequency distribution looks.

Bri - imagine the above being done.Two RBs have 4.0 YPCS-RB#1 has the most distribution of his carries in the 3.0-5.0 range. And he has ZERO carries of 20+ yards.S-RB#2 has NO carries in the 3.0-5.0 range and has most of his carries in the -2 to 2.9 range. He also leads the league in 20+ yard carries.S-RB#1 is running behind a great OL - and is not individually talented. S-RB#1 can get hurt and S-RB#1's backup is highly likely to be able to run at least 4.0 yards per carry.S-RB#2 is not running behind a great OL - he is getting hit in the backfield or at the LOS most of the time - but he is able to maintain a 4.0 YPC because he is individually talented once he gets past the OL's blocking. S-RB#2's backup is not likely to maintain a 4.0 YPC average and is more likely to have a poor YPC average.I hope THAT makes it clear enough for you.

Marc Levin · Nov 3, 2006

Bri said:
as I mentioned you expect the backup to Barry Sanders of the Detroit Lions (aka BSDL's backup) to do as well as him. You're telling me you'd have to analyze stats to know that? I don't recall Walter Payton of the Chicago Bears (aka WPCB) backing up BSDL.oh yeah and

No - that is not even close to what I said - see above.

Bri · Nov 3, 2006

Marc Levin said:
Mootej said:

Fit this to a Poisson distribution. The average predicted by the Poisson will be the most likely number of times in a given game that the RB will run that distance. This will be a better fit and a better number. You can also look at how the frequency distribution looks.

Click to expand...

Bri - imagine the above being done.I hope THAT makes it clear enough for you.

WHAT?Is Bret Michaels singing?I have no clue WTF a poisson is

Bri · Nov 3, 2006

Marc Levin said:
Bri said:

as I mentioned you expect the backup to Barry Sanders of the Detroit Lions (aka BSDL's backup) to do as well as him. You're telling me you'd have to analyze stats to know that? I don't recall Walter Payton of the Chicago Bears (aka WPCB) backing up BSDL.

oh yeah and

Click to expand...

No - that is not even close to what I said - see above.

In post 40 for S-FBG-RB2 you said one thing, before that(which I replied to) you said this:S-RB #2 gets injured. Whoever his backup is will be inserted behind a not so incredible OL. There is, therefore, not a particularly reasonable chance that the b/u will have similar running success as S-RB#2to which I'm replying as I did above. Not earth shatterring to think Barry's backup isn't as good as him. Why do you need to bother analyzing YPC stats to come to that conclusion?

Marc Levin · Nov 3, 2006

Maurile Tremblay said:
Marc Levin said:

Am I REALLY not explaining this clearly?

Click to expand...

FWIW, I have no idea what either one of you guys are talking about. Edit to add: Barry Sanders was pretty good, though. So I agree with whoever has that side in the debate.

That's not up for debate.

Bri was commenting on this:

Bri said:
Bri said:

just go with average and call it a day

Click to expand...

flippant answer - two backs may average 4.0 per carry, but I obviously want the one that gets more 4 yard carries - the reason being that a back who averages 4.0, but gets them on one or two big plays in a game, is more likely to be replaced if he stops running the big plays (for whatever reason) - plus, his YPC plummets faster after a couple bad games. Finally, knowing the answer helps determine whether a team's running game is more or less likely to succeed against a tough run D.The back that gets 4 more often and averages 4 per carry puts his team in a great position much more often than back #1 - that guy is unlikely to lose his job.

Moreover, redman's question is extremely useful in determining whether an injured starter's immediate backup could be as, nearly as, or more successful. A back that tends to get a lot of yardage on big plays and otherwise gets stopped for less than 3 is much more likely making his numbers on individual talent rather than OL play.

I am trying to explain why redman's query is valuable information.For some reason, Bri is not understanding why having a backup for a guy who runs 4.0 YPC by running mostly 4 yards every carry is better than having a backup for a guy who also runs 4.0 YPC, but mostly runs for negative or low yardage.

Marc Levin · Nov 3, 2006

Bri said:
Marc Levin said:

Bri said:

as I mentioned you expect the backup to Barry Sanders of the Detroit Lions (aka BSDL's backup) to do as well as him. You're telling me you'd have to analyze stats to know that? I don't recall Walter Payton of the Chicago Bears (aka WPCB) backing up BSDL.

oh yeah and

Click to expand...

No - that is not even close to what I said - see above.

Click to expand...

In post 40 for S-FBG-RB2 you said one thing, before that(which I replied to) you said this:S-RB #2 gets injured. Whoever his backup is will be inserted behind a not so incredible OL. There is, therefore, not a particularly reasonable chance that the b/u will have similar running success as S-RB#2to which I'm replying as I did above. Not earth shatterring to think Barry's backup isn't as good as him. Why do you need to bother analyzing YPC stats to come to that conclusion?

I don't think that point is debatable. I am explaining why redman wants the info. Do you REALLY want Najeh Davenport behind Willie Parker? That analysis is getting a little closer to why htat info could be useful. We have a perception that Willie Parker makes most of his yardage on big plays - if we had the info redman is asking for, we could either prove or disprove that point and decide whether picking up Davenport is a good idea or a waste of time.Liek I said in the FIRST post - your response was flippant because knowing redman's query helps determine if the team has a good OL.

Marc Levin · Nov 3, 2006

p.s. - i don't need help knowing that Barry is awesome.

I *might* need that info to analyze whether DeShaun Foster could catch on somewhere else next year.

I might need it to figure out whether Addai is gonna be a stud even if the team turns over half their OL.

I can imagine a LOT of useful applications of the numbers redman wants. Taking YPC and moving on was a flippant answer because you implied that the information he wants has no practical application.

Mootej · Nov 3, 2006

Mootej said:
redman said:

I'm a pretty poor number cruncher, so I pose this question to the forum.

Most people (me included) jump all over people who point out that "such-and-such-player would only have this yards per carry/reception/attempt average if you removed this big play." The logic in the criticism is that, of course, you can't simply ignore big plays because they count towards the player's statistics of course and they indicate the ability of the player to break a big play every once in a while, i.e. "past performance may be indicative of future performance".

In statistics, however, there are such things as statistical outliers, meaning unusually large or small numbers that form the exception to the data being studied. In addition, it can be useful to try to figure out what amount of yards per carry, for example, are the most indicative of the "average run" that a RB has. For example, Terrell Davis averaged 4.7 yards per carry in 1995 while Barry Sanders averaged 4.8, however I'm sure that a close examination of their carries would reveal that Barry had a greater proportion of runs for lost yards and also of runs over 20 yards than Davis did. They were very different RB's.

Certain methodologies allow you when analyzing stats to remove both the largest and the smallest numbers from your data set before looking for the mean. However, taking a RB's longest and shortest runs out of the equation is problematic because a RB can gain far more yards past the line of scrimmage than he can lose behind the line of scrimmage on the average carry so, while the impact of a long run on his average as a statistical outlier would be diminished, it would still be there.

The question, then, is how do you go about figuring out what length of run is the most indicative of a RB's carries? Taking the median would seem like a logical way, except that rushing yards are measured in whole numbers on each carry by the NFL leading to pretty homogenous results when comparing RB's, and anyway I don't know of a source that compiles, orders and lists all of a RB's runs each game by distance. That seems like a lot of work. Any ideas here?

Click to expand...

You might try this.Dump all the runs into frequency bins: how many times -10 to -5 yds, -5 to 0, 0 to 5, etc. that the RB runs that length in a given game. You may want smaller bins if you have enough data.

Fit this to a Poisson distribution. The average predicted by the Poisson will be the most likely number of times in a given game that the RB will run that distance. This will be a better fit and a better number. You can also look at how the frequency distribution looks.

Excell has all the functions (frequency, poisson, graphing).

Wish I could post an image here of the graph. I did this for LT for 2006. Most frequent is 1 & 2 yds. Another small mode at 8 yds. Another even smaller around 15 yds. Here are the numbers:yds---freq

-5-----1

-4-----0

-3-----4

-4-----0

-1-----8

0-----14

1-----24

2-----24

3-----15

4-----17

5-----14

6------4

7------3

8------7

9------3

10-----2

11-----1

12-----1

13-----0

14-----1

15-----2

16-----1

17-----0

18-----0

19-----1

20-----0

21-----0

22-----0

23-----0

24-----0

25-----0

>25---4

Bri · Nov 3, 2006

Marc Levin said:
For some reason, Bri is not understanding why having a backup for a guy who runs 4.0 YPC by running mostly 4 yards every carry is better than having a backup for a guy who also runs 4.0 YPC, but mostly runs for negative or low yardage.

I understand that, it doesn't matter. I understand your line of thinking is that the Oline is establishing a 4 YPC and the backup will just plug right in. That doesn't happen though. I mean maybe from time to time but generally ...nope. Gimme some backups plugged in with the same YPC as the starter. In Indy, Edge averaged 4.6. Rhodes averages 3.1. But what about Addai?(talking to myself) He has less carries but 5.1 average and I already told you a back will average more if he runs on 3rd alot. Nonetheless a half yard more than Edge is not the same.Barber averages 5.1 and Jones averages 4.2, same lineSteven Jackson as a backup averaged 5.0 YPC but as a starter averages 4.0

Bri · Nov 3, 2006

Marc Levin said:
I can imagine a LOT of useful applications of the numbers redman wants. Taking YPC and moving on was a flippant answer because you implied that the information he wants has no practical application.

it doesn't, especially without taking in the stat:carries which as I said before, none of you have done.If you disagree then give me one, give me a practical application. I can sure give ya one with total yards(FBG ranking) or carries:

http://www.nfl.com/stats/leaders/NFL/RYDS/2006/regular

show me a RB with 600 or more carries not doing well

ETA top rushers by average aren't even starting RBs:

Michael Vick, Jerrius Norwood, and Michael Turner

Marc Levin · Nov 3, 2006

Bri said:
Marc Levin said:

For some reason, Bri is not understanding why having a backup for a guy who runs 4.0 YPC by running mostly 4 yards every carry is better than having a backup for a guy who also runs 4.0 YPC, but mostly runs for negative or low yardage.

Click to expand...

I understand that, it doesn't matter. I understand your line of thinking is that the Oline is establishing a 4 YPC and the backup will just plug right in. That doesn't happen though. I mean maybe from time to time but generally ...nope. Gimme some backups plugged in with the same YPC as the starter. In Indy, Edge averaged 4.6. Rhodes averages 3.1. But what about Addai?(talking to myself) He has less carries but 5.1 average and I already told you a back will average more if he runs on 3rd alot. Nonetheless a half yard more than Edge is not the same.Barber averages 5.1 and Jones averages 4.2, same lineSteven Jackson as a backup averaged 5.0 YPC but as a starter averages 4.0

Why do you keep bringing up the backups' averages as backups???I am making the backup the starter for the consistent 4.0 YPC guy - I do not expect the b/u to duplicate the starter's stats, but I expect him to get pretty close to the starter's numbers and to maybe run better.Plus b/u into the Barry Sanders type and you have no idea what will happen.Your Edge-Dom/Addai example is EXACTLY why redman's stat is important. The guy subbing for Edge being gone is getting behind a pretty good OL and Edge is def. a "consistent" 4.6 YPC guy rather than a breakaway guy.That guy should get near 4.6 yards per carry - Rhodes aint getting it, but Addai is. Edge wasn't the reason for the Colts running game being effective - witness his numbers in AZ - the OL was. Addai at 5.1 YPC is exactrly the kind of info you want to know - is he getting those numbers by most often running near 5 a carry or is he busting off a long run everey so often - I'd guess the latter because he is a full half yard per carry ABOVE Edge's numbers from last year.The OL is getting Addai 2-3 yards into the second line of defense, he's doing the rest. The OL is also getting Rhodes 2-3 yards into the defense, but he has less talent than either Edge or Addai, so he's not making any progress after that.

Question about the "if you take out that big play" argument (1 Viewer)

Footballguy

Footballguy

Hangs out with Oscar Zeta Acosta

Footballguy

Footballguy

Hangs out with Oscar Zeta Acosta

Footballguy

Footballguy

MLBrandow

Guest

Footballguy

Footballguy

Footballguy

Footballguy

Hangs out with Oscar Zeta Acosta

Hangs out with Oscar Zeta Acosta

Hangs out with Oscar Zeta Acosta

Footballguy

Footballguy

Footballguy

Footballguy

Hangs out with Oscar Zeta Acosta

Footballguy

Footballguy

Hangs out with Oscar Zeta Acosta

Footballguy

Footballguy

Hangs out with Oscar Zeta Acosta

Hangs out with Oscar Zeta Acosta

Footballguy

Footballguy

Footballguy

Administrator

Footballguy

Footballguy

Footballguy

Hangs out with Oscar Zeta Acosta

Footballguy

Administrator

Footballguy

Hangs out with Oscar Zeta Acosta

Hangs out with Oscar Zeta Acosta

Footballguy

Footballguy

Hangs out with Oscar Zeta Acosta

Hangs out with Oscar Zeta Acosta

Hangs out with Oscar Zeta Acosta

Footballguy

Footballguy

Footballguy

Hangs out with Oscar Zeta Acosta

Similar threads

Users who are viewing this thread