ConstruxBoy Ranking Contest Rationalization (1 Viewer)

ConstruxBoy · Apr 2, 2007

I staged a ranking contest this past season with a specific purpose in mind. First of all, here are the links to the Contest and Results threads:

Results

QBs

RBs

WRs

TEs

PKs

DEFs

The idea behind the contest was hatched while reading a popular book by James Surowiecki called The Wisdom of Crowds. Surowiecki's basic premise was that the "Crowd" was smarter and more accurate than any one person when determining something that is unknown. A good example is the number of Jelly Beans in a jar. He showed that an average of all the guesses would be a more accurate guess than the guess of any one person. This theory holds true even if you include experts in the average. So the accuracy of a group should be greater than the accuracy of an expert. Of course, the average of a group of experts should also be more accurate than the accuracy of a group of non-experts.

I decided that it would be fun to try this with fantasy football rankings at FootballGuys.com. My expectation was that even giving away $5 a position for the most accurate ranking (other than the consensus, or "crowd" ranking of course) would be offset by the $30 I would get for my fantastic freelance article on the contest this summer. I also decided to include the final pre-season rankings of 6 FBG staff members, plus the overall FBG Staff Average rankings, for each position to represent the experts.

Oops.

As you can see from the results, my expectations were not met. Let's look at each position:

QB: 16 different board members and 4 experts beat the board consensus. Not what I expected. Of course, the FBG Staff average trounced the Board Consensus, so the secondary theory of a group of experts beating a group of non-experts still looked OK.

RB: A little bit better, as only 6 board members and 4 experts beat the board consensus. And the FBG Staff average beat the board consensus as well.

WR: Back to QB territory as 10 board members and 3 experts beat the board consensus. The FBG Staff average also beat the board consensus for the 3rd straight time. And my wife is wondering why I've paid out $30 already with 3 positions left to go. :yes:

TE: FINALLY! The whole theory is validated by the TE position! :shrug:

Not sure what happened here, but the Board Consensus was the most accurate ranking and beat the FBG Staff average by a decent amount. So Disco Stu had 5 less dollars to make do. :bag:

My thought was that maybe the less scrutinized positions, like TE, PK and DEF, were more difficult to rank and more likely to lead to a better consensus ranking.

PK: There goes that theory. The Board Consensus was a brutal 194 or 4th worst. 10 out of 13 board entries beat it. All the experts beat it. The FBG Staff average was better again, but it was still not that great, at 190. It shows that this was a tough position to rank last year.

DEF: Very, very similar to the PKs. The Board Consensus was once again 4th worst, the FBG Staff average beat it for the 5th of 6 positions, but it was also quite poor. Again this position looks like one that is difficult to rank and is as much of a crap shoot as anything.

Conclusions:

At the very least, in this contest, the wisdom of crowds did not hold true. At 5 of the 6 positions, a board member beat the board consensus.

At 5 of the 6 positions, the FBG Staff average beat the board consensus. Anyone that doesn't think the FBG Staff members know what they are doing is fooling themselves. This is obviously just one year's worth of data, but at all big 3 positions (QB, RB, WR) the FBG Staff average was better than more than half of the board members.

Props to BoltBacker and Lott's Fingertip. Their finishes:

BB - 1, 2, 19, 15, 2, 1

LF - 8, 4, 1, 16, 3, 4

Nice job guys.

Also props to Jason Wood, who did the best among the 4 experts used at all 6 positions.

The H.K. explanation: What I did not count on when starting this contest was someone purposely using bad rankings. But then again, I should have counted on H.K. I forgot that he had beaten me in the Fantasy Bowl of our CHUG league in 2005. That meant that in the 1st round, and every odd numbered round, I drafted 11th and he drafted 12th. In an effort to hide his true feelings (you could have just used the force, like Vader, GB), he put in crazy rankings and finished near the bottom at almost every position. What he didn't know was that I didn't even look at the rankings or start compiling the spreadsheet until mid-season.

Of course, he won the Fantasy Bowl in CHUG again this year, so maybe he's on to something.

Family Matters · Apr 2, 2007

In regards to your theory of a concensus of many being more accurate than 1, I was thinking about the example you used for the jelly beans. The difference between that example and FF are striking. First off, the beans are already determined (we can see what the results are) so all we need to know is what does that certain result equal. In FF we do not yet know the result. We are guessing not only what the result will be but who it will be by. Mulitiply that by 12 QB's for example and the level of difficulty increases.

Now to make it even more difficult and therefore less exact, add in the factors that will impact the final result of each player. QB's are effected by RB's, WR's, defenses and so on. It's a lot different than a static object such as jelly beans.

I realize this theory could be applied to many things and that the beans where just an example of the theory in application. I just think FF is not the best application of the theory.

ConstruxBoy · Apr 2, 2007

Family Matters said:
In regards to your theory of a concensus of many being more accurate than 1, I was thinking about the example you used for the jelly beans. The difference between that example and FF are striking. First off, the beans are already determined (we can see what the results are) so all we need to know is what does that certain result equal. In FF we do not yet know the result. We are guessing not only what the result will be but who it will be by. Mulitiply that by 12 QB's for example and the level of difficulty increases.Now to make it even more difficult and therefore less exact, add in the factors that will impact the final result of each player. QB's are effected by RB's, WR's, defenses and so on. It's a lot different than a static object such as jelly beans.I realize this theory could be applied to many things and that the beans where just an example of the theory in application. I just think FF is not the best application of the theory.

Where the hell were you in August!!

beto · Apr 2, 2007

Did you have a large enough sample size at each position? In my case, I think I only managed to do a couple of the spots.

ConstruxBoy · Apr 2, 2007

beto said:
Did you have a large enough sample size at each position? In my case, I think I only managed to do a couple of the spots.

No, probably not. And I did mean to address that in the first post. I was really hoping for 40-50 entries for each position.

Lott's Fingertip · Apr 2, 2007

I, too, think the results would be more in line with the theory's expectations with a larger field of entrants.

Still fun, though.

The Man With No Name · Apr 2, 2007

ConstruxBoy said:
beto said:

Did you have a large enough sample size at each position? In my case, I think I only managed to do a couple of the spots.

Click to expand...

No, probably not. And I did mean to address that in the first post. I was really hoping for 40-50 entries for each position.

Surprised you didn't get more responses from the huge population of posters here. I only did QB's and was somehow 2nd. I really don't do rankings because I only play in dynasty IDP leagues.

ConstruxBoy · Apr 2, 2007

The Man With No Name said:
ConstruxBoy said:

beto said:

Did you have a large enough sample size at each position? In my case, I think I only managed to do a couple of the spots.

Click to expand...

No, probably not. And I did mean to address that in the first post. I was really hoping for 40-50 entries for each position.

Click to expand...

Surprised you didn't get more responses from the huge population of posters here. I only did QB's and was somehow 2nd. I really don't do rankings because I only play in dynasty IDP leagues.

Yeah, couldn't really afford to give away big bucks or anything, but was hoping for more.

renesauz · Apr 3, 2007

Nothing wrong with the Theory, but the sample size is WAAAAY too small for testing this kind of theory. The more complex the excercise, the bigger the sample size needs to be. For something like FF predictions, you'd need thousands participating in the exercise.

H.K. · Apr 3, 2007

ConstruxBoy said:
The H.K. explanation: What I did not count on when starting this contest was someone purposely using bad rankings.

Have you tallied the results by eliminating my rankings? If I had any idea the purpose of your analysis, I wouldn't have skewed the results. Given the small sample sizes, it may make a little bit of difference because mine were so far off. Just figured I'd give you something else to do, I know how much free time you have these days... :lmao:

PS - Cool concept btw.

ConstruxBoy · Apr 3, 2007

renesauz said:
Nothing wrong with the Theory, but the sample size is WAAAAY too small for testing this kind of theory. The more complex the excercise, the bigger the sample size needs to be. For something like FF predictions, you'd need thousands participating in the exercise.

Yeah, that's a good point. I'm not sure how easy it would be to get that many rankings though. The idea was that if the conclusion was mostly correct (ie. an average group of rankings should be more accurate than your own), then it needed to be something that a fantasy football owner could reasonably do to improve their own chances to win. So if it takes two thousand sets of rankings to see an accurate consensus ranking, that's way out of the realm of the average fantasy owner's time or resources to accomplish each year. So even if it was true, it wouldn't be very practical. Of course, had I given more thought to the small sample size issue I could have saved myself a bunch of time and $55. :thumbup:

Perfect Tommy · Apr 3, 2007

Sample size is not the issue. Even a sample size of two should be better than a sample size of one. The problem is comparing the rankings against actual results. The ranking can really be viewed as a probability distribution for each player. One would expect the consensis ranking measured against the actual results of the season to finish near the middle. But that does not mean that the probability distribution of the consensis rankings was not better than any individual's ranking. And it is almost surely better than most people's ranking. This is not unlike the deception that many mutual funds make to suggest that they are better than an index fund.

Maurile Tremblay · Apr 3, 2007

ConstruxBoy said:
The idea behind the contest was hatched while reading a popular book by James Surowiecki called The Wisdom of Crowds. Surowiecki's basic premise was that the "Crowd" was smarter and more accurate than any one person when determining something that is unknown. A good example is the number of Jelly Beans in a jar. He showed that an average of all the guesses would be a more accurate guess than the guess of any one person.

I have not read the book.But I don't think the average of all the guesses would be the most accurate guess out of a large number of them. It might be the "best" guess in the sense of being a favorite over any specific individual guess before the beans are counted, but I don't think it would be a favorite over the field.

In other words, let's say that we have ten different guessers, who are ranked in their guessing expertise as follows: Allen, Byron, Chad, Dirk, Evan, Fred, Gill, Hal, Ian, Jack.

Let's call the actual closest guess (after the beans have been counted) G1, the second-closest guess G2, and so on.

If we include the average of all the guesses as a separate guess, we have eleven guesses.

The Average Guess might be a favorite over Allen. But it should be an underdog to be G1.

If that makes any sense . . .

Also, level of expertise does matter. Rain Man might be a favorite over the crowd average.

Similarly, Richard Feynman tells a story of when he was on a committee to review science textbooks that were being considered for use in public schools. Everyone on the committee was supposed to give each book a grade from 1 to 5. Feynman read the books and gave them his grades. At the committee meeting, people were all announcing the grades they'd given to each book and everyone else gave a grade to a particular book that Feynman had not graded. Feynman had not graded it because his copy was completely blank. He mentioned this at the meeting, and the representative from that publisher said that yes, at the time the books were sent out to people, that book had not been finished. They had a cover, but the content inside wasn't finished, so they sent out blank books to everyone.

How were all these people giving grades ("I gave it a 4." "I gave it a 3.") to blank books? It turned out that nobody else realized they were blank, because they were just giving arbitrary grades out to all the books without going through the trouble of actually reading them first. (It was a lot of work to read them all.)

So it was obvious in that case that Feynman's grade alone was more accurate than the average of a bunch of random grades by people who hadn't read the books. You don't improve the accuracy of a wise opinion by averaging it with a bunch of stupid ones.

That's a contrived example, though. In general, I would expect the average FF rankings from people on this board to be very good . . . as long as HK isn't included.

Maurile Tremblay · Apr 3, 2007

Maurile Tremblay said:
Richard Feynman tells a story . . .

Here it is, by the way.

H.K. · Apr 3, 2007

Maurile Tremblay said:
In general, I would expect the average FF rankings from people on this board to be very good . . . as long as HK isn't included.

Thanks for the new sig!

ConstruxBoy · Apr 3, 2007

Maurile Tremblay said:
ConstruxBoy said:

The idea behind the contest was hatched while reading a popular book by James Surowiecki called The Wisdom of Crowds. Surowiecki's basic premise was that the "Crowd" was smarter and more accurate than any one person when determining something that is unknown. A good example is the number of Jelly Beans in a jar. He showed that an average of all the guesses would be a more accurate guess than the guess of any one person.

Click to expand...

I have not read the book.But I don't think the average of all the guesses would be the most accurate guess out of a large number of them. It might be the "best" guess in the sense of being a favorite over any specific individual guess before the beans are counted, but I don't think it would be a favorite over the field.

In other words, let's say that we have ten different guessers, who are ranked in their guessing expertise as follows: Allen, Byron, Chad, Dirk, Evan, Fred, Gill, Hal, Ian, Jack.

Let's call the actual closest guess (after the beans have been counted) G1, the second-closest guess G2, and so on.

If we include the average of all the guesses as a separate guess, we have eleven guesses.

The Average Guess might be a favorite over Allen. But it should be an underdog to be G1.

If that makes any sense . . .

Also, level of expertise does matter. Rain Man might be a favorite over the crowd average.

Similarly, Richard Feynman tells a story of when he was on a committee to review science textbooks that were being considered for use in public schools. Everyone on the committee was supposed to give each book a grade from 1 to 5. Feynman read the books and gave them his grades. At the committee meeting, people were all announcing the grades they'd given to each book and everyone else gave a grade to a particular book that Feynman had not graded. Feynman had not graded it because his copy was completely blank. He mentioned this at the meeting, and the representative from that publisher said that yes, at the time the books were sent out to people, that book had not been finished. They had a cover, but the content inside wasn't finished, so they sent out blank books to everyone.

How were all these people giving grades ("I gave it a 4." "I gave it a 3.") to blank books? It turned out that nobody else realized they were blank, because they were just giving arbitrary grades out to all the books without going through the trouble of actually reading them first. (It was a lot of work to read them all.)

So it was obvious in that case that Feynman's grade alone was more accurate than the average of a bunch of random grades by people who hadn't read the books. You don't improve the accuracy of a wise opinion by averaging it with a bunch of stupid ones.

That's a contrived example, though. In general, I would expect the average FF rankings from people on this board to be very good . . . as long as HK isn't included.

Good points. So in simple terms, the Board Consensus should be better than the average score, but not necessarily better than the best score? And I love me some Feynman. I think I read that story before actually. Maybe in 'Genius'. Wasn't thinking of it when I did the contest though.

You should read Wisdom of Crowds. Good book.

Family Matters · Apr 3, 2007

Are we still in the shark thread? For a minute there I thought I was back in school. :(

ConstruxBoy · Apr 3, 2007

Family Matters said:
Are we still in the shark thread? For a minute there I thought I was back in school.

It's always like that when MT posts. :hey:

ConstruxBoy Ranking Contest Rationalization (1 Viewer)

Kate&#39;s Daddy

Footballguy

Kate&#39;s Daddy

Footballguy

Kate&#39;s Daddy

Footballguy

Footballguy

Kate&#39;s Daddy

IBL Representative

Footballguy

Kate&#39;s Daddy

Footballguy

Administrator

Administrator

Footballguy

Kate&#39;s Daddy

Footballguy

Kate&#39;s Daddy

Similar threads

Users who are viewing this thread

Kate's Daddy

Kate's Daddy

Kate's Daddy

Kate's Daddy

Kate's Daddy

Kate's Daddy

Kate's Daddy