There are a few of these but examples of Simpson's Paradox always blow my mind:
In the 1995 season, David Justice had a higher batting average than Derek Jeter.
In the 1996 season, David Justice had a higher batting average than Derek Jeter.
However, Derek Jeter had a higher batting average over the 1995-1996 seasons combined.
I know it has to be some sort of weighting thing, but I'm still struggling to figure out how this is possible.
It was messing with my head as well, but basically the idea is both of their averages were low in a season where Jeter had few at bats and Justice had a lot, and both of their averages were high in a season where Jeter had a lot of at bats and Justice had a few.
So for a an extreme example where Player 2 has a higher batting average in each season but a lower average overall...
Year 1, Player 1: 1/10 = .100 batting average
Year 1, Player 2: 150/1000 = .150 batting average
Year 2, Player 1: 800/1000 = .800 batting average
Year 2, Player 2: 9/10 = .900 batting average
Total, Player 1: 801/1010 = .793 batting average
Total, Player 2: 159/1010 = .157 batting average
In the case of Jeter/Justice, Jeter was 12/48 and 183/582 in the two seasons, and Justice was 104/411 and 45/140 in the two seasons.
Even though Justice was higher in both seasons, Jeter's total was actually not just a little bit higher, but quite a bit (.310 vs. .270).