vorosmccracken.com

The “triumphant” return of The Knack of the baseball world

vorosmccracken.com header image 2

Sabermetric Drive By: Adjusting Past Statistics

February 6th, 2008 · 1 Comment

Came up with an idea for adjusting past stats and was wondering if and where I’m going to be tripped up.

The idea is to take all non pitchers who played in MLB in consecutive seasons and then run a comparison of their stats from one year to the next to see how their stats change. The idea being that if you just look at the year to year totals, you miss the stats of the players who have dropped out of the league and also those of those who have entered the league. Often changes in league averages might have more to do with different types of players coming in and out than any real change in playing environments.

The key step is to adjust the first year stats for an age curve. The problem is if you don’t do that, those stats which tend to go down almost immediately from a player’s rookie year will tend to be seen as league changes rather than player changes. The downside to this is that we don’t actually know how much the traditional age curves reflect actual aging and how much it reflects a trend towards more and more talent entering the league. What this system does is assume an “average” aging and gauges how the individual years deviate from that average. The final downside is that so far the system breaks a bit in years before 1946 because of the war.

The big surprise so far is how far downward the stats in the 1950s seem to be adjusted, particularly in the American League. It’s worth noting that the AL in the 1950s was notoriously lopsided in terms of talent so it’s not that far of a stretch. Also Cecil Fielder’s 1990 becomes one of the top five home run seasons ever from 1946-2006. Mike Schmidt’s 1980 similarly gains a lot of ground. Mickey Mantle’s numbers take a big hit.

Any suggestions on obvious errors I might have missed, ways to improve the system? There are other problems (including what precisely you’d want to use these stats for), but I found it to be an interesting exercise anyway.

(Interesting stats based on 2006 AL) AVG/OBP/SLG – HR:

2001 – Barry Bonds – .323/.508/.805 – 65
1998 – Mark McGwire – .303/.466/.755 – 71
1990 – Cecil Fielder – .302/.401/.677 – 62
1992 – Mark McGwire – .295/.415/.693 – 54
1946 – Hank Greenberg – .265/.376/.628 – 52
1980 – Mike Schmidt – .289/.390/.636 – 52
1971 – Willie Stargell – .301/.401/.644 – 50
1946 – Ted Williams – .317/.497/.663 – 44
1976 – Graig Nettles – .275/.354/.563 – 43 (big improvement from actual)
1966 – Frank Robinson – .310/.413/.600 – 42
1967 – Carl Yastrzemski – .328/.424/.615 – 41
1968 – Ernie Banks – .261/.304/.535 – 41 (his adjusted career high in HR)
1959 – Hank Aaron – .337/.387/.565 – 31
1956 – Mickey Mantle – 317/.427/.565 – 34

Top 5 in adjusted homers from 1946-2006:

Barry Bonds – 733
Hank Aaron – 678
Mark McGwire – 604
Mike Schmidt – 592
Rafael Palmeiro – 582

Top 5 in adjusted hits from 1946-2006:

Pete Rose – 4212
Hank Aaron – 3684
Carl Yastrzemski – 3392
Paul Molitor – 3390
Eddie Murray – 3320

Easiest Home Run Leagues (relative to 2006 AL):

1962 AL
1961 AL
1958 AL
1959 AL
1956 AL

Hardest Home Run Leagues:

1976 AL
1992 AL
1976 NL
1992 NL
1968 NL

(all seasons that preceded expansion).

Tags: Drive Bys · Uncategorized

1 response so far ↓

  • 1 Larry // Feb 7, 2008 at 11:08 am

    This is a great idea, and a tool I can see being very useful comparing players/styles/stats from different years. It should be very telling grouping years into ‘eras’. It’s got real promise. (Not that I can help proof it. I’m good for Excel sorts, and that’s about it.)

Leave a Comment