Tuesday, October 6, 2015

Why MLB Should Seed Playoff Teams Based on Record Alone

Over time, I’ve written several pieces about the change to baseball’s playoff system. Overall, I’d say my position on it has shifted from “Get rid of the second wild card” to “if we’re going to have a fifth team, at least do it better than we are now”. I would hope that prospective improvement is something that we can all agree is a good goal, right? So what would prospective improvement for the current system be?

Well, I think there are a number of things that could be fixed, but the one I want to focus on today is seeding. You may or may not have heard, but the three best records in the majors this year all belong to teams in the NL Central. Despite this, the Pirates and Cubs will need to play one game to determine which one of them “really” deserves a post-season spot, at which point the winner will face the Cardinals. So we are guaranteed to see only one of the best records in the majors making the Championship Series round.

That’s a little absurd. Why can’t baseball switch to seeding solely based on record, like the NBA recently decided to do? I see people arguing against it all the time, but the arguments just don’t make sense to me. The Pirates won 98 games; the Cubs won 97. You mean to tell me that, because they were assigned to the Central division back in 1994, that the 92-win Dodgers and 90-win Mets deserve those automatic bids to the ALDS more? Sure, sure, you can scream “DOESN’T MATTER, JUST WIN YOUR DIVISON” all you want, that still doesn’t explain why teams that did less to win their division (in just about every conceivable way) than the Pirates and the Cubs should see benefits. It’s not even like the Mets and Dodgers were noticeably better at beating the Cardinals; they went 3-4 and 2-5 versus St. Louis respectively, while the Cubs went 8-11 and the Pirates went 9-10.

Some might point out that it’s rare for the three best teams in the majors to all come from one division, and that is true. However, what’s not uncommon at all is for a wild card winner to have a better record than a division winner; since the first full season with the new format in 1995, there have been thirteen seasons in the AL and fourteen seasons in the NL where the top Wild Card has had a better record than at least one division winner. 

Yep, out of 42 league-seasons, over half have ended with a wild card playing well enough to win a division other than their own. I’d like to think that the regular season means something, that doing well over 162 games gets you some benefit for the final twenty, but as it’s structured now, there’s a good chance that you finishing with a better record will be rendered moot. This isn’t some once-in-a-lifetime fluke; it’s a real issue that needs addressing. It’s admittedly a lot more unusual for both wild cards to finish so well; this season marks the sixth time the second wild card team (or the team that would have filled that role, had it existed prior to 2012) would have pulled it off. But that just makes this season different in extent rather than in kind.

But what about the unbalanced record, you might say. Baseball has an unbalanced schedule, and it wouldn’t be fair if the wild card team’s record was just a case of fattening up on a weaker division. Well, there are a lot of issues with that statement. For instance, I don’t see how letting the record be the unseen deciding factor is significantly worse than letting geography be the tie-breaker, as it is now.

More notably, though, is that the case doesn’t seem to hold up in the abstract. Picture two divisions, one with a wild card team better than the other division’s winner. If the schedule was unbalanced towards more games with intra-divisional foes, which team would have the easier schedule?

Well, knowing only that one division has a wild card team with a better record than the other division’s winner, you’d be safer betting on the wild card having a harder schedule. Why is that? Well, picture it with numbers attached; say that the wild card has the third best record while the worse division winner has the fourth best. Just based on that, we KNOW that the wild card played a lot of games against the best or second-best record in the league, but the weak division winner’s hardest rival was at-best, the fifth-best in the league. Those are the only things we know FOR CERTAIN, so assuming the rest of the league’s records are randomly distributed, we can’t guarantee anything other than what we’ve previously stated, meaning that the two-team division is just as likely as the “weak” division gets a stacked bottom three.

That’s great in theory, but does it hold up in practice? To find out, I used ESPN’s strength of schedule numbers. These are the weighted average record of a team’s opponents, based on how many times they played (and factoring out games the opponent played against the first team, to make sure you aren’t penalizing a good team for making their opponents look even worse). The numbers only go back to 2002, but that’s still a good amount to look at.

And in that time, I made an interesting discovery; teams face relatively balanced schedules. That’s not to say totally balanced; the average difference in strength of schedule (measured as opponents’ winning percentage) between the toughest schedule and the weakest came out to .027. That translates to about 4.4 wins, or about the difference between the Phillies and Braves this season.

But that number isn’t entirely what we’re looking at; the hardest schedule and the weakest schedule rarely come from the same league, and since the two leagues are competing for different playoff spots. When we separate by league, we get the following:

Average AL Range: .019 (3.1 wins)
Average AL Standard Deviation: .0056 (0.9 wins)
Average NL Range: .021 (3.4 wins)
Average NL Standard Deviation: .0062 (1.01 wins)

So over a full season, a large majority of teams in the same league have a schedule within 2 wins of each other in difficulty. This range condenses even further when you look at just the teams vying for the postseason; the average range in strength of schedule among teams finishing first or second in their division or the wild card race is .0151 in the AL and .0154 in the NL, or between 2.4 and 2.5 wins. While those numbers aren’t as equal as a perfectly balanced schedule, I’d say that’s pretty damn close, even given the logistics that need to be factored in around the current schedule.

But let’s just go back to my original hypothesis; when wild card winners have a better record than division winners, do they generally have harder schedules as well? Well, in the 28 league-seasons since 2002, there have been 20 cases where a wild card team has had an equal or better record than an AL or NL division winner. In seventeen of those, the wild card team had a harder schedule as well. Of the remaining three cases, the division winner had a harder schedule than the wild card team only once (twice, the wild card was tied with an equal-or-worse division winner on the schedule strength).

So why not switch to an entirely record-based seeding method? To recap, we average over one wild card winner per season with a better record than a division winner so it’s not an uncommon problem; and they are almost guaranteed to have played an equal or harder schedule than their division-winning foe (not that there’s a huge difference between teams’ schedule strengths to begin with). What reason is there beyond that to deny strong wild cards a free bye into the division series? They already showed they’ve earned it over 162 games. If we have to punish any teams with a one-game play-in for “sneaking into” the postseason, it should be weak division winners.

No comments:

Post a Comment