Calculating Strength of Schedule

Strength of schedule is an incredibly interesting metric when one gets down into the thick of it. Like a lot of statistics, if you ask one hundred analysts to calculate strength of schedule (SOS), then you may get one hundred different answers. I want to break down the pluses and minuses of each system, and even show how systems may vary depending on the sport.

What Makes For a Strong SOS Calculation?

There are a few things that set different methods apart from the rest. Now, let’s name a few of them.

Accurately defining who a strong team is: First and most obviously, the system needs to be able to understand that, for instance, a Georgia Bulldog team with a record of 9-3 is better than a UAB Blazers team with a record of 9-3.
The method should fit the sport: Strength of schedule doesn’t necessarily follow a one size fits all solution. A solution is much easier to come by in a 162 game season with 30 teams in Major League Baseball versus a 12 game season with hundreds of teams in college football.
Home field or home court advantage: This is next level stuff that most systems don’t take into account. A game at Cameron Indoor arena is clearly much tougher than getting Duke on one’s own home court.

Bowl Championship Series Strength of Schedule

Though the BCS is now defunct, it’s calculation for strength of schedule lives on. This system takes the sum of the team’s opponent’s records and multiplies by two. It then adds that number to the team’s opponent’s opponent’s record and multiplies by one. Finally, that number is divided by three.

Calculation of Strength of Schedule by the BCS

I’d argue that this is currently the best non-rating based SOS calculation. This method does a few things well and is a really great method when record is the only known variable. Capturing the opponent’s opponent’s record allows the formula to differentiate between two teams like my Georgia/UAB example above. At the first level, SOS has Georgia as an equal to UAB, but at the second level, the opponent’s record of UAB is highly likely to be much worse than Georgia’s. The drawbacks of this system should be pretty obvious. You just aren’t going to get very much differentiation between a team like Georgia and UAB.

Central Mean Strength of Schedule

This is a ratings based strength of schedule system that gives less weight to opponent’s teams that are outliers.

From Jeff Sagarin’s Conference Ratings

I am mostly not a fan of this system even though the results don’t turn out much different than our next system. As one can see from the table, this system highly weights the median opponent team. This works well in a sport where there are a few opponents that are wild outliers.

Arithmetic Mean Strength of Schedule

The arithmetic mean is a simple average of the opponent’s ratings across the course of a season. In my opinion, this is the best system out there for calculating SOS, yet it still isn’t perfect. The one flaw can be most often found in college basketball. When a very strong team plays a very weak team or vice-versa. There are different levels of “very weak” teams and games against “strong very weak” teams could boost a team’s SOS even though victory is almost guaranteed regardless of the weak opponent. This is an area where a system like the Central Mean SOS could be strong.

What is the Best Strength of Schedule System?

This simple answer is that it just depends. Very basic strength of schedule systems in most major professional sports will yield similar results. More complex systems are needed as the number of games drop or the number of teams increases. A challenge in figuring out the best system is that there is no target variable to model off. Thus, it can be difficult to determine the difference between a good strength of schedule system and a bad one.