This article is part of our The Z Files series.
I'll be using percentage of wins that were saved a lot, so let's abbreviate using SV/WIN. Most of the math will be determining "r", the Pearson correlation coefficient. This is a measure of the linear correlation between two variables:
r | correlation |
1 | complete co-dependence |
0 | completely random |
-1 | complete inverse co-dependence |
So, if r = 1, as one variable increases, the second increases proportionately. If r = -1, as one increases, the other decreases.
Correlating SV/WIN to runs scored over the last 20 season yields r = -0.62. This means the higher the run-scoring environment, the fewer SV/WIN. Since we're looking at entire seasons, we can say there are fewer saves in years with higher scoring. It's not a perfect correlation, but certainly strong enough to use a basis for projections.
Chances are, you're curious what happened last season with the home run explosion leading to more runs. Sure enough, the 48.5 percent SV/WIN is the lowest in the past 20 campaigns.
Another interesting takeaway from the data is SV/WIN ranged from 48.5 percent to 53.2 percent. This avails a logical boundary for overall projected saves.
Now let's
I'll be using percentage of wins that were saved a lot, so let's abbreviate using SV/WIN. Most of the math will be determining "r", the Pearson correlation coefficient. This is a measure of the linear correlation between two variables:
r | correlation |
1 | complete co-dependence |
0 | completely random |
-1 | complete inverse co-dependence |
So, if r = 1, as one variable increases, the second increases proportionately. If r = -1, as one increases, the other decreases.
Correlating SV/WIN to runs scored over the last 20 season yields r = -0.62. This means the higher the run-scoring environment, the fewer SV/WIN. Since we're looking at entire seasons, we can say there are fewer saves in years with higher scoring. It's not a perfect correlation, but certainly strong enough to use a basis for projections.
Chances are, you're curious what happened last season with the home run explosion leading to more runs. Sure enough, the 48.5 percent SV/WIN is the lowest in the past 20 campaigns.
Another interesting takeaway from the data is SV/WIN ranged from 48.5 percent to 53.2 percent. This avails a logical boundary for overall projected saves.
Now let's get more granular and look at some data at the team level in each season for the past twenty years. The studies will be correlating team saves to wins and team SV/WIN to wins:
Ave r | High r | Low r | |
Saves | 0.65 | 0.79 | 0.5 |
SV/WIN | -0.19 | 0.07 | -0.55 |
On a team-by-team basis, it's clear saves correlate best to wins. The relationship isn't perfect, which makes sense. One of the reasons saves are considered a crapshoot is people cherry-picking the instances when a poor team's closer accrued a lot of saves. Instead of dismissing this as luck, let's dig deeper to investigate reasons why this may have occurred.
There are three logical variables to consider: runs allowed, runs scored and run differential. Let's correlate each to team saves. Before presenting the data, which do you think will exhibit the strongest dependence? Using team-wide data for the past 20 seasons:
Runs | Ave r | High r | Low r |
Allowed | -0.55 | -0.44 | -0.71 |
Scored | 0.14 | 0.41 | -0.14 |
Differential | -0.50 | -0.32 | -0.70 |
Now it gets interesting. The least important factor is strength of offense. In some years, there's moderate correlation – the more prolific the offense, the more plentiful the saves. However, in others, there's a tiny negative relationship – the fewer runs scored, the more saves are recorded.
Perhaps even more intriguing is that runs allowed noses out run differential as the leading indicator for team saves. I say this because my guess was runs differential would be the most important factor. It's close, but the quality of the pitching staff is the most relevant contributor.
The next logical question is whether quality of starters or relievers drives the correlation, or perhaps neither. Using the same 20-season sample, the coefficient for starters is -0.46 compared to -0.49 for relievers. There's a slight lean to quality of bullpen, but it isn't much.
While running those correlations, I wondered if innings per start mattered. It does, but only a little as evidenced by an r of 0.25.
When doing projections, it's crucial to have logical boundaries. Looking ahead, my formula for projecting saves is team wins multiplied by SV/WIN. Here's a table showing SV/WIN levels for the past two decades:
SV/WIN | Number |
greater than .7 | 1 |
.65 to .7 | 11 |
.6 to .65 | 46 |
.55 to .6 | 104 |
.5 to .55 | 144 |
.45 to .5 | 174 |
.4 to .45 | 80 |
.35 to .4 | 35 |
.3 to .35 | 5 |
There will always be outliers, but 70 percent of teams saved between 45 and 55 percent of their wins. This seems like a reasonable guideline for projection purposes. Keep in mind, this is independent of the number of wins, just the proportion saved.
Let's summarize the factors making projecting saves something less than a crapshoot:
- The higher the scoring league-wide, the fewer the saves
- For the season, SV/WIN ranges from 49 to 53 percent
- The more team wins, the more saves
- Quality of offense is moot
- The lower the team ERA, the more saves
- Quality of relievers has a tiny effect
- Average innings per start has a real tiny effect
- Individual teams will save between 45 and 55 percent of their wins
This is enough to come up with a plausible number of saves per team. I'll freely admit, there are others with a deeper statistical background capable of turning this data into an elegant algorithm. However, given the accuracy with which team wins and team ERA can be predicted, I'm comfortable not going back to school to embellish my statistical knowledge. Given the above constraints, my method of adjusting the data manually will yield results within a couple saves of an intricate formula. This is close enough for me. A difference of a save or two isn't going to alter a ranking, especially since I'm the leader of the "closers do more than get saves" crusade.
I start with projected wins from betting odds. There's a reason for all those extravagant buildings with the pretty lights, waterfalls, statues and the like on the Las Vegas Strip.
To get a team's SV/WIN, I start with the league-wide expectation for offense. Scoring has increased the previous three seasons. Perhaps it continues, though I expect a small drop-off. Regardless, I don't see a huge change in either direction, so baselining each team's SV/WIN at 49 percent feel logical. Big picture, the amount of tallies will be high, resulting in a SV/WIN at the low end of the boundary.
Next, I estimate each team's ERA, using my projections. Each squad's SV/WIN is massaged accordingly, staying with the 45 to 55 percent limits.
At this point, everything needed to project team saves is available. As cited above, just take projected wins and multiply by SV/WIN.
We're almost there. Obviously, the endgame is player projections. Looking at historical data, most full-time closers earn about 90 percent of their team's saves. Some achieve even more, but it's rare I project an individual for more than 90 percent of the total. There needs to be a strong recent usage trend to expect more than 90 percent, with 93 percent being the maximum. So, instead of projecting saves for each reliever, I project the percentage of team saves I expect, then the number of saves emanates from the above formula.
The method isn't perfect, but neither are the correlations. There's always variance, leading to outliers. The error associated with projecting team wins and ERA introduces error. However, looking at the big picture, I'm more comfortable applying some level of defensible objectivity as opposed to using a more subjective method, seasoned by potentially false assumptions.
More and more leagues are incorporating holds into their scoring. By means of a brief summary, holds correlate fairly well to saves, which is not surprising. Something a little odd is they don't correlate well to bullpen ERA. There's some correlation, but not as strong as one may intuit. So, while there's guidelines with regards to holds, it's much more team-oriented and driven by managerial tendencies.