PDA

View Full Version : Win Probability Added



PaulieIsAwesome
07-13-06, 02:08 AM
I want to jump off a discussion PricklyPete and I were having in my "how valuable is Ortiz thread?" We were discussing the relative merits of WPA. I just read a big summary post by mgl on his and tangotiger's blog. MGL hates it, and tango really loves it (much of the non-graphical work you see on the internet about it comes from tango.)

Here's what MGL says:

I’ll take this opportunity to say that I can’t stand WPA, as now everyone is assuming that the new sabermetric uber-stat takes into consideration hitting (or pitching) in the clutch.
I’ve said this a million times and I’ll say it again.
One, since every saber researcher and their mother have shown that an ability to perform better or worse than the average player in the clutch is at best a small skill that very few players possess, and hence has almost no predictive value, waazz the point of WPA??
Two, “the clutch” is only “the clutch” if the player perceives it as such. Players as a general rule know when games are on the line, but there are many situations that WPA sees as “clutch,” that a player would not, and vice versa. For example, everyone must think that a 2 or 3 run lead in the 9th is a “clutch” situation, or why would managers feel a need to bring in their best pitchers?
Three, if you want to come up with a stat to reward performance regardless of whether it was predicated on “luck” or skill or some combination thereof, it makes no sense to reward performance which leads to a loss. WPA of course, while it rewards large positive changes in WP and give demerits for large negative changes in WP, is silent about whether the game was actually won or lost by the player’s team or even whether a run was even scored or not.
If we are going to reward “lucky” behavior that leads to a win, fine, I have no problem with that. If we are going to reward true (context-neutral) skill (like lwts) regardless of the outcome, I have no problem with that either.
To reward luck-based performance that leads to a loss is ridiculous in my opinion.
Anyway, I hate WPA and find it utterly useless. Period.

So, he really has two points here: 1. WPA shouldn't undermine all of the previous research on clutch, since a. the research definitively demonstrated that clutch is a minor factor in baseball performance and b. WPA captures a whole lot that most baseball players would not identify as clutch (which is the whole point of the exercise)

2. WPA is in fact a luck based statistic that tries to encompass all, non-luck based performance.

Tangotiger immediately posted a followup defending WPA (for every reason you would think.)

"Guy" then responds with another absolutely fantastic point. WPA is supposed to be all about rewarding the actual hits and plays that lead to a team winning. However, they don't actually record whether a play in fact leads to a run: Jeter hits a leadoff double in a tie game in the ninth, he gets some X amount of WPA, regardless of whether or not he actually scores, and contributes to his team's likely winning effort, the entire point of WPA.

I've been really critical in this post about WPA, but I should temper it a little. MGL does come out and say: It's pretty good at what it does, and should, maybe, be used in MVP calculations. But that's it.

Read all about it at:
http://www.insidethebook.com/ee/index.php/site/comments/win_expectancy_in_the_mainstream/#comments

Kceracerone
07-14-06, 11:46 AM
I've been really critical in this post about WPA, but I should temper it a little. MGL does come out and say: It's pretty good at what it does, and should, maybe, be used in MVP calculations. But that's it.

Personally, I love this stat. I think it does a fairly solid job of quantifying who has been performing when the game is on the line. I definitely think the writers should use this as a stat for determining who the MVP should be and not rely on what they saw on Sportscenter. It provides context for the numbers that the players have put up and enables you to quantify what value a certain player has provided to a team. I realize that it has holes in that it doesn't include fielding and baserunning, but I've never heard an argument for MVP start with - nobody runs the bases like ..... .

That being said, I don't think that this stat is very useful when evaluating how good a player is given its limitations in predicting future performance. It seems counterintuitive that players can raise their game when they are in a big spot. The only value I could see is identifying players that have a history of coming up small in pressure situations.

buckyjacobson
07-14-06, 01:57 PM
If the stat actually measures what it claims to, I think it's quite valuable. Not necessarily from the standpoint of whether a given player is "clutch," but in terms of measuring actual contribution. A player may not be "clutch," but if he performs in clutch situations during a given season, and we're talking about that season, that's all that matters -- the fact that he likely will revert to the mean in the future is irrelevant.

PaulieIsAwesome
07-14-06, 02:12 PM
Well, the problem that MGL highlights is that people who like to talk about how it captures the actual, context dependent performance that leads to wins. If that's what it actually showed, than you should only count performance in games that your team won. Or, only count performance that leads to runs: eliminate hits that don't drive in runs and don't have you being driven in, since you didn't actually help your team score any runs in that inning.

Soriambi
07-14-06, 03:14 PM
Well, the problem that MGL highlights is that people who like to talk about how it captures the actual, context dependent performance that leads to wins. If that's what it actually showed, than you should only count performance in games that your team won. Or, only count performance that leads to runs: eliminate hits that don't drive in runs and don't have you being driven in, since you didn't actually help your team score any runs in that inning.

I'm not sure that I agree with MGL on his point about whether the guy scores or not or whether the team wins or not. If a team is behind 5-4 in the bottom of the 9th inning with two outs and a player hits a double and then:

A. Is driven in on a two-run HR, winning the game OR
B. Is stranded there when the next hitter Ks

was the double any more or less valuable when it happened in situation A over situation B? At the time it happened, assuming all other variables are equal, it gave the team the same chance to win the game in A as it did in B. I don't think that a player should be punished for hitting the double if the guy after him, who he has no control over, makes an out. He raised the team's chances to win in both scenarios equally-they just won in one and didn't win in the other. In other words, your hit adjusted the probability of winning in the same way both times.

PaulieIsAwesome
07-14-06, 03:41 PM
I'm not sure that I agree with MGL on his point about whether the guy scores or not or whether the team wins or not. If a team is behind 5-4 in the bottom of the 9th inning with two outs and a player hits a double and then:

A. Is driven in on a two-run HR, winning the game OR
B. Is stranded there when the next hitter Ks

was the double any more or less valuable when it happened in situation A over situation B? At the time it happened, assuming all other variables are equal, it gave the team the same chance to win the game in A as it did in B. I don't think that a player should be punished for hitting the double if the guy after him, who he has no control over, makes an out. He raised the team's chances to win in both scenarios equally-they just won in one and didn't win in the other. In other words, your hit adjusted the probability of winning in the same way both times.

This, I think, is MGL's point:

Why should a player be credited anymore if his double comes with a runner on first (which he had no control over) in the bottom of the ninth inning (which he had little control over) than if it comes in the 7th with no one on?

Soriambi
07-14-06, 03:50 PM
This, I think, is MGL's point:

Why should a player be credited anymore if his double comes with a runner on first (which he had no control over) in the bottom of the ninth inning (which he had little control over) than if it comes in the 7th with no one on?

I agree with that part of it, but he also seems to be arguing that it's ridiculous that a guy who doesn't score after his double gets credited the same as a guy who does score, which I disagree with. I base that on the part where he says "WPA of course, while it rewards large positive changes in WP and give demerits for large negative changes in WP, is silent about whether the game was actually won or lost by the player’s team or even whether a run was even scored or not." It seems to me like he's criticizing that.

buckyjacobson
07-14-06, 07:45 PM
This, I think, is MGL's point:

Why should a player be credited anymore if his double comes with a runner on first (which he had no control over) in the bottom of the ninth inning (which he had little control over) than if it comes in the 7th with no one on?

Why does control of the situation matter?

What matters is how the player performs within the situations he winds up in. If you're trying to compare how player A might perform in the future versus player B, then variations in their situations matter, but if you're looking back on a season (or any period of time) and trying to evaluate who contributed more, control of their circumstances is irrelevant.

As for winning, it's who contributed the most toward's the teams's chances of winning. Otherwise, we should look at all stats in that light -- only looking at performance in wins. And when doing so, a HR in the bottom of the 9th when already up 10-0 would be more valuable than a game-tying HR in the 9th that the team proceeds to lose.

EvanJ
07-15-06, 09:18 AM
If you're up 10-0 you won't bat in the bottom of the ninth.

buckyjacobson
07-15-06, 12:04 PM
If you're up 10-0 you won't bat in the bottom of the ninth.

lol This is true.

Soriambi
07-19-06, 03:30 PM
Another problem that I discovered with WPA:

According to the numbers at walkoffbalk.com, which allows you to calculate WPA for any situation, there are some issues with sample size with WPA. (Well, that's not according to the site, that's according to me looking at the site.) For instance, if your team is down one run in the bottom of the 9th inning with 2 outs and you're down by one run, WPA gives your team a better chance to win the game with runners on first and second (18% chance to win) vs. with runners on second and third (16.3% chance to win). Obviously, if you have the tying and winning runs in scoring position, you have a better chance to win than with the tying run in scoring position and the winning run on first. In addition, having a guy on 2nd and 3rd means that there's no force at second or third, which would make it even more likely.

Tom Tango's Run Expectancy table shows this (.466 runs expected with 1st and 2nd and two outs, .634 runs expected with runners on 2nd and 3rd and 2 outs) and his Run Frequency table (with runners on 2nd and 3rd with 2 outs teams score 1 run 5.4% of the time, 2 runs 14.1% of the time (so 19.5% of the time they score once or twice), and more than 2 runs 8.1% of the time, but with men on first and second and two down they get one run 10.6% of the time, two runs 5.8% of the time, (so they score 1 or 2 16.4% of the time) and more than two runs 6.7% of the time.)

So those specific situations in the WPA seem to be backwards based on both common sense and the run charts. I'm not sure how many other situations there are where there is a situation similar to this, and I'm not sure whether walkoffbalk.com uses the same formula as, say, fangraphs.com, but I can't help but think that they could incorporate the run frequency and the expected run charts into the formula somehow to make the probabilities better and more accurate and to protect against small sample size problems like this, because it seems like, as of now, they just look at all of the games that have happened, how many games the teams have won in those situations, and give you the probability based on that. Like I said, that can cause situations with sample size when there have been, say, 202 total games over the last 25 years where there have been men on 2nd and 3rd with 2 out in the 9th and a team down by one, like there have been (between 1979-2004) according to walkoffbalk.com. I'm not nearly as mathematically inclined as I would have to be to try to do something like that, though. :lol:

PaulieIsAwesome
07-19-06, 03:54 PM
Another problem that I discovered with WPA:

According to the numbers at walkoffbalk.com, which allows you to calculate WPA for any situation, there are some issues with sample size with WPA. (Well, that's not according to the site, that's according to me looking at the site.) For instance, if your team is down one run in the bottom of the 9th inning with 2 outs and you're down by one run, WPA gives your team a better chance to win the game with runners on first and second (18% chance to win) vs. with runners on second and third (16.3% chance to win). Obviously, if you have the tying and winning runs in scoring position, you have a better chance to win than with the tying run in scoring position and the winning run on first. In addition, having a guy on 2nd and 3rd means that there's no force at second or third, which would make it even more likely.

Tom Tango's Run Expectancy table shows this (.466 runs expected with 1st and 2nd and two outs, .634 runs expected with runners on 2nd and 3rd and 2 outs) and his Run Frequency table (with runners on 2nd and 3rd with 2 outs teams score 1 run 5.4% of the time, 2 runs 14.1% of the time (so 19.5% of the time they score once or twice), and more than 2 runs 8.1% of the time, but with men on first and second and two down they get one run 10.6% of the time, two runs 5.8% of the time, (so they score 1 or 2 16.4% of the time) and more than two runs 6.7% of the time.)

So those specific situations in the WPA seem to be backwards based on both common sense and the run charts. I'm not sure how many other situations there are where there is a situation similar to this, and I'm not sure whether walkoffbalk.com uses the same formula as, say, fangraphs.com, but I can't help but think that they could incorporate the run frequency and the expected run charts into the formula somehow to make the probabilities better and more accurate and to protect against small sample size problems like this, because it seems like, as of now, they just look at all of the games that have happened, how many games the teams have won in those situations, and give you the probability based on that. Like I said, that can cause situations with sample size when there have been, say, 202 total games over the last 25 years where there have been men on 2nd and 3rd with 2 out in the 9th and a team down by one, like there have been (between 1979-2004) according to walkoffbalk.com. I'm not nearly as mathematically inclined as I would have to be to try to do something like that, though. :lol:

Yeah, I noticed something related to that once: there was some situation, where I was screwing around with the table on walkoffbalk, where in the 6th or something, if you're up 4 runs, a single is detrimental to your team's winning percentage.

All of your suggestions for bettering it are right on, and someone should do it. It will be pretty time consuming and difficult.

homer2931
07-19-06, 04:35 PM
Keith Woolner made a win expectancy chart based on math instead of game results in baseball prospectus 2005. I'm not sure if fangraphs uses it, but it does exist

buckyjacobson
07-19-06, 06:11 PM
Another problem that I discovered with WPA:

According to the numbers at walkoffbalk.com, which allows you to calculate WPA for any situation, there are some issues with sample size with WPA. ...

From what I've read, this is probably the biggest flaw -- i.e., a problem of execution rather than concept. A pretty severe flaw, though. (Of course, I often wonder about many other sabermetrics people employ -- for stats that allege to calculate runs and wins a player contributes, at the end of this year, if we summed the runs for each team's players, would they end up being close to the team's actual runs scored and games won?)

PaulieIsAwesome
07-24-06, 01:37 AM
From what I've read, this is probably the biggest flaw -- i.e., a problem of execution rather than concept. A pretty severe flaw, though. (Of course, I often wonder about many other sabermetrics people employ -- for stats that allege to calculate runs and wins a player contributes, at the end of this year, if we summed the runs for each team's players, would they end up being close to the team's actual runs scored and games won?)

Dave Studeman did exactly what you are talking about with your last point last week:

http://www.hardballtimes.com/main/article/what-wpa-can-tell-us-about-teams/

In this week's column, he highlights another, enormous problem with WPA: unadjusted, it has a tremendous imbalance towards relievers:

http://www.hardballtimes.com/main/article/what-wpa-can-tell-us-about-players1/

He offers a few ways to adjust WPA to address these concerns.

PaulieIsAwesome
09-14-06, 01:14 AM
Found some new stuff, so I digged up this old thread:

There's a great thread on Baseball Think Factory recently about WPA. http://www.baseballthinkfactory.org/files/newsstand/discussion/espn_rogers2/

Post 40 by Smitty delves into some numbers, and comes up with some pretty interesting results.

Another person in the thread brought up an issue I thought about yesterday: if it hadn't been Ortiz/A-Rod last year, and instead had been, I don't know, Jermaine Dye, or Vlad Guerrero, exhibiting that huge WPA advantage, would WPA be as big in the SABR community as it is now? The Red Sox have probably the largest fanbase on the internet, and their fans are very well represented among posters at Think Factory and Hardball Times. The fact that a member of the RS was helped out by this new fangled success, I believe, helped drive the popularization of this stat more than it would have otherwise.

WHIP
09-14-06, 10:09 PM
Another person in the thread brought up an issue I thought about yesterday: if it hadn't been Ortiz/A-Rod last year, and instead had been, I don't know, Jermaine Dye, or Vlad Guerrero, exhibiting that huge WPA advantage, would WPA be as big in the SABR community as it is now? The Red Sox have probably the largest fanbase on the internet, and their fans are very well represented among posters at Think Factory and Hardball Times. The fact that a member of the RS was helped out by this new fangled success, I believe, helped drive the popularization of this stat more than it would have otherwise.

There is a nail. And there is a head of this nail.

And you hit the head of this nail. In your post.

BronxByTheBay
09-15-06, 01:36 PM
There is a nail. And there is a head of this nail.

And you hit the head of this nail. In your post.

What a tortured path you took to saying you agreed with his post. What IS wrong with you?

WHIP
09-17-06, 11:46 AM
What a tortured path you took to saying you agreed with his post. What IS wrong with you?

Gay men now possess a lot of rights in our wonderful nation, the United States of America. For example, they now possess the right to marry in the State of Massachusetts. They are enjoying the freedoms of democratic governance.

You are one such man who is enjoying the new freedoms of democratic governance.

ojo
09-19-06, 09:43 AM
Found some new stuff, so I digged up this old thread:

There's a great thread on Baseball Think Factory recently about WPA. http://www.baseballthinkfactory.org/files/newsstand/discussion/espn_rogers2/

Post 40 by Smitty delves into some numbers, and comes up with some pretty interesting results.

Another person in the thread brought up an issue I thought about yesterday: if it hadn't been Ortiz/A-Rod last year, and instead had been, I don't know, Jermaine Dye, or Vlad Guerrero, exhibiting that huge WPA advantage, would WPA be as big in the SABR community as it is now? The Red Sox have probably the largest fanbase on the internet, and their fans are very well represented among posters at Think Factory and Hardball Times. The fact that a member of the RS was helped out by this new fangled success, I believe, helped drive the popularization of this stat more than it would have otherwise.

i wouldn't assume for a second the red sox have the largest following on the internet. that is essentially saying they have more fans than do the yankees, which is clearly not true.

BronxByTheBay
09-19-06, 12:15 PM
Gay men now possess a lot of rights in our wonderful nation, the United States of America. For example, they now possess the right to marry in the State of Massachusetts. They are enjoying the freedoms of democratic governance.

You are one such man who is enjoying the new freedoms of democratic governance.

You saying I hate America now?

PaulieIsAwesome
09-19-06, 01:13 PM
i wouldn't assume for a second the red sox have the largest following on the internet. that is essentially saying they have more fans than do the yankees, which is clearly not true.

Well, within the baseball sabr community, as represented on sites like Hardball Times, Baseball Prospectus, and Baseball Think Factory, while there is a certain Yankee component, there's also a huge Red Sox fan base, probably even bigger than NYY. There are a number of reasons for that, the chief being the RS general management reliance on sabr principles.

FarWestNewYork
10-08-06, 06:18 PM
Hmmmmm

gdn
10-08-06, 06:19 PM
Uh???