+ Reply to Thread
Page 1 of 4 1 2 3 4 LastLast
Results 1 to 25 of 83
  1. #1

    A Close Analysis of Yankee Projections.

    This thread is meant as a scientific analysis of poster Huges2.50’s projection system. He has not disclosed his full methodology, but he has provided most of the information here in the infamous “post 93.” Further information comes from subsequent posts in that thread, and quotes here are from that thread.

    The point of this thread is not to attack a member of this community or to belittle his opinions. My goal is to better understand how he has come to his end results and to examine how his methods compare to the standard methods of projecting pitchers. It is meant as a critique of his method, so that everyone here might understand it and where applicable it might be improved.

    It should be noted that “critique” is neither inherently positive or negative; I hope everyone who participates in this thread will be able to approach it with the attitude of scientific neutrality with which it was intended. From Here I will attempt to transcribe Hughes2.50’s (from now on referred to as H2.5) method from the original post #93. The original post is somewhat disjointed so it is my hope to provide some clarity by presenting the steps in order where possible. I have also edited some of his quotes as they appear here for spelling and punctuation. The section headings provided below are to help organize the entire enterprise (responses can be directed to or parsed by the appropriate section).

    I apologize for the length, but I think it is necessary to be both thorough and complete so as to be fair to all concerned.


    Section 1

    The start of H2.5’s method is to combine Clay Dreslough’s Defense Independent Component Era (DICE) with the MLE’s provided by minor league splits.com.

    I have some comments on this so far:

    1. The Formula for DICE given in the post is incorrect, the actual formula for DICE is (13HR + 3(BB + HBP) –2K/ IP) + 3.00. I think this merely a typo, but if it is not, the formula listed would lead to very different results as the 3.0 runs are included in the numerator rather than modifying the final line to look more like an ERA.

    2. I am curious as to the choice of Dreslough’s DIPs formula as opposed to FIPs (either TangoTiger’s version or TheHardballTimes version). There are subtle differences in the formulas, but the largest difference is that the latter two use a modifier of 3.20 instead of 3.00 to make the result of the base equation look like ERA. Obviously the use of Dreslough’s formula will give a lower number than FIPs (depending on how many HBP the pitcher has, but since this looks at top pitching prospects, I think its safe to assume that they will not have a great number of hit batsmen).

    3. At this point I think it’s important to point out that the MLE’s are not predictive of future performance. They are intended to translate a specific performance at the minor league level into numbers at the ML level.

    4. The most important point is that H2.5 has not specified in which order he combines the above statistics. I am assuming that he takes the MLE translation of the raw minor league counting stats (HR, BB, Ks) and then plugs those into the DIPs formula, but he hasn’t stated so explicitly. Any other combination of translations should produce skewed results.

    Section 2

    With these numbers in hand H2.5 generated a list of the top minor league pitchers (see post for a list starting with Adenhart and ending with Ohlendorf) who
    Quote Originally Posted by Hughes2.50
    …pitched in the minors last year, pitch for American League teams, and generated results better than league average ERA+ > 100.
    The formula for ERA+ is 100 * (league average ERA/pitcher’s ERA). H2.5 sets the league average ERA at 4.50.

    This is the first major problem in his method. He is arbitrarily setting the league average ERA at 4.50. ERA+ is meant to set a pitcher’s individual performance against the rest of his pitchers in his league. His source data at minor league splits does not provide league ERAs, or even park adjustments. Since different minor leagues tend to vary widely as to whether or not they favor offense or pitching, it is impossible to produce a meaningful statistic by choosing to set ERA at what is not quite a random number (see below).

    To give an example of how this adjustment works let’s consider the single best year by ERA+ of the modern era (actually the second best season of all time, behind Tim Keefe in 1880), Pedro Martinez’s 2000 season. He had a 1.74 ERA vs. a 4.97 league average ERA for a 285 ERA+. If the league ERA is arbitrarily set at 4.50 he winds up with a 259 ERA+, which is still good, but drops him into a tie with Bob Gibson in 7th place among single season leaders, so the difference is significant.

    Now 4.50 is not a bad guess to set the league average ERA at; the laERA that Pedro pitched against over his career is 4.49. However, the pitchers on this list are pitching in very different environments, and so far H2.5 has not accounted for league and park adjustments in his method. Using the actual minor league average ERAs, no matter how tedious to compile, would go along way towards accounting for this. Again, using Pedro Martinez as an example, the laERA has had a decent amount of variation over the course of his career. Using just his time in the AL, it varied between 4.42 and 5.07. Is there any reason to not expect the sum total of minor league performances to have just as wide (if not wider) variation?

    Section 3

    I must confess that here’s where H2.5 looses me. I had been able to follow what he was doing up until now, if not exactly why he was doing it. But here at the end he has numbers jumping in and out of various parts of his formula, so I’ll provide his text from the original post in two sections:

    Quote Originally Posted by Hughes2.50
    … and then I calculate his career ERA by taking his raw walk and hit batter rate for the year and replace his MLE's with those numbers (Pitchers generally improve their command for the bulk of their career over the MLE values provided in any given year)
    This is where things get very problematic. H2.5 is extrapolating career numbers based on one season’s worth of minor league data. I can only assume that he’s using the player’s “raw walk and hit batter rates” in the DIPs formula shown in section 1 (along with his HR and K rates?).

    I cannot find any reference on any sabermetric (or other) site that would suggest that these rates could be used to project an entire career at the MLB level. The stats as presented would indicate what skill set the player has (or had that particular year) but the rates themselves cannot be expected to simply continue a the same progression. This seems to be an extremely unadvised use of a small sample size of data.

    From there he goes on:

    Quote Originally Posted by Hughes2.50
    then I calculate the difference in ERA+ between the two values and do a cubic transformation to normalize the distribution of the scores (I'll describe this in more detail if you are interested in the rationale).
    I think I speak for a great many people when I say ”Yes! We would very much like to have this explained in more detail!”

    What is the purpose in calculating the different ERA+ values, and what exactly is the value in determining the difference between the single season MLE and the career number?

    In all the readings and research I have done on baseball, I have never seen reference to a “cubic transformation.” It does not appear to be the basis of any statistical model on any sabermetric site. My background in mathematics is admittedly limited (although that has not hindered any other statistical reading or baseball research I have done), but I have never heard of “cubic transformations.” (Wikipedia doesn’t even have an entry on them (which is more than a little suspicious), and the Google results are mostly about obscure molecules or transitions in N-dimensional space.)

    Which leads me to question H2.5’s use of them. His goal is to “normalize the distribution of the scores” but I can’t find any evidence that they either 1. actually do that, or 2. that it is relevant either statistically on a performance basis to “normalize” them in the first place.

    Section 4

    The above sections apparently end the mathematical part of the process. H2.5 continues, but it is not clear how exactly this continued commentary impacts his projections methodology. I hope he will provide those answers. I will quote him here and offer my response. notes: The list is not quoted, but can be found in the original post. The last part of the quoted text comes from the next sequential post in the series (#94).

    Quote Originally Posted by Hughes2.50
    I mentioned here somewhere that another pitcher than Hughes got the top spot based off last year’s results. Staying true to the model, I list Adenhart first above, even though when you look at all of the data, it appeared that Hughes was the better pitcher, when using just the model to inform (something that should never be done, by the way), it is correct to list Adenhart first.
    I feel that when presenting a model it is correct to only reference the model within it’s own context. Adjustments made outside the model should be made after the fact and noted as such. A good example would be Nate Silver’s use of PECOTA, where he presents the original data and then points out what his system might be misinterpreting and why.

    note: this passage also seems to indicate that there is data for more than one year. It would be very helpful if that data was added to what we have.

    Quote Originally Posted by Huges2.50
    As to the Betances and Chamberlain estimates. When I provided those estimates in the fall of last year only Baseball America (of the commercial evaluators) had provided their top ten of the Yankees prospects. My evaluations of both pitchers were heavily dependent upon the reports of Baseball America, and other, non-commercial sources.
    This suggests that there is another component to H2.5’s method that does not use any stats whatsoever. The “scouts vs. stats” debate is often overplayed in my mind, but I feel that it stems from the fact that the middle ground between the two sides is compromised by one side looking at what did happen and the other looking for what has the potential to happen.

    In any event, we do not know how H2.5 is altering his math to take scouting into account (if indeed he is at all). I do not think projection systems should ever do this. It is better to predict an anomaly (the best example I can think of is PECOTA suggesting that Dustin Pedroia is similar to Garry Sheffield) and then address it and learn from it, than to fudge the data or math in an attempt to make it go away. (Witness Einstein and his use of the cosmological constant. Most theories produce anomalies, and they are often the window to a deeper understanding of the theory and/or nature.)

    Quote Originally Posted by Huges2.50
    One criticism levied against those projections was that it was impossible to suggest that a pitcher who had never pitched above short-season ball could not be evaluated as better than a prospect that had pitched in the high minors. I suspect that most people if they give it a bit of thought can see why that argument is faulty. If it were true, how could baseball talent evaluators expect Andrew Miller to be an ace level prospect in the majors off 5 innings in the low minors? How could a high school pitcher get drafted before a college pitcher?
    This is logically inconsistent and does not address the issue at hand. The point is not that it is impossible to project a player with less experience to have more talent and do better than one with more experience. The point is that statistics measure what has been experienced, and that mathematical models are unlikely to make that projection (without extra help). It is not the fault of the projection system, it is a tool with its own strengths and weakness. But that does not mean that this (or any) particular tool should be compromised in order to make it do everything. (I wouldn’t try to turn a table saw into a can opener.)

    Quote Originally Posted by Huges2.50
    The point here is that when scouts suggest that a pitcher has a 'once in a generation set of tools' (Betances) and has 'high end ace stuff' (Chamberlain) you have to adjust your expectations accordingly if you value the scouts’ opinions (and I do). That doesn't mean that everything will work out, but it does mean that a talent like what Betances has, should be carefully nurtured by the wise organization. Because the return on investment in such situations could be extraordinary.
    This is also seems to indicate that the opinions of scouting are being shoehorned into the statistical model. Scouting is essentially an opinion. It can be an informed opinion, and a professional opinion, but it is still an opinion. I think I can speak for most of the sabermetrically inclined when I say that opinions have no place modifying the numbers within a mathematical system.

    By way of example, consider Betances. Despite having “a once in a generation set of tools” he was not taken first overall, and he fell out of the first round due to a “slow start” in the spring. Why are the opinions of other scouts who prefer other players (who were picked earlier in the draft) not reflected in the listed results? Is it even possible?

    Players often do not reach the peaks scouts see in them. This is not a fault of scouting (after all, the job of the scout is to find the possibility “where the return on investment is extraordinary”) but that does not mean that a projection system should be adjusted for such “wish casting.”

    A look at how scouting is being used by sabermetricians can be found in TangoTiger’s “Wisdom of Crowds” fan database experiment. He combines a broad range of opinions to get a reading on something that is very hard to quantify (defense). The idea is that a broad range of opinions will reduce the possibility of outliers and small samples from corrupting the final data line. I am extremely doubtful that H2.5 has anything close to this level of input from professional scouts or other “non-public sources.” The most important point is that TangoTiger does not use these results to directly modify the defensive system he works on directly (Michael Lichtman’s UZR).

    Conclusion

    In order to judge how H2.5’s numbers stand up as a projection system, I would like to set my analysis of his system against what I think are axiomatic standards for any projection system.

    1. The system must have a clearly defined methodology.

    Hopefully I have helped somewhat with this, but obviously there are still questions that are up to H2.5 to answer. It should be noted that this does not mean that a system is necessarily “open source.” PECOTA’s inner workings are not available, but Nate Silver can at least explain in clear and concise language how it works. Problems that may arise from the methodology should be acknowledged as such. For example, using a simple weighted 3-2-1 system does not really work for rookies who don’t have any previous ML playing time, but that is understood.

    I feel that it is important to note that he has said that his numbers require league and park adjustments (see here), but this is not described or accounted for in the original.

    Obviously, there is more work to do.

    2. The system must be complete and uniform.

    By this I mean that the system has a complete data on all the players within its purview and the data is handled in the same manner for every projection. We should be able to compare any two (or more) like players (i.e. pitchers and position players). This does not rule out making judgments on the raw data set based on reasons outside the system (see the PECOTA example, above), but the data set itself should not be compromised.

    The major issue with regards to this is what appears to be the influence of H2.5’s scouting opinions on his numbers. I would hope he can provide an explanation, as without one his results must be considered flawed and suspect.

    3. The system must be testable.

    Clearly, these are a (I daresay the) major problem with H2.5’s system. We currently only have a small list of player projections, and do not have any way to check against other players outside of the data set he has provided.

    Testability does not mean that the system is expected to be 100% accurate, no system comes anywhere close to that standard. It does, however, mean that we should be able to track how players are doing against the system in order to see exactly how well it does. Without a complete roster of projections made it would stand to reason that the sample size is to small to declare any sort of success with the projections it makes that are accurate.

    This also means that the system should be testable against nature. In the case of H2.5’s system we can compare his methods and projections against known mathematical processes and baseball populations.

    The most basic mathematical truism in baseball is to beware of drawing large conclusions on a small sample size. Unfortunately this is exactly what he has done. He has taken pitchers with only a few professional (and no major league) seasons and extrapolated their entire careers. He has not provided a mathematical basis for making that leap of faith.

    He has also projected a pair of 21 year olds to have career ERA+s greater than any other pitcher in history. While it is possible that we are about to bear witness to a new golden age of pitching, common sense would seem to caution against it, and he has not provided a suitable explanation as to why we should forgo that natural impulse.

    Fin

    I hope that this can set the groundwork for a more complete understanding the numbers and methods discussed, and I sincerely hope that if I have made any errors they will be presented and corrected here.

    Edited to correct a few typos.
    Last edited by Munson's 'Stash; 05-29-07 at 12:36 PM.
    [SIZE=1][COLOR=DarkRed][I]"[/I][/COLOR][/SIZE][SIZE=1][COLOR=DarkRed][I]The abuse on Matsuzaka's arm so far is the sort of thing Dusty Baker masturbates to at night.[/I] [/COLOR][/SIZE][COLOR=DarkRed][SIZE=1][I]"[/I] - OCD SS[/SIZE]
    [/COLOR]

  2. #2
    NYYF Triple Crown

    PaulieIsAwesome's Avatar
    Join Date
    Jul 2004
    Location
    Washington, DC

    Re: A Close Analysis of Yankee Projections.

    Very good post. You've summarized and added to every question I've had about Hughes2.50's projection model. Hopefully we can get him in here, and have him answer some of these fundamental questions.

  3. #3
    NYYF Legend

    gdn's Avatar
    Join Date
    Aug 2005
    Location
    your face

    Re: A Close Analysis of Yankee Projections.

    Fantastic post. His projection system is a little clearer to me now. Very well done.

  4. #4

    Re: A Close Analysis of Yankee Projections.

    Quote Originally Posted by gdn
    Fantastic post. His projection system is a little clearer to me now. Very well done.
    How do you understand it more clearly now? I think what he has made clear (in case it wasn't clear from the start) is that his projections are bogus. Even if we ignore the problems laid out above, to adjust a MLE by X% in a favorable direction is nothing but arbitrary.

    http://www.google.com/search?hl=en&q=%22cubically+transformed%22&btnG=Google+Search


    Excellent post, btw.
    Last edited by AMarshal2; 05-28-07 at 08:02 PM.

  5. #5

    Re: A Close Analysis of Yankee Projections.

    Quote Originally Posted by PaulieIsAwesome
    Very good post. You've summarized and added to every question I've had about Hughes2.50's projection model. Hopefully we can get him in here, and have him answer some of these fundamental questions.
    Don't count on it.

    But, agreed on all accounts - it's nice to see this laid out so carefully and explained and, well, picked over. I'm pretty sure we can all agree that any system that projects rookie league and single-A pitchers to be hall of famers as one that needs to be picked over.

  6. #6
    NYYF Legend

    gdn's Avatar
    Join Date
    Aug 2005
    Location
    your face

    Re: A Close Analysis of Yankee Projections.

    Quote Originally Posted by AMarshal2
    How do you understand it more clearly now? I think what he has made clear (in case it wasn't clear from the start) is that his projections are bogus. Even if we ignore the problems laid out above, to adjust a MLE by X% in a favorable direction is nothing but arbitrary.

    http://www.google.com/search?hl=en&q=%22cubically+transformed%22&btnG=Google+Search


    Excellent post, btw.
    Well, before this thread, I had absolutely no clue how he was getting his projections and what he was using.

    Now I somewhat understand what he's using.

    So, yes, my understanding is a little clearer.

    Edit: As to the first few hits from the Google search - that's just childish.

  7. #7

    Re: A Close Analysis of Yankee Projections.

    Quote Originally Posted by gdn
    Well, before this thread, I had absolutely no clue how he was getting his projections and what he was using.

    Now I somewhat understand what he's using.

    So, yes, my understanding is a little clearer.

    Edit: As to the first few hits from the Google search - that's just childish.
    As to the google search. And, they wonder why I don't respond.

    A cubic transformation is merely a way to make poorly fitting data, more easily analyzed by applying a mathematical function to the data set. Although this is a pretty obscure term among people with some mathematical background, it isn't a mystery among people who have been specifically educated to analyze data sets.

    gdn if you really don't understand something, drop me a pm and I will try to clear it up for you.

    Royal Flush: Hughes, Sabathia, Betances, Brackman, Banuelos.

  8. #8

    Re: A Close Analysis of Yankee Projections.

    Quote Originally Posted by gdn
    Well, before this thread, I had absolutely no clue how he was getting his projections and what he was using.

    Now I somewhat understand what he's using.

    So, yes, my understanding is a little clearer.

    Edit: As to the first few hits from the Google search - that's just childish.
    Well, for me, I understand it less. At first I thought he was just toying with MLE's and adjusting them arbitrarily. I didn't realize there were so many other unanswered questions and problems with his projection (like the HOF level career ERA+'s, for instance).

  9. #9

    Re: A Close Analysis of Yankee Projections.

    Quote Originally Posted by Hughes2.50
    As to the google search. And, they wonder why I don't respond.
    A cubic transformation is merely a way to make poorly fitting data, more easily analyzed by applying a mathematical function to the data set. Although this is a pretty obscure term among people with some mathematical background, it isn't a mystery among people who have been specifically educated to analyze data sets.gdn if you really don't understand something, drop me a pm and I will try to clear it up for you.
    I apologise if people thought that the google search is somehow out of bounds; my point was to show that this is not a mathematical tool or function that is commonly (or ever, for that matter) used by other baseball analysts. As I stated in the begining, the purpose here is to learn, and I don' t think a look for how his processes are used should be outside that.

    If a you have "poorly fitting data" why are you trying to shoehorn it into predictions that do not match nature? If the data does not fit, why does that not point to a more fundamental problem with your methods?
    [SIZE=1][COLOR=DarkRed][I]"[/I][/COLOR][/SIZE][SIZE=1][COLOR=DarkRed][I]The abuse on Matsuzaka's arm so far is the sort of thing Dusty Baker masturbates to at night.[/I] [/COLOR][/SIZE][COLOR=DarkRed][SIZE=1][I]"[/I] - OCD SS[/SIZE]
    [/COLOR]

  10. #10

    Re: A Close Analysis of Yankee Projections.

    Wait, so are you fitting a cubic polynomial to the data? Is that what you mean?

    That might explain some of the more extreme projections, because while polynomials can be made to fit data pretty well, they tend to "explode" outside of the main data sample and could thus lead to some pretty extreme forecasts on inputs that are out of the normal range. Which could happen if we are taking small samples of minor league domination.

    Or are you really doing some sort of complex Fourier-type transform? I found this paper (PDF), for example, that gives one cubic transform forumla (top of page 11 of the pdf). And to be fair to Hughes, a google search for "cubic transform" gives lots of legitimate hits.

    To be fair to the rest of us, however, that looks like nonsense. And I have studied math, applied math, statistics, and econometrics for over 10 years. (I have a phd in econ from a good school, and have always done applied, data-driven work.) Even if you can't give us your precise formula (and I can totally understand you not wanting to, much as BP doesn't give out its formulas for PECOTA), perhaps you can explain intuitively what the cubic transform does?

    Is it smoothing out data the way predicted values from a regression would (like fitting the data to a pre-defined functional form, such as a cubic polynomial)? That's something we can all understand, even if the formula/implementation is not something we're going to follow ourselves. Or is it something else entirely?

  11. #11

    Re: A Close Analysis of Yankee Projections.

    Quote Originally Posted by Munson's 'Stash
    I apologise if people thought that the google search is somehow out of bounds; my point was to show that this is not a mathematical tool or function that is commonly (or ever, for that matter) used by other baseball analysts. As I stated in the begining, the purpose here is to learn, and I don' t think a look for how his processes are used should be outside that.

    If a you have "poorly fitting data" why are you trying to shoehorn it into predictions that do not match nature? If the data does not fit, why does that not point to a more fundamental problem with your methods?
    I think that was directed at me. I linked a google search of "cubically transformed" to show that it didn't get any hits that were in any way useful. Of course, the first 3 or 4 hits are from the SoSH forum headers which are poking fun at H2.50. That's life on a message board. You say something silly, you end up in somebody elses sig.

    I've been very critical of H2.50 because I don't think his work is credible and I think he's misleading a lot of people. I don't have a PHD in Econ (though I do have a BA), so if he can in any way show that I'm off base, I'd like to see how he formulates his projections. Until then I remain very skeptical.
    Last edited by AMarshal2; 07-15-07 at 08:49 PM.

  12. #12

    Re: A Close Analysis of Yankee Projections.

    Quote Originally Posted by AMarshal2
    I think that was directed at me. I linked a google search of "cubically transformed" to show that it didn't get any hits that were in any way useful. Of course, the first 3 or 4 hits are from the SoSH forum headers which are poking fun at H2.50. That's life on a message board. You say something silly, you end up in somebody elses sig.

    I've been very critical of H2.50 because I don't think his work is credible and I think he's misleading a lot of people. I don't have a PHD in Econ, so if he can in any way show that I'm off base, I'd like to see how he formulates his projections. Until then I remain very skeptical.
    So am I, but I'm trying to give him the benefit of the doubt. I only stated that I have a phd in econ because H2.50 keeps implying that if someone had studied enough, they'd know what he's talking about. Maybe that's true, but I have studied a lot of math and statistics, and I do not follow what he's saying. That's not to say that he's necessarily doing something wrong, but I would like at least an intuitive explanation of what the (now infamous) cubic transform is doing.

  13. #13

    Re: A Close Analysis of Yankee Projections.

    Good thread - I appreciate the discussion.

    Whenever I see the word 'fit', in relation to numbers or stats, a red flag goes up. Data can be massaged to make existing trends more clearly visible, but drawing HOF conclusions from AA stats seems a leap. Once one engages in 'fitting' numbers, the possibility that one's own bias' leaks into the result is great.

    Statistics, sufficiently tortured, will confess to anything.

  14. #14

    Re: A Close Analysis of Yankee Projections.

    Quote Originally Posted by Hughes2.50
    As to the google search. And, they wonder why I don't respond.
    I'm pretty sure no one is wondering why you didn't respond more in depth to this thorough dismantling of your method.


    A cubic transformation is merely a way to make poorly fitting data, more easily analyzed by applying a mathematical function to the data set.
    Fine, but then why not explain why you need to make poorly fitting data more applicable to your method? Data is data. When you transform data you're altering it, and then it's not longer the data it was when you first analyzed it. So while your mathematical functions work better with transformed data set, you're no longer dealing with the real data.
    Although this is a pretty obscure term among people with some mathematical background, it isn't a mystery among people who have been specifically educated to analyze data sets.
    PH25 you are clearly an intelligent person with a reasonably strong knowledge of stats and data.....why would you go so far out of your way to discredit your own knowledge by predicting that most of our minor leaguers are going to be Koufax-esque and then either ignoring or shouting down anyone who disagrees?

    Here you have a perfect opportunity to either defend your work against the plethora of academic questions that have been brought up in regards to it, and if you're not invested enough to defend it that way you can explain that the projections are not the most scientific and that you had a little fun analyzing the data and using different formulas to come to these projections. Instead you're focusing on the fact that cubic transformation is more common than Munson'sStache said it was, and that you shouldn't have to explain yourself because others occasionally make fun of you.

    I don't post that much but I've read pretty much all of your posts and while I'll admit the projections can be fun to read about their basis in reality is something I think you're going to have to defend a lot more rigorously than your response above if you're really expecting people to take them seriously.

  15. #15
    NYYF Legend

    gdn's Avatar
    Join Date
    Aug 2005
    Location
    your face

    Re: A Close Analysis of Yankee Projections.

    Just so we're clear - I didn't make my comment about "the first 3 or 4 links" to denigrate anyone here in this thread. It was merely an opinion that people on message boards can be childish at times.

  16. #16
    So good they named me twice
    Join Date
    Jan 2005

    Re: A Close Analysis of Yankee Projections.

    I think you guys are being a bit harsh on Hughes2.50.

    He's gone to a lot of trouble to come up with some fantastic projections that say that we're going to have a kick-ass rotation for the next ten years that is going to have the Red Sox chewing our dust.

    That's good news isn't it?

    So, why argue with him?
    World Champions, 2007.

  17. #17

    Re: A Close Analysis of Yankee Projections.

    Quote Originally Posted by Phobicsman
    I think you guys are being a bit harsh on Hughes2.50.

    He's gone to a lot of trouble to come up with some fantastic projections that say that we're going to have a kick-ass rotation for the next ten years that is going to have the Red Sox chewing our dust.

    That's good news isn't it?

    So, why argue with him?
    I can't tell if you're being saracastic here or not.

    For one, I find the projections to be so over the top that I have a hard time thinking of them in that much of a positive light because they're just so outrageous. That doesn't mean that I'm not excited about the prospect of these 3-4 guys coming through the system together and developing into a good if not great homegrown rotation in the near future. That's a wonderful prospect, but just because somebody makes up a flawed formula declaring that they'll be Koufax/Pedro/Clemens/RJ in their primes and first ballot hall of famers doesn't make it so.

    That's why I'm arguing with him(can't speak for anybody else). If you were being sarcastic then please disregard my post. If you were not, I like projections to have a bit more basis in reality before I get too excited about them.

  18. #18
    Released Outright
    Join Date
    Nov 2004
    Location
    Westchester-ish

    Re: A Close Analysis of Yankee Projections.

    Quote Originally Posted by Dow Jones
    I can't tell if you're being saracastic here or not.

    For one, I find the projections to be so over the top that I have a hard time thinking of them in that much of a positive light because they're just so outrageous. That doesn't mean that I'm not excited about the prospect of these 3-4 guys coming through the system together and developing into a good if not great homegrown rotation in the near future. That's a wonderful prospect, but just because somebody makes up a flawed formula declaring that they'll be Koufax/Pedro/Clemens/RJ in their primes and first ballot hall of famers doesn't make it so.

    That's why I'm arguing with him(can't speak for anybody else). If you were being sarcastic then please disregard my post. If you were not, I like projections to have a bit more basis in reality before I get too excited about them.
    Phobicsman's been here over 2 years and has 16 posts. They all read similar to that one. Rah-rah.

  19. #19
    So good they named me twice
    Join Date
    Jan 2005

    Re: A Close Analysis of Yankee Projections.

    Quote Originally Posted by yankeebot
    Phobicsman's been here over 2 years and has 16 posts. They all read similar to that one. Rah-rah.
    Whats wrong with being 'rah rah' as you put it?

    Better to be optimistic and trying to look on the bright side than coming up with sour cynical stuff all the time.

    I just think Hughes2.50 has come up with some decent statistical analysis - don't see why you have to get abusive if someone wants to stick up for him.
    World Champions, 2007.

  20. #20

    Re: A Close Analysis of Yankee Projections.

    Because the point of objective statistical analysis is to allow models to be criticized, otherwise there would be no point to them.

  21. #21
    Released Outright
    Join Date
    Nov 2004
    Location
    Westchester-ish

    Re: A Close Analysis of Yankee Projections.

    Quote Originally Posted by Phobicsman
    Whats wrong with being 'rah rah' as you put it?

    Better to be optimistic and trying to look on the bright side than coming up with sour cynical stuff all the time.

    I just think Hughes2.50 has come up with some decent statistical analysis - don't see why you have to get abusive if someone wants to stick up for him.
    I didn't say there was anything wrong with it. Dow Jones wasn't sure whether or not to take your post as sarcasm. I was simply describing the kind of poster I perceive you to be. With the limited data at hand, I think I was spot on. I have no griipe with optimists.

  22. #22

    Re: A Close Analysis of Yankee Projections.

    Quote Originally Posted by Phobicsman
    I think you guys are being a bit harsh on Hughes2.50.

    He's gone to a lot of trouble to come up with some fantastic projections that say that we're going to have a kick-ass rotation for the next ten years that is going to have the Red Sox chewing our dust.

    That's good news isn't it?

    So, why argue with him?
    Not to pile on, but what if he's wrong?

    The point of any projection system is to try to predict reality as accurately as possible. If your system causes you to over-value your own players due to excessively optomisitc projections what happens if they don't meet those expectations?

    Consider a hypothetical example:

    The Twins are willing to deal Johan Santana and come to the Yankees because they are one of the places he is willing to go and have talent the Twins are interested in (the Yankees young pitchers). Who are you willing to give up to get him, knowing that he is on the block and will sign an extension with the team he winds up with (so the Yankees will not not be able to just wait untill he hits FA)?

    Santana has a career ERA+ of 143, tying him for 9th on the all time list. But H2.5 projects Hughes to outpitch him by a career ERA+ of 167, and has Humberto Sanchez just behind him at a career ERA+ of 142. He also thinks that Betances "has a better than 50-50 chance to have a career ERA+ of over 150." Similarly, Joba Chamberlain has the same chance to be over 130 by his estimation.

    So who do you deal for Santana, if anyone? In the real world the Twins would undoubtedly want Hughes, and probably another of these pitching prospects. If H2.5 is correct Phil Huges is going to be the best pitcher in the history of baseball, and Humberto Sanchez is going to be more valuable to the Yankees than Santana (because of the number of years of arb control vs having near identical ERA+s). Similarly Betances and Chamberlain have good chances to be the second best pitcher in the history of baseball and in the top 30 of all time, respectively.

    But let's consider a projection system with a better track record. PECOTA projects Santana to have 31.4 Warp over the next 5 years. It projects Hughes to produce 21.1 Warp and Sanchez to produce only 7.5 Warp in the same time period (and the low projection for Sanchez isn't out of line if think that his lack of durability may force him to the bullpen). (Betances and Chamberlain aren't even projected for so little actual data.)

    Using these numbers instead the Yankees would come out ahead in a trade of Hughes and Sanchez for Santana (as measured by Warp). Looking closer, say the Yankees are willing to give up Santana for Sanchez, Bettances, and Chamberlain. By H2.5's analysis this is not a deal the Yankees should make (remember Sanchez = Santana, and both Bettances and Chamberlain have a chance to be as good or better than him). OTOH I really doubt there isn't an actual GM who wouldn't make that deal.

    There are a lot more considerations that go into looking at a trade, obviously; but this should make it clear that improperly evaluating players (your own and others) leaves your team at an extreme disadvantage against teams with a better understanding of the reality of their player's potential. There is no reason that we as fans should be any less informed if we can help it, or embrace ignorance just because it makes us feel good.
    [SIZE=1][COLOR=DarkRed][I]"[/I][/COLOR][/SIZE][SIZE=1][COLOR=DarkRed][I]The abuse on Matsuzaka's arm so far is the sort of thing Dusty Baker masturbates to at night.[/I] [/COLOR][/SIZE][COLOR=DarkRed][SIZE=1][I]"[/I] - OCD SS[/SIZE]
    [/COLOR]

  23. #23
    You know what I mean.
    Join Date
    Apr 2007

    Re: A Close Analysis of Yankee Projections.

    Excellent analysis by Munson's Stash. The long and short of it is that there is no "there" there when it comes to H2.5's methods.

    There is no way to evaluate them, since no control data is shown and since there are a number of significant "black boxes" in his description. There's nothing inherently wrong with making up one's own projection system and then wishing to keep it secret, but when it produces such wildly optimistic results one has to expect that it is going to invite a certain amount of skepticism, if not outright ridicule.

    The fact is that M.S. has done a really good job of not ridiculing or personally attacking H2.5, in spite of the fact that H2.5 seems hell-bent on making himself an easy target with the combination of dismissive arrogance, cryptic elitism, and offhand usage of terminology that sounds for all the world like big-word nonsense.

    The bottom line is that, to an outside observer, H2.5's "projections" may as well be guesses, because there is simply no way to evaluate their validity. There is no way to test his methods and no historical track record.
    Walter:I’ll get you a toe. There are ways, Dude. You don't wanna know about it.
    Dude:Walter...
    Walter:I can get you a toe by 3 o'clock this afternoon, with nail polish.

    Red Sox fan, not a troll

  24. #24
    Released Outright JavyVazquezIsSick's Avatar
    Join Date
    May 2004
    Location
    Dublin, Ireland

    Re: A Close Analysis of Yankee Projections.

    Munson that was bang up work, and as soon as I have a spare day I'll actually read the whole thing.

  25. #25

    Re: A Close Analysis of Yankee Projections.

    Print it, stuff it into an envelope, and mail it to Cashman before he goes off on another panic move like Igawa.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

     

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts