Fire Joe Morgan: if you like baseball, start reading this site regularly. Their mission, it appears, is to humiliate those in the baseball media and front office that reject the current value of information in baseball stats. The writing and analysis is the most refreshing thing to read about baseball on these internets. The comment on Jim Armstrong was fantastic, simply take a look at this exchange:

Have you seen some of the quote, unquote stats out there?

My man: when you are talking you say “quote-unquote” to indicate sarcasm. When you are writing you can just put things in quotes. As in: Jim Armstrong is a “journalist.” He is also “funny” and “smart” and I “want to hang out with him” because he seems to have a lot of “good” “points.”

I’m still laughing at this several days later. Anyways, there was another post directed at Jim Lang about how Moneyball is an Al Qaeda like presence in baseball. His specific complaint is that the Jays didn’t sacrifice bunt with runners on first and second with no outs. Why the outrage at such a statement? Well, anyone whose looked at an expected run matrix can tell you that statistically, the benefit of moving the runners up is less than the cost, in expected runs, by sacrificing the out. This is especially true in the American League with the D.H. But it is especially disappointing coming from Jim with the 2, 3, and 4 hitters coming up. Let’s take a look at some of the numbers.

The expected runs for a team with runners on first and second with no outs is 1.51. The expected runs for a team with runners on second and third with one out is 1.44. It obviously isn’t worth the risk in this case. Additionally, we’re assuming that the bunt is perfectly executed, which is a poor assumption in this case. We’re talking about the American League where sacrifice bunts are scarce and players are generally not accustomed to bunting. To make things simple let’s assume that either the bunt is successful leaving runners at second and third with one out, or the bunt is not successful and the runners remain at first and second with one out. And let’s say the probability of a successful bunt is 0.8. The expected runs for such a strategy is then,

0.8*1.44 + 0.2*0.91 = 1.33

Clearly the risk of the sacrifice bunt isn’t worth the reward. In fact, the strategy on average is costing you about a tenth of a run. This is precisely why the sacrifice bunt is not employed as often as it once was.

Now, was it intuition or stats that proved the sacrifice bunt was a poor strategy? I imagine it was a little of both. But what bothers me the the most about these arguments is the stratified sides. Most times, baseball purists and the stats strategies are coordinated. And the two sides are often working for the same purpose. But there are still sports writers, managers, and GM’s that refuse to listen to the new methods. Consider that managers have significantly reduced the number of sacrifice bunts. If a manager wants to say that’s because of his intuition that’s fine. But the numbers support that conjecture and it can be used to evaluate good and bad strategy. Ignoring the value in the numbers can be costly.

_uacct = “UA-4792950-1″;
urchinTracker();

There can be only one!

April 11, 2008

The battle ground is Minute Maid/Enron/Train Depot field. Brandon Backe, upset at a Pujols slide into a backup catcher, has thrown down the gauntlet. The competition between the two is now, in his words, “escalated.” Excellent. It reminds me of the critically acclaimed film, Bring it on, in which cheerleaders “bring it” at each other repeatedly? Although, I can’t substantiate that claim. At any rate, the next meeting between the two will have drama, fireworks, possibly pinatas? Who knows what the future brings. But my question is this: How can Backe extend his dominance over Pujols? Backe, after all, is only human. How much additional “competition” is necessary to destroy el hombre instead of the gentle dominance he has exhibited? Let us examine the line between the two.
PA AB H 2B 3B HR RBI BB SO BA OBP SLG OPS
Brandon Backe 13 10 3 0 0 3 4 3 1 .300 .462 1.200 1.662

Indeed the competition between the two is fierce. In 13 plate appearances Pujols has walked three times with a paltry 3 home runs. His 1.662 OPS is simply anemic. Backe is truly a counter force, a nemesis to Pujols. I am saddened by this of course. Pujols is, at the most, my imaginary friend. I hate to see what will happen to his line when Backe steps up the competish.

I get it, Paul Krugman likes Hillary Clinton. There is yet another column by Krugman describing Clinton policy as superior to that of Obama. Furthermore, Krugman comments that Obama’s strength in the anti-war message is not a focus for voters, while the economy is of a greater concern. I have no disagreement about this latter sentiment, however, to claim that her economic message was crucial to her winning in Ohio is curious. Perhaps the news coverage was misplaced or incorrect, but the analysis I read was in response to the cliche “It’s 3 am, are your children safe,” television spots. That focus and attention was on her national security credentials and not her economic policy. If she would use the economy as the centerpiece for her platform it would a brilliant step that apparently eludes her questionable campaign staff. But instead of focusing on her economic policy and what she can do, the message is consistently what my opponent cannot do. Why hasn’t she used the economy as a stonger issue? She certainly has been pandering her foreign relations experience, which has come under question. There must be some questions as for her economic experience as well.

Is this an anxiety election as Krugman would have you believe? Possibly, if you believe the Clinton message of gloom and doom. But the Obama message is different and is not motivated by anxiety. The support he gathers is from hope; hope for a better country and political landscape.

From the Washington Post Fact Checker:

Hillary is making a lot more of her Northern Ireland role on the campaign trail than she did in her memoir “Living History.” As the Boston Globe recently noted, her stories of bringing Protestant and Catholic women together have become more dramatic with each retelling. The claim that she brought Catholics and Protestants together “for the first time” seems dubious. This would not be the first time that she has mixed up her chronology.

Stay classy Clinton.

Polls cont.

January 14, 2008

When designing a survey, one can determine the number of individuals necessary to be sampled for a specified margin of error. Suppose the pollsters want a margin of error of \pm 3 points. The width of a confidence interval with (1-\alpha ) level of significance for a binomial random variable is computed by using the following equation,
w=2z_{\frac{\alpha }{2}}\sqrt{\frac{\theta (1-\theta)}{n}}.
Solving for n we get a function of the number of individuals needed to be sampled for a given confidence interval width and proportion,
n=\left( \frac{2z_{\frac{\alpha }{2}}}{w}\right) ^{2}\theta (1-\theta ).
We can substitute 0.06 for w and using calculus we know that the preceding function is maximized when the proportion, \theta , is equal to one-half. Setting \alpha to a moderate 0.05 and computing for n we get the number of individuals needed to be sampled
n=1067
Around 1,000 people sampled from the population will provide a 6 point width in the confidence interval and a margin of error of \pm 3 points. Once the population is sampled and the parameters are estimated then what conclusions can be made? The computations return an estimate of the proportion, \hat{\theta}, and a \pm 3 point confidence interval around the estimate. The interpretation of this confidence interval is tricky to understand and often misinterpreted in the media. The interpretation is this: If 100 independent random samples were taken from the population, in 95 of those samples the parameter estimate would lie within the confidence interval. The true parameter value of the proportion may or may not lie in that interval. The likelihood of the study is designed to capture the true parameter value but it is far from a certainty. And that is what the pundits and journalists should remember. Try not to draw too much information from a poll that is statistically derived. Or at least understand the uncertainty involved in such games.

With the Presidential Primaries in full swing it’s important to review the statistical nature of polls. The media can sometimes overestimate the value of information from these surveys and rely on a tool that is a random measure. First issue, is the sample biased? The underlying assumption that is relied upon to conduct a scientific poll is the random sample. If the sample generated is not random then the results should be examined closely. Second issue, what is the underlying population that is being sampled? The population needs to be clearly defined and should be reported when citing results from a poll. Is the population sampled registered voters, party affiliations, people listed in the phone book? The poll is designed to infer characteristics about the population and if there is confusion about the precise definition then interpretation of the results can be misleading. Third issue, the media often misinterpret the meaning of significance and confidence levels. The final issue here will be addressed in an example in the next post.

A Simple Tax System

December 16, 2007

One of the more interesting topics in last nights Republican Presidential debates concerned the flat tax. It should be noted that the flat tax is now generally referred to as the “fair tax.” The sound of a fair tax is more appealing then a flat tax I suppose. Each candidate expressed interest in either the flat tax or a simplification of the tax code. This is a step in the right direction. The benefits to a flat tax are available and will likely increase productivity and possibly generate more tax revenue. The problem, however, is that only one of the candidates discussed dealing with the powerful Tax Lobby. A $250 billion a year industry will not go away lightly and it will require a President and a Congress strong enough to ignore this lobby. The California Legislature was unable to pass such measures after a pilot program proved extremely successful. Let’s hope it is time for such changes because returning $250 billion in money spent to prepare taxes would be in itself a great start.

Efficiency and Equity

December 7, 2007

Economists are generally concerned with efficient markets. The issue that is rarely addressed is equity within markets. That is to say, equity in the form of the allocation of resources or income distributions. It might surprise you that economists are seldom interested in equity. Equity is a normative concept that relies on value judgements. Efficiency is a positive concept with clearly defined properties. It is important to remember that economists, when forming policy, are not concerned with equitable distributions. This is a major reason economists favor policies like free trade, because free trade promotes efficiency. The problem then lies in determining the distributional consequences of a free trade scheme.

Torre and the evil empire

October 25, 2007

It’s disappointing that I feel obligated to comment on the Joe Torre contract insult situation. You might ask, don’t the Yanks get enough attention as it is? Certainly, but Yankees spending can seem so ridiculous at times it’s an easy target. Sorry friend, already off topic. Anyways, let’s take all the politics and personal feelings out of this. An incentive rich, one year, $5 million contract for Torre with possible $1 million bonuses for reaching each the Division, League Championship, and World Series. Incentive clauses that were similar to other contracts Torre has agreed upon. It is a $2.5 million pay cut from last season, yet Torre would still be the highest paid manager in the game by $1.5 million with this new contract. For the 2007 season Torre was the highest paid manager by $4 million over Lou Pinella and is four standard deviations above the mean salary. Take a look at the distribution of managerial salaries for the 2007 season. Torre is all alone in the tail of the distribution. The new offer is still generous for what is considered heavily incentive based. Without the bonuses the contract offer is nearly 2.5 standard deviations above the 2007 average. This is not an insult contract from the Yankees. Torre would still be the highest paid manager by a significant amount and the contract is probably over valued given his true marginal value product. If he does decide to manage somewhere else, it is unlikely he’ll find a similar offer.
Manager Salary Distribution

17 movies in 17 days

August 21, 2007

The Ames library is amazing and one of the best collections the library has is it’s assortment of dvds. And we’re not talking horrible, archaic, documentaries. These are the latest and greatest blockbusters.

To avoid the massive fights and arguments over who gets to check out the newest films, the library allows users to place items on hold. Once you place a movie on hold you receive a request number and wait until it arrives. On the date it comes in, the library sends you an email to pick up your hold. For us, it’s essentially the same as netflix since we’re 4 blocks from the library.

From the holds we usually get a new movie every other week, which is plenty. However, at the beginning of August everyone in town goes on vacation. That means a massive number of suspended holds. For us, it means 17 movies in 17 days. Over the span of two weeks we had 17 nights to get through 17 movies. Here’s the list: