The Boston Big Dude

Monday, April 26, 2010

The 2010 NFL Draft - AFC North

I don't have time to do the AFC South, but here's the North!

Ravens: Baltimore is pretty solid everywhere - their biggest problem is age/injury and its effect on vets like Todd Heap and Ed Reed. Although the Ravens did not draft a successor for the great safety, they did add two first round defensive talents in Sergio Kindle and Terrence Cody despite not picking until Day Two. They drafted two top tight ends and a wide receiver to help out Joe Flacco, and they added a talented developmental project in Barbados-native offensive lineman Ramon Harewood during round six. I like picking up Syracuse defensive tackle Art Jones in round five - before his injury last season, he was a potential top-50 pick. Overall, I give the Ravens credit for taking good players at appropriate times. The biggest concern is that they didn't address their secondary concerns, particularly at cornerback, but they would have had to either take a lesser cornerback or sacrifice one of their more talented targets, such as Cody. They clearly stuck to their board, and I respect that.

Bengals: Jermaine Gresham is an excellent tight end, and it's becoming more and more apparent how big an impact such a player can make. Cincinnati needed help with the passing game, and they got it in Gresham, productive slot receiver/return man Jordan Shipley, and Kansas star Dezmon Briscoe. I think Briscoe was one of the top 4 receivers in the draft and a second round talent, so picking him up in the 6th is an absolute steal. He dropped because of a crummy combine, but I saw the guy play against Oklahoma, and he's got the tools to be successful. They also added depth at every level of an already good defensive unit - I love the Carlos Dunlap selection round 2. Cincinnati could be incredibly good next year, as they may boast the NFL's best defense and a much more powerful offense than last year.

Browns: What's not to like about Cleveland's draft? They didn't bow to the pressure and take Jimmy Clausen with the 7th overall pick - they opted for the talented and skilled cornerback Joe Haden, who thoroughly warranted the high slot. In the second round, they again passed on Clausen and stayed focused on the secondary by taking safety T.J. Ward. While not the most gifted safety available to them, Ward is just a good football player - he's highly productive playmaker who works hard and plays smart - and I like it when teams pick those kind of guys. They made other good picks (Carlton Mitchell in round 6 was a steal), but the story is obviously Colt McCoy going in round 3. Now, I know McCoy isn't very tall and he doesn't have a great arm, but the fella is tough, smart, and decent, and he makes plays that win football games. Eventually, production should trump all. I really like Cleveland's draft because they believed in that principle.

Steelers: Much as we might like to see him get pounded, Pittsburgh needed to get better protection for Ben Roethlisberger. They got that in interior lineman Maurkice Pouncey, a physical player who should contribute as a rookie. The Steelers focused heavily on their pass rush and receiving corps, and I particularly like 4th-rounder defensive end/outside linebacker Thaddeus Gibson. Getting Georgia Tech star Jonathan Dwyer in the 6th was miraculous, as Dwyer is every bit the player that legitimate 2nd-rounder Toby Gerhart is. He reads holes well (you have to in an option system) and hits them hard, and that kind of work ethic is always admirable in Steel City. They might have been wise to draft a quarterback in the middle-to-late rounds, as I can't imagine Roethlisberger will be in a Steelers uniform for very many years.

Sunday, April 25, 2010

The 2010 NFL Draft - A few thoughts and the AFC East

So, Commissioner Goodell's three-day spectacle was a rousing success for at least a couple of teams. Before doing a quick team-by-team look, let's hit a couple of bullet points:

I'm not a fan of drafting a quarterback with the first overall pick - if you're picking first, there's a pretty darn good chance you'll get the kid killed - but I think Sam Bradford is about as good a prospect as you'll find. Living in Oklahoma, I got a chance to see him play and he's got the single most important physical tool: accuracy. He does throw a good deep ball, but the 10-yard routes are what keep the chains moving, and he throws those beautifully. He gets great touch on his passes - he can zip it in when he has to, but he knows that sometimes that's not the best play - and he's demonstrated the ability to read defenses and go through progressions. Some people say he's a product of a good offensive line, and while it certainly helped, it's incredibly easy to find highlights where he's hung in the pocket and taken shots or made an accurate throw on the run. Sam's also a good, smart guy who doesn't need the spotlight, so when it's all said and done, he's as good a pick as a quarterback selection ever can be.
I'm surprised that Bradford's fellow Sooner Trent Williams was the first tackle taken, going number 4 to Washington. The guy's definitely got the physical tools to be a great player, but there's a couple thoughts in my mind. He was only a right tackle on Oklahoma's best offensive line (which was also one of the best offensive lines in college football history), and the left tackle on that line, Phil Loadholt, was drafted in the mid-second round last year and is unlikely to ever protect the blindside. Also: he's the guy who got beat during the BYU game when Bradford was injured. He had a fine collegiate career and I hope he was a fine professional career, but there's a reason that Oklahoma State's Russell Okung was rated the top tackle for almost the entire Draft Season. I think that the draft gurus and the personnel men just like tinkering with the boards ... sometimes you get it right the first time, fellas.
Oakland's gotten an unfair rep for making poor draft choices. Sure, the Darrius Heyward-Bey selection was an awful pick, especially with Michael Crabtree and Jeremy Maclin still on the board, but rare is the team that doesn't make stupid picks from time-to-time. You hear the 2004 Robert Gallery pick get slammed ... well, I'm sorry, but every single pundit loved that pick at the time, and that's the standard you should use. You hear the Russell pick get slammed, but, again, people loved him. Now, the Raiders don't always draft conventionally, but they haven't had any more busts than normal, the worst ones were the consensus 'smart' picks, and they've made a couple good choices, too (2003 first rounder Nmandi Asomugha is the best cornerback in the league, for instance). I say all this because the Raiders had a universally-acclaimed excellent draft, and I don't want you to expect any "surprising competency" jokes.

So, those thoughts were the ones bouncing around. I'm going to go a division at a time over the next week and provide team-by-team thoughts. We'll start with the AFC East.

Bills: I'm not crazy about the C.J. Spiller pick. I think people marked him up for being the cream of a mediocre crop of running backs, and while position scarcity should factor into any decision, it shouldn't override the fact that Spiller had 20 carries in a game only 5 times in 52 games at Clemson. I smell Reggie Bush, and that's not a guy you should take when you've got good tackles in Bryan Bulaga and Anthony Davis still there. They didn't address the o-line until the third day, and their second round pick of defensive tackle Torell Troup was a terrible reach. Guys rated around him were picked many rounds later, and Terrence Cody, Linval Joseph, and Lamarr Houson were still available. A couple nice late picks (Kyle Calloway) don't make up for those stretchy early picks.

Dolphins: They focused heavily on defense (7 of 8), and I think they did a fair job of it. I like trading down in the first round to recoup the second round pick they lost in the Brandon Marshall trade, and the players they got out of it are plenty good. Jared Odrick is a good defensive tackle who will be an effective starter from day one, and they picked up a couple of pass rushing threats in Koa Misi and Chris McCoy. I like taking developmental guys in the middle rounds, and offensive lineman John Jerry is a talented guy. Their other picks are hard-working guys who should at least contribute on special teams. All in all, Miami did a good job.

Patriots: As usual, the Pats had a million picks. First rounder Devin McCourty wasn't my favorite corner available, but he's hardly a bad pick. The Pats also landed two of the top tight ends in the draft in Rob Gronkowski and Aaron Hernandez and two great Florida defenders in Jermaine Cunningham and Brandon Spikes. They added depth to both their lines, a decent punter, and a talented backup (and maybe even replacement four or five years from now) for Tom Brady in Zac Robinson. I saw him at Oklahoma State, and I like his abilities. Overall, New England got good players at good values ... and they got lots of them.

Jets: With just four picks, these guys were the Anti-Patriots, and probably happy to bear the title. Kyle Wilson might have been the second-best cornerback in the draft behind Joe Haden, so he was a good value at 29. Their defense was strong last year, and Wilson could be the league's best nickel back next season; the Jets will be very tough to pass on, and that's a great way to win games in the modern NFL. Second rounder Vlad Ducasse is talented, though he might end up playing guard rather than tackle at the next level, and the Jets have the offensive line depth that they don't need to play him before he's ready. I like USC running back Joe McKnight (especially in the 4th round) as a complement to Shonn Greene. He's got a reputation as a scat back, though he's a better inside runner than folks give him credit for, and he can immediately contribute. Fullback John Conner, in addition to being mankind's savior, is a physical player who fits right in with the Jets power game. The Fightin' Ryans had few picks, but they did pretty well with them.

Tomorrow: The AFC North and South.

Thursday, April 22, 2010

Fixing IRP

So, I decided to see if the numbers I was compiling for IRP were actually were making sense. Since the inherent values for the offensive plays were determined using the league's 2009 Run Expectancy Matrix data, I figured that I could use the league's hit (and walk) totals to estimate the number of runs that should have been scored. If the numbers correspond closely with the actual runs scored, it means that we've got something that really does relate to the most important part of the game: touching all four.

IRP runs scored projection: 20394

Actual MLB 2009 run tally: 22419

Error: 9.03%

Well, that's ballpark - Coors Field-sized, but ballpark. I'd feel much better, though if we could get the margin of error under 4 or 5%. So, where's the error source? My first decision is to suck it up and include hit-by pitch (which some players, such as Ron Hunt, Craig Biggio, and Chase Utley, do have a knack for). This immediately brings the error down to 7.29%. That's certainly better, but it does leave a lot to be desired. Then, after making sure I was using Excel correctly, I start the logical thinking process over again. I quickly latch onto the problem: home runs.

You'll recall I spent a paragraph justifying why a home run was worth a run and a run alone. Well, that paragraph was lousy, and I'm embarrassed that such garbage came from head. I completely ignored the central idea behind IRP (and my own stern warning to you that you should remember it): the batter is only responsible for the situation the next guy up faces. The batter who hits a home run absolutely deserves the 1 run currently assigned to him, but he also deserves credit for the runs that will score as a result of not making an out. As I see it, the most important thing a batter can do is not make an out, and I wasn't crediting home run hitters for staving off the next half-inning a bit longer.

So, let's add the run expectancy result for each out situation to the one run already assigned. A home run with nobody out, in 2009 at least, is worth 1.52 runs. Averaged out over the three out situations, a home run goes from being worth 1 run in IRP v.1 to be worthing about 1.3 runs in IRP v.2. What difference does that make?

IRP Runs Scored Projection: 22302

Actual MLB 2009 runs tally: 22419

Error: 0.52%

Now THAT's what I'm talking about. A half-percentage point? Statistically speaking, that's absolutely nothing. Using a season's run expectancy matrix and the hit, walk, and hit by pitch totals, we're able to nail the number of runs scored league-wide. Just multiply the various events by the proper value, add 'em up, and you've got the answer. Just as a refresher, in 2009 those values were:

One-Base - 0.2457

Double - 0.4164

Triple - 0.5825

Homer - 1.3009

(For ease of use, just use 0.25, 0.4, 0.6, and 1.3. It actually comes a tiny bit closer, with the error for league totals being only 0.22% using those approximations, but I suspect that's a random quirk more than anything.)

Now, the question is whether these inherent values remain consistent from year-to-year, and if we can use inherent value data cross-years (using the above 2009 values in 2008, for instance) without significant statistical effect. I tried the latter first, using the 2009 values for each season from 2005 to 2008. Using the exact values rather than the approximations, I got these errors in each year:

2005 - 1.03%

2006 - 2.62%

2007 - 3.30%

2008 - 1.64%

Those are still very good estimates - I'm an engineering major, and we operate under the "5% is good enough" rule of thumb - but they do indicate there is a bit of fluctuation between the exact inherent values over time. That makes sense, of course, since the run expectancy matrixes are similar but not exact from year-to-year, and the trends in the matrixes are over decades rather than single seasons. We can expect the same to be true for the inherent values.

I was curious about those approximations we came up with earlier, for ease of use. Applying them to those same years, we got these errors:

2005 - 0.75%

2006 - 2.36%

2007 - 3.03%

2008 - 1.35%

They all get a little better, and that suggests to me that the numbers bounce around a core value, which the approximations might be closer to than the exact numbers in 2009. With that in mind, I decided to look at the inherent values over the five-year period, 2005 to 2009, using the run expectancy matrix from each season. Here are the results:

Type - 2005 / 2006 / 2007 / 2008 / 2009 / Average / Deviation

One-Base - 0.2596 / 0.2602 / 0.2585 / 0.2495 / 0.2457 / 0.2547 / 0.0066

Double - 0.4254 / 0.4346 / 0.4478 / 0.4239 / 0.4164 / 0.4300 / 0.0120

Triple - 0.6532 / 0.6109 / 0.6207 / 0.6352 / 0.5825 / 0.6205 / 0.0266

Homer - 1.3012 / 1.3171 / 1.3094 / 1.3025 / 1.3009 / 1.3062 / 0.0070

The triple varies the most from year-to-year - not surprising, as the runner on third situation is the most rare of those studied, so there's sample size induced fluctuation. Still, one can see that the numbers are nice and consistent, all plays having relatively small standard deviations. Now I'm going to apply those averages to each individual year and also the numbers for that specific year and look at the percentage error. This should tell us whether we can just declare an average (and make approximations) or if we would be better off using exact, year-specific data. The results:

Year - Specific / Average

2005 - 2.49% / 1.62%

2006 - 1.46% / 0.04%

2007 - 0.87% / 0.68%

2008 - 0.32% / 1.03%

2009 - 0.52% / 2.15%

2005 and 2009 were the biggest outlier years according to that big table earlier, so it's unsurprising that the error for the averages there is higher. In 2006, the average projected a total off by all of ten runs. That's just flat out cool. I think the 2005 specific is off by as much as it because it was a weird year. The triple inherent value is very high, but there weren't that many triples hit (it was virtually tied with 2008 for the fewest in the time interval, and those two years rank far behind the other three), and the one-base value is very low, but there were much fewer one-base plays than in any other year. This means that the real situations were just a bit out of the ordinary, so it becomes a little trickier to predict. I'm not sure, but that seems reasonable. Big picture, though, is that those errors range from "small" to "negligible" to "EUREKA!", so there's nothing to be worried about.

I'm working on providing individual data using these numbers. I plan on using both 5-year and single-season data. For now, though, I'm happy that IRP projects very close to the real value. It is pretty compelling evidence that we can measure the independent value of each play, which is an important step to identifying individual performance. It can also tell us how well each team is taking advantage of its opportunities, or if a player or team is getting unlucky. I see plenty of possibilities. Mostly I'm just happy that the math works out. I'll wrap up by showing you the Independent Run Production and Independent Run Production Average values for the average 2009 big league player.

Average 2009 Major Leaguer (600 Plate Appearances)

1-year IRP: 71.07 runs

1-year IRPA: 0.117 runs/plate appearance

5-year IRP: 73.00

5-year IRPA: 0.120

Monday, April 19, 2010

The Super-Secret Origin of IRP

So early this morning, I had an idea profound enough to wake me up and make me get on the computer. Considering that I had been dreaming about very pleasant things - I don't remember much, but waterskiing pirates were involved - this was not an insignificant occurrence. What moved me so? Why, baseball statistics, of course!

It occurred to me that the result of every plate appearance must have an intrinsic value, value of course meaning runs. That was the original thought rather than the conclusion, and I honestly don't know how I formulated it. I do have a justification, though, which I came up with afterwards. The result of a plate appearance affects how many runs a team can expect to score on average, but it's the circumstance that changes the value of the result (as in, a single is worth more in Case A than in Case B because the bases are loaded in Case A but empty in Case B; it's the situation that dictates the worth, rather than the hit itself). There's no reason to think the result itself will change unless the circumstances around it change. Before we get too heavy, though, here's an example that illustrates just what I'm getting at when I say "value" and gives me something to refer to in my explanation.

The Cardinals are batting. There's nobody out and Ryan Ludwick is on third base. Albert Pujols singles and Ludwick scores. We might therefore think that the single is worth a run - that's the rather noble idea behind Runs Batted In, probably the most maligned statistic in existence. It is true in a concrete way, but there are two big ideas missing: 1) the hit plated a run only because the sequence of events prior to Pujols' plate appearance made it possible - if Ludwick had made an out instead of getting on, no runs would have scored that play; and 2) the batter reaches base and therefore has a chance to score himself, beyond the other results of the play.

Pushing a little more, we can see that the single was actually worth more than a run. According to the run expectancy matrix supplied by Baseball Prospectus, a team with no outs and a runner on third base will score an average of 1.31 runs that inning. Pujols' hit scored Ludwick, and now Matt Holliday comes up with a runner on first base and no outs. In that situation (runner on first, no outs) the Cardinals can expect to score 0.88 more runs. Pujols' hit really did more than just score Ludwick. It also created an opportunity for further scoring.

(This is getting into an area of probability that might be a little confusing. If you're cool, feel free to go on to the next paragraph. If you're my mother, stick around. One would think that since the Cards already scored one run, they should only expect another 0.31. It's important to remember, though, that the situation resets itself every batter, much like a coin-flip. The figure 1.31 was the average of the results of when the batter created situations where more than 1.31 runs could score and when the batter created situations where less than 1.31 runs could score. Obviously, you can't actually score three-tenths of a run, so more or less than the average number will be scored: 0, 1, 2, and so on. Think of the decimal places as telling us which integer is more likely in the grand scheme of things.)

We have statistics in order to identify individual performance, but in a team sport, the natural urge is to figure out how those stats fit into a team's overall showing. In baseball, an offensive player's performance has to be judged by how many runs he's responsible for - after all, runs are the only positive offensive result, so there's nothing that a team should value in its batters besides run production. In our example, though, Ludwick's scoring from third shouldn't factor into assessing how much Pujols contributed to the team's effort, as he wasn't responsible for the situation that scored the run. What he's actually responsible for is the situation that appears for the guy on-deck. That's really the key point behind my dream-interrupting thought, so if you take nothing else away from this post, take that.

So, we've realized that the quality of a plate appearance should really be determined by the circumstance it creates. That passes the eye test: a batter reaches base because he did something right - got a hit, worked a walk - and that increases his team's chances of scoring. He should get credit for that. One thing you could do is just take the difference between the situation the batter faced and the situation that resulted, using the run expectancy matrix. But, in the example, you'd have to take into account that Ludwick scored. Otherwise you'd be docking Pujols about half-a-run despite the fact that he did his number one job - not make an out. If you just add the run scored to the total, you're again giving Pujols credit for the situation he didn't create. What we need to do is separate the valuation of the hit from the run driven in.

You could just give Pujols the 0.88 runs. After all, that specific situation resulted from his plate appearance alone. There's a couple of problems with that, though. First off, it would mean that he'd actually get more credit for making an out and stranding Ludwick at third (the one-out, runner on third situation yields 0.97 runs) than he would by singling. Another problem: even if Pujols had made an out, there would still have been a probability of scoring. Let's say he hit a sacrifice fly and Ludwick scored, bringing up Holliday with nobody on and one out. The situation Pujols created for Holliday would still yield an average of 0.28 runs. Until the third out is recorded, there's always a positive average run scored figure, so it wouldn't be right to give Pujols the full 0.88 runs. That would be crediting him for runs that would have, on average, scored without help from his plate appearance. The last problem, that I can think of right now at least, is that sometimes you'd be crediting the batter with the results of previous plate appearances. It didn't apply in this example, but if there had been a runner on first as well as third, the single would have presented Holliday with two base runners (unless one of them was Enos Slaughter, of course), only one of whom Pujols could properly claim responsibility for. In a real-world sense, yes, that's the situation Pujols left Holliday, but we're wanting to whittle down to the thoroughly independent result in order to find out what results well and truly belongs to the individual player.

(There's also a practical problem with trying to deal directly with the run expectancy matrix: time. It would be preposterously time-consuming to go through and record the situation after every single plate appearance for every single player in every single game. It'd be possible with computer programs - that's what Sean Forman does, after all - but I simply don't know enough to be able to manage that task in a timely enough fashion.)

So, what can free the individual's production from the influence of other players? Well, there are three situations that are absolutely free from the effects of earlier batters: nobody on, any number of outs. For instance, with nobody on and nobody out, a team will average 0.52 runs. If the batter singles, the run average goes up to 0.88. This means that the batter's single created an extra 0.36 runs. Look at it. The only play that contributed to that 0.36 increase was the single, for which the batter can claim full responsibility. No other batters and no other events were involved. We can say that, with zero outs, a single by itself is worth 0.36 runs and not worry the total is under external influence. You repeat the exercise with one and two outs and average the three figures to find out the overall value of the generic single. It's quite reasonable to assume that one-third of all plate appearances take place with zero out, one-third with one out, and one-third with two-out - there's simply no substantive reason for that not to be the case - so that's a very easy calculation to make.

A word about home runs: I decided to score a nobody on, nobody out home run as being worth 1 run rather than 1.52 (the run plus the ensuing situation; this was my original strategy). A batter cannot produce more than 1 run in a plate appearance without outside help - he'd need a runner on base in front to drive in or a batter behind him to convert, too. Therefore, we top out at 1 run produced per plate appearance.

I'm about to post a short list containing the runs per plate appearance result. You'll note that I have grouped walks and singles together as "One-Base"; in the vacuum of nobody on, where we can see the inherent value of the result itself and not the effect it has on the surroundings or the surroundings has on it, the one-base plays are well and truly identical. I've left out hit by pitch and reached on error because, honestly, those are largely unpredictable defensive mistakes that don't merit much in the way of credit. They're infrequent enough that ignoring them will have a very minor impact on the data in the eventual, and we have to leave something for version 2.0, don't we? So, here are the inherent run values for one-base plays, doubles, and triples (rounded to 4 decimal places):

One-Base - 0.2457 runs

Double - 0.4164 runs

Triple - 0.5825 runs

Before I get into the final stretch, I'd like to just provide some thoughts on these results, which I find interesting. The difference between a single and a double and a double and a triple is very, very similar, which is just plain cool. It's somewhat intuitive, but plenty of things you think are that simple just aren't, so that's nifty. What's even more interesting is that a one-base play is almost exactly a quarter of a home run - again, you might guess that it's a quarter, but the fact that it actually is a quarter is awesome - but a triple isn't even three-fifths of a home run. In fact, there's a bigger difference between a triple and a home run than there is between a single and a triple! This really emphasizes how important it is to just get on base. Just look at the 2009 Leaderboard. Walk leader Adrian Gonzalez (119) produced 29.2 runs by taking the free pass. Doubles leader Brian Roberts (56)? 23.3 by double. Triples leader Shane Victorino (13)? Just 7.6 by triple. Just reaching first base frequently produces a lot of runs, although nothing quite packs the wallop of the home run.

Anyway, I'm sure you can see how I'm choosing to apply these results. By taking the number of singles+walks, doubles, triples, and home runs a player collects in a season and multiplying them by the proper inherent value, you can see exactly how many runs a player produced independent of his surroundings. We can get a rate statistic by simply dividing this value by the number of plate appearances. This is where the small errors I mentioned before come in, but I really don't think they make enough of a difference to worry about. Even for Chase Utley, the three-time reigning hit by pitch king who actually does seem to have a talent for getting hit, would only lose 6 runs using this method. That's pretty small in the grand scheme of things, though I might just bite the bullet and include it anyway. It's starting to niggle at me, though I don't like the idea of messing with a rather pretty looking spreadsheet.

I'm in the process of crunching the data for every player, but here are a couple of notable ones from 2009. I tried to get two greats, two goods, two averages, and two bads, based on other statistics like OPS+. The totals are first, followed by per plate appearance.

MLB 600 PA Average: 65.04 / 0.108

Joe Mauer: 92.19 / 0.152

Albert Pujols: 117.44 / 0.168

Derek Jeter: 88.31 / 0.123

Troy Tulowitzki: 88.94 / 0.142

Curtis Granderson: 85.52 / 0.120

Kevin Kouzmanoff: 59.02 / 0.103

Yuniesky Betancourt: 43.38 / 0.085

Jason Kendall: 43.52 / 0.083

Interesting. Granderson shows the power of the big fly, as his season was, by all accounts, thoroughly average, but his thirty home runs help him sneak right up on Jeter.

Anyway, the only thing left to do is come up with a name. I like acronyms, so I'm trying to think of a good one. The main idea is about the independent value of each offensive play. Call it Independent Run Production. IRP. The rate value can be Independent Run Production Average, or IRPA. Not insane about it, but it'll do for now. Now I just need to crunch the data.

If you have a better name, or any other thoughts, please let me know!

Sunday, March 21, 2010

MLB: Basic Prognostication

I might go into these races a bit more over the course of the next couple weeks, but I'd like to cover the broad brush strokes for how I expect the playoff and award races to go.

NL East

1. Philadelphia

2. Atlanta

3. New York

4. Florida

5. Washington

This is a division that is potentially loaded. Philadelphia is the closest thing you'll find to a sure-thing, and I think they balance arguably the division's best staff with arguably baseball's best lineup. Atlanta has a starting rotation that could be scary good, but McCann needs help; even Albert Pujols would have trouble producing runs with that group. The Mets are the Mets, Florida is too dependent on young pitching to overcome an iffy lineup , and it's just one year early to call for a surprise Nationals run, something I tentatively plan to do next March.

(Side-note: Nats star Ryan Zimmerman and Rays stud Evan Longoria are ludicrously similar. Last year they had respective lines of .292/.364/.525 and .281/.364/.526. Both hit 33 home runs. Both have glittering defensive reputations backed up by the stats. Considering that some people think Longoria is fast overtaking A-Rod as the game's top man at the hot corner, Zimmerman deserves some love, too. After all, they're practically twins.)

NL Central

1. St. Louis

2. Chicago

3. Houston

4. Cincinnati

5. Milwaukee

6. Pittsburgh

This division has too many teams with playoff upside (in addition to having too many teams, period). Like most, I expect St. Louis to win the division, as they have the best player in the league and a good supporting cast with guys like Holliday, Molina, and the exciting young Rasmus, and they have good-to-great starting pitching. I'm probably ranking Chicago too high, but I'm a big fan of the corner infield combo of Derrek Lee and Aramis Ramirez, and they've got a rotation that could be sneaky good. I'm giving Houston points because Berkman, Carlos Lee, and Oswalt aren't done yet, and there are some young talents like Hunter Pence that could make a run. If any team is going to steal this division from St. Louis, the Astros might have the best chance. Cincy has a potentially deadly staff, but Joey Votto is only allowed to hit once every nine batters. Milwaukee might deserve better than this, but it's a tough division and it's got a very iffy pitching staff. Two great mashers, Fielder and Braun, can't win more than 75 games without help. Pittsburgh will continue to be everybody's Quadruple-A team, although I love Andrew McCutchen.

NL West

1. Los Angeles

2. Colorado (Wild Card)

3. San Francisco

4. Arizona

5. San Diego

The Dodgers have a great young hitting corps and some fine pitchers, so they'll make it three in a row. I wanted to put Arizona at number two and potentially the wild card, but they'll need Webb in order to create the game's best one-two punch with Haren. They've got good young hitters, but Webb's health is too big a concern. Colorado has a Cy Young contender in Jimenez (people still get scared by Coors, but the humidor has mellowed it a lot) and some awesome young players in Tulowitzki and Seth Smith, so I think they're the wild card favorite going in, and could push the Dodgers. San Francisco has one of the league's best one-two starting combos in Lincecum and Cain, but Pablo "Kung Fu Panda" Sandoval is their only hitter who seems to understand the concept of offense. San Diego has a similar situation, but they substitute bad pitching for good pitching. And if you've got bad pitching in Petco, you deserve last place.

AL East

1. New York

2. Boston (Wild Card)

3. Tampa Bay

4. Baltimore

5. Toronto

This will again be baseball's most cutthroat division, as we could be looking at three 90-win teams. The Yankees will score oodles of runs and their pitching is good enough. (Funny thing about the Yankees: Rodriguez is the only guy who even might be baseball's best at his position, and yet this is a lineup that could score 900 runs.) Boston will probably do a better job at run prevention than anyone else in the baseball, as their staff is loaded and their defense excellent. They'll be about as good offensively as they were last year - and that's pretty good - and they'll be right in the thick of the best in the Majors discussion all year long. I gave the edge to the Yankees because they can handle an injury to their lineup better than Boston can, and pitching is always the hardest thing to count on. I think both teams get in the playoffs, though. Tampa has loads of talented youngsters, but you're better off banking on the established stars in the Northeast. Baltimore may have two of the division's top three outfielders in Markakis and Jones, and a wonderful catcher in Wieters, but too many pieces have to fall into place for them to make a run at more than .500. Toronto is rebuilding, which is unfortunate for studs like Adam Lind.

AL Central

1. Minnesota

2. Detroit

3. Chicago

4. Cleveland

5. Kansas City

The loss of closer extraordinaire Joe Nathan really hurts the Twins - you ain't replacing 65+ superb innings at the drop of a hat - but Ron Gardenhire will do what he always does: pull wins out from a magical top hat. It also helps when you've got lineup talents like Mauer, Morneau, Cuddyer, Kubel, and Span and you play in a relatively weak division. I like Detroit's starting pitching an awful lot - though I have a problem with young Max Scherzer I'll be writing about soon - and Miggy Cabrera is a wonderful masher. You'll be hard pressed to find a lineup more dependent on the long ball than Chicago, but they're in a good park for it and have a potentially good rotation. Cleveland has Grady Sizemore and that's about it, but somehow it will continue to outperform the lowly Royals. The weird thing about the Royals is they have three remarkable youngsters in Greinke, Soria, and Butler (and a solid one in Alberto Callapso), and yet they are so terribly bad everywhere else that it doesn't matter. Royals fans must comfort themselves with the knowledge that their three stars will make some fantasy baseball players very happy.

AL West

1. Texas

2. Seattle

3. Los Angeles

4. Oakland

This is a very intriguing race, as three teams are very much in the mix and the fourth has tremendous potential. They could all end up within ten games of each other, which would be remarkable. I'm picking Texas to win because they've got better balance than any other team in this division. They still scored runs despite injuries and bad luck last year, and they should easily be the division's best run producer. They've also got a deep rotation that's got enough talent that some of it will do well, and they've got such a great farm system that they'll be able to handle injuries better than most. Seattle has a wonderful one-two in King Felix and Cliff Lee, they've got a great defense, and while offense is a concern, guys like Franklin Gutierrez and Jose Lopez are much better than they're given credit for. The Angels really don't deserve this third-place listing, but I just can't see them scoring 883 runs again, and they have neither an ace nor a stopper in the bullpen. Oakland has too many ifs, but they've got some young talents who could be very productive.

NL MVP

Albert Pujols will be the league's best player - does that even need to be said? - but I really think the writers will try to get some variety unless Pujols wins that Triple Crown he's been threatening. If the Phillies establish themselves clearly as the National League's best team, Chase Utley will have that going for him, but Matt Kemp might well hit .300, pull off the magical 30-30 homers and steals feat, and win a Gold Glove playing center field for the big-time Los Angeles Dodgers. That's why he's my pick.

AL MVP

Much like the National League, there really shouldn't be any doubt about who the best player actually is: Joe Mauer. I think he's got an excellent chance at repeating, especially since he's now got the "led his team to the the playoffs despite major injury to star player" thing going for him. There are other contenders, though. Anybody in Boston or New York, obviously, but I rather like Ian Kinsler to win. He was badly unlucky last year, hitting just .253 due to his ridiculously low batting average on balls in play. That stat usually corrects itself, and Bill James projects that he'll hit 27 home runs, drive in 106 runs, bat .275, and steal 28 bases. Considering he plays very good defense at a premium position and could be the leader on a division-winning club, I like his odds. As a Sox fan and Pedroia nut, it pains me to write all that. (Utley, Pedroia, Kinsler, Cano ... these are some awesome second baseman we have nowadays.)

NL Cy Young

We may have a Randy Johnson-type stranglehold on the Cy Young developing for ol' Tim Lincecum - a great player that I've never much liked, unfortunately - even though Dan Haren is awfully close to being that division's best pitcher. In addition to those two, Adam Wainwright, Roy Halladay, and Ubaldo Jimenez are also contenders. I'll give the edge to Halladay on the premise that he'll get a healthy bump due to the weaker National League lineups and probably snag 20 wins when you combine that potent lineup with his admirable ability to go deep into games.

AL Cy Young

Zack Greinke is my personal pick for best pitcher in the Major Leagues, and I have little doubt that he'll have another great season for that terrible franchise in the great town of Kansas City. However, I just can't see the writers going to the pitcher on a 60-win team two years in a row, no matter how awesome he is. Don't forget, these people gave the 2005 Cy Young to Bartolo Colon rather than Johan Santana. It won't be quite such a travesty though, as you've got excellent guys like Felix Hernandez, Justin Verlander, Jon Lester, and C.C. Sabathia pitching for potential playoff teams. Lester will be the ace on the Major's best staff, and I expect he'll push the league lead in wins, strikeouts, and ERA, so he's my guy.

What? You didn't seriously think my picks would be completely unbiased, did you?

Sunday, February 14, 2010

Pitching: Halladay and Lee

As always, pitchers were the dominant focus of the offseason. Sure, you had the Jason Bay saga - the Mets will like him and the Red Sox won't miss him, but more on that another day - but I'd say that guys like John Lackey, Justin Verlander, Tim Lincecum, Roy Halladay, Felix Hernandez, and Cliff Lee were the biggest stories. Lackey signed with my Red Sox for $82.5 million over five years. Verlander and the Tigers agreed to an $80m/5yr contract extension. Lincecum, probably worried about his Karma, chickened out of an arbitration hearing he surely would have won, and settled for a $23m/2yr deal with the Giants. Lee and Halladay were involved in the big trade that sent Halladay to Philly and shipped Lee off to join Hernandez in Seattle. Hernandez, for his part, was also signed to a five-year deal worth about $80m.

(I'm not quite sure what's up with the similarity of the deals, as I don't think anybody would seriously suggest that Felix Hernandez, Justin Verlander, and John Lackey - or A.J. Burnett, if we stretch back a year - are interchangeable. Maybe there's a Big Name Pitcher Contract template, and all the general managers are just too lazy to change the figures.)

In this first blog entry, I'd like to talk about the biggest deal of them all: the Halladay-Lee Trade. In the minds of some, it was a lateral move - exchanging one ace for another ace. In the minds of others, it was a step backwards - exchanging a proven postseason star for an older pitcher with no playoff experience and some past injuries. In my mind, however, the Phillies made a championship-caliber move. Any time you're in the hunt for a title and you have an opportunity to upgrade your rotation by even a little bit, you have to do it, and Halladay is a better pitcher than Lee (and most every other pitcher in baseball) by a fairly sizable margin. The Phillies came awfully close to winning the World Series last year, and they're a better team with Halladay than with Lee. Certainly they'd be better with both of them than with only one, but we have to assume that the Phillies felt they couldn't make that happen, and it's not unreasonable to give the defending National League Champions the benefit of a doubt.

Let's dismiss the criticisms against Halladay first. He's only about 15 months older than Lee. Given the state of modern training, that's practically nothing when you're talking about guys in their early 30s. He also hasn't had injury problems since 2005, and that's despite leading the league in complete games three straight years and finishing in the top four in innings pitched four straight years. Halladay is no more an injury risk than any other player. He's even a lesser risk than Lee, given recent reports about bone spurs in the Southpaw Lee's push-off leg. While the bone spurs aren't major in and of themselves, they can be a persistent problem no matter how good your surgeon is, and better pitchers than Lee have been ended by foot pain. See: Dean, Dizzy.

Sometimes people forget just how good Halladay really is. I know I do. Here in Boston, you hear far more about (the admittedly promising) Clay Buchholz than you do about the former Toronto ace. Much like Arizona's Dan Haren - a guy who has to be in the Major's best pitcher discussion, by the way - Halladay has simply put up superb numbers for mediocre teams in some random corner of the continent. Halladay's got fantastic complete game and shutout figures, and his winning percentage of .661 is incredible for anyone, let alone someone on at most the third-best team in his division. Let's focus in on WHIP, and see what it has to say. Walks+Hits/Innings Pitched (WHIP) is a great number to look at since pitchers have a great deal of control over their walk rates and over how much good contact batters make. It isn't perfect, but if you ranked pitchers strictly by WHIP, you'd get much closer to an accurate pecking order than you would by looking at traditional metrics such as ERA or strikeouts. It's also an easy number to scale. The best pitchers have whips in the 1.0-1.1 range. The league average is about 1.4. Since 2003, Halladay has either led or been second in the American League in WHIP four times, all while pitching in this decade's toughest division. He was also on pace to lead the league in 2005 before getting hurt.

Then there's the consistency factor. The former Indian is only two years removed from a 5-8 record and a WHIP over 1.5, while Halladay has essentially been the same pitcher since 2001. Comparing their careers year-by-year, it's interesting to note that Halladay has always finished with a better WHIP than Lee. Yes, that includes Cliff Lee's Cy Young year. (That was a prize he undoubtedly deserved, though I'd say it would have been just as fair for Halladay, the runner-up, to win.) In fact, Halladay's career WHIP of 1.198 is less than a tenth of a point behind Cliff Lee's best season. Neither guy has been a strikeout king, but Halladay has been top 5 three times, while Lee has only been top 10 once. When you look at any of their rates (hits, walks, home runs, strikeouts, strikeouts-to-walks), Halladay has a clear advantage. Really, no matter what metric you use in direct comparison, Halladay has peaked higher, and his averages are as good as Lee's best.

Now let's criticize Cliff Lee. Not really because he deserves it; mostly because I liked looking up the stats. His full season WHIP ranked thirtieth in the Majors. 30th. As in worse than 29 other guys. I don't believe in strict guidelines, but a good rule of thumb is that when a Pittsburgh Pirates pitcher named Ross Ohlendorf is ranked five places ahead of you, you aren't better than one of the decade's elite pitchers. Don't get me wrong. I'd love for my team to have him. I certainly think Seattle has one of the deadliest one-two combinations in baseball ... but there really shouldn't be any doubt about whether Lee is playing the part of Koufax or the part of Drysdale.

It seems pretty clear to me that of the two, Halladay is the better option. While Lee pitched great in the postseason, we have to remember that only a year prior, Cole Hamels was the Fall Classic star. The terrible October he had in '09 should serve as a warning to all those who put too much stock in short series performances. I mean that not as a criticism of Hamels - he's only 26 and he's had several fine seasons and just one pretty unlucky one. I simply don't think we should anoint Lee as the Prince of Big Games and Heir to the Throne of Schilling when his entire career has been a textbook example of inconsistency. Halladay's career has been one of steady excellence, and there's no reason to think that he'll suddenly lose his stuff when he takes the mound during the playoffs.

From a more broad perspective, the Phillies made a good decision. Halladay is simply better than Lee, and that was one of the few opportunities they had to improve. They already had an excellent lineup - I'm not a big Ryan Howard fan, but only seven men have ever had more 45 home run seasons, and the guy is only 30 - that can probably go toe-to-toe with the mighty Yankees. Their only deficiency was third-base, and they solved that with the Polanco signing. Their bullpen is suspect, but relief pitchers aren't exactly predictable even in the best of circumstances, and it's not like you're going to find a single team in baseball that would feel comfortable trading away a good reliever. If they hadn't pulled the trigger to replace Lee with the superior Halladay, they'd have just treaded water.

The Phillies are good enough to win the National League East even without Halladay. Hanley Ramirez is versatile, but I don't think he can throw 200 innings. Jason Bay is a pretty good slugger, but he won't keep the Mets' rotation healthy. Atlanta is the most serious threat - I really like Tommy Hanson and Brian McCann - but the Phillies have simply too much firepower, and they've got enough depth in the starting rotation to stave off the Braves' largely uninspiring regulars. But winning the division isn't the point. It's all about the World Series. They weren't good enough to win it last year, and unless they wanted to pin their hopes on Placido Polanco, they had an obligation to improve when the opportunity presented itself. If I'm a Phillies fan, I've got to be happy with the team's attitude.