Friday, September 5, 2014

Candidate quality and the forecast

With Alaska Republicans having finally nominated Daniel S. Sullivan as their choice for the Senate seat there--and with my vacation having finally ended--candidate-choosing time has pretty much wrapped up. In pretty much all races where primaries have yet to be held, the results are foregone conclusions--New Hampshire's Senate is going to be Jeanne Shaheen vs. Scott Brown, etc. (The exception is Rhode Island's Democratic primary for the governorship, which is still basically tied and probably won't be settled until September 9.) As a result, it's probably time to start talking about candidate quality.

"Candidate quality" is about as nebulous as political jargon gets. That's largely because it's a composite of a number of qualities that are just as nebulous as "candidate quality", if not more so: name recognition, political experience, etc. However, that doesn't mean we can't quantify it. Some, like FiveThirtyEight's recently published model, compute candidate quality as a function of the highest political office the candidate has ever held. Depending on what we're trying to calculate here, political experience appears to be the single best metric of candidate quality among all other possible variables, even better than favorability ratings (although those still probably play a role). But for us, candidate quality is all about trying to figure out how undecided voters will break once it actually comes time to put a name down on the ballot.

While (probably) not the same as others have done, I've very crudely measured candidate experience by putting candidates into "tiers" based on the candidate's political career to that point. Specifically:

  • 1st-tier candidates:
    • Multiple-term incumbents (most of these will be senators and not governors, since governors often have term limits that bar them from serving multiple terms).
  • 2nd-tier candidates:
    • Freshman elected incumbents running for re-election (this does not include unelected appointed incumbents running for re-election). 
  • 3rd-tier candidates:
    • Incumbent governor running for Senate
    • Incumbent senator running for governor
    • Former governor or senator running for either office
    • First-term appointed U.S. Senator running for re-election
  • 4th-tier candidates:
    • U.S. Representative
    • Statewide elected official other than governor
    • State legislature party leader
    • Mayor of large city (population 200,000+ or accounts for more than 15% of the state's population)
    • Former U.S. Senator from other state Former U.S. Senator from Massachusetts running for Senate in New Hampshire Scott Brown
  • 5th-tier candidates:
    • State legislator
    • Statewide appointed official
    • Mayor of a not-large city
    • Former U.S. Representative
    • Former statewide elected official
    • Former mayor of large city
  • 6th-tier candidates:
    • Former state legislator
    • Former statewide appointed official
    • County- or municipal-level executive
  • 7th-tier candidates:
    • Anything else (generic "businessmen" go in here)

After coding candidates from competitive races in 2006-2013 as tier 1-7, I ran polling data from those races through our averaging formula and calculated the improvement for each candidate--that is, the difference between their eventual performance and their final polling average. I then divided each candidate's improvement by the average proportion of undecided voters in the final polling average to get a variable I called "propimp" (for proportion of improvement). Theoretically, it represents the percentage of undecided voters who broke for a particular candidate.

Now here's what I got when I ran a simple linear regression between propimp and the candidate's tier T:


Blue is a Democrat, red is a Republican, and green is an independent or third-party candidate. Note first that it is possible to have negative propimp values; that means that polls overstated a candidate's eventual performance. It's also possible to have propimp values greater than 1; that simply means that not only did undecided voters break for a candidate, but some of that candidate's opponent's supporters also broke for him. Of course, it also means that one candidate has to have propimp < 0 if another is to have propimp > 1.

The negative correlation is expected--as a candidate's "tier" goes down (i.e., the candidate is less experienced), her share of the undecided voters also goes down. The fit isn't stellar, but the correlation is statistically significant (t = -4.03) and strong enough that undecided voters broke at least evenly for all but one of the first-tier candidates, while no bottom-tier candidates received more than 50% of the undecided vote; indeed, over half of them actually overperformed in polling compared to the actual election. It is not a Republican or Democratic phenomenon (although, as it happens, 2006 and 2010 saw a lot of competitive races featuring incumbent Democrats, which is why the left side of the graph looks a lot bluer compared to the right).

Ultimately what this gives us is a fairly strong predictor variable in candidate tiers of experience which can be used to inform candidate quality terms in the model. I don't, however, believe that it's the only thing that belongs in the model--specifically, I feel that favorability ratings also go some way toward predicting how undecided voters will break. More popular candidates are better candidates, just like more experienced candidates are better candidates.

No comments:

Post a Comment