Breaking Empirical Deadlocks in the Study of Partisanship: An Overview of Experimental Research Strategies

Donald P. Green

doi:10.12924/pag2013.01010006

Research Article

Breaking Empirical Deadlocks in the Study of Partisanship: An Overview of Experimental Research Strategies

Donald P. Green

Department of Political Science, Columbia University, 420 W. 118th Street, 7th Floor IAB, New York, NY 10027, USA; E-Mail: [email protected]

Submitted: 19 December 2012 | In revised form: 9 February 2013 | Accepted: 20 February 2013 | Published: 5 March 2013

Abstract: The vast literature on party identification has gradually become bogged down by disputes about how to interpret observational data. This paper proposes the use of experimental designs to shed light on the responsiveness of party identification to short term forces such as retrospective performance evaluations. Examples of recent field experiments are used to illustrate two types of experimental designs and the assumptions on which they rest.

Keywords: causal inference; field experiments; party identification; research design

1. Introduction

The vast behavioral literature on party identification has been propelled by a series of methodological innovations. The initial conceptualization of party identification as an enduring attachment that shapes the way in which voters view political figures and issues 1] was prompted by the growth and develop-ment of survey research in the early 1950s, and theoretical refinements followed as surveys became more widespread and sophisticated [2–4]. During the mid-1970s, nonrecursive statistical models became part of the political science toolkit, and a torrent of studies called into question the assumption that causation flows in one direction from party attachments to issue positions [5,6], performance evaluations [7,8], and candidate evaluations [9]. This line of attack drew on a wide array of surveys, including several conducted outside the United States [10]. By the mid-1980s, political scientists had grown deeply skeptical of the view that party identification is an unmoved mover, developed early in life and unresponsive to short-term changes in the political environment. The simulta-neous equations models of the 1970s and early 1980s, however, came under criticism in the wake of another methodological development, the analysis of covariance structures as a means of addressing biases due to measurement error. Response error was said to produce a variety of statistical artifacts, leading scholars to exaggerate the rate of partisan change [11,12] and the responsiveness of partisanship to short-term shifts in the way that voters evaluate incumbent performance and candidates' issue stances [13] in a variety of cross-national settings ([14], but see [15–17]). The most recent methodological innovation was the analysis of aggregate survey data, made possible by the accumulation of several decades of quarterly polling data by commercial and news organizations [18]. This evidence was initially inter-preted as demonstrating the malleability of partisan-ship in the wake of economic fluctuations and scandals, although subsequent work that took sampling variability [19,20] and question wording effects [21] into account tempered this conclusion.

Each wave of methodological innovation has intro-duced new evidence into debates about the nature and origins of party attachments, but uncertainty remains about how to interpret the results given the welter of competing methodological claims. The study of partisanship currently finds itself in a state of deadlock between theoretical perspectives that em-phasize the stability of partisan identities (and social identities more generally) in polities where the parties and their social constituencies are stable [22] and theoretical perspectives that regard partisanship as a running tally of past performance evaluations [7,23], a summary of expectations about future performance [24], or a manifestation of voters' ideological proxim-ity to the parties [6,15].

How might researchers break this deadlock? Many of the central debates ultimately come down to questions of causal inference. The reason meth-odological debates about two-way causal flows, measurement error, and other specification issues have played such a prominent role in the literature on party identification is that the evidence base is almost entirely drawn from nonexperimental research.
Cross-sectional surveys, panel surveys, and aggregate time-series furnish the data analyst with variation in partisanship and variation in the putative causes of partisanship. What to make of the correlation between these two sets of variables hinges on the substantive modeling assumptions that researchers bring to bear when analyzing the data. Do voters' policy views cause them to adjust their partisan attachments in light of party platforms, or do voters instead follow party leaders' pronouncements on prominent issues of the day [25]? Or do correlations between policy views and party attachments reflect unmeasured variables with which they are both correlated? Sorting out cause and effect statistically requires the researcher to trace this correlation to some putatively exogenous initial conditions. For example, in cross-sectional analysis (e.g., [5]), the identifying assumption is that certain demographic variables predict issue stances but are unrelated to omitted causes of party identifications. In panel analysis, the core assumption is slightly weaker: subjects' background attributes and prior attitudes are related to current partisanship only insofar as they influence contemporary issue stances and performance evaluations (e.g., [8]). In time-series analysis, the identifying assumptions are somewhat more complex because they involve a range of propo-sitions about how partisanship and short-term forces are measured over time and how the dynamics of each series are modeled [19,23,26]. Each of the competing modeling approaches involves strong and untestable modeling assumptions. New statistical techniques (e.g., matching) that introduce untestable assumptions of their own are unlikely to advance this literature. Even if voters who harbor different policy views were precisely matched in terms of their measured attributes, a researcher might still wonder whether their different partisan attachments reflect unmeasured attributes, such as pre-adult socialization experiences, that are correlated with policy stances [27].

During the past decade, largely in response to the kinds of identification problems just mentioned, another methodological innovation has taken root in the social sciences. Increasingly, researchers in political science and economics have turned to randomized experiments in order to facilitate causal inference. Experimental designs by no means eliminate problems of inference, but they nonetheless represent an important advance that, at a minimum, calls attention to subtle issues of identification and interpretation. This essay discusses a pair of recent studies that illustrate two broad classes of experimental designs. The first addresses the question of what kinds of stimuli cause people to alter their partisan attachments; the second addresses the question of what downstream consequences follow from an exogenously-induced change in partisanship. We begin by introducing the logic of inference that underlies randomized experiments, discuss the identification strategies that underlie each study, and suggest how an experimental agenda might advance the literature on party identification.

2. Inference from Direct and Downstream Experiments

Randomized experiments—and research designs that attempt to approximate random assignment—are often explicated in terms of a potential outcome framework [28,29]. The advantages of this framework for statistical practice are twofold: it makes clear what is meant by causal influence and encourages researchers to attempt to estimate causal parameters without invoking the assumption that all individuals are subject to the same treatment effect. These advantages have special value for the literature on party identification, which tends to gloss over important issues of identification, especially as they pertain to variation in treatment effects from one person to the next. What follows is a brief intro-duction to the potential outcomes framework, drawing on the more extensive presentation in [30].

Before delving into the specifics of how parti-sanship is influenced by other factors, such as voters' economic assessments or policy stances, let's consider the problem of causal inference in abstract terms. We begin by supposing that each person i harbors two potential outcomes. Let $Y_{i} (0)$ be i's partisanship if i is not exposed to the treatment, and $Y_{i} (1)$ be i's partisanship if i is exposed to the treatment. The treatment effect is defined as:

τ_{i} \equiv Y_{i} (1) - Y_{i} (0)

(1)

In other words, the treatment effect is defined as the difference between two potential states of the world, one in which the individual receives the treatment, and another in which the individual does not. Extending this logic from a single individual to a set of individuals, we may define the average treat-ment effect (ATE) as follows:

ATE \equiv E [τ_{i}] = E [Y_{i} (1)] - E [Y_{i} (0)]

(2)

where $E [∙]$ indicates an expectation over all sub-jects. Although empirical research may serve many purposes, one principal aim is to estimate the ATE, the average effect of introducing some sort of infor-mation, policy, or incentive.

In an actual experiment or observational study, we observe subjects in either their treated or untreated states. Let $D_{i}$ denote the treatment status of each subject, where $D_{i}$ = 1 if treated and 0 if not. The difference in expected outcomes among those who are treated and those who are not treated may be expressed as:

E [Y_{i} (1) │ D_{i} = 1] - E [Y_{i} (0) │ D_{i} = 0]

(3)

where the notation $E [A_{i} │ D_{i} = B]$ means the average value of $A_{i}$ among those subjects for which the condition $D_{i} = B$ holds. For example, one could com-pare average outcomes (party identification scores) among those who evaluate the economy positively $(D_{i} = 1)$ to average outcomes among those who evaluate the economy negatively $(D_{i} = 0)$ .

In a typical observational study, the observed difference in partisanship between those who evaluate the economy positively or negatively may not, in expec-tation, reveal the average causal effect of economic perceptions. We observe average outcomes for the treated subjects in their treated state and average outcomes of the untreated subjects in their untreated state. To see how this quantity is different, in expec-tation, from the ATE, we rewrite Equation (3) as:

\begin{matrix} E [(Y_{i} (1) - Y_{i} (0)) │ D_{i} = 1] \\ + {E [Y_{i} (0) │ D_{i} = 1] - E [Y_{i} (0) │ D_{i} = 0]} \end{matrix}

(4)

In other words, the expected difference in outcomes of the treated and untreated can be decomposed into the sum of two quantities: the average treatment effect for a subset of the subjects (the treated), and a selection bias term. The selection bias term (in braces) is the difference between what the outcome $Y_{i} (0)$ would have been for those who are treated had they not been treated and the value of $Y_{i} (0)$ observed among those who were not treated. The threat of selection bias arises whenever systematic processes determine which people receive treatment. In this example, if people choose the sorts of economic news they read and remember, expected $Y_{i} (0)$ potential outcomes may be quite different among those who evaluate the economy positively or negatively.

Random assignment solves the selection problem. When random assignment determines which treat-ment each subject receives, $D_{i}$ is independent of potential outcomes. Those randomly selected into the treatment group have the same expected outcomes in the treated state as those randomly assigned to remain untreated (control group):

E [Y_{i} (1) │ D_{i} = 1] = E [Y_{i} (1) │ D_{i} = 0] = E [Y_{i} (1)]

(5)

By the same token, those randomly assigned to the control group have the same expected $Y_{i} (0)$ out-comes as those assigned to the treatment group:

E [Y_{i} (0) │ D_{i} = 0] = E [Y_{i} (0) │ D_{i} = 1] = E [Y_{i} (0)]

(6)

Equations (5) and (6) reveal why, when subjects are randomly treated, the selection bias term vanishes and the difference between treatment and control group averages provides an unbiased estimate of the ATE. This identification result can be shown by substituting Equations (5) and (6) into Equation (3):

E [Y_{i} (1) │ D_{i} = 1] - E [Y_{i} (0) │ D_{i} = 0] = E [Y_{i} (1)] - E [Y_{i} (0)]

(7)

This proof demonstrates an attractive property of randomized experiments. At the same time, it glosses over two implicit assumptions. One assumption, which plays a minor role in the analysis that follows, is the stable unit treatment value assumption [29], which stipulates that potential outcomes do not depend on which subjects are assigned to treatment. This assumption is jeopardized, for example, when the treatment administered to one subject affects the out-comes of other subjects. More pertinent to our discussion below is the exclusion restriction as-sumption [31], which requires that outcomes respond solely to the treatment itself and not to the assigned treatment or other backdoor causal pathways that are set in motion by the assignment to treatment or control. For example, we must assume that when we randomly assign economic evaluations, we are not inadvertently deploying other treatments, such as information about the party platforms on environ-mental issues.

Readers may be wondering whether an experiment could feasibly assign how people evaluate the economy. The answer is probably not, and we must therefore introduce another layer of notation to describe the imperfect translation of intended treat-ments into actual treatments. Let $Z_{i} = 1$ if a subject is assigned to the treatment group, and $Z_{i} = 0$ if the subject is assigned to the control group. In experi-ments with full compliance, all those assigned to the treatment group $(Z_{i} = 1)$ also receive the treatment $(D_{i} = 1)$ , and all those assigned to the control group $(Z_{i} = 0)$ are untreated $(D_{i} = 0)$ . In experiments with some degree of noncompliance, $D_{i} (z) \neq Z_{i}$ . Encour-agement designs, for example, attempt to induce some subjects to take the treatment $D_{i}$ but recognize that there may be some subjects who will fail to do so or who will take the treatment even when not encouraged.

In the context of experiments that encounter noncompliance, the exclusion restriction holds that $Y_{i} (d, z) = Y_{i} (d)$ for all values of d and z. In other words, potential outcomes respond solely to actual treatment, not assigned treatment. Consider a recent survey experiment by Middleton [32] that randomly encourages some subjects to read upbeat news stories about the economy $(Z_{i})$ in an effort to change their assessment of national economic conditions $(D_{i})$ , which in turn may affect their partisanship $(Y_{i})$ . The causal effect of interest is the influence of $D_{i}$ on $Y_{i}$ , but $D_{i}$ itself is not randomly assigned. The exclusion restriction holds that assignment $Z_{i}$ has no influence on $Y_{i}$ except insofar as it affects $D_{i}$ , which in turn affects $Y_{i}$ . In other words, the encouragement to read a news story is assumed to affect partisanship only insofar as the encouragement changes assessments of national economic conditions.

In order to recover the causal effect of $D_{i}$ on $Y_{i}$ using an encouragement design, we need one further assumption known as monotonicity [31]. Describing this assumption requires a bit more terminology. Depending on the way their received treatments potentially respond to treatment assignment, subjects may be classified into four types, Compliers, Never-takers, Always-takers, and Defiers. Compliers are subjects who take the treatment if and only if assigned to the treatment. For this group $D_{i} (1) - D_{i} (0) = 1$ . Never-takers are those who are always untreated no matter their assignment: $D_{i} (1) = D_{i} (0) = 0$ . Conversely, Always-takers are those who are always treated no matter their assignment: $D_{i} (1) = D_{i} (0) = 1$ . Defiers are those who take the treatment if and only if they are assigned to the control group: $D_{i} (1) - D_{i} (0) = - 1$ . The monotonicity assumption stipulates that there are no Defiers. In context of our running example, when assigned to receive upbeat economic news, every-one's economic assessments either remain unchanged or become more buoyant. Notice that the mono-tonicity assumption has nothing to do with potential outcomes concerning partisanship, $Y_{i}$ . Monotonicity refers only to the relationship between assigned treatment and actual treatment.

Under the stable unit treatment value, exclusion restriction, and monotonicity assumptions, one can identify the ATE among Compliers [31]. This quantity, the Complier Average Causal Effect (CACE), is estimated by dividing two quantities. The numerator in Equation (8) is the average outcome in the assigned treatment group minus the average outcome in the as-signed control group; the denominator is the observed rate of treatment in the assigned treatment minus the observed rate of treatment in the control group:

\hat{CACE} = \frac{(Ê [Y_{i} │ Z_{i} = 1] - Ê [Y_{i} │ Z_{i} = 0])}{(Ê [D_{i} │ Z_{i} = 1] - Ê [D_{i} │ Z_{i} = 0])}

(8)

This ratio is equivalent to the estimate generated by an instrumental variables regression of $Y_{i}$ on $D_{i}$ using $Z_{i}$ as an instrumental variable. Because the denominator is a difference between two quantities that are subject to sampling variability, this ratio is consistent but not unbiased and becomes undefined when the treatment rate in the two experimental groups is the same. Precise estimation requires a substantial difference in treatment rates, a point that has special importance for the analysis of what Green and Gerber [33] refer to as downstream experiments.

A downstream experiment is one in which an initial randomization causes a change in an outcome, and this outcome is then considered a treatment affecting a subsequent outcome. For example, in Middleton's study of news coverage on economic assessments [32], subjects in an internet survey were assigned to read newspaper coverage of the 2008 economic crisis. Random assignment produced a change in economic evaluations. A downstream analysis might examine the consequences of changing economic evaluations on party identification. This analysis parallels an encouragement design in terms of its underlying assumptions (stable unit treatment value, exclusion restriction, monotonicity), mode of analysis (instru-mental variables regression), and causal estimand (the CACE). Of special importance is the exclusion restriction, which holds that exposure to news stories had no effect on party identification through paths other than economic evaluations. When these assumptions are met, the experimenter obtains consistent estimates of the ATE among Compliers, who are in this case those whose economic evaluations are favorable if and only if they are exposed to the news stories. In order to estimate the CACE with reasonable power, there must be ample numbers of Compliers, which is to say that the news stories must have a sizable impact on economic evaluations. Small numbers of Compliers also mean that a slight violation of the exclusion restriction may lead to severe bias. Thus, the most informative experiments are those that set in motion substantial changes in causal variables, such as economic assessments.

In sum, random assignment allows researchers to sidestep the selection problem, but important assumptions remain. Both full-compliance and encour-agement designs force the researcher to impose exclusion restrictions. Encouragement designs require the additional assumption of monotonicity and confine the causal estimand to the average treatment effect among Compliers. Whether one can safely generalize from the ATE among Compliers to the ATE for other subgroups is an open question that may be addressed empirically through replication using different sorts of encouragements ([30], chapter 6).

From the standpoint of estimation, this framework departs markedly from the way in which researchers typically analyze observational data. Using the estimator described in Equation (8), a researcher compares subjects according to their experimental assignments, not according to the treatments they actually receive. Precise estimation requires that the assigned treatments bear a reasonably strong relation-ship to the treatments that subjects actually receive. In other words, the use of instrumental variables regres-sion to estimate the CACE requires an experimental design that generates ample numbers of Compliers.

In order to see these assumptions and design considerations in action, we next consider a pair of recent experiments. The first assesses the influence of information about incumbent performance on party identification. The second considers the downstream effects of randomly-induced party registration on party identification. Because the technical issues surrounding the downstream study are more complex, we discuss the experiment in more detail.

3. Chong et al. (2011) [34]

Chong, De La O, Karlan, and Wantchekon [34] report the results of a field experiment conducted in Mexico shortly before its 2009 municipal elections. Their intervention followed in the wake of a federal audit of municipal governments. These audits graded munici-pal governments according to whether they had accounting irregularities indicative of corruption; the auditors also noted whether local administrators had failed to spend federal grant money, suggesting a low level of administrative competence. The researchers conducted a precinct-level leafleting campaign de-signed to publicize some aspect of the auditors' reports. Some 1,910 precincts were randomly selected to a control group that received no leaflets. Three random subsets of 150 precincts apiece each received one type of treatment flyer. The first treatment publicized the degree to which the municipality failed to spend federal grant funds. The second publicized the failure to spend grant funds that were supposed to aid the poor. A third graded the municipality according to the amount of evidence of corruption.

Much of the authors' report focuses on how precinct-level vote outcomes changed in the wake of the leafleting campaign; for our purposes, the rele-vant part of the study examines the effects of the intervention on individual-level attitudes of 750 respondents who were sampled from 75 of the precincts and surveyed two weeks after the election. Since Mexican elected officials are forbidden to seek reelection, voter displeasure cannot be directed at incumbent candidates; the relevant target is the incumbent party. Chong et al. find that negative report cards addressing corruption (but not failure to spend grant money) significantly diminish respon-dents' approval of the incumbent mayor and identification with the incumbent's political party. Unfortunately, no follow-up surveys were conducted to assess the extent to which the effects persisted beyond two weeks. Nevertheless, the study remains one of the first experiments to show that party attachments change when performance evaluations are altered exogenously.

Given the sheer number of studies on the topic of party identification, readers may be surprised to learn that the Chong et al. study is among the very few that have attempted to influence party identification via an experimental manipulation. One exception is Cowden and McDermott [35], which reports the results of a series of laboratory studies that sought to influence party attachments though, among other things, role-playing exercises in which undergraduate subjects were asked to take a pro- or ant-Clinton position. None of their interventions succeeded in changing party attachments. Similarly, although split ballot designs have often been used to assess the effects of question wording on responses to party identification measures (e.g., [36]), survey experiments have seldom assessed whether party identification moves in the wake of information about party platforms or performance. A notable recent exception is Lupu [37], which uses a split ballot design to assess the effects of information on party identification in Argentina. Lupu's work builds on Russian, Polish, and Hungarian experiments reported by Brader and Tucker [38]. Unfortunately, these experiments do not measure whether information effects persist over time, a limitation that makes it difficult to interpret the small and contingent treatment effects that these authors report. One of the attractive features of the Chong et al. study is that its intervention and outcome assessment occur at different points in time.

Let's now consider the Chong et al. study from the standpoint of the core assumptions discussed in the previous section. The exclusion restriction in this instance stipulates that random distribution of corruption-related leaflets influences outcomes be-cause it provides evaluative information about incumbent performance. The authors present convinc-ing evidence that the leaflets did tarnish the image of incumbents who were accused of corruption and that precinct-level votes for incumbents accused of corruption were lowered significantly. As for the assumption of excludability, which holds that random assignment does not affect outcomes, it seems there are few backdoor paths that could explain the effect on partisanship: the leaflets were distributed toward the end of the campaign period, preventing incum-bents from responding to the messages; the leaflets themselves did not mention political parties; and the post-election surveys did not prime the respondents to think about the leaflets they might have received. The lack of immediate connection between the inter-vention and the survey represents an advantage of the Chong et al. design in comparison to the split ballot experiments of Lupu [37] and Brader and Tucker [38].

In sum, the Chong et al. design represents an instructive example of an experimental study that measures the extent to which party identification responds to a theoretically informative, real-world intervention. Although more research of this kind needs to be done before one can draw robust conclusions about party attachments in Mexico or elsewhere, this study seems to suggest that performance-related information regarding corruption has a short-term effect on partisanship, while somewhat more issue-related information concerning spending had negligible effects.

4. Gerber, Huber, and Washington (2010) [39]

In the context of the hotly contested presidential primaries of 2008, Gerber, Huber, and Washington [39] conducted an experiment in which they sought to create partisan attachments among self-identified independents. In January of 2008, as the presidential primaries of both parties were intensifying, the authors conducted a survey of registered voters in Connecticut who, when registering, declared them-selves unaffiliated with any political party. This declaration rendered them ineligible to vote in the upcoming presidential primaries. Among those who declared themselves in the survey to be independents (including those who "lean" toward the Democrats or Republi-cans when asked a standard follow-up question about which party they feel closer to), half were randomly selected to receive a letter a week or two later informing them that they must register with a party in order to vote in that party's presidential primary election on 5 February. The letter also included a registration form enabling them to register with a party. In June, respondents were reinterviewed and asked about their party identification, as well as their issue stances and other evaluations.

This experiment parallels the encouragement design described earlier. The pool of experimental subjects comprised self-described independents who were interviewed in January. Random assignment $(Z_{i})$ determined which of the subjects was sent a letter. The letter was literally an encouragement to register with a political party. Although the letter might ordinarily be considered the treatment in a standard design, the treatment in the downstream experiment $(D_{i})$ was whether the subject actually registers as a Democrat or Republican. (The authors discuss other potential outcomes variables, such as whether subjects vote in the presidential primaries; what follows is a simplified version of their analysis that conveys the basic logic of the design.) Some members of the control group registered without encouragement; some members of the treatment group failed to register despite encouragement.

The mismatch between assigned and actual treatment prevents us from estimating the ATE for the sample as a whole; instead, we must set our sights on estimating the ATE for Compliers, those who register with a major party if and only if encouraged. In order to identify the CACE, we must assume monotonicity, or the absence of Defiers. In this case, Defiers are those who would register with one of the two major parties if and only if they are assigned to the control group. Intuition suggests that few voters are so hostile to form letters from public officials that they would cancel their plans to register with a major party if (and only if) encouraged to do so. Monotonicity appears to be a plausible assump-tion here.

Under monotonicity, those who register with a major party in the control group are Always-Takers, and those who register in the treatment group are a combination of Always-Takers and Compliers. Since the treatment and control groups were selected randomly, in expectation they should have the same shares of Always-Takers and Compliers. Thus, the share of Compliers can be estimated by subtracting the party registration rate (7.23%) in the control group (N = 346) from the party registration rate (13.61%) in the treatment group (N = 360). This estimate (0.1361 – 0.0723 = 0.0639) forms the denominator of the estimator in Equation (8). The
t-ratio for this estimated effect is 2.78. Using the full sample of subjects (rather than just those reinter-viewed in June) leaves no doubt about the robustness of the relationship. For these 2,348 subjects, the t-ratio is 5.48. The experiment did not generate an enormous share of Compliers, but clearly there are enough to support a downstream analysis.

The numerator of Equation (8) is the observed difference in outcomes, in this case, identification with a major party when re-interviewed several months later. Identification could be measured in various ways; for purposes of illustration, we will use the convention of measuring partisan strength by folding the 7-point party identification scale at the center (pure independent) and counting independent leaners as 1, weak partisans as 2, and strong parti-sans as 3. Using this scoring method, partisan strength averaged 1.0361 in the treatment group, as compared to 0.9624 in the control group. In other words, assignment to receive a letter boosted the apparent probability of identifying with a party by 1.0361 – 0.9624 = 0.0737 scale points. Putting the numerator and denominator together gives us the in-strumental variables regression estimate of the CACE:

\begin{matrix} \hat{CACE} = \frac{(Ê [Y_{i} │ Z_{i} = 1] - Ê [Y_{i} │ Z_{i} = 0])}{(Ê [D_{i} │ Z_{i} = 1] - Ê [D_{i} │ Z_{i} = 0])} \\ = \frac{(1.0361 - 0.9624)}{(0.1361 - 0.0723)} = \frac{0.0737}{0.0639} = 1.153 \end{matrix}

(9)

This estimate suggests that among Compliers, those who register with a party if and only if encouraged to do so, the act of registering with a party increases partisan strength by 1.153 scale points. The magnitude of this effect is not trivial: in their pre-election round of interviews with registered voters who were not registered with a party (including respondents who were not part of the letter experiment because they were weak or strong partisans), the average level of partisan strength was 1.01 with a standard deviation of 0.85.

Before drawing substantive inferences based on this estimate, let's first evaluate the plausibility of the exclusion restriction in this application, an issue that Gerber, Huber, and Washington discuss in detail ([39], pp. 737–741). Clearly, the encouragement letters $(Z_{i})$ influenced party registration $(D_{i})$ and partisan strength $(Y_{i})$ . The question is whether the exclusion restriction $Y_{i} (d) = Y_{i} (d, z)$ is plausible; could it be that potential outcomes for partisan strength respond not only to whether people register with a party but also to whether they receive a letter? The letters them-selves were designed to be empty of partisan content; they simply remind voters of the administrative fact that a change of registration will be necessary if they want to participate in an upcoming election. In terms of measurement procedures, the authors took care to assess outcomes in the June survey in ways that preserved the symmetry between treatment and control groups, avoiding any questions that would prompt members of the treatment group to recall the letter or the circumstances surrounding their change in registration. In terms of substantive confounders, it is possible that the letters piqued voters' interest in the campaign, so that even if they did not change their registration, their partisan attachments were altered. This backdoor pathway from $Z_{i}$ to $Y_{i}$ seems unlikely, and the authors found no evidence that subjects in the treatment group were any more interested or hungry for political information when interviewed in June (p. 739).

If we accept the exclusion restriction, two issues of interpretation remain. The first is whether one can generalize from the estimated ATE for Compliers to causal effects for other subjects, contexts, and interventions. Would the results be the same if one's treatment caused every person who was registered but unaffiliated with a party to change party registration? This question is best settled by follow-up experiments that assess whether the results depend on number and frequency of encouragements (which will affect the proportion of Compliers) or the particular arguments that are used in the encourage-ment. The same goes for experimenting with different contexts: instead of offering voters a chance to vote in both parties' contested primaries, what about circumstances in which only Republican candidates are vying for the nomination?

Another question of interpretation is what to make of the effect of changing registration. A variety of hypotheses could be adduced: a public declaration of a partisan identity changes the way one regards oneself, sets in motion a search for information to justify one's partisan choice, or causes political campaigns to make increased efforts to mobilize and persuade (p. 737). Each of these subsidiary hypoth-eses has testable implications, and the authors investigate whether subjects in the treatment and control group evaluate partisan figures differently or have different types of interactions with political campaigns. They find that partisan evaluations do change concomitantly with changes in party identifi-cation (p. 735), but there is no apparent relationship between the treatment and contact with campaigns or other manifestations of greater interest in issues or information. Over the course of a few months, change in partisanship seems to have coincided with changes in partisan attitudes but not changes in behaviors such as searching for information or discussing politics with others.

We say "coincided" because one cannot distinguish the causative effects of each of the changes that were set in motion by the letter. The authors note that "receipt of the letter informing the recipients about the need to be affiliated with a party in order to vote in that party's primary increased partisan identity, partisan registration, voter turnout, and partisan evaluations of political figures" (p. 737). With just a single randomly assigned treatment (the letter), one cannot separately identify the effects of each intervening variable. For example, one cannot sepa-rately identify the effects of registration and the effects of actually voting; voting is just one of the many possible by-products of registration. If one wanted to isolate the effect of registration per se, a different design would be needed—perhaps encourage unaffiliated voters to re-register with a party shortly after the primary has passed in order to estimate the effect of (solely) registering with a party? Conversely, one could determine whether voting per se increases partisanship by urging people to vote using non-partisan messages (see [40]). The single-factor encouragement used in this study paves the way for more elaborate encouragement designs that aim to identify distinct sources of partisan change.

5. Discussion

The two studies summarized above provide a tem-plate for future research. The Chong et al. [34] study offers an example of how one might fruitfully study causes of partisan change by deploying an array of different kinds of interventions. In that study, information about corruption in municipal government led voters to change their party attachments. The Gerber et al. [39] study deploys a treatment that in itself had no partisan content and functioned solely to facilitate behaviors that are believed to reinforce partisanship. By setting in motion randomly generated direct and downstream effects, these experiments provide a method for studying partisanship that is both informative and methodologically defensible.

This style of intervention-oriented research could be expanded to include information about the parties' policy stances, their financial backers, their level of support among different segments of the electorate, and so forth. A combination of treatments could be designed to test competing theories about how party identities are formed. One kind of treatment might be designed to affect retrospective performance evalu-ations, while another might be crafted to alter perceptions of the parties' platforms or support among voters with different social identities. What makes this approach distinctive is that scholars intervene to mint partisans through randomly assigned treatments rather than to observe passively the partisan changes that occur on their own.

Both experiments illustrate how this approach might be deployed in a field setting (perhaps as a by-product of a broader field experiment), but the basic design applies also to laboratory research [35] and survey research [32,37,38]. One could imagine a lab or on-line study in which subjects are pre-screened for weak partisan attachments, randomly exposed to theoretically-inspired appeals that are designed to move them closer to a political party. For example, one could imagine a "social identity" video that explains what sorts of people favor the Democratic and Republican parties and a competing "spatial proximity" video that explains the ideological stances of the party with respect to several leading issues. Indeed, one can even imagine a vacuous "feel-good" video that deploys slogans and attractive imagery while endorsing one of the parties—in this case, the same video could be adapted to support each party. The main practical constraints are the need to expose the control group to something that is vaguely similar (but not party-focused) so that subjects in both groups have similar suspicions about what the study is about when reinterviewed at some later point in time.

More challenging is the task of designing experi-ments to test the effects of partisan attachments on other attitudes and behavior. For example, parti-sanship is said to alter issue stances, economic evaluations, and interest in political news. In an ideal design, a randomly assigned intervention would affect party attachments without directly affecting these outcomes. This exclusion restriction obviously rules out the use of economic news as an inducement to identify with the allegedly more competent party. It may also rule out naturally occurring random assignments, such as the Vietnam draft lottery [41], which may affect both partisanship and issue stances directly. Developing effective interventions that seem to satisfy the exclusion restriction may require a fair amount of trial-and-error. Social scientists are relatively unaccus-tomed to developing interventions that successfully change partisanship; the experiments discussed above are important first steps in that direction.

Acknowledgements

An earlier version of this paper was presented at the CISE-ITANES Conference "Revisiting Party Identifi-cation: American and European Perspectives". LUISS-Guido Carli, Roma, 7–8 October 2011. The author is grateful to the Institution for Social and Policy Studies at Yale University, which furnished the replication data analyzed in this paper.

References

1. Campbell A, Converse PE, Miller WE, Stokes DE. The American Voter. New York, USA: Wiley; 1960.

2. Campbell A, Converse PE, Miller WE, Stokes DE. Elections and the Political Order. New York, USA: Wiley; 1966.

3. Butler D, Stokes DE. Political Change in Britain. London, USA: Macmillan; 1969.

4. Jennings KM, Niemi RG. Generations and Politics: A Panel Study of Young Adults and Their Parents. Princeton, NJ, USA: Princeton University Press; 1981.

5. Jackson JE. Issues, Party Choices, and Presi-dential Votes. American Journal of Political Science. 1975;19(2):161–185.

6. Franklin CH, Jackson JE. The Dynamics of Par-ty Identification. American Political Science Review. 1983;77(4):957–973.

7. Fiorina MP. Retrospective Voting in American National Elections. New Haven, CT, USA: Yale University Press; 1981.

8. Brody RA, Rothenberg, LS. The Instability of Partisanship: An Analysis of the 1980 Presidential Election. British Journal of Political Science. 1988; 18(4): 445–465.

9. Page BI, Jones CC. Reciprocal Effects of Policy Preferences, Party Loyalties, and the Vote. American Political Science Review. 1979;73(4):1071–1089.

10. Budge I, Crewe I, Farlie D, editors. Party Identification and Beyond: Representations of Voting and Party Competition. New York, USA: John Wiley and Sons; 1976.

11. Achen CH. Mass Political Attitudes and the Survey Response. American Political Science Review. 1975;69(4):1218–1231.

12. Green DP, Palmquist B. How Stable Is Party Identification? Political Behavior. 1994;16(4):437–466.

13. Green DP, Palmquist B. Of Artifacts and Partisan Instability. American Journal of Political Sci-ence. 1990;34(3):872–902.

14. Schickler E, Green DP. The Stability of Party Identification in Western Democracies: Results from Eight Panel Surveys. Comparative Political Studies. 1997;30(4):450–483.

15. Bartle J, Bellucci P, editors. Political Parties and Partisanship: Social Identity and Individual Atti-tudes. New York, USA: Routledge; 2009.

16. Clarke HD, McCutcheon AL. The Dynamics of Party Identification Reconsidered. Public Opinion Quarterly. 2009;73(4):704–728.

17. Neundorf A, Stegmueller D, Scotto TJ. The Individual-Level Dynamics of Bounded Partisanship. Public Opinion Quarterly. 2011;75(3):458–482.

18. MacKuen MB, Erikson RS, Stimson JA. Macropartisanship. American Political Science Review. 1989;83(4):1125–1142.

19. Green DP, Palmquist B, Schickler E. Macroparti-sanship: A Replication and Critique. American Political Science Review. 1998;92(4):883–899.

20. Green DP, Schickler E. A Spirited Defense of Party Identification Against Its Critics. In: Bartle J, Bellucci P, editors. Beyond Party Identification and Beyond. New York, USA: Routledge; 2009, pp. 180–199.

21. Abramson PR, Ostrom CW Jr. Macropartisan-ship: An Empirical Reassessment. American Political Science Review. 1991;85(1):181–192.

22. Green DP, Palmquist B, Schickler E. Partisan Hearts and Minds: Political Parties and the Social Identities of Voters. New Haven, USA: Yale University Press; 2002.

23. Erikson RS, MacKuen MB, Stimson JA. What Moves Macropartisanship? A Response to Green, Palmquist, and Schickler. American Political Science Review. 1998;92(4):901–912.

24. Achen CH. Parental Socialization and Rational Party Identification. Political Behavior. 2002;24(2):141–170.

25. Lenz GS. Follow the Leader: How Voters Respond to Politicians' Policies and Performance. Chicago, USA: University of Chicago Press; 2012.

26. Box-Steffensmeier JM, Smith RM. The Dynamics of Aggregate Partisanship. American Political Science Review. 1996;90(3):567–580.

27. Sekhon JS. Opiates for the Matches: Matching Methods for Causal Inference. Annual Review of Politi-cal Science. 2009;12(1):487–508.

28. Neyman J. On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9. Statistical Science. 1990;5(4):465–480.

29. Rubin D. Formal Mode of Statistical Inference for Causal Effects. Journal of Statistical Planning and Inference. 1990;25(3):279–292.

30. Gerber AS, Green DP. Field Experiments: Design, Analysis, and Interpretation. New York, USA: W.W. Norton; 2012.

31. Angrist JD, Imbens GW, Rubin DD. Identifi-cation of Causal Effects Using Instrumental Variables. Journal of the American Statistical Association. 1996; 91(434):444–455.

32. Middleton J. On the Foundations of Economic Voting: An Experimental Study [PhD Dissertation]. New Haven, CT, USA: Department of Political Science, Yale University; 2011.

33. Green DP, Gerber AS. The Downstream Benefits of Experimentation. Political Analysis. 2002; 10(4):394–402.

34. Chong A, de la O AL, Karlan D, Wantchekon L. Looking Beyond the Incumbent: The Effects of Expos-ing Corruption on Electoral Outcomes. NBER Working Paper No. 17679, December 2011.

35. Cowden JA, McDermott RM. Short-Term Forces and Partisanship. Political Behavior. 2000;22(3):197–222.

36. Johnston R. Party Identification Measures in the Anglo-American Democracies: A National Survey Experiment. American Journal of Political Science. 1992;36(2):542–559.

37. Lupu N. Party Brands and Partisanship: Theory with Evidence from a Survey Experiment in Argentina. American Journal of Political Science. 2013;57(1):49–64.

38. Brader TA, Tucker JA. Reflective and Unreflec-tive Partisans? Experimental Evidence on the Links between Information, Opinion, and Party Identifica-tion. Proceedings of the Stanford Workshop on Comparative Politics, Stanford University, Stanford, CA, USA, 19 May 2008.

39. Gerber AS, Huber GA, Washington E. Party Affiliation, Partisanship, and Political Beliefs: A Field Experiment. American Political Science Review. 2010;104(4):720–744.

40. Gerber AS, Green DP. The Effects of Can-vassing, Direct Mail, and Telephone Contact on Voter Turnout: A Field Experiment. American Political Science Review. 2000;94(3):653–663.

41. Erikson RS, Stoker L. Caught in the Draft: The Effects of Vietnam Lottery Status on Political Attitudes. American Political Science Review. 2011;105(2):221–237.

Politics and Governance