Institution for Social and Policy Studies

Advancing Research • Shaping Policy • Developing Leaders

Testing the Accuracy of Regression Discontinuity Analysis Using Experimental Benchmarks

ISPS Data Archive: Terms of Use

By using, contributing, and/or downloading files associated with scholarly studies available on the ISPS Data Archive, you agree to these terms and conditions.

Please read the ISPS Data Archive Terms of Use.

Author(s): 

Green, Donald P., Terence Y. Leong, Holger L. Kern, Alan S. Gerber, and Christopher W. Larimer

ISPS ID: 
D016
Data type: 
Administrative
Data source(s): 

“Qualified Voter File” (QVF), the official state voter list (as constructed by Mark Grebner at Practical Political Consulting).

Field date: 
August 1, 2006
Location: 
Location details: 
Michigan
Unit of observation: 
Households
Sample size: 
180,002 households; 344,084 individuals/registered voters
Inclusion/exclusion: 
Households in the state of Michigan: The targeting criteria used in this mailing campaign were developed by a political consultant, Mark Grebner who uses targeting criteria based on a combination of address information readily available from the Qualified Voter File (QVF) and a set of proprietary indices of partisanship and voting behavior developed by his consulting firm. Grebner’s targeting objective was to direct mailings to those who were thought to be especially responsive to them. Mailings were therefore sent to voters whose expected probability of voting was deemed to be moderate. Those believed to be strong Democrats were excluded on the grounds that they had little chance of voting in an election that was meaningful mainly to Republicans. Absentee voters were excluded because they were believed to vote early, before the receipt of these mailings. Sparsely populated streets were excluded because the Neighbors treatment requires the voting histories of several neighbors. Apartment addresses were excluded because apartment numbers are sometimes unreliable, and it is hard to be certain which voters belong to the same household... We removed everyone for whom we could not assign a valid 9-digit ZIP, people who live on blocks where more than 10% of the addresses included apartment numbers, people who live on streets with fewer than four addresses (or fewer than 10 voters). Prior to random assignment we also removed households with the following characteristics: all members of the household had over a 60% probability of voting by absentee ballot if they voted or all household members had a greater than a 60% probability of choosing the Democratic primary rather than the Republican primary. Absentees were removed because it was thought that many would have decided to vote or not prior to receipt of the experimental mailings, which were sent to arrive just a few days before the election. Those considered overwhelmingly likely to favor the Democratic primary were excluded because it was thought that, given the lack of contested primaries, these citizens would tend to ignore pre election mailings. We removed everyone who lived in a route where fewer than 25 households remained, because the production process depended on using carrier-route-presort standard mail. Finally, we removed all those who had abstained in the 2004 general election on the grounds that those not voting in this very high turnout election were likely to be “deadwood”—those who had moved, died, or registered under more than one name.
Randomization procedure: 
Households were randomly assigned to either the control group or one of four treatment groups described next. Each treatment group consisted of approximately 20,000 households, with 99,999 households in the control group. The 180,002 households were sorted exactly into the order required by the USPS for “ECRLOT” eligibility (approximately: by ZIP, carrier route; then the order in which the carrier walks the route). The 180,002 households were then divided into 10,000 cells of 18 households each, with each cell consisting of households 1–18, 19–36, and so forth, of the sorted file. As a result, after sorting, each cell consisted entirely of either one or two carrier routes. A random number was generated and the entire 180,002 records were sorted by cell number and the random number. The effect was to leave all the cells together, but in a random order. Using this randomly sorted copy of the file, the records were assigned to treatments 1/1/2/2/3/3/4/4/c/c/c/c/c/c/c/c/c/c where “c” indicates “control group.” The records were then resorted into carrier route order.
Treatment: 
Mailing; Households assigned to treatment groups were sent one mailing 11 days prior to the primary election. There were 4 conditions: Civic Duty, Hawthorne, Self, and Neighbors and a Control group.
Treatment administration: 
Mail
Outcome measures: 
Turnout rates in 2001 local elections (at the HH level)
Archive date: 
2009
Archive contributor: 
Limor Peer
Owner: 
Green, Donald P., Terence Y. Leong, Holger L. Kern, Alan S. Gerber, and Christopher W. Larimer
Owner contact: 

isps(at)yale(dot)edu

Terms of use: 

Adcademic, non-commercial >>more

Discipline: 
Area of study: 
Data file numbersort descending Description File format Size File url
D016F00 ReadMe file .txt 921 Download file
D016F01 Dataset Excel .csv 1572864 Download file
D016F02 Dataset R (2.9.1) .Rdata 26528972 Download file
D016F03 Program file (main) R (2.9.1) .R 16076 Download file
D016F04 Program file R (2.9.1) .R 819 Download file
D016F05 Program file R (2.9.1) .R 2662 Download file
D016F06 Program file R (2.9.1) .R 717 Download file
D016F07 Program file R (2.9.1) .R 2048 Download file
D016F08 Program file R (2.9.1) .R 2355 Download file
D016F09 Output file R (2.9.1) .eps 6963 Download file
D016F10 Output file R (2.9.1) .eps 11980 Download file
D016F11 Output file R (2.9.1) .eps 25600 Download file
D016F12 Codebook XML (1.1) .xml 9216 Download file
D016F13 Metadata record Adobe Acrobat (8.1) .pdf 199680 Download file