The table above gives the results of a poll published by The Literary Digest magazine on 31 October 1936 (Halloween, appropriately enough), shortly before the 1936 presidential election. The candidates were Franklin Delano Roosevelt (the incumbent president, a democrat) and Alfred (Alf) Mossman Landon (the republican challenger, then governor of Kansas). Approximately 10,000,000 questionnaires (in the form of postcards) were mailed to prospective voters, making the Literary Digest poll one of the largest ever conducted. Approximately 2,300,000 were returned. The prospective voters were chosen from the subscription list of the magazine, from automobile registration lists, from phone lists, and from club membership lists.
In the data table, the results are given by state. (There were 48 states in 1936.) The variable EV refers to the number of electoral votes of the state, FDR is the Roosevelt total for the state in the poll, and AML is the Landon total for the state in the poll.
Based on the poll, The Literary Digest predicted that Landon would win the 1936 presidential election with 57.1% of the popular vote and an electoral college margin of 370 to 161. In fact, Roosevelt won the election with 60.8% of the popular vote (27,751,841 to 16,679,491) and an electoral college landslide of 523 to 8 (the largest ever in a presidential election). Roosevelt won 46 of 48 states, losing only Maine and Vermont.
The Literary Digest, using similar techniques, had correctly predicted the outcome of the last four presidential elections. But in this case, the magazine was not just wrong, it was spectacularly wrong. In part because of the subsequent loss of prestige and credibility, the magazine died just two years later.
What went wrong? Clearly the sample was skewed towards wealthier voters--those who could afford magazine subscriptions, cars, phones, and club memberships in the depths of the Great Depression. This sort of bias would not matter if wealthier voters behaved in a similar manner to voters as a whole (as was basically the case in the previous four elections). But in 1936, at a time of great tension between economic classes, this was definitely not the case.
Another problem, not easily understood, is self-selection bias. Were the voters who chose to return the questionnaires different, in terms of how they planned to vote, from the voters who did not respond?
The links below give the data set in tab-separated text format and comma-separated text format. These are standard formats that can be imported into most statistical and spreadsheet software.