We are awash in a sea of data. Thanks to web analytics tools, CRM systems, and social media, we have more data than ever about the behavior of customers and prospects. What is often lacking are the knowledge and skills necessary to turn this data into useful information.
Both are on display in a brilliant study conducted by Seth Stephens-Davidowitz of Harvard. As reported in the Wall Street Journal, using freely available data from Google Insights, skillful research, and clever thinking, he was able to determine that in the 2008 presidential election, racial attitudes reduced the number of votes garnered by President Obama by 3%-5%. His method of reaching this conclusion, which we’ll review here, represents techniques that can be used by all marketers in gaining insights into topics such as forecasting product demand, buying attitudes, geographical preferences, and buyer demographics.
Stephens-Davidowitz performed the study because of the notorious unreliability of surveys to capture the true racial attitudes of voters. Participants in surveys are highly likely to misreport their true attitudes due to embarrassment. Google-based measures of racial bias are more likely to accurately reflect voters’ attitudes, since they perform Google searches online while likely alone. In addition, information about Google searches is available at finer geographic levels, uses data that is more recent, and aggregates information from larger samples as compared to typical surveys.
The method used in the study was as follows:
- Choose a search term that represents the underlying attitude. In this case, Stephens-Davidowitz used a certain well know racial epithet that began with “n” for the representative search term.
- He had to make sure that the term represented a strong proxy for racial bias; he did this by:
- Examining some of the output from Google Insights, which includes the top related search terms including the word. From the list of related terms, it was clear that the search was motivated by racial bias.
- Verifying that Google search volumes correlate well with demographics one would more often expect to search the term. For example, the percent of a state’s residents who say they believe in God explains 65% of the variation of the search volume for the word “God”. The table below gives further examples:
- Finally, the major potential bias with racial attitude survey data – misreporting due to embarrassment – is unlikely to significantly bias Google data. As mentioned previously, the conditions under which people search -online and likely alone – limit this concern. The following table documents substantial search volume for various terms that researchers suspect may be under-reported in surveys.
- He then used Google Insights to determine the geographic variation in the use of this term in searches. Quite a wide variation was found by media market:
- Stephens-Davidowitz next sought to arrive at an estimate of how this bias translated into votes. In order to do this, he arrived at a first estimate by comparing voting results by media market in the Obama – McCain election with results in the Kerry – Bush election using linear regression.
- To verify that his estimate of racial bias was a strong predictor of the difference in voting patterns between the two elections, he then added additional variables to his analysis that are known to affect voting outcomes.Stephens-Davidowitz concludes that
Estimating the effect of racial animus on voting is complicated by surveyed individuals’ propensity to misreport socially unacceptable attitudes. This paper sidesteps surveys using area-level Google search data and administrative voting records. I find that racial animus played a major role in the 2008 election. Relative to the attitudes of the most tolerant area, racial animus cost Obama 3 to 5 percentage points of national popular vote.
More details are offered in the full study, The Effects of Racial Animus on Voting: Evidence Using Google Search Data. The method described here can be used in a host of marketing applications, including forecasting product demand, buying attitudes, geographical preferences, and buyer demographics.