On the Accuracy of Nielsen Homescan Data
by Liran Einav, Ephraim Leibtag
, and Aviv Nevo
Economic Research Report No. (ERR-69) 34 pp, December 2008
Researchers use Nielsen Homescan data, which provide detailed food-purchase information from a panel of U.S. households, to study the dynamics of retail food markets.
What Is the Issue?
Some questions have been raised regarding the credibility of the Nielsen Homescan data because the data are self-recorded and the recording process is time-consuming. Given the time commitment, households who agree to participate in the sample might not be representative of the U.S. population as a whole, and those who agree to participate may not record their purchases accurately.
What Did the Study Find?
The analysis conducted in this report suggests that the Homescan data contain recording errors in several dimensions, but that the overall accuracy of self-reported data by Homescan panelists seems to be in line with other commonly used (government-collected) economic data sets.
For approximately 20 percent of food-shopping trips recorded in the Nielsen Homescan data, there was no corresponding transaction in the retailer's data, suggesting that either the store or date information was recorded with error. Using the retailer's loyalty card information, the study finds some shopping trips that did not match up with Nielsen Homescan data, implying that households did not record all of their trips in their Homescan records.
For the trips that did match up, roughly 20 percent of the items purchased were not recorded. For those items that were recorded, quantity was reported fairly accurately: 94 percent of the quantity information matched in the two data sets. The match for prices was lower: in almost half of the cases, the two data sets did not agree. However, much of this difference can be attributed to transactions that involved promotional or other temporary sale prices in either the Nielsen Homescan data or the retailer's data.
Nielsen's practice of using store-level data as an estimate of what households actually paid, poses a challenge when those stores have multiple possible prices in a given time period due to loyalty card or other shopper-specific price promotions. Indeed, for prices that involve no promotion or temporary price reduction, there are recording errors in only about 17 percent of the cases. Therefore, much of the overall price difference is likely caused by the way Nielsen imputes prices and not by recording errors by the panelists. Mismatched prices would most likely be less of a problem for stores that only have one price per product in a given week, so that the results highlight the importance of store pricing practices in food price analysis.
The study also compares the recording errors to errors in other commonly used economic data sets, and finds that errors in Homescan are of the same order of magnitude, for example, as reporting errors in earnings and employment status
How Was the Study Conducted?
Homescan records contain all products purchased by a household on a particular day in a particular store, as they were scanned by the consumer. The study compared these records to data obtained from a single retailer. The retailer's data contain the products purchased in each of the transactions at the same store and day reported by the household, as recorded by the cashier. Using data from trips made during 2004, the records from both data sets were matched. The matched transactions were compared and contrasted, and differences in various dimensions were recorded. In order to study the impact the recording errors might make in an applied study, the price paid was regressed on household characteristics in both data sets to see if the results differed.