IBM researchers use grocery scanner data to speed investigations during early foodborne illness outbreaks
This study shows that big data and analytics can help identify potential sources of contamination.
Scientists at IBM Research – Almaden, San Jose, Calif., discovered that analyzing retail-scanner data from grocery stores against maps of confirmed cases of foodborne illness can speed early investigations. In the study, “From Farm to Fork: How Spatial-Temporal Data can Accelerate Foodborne Illness Investigation in a Global Food Supply Chain," researchers demonstrated that with as few as 10 medical-examination reports of foodborne illness, they can narrow down the investigation to 12 suspected food products in just a few hours.
Researchers created a data-analytics methodology to review spatio-temporal data, including geographic location and possible time of consumption, for hundreds of grocery product categories. Researchers also analyzed each product for shelf life, geographic location of consumption and likelihood of harboring a particular pathogen, then mapped the information to the known location of illness outbreaks. Next, the system ranked all grocery products by likelihood of contamination in a list from which public health officials could test the Top 12 suspected foods for contamination and alert the public accordingly.
A traditional investigation can take weeks to months, and the timing can significantly influence the economic and health impact of a disease outbreak.
"When there's an outbreak of foodborne illness, the biggest challenge facing public health officials is the speed at which they can identify the contaminated food source and alert the public," says Kun Hu, public health research scientist, IBM Research - Almaden. "While traditional methods like interviews and surveys are still necessary, analyzing big data from retail grocery scanners can significantly narrow down the list of contaminants in hours for further lab testing. Our study shows that big data and analytics can profoundly reduce investigation time and human error and have a huge impact on public health."
The method in this study has already been applied to an actual E. coli illness outbreak in Norway. With just 17 confirmed cases of infection, public health officials were able to use this methodology to analyze grocery-scanner data related to more than 2,600 possible food products and create a short list of 10 possible contaminants. Further lab analysis pinpointed the source of contamination down to the batch and lot numbers of the specific product—sausage.