Saturday, March 15, 2014

Google flu trends as a lesson in big data prediction

A recent article in the science section of TIME magazine reports that prediction using "big data" techniques is not as easy as portrayed.  It analyzes the Google Flu Trend case, in which the assumption has been that there is a strong correlation between the spread of flu, and the searchers for flu related terms in Google.   It seems that this does not produce accurate results.   The article claims that while using the big data methods is useful, they should be combined with traditional "small data" methods.  There are various definitions of what a small data is - for example, the one from "small data group"Small data connects people with timely, meaningful insights (derived from big data and/or “local” sources), organized and packaged – often visually – to be accessible, understandable, and actionable for everyday tasks.   

I guess that this also relates to the discussion about understanding causality in addition to statistical correlation that I've discussed before on this blog.

No comments: