I came across an interesting article by Tim Harford in FT Magazine. This article in in line of several posts I have made on this Blog, which express some skeptics on the ability of merely looking at statistical correlation in the past to create "big insights". Harford brings some examples for that and concludes that there are some naive believes around the big data hypes. I'll keep writing more insights about this topic.
This is a blog describing some thoughts about issues related to event processing and thoughts related to my current role. It is written by Opher Etzion and reflects the author's own opinions
Showing posts with label correlation vs. causality. Show all posts
Showing posts with label correlation vs. causality. Show all posts
Thursday, April 3, 2014
Saturday, March 15, 2014
Google flu trends as a lesson in big data prediction
A recent article in the science section of TIME magazine reports that prediction using "big data" techniques is not as easy as portrayed. It analyzes the Google Flu Trend case, in which the assumption has been that there is a strong correlation between the spread of flu, and the searchers for flu related terms in Google. It seems that this does not produce accurate results. The article claims that while using the big data methods is useful, they should be combined with traditional "small data" methods. There are various definitions of what a small data is - for example, the one from "small data group" : Small data connects people with timely, meaningful insights (derived from big data and/or “local” sources), organized and packaged – often visually – to be accessible, understandable, and actionable for everyday tasks.
I guess that this also relates to the discussion about understanding causality in addition to statistical correlation that I've discussed before on this blog.
Thursday, May 9, 2013
Causality vs. correlation - statistical reasoning is not enough - NY Times Interview with Dave Ferrucci
Dave Ferrucci, who was until several months ago an IBM Fellow and was known as the father of Watson, was interviewed by the NY Times in his new working place at Bridgewater Associates.
In the interview Ferrruci somewhat continues the line of thought of Noam Chomsky, saying that AI has concentrated around statistical reasoning based on correlations, but the drawback is that one cannot understand why the prediction made by the statistical reasoning is correct. While Chomsky bluntly stated that statistical reasoning does not create a solid model of the universe, Ferruci claims that a complementary approach is required - understanding causality. This is a rather old issue, in symbolic logic, there is a distinction between "material implication" which states that IF A is true then B is true, and the meaning is that always when A is true then B is also true, which makes a sentence like "If the week has seven days than the capital city of France is Paris" - a valid statement in logic. Entailment, on the other hand, said that "A ENTAILS B" if it is necessary and relevant, in other word, there is a causality among them. Thus, Ferruci concentrates now on building causality models to model the world economy. I concur with the assertion that understanding causalities give better abilities of reasoning and prediction. As David Luckham already noted, causality among events is one of the major abstraction of event processing models. Here is a rather old discussion about causality of events.
Subscribe to:
Posts (Atom)


