Thursday, November 24, 2011
More on big data and event processing
Philip Howard, one of the analysts who follows the event processing areas for many years, recently wrote about "CEP and big data". emphasizing the synergy of data mining techniques on big data as a basis to create real-time scoring based on predictive model created by data mining techniques, his inspiration for writing this piece was reviewing the Red Lambda product. It is certainly true that creation of event processing patterns off-line using mining techniques and then tracking this event patterns on-line using event processing is a valid combination, although the transfer from the data mining part to the event processing part typically requires some more work (in most cases also involves some manual work). In general getting a model built in one technology to be used by another technology is not smooth, and require more work.
The synergy between big data and event processing has more patterns of use -- as big data in many cases is manifested in streaming data that has to be analyzed in real-time, Philip mentions Infosphere Streams, which is the IBM platform to manage high throughput streaming data. The data mining on transient data as a source for on-line event processing, and the real-time processing of high throughput streaming data, are orthogonal topics that relate to two different dimensions of big data, my posting about the four Vs summarizes those dimensions.