Showing posts with label Infosphere Streams. Show all posts
Showing posts with label Infosphere Streams. Show all posts

Tuesday, July 10, 2012

Event processing in simulation mode - an Infosphere streams example

I have written in the past about event processing in simulation mode,  recently an article on the IBM developerWork describes an Infosphere Streams implementation of traffic simulation.  

The picture above is taken from the article, a vehicle generator generates simulated events of vehicles. Each round represents 1 second in real-time and cars are accelerating, stopping and behaving according to traffic rules.   This is an example where the stream processing is done not on real-life events, but on simulated events.    The reported benefit of using streams have been the large scale of this simulation (multiple vehicles, rounds of 1 second,  multiple streets).   Besides the scalability aspects, much of the simulated behavior may be expressed using operators on events.    Simulation has been one of the first areas that use some type of event processing, and using event processing for simulation closes a circle.  

Thursday, November 24, 2011

More on big data and event processing



Philip Howard, one of the analysts who follows the event processing areas for many years, recently wrote about "CEP and big data".    emphasizing the synergy of data mining techniques on big data as a basis to create real-time scoring based on predictive model created by data mining techniques, his inspiration for writing this piece was reviewing the Red Lambda product.   It is certainly true that creation of event processing patterns off-line using mining techniques and then tracking this event patterns on-line using event processing is a valid combination, although the transfer from the data mining part to the event processing part typically requires some more work (in most cases also involves some manual work).    In general getting a model built in one technology to be used by another technology is not smooth, and require more work.  
The synergy between big data and event processing has more patterns of use -- as  big data in many cases is manifested in streaming data that has to be analyzed in real-time,  Philip mentions Infosphere Streams, which is the IBM platform to manage high throughput streaming data.   The data mining on transient data as a source for on-line event processing, and the real-time processing of high throughput streaming data, are orthogonal topics that relate to two different dimensions of big data, my posting about the four Vs summarizes those dimensions.   

Sunday, May 24, 2009

On System S


The logo above (in Hebrew) is the logo of the academic college of Emek Yezreel , in which I am serving in the steering committee of the Information Systems department, which is a new department started this year, as part of my service to the community, I am helping various institutes to establish academic plans to give the opportunity to people that otherwise would not be able to acquire academic education. As Aliza Shenhar, the dominant president of this college has said: for many of the students they are the first in their family ever to obtain academic degrees. We had an interesting discussion about the challenges that they have in teaching a diversified population.

Anyway, today I would like to write a little bit about System S, which has been recently highlighted by IBM, and been covered in NY Times, ComputerWorld, and some of the community Blogs by Paul Vincent and Marc Adler. The picture below, taken from the NY Times show Steve Mills, the head of IBM Software Group (on the left hand side), and John Kelly, the head of my organization, IBM Research (on the right hand side), both senior vice presidents in IBM, reporting directly to the CEO.


So what is System S, and how does it relate to event processing ? In the slide that Steve Mills points at, the title is "Stream Computing", and indeed, this system takes streams in the broad sense, anything that sends constant information from various types -- such as: video, audio, text, multi-media. The points in this slide are showing a data flow, and indeed, System S is a platform that can run data flow of processing elements in the system S terminology, each of them runs on stream of a certain type, and provide some form of analytics -- filtering, aggregation, extracting features out of video, interpreting voice and much more. The platform can take advantage of supercomputers to provide parallel processing, and digest high throughput of data. You can read more about it in the IBM Research website (I am not sure it is up to date). System S is a prelude to an IBM product already announced under the name -"Infosphere Streams".

Now, the question is what is the relationship between System S and event processing ? There are two different points. The one is that System S can take as an input large amount of streaming data, filter and aggregate it, and create a relatively small collection of events that can further be processes by some event processing engine. The other is that System S, as said, is a platform; the processing elements in this platform can be, in principle, event processing agents. In fact, while the semantics of the data flow is not identical to the semantic of an event processing network, it is possible to map event processing network to be implemented by the System S platform. The spade language provides some such capabilities, and may be extended in time to include more. IBM takes a portfolio approach to event processing (BTW - IBM does not use the "CEP" TLA, it uses its own TLA "BEP" as Business Event Processing, I tend to use event processing without prefixes and suffixes, as I stated before), since it believes that the "one size fits all", does not work, due to differences in functional and non-functional properties. System S is definitely aimed at the high end, in terms of throughput requirements. More - Later.