Saturday, January 17, 2009

On Distribution and parallelism in Event Processing

This picture, taken from the site of Nature Reviews as part of an article about "parallel processing in mammalian retina", illustrates that structures like the human body distribute the functions it needs to perform, and performs many of them in parallel and by specialized systems.

Getting to event processing, the producers and consumers of event processing can be distributed, as events can come to many sources, and situations may be consumed by many sinks. The first generation of event processing was mostly centralized in processing, the centralization has been twofold: functional centralization using an monolithic engine that performs all processing fucntions, and location centralization, this engine runs on a single server.

Today I'll concentrate on the second aspect of centralization, there are various reasons to decentralize the processing, one is to do some of the activities closer to the producers or consumers, example: if a producer produces events, where only 1% is relevant to the defined event processing, and it can be done by independent filtering that does not depend on other events, then it will be more efficient that the filtering will take place at or close to the consumer site, and thus eliminate the unnecessary network traffic.

Another reason to distribute the functionality is the scalability aspect, which is really an old idea to "divide and conquer" problems. The challenge is how to do a "good" partition. First there is a need to define what a "good" partition is, i.e. looking at it as an optimization problem, what is the goal function, then solving it is a function of the topology, semantics and behavior of a particular application which can be dynamic.

IBM has recently released the first version of WBEXS (Websphere Business Event Exterme Scale) and made a statement of direction for another product: Infostreme Streams
both are aimed to handle scalability by distribution in different environments. While details about IBM products you can obtain from the appropriate people in IBM, we in the IBM Haifa Research lab are working on related topics, we have exposed initial results in DEBS 2008, in the fast abstract session introducing the statification approach.The project has substantially advanced since that time, and I'll discuss it further in future Blogs (well - I need to go over the Blog and list all the topics I promised to discuss later and have not done so yet...).

More - later

No comments: