Event Processing Thinking: off-line event processing

Showing posts with label off-line event processing. Show all posts

Saturday, March 10, 2012

Event processing as reducer in Map Reduce

A relatively recent posting from MSFT. In this posting there is a question about the relationship between the batch oriented Map Reduce, and the on-line oriented event processing. The answer, according to MSFT is - event processing can be used as a reducer in the Map Reduce, where there are multiple copies of an event processing engine perform the reduce function.

I have written before about offline event processing, with the insight that event processing is useful not only in online, but in offline, since it provides both efficient implementation and high-level abstraction in certain functions (pattern matching, aggregation and more) that makes it also attractive to use in batch.

Of course, another synergy may be using event processing within real-time hadoop, like Darkstar. as is frequently articulated by Colin Clark.

Sunday, February 1, 2009

On Off-Line Event Processing

A comment made by Hans Glide to one of my previous postings on this Blog, prompted me to dedicate today's posting to Off-Line Event Processing. Well - as a person who is constantly off any line, I feel at home here...

Anyway -- some people may wonder and think that the title above is an Oxymoron, since they put "real-time" as part of the definition of event processing. I have used before this picture that is the best describing some of what is written about event processing - by everybody:

This, of course, illustrates a collection of blind people touching an elephant; each of them will describe the elephant quite differently, and the phenomenon that people say "event processing is only X", where X defines a subset of the area is quite common. In our case X = "on line".

The best here is to tell you about a concrete example of a customer's application I am somewhat familiar with. The customer is a pharmaceutical company which monitors its suppliers related activities. It looks at events related to supplier-related activities and checks them against its internal regulations. The amount of such events are several thousands per day and from business point of view, it does not require real-time requirements, the observation about any regulation violation and action taken, can be done in the next day. The way that this system works is accumulate events during the day, and activate the vent processing system at the end of each day, which is actually a batch processing done off-line.

An interesting question is why have this customer chosen to use an event processing system, and did not use a more traditional approach of putting everything in a database and using SQL queries. The answer is quite simple: This applications have some interesting properties:

The number of regulations is relatively high (in the higher range of three digits);
Many of the regulations rules are indeed detection of temporal oriented patterns that include multiple events,
Regulations are inserted or modified frequently.

Given all these it turned out that the use of event processing system in off-line was the most cost-effective solution; While using SQL is nominally possible, writing these regulations in SQL is not easy, and the magnitude makes the investment in development and maintenance quite high.

So - the benefit of using event processing here is neither the real-time aspect, nor high throughput support, but simple TCO considerations.

This is not the only applications of this type, and in fact, I have seen several other cases in which event processing has been used off-line. There is also another branch of off-line processing which combine on-line and off-line together, but I'll write about it in another posting...

More - Later.

Event Processing Thinking

Saturday, March 10, 2012

Event processing as reducer in Map Reduce

Sunday, February 1, 2009

On Off-Line Event Processing

Popular Posts

Contributors