Friday, April 10, 2009

On the boundaries of event processing

We are in the Passover vacation, to celebrate the biblical story of getting out from Egypt where the sea has split and people could move in the gap that was created, well -- I guess that at that time people just walked, but if it would have occurred today it might have looked like this.
Today I would like to write something about the "boundaries" of event processing, based on some discussions last week, related to writing a book about event processing. There are two issues related to the scope:
  • Is pre-processing to emit event by the producer, and post-processing of events by the consumer are part of the event processing systems?
  • Are pre-processing to obtain the event processing patterns that has to be monitored (i.e. using machine learning techniques) part of the event processing systems?
From the point of view of "event processing language", if we'll include the pre-processing and post-processing we'll have to extend the language to have the expressive power of any programming language, which will loose the focus on specific event processing functionality. Thus, while an application may require pre and post processing, this is typically outside the "event processing network". The main point of using "event processing language" and not hard-coding the event processing functionality in Java, C# or any other imperative general-purpose language is using higher level abstraction. As an analog, before the days of SQL we had to read from the database, loop over a record, and evaluate the conditions in hard-coded way, SQL did not provide anything we could not write in Cobol or PL/I (the languages of that time...), but just provided a more concise way to write it. The situation in event processing is similar, we can write something that is specified as:
" Match a pattern of events which is a conjunction of type E1, E2, E3 that refer to the same person and all occur within one hour since an event of type E0 for the same person, if there are several instances of E1, E2, E3 take the most recent of each at the point that the match occurred, and if there are multiple matches within this same time interval, ignore all but the first". Of course, one can write it in Java, but a language that enables to write this pattern in less than 1 minute is more cost-effective.

Back to the scope -- pre and post processing of events and patterns are not part of the event processing system, and typically done in different technologies. This does not say that they are not important, sometimes the pre-processing of events is more complicated than the event processing, especially since it is hard-code.

More on this - later

2 comments:

Paul Vincent said...

Hi Opher - another view is that "PreProcessing" is effectively "IT preparation" - selecting the right EP Agent, filtering out any irrelevant events, etc.

Probably you can do EPElement operations in a preprocessor, but also system control operations...

Cheers

Opher Etzion said...

Hi Paul. The selection of the right EP agents, and filtering out irrelevant may be part of the processing itself; filtering events when done by a producer outside the boundaries of an event processing system can be counted as preprocessing, the borderline is sometimes fuzzy.

cheers,

Opher