Wednesday, December 10, 2008

On contexts and separation of concerns - the case of "On Off Windows"

The window in the picture cannot be openned or closed without some fixing, which is typical to some of the event processing language that support in fixed window. I have been quite amused in the last couple of days to read the discussion in the CEP-Interest Group on Yahoo!, started by
a query posted by Pedro Bizarro.
If you do not subscribe on this group I am copying Pedro's query:

Hi there,

A company contacted me to solve a query puzzle that they have. They
are reading incoming tuples and want to select all tuples that are
between two tuples. For example, assume that I want all tuples between
the first tuple with PARAM1=2 and the first tuple after this one with
PARAM2=0. This would select all tuples with time between 3 and 8 in
the example below.

1 1 0
2 1 1
3 2 2 <== ON
4 2 3
5 3 5
6 3 4
7 2 3
8 2 0 <== OFF
9 3 1

I have tried a number of event processing engines but cannot write
such a query. (However, I can write the query on a table on Oracle 11g
using Oracle's Analytic functions with the help of a simple PL/SQL

Do you have a solution for the OnOff Window puzzle?


You can browse and see the variety of answers given by different vendors - my intention is not to analyze the results, but look at the semantics of what is asked by Pedro. Pedro is talking about window, typically window is used in order to select a collection of events and perform some operation on them, in Pedro's case the operation is trivial - just select all of them, but we can slightly change the requirement and change it to "find if a pattern P is satisfied in this collection of events" (example of a pattern -- there are at least two events in which PARAM1 > PARAM2 for the same event) and if this pattern is detected perform some action (e.g. send me SMS).

Here we have three types of conditions:

1. Condition for opening the window
2. Condition for closing the window
3. Condition for detecting a certain pattern (in this case the result of the detection is a "situation" since it triggers action outside the event processing network.

Condition 3 is a normal type of "pattern detection" processing, however, conditions 1 and 2 have different roles -- they are used in order to decide which events will be used as input for the pattern detection operation. One of the answers was given by my colleague from IBM Haifa, Guy Sharon, who talked about the AMiT implementation of "lifespan" as a temporal context. Again, without going to the particular language, I would like to point out that the principle of "separation of concerns" --- have distinct separation between conditions to select the events that participate in an operation and the operation itself provides better clarity to the end result.
Since a window is indeed a subset of the term context on which I have blogged several time before and pointed out about the importance of looking at context as a first class citizen in event processing. Context can have several dimensions -- the temporal dimension of context subsumes the notion of window, and there are other dimension (e.g. spatial - that selects events based on their location).

Bottom line: my belief is that the consumability of event processing will increase if we'll succeed to make them available to semi-technical people and have clear semantic roles to the building blocks instead of trust the developers that they will find a way to hack a bit in order to satisfy requirements -- more on this topic -- later.

No comments: