Saturday, October 11, 2008

More on semantics and race conditions

In previous posting I have posed the following sceanario:

Given the simple application shown below:

  • There is a single event source (so no clock synchronization issues) which generates events of three types e1, e2, e3.

  • Let's also say that in our story there is a single events of each type that is published (so no synonyms issues), the table shows their occurrence time (when they occurred in reality) and detection time (when they have been reported to the system) - each of them has been reported 1 time unit after its occurrence, no re-ordering problem.

  • Events e1, e2 serve as an input to an EPA of type "pattern detection" which detects a temporal sequence pattern "e1 before e2", and when this is detected, it derives an event e4 - some function of e1 and e2.

  • Events e3 (raw event) and e4 (derived event) serve as input to another EPA of type "pattern detection" which again detects a temporal sequences pattern "e3 before e4", if this pattern is detected - create event e5 which triggers some action in the consumer.

I also asked the question is -- given the above - will the action triggered by e5 occur?, i.e. will the pattern - "e3 before e4" be evaluated to true?

I got a few answers to this and you can read them as comments to the original posting; as promised I am dedicating this postings to analysis of this simple case:

The first thing to discuss is the semantics of "temporal sequence". There are two possible types of semantics for temporal sequence, which I call "detection time semantics" and "occurrence time semantics".

  • The detection time semantics is implemented in various languages and means that the temporal order is the order of the time-stamps in which the "event processing platform" detects that this event occurs; if there is a single thread of such detection, then the events are totally ordered, otherwise, there may be several events with the same "detection timestamp".
  • The occurrence time semantics also implemented in various languages means that the temporal order is the order of the time-stamps that are provided as part of the event information, and designate - when this event happend in reality. There are some complexity of synchronization of time in multi-producer environment, however, in this example we assume a single producer (I'll write about multi-producer cases in another posting).
  • Note that this two order relations may not be identical.
  • There is also kind of hybrid solution ("total order semantics") -- the semantics is really "detection time" semantics, but in order to allow events that arrive a bit late to take their proper role, the events are queued at a buffer (and not considered as detected) until time-out to let "out of order" events to arrive and re-order the buffer, and then send the events according to the buffer order.

Getting back to the example - in the small table on the bottom left-hand side of the figure above, there are occurrence and detection times of e1, e2, e3. For e4 there is only detection time - e4 is different from {e1, e2, e3} by the fact that it is a derived event and not raw event like the other three. The question is "what is the occurrence time of a derived event" ? -- there is no clear answer for it - there are several possible answers:

  • In the derived event case the occurrence time = detection time, since this event is not real event but a virtual one, thus, its source is the EPA that creates it, and it occurred when created. In our case it means that occurence-time (e4) = 4.
  • Its occurence time is the occurrence time of the last event that completed the pattern - since the participating events in the creation of e4 are {e1, e2} and e2 was the last that completed the pattern, occurrence-time (e4) = occurrence-time (e2) = 2
  • Interval semantics: The event e4 occurs in the interval in which all the participants occur, which is this case means occurrence-time (e4) = [1, 2].

The phenomenon of multiple semantic interpretations apply to various other semantic decisions in the semantic of event processing language, and the preferred solution is to provide the user with semantic "fine tuning" policies, under which the user can chose the desired semantics, instead of "hard code" a certain semantics (using the most common one as a default), this is one of the benefits of using COTS for event processing, since it is quite difficult to think about such issues when developing EP manuaully using conventional language.

The semantics of the second "temporal sequence" (e3, e4) is thus:

  • According to "detection time" semantics -- both have detection-time of 4. As such the sequence condition is not satisfied. However, if we impose total order by a single thread, this may create race conditions between the two events. In this case it is recommended to use a consistent priority policy - either breadth first (the raw event always comes first) or depth first (the derived event always comes first) to ensure deterministic result.
  • According to the "occurence time" -- it depends on the policy chosen, but according to all interprerations - e4 occurs before e3 - thus the temporal sequence is not satisfied.

Bottome line: the temporal sequence (e3, e4) is satisfied if:

  • The temporal semantics is detection time
  • It is implemented by total order
  • The total order policy is "breadth first" - namely priority for the raw events.

In all other cases the temporal sequence will not be satisfied and the corollary action will not execute.

Wednesday, October 8, 2008

On HITC and some small stuff

I have written about a month ago about the Arab Israeli High Tech Center, whose first goal will be to convert Arab Israeli engineers and mathematicians, to work in the very demanding Israeli high-tech industry. Earlier this week there was very impressive ceremony of kick-off for the center (that somehow migrated from AITC to HITC), in the two picture above - the top one is a group photo of the management and industry advisory committee of the center, and in the bottom one you can see me sitting in the crowd, with my eldest daughter, Anat, that decided to come and watch. This project is highly supported by the Israeli High-Tech industry and some of them included their country general managers -- like: HP, Oracle, BMC, EDS and Matrix (and Israeli services company), and other companies like - IBM, Motorola, Intel, Microsoft, Checkpoint and more, were represented by a senior person, there were several hundred people present, and was quite impressive -- studies will start in February 2009, and still a lot of work to be done to make it happen, but people came in feeling of a history in the making.

And back to event processing -- in the next posting I'll talk about the semantic question I have posted last week, and meanwhile just some short comments:

  • Mark Tsimlezon from Coral8 tries to define what is "CEP engine" stating that there is some confusion in the market about this. I almost agree with what he has written and wondered if I should react, since my reaction can further confuse people... So I'll just remark that the term "platform" starts to be very popular, but with somewhat different meanings. I'll write more about platforms in the future.
  • Marc adler is blogging about MSFT Oslo and his CEP application - without going now to further details I believe that the direction of having the ability to interface in the user's domain terminology and way of thinking, and then map it automatically to an execution language (directly or through intermediate representation) is a correct idea, somewhat beyond the state-of-the-art today; there will probably be several ways to do it, but a good topic to work on.
  • Marco from RuleCore is blogging about the pain of their SAAS model and mentions some obstacles, this is a good topic for further discussion, it was also presented in the EPTS meeting by Bob Marcus. Clear, there are some applications that can be served by this model and some (e.g. distributed applications) are not. Will discuss this issue in length in one of the coming postings.

More topics - Later