Saturday, November 29, 2008

On basic classification of terms

Besides the fact that there is still a sort of uncertainty among the general IT community about what is event processing (or CEP, which is the most common TLA associated with it), some discussion on the Blogsphere lead to further confusion --- e.g. claiming that CEP equals BRMS. Carole-Ann Matignon,VP, Product Management at Fair Isaac, has nice posting in her Blog, talking about this confusion, and providing what she calls simplistic view of the definitions of BPM, BRMS, and CEP. I'll go with the simplistic direction of thinking and claim that this is a mix of terms from three different domains -- application, technique and function.

Function answers the question --- what is being done ?
Technique answers the question -- how something being done ?
Application answers the question --- what is the problem being solved ?

  • Business Activity Monitoring (BAM) is an application type, it solves the problem of controlling the business activities in order to optimize the business, deal with exceptions etc...
  • Business Rules are type of technique --- which can be used to infer facts from other facts or rules (inference rules) , or to determine action when event occurs and condition is satisfied (ECA rules) and more (there are at least half a dozen types of rules, which are techniques to do something).
  • Event Processing is really a set of functions which does what the name indicates -- process events --- processing can be filtering, transforming, enriching, routing, detect patterns, deriving and some more.
Of course there are inter-relationships among them --- some types of business rules are a non exclusive way to execute some of the functions of event processing -- e.g. ECA rules can be used for routing, inference rules can be used for event derivation and more. However, there are other techniques that can be used for each of this functions. Likewise, a BAM application may use event processing functions as part of its implementation, but may also use other techniques (e.g. being data-driven rather than event-driven and do all its processing in retrospect, looking at committed data periodically and not on events as data-in-motion). Business Rules can be used for various utilization in BAM and in BPM applications, with or without the use of event processing.

The functions of event processing is typically used for several motivations:
  • Observation into business processes, business activity monitoring.
  • Dynamic reaction and modification to transactions and business processes
  • Diagnosis of problems and finding root-cause.
  • Prediction of future problematic events that need to be eliminated or mitigated
  • Dissemination of data in motion to arrive to the right person at the right time in the right granularity.
These motivations are not applications --- there are multiple applications that perform diagnosis -- system management, car repairs, medicine diagnosis, oil drilling management; there are also multiple applications that perform more than one of these -- e.g. observation and reaction or prediction and reaction. I have blogged before about the metaphor of blind people touching an elephant -- there are some people who are looking at a single line of applications, or a single implementation techniques and saying CEP is X, while another person who is thinking on another concept is saying the CEP is Y, thus a confusion is being created - the following elephant's picture illustrates it (taken from the original postings).

Bottom line: the confusion is a result of equating terms from various classes, and resolving this confusion is indeed very simple.

Friday, November 28, 2008

More on event hierarchies - cycles as hierachy spoiler

Yesterday I had the honor to eat dinner with the Israeli president Shimon Peres (well - not 1-1 dinner, not even in the same table, but on the table next to his); this was in an event that included many of the seniors of the Israeli High-Tech, about a project in which I am involved of helping Arabs to break into the Israeli high-Tech on which I have written before.

In Israel, the president is a symbolic head of state, while the "CEO" is the prime-minister; Shimon Peres, seen in this picture from last week where he became a knight by the UK queen, is a very impressive person, 85 years old who has a lot of energy and considered as the ultimate elder statesman. I wish that I'll have the same energy and vitality in the age of 58...

Back to event processing -- reflecting back on the "event hierarchy" discussion. Some people assume that the event processing network is a DAG (Directed Acyclic Graph) which is a structure whose handling is well-known, unfortunately, the world may be sometimes cyclic and the event processing network that represents this world is also cyclic.

Let's look at the following example -- I went to the interior ministry for re
This is, of course, an event processing system, relatively simple one. Trying to sketch the event processing network we'll have events - like:
customer arrived, clerk arrived, clerk displayed number, customer sits in front of clerk, customer gets up and leaves the clerk, clerk displays number.

Looking at this events we can realise that they create the following cycle
clerk arrived ; clerk displayed N; customer sits in front of clerk; customer leaves clerk; clerk displayed L etc... This is a cycle Clerk displayed L is an indirect descendant of Clerk displayed N... and likewise we need to continue circling...
Since event processing network deals with event types then we can say that "clerk displayed number" repeats itself, this may also be for the same clerk and same number -- same clerk, since clerk's id is fixed, same number -- since numbers can be recycled (e.g. after getting to 20, return to 1)... The support of cycles, of course, spoils the notion of hierarchy.

How do we know that cycle will not be infinite? ---- if the context is bounded then there is a natural stopper for the cycle (e.g. end of the reception hours), if the context is not bounded, then an infinite loop is possible, and there should be other means to detect and handle it, if occurs (e.g. allow limited number of cycles).

Should we forbid cycles? -- we are back to the dilemma about Russell's type theory.
we can forbid it and our language will be restricted and will not be able to express scenarios such as the one expressed in the example above, if we allow it --- we spoil the strict hierarchy, but as mentioned in the previous postings, strict hierarchy of events may not be realistic.

Wednesday, November 26, 2008

On event hirearchies and types

In a comment to my previous posting Harvey has reminded me that Israel won second place ("gold medal") in the Chess Olimpics, actually it has been number one through part of the games and then moved to number two which is also very respectable. I have not played chess since high school, but I have registered to play in an internal IBM championship in Backgammon;
A game I have not played a lot recently, so need to practice...

I am following with interest Marco's Rulecore Blogging about geo-spatial operators, I think this is in the right direction, and there is a lot of potential in spatial and even spatio-temporal patterns and I also view space as one of the main dimensions of context. However, today I would like to write about one of the other topics that Marco has been writing about -- event hirearchies, in this posting Marco suggests hirearchy level of events and rules, such as rules in level 1 consume only events of level 1 and produce event of level 2, and so forth. This reminds me a great scholar named Bertrand Russell, whose picture you can see below:
Russell has introduced a well-known paradox which in its popular form is called the "Barber Paradox" (Wikipedia claims that
which states that in a village there is a barber who shaves everybody who does not shave himself. The question is who shaves the barber? if the barber shaves himself that means that the barber is not shaved by the barber, however if the barber shaves the barber that means that the barber does not shave himself, meaning that the barber shaves himself if the barber does not shave himself and vice versa. Russell also provided solution for the paradox called "type theory" saying that we have hierarchy of terms and functions/predicates based on "types". Thus if a "person" is of type 1, Barber is a predicate of a person, which makes it term of type 2, being a term of type 2, it cannot participate in predicates on type 1, and thus the entire paradox does not happen. Type theory indeed resolves the paradox (if you wonder the "types" in programing languages are indeed derived from Russell's terms), however, the price is that the expressive power of the language is compromised, things which seem to us perfectly valid in natural language are not valid under the types theory.

Back to event processing -- let's look at the following example:
  • Our location-based event processing system manages fleet of cabs and travel reservation.
  • An order arrives for a travel in 30 minutes
    - this is an event of type 1, it is processed and matched against taxis locations and directions to assign the right cab and makes "cab assingment" which is an event of type 2 (since it was created by an event processing agent - "rule" in Marco's terminology that processed events of type 1).
  • After a short while the perspective traveller changed his mind and called to cancel the order. This is again, event of type 1.
  • However, now we have to find out if the travel has already been allocated and if yes deallocate the travel and free the allocated taxi. This operation has to take two events as input -- the travel cancellation (type 1) and the travel allocation (type 2)... Oops... our system works purely on strict hierarchy of event types --- can't do that!.
This demonstrates that using strict hieratchy we'll have, like Russell's type theory, to give up expressive power. So it seems that while hierarchy is a good idea, it may also serve as a restriction... One modification is indeed to order the event types and the agents, however enable agent of type 2 to accept as input events at type 1 and 2.

With this correction there are certain benefits for using the hierarchy levels, relate to optimization possiblities that Marco have not mentioned. We'll discuss them in one of the next postings.

Monday, November 24, 2008

On evaluation criteria for EP products

Typically, I refrain from reacting in this Blog to any marketing material presented by vendors, a restriction I have taken upon myself as the chair of EPTS. I am not deviating from this rule, but since my friends in Coral8 have posted their article entitled: Comprehensive Guide to Evaluating Event Stream Processing Engines on David Luckham's site, as a vendor-neutral service to the community, I am taking a freedom to put some footnotes to this paper.

On the positive side, I think that this type of work is useful, and discussions about it is also useful, and many of the criteria presented are valid. We in IBM have devised in the past criteria for evaluation for internal purposes that included many of the mentioned criteria, I have to check if we can expose them.

On the critic side - here are several comments:

1. The first claim is that the authors view "event stream processing" and "complex event processing" as one and the same, saying that customers do not make distinction between terms, and saying that there is no agreed upon terminology. I am referring the authors to the EPTS glossary as a reference for terminology. But regardless of that, I would agree that customers typically don't care what TLA is used, the substance is more important.

2. Giving the statement that the coverage of this document is ESP and CEP which are one of the same, have created the feeling that this document is general, however, reading further I find out among the criteria that define what is ESP engine the following condition: "...process large volumes of incoming messages or events". This criterion confuses me -- is that a fundamental property of ESP/CEP engine -- I have heard in the recent year some analysts talks saying that actually most of the potential EP applications are not the "high volumes" ones, furthermore, the customers I know have various degrees of event volumes, some of them high, some low -- so maybe this is not part of the definition of what is an engine, but an evaluation criterion for certain amount of applications.

3. Reading further I see terms like: continuous queries, windows -- terms that already assume a certain type of implementation (indeed --- query-based stream processing), this fits the title of "event stream processing" assuming that there is an agreement that this is what ESP is, however, it does not represent the entire spectrum. Continuous queries is a technique that is intended to achieve some functionality, that can be achieved in other means.....

Personally I believe that "one size fits all" does not work, and that different event processing applications have different functional and non-functional requirements. There are applications in which various performance aspects are more or less important, note that there is also no standard benchmarks yet. I hope that the work of the EPTS work group on use cases that is planned to result in classification of event processing applications will result in a finite, manageable number of application classes, so the evaluation criteria can be partitioned by type.

And -- if possible, hands on experience indeed makes the evaluation more accurate and removes noise of preconceptions and false assumptions... More on evaluation - later.

Sunday, November 23, 2008

On the rain in the window -- windows and temporal contexts

I realized that I have not written for a while, I am not out of topics, just trying to do too many things in parallel... Anyway, I am typically late in changing from summer clothing to winter clothing relative to most others, but it happened yesterday, maybe the noise of the heavy rain in the window, brought me to change from short shirt and sandals to long shirt and shoes.
Winter is a relative term, people who live in some climates, will not call our winter as winter.

Last night I attended the conclusion session of "students exchange trip" in which my 13 years old daughter Hadas has participated, they visited a school in Foster City, California, this is a plan called "ambassadors", and they had also to give speeches about various aspects on Israel, one of their challenges was to convince their host to come to Israel as a counter-visit. Since the international media create the preconception that Israel is a dangerous place to be, with wars in the streets etc.., some people (typically those who have never been here) are afraid to come... It seems that the children were successful to convince that in Haifa we live normal life, and there is no war in the streets... Actually I am used to people I ask me, that I feel much safer in Haifa then in New-York, London, or Paris. Paris is the only place I was attacked by thieves, so it is the most terrifying city for me.

Back to the rain in the window. The notion of "window" that came from stream processing, is used to process a sub-stream that is bounded by time (or by number of occurrences). In some cases a window can be specified by some starting time and duration, or slide at certain time intervals, however, in other cases we need to process events in a time interval "while it is raining" - this is done either to find certain patterns that are only relevant in raining time, or use the stream processing classic application --- aggregate within a sub-stream. In any case, this is not determined by fixed time, and the duration is not known in advance. This can be either "while something is in state S" or a time interval that starts by the occurrence of event E1 and ends by the occurrence of event E2. An interval may also expire if the state lasts too long...

I'll re-visit the notion of context and its formal definition soon.