Besides the fact that there is still a sort of uncertainty among the general IT community about what is event processing (or CEP, which is the most common TLA associated with it), some discussion on the Blogsphere lead to further confusion --- e.g. claiming that CEP equals BRMS. Carole-Ann Matignon,VP, Product Management at Fair Isaac, has nice postingin her Blog, talking about this confusion, and providing what she calls simplistic view of the definitions of BPM, BRMS, and CEP. I'll go with the simplistic direction of thinking and claim that this is a mix of terms from three different domains -- application, technique and function.
Function answers the question --- what is being done ? Technique answers the question -- how something being done ? Application answers the question --- what is the problem being solved ?
Business Activity Monitoring (BAM) is an application type, it solves the problem of controlling the business activities in order to optimize the business, deal with exceptions etc...
Business Rules are type of technique --- which can be used to infer facts from other facts or rules (inference rules) , or to determine action when event occurs and condition is satisfied (ECA rules) and more (there are at least half a dozen types of rules, which are techniques to do something).
Event Processing is really a set of functions which does what the name indicates -- process events --- processing can be filtering, transforming, enriching, routing, detect patterns, deriving and some more.
Of course there are inter-relationships among them --- some types of business rules are a non exclusive way to execute some of the functions of event processing -- e.g. ECA rules can be used for routing, inference rules can be used for event derivation and more. However, there are other techniques that can be used for each of this functions. Likewise, a BAM application may use event processing functions as part of its implementation, but may also use other techniques (e.g. being data-driven rather than event-driven and do all its processing in retrospect, looking at committed data periodically and not on events as data-in-motion). Business Rules can be used for various utilization in BAM and in BPM applications, with or without the use of event processing.
The functions of event processing is typically used for several motivations:
Observation into business processes, business activity monitoring.
Dynamic reaction and modification to transactions and business processes
Diagnosis of problems and finding root-cause.
Prediction of future problematic events that need to be eliminated or mitigated
Dissemination of data in motion to arrive to the right person at the right time in the right granularity.
These motivations are not applications --- there are multiple applications that perform diagnosis -- system management, car repairs, medicine diagnosis, oil drilling management; there are also multiple applications that perform more than one of these -- e.g. observation and reaction or prediction and reaction. I have bloggedbefore about the metaphor of blind people touching an elephant -- there are some people who are looking at a single line of applications, or a single implementation techniques and saying CEP is X, while another person who is thinking on another concept is saying the CEP is Y, thus a confusion is being created - the following elephant's picture illustrates it (taken from the original postings).
Bottom line: the confusion is a result of equating terms from various classes, and resolving this confusion is indeed very simple.
Back to event processing -- reflecting back on the "event hierarchy" discussion. Some people assume that the event processing network is a DAG (Directed Acyclic Graph) which is a structure whose handling is well-known, unfortunately, the world may be sometimes cyclic and the event processing network that represents this world is also cyclic.
Let's look at the following example -- I went to the interior ministry for re This is, of course, an event processing system, relatively simple one. Trying to sketch the event processing network we'll have events - like: customer arrived, clerk arrived, clerk displayed number, customer sits in front of clerk, customer gets up and leaves the clerk, clerk displays number.
Looking at this events we can realise that they create the following cycle clerk arrived ; clerk displayed N; customer sits in front of clerk; customer leaves clerk; clerk displayed L etc... This is a cycle Clerk displayed L is an indirect descendant of Clerk displayed N... and likewise we need to continue circling... Since event processing network deals with event types then we can say that "clerk displayed number" repeats itself, this may also be for the same clerk and same number -- same clerk, since clerk's id is fixed, same number -- since numbers can be recycled (e.g. after getting to 20, return to 1)... The support of cycles, of course, spoils the notion of hierarchy.
How do we know that cycle will not be infinite? ---- if the context is bounded then there is a natural stopper for the cycle (e.g. end of the reception hours), if the context is not bounded, then an infinite loop is possible, and there should be other means to detect and handle it, if occurs (e.g. allow limited number of cycles).
Should we forbid cycles? -- we are back to the dilemma about Russell's type theory. we can forbid it and our language will be restricted and will not be able to express scenarios such as the one expressed in the example above, if we allow it --- we spoil the strict hierarchy, but as mentioned in the previous postings, strict hierarchy of events may not be realistic.
I am following with interest Marco's Rulecore Blogging about geo-spatial operators, I think this is in the right direction, and there is a lot of potential in spatial and even spatio-temporal patterns and I also view space as one of the main dimensions of context. However, today I would like to write about one of the other topics that Marco has been writing about -- event hirearchies, in this posting Marco suggests hirearchy level of events and rules, such as rules in level 1 consume only events of level 1 and produce event of level 2, and so forth. This reminds me a great scholar named Bertrand Russell, whose picture you can see below: Russell has introduced a well-known paradox which in its popular form is called the "Barber Paradox" (Wikipedia claims that which states that in a village there is a barber who shaves everybody who does not shave himself. The question is who shaves the barber? if the barber shaves himself that means that the barber is not shaved by the barber, however if the barber shaves the barber that means that the barber does not shave himself, meaning that the barber shaves himself if the barber does not shave himself and vice versa. Russell also provided solution for the paradox called "type theory" saying that we have hierarchy of terms and functions/predicates based on "types". Thus if a "person" is of type 1, Barber is a predicate of a person, which makes it term of type 2, being a term of type 2, it cannot participate in predicates on type 1, and thus the entire paradox does not happen. Type theory indeed resolves the paradox (if you wonder the "types" in programing languages are indeed derived from Russell's terms), however, the price is that the expressive power of the language is compromised, things which seem to us perfectly valid in natural language are not valid under the types theory.
Back to event processing -- let's look at the following example:
Our location-based event processing system manages fleet of cabs and travel reservation.
An order arrives for a travel in 30 minutes - this is an event of type 1, it is processed and matched against taxis locations and directions to assign the right cab and makes "cab assingment" which is an event of type 2 (since it was created by an event processing agent - "rule" in Marco's terminology that processed events of type 1).
After a short while the perspective traveller changed his mind and called to cancel the order. This is again, event of type 1.
However, now we have to find out if the travel has already been allocated and if yes deallocate the travel and free the allocated taxi. This operation has to take two events as input -- the travel cancellation (type 1) and the travel allocation (type 2)... Oops... our system works purely on strict hierarchy of event types --- can't do that!.
This demonstrates that using strict hieratchy we'll have, like Russell's type theory, to give up expressive power. So it seems that while hierarchy is a good idea, it may also serve as a restriction... One modification is indeed to order the event types and the agents, however enable agent of type 2 to accept as input events at type 1 and 2.
With this correction there are certain benefits for using the hierarchy levels, relate to optimization possiblities that Marco have not mentioned. We'll discuss them in one of the next postings.
Typically, I refrain from reacting in this Blog to any marketing material presented by vendors, a restriction I have taken upon myself as the chair of EPTS. I am not deviating from this rule, but since my friends in Coral8 have posted their article entitled: Comprehensive Guide to Evaluating Event Stream Processing Engines on David Luckham's site, as a vendor-neutral service to the community, I am taking a freedom to put some footnotes to this paper.
On the positive side, I think that this type of work is useful, and discussions about it is also useful, and many of the criteria presented are valid. We in IBM have devised in the past criteria for evaluation for internal purposes that included many of the mentioned criteria, I have to check if we can expose them.
On the critic side - here are several comments:
1. The first claim is that the authors view "event stream processing" and "complex event processing" as one and the same, saying that customers do not make distinction between terms, and saying that there is no agreed upon terminology. I am referring the authors to the EPTS glossary as a reference for terminology. But regardless of that, I would agree that customers typically don't care what TLA is used, the substance is more important.
2. Giving the statement that the coverage of this document is ESP and CEP which are one of the same, have created the feeling that this document is general, however, reading further I find out among the criteria that define what is ESP engine the following condition: "...process large volumes of incoming messages or events". This criterion confuses me -- is that a fundamental property of ESP/CEP engine -- I have heard in the recent year some analysts talks saying that actually most of the potential EP applications are not the "high volumes" ones, furthermore, the customers I know have various degrees of event volumes, some of them high, some low -- so maybe this is not part of the definition of what is an engine, but an evaluation criterion for certain amount of applications.
3. Reading further I see terms like: continuous queries, windows -- terms that already assume a certain type of implementation (indeed --- query-based stream processing), this fits the title of "event stream processing" assuming that there is an agreement that this is what ESP is, however, it does not represent the entire spectrum. Continuous queries is a technique that is intended to achieve some functionality, that can be achieved in other means.....
Personally I believe that "one size fits all" does not work, and that different event processing applications have different functional and non-functional requirements. There are applications in which various performance aspects are more or less important, note that there is also no standard benchmarks yet. I hope that the work of the EPTS work group on use cases that is planned to result in classification of event processing applications will result in a finite, manageable number of application classes, so the evaluation criteria can be partitioned by type.
And -- if possible, hands on experience indeed makes the evaluation more accurate and removes noise of preconceptions and false assumptions... More on evaluation - later.
Back to the rain in the window. The notion of "window" that came from stream processing, is used to process a sub-stream that is bounded by time (or by number of occurrences). In some cases a window can be specified by some starting time and duration, or slide at certain time intervals, however, in other cases we need to process events in a time interval "while it is raining" - this is done either to find certain patterns that are only relevant in raining time, or use the stream processing classic application --- aggregate within a sub-stream. In any case, this is not determined by fixed time, and the duration is not known in advance. This can be either "while something is in state S" or a time interval that starts by the occurrence of event E1 and ends by the occurrence of event E2. An interval may also expire if the state lasts too long...
I'll re-visit the notion of context and its formal definition soon.