This is a blog describing some thoughts about issues related to event processing and thoughts related to my current role. It is written by Opher Etzion and reflects the author's own opinions
Saturday, December 22, 2007
On the envelope for CEP
Thursday, December 20, 2007
On - "one size fits all" and Event Processing
- I don't believe that ESP vs. CEP is the right type of partition in this area;
- There may be a need to have various implementation under one roof (the heterogeneous framework approach),
For the first point -- what is the right type of partition ? this is a multi-dimensional questions and we still have to learn more to know the most useful combinations.
One of the important dimensions is the "reason for use" dimension, and here in an internal IBM study we got to five different reasons to use, I'll write about it in one of the next postings.
EPTS has recently launched a workgroup that tries to identify these classifications by doing a comprehansive survey of use cases that will be compared using the same template. A team that consists of Tao Lin (SAP), Dieter Gawlick (Oracle) and Pedro Bizzaro (University of Coimbra, Portugal) is working on this template, and a larger team will handle this survey and analysis -- the end result - a collaborative white paper about the state of the practice in event processing is expected somewhere in the second quarter of 2008. Stay tuned.
More - Later.
Wednesday, December 19, 2007
On deleted event, revised event and converse event
Today's topic is inspired by a recent blog that I have recently read by Marco Seiriƶ. Marco is one of the pioneers in EP blogging, I've started reading his blog in January 2006, when he started the blog as "Blog on ESP", however at some point his blog became "Marco writes about complex event processing", another evidence that the name ESP has disappeared. Anyway, in his Blog, Marco talks about event model, I'll not discuss event model today, but concentrate in one interesting point that Marco raises about "undoing events". This is indeed a pragmatic issue with some semantic difficulties. There are systems in which events can be deleted, and some actions can be triggered by the event deletion. However, event is not a regular data and cannot be treated as such, since event represents something that happens in reality, then conceptually events are "append only" - in database terms, one can only insert events, but not modify or delete them. Deleting events also blocks the way from the ability to trace decisions/actions or have retrospective processing of the events. So - when in reality we need to delete/undo/revise events:
- when event is entered by mistake - typically not the event itself, but some details in the event attributes, we'll need a possibility to revise event.
- when we wish an event no longer to effect the processing.
- when the event itself expired or we'll not need it anymore, and don't need to use it in any other processing - including retrospective.
The first case is a revision case - if we are in an "append only" mode, then the way to do it is to enter another event, and have the possibility that it will override an existing event (or set of events) for the purpose of processing. Example: somebody sent bid for an electronic auction and realized that one of the details (say: the price he is ready to buy) is wrong, then he can add another bid that will override the first bid. Why not delete the original bid ? it may be possible that the original bid is already in process, and the overriding cannot stop this process, even if not, there is a possibility that for retrospective processing we'll need to reconstruct past state which includes the original bid (these considerations are actually not new, we have thoroughly discussed these issues within the temporal database community a decade ago when we (Sushil Jajodia, Sury Sripada and myself) edited a book about temporal databases research and practice
The second case is even more interesting, but similar in type of thinking, here we would like to eliminate an event from taking effect, this can be done by sending a "converse event" that reversing the effect of the event - e.g. cancel bid. The implementation problem is that this event, and maybe its descendant events may have being flowing all over the event processing networks, with some even getting out from the EPN with actions triggered, some in process, and some are part of a state, but have not been processed yet (e.g. since a pattern has not been detected yet). Theoretically there is a possibility to apply something similar to "truth maintenance system" in AI that includes also the action and compensate for all actions, but this complicates the system, so recommended only when it is critical to do it (I'll discuss such cases in another postings), when the event has not gone out from the EPN, it is still possible to stop it, most system does not provide a language primitive to do it globally in an EPN, and recently I have watched a concrete customer case, where they had to do it manually.
The third case is the "vacuuming" case - when an event is no longer needed (in agents' state, in the global state etc..), I never got deep into this issue, but thought intuitively that it is a relatively easy problem; however, when this issue has been discussed in the Dagstuhl seminar last year, the claim was that the general issue of event vacuuming is still an open question.
I'll stop here now -- spent enough time on this one... more - later
Monday, December 17, 2007
CEP and the story of the captured traveller
Reading the recent posting of my friend Tim Bass entitled "CEP and the story of the Fish" I decided to answer with another story (from the other side of Asia) :
A traveller went in the jungle somewhere on the globe and unfortunately was captured by a tribe that is still using ancient weapons. He is brought to the chief, and the chief says - " You have trespassed into the tribe's territory, which is punishable by death, however, I am a very curious person, if you'll show me something I haven't seen before I'll let you go"; our unlucky traveller started to look in his pockets and the only meaningful thing he found was a lighter, so he took his chance, showing it to the chief saying: "this thing makes fire", however, since he was in under a big pressure, he pressed once - no fire, pressed twice - no fire, in the third time the lighter indeed has produced the promised fire, the chief did not hesitate and said "let him go", so our relieved traveller muttered to himself - "I knew that they have not seen a lighter", but surprisingly to him the chief said - "oh, I have seen many lighter, but a Zippo lighter that does not light in the first time I have never seen".
When someone disagrees with somebody else, it is very easy to assume that my point of view is right since I am smarter / knows more / more qualified / older / more experienced / generally always right etc... My preference is not to doubt the wisdom, experience or qualification of anybody that I am arguing / discussing / debating with, but make the arguments on the issue and not on the person who makes the arguments....
Enough introduction -- now for the main message of this posting, the term CEP (Complex Event Processing) has more or less agreed now in the industry to denote "computing that performs operations on complex events", where complex event is an "abstraction or aggregation of events". The term complex does not say that the processing is complex, but that it deals with complex events, as defined. Complex event processing is typically detecting predefined patterns that can be expressed by queries/rules/patterns/scripts and are deterministic in nature. Regardless if I think that this is the best term, I think that it is important to have common agreed terminology, otherwise we are confusing the industry, the customers (and sometimes ourselves). Now, Tim Bass claims that since event processing with stochastic/probabilistic/uncertain nature is more complex than what we call "complex event processing", we have to call this one - "complex event processing", and rename what we call "complex event processing" to be "simple event processing". Unfortunately, it is too late for that - and also not justified, again, since the "complex" in the "complex event processing" does not say that this is "complex processing of events" but that this is "processing of complex events" (very common misconception !). Bottom line: yes - there is another class of event processing capabilities that requires techniques from AI, machine learning, OR etc.. and that is not deterministic in nature; no - I don't think we should call it "complex event processing", we have suggested the term "intelligent event processing" which I have already referred to in previous posting , there are a variety of other postings that I have dedicated to terminology.
More - later
Sunday, December 16, 2007
On Event Stream Processing
This is in part a response to my friend and colleague Claudi for his recent post in the CEP Interest Group
- Mark Palmer, General Manager, Progress Apama, mpalmer@PROGRESS.COM
Another indication of the blurring between ESP and CEP is that the vendor descendants of the academic projects - Streambase and Coral8 now positioned themselves as "complex event processing" vendors. Both have "complex event processing" all over their homepages, Streambase labels its product as - "complex event processing platforms" (well -- we'll discuss platforms in another posting); Coral8 has a portal which is offers self-service CEP. Aleri which also provides SQL oriented API, also uses the term CEP, although they are also using the term "Aleri streaming platform" as the way to do CEP. Thus, while the term "stream processing" is very much alive in the academic database community - see the VLDB 2007 program, for example, it seems that the market has already voted on the unification of these two terms, behind the CEP term.