
This is a blog describing some thoughts about issues related to event processing and thoughts related to my current role. It is written by Opher Etzion and reflects the author's own opinions
Thursday, April 24, 2008
On the science and engineering of event processing

Monday, April 21, 2008
On Event Clouds
Marc Adler in a couple of his blog postings wondered about support of event clouds in the product he chose, and at the end has settled in the opinion of the vendor (Mark Tsimelzon from Coral8) who claims that "cloud" is an abstract term, and in reality we are facing multiple streams that may or may not be ordered. The response comes from Greg-the-architect who is in "everybody are confused" mode recently. Greg-the-architect claims that vendors have sinned in disinformation towards their customer to hide their inabilities to cope with hidden causal relations. So - what can I contribute to that party ?
First - let's look again at the defintion of event cloud in the glossary:
Event cloud: a partially ordered set of events (poset), either bounded or unbounded, where the partial orderings are imposed by the causal, timing and other relationships between the events.
Clouds became a fashionable term, we hear a lot about cloud computing in the recent year, that we all feel like flying in various clouds.
What about the clouds/streams debate ? -- one of the differences that are stated is that a cloud is a poset (partially ordered set) while a stream is totally ordered. I agree that this terms come from two different origins, the question is if indeed a cloud can be supported by multiple streams, while people focus the discussion on whether streams are always totally ordered or can also support non-ordered set of events - this is not really an interesting distinction. I agree here with Mark Tsimelzon that a stream can also be un-ordered, this is up to implementation. If one wants to make a distinction between "streams" being ordered and other things that can be unordered, I propose the term "pipes" - where ordered pipe is a stream. But the ordered/unordered does not make the main difference. Reading the cloud definition again, it is the notion of cuasality that is important for having a cloud. The "partial ordering" in the cloud is a result of causality relations between events. I have discussed in a past posting the notion of causality, support in causality (including pre-determined causality that may be result of mining, or inference system) is the enabler for the support of clouds (i.e. the partial order vs. no order).
Cloud is indeed the collection of events that an enterprise is faced with, and this cloud may be implemented by a collection of pipes (or streams, if you wish) and support in causality relation.
We can also look at a (small) cloud, which is the collection of all events that a single EPA (Event Processing Agent) is facing as an input - and this is just a subset of the big "Cloud" - with its own pipes and causality relations.
Now - to the most important question - besides the game in terminology, is it important to make these distinctions?
As stated before, the world of event processing is not monolithic, there are some applications which need total order, while other applications need partial order, and other applications don't care about the notion of order at all. Causality relations are required by some applications, either if the pre-defined relations between the events play a role in the event processing, or if there is a need to trace back the lineage of a certain event / action. For other applications it may be just an unnecessary overhead. So my (2 cents worth of) advice to the people who are looking at CEP products - is to look at their requirements and determine if they need causality, and partial ordered set. It may be that the support of totally ordered stream is totally sufficient for their applications, if it is not - they should look for if and how causality is implemented. I hope that I have not confused you even more... More - later.
Sunday, April 20, 2008
On Event Pattern Semantics

Today is Passover, while I am far from being religious, there are several traditions we keep, one of them is to have a family dinner in Passover-eve, and reading (at least part of) the Haggadah, so I've looked at the internet to find some fancy Haggadah in English, and here is the result.
The call for EPTS founding members
is also progressing - by now more than 20 compnies either signed or indicated that they are in internal approval process, and intend to sign as EPTS members, in addition to about 20 individual members. We excpect this number to grow towards the deadline, and call anybody who has not joined and wish to contribute to the emerging EP community to join.
Moving to today's topic: Tom Puzak has posted on the CEP interest group a message about nine features the CEP engine should have. This discussion is useful, since there is no agreed upon "CEP manifesto", a definition what are the functions that should be supported by "CEP engines", and we are going to need one, sooner or later.
Since I am working on a tutorial for the DEBS conference which will talk about event pattern semantics as a major theme, here is a sneak preview about the type of semantic decisions that are needed, this is in addition to the semantics of the specific pattern (conjunction, disjunction, absernce, sequence...).
1. In which context this particular pattern is relevant. Context can be temporal (within working hours, 1 hour from the power break), spatial (within the headquarter building), semantic (only for platinum customers or state-oriented ( while it is rainining) - or combinations of all the various dimensions (I have written before about the notion of context).
2. Is an event participate in the same pattern in a single context or in multiple contexts ? this can happen when there there is overlap among contexts.
3. Is the action / notification about the fact that the pattern has been detected should execute immediately or in a deferred mode (example: at the end of the temporal context).
4. Within a context - is the pattern existential (i.e. we are looking for a single pattern per context) or can there be multiple instances >
5. Using quantifiers on synonims - Taking the example from Tom Puzak's message: we are looking for a message of A, B within 60 secondes (temporal context), and the actual flowing events are: A1 A2 B1 A3 B2 B3 - we may want the cartesian product, but typically this is not what we really wish - thus, we can use quantifiers to select among the A and B events. Quantifiers can be according to order - firts, last, each or according to content of attributes (or both).
6. Can a single event particpate in more than one pattern within the same context ?
7. Should newer synonim kill older sysnonims ?
This are just titles - and in the DEBS tutorial I'll explain each with examples and show how they impact the pattern detection behavior.
Bottom line -- tune up the semantics of a pattern consists of several decisions, if these decisions are not supported in the language, and the application does not conform with the default, results in hacking around... more - later.
Tuesday, April 15, 2008
On Event Processing Agents

- "simple event processing" EPAs - filter and routing,
- "mediated event processing" EPAs - enrichment, transformation, validation
- "Complex event processing" EPAs - pattern detection
- "intelligent event processing" EPAs - prediction, decisions...
The common denominator: each of them receives events as input, emits events as output and does a single type of function.
I find this type of abstraction both very easy to explain people how EP systems work, and also basis for architecture. The EPN routing can be done by standard middleware, or in a stand-alone mode. Other terminology issues raised by David Luckham is the relationships to the "actor model" and to "engines".
The actor model is a model that helps reasoning about concurrency, while agents in AI are autonomous goal-driven artifacts. These are orthogonal terms, of course. In the context of EPA - when looking at EPAs as an executable network, we can look at each EPA as an actor and apply actor models.
Last but not least -- relationships of EPAs to engines -- an EPA is a software artifcat, it can be an instance of an engine, it can be some software that contains an engine, and it can be hard-coded program, as long as it complies with the EPA definition. In a future world, with inter-operability (and perhaps also language) standards, we'll be able to run (and maybe to self-select) multiple engines for the same EPN, residing in different EPAs.
More about EPA types -- later.
Monday, April 14, 2008
On the spectrum of event processing applications

Saturday, April 12, 2008
On Semantic Event Processing

Thursday, April 10, 2008
On Impact 2008
