Event Processing Thinking

Sunday, November 9, 2008

More on Web 2.0 and Event Processing

This is a model of the Haifa City hall - as seen in the Mini-Israel park , the best place to see all the interesting sights in Israel if you have only 1 hour to spare... While last week the elections in the USA have been a big story, we have this week a smaller story -- electing mayor, there are nine people who want to be mayors of Haifa, and today in lunch we discussed what we know on some of them... no conclusions yet, will decide in the last minute, as always.

The Internet plays now a role in elections, including the Web 2.0 - Blogs, social networks etc.. which become part of the life. I have already written about the impact of Blogs, this week I was asked to look at IBM response to RFP that has been identified by the relevant people to be copied from some Blog entry (not mine); it seems that what's written in Blogs start to impact decision making. Social networks are also very visible. In 2006 Mark Palmer sent me an invitation to link in linkedin today I have 517 connections (when one passes the 500, it shows as 500+, I guess a kind of honor...), since then I've lost track at the number of social networks I got invitations to join (and joined), the most famous is of course facebook, which I joined to see pictures that my daughter is posting them and serves as a major means of communication between members of her generation, but there are at least half a dozen more social networks that keep sending me requests and messages from time to time.

Today I have discovered "slideshare" - a slide sharing network, and found out that Tim Bass has created a slideshare group of "complex event processing". I think this as excellent idea, and added also some slides there to see if there will be any interest. Slideshare can also be linked to linkedin, so linkedin members can also view it in my linkedin profile.

BTW - for those who missed the EPTS 4th event processing symposium - or wish to view it again - the professional part of the meeting has been recorded and now can be found on the Web in the CITT site
Thanks to Thomas Ertlmaier who worked hard both on the photographing and the editing, and to Rainer von Ammon for the initiative and sponsoring of this effort.

Friday, November 7, 2008

On "event at a time" vs. "set at a time" processing

Today I have broken life-long record in the length of a single session with the dentist --- more than 3 hours (I had to go out in the middle and restart the parking card which allows 2 hours at a time); this is time of setting up plans for next year and there are plenty of plans --- to be involved in 1/3 of my time in some activity, only when the number of "1/3 of your time" becomes bigger than 3, then the phrase "1/3 of your time" becomes tricky. Anyway -- today I would like to write about "event-at-a-time" and "set-at-a-time" processing, as it seems that people don't understand this issue well. We'll take it by examples:

"event-at-a-time" is a processing type in which when an event is detected, this event is evaluated against the relevant patterns to determine if the addition of this event (together with possible more events) satisfies the pattern.
"set-at-a-time" is a processing type in which patterns are evaluated against a set of events after all the set has been detected.

Examples:

look at the pattern: And (E1, E2) -- when the relevant instance of E1 arrives this becomes a potential pattern, whenever the relevant instance of E2 arrives - then the pattern is satisfied. A concrete example: E1 = the buyer received the merchandise, E2 = the seller received the money for the same transaction. The order does not matter - the transaction is closed if both of them occurred. This is an "event-at-a-time" processing since it looks at each event individually when it comes and determined what is the state of that pattern- instance.
look at the pattern: Increasing (E1.A) -- when the set of all relevant instances of E1 arrives - then the evaluation is done on the entire set. A concrete example: E1 = Temperature measurement, A = Value. When all the values (e.g. in a specific time window) arrive, there is an ability to determine if the values are indeed always increasing in time.

It is interesting to point out that patterns that are geared towards "event-at-a-time" can be implemented as "set-at-a-time", in this case, looking for patterns periodically instead of immediately, note that "set-at-a-time' does not necessarily means : time series processing, the size of the set may be unknown in advance, and the time difference between the different elements in the set may not be fixed.

It is also possible to implement "set-at-a-time" patterns using "event-at-a-time" - in an incremental fashion.

While some applications are better handled easily and more efficiently by one of these two styles -- some applications will need both, so hybrid processing will be required for those.

Tuesday, November 4, 2008

On some EP related conferences

As I said all I had to say about the EDA vs. CEP issue, and does not wish again to get inside discussion of "what is CEP", I'll write today about some conferences in the event processing area. In the top of this page I've put the logo of DEBS 2009 which became the flagship research conference of the event processing community. The conference that has become ACM conference will include, like last year, industrial track that will contain - experience reports, architectures, interesting applications, and anything else of interest from the industrial perspective. We'll also try to solicit intermediate reports of the EPTS work-groups. Follow the call for papers in the conference website.

Another conference that announced extension in submission : the AAAI symposium on intelligent event processing - submission extended to November 13th.

Happy conferencing.

Saturday, November 1, 2008

On the head and the tail of EDA

Somehow the discussion around "EDA" and "CEP" continues in the Blogsphere - Giles Nilson from Apama has published seven points , out of which I quite agree to the first five. Jack van Hoof, who started this whole thread of discussion, argues that "CEP is not the beginning but finishing of EDA". So what is the answer? head or tail ? the right answer is --- it depends. Some customers work with methodic architecture way, in which first the EDA architecture is being set up, and then CEP tools are invoked on top of this architecture -- this is the case that Jack is talking about. However, in some cases, customers don't apply architectural thinking, but just acquiring some application that is implemented with CEP tool, and thus it introduces a kind of EDA ad-hoc for a specific application, this is the case that Jiles is talking about.

My guess is that most early adopters have applied CEP in ad-hoc way, so it serves as a "head" for more EDA in the future, while recently we see more customers looking at EDA first, since they are taking it from enterprise architecture perspective.

If we'll take the CEP functionality -- finding patterns on multiple events, it may not be implemented on EDA at all -- since the same technique may fit to "non event" environment.

More - Later.

Wednesday, October 29, 2008

On EDA, CEP and disruptive technology

This is the second time in the last few months that the term "disruptive technology" is being used, this time by Mark Palmer who adds his voice to the EDA vs. CEP discussion. Mark acknowledges the fact that EDA and CEP are not synonyms, but asserting that CEP is the disruptive part of EDA.

Recall that in the Oracle Blog the "disruptive" word was used recently and I have discussed it at that time; Richard Veryard has answered my response claiming that "disruptive" is often not better, at least not in the short run.

Anyway, I have noticed in the recent few weeks, two IBM customers who made plans to move to EDA (neither will use CEP, at least not in the short run). The shift to EDA is a fundamental change in the architecture thinking with the introduction of the decoupled event-driven thinking. Is it new ? - not really. Is it new for these customers - yes, not only new, but a significant change in thinking, it is an indication that there is a beef in EDA, even without introducing CEP, that the enterprise should digest. While thinking in events is natural in the daily life, it is still not natural for enterprise architecture and programming paradigm.

Bottom line -- while CEP certainly has its merits (if the customer has mature enough to digest it), EDA seems to be a more fundamental change in thinking relative to the alternative, and has its own beef. More - Later.

Friday, October 24, 2008

On EDA and maturity model

This is the first postings I am writing on my new Laptop, Lenovo T61p. New laptops are a trauma, since with the new laptop one gets new versions of all software, that somehow not always behave in a familiar way. I still did not succeed to convince the new laptop work with my home wireless network (it succeeds to connect, and then disconnects after a few seconds), so I have arranged workplace near the router and works wired, catching up in reading Blogs.

Jack Wan Hoof in his Blog claims that the market does not understand EDA, since people when talking about EDA are really talking about CEP, in a previous posting he makes a distinction between EDA that is about business events and CEP which processes messages.

It is true that people are often mix the terms of EDA and CEP (both TLAs) which are not quite the same, however, the "message" vs. "event" is something the needs some discussion, as well as the relations between all these, and why should a customer of these technology care anyway.

First of all - I agree that EDA is an architectural pattern, where various services/components/agents/artifacts are communicating in a decoupled way by sending events to some mediator which routes it based on subscription or other type of routing decision, such that the source and sink may not know each other, while CEP is a functionality of detecting patterns among multiple events and deriving new (derived/synthetic) events as a result.

To the question of "messages" vs. "events" -- CEP is called CEP since its inputs and outputs are events according to the definition of event as "something that happened" (with all its variations - see the EPTS glossary), message is in this case a way to represent event (not the only one - CEP can also process events represented as objects that are not messages).

The claim that messages in general may not represent events is valid, I can send my picture as a message and of course some smart guy can find event in it, but it is not intended to represent an event. However, if we apply pattern detection on messages that not represent events, then it is not CEP, since CEP assumes, by definition, that it inputs are events. This can be pattern detection on messages, and one can invent some name for it, if it makes sense.

Interestingly, people also realized that the same type of patterns may be useful for data and not only for events, which makes is detection of patterns on data and not CEP again, and we have started to see some works on patterns in the database community, although they are still quite complex.

EDA by itself has a value to the architectural thinking, recently I came across a plan made by an IBM big customer looking at their future architecture as a combination of SOA and EDA, in the sense that services will communicate among themselves in an event-driven way, but they have not yet discovered CEP at all. In fact, observation on various customers that are in different phase of thinking about the "event" issue leads me to think on a kind of maturity model in this area :

phase I -- simple event processing: determine what events are important, how they can be obtained, who should receive these events, how these event should be transmitted etc.. This in some cases is a major effort, teaching the customer to "think in events", and building the way to emit and consume events, in some cases that by itself can be a big investment, since it can affect running systems, and invest in development of instrumentation/adapters/periodic pull mechanisms. The type of "event processing" that is being provided by basic EDA is typically routing (including subscriptions) and filtering. This phase should have a business benefit by its own right to justify the investment.
phase II - mediated event processing: to bridge semantic and syntactic gaps between the event producer and consumer, there are additional mediators that provide ESB type functionality: transformation (including split and aggregation), enrichment (from external sources) and validation - this is often needed, and since many of these features are pretty much common, it may make sense to use COTS for it, although in simple cases, hard-coding may be cost-effective enough, this phase is optional, it may make sense to unify it with phase III.
phase III - complex event processing -- this is a step-function towards looking at "event clouds" instead of events one-by-one, and typically built on top of the phase I, possibly I +II. I have seen cases in which phase III has been implemented as first step and included step I and II -- this makes step III much longer, but makes sense if the customer has a business problem for which it needs step III, but really did not think in event before.
phase IV (optional) -- This extends the functionality towards intelligent event processing - which I already discussed in the past.

More about maturity models and event processing - later.

Monday, October 20, 2008

More on the semantics of synonyms

Still lazy days of the holiday week, I took advantage of the time to help my mother, who decided that she wants to leave home and move to seniors residence located 10 minutes walk from my house, this requires to deal with many details, so that is what I was doing in the last three days.... In the picture above (taken from the residence site) you can see the front entrance and the view seen from the residence tower, on the right hand side of the upper picture one can see part of the neighborhood we are living in (Ramat Begin) surrounded by pine trees all over.

Now, holiday eve again, and this is a good time to visit the Blog again. Last time I started the discussion in the semantics of synonyms by posing a simple example of conjunction over a bounded time interval (same pattern that Hans Glide referred to in his Blog), and slightly different from the "temporal sequence" pattern.

In the previous posting I have posed the following example:

Detect a pattern that consists of conjunction of two events (order is not important) - e1, e2.
e1 has two attributes = {N, A}; e2 has also two attributes = {N, B} ; the pattern matching is partitioned according to the value of N (on context partitions I'll write another time).
For each detection, create a derived event e3 which includes two attributes = {N, C}; E3 values are derived as: E3.N := E1.N ; E3. C = E1. A * E2. B.

Let's also assume that the relevant temporal context is time-stamps = [1, 5] - and the events of types E1 and E2 that arrived during this period are displayed in the table below:

The question is: how many instances of event E3 are going to be created, and what will be the values of their attributes?

Looking at this example, for N = 2, there is exactly one pair that matches the pattern
E1 that occurs in timestamp 5, and E2 that occurs in timestamp 4, so E3 will have the attributes {N = 2, C = 24}. However, for N = 1 things are more complicated. If we'll take the set oriented approach that looks at it as "join" (Cartesian product), since we have 3 instances of E1 and two instances of E2, we'll get 6 instances of E3 with all combinations. In some cases we may be interested in all combinations, but typically in event processing we are looking for match and not for join -- that is the difference between "event-at-a-time" type of patterns and "set-at-a-time" patterns that is being used by some of the stream processing semantics. So what is the correct answer ? -- there is no single correct answer, thus what is needed is to fine tune the semantics using policies. For those who are hard-coding event processing, or using imperative event processing languages, this entire issue seems a non-issue, since when they develop the code for a particular case they also build (implicitly) the semantics they require for a specific case, however policies are required when using higher level languages (descriptive, declarative, visual etc...), policies are needed to bridge between the fact that semantics is built-in inside higher level abstraction, and the need to fine-tune the semantics in several cases. In our case we can have several types of policies:

Policies based on order of events - example:

For E1 - select the first instance; for E2 - select the last instance.
For E1 - select the last instance; for E2 - select the last instance

Policies based on values - example:

For E1 - select the highest 2 instances for the value of A ; for E2 select the lowest instance for the value of B.

These are examples only -- it is also important to select a reasonable default which satisfies the "typical case", so if the semantics fits this default, no more action is needed.

These have been examples only, in one of the next postings I'll deal with identifying the set of policies required in order to make the semantics precise.