Thursday, August 21, 2008

On Web 2.0 and Event Processing

One of the questions that I heard recently is - how does "web 2.o" relates to "event processing" - or vice versa. Well - today everything has to fit with anything else. So here are some wild ideas:

  • Mashup - kind of fusion of content from different sources, this content is being updated as result of processing events.
  • Social networks -- a lot of things can be learned from tracing events in social networks, examples: a person has accumulated at least 5 recommendations recently -- is probably looking for a job; if a person changes his/her affiliation more than twice a year may also be of interest; one of the social networks sends me birthday alerts...
  • Blogs - besides the fact that there are Blogs about event processing, there can be events related to Blogs - patterns over content of Blogs, over relations among Blogs, and over readers of the Blog.

There are, of course, more - these are just a sample; event processing can be useful in the investigation of collaboration systems. For example - The informal structure of an organization can be discovered by analyzing collaboration systems. This topic deserves more attention.

Talking about event processing Blogs -- I would like to welcome Colin Clark to the event processing Blog sphere.

Wednesday, August 20, 2008

On Event Processing Network and Situations - the semantic bridge

One of the challenges in building a semantic meta-language that captures event processing behavior is to bridge the gap between terms that are in different domains. I have written before about Situations following the CITT meeting, in which this term has been discussed. In fact, we have used the term "situation" in AMiT , we also called its core component the "situation manager" We used the term situation to denote a meta-data entity the defines combination of pattern and derivation, but I must admit that this has been a mismatch of terms, although there is a strong correlation among these terms.

First - let's go back to the operational world, in this world events are flowing within event processing networks, in the EPN there are agents of various types: simple (filtering), mediating (transformation, enrichment, aggregation, splitting), complex (pattern detection) and intelligent (uncertainty handling, decision heuristics... ), in the roots of the network there are producers, and in the leaves of the network there are consumers. This is an operational view of what is done, the term "situation" is in fact, not an operational term, but a semantic term, in the consumers' terminology and can be defined as circumstances that require reaction.

How can we move from the operational world to the semantic world - we have two cases here:

  • Deterministic case: there is an exact mapping between concepts in the operational world and the situation concept;
  • Approximate case: there is only approximate mapping.

In order to understand it, let's take two examples:

Example 1 - toll violation (determinstic, simple)

  • The cirumstance that require reaction is the case that somebody crosses a highway toll booth without paying (in Israel the only toll highway is completely automatic, but we'll assume that it occurs elsewhere, and that the toll is not applicable between 9PM - 6AM and in weekends).
  • Getting it to the operational domain - there are two cases - one: go in the EZ pass lane and don't pay, two: go in a manual lane and somehow succeed to cross the barrier (obstacle?) without paying.
  • The action in both cases: apply camera to capture the license plate, SMS the picture to officer on duty in the other side of the bridge.

From the EPN perspective, we have events of car cross a certain section of the road (the raw event), the EZ pass reading is an attribute of this event, and if no EZ pass it gets a "null" value. There is context information which is --- temporal (it is in hour and day where toll is in effect), spatial (the location of EZ pass lane), note: sometimes people mix the notion of context with the notion of situation, I have explained the difference in the past. Within this context a filter agent that looks for null value in the EZ pass reading is applied, if the filter agent evaluates to true then the situation phrased has been applied in deterministic way, and indeed the edge going out of the filter agent is going directly to a consumer (the camera snapshot may or may not be considered as part of the EPN). This is a case of "simple event processing", stateless filtering, whose output is a situation. This gives a counter example to the misconception that situation is closely related with complex event processing [I can continue the example to the other case, but I think that you got the point by now]

Example 2 - Angry Customer (approximate, complex)

The setting is a call center, the situation is -- detect an angry customer -- refer him or her to the "angry customers officer".

Here the life is not that easy, a human agent can detect angry customers by the tone of their voice (or electronic message), but this does not include all cases of angry customers, so we can look at some pattern saying -- a customer that applied 3rd time in a single day is an angry customer, and then we need to have a "pattern detection" agent that detects the pattern "3rd instances of application" where the context partition refers to the same date, same customer, same product. In this case also a leaf edge is mapped to a situation, but there are two differences from the previous case:

1. The agent is now complex event processing agent since it detects pattern in mu;tliple agents;

2. The edge represents the situation in an approximate way, which means that it can have false positives (the CEP pattern is satisfied but the customer is not really angry, just asked for a lot of information to install the product), or false negatives (the customer called twice, and does not talk in an aggressive tone, yet he is furious).

In some case it also makes sense to associate "certainty factor" with the leaf edge, approximating the measure of belief in the fact that this edge represents the situation. I'll leave the discussion about uncertain situations to another time.

Tuesday, August 19, 2008

On event Driven BPEL

WS-BPEL has become a leading standard in the area of business process management, and since event processing is typically part of a bigger picture, it often interacts with business process management systems. One of the success factors is to make the environment ready for event processing, i.e. being able to get the right events, and being able to perform the right actions, while the event processing network has detected a situation (in one of the next postings I'll discuss how situation is related to the EPN world).

My student, Alex Kofman, has been doing recently a M.Sc. thesis looking from the BPEL perspective, he has determined what need to be done in order to extend BPEL to be "event driven", and proposed detailed modifications to the standard. This is intended to enable easy interaction between event processing system and BPM systems that plays as producer and consumer from the event processing viewpoint, while from the BPM viewpoint, input events can serve in decisions about orchestration, while output events can provide decisions external to the BPM based on the BPM state.

Here are some of the main points raised in this thesis:

1. Ability of BPM subscribe/unsubscribe to events.
2. Ability of BPM to publish events.
3. Ability to invalidate one or more processes as a result of an event (partially exists).
4. Ability to start a new process as a result of an event (partially exists)
5. Ability to invoke a task upon occurrence of an event
6. Adapt process execution upon occurrence of an event
7. Support the definition of an event processing pattern requested by the BPM engine, as a "callback" to an event processing system.
This in general will raise the level of abstraction and make the integration easier...

More about this project, later -- the thesis is now being written, Alex gave presentation about it last week in the department seminar, so this is a good time to start communicating these ideas.

Monday, August 18, 2008

On Event Processing Description Language

This is a logo of the UK Geologists celebrated their 15o anniversary. I have more modest accomplishment, the Blog dashboard claims that this is my 150th entry into this Blog, that have started almost a year ago (in the one year anniversary, I'll look for some statistics about it)..
In one of the first postings to this Blog I have talked about meta-language for describing event processing behavior as a possible candidate for first standard. Recently, I have been working further on this idea, and soon I'll be able to start talking more about the details (those who heard my tutorial in DEBS 2008 could get a sneak preview. The language is a semantic language whose roots are coming from two directions: the "outside in" direction and the "inside out" direction.
The "outside in" direction is a result of requirement survey that has been done internally in IBM in nine industries, by interviewing IBM domain experts (such as industry CTOs) and in some industries also selected customers. The "inside out" direction is a result of looking at existing event processing languages (both from products and from academic projects) and try to find the union (but also learn something from the interaction), we are now in the process of doing the "inside out" part, while the goal is not to compare language, but to learn from them - it is still interesting to look at different languages and at different assumptions that are reflected in features in the language (example: if the language assumes that the input is a time series, then counting events is equivalent to creating fixed time interval, while in other cases where events arrive in a sporadic/chaotic way these concepts are totally orthogonal). One interesting question is the "effective" expressive power of languages, which is somewhat different from the theoretical expressive power, since - given a specific requirements, with some (or a lot of..) hacking, sometimes adding code, one can achieve this goal, so in comparing language one should use a subjective goal -- how easy to express it / is it a natural language to express it / can a typical developer understand determine how to express it ? maybe it can be translated to a quantitative measures -- length of solution, development time etc...
While the "pattern detection" is certainly the most challenging part of the meta-language, it is by no means the only part - it also includes sections around: event definition, transformation, enrichment etc.. This work also entails various interesting topics to report in this Blog -- more later.