Wednesday, August 20, 2008

On Event Processing Network and Situations - the semantic bridge

One of the challenges in building a semantic meta-language that captures event processing behavior is to bridge the gap between terms that are in different domains. I have written before about Situations following the CITT meeting, in which this term has been discussed. In fact, we have used the term "situation" in AMiT , we also called its core component the "situation manager" We used the term situation to denote a meta-data entity the defines combination of pattern and derivation, but I must admit that this has been a mismatch of terms, although there is a strong correlation among these terms.

First - let's go back to the operational world, in this world events are flowing within event processing networks, in the EPN there are agents of various types: simple (filtering), mediating (transformation, enrichment, aggregation, splitting), complex (pattern detection) and intelligent (uncertainty handling, decision heuristics... ), in the roots of the network there are producers, and in the leaves of the network there are consumers. This is an operational view of what is done, the term "situation" is in fact, not an operational term, but a semantic term, in the consumers' terminology and can be defined as circumstances that require reaction.

How can we move from the operational world to the semantic world - we have two cases here:

  • Deterministic case: there is an exact mapping between concepts in the operational world and the situation concept;
  • Approximate case: there is only approximate mapping.

In order to understand it, let's take two examples:

Example 1 - toll violation (determinstic, simple)

  • The cirumstance that require reaction is the case that somebody crosses a highway toll booth without paying (in Israel the only toll highway is completely automatic, but we'll assume that it occurs elsewhere, and that the toll is not applicable between 9PM - 6AM and in weekends).
  • Getting it to the operational domain - there are two cases - one: go in the EZ pass lane and don't pay, two: go in a manual lane and somehow succeed to cross the barrier (obstacle?) without paying.
  • The action in both cases: apply camera to capture the license plate, SMS the picture to officer on duty in the other side of the bridge.

From the EPN perspective, we have events of car cross a certain section of the road (the raw event), the EZ pass reading is an attribute of this event, and if no EZ pass it gets a "null" value. There is context information which is --- temporal (it is in hour and day where toll is in effect), spatial (the location of EZ pass lane), note: sometimes people mix the notion of context with the notion of situation, I have explained the difference in the past. Within this context a filter agent that looks for null value in the EZ pass reading is applied, if the filter agent evaluates to true then the situation phrased has been applied in deterministic way, and indeed the edge going out of the filter agent is going directly to a consumer (the camera snapshot may or may not be considered as part of the EPN). This is a case of "simple event processing", stateless filtering, whose output is a situation. This gives a counter example to the misconception that situation is closely related with complex event processing [I can continue the example to the other case, but I think that you got the point by now]

Example 2 - Angry Customer (approximate, complex)

The setting is a call center, the situation is -- detect an angry customer -- refer him or her to the "angry customers officer".

Here the life is not that easy, a human agent can detect angry customers by the tone of their voice (or electronic message), but this does not include all cases of angry customers, so we can look at some pattern saying -- a customer that applied 3rd time in a single day is an angry customer, and then we need to have a "pattern detection" agent that detects the pattern "3rd instances of application" where the context partition refers to the same date, same customer, same product. In this case also a leaf edge is mapped to a situation, but there are two differences from the previous case:

1. The agent is now complex event processing agent since it detects pattern in mu;tliple agents;

2. The edge represents the situation in an approximate way, which means that it can have false positives (the CEP pattern is satisfied but the customer is not really angry, just asked for a lot of information to install the product), or false negatives (the customer called twice, and does not talk in an aggressive tone, yet he is furious).

In some case it also makes sense to associate "certainty factor" with the leaf edge, approximating the measure of belief in the fact that this edge represents the situation. I'll leave the discussion about uncertain situations to another time.


Anonymous said...

Hi Opher,

I would like to comment on the difference between situation and context: you say, situations are more like transitions while context is more like state...While I can understand the difference I would also like to point out the similarities: an state is in fact a set of sequences of transitions (see What is State? ), therefore an state is a (composite) situation, therefore context is a (composite) situation; or, in other words, transitions are simple (non-composite) situations while states (and therefore context) are composite situations.

Treating context (i.e. state) as a (composite) situation leads to a beatiful and elegant economy of concepts in the event processing metamodel...



Opher Etzion said...

Hi Claudi.

I agree that in general states can be expressed as a composite set of transitions - however, not every event is a situation, an event this has only internal use within EPN is not a situation, while the transitions that relate to start, end and content of context may not be situations. Thus while the term "context" can be reduce to some composition of events (with some semantics, a composite event by itself is not enough to express context), this is not true for situation - this stems from the semantic meaning of situation which is a transition in the user domain and not in the computer domain.



Hans said...

PetternStorm - I posted a response to your What is State article based on reading your comment here.

I do detect some interesting semantics surrounding context.

For example, going back to the red alert scenario. Maybe we would like to detect the situation of unauthorized people in a sensitive location during a red alert. So we have the "person entering" events from which the situation must be distilled. To do the distilling, we have the "authorized" condition, the "red alert" condition and the location condition. Which of these is context?

Opher Etzion said...

Hi Hans.

Context can - contain several dimensions and be also union or intersection of contexts. To take your example -- the context for that situation consists of the dimensions:
state = red alert;
spatial = location...

thus the relevant events of "person entering" will be chosen only if they are classified to this context. The "autorized" assuming that it required enrichment and filtering is part of the processing, in general context helps us to look at arriving event and classify it to various processing. Note also that there can be multiple situations that can be detected within a single context.



Hans said...

I'll be interested to see how context is represented in the meta-language.

Off the top of my head, I could see both the authorization and the red alert being enrichment of the event. The authorization maybe requires enrichment from some database/directory of people, while the red alert is an enrichment from some state that keeps track of the alert condition (which might also be a database).

As a programmer, I'm used to choosing among many ways to solve the same problem. But going back to your point about the practical usability of the language - some users may not be so happy to choose from several seemingly equal ways to implement the same logic. This would be especially true if, down the line, their choice of implementation affects the flexibility of the rules (their ability to extend or modify the rules). Then it turns out that the different implementations were not equal after all and the choice really required experience with designing logic. This forces the user to be a "software architect" and that seems to run directly counter to your goals.

Also, from the user's perspective, it seems like context is semantically overloaded.

On one hand, there is context in the sense of some ongoing circumstances that have meaning (the red alert).

On the other hand, there is context in the sense of those conditions (ongoing or otherwise) that cause an event to become a situation.

To define the scenario, the user must first define the red alert context. This is a persistent state that begins with an event and ends with an event.

Now the user navigates to the scenario builder and is asked "under what context does an event become this situation." Here they are not being asked to define a persistent state, but about the logic that should evaluated to produce the situation.

I could see the situation becoming worse when the situation requires some pattern detection. Now the current state of the pattern detection algorithm is sort of like a context, although it is quite different from the red alert context.

The semantic overloading seems to be easier than the multiple-paths-to-the-same-goal problem. In the case of semantic overloading, one would simply need to sort out the different kinds of context and give them names.

Opher Etzion said...

Hi Hans.

The meta-language is still at work, so I'll have to defer the answer on that one to a later phase, however - in the AMiT language we have a notion of temporal context called "lifespan", which indeed describes state that start by event and end by event or time offset, or expires. While this can conceptually done inside the pattern conditions, it provides semantic separation between:

* "pre-conditions" in which certain pattern is evaluated

* "conditions" for the evaluation of the pattern.

From experience, developers relate to this difference with no problem.

The question of language minimality (i.e. each logical function can be done in a single way) is a question that has been discussed a lot - it has pros and cons. In general, my belief is that this is too restrictive goal, since if "all else fails" you are getting to components of the language that can subsume others (e.g. use of predicates), but there should also be an explicit methodology / guidlines of how not to abuse it.