Event Processing Thinking

Thursday, December 18, 2008

"complex event" and "derived event" - are they synonyms ?

Israel is a small country, and its commercial TV stations just recently discovered the "reality" programs, this week was the final episode of a reality program called "big brother" in which a bunch of people are closed in a house (the one seen in the picture) for 3.5 months (those who survive until the end) doing nothing, without any connection to the outside world, and with cameras everywhere, there was a dedicated TV channel watching them 24 hours, and twice a week - TV show in prime time. This reality program drove the entire country crazy -- got unprecedented rating, and became the main discussion issue among people, today I went to the coffee room to take some coffee and have seen around 10 people there spending their time in a heated discussion around this TV program. Interestingly, in the night of the final, a group of people from the culture and artist community made a big demonstration against the TV channels that spend their production money on realities, and drop drama series -- I personally agree with them.
BTW -- event processing can be used to serve as a "big brother" and trace people's activities, but I'll blog about it another time.

I would like to answer the question of Hans Glide about my previous posting -- the question has been -- in the illustration (below) it is seen that "complex events" is not a subset of "derived events" meaning that there are complex events that are not "derived events" - is it true?

The answer is: indeed - "complex events" intersects "derived event" but there are derived events which are not complex events and complex events which are not derived events.

The first case is easy: enriched event is a derived event but it is not a complex event.
What about the other direction ? - well, getting back to what "derived event" is -- this is an event that is created by some "event processing agent" as a result of some event processing function. If an event is a raw event it is not derived event. However, there are in the universe
"raw complex events", not all complex events are derived by software artifacts. For example: Since David Luckham is the copywriter of the term "complex event", I'll use two of his favorite examples:

Tsunami

Economic Crisis

(David referred to the one started in 1929, but our generation also won one of these)

The "economic crisis" is a complex event -- it is certainly an event, and it is aggregation of other events, but this aggregation is not created by software, the raw event is already complex; likewise the "tsunami".

More - later.

Wednesday, December 17, 2008

On Event Derivation

Today I have participated in the local IBM championship in Backgammon (in Hebrew we use the Turkish name "shesh-besh"), however, my participation did not last very long, I lost the first game, and did not move to the second round, thus, returned to my office... At least the one who won the game over me is the technician in charge of the video conference equipment, so if one wants to conduct video conferences, make sure he likes you, since he is the only one in the lab who knows how to operate this simple equipment.

Today I would like to write a little bit about "event derivation" - in one of the past postings I've used this illustration in order to explain relationships between terms. Why do we need all these terms? -- well - we don't, but they are "legacy" terms, thus it is a good to show the relations among them.

When saying "derivation" -- let's compare it to two other types of derivations:

Inference -- where the input are: facts and inference rules, for example: fact 1 -- Jack is the father of Mike; fact 2 -- Mike is the fater of Alex; inference rule: X is grandfather of Y if X is father of Z and Z is father of Y. From this system we can infer that Jack is the grandfgather of Alex. This is a deductive inference, there are some other types of inferences, and can be obtained by logic programming, or inference rule engines.
Data derivation --- although the term is broader --- if we'll take the relational database variation --- it takes a collection of relations as an input, apply query, and gets a relation as an output -- here there is no logical inference, but computation over data. Other variation of data derivation is a spreadsheet formula, typically performing some aggregation operator (sum, count, average...) on neigboring data in the table.

Event derivation is neither of the above, but has similar nature to both. The input and output are events, if we present it as functions then:

Inference: Facts X rules ----> Fact
Data derivation: Relations X Query ---> Relation
Event Derivation: Events X Event Derivations ---> Events

Now looking closely at event derivation. First -- why does event derivation occur ?

Event derivation may occur from various reasons -- while in mind it is closely related with pattern detection, it may also be associated with other event processing functions such as: enrichment and transformation.

Some examples:

Example 1:

Input event: Order that consists of the attributes (not including the header) -- customer, item, quantity.
Functionality type: enrichment -- add attributes as: customer-type, customer-reliability-rank to the order.
Output event: the enriched event. In this case the derived event is identical to the event that has been created by the main function (enrichment) and the derivation is just copying.
Output event occurs immediately when function is done

Example 2:
Input event: Order (same as first example)
Functionality: Aggregation -- quantity by item.
Output event: Item-Demand: Item, Quantity. Multiple events - one for each item.
Output event occurs at the end of the context ("deferred").
comment: here a composite event is being accumulated, and at the end of the context is decomposed to individual events.

Example 3:
Input events: Order (same as first), Item-Demand (Output of example 2).
Functionality: Detect Pattern, the pattern is: Quantity of a single order > 2/3 from quantity of the same Item.
Output event: Dominant-Order: Order-Id (from header), Item, Dominance Ratio = (Order.Quantity / Item. Quantity)
Output event occurs as soon as detected.

Example 4:
Input events: Temperature-Reading (every 1 minute).
Functionality: Aggregator/collector -- collects all reading of a certain hour and creates one event.
Output event: Every hour -- Composite event that consists of 60 raw events.

These are some examples. More formal definition of event derivation - later.

Sunday, December 14, 2008

EPDL expression for the "on off windows"

Yesterday, I drove my 11 year old (youngest) duaghter, Daphna, with four of her friends (all boys), 65 kilometers to the nearest branch of Max Brenner (see pictures above) a chocolate store which also has a restaurant, most of the menu is the chocolate oriented menu, but for parents there is also some real food. They still did not get to Haifa, but I think that they have brnaches in New York. Anyway, we found the nearest branch and got there...

I have posted an explanation about the approach to the "on off window" that Pedro posted; somebody asked me beyond the explanation -- how this scenario can be expressed in the EPDL meta-language that we are working on. Actually this is a very simple scenario, it is expressed like this:

Let E be the event type; with two attributes param1 and param2 (as given); we also need to route the event e to the agent a, and the output event to a producer - lets define two output pipes, p1 and p2, both of them carry events of type e.

We define:

Context, name = c, type = temporal, initiated-by = event e with param1 = 2, terminated-by = event e with param2 = 0.

Agent, name = a, type = filter, within-context = c, filter-condition = all, input = p1, output = p2

Explanation: Context determines which subset of the "event cloud" are applicable, there may be multiple agents within a context, but each agent has a single context. In this case - we can use the simplest type of agent - filter - with a trivial filter-condition - "all". Agent a will not be active before cotnext c will be active. With the event that satisfies the condition, the context is opened, thus agent a receives input, does nothing with it, and post it as output.

This is of course very simple case -- we can also concatenate all selected events to one composite event; we can also use more sophisticated agent e.g. pattern detection.

More educational material about languages -- later.

Friday, December 12, 2008

More On Event Representation

CAC Japan is the only Japanese member of EPTS, some of their team have participated in the EPTS 4th Event Processing Symposium. One of the missions they have talked upon themselves the task to bring the event processing ideas and concepts of event processing closer to the Japanese market, they have sent me the URL of their Japanese portal
and in specific they have translated the EPTS glossary to Japanese
I salute the initiative, we indeed need to reach out for more parts of the universe. I don't understand the Japanese part of the site, but trust the CAC team. Hope to have an opportunity to visit Japan some day.

The issue of translation from language to language reminded me about the still chaotic situation in the area of event representation on which I have blogged about a year ago.

Much of the existing event processing systems see the event representation as an internal issue and make their own decision about which structure should be supported: flat positional like 1-NF in relational database, tagged semi-structured (XML), or Java objects. Furthermore the "header" of an event varies. This reminds me a discussion a few years ago with some of my IBM colleagues who worked on standard for management events. They included "managed entity" and "severity" as mandatory attributes, since in their domain -- event is always a reported problem. We explained them that there are other types of events, e.g. money withdrawal. For this type of event there is no severity and no "managed entity". This has been a funny discussion since they did not really think that what we are talking about is event, however, in their domain, what they have done made perfectly sense. This leads to a thought that we may have several layers of attributes:

Universal attributes that should be mandatory in any event --- there should be very few of them --- Marco from RuleCore has suggested in his recent Blog:
Id - every event is unique and have an unique id.
Type - Every event is of a specific type.
Detection time - every event is detected at a specific time.
Entity id - The entity that changed its state.

I agree to the first three, but not every event changes the state of a single entity, and in
some cases (e.g. derived events) it is not really important which entity changes it state, so I
would move the "Entity id" to the next category.

Domain common attributes (which also may have several level of domain, sub-domain, sub-sub-domain etc...) -- this may include: event-location, entity that changed its state, time-zone, severity and much more.
Event type specific attributes -- individual for each event type. Some of them may be reference to entities, some descriptive attributes.

There is even more interesting discussion about derived events - but I'll defer it to one of the next postings.

Wednesday, December 10, 2008

On contexts and separation of concerns - the case of "On Off Windows"

The window in the picture cannot be openned or closed without some fixing, which is typical to some of the event processing language that support in fixed window. I have been quite amused in the last couple of days to read the discussion in the CEP-Interest Group on Yahoo!, started by
a query posted by Pedro Bizarro.
If you do not subscribe on this group I am copying Pedro's query:

Hi there,

A company contacted me to solve a query puzzle that they have. They
are reading incoming tuples and want to select all tuples that are
between two tuples. For example, assume that I want all tuples between
the first tuple with PARAM1=2 and the first tuple after this one with
PARAM2=0. This would select all tuples with time between 3 and 8 in
the example below.

TIME | PARAM1 | PARAM2
1 1 0
2 1 1
3 2 2 <== ON
4 2 3
5 3 5
6 3 4
7 2 3
8 2 0 <== OFF
9 3 1

I have tried a number of event processing engines but cannot write
such a query. (However, I can write the query on a table on Oracle 11g
using Oracle's Analytic functions with the help of a simple PL/SQL
function.)

Do you have a solution for the OnOff Window puzzle?

Thanks!
-pedro

You can browse and see the variety of answers given by different vendors - my intention is not to analyze the results, but look at the semantics of what is asked by Pedro. Pedro is talking about window, typically window is used in order to select a collection of events and perform some operation on them, in Pedro's case the operation is trivial - just select all of them, but we can slightly change the requirement and change it to "find if a pattern P is satisfied in this collection of events" (example of a pattern -- there are at least two events in which PARAM1 > PARAM2 for the same event) and if this pattern is detected perform some action (e.g. send me SMS).

Here we have three types of conditions:

1. Condition for opening the window
2. Condition for closing the window
3. Condition for detecting a certain pattern (in this case the result of the detection is a "situation" since it triggers action outside the event processing network.

Condition 3 is a normal type of "pattern detection" processing, however, conditions 1 and 2 have different roles -- they are used in order to decide which events will be used as input for the pattern detection operation. One of the answers was given by my colleague from IBM Haifa, Guy Sharon, who talked about the AMiT implementation of "lifespan" as a temporal context. Again, without going to the particular language, I would like to point out that the principle of "separation of concerns" --- have distinct separation between conditions to select the events that participate in an operation and the operation itself provides better clarity to the end result. Since a window is indeed a subset of the term context on which I have blogged several time before and pointed out about the importance of looking at context as a first class citizen in event processing. Context can have several dimensions -- the temporal dimension of context subsumes the notion of window, and there are other dimension (e.g. spatial - that selects events based on their location).

Bottom line: my belief is that the consumability of event processing will increase if we'll succeed to make them available to semi-technical people and have clear semantic roles to the building blocks instead of trust the developers that they will find a way to hack a bit in order to satisfy requirements -- more on this topic -- later.

Sunday, December 7, 2008

On EPTS working groups

Towards the year 2009, EPTS will increase its activities. Currently six working groups has been approved by a series of meetings of the EPTS steering committee extended with all the people who proposed working groups. We are going to issue soon a call for -- comments, vote and participation for the EPTS community.

First - something about the process of EPTS work. The main work will be done in working groups, the steering committee serves as a facilitator, but each working group has two co-leaders (as the proposals go now), and help the proposers devise the charter, make sure it makes sense, and meet the legal requirements (one of the properties of making EPTS as a formal organization is that there are some legal agreements between the members that need to be kept). The sec0nd phase which we are now entering is -- putting the working groups proposals in the EPTS site, in a members only section of the site, for comments, vote, and call for participation - each organizational and individual member can participate in any working group they are interested in. However, participation also means commitment for active participation. We shall hold a members' call to present all the proposals, and then the members will vote. Each negative vote will have to be augmented, and the proposal leader will answer - both objections and answers will be made public. After the members' vote, there will be final discussion in the steering committee, especially for a working groups that had objections, and final decision will be made.
The idea is to finish all this process in early January and launch the working groups for 2009.
EPTS members will get further instructions; the participation in the working group is restricted to EPTS members only, for legal reasons; however, everybody can become EPTS member (organizational member or individual member - see instructions in the EPTS website.

The six working groups that will be presented are:

(1). Glossary: We have issued version 1.1 of the glossary, but the work has not ended; this is a living document and a moving target, as the event processing area is in a relatively young age as a discipline. An agreed upon glossary is important to have common language, and has been successfully done in other disciplines.

(2). Use Cases: This working group continues from 2008 and had devised a template for the analysis of use cases, the idea is to survey a significant amount of use cases in order to classify event processing applications.

(3). Meta-modelling: OMG has issue RFPs for meta-modeling standards that have relations to event processing, in specific: Event Metamodel and Agent Metamodel and Profile.
EPTS still needs to determine about its status of engagement with OMG, according to it this can be either official response to the RFP, or input to OMG. In any event, EPTS has been recognized by OMG (and referenced in the RFP itself), and was asked to provide input. The working group will attempt to provide unified response of the EPTS community. Note that this is the pattern we are pursuing in general - EPTS will not become a standard development organization, but will assist existing organizations to develop EP related standards.

(4). Reference Architecture: In the early days of the pre-EPTS, there has been some work to collect and compare reference architectures of various vendors. We are now returning to deal with refernce architectures, this time in the form of an EPTS Working Group. This working group will propose one or more reference architectures for various cases (consistent with the evolving classification in the use cases workgroup and the evolving glossary).

(5). Interoperability Analysis: This working group will engage in study of requirements and mechanisms for interoperability - both between event processing products of various vendors, and between event processing products and various producer and consumers of event processing.
After the study, the working group will recommend to the EPTS community about next phases
(e.g. creation of additional standards, revision of current standards etc...).

(6). Languages Analysis: This working group will engage in study of existing event processing languages (both from products and from the literature) to devise (in a semantic level) a set of functions that is being used. After the study, the working group will recommend to the EPTS community about next phases (e.g. creation of a single language standard or creation of N variations for various languages or creation of a meta-lanaguage standard...).

I am personally will co-chair the languages analysis one (an area that I spent a lot of time on recently), and will follow, all others.

More working groups may be launched, however, I surveyed only those approved so far to continue to the next phase.

I believe that at the end of 2009 with the results of these working groups report, we'll advance the understanding of the event processing discipline, and will have a clear road-map for related standards...

Enough for now -- more later.

Friday, December 5, 2008

On Continuous Monitoring and Continous Actions

In Israel, a new law has gone into effect in December 1st, to fight Spam - it forbids sending commercial material over Email, SMS and recorded phone calls, unless the person signed up explicitly (e.g. through a website); recorded phone calls are especially bad, picking up the phone and talking to recorded message is something I am allergic to... Unfortunately, this law is in effect only in Israel, and I keep getting Spam in Turkish and Russian -- I have already blocked more that dozen spamming sources, but others keep pooping.

Anyway -- this week, my Master student, Elad Margalit, had his final exam on the thesis, his thesis dealt in event-driven control of traffic lights, and he has shown (by simulation) that when the traffic lights frequencies of green and red in individual junctions are optimized based on the actual traffic it can shorten substantially the waiting times (there may be various goal functions here). This is done by continuous monitoring of the traffic (entrance and exit from a road segment by camera). IBM Research has done a lot of work in the area of "continuous optimization" that deals with solving optimization problem continuously, there are two variations: Solve the optimization problem all the time (in our case - for each cycle of the traffic light), or monitor all the time and solve the optimization problem only when there is a significant change from the base assumptions (in our case -- when the distribution of traffic in the various direction is significantly different, otherwise leave the traffic lights policies as is), this is based on the assumption that the cost of monitoring is much lower than the cost of actual changing the system behavior; if no such difference then we can use the continuous actions.

It is interesting to note that Coral8 added in its logo the trademarked expression "Continuous Intelligence". Interesting phrase, not sure if there is indeed intelligence, but I guess that intelligence is in the eyes of beholder.