Showing posts with label complex event proessing. Show all posts
Showing posts with label complex event proessing. Show all posts

Wednesday, December 24, 2008

On Data Mining and Event Processing

Today I have travelled to Beer Sheva, the capital of the Negev, which is the south part of Israel, that consists mostly of desert. I have visited Ben-Gurion University, met some old friends, and gave a talk in a well-attended seminar on "the next generation of event processing". I have travelled by train (2 hours and 15 minutes each direction), and since my last visit there five years ago or so, they have built a train station from which there is a bridge which goes to the campus, very convenient. Since I am not a frequent train rider in Israel, I discovered that in both ends of the line, there are no signs saying which trains go on which track, and this is assumed to be common knowledge... Although they do notify when a train entered the station where it is going and from which track, but still they have a point to improve.

Since some of the people attended my talk were data mining people they have wondered about the relationships of event processing and data mining, since I've heard this question before I thought the answer will be of interest to more people.

In the "pattern detection" function of event processing, there is a detection in run-time of patterns that have been predetermined, thus, the system knows what patterns it is looking for, and the functionality is to ensure correct and efficient detection of the predetermined patterns.

Data mining is about looking at historical data to find various things. One of the things that can be found are patterns that have some meaning and we know that if this pattern occurs again then it requires some reaction. The classic example is "fraud detection", in which, based on mining past data, there is a determination that a certain pattern of action indicates suspicion of fraud. In this case the data mining determines the pattern, and the event processing system finds in run-time that this pattern occurs. Note, that not all patterns can be mined from past data, for example- if the pattern is looking for violation of regulations, then the pattern stands for the regulation, but this is not mined, but determined by some regulator and is given explicitly.

So - typically data mining is done off-line and event processing is performed on-line, but again, this is not true for all cases, there are examples in which event processing and mining are mixing in run-time. An example is: there is a traffic model, according to which the traffic light policies are set, but there is also constant monitoring of the traffic, and when the monitored traffic is deviating significantly then the traffic model has to be changed and the policies should be set according to the new traffic model, this is kind of mix between event processing and mining, since the actual mining process is triggered by events, and the patterns may change dynamically as the result of this mining process.

More - Later.

Friday, December 12, 2008

More On Event Representation


CAC Japan is the only Japanese member of EPTS, some of their team have participated in the EPTS 4th Event Processing Symposium. One of the missions they have talked upon themselves the task to bring the event processing ideas and concepts of event processing closer to the Japanese market, they have sent me the URL of their Japanese portal
and in specific they have translated the EPTS glossary to Japanese
I salute the initiative, we indeed need to reach out for more parts of the universe. I don't understand the Japanese part of the site, but trust the CAC team. Hope to have an opportunity to visit Japan some day.

The issue of translation from language to language reminded me about the still chaotic situation in the area of event representation on which I have blogged about a year ago.

Much of the existing event processing systems see the event representation as an internal issue and make their own decision about which structure should be supported: flat positional like 1-NF in relational database, tagged semi-structured (XML), or Java objects. Furthermore the "header" of an event varies. This reminds me a discussion a few years ago with some of my IBM colleagues who worked on standard for management events. They included "managed entity" and "severity" as mandatory attributes, since in their domain -- event is always a reported problem. We explained them that there are other types of events, e.g. money withdrawal. For this type of event there is no severity and no "managed entity". This has been a funny discussion since they did not really think that what we are talking about is event, however, in their domain, what they have done made perfectly sense. This leads to a thought that we may have several layers of attributes:

  • Universal attributes that should be mandatory in any event --- there should be very few of them --- Marco from RuleCore has suggested in his recent Blog:
  • Id - every event is unique and have an unique id.
  • Type - Every event is of a specific type.
  • Detection time - every event is detected at a specific time.
  • Entity id - The entity that changed its state.
I agree to the first three, but not every event changes the state of a single entity, and in
some cases (e.g. derived events) it is not really important which entity changes it state, so I
would move the "Entity id" to the next category.
  • Domain common attributes (which also may have several level of domain, sub-domain, sub-sub-domain etc...) -- this may include: event-location, entity that changed its state, time-zone, severity and much more.
  • Event type specific attributes -- individual for each event type. Some of them may be reference to entities, some descriptive attributes.
There is even more interesting discussion about derived events - but I'll defer it to one of the next postings.


Wednesday, November 26, 2008

On event hirearchies and types


In a comment to my previous posting Harvey has reminded me that Israel won second place ("gold medal") in the Chess Olimpics, actually it has been number one through part of the games and then moved to number two which is also very respectable. I have not played chess since high school, but I have registered to play in an internal IBM championship in Backgammon;
A game I have not played a lot recently, so need to practice...

I am following with interest Marco's Rulecore Blogging about geo-spatial operators, I think this is in the right direction, and there is a lot of potential in spatial and even spatio-temporal patterns and I also view space as one of the main dimensions of context. However, today I would like to write about one of the other topics that Marco has been writing about -- event hirearchies, in this posting Marco suggests hirearchy level of events and rules, such as rules in level 1 consume only events of level 1 and produce event of level 2, and so forth. This reminds me a great scholar named Bertrand Russell, whose picture you can see below:
Russell has introduced a well-known paradox which in its popular form is called the "Barber Paradox" (Wikipedia claims that
which states that in a village there is a barber who shaves everybody who does not shave himself. The question is who shaves the barber? if the barber shaves himself that means that the barber is not shaved by the barber, however if the barber shaves the barber that means that the barber does not shave himself, meaning that the barber shaves himself if the barber does not shave himself and vice versa. Russell also provided solution for the paradox called "type theory" saying that we have hierarchy of terms and functions/predicates based on "types". Thus if a "person" is of type 1, Barber is a predicate of a person, which makes it term of type 2, being a term of type 2, it cannot participate in predicates on type 1, and thus the entire paradox does not happen. Type theory indeed resolves the paradox (if you wonder the "types" in programing languages are indeed derived from Russell's terms), however, the price is that the expressive power of the language is compromised, things which seem to us perfectly valid in natural language are not valid under the types theory.

Back to event processing -- let's look at the following example:
  • Our location-based event processing system manages fleet of cabs and travel reservation.
  • An order arrives for a travel in 30 minutes
    - this is an event of type 1, it is processed and matched against taxis locations and directions to assign the right cab and makes "cab assingment" which is an event of type 2 (since it was created by an event processing agent - "rule" in Marco's terminology that processed events of type 1).
  • After a short while the perspective traveller changed his mind and called to cancel the order. This is again, event of type 1.
  • However, now we have to find out if the travel has already been allocated and if yes deallocate the travel and free the allocated taxi. This operation has to take two events as input -- the travel cancellation (type 1) and the travel allocation (type 2)... Oops... our system works purely on strict hierarchy of event types --- can't do that!.
This demonstrates that using strict hieratchy we'll have, like Russell's type theory, to give up expressive power. So it seems that while hierarchy is a good idea, it may also serve as a restriction... One modification is indeed to order the event types and the agents, however enable agent of type 2 to accept as input events at type 1 and 2.

With this correction there are certain benefits for using the hierarchy levels, relate to optimization possiblities that Marco have not mentioned. We'll discuss them in one of the next postings.