Event Processing Thinking: event processing functions

Showing posts with label event processing functions. Show all posts

Thursday, May 20, 2010

Dagstuhl seminar on event processing - the third day

The third day in a Dagstuhl seminar traditionally has half day of trip outside the castle, this time we have traveled to a place called Metllach seen in the picture, we have sailed in a boat on the Saar river, and saw this island, and also went to a place from which we could view this island from the hill above, we also traveled to a winery and tasted six kinds of wine and heard long explanations (in German) about these wines.

In the morning we had one more breakout session, and a deep dive into topic 2: what are the functions of event processing (including non-functional function), though for some there was difference of opinions whether it is functional or non-functional (e.g. provenance).

There are discussions about the boundaries of event processing: are "actions" internal or external to event processing: they seem to be external, but for provenance and retraction, the event processing system should be aware of them. The team also identified a collection of topics that require further research, here is the list:

–

Use of EP to predict (anticipate) problems
Use of predictions (e.g. from simulations) in EP
Complex actions
Action processing as the converse of event processing
Decomposition of complex actions with time constraints
Goal directed reaction
Adaptive planning
Implicit validation
Function placement and optimization
Real-time machine generated specification
Compensation and Retraction
Privacy and Security
Probabilistic events
Provenance

I'll write more about future research topics. Today we finish the deep dives and starting to wrap-up, determine the structure and schedule of the final document, and move to discuss the most important stuff -- what we want to achieve, what are the follow-ups, and what will be the follow-up actions?

There is another Dagstuhl tradition - to take a group picture, always in the same place, on the stairs of the castle's old church:

Friday, December 18, 2009

On event processing fuctions

I have been in a short vacation, and went with (some of) my family to see the film 2012, it is based on an ancient prophecy that the world as we know it will come to an end in December 21st, 2012 -- three more years to see whether this prophecy will come true.

This time I would like to write about event processing functions, I have written about them before, just summarizing it in one place.

There are various functions under the roof of event processing, some applications need all of them, but many applications need only part of them, in various level of sophistications.

Here are the major functions that I have observed:

1. Event distribution: This is the most basic one, event consumers are disseminating events through some intermediate brokers (often called channels), the events may be filtered, but are transfered without change, where any processing occur within the consumer's premises and is not part of the event processing system. Pub/sub systems are of this type, and there is a lot of work about such systems in the distributed computing area.

2. Event transformation: This goes another step and send the consumers transformed events, where the transformation may be translation, aggregation, composition, enrichment, projection and split. Aggregation is probably the most notable use of transformation, and there are many applications whose main usage of event processing is transformation.

3. Event pattern matching: This function is to find whether any subset of the input events satisfy a predefined pattern.

Note that some systems require transformation only, some require pattern matching only, some require both, systems can also have different levels of sophistication in both. It may require very simple patterns only, or sophisticated patterns; likewise it may require very simple types of transformation or much more advanced ones.

4. Situation discovery / event pattern discovery: This function is to discover that some situation occurs without having a predefined patterns, using intelligent techniques. While the first three types of functions are more investigated (although I can't say that all issues are figured out), the fourth one is still a challenge, since there are some experiments, but generally it is not well established yet.

This also remind me of a different topic -- misconceptions around event processing, and I'll write about this topic soon.

Saturday, April 25, 2009

On Revision

Saturday morning, and I am spending some spare time (well -- ignoring my huge to-do list..) in reading the autobiography of Shmuel Tamir, who has probably been the most influential lawyer in Israel, as well as a political leader whom I always had great respect to (I don't admire people).

Today I would like to write about the notion of "revision" and relate it to event processing.
This is inspired, but not a direct response to a thread of discussion started by Peter Lin in the complexevents forum, under the name: mutability and aggregation.

Revision is somewhat different from modification; in modifications fact are modified, in revision they are revised. For example: if John Smith moves from the USA to Canada, then the facts about John Smith are modified, while if, by mistake it was recorded that John Smith lives in USA, where in reality he always lived in Canada, this is correction of recording mistake. Some people may wander what is the importance to make distinctions between the two ?

The first use of "revision" that I came about was in AI, talking about "non monotonic logic", the rationale is that using "classic logic" one can reason about the universe just if there is perfect knowledge, so the example used is that although birds typically can fly, however there are some exceptions -- Penguin does not fly, Ostrich does not fly, bird with broken wings cannot fly etc..
Let's say that Tweety is a bird and we don't know anything else about it, according to classic logic we cannot say whether it flies, however, according to the various non monotonic logics, we can say that since birds typically fly, we can assume for any practical purposes that Tweety flies, as long as we are ready to withdraw from this assumption when new information (such as: Tweety is a Penguin) becomes available, in that case we may need to retract all the assertions that were inferred directly or indirectly from the revised assertion.

Later in life, I have worked on temporal databases; one of the motivations of temporal databases have been to issue "as-of" queries, meaning -- looking what was known from a viewpoint of a certain time point in the past. For example -- if we investigate possible malpractice of a physician (I heard that the national sport of Americans is to sue their physicians) then in order to determine whether a physician made a reasonable decision we need to know what was the information available to the physician at the time that he made the decision. In order to achieve that facts cannot be deleted or changed, but we need an "append only" database, the distinction between "modification" to "revision" is important for the decision analysis, there may be a difference between -- the fever was high in the next measurement, or if the fever was high also in the measurement before the decision, but it was reported wrong and this information has been revised later. Eleven years ago I have co-edited a book about temporal databases which (among other things) discuss these issues.

Now, something about revisions and event processing. Recall that an event is something that happens, and it is reported to an event processing system using its projection which is also known as event (sometimes: event object, event message). An event that happens in reality cannot be modified or deleted, if it reflects something that happened. However, since when go to the projection on the processing system, again, if we assume that the knowledge is not perfect then we can have several cases of revision:

1. The event really did not happen, but it was reported by mistake that it happened, and the mistake was realized later.
2. The event happened, but it was not reported, and this was realized later.
3. The event both happened and reported, but some information associated with the event (reported through the event's payload) had wrong value due to error that was corrected later.

I'll post soon a continuation that discusses the implications of revisions on the processing of events. More - Later

Saturday, February 14, 2009

Quantum Leap -- take II

This morning was a sunny Saturday after a few rainy ones, and along with many other people, I went out with my family to the nature... We live in Haifa, which besides its beaches and beautiful view of the bay, has also a close by big nature reserve called "Carmel forrests", not really a Forrest in global terms, but has many nice hiking trails, 15 minutes drive from home. Here are some of the flowers we watched today... good to take a break sometimes..

As a follow up to my previous posting on quantum leap, here are some more insights, we in IBM Haifa Research Lab have signed up to look at the "next generation of event processing", and are working on this topic, I may present a tutorial about our findings in DEBS 2009, if accepted.

Here are some initial insights:

Like in databases, there need to be a formal model that will have wide acceptance (over time) to enable the quantum leap, since acceptance provides a critical mass of work directed to the same direction. Our belief is that the "event processing network" model is the one, but it still lacks solid formal basis.
Besides this -- there are four areas that will show in the future significant developments, if they will be done on the basis of the model -- it can provide a coherent play. The pyramid below shows the four :

Platform: While the first generation of event processing is the "engine" land, we are starting to see movement for platforms which will provide shared services (e.g. - global state management, routing, load balancing, security, high availability...) and a possibly heterogeneous collection of event processing agents will run in these platforms. There may be platforms with various orientations -- grid platforms, database oriented platforms, messaging oriented platforms, streaming (data flow) oriented platforms to name a few. The platforms may be an "event processing platforms" or platforms with wider functions (e.g. event processing agents and other decision agents). Some analysts are talking about -- extreme transaction processing (XTP) and context-oriented platforms, maybe the platform will mix some of all of the above. Like the area of application servers in enterprise computing, the platform orientation is one of the facets of the next generations.
Engineering: The engineering progress is not really considered as revulsion, but they are required to enable the higher layers to work in reality. This is the equivalent in other areas to query optimization, tuning, configuration, scheduling, load balancing, parallel programming assignments and various of other systems related topics. The relational databases became widespread only after the vendors succeeded to get the engineering parts right, so advancement in this area is critical.
Functional: The functionality that products have today is just the start, more functionality will be supported, maybe even substantially more. Some directions: the "intelligent event processing" direction -- looking at discovery of unknown pattern and prediction of future events, adding more context information - like geo-spatial, getting better temporal handling; probably much more.
Usability: Here probably will be much of the quantum leap -- getting the abstraction levels higher. Hierarchy of events, and causality, advocated by David Luckham, are really abstractions. However, there are more than just abstractions from the implementations up, there also need to be abstractions from the user thinking down. Instead of trying to visualize and abstract out the implementation model, the opposite direction will be to have the abstractions in the users domain of thinking and translate them (perhaps not 1-1) to implementation.

The quantum leap will occur with a coherent combination of all these aspects. There may be some new vendors which will offer next generations as their first generation, since they are liberated from supporting legacy (and may be acquired by larger vendors) , and there are existing vendors which are going into some of this in an incremental way....

EPTS will attempt to contribute to the thinking about next quantum leap by the work in its working groups; we also saw in the last EPTS event processing symposium that the use cases working group has presented a variety of use cases, which cover broad range of applications types and requirements, this will be one vehicle to determine requirements. Other working groups will contribute in the various areas. In May 2010 we'll do a major summit of industry and academic people (Dagstuhl Seminar), EPTS members will get a more detailed note about it.

More - Later.

Sunday, September 14, 2008

On sporadic events

I have never been a student in Stamford high school, but Stamford, CT is my home away from home for the next seven days. Starting tomorrow, I'll provide some impressions from the Gartner meeting and EPTS symposium, but I rely on other people in the blog-land to have a better coverage (e.g. Paul Vincent, with endnotes and references). I've Arrived earlier today, and resting before the busy week.

One of the thoughts that came to mind when looking at some of the discussions around the Stream-SQL standards, is one more observation.

While the claim that the difference between a "stream" and a "cloud" is that a stream is totally ordered and a cloud is partially ordered, I think that there are also some more distinctions. I'll discuss one of them --- sporadic events vs. known events. When dealing with "time series" type of input, then the timing of events are known, for each time unit (whatever it is) there is an event (or set of events) that are reported. This is true when the events are stock quotes provided periodically, or signals from sensors provide periodically. There are events that are not naturally organize themselves in time-series fashion, for example: bids for an auction, complaints from customers, irregular deposit, coffee machine failure etc...

From the point of view of functionality, there is no much difference --- one can create time series which reports a possibly empty set of events for each time-unit, but if in most time-unit the reported set is empty, this will not be very efficient way to handle it. On the other hand a system that does not support time series as a primitive can view all events as sporadic events, but there may be some optimization merit to the knowledge that events are indeed expected at every time unit. This is just one dimension -- but this leads me to reinforce my conclusion that there are various ways to provide event processing functionality, and the most efficient way is probably hybrid of approaches based on the semantics of functions and characteristics of the input. So this is the observation of today; it will be interesting to see what will be the main discussion topics in the coming conferences.

More later - from Stamford Hilton.

Thursday, July 10, 2008

On EuroPLoP and Event Processing Patterns

This is Kloster Irsee in Germany, where the EuroPLoP conference is held. The conference itself is held in kind of an opposite way to other conferences, in a regular conference, the author of a paper presents, in this conference, the author of a paper hears other criticizing his paper, and does not talk. It also have some other properties -- some of the sessions are held in room which the chairs have been taken out, so people are sitting on the floor, or standing, there are also games embedded in the program --- my daughter would have said if she was here that there is a bunch of nerd children trying to behave in a cool fashion - well... anyway, we came here to combine work on creating a consortium for EU project and having a three hours "focused group" inside the conference, organized and moderated by Adrian Paschke. The audience was mixed - some were the EP gang, those who came to work on the EU project, and some where regular participants in the conference. The various interpretations of patterns related to EP have been discussed, and the need to advance in all of them. This is directly related to my tutorial in DEBS 2008, there is now work in progress on a pattern meta-language that we are doing in IBM, in a few weeks I'll tell you more about it - stay tuned.
More about my tutorial in DEBS 2008 - Paul Vincent from TIBCO has taken a picture of a slide that I've presented with different computer (due to some unclear technical problem) with probably different character set (Paul also blogged about this slide) - in footnote 1. Paul has been kind enough to send me the picture - so here it is, you may see some unknown characters...

More about patterns - soon.

Wednesday, July 9, 2008

On the multiple types of patterns in event processing

The red arrow in the map shows where I am now - in Kaufbeuren, a town in the south-west corner on Bavaria. I have arrived last night by train, and it has been a challenge to find a place to eat in 9PM, but with a few minutes of search I have found it. I am on the way to the patterns conference, in which we have a working group to establish a partnership towards EU project proposal. Thus, I would like to write a few lines about patterns. In my DEBS 2008 tutorial I have used the following slide:

In this slide there are four possible meanings for patterns in event processing:

(1). Patterns in the sense of functions that event processing application may perform (e.g. enrich, route, transform, filter, detect pattern).

(2). Patterns that are detected on the history of events (note that "detect pattern" is just one of the patterns of the first type, but the support of this pattern is what makes it CEP).

(3). Patterns in the user's domain - the way things are presented to the user (which may not be translated 1-1 to an implementation pattern)

(4). Patterns in the software engineering sense - best practices of how to use some language / product / technology.

All of the four are interesting and also have some dependencies, my tutorial has concentrated mainly on patterns of type (2), I have written about type (1) a lot in the past -- types (3) and (4) are frontiers that need to be conquered. More - Later.

Event Processing Thinking