Saturday, July 5, 2008

On events and Data

I am back at home for 2.5 days, in Monday morning I'll continue travelling the world - this time in Germany mostly for the DoReMoPat consortium meeting, which is related to the tutorial I've given in DEBS on event processing patterns. Today I would like to refer to one of the topics discussed in DEBS, as a result of the tutorial given by my friends Shailendra and Dieter from Oracle - the questions is what are the boundaries (if any) between database processing and event processing. I have already posted in the past a blog entry about the topic: is event processing a footnote for databases? and will not repeat what was written there, to illustrate the difference between event and data, I'll take the state machine example -- data is like states and events are like transition. Thus, a database reflects a snapshot that captures a single state in the universe. In event processing we process not the state, but the history of transitions, thus, the type of processing is slightly different (although as we noticed from the Oracle's presentation, there is now an attempt to take the pattern notion from EP and get it into SQL extension, IMHO - the result is somewhat complicated, since is not a natural hybrid), Anyway, one of the controversial point has been - whether sensor reading is an event or a data -- according to the state/transition test - a sensor reading is an event -- it transitions the database to a new state that includes the new transition. Some clever guy asked me the question - what happens if the sensor reading is the same as the previous one, nothing is change in the universe, so it is not a transition. Well - good question, but IMHO this is still a transition, first transitions may return the state machine to the same state (as seen in the state machine example above - the top state has a transition that maps back to itself). Why is it a transition ? -- first the timestamp is different, it is a new piece of information that exists (the combination of reading and timestamp) and has the potential also to change the state of the universe by mapping to another state. More - Later.

Friday, July 4, 2008

On DEBS 2008

Still in Rome, still in the hotel's bar since my room does not get enough wireless signal. The DEBS conference site is just next door to the church in which the famous Moses status is located.
I have also finished tonight my participation in DEBS (it goes one more day, but I will be heading for home tomorrow according to the plan). The DEBS conference (disregarding some organizational challenges) have gone well, while it has been a scientific conference, there has been a relatively big participation from industry - I have seen there people from vednors as well as customers, DEBS has been originally the home of the pub/sub and messaging community, however they have successfully extended it to include various other event processing "sub-communities" like the CEP guys, the streaming guys, people from modeling area etc..
I have contributed my part by providing a tutorial about EP patterns - which is a work in progress for building a meta-language which abstracts the semantics behind event processing functionality.
Since it is late tonight I'll defer discussions that I had in the conference to later posts, will just make a note that I owe everybody discussion on three points:
  • Boundaries and territories among disciplines -- in the wake of discussion with Dieter Gawlick from Oracle.
  • The term "event processing agents" - some discussions, and Blog comments from Paul Vincent
  • The notion of pattern in event processing.

Overall - very good conference, an opportunity to meet a lot of friends, and (I am ashamed to admit) my first visit in Rome ever.

Monday, June 30, 2008

On the EPTS working group use cases

Hello from hot and humid Rome, Italy. I have just finished eating a Pizza dinner just across the street from the Coliseum, seen in this picture, lot of ancient stuff around. I am now writing this Blog in the hotel lobby - just 5 minutes walk from there, why am I writing from the hotel lobby? One of my criteria to chose hotels is that they advertise that they have Internet access in their rooms, and this hotel also claim to have Internet access - nominally they have , it is just that the signal in the rooms is too weak, and in order to work one has to come to the lobby - which is an information that the receptionist gladly shares with you after you check in... Anyway, they have here a comfortable desk to work and air-condition, which is more than we had today in the university, where we had been meeting for the EPTS working group. So - to start from the beginning, I have arrived to Rome last night for the DEBS conference. If you look at the DEBS site you can discover that the air-condition in the university has been broken, and they move the conference to anther site -- well, but just from Wednesday, and tomorrow we'll have to suffer another day in the non air-conditioned building. This has been the first time in history that I have closed my laptop, since it got so hot that I feared it will not survive. Tomorrow I'll give a tutorial in front of a sweating audience just after lunch, it will be fun. Today, we had a F2F meeting of the use cases working group - interesting mix of four people from different vendors (Dieter Gawlick from Oracle, Richard Tibbetts from Streambase, Brian Connell from WestGlobal and myself), one academic person (Pedro Bizzaro) and one customer (Alex Kozlenkov from BetFair), there are some other members that could not participate, the working group is larger that the six of us. Despite the heat it has been long and productive day -- what are we trying to achieve? We are trying to learn something from the variety of use cases out there, we have heard until now in the three EPTS meetings and Dagsthul seminar, something like 80 use case. However, we cannot learn much of them, since they have been presented in an unstructured way, we could not analyze them on the same scale - the purpose is to try and classify event processing applications to several classes - and observe what are the important requirements for each class - in terms of functional, non-functional requirements, ROI measurements etc..., this is done to enable existing and potential customers to better understand the types of event processing (complex event processing or other event processing types) - today we have worked on the template - what are the criteria to analyze each individual use case, and got to a draft there that will be finalized in the next few weeks. We shall present the template together with some use cases on this template in the next EPTS F2F meeting in September - so stay tuned. Tomorrow -- DEBS. While DEBS is an academic conference, I am glad to see here many of my industrial colleagues (both vendors and some customers), the establishment of a research discipline around event processing, and help growing the academic research in this area, has been one of the EPTS basic missions, and we are supporting DEBS as the research academic flagship. Will write my impressions from the conference later this week (if my laptop will survive the heat)...

Sunday, June 29, 2008

On cost-effective EP when high performance is not required

I have already referred to this issue in the past. There are several concrete examples that come to mind - here is one (real system):
An enterprise has internal regulations to handle suppliers, these requirements relate to the interaction with suppliers, that may be represented as events, and involve actions that must be done, or should not be done, within certain amount of time, or until some event happens. The application is to monitor the compliance with these internal regulation in audit mode, meaning the results are alerts (which are also treated as events, since there are time-outs from alert sending) and not direct interference in the business processes.
The throughput is far below what is considered as "high throughput" and it is several thousands events per day. The latency is also not required to be extremely low -- if the alert will be issued within a minute or two, it is still very fast auditing. It also does not require any analytics of intelligent procedures, since the regulations are given and deterministic
What are the benefits for the customer to use EP software and not any other solution?
First - some of the regulations are fairly complex - writing them in hard-code can be time and resource consuming, putting everything on databases and using SQL was also considered - but some of the regulations are not easy to express in SQL.
Second - Regulations tend to change frequently, the users wish to control these changes, and getting them through the slow IT development cycle will delay the introduction of the change.
So in this case the customer's motivation are - agility and de-complexity; more later