Event Processing Thinking

Thursday, August 14, 2008

On performanc metrics and the new coffee machine

Morning, in my office with the morning coffee and the Blog... In this building the coffee machines are gradually being replaced with new ones. The new machine produces somewhat better coffee, but noticeably slower. This tie back to one of the topics that I am working on recently - performance metrics for event processing networks. From the coffee machine I can learn that people are ready to switch one property (speed) with another property (quality), which, of course, indicates that performance metrics typically does not consist of a single property. Even the dimension themselves are tricky, in previous posting I have indicated that defining latency in event processing network may have multiple interpretations, besides this we can look at minimizing the average latency, or minimizing the maximal latency. This is not identical -- "real time Java" implementation which smooth the garbage collection functions are making the maximal latency much lower, but there is a price in average latency (try and observe)... The autonomic computing principle of self-optimization applied on event processing network given multiple criteria is one of the major challenges of the next generation of event processing implementations. This is an evolving thinking, so more thoughts on: WHAT are the optimization parts and HOW they can be optimized -- in later posts.

Tuesday, August 12, 2008

On Top Down and Bottom Up

My B.A. degree is in Philosophy (well, to be accurate most of the studies were related to logic), later I have learned business administration and Computer Science, but I think that Philosophy was the most important thing I have learned, as the other studies provided techniques, while Philosophy provide more basic competencies.

When I have decided to study Philosophy, my late father, who has been very practical person, invested an entire evening trying to convince me that I am going to waste my time, and why I cannot study engineering, medicine or law like everybody else, well, I have listened carefully, and went to study philosophy, a decision I have never regretted - actually some of my friends switched to study philosophy when they realized that I am enjoying my studies and they are not.

I was reminded about this episode with time wasting when I caught up in what happened in Blog-land during my vacation, and discovered the blog posting entitled: fallacies of self-fullfiling CEP use case studies. I am quite amused to read these type of Blogs, I hope that the author is also feel amused when writing them... well, after being amused for a few minutes, I thought -- well, today there is a conference call of the EPTS use case workgroup, maybe I should cancel the call, since all the participants including myself are going to waste their time in addition for the time already wasted on fallacies... Fortunately, I have studied Philosophy, and moreover investigated fallacy types a long time before Wikipedia was created, and failed the fallacy in the identification of fallacies (an absence of fallacy event..), so I decided not to cancel the call.... I have a feeling that the participants did not think that they have wasted their time, but who knows...

Enough humor, time to have some serious stuff on this posting -- so I think that top-down and bottom-up approaches is a good topic to discuss. While the original posting used the metaphor of space rockets

I'll stay on the earth and use the metaphor of a gadget - let's imagine that some inventor invents a gadget maybe like this:

This gadget has a lot of features, and it was designed in a totally top-down manner.

At some point customers start to acquire this gadget and are using it in different ways for different purposes. The inventors of this gadget have a lot of ideas of what else to do in the next version of the gadget, but besides the top-down innovation there is also a bottom-up process that is called in control theory: feedback. Typically, a large enough sample of customers is being interviewed to understand - what is this used for ?, how it is used ?, are various features needed? understood ? used the same way as imagined by the inventors or otherwise? are there some requirements for this gadget to connect to other gadgets? some requirements about the operational aspects? this information can be used both to explain new customers about the experience gained, and information that can tune up the priorities and ideas of the next generation.

Back from the fascinating gadget world to our not less fascinating EP world -- the use case workgroup study is intended to understand both about the ways in which EP technologies are used today, and about additional requirements that customers who already used EP technologies are looking at (customers that did not use a certain technology typically have difficulties to express requirements about such a technology, unless they have studied the area...). .

My assumption is that the end result of this study will be beneficial for the entire community - customers who would like the best practices, vendors that design their next generation, researchers who wish to analyze this market etc... However, assumptions are just philosophy and as such, my assumptions are as good as the counter-assumption that this is all a self-fulfilling fallacy. Since I have grown up and am wearing the scientist hat these days, I suggest to take the empirical approach, namely, be patient to see the end result and judge it.

And bottom line about bottom-up and top-down: Top-Down and Bottom-up work have in general complimentary roles, and are used in different phases in the life-cycle of products/technologies/areas. Important concepts such as: use patterns and best practices are, by definition, Bottom-Up...

Monday, August 11, 2008

On faithfull representation and other comments

Back home from the vacation in Turkey, the vacation took place in the Limak Limra hotel, about 1.5 hours drive from Antalya airport (see picture of one of the many swimming pools above). It was a great British philosopher who preached to workaholists people like myself about "in praise of idleness" . So - not taking the laptop with me, I have learned several things:

1. Unlike the Israeli beach which consists of soft sand, the beach in Turkey consists of small and large stones;

2. Turkish chefs know how to cook many types of foods quite well, but have a lot to learn still in preparing Sushi,

3. The reputation of Charter flights about long delays is actually true (however, this is also true today for many regular flights).

Since Richard Veryard has sent me an Email about his Blog postings entitled "Faithfull Representation" in which he referred to an illustration that I have made as a "simple situation model" and attributed this model to both Tim Bass and myself (goodness gracious me!). Tim, who constantly claims that he has much more general view than me, could not believe that his name and my name are mentioned in the same sentence as agreeing on something, and asserted (I am using "cut and paste" from Tim's Blog:) "Opher tends to view CEP as mostly an extension of active database technology where I see CEP as a technology that is much more closely aligned with the cognitive models".

Here are some comments:

1. The illustration that Richard is quoting does not mean to explain what a situation is, but to show the relations among several concepts, I am enclosing it again -

As can be seen I am writing there that composite events (which are taken from active database terminology) and complex events (which are not) may both represent situations, which does not say that this is the only way to represent situation (as saying that fish is an animal does not define what is an animal).

2. I have explained the basic idea of situation in this posting , simply said - a situation is a concept in the "real world" domain (not in the computer domain) that requires reaction. In some cases a single event determines a situation, in some cases, detecting a pattern determines a situation, and in other cases, patterns only approximate the notion of situation, and there is no 1-1 mapping between events and situation, note that in that posting I also have provided an example of non deterministic situations.

3. Regardless of the situation definition, Richard is absolutely right that all over the event processing life-cycle we may have instances in which the events are inaccurate or uncertain , and the reader is referred to this posting for some examples of uncertainty issues we are dealing with. This is an area that I am investigating in the last few years together withAvi Gal from the Technion and Segev Wasserkrug (our joint Ph.D. student who graduated recenlty with a Ph.D. dissertation was denoted as excellent by the exam committee). Hot from the oven - A paper about it is published in the recent (August 2008) issue of IEEE Transactions on Knowledge and Data Engineering, which is dedicated to "SPECIAL SECTION on Intelligence and Security Informatics". The actual paper can be downloaded from Avi Gal's website. Another paper related to the same study has been presented in DEBS 2008.

4. While I totally agree that in some cases the uncertainty is needed - and certainly some security applications are example, I also believe that the potential market for the more basic deterministic world is much higher, and we are far from picking up all the low hanging fruits of the deterministic event processing.

5. We still have challenges in defining the semantics of the different cases of handling uncertain events/patterns/situations. The fact that there are arithmetic of uncertainty help, but not everything that exists in AI research fits the real world requirements of scalability, performance etc..

6. About the comment of me viewing event processing as extension of active database technology -- I view event processing as a discipline by its own right (and this is a topic for another discussion which I'll defer), it has origins in several disciplines, one of them is active databases, but it has several more ancestors - sensor fusion, discrete event simulation, distributed computing/messaging/pub-sub and some more, and draws concepts from each of them. Anybody who reads my Blog can realize that there is a fundamental difference between active database that extends database engines and event processing that is not based on database technology, there are some other differences too.

7. My friendly advice to Tim is that before he makes assertion about how and what people think (and this does not refer necessarily to myself) he will re-read his own excellent posting :"red herring fallacies" .

More on event processing as a discipline - at a later post.

Tuesday, August 5, 2008

On latency in event processing network

Packing, on my way to a family vacation in Anatalya, Turkey.

Following an interesting discussion yesterday about performance metrics -- it turns out that definition of latency in event processing network is quite tricky. The reason is that an event can move in multiple paths over the network, in some it is filterred out, in some it just getting into internal state of agent since it does not complete a pattern, and in some it may complete a pattern and trigger reaction, thus there are various ways to define the metrics here. This is important since optimization has to take into account the goal function - what is being optimized. More discussion on this area -- after I'll return. I'll be out of touch for a week.

Friday, August 1, 2008

A prelude to the 4th event processing symposium

Last night the tentative agenda of the 4th event processing symposium was sent to the EPTS members. I have issued a call for contributions in various forums, including this Blog.

I would like to thank to all those who contributed ideas and volunteered to give a talk or participate in a panel, while I am subjective, it seems to me that we'll have very interesting program this time.

What are the goals of this meeting?

I believe that F2F meeting are vital for any community, even in the era of everything virtual, the best communications means is still F2F meeting, communities consist of people, and their success is largely dependent upon ability of people to work together, chemistry among people and communication among people. The informal atmosphere of these meetings, and the restricted number of participants, enable good interactions both in the conference room and the corridors / restaurant etc..

Besides the social aspect, the main goal of this meeting is to determine the business agenda of EPTS for the next year, mainly in thinking about work groups, their scope and intended impact.

While the business meeting will be conducted in the third day, the first two (long) days will serve as background.

Some highlights:

four invited talks:

A VC investor who has recently invested in an EP company will tell us his perspective about the EP market (hype vs. reality)
A customer who has experience in use EP product will tell us his perspective and view of the future.
A standards expert will tell us about the impact of standards on communities and areas.

Each invited talk will have follow-up panel; the first panel with vendors business managers and analysts, the second panel with customers in various industries (not just finance), the third with CTOs and senior researchers and the fourth with various people involved inssion of the two existing work groups: the glossary, and the use cases. The use cases workgroup will present its template and some use cases described using this template. We'll also have a research session to vieAt the end of the second day we'll hear some topics for thought including: event processing as a service, event distribution in heterogeneous environment and academic programs/ courses relate to EP.w contributions from the research community.

At the end of the second day we'll hear some topics for thought including: event processing as a service, event distribution in heterogeneous environment and academic programs/ courses relate to EP.

Participation is by invitation only, and is intended primarily for EPTS members, however there will be some slots available to non members. If you are interested in participation please send an Email message to:<events@ep-ts.com> .

Saturday, July 26, 2008

On Patagonia Dinosaurs and Disruptive Technologies

Saturday night, at home in Haifa. Today I went along with (most of) my family to visit the fossils of some Patagonia Dinosaurs . The exhibition took place in the local technology museum, and since the last time I've been there, I noticed that they have eliminated the museum's parking lot, asking the visitors to find parking in the center city -- which is not easy even in regular days, and becoming more difficult when many people somehow have the same idea that they wish to spend Saturday noon in looking at old lizards... Anyway, after two rounds I have found a good parking place, making a note that there are soon elections for the mayor of Haifa.

Anyway, looking at the fossils, among the many people who were there, I also looked at a poster explaining the various assumptions why these animals became extinct.

This has drawn me to thinking in two directions -- when will humanity become extinct, and getting back into technology -- when does disruptive technology makes previous technologies obsolete? in our case -- will we have a second (or third) generation of event processing technologies which will be disruptive for everything that exists today ? --- well, we can only speculate at this point, but it is a good topic to think about... This can be one of the topics that we'll ask the senior technologist panel in the EPTS F2F meeting. More - Later.

Thursday, July 24, 2008

On optimization criteria for EP applications

This picture shows optimization of sitting on chairs, I actually know a person who sits on a big ball when he works, claiming it is good to his back. I have read with interest Paul Vincent's report on the OMG Real-Time workshop (since I cannot be everywhere, it is good that other people are reporting on what's happening, and Paul is especially good on reporting on conferences), in this meeting there has been a discussion about metrics for metrics for how to measure event processing applications. We don't have a standard benchmark yet, and I don't believe in a single benchmark fits all - but on a collection of benchmarks based on classification of applications. I would like to go deeper into the issue of "runtime performance" mentions there -- interestingly "runtime performance" means different things to different people, and indeed different application have different requirements -- if we just look at the metrics of -- latency and throughput, then we have the following variations of goal functions (this is probably not a complete list):

min (average e2e latency)
min (max e2e latency)
min (variance e2e latency)
min (deviation from time constraints)
max (input throughput)
max (output throughput)

The metrics are not identical - in latency there is a difference if the metrics is to minimize average latency or minimize maximal latency. For example, in Java the maximal latency can suffer from garbage collection that will make it untypically high, while "real-time Java" implementations that smooth the garbage collection minimize the maximum latency, but the price is that the average latency may grow. Throughput can be measured by input or output events, which are not really identical. Each of these goal functions indicates different kind of optimization, and this is just by looking at two parameters of throughput and latency...

This poses two interesting questions: will there be partition of the market according to optimization capabilities, or will be able to generate adaptive software that will be able to be tuned to multiple optimization ? more about performance metrics - later.