Friday, August 15, 2008

On the first class citizens of enterprise computing

These are the citizens of hawktown, chosen to start this Blog posting. The issue of first class citizens has been discussed recently by Paul Vincent in a recent Blog posting that also cites Forrester's Blog on the four classical elements : Process, Services, Events and information.

This is interesting, since it presents events as first class citizens in the computing world. Paul asks whether CEP is a fifth element and proposes the following diagram.

In the introduction class that I am giving to my students in the event processing course I am showing a slide that explains how events relate to the rest of the universe, interestingly in this slide as first class citizens there are all the four elements identified by Forrester -- process, service, event and information, with addition of two elements - entity and context; entity is often considered to be hidden behind information which represents it, and context has not yet gained recognition as a first class citizen, although some analysts think it will become a first class citizen.

This illustration does not provide all connections among all elements, but it is an event-centric view of the universe, let's go briefly over the different connections of events to the rest of the universe:

1. Event can be related to another event - by various types of causality, by the fact that an event is included in a complex event, and by the fact that an event is a participant of the derivation function of a derived event (further discussion is here).

2. Event can be related to context - by the fact that it starts or ends context-instance, classified to context, and is created within a certain context.

3. Event can be related to service - by the fact that service can be either a producer or a consumer of an event, a service may emit events, and an event may be sent to a service, or even invoke a service.

4. Event can be related to process - by the fact that process state may be observed as event, while an event can affect (insert/modify/cancel) a process, or affect the orchestration of processes.

5. Event can relate to entity - by the fact that event references entities within a certain role; more semantic view of event processing can be done also according to relationships among entities.

6. Event has various relations to information -- it can be represented as a (kind of) information, a database operation by itself can be an event (like in active databases) and the result of derived events can be a database update action.

For Paul's question - how does CEP fit in? I agree that CEP touches all, but would not use Venn Diagrams to show the connections here.

  • I view event, process, information, entity, service and context as conceptually different terms, thus there is no intersection between any two; There are relations between each pair.
  • The main function of event processing is to process event, meaning take one or more events as input, do something (transform, enrich, aggregate, find pattern, derive...) and produce one or more events as output. Interestingly event processing has relations with all the six entities (I'll use EP instead of CEP, since I am using the CEP according to its glossary definition and not according to the "marketing" definition - see discussion here)

1. EP consumes and produces events.

2. EP executes within context that determines what will be executed.

3. EP can be wrapped as a service. Also EP interacts with services.

4. EPN is a kind of process - but cannot be managed by typical BPM tools due to the different scheme of routing, it can also interact with processes.

5. Entities are not explicitly part of EP, although some type of processing may also process dependencies among entities.

6. EP has various interactions with information - same as events.

Note that sometime conceptually there are activities on the borderline between event processing and other terms, for example - the action to determine what should be done with an event on the consumer side, can be done as part of EP system, or as part of the consumer software - depends on the level of coupling between them.

More - including discussion how does the term situation fits in this game - later.

Thursday, August 14, 2008

On performanc metrics and the new coffee machine

Morning, in my office with the morning coffee and the Blog... In this building the coffee machines are gradually being replaced with new ones. The new machine produces somewhat better coffee, but noticeably slower. This tie back to one of the topics that I am working on recently - performance metrics for event processing networks. From the coffee machine I can learn that people are ready to switch one property (speed) with another property (quality), which, of course, indicates that performance metrics typically does not consist of a single property. Even the dimension themselves are tricky, in previous posting I have indicated that defining latency in event processing network may have multiple interpretations, besides this we can look at minimizing the average latency, or minimizing the maximal latency. This is not identical -- "real time Java" implementation which smooth the garbage collection functions are making the maximal latency much lower, but there is a price in average latency (try and observe)... The autonomic computing principle of self-optimization applied on event processing network given multiple criteria is one of the major challenges of the next generation of event processing implementations. This is an evolving thinking, so more thoughts on: WHAT are the optimization parts and HOW they can be optimized -- in later posts.

Tuesday, August 12, 2008

On Top Down and Bottom Up

My B.A. degree is in Philosophy (well, to be accurate most of the studies were related to logic), later I have learned business administration and Computer Science, but I think that Philosophy was the most important thing I have learned, as the other studies provided techniques, while Philosophy provide more basic competencies.

When I have decided to study Philosophy, my late father, who has been very practical person, invested an entire evening trying to convince me that I am going to waste my time, and why I cannot study engineering, medicine or law like everybody else, well, I have listened carefully, and went to study philosophy, a decision I have never regretted - actually some of my friends switched to study philosophy when they realized that I am enjoying my studies and they are not.

I was reminded about this episode with time wasting when I caught up in what happened in Blog-land during my vacation, and discovered the blog posting entitled: fallacies of self-fullfiling CEP use case studies. I am quite amused to read these type of Blogs, I hope that the author is also feel amused when writing them... well, after being amused for a few minutes, I thought -- well, today there is a conference call of the EPTS use case workgroup, maybe I should cancel the call, since all the participants including myself are going to waste their time in addition for the time already wasted on fallacies... Fortunately, I have studied Philosophy, and moreover investigated fallacy types a long time before Wikipedia was created, and failed the fallacy in the identification of fallacies (an absence of fallacy event..), so I decided not to cancel the call.... I have a feeling that the participants did not think that they have wasted their time, but who knows...
Enough humor, time to have some serious stuff on this posting -- so I think that top-down and bottom-up approaches is a good topic to discuss. While the original posting used the metaphor of space rockets

I'll stay on the earth and use the metaphor of a gadget - let's imagine that some inventor invents a gadget maybe like this:

This gadget has a lot of features, and it was designed in a totally top-down manner.

At some point customers start to acquire this gadget and are using it in different ways for different purposes. The inventors of this gadget have a lot of ideas of what else to do in the next version of the gadget, but besides the top-down innovation there is also a bottom-up process that is called in control theory: feedback. Typically, a large enough sample of customers is being interviewed to understand - what is this used for ?, how it is used ?, are various features needed? understood ? used the same way as imagined by the inventors or otherwise? are there some requirements for this gadget to connect to other gadgets? some requirements about the operational aspects? this information can be used both to explain new customers about the experience gained, and information that can tune up the priorities and ideas of the next generation.
Back from the fascinating gadget world to our not less fascinating EP world -- the use case workgroup study is intended to understand both about the ways in which EP technologies are used today, and about additional requirements that customers who already used EP technologies are looking at (customers that did not use a certain technology typically have difficulties to express requirements about such a technology, unless they have studied the area...). .

My assumption is that the end result of this study will be beneficial for the entire community - customers who would like the best practices, vendors that design their next generation, researchers who wish to analyze this market etc... However, assumptions are just philosophy and as such, my assumptions are as good as the counter-assumption that this is all a self-fulfilling fallacy. Since I have grown up and am wearing the scientist hat these days, I suggest to take the empirical approach, namely, be patient to see the end result and judge it.

And bottom line about bottom-up and top-down: Top-Down and Bottom-up work have in general complimentary roles, and are used in different phases in the life-cycle of products/technologies/areas. Important concepts such as: use patterns and best practices are, by definition, Bottom-Up...

Monday, August 11, 2008

On faithfull representation and other comments

Back home from the vacation in Turkey, the vacation took place in the Limak Limra hotel, about 1.5 hours drive from Antalya airport (see picture of one of the many swimming pools above). It was a great British philosopher who preached to workaholists people like myself about "in praise of idleness" . So - not taking the laptop with me, I have learned several things:
1. Unlike the Israeli beach which consists of soft sand, the beach in Turkey consists of small and large stones;
2. Turkish chefs know how to cook many types of foods quite well, but have a lot to learn still in preparing Sushi,
3. The reputation of Charter flights about long delays is actually true (however, this is also true today for many regular flights).

Since Richard Veryard has sent me an Email about his Blog postings entitled "Faithfull Representation" in which he referred to an illustration that I have made as a "simple situation model" and attributed this model to both Tim Bass and myself (goodness gracious me!). Tim, who constantly claims that he has much more general view than me, could not believe that his name and my name are mentioned in the same sentence as agreeing on something, and asserted (I am using "cut and paste" from Tim's Blog:) "Opher tends to view CEP as mostly an extension of active database technology where I see CEP as a technology that is much more closely aligned with the cognitive models".

Here are some comments:

1. The illustration that Richard is quoting does not mean to explain what a situation is, but to show the relations among several concepts, I am enclosing it again -

As can be seen I am writing there that composite events (which are taken from active database terminology) and complex events (which are not) may both represent situations, which does not say that this is the only way to represent situation (as saying that fish is an animal does not define what is an animal).

2. I have explained the basic idea of situation in this posting , simply said - a situation is a concept in the "real world" domain (not in the computer domain) that requires reaction. In some cases a single event determines a situation, in some cases, detecting a pattern determines a situation, and in other cases, patterns only approximate the notion of situation, and there is no 1-1 mapping between events and situation, note that in that posting I also have provided an example of non deterministic situations.

3. Regardless of the situation definition, Richard is absolutely right that all over the event processing life-cycle we may have instances in which the events are inaccurate or uncertain , and the reader is referred to this posting for some examples of uncertainty issues we are dealing with. This is an area that I am investigating in the last few years together withAvi Gal from the Technion and Segev Wasserkrug (our joint Ph.D. student who graduated recenlty with a Ph.D. dissertation was denoted as excellent by the exam committee). Hot from the oven - A paper about it is published in the recent (August 2008) issue of IEEE Transactions on Knowledge and Data Engineering, which is dedicated to "SPECIAL SECTION on Intelligence and Security Informatics". The actual paper can be downloaded from Avi Gal's website. Another paper related to the same study has been presented in DEBS 2008.

4. While I totally agree that in some cases the uncertainty is needed - and certainly some security applications are example, I also believe that the potential market for the more basic deterministic world is much higher, and we are far from picking up all the low hanging fruits of the deterministic event processing.

5. We still have challenges in defining the semantics of the different cases of handling uncertain events/patterns/situations. The fact that there are arithmetic of uncertainty help, but not everything that exists in AI research fits the real world requirements of scalability, performance etc..

6. About the comment of me viewing event processing as extension of active database technology -- I view event processing as a discipline by its own right (and this is a topic for another discussion which I'll defer), it has origins in several disciplines, one of them is active databases, but it has several more ancestors - sensor fusion, discrete event simulation, distributed computing/messaging/pub-sub and some more, and draws concepts from each of them. Anybody who reads my Blog can realize that there is a fundamental difference between active database that extends database engines and event processing that is not based on database technology, there are some other differences too.

7. My friendly advice to Tim is that before he makes assertion about how and what people think (and this does not refer necessarily to myself) he will re-read his own excellent posting :"red herring fallacies" .

More on event processing as a discipline - at a later post.