Saturday, September 29, 2007

The trap of ambiguity - case of "event correlation"

One of the terms that is missing from the event processing glossary is "event correlation", yet, I don't miss this term at all, in facts, I am trying to lose it for years. The main reason is the homonym effect which the same term is used in different meaning. Here is the collection of different meanings that I know of of this term:

1. A network and system management application: following the ComputerWorld article - "Event correlation simplifies and speeds the monitoring of network events by consolidating alerts and error logs into a short, easy-to-understand package" - in this case the idea is to reduce the number of symptoms and concentrate on problems.

2. Some call the pattern detection part of CEP in the name "event correlation"

3. Some call the matching between events and data (e.g. for enrichment) as "event correlation"

4. Some are using the "correlation identifier pattern" in the enterprise integration patterns - the act of matching two events based on shared attributes.

5. Last but not least, event correlation refers to statistical correlation among the occurrence of two events, used to mine patterns of causalities.

So - what is the problem in homonyms ? the problem is that since in general there is a confusion of terms in this area, homonyms tend to intensify the confusion, especially among decision makers who don't really have technical depth, but get decision based on perceived impressions -- this may have undesired results.

Thus, I prefer to use non-ambiguous terms as much as possible, and avoid confusing slang.

This is just one example of confusing terminology, in a recent incident I noticed people spending much energy trying to understand what is the meaning of BAM (which I may discuss in another time).... The consensus glossary is not just a "nice to have", it is a business tool to save a lot of time on miscommunication.... More, later.

Thursday, September 27, 2007

More on EDA is EDA and SOA is SOA

Hello from Haifa again. My good friend Tim Bass has written a blog entitle EDA is EDA and SOA is SOA. I totally agree that EDA and SOA are not the same, more than that, I don't even think that they are talking about the same thing. Many people equate SOA with the synchronous "request-reply" style of interaction, but this is only one way to implement SOA, since SOA is not about interaction style, but about modularization or componentization of the enterprise, and hence it IT systems, introducing them as "services", the interaction style is a secondary issue, services can interact both in synchronous and asynchronous ways, and there is nothing inherent in the concept of services that say anything about the interaction type. EDA is indeed about asynchronous push interaction style, and this can be interaction between services, where we get both SOA and EDA, or interaction between other software artifacts, in enterprises that don't implement SOA. Implementation of SOA and EDA both make sense for other reasons and to solve other problems, and they are completely orthogonal. In enterprises that implement SOA there is a sense to use a combination of both, thus, it is important to position them in a way that they are complementary and not contradictory, since there are some misconceptions about it (exactly since some people interpret SOA = request / response). I may dedicate another blog to in-depth discussion about this two concepts. A side point - since this blog deals with "event processing", event processing is not implemented on top of EDA -- there are some cases in which event processing has "request-response" functions such as: getting events in PULL from the producer, which may be done periodically or on-demand, or event processing network that is implemented as part of transaction, and thus does not fulfill the loosely coupled principle of SOA. Thus "event processing" is not a pure EDA.... confused ?
more - later.

Tuesday, September 25, 2007

VLDB - and computer science 2.0

Hello from Vienna. Today the VLDB conference started with an interesting talk of Werner Vogels the CTO of Amazon, whose blog is entitled: AllThingsDistributed, and the framework they have built (and that other retailers use), he referred to Amazon as a technology company that happens to do retail. I think that there many touch points of event processing technology with the Amazon model, but did not find him to talk about it. I am Amazon customer for years, somewhere in the late nineties I remember ordering a bunch of books from Amazon, and not receiving them in the designated time, I have sent an Email to Amazon asking about it, the answer amazed me: we don't know what happened, we are sending the order again. A day later I received the original shipment, and sent another Email to Amazon - I got the original shipment, you may stop the substitute one, the answer I got was even more amazing: We cannot trace an order once it was issued, keep the books with our compliments". It seems that now they know how to track their order.

Other two keynote speakers have been Mike Stonebraker and Michael Brodie, two old-timers, who have been around for a while. Stonebraker gave some variation about his repeating message: "One size fits all: A concept whose time has come and gone" which talk about the elephants (Oracle, Microsoft, IBM) DBMS product as an obsolete concept, and shows that for various types of functionality (including "stream processing", of course), a specialized engine is better than a monolithic one, and in fact, the monolithic engines excel at nothing and should be eliminated. The idea that one size does not fit all is probably true, in databases (and also in event processing), one thing to note (and this follows also Mike's talk in EDAPS yesterday), he looks on everything in a single criterion -- speed (latency ?), I think that reality is a little bit more complex.

Mike Brodie started with a nice video with music that was getting louder showing facts about quantities -- size of various databases, internet webpages, use of search engines etc -- and trend (the time everything is duplicated is getting shorter and shorter), he also talked briefly about SOA, and about the need to take a new approach that is application-based, semantics-based, and create Computer Science 2.0 -- however I did not understand what new science is required, and in response to a question he answered --- I presented the problems, leaving the solutions to you. I am not sure that I have understood the problem (except for engineering issues), but let's wait to see if computer science 2.0 will arrive (I think that the term 2.0 is starting to be over-hyped, there were some attempts at SOA 2.0 as combination of SOA and EDA, but I am not sure it caught as a buzzword). Anyway -- whatever Computer Science 2.0 is -- event processing should be one of its fundementals. More later.

Monday, September 24, 2007

EDAPS-07 and event processing research community

Hello from Vienna again. Today the event processing meetings festival, that started in the EPTS meeting exactly a week ago, continued (and ended) in Vienna with the EDAPS workshop adjecent to the VLDB conference. The EDAPS workshop was chaired by Ling Liu and myself. Unlike the previous meetings, EDAPS is a scientific workshop, where the accepted papers were selected by a program committee. This has been the second time we are doing this workshop, and it was attened by around 20 people with interesting papers. One of the issues we have discussed in the closing session is the creation of a unified research conference, this year there have been three such conferences: DEBS 2007, DEPSA 2007 and EDAPS, besides that other conferences like RULE-ML 2007 has some track on "reaction rules", while database conferences have tracks on "streams". Indeed, event processing has many roots - verification/simulation, active databases, stream management, pub/sub, distributed computing, rules, programming languages, sensor networks and maybe I forgot something, but the challange is to try and devise a community of researchers whose primary discipline will be event processing. A first step is to try and devise one annual conference that will unify all forces. We have proposed the DEBS steering committee to open up and make DEBS such a conference, as the "flagship" of the event processing community, and decided in 2008 not to hold EDAPS instance, to give a chance to DEBS to build a strong conference. Alex Buchmann who will chair the program committee of DEBS 2008 in Rome will work to extend the program committee and we'll help him recruit industry/vendors participation. In my opinion, a research community associated with event processing is important for the existance of event processing as an area and discipline.

One last comments about EDAPS - we have invited Mike Stonebraker to be keynote speaker, Mike is certainly a great speaker, however, he has chosen to wear his vendor hat, and provided sales pitch, with some assertions and generalizations that I wonder if Mike Stonebraker the distinguished scientist would have accepted for publication... well. Tomorrow, the VLDB conference - more later.

Sunday, September 23, 2007

Baggage handling -- how event processing can help people like me





Hello from Vienna, Austrtia. The baggage carousel was (almost) the first thing I have seen in Vienna airport, those who know me realize that this is one the places I most dislike, trying hard not to check luggage, which usually work (I am travelling with slight overweight on my carry-on luggage, but most airlines allow it), but I came across a strict (manual) agent in the Swiss ticket counter in Miami, and thus, had to deposit (in Hebrew there is a one letter difference between the word meaning "deposit" and the word meaning "abandon", so I always use the second word to describe hadning out luggage to an airline. I don't trust airlines to get my luggage in time - over the years, my luggage did not get on time three times, which may not be considered a lot, considered my travel quantity, but believe me, any such time is a big hussle. Moreover, I have read a statistics that 1.5% of the luggages don't get on time. Anyway, today there was an happy end -- after 10 minutes of wait I got my luggage and hurried out of there. Since this blog is about event processing and not about my personal eccentricities, I am thinking how event processing can help airlines to restore the confidence in the buggage handling systems and make neurotic passengers like me more relaxed -- well: it seems that all types of event processing can play here:


  • Trace and track system with notification can be used to have the passenger ask questions (or get notifications) that notify that the luggage is on the right aircraft, and when standing near the carousel, provide an estimate when it is coming (with some RFID readers all over).

  • BAM system that determines if something is going wrong --- the luggage did not arrive to some place, or arrived to the wrong place.

  • Automatic decision system which re-routes the luggage, if the passenger was reassigned.

  • Anything else ?


I will personally will be willing to pay for a service that notifies me the status of my checked luggage, espcially when standing near this carousale... this is not very difficult to construct...



Another small matter is the "ultra personalization", after arriving to Austria, not only people are talking to me in German, but also computerized system, I had to purchase Internet time in my hotel - T-Mobile has German only text, and after series of guesses, and some trial and error, I have succeeded to log it. This blogger system also talks German to me now -- in fact this is not personalization, this is location-based classification with wrong assumptions.... which brings to an interesting discussion about the term "context", which I view to be one of the major abstractions useful also in event processing -- more about it later.


Saturday, September 22, 2007

The mythical "event per second"

Still in Orlando, the Gartner Event Processing Summit is now behind us, and the Gartner folks are happy from the turnover, and promised another summit in August 2008. The closing keynote speaker was David Luckham who needs no introduction in this community, who talked about the past, present and future of event processing. David has positioned himself as the prophet and the person setting challenges to the community, and talked about "creeping event processing" (present), "event processing as first class citizen in computing" (five years from now) and ubiquitous event processing, where event driven architecture and event processing will be fundamental to computing infrastructure, with the vision of dynamic event processing networks, that consist of Internet-scale amount of agents, that are dynamically created and destroyed. Another interesting talk was the talk of Robert Almgern, who has a long academic record, but currently works as Managing Director, Head of Quantitative Strategies for Banc of America Securities. He has provided excellent introduction to algorithmic trading, one of the most pervasive applications of event processing products today. There has been one sentence that attracted my attention in his talk : "vendors are talking about 200,000 events per second, this is great, but far from what we need, the typical load in our case is 7,000 events per second, with a peak of 13,000 events per second, and any product in the market can deal with this loads with no problems". He further pointed out that the main effort has been invested in connectivity with other applications, and not in setting up the algo. trading system. This brings me to earlier thoughts about the "mythical event per second" . Old timers may remember the mythical man month that discussed the fact the problematic in estimating time durations in software products. While some vendors attempt to make the "event per second" as a main property - there are two questions: the first one -- what does it mean ? is that throughput, well one can have very large buffers within the boundaries of the system, and make sure that all events are there in the buffer and will be processed eventually - this is a good property, but does not make it high performance. It is how many events can be processed in a second that matter (thus, latency is impacted), but what do we mean by process -- this is like talking about "transaction per second" without specifying what is there between the beginning and end of the transaction, this can mean anywhere from filtering out the event, and using it (and its descendants) in 2000 pattern detections. Thus, without a benchmark, this term is meaningless. However, the second question is even more interesting --- how critical is high performance ? some vendors have interest to make it a major requirement, and if you'll ask customer - do you want high performance, nobody will say no -- which reminds me that once I have interviewed customer and asked whether he has a need of high throughput, the answer was - "yes, of course", then I said about the quantity, and the answer was " around 10,000 events per hour", and if you think of it, for a human this is a high throughput. While there are certainly some applications which need "high performance", the evidences show that majority (say 95%) of the candidate applications do not require very high performance, since the main value of event processing is the abstractions to mitigate the complexity, and not the throughput. I remember a discussion with CIO of a bank, who said -- well - today we are doing these things in batch, let's learn how to walk, before start dancing. I think that we need to teach the customers how to walk first - use the right abstractions, integrate with their systems etc, as a higher priority. Bottom line -- I think that the "high performance" in event processing is somewhat hyped, and is only one of a series of considerations, certainly not the most important for most applications. I have started with David Luckham's talk -- and will end with saying that EP frameworks like the dynamic event processing network mentioned before will, by definition, resolve all scalability issues in the framework level, rather than in the engine level.

More on frameworks vs. engines - later.

Thursday, September 20, 2007

The role and prospects of standards in event processing


Still in Orlando, we have finished the EPTS meeting yesterday, and I'll summarize it within the next few days, staying for the Gartner Event Processing summit. The fact that Gartner made a commercial conference indicates that the level of interest in event processing has crossed a meaningfull threshold. In order to participate in the Gartner meeting, I also have an IBM "booth duty", which requires me to dress in a way that I am not accoustomed to (disguised as a civilized business person) -- somebody told me to put a picture on the blog of how I am looked, since in reality, there is a rare event, so you have low probability to see me this way, so this is the picture you see on the top. Anyway, one of the topics that have been mentioned in the Gartner meeting by Roy Schulte is that SOA became a reality only after web services standards started to be pervasive, and advocated for standards as a major enabler (and lack of standards as a major obstacle). We had long discussions in the EPTS meeting about standards, it seems that there is an agreement to advance towards standars, but with some caution. First, some people were not sure that our understanding of event processing is mature enough for standards (this is IMHO also true for the SQL extension proposal that is going on, I have some doubts about the maturity of their thinking)... On the other hand, we need to show the customers that we are working towards standards, and try to make the thinking more mature (more on when I'll summarize the EPTS meeting), anyway -- we can classify the standards to three main groups.
(1). Modelling standards.
(2). Interoperability standards.
(3). Language standards.
OMG is about to issue RFP for modelling standards, and I guess that we'll deal a lot in this issue over the next year, on the language standard -- I have already discussed the idea of meta-language first, and it seems to get some momentum, interoperability standards are more difficult since there are a lot of related standards both on event structures and on event transport, and we'll probably defer this discussion for later.
Is there a way to accelerate the process of getting to standards ? -- probably yes, but it requires level of investment from the community, much higher than invested so far --- and this is the challenge we'll have to deal with --- more later.