Showing posts with label complex event proccontext. Show all posts
Showing posts with label complex event proccontext. Show all posts

Wednesday, February 1, 2012

On "CEP and Big Data 2" - comments on Philip Howard's observations.

Philip Howard from Bloor Research has posted some observations on his Blog entitled "CEP and Big Data 2".   Here are some comments (actually nothing new - just summarizing things I have written about before).
Philip deals with three issues:

  • whether the name CEP is appropriate or should be changed? 
  • who should be credited as the pioneer of this area?   
  • whether CEP implies real-time processing?  
  •  who are the CEP big data platforms?

Here are summary of my views on each of this topics.

The name "Complex Event Processing"

Exactly four years ago I posted on this Blog an explanation about - "why I prefer to use the name event processing without any prefix, infix or suffix".   My particular dislike of the term "complex event processing" stems from the ambiguity in the name - some people (including David Luckham who coined this term) view it as processing of complex events, some interpret it as complex processing of events, and then debate of when something is complex enough, and what type of complexity is needed  to qualify as CEP.  Moreover some of the vendors use this term for products that are neither of the two options.   I think that two words is enough for the name of a discipline, examples: information retrieval, machine learning, image processing and much more....  Thus, from my point of view the term "event processing" subsumes all other terms like complex event processing, business event processing, event stream processing and more.

Who gets the pioneering credit

Philip as a good UK patriot wonders why the Wikipedia value about Wikipedia and other sources gives credit to David Luckham and forget the Apama work that came from Cambridge UK.    Looking at Wikipedia, it has one mention of David, as well as other references (like our EPIA book). It indeed does not mention Apama or any paper by John Bates, but being a Wikipedia, anybody can suggest additions.   
David Luckham had major influence on this area, since he was the first one who published a full book and exposed the young area to the general public.    An article in IEEE Computer, published in 2009,  made some investigation of the history of that area and determined that in the 1990-ies there were four parallel projects that can be classified as starting points in this area:  David Luckham's project in Stanford,  John Bates' project in Cambridge (UK, not Boston), Mani Chandy in Cal Tech,  and our Amit project in IBM Haifa Research Lab.    I share Philip's view that John Bates should have full credit as one of the pioneers, and still view David Luckham as the "elder statesman" of the community.

Is CEP necessarily associated with real-time?

I have written several times about this topic, last time in response to Chris Carlson, to whom Philip also responds.   There is some abuse of the term real-time in the industry, while its meaning is "within time constraints", many people interpret it as "with very low latency".   This is not the same,  anyway, event processing is a functionality with applications that require very low latency, applications which require to react within real-time constraints (which can be: 2 hours), some require both, and some require none.

Who are the CEP big data platforms?

I have taken upon myself the limitation not to state opinions on commercial products within this Blog  - leaving  it to analysts.   Thus will make one comment.  There is distinction between two types of software entities
which is sometimes confused in the language used by people.

  • Event Processing Platform is a software that enables the creation of event processing network, handle the routing of events among agents, management, and other common infrastructure issues.
  • Event Processing Engine is a software that enables the creation of the actual function - in the EPN term implementing agents.
This is similar to the difference between an application server and a single component (programming in the small vs. programming in the large).    Some of the available platforms for "event processing for big data" provide the first one -- it gives infrastructure, but not implementing any type of functionality, but enabling developers to create their own functionality, thus they don't do full-fledged event processing.   Seems that many people classify both under the same classification  (of course there are products that do both). 

Friday, March 6, 2009

On event processing engines and platforms


Today, Friday, is part of our weekend, so it is a good time to do shopping and other arrangements.
My wife and myself went to our local friendly bank to open some new account for some purpose. The lady that handles our account said that they have a new software to open an account that is extremely difficult to operate, with a lot of screens that one has to understand what is asked, and suggested she'll do it off-line and call us when ready, so we'll come to sign the papers. Once, opening an account was simple and lasted a few minutes, just signing some forms; the more sophisticated a software becomes, the more difficult it to operate, and sometimes it becomes obstacle to the business. Often, developers don't really care about the human engineering aspects. Hans Gilde wrote recently about the fact that CEP software is not smart. I agree, in several occasions I have given talks to an audience of high-school students which gives a rough introduction to AI, under the title: can a computer think ? while there some works in AI that strive to do it, today's software does cannot really think, and is not really smart. One can use the software to do things that look smart, but the wisdom is not in the software itself, it is in the way it is used. In the bank case, the software does not even look smart...

This week I had three visitors from Germany, Rainer von Ammon and two of his CITT colleagues, and we made some progress towards defining the EDBPM project that we plan to submit as EU project. They have asked me to pose in my office under my " wall of plaques" (half of them are in Hebrew, so they could not really read them...). So this is my most current picture..




One short clarification -- after my posting entitled : "event processing platforms - yes, but..."
I received some private communication claiming that there is a confusion between the terms "platforms" and "engines". The claim is that there are vendors who refer to their engines as platforms, moreover, some people refer to any run-time software as an engine. So I thought it worth clarifying how do I see the distinction:
  • Event Processing Platform is a software that enables the creation of event processing network, handle the routing of events among agents, management, and other common infrastructure issues.
  • Event Processing Engine is a software that enables the creation of the actual function - in the EPN term implementing agents.
This is similar to the difference between an application server and a single component.

What is the connection ---
  • On one extreme, there are closed platforms, i.e. platform that can run only one type of engines, in this case the distinction becomes more fuzzy.
  • On the other extreme -- there are open platforms, in this case these concepts are totally separated, a platform that can run multiple engines. The main issue about it is that there may be a collection of different languages that come with the different engines, and this may make the development of an application more difficult.
The first generation of event processing has started with engines that are stand-alone, the emergence of platforms, and making them open, are the signs of the second generation. I'll say more about the challenges of constructing the next generations -- more later.

Monday, November 24, 2008

On evaluation criteria for EP products


Typically, I refrain from reacting in this Blog to any marketing material presented by vendors, a restriction I have taken upon myself as the chair of EPTS. I am not deviating from this rule, but since my friends in Coral8 have posted their article entitled: Comprehensive Guide to Evaluating Event Stream Processing Engines on David Luckham's site, as a vendor-neutral service to the community, I am taking a freedom to put some footnotes to this paper.

On the positive side, I think that this type of work is useful, and discussions about it is also useful, and many of the criteria presented are valid. We in IBM have devised in the past criteria for evaluation for internal purposes that included many of the mentioned criteria, I have to check if we can expose them.

On the critic side - here are several comments:

1. The first claim is that the authors view "event stream processing" and "complex event processing" as one and the same, saying that customers do not make distinction between terms, and saying that there is no agreed upon terminology. I am referring the authors to the EPTS glossary as a reference for terminology. But regardless of that, I would agree that customers typically don't care what TLA is used, the substance is more important.

2. Giving the statement that the coverage of this document is ESP and CEP which are one of the same, have created the feeling that this document is general, however, reading further I find out among the criteria that define what is ESP engine the following condition: "...process large volumes of incoming messages or events". This criterion confuses me -- is that a fundamental property of ESP/CEP engine -- I have heard in the recent year some analysts talks saying that actually most of the potential EP applications are not the "high volumes" ones, furthermore, the customers I know have various degrees of event volumes, some of them high, some low -- so maybe this is not part of the definition of what is an engine, but an evaluation criterion for certain amount of applications.

3. Reading further I see terms like: continuous queries, windows -- terms that already assume a certain type of implementation (indeed --- query-based stream processing), this fits the title of "event stream processing" assuming that there is an agreement that this is what ESP is, however, it does not represent the entire spectrum. Continuous queries is a technique that is intended to achieve some functionality, that can be achieved in other means.....

Personally I believe that "one size fits all" does not work, and that different event processing applications have different functional and non-functional requirements. There are applications in which various performance aspects are more or less important, note that there is also no standard benchmarks yet. I hope that the work of the EPTS work group on use cases that is planned to result in classification of event processing applications will result in a finite, manageable number of application classes, so the evaluation criteria can be partitioned by type.

And -- if possible, hands on experience indeed makes the evaluation more accurate and removes noise of preconceptions and false assumptions... More on evaluation - later.

Wednesday, October 29, 2008

On EDA, CEP and disruptive technology

Text Color
This is the second time in the last few months that the term "disruptive technology" is being used, this time by Mark Palmer who adds his voice to the EDA vs. CEP discussion. Mark acknowledges the fact that EDA and CEP are not synonyms, but asserting that CEP is the disruptive part of EDA.
Recall that in the Oracle Blog the "disruptive" word was used recently and I have discussed it at that time; Richard Veryard has answered my response claiming that "disruptive" is often not better, at least not in the short run.

Anyway, I have noticed in the recent few weeks, two IBM customers who made plans to move to EDA (neither will use CEP, at least not in the short run). The shift to EDA is a fundamental change in the architecture thinking with the introduction of the decoupled event-driven thinking. Is it new ? - not really. Is it new for these customers - yes, not only new, but a significant change in thinking, it is an indication that there is a beef in EDA, even without introducing CEP, that the enterprise should digest. While thinking in events is natural in the daily life, it is still not natural for enterprise architecture and programming paradigm.

Bottom line -- while CEP certainly has its merits (if the customer has mature enough to digest it), EDA seems to be a more fundamental change in thinking relative to the alternative, and has its own beef. More - Later.

Wednesday, October 15, 2008

On semantics of synonyms


Holidays time - and I had to spend time in replacing my home wireless router (not of the same firm that is shown in the picture) who did not work well - to get another expensive one, whose signal is not received well in the lower floor well -- I hate wasting my time on restore things to work, this is never-ending story, every time something else need fix... we need more robust appliances..

Well - back from my personal frustrations to event processing thinking. I have posted a couple of postings about the different possible interpretations of "event_1 occurs before event_2", however, different interpretations are not unique to temporal issues, and the specific anomaly of having different temporal semantics. Let's take a case in which time does not matter - the function defined this time as follows:
Detect a pattern that consists of conjunction of two events (order is not important) - e1, e2.
e1 has two attributes = {N, A}; e2 has also two attributes = {N, B} ; the pattern matching is partitioned according to the value of N (on context partitions I'll write another time).

For each detection, create a derived event e3 which includes two attributes = {N, C}; E3 values are derived as: E3.N := E1.N ; E3. C = E1. A * E2. B.

Let's also assume that the relevant temporal context is time-stamps = [1, 5] - and the events of types E1 and E2 that arrived during this period are displayed in the table below:



The question is: how many instances of event E3 are going to be created, and what will be the values of their attributes? Think about it -- I'll discuss it next week.