Showing posts with label event processing platofrms. Show all posts
Showing posts with label event processing platofrms. Show all posts

Sunday, May 24, 2009

On System S


The logo above (in Hebrew) is the logo of the academic college of Emek Yezreel , in which I am serving in the steering committee of the Information Systems department, which is a new department started this year, as part of my service to the community, I am helping various institutes to establish academic plans to give the opportunity to people that otherwise would not be able to acquire academic education. As Aliza Shenhar, the dominant president of this college has said: for many of the students they are the first in their family ever to obtain academic degrees. We had an interesting discussion about the challenges that they have in teaching a diversified population.

Anyway, today I would like to write a little bit about System S, which has been recently highlighted by IBM, and been covered in NY Times, ComputerWorld, and some of the community Blogs by Paul Vincent and Marc Adler. The picture below, taken from the NY Times show Steve Mills, the head of IBM Software Group (on the left hand side), and John Kelly, the head of my organization, IBM Research (on the right hand side), both senior vice presidents in IBM, reporting directly to the CEO.


So what is System S, and how does it relate to event processing ? In the slide that Steve Mills points at, the title is "Stream Computing", and indeed, this system takes streams in the broad sense, anything that sends constant information from various types -- such as: video, audio, text, multi-media. The points in this slide are showing a data flow, and indeed, System S is a platform that can run data flow of processing elements in the system S terminology, each of them runs on stream of a certain type, and provide some form of analytics -- filtering, aggregation, extracting features out of video, interpreting voice and much more. The platform can take advantage of supercomputers to provide parallel processing, and digest high throughput of data. You can read more about it in the IBM Research website (I am not sure it is up to date). System S is a prelude to an IBM product already announced under the name -"Infosphere Streams".

Now, the question is what is the relationship between System S and event processing ? There are two different points. The one is that System S can take as an input large amount of streaming data, filter and aggregate it, and create a relatively small collection of events that can further be processes by some event processing engine. The other is that System S, as said, is a platform; the processing elements in this platform can be, in principle, event processing agents. In fact, while the semantics of the data flow is not identical to the semantic of an event processing network, it is possible to map event processing network to be implemented by the System S platform. The spade language provides some such capabilities, and may be extended in time to include more. IBM takes a portfolio approach to event processing (BTW - IBM does not use the "CEP" TLA, it uses its own TLA "BEP" as Business Event Processing, I tend to use event processing without prefixes and suffixes, as I stated before), since it believes that the "one size fits all", does not work, due to differences in functional and non-functional properties. System S is definitely aimed at the high end, in terms of throughput requirements. More - Later.

Friday, March 6, 2009

On event processing engines and platforms


Today, Friday, is part of our weekend, so it is a good time to do shopping and other arrangements.
My wife and myself went to our local friendly bank to open some new account for some purpose. The lady that handles our account said that they have a new software to open an account that is extremely difficult to operate, with a lot of screens that one has to understand what is asked, and suggested she'll do it off-line and call us when ready, so we'll come to sign the papers. Once, opening an account was simple and lasted a few minutes, just signing some forms; the more sophisticated a software becomes, the more difficult it to operate, and sometimes it becomes obstacle to the business. Often, developers don't really care about the human engineering aspects. Hans Gilde wrote recently about the fact that CEP software is not smart. I agree, in several occasions I have given talks to an audience of high-school students which gives a rough introduction to AI, under the title: can a computer think ? while there some works in AI that strive to do it, today's software does cannot really think, and is not really smart. One can use the software to do things that look smart, but the wisdom is not in the software itself, it is in the way it is used. In the bank case, the software does not even look smart...

This week I had three visitors from Germany, Rainer von Ammon and two of his CITT colleagues, and we made some progress towards defining the EDBPM project that we plan to submit as EU project. They have asked me to pose in my office under my " wall of plaques" (half of them are in Hebrew, so they could not really read them...). So this is my most current picture..




One short clarification -- after my posting entitled : "event processing platforms - yes, but..."
I received some private communication claiming that there is a confusion between the terms "platforms" and "engines". The claim is that there are vendors who refer to their engines as platforms, moreover, some people refer to any run-time software as an engine. So I thought it worth clarifying how do I see the distinction:
  • Event Processing Platform is a software that enables the creation of event processing network, handle the routing of events among agents, management, and other common infrastructure issues.
  • Event Processing Engine is a software that enables the creation of the actual function - in the EPN term implementing agents.
This is similar to the difference between an application server and a single component.

What is the connection ---
  • On one extreme, there are closed platforms, i.e. platform that can run only one type of engines, in this case the distinction becomes more fuzzy.
  • On the other extreme -- there are open platforms, in this case these concepts are totally separated, a platform that can run multiple engines. The main issue about it is that there may be a collection of different languages that come with the different engines, and this may make the development of an application more difficult.
The first generation of event processing has started with engines that are stand-alone, the emergence of platforms, and making them open, are the signs of the second generation. I'll say more about the challenges of constructing the next generations -- more later.

Saturday, February 14, 2009

Quantum Leap -- take II


This morning was a sunny Saturday after a few rainy ones, and along with many other people, I went out with my family to the nature... We live in Haifa, which besides its beaches and beautiful view of the bay, has also a close by big nature reserve called "Carmel forrests", not really a Forrest in global terms, but has many nice hiking trails, 15 minutes drive from home. Here are some of the flowers we watched today... good to take a break sometimes..

As a follow up to my previous posting on quantum leap, here are some more insights, we in IBM Haifa Research Lab have signed up to look at the "next generation of event processing", and are working on this topic, I may present a tutorial about our findings in DEBS 2009, if accepted.


Here are some initial insights:

  • Like in databases, there need to be a formal model that will have wide acceptance (over time) to enable the quantum leap, since acceptance provides a critical mass of work directed to the same direction. Our belief is that the "event processing network" model is the one, but it still lacks solid formal basis.
  • Besides this -- there are four areas that will show in the future significant developments, if they will be done on the basis of the model -- it can provide a coherent play. The pyramid below shows the four :


  • Platform: While the first generation of event processing is the "engine" land, we are starting to see movement for platforms which will provide shared services (e.g. - global state management, routing, load balancing, security, high availability...) and a possibly heterogeneous collection of event processing agents will run in these platforms. There may be platforms with various orientations -- grid platforms, database oriented platforms, messaging oriented platforms, streaming (data flow) oriented platforms to name a few. The platforms may be an "event processing platforms" or platforms with wider functions (e.g. event processing agents and other decision agents). Some analysts are talking about -- extreme transaction processing (XTP) and context-oriented platforms, maybe the platform will mix some of all of the above. Like the area of application servers in enterprise computing, the platform orientation is one of the facets of the next generations.
  • Engineering: The engineering progress is not really considered as revulsion, but they are required to enable the higher layers to work in reality. This is the equivalent in other areas to query optimization, tuning, configuration, scheduling, load balancing, parallel programming assignments and various of other systems related topics. The relational databases became widespread only after the vendors succeeded to get the engineering parts right, so advancement in this area is critical.
  • Functional: The functionality that products have today is just the start, more functionality will be supported, maybe even substantially more. Some directions: the "intelligent event processing" direction -- looking at discovery of unknown pattern and prediction of future events, adding more context information - like geo-spatial, getting better temporal handling; probably much more.
  • Usability: Here probably will be much of the quantum leap -- getting the abstraction levels higher. Hierarchy of events, and causality, advocated by David Luckham, are really abstractions. However, there are more than just abstractions from the implementations up, there also need to be abstractions from the user thinking down. Instead of trying to visualize and abstract out the implementation model, the opposite direction will be to have the abstractions in the users domain of thinking and translate them (perhaps not 1-1) to implementation.
The quantum leap will occur with a coherent combination of all these aspects. There may be some new vendors which will offer next generations as their first generation, since they are liberated from supporting legacy (and may be acquired by larger vendors) , and there are existing vendors which are going into some of this in an incremental way....

EPTS will attempt to contribute to the thinking about next quantum leap by the work in its working groups; we also saw in the last EPTS event processing symposium that the use cases working group has presented a variety of use cases, which cover broad range of applications types and requirements, this will be one vehicle to determine requirements. Other working groups will contribute in the various areas. In May 2010 we'll do a major summit of industry and academic people (Dagstuhl Seminar), EPTS members will get a more detailed note about it.

More - Later.

Sunday, February 8, 2009

Event Processing Platform (EPP) --- yes, but...





STAC, as cited in the popular Blog of Tim Bass, has determined that the correct name that should be used for event processing products is EPP (Event Processing Platforms) rather than CEP. I actually like the term EPP, actually I used this term before, so I should have copyrighted this name...

However, EPP should be used in the right meaning.

In event processing software there are two different things:
  • Platforms that provide the "programming in the large" -- indeed a container in which different types of functionality can be plugged in.
  • EPA implementation Software - that performs the actual event processing work - e.g. pattern matching, enrichment, filtering etc... This is the "programming in the small" (I called it "event processing engines", but not convinced that this is the best name)
In the EPTS glossary terminology, Platforms implement "event processing networks", while the other type implement various types of event processing agents.

These are not the same, there are vendors who provide platforms, but use other software to implement agents For example - BEA provided a platform, and used Esper for various functions, if I am not mistaken this is also true for Event Zero, IBM's Infosphere Streams is also a platform -- all are indeed platforms. Some products provide both the EPN platforms and various EPA implementations, some provide just the EPA implementation and runs on various platforms (or as a centralized stand alone engine).

So, while I agree that the EPN implementations are platforms, I am not sure that the EPA implementations are also platforms, and we probably may need a different name (engines ?, not sure)...

And one sentence about the term CEP. As I have written several times, I am not a big fan of this term at all, I am consistently talking about event processing and not about complex event processing as the name of the discipline that this Blog covers. However, this reminds me that once I have been a member in a Hebrew technical terminology committee, and one of the terms that came for discussion has been "real-time", for some strange reason, in Hebrew it was translated literally as "true time", and when it came to write the official glossary endorsed by the Israeli Academy of Hebrew Language, their representative who knew Hebrew linguistics, but not computer science, insisted the the Hebrew word should be a true translation of "real time", giving a long talk about "real numbers" and other real stuff. I argued that --- from linguistic point of view he is probably right, but, the scientifically wrong name is already well-known in the industry, and decision on another name would not be accepted by the public. After long discussion he agreed to include my wrong version as an alias to his true version. You can guess which of the two is still being used for "real-time" in Hebrew.
The moral of this story is that it may be too late to change names, since the name CEP has been accepted in the industry for any type of event processing system, whether or not it is scientifically accurate, and as somebody said once -- resistance is futile...

I'll continue to use "event processing", will use "event processing platform" for a platform, and still looking for a term for the "EPA implementation" (engines or otherwise). But -- my guess is that the people that use CEP to denote any type of EP will continue doing it, since this name may already penetrated to the ground. More - Later.

Sunday, January 6, 2008

On Enterprise Service Bus and Event Processing


This is one of the variations of Enterprise Service Bus (ESB) illustrations that I have taken from an article by one of my IBM colleagues. The topic of today is -- what are the relations between ESB and Event Processing ?
  • An event processing functionality that runs Event Processing Agents in various sources, requires a that will take care of the routing and execution in a distributed environment. There are three alternatives here:
  1. No native platform -- an engine that can run in multiple platforms (thus need to be integrated to each platform that it runs in by adapters etc...).
  2. Dedicated event processing platform -- the event processing part has a dedicated platform that provides the infrastructure for the event processing functions.
  3. Event Processing is built as part of an already existing platform.

All of these variations exist in the market today, and there are pros and cons for each of them, smaller vendors may prefer the first alternative as my friend Marco noted in his Blog.

When getting to the third alternative, if the environment is a SOA environment, then the ESB is a natural place in the SOA middleware to be the principle carrier of event processing functionality:

  • It provides messaging infrastructure and routing capabilities
  • It provides mediations like - validation, transformation and enrichment that can be reused for event processing (have a large intersection with the "mediated event processing" functions)
  • It supports distributed environment.

While the principle usage of ESB in SOA has been to mediate between consumer and producer of services, being a carrier for event processing is now considered as a step in the evolution of ESBs.

This does not say that ESB is the ONLY place in which event processing functionality can run, which brings to a discussion about the Event Processing Conceptual Model- which I'll deal in a subsequent posting.

The ESB gets into the picture in alternative

Thursday, December 20, 2007

On - "one size fits all" and Event Processing


Like commercial TV station - if a Blog wants to get "rating" one have to put somewhat controversial - the number of visitors to this Blog has more than doubled in the last few days when I had exchanges of opinions and folk stories with Tim Bass, anyway -- I got tired and did not continue that discussion. One question that I have received somehow related was -- does the fact that I don't think it is worth talking about ESP and CEP as separate entities means that I believe that there is a "one size fits all" in event processing ? well - this is a fair question, in the past I did believe it is true, until I read Mike Stonebraker in his immortal assertion: "One size fits all is a concept whose time has come and gone" Actually, I ceased to believe in it a little earlier, I think that the event processing area is not a monolithic area, and there are some variations needed - however:
  • I don't believe that ESP vs. CEP is the right type of partition in this area;
  • There may be a need to have various implementation under one roof (the heterogeneous framework approach),

For the first point -- what is the right type of partition ? this is a multi-dimensional questions and we still have to learn more to know the most useful combinations.

One of the important dimensions is the "reason for use" dimension, and here in an internal IBM study we got to five different reasons to use, I'll write about it in one of the next postings.

EPTS has recently launched a workgroup that tries to identify these classifications by doing a comprehansive survey of use cases that will be compared using the same template. A team that consists of Tao Lin (SAP), Dieter Gawlick (Oracle) and Pedro Bizzaro (University of Coimbra, Portugal) is working on this template, and a larger team will handle this survey and analysis -- the end result - a collaborative white paper about the state of the practice in event processing is expected somewhere in the second quarter of 2008. Stay tuned.

More - Later.

Saturday, December 1, 2007

On CEP and IEP


Ambidexterity is a good property for a boxer, he can decide when is better to attack with his right hand, and when to attack with the left hand (I am part of the left-handed minority, should write sometimes a post about being left-handed in the Right-hand people's world). Likewise, there are problems in the event processing space that can be solved by deterministic means (rules, queries, scripts, patterns --- chose your favorite religion), and problems that are solved by stochastic means -- using probabilistic networks, machine learning etc.. (AKA IEP - Intelligent Event Processing). When there is a pattern that need to be traced , to check compliance with regulations, and the pattern is well-defined - then a deterministic approach should be used; when there is a need to dynamically change the traffic lights policies to have minimal waiting time of vehicle, there is a need to predict the traffic in the next few minutes - this is a non deterministic problem and require some stochastic tool (BTW - my student, Elad Margalit, is looking at the traffic lights issue as his M.Sc. thesis). Event Processing Platforms should include various types of functionality - which brings to another discussion on the "actor/agent" architecture - which I'll refer to in one of the next posts. more -later