Saturday, October 4, 2008

On Event Processing Network and Transaction Processing

It is a holiday period, time in which we have four holidays during three weeks, and is quite a lazy time here, with many people taking vacations (like the second part of December - beginning of January in the countries with christian majority), due to holidays and some other events I'll see my office in the coming week only on Tuesday, but working a bit from home now...

In an IBM internal Email exchange this week with a person who does not really understand event processing, this person has seen some illustration of EPN (Event Processing Network) and wondered -- this seems like regular transaction processing ? what is the difference ?

Indeed - from bird's eye view everything looks like directed graph, such as the one shown in the top of this page, both transactional flow and EPN as well as many other things are expressed using a directed graph, however there is a major difference in the semantics of the graph.

In order to refer to a concrete example, let's take an EPN example taken from an application of remote patient monitoring.

The semantics of EPN means that a node in a graph creates events and then these events are consumed by other nodes in the graph, for example the "enrich" node takes a blood pressure reading and enrich it with indication whether the patient is diabetic, thus creates a derived event; this derived event is consumed by the node that is looking for pattern to alert physician. Without going to the application's details too much - we may also state that unlike a control flow, the pattern detection node does not start its execution when all of its predecessors have finished, since the pattern may look at multiple blood pressure measurements of the same patient, it may exist for longer period relative to the enrich node that is created and measurements anytime that there is a blood pressure reading of a new patient, so the graph does not show the control flow, moreover, these two nodes don't know each other and communicate through a router (channel) node. So there are some differences between event processing network and transactional flow:

  1. The EPN graph does not represent control flow, but event flow.
  2. In a control flow graph, typically the relationship between predecessor and successor nodes are "finish to start" (either "meets" or "after" in the Allen's operators that I'll discuss in a seperate discussion) which means that the predecessor's node must terminate in order for the successor node to start, in EPN, this may not be the case.
  3. EPN does not necessarily be atomic (one node in the EPN may fail, but others continue - no "atomic commitment protocol" (e.g. 2PC) is applied
  4. It also may not be isolated -- a node can emit events, while still continue working; even if it fails later - its emitted events may still be valid, if atomicity is not required.
  5. EPN can be restricted to behave in a transactional way - this is an interesting observation, as transaction support violates the decoupling principle, however there are cases in which it is required (again, deserves some more discussion). More - Later.

Friday, October 3, 2008

On the Genesis and Exodus in Event Processing

One of the greatest scientists I had the honor to meet in person (in a conference in France, 1991) is Lofti Zadeh the inventor of "fuzzy sets" which is one of the major ways to formulate inexact thinking. When I was an undergraduate student, there was an urban legend that Zadeh came with the fuzzy notion when his wife went out of town and left him a cooking recipe, trying to formalize the recipe he came out with the notion of fuzzy. Later in life I've met another great scientist and wonderful person, the late Manfred Kochen, who told me overa lunch in Ann-Arbor, that he has been a graduate student together with Zadeh in Columbia University; so I told him the urban legend and asked him if it is true, he was quite amused to hear it, and said that the problem that actually started the thinking about fuzzy theory was - formalizing the process of parking car between two parking cars, assuming the Fred Kochen told me the truth, was the genesis of fuzzy logic. It was interesting to observe that Tim Bass, in a couple of his latest Blog postings, have returned to the genesis of "complex event processing" citing topics that emerge from the papers that David Luckham's group in Stanford published in the late 1990-ies - the list contained:
  • Network Level Monitoring and Management;
  • Cyber Security: Network Intrusion Detection;
  • Enterprise Monitoring and Management,
  • Modelling and Simulation of Collaborative Business Processes;
  • Business Policy Monitoring;
  • Analysis and Debugging of Distributed Systems.

These applications are all still very much alive and kicking in the event processing space.

It is interesting to note that the genesis of data stream management in one of the earliest papers of the "stream" project, has been, surprise, surprise -- "network traffic management". It also should be noted that David Luckham and Jennifer Widom reside in the same building.

As the area of event processing have many ancestors - they have some more genesis books, for example, the term "active database" was first coined by Morgenstern in his VLDB paper from 1983 , and the genesis of Morgenstern has been - consistency and integrity applications. We still see compliance and governance (our current names) as major applications. Other ancestors are in the area of system management whose genesis has been the "root cause analysis" application - i.e. diagnostics of problems out of symptoms. We in the AMiT project in IBM Haifa Research Lab started also with looking at system management applications, and what is now called "business services management" - impact analysis of events in the IT on business processes. I think that at least some of the pub/sub companies started with distribution of new versions of software to subscribers, and of course some of the current event processing vendors started with applications like algorithmic trading in capital markets.

If we have used the biblical term genesis, we also may remember that the successor of "genesis" is "exodus", and in our term -- moving on and not staying just where we happened to start. While some of the software industry is based on niche players, where the niche may be quite big (one of the biggest IT companies in Israel has concentrated for many years mostly in the area of Telco billing, probably big enough to enable niche companies of several thousands employees), however, for more basic software like event processing tools, there is a big benefit in the ability to generalize beyond the genesis, and indeed we see now that some vendors are going after other markets that may seem beyond their "comfort zone" and need to make some adjustments (this phenomenon may be one of the drivers for standardization in this area, but I'll discuss this issue in another time), thus, we are watching growing list of applications and business problems that event processing can be part of its solution, both in the infrastructure area (which should grow to internet scale infrastructure) and the enterprise application area. To conclude this posting with citing another great speaker, Professor Stu Madnick from MIT, whom I remember giving an amusing talk about theoretical computer science saying something like: A bunch of people went to a close room taking with them some problems from the outside world, and since then they are still in the same close room, still working on the same problems, and sometimes inventing new problems . Well - we shall still solve the original problems, but also look around to find new ones, we are just in the early days of the event processing area, and probably did not discover much of its power to impact the business world. More - Later.

Monday, September 29, 2008

On Semantics and Race Conditions - introduction

In this Blog posting I'll touch upon an issue that requires some attention to the exact semantics.

I'll introduce the topic today -- wait a few days to see if there are comments - and then post the analysis of this case.

Given the simple application shown below:

Let's explain this simple example, since I would like to concentrate on a single issue, I'll simplify all other things to eliminate any noise.

  • There is a single event source (so no clock synchronization issues) which generates events of three types e1, e2, e3.

  • Let's also say that in our story there is a single events of each type that is published (so no synonyms issues), the table shows their occurrence time (when they occurred in reality) and detection time (when they have been reported to the system) - each of them has been reported 1 time unit after its occurrence, no re-ordering problem.

  • Events e1, e2 serve as an input to an EPA of type "pattern detection" which detects a temporal sequence pattern "e1 before e2", and when this is detected, it derives an event e4 - some function of e1 and e2.

  • Events e3 (raw event) and e4 (derived event) serve as input to another EPA of type "pattern detection" which again detects a temporal sequences pattern "e3 before e4", if this pattern is detected - create event e5 which triggers some action in the consumer.

The question is -- given the above - will the action triggered by e5 occur?, i.e. will the pattern - "e3 before e4" will be evaluated to true.

Before getting to the analysis -- I wonder what will be the results in current EP solutions:

  1. The action will always be triggered.

  2. The action will never be triggered.

  3. The behavior is non-deterministic (sometimes yes and sometimes no)

  4. Any other possibility (specify).

Please send it as a comment to this post, I'll publish an interesting analysis of this case next week.

Happy New Year.

Sunday, September 28, 2008

On the scope of event processing as a discipline again

Back home... short work week due to the Jewish New Year holiday (tomorrow is the holiday eve).
One of the topics that were not discussed in the EPTS meeting is - "what is CEP?", an indeed EPTS is looking at "Event Processing" as a discipline, where "Complex Event Processing" - no matter how it is defined, is a subset of a larger whole. One of the discussion points is to define the scope of the "event processing" discipline (some people prefer to call it "event-based systems" but we are talking about the same thing), I have already written in this Blog about event processing as a discipline before, talking about some interesting subsets.

As one interesting source, let's look at the scope of DEBS 2009:

Event-based systems are rapidly gaining importance in many application domains ranging from real time monitoring systems in production, logistics and networking to complex event processing in finance and security. The event based paradigm has gathered momentum as witnessed by current efforts in areas including publish/subscribe systems, event-driven architectures, complex event processing, business process management and modelling, Grid computing, Web services notifications, information dissemination, event stream processing, and message-oriented middleware. The various communities dealing with event based systems have made progress in different aspects of the problem. The DEBS conference attempts to bring together researchers and practitioners active in the various sub communities to share their views and reach a common understanding.
The scope of the conference covers all topics relevant to event-based computing ranging from those discussed in related disciplines (e.g., coordination, software engineering, peer-to-peer systems, Grid computing, and streaming databases), over domain-specific topics of event-based computing (e.g., workflow management systems, mobile computing, pervasive and ubiquitous computing, sensor networks, user interfaces, component integration, Web services, and embedded systems), to enterprise related topics (e.g., complex event detection, enterprise application integration, real time enterprises, and Web services notifications).
While this is not a definition that I have phrased, it shows that the discipline is diverse, and has touch points with some other disciplines (software engineering, databases, sensor networks, embedded systems etc...). It is also interesting to note that the applications presented in the EPTS use cases group were also diversified: we have seen applications from - Finance and Defense (not surprising), but also from - Media and Entertainment, Chemical and Petroleum, Telco and emergency management.
The event processing discipline crosses several aspects - modeling, architectures, languages, engineering aspects, performance and optimization, user interfaces, intelligent components, and domain-specific additions - again, all of these in the context of creating specific platforms and tools for building event processing applications. In the next few postings I'll return to some micro-level issues I have faced in the last few weeks.
Happy New Year.