Event Processing Thinking

Saturday, January 10, 2009

On disciplines and marketing devices

Yesterday I participated in the "parents teaching" program in my third daughter's junior high (8th grade) and gave the children a short introduction to the issue - does a computer think ? I did not give them an answer for this question, but gave them several basic puzzles and explained them how we can teach a computer to solved them -- one of them has been the old good missionaries and cannibals problem.

From the question --- does a computer think, I will move to the Blog of Hand Glide who phrased his posting in a form of a question -- CEP is a marketing device, so what does it say about CEP products ?

The answer is --- not much.

Let's change the TLA from CEP to SOA and ask the same question, the answer is that there are good and bad products that are marketed under the TLA of SOA, some of them have been here before SOA, and maybe some of them will be here if another TLA will dominate.

I have blogged before about the various interpretation of CEP, and the observation about what is called "CEP products" is that there is a variety of implementations that call themselves CEP, this does not teach anything about the quality of these products, their benefits to the business etc...

While TLAs became the property of marketing people to position products, somehow disciplines consist of one or two words such as: data management, image processing, graphics, information retrieval and many more - that's why I consistently use "event processing" when talking about the discipline.

Disciplines normally start in multiple places that try to solve similar (but not necessarily identical) problems, first generation of product is developed, and sometimes also hype is created and this is consistent with the "hype cycle" concept of Gartner. In the EPTS conference Brenda Michelson has argued that if anything this area is under-hyped and not over-hyped. There are some other indications that support her observation.

The early phases of a discipline lacks standard, agreed upon theory, and coherent thinking.
In the OMG meeting, March 2008, I have used the following slide as an example of what are the indications/conditions for a discipline to succeed:

The fact that EP is not in the maturity level of relational databases or some other more mature discipline is obvious, however, while there are people who made a career out of criticizing and complaining that what other people are doing is not good enough, I think that our challenge is to advance ---- it took years until there was an agreement what a relational database is, during which all databases suddenly became relational (to anybody old enough to remember, there were some funny situations of products that claim to have relational extension, when they did not understand the term), we need an event processing manifesto, and a collection of standards, but they will not be constructed in a single day, so we also need patient and persistence... I believe that EP will be 10 years from now one of the major disciplines of computing, and that we have the challenge to get there...

BTW - I agree with Hans that if products have business value for customers, they will be used regardless of the fact if at the end they will be classified EP or not. more - later

Monday, January 5, 2009

On event processing and some interesting queries

Some people have returned from the vacation with a surplus of energy, otherwise I cannot explain why my inbox today was full of mails from the same thread of discussion in the everlasting Yahoo CEP interest group trigerred by a question sent by Luis Poreza, a graduate student from University of Coimbra in Portugal. I am taking a liberty to re-write the question since it was phrased as a question in trading system, thus, some of the responders answered in trading related stuff that did not help to answer Luis' question, so getting as far away as possible from the stock market, I will base the rewriten question in the fish market. So the story is as follows: the price of 1 KG of fish is determined according to the hour, the demand, the supply and the general mood of the seller. In 10:50 he made this price as 71, then in 11:15 the price was down to 69 no more changes by 12:00. There is a computerized system that works in time windows of one hour starting every hour. The request is to find out for the time window 11:00 - 12:00 whether the price of 1 KG of fish was ever > 70. The claim is that intuitively the answer is yes, since the price in the interval [10:50, 11:15] was 71, but if we look at all the events that occurred at this window there was no event with value > 70, thus current "window oriented" tools will answer --- no.

There have been plenty of answers, some even tried to answer the question, for example by adding dummy events (one at the end of the interval ? every minute? ) with the value 71.

However -- I am going to claim the following assertions:

(1). The requirement given is not an event processing pattern.
(2). Attempts to treat it as event processing patterns are not very useful.
(3). It is in fact a kind of temporal query
(4). There may be a sense to have the capability to issue temporal queries as a response to events (AKA retrospective event processing) but this has to be done right.

Assertion one - the requirement is not an event processing pattern. Event processing pattern is a function of events, it is no surprise that Luis found some difficulty to phrase it as such. Let me take two other examples that look syntactically the same and try to understand what is the problem here:

The government agency example: A government agency known for its long queues in getting service tries to monitor the lenght of the queue. Periodically some clerk goes out and counts the number of people waiting in the queue. In 10:50 he found 71 people in the queue, in 11:15 69 people in the queue, no more samples by 12:00. Now the question is -- whether there has been some point in the time window between [11:00, 12:00] in which the number of people in the queue > 70.

Before starting the discussion, let's look at another example, the bank account example.

In 10:50 Mr. X has deposited $30, his previous balance was $41, which made his balance $71;
in 11:15 Mr. X has withdrawn $2, his balance was set to $69.

The fish market example looks from syntax point of view exactly like the queue monitoring example, in both cases we have events in the hours 10:50, 11:15 with attributes 71 and 69 respectively. However, they are not the same, the reason is that the price in the fish market is fixed until changed, while the length of the queue may have been changed several times up and down since the event here is only a sample and does not cover all events. Both of these events observe some state (price or length of queue), but the semantics is quite different. If we'll use the solution of dummy event for the queue case then the value will probably be wrong, furthermore, we cannot really answer the query in the queue case in "true" or "false", yet, in reality, periodic sampling is a totally valid type of events. Moreover, if we look at the bank account example, it looks very different from the fish market example -- it has two types of events, and the events do not observe a state, but report on change, and report the change value ("delta"). Thus looking at the two events of deposit and withdrawal we'll not be able also to answer the question, but knowing the state (balance of the account) and the delta (for the deposit and withdrawal) we are getting something which is semantically similar to the fish market example.

What can we learn from these examples? first that the property "the value is the same until it is changed" is not a property of an attribute in event, it is the property of the state (data) that may be created or updated by events. This is true for some state, this is not true for others. Solution given based on the fact that a human knows the semantics of this state, and writes ad-hoc query. However this is processing of the state, based on its semantic properties, and not of the events.

Assertion two -- Attempts to treat it as event processing is not useful.

In the past I've blogged about the hammer and the nail. There is a natal tendency of anybody who has a product to try and starch its boundaries. This may also backfire, since if trying to do some functions that this product is good at, and not doing great work can overshadow the good parts of the product. Solution like adding "dummy events" is a kind of hacking. It abuses the notion of event (since dummy event did not really happen), moreover, given the fact that this is just ad-hoc query, and there can be many such queries, in order to cover all them, we may need exponential number of dummy events... Anyway- event processing software is just a part of bigger picture, and instead of improvising, hacking or get to this functionality, it may be more advisable to use a product with better fit.

Assertion three -- This requirement is in fact a temporal query. I will not get into temporal queries now, but the actual query is over the price of 1 KG fish as changed by time. It is an existential query -- looking if some predicate holds somewhere in the interval. Other example of temporal queries can be: was there any day during the last 30 days in which the customer has withdrawn more than $10000 in a single withdrawal.

And this example brings us back to assertion four --- there may be a sense to couple event processing software with temporal queries. Example is that we have an event that makes a customer "suspect" in many laundering, but we need reinforcement by looking at some temporal queries in the past - like the one written above... I'll write about this type of functionality in a later phase.

Well - it is 1:15 AM, so I'd better take some sleep, tomorrow is again a busy day. So conclusion -- not everything that looks simple to do manually is simple to be done by a generic type of thinking, second -- event processing software should concentrate on doing event processing right, and not doing other stuff wrong... Some follow up Blog postings -- later

Sunday, January 4, 2009

On Event Processing Networks

Back in my office, with the machine-made coffee; starting the day by reading some stuff on the Web, and first I've seen David Luckham's request to write some prediction about the CEP market in 2009. It seems that I've misplaced my crystal ball, which probably means that I am not in the prophecy business recently. While there are things that are beyond our control, I think that the approach taken by Paul Vincent to talk about challenges is more constructive.

I agree that the one of the challenges is to define the borders of the area -- like other areas that have determined clear definition of their scope -- and maybe partition to sub-types. There are other challenges of interoperability -- how connectivity to many producers and many consumers of various types can be achieved, and also interoperability between event processors that can send events to each other. I view the EPTS work groups that will be launched hopefully later this month (and those who continue from the pre-EPTS era) as vehicles for the community effort to advance in these areas: the use-cases work group in defining the various sub-types, the language analysis one in working on required functions, the interoperability analysis one on interoperability issues, meta-modeling on the modeling perspective, and of course the glossary and the reference architecture as pivots in defining terms and relationships among them. We shall not finish all work in 2009, but my challenge to the community is to achieve significant progress in all of these during 2009, and make it the year in which much of the discipline will be defined.

In addition, I have also read with interest Philip Howard's short article on "Event Processing Networks" (Below is Philip's picture on the Web)

I have received it in direct Email from David Tucker, Event Zero's CEO, and later also found it on David Luckham's site. Anybody who reads my Blog may realize that I view the EPN as the
basis of the conceptual an execution model for event processing. Anybody who reads Philip's article may infer that EPN is a new concept invented by Event Zero, and this is not really true; Though Event Zero is indeed one of the first companies to implement an EPN based solution.
The glossary defines EPN as: set of event processing agents (EPAs) and a set of event channels connecting them. The glossary definition is very general and there can be many implementations that fit this definition. One view of EPN is as a conceptual model and implement it using existing tools, another view of EPN is as an execution architecture. With the few implementations of EPN right now we see the known phenomenon of the "Babylon tower" that I have written about in the past -- each implementation chooses its own set of primitives (in this case -- agent types).

The benefits of the EPN model is in its relative simplicity, generality, and its natural support in distributed environment and parallel processing (not for free, some more wisdom is required here!). My view is that the concept of EPN should be in the center of the event processing community efforts mentioned before --- from the fundamental theory to the execution optimizations. I'll write more on that in later Blogs.

Wednesday, December 31, 2008

On Event Reduction

The new year is going to arrive in less than three hours (local time), we don't really celebrate the new year, as our new year has already started in September, we celebrate our holidays according to the "Hebrew Calendar", but the daily life is handled according to the Gregorian Calendar, so I am mentioning that date. The mood here is not a celebration mood, various locations in the universe have their own natural disasters: The east coast of the USA and the Caribbean Islands have Hurricanes, parts of Asia have Tsunamis, the west coast of the USA and some other places have earthquakes, and we in Israel have periods of fightingwith one of our neighbors. Some are viewing it as the natural disaster of the middle east, like hurricanes; however, unlike hurricanes, I strongly believe that human violence can be avoided, but will not get in this Blog into the very complex situation. For those who were asking me about my personal safety -- this time we are (so far) quite far from the combat area, and in general -- I have better feeling of personal safety here than in many other places in the universe.

An interesting comment about my previous posting made by Richard Veryard
Richard referred to the comment I've made about being more productive in a Cafe, and asked whether this is a result of getting less interrupts, and what we can project to event processing in the enterprise level.

I think that this is a good point, actually, one the more marketed benefit of event processing systems are their ability to help not missing any event that requires reaction (I think that the term identifying "threats" and "opportunities" is much over-statement of most detected situations, but will write about it another time), in some cases, the business benefit of event processing system is actually reducing the number of events, and focus the decision makers on the important events.

Some examples:

In Network Management there is a phenomenon knows as "event storm", e.g. when some network segment is out, many devices send "time-out" alerts, which are just the symptoms of the real problem. What we want is to reduce this event storm to a single event that need to be reacted upon.
I would like to get alert when my investment portfolio is up by 5% within a single day (as you can see, I am still optimistic). Here I don't care about any of the many raw events about any of my investments, but about the situation defined above.

The conclusion (not very surprising) is that sometimes less is more --- the event processing system can eliminate events in various ways:

Filtering out unnecessary events
Aggregating multiple events to one
Report on derived event when some pattern is detected.

I'll blog more about this issue, but remember -- some times less is more and more is less...

Tuesday, December 30, 2008

On single application vs. general event processing software - the network and system management case

There are people who like working in coffee shops, in my academic days I had a colleague in U.C. Berkeley which liked to work in one of the coffee shops that surround the beautiful campus. I am usually work at home or in my office (where I sometime spends 13 hours per day), but today I spent the morning in a coffee shop called "Grand Cafe", well - not quite the famous one in Paris seen in the picture, but a much smaller one bearing the same name in Haifa. I am not sure that it was cost-effective for the coffee shop, since I ordered one mug of "upside down" coffee, which is the Hebrew name for a coffee in which the milk is put into the mug first and the coffee later, anyway - I spent the time in writing some paper, and found that the coffee shop setting actually makes me more productive then my office... Will try again to see if this is consistent...

I have also talked recently with somebody who works for one of the NSM (network and system management) vendors; the discussion was about my favorite topic -- event processing, and why the NSM companies who are dealing with events as primary occupation did not really try to look at "event processing" as a more general discipline and extend it to more types of applications. The answers I got from him reminded me of two things --- one from the far past, and one from the near past.

From the far past I recall that when I was assigned as a programmer to the Israeli Air-Force IT unit, when I was in the age of 18.5, I got my first assignment, and very enthusiastically wrote that program (in those days -- in PL/1..), and then got the second assignment, when I got my second assignment I read it carefully, and then went to my division commander and told him that the second program is quite similar to the first program, the change are in some details -- somewhat different input, somewhat different validity checks, somewhat different calculations, but the same type of thinking, so instead of writing this program I suggest to think bigger, look at this class of programs and try to write "general parametric program" which takes all the individual programs as a set of arguments to the general program. My division commander heard me with a lot of patient and then said: great idea, however, our mission is to write certain kind of applications for the air-force, and not to invent general programs. You may talk with the guys in the "chief programmer" department (what today we would call CTO), they are in charge of generic software. I was just a beginner and he was a Captain, and the commander, so I deserted this ideas, but pursued them later in life, as I always was under the impression that people program again and again the same things, and the level of abstraction in programming should be higher.

So - as you can guess from that story, the NSM guy just told me: Our aim has been to build network and system management and not generic software.

I also remembered that there was some discussion on the Web on the same topic, and found in David Luckham's site an article
entitled "an answer from the engine room of the industry", the question that David phrases is:
I have often asked why the network monitoring applications that were developed in the late 1980’s and early 1990’s didn’t get extended to apply to business level events at about the same time.

The question is answered by Tom Bishop, the CTO of BMC, who is seen here in the picture.

I've met Tom Bishop in the late 1990-ies, after IBM has acquired Tivoli (and NSM vendor), and we in IBM Haifa had a project with Tivoli; Tom made an impression of an impressive big guy with Texan accent. Now he is BMC CTO. In his answer to David Luckham he makes roughly the same answer, in three parts (quoting Tom Bishop) :

When the architects for these products were building them, they weren't actually thinking of the broadest applications for the types of systems they were trying to build, but were really focused on solving a very specific problem;
As we know all too well, often the correct way to solve a problem is to find the most general description of the problem and then assume that, if you've done your job correctly, the specific solution can be described as an instance or subset of the more general problem. But this only works if you know to set your sights high enough. In the environment you note above, this didn't happen.
The people who buy IT management solutions don't care if the solutions they buy might also be used to solve a business activity monitoring solution, and the people who buy business activity monitoring solutions don't care if the solutions they buy might also be used to solve an IT management solution. In fact, these two groups of people almost never talk to each other!

This is all revolving around the same phenomenon --- there is a big difference between hard-coding an application doing X, and building a generic program that the application doing X is an instance of it. Furthermore, the fact there are various hard-coded applications doing variations of X, may help in requirements, but does not mean that it gets us closer to a generic application - since the level of abstraction is probably wrong.

I guess that if event processing generic software existed when the NSM software has been built, the NSM vendors would have used it, instead of re-inventing the wheel, the same as they used existing databases, existing communication network etc..

Event processing as a discipline is about creating generic software, my personal prediction: NSM vendors will gradually merge into the event processing community.

More - Later

Friday, December 26, 2008

Footnotes to Philip Howard's - "Untangling Events"

My employer, IBM, does not allow to transfer vacation days across years, thus, even that I do not celebrate any major holiday this week, I have decided that this is a good time to spend the rest of my vacation days for the year, and takes two weeks off (one of them is already behind me) - spending some time with my children, taking care of some neglected health issues, and also reading books (it is rainy and cold, not a time to wonder around much...). I have looked today a little bit on the Web to see if I have missed something, and found out on David Luckham's site, a reference to Philip Howard from Bloor who writes about - untangling events. I understood that Philip is trying to look at various event-related marketing terms and determine whether there are synonyms, whether there is a distinct market for each... In doing that he is trying to list the various functions done by event processing applications and then gets to the (unsurprising) conclusion that each application does some subset of this functionality. but at the end he admits that he did not get very far and left the question unanswered, promising to dive more into it.

In essence he is right in the conclusion -- all the various functions create some continuum which a specific application may need all or a subset of them. Typically there is a progression - starting from getting the events and disseminate them (pub/sub with some filtering), then advancing to do the same with transformation, aggregation, enrichment etc -- so the dissemination relate to derived events and not just to the raw events, and then advancing to pattern detection to determine what cases need reactions ('situations') and what events should participate in the derived events (yes - I still owe one more posting to formally define derived events).

One can also move above all of these and deal with uncertain events, mine event patterns, or apply decision techniques for routing.

I think that there are multiple dimensions of classification of applications:

Based on functionality; as noted above.
Based on non-functional requirements -- QOS, scalability in state, event throughput etc,
Based on type of developers --- programmers vs. business developers
Based on goal of the application --- e.g. diagnostics, observation, real-time action...

There may be more classifications --- the question is whether we can determine a distinct market segments ? probably yes -- with some overlaps. This requires empirical study, and indeed this is one of the targets of the EPTS use-cases working group that is chartered to analyze many different use cases and try to classify them. Conceptually for each type there should be a distinct benchmark that determines its important characteristics.

Still - I think that all the vendors that are going after "event processing" in the large sense will strive to support all functionality. As analog: not all programs requires the rich set of built-in functions that exist in programming languages, but typically languages are not offered for subsets of the functionality. Likewise -- looking at DBMS products, most vendors support the general case. Note that there is some tension between supporting the general case and supporting a specific function in the most efficient way, but I'll leave this topic to when I am blogging in an earlier hour of the day --- happy holidays.

Wednesday, December 24, 2008

On Data Mining and Event Processing

Today I have travelled to Beer Sheva, the capital of the Negev, which is the south part of Israel, that consists mostly of desert. I have visited Ben-Gurion University, met some old friends, and gave a talk in a well-attended seminar on "the next generation of event processing". I have travelled by train (2 hours and 15 minutes each direction), and since my last visit there five years ago or so, they have built a train station from which there is a bridge which goes to the campus, very convenient. Since I am not a frequent train rider in Israel, I discovered that in both ends of the line, there are no signs saying which trains go on which track, and this is assumed to be common knowledge... Although they do notify when a train entered the station where it is going and from which track, but still they have a point to improve.

Since some of the people attended my talk were data mining people they have wondered about the relationships of event processing and data mining, since I've heard this question before I thought the answer will be of interest to more people.

In the "pattern detection" function of event processing, there is a detection in run-time of patterns that have been predetermined, thus, the system knows what patterns it is looking for, and the functionality is to ensure correct and efficient detection of the predetermined patterns.

Data mining is about looking at historical data to find various things. One of the things that can be found are patterns that have some meaning and we know that if this pattern occurs again then it requires some reaction. The classic example is "fraud detection", in which, based on mining past data, there is a determination that a certain pattern of action indicates suspicion of fraud. In this case the data mining determines the pattern, and the event processing system finds in run-time that this pattern occurs. Note, that not all patterns can be mined from past data, for example- if the pattern is looking for violation of regulations, then the pattern stands for the regulation, but this is not mined, but determined by some regulator and is given explicitly.

So - typically data mining is done off-line and event processing is performed on-line, but again, this is not true for all cases, there are examples in which event processing and mining are mixing in run-time. An example is: there is a traffic model, according to which the traffic light policies are set, but there is also constant monitoring of the traffic, and when the monitored traffic is deviating significantly then the traffic model has to be changed and the policies should be set according to the new traffic model, this is kind of mix between event processing and mining, since the actual mining process is triggered by events, and the patterns may change dynamically as the result of this mining process.

More - Later.