Event Processing Thinking: complex event proceesing

Showing posts with label complex event proceesing. Show all posts

Wednesday, December 23, 2009

On common misconceptions about event processing - the complexity misconception

David Luckham has coined the term "complex event processing", this term has caught as the marketing term behind many of the vendors that provide event processing platforms (comment: IBM, and recently Progress/Apama moved to use the term "business event processing"). While this term succeeded to get traction, it also is a source of on of the common misconceptions, Luckham talked about complex events, and their processing, some people understand it as the complex processing of events, and some just view it as the intersection between event processing and "complex systems". Complex event is defined as "abstraction of one or more other events", which also leads to some interpretations about the nature of abstractions, so this interpretation is easier to understand. However, the misconception is that it is more intuitive to think about "complex event processing" in the second interpretation as "complex" processing of events, and this brings us to the question -- what is complexity? there can be different dimensions of complexity.

The complexity may stem from the fact that we don't know what exactly we are looking for, and generally looking for anomalies in the system (e.g. trying to find security violations), thus some AI techniques have to be applied here.
Another case is that we know what are the patterns or aggregations we are using, but they require complex computing, in term of functional capabilities.
Another case is that the complexity is in some non-functional requirement: such as scalability in several direction (scale-up or scale-down), strict real-time performance constraints, highly distributed system etc..
Another case of complexity is in interoperability, the need to obtain events from many producers, and use events in many consumers, which requires instrumentation/modification of a lot of legacy systems.
Yet another case of complexity may be unreliable event sources, handling false positives and false negatives.

There are probably more complexity cases, however the interesting question is whether the main goal of an event processing system is to solve a "complex" problem.

Om my scientist hat, it is definitely more exciting to solve complex problems, even better, problems of the type that have never seen before. However, from pragmatic point of view, event processing applications are measured on their business value, and there might be a lot of business value of using event processing to systems that have none of these complexity measures, from complexity point of view they can be quite simple, moreover, there may not be an exciting aspect about the implementation as it is similar to other implementations already done, but on the measurement of "business value" it brings a lot of value, thus the value metric is orthogonal to any complexity metric, and indeed many of the applications in which event processing technology is very useful to is quite simple (according to one of the analysts report the "simple" applications are 80-90% of the potential market for the event processing technology). While there is certainly a segment of application for each type of complexity, and more work is required in these direction, the "simple" application will be the bread and butter.

More misconceptions - later.

Sunday, April 19, 2009

On Event Processing related Books

The amount of books related to event processing is growing, and in the next year or so there will be several books added to the already existing one, looking at event processing from different angles.

First let's review the existing books.

"The power of Events" by David Luckham was the pioneering book in this area and became an icon by its own right, the book was published in 2002, and influenced the thinking in this area, including the name coined "complex event processing" that is in use as a name being used for various products in this area.

The book "Distributed Event-Based Systems" by three members of the research community: Gero Mühl, Ludger Fiege, and Peter Pietzuch. It should be noted that the "DEBS" community which concentrated around pub/sub and distributed middleware, has joined forces with the community that deal with event processing languages, architectures and execution models to form the event processing community.
This book published in 2006 deals mostly with the infrastructure that enables event-based interaction (such as: pub/sub) and is a good textbook on the infrastructure topics.

The book "Event-Driven Architecture - how SOA enables the Real-Time Enterprise" is a new book by Hugh Taylor, Angela Yochem, Les Phillips, and Frank Martinez available also as a Kindle book. I have not read it yet, but from the title it seems to look at EDA as a SOA pattern. Will add it to my next periodic Amazon order.

Now - to future books. Amazon also shows a book that has not been released yet, by Sharma Chakravarty one of the veterans of active databases and Qingchun Jiang. The book is entitled: "Stream Data Processing: A Quality of Service Perspective", and has "Complex Event Processing" is its sub-title, seems to be a monograph about Sharma's approach to unify stream and complex event processing that he presented in the DEBS 2008.

Annika Hinze and Alex Buchmann, again, two persons from the academia, well known in the community. Their intended book is a collection of articles, entitled: "Handbook of Research on Advanced Distributed Event-Based Systems, Publish/Subscribe and Message Filtering Technologies", and is geared towards the research community. I don't know the schedule.

Another forthcoming book by well known figures in our community: Roy Schulte and Mani Chandy is entitled: "Event Processing: Designing IT Systems for Agile Companies".

This book will provide business oriented view of event processing and its relations with various part of the enterprise architecture (SOA, BPM, BI).

Last but not least, the book that Peter Niblett and myself are writing entitles "Event Processing in Action", the book focuses on the building blocks of constructing event processing applications, and provides a deep dive of application building using a use case.

The fact that various publishers have taken the investment in event processing oriented book is an indication of the interest in this area. Enriching the community with several books with different viewpoints and focus area will help in both the understanding and teaching event processing concepts and facilities.

Thursday, April 2, 2009

On the EPTS Language Analysis Working Group and April Fool Day

Yesterday, April 1st, was the traditional day for practical jokes, AKA "April Fool" day. The slideshare site has decided to do a practical joke on the people who post slides, sent Email entitled : "you are a slideshare rock star" saying: We've noticed that your slideshow on SlideShare has been getting a LOT of views in the last 24 hours. Great job ... you must be doing something right. ;-). Since I have just posted a presentation there, I went to see what they are talking about and found out that hundred of thousands of people have visited my slides in the last day, since I don't really think that event processing presentations are that popular, at first I thought they have their counts mixed, but it took me 20 seconds to remember the date, and realized is is a practical joke. However, some of the other people have taken it seriously and started to notify the whole world about their tremendous accomplishments. At some point during the day, the site clarified that this is indeed a practical joke, and got some angry responses like -- after notifying all world about it, I am left humiliated. Some humor will not harm anybody, also, people should also have some judgement on what is reasonable before notifying the entire world, well - today is April 2nd and the counts returned to normal. It turned out that there are 83 views of the presentation I have posted in Tuesday, not really a rock star scale, but not bad for the community size for a single day. BTW -- this Blog broke the record high of views twice last week, once in Tuesday, and then again in Thursday.

Yesterday, we also had a meeting (not a joke) of the EPTS language analysis workgroup. We are advancing by setting the criteria for analysis over the dimensions -- events, meta-data, state, computational/execution model and programming model. All EPTS members have access to the memberw Wiki, and can make comments. If you want to be part of this process, and not an EPTS member yet, you are invited to join, the public website of EPTS has instructions about joining -- one can join as an individual member, and there are no membership fees. While the criteria list is being constructed, we also construct the list of languages that will participate in the evaluation, the goal of the evaluation is not to do a "beauty contest" of languages, but to understand the different functions that exist today in languages in order to abstract out a semantic model of event processing functionality. Again, EPTS members can update the language list on the members Wiki. The languages are either languages that exist in the products, or languages that have been used in the research community, which also may contain interesting features that do not exist in product languages. More about this workgroup - later.

Thursday, March 19, 2009

On data flows event flows and EPN

Bob Hagmann from Aleri (ex-Coral8) has advocated "data flow" model as an underlying model that unifies both engines of Aleri, and contrasts it with "event delivery systems" in which programmers create state manually if needed. I am not really familiar with the phrase "event delivery system" and don't know what he refers to, but there are event processing systems that employ different programming styles from stream processing, in which states are handled implicitly by the system and the programmer does not really deal with creating states.

But -- I have no interest in "language wars", my interest these days is somewhat different -- to find a conceptual model that can express in a seamless way functionality that exists by different programming styles.

Actually the conceptual model of EPN (event processing network) can be thought as a kind of data flow (although I prefer the term event flow - as what is flowing is really events). The processing unit is EPA (Event Processing Agent). There are indeed two types of input to EPA, which can be called "set-at-a-time" and "event-at-a-time". Typically SQL based languages are more geared to "set-at-a-time", and other languages styles (like ECA rule) are working "event-at-a-time". From conceptual point of view, an EPA get events in channels, one input channels may be of a "stream" type, and in other, the event flow one-by-one. As there are some functions that are naturally set-oriented and other that are naturally event-at-a-time oriented, and application may not fall nicely into one of them, it makes sense to have kind of hybrid systems, and have EPN as the conceptual model on top of both of them...

This is the short answer. More detailed discussion -- later.

Wednesday, March 11, 2009

On misconceptions and fun in event processing

Today, I am in a short (two days) vacation, due to the Purim holiday... Purim is a fun holiday for children in which they wear customs, here is my daughter Hadas, posing as a hippie (something she knows only from old movies. The fun takes me to a recent Blog posting of Hans Gilde who wrote about two false statements and a fun fact about CEP. So I decided to have my own version talking about three misconceptions. Not really new stuff, I have written about it before, but putting it all in one posting. I'll concentrate on three misconceptions: the elephant's view misconception, the complex about complex misconception, and the maturity misconception.

I have used this picture before, about several blind people touching various parts of an elephant and reaching to different conclusions what an elephant is. The first misconception is that there is a single reason, a single industry, a single killer application, and a single view on what event processing is; some people strongly identify event processing with certain trading applications in capital markets; the financial market industry has been indeed an early adopter, but other industries (gaming, chemical and petroleum, travel and transportation, retail, healtchcare, insurance and more) are getting there, and we see a wide range of industries and applications, bringing to a variety of requirements. I don't foresee that in the future there will be any single dominant industry. Some people think that the only ROI for event processing is the support of high throughput, but most EP applications today don't really require high throughput support; actually in some cases the high throughput support is indeed the ROI, in others it is the TCO of using higher level abstractions or the use of adapters. Some other voice are saying that the only important requirements are those relevant to some security applications. Well - this is true for security consultants who make their living from these applications -- but not in general. We should strive to have a bird's-eye holistic view of the entire elephant.

David Luckham has coined the term "complex event processing"; he meant the processing of complex event, where a complex event is an abstraction of aggregation of other event that creates higher level event. However, the ambiguity of this sentence, made other people to interpret is as "complex processing of events", and somehow it became a name that various people are using it to denote ANY processing of events. There is even one person who declared a full fledged cyber war on anybody who uses the term "complex event processing" not in what he believes to be the original intent of past DARPA project in the area of security and military operational applications. My view is that complexity have various dimensions: in some cases the complexity is in the event structure, in other cases it stems from the uncertainty nature of this application (e.g. in fraud detection that patterns that are looked for is a constant moving target), another type of complexity is in the patterns themselves that are need to be supported, and in some cases, the complexity is not in the functionality, but in non-functional requirements such as: hard real-time constraints on latency, handling high throughput, need for deterministic performance etc... For example: network and system management are intended to find the root cause analysis of some events considered as "symptoms". The complexity is in attributing a collection of symptoms to actual problems, and the fact that the space of symptoms may be incomplete, thus stochastic models are needed; however, the pattern language needed is rather simple. On the other hand, in regulations enforcement, the patterns may be complex, but they are given, and there is no uncertainty aspect involved, and stochastic models would be of little use (they are used in risk assessment, but this is another opera).

The third misconception is the maturity one. Some vendors claim that they have mature technology and they mean that their product is stable, and being used by customers without substantial problems. This does not say that the event processing discipline by itself is mature.
In the insect life-cycle (see above picture) it is going from the phase of young larva to the phase of mature larva, but the way to adult insect is still (relatively) long. I always compare the state of event processing today to the state of databases in the hippies time -- late 1960-ies. Yes, there were some databases that worked, but the notions of concurrency control, transaction, distributed databases, query optimization to name a few, were not there; even the formal theory of the relational model has not been there; you can draw direct analogy to what is missing in event processing state-of-the-practice.

In the fun fact, Hans Gilde is talking about constructing the event processing discipline as a cohesive field, as the result of the six EPTS working group which are currently active. It is indeed fun work, however a huge challenge. It will take time (Rome was not built in one day either...). We need the best and the brightest to help in that, and hope that knowledgeable persons like Hans will also join and help in achieving this challenge. More - Later.

Friday, February 27, 2009

On levels of decision makers and event processing - part I

I am sitting now in my living room, watching the heavy rain outside.Rainy; We did not get a lot of rain this winter, however when the rain comes it tends to come in the wrong times, like weekends (Friday is part of our weekend which is Friday and Saturday, Sunday is a normal work day, it is used to catch up on things since in this day our colleagues from abroad are idle and there are no conference call and other interactions).

Today I would like to concentrate on the question --- what level in the Enterprise is event processing for, I had a recent discussion with somebody who investigated the BRMS market and asked him this question about BRMS, and the answer was the mostly BRMS products concentrate in the operational level, the typical example of BRMS is assignment of rate to insurance policy, which is clearly an operational decision. What is the situation in event processing ? There is a famous analyst presentation that talked about "detecting threats and opportunities" as part of the ROI for event processing pattern matching. Let's examine what's behind this title. Sometimes there are risks in the operational level, such as security attacks, but since this presentation have concentrated in the business area and not the IT, the meaning has been seeking opportunities for the business and mitigating risks for the business, which is beyond the operational level, such detection is probably in the tactical level, but the outcome can flow to the strategic level, since there may not be an answer to specific threat or opportunity within the current strategy. On the other hand, event processing is also associated with being done on-line (what some people call "real time" or "near real time" when they are not sure if it is really "real time", which is even worse as a term).

Some interesting questions on this topic are:

1. Whether organizations are really doing tactical and strategic decisions on-line ? in the illustration on the top of the page taken from the Microsoft site
the authors believe that tactical decisions is matters of days to months, and strategic decisions are done in resolution of quarters or years. Is there benefits/feasibility to change it online ?

2. Do we need different variations of event processing for the different levels ?

3. Is the semantics of events the same among all levels ?

4. Are there different complementary technologies, and different platforms in the different levels, or can we look at a single event processing system across levels?

Today I am just posting the question, will try to address each of these questions in subsequents postings.

Wednesday, February 25, 2009

More EPTS News

EPTS moved to an hyperactive phase with the launch of the six working groups. This has been also an opportunity to gain more members, as membership in the working groups is a privilige provided for EPTS members. There are several more members that belong to most of the different communities --- academic peo0le, consultants, customers and vednors (no new analysts...). I am especially glad to mention that CA has recently joined EPTS; I welcome CA as an expereinced company whose main line relates to managing infrastructure events, and knowing some of its people that have already been involved in the community, I am sure that they can make a substantial contribution to the technical society.

In fact, any new member interested to contribute to the understanding and progress of the event processing area on multiple fronts, is most welcome to join. There is a lot of action and place for contribution and innovation. Event Processing is a relatively young discipline; disciplines progress is accelerated with a community work. I am confident that the activities done today with a lot of excellent people who invest time and energy to contribute, will make a substantial impact on the universe. If you are interested to join EPTS, please follow the instructions

Somebody who looked at the public EPTS home page has asked me why there is not any trace for all the activities in this site? the answer is that on the public site we'll post final reports, but work in progress will be done internally, thus the "action" is reflected in the members Wiki.

EPTS is also starting to be recognized in the larger universe as a representative of the EP community. OMG mentioned EPTS explicitly in its RFP that deal with event meta-modeling stuff. Several articles in professional magazines, both industry oriented journals and general computing magazines are running articles about EP or CEP and cover EPTS as part of this article, I'll write more about any such article when I'll have it under my hands.

Tuesday, February 24, 2009

On BRMS and EP

This is a slide rule, an ancient means to do arithmetic calculations easily if one has some experience in working with it. When I took the matriculation exam in mathematics many years ago, the Israeli ministry of education did not allow the use of calculator, since calculator at that time was relatively expensive, and it was considered of giving unfair advantage to those who can afford it, however they allowed the use in slide rules, so it served me well at that time. Today slide rules together with logarithmic tables and typewriter found their way to museums, but other type of rules are still with us.

Some Blogs have recently made references to the recent Forrester report with the catchy name:
Must You Choose Between Business Rules And Complex Event Processing Platforms?

The Forrester reports discusses some confusion that exists between the two terms. It is true that there is some ambiguity of the word "rules" - on one hand rule-based is a kind of programming style that can be used to express event processing patterns, and between BRMS - a collection of products with a certain functionality. Forrester also claims that to add to the confusion there are people who use (or abuse) BRMS products to do CEP applications and CEP products to do business rules applications. You can read the rest of the original report for more details. In my previous posting about state processing and event processing I have talked about the difference between the two. In fact, BRMS products are processing the current snapshot (state), while event processing is about processing of the history of transitions, different kind of techniques and optimizations are used for each.
I have also blogged recently about decision agents, talking about the fact that event processing agents (at least some of them) can be a subset of a larger whole which can be called - decision agents. And indeed, while the two type of technologies are distinct, there is also a sense to look at them with a unified view. Here I share the vision of James Taylor who talks a lot about Enterprise Decision management which consists of business rules, event processing and analytics. We'll hear much more about this concept - more later.

Friday, February 20, 2009

On static and dynamic event flows

This picture shows static and dynamic flows in an interesting picture I have found in one of the Web albums under "Minneapolis pictures".

In continuation to my previous posting on event flow and decoupling, I would like to discuss the issue of static vs. dynamic event flows.

I already discussed the fact that event processing applications can be of many types, and naturally various types have their own properties.

There are applications whose nature is totally dynamic, such an application is information dissemination to alerts about customer's activities in banking systems. There are many subscribers that can subscribe to multiple type of alerts and change their subscriptions from time to time. In these type of application monitoring the event flow can be done for management purposes of the system, e.g. collection statistics about patterns of use, tracing individual flows for exception handling purposes etc.. However there is no sense of a global event processing network as there are many flow islands that are not related.

On the other hand there are event processing applications in which the flows are relatively static and there are a relatively stable set of event processing agents with relatively stable collection of relationships among them, actually, many of the event processing applications I have encountered are of this type. Example: an event processing application that manages am auction. The flow here is fixed as long as the auction protocol is not changing, thus the collection of event processing agents and their relationships are fixed. Of course, the run-time instances are still dynamic. This is similar to a database schema that may be relatively stable, and the data itself is dynamic.

The flow modeling is helpful for the:

"software engineering" aspect --- debugging, validation,analysis
performance aspect --- enable of scale-out by semantic partition, a topic we are working on and I'll discuss in detail in one of the future postings
management aspect --- provenance, tracing, monitoring

There are more questions that needs discussion about dynamic updates to event processing network, and I'll discuss them in the near future -- more later.

Wednesday, February 18, 2009

On Event Flow and Decoupling

This is a simulation of an anesthesia workstation, it can simulates various cases that creates a flow of events that are flowing inside this configuration, e.g. what happens when there is power failure.

I was recently asked if there is no contradiction between two concepts:

The decoupling concept, each event processing agent is independent, it subscribes to some event, publishes derived events, and is independent in any other agent, furthermore, it is decoupled and does not know anything about the other agents.
The Event Flow concept, in which there is explicit modeling of event flows

My answer is that there is not really a contradiction, since these two principles are in two different levels. The decoupling is in the execution level, event processing agents indeed do not need to communicate with one another, since there is no RPC or any other synchronous communication among them. The event flow concept exists in the modeling and management layers. In the modeling layer, there should be a view of the entire "event processing network" to enable that that the orchestra is playing together; in the management level, there should be a possibility to trace back the provenance of a certain decision or action, or trace forward the consequences of any event, however this still does not require to violate the decoupling in the execution layer, that's the beauty of model driven architecture... more - later.

Saturday, February 7, 2009

On Classification of Event Processing Applications

The illustration above talks about classification in the animal universe; classifications is one of the best way to understand the universe. In our context, I have started in the previous posting to discuss types of functions that exist in Event Processing generic tools. Today I'll complete the picture by discussing classes of applications. This classification is not a partition, a certain application can have elements of multiple classes. This classification answers the question ---
what benefit the customer expect to obtain from an event processing system ?

The illustration below is an IBM classification of what is "Business Event Processing", this is a slightly modified version of results of study we conducted within an IBM Academy of Technology study that analyzed some use cases. The use cases working group of EPTS is now repeating this exercise, three years later, and with probably somewhat broader perspective, so the end result may be different, but this will provide a sense of this type of classification:

Starting from the top and going anti-clockwise (I am left-handed...)

Business Activity Monitoring (BAM): Observation on collection of activities to find exceptions and monitor key performance indicators to alert business stakeholders. This typically requires aggregations and predefined pattern matching.
Business logic derived events (sometimes called RTE - Real-Time Enterprise): detecting situations that require reaction (typically with some time constraints). The derivation of the situation may be either by predefined patterns (e.g. regulation enforcement) or by discovered patterns (fraud detection). Most of the applications use predefined patterns.
Predictive Processing: Processing future predicted event in order to eliminate or mitigate them.
Stream Analytics: Analysis of various streams (video, voice, data etc..) to derive individual events (e.g. from video stream) or trends - this includes "real-time business intelligence".
Business Service Management: Monitoring satisfaction of Service Level Agreement (SLA) of IT systems.
Active Diagnostics: Finding the root-cause problem by looking at collection of symptoms.
Information Derived Events (also know as "information dissemination") -- personalized subscription that provide the right information at the right granularity to the right person at the right timing.

I'll dedicate (in the next few weeks) a separate posting to each of them with some examples, and reference back to functional and non-functional requirements.

Friday, February 6, 2009

On the first step in the way to "event processing manitfesto"

It was a very busy week and alas I had to neglect the blogging hobby, now it is Friday night, I am watching a TV program with old Hebrew songs (my favorite), and decided it is a good time to blog a bit, however, our relatively new cat, who looks somewhat like this (this is not his picture, but of a similar cat I've found on the web) decided that I am a good place to rest on, and did not want to move, another creature who is trying to manage me... He is really a kitten that my daughters found and adopted, and as I have written before, giving names in our family is not an easy task, so he has several names and is known by "the cat". I call him Gilgamesh the terrible.

In 2007 we had the first Dagstuhl seminar on event processing, and we the same set of organizers (Mani Chandy, Rainer von Ammon and myself) decided to apply again for a second Dagsthul seminar in 2010, and the seminar has passed the committee, with some clarifications that we need to provide about scope. I'll let you know if and when it will be finally approved.

The intention of this Dagstuhl seminar (that lasts for 4.5 days) is to have an opportunity for a selected group of people to have a meeting in an isolated place to have in-depth discussions. The proposed goal of this Dagstuhl Seminar is to work on "event processing manifesto". There has been several manifestos of different area in the past, for example: OODB manifesto, Hopefully, by the time of the Dagstuhl seminar we'll have advanced work done by the various EPTS working groups that are being launched this month, and we'll be able to utilize their results in order to better define what "event processing" is -- note that I don't use "complex event processing", and I explained the reasons before.

One of the questions asked is what is the scope of "event processing", since working with events is quite wide area - starting from interrupt handling in operating systems, moving through graphical programming and more -- much of this is related to programming with events in conventional programming, and there are even books dealing with this area. However, our scope is more modest: generic tools for processing events in IT systems. This scope talks on what is needed to build a generic tool, and not ad-hoc programming hard-coded for every single application, and IT systems and not operating system, embedded systems etc..

The illustration above is a first step in thinking about -- what event processing system should include -- parts of it should be mandatory and some optional, however from functionality point of view there are:

Routing and filtering -- the most basic form of event processing.
Mediation -- transformation, enrichment, aggregation, split -- the next level of sophistication.
Pattern Matching --- (I called it in the past "pattern detection") which may involve multiple events from multiple types.

On the bottom of the illustration there are two other entities:

Event processing platforms which are enablers for scalability, distribution and other good qualities. Event processing platforms may have their own functions or host others (or both)...

Pattern discovery that falls under the category "Intelligent Event Processing". It can be done off-line (typically this is the case) or on-line - and then the pattern matching may be unified with the discovery.

In different types of applications we may need different subsets, for example: fraud detection requires pattern discovery, security type detections (e.g. denial-of-service attack or intrusion) may use on-line pattern detection. On the other hand, other applications don't require pattern discovery at all, for example: compliance with regulations, where the regulations are given and cannot be discovered, or BAM systems in which the Key Performance Indicatros are determined according to the corportate strategy and cannot be discovered. Furthermore, there are applications in which pattern matching is not required at all, and all processing is of type filtering, routing, enrichment and aggregation.

And I'll finish with a footnote to David Luckham's recent article. David is trying to answer "critisizm on the Blogsphere" about CEP as a marketing hype, and lack of value from the current set of products. First, I never thought that there is over-hype, on the contrary, relative to the potential of event processing there is under-hype. I am re-posting this illustration taken from Brenda Michelson panel presentation in the last EPTS annual symposium.

The hype is relatively low, and in contrast, the analysts report are all indicating that the EP market has grown by 50% or so in 2008, and IDC even claims that for a second year in a raw that is the fastest growing middleware type. About the Blogsphere crtisizm, as I already written before, much of it stems from diferent interpretations of the term "complex event processing", for example, some of the postings of Tim Bass lead me the conlusion that he believes in the equation : complex event processing = on-line pattern discovery. Again, eliminating the quantification "complex", there is a large set of applications (probably most of the applications I know) of event procssing, do not require stochastic reasoning at all.

I'll post a continuation Blogs about application types, and functions they require.. It is very late - going to sleep.

Sunday, February 1, 2009

On Off-Line Event Processing

A comment made by Hans Glide to one of my previous postings on this Blog, prompted me to dedicate today's posting to Off-Line Event Processing. Well - as a person who is constantly off any line, I feel at home here...

Anyway -- some people may wonder and think that the title above is an Oxymoron, since they put "real-time" as part of the definition of event processing. I have used before this picture that is the best describing some of what is written about event processing - by everybody:

This, of course, illustrates a collection of blind people touching an elephant; each of them will describe the elephant quite differently, and the phenomenon that people say "event processing is only X", where X defines a subset of the area is quite common. In our case X = "on line".

The best here is to tell you about a concrete example of a customer's application I am somewhat familiar with. The customer is a pharmaceutical company which monitors its suppliers related activities. It looks at events related to supplier-related activities and checks them against its internal regulations. The amount of such events are several thousands per day and from business point of view, it does not require real-time requirements, the observation about any regulation violation and action taken, can be done in the next day. The way that this system works is accumulate events during the day, and activate the vent processing system at the end of each day, which is actually a batch processing done off-line.

An interesting question is why have this customer chosen to use an event processing system, and did not use a more traditional approach of putting everything in a database and using SQL queries. The answer is quite simple: This applications have some interesting properties:

The number of regulations is relatively high (in the higher range of three digits);
Many of the regulations rules are indeed detection of temporal oriented patterns that include multiple events,
Regulations are inserted or modified frequently.

Given all these it turned out that the use of event processing system in off-line was the most cost-effective solution; While using SQL is nominally possible, writing these regulations in SQL is not easy, and the magnitude makes the investment in development and maintenance quite high.

So - the benefit of using event processing here is neither the real-time aspect, nor high throughput support, but simple TCO considerations.

This is not the only applications of this type, and in fact, I have seen several other cases in which event processing has been used off-line. There is also another branch of off-line processing which combine on-line and off-line together, but I'll write about it in another posting...

More - Later.

Saturday, January 31, 2009

On Decision Agents

Decisions are part of the enterprise life as well as the life of every individual. Take a decision that many of the readers have experienced: naming children. My wife and myself have realized in very early phase that we have completely different taste in names, so we have decided to agree on a protocol for how names are selected: taking turns - one of us is making a list of five names, and the other selects a name from this list. The selection is done only after the birth, and the list can be modified until the selection made. To be fair we need even number of children so each will play any of the roles equal amount of times (indeed we have four children which satisfies this requirement). This is actually work, none of us got the first priority, and none of us have to tolerate a name we really hate.

In the paragraph above I have described a manual decision, but increasingly decisions are made by computer software which makes or recommends decisions. James Taylor is constantly talking about "Decision Agents". I thought that it will be interested to look at this notion and discuss how it is related to event processing.

We can say that a "decision agent" has various properties that we can qualify as answer to questions:

Why ? what is the reason that the decision agent activated. In the example above -- the birth of a child.
which ? which information is needed in order to make this decision. In the example above --- the list of five names (which have been obtained by another decision or collection of decisions) .
How? How the decision is made. In the example above --- by some human cognitive process, which may have reflection in a computerized decision agent.

Back to our world of event processing, we are using the term "event processing agent" to denote a software artifact that gets one or more event as an input, does something, and produces one or more events as an output.

The question is what is the relationship between decision agent and event processing agent ?

In a (not very recent) postings, Carole-An Matignon from Fair Isaac has attempted to demystify some terms. She used an analogy to the human body, saying that BRMS is the brain, while event processing is the sensor for the brain to get the decisions. So is event processing agent is a sensing agent ?

The answer is --- the terms decision agent and event processing agent do intersect, but none of them subsumes the other.

Returning to the decision agent questions.

The why question: A decision may be required since some situation has occurred, because some relevant fact has changed, or because somebody made an explicit request to activate the decision agent. In the first case (a situation has occurred), then this situation may be a simple event, but also may be a leaf of an event processing network. In this case, there may be some event processing agents that are part of the decision to activate the agent, so it is part of the brain.

The which question: Here again the information needed can vary -- it may relate to the present state, to the history of states, and transitions. Event processing agents can be used to prepare the required information, by taking events and filter, transform, enrich, aggregate, split and more. In this case the event processing agent is indeed a sensor, creating input for the decision.

The how question: There are various techniques to get a decision, detecting patterns on the event history may be a method to obtain a decision, together with other techniques, such as inferring from facts and rules, applying stochastic decision reasoning and more. This does not say that every event processing agent which performs "pattern detection" is indeed a decision agent, sometimes it just derive event that will be used directly or indirectly as input to a decision.
Interestingly, this is true for business rules as well. A business rule may derive new fact, the fact itself is not a decision, for example, it may be classification of a customer based on demographic information. The output of this rule is used as input to (another) decision agent. So BRMS like event processing can also play the role of both sensors and brain.

To summarize:

An Event Processing Agent may be a Decision Agent, or provider of input or trigger to other decision agents.
The same statement is also true for "state processing" business rule.
A decision agent may be Event Processing Agent, but also can consist of several other types of agents.
There may be blend of decision agents of various types inter-related. I'll write more in the future about this assertion.

Thursday, January 29, 2009

On state processing and event processing

Yesterday, I got visited by my (now- Ex) Master student Elad Margalit, about his thesis regarding dynamic setting of traffic light policies I have written before. For some strange reason he decided that I deserve a gift for his graduation so he brought me a "flip clock" that looks like this. Strangely enough it switches the labels to show the correct time, all people who somehow got to my office yesterday thought it is a cool gadget, and it is now located in front of my eyes.

Today's topic is some echo to the discussion started by my friend and ex-IBM colleague Claudi AKA patternstorm on the forum in the complexevents site. Claudi has defined state as a sequence of events, while several others answered that this is not really the definition.

Before getting to definition, there was also very concrete motivation that Claudi mentioned -- if we equate state to "sequence of transitions" than state processing becomes a kind of event processing. I think that it is important to discuss this statement.

While state is not exactly a sequence of transitions, it is true that the value represented by the state can be reconstructed if we apply the series of transitions on an initial state, and considering that the initial transition is null and the first transition creates the state, we can obtain all information as part of series of transitions.

Let's take a simple example. The state represents the value of my balance in the bank checking account. The transitions start from a one that opens this account, going through deposits, withdrawals, commissions of the bank etc.. I have opened my current checking account in 1984. Assuming that I would like to process this state, such as getting an alert everytime that my account balance becomes negative (unlike the USA, in Israel overdraft is a common practice). I can make it an event processing activity by taking all transition from 1984 and reconstruct the state with each new transition, however, this is not an efficient way to do it, first I'll need to keep all the historical transitions forever, second, it is much more cost-effective to maintain the balance as an entity, and process it.

State processing and event processing are complementary, in states processing we are processing the snapshot of the present time, while in event processing we process the history of transitions. If I want to get alert on overdraft -- this is state processing, If a compliance officer looking for money laundering suspect is seeking if three deposits with more than $10,000 each were done to my account within a single week, he is doing event processing. In reality we need both, but each of them has other techniques for its cost-effective processing.

More on this topic -- later.

Wednesday, January 28, 2009

On 20000 visitors in the Blog

The numbere 20,000 typically reminds me of the famous book that you can see some poster taken from the related movie, however, today it means something else -- the 20,000th visitor has visited this Blog. Many of them just arrive there somehow while scanning the cyberspace, more interestingly, around 1,500 visitors are quite frequent repeating visitors, and a similar number are visiting from time to time (but in total visited at least 15 times). 10 percents of the visit were direct, and of the rest, most were referrals from varios type of Google options, and some refering sites: Complexevents (and the forum), Tim Bass's Blog, TIBCO's Blog, RuleCore's Blog and Apama's Blog.

More statistics: The most popular posting, by far, is : "On Unicorn, Professor and Infant"

written in June 2008, and still fresh. The next one is: On Agnon, the dog, playing and downplaying. Soon I'll write a follow up to this one. The third one talks on event stream processing, quite an old one. The next one, like the current posting is gossip about the Blog itself, last time I have written about this Blog, almost a year ago, the Blog had 3,000 visitors.

In terms of geographical distribution -- still most of the readers are from the USA, followed by UK, Israel, Canada, Germany, France, Japan, India and Australia. The number of countries is now 135 - some of the new ones are: Reunion, Guam, Swaziland and Namibia.

As far as cities go -- London keeps the first place, followed by Haifa (my home town) and New York.

That's all for today -- a professional posting will foloow tomorrow.

Tuesday, January 27, 2009

More on Event Pattern Detection and Discovery

One cannot ignore these days the change of president in the USA, something with affects the entire universe. One minority the Mr. Obama belongs to is the minority of left-handed people, as can be clearly seen in the picture, while four of the last USA presidents were left handed (which make his fifth in the last seven presidents), conference rooms in the USA or university classes - all of them have desks only for the right handed majority. Here is a picture of a left handed desk,

I am sure that the USA president has much more urgent items on his agenda, however, unlike his predecessors he may also do something for the deprived minority of left-handed people. BTW - the situation in Israel (whose current prime-minister is, surprise...a left-handed person) is somewhat better. We, Left handed people may be a small minority (around 10% of the population) but our collective impact on humanity is unproportionately huge -- starting from Alexander the great, Julius Caesar, Napoleon, Queen Victoria, Lewis Carrol, Mark Twain, Escher, Michelangelo, Leonardo da Vinci and many more....well enough of that for now.

Going back to one of my previous posts
that has explained the difference between event pattern detection and event pattern discovery.
In the wake of some questions, here is more about the relationship between these two terms:

Event pattern detection is performed for patterns that are known in advance, the pattern detection in done "on-line" when the event occur.
Event pattern discovery is performed typically off-line, it can use machine learning techniques on past events in some cases, it can also use some natural language understanding technique to derive pattern from legal documents (e.g. regulations) in other cases.
A pattern discovery creates patterns that are detected on-line by pattern detection, so they are complementary techniques.
In some cases there is continuous discovery, and thus the patterns are updated in a dynamic way, however, still the discovery feeds the detection part on-line, and the respective roles are preserved.
Last but not least, the discovery process may use simulation techniques that use detection of simulated events in order to check assumptions about patterns.

Typically, event processing products contain event pattern detection capabilities, in one form or another. The event pattern discovery is considered as add-on, typically using techniques that are not particular to event processing.

Event Processing Thinking