Monday, December 31, 2007

Has BAM failed ?

Last posting in the year 2007 - we don't celebrate new year as an holiday, but since the civilized world is using the Gregorian calendar and not the Hebrew calendar we tend to celebrate holidays according the Hebrew calendar and do everything else by the Gregorian calendar.
Anyway - this posting is another in the series of responses to Colin White's article. This time I'll refer to one sentence in that article: This is somewhat analogous to the way business activity monitoring (BAM) solutions were developed independently from the existing business intelligence environment. BAM failed, and the solution was to instead build embedded BI applications that supported the concept of BAM, but which also were integrated with traditional BI processing. The claim is simple:
  • The sin - BAM solutions were developed outside the true religion of BI;
  • The punishment - BAM failed;
  • The redemption - Embedded BI applications have included the concept of BAM -- and lived happily ever after.

In order to analyze this claim - let's first define what "BAM solutions" are. BAM - Business activity monitoring (or management) stands for a class of applications that observe the behavior of business activities (business processes, transactions, applications etc..), in a non politically correct terminology (well, we are lagging behind USA, we have not invented political correctness yet) I can call it "the big brother is watching", a term that is sometimes used is "observation" - the essence is -- look at business activity from the outside, and try to find situations that are of interest - can be major situations like - threats and opportunities, but most of them are micro issues that need to be observed, maybe fixed, maybe just watched closely.

Now for the question of failure or success -- Bill Gassman from Gartner claims in his talks that 65% of the BAM solutions are industry specific, and general purpose software accounts only to 35% of the BAM solutions. Indeed, while the notion of observation is general, what we would like to observe and how is very much domain dependent. Performance management that trace Key Performance Indicators is one flavor that has been generalized, but many of the BAM solutions are not about tracing aggregates or measurements, but about tracing individual situations in the business activities: delivery has failed, to meet the deadline, customer has complained and not handled well, medical treatment has been done outside the protocol for that disease - there are examples from all industries for it. Event Processing is a useful technology in cases where the BAM solution is event-driven, and its input is event. Some of the BAM situations are detected on-line, some of them are detected in retrospect - and I'll talk about retrospective event processing in another posting. General BAM systems are useful for some cases - and are less useful for others.

So - my 2 cents opinion about Colin White's claims:

  • The concept of BAM has not failed - there are many success stories, however the area has not reached maturity. General purpose BAM are effective for some of the BAM market, but many BAM solutions are more industry-specific -- event processing is useful both for the general-purpose BAM market and for the industry-specific BAM market.
  • And as for BAM on top of BI -- according to Gassman this accounts for 10% of the BAM solutions market, so while it has certainly its own variety of application, the majority of the BAM market is not a BI market.

P.S - the picture above is of a skateboard called BAM -- and I hope it sells well - click for a free advertisement to skateboards

More - Later.

Sunday, December 30, 2007

Event Processing for Business Intelligence

From the various descriptions of Business Intelligence found on the Web, I have chosen the one by Hakki Erbug as a starting point.

As noted in several previous postings, IMHO, event processing is a set of technologies that have multiple usages, and are not really strongly associated with a single type of application. Business Intelligence is similar in the fact that it combines various technologies, but different in the fact that it is focused around area of using data for decision making in various ways. In the event processing case, decision making is one of the possible usages, but not the only. This posting will briefly survey how event processing can enrich business intelligence, and in fact, in the Gartner EPS summit I have seen several BI vendors that are looking at EP as a natural capability to complement their products.

Going on Erburg's illustration clockwise:

1. "Active Data Warehouse" - While traditional warehouses are being updated in batch, the notion of active data warehouse makes the warehouse update itself an event-driven action. The rationale is: when a certain event (raw or derived) occurs, a decision has to be taken, however, the decision relies on a data warehouse, thus, an update of the data warehouse should occur before making this decision. The update can be of the same event that happens or of some collection of data that has still not been updated in the data warehouse and is needed for the decision making. There can be some time constraints associated with the decision (and in turn with the warehouse updates). The time constraints are not necessarily micro-seconds, the constraints can be minutes or hours, but they are typically well-defined.

2. ETL and mediated event processing: ETL has some functional similarity with mediated event processing, it is also about transformation. We see mediated event processing and ETL getting closer to one another, where difference may be in quality of service. In the future there may be a case that ETL will become a specific case of mediated event processing (of course ETL folks may say the same from the opposite direction).

3. Real-time analytics: While analytics (simulation, optimization, mining...) has been used for a while in the decision making part of BI, in the event-driven world, the reaction to an event in some cases is temporally-bound, which means that there is a real-time constraint, or upper limit on the requested reaction time. This provides new way of thinking about analytics - while without time constraints an optimization should strive to get the "best result" (or if heuristics satisfy some approximation condition), in real-time analytics the optimization strives to get "the best result that can be obtained in T time-units as specified (e.g. 18 seconds)". How does event processing play in real-time analytics? it may play in a simulation mode - scenarios are created and simulated events are emitted - they in turn may create simulated derived events which determine the situations of this simulation. This is, of course, in addition to the fact that in an event-driven universe, the entire BI cycle is event-driven and relates to the event and its context.

To conclude -- event processing is a natural step in the BI capabilities, and thus I expect BI suites to support the event-driven flavor... This - again, does not say that BI is the ONLY use of event processing. I still need also to refer to the issue - "has BAM failed because it was not based on BI techniques" as claimed in the article that triggered my discussion on the BI topic - stay tuned.

Thursday, December 27, 2007

On Business Intelligence and Event Processing

This is a quiet time, although we do not have holiday period in Israel, those of us, like myself, in which major part of the work requires interaction with the rest of the universe, it is quiet time since my colleagues are away - much less Emails, no conference calls, today I've spent the day in Rehovot known for its famous "Weizmann Institute" but this time (a famous building from the institute is in the picture above) I have visited an IBM site not far this institute, IBM has acquired in the recent years several Israeli companies, and they are now fused into a single lab ILSL - Israel Software Lab (well - ISL was already taken in IBM by India Software Lab).

Anyway, coming back, I saw that the blog world of event processing is also alive these days, but this time I would like to give a few comments to "operational analytics: yesterday today and tomorrow" by Colin White
In this article Mr. White has some interesting assertions (at least, my own interpretation of these assertions):

(1). CEP is a buzzword which stands for a kind of operational analytics that should be used in extreme cases.
(2). BAM has failed since it has been independent of BI
(3). Hint at the fate of CEP (or ESP, it is not clear from his article what is the difference) will be the same if not be part of BI.

While I am sure that Mr. White is a big expert on BI, it seems that he also falls into the "hammer and nail" trap that has been discussed by myself and several other bloggers in this area. So here is some preliminary responses to his assertions:

(1). CEP is a technology, it has roots in multiple disciplines, and some of it has roots in BI, but there is a distance between this and the assertion that it is part of BI. CEP has different uses that may not be even connected to BI (e.g. network management diagnostics or situational awareness), here again we get back to the issue of motivation for using CEP - the consistent view of database people is that the only reason to use CEP is extreme latency/throughput otherwise one can use the right religion of SQL databases, I think that this issue has been discussed also in the EP blogs that there are multiple reasons for using CEP and the high throughput / low latency is one, but not necessarily event the dominant one.

(2). As far as the "BAM has failed" -- is that a fact or a wishful thinking ? in the Gartner BAM and EP summit we have heard some success stories of BAM, and saw some prospering BAM products. While there are synergies and relationships between BAM and BI - I wonder what are the success / failure criteria used to derive this assertion ?

(3). I have already stated my opinion about the question - whether event processing is a footnote to databases
While I spent many years as part of the database community, I am in the opinion that event processing is a discipline by its own right, with some interaction and intersection with the database area, as well as other areas (business rules, distributed computing, middleware, software agents).

This is a preliminary reaction - mostly giving some comments to the article, which does not free me from writing a "positive" article about - the relationships between event processing and business intelligence, and I'll do so in one of the next postings - more later.

Wednesday, December 26, 2007

EPTS plans for 2008

This picture has been taken in the EPTS meeting in Orlando in September 2007, the meeting itself has been described by Paul Vincent in his blog : first day, second and third day.
As written before the EPTS is a consortium (under construction) intended to promote the understanding and incubate standards in the event processing area. Getting into 2008 - here is the work plan that has been recently approved by the steering committee:
Quarter 1: Launch - as expected when some of the players are big companies, the formalization process take some time, but it will hopefully converge soon. After getting approval of the steering committee - the idea is to invite vendors, customers, academic people and other individuals to join the list of founding members. The launch will also include - introduction of EPTS website (a prototype already exists - thanks to Brian Connell from WestGlobal and Serge Mankovskii from CA) which will support the work of work groups and other EPTS related information (leaving news, blog pointers, general discussion forum for David Luckham's site ). In the launch opportunity we'll also sign and seal the first version of the consensus glossary edited by David Luckham and Roy Schulte, with many contributions provided so far (a draft of this glossary is available)
Quarter 2: "State of the practice" white paper based on a collection of use cases. This is a workgroup launched in the last meeting, with a diversified team of volunteers. The first phase carried out by Tao Lin from SAP, Dieter Gawlick from Oracle and Pedro Bizzaro from University of Coimbra in Portugal to define a template to describe and compare use cases, where the use cases are those presented in the three EPTS meetings, in the Dagstuhl seminar on event processing and maybe open for others. After the glossary, this will be the next community effort in order to understand the state of the practice (there has also been some work last year on "reference architecture" lead by Tim Bass that we'll return to after completing the use cases survey).
Quarter 3: In this quarter we'll hold the fourth EPTS meeting (place and time are not determined yet), participate in the DEBS conference, which we support of becoming the research "flagship" of the event processing community. We'll also launch a portal of teaching material on event processing for the next school year.
Quarter 4: Our first contribution to standard -- we'll launch in early year a workgroup to provide the EPTS input for the OMG RFP on meta-modelling.
Many people are contributing their time and energy for these community activities, and I am looking forward to the momentum that will be created after the EPTS formal launch... stay tuned, or even better - join. Instructions about joining -- soon.

Saturday, December 22, 2007

On the envelope for CEP

In two recent blogs - Mark Tsimelzon from Coral8 argued for CEP server vs. embedded CEP libraries, while Paul Vincent from TIBCO argued for the need for infrastructure stack outside the
Since I avoid making product evaluations in this Blog, I'll talk about the principle.
Conceptually we have a model of "event producer" that produces the events, "event consumer" that consumes the event, and the "event processor" which processes the events and stands in the middle. There are two questions about envelopes:
(1). Is "event processor" an embedded capabilities inside applications or an server.
(2). If "event processor" is a server, is it independent server or part of a larger middleware ?
The answer, as anything is not binary (zero or one), there are cases in which there is a need to have event processing as embedded capabilities (e.g. inside pervasive devices), but Mark is right that the big majority of "event processors" are tend towards the server.
The second question is more interesting --- we have today both stand-alone server and servers that are part of a larger middleware, the rational behind being part of a larger middleware, stems from the fact that event processing is not isolated and has various relations with different applications in the enterprise, it is true that the loose-coupling nature of event driven architectures eases the task of separating it from the applications, but still the integration is the most costly part of building event processing applications, and means to ease the integration has already built into application integration middleware, and if the event processing server is a stand-alone one, there is a need to re-invent this integration, as Paul Vincent rightly say: every $ a CEP vendor spends on middleware integration is a $ less on interesting CEP functionality.
Furthermore, there are some event processing infrastructure and functions (pub/sub, routing - for simple event processing, and ESB mediations like - enrichment, transformation etc, that are already there). Thus, it seems that the ROI will be higher if event processors will be implemented on top of a "middleware stack".
An interesting observation is that from the point of view of application integration middlewares - event processing is becoming a key feature, and there are already some predictions that the standard event processing programming model (which is still not there!), will be the basis for application servers of the future, e.g. Gartner's XTP that I have once discussed and should discuss more.

Thursday, December 20, 2007

On - "one size fits all" and Event Processing

Like commercial TV station - if a Blog wants to get "rating" one have to put somewhat controversial - the number of visitors to this Blog has more than doubled in the last few days when I had exchanges of opinions and folk stories with Tim Bass, anyway -- I got tired and did not continue that discussion. One question that I have received somehow related was -- does the fact that I don't think it is worth talking about ESP and CEP as separate entities means that I believe that there is a "one size fits all" in event processing ? well - this is a fair question, in the past I did believe it is true, until I read Mike Stonebraker in his immortal assertion: "One size fits all is a concept whose time has come and gone" Actually, I ceased to believe in it a little earlier, I think that the event processing area is not a monolithic area, and there are some variations needed - however:
  • I don't believe that ESP vs. CEP is the right type of partition in this area;
  • There may be a need to have various implementation under one roof (the heterogeneous framework approach),

For the first point -- what is the right type of partition ? this is a multi-dimensional questions and we still have to learn more to know the most useful combinations.

One of the important dimensions is the "reason for use" dimension, and here in an internal IBM study we got to five different reasons to use, I'll write about it in one of the next postings.

EPTS has recently launched a workgroup that tries to identify these classifications by doing a comprehansive survey of use cases that will be compared using the same template. A team that consists of Tao Lin (SAP), Dieter Gawlick (Oracle) and Pedro Bizzaro (University of Coimbra, Portugal) is working on this template, and a larger team will handle this survey and analysis -- the end result - a collaborative white paper about the state of the practice in event processing is expected somewhere in the second quarter of 2008. Stay tuned.

More - Later.

Wednesday, December 19, 2007

On deleted event, revised event and converse event

First, congratulations to my student Ayelet Biger, who has successfully taken today her M.Sc. thesis defense exam. Ayelet's thesis topic has been - Complex Event Processing Scalability by Partition which deals with parallel execution of CEP logic, when there are complex dependencies among the different agents. I'll discuss this issue in one of the later postings - we still need to compose a paper for one of the coming conferences on this thesis. Ayelet is my 17th M.Sc. student that has been graduated (together with 5 Ph.D. students makes it the 22nd thesis exam). Most of the students have done theses on - active databases, temporal databases (my past interest areas) and in the last few years to event processing. Supervising graduate students is a great way to work on new ideas that I don't have ability to work on in my regular work, the only thing that is needed are three more hours in each day...

Today's topic is inspired by a recent blog that I have recently read by Marco Seiriƶ. Marco is one of the pioneers in EP blogging, I've started reading his blog in January 2006, when he started the blog as "Blog on ESP", however at some point his blog became "Marco writes about complex event processing", another evidence that the name ESP has disappeared. Anyway, in his Blog, Marco talks about event model, I'll not discuss event model today, but concentrate in one interesting point that Marco raises about "undoing events". This is indeed a pragmatic issue with some semantic difficulties. There are systems in which events can be deleted, and some actions can be triggered by the event deletion. However, event is not a regular data and cannot be treated as such, since event represents something that happens in reality, then conceptually events are "append only" - in database terms, one can only insert events, but not modify or delete them. Deleting events also blocks the way from the ability to trace decisions/actions or have retrospective processing of the events. So - when in reality we need to delete/undo/revise events:

  1. when event is entered by mistake - typically not the event itself, but some details in the event attributes, we'll need a possibility to revise event.
  2. when we wish an event no longer to effect the processing.
  3. when the event itself expired or we'll not need it anymore, and don't need to use it in any other processing - including retrospective.

The first case is a revision case - if we are in an "append only" mode, then the way to do it is to enter another event, and have the possibility that it will override an existing event (or set of events) for the purpose of processing. Example: somebody sent bid for an electronic auction and realized that one of the details (say: the price he is ready to buy) is wrong, then he can add another bid that will override the first bid. Why not delete the original bid ? it may be possible that the original bid is already in process, and the overriding cannot stop this process, even if not, there is a possibility that for retrospective processing we'll need to reconstruct past state which includes the original bid (these considerations are actually not new, we have thoroughly discussed these issues within the temporal database community a decade ago when we (Sushil Jajodia, Sury Sripada and myself) edited a book about temporal databases research and practice

The second case is even more interesting, but similar in type of thinking, here we would like to eliminate an event from taking effect, this can be done by sending a "converse event" that reversing the effect of the event - e.g. cancel bid. The implementation problem is that this event, and maybe its descendant events may have being flowing all over the event processing networks, with some even getting out from the EPN with actions triggered, some in process, and some are part of a state, but have not been processed yet (e.g. since a pattern has not been detected yet). Theoretically there is a possibility to apply something similar to "truth maintenance system" in AI that includes also the action and compensate for all actions, but this complicates the system, so recommended only when it is critical to do it (I'll discuss such cases in another postings), when the event has not gone out from the EPN, it is still possible to stop it, most system does not provide a language primitive to do it globally in an EPN, and recently I have watched a concrete customer case, where they had to do it manually.

The third case is the "vacuuming" case - when an event is no longer needed (in agents' state, in the global state etc..), I never got deep into this issue, but thought intuitively that it is a relatively easy problem; however, when this issue has been discussed in the Dagstuhl seminar last year, the claim was that the general issue of event vacuuming is still an open question.

I'll stop here now -- spent enough time on this one... more - later

Monday, December 17, 2007

CEP and the story of the captured traveller

Reading the recent posting of my friend Tim Bass entitled "CEP and the story of the Fish" I decided to answer with another story (from the other side of Asia) :

A traveller went in the jungle somewhere on the globe and unfortunately was captured by a tribe that is still using ancient weapons. He is brought to the chief, and the chief says - " You have trespassed into the tribe's territory, which is punishable by death, however, I am a very curious person, if you'll show me something I haven't seen before I'll let you go"; our unlucky traveller started to look in his pockets and the only meaningful thing he found was a lighter, so he took his chance, showing it to the chief saying: "this thing makes fire", however, since he was in under a big pressure, he pressed once - no fire, pressed twice - no fire, in the third time the lighter indeed has produced the promised fire, the chief did not hesitate and said "let him go", so our relieved traveller muttered to himself - "I knew that they have not seen a lighter", but surprisingly to him the chief said - "oh, I have seen many lighter, but a Zippo lighter that does not light in the first time I have never seen".

When someone disagrees with somebody else, it is very easy to assume that my point of view is right since I am smarter / knows more / more qualified / older / more experienced / generally always right etc... My preference is not to doubt the wisdom, experience or qualification of anybody that I am arguing / discussing / debating with, but make the arguments on the issue and not on the person who makes the arguments....

Enough introduction -- now for the main message of this posting, the term CEP (Complex Event Processing) has more or less agreed now in the industry to denote "computing that performs operations on complex events", where complex event is an "abstraction or aggregation of events". The term complex does not say that the processing is complex, but that it deals with complex events, as defined. Complex event processing is typically detecting predefined patterns that can be expressed by queries/rules/patterns/scripts and are deterministic in nature. Regardless if I think that this is the best term, I think that it is important to have common agreed terminology, otherwise we are confusing the industry, the customers (and sometimes ourselves). Now, Tim Bass claims that since event processing with stochastic/probabilistic/uncertain nature is more complex than what we call "complex event processing", we have to call this one - "complex event processing", and rename what we call "complex event processing" to be "simple event processing". Unfortunately, it is too late for that - and also not justified, again, since the "complex" in the "complex event processing" does not say that this is "complex processing of events" but that this is "processing of complex events" (very common misconception !). Bottom line: yes - there is another class of event processing capabilities that requires techniques from AI, machine learning, OR etc.. and that is not deterministic in nature; no - I don't think we should call it "complex event processing", we have suggested the term "intelligent event processing" which I have already referred to in previous posting , there are a variety of other postings that I have dedicated to terminology.

More - later

Sunday, December 16, 2007

On Event Stream Processing

This is in part a response to my friend and colleague Claudi for his recent post in the CEP Interest Group

There are many types of streams in the universe - the Gulf stream that affects the weather, a water stream who provide pastoral nature sight, and an audio stream, to name just a few.
In the event processing area the name "stream" appears first in the database research community, as a research project in Stanford. Interestingly the name "event" is never mentioned, and the term "data stream" is the central concept. The first one who made a blend of the "stream" concept and "event processing" concept is my friend Mark Palmer from Progress who did not like the "complex" word and thought that the term "event stream processing" will be more accepted, Mark certainly did not mean to talk about data streams in the academic sense. In the discussion session of the term event stream processing in Wikipedia
Mark writes:
You are completely correct in my opinion; these should be merged. And I say this from the perpsective of the software vendor that popularized and caused the confusion in the first place. I'm the general manager of the Progress Apama software division and we coined the term "event stream processing" in April of 2005 when we acquired Apama for $30M - we didn't like the term "complex event processing" and decided to make up another term. Yes, stream processing, and data stream processing have been used as terms in academia, but we made up the term ESP as a synonym for CEP. Some on this list will argue that there are subtle, technical differences, but, being in the center of this quagmire of a debate, I think they should be merged, and that ESP should basically go away!
- Mark Palmer, General Manager, Progress Apama, mpalmer@PROGRESS.COM

Another indication of the blurring between ESP and CEP is that the vendor descendants of the academic projects - Streambase and Coral8 now positioned themselves as "complex event processing" vendors. Both have "complex event processing" all over their homepages, Streambase labels its product as - "complex event processing platforms" (well -- we'll discuss platforms in another posting); Coral8 has a portal which is offers self-service CEP. Aleri which also provides SQL oriented API, also uses the term CEP, although they are also using the term "Aleri streaming platform" as the way to do CEP. Thus, while the term "stream processing" is very much alive in the academic database community - see the VLDB 2007 program, for example, it seems that the market has already voted on the unification of these two terms, behind the CEP term.
Why did it happen ? - in the beginning we have seen some 2 x 2 matrices, showing that CEP is - complex and l0w-performance, while ESP is simple and high-performance. It does not seem that any vendor thought it is positioned in one of the extremes, since most applications are somewhat in the middle, and confusing the customers with two names from vendors who have competed on roughly the same applications and customers did not help any of the vendors, thus, the market wisely moved to one name (BTW - this name could have also been "event stream processing" as Progress suggested, but for some reason the term CEP has caught, and some potential customers are still nervous about the word "complex", but it got the traction nevertheless).
This has been until now discussion on branding, and did not answer the questions - whether there are real differences between ESP and CEP ? in some cases, people indicated theoretical differences, the most notable is: stream processing is ordered, while CEP is partially-ordered.
It may be true, though, I was never convinced that "total order" is an inherent property of stream, it is just the way it was happened to be defined in the academic projects, but I think that the more important difference is - whether we start from set-oriented thinking (stream processing) or from individual-event-oriented thinking (event processing), and there are pros and cons of thinking in each of them, but the bottom line is that real applications may be mixed, they may have ordered events from the same type (e.g. when we are looking at trends in time-series), or it can have unordered events of the same type (e.g. when we are looking at information from various sensors whose original timestamps may not be synchronized), in fact, it can have both in the same application. It is true that the space of CEP applications is not monolithic, but there are other classifications that are more useful then the classification of partial vs. ordered set, thus, for practical purposes, let's assume that "stream processing" as defined by those who are looking for the theoretical differences indeed covers a subset of the space of functionality, however - this subset is not important enough to have separate products covering it, or even to mention it as a sub-class.
Last but not least -- an answer to Claudi on his claim that there is not really a CEP engine, since none of the current products know how to obtain general relations among events and calculate transitive closures.
My answer is that event relationship definitions do exist, but this is not the main point, the point is that one may claim that "there is not really a CEP engine that contains all the possible language features that one can think of", and this is true, the EP discipline is young, and I am sure that we just scratched the surface, and EP products will include many features that we event did not think of them today (otherwise it is an indication that the area has failed!), however, without talking about a certain feature, CEP engines do exist today, none is perfect, but probably sufficient for big majority of the existing applications today, so theoretical perfection may not be the criterion to call something "CEP engine", we'll have to settle in "sufficient conditions"
I'll relate to relations among events, including transitive closure in another postings - but the way they exist or don't exist does not really matter for the question. Long posting today - so this is all for now.

Saturday, December 15, 2007

On simple events and simple event processing

This is a picture I have borrowed from Siemens, however, I'll use it talking about simple events and simple event processing. There is a constant confusion around the terms "simple" (and "complex") here due to the ambiguity of the phrase: simple (complex) event processing - does it mean: processing of simple (or complex) events? or does it mean simple (or complex) processing in events. In this posting, I am drilling down to the simplicity notion.

Let's start with simple event - I prefer to contrast simple event with composite event where the contrast is in the structure - composite event is a collection of simple events and simple event is an atomic event, nothing said about the processing yet.

Now - is there different processing for simple events and composite events ? - in principle no - there are some functions on collections that are not applicable for atomic events, but if we take a collection of simple events that has not been concatenated we can apply the same processing for them.

Thus - my preference is to attach the simple to the processing and not to the event type, and define simple event processing as simple type of processing, no matter what the event structure is. What are the characteristics of simple event processing ?

  • processing is done on a single event - not looking on other events, but events are processed "one at a time".

  • only types of processing possible are: filtering and routing

  • filtering decides whether this event should be passed

  • routing decides to whom this event should be passed

Basic pub/sub with filtering is simple event processing.

ECA (Event-Condition-Action) rule is also simple event processing -- this does not say that the event cannot be derived/complex/composite - but regardless of the history of the event, its structure, the reason why it is created, and its source the ECA rule still performs a simple event processing, and processes one event at a time, where the condition provides filtering, and the action is indeed routing to somewhere to do this action.

Furthermore, in many cases, the "simple event processing" is a preamble to the event processing network done by the event producer, or post-processing done by the consumer, however, it can still be part of the Event Processing Network.

More related concepts - in the next postings.

Thursday, December 13, 2007

On Event processing and Event Driven Architecture

As those who read this blog can notice, I use the term - "event processing" as the name of the discipline that this blog deals with. Other people are using the term EDA - "event driven architecture" as their key terms? one of the questions I was recently asked - are they the same, or similar like the two Arizona Moths that you can see in the picture above ? this, of course, depends on the exact meaning, and people do use them interchangeably.
In my opinions these terms are different.
EDA deals with the way components communicate, and in this case, unlike the request/response style, EDA is loosely coupled, asynchronous, and delivered via push.
Event Processing deals with the end-to-end processing of events, it may or may not be based on EDA. There are some cases that are EP, and are not EDA, examples:
  • Event can be obtained in pull (periodically or on demand)
  • Event can be part of transaction -and thus there is a dependency
  • Event can be communicated to an event processor using request/response protocol, but the actual functionality is event processing.

(may be other reason).

So EP and EDA are not really identical...

Tuesday, December 11, 2007

On sources for uncertainty in Event Processing

There are various sources of uncertainties associated with event processing - here is an attempt to list some of them:

Uncertainties related to the source:
  • Uncertainty that an event happened due lack of credible source, or inaccuracy in the source reporting (e.g. has the sensor really detected an object, or there has been some power failure in the process).
  • Uncertainty to classify an event that happened (murder? suicide? accident?)
  • Uncertainty about a value of a certain attribute in the event (again - inaccuracy of measurement or lack of information)
  • Uncertainty about the timing of an event (happened sometimes during last night, but we don't know when).
  • Uncertainty that our sources reported all events (we cannot assume "closed world")
  • Events that are inherently probabilistic (e.g. future/predicted events).

Uncertainties related to the processing:

A pattern in the event history designates a "business situation" in the application domain

  • Uncertainty whether the pattern detection is a sufficient condition to identify the situation, or it is only an approximation (which is a major source for "false positives" and "false negatives").
  • Uncertainty about the meaning of a "partial satisfaction" of the pattern, e.g. the pattern consists of a conjunction of four events, what happens if three out of the four occur ? is it a really a binary game?
  • Uncertainty that is driven by one of the uncertainties related to the source (e.g. uncertainty in the timing of event occurrence may inflict uncertainty in a temporal-oriented pattern).
  • Processing of probabilistic events.

There are also uncertainties associated with the event consumer - but there are for now outside the scope of this discussion. More - Later.

Sunday, December 9, 2007

On Virtual Events

The glossary defines event as something that happens, it also has some escapte to talk about virtual events that is: an event that does not happen in the physical world but appears to signify a real world event; an event that is imagined or modeled or simulated
Which leads to the question- when we are processing events, is there a place to process events that don't really happen in the physical world - well, today that people have second life in a virtual reality, and there are some events in this reality that worth processing, this is an example of a virtual event, but there are some other examples -- when event processing is used in a simulated mode, the simulated event also does not happen in the physical world, if an event is predicted it has not happened (yet?) in the physical world and there are some other examples. The question is -- what is the difference between event and virtual event ? are the boundaries even clear ? - in the meta-physical world the answer is YES, an even either occurs or does not occur (following the law of excluded middle), however, in reality we may have uncertain events, thus, when processing this event we don't really know if it is a real event or a virtual event. In previous posts I have started to talk about the importance of context and this is the key to handle these events - an event happens within a context. The context may be physical deterministic world, virtual world, physical stochastic world, simulated world, predicted world etc - and the event happen within this context. The type of processing for these events is the same, so something that is real in one context may be virtual in another context and vice-versa.
Furthermore, something that may be considered as an important event in one context For example:
the event that the Dow Jones is up by 5 percent is an exciting event for some people, and a non-event to the fisherman on the Pacific island beach who does not even know what Dow Jones is, but for him the capture of a fish that weighs 100 KG is a big event, and the broker in Wall Street is not excited from such an event, thus - the question whether something deserves to be called "event" is also a question of context. I'll get to a more formal definition of context in the next posting.

Thursday, December 6, 2007

On Event Representation

Back to micro-oriented issue, and today I'll start discussion about -what's behind the definition of the event processing glossary and get to the issue of event representation. As the glossary says - event is something that happens in reality. We also tend to call "event" to the representation of this reality for the purpose of processing by a computer. This notion in event has in the glossary several aliases: event object, event message and event tuple. The various aliases are indications that the space of event representation is not uniform, some think about event as a message that moves around, some thinks of it as a tuple, which is part of a stream, and really the twin brother of a tuple in relational database, some think of it as an object with arbitrary structure (which may also be hidden). Obviously, there is no "universal event", and unfortunately, since in many cases, events are already given from the sources with their given formats, and the event processing designer has little to say about it, then a generic event processing system has to support multiple type of events, or have adapters that translate all types of event to some cannonic type of event (and typically both -- supporting some cannonic type of events and having adapters translating other types of events to the cannonoic type). Event can be structured, semi-structured (XML), and unstructured (the area of unstructured events processing deserves more focused attention). One of the questions is - whether there are common attributes that each event should have to enable event processing. In the data world - the answer is no - there is not a single attribute that must exist in all relations (besides the fact that each tuple should be a member of some relation - no floating tuples). For event processing -- there are some attributes that have been proposed as common attributes:
  • Event-type
  • Source
  • Time-Stamp (or Time-Interval)

Let's look about the question - are they mandatory or not:

  • The first question is whether each event is an instance of an event-type (or event-class). The glossary says - yes ! "all events must be instances of event-type". This seems reasonable, however, we may think of some exceptions - such as rare events that have not been classified.. I need to drill down on rare events in some other post.
  • The second question is whether the source should be mandatory - again, this is desirable if we want to have lineage or tracing back actions/decisions, but there may be cases in which the source is indefinite, or we wish to hide the source (e.g. leaking of information).
  • The third question is whether each event must have a time-stamp (or time-interval in case it happens over an interval - another area that needs more discussion) - the answer is that many event processing patterns are time related, and if we want to know which event occurs first, or if two events occurred within 5 minutes of each other, we need to know WHEN this event occurred in reality. However - in some cases it is not known, in other cases it is not really needed.

It seems that all common attributes are useful, but may be optional in some cases.

There are attributes that are common for types - such as probability for uncertain events, spatial coordinates for spatial events etc -- this is before relating to the content.

The content is determined according to domain related ontologies - and there is a lot of work today in different application domain or industry to define such ontologies. XML is the ontology language, and it has its own benefits, it also carries overhead relative to "flat" events in which the attributes are positional oriented and not keyword oriented.

Events also carry semantic information - such as: reference to entities in certain roles. In fact, event can be thought of a transition between one state to another and the information included in the event refers to a change in the universe such as:

what was changed ? what entities are affected? when it was change ? where did the change take place ? what other information is important about the change ?

This short discussion raised already several open issues that deserve further discussion - so I'll put these topics on the queue for further postings.... more - later.

Wednesday, December 5, 2007

On False positives and False Ngatives

From syntactic point of view, CEP looks for patterns and derives event / trigger action based on each pattern detected, however, detecting the patten is the mechanic work, the patterns designate a "situation" which is an "event" in the customer's frame of reference to which the customer wants to react to (there are also "internal" situation for further processing. There is obviously a gap between the intention (situation) and the way it is detected (patter no the event flow). In many cases,satisfying the pattern is sufficient condition to detect the intended situation, however, in other cases, this serves as "best approximation". This leads to the phenomenon of false positives (detecting of patterns, but the situation did not really happen) and post negatives (situation occurred but pattern has not been detected). Some reasons are:
  • Raw events are missed - do not get at all, or do not get on time (source or communication issues).
  • Raw events are not accurate - values are not accurate (source issues).
  • Temporal order issues - Uncertainty in correct order of events.
  • Pattern does not accurately reflect the conditions for situation (e.g. there are probabilistic elements)
  • (other reasons) ?

Like the time constraints case there are various utility functions to designate the damage from either false positives or false negatives.

More on that issue - later.

Saturday, December 1, 2007

On CEP and IEP

Ambidexterity is a good property for a boxer, he can decide when is better to attack with his right hand, and when to attack with the left hand (I am part of the left-handed minority, should write sometimes a post about being left-handed in the Right-hand people's world). Likewise, there are problems in the event processing space that can be solved by deterministic means (rules, queries, scripts, patterns --- chose your favorite religion), and problems that are solved by stochastic means -- using probabilistic networks, machine learning etc.. (AKA IEP - Intelligent Event Processing). When there is a pattern that need to be traced , to check compliance with regulations, and the pattern is well-defined - then a deterministic approach should be used; when there is a need to dynamically change the traffic lights policies to have minimal waiting time of vehicle, there is a need to predict the traffic in the next few minutes - this is a non deterministic problem and require some stochastic tool (BTW - my student, Elad Margalit, is looking at the traffic lights issue as his M.Sc. thesis). Event Processing Platforms should include various types of functionality - which brings to another discussion on the "actor/agent" architecture - which I'll refer to in one of the next posts. more -later

Wednesday, November 28, 2007

Introducing DEBS 2008

DEBS 2008 is an international conference that is becoming the "flagship" conference of the Event Processing community. The Research discipline of "event processing" has its origin in various disciplines - databases (active databases, stream processing); verification; simulation; rules; programming languages and distributed computing. DEBS have been for several years a conference of the "pub/sub" community that came from distributed computing, they still control this conference (I have suggested to get some people out of this community to the steering committee but it has not happened yet !!). In the course of establishing the "event processing" community, a research annual conference is an important step, as you can see in the call for papers, the list of topics is quite comprehensive, in order to do that, there is an agreement for 2008 to hold other research workshops/conferences in this area, such as EDAPS and join forces
on DEBS 2008 (however, we shall still hold EPTS conference next year, as the goal is different).
In addition to the research program, DEBS 2008 will have:
  • Industrial session - where vendors/customers can present
  • Demo session - to demo products/prototypes
  • Tutorials - in-depth study of the state-of-the-art

There is plenty of time to prepare - deadline to submit in all these category is in March (the conference itself is in early July) -- last but not least -- an opportunity to visit Rome (I have never been there outside the airport).

Monday, November 26, 2007

Non events again - much ado about nothing

Shakespeare fans (and theater fans in general) love the wonderful play : Much Ado about Nothing.
I recalled the title of the play when seeing that different people this week have written various things about the issue of "non events". Tim Bass thinks that this is just a time-based event, there have been various answers to David Luckham's question on the CEP forum. I actually did not think that I'll write second posting on this topic, since I don't think that this topic is that important, it is just one of various basic patterns. However, I'll try to provide more formal discussion on what is called "non-event event".

Let's defer the name discussion for a while - and try to define what it is:

This pattern is a function of two things (as many patterns are): an event E, and a context C, for simplicity let's assume that this is a temporal context only - i.e. a time window.

The definition:
  • Forall t: t is a time-stamp and t is an element of C, the event e does not occur in C.
We can of course enable a condition - saying "event E that satisfies predicate P", but this does not change the principle.

The first question is - does the fact that an event E does not occur in a time-stamp t is an event - i.e. is a "negative event" an event by itself - the answer is NO - the fact that an event does not happen in a certain time-stamp is not an event, since an event is something that happens and not something that does not happen.

The second question is - is the definition of pattern shown above is an event ? the answer is - like any other pattern, the detection of a pattern may create a derived event. The semantics is that the detection of this pattern may be interpreted as a situation that requires some reaction, and in this case, we chose to derive an event that designates that this situation has been reported, this does not change the answer to the first question - the negative event at any time-stamp is still not an event!

Now - to the question of WHEN such a derived event is reported. In general, a derived event may be reported immediately when the pattern is detected, or in a deferred mode (this term is taken from active databases). There are patterns that only make sense in deferred mode - this pattern is one of them - since it is only make sense to talk about the fact that an event did not happen during a time-interval - at the end of this time-interval. Deferred report of derived events typically occurs at the end of a temporal context, but it can also have something like temporal coherence condition, such as: report within 2 hours. The reporting of the derived event is time-based, but this is true not only for this pattern, but to any derived event that is reported in deferred mode.

Last but not least -- the name : non-event event is a bad name, since it has a flavor of negative events as noted in question 1. The names - "absent pattern", "not pattern", and "time-out pattern" are used for it. Time-out can be a good name - but need to think if it general enough.

Back to other topics -- in the next post.

Saturday, November 24, 2007

Is there a non-event event ? on absence as a pattern

Recently, there were some discussions, related to the glossary, about the term "non-event event" , this seems to be a funny notion that seems to contain self contradiction, similar to "soapless soap" I don't think that this term is very good - I prefer to call it
"absent event" .
Now the questions - how is it really defined ? and is it really an event?
As for the first question: an absent event is a CEP pattern that is defined within a context. Taking some examples:
  • There have no been any major financial transaction during the working day for the customer John Galt.
  • The Pizza order has been received, but the delivery did not arrive within 40 minutes.

In these two examples we are looking for - things that did not happen within a certain context; the first example is the context of - "John Galt's financial activities within a single working day", the event that did not happen - "major financial transaction" is by itself a pattern that need to be defined and detected (say - a transaction with financial value of more than $10,000). The absence is on all time stamps that belong to the context - so the formal definition of the absence pattern of event e in context C is defined as : For context C, for all time-stamp t that belongs to the temporal extension of context C, event e does not occur in time-stamp t.

Let's check this definition in the Pizza order - the context here is a certain order, and its temporal extension starts when the Pizza order is confirmed, and expires after 40 minutes (which is the public commitment of the take-away Pizza shop for Pizza delivery).
This is also an example of the use of context as a major abstraction that gets the thinking about event processing closer to the way people think.
Now - the question whether the detection of an absent pattern is an event --- the answer is - a pattern detection by itself is not an event (although it may be an event from the point of view of the internal management system of the EP platform), however, it may create a derived event, which has some structure - for example - the derived event that is created in the wake of the lack of Pizza delivery consists of : date, time, pizza store id, value of order - this event may be enriched with summary of past late deliveries from the same store, and then notify or orchestrate something.

Wednesday, November 21, 2007

On Real-time, Right-time, latency, throughput and other time-oriented measurements

The illustration shows different types of "real time" types. This posting was inspired by a comment on a previous posting - trying to do some order in several notions of time. First, the term real-time is frequently used in conjunction with event processing. The popular belief is that real-time = very fast, but this is not really what real-time is about. real-time can be thought of a deadline accompanied by a utility function that designates the damage from missing the deadline. In this illustration there are four type of real-time:

(a). Soft Real-Time: there is a sense to react after the deadline, but the utility decreases (maybe fast) and at some point gets to zero - no use to do it at that point, but no damage.
(b). Firm Real-Time: The utility go immediately to zero when the deadline is missed - no use to do it after the deadline, but no damage.
(c). Hard Essential: Missing the deadline - the utility function goes to a constant negative value; there is a constant penalty.
(d). Hard Critical: Missing the deadline - the utility function goes immediately to "minus infinity", means: a catastrophe will happen.

One can, of course, define the real-time utility function differently, and create more variations.

So - Real-time is not about fast, but about missing the dead-line. The linkage is there, if the dead-line is very short (need to react within 1/1000 of a sec), but many dead-lines are longer than that -- seconds, minutes, hours or days - depends on what is needed to react to - e.g. the contract that we have with our local electricity company says that when a problem is reported, they should start fixing it within 2 hours; the deadline for delivery of Pizza (otherwise it is free of charge) in one of our local delivery centers is 40 minutes - so most the world typically does not work in milliseconds.

When talking about "quality of service" measurements they can be either statistical: in 90 percent of the cases, the deadline should be reached, or individual: for all cases the deadline should be reached. Typically there are different scheduling strategies to achieve each of them - for the individual case it is important to have a consistent deterministic reaction, and this is a source of the various implementations of Real-Time Java - since Java is non deterministic by its nature of garbage collection that happens in undetermined times and lasts for undetermined duration, which may work for the statistical case, but not for the individual case. The Real-Time Java does not stand for Java which runs much faster, but for Java in which the garbage collection differences are smoothed, thus its reaction is (more) deterministic.

What other measurements are there ? in event processing - latency can be measured by the end-to-end (from the time that the event happens in reality to the action being taken in reality), can be related only to the event processing part (from the producer sends the event, until the time that the consumer(s) receive the consequences of this event), and can relate to a specific function (agent) in the event processing network, so when latency is mentioned - it should be defined - what is really being measured. The deadline typically refers to time constraint of the latency.

The term throughput designates the amount of events that the system can get as an input in a given time interval (1 second, 1 minute etc...), throughput does not necessarily entails that all these events are handled within this time interval - it may be the case that the events are put into a buffer, and all of them are handled within the next 6 hours - so in this case, the throughput is mainly determine by the capacity of input channels and buffers and not in the speed of processing. Of course, there are applications that require high throughput together with individual hard real-time constraint on each event.

The "Right-time" to react is determined by these time constraints, and determine the scheduling and optimization requirements - more later.

Tuesday, November 20, 2007

"The only motivation to use EP COTS is to cope with high performance requirements" - true or false ?

Somehow, I find myself using my collection of analysts' presentations to help me make some points, this time, I am showing a slide from Roy Schulte's presentation in the Gartner EPS summit - I'll return to the content of this slide shortly, but will start with some discussion I had yesterday about the reasons that enterprises are using COTS for event processing. I have heard (and not in the first time) the assertion - the only reason that one will want to use EP software and not hard-coded solution is the ability to cope with high throughput / low latency - in short, to deal with high performance requirements. If there are no high performance requirements there are other solutions, e.g. the database guys think that in this case one can insert all events to a database and use simple SQL queries for CEP patterns, or just using the good old C/Java programming for this purpose. This is somewhat inconsistent with my own experience, where customers that did not have "high performance" requirements were eager to use CEP technologies. Indeed, high performance is a reason to use CEP COTS, however, as indicated in Roy Schulte's slide above - it is actually a minor reason. According to Gartner, the high end is somewhere between 5-10 percent of the candidate applications, while looking at the prediction for 2012 - the use in the high end will be 8% - out of the 27% total use; note also that Roy Schulte defines the high end as 250 events per second, which is really far from the definition of "high performance", so the numbers are even lower. It seems that the market for "non high performance CEP" is much larger, and will grow faster. If that's so - where does the misconception that EP equals high performance always ? I think there are two sources - the first, the early adopters were from the capital markets industry, where some (not all !) of the candidate applications has indeed high performance characteristics. However, with the growth of the market, and use of EP software in other applications and other industries, these type of applications, while continue to grow, will not match the higher growth of non high performance applications. The other reason is that some vendors make the high performance as their main message, and trying to get the market believe that this is indeed the most important property.
So - if high performance is not the only reason to use EP COTS what are the other reasons to use EP COTS ? this is a matter for investigation, but IMHO the main one is the "high level programming" and agility - in short - the ability to reduce the Total Cost of Ownership.
I'll provide more insights about the TCO issue in a future post.

Sunday, November 18, 2007

On CoDA - Context-Driven Architecture

Yefim Natis from Gartner, the same analyst who is bringing us the XTP vision
is also responsible for the brand new vision CoDA = Context-Driven Architecture; which is one or two steps further from the XTP era. In the previous posts about contexts: (1) ; (2)
I have argued about the use of context as a first-class-citizen in event processing; and indeed people act and think within contexts, and the use of explicit context can get the computing closer to the human thinking, and also has computational benefits (like indexed vs. sequential search in the early days of file systems). Both Yefim and myself did not provide formal definition of context when talking about it, and I still have a "to do" to think about it. One comment though about the "Architecture" - some people think that SOA and EDA are contradicting terms, since there can be only one "big A" - meaning architecture. As I have already mentioned - in the posting about EDA and SOA that they represent different aspects of computing, thus they are orthogonal (but can co-exist). The question is - whether context is yet another dimension that can co-exist with the other aspects ? again - still need to think about it. Thus, I'll return to the context concept soon.

Friday, November 16, 2007

Preparing for unexcepected events - pickpocket

Today I have experienced an unexpected event - I have been most likely pick pocketed while unloading to the car bags from shopping at the supermarket. When I noticed it - it was too late, the wallet was gone. It will take me some time and effort to reconstruct its content - had to cancel credit cards, get substitute for driving license, get new identity card, and some other less important cards as well. Event processing can help in this case -- first locating chip attached to driving license, credit cards etc - can help locate them (for the privacy fans - this will also let some big brother trace you), some automatic procedure to replace everything that has been there, without bothering me - would also been helpful. This bring the question of cost-effectiveness - the investment in handling relatively rare events which have a noticeable overhead vs. resolution when they do happen - this is a utility function which weights the cost, other damages, and the damage from the event that happens - an interesting investigation area - which also has practical impact on applying event processing in business situations -- which events should be traced and handled, and which not.

And since the tone of this blog is personal this time, I would like also to write a few lines about a good friend and colleague - Shlomit Zak - who unfortunately passed away recently in relatively young age. Many of the foundations of the work I have done over the years came from my joint work with Shlomit in the Israel Air-Force, many years ago, Shlomit has been one of the brightest persons I know, who, unfortunately did not come close to fulfill her potential; besides her professional abilities she has has an artistic soul, was gifted piano player, and keen sense of humor, and was famous by writing achrosticons. My last Email from her was two months ago when she sent me an Email "happy new year" card with achrosticon - and I have replied : "happy new year to you too - the achrosticon is nice, as usual; rhyming is mostly reasonable", and got the answer "and you - as usual, cannot free yourself from the need to provide grades" (well in Hebrew it sound better...).

Thursday, November 15, 2007

The MARK on the BENCH - and the mythical event per second

Recent news item from BEA is talking about a benchmark and cites some EPS (Event Per Second) figures. Unlike some vendors that just cite numbers, there is also a white paper describing the benchmark. I don't wish to refer to the BEA benchmark specifically, but to share some insights about benchmarks in general. Benchmarks have a positive side, in which they are enable either to compare different products based on the same criteria, or to evaluate some properties of a product, even when not comparing it to others. Currently there is no "standard" benchmark in the event processing area, thus, vendors are inventing their own benchmark, carefully designed to expose much of the strengths, and none of the weaknesses of their products, and create benchmarks that may be non reproducible in other environments, or with some change in the application. Thus, to make any significant comparison between different products, standard benchmarks need to be constructed. Standard benchmarks, by themselves, may be double-edge sword, since we have benchmark-driven industry, vendors will invest a lot of resources into optimizing for the standard benchmark, however - this may not help a specific application, since its requirements may be far enough from the benchmark. Event Processing is heterogeneous area, which means that a single benchmark will not be sufficient - we need a collection of benchmarks, and each customer will have to chose the one or more benchmarks that are closer to its requirements. The standard benchmark should come from a vendor-neutral organization. I know of some academic work in this area, but more needed to be done.
And a word of caution - all the benchmarks refer to performance characteristics such as latency and throughput. But as noted in a previous post on the mythical event per second, I doubt if these are the main decision criteria in most applications - thus benchmarks should refer to other dimensions (functions, consumability, other non functional requirements), while, there are certainly cases that the high performance characteristics are critical, in general, I think this is over-hyped a bit. more - later.

Tuesday, November 13, 2007

On more or less important issues - the case of event processing languages

There are things important in life - and things that are less important.
In the picture you can see my four daughters ride two elephants in our family vacation in Thailand earlier this year, for me they are obviously very important, while other things may be less important in life. I am returning to the issue of event processing languages, since I realize that too much are being spent over the community on comparison of what a certain language syntax can or cannot do - I admit that I have also contributed my share to this discussion.
Now I am raising the question --- is it an important topic to deal with? yes - I realize that some vendors would like to make a language as a differentiator (I am not sure if anybody succeeded in this task), it is also true that some people in the development community have programming style preference, but since I see the trend of getting the patterns/rules/queries being constructed by semi-technical people, who will not be able to use many of the languages anyway, and will need higher level abstractions anyway.
Therefore, I think that the more important thing to understand are - requirements and semantic features of event processing - so I intend, at least for a while, to stay away from the programming style and concentrate more on these two issues.
more - later.

Saturday, November 10, 2007

Context and Situation - are they synonyms?

I have spent the last couple of days in Eilat a resort town near the Red Sea, the Las Vegas of Israel, if one counts hotels, and an interesting combination of dessert and beaches in a small proximity, in the picture you can see the hotel in which we stayed. Anyway - back to context. A comment that received to the previous posting made me realize that there is a need to explain what is the difference between - "context" and "situation", both of them are semantic terms. In a way - it is similar (but not identical) to the difference between "state" and "transition". "situation" is something that happens, and that has some meaning in the consumer's terms, and it is the event consumed by the consumer(and as a result may trigger an action), in syntactic terms, situation is an event, and may be either raw or derived event (I'll not get into complex / composite events - it is sufficient to say it can be any kind of event). Context is a state -- a context can be created by the occurrence of event and destroyed by the occurrence of another event. Let's take a simple context: "during red alert". There is an event that declares the "red alert" state, and then there is an event that declares any other color, and it is considered to be "converse event" to the starting event, thus, it end the context. While the declaration of "red alert" is an event and can also be a situation, the fact that another event happened "during red alert" is not an event (semantically). Context may also be spatial - for fleet management application - a context may be a geographical area - and the events are that a vehicle enters and exits a context, but the fact that a "vehicle is within the area" is not an event.
Thus "context" has a distinct semantic existence. As far as implementation goes - there are various relations between events and contexts. After writing my first blog about context - somebody attracted my attentions that analysts have been recently talking about "context-aware delivery architecture" - I'll refer to this four letter acronym (are we out of all combinations of three letter acronyms?) in one of my next blog.

Tuesday, November 6, 2007

The notion of context and its role in event processing

I have copied this pyramid from an academic project site from Monash University in Australia that explores the notion of context in pervasive computing
It represents quite well the location of context - between the raw events ("sensory originated data" in this picture), and the derived events ("situation" in this picture).

The rationale behind this term is that under different circumstances, we identify different situations from the same event or combination of events. This can be temporal : "during working hours" we detect different situations than "during off-working hours" or "as long as red alert is in effect, it can be spatial : "if it happens within the county limits", or it can be entity-driven such as "if it relates to platinum customers".

Is context a first level abstraction ? - not in SQL oriented languages, where the "where" is semantically overloaded, although there is a limited support to temporal contexts with the notion of "time window" (which is usually just a time offset). Some works of context support in rules exist, though not in the main-stream products in this area. However, the benefits of getting context to be first class citizen:

  • It is a first-class semantic abstraction: people think in contexts.
  • It provides computational efficiency. E.g. in event processing network - context can determine the partition of agents, each agent does "function within context" (i.e. detecting a "complex event processing pattern", thus events are routed only to the agents that meet their context requirement.

I'll continue to discuss contexts in later blogs.

Sunday, November 4, 2007

Blog **2 - A Blog about this Blog - after the first 1000 readers

Today I have processed the event that the number of "absolute unique visitors" is 1004 at the time I am writing this lines, and this deserves some reflections about this Blog. To be exact I may miss some readers in the statistics, since I have started the Blog in August 28, and started measuring in September 11. It seems, surprisingly, that people are reading this Blog, my previous experience with writing was writing poetry as a teenager, and this was long ago...
Anyway, I have drawn some statistics: Out of the 1004 readers, 12 have comments - the one who sent the highest amount of comments by far is Harvey Reed.
There were also a few who commented in their own Blogs - with or without explicitly mentioning me. The Geographical distribution is also interesting - most of my visits are from USA - 878 visits out of total of 1,798. This is true also for the "absolute unique visitors" - 525, which is the majority. So I am writing mostly for the USA folks - however, other countries have also contributed visitors to the "absolutely unique" list - UK is second with 89, India 44 and Canada 43 come next, and then France with 24, Australia and Israel with 23 each. The total is 59 countries - including one country that I have not previously heard of: Saint Kitts and Navis
as well as some of our neighbours - Kuwait, Iran, Egypt, Cyprus and Turkey - and countries in
all continents. Around half of the readers have returned, and some have returned more than 100 times (!!!). About the content - While most has entered the "root" - the most popular direct entries were those who dealt somehow with business rules: the most recent one and the first in this series. I am not sure they have made me very popular among the business rule guys (???). Next were the blog entry with all the TLAs and the mythical event per second piece. The most popular traffic source has been direct traffic (around 20%), and out of the referral site - the biggest by far is the almighty Google, however 10% of the traffic to my Blog has been referred from Tim Bass's Blog , I think that I have mentioned before that Tim has urged me for months to write a blog, so Tim's Blog is in a way my "parent Blog". About the content - I did not find strong correlation between the type of content provided and the number of visitors in that day, so I'll continue to experiment with writing all kind of staff (macro and micro issues) and react to other community Blogs. There are plenty of topics that I have promised to come back to, but will continue doing it in a random fashion. So - thanks a lot for the 1007 readers (the number has grown since I started writing this Blog), and see you in the next real Blog (this one probably does not count)...