Wednesday, December 30, 2009

2009 - event processing perspectives

2009 is going away soon, and it is time to summarize it from the event processing perspective, which is the focus of this Blog.

According to analysts this has been a good year for event processing, in tough economic climate, the accumulative market for event processing continued to grow, more or less according to the original predictions, and is expected to continue the substantial growth. Here are ten statements about event processing in 2009.

  1. In the vendors world, Microsoft has announced a forthcoming product, Software AG notified that it is working on a product, and more start-ups have joined the area; the most notable acquisition this year is the acquisition of Coral8 by Aleri that was not an intuitive acquisition.
  2. In my own company, IBM -- besides Websphere Business Events (WBE) that was launched in 2008 and is growing rapidly in 2009 in number of customers, IBM announced three more products in this area in 2009: Infosphere Streams, Websphere Sensor Events, and EDA extension for CICS, as IBM believes in having event processing capabilities pervasive throughout its software portfolio
  3. The emergence of new book. The first book in this area that has made a big impact was David Luckham's "Power of events". Eight year have passed with Luckham's book as a single book in this area. In 2009 several more books have been published, most notable the book by Chandy and Schulte. Some more books are due in 2010
  4. Some popular magazines ran articles about event processing, one of them is International Journal of Banking Systems, and the other is IEEE Computer.
  5. All major analysts had special reports on event processing. Gartner has written about it before, but now made it explicit part of its "hype cycle"; Forrester made a thorough report with comparison among several products over multiple criteria.
  6. The major scientific conference of event processing DEBS has been endorsed by ACM and became ACM DEBS conference. The conference made a shift over the last couple of years from "pub/sub" conference to a larger event processing conference. EPTS provided two tutorials: languages and use cases
  7. Other event processing related workshops interacting event processing with other areas were: event-driven business process management, or event-based processing in robotics. These two topics have been discussed in the annual EPTS meeting that was held in Trento.
  8. it is announced that Streambase will receive the "world economic foundation" award, an indication that event processing is considered as one of the influential technologies for the world economy.
  9. Another winner of the same award is Twitter. This year different applications that processing Twitter events have emerged.
  10. The quote of the year comes from Alex Buchmann, in his Keynote address in DEBS 2010: regular programming is like drinking with a straw, this is good when the data is standing, while the data is moving, like in event processing, using the same kind of thinking is similar to use a straw to drink from a waterfall.
Something about what's coming in 2010 -- later.

Wednesday, December 23, 2009

On common misconceptions about event processing - the complexity misconception

David Luckham has coined the term "complex event processing", this term has caught as the marketing term behind many of the vendors that provide event processing platforms (comment: IBM, and recently Progress/Apama moved to use the term "business event processing"). While this term succeeded to get traction, it also is a source of on of the common misconceptions, Luckham talked about complex events, and their processing, some people understand it as the complex processing of events, and some just view it as the intersection between event processing and "complex systems". Complex event is defined as "abstraction of one or more other events", which also leads to some interpretations about the nature of abstractions, so this interpretation is easier to understand. However, the misconception is that it is more intuitive to think about "complex event processing" in the second interpretation as "complex" processing of events, and this brings us to the question -- what is complexity? there can be different dimensions of complexity.

  1. The complexity may stem from the fact that we don't know what exactly we are looking for, and generally looking for anomalies in the system (e.g. trying to find security violations), thus some AI techniques have to be applied here.
  2. Another case is that we know what are the patterns or aggregations we are using, but they require complex computing, in term of functional capabilities.
  3. Another case is that the complexity is in some non-functional requirement: such as scalability in several direction (scale-up or scale-down), strict real-time performance constraints, highly distributed system etc..
  4. Another case of complexity is in interoperability, the need to obtain events from many producers, and use events in many consumers, which requires instrumentation/modification of a lot of legacy systems.
  5. Yet another case of complexity may be unreliable event sources, handling false positives and false negatives.

There are probably more complexity cases, however the interesting question is whether the main goal of an event processing system is to solve a "complex" problem.

Om my scientist hat, it is definitely more exciting to solve complex problems, even better, problems of the type that have never seen before. However, from pragmatic point of view, event processing applications are measured on their business value, and there might be a lot of business value of using event processing to systems that have none of these complexity measures, from complexity point of view they can be quite simple, moreover, there may not be an exciting aspect about the implementation as it is similar to other implementations already done, but on the measurement of "business value" it brings a lot of value, thus the value metric is orthogonal to any complexity metric, and indeed many of the applications in which event processing technology is very useful to is quite simple (according to one of the analysts report the "simple" applications are 80-90% of the potential market for the event processing technology). While there is certainly a segment of application for each type of complexity, and more work is required in these direction, the "simple" application will be the bread and butter.

More misconceptions - later.

Sunday, December 20, 2009

On common misconceptions about event processing - the single application misconception

We start the introduction chapter for the EPIA book, by stating: Some people say that event processing is the next big thing; some people say that event processing is old hat and there is nothing really new in it. Both groups may be right to a certain extent. As with any field that is relatively new there is some fog around it: some of the fog stems from misconceptions, some from confusing messages by vendors and analysts, and some arises because of a lack of standards, a lack of agreement on terms, and a lack of understanding about some of the basic issues.

In the book we don't really talk about the misconceptions, but I think it is a good topic towards the end of 2009 to dedicate some postings towards the major misconceptions.

I'll start with misconception number 1: Event processing is a single-industry (some even say single-application) technology, and event processing software cannot generalize beyond this single industry/application.

The industry is, of course, capital markets, and the application is algorithmic trading

The diagram below is taken from the ebizQ customers survey (two years ago) about what are the business problems that they expect to solve with event processing, and the result is 9% indicated algorithmic trading.

This misconception is originated from the fact that the capital market industry has indeed been the early adopter of event processing software, and served as a proof of concept for the rest of the industries, there are indeed some vendors that focus mainly around this type of application, however, this does not show the entire picture. From the IBM experience I know of customers in various industries, most are not in the capital market area. Getting to the material collected by the EPTS use case work group (the material is on the EPTS members internal site, available to EPTS members) I find quite a lot of examples of systems working in production or being developed from variety of domains, here are some samples:
  • Border security radiation detection (Eventzero)
  • Mobile asset geofence (Rulecore)
  • Logistic and scheduling application (Starview)
  • Unauthorized use of heavy machinery (Rulecore)
  • Hospital patient and asset tracking (IBM)
  • Activity monitoring for taxing and fraud detection (IBM)
  • Intelligent CRM in banking (TIBCO)
  • EDA and asynchronous BPM in retail (TIBCO)
  • Situation awareness in energy utilities (TIBCO)
  • Situation awareness in airlines (TIBCO)
  • Reduce cost in injection therapy (IBM)
  • Next generation navigation (CITT)
  • Real-time management of hazardous materials (Oracle)
  • Finding anomalies in point of sales in retail stores (CA)
  • Elderly behavior monitoring (U. of Munich)
These are only samples, I am familiar with variety of examples in various industries: healthcare, utilities, chemical and petroleum, insurance, security, transportation and others. In the last event processing symposium in Trento, we had one keynote address on event processing in robotics, and there are other areas as well such as smart house to monitor energy consumption in the house. While we are now in the first generation, and the utilization of event processing will increase in time, the coverage in terms of industries and applications is growing has gone far beyond algorithmic trading.

More misconceptions in subsequent postings.

Saturday, December 19, 2009

More on the ecosystem of event processing systems

The one week school break for children ends tomorrow, and today I went with my two younger daughters to see Avatar, a very ambitious movie with a lot of advanced 3D graphics. The downside is that the film spans over 3 hours without a break -- could cut 1 hour easily, but nevertheless the graphics is very impressive, the plot is kind of paraphrase on other movies.

In my previous posting I've written about event processing functions, one of the frequently asked questions is whether event processing has a value by its own right, or as part of a larger ecosystem. Let's look at another question: DBMS software deal with update, retrieve, store, organize, backup and restore data, it also supports concurrency control, transaction management and other stuff. DBMS itself does not have value, the value is in the use of data.

In a similar way, event processing software deals with transferring, filtering, transforming, pattern matching, and situation discovering, it also provides execution control and support of some non functional properties, but the aim is to receive event process them and derive more events. The value is not in derivation of events, but in the way it is used.

In order to complete the cycle we need two other types of software: event producers that produce the raw events, and event consumers that consume the derived events and execute the actions triggered by these events -- the consumers are those who are driving the value.

The producer can be any type of software or sensors. One example of event producer I've written about in the past is the "EDA extension of CICS" that I have written about before. It is an example in which a software is instrumented to produce events. We are seeing more software of this type in the sensor area, and other areas as well.

Consumers are the part of ecosystem that utilize the events flowing from the event processing systems. I've posted before about the consumer chapter in EPIA, There are many ways to consume events, they can also be connected with other types of enterprise software: Events can drive decisions, and can serve as input to business rules; events can drive KPI (Key Performance Indicators) metrics and can be input to BAM systems, it can have various actions related to BPM systems, events can drive real-time analytics and be input to BI systems, it can post on Twitter, and update Ambient Orbs (see the posting about the EPIA chapter), or if it is part of a "smart house" it can turn off lights or control the temperature using some actuators.

Like DBMS that requires somebody to feed the data, and somebody else to use it, event processing requires its ecosystem, we'll see more of the ecosystem software both in the producer side (like the CICS example) or in the consumer side (like BAM software). More on event processing ecosystem later.

Friday, December 18, 2009

On event processing fuctions

I have been in a short vacation, and went with (some of) my family to see the film 2012, it is based on an ancient prophecy that the world as we know it will come to an end in December 21st, 2012 -- three more years to see whether this prophecy will come true.

This time I would like to write about event processing functions, I have written about them before, just summarizing it in one place.

There are various functions under the roof of event processing, some applications need all of them, but many applications need only part of them, in various level of sophistications.

Here are the major functions that I have observed:

1. Event distribution: This is the most basic one, event consumers are disseminating events through some intermediate brokers (often called channels), the events may be filtered, but are transfered without change, where any processing occur within the consumer's premises and is not part of the event processing system. Pub/sub systems are of this type, and there is a lot of work about such systems in the distributed computing area.

2. Event transformation: This goes another step and send the consumers transformed events, where the transformation may be translation, aggregation, composition, enrichment, projection and split. Aggregation is probably the most notable use of transformation, and there are many applications whose main usage of event processing is transformation.

3. Event pattern matching: This function is to find whether any subset of the input events satisfy a predefined pattern.

Note that some systems require transformation only, some require pattern matching only, some require both, systems can also have different levels of sophistication in both. It may require very simple patterns only, or sophisticated patterns; likewise it may require very simple types of transformation or much more advanced ones.

4. Situation discovery / event pattern discovery: This function is to discover that some situation occurs without having a predefined patterns, using intelligent techniques. While the first three types of functions are more investigated (although I can't say that all issues are figured out), the fourth one is still a challenge, since there are some experiments, but generally it is not well established yet.

This also remind me of a different topic -- misconceptions around event processing, and I'll write about this topic soon.

Thursday, December 17, 2009

On ecosystem for event processing

Today I woke up early, a car alarm has been activated, and was heard by the entire neighborhood, I think, except for the person who owns the car, although it parked in front of his house. I think that it lasted for 3 hours, maybe ran out of battery...

I also have to reject a lot of spam comments on my Blog recently, some of the problem was solved when I used the standard way to eliminate automatically created comments, but some still remain. Notes like: "I really like your Blog, you may be interested in my design", linking to a commercial site that can sell everything from home appliances to escort service in London.

One of the sections we added to the book, as requested by the last reviewers deals with the ecosystem of event processing with related concepts. One of the areas that we already started investigating is the relationship between event processing and business process management (AKA EDBPM), there is a recent paper on that c0-authored by Rainer von Ammon and some of his CITT colleagues, some of this work was done by Alex Kofman, my M.Sc. student (already graduated) and one of my IBM Haifa Research Lab colleagues. The paper has been linked recently from David Luckham's complexevents website. Enjoy! More on ecosystems later (I was asked to prepare something about the relationships between event processing and business intelligence for some IBM internal forum, will share it when prepared).

Tuesday, December 15, 2009

On Zamehnof and event processing

Today, December 15th is Zamenhof's day. Zamenhof is the inventor of the Esperanto language, his idea was simple: create a universal language, with very simple grammar and spelling and no spelling and grammar exceptions. While there are several millions of Esperanto fans, it has not become the universal language, and we still speak in many languages and dialects. There are street named after Zamenhof in many of the cities in Israel, not that there are so many Esperanto speakers, but somehow famous Jewish people get priority in streets names (in Haifa Zamenhof street is near Einstein street - another famous Jewish person).

Like natural languages, in event processing we have many languages, and languages styles (when preparing the course I am teaching now I realized that there are more styles than I've realized, next class we'll discuss it in class, showing examples from all the different products participating in the EPIA book's website).

I share Zamenhof's dream to get to a universal language, but this may take time (maybe infinity), worth trying to invent the event processing Esperanto.

Sunday, December 13, 2009

On event processing as a service

While working on the Website of the EPIA book, we asked the language owners to provide downloadable version of the product implementing their language. I was asked by some of the language owners if instead they can provide instead a possibility to provide their software as a service and let the readers run it on their servers. My answer was positive, and we'll see couple of such examples (one already there, one is coming up).

The book's website is just a resource for readers who wish to study languages, but this brought me to a thought about event processing as a service in general.

Some of the reasons for doing it is to gain the benefits of cloud computing in terms of scalability,
I've recently came across some material about activeinsights which seems to be a new Israeli company developing open source "event stream processing" in the cloud (well - I have some terminology comments to them, but this is not the main point) that advocate the use of event processing in the cloud to cope with scalability issues. Using SAAS model for event processing can give rise to some interesting cost models, that are either related to the input (amount of input event processed) or the output (amount of situations detected by the event processing service, with some cost per situation, or amount of aggregated/transformed derived events) which ties the cost directly to the benefit. One of the barriers of using event processing as a service is lack of standards especially for interoperability, which does not enable just to connect and run, but requires substantial investment in writing adapters in a proprietary way. I assume that we'll see more of that, when the cost/benefit model will be clarified.

There are more interactions between cloud computing and event processing, such as the use of event processing as part of the cloud infrastructure, but this deserves a separate discussion.

Friday, December 11, 2009

On EPIA Website

The Event Processing In Action book that Peter Niblett and myself are writing is getting to the last phase before getting to production. We have finished a full draft that is now going to through the publisher's review system. The book has a closely related website that is intended to enable the readers to have hands-on experience of the concepts described in the book using representatives of the various programming styles that exist within the state of the practice.
This is done by having a single example implemented in all these languages. Some of these implementations already exist on the website, some have just "placeholders" as reference to a site. Most allow downloading the software, and some using the "software as a service" model in a cloud. Six languages are already referred on the site, more will be added soon for example IBM's Websphere Business Events.

The website is being prepared by a group of students, and is now exposed for public review.
This is a first version, probably much to improve. So comments are appreciated.

This is a link to the website.
The website is hosted in the book section if the EPTS website.

Thursday, December 10, 2009

Congratulations to John Bates on becoming Progress CTO

It seems that this is a week for congratulating colleagues from the event processing community.
This time, I would like to congratulate John Bates, Apama founder, and one of the notable persons in the event processing community who became Progress CTO

The Logo below is of a UK based Commonwealth Telecommunication Organisation, but in our context the meaning of CTO is, of course, Chief Technology Officer (John is indeed from UK, but in the last few years cross the ocean).

The fact that one of the leading figures in the event processing industry becomes a CTO of a larger software company is an indication of both John's personal qualities as well as the growing importance of event processing technology in the general software industry. I wish John a lot of success in his new role.

Saturday, December 5, 2009

On World Economic Forum 2010 technology pioneers

Today I wish to congratulate my colleagues from Streambase for being recognized by the World Economic Forum as "2010 technology pioneer".

The area of event processing is still evolving and has a lot of innovative people and companies, while there are other innovators in this area, Streambase is certainly an innovative company and among the first wave of products in this area. They deserve a salute for this recognition.

It is also a good sign for the recognition of the event processing area, as one of the areas that deserved being mentioned as one of those with highest impact on the world economy.

Monday, November 30, 2009

More on the EPIA upocming book

I have written before that I am a science fiction fan, and out of the (relative) young generation of the science fiction writer, there are some writers I find as very creative in ideas. One I've written before about was Rob Sawyer, this time I am reading in my spare time Brandon Sanderson's book: Warbreaker. I have read some of the other books of Brandon Sanderson, and this one find this with many creative ideas. All his books are highly recommended if you like this genre. Anyway, I did not write in this Blog last week, actually took a few day off to work on the EPIA book.

We now have all chapters of the book ready in a draft form !!!, several of them are still in cleaning phase, we will send them to another set of reviewers (Manning is champions of reviews, it did a review on the outline, and three reviews during the book), and will hand out the final copy sometimes in January, so the target is to have the book out around the end of April.

The new stuff in the book -- some of it are revisions: adding a section about the relationships of event processing to other stuff, and adding code samples from various products in different chapters, some of it is new, a new chapter about implementation issues: programming styles (based on the DEBS languages tutorial), non functional requirements, performance metrics and optimization kinds. We are also advancing with the book's Website. I'll write more about selected parts of the book later.

Saturday, November 21, 2009

DEBS 2010 site is up

The DEBS 2010 conference site is up now. DEBS is ACM conference, and the major research conference in the event processing area. DEBS 2010 is done in cooperation with EPTS, and will take place in Cambridge, UK. I am in the organizing committee again, this time in the role of tutorial chair, so I'll be soliciting tutorial submissions. The call for papers both in the research track and the industry track, tutorials, demos and more is on the conference's site.

Friday, November 20, 2009

On Inexact events

Back to chapter 11 in the EPIA book that deals with challenges that developers and users of event processing systems should be aware. One these topics is the issue of inexact events. The basic assumption about current systems is this is a projection of the "closed world assumption" kind of thinking, which assumes that every event that is reported really happened, that every details in the event payload is accurate, and that every event that happened was indeed reported. In reality, one or more of these assumptions may be invalid from several reasons, as shown in the following figure:

As shown in this figure there are several reasons for making one or more of these assumptions invalid.

The source (e.g. sensor) may malfunction; if the source is an instrumented program, there may be a bug in the instrumentation.

The source (event producer) may be malicious, and send wrong information in order to sabotage the system.

The inexactness maybe a projection of temporal anomalies discussed before, e.g. derived event that has not been detected.

This inexactness may be propagated, as a derived event is derived from an event which is by itself inexact.

The source itself may be imprecise, thus some of the content may not be accurate.

The input events may be based on sample or estimates.

The uncertainty does not stop in event content, it also exists in the bridge between events and situations, I'll write on that topic in a separate posting.

Thursday, November 19, 2009

On the Fast Flower Delivery example and various programming styles of event processing

As explained in one of the previous postings, we are using a single example in the EPIA book as an example that accompanies the book, this is part of the book's methodology. I'll write more about the methodology more, as we are now writing the preface for this book explaining the methodology (among other things). Actually today I received an additional review of the book from the publisher and the reviewer has criticized the example claiming the since most of the readers are men, example that relates to flower may be considered as too feminine, and realizing that it is too late to change, suggested that we'll select another example in the second edition.

Well, I am thinking what will appeal to the real machos.

Maybe we should go for Poker example -- this is a real macho staff...

On second thought, the real machos are engaged in boxing, so maybe we should have an example around boxing match... go figure...

Getting serious now. The FFD (Fast Flower Delivery) example is explained in the book using our building block approach, it is also demonstrated using several languages. I have approached the entire community earlier this year, and there has been a very good willingness of participating in this game, implementing the FFD example in various languages by the "language owners". We have six languages participating in the game now. Languages implemented by four commercial products:
  • Aleri (actually the CCL language originally Coral8)
  • Apama (owned by Progress Software)
  • Rulecore (a Sweden based company)
  • Streambase
and two open sources:

  • Esper
  • Etalis
The reader will be able to look at the example implemented in these six languages; furthermore, will be able to download a full or demo version of the engine implementing this language. As written before the logistics of constructing this website, validating the solutions etc... are done by students taking my event processing course. Some examples will be brought in-line to the various chapters of the book to provide the readers some glimpse of the different styles.
We should have a "Beta version" of this website within a couple of weeks.

I'll update about this experiment more.

Tuesday, November 17, 2009

When does a derived event actually happen? - (posting II)

In the previous posting I've shown some possible anomalies when dealing with derived events. The picture above shows a snowfall as a derived event, actually where I am located, in Haifa, this is a very rare event (once every 20 years for a few minutes). There are various types of derived event, this time I'll discuss derived events of two different patterns: sequence pattern, and time-out pattern.

Example 1: The pattern is: if a sequence of events E1 and then E2 occur, derive event E3.
Let's assume that event E1 occurs at 9:00 and arrives to the system at 9:02, and event E2 occurs at 9:30, and arrives to the system at 9:31. The derived event is derived by the system in 9:33. The question is when does event E3 occurs. One can think of three logical possibilities:

I: E3 occurs when it is produced in 9:33; the rationale: since it is a virtual event, it does not occur in reality, and exists only since it is derived by the system.
II: E3 occurs when the last event that triggers the pattern matching occurs, in this case, in 9:30; the rationale: the derived event occurs when the patterns conditions are satisfied in reality, and this occurs when E2 occurs.
III: E3 occurs over the interval [9:00, 9:30]; the rationale: the derived event occurs over the interval of all participating events.

Example 2: The pattern is time-out (absence event). Example: if there is no bid for an auction by the end of the auction time, derive an event "no bidders".
Scenario: A bid was issued in 9:00 and is valid for 2 hours, in 11:00 it is closed without any bidders, in 11:02 the system issues the derived event.
We have similar three alternatives here:
I: The no bidders event occurs in 11:02, the time that the derived event is issued.
II: The no bidders events occurs in 11:00, when the "bid close" event occurs, which completes the pattern.
III: The no bidders event occurs during the interval [9:00, 11:00] --- since the "absent" event relates to the entire interval.

Like some other cases, there is no single solution that fits all cases; and the actual semantics of a specific case is a matter of policies, we see here three policies, which seem to cover most cases, but not necessarily all, that's why there is a need also to enable explicit derivation of the occurrence time of a derived event, i.e. the value of the occurrence time itself can be computed and derived.

More about temporal issues -- later.

Saturday, November 14, 2009

When does a derived event actually happen? - (posting I)

Just finished reading the book "Flash Forward" by Robert Sawyer. Science fiction was always my favorite type of literature, and my favorite writers are Asimov and Hienlein. There are science fictions writers among the following generation that stand out, and the Canadian writer Sawyer, who does not forget to give Canada a role in each of his books, is one of those. I have read several (not yet all) of his books. The best of these I read so far is the Neanderthal Parallax trilogy, which is also very though provoking besides being fascinating. "Flash Forward" book, which is now also becoming a TV series deals with an experiment that get everybody in the universe to jump forward 21 years in time for 2 minutes, this is a combination of science fiction, a book that raises some philosophical issues, and a suspender, highly recommended.

The question of time and deep temporal issues also was one of my favorite research topics, since time has physical, philosophical, and also computer science implication. Back to event processing, recently I have written the "warnings" chapter in the EPIA book, and one of the interesting question is: when does a derived update occur?
As discussed before, there are two dimensions for answering the question: occurrence time which stand for the time in which an event occurs in reality, and detection time which stands for the time in which an event is detected by the event processing system. Both of these are not obvious in the case that the event is derived. If we take the naive approach that a derived data occurs when the system computes it then we can have several anomalies. Consider the following simple example: there is an auction system, each auction has some auction context time interval, in which this auction is valid, and people are doing bids. The auction works on fairness criterion, which gives preference to people who did the bid earlier, in case of multiple bidders that made the maximal bid. The raw event is bid request, but the entry to the bid process is a derived event, since the event has to be enriched, validated, and some details added from the previous bid of the same bidders (if exists). If we take the time that the derived event actually happened as its occurrence time then we can have some semantic anomalies, as shown in the following figure:

Anomaly 1 (on the right hand side) is realized by the fact that though the bid request is done within the auction validity interval, the bid entry occurs after the auction interval ends and will not get into the auction processing.
Anomaly 2 (on the left hand side) is realized by the fact that orders of the bid requests can be reversed by their corresponding derived events and thus the outcome of this auction may not be consistent with the auctions' rules.

This is just one example that create a bias into a particular solution, however, the reality is even more complicated, since in different cases the answer to the question poses in the title of the postings may not be the same, thus policies should be used to disambiguate the semantics here.

I'll have a follow-up posting with discussion about the proposed policies for this case.

Friday, November 13, 2009

On EPIA and Friday the 13th

Today is Friday the 13th, some people have superstitions about the number 13th in general (many hotels don't have 13th floor, sometimes not even X13 room), and about Friday the 13th in specific. It seems that Manning, the publisher which publishes the EPIA book is having $13 off the list price in the Manning Early Access Program, so today is an opportunity to purchase the book $13 cheaper, get into the book's MEAP site and if you purchase the book, when checking out use the code: fri13 as a promotion code.

This is also a good opportunity to update about the book status. We have received the review reports from the 2nd review (actually 3rd including the reviews on the book proposal). Somehow the reviewers keep changing, which make them somewhat inconsistent with previous reviews. Reviews are good for improving the quality of the manuscript, it is also shows the necessity of writing a forward to the book explaining exactly what is the focus of this book, as various people have in mind various thing, and as I have written in the recent three book reviews on this Blog, books come from different focus, to different audience, so it is important to set the expectations right about what the book is (a in-depth technical book about the concepts behind designing event processing applications) and what it is not: It does not follow a single language, it is generic and demonstrated through multiple languages, a concept that is new for some readers, also it is not book about how EDA fits SOA, BPM, Messaging and other adjacent concepts and does not take a business oriented perspective, we write briefly about these topics (some reviewers think they are vital, other think they are boring), we leave the business oriented discussion to the book of Chandy and Schulte, and we'll devise an "additional reading" section for each chapter. We are now working on the last 1/3 of the book and intend to finish by early December, and also get the first version of the website alive.

Yesterday we also had an internal briefing in IBM about the book, and this is the slide that ended our presentation.

Wednesday, November 11, 2009

On Defining "EVENT" in Earnest

Professional books are not that funny, this is left for comedies. My favorite comedy of all times is Oscar Wilde's "The Importance of being Earnest". In Hebrew it was translated literally to something like "The importance of seriousness", and everybody who know what it is talking about understands that this translation totally misses the point of this comedy. Anyway, I recalled Oscar Wilde's old play, when reading the book by Mani Chandy and Roy Schulte recently, since they have in their book a section called "defining "EVENT" in Earnest". In this section they are saying that there are three school of thoughts about how EVENT is defined:
  • State-change view - an event is a change in the state of something and as such is reported. Its properties: a change must occur, and this change must be reported. Example: An item previously outside the range of RFID reader, is now within the range of this RFID reader.
  • Happening view -- an event is anything that happens, or is contemplated as happening (the EPTS glossary definition), in this case, a change must occur, but its reporting to the system is optional, not every event according to this definition is of interest to be reported. Example: A person sending Email
  • Detectable-condition view -- an event is a detectable condition that can trigger a notification, in this case a change does not have to occur, but reporting should occur. Example: A GPS devise reporting track location (note -- location may not have changed since last report. since the track driver went for lunch).

This is an interesting observation, some people argue that only the first type is an event, while the other types are not. My view is that all the above are actually events. The question is whether we can come with an inclusive, agreed upon definition of event, maybe the glossary team (co-lead by Roy Schulte) should take this challenge.

More about event types - later.

Tuesday, November 10, 2009

On the Event-Driven Architecture book

Last in the series of 2009 event related books is the book entitled: "Event-driven architecture - How SOA enables the Real-Time Enterprise". This book was published early this year, and I actually purchased it while visiting the USA earlier this year, and while doing the other book reviews it is a good time to write about this book as well.

The book, unlike the others, does not deal with event processing, it deals with EDA as a central concept, starting with a "working definition": event-driven architecture is one that has the ability to detect events and react intelligently on them. I have some trouble to digest this definition, since in my mind, architectures don't possess abilities. Part I of the book talks about "The Theory of EDA", in which it starts with a second "working systemic definition" saying that EDA is the complete array of architectural elements, including design, planning, technology, organization, and so on, which enables the ability to disseminate event immediately to all interest parties, human or automated. So now this is a definition of architecture for event/message routing, but I already noted that this is not about event processing. Next it goes in depth about the relationships between EDA and SOA, explaining on its way what SOA is. The metaphor used throughout is a nervous system, and this is talking about enterprise nervous systems, the discussion about SOA and related concepts spans over four chapters, ending with some hints of how to calculate ROI of selecting architecture style, but the ROI discussion remains in title levels. The second part of the book goes from theory to practice, in this case they are saying that the products implementing EDA are called ESB (Enterprise Service Bus), and (rightfully) claiming that the main gap in using EDA is that people are not used to think in EDA. However, while they have a chapter called "thinking EDA", its insights of how to "think EDA" stay in a very high level area. Going from the thinking to the examples, the book discusses in big details three examples: Airline flight control, Anti-money laundering, and event-driven productivity infrastructure (under this name there is a description of a framework to connect workflows, E-mail, phone, document repositories, blogs, wikis, social networks and some other stuff).
The book ends after these four example chapters (which actually take more than 50% of the pages), without any conclusion chapter.

It seems that the examples are the essence of the book, and the previous chapters are introduction, the examples also remain in the transport level, and while in one of the example "rule engines" are mentioned as part of the architecture, the book says very little about them.

Looking at the reviews in Amazon, it has polar opinions going from 1 star to 5 starts, I guess that I am somewhere in the middle, for somebody who does not have a clue about what EDA is it provides simple non-technical explanation, and such people found it useful; however, I agree with the 1 star reviewer that it does not really making a convincing story on the sub-title promise - "How SOA enables the real-time enterprise".

This completes my book reviews. We'll see some more books in this area coming in 2010.

Monday, November 9, 2009

On Stream Data Processing book by Chkravarthy and Jiang

Another related book that arrived yesterday is the book entitled: "Stream Data Processing: A Quality of Service Perspective - modeling, scheduling, load shedding and complex event processing".

First - let's start with a lesson in economics. Looking at the Amazon query about "event processing books", one can realize that the Amazon price for the book of Chandy and Schulte that I described yesterday is $32.97, the new EDA book, by Taylor et al costs in Amazon $37.30, and the book I am talking about today has Amazon price of $112.45 -- roughly a price of four books. So the economic question is what makes it so expensive? My guess is that the answer is that books of the type of the two referred book (and probably our upcoming book is within the same category) relies on the fact that people will want to buy these books out of their own pocket, while academic books, especially part of Springer series (this one is part of the series "Advances in Database Systems"), have captive audience of university libraries. I wonder how many people are willing to pay this price out of their own pocket for that book.

Now -- from the business side to the book itself. Sharma is an old colleague from my active database days. The book takes a database approach and starts by explaining why data streams are paradigm shift relative to traditional databases, then it moves to explain the notion of data streams, and gets into QoS metrics, moving to data stream challenges, and introduces CEP as a complementary technology whose support as part of the data stream management system is posed as a challenge, follows by a literature review, including a survey of commercial and open sources stream and CEP systems, that seems to me to have false positives and false negatives. Then start the more academic oriented discussion about modeling continuous queries, with theorems and Greek letters, next is discussion about engineering oriented aspects of DSMS like scheduling and load shedding.

After discussing all this, the authors move to discuss integration between stream and complex event processing, starting with differences, and stating that it will be difficult to combine incompatible execution models, nevertheless, the authors are not afraid of difficulties and a page later describe an integrated architecture, which is a layered architecture, where the stream processing is done first, as a result there is a phase of event generation, as a second layer, where the event processing is a third layer, and rule processing as a fourth layer. I think that strict hierarchical architectures are somewhat simplistic for realistic scenarios (I'll need to write something about it at later point) , then the authors dedicate two chapters to describe their prototypes, and the books concludes with conclusions and future directions, but they seem to be ideas to extend the current issues discussed.

Bottom line -- seems like an academic journal paper that has scaled up (324 pages including long list of references (not lexicographically sorted), and index. May have interest to those who wants to study the formal aspects of stream processing.

I also got with the package two books about causality models, but I need to read them first before making any comment on them.

Sunday, November 8, 2009

On the Event Processing book by Chandy and Schulte

Today I got a package of books from Amazon that included two new event processing related books. I'll review the first of them today. This is the book by Mani Chandy and Roy Schulte, called "event processing - designing IT systems for agile companies". The title itself (agile companies) indicates that the book is business related, and indeed it is primarily answers the questions: why use event processing, and how it is related to other concepts in enterprise architecture concepts. The book is non-technical and fits the level of managers/CIOs/ business analysts. The book starts with overview and business context of event processing, talks about business patterns of event processing (another type of patterns, besides all other types of event processing patterns), talks about costs and benefits of event-processing applications, and types of event processing applications. After doing the ROI part, it goes to more architectural discussion -- getting top-down approach: EDA, events, and employing the architecture. Next there are two chapters about positioning event processing against the rest of the universe: SOA, BPM, BAM, BI, rule engines (I'll write about this positioning attempts in later postings). Towards the end there is a chapter of advices how to handle event processing applications (and this chapter reads like analysts report). Last chapter talks about the future of event processing, again from business perspective, future applications, barriers and dangers (again a topic for which I should dedicate a complete discussion), and drivers for adoption.

In conclusion: good book to everybody who wants to know what event processing is and what is its business value. Things that I thought such a book might also include --- some reference to what currently exists in the industry, how the state-of-the-practice relates to these theoretical concepts presented in the book, when COTS event processing should be used vs. hard-coded, which are practical considerations of event processing applications
(maybe in the second edition?)

For those who asked me what is the relationships between the book Peter Niblett and myself are writing and this book, the answer is that our book has a totally different focus, explaining step-by-step, what is needed to build an event processing technology, providing the reader an opportunity to experience the various approaches in the state-of-the-practice by providing a free downloadable versions of various products and open source. The target population is also different - we aim for designers, architects, developers and CS students, while The book by Mani and Roy is aimed at managers, business analysts and MBA students. The review of the second related book - later.

On challenging topics for event procesing developers and users

Spent much of the weekend in working on the EPIA book, time is getting closer to finish, and now it is the last 1/3 of the book. While in the first 2/3 of the book we concentrate on explaining what event processing is, and going step-by-step on the different ingredients of building applications, the last part of the book deal with some implementation issues, focus on challenging topics, and our view for the event processing of tomorrow. The chapter that I worked on in the last few days - chapter 11 (has nothing to do with bankruptcy), deals with challenging topics for event processing developers and users. This means -- topics that the developers and users have to pay attention, since: there are issues that can influence the quality of results obtained from an event processing systems, and the current state of the art does not have magic bullets to resolve them. In this postings I'll just provide the list of topics discussed in this chapter, I'll write about some of them in the future, here is the list:
  • Occurrence time that occur over intervals: Events typically occur over intervals, but for computational reasons it is convenient to approximate it to a time-point, and look at events in the discrete space; however, for some events this is not an accurate thing to do, and interval-based temporal semantics should be supported, along with operations associated with them.
  • Temporal properties of derived events: For raw event, we defined occurrence time as the time it occurred in reality, and detection time, as the time that the system detected its existence. What are the temporal properties of derived events? there is no unique solution to this question.
  • Out-of-order events: This topic is the topic most investigated among the challenging topics, however, current solutions are based on assumptions that are sometimes problematic. This problem is about events that arrive out of order, where the event processing operation is order-sensitive.
  • Uncertain events: Uncertainty whether event has happened, due to malfunction, malicious or inaccurate sources
  • Inexact content of events: Similar to uncertain events, some content in the event payload including temporal and spatial properties of the events may not be accurate.
  • Inexact matching between events and situations. Situations are the events that require reaction in the user's mind. This is in getting us back from the computer domain to the real-world domain. Situation is being represented as a raw or derived event, but this may be only approximation, since there may be false positives and false negatives in the transfer between the domains.
  • Traceability of lineage for event or action, this gets to the notion of determination of causality. Since in some cases there are operations in the middle of the causality network outside the event processing systems boundaries (e.g. event consumer who is also event producer) causality may not be automatically determined.
  • Retraction of event: ways to undo the logical effects of events, sometimes tricky or impossible, but seems to be a repeating pattern.

More about some of them - later.

Wednesday, November 4, 2009

On logic programming and event processing

Alex Alves wrote in his Blog about logic programming extensions for CEP. Logic programming is like Phoenix, it goes and comes again in different contexts. First time I heard about it, in the early 1980-ies, in the context of the fifth generation computing, that was promised to be the real computer revolution -- old guys like me may still remember the hype around it, this was based around logic programming, actually its metrics was: LIPS (logical inferences per second). Then Prolog appeared as a competitor of LISP as AI language, some of the language wars were documented in the famous paper by Bobrow, entitled: If Prolog is the answer then what is the questions? Anyway, both LISP (my own favorite language) and Prolog has stayed in AI courses, but AI oriented programming is now being written in general purpose languages. Next we saw logic programming appearing in databases, in the form of deductive databases, Datalog and its siblings; when I was a graduate student in the late 1980-ies, deductive databases were the most visible topic in database conferences, and it was also somehow vanished a few years later. Now we are observing that among the many languages styles for event processing there is also event processing based on logic programming. Alex's Blog talks about Drools, there are also some other of this type. One of the languages that will participate in the EPIA book languages experience is an open source language based on logic programming called ETALIS. I'll report about this languages experience as we advance.

Tuesday, November 3, 2009

On the patterns collections list

Back on dealing with the EPIA book, we are now in the process of the 2/3 book review, and started to work on the last 1/3. Right now I am working on a section talking about temporal issues in event processing, but before talking about that, I still wish to get back to the previous chapter that deals with event patterns, continuing the discussion that I have started in this posting, and continued in this posting. In the book we bring a collection of patterns, these patterns are not meant to be complete, and we expect to grow the collection of patterns over time using the book's website.

The patterns collected are of several types:

  • Logical operator patterns: all, any, absence that designate conjunction, disjunction and negation event patterns.
  • Threshold oriented patterns: count of events, average/maximum/minimum of some attribute of a collection of events has some binary relationship (e.g. > ) with a given threshold.
  • Relative patterns: relative max/relative min selects the events with the minimal or maximal value for a certain attribute over a collection of events.
  • Modal patterns: sometimes, always, select a collection of patterns if a certain predicate is satisfied over all/some of the events in this collection.
  • Not select pattern: This is a second level modal pattern that selects events that were not selected by a certain patterns.
  • Sequence pattern: A temporal pattern that denotes a conjunction of event that occur within a predefined order.
  • Trend patterns: Temporal patterns that detect trend, e.g. a value of a certain attribute is consistently increasing with a context.
  • Spatial distance patterns: These are similar to the threshold patterns, but relate to the distance of events from some point in space.
  • Spatial relative patterns: This are similar to relative patterns, but relate to the relative distance of events from other events
  • Spatiotemporal patterns: This combine temporal and spatial properties, and designate direction of movement (e.g. moving consistently north, moving towards some entity).

The current full list of patterns consist of 30 patterns, and this list will probably grow. Each of the patterns is defined in the book and demonstrated using an example.

More on patterns - later.

Saturday, October 31, 2009

More on responsive, reactive and proactive computing

Earlier this week, I've posted in this Blog short definitions of the terms: responsive computing, reactive computing and proactive computing. Somehow, terminology always gets questions and responses, and there has been a thread in the Complex Event Processing forum that started to discuss it. So as a follow-up to this discussion, some clarification. The computing modes (responsive, reactive, proactive) are indeed mutually exclusive, however a single system may have combination of all of them.

Active databases are example that start with responsive computing, the basic operations are: insert, modify, delete or retrieve from a database. Then the active database engine takes these database operations as events, and apply reactive computing to execute rules reacting to these events.

The opposite direction is a reactive system, doing some kind of event processing, which during the event processing operations need to consult a database in order to enrich an event with more information. The database query issued is a responsive computing.

Proactive computing may also be combined with reactive and responsive systems.

Hans Gilde has posted on the complex event processing forum an example that combines all three.

While responsive computing is the bread and butter of computing, and reactive is now being more understood, proactive is still lagging behind in terms of realizing the potential, maybe the next hill to climb.