Sunday, September 30, 2007

Event Processing - a footnote to databases ?

More in the spirit of the VLDB conference I've attended last week, there is a conception in the database community that event processing is really part of database technology, and that the functionality of event processing can be obtained using regular databases by inserting the events into the database, and asking "continuous queries" in the database. According to this outlook, the only reason that customers want to have engines outside the database engine is when some performance properties - typically - throughput and latency cannot be satisfied by database engines, but this can be handled by some tricks - like in-memory databases.


This reminds me that in the first conference I have ever organized: NGITS 1993 (we did not have conference webpages at those days) there was a discussion about the relations between Artificial Intelligence and Databases that followed the keynote address of John Mylopoulos, whom I always considered as one of the most visionary people I've ever met, John said something like this "the difference between AI and database discipline is that AI is a scientific discipline and database is an engineering discipline, which deals of efficiency issues", he, of course, made the database people who were present, quite angry, however, now that I am looking from the outside (at that time I have looked from the inside) on the way that database people think, I realize that he was, as usual, right.


While, high performance is one of the reasons that customers turn to COTS in this area, this is only the secondary reason, the main reason is that event processing software is being used is the level of abstraction they provide, and consequently the improvement in ROI. It seems also that the main competition between different products will be more in the ROI (ease of use) front, then in the performance front.


Event processing is different in the required functionality from database processing, the fact that database processing processes a state ("snapshot"), and event processing processes a set of transitions ("event cloud") impose different thinking, and hence different abstractions. Trying to introduce event pattern detection as extension to database processing (as we have seen in the EPTS meeting, the proposal being prepared now) have several attributes - simplicity is not part of them, and thus it totally misses the point of "ease of use", only to satisfy the assertion that event processing should be done within database processing. While these are nice academic attempts, and probably researchers will be able to write a lot of papers about the pattern extensions to SQL, I don't believe that they will catch in reality.


However - databases do have several roles in event processing, here are a few of them:

(1). Databases will be used to store events that should be used for retrospective processing. These database require to support temporal (or even spatio-temporal) characteristics; the database products don't provide yet good support of this area, and this deserves a separate blog.

(2). Databases (or in-memory databases) Will be used to store intermediate states for recoverability.

(3). Databases will be used to enrich events for processing (mainly reference data, but sometimes transaction data).
(4). Data warehouses will be used for embedded analytics.


I think that the database community should concentrate in enhancing database technology to support these functions in event processing -- e.g. temporal database support - both in abstraction level and efficiency in implementation, instead of insist on extend SQL in unnatural way.
I still need to discuss in more depth several topics like: temporal databases, retrospective processing, and alternative approach for SQL patterns, but will leave it for later.

No comments: