Monday, September 10, 2007

If SQL extensions are the answer then what is the question ?

Surprisingly, I realized that some people really read the blog, so I'll try from time to time to touch controversial topics to get it more spicy... The title is, of course, inspired by the title of a famous paper http://portal.acm.org/citation.cfm?id=4583, and the question is indeed - does SQL fit "event processing", or which types of event processing applications does SQL style of programming fit, if any ? There were some discussions on this area, some of them with passion of religious nature (well - this has also been true for Prolog at that time). Some arguments for it are: event processing applications require the use of database anyway, thus, the developer will need to use SQL anyway, thus, it will be easier of the developer to use only single type of programming... Well - this argument can go in the opposite direction -- let's assume that we decide that for event processing there is a more natural style of programming, we can use this style and generate SQL "under the cover" to communicate with database. Another argument is that SQL is declarative, however, contrary to the popular belief among database people (and I am originally part of this community), SQL is not the only declarative language in the universe. Another claim is that there is a lot of prior knowledge about query optimization in SQL, this is true, but much of it is unhelpful for the EP case anyway.

If the idea (as most vendors aspire) is to have a most general event processing language, there are some cases, in which I find mismatch between SQL type of thinking and EP thinking, let me point out some of them: first -- SQL is set-oriented, which means in case of "join" it conceptually starts from the Cartesian product, and then creates subsets by the select and project operators. In event processing some of the applications are set-oriented (e.g. finding trends in time-series) but many of them are "event at a time", where for each individual event, there is a check if some pattern is matched. While, it is possible (sometimes with difficulties) to express pattern matching in SQL, it is not a natural way to think about it, especially Second - SQL lacks abstractions that allow to fine tune the semantics. In the past I have presented a relatively simple example on the Yahoo CEP-Interest group, and have been shown SQL solutions that can solve it, but with a price of highly complex queries. Anybody interested in the details can read the example in: http://tech.groups.yahoo.com/group/CEP-Interest/message/678 there are some follow-up actions that have shown how it is done in SQL, and you can get your own impression.

However, since event processing is not a monolithic area, there may be a place for specific cases, which do not intend to provide a general language, is there a benefit to use SQL in such specific cases ?
This goes into the issue of relationships between databases and event processing which deserves more attention and will be a topic of one of the next postings on this blog.


More - Later

1 comment:

Anonymous said...

Dear Opher, please allow me to spice it up even more.
We are assuming that event processing is somehow exclusively related to pattern matching. Within the boundaries of this assumption, some suggest that SQL is well suited to express a set of patterns. You are clearly suggesting that there might be some, possibly more viable, alternatives. Here I do not want to discuss about alternatives to SQL (that might or might not be well suited to support pattern matching), however, in the past, we have certainly seen some quite powerful alternatives based on strongly declarative syntax. Talking about religion(s) and Prolog, you might remember that we were once divided in two tribes – the former worshipping backward chaining, the latter forward chaining. I was a devoted member of the second tribe and I have been working quite a lot with tools like the Automated Reasoning Tool (Inference). This kind of tool (like others) was implemented in Lisp and exposed a very expressive set of constructs to support pattern matching. The same tool (like others) was rule based, each rule being made of a Left Hand Side (expressing the pattern) and a Right Hand Side (expressing the action to be fired). The rule executive was based (like others “production” systems) on the RETE algorithm. Here I’m not suggesting to adding yet another floor to THE tower, I’m just reminding that there is a history of alternatives. (I have to admit, though, that it was quite nice to see some of these historical milestones reappearing, here and there, recently ).
What if we contemplated another possibility? What if we started to think about event processing as something not exclusively related to pattern matching?
In this alternate landscape, it might be less expensive to tackle the kind of problems you have submitted to the CEP Interest Group. In general, I think that a very subtle assumption that might be exposed by a pattern matching/aggregation approach is that, in general, a reaction might depend on the capability to establish a relationship amongst a set of assertions (or events, in our case). This might, even more subtly, lead to a logic that is quite batch oriented, rather than event driven: it might be quite daring to assume that a set of events will happen at the same time or that a set of events will have happened at a certain point in time. Thus whenever we describe a pattern referencing a set of events we are, likely, looking at something that has happened in the past, we are aggregating a set of assertions (or facts) and possibly we are not doing anything really different from what analytics do, such as “finding trends in time-series”.