Sunday, August 24, 2008

On Event Stores and Temporal Databases


I am an old-fashioned guy who carries handkerchiefs, like this one, anywhere he goes, it is handy for multiple usages, anyway - while in the past, all department stores in Israel carried handkerchiefs and it was quite a popular product, for some reason, it went out of fashion, and I have hard time to renew the inventory of handkerchiefs, and in this sense, I wish I could step for a minute into the past, buy two dozens of handkerchiefs and return. In the past, I have been involved in work around temporal databases and even co-edited a book in this area. Temporal databases had two major goals:
(1). Keep historical data, and enable easy retrieval of this data
(2). Enable to issue queries "as of" any point in time, i.e. issue query that takes into account the information that was available at a certain point in time (not as seen from "now") - again, returning for the past.
One may wonder why am I writing about temporal databases today, well - the issue of temporal databases is coming back when thinking about "event stores", I know that some of my database colleagues don't like the term "event store" or "event repository", since it does not include explicitly the word "database", but for me, using DBMS is just a possible implementation, while others, such as grid cache are also possible - but this is a topic for another discussion.
Anyway - why do we need an "event store" - in some cases we need to maintain historical events and use them, in some cases even apply pattern detection on past events. For auditing purposes we may also want to issue "as of" queries. Note that temporal representation of events can be done according to multiple temporal dimensions (see discussion about temporal dimensions of events). One of the characteristics of temporal databases are that they are "append only" databases, meaning: database records can be added, but not modified or deleted; modification and deletions are logical operators that create other instances, keeping the old ones. This is linked to one of the properties of events - immutability, which is actually a controversial property that still needs discussion about - in what conditions it is needed. Temporal databases seem to be a proper way to represent historical events.
Some concluding comments:
(1). Current DBMS do not support temporal databases as primitive, although temporal databases have been built as a second layer above them.
(2). Not all events need to be persistent for historical processing, this is a property of event-type, and its retention policies. Different events need to be persisted for different purposes.
(3). The issue of what language should be used to process "event stores" is also a matter of opinion, some believe that SQL is the answer (however, for some patterns it is an awkward language), there is an attempt to extend the SQL language with pattern extensions, here I will quote a wise person, Paul Vincent, who wrote in a footnote to this posting : This will be especially good news for those who like their SQL statements to run to multiple pages… Another option is to use on-line pattern language that is used for on-line patterns, and translated it to SQL (or one of its variations).
There are several issues that still need deeper discussion - but enough for today.

No comments: