Friday, August 19, 2011

On temporal databases and DB2

I have written before about temporal databases in this Blog, and in general I worked on temporal databases around 15 years ago, and co-edited the book whose cover is shown here in 1998.  Temporal database is noted by the fact that academic people tried to drive standard in this area: TSQL2 lead by Rick Snodgrass and his colleagues. At that time it did not succeed since none of the DBMS vendors had an interest to see it as a high priority, these were the days were the Internet emerged along with XML, and the DBMS vendors had many other things to worry about.  However over the time  some DBMS vendors have adopted temporal capabilities within their products. Oracle already implemented temporal extensions supporting TSQL in its DBMS product.  Recently IBM produced its own version of temporal database within DB2.  It seems that there is now traction for temporal databases in various industries.    Today my colleague Guy Sharon attracted my attention to a new article on IBM DeveloperWorks entitled "going back in time" describing the DB2 temporal capabilities,  the traditional dimensions in temporal databases: transaction time and valid time got converted to new names:  system time and business time  (I make a note to write a post about the overuse of the term "business" ).     These two dimensions enable to ask queries like: what was the value of a certain attribute in 7/7/2011, as observed from 8/8/2011.  This can have different answer from the observation time of different days, since the knowledge about the past is changed in time.  While the title of that article talks about "going back in time", and indeed using temporal databases is typically viewed about recording the past, temporal databases can also be used about recording the future, this was noted in a work published in 1994 by Arie Segev, Avi Gal and myself entitled "retroactive and proactive database processing"  (I don't think that online version is available).  Since we are dealing over the last year in proactive event-driven processing, the issue of looking at predicted future events that can be revised with time is very useful, and we are indeed looking on temporal database techniques for that.  More on that - later.

Tuesday, August 16, 2011

Book review: Managing event information by Gupta and Jain

I am reading the book "Managing event information" by Gupta and Gain,  which is part of the series of short books by Morgan&Claypool synthesis lectures series.   The authors, professors from U.C. San Diego, and U.C. Irvine, have concentrated in the modeling of events, the illustration below was copied from the book to demonstrate the flavor.   The book starts with an introduction and running example about news agency, it also defines the setting of various types of events: spatio-temporal database events, sensor events, multi-media events (video, audio).    Chapter 2 deals with "Event Data Models" starting with temporal database as a tool for event modeling,  such that the temporal database represents the collection of states, while event is any transition in any state (value change), then it moves to discuss conceptual temporal models and discusses E* - a graph based event model using RDF and ontologies.  formalizing E* and showing some example of modeling using E*.  It discusses several modeling constructs such as disjointness (causal or temporal) of events; coverage; ordering' lifespan and spacespan of events, and constraints on graph relationships, such as follows, sub-class etc..  In chapter 3 the book discusses the implementation of the event data model; the illustration below is a UML diagram showing one of the supported patterns - events that participate within a situation, there are other patterns supported such as causality.  Later it discusses storage model of various types of events.  Chapter 4 is entitled "querying events" and discusses the possibility to query a collection of events. Example of queries are: spatiotemporal queries such as: "which meetings are scheduled in this hotel today after a certain talk", which looks like a regular query in spatiotemporal database, the queries can also support aggregates (looking for frequent meetings with a certain characteristics), and hierarchy, expanding to sub-events.   It can also query continuous events, graph relationships etc -- these queries seem to be the same as queries on spatiotemporal and graph databases, except he fact that the database represents events.  Chapter 5 brings a major application of the event model - the creation of a story based on the event model.  The storytelling is of multimedia type, a thread of research that follows Brooks' paper. 
Kevin M. Brooks: Do Story Agents Use Rocking Chairs? The Theory and Implementation of One Model for Computational Narrative. ACM Multimedia 1996: 317-328
The  book is summarized with some applications, conclusion and references.

The E* model is interesting in the sense that it shows various relationships among events,  and enable to get observation on the events and their relations to states.   Since there are no standards in this area, the terminology used in this book is sometimes inconsistent with other publications in this area, but it is generally clear.   The book proposes a holistic approach of event modeling, claiming that current event modeling systems are looking isolated aspects of events and cited the famous metaphor on the blind men that touch an elephant from different points, I have used this metaphor talking about misconceptions about event processing.
It still remains to be seen whether such models will penetrate the real-world systems.  
I'll give as a project topic in my event processing course more investigation of E*.