Saturday, April 25, 2009

On Revision

Saturday morning, and I am spending some spare time (well -- ignoring my huge to-do list..) in reading the autobiography of Shmuel Tamir, who has probably been the most influential lawyer in Israel, as well as a political leader whom I always had great respect to (I don't admire people).

Today I would like to write about the notion of "revision" and relate it to event processing.
This is inspired, but not a direct response to a thread of discussion started by Peter Lin in the complexevents forum, under the name: mutability and aggregation.

Revision is somewhat different from modification; in modifications fact are modified, in revision they are revised. For example: if John Smith moves from the USA to Canada, then the facts about John Smith are modified, while if, by mistake it was recorded that John Smith lives in USA, where in reality he always lived in Canada, this is correction of recording mistake. Some people may wander what is the importance to make distinctions between the two ?

The first use of "revision" that I came about was in AI, talking about "non monotonic logic", the rationale is that using "classic logic" one can reason about the universe just if there is perfect knowledge, so the example used is that although birds typically can fly, however there are some exceptions -- Penguin does not fly, Ostrich does not fly, bird with broken wings cannot fly etc..
Let's say that Tweety is a bird and we don't know anything else about it, according to classic logic we cannot say whether it flies, however, according to the various non monotonic logics, we can say that since birds typically fly, we can assume for any practical purposes that Tweety flies, as long as we are ready to withdraw from this assumption when new information (such as: Tweety is a Penguin) becomes available, in that case we may need to retract all the assertions that were inferred directly or indirectly from the revised assertion.

Later in life, I have worked on temporal databases; one of the motivations of temporal databases have been to issue "as-of" queries, meaning -- looking what was known from a viewpoint of a certain time point in the past. For example -- if we investigate possible malpractice of a physician (I heard that the national sport of Americans is to sue their physicians) then in order to determine whether a physician made a reasonable decision we need to know what was the information available to the physician at the time that he made the decision. In order to achieve that facts cannot be deleted or changed, but we need an "append only" database, the distinction between "modification" to "revision" is important for the decision analysis, there may be a difference between -- the fever was high in the next measurement, or if the fever was high also in the measurement before the decision, but it was reported wrong and this information has been revised later. Eleven years ago I have co-edited a book about temporal databases which (among other things) discuss these issues.

Now, something about revisions and event processing. Recall that an event is something that happens, and it is reported to an event processing system using its projection which is also known as event (sometimes: event object, event message). An event that happens in reality cannot be modified or deleted, if it reflects something that happened. However, since when go to the projection on the processing system, again, if we assume that the knowledge is not perfect then we can have several cases of revision:

1. The event really did not happen, but it was reported by mistake that it happened, and the mistake was realized later.
2. The event happened, but it was not reported, and this was realized later.
3. The event both happened and reported, but some information associated with the event (reported through the event's payload) had wrong value due to error that was corrected later.

I'll post soon a continuation that discusses the implications of revisions on the processing of events. More - Later


Marco Seiriƶ said...

One interesting variation is when we know for sure that we never get correct events. For example for GPS events we know that the probability for the vehicle to actually be located at the exact point reported by the GPS is rather low. It's somewhere near for sure, but not likely exactly where the GPS says.

For most practical purposes we don't have a problem with this. But there are cases where one would like to take this into account.

Opher Etzion said...

Hi Marco. This is another topic -- it talks about inaccuracy of information in event, it is of course related, but can be dealt with uncertainty handling methods.



Richard Veryard said...

Marco's comment raises the question of cascading probability revision. If we revise A, and A is used to infer B, then we need to revise the probability of B, and this in turn affects the probability of C, and so on. Conversely, if C turns out to be untrue, this may give us a reason to revise A.