As a past Dungeon Master the word crawler always reminds me about the "carrion crawler", a monster you can see in the picture above, but recently a combination of the allmighty Google crawler, and automatic trading programs based on event processing has caused a fiasco that crashed the stock of United Airlines, some of the blogs have referred to it: Brenda Michelson in her Blog have talked about the butterfly that lead to the computer glitch. Mark Palmer thinks that news should be regulated (some people I know who were borne in countries were news are indeed regulated shiver to hear the idea that news - of any type - should be regulated).
I will not go back to the story, but as a footnote - two issues come to mind - event validation and the issue of occurraece time. So I'll write today about occurance time since it is easier...
The works in the temporal area are talking about several time dimensions - the bi-temporal model talks about: transaction time -- the time that a fact is recorded, and valid time -- the time interval in which the fact is valid. In event processing we also look at a bi-temporal time similar to this: detection time -- the time that the message that represents the event was detected by the processing system, and occurence time -- the time which the event happened in reality (occurrence time can be considered as the starting point of a valid time that ends when the event becomes irrelevant, but let's get it out of the scope and concentrate in occurrence time).
Some of the implementation of event processing base the order of event on the detection time, some support occurance time, and some base the built-in temporal capabilities based on detection time, and enable defining times as an an attribute, but then the temporal operators have to be hand-coded as regular predicate.
One of the common fallacies is that detection time is good enough as a metrics for temporal operations on event (e.g. trends), first - event from the past can suddenly pop up out of the blue (I know a person who has an habit to catch-up in Email every two weeks or so, and answer to the Email before realizing that there has been a whole thread of Emails that make answering the original Email quite obsolete), second - the order may not be kept even if the delay from the occurance time to the detection time is very small. The order of medical exams may not be consistent with the order of results reaching, and knowing the real order may be important for the differential diagnosis.
Thinking about standard structures for events -- I would think that having "standard header" with some mandatory properties for each event - is a good candidate for having standard (I am less optimistic about standards for the content of the event), and in the header - the occurrence
time should be a mandatory.
Occurrence time has some inherent issues associated with it - but I'll discuss it another time.