Last week I have been in a short visit to England, served as external examiner in a PhD exam (called by the locals: "Viva"). The exam was strange in the sense that it was the first PhD exam I have ever attended where the advisor was not present (actually he was not invited, I guess it is a matter of culture). The dissertation that was submitted by Jenny Li dealt with performance metrics and benchmark framework for event processing system. One of the ideas raised in this dissertation is that the measured latency may be subjective in what it means. It took me some time to understand the idea, so I'll explain it through example.
Let's assume that the pattern that we are looking for is a sequence of four events, of types E1, E2, E3, E4.
The two common metrics associated with latency are:
Measurement starts at E4, since the consumer expects to see results only when the last event that closes the loop occurs.
Measurement start at any event occurrence, and ends when the system finishes processing the individual event, which can be merely storing it in the buffer.
Note that the first metric is biased towards eager evaluation, doing the minimal work at the end, and preparing sub-patterns, while the second metric is more balanced.
The proposed metric is -- start measuring in the event most significant to the consumer, and end at the end of the processing of the pattern. Let's say that the most significant event to the consumer is E2, then the latency starts at the occurrence of E2, and ends either when there is a match of a pattern, or when the event E2 can be discarded since there is no match. This can be applicable in cases that all events in the sequence typically happen (e.g. when they are time series events) in relatively fixed differences. This is an interesting metric, we asked the student to define definite criteria for when it is an applicable metric and when it is not.