In a comment to my previous postings on window boundaries, I was asked why do we need two type of interval semantics: half closed for the time-oriented sliding window, and closed for the event-oriented sliding window (one of them counting time, the other counting the number of events of certain types).
The question is: why can't we use the half-closed interval semantics also for the event-oriented sliding window, let's say we have a sliding window that counts 5 events of one type, the 6th event that serves as a starting point for the next window, will also terminate the previous window.
The answer: this solution is not really equivalent to the one with closed on 5 events.
Let's take an example:
Instance 1 occurs in 10:02
Instance 2 occurs in 10:03
Instance 3 occurs in 10:13
Instance 4 occurs in 10:14
Instance 5 occurs in 10:17
Instance 6 occurs in 11:01
According to the closed interval semantics, the interval is [10:02, 10:17], according to the half closed interval semantics on the 6th instance, the interval is [10:02, 11:01) , which means that event that occurs in 10:35 belongs to the window according to the first interpretation, and does not belong to the window according to the second interpretation.
Furthermore, if in the end of the window, there are some derived events emitted, or action triggered, this will now occur in 11:01 and not in 10:35 -- which again may create other problems.
In some applications the distance between events is very small, since the assumption is that the events of the types that bounds the windows are very dense, thus the distinction between the two becomes marginal, however, this is not the general case; in the general case the distance between the 5th and 6th instances of the events may be quite substantial, this is true for many applications.
This reminds me that in the course that I've taught, the students implemented projects using various products available on the market today, and one of the teams (I will not disclose the product name) has written in its report that indeed the window is closed only when the next event arrives, thus when they debugged their system they added dummy event, otherwise the window would never close.
More window related discussion - later