This is a blog describing some thoughts about issues related to event processing and thoughts related to my current role. It is written by Opher Etzion and reflects the author's own opinions
Wednesday, December 15, 2010
Revisiting EPN
Since this question is actually a good question, I wanted to share my response. There are two main reasons why we have shifted in the thinking to the EPN model: efficiency and usability.
I'll start with the usability, experience shows (and this observation is true also to inference based systems) that people feel more comfortable in ability to control the flow rather then having implicit flows, they understand better what it does, can better debug and validate it, and trust such systems more. Note that EPN is not a workflow, it does not represent control flow, it represent event streaming flow (in a way similar to data flow, with some semantic distinctions).
The other reason is efficiency. If an EPA subscribes to event type then either an EPA has to process and filter out a substantial amount of irrelevant events, or the amount of event types might successfully be increased. Imagine the following scenario: An event of type ET1 arrives, first it meets a filter that filters out much of the event using some assertion, and then there are various EPAs that process only the filtered-in events, one of this EPAs is enrichment, adding some information from a database, and then the enriched event is being sent to an aggregator for further processing. If we use the "event type" subscription, there are two choices: first -- create event type ET2 for the filtered-in events, identical to ET1, and create derived event of type ET2 for each filtered-in event of type ET1, then create event type ET3 for the enriched event with added enriched attribute, and then indeed each EPA subscribes to a single event type. The second choice is to use ET1 for all three cases, but add indication (using some derived attribute) which variation of ET1 it is, and filter inside the aggregator to have only the right type of ET1. Both are inefficient, the first one due to the need to manage much more event types, the second is that much more events are transmitted to each EPA to filter out, and the order also becomes important here.
The explicit EPN resolves it by the fact that each EPA sends it output to a channel and the channel can route according to source, type, assertion etc... - thus a specific output terminal of a channel is really the topic which EPA subscribes to. Note that all the possibilities mentioned before are just special cases of EPN and if one insists, such EPN can be constructed, in the extreme case, one can construct EPN with a single channel that routes every event to every EPA to decide whether it wants to use it or not, but I would not recommend it as a good design pattern. More - later.
Wednesday, November 11, 2009
On Defining "EVENT" in Earnest

- State-change view - an event is a change in the state of something and as such is reported. Its properties: a change must occur, and this change must be reported. Example: An item previously outside the range of RFID reader, is now within the range of this RFID reader.
- Happening view -- an event is anything that happens, or is contemplated as happening (the EPTS glossary definition), in this case, a change must occur, but its reporting to the system is optional, not every event according to this definition is of interest to be reported. Example: A person sending Email
- Detectable-condition view -- an event is a detectable condition that can trigger a notification, in this case a change does not have to occur, but reporting should occur. Example: A GPS devise reporting track location (note -- location may not have changed since last report. since the track driver went for lunch).
Saturday, March 15, 2008
The Babylon tower and event formats

Talking about data, one of the topics that were discussed in the OMG meeting about standards is the topic of -- semantic/structural standards for events. I have used the term "Babylon tower" in one of the earliest postings in this Blog and meant that we have Babylon tower of languages - like the original tower who separated the languages. However, there is another Babylon tower that relates to event format - syntax and semantics, and here the problem is even harder, since there are multiple formats in multiple domains, nobody even made an inventory survey. One of the presenters said (and it is true in some cases) that 80 percent of the cost of building EP application is to set up the events from the sources and transform them to a processable format by composing adapters (hand coded, or by using transformation engines). This is some of the domains that need more investigation, and perhaps we need a meta-data standard here.
Thursday, December 6, 2007
On Event Representation
Back to micro-oriented issue, and today I'll start discussion about -what's behind the definition of the event processing glossary and get to the issue of event representation. As the glossary says - event is something that happens in reality. We also tend to call "event" to the representation of this reality for the purpose of processing by a computer. This notion in event has in the glossary several aliases: event object, event message and event tuple. The various aliases are indications that the space of event representation is not uniform, some think about event as a message that moves around, some thinks of it as a tuple, which is part of a stream, and really the twin brother of a tuple in relational database, some think of it as an object with arbitrary structure (which may also be hidden). Obviously, there is no "universal event", and unfortunately, since in many cases, events are already given from the sources with their given formats, and the event processing designer has little to say about it, then a generic event processing system has to support multiple type of events, or have adapters that translate all types of event to some cannonic type of event (and typically both -- supporting some cannonic type of events and having adapters translating other types of events to the cannonoic type). Event can be structured, semi-structured (XML), and unstructured (the area of unstructured events processing deserves more focused attention). One of the questions is - whether there are common attributes that each event should have to enable event processing. In the data world - the answer is no - there is not a single attribute that must exist in all relations (besides the fact that each tuple should be a member of some relation - no floating tuples). For event processing -- there are some attributes that have been proposed as common attributes:
- Event-type
- Source
- Time-Stamp (or Time-Interval)
Let's look about the question - are they mandatory or not:
- The first question is whether each event is an instance of an event-type (or event-class). The glossary says - yes ! "all events must be instances of event-type". This seems reasonable, however, we may think of some exceptions - such as rare events that have not been classified.. I need to drill down on rare events in some other post.
- The second question is whether the source should be mandatory - again, this is desirable if we want to have lineage or tracing back actions/decisions, but there may be cases in which the source is indefinite, or we wish to hide the source (e.g. leaking of information).
- The third question is whether each event must have a time-stamp (or time-interval in case it happens over an interval - another area that needs more discussion) - the answer is that many event processing patterns are time related, and if we want to know which event occurs first, or if two events occurred within 5 minutes of each other, we need to know WHEN this event occurred in reality. However - in some cases it is not known, in other cases it is not really needed.
It seems that all common attributes are useful, but may be optional in some cases.
There are attributes that are common for types - such as probability for uncertain events, spatial coordinates for spatial events etc -- this is before relating to the content.
The content is determined according to domain related ontologies - and there is a lot of work today in different application domain or industry to define such ontologies. XML is the ontology language, and it has its own benefits, it also carries overhead relative to "flat" events in which the attributes are positional oriented and not keyword oriented.
Events also carry semantic information - such as: reference to entities in certain roles. In fact, event can be thought of a transition between one state to another and the information included in the event refers to a change in the universe such as:
what was changed ? what entities are affected? when it was change ? where did the change take place ? what other information is important about the change ?
This short discussion raised already several open issues that deserve further discussion - so I'll put these topics on the queue for further postings.... more - later.