Thursday, September 11, 2008

On Occurrence time: a footnote to the UAL fiasco

As a past Dungeon Master the word crawler always reminds me about the "carrion crawler", a monster you can see in the picture above, but recently a combination of the allmighty Google crawler, and automatic trading programs based on event processing has caused a fiasco that crashed the stock of United Airlines, some of the blogs have referred to it: Brenda Michelson in her Blog have talked about the butterfly that lead to the computer glitch. Mark Palmer thinks that news should be regulated (some people I know who were borne in countries were news are indeed regulated shiver to hear the idea that news - of any type - should be regulated).
I will not go back to the story, but as a footnote - two issues come to mind - event validation and the issue of occurraece time. So I'll write today about occurance time since it is easier...
The works in the temporal area are talking about several time dimensions - the bi-temporal model talks about: transaction time -- the time that a fact is recorded, and valid time -- the time interval in which the fact is valid. In event processing we also look at a bi-temporal time similar to this: detection time -- the time that the message that represents the event was detected by the processing system, and occurence time -- the time which the event happened in reality (occurrence time can be considered as the starting point of a valid time that ends when the event becomes irrelevant, but let's get it out of the scope and concentrate in occurrence time).
Some of the implementation of event processing base the order of event on the detection time, some support occurance time, and some base the built-in temporal capabilities based on detection time, and enable defining times as an an attribute, but then the temporal operators have to be hand-coded as regular predicate.
One of the common fallacies is that detection time is good enough as a metrics for temporal operations on event (e.g. trends), first - event from the past can suddenly pop up out of the blue (I know a person who has an habit to catch-up in Email every two weeks or so, and answer to the Email before realizing that there has been a whole thread of Emails that make answering the original Email quite obsolete), second - the order may not be kept even if the delay from the occurance time to the detection time is very small. The order of medical exams may not be consistent with the order of results reaching, and knowing the real order may be important for the differential diagnosis.
Thinking about standard structures for events -- I would think that having "standard header" with some mandatory properties for each event - is a good candidate for having standard (I am less optimistic about standards for the content of the event), and in the header - the occurrence
time should be a mandatory.
Occurrence time has some inherent issues associated with it - but I'll discuss it another time.

Wednesday, September 10, 2008

On events about events

Today, our entire family has travelled to the "instruction basis" of the Israeli Navy, where my (second) daughter had the ceremony of finishing her basic training, above is the logo of the Israeli Navy, the Navy is the only branch of the army who has ceremonial white uniforms, so it was all white day...

This is also, somehow, the time in the year where many things related to event processing happen. Yesterday, IBM (the company who pays my salary) has done what was known internally as the "events event" in Boston for analysts and press, some report on it exist in the media, I think that David Berlind's report in "information week" is the most thorough one and includes video and slides. I have also seen several announcement by other vendors.

Next week there will be the two back-to-back events, the Gartner EPS - second of its kind, follows by the EPTS event processing symposium - which is fourth of its kind, but first one after the formal EPTS launch. The first one to hear analysts reports and some other talks geared to customers, the second - highly interactive, discussion oriented meeting geared to the EP community. This is primarily for EPTS members, but we have invited some guests. The EPTS meeting will be video-taped by CITT, and will be posted on the Web, so additional people willl have access to the material.

I'll blog more on these events next week. Looking forward to see the active those active in the community and also some new faces...

Monday, September 8, 2008

A footnote to the streamSQL paper

The comment that my good friend Claudi (AKA Pattern Storm) made in the complexevents forum made me curious to actually read this paper; reading it I had the uncomfortable feeling that since people insist to use a language style that implies type of thinking about event processing, and this creates semantic problems which they try to solve by use the same type of thinking, with more complicated constructs.

I'll use one simple example taken from the paper, which they had to deal with semantic problems that were caused by the way the language semantics.

The scenario (translated to my language - without the "streams") -- Events are reported about cars that move through some segment of the road; each event consists of

There are also simultaneous events, i.e. several events that happen in the same time unit (what ever the time granularity is). The inputs are events of this type, the output is - for each event, generate a derived event that include the original attributes of the events and the average speed of cars in the same time unit. If you want to see the types of problems that the SQL implementators see in this simple example, read the streamsql paper. Instead of discussing SQL, I would like to show an alternative way to think about the same problem.

The slide below shows an alternative way to think about this problem - this is a very simple EPN (Event Processing Network) which has two functional agents, one producer (e.g. an event emitter that create events from video stream produced by a camera that looks at the road) and one consumer (whoever wants to see the output events)..

The two agents work under the same temporal context (it can be spatio-temporal if we also want to group by road segment) - in this case, a temporal context is opened and closed every beginning and end of 1 time unit.

  • The raw event is called "car position event" and it goes to both agents.
  • The first agent is an aggregator which calculates (incrementally) the average, since it is bounded to the context, the average is of events from the same time unit, at the end of the time unit it produces a single event "speed-average-event" with the structure

  • The second agent is a "pattern detector" which takes two input events - the "car position event" again, and the derived event "speed-average-event"; the pattern that need to be identified is AND, and the "speed-average-event" for that agent has a consumption policy of "reuse" (which means that if an event can be used for multiple patterns). The agent produces a derived event - for each AND pattern that consists of the "output-event" whose structure is:

This EPN does not involve "streams" - the thinking is "event oriented" and it attempts to provide natural thinking about event processing functionality.


1. This is rather simple example, can also be solved by putting the average speed event on a global state (or event store/database) and then enrich it back - but the event-oriented is closer to the spirit of the original example which work on streams.

2. Aggregator and pattern detector are type of agents, there are some (not many) more types. Typically, an event processing network consist of multiple types of agents.

3. "Pattern Storm" claims that stream SQL ignore causality. One can view the relation between input events and output events of the same agent as a causality relation (he is using another scenario from the paper), and this can be set while defining the EPN.

One general comment (not related to this posting) - to "anonymous" - I'll gladly answer your question if you'll send it back and identify yourself. I don't publish anonymous comments.

I can post the solution to the rest of the examples in the stream SQL paper if anybody is interested...

Sunday, September 7, 2008

On AITC (Arab Israeli Technology Center)

Today I am blogging about an "off topic" issue -- an initiative that occupy part of my time -- AITC (Arab Israeli Technology Center). Some background - as you probably know, Israel is high-tech country, out of the 3.3 Million people in the Israeli work-force, 200,000 work in the ICT area (AKA "high tech"), something around 6%. This is an impressive number, considering the fact that the salaries and thus standards of living in the high-tech industry is relatively high, and that the supply of high-tech professional does not meet the demand. However, the Israeli high-tech does not have even participation of all populations. There are two populations that are extremely under-represented, one is the ultra-orthodox Jewish population, which parts of it chose to live in isolation, and not seek employment outside their community - they are outside the scope of this discussion now, the other under-represented population is the Arab population, which is around 20% of the population of Israel (I am talking about the Arab citizens of Israel, not the residents of the Palestinian territories). Some survey that has been done revealed that the participation of the Arab population in the Israeli High-Tech is around 0.2% - two orders of magnitude difference. Why did it happen? Twenty years ago much of the high-tech was associated with the defense industry. While there still is a defense industry, most of the high-tech is civilian, furthermore - many multinational companies: Intel, IBM, Oracle, HP, SAP, BMC, CA, Motorola, Microsoft, Google - to name a few - have substantial activity in Israel, many of them have also acquired Israeli start-ups and continue operate them as part of the bigger company. However, the image of high-tech being closely associated with the defense industries, who don't occupy Arabs, is still pervasive in the Arab society. One of the Arab Israeli notable families decided to break this reality, and to launch a project that will first convert people from the target population who already have academic degrees in various engineering, science or mathematics disciplines, in the sequel establish a private academic institute, and then also incubation for start-ups. This family has recruited a known (and controversial) person in Israel, who is known in his ability to push things, as the President of this center, and he recruited me to be the Academic Director of this center (based on some bullets in my CV, I guess)... I have decided to take the offer (in my spare time... at least for now), in order to make contribution to the society. The first mission has been to recruit the support of the Israeli high-tech industry, and this has been relatively easy. We have now full support, including participation in advisory committee, ability to do projects, and willingness to recruit graduates of this program (well - if we prove that we meet the high standards of the industry). I also compiled a curriculum draft, which have gone major revision after consulting with the different companies (we cannot make everybody happy, but it seems that everybody is happy about something, and at the end - this make them happy). Besides the high-tech industry, various politicans are also showing involvement (I am not crazy about politicians in general)... and the Israeli president, Shimon Peres, is giving his sponsorship (which will earn the entire team a dinner with him in 2 months or so)...
Now there are preparations for the launch, and the actual studies are planned to start in February 2009. I am in a phase of recruiting teachers - blend of people from academia and industry...
BTW - the picture above is of Nazareth, the biggest Arab town, where the center will be located.
How does it relate to event processing? Well- this launch is a big event, but like David Luckham's example of the stock market crash, it does not happen over a day, and consists of many smaller events. So - I'll blog in th future, about some of these smaller events that will happen. More -later; back to EP Blogging.