Showing posts with label evnet processing science. Show all posts
Showing posts with label evnet processing science. Show all posts

Monday, September 8, 2008

A footnote to the streamSQL paper

The comment that my good friend Claudi (AKA Pattern Storm) made in the complexevents forum made me curious to actually read this paper; reading it I had the uncomfortable feeling that since people insist to use a language style that implies type of thinking about event processing, and this creates semantic problems which they try to solve by use the same type of thinking, with more complicated constructs.


I'll use one simple example taken from the paper, which they had to deal with semantic problems that were caused by the way the language semantics.



The scenario (translated to my language - without the "streams") -- Events are reported about cars that move through some segment of the road; each event consists of

There are also simultaneous events, i.e. several events that happen in the same time unit (what ever the time granularity is). The inputs are events of this type, the output is - for each event, generate a derived event that include the original attributes of the events and the average speed of cars in the same time unit. If you want to see the types of problems that the SQL implementators see in this simple example, read the streamsql paper. Instead of discussing SQL, I would like to show an alternative way to think about the same problem.

The slide below shows an alternative way to think about this problem - this is a very simple EPN (Event Processing Network) which has two functional agents, one producer (e.g. an event emitter that create events from video stream produced by a camera that looks at the road) and one consumer (whoever wants to see the output events)..




The two agents work under the same temporal context (it can be spatio-temporal if we also want to group by road segment) - in this case, a temporal context is opened and closed every beginning and end of 1 time unit.

  • The raw event is called "car position event" and it goes to both agents.
  • The first agent is an aggregator which calculates (incrementally) the average, since it is bounded to the context, the average is of events from the same time unit, at the end of the time unit it produces a single event "speed-average-event" with the structure

  • The second agent is a "pattern detector" which takes two input events - the "car position event" again, and the derived event "speed-average-event"; the pattern that need to be identified is AND, and the "speed-average-event" for that agent has a consumption policy of "reuse" (which means that if an event can be used for multiple patterns). The agent produces a derived event - for each AND pattern that consists of the "output-event" whose structure is:

This EPN does not involve "streams" - the thinking is "event oriented" and it attempts to provide natural thinking about event processing functionality.

Comments:

1. This is rather simple example, can also be solved by putting the average speed event on a global state (or event store/database) and then enrich it back - but the event-oriented is closer to the spirit of the original example which work on streams.

2. Aggregator and pattern detector are type of agents, there are some (not many) more types. Typically, an event processing network consist of multiple types of agents.

3. "Pattern Storm" claims that stream SQL ignore causality. One can view the relation between input events and output events of the same agent as a causality relation (he is using another scenario from the paper), and this can be set while defining the EPN.

One general comment (not related to this posting) - to "anonymous" - I'll gladly answer your question if you'll send it back and identify yourself. I don't publish anonymous comments.

I can post the solution to the rest of the examples in the stream SQL paper if anybody is interested...

Thursday, April 24, 2008

On the science and engineering of event processing


This is holiday week here, and yesterday I have driven about 2 hours south to the Weizmann Institute, a research institute that has a graduate school in some scientific disciplines - among them computer science, and is considered a great place to researchers that are good enough to be accepted, and are satisfied with academic salaries... Anyway, the Weizamnn institute hosted yesterday a "science festival" for children, above in the picture you can see the main idea - showing scientific principles through games. Since there have been several sites within the institute, the organizers provided airconditioned busses (it was also extremely hot day), however, when we arrived, there has been big pressure in the entracne station, and though there were 4 busses waiting, they have loaded passengers in a sequential way -- all waited until one finished loading passengers, and people wondered why they don't load passengers in parallel - it seems that sometimes engineering is needed to agument science.... Talking about science, there is one country that asks you to define your profession when you are filling the "landing card" in the aircraft before landing, this is the United Kingdom, and I always fill the form by writing in my profession as - scientist, this is a matter of self-identity, but more than that, it is also a way of life - risking generalizations I would say that engineers think in induction, while scientists think in deduction. In the NGITS 1993 conference that we held in Israel, in one of the discussions, John Mylopoulos said : "the distinction between the Artificial Intelligence and Database disciplines is that AI is science, while DB is engineering". Of course, database guys did not like it.
Well - I also wanted to tie the science / engineering issue to "event processing" - this area, as typically done in areas, while have some science origins, the first generation is the engineering era - different vendors came with implementations, that attempted to solve various problems, and the thinking is very much centric to the product one is trying to sell -- thus, if a customer's requirement is not easy to implement, the typical reaction is to do ad-hoc hacking around it, I know from personal experience, been there a couple of times, with different products. Engineering solutions are inductive, sometimes based on induction with N = 1, as a basis.
The engineering approach is typically the first wave -- I often like to use the analog of databaes in the 1960-ies.
However, maturing discipline, also needs science - which is looking beyond (maybe behind) the enginnering -- getting back to the fundementals and come with a model (like the relational model in databases -- but not really extension of the relational model, whose purpose is much different). Getting the science part will be a vital part of the discipline maturing - however, this is a longer term effort, the 2nd generation of event processing products will be more incremental on top of the first one - and still engineering oriented. More about the science of event processing - in later posts.