Saturday, February 28, 2009

On fusion confusion and infusion

This is a picture of Akko (AKA Acre), an ancient city with walls, middle eastern typical market with the smells of spices, an fisherman's harbor. I am hosting now some visitors from Germany, Rainer von Ammon and his colleagues from CITT to discuss some collaboration topics including a consortium for an EU project that we are establishing with other partners. Unfortunately they chose to arrive in the most rainy period we have this year, so we could not do much sightseeing today, however, we succeeded to get two hours break in the rain to stroll around the old city of Akko.

I'll get back to the discussion about the questions I posed yesterday soon, as I would like to see if more people want to react before stating my opinion (I am in learning mode...).

Today I'll write something about Information Fusion and its relationship to event processing; I came across a recent survey article in ACM computing surveys about data fusion.

There are various kind of fusions - data fusion, information fusion and sensor fusion -- and all of them are intended to get information from distinct sources blend it together and understand what has happened. A very simple example of sensor fusion is in traffic monitoring, there is a sensor that senses the speed of a car, there is a camera that takes pictures of the car and its license plate, fusion of both can identify the fact that a certain car has violated the speed laws, this is a relatively simple case that requires some basic image processing, but it is quite easy to determine what happened. This is, of course, very simple case, and in the area of military intelligence it is much more complicated to understand what happened / happening / going to happen and some techniques are being used. The Center for Multi source information fusion in University of Buffalo maintains a site with collection of bibliography about fusion issues including tutorials and their proposals to modify the relatively old JDL model, so you can find much more information there.

So where is the confusion ? --- there are people who confuse event processing with some other different areas, somebody in IBM who saw an illustration of event processing network once tried to convince me that we are re-inventing workflows, some data management people think that event processing is just a footnote to existing query processing, everyone with a hammer looks at the rest of the world as a bunch of nails;
Likewise, there are people who confuse fusion with event processing.

So what is the infusion? the fact of the matter is that information fusion and event processing are complementary technologies. The goal of fusion is to determine what happened, i.e. to determine what is the event that has occurred. Event processing is processing the event after somebody determined that the event happened it has multiple goals, the techniques are different, fusion is using conflict resolution techniques and stochastic modeling, event processing is using pattern matching, transformation, aggregation etc. Thus an event can be created using fusion techniques and then processed using an event processing system -- this is the infusion.

However -- there is also a potential synergies between these two applications - a partnership of fusion technology as a preprocessor for events and event processing can be beneficial for certain applications, this is the most obvious synergy. Another type of synergy is that techniques used in fusion can be used in event processing and vice versa, this is an interesting direction to investigate further and also investigate possible real applications for it. More on this - later.


Hans said...

Some types of fusion (sensor fusion) are about models and techniques for minimizing this or that kind of error in data derived from heterogeneous sources. Many CEP products (a subset of the overall EP landscape) are used today to implement these techniques. So that seems like another way that these things are complimentary.

Opher Etzion said...

Hello Hans.

There are indeed some EP products that are used for Deduplication and other data cleansing application, however, the overlap with EP is very partial; there are other technologies more common for these applications, typically processed in batch.



Hans said...

Well, there is more than deduplicating and data cleansing happening with EP products. Many of the probabilistic techniques discussed under fusion are used in finance, but under different names or slightly modified. There are plenty of reasonably sophisticated incremental or frequently-running-batch (time window based) algorithms in use.

I do not know which of these you would consider to be true EP, but forgetting about that for a minute.

I was trying to say that, to the extent that a product makes it easier to implement and run algorithms over streaming data, it can help people that want to do this. This sounds different from the case you describe where fusion is a technology that can be used with EP. There might be cases where the cooperation works the other way around - where EP is used to help implement fusion techniques.