Saturday, August 27, 2011

Siddhi - an open source event processing engine from Sri Lanka


According to Wikipedia, Siddhi is translated into: perfection, accomplishment or unusual skill.   In the picture we can see the eight primary Siddhis.    Siddhi is also a name of an open source event processing engine, recently advertised.    From the basic description it is aimed to support event processing as stand alone applications, and experiment with optimization algorithm for the run-time engine.    Siddhi came from University of Moratuwa in Sri Lanka, a university who claims an ambition mission: "to be the most globally recognized knowledge enterprise in Asia".   I like ambitious goals, and probably making open sources like this  can contribute to the recognition, of course, if this will get traction.  


I know of event processing open source and products coming from the USA, UK, Germany, Austria,  Sweden, Israel, Australia,  and I probably miss some (will be happy to update this list)   So now Sri Lanka to the list,   We are starting to see more activities in Asia in recent years, and hope to see more. 

On streams, events, programming-in-the-large and programming-in-the-small


In the tutorial I've given in VLDB 2010, one of the first slides was a rhetorical questions - see above. 
There are four opinions: some people think these are aliases, some people view stream processing as a subset of event processing that deal with ordered events, some people view event processing as a subset of stream processing, saying that event stream is one type of stream, and there are also other type of data streams such as voice stream, video streams,  and there are also people who think that these two are actually totally different concepts, relating to different types of applications.   There is something of true in each of them, looking at some interpretations, but IMHO none of the above is really true,


Curt Monash decided to renew the old terminology discussion on his Blog,  Taking the "stream" approach which is favored by the database people which look at "data streams" as data in motion, and view events as type of data that does not need any real special handling.


The difference of opinions and terminology stems from the fact that some people are thinking about apples and some about oranges.    


What is the apple?  - let's take as an example the S4 from Yahoo Labs,  in the Blog post I referenced here I mentioned that S4 is a platform for doing "programming in the large" for stream processing, what does it mean? -- it supports a data flow graph, where streams are flowing on the graph's edges, and the processing logic is embedded in the graph's nodes.  How is this logic implemented?  this is not part of the model, each developer can use the platform and implement the nodes, the platform takes care of the flow, and some non-functional properties (distribution, fault tolerance, cluster management, scalability in some aspects etc..).   
It is a pure programming-in-the-large framework.  There are others like that, in this case the model is blind to the type of stream, and the stream can indeed be video stream, voice stream etc..   I would call such a framework as "stream processing".


What is the orange? -- if we look at the abstract model of event processing, the way we defined it in the EPIA book,  it is a model that is centered around the programming-in-the-small,  with language primitives that related to the semantics of events: mainly the notion of context (when? where? to whom?) of events and patterns over multiple event occurrence.   The orange does not sound at all like the apple.


Can something be both apple and orange?  -- the answer is positive,   while event processing can be implemented using various "programming in the large" models, we advocate the "event flow" one, and the "event processing network" can be mapped to the data-flow graph model of streams.   So it is possible, but not necessarily to implement event processing as a kind of stream processing.   It turns out that there are some benefits to do it, and we see that indeed this seems to become a dominant way for "programming-in-the-large", while the programming in the small is still based on the semantics of events.   


The view point is always the hammer and nail issue.   Those who have the stream processing  "programming-in-the-large"  see event processing as just an applications of their platform,  and think that the platforms is the main thing.  Those who are having event processing language view the semantics and functionality of the language as the main thing, and the platform as facilitator.  

The intersection is not an overload, in stream processing one can add a node dealing with audio processing, but the event processing language might be of little value,  likewise, there are implementations of event processing that are based on other programming-in-the-large models (such as: logic programming framework) and not on the stream model.  



When looking at current state-of-the-art, we see that many of them indeed lie in the intersection of both, thus each of the sides can classify them its own way.    The fact that most classify them as event processing may show where the market thinks that the value is.  

Friday, August 26, 2011

Doing what one is inspected to do instead of what one is expected to do


Lou Gerstner, the now mythological CEO of IBM that many still miss,  wrote an excellent book about his tenure as IBM's CEO, much of it is dedicated to his organizational culture battles.    The most famous quote from Gerstner's book is:  people are doing what they are inspected to do, not what they are expected to do. 


This is indeed true in IBM until today, but IBM is just a reflection of the western culture, in which anything is weighed and measured,  the metrics are intended to achieve some goals,  but get a life of their own, often not keeping close contact with the original goals.   Furthermore, people find creative ways to satisfy the metrics in ways that have nothing to do with the original goals, since satisfying the metrics become the goal.


Some examples:  In Israel the traffic police guys are measured by the quantity of traffic violation ticket they get credit for.  I remember many years ago that I've committed some traffic violation (turned left in a place in which turning left was forbidden between 7-9am, it was indeed a few minutes before 9am),  a police car that was quite far saw me, and then started a movie-style chase, the police car not only turned in the forbidden way, but traveled in the opposite side of the road, turning the traffic to stand in the side, and drive in a way that was very dangerous to the traffic, and when they caught me they wrote on the traffic violation ticket" -credit  XXX(name)".    Hope that the reward for the credit was worth the effort! 
Other examples:  One of the previous "legal advisers to the government", a position that in Israel is among other things that head of the state prosecution, wrote in his autobiography that he had again and again to remind the prosecutors that their goal is to ensure justice, however, since they are measured by the percentage of conviction, while they have decided to prosecute a person, they cannot back off, it will spoil the statistics.   More examples from the education system:   I have talked once with somebody who dealt with admission of graduate students in one of the best business schools in the USA,  he told me that sometimes they miss students that some indications (and faculty members) show they can be exceptional, but their score in the GMAT is not that great,  it is virtually impossible to pass these candidates through the admission committee, since it would spoil the statistics of minimum and average of the admitted students, and this is a metric that the business school measures itself against other business schools.  More in the educational system -   My long term observation is that for many of the students the goal is maximize the GPA, and gaining useful knowledge becomes secondary,  thus elective courses popularity is often determined according  to the past statistics about grades distribution.  One of my students had a special talent to know how to learn to exams, get very high grades, and not remembering much a week later.   According to the metrics he won -- he was on the president list.  This talent still helps him in his life.   Getting to the business world,  the fact that corporations that are traded in the stock market, are measured by Wall Street analysts quarterly on accounting measure (Earning per Share) has a major impact on how the business world behave,  first considerations are often short term (due to the quarterly metrics), and second, accounting thinking is not always consistent with economic thinking, furthermore, it enforces some low common denominator in the behavior of the business world and reduce variance in goals and natures of companies.
While metrics are not necessarily bad,  there are two observations about them:

  1. There should be constant check whether the metrics still reflect the goals, and whether those who satisfy the metric do it in a way consistent with the goals - and adjust the metrics (or the goals) to avoid misuse (same kind of thinking as "fraud detection").
  2. There are cases in which goals are not translated well to metrics, or there are exceptions that are consistent with the goals, but not with the metrics -- people should be brave enough to stick to the goals and ignore the metrics (or handle them later).  


I started with quote from Lou Gerstner --- which I see as one of the cornerstones of my own behavior, over the history I had several times "motivation talks" with people instrumented by metrics to advise me how to behave,  and I'll end with another phrase I like , which I use for this people:  My behavior is driven by a  a compass and not by a weather wane"



Wednesday, August 24, 2011

A video clip from NEXT - demonstrating the proactive idea

The credit for this idea goes to my colleague Zohar Feldman.


A good way to illustrate the proactive idea is to watch the video clip from the movie NEXT,


In this movie Nicolas Cage acts like a person who can see  two minutes into the future, in the clip he is trying different alternatives how to approach a woman he is trying to pursue, understanding why they fail he improves it until success is achieved.     This illustrates the proactive idea:  event happens, their consequences are predicted and a decision what to do in order to achieve some goal should be taken.  There might be different decisions, but we need to predict the consequences of these decisions and make the best one that will get us to the desired solution.     Enjoy!

Tuesday, August 23, 2011

On flood prediction from IBM Research




There have been some visible floods in recent years, like the one in Australia.    Prediction of the course of floods can be tricky, since rivers can have multiple splits.  Today IBM  announced that  IBM Research together with UT Austin a prediction model to predict the course of flooding.   This combines IBM analytics research with UT research in the physics of rivers.


Note that disaster management is an area where proactive computing hold significant potential, however in order to realizing this potential, strong prediction abilities are required.   

Additional presentation - talking about AI and EP

To complete the presentations list -- another interesting presentation from RuleML'11 (the first out of two) by 
Alex Artikis and Nenad Stojanovic  talking about AI approaches to situation detection and remaining challenges.

Monday, August 22, 2011

Some EP related slides and videos on the Web


With the help of Google Alerts, I have found some recent presentations that relate to relatively old staff, but has some interest nevertheless.


Surprisingly, somebody posted a presentation about RFID/ event processing panel in the first EPTS symposium in March 2006, relatively old staff but interesting even today. 


A video describing the CICS event system has been posted on the IBM Education Assistance server.  This is a system that instruments CICS to act as event producer that was released in 2009.  


A presentation by Informatica on "Data Governance". where event processing is one of the enabling technologies. 


Last, but not least. an academic presentation from INRIA with the interesting title "undoing event driven adaptation of business processes". The terminology here is somewhat weird, but it relates to the retraction issue, and proposes the use of EP for doing that.


There are also some presentations from DEBS 2011, on the DEBS website, hope that more presentations will be posted.