Tuesday, December 30, 2008

On single application vs. general event processing software - the network and system management case

There are people who like working in coffee shops, in my academic days I had a colleague in U.C. Berkeley which liked to work in one of the coffee shops that surround the beautiful campus. I am usually work at home or in my office (where I sometime spends 13 hours per day), but today I spent the morning in a coffee shop called "Grand Cafe", well - not quite the famous one in Paris seen in the picture, but a much smaller one bearing the same name in Haifa. I am not sure that it was cost-effective for the coffee shop, since I ordered one mug of "upside down" coffee, which is the Hebrew name for a coffee in which the milk is put into the mug first and the coffee later, anyway - I spent the time in writing some paper, and found that the coffee shop setting actually makes me more productive then my office... Will try again to see if this is consistent...

I have also talked recently with somebody who works for one of the NSM (network and system management) vendors; the discussion was about my favorite topic -- event processing, and why the NSM companies who are dealing with events as primary occupation did not really try to look at "event processing" as a more general discipline and extend it to more types of applications. The answers I got from him reminded me of two things --- one from the far past, and one from the near past.

From the far past I recall that when I was assigned as a programmer to the Israeli Air-Force IT unit, when I was in the age of 18.5, I got my first assignment, and very enthusiastically wrote that program (in those days -- in PL/1..), and then got the second assignment, when I got my second assignment I read it carefully, and then went to my division commander and told him that the second program is quite similar to the first program, the change are in some details -- somewhat different input, somewhat different validity checks, somewhat different calculations, but the same type of thinking, so instead of writing this program I suggest to think bigger, look at this class of programs and try to write "general parametric program" which takes all the individual programs as a set of arguments to the general program. My division commander heard me with a lot of patient and then said: great idea, however, our mission is to write certain kind of applications for the air-force, and not to invent general programs. You may talk with the guys in the "chief programmer" department (what today we would call CTO), they
are in charge of generic software. I was just a beginner and he was a Captain, and the commander, so I deserted this ideas, but pursued them later in life, as I always was under the impression that people program again and again the same things, and the level of abstraction in programming should be higher.

So - as you can guess from that story, the NSM guy just told me: Our aim has been to build network and system management and not generic software.

I also remembered that there was some discussion on the Web on the same topic, and found in David Luckham's site an article
entitled "an answer from the engine room of the industry", the question that David phrases is:
I have often asked why the network monitoring applications that were developed in the late 1980’s and early 1990’s didn’t get extended to apply to business level events at about the same time.

The question is answered by Tom Bishop, the CTO of BMC, who is seen here in the picture.

I've met Tom Bishop in the late 1990-ies, after IBM has acquired Tivoli (and NSM vendor), and we in IBM Haifa had a project with Tivoli; Tom made an impression of an impressive big guy with Texan accent. Now he is BMC CTO. In his answer to David Luckham he makes roughly the same answer, in three parts (quoting Tom Bishop) :

  1. When the architects for these products were building them, they weren't actually thinking of the broadest applications for the types of systems they were trying to build, but were really focused on solving a very specific problem;
  2. As we know all too well, often the correct way to solve a problem is to find the most general description of the problem and then assume that, if you've done your job correctly, the specific solution can be described as an instance or subset of the more general problem. But this only works if you know to set your sights high enough. In the environment you note above, this didn't happen.
  3. The people who buy IT management solutions don't care if the solutions they buy might also be used to solve a business activity monitoring solution, and the people who buy business activity monitoring solutions don't care if the solutions they buy might also be used to solve an IT management solution. In fact, these two groups of people almost never talk to each other!
This is all revolving around the same phenomenon --- there is a big difference between hard-coding an application doing X, and building a generic program that the application doing X is an instance of it. Furthermore, the fact there are various hard-coded applications doing variations of X, may help in requirements, but does not mean that it gets us closer to a generic application - since the level of abstraction is probably wrong.

I guess that if event processing generic software existed when the NSM software has been built, the NSM vendors would have used it, instead of re-inventing the wheel, the same as they used existing databases, existing communication network etc..

Event processing as a discipline is about creating generic software, my personal prediction: NSM vendors will gradually merge into the event processing community.

More - Later


Richard Veryard said...

Why does one often work more productively in cafes than in the office? Or for that matter at home?

Less connectivity means fewer new events, hence the ability to focus one's thought-power more effectively on a previously selected set of events. I wonder whether this statement has any broader implications for enterprise-scale event processing?

Opher Etzion said...

Hi Richard. You are correct in your observation that in a cafe there are less interrupts, although I was connected to the internet and had my cellphone with me; I think also that different environment also helps for productivity then to stay in the same place. The lesson for enterprise-scale event processing may bethat a human should be exposed to relatively few events (although the computer can process many). I'll think further about this issue...