Thursday, June 18, 2009

More on generic vs. specific event processing



In the last few days I spent part of my time in the NGITS 2009 conference, that was hosted by the IBM Haifa Research Lab, today the keynote speaker was Alon Halevy, whose picture you can see above. Alon gave an excellent talk about "searching the structured Web". As a database person that joined Google 4 years ago he discovered that many of the things he knew from databases were not valid in Google. and from data management point of view, he went several steps forward, and his challenge was how to process structured data that is hidden in the "deep Web" behind web forms and interfaces. This reminded me one of my colleagues in the Israel Air force, who spent a year of his life more or less (almost around the clock) to write a proprietary database system based on low level I/O operations, for some logistics process (I think it was automatic warehouse), he wrote his version for conncurecny control, query capability, recovery, and many other stuff. He thought at that time that the DBMS products available in the market (this was 1978 I think) are not good enough, and using them will be a step backwards relative to what he had in mind. I have written in the past about single application vs. more general one in the context of why network and system management guys have not developed more general event processing products.

I was asked several times what is really the main issue behind all the work I am doing (along with many others) about "event processing" as a discipline, what is the new thing -- people have processed events forever. People also processed data long before DBMS has been introduced, but the way that my old colleague worked was not scalable. Not many people had the skills to do it, and it was not that cost-effective way.

Likewise, there were and still are many event processing applications of all types, colors and sizes, that are developed in an ad-hoc fashion, as there are still applications that process data that do not use DBMS products, because some aspect does not make it feasible or cost-effective.
However, it is clear that the DBMS area has made a tremendous contribution to the IT and business in the last few decades.

My own goal around event processing is to make event processing pervasive as part of enterprise computing, this will be achieved by generic software. Many of the database issues have been developed due the need for genericity -- take query optimization as an example, if one writes query by hand, it is not needed, since every developer can write optimized ad-hoc queries, the requirement to do query optimization came from the fact that a generic query language is being used for many purposes.

Personally, my interest is not in building "complex systems" (as the readers of my Blog know, I tend not to use the "CEP" acronym, due to the ambiguity of the term "complex") or one of kind systems. I think that the generic event processing systems will enrich their functionality over time, my interest is to make it pervasive. The first generation of products went some steps in this direction, and the next generations will do more. I have presented several times in various places my view about the next generation of event processing which refer to the challenges in the way to do it properly.

This topic will be discussed later this year in the 5th EPTS event processing symposium and the event processing Dagstuhl seminar planned for May 2010.

Monday, June 15, 2009

On Intellectual Integrity and honesty



The common denominator between the two gentlemen whose picture you can see above (besides being philosophers whose name start with S) is that they both talked about intellectual integrity and honesty. I got a lot of responses (most in Email and not as comments to the Blog) for my post on positive thinking, and decided to have another off-topic posting on related issue.

A few weeks ago I have been in the large conference that IBM is doing to its customers (Websphere Impact), and besides meeting customers, this is an opportunity to meet other IBMers from all over. I had some corridor meeting with a senior person in IBM, whom I never met in person before, and he told me that he heard about me, and that I have a reputation that I stand firm for what I believe, even if I swim against the current. Well, at least I have a reputation for something. Some people appreciate such a behavior, and some are not.

But - instead of telling about myself, I'll tell a related story. In my long university teaching career, I had lot of teaching assistants.

One of the teaching assistants, whom we shall call TA1, has once looked at the slides I was going to present in my class, and said that actually they are not consistent with what he told the students in the recitation, it turned out that he got some (not very important detail) wrong. He asked me to change my slides and talk about this in a fuzzy way so that it will not be inconsistent with what he told the students, I suggested that he'll just tell them that he has some minor correction to the last recitation, but he refused, saying that this will harm his professional authority for the students, in short -- he said he needs to "save face". I did not like his answer and did not change my presentation, I don't know what happened, maybe the students did not notice. I actually allergic to the term "save face". I have seen a lot of bad things done for face savings.



Another teaching assistant, whom we shall call TA2, once asked me to take 5 minutes at the beginning of my class, since he wants to clarify something about the recitation, I thought that he wants to clarify something about an assignment he gave to the students, but surprisingly for me, he said -- in the last recitation I made a mistake, here is the mistake I have made, and here is the correct version, please copy it so you'll not be confused, and I apologize for the mistake. At that point I thanked him and started the class with a discussion with the students about intellectual honesty, and the importance of it. Guess who succeeded better in life
TA1 or TA2 ? you probably guessed right.
Intellectual honesty often does not pay off, there is something in the culture that prefer other values.

I have started with saying that I got many responses to the "positive thinking" postings, I got a cool one today, from Septimiu Nechifor, a Blog reader from Romania, who sent me the ultimate response to Kipling's IF -- any relationships to today's posting is just a coincidence...



ANTI - IF by Kostas Varnalis

If you can fool yourself when someone else hits you,
Pretending him a wise man and never you blame him;
If you don't trust nobody and no one's trusting you,
Forgive your sin is easy, but never others' sins.

If you respite the evil no single moment and
You lie louder and louder when other people lie;
If you're enjoying hitting the love with hate and though
Pretend to have a wise and even saint good side.

If you move like a worm and never fly with dreams,
And interest is always up to your highest aim;
If you leave the defeated for the winner all times,
Though both of them betraying is your standing desire.


If you can gain a thousand for every little gift
And mother land to play at cards is not a heavy act;
If you don't pay a penny as duty you have made,
As for being you paid is always right and fair.


If you can urge your thinking and heart and even nerves,
Ill - old all them to make some new and evil acts;
And indecision bowing low deserves from you as serve,
When all bawl "Forward!" you're the only crying "Back!".


If you reject the evil no single time or plot
And in its shadow feel like in a saint life tree shade;
Yours will be whole Earth and all its gifts and mines,
You'll be the first of masters, but never MAN, be sure!





Saturday, June 13, 2009

On CICS event processing


Recently, IBM has announced "CICS Event Processing". A full presentation explaining what it is about, is available on the Web. As I have written many times in the past, the processing part of events is just a part of a bigger picture, that includes: producing the events before the processing, route the events to the right processing elements, and consuming the events by consumers. In some cases, devising the event processing application is the easier part of the work, and the more difficult part is to connect it to the rest of the world. Since a substantial amount of the world transactions are going through CICS, which is a rather old, but still alive and kicking transaction processor, then it makes good sense to take it as a place for instrumentation, and emit events that can be sent either to further processing or directly to a consumer or a dashboard. The event processing part of CICS perform simple and mediated event processing, e.g. filtering, transformation, enrichment and routing. For pattern matching it sends the event to an event processing engine. I think that we'll see more of the producer side event processing support, that will reduce the need to write ad-hoc adapters and make it more cost-effective to use. We'll also see the complementary part - the consumer side, on which I'll write in a later date.

Thursday, June 11, 2009

On some authors' dilemmas


This is an illustration of the "prisoner's dilemma", a known concept from game theory. These days I am facing some smaller scale dilemmas that do not include decisions about imprisonment, but relate to the book that I am writing together with Peter Niblett - Event Processing in Action



After completing the first 1/3 of the book, the draft have been sent to many "anonymous reviewers". Yesterday I got the result of 11 reviewers, they generally liked the draft, some of them made comments that demonstrate some of the dilemmas of writing such a book. The dilemma stems from the fact that the target audience is not monolithic, and this is evident by variety of opinions. Which reminds me that many years ago I have taught a basic first-year ("freshmen") university course, and in this course there has been a lab in which the teaching assistant taught them some products (I think it was MS-Access), one of the teaching assistants told me that one student asked him what version of the software should be used, and another student asked him how to insert the floppy disk (remember?) to the disk drive. It was difficult to teach such heterogeneous audience. In the book target audience there is less polarization, but still there is a variety: people who are part of the "event processing community", people who are somewhat familiar with event processing (or think they are familiar), and people who don't know if event is written with "v" or with "w"... The target audience is further segmented to: developers that are interested only in the technical side, system architects / designers who are interested to understand principles, students or newcomers to the area, who are interested to study the area. One of the facets of this diversity is that one reviewer wrote that we should write more about the business motivation since the most important thing is to explain decision makers what is the value of event processing to the enterprise, while another reviewer thought that the introduction is boring and that we should move directly to chapter 3 that starts with the technical stuff. The way we chose is to have an introduction chapter provides an overview, gives ten different examples that represent different types of event processing application, explanation of the various reasons for doing event processing and some key terms. We decided that one introduction chapter is enough for those who want to get some notion of what is the motivation, and may still be of interest to those who already know or are not really interested in motivations, just in technical details. There is another book being written which is dedicated to the business side (by Mani Chandy and Roy Schulte) for those who would like to get a complete business oriented book - our book is more for the technical audience. For those who are not interested -- we'll recommend in the preface to skip this chapter. I'll write more about some other authors' dilemma in subsequent postings.

Saturday, June 6, 2009

On Positive Thinking



Two news items have been highlighted in the Israeli press, seemingly unrelated: the first one is Obama's speech in Egypt addressing the middle east people, and the second one is about Dudu Topaz, a famous Israeli entertainer, who was arrested by sending hired people to beat quite hard some senior people in the Israeli media. In the speech that the U.S. president has given earlier this week he addressed the people of this region and called upon them to exercise positive thinking and overcome the differences, to get a permanent settlement. I would say that he called upon the people to desert the "zero sum game" and move to "cooperative game". Some people around me say that this message is very naive, and asserting that since the other side has negative attitude then positive attitude is useless. It is always easier to unite people around negative messages, and indeed the current government of Israel has been elected using negative messages that the people through the collective wisdom of democracy decided to endorse. The other side is not innocent either, generation of children are educated on negative messages of hate. I very much support Obama's call for positive thinking, it will not be easy for people to think that way, after a lifetime of negative thinking, but IMHO this should be the way forward.

The other piece of news was somewhat surprising to most, as written, a famous Israeli entertainer, who was in the past the "king of rating" in commercial TV, hosting a popular show (whom I have never watched). A few years ago, after some embarrassing incidents that he was involved in, the commercial TV stations decided not to hire him anymore, a person like him who felt like a king, with a huge ego, could not stand the humiliation and frustration in repeating rejections and hired some bullies to beat some people hard (they needed hospital treatment and surgeries to undo damages) - one of them was CEO of one of the commercial TV stations, and the other was a program manager of another commercial TV station (the fact that she is a woman did not matter to him), the third was the artist's own agent that did not succeed to take care of him, and the police found in his house plans about several other people. This is an extreme type of negative thinking, but it worthwhile writing something about it. It happens to many people that one day they are stepping out of some position, or some circumstances change, and suddenly they are out of their previous power anymore, some people are going on and doing other things, but some stay on the sideways of the court in which they played before, and feel very frustrated by the fact that they are not the players anymore, since they are convinced that they would have played much better. This lead people to invest a lot of energy in negative thinking and negative actions. Dudu Topaz is an extreme case, but I have seen and am seeing various such cases of investing energy on negative thinking.

I can certainly understand frustration, the wheel is spinning, and as many people, I have been some time up and some time down, but early in my life I decided that positive thinking is much better attitude to life, and looking back is not really a good policy, remembering what happened to Lot's wife when she looked back. Although like any human, I am sometimes tempted to have negative reactions, which mostly proven to be the wrong ones.


I have written in the past about positive thinking in Blog posting, but it is true for other activities as well.

In early stage of my life I have read the poem "IF" by Rudyard Kipling (in Hebrew translation) and felt that this is not just written words, this is a code of behavior that I should adopt. I have returned to cite Kipling from time to time, when I am thinking on a way to behave in extreme situation. I have copied Kipling's poem below his picture.


If (R. Kipling)
If you can keep your head when all about you
Are losing theirs and blaming it on you,
If you can trust yourself when all men doubt you
But make allowance for their doubting too,
If you can wait and not be tired by waiting,
Or being lied about, don't deal in lies,
Or being hated, don't give way to hating,
And yet don't look too good, nor talk too wise:

If you can dream--and not make dreams your master,
If you can think--and not make thoughts your aim;
If you can meet with Triumph and Disaster
And treat those two impostors just the same;
If you can bear to hear the truth you've spoken
Twisted by knaves to make a trap for fools,
Or watch the things you gave your life to, broken,
And stoop and build 'em up with worn-out tools:

If you can make one heap of all your winnings
And risk it all on one turn of pitch-and-toss,
And lose, and start again at your beginnings
And never breath a word about your loss;
If you can force your heart and nerve and sinew
To serve your turn long after they are gone,
And so hold on when there is nothing in you
Except the Will which says to them: "Hold on!"

If you can talk with crowds and keep your virtue,
Or walk with kings--nor lose the common touch,
If neither foes nor loving friends can hurt you;
If all men count with you, but none too much,
If you can fill the unforgiving minute
With sixty seconds' worth of distance run,
Yours is the Earth and everything that's in it,
And--which is more--you'll be a Man, my son!




Back to professional postings -- soon.

Thursday, June 4, 2009

On temporal semantics of events - or when has the shimpent not arrived ?




In the early 199o-ies, my home away from home, has been Berkeley, where I stayed for a joint work with Arie Segev on temporal databases. In one of the weekends I have strolled along the famous SF Fisherman's wharf, and there was some store for left handed people, since I am part of the deprived minority of left handed people, I was curious and entered the store, among the different items there (mostly not very practical), I saw this clock, if you notice, it is a backwards clock, which goes anti-clockwise. I am sure that the owner of the store was right handed -). Anyway, I recalled this clock, when working on a final version of a paper entitled "Temporal perspectives in event processing" that has been accepted recently for publication, and re-read the paper (as any paper, it is written, submitted, and then after a few months a review arrives and the author has to be reminded what it was, revise according to the comments, send back, and so on, until it is either accepted or rejected), and thought that temporal semantics of events can be a good topic to write about here. The temporal semantics of a backwards clock is, of course, different than that of the regular clock, and this brings me to the temporal semantics of derived events. Some background: event may have two time-stamps (or intervals) associated with it: occurrence time and detection time. Occurrence time is the time that the event happened in reality, detection time is the time in which the event processing system detected the event message sent to it. It is easier to make the processing of the events (when did they happen ? in what order ?) according to the detection time, however, for some applications, this may yield incorrect results. There are several issues around obtaining the correct occurrence time, but let's assume that we know how to do it. While the occurrence time of a raw event (events that has arrived from an external producer that assigns the value) is explicitly provided, the question is what is the occurrence time of derived events. Let's take a simple example: In May 2nd, 10:30 the customer John Galt has issued an order for books, with a guaranteed delivery of 48 hours (see my story with Amazon in its early days as a footnote to this postings). In May4th, 10:30 Mr. Galt looked at his (forward going) watch and said: "the shipment has not arrived by its deadline". The fact that he has not reported on arrival by the deadline caused the event processing system to derive the event "shipment did not arrive", which is a time-out event (or non-event event as some vendors call it). Now the question is WHEN did this event happen ? the detection time is easy, when some computational process derived the event and emitted it to the event processing system then the detection time is set. Let's say that this happened in May 4th in 10:32. The occurrence time is more tricky. Actually I can think of three different interpretations:

1. The occurrence time of the "shipment not arrived" is the same as its detection time, which means May 4th, 10:32.

2. The occurrence time of the "shipment not arrived" is the deadline of when it should have arrived, in this case, May 4th, 10:30

3. The occurrence time of the "shipment not arrived" is the entire interval of the 48 hours, since the shipment did not arrive during this interval [May 2nd 10:30, May 4th 10:30].

What is the right answer for semantics ? there is no right answer, as some more cases in event processing, the system designer should chose among these (and may be other) alternatives.

More about temporal semantics -- later.


Footnote: A story from the early days of Amazon.

I was an early customer of Amazon, buying, science fiction books through the web (I still do it). Typically it took 3 week for a shipment to arrive to Israel, so once after three and half weeks in which the packaged did not arrive, I've sent Email to Amazon customer service to ask about it. Their response was surprising -- we don't know what happened, we are re-sending you the books. After two more days I received the original package, and since I thought that may be the substitute package still can be stopped, I send another Email to Amazon friendly customer service, and got event more amazing response -- after we issue the reservation we cannot control the rest of the process-- so please keep the extra book with our compliments. At that time I thought that this company is not going to survive... Of course, since then they have much better logistic system... and I have two copies of each of the books in this shipment.

Tuesday, June 2, 2009

On the methodic use case used in the EPIA book


The EPIA (Event Processing in Action) book that Peter Niblett and myself are writing went to its first milestone -- the first 1/3 of the book (five chapters) have been completed in a draft form (chapter 4 and 5 should still be posted on the web by the publisher), and sent to a set of reviewers that the publisher has selected. One of the main issues in the book is, of course, to make things concrete by using a concrete example. We chose to use one example that accompanies the entire book and is developed step-by-step during the book chapters. One of the good advices we received from one of the initial reviewers on the book outline, was to use an example that does not require any domain knowledge (e.g. trade example), since this can issue a communication barrier with readers not familiar with that domain. Taking this advice, we decided to go for a methodic use case, that takes things from various applications we are familiar this and wraps them up in a single story, which does not require any prior domain knowledge - and this is the "Fast Flower Delivery" use case that I present below.

We present this use case in the descriptive language that we build througout the book, a sample of a building block in this language describing the event structure has been presented in past posting. The intention is also to go beyond that and add (as appendix?) some code samples from languages of various kinds -- SQL extension, rule language, script language etc... some language owners have already agreed to help, and we'll solicit more help from them soon.






Here is the current draft of the "Fast Flower Delivery" use case. It is fairly simple to understand, yet, it includes many (not all) the concepts we explain in the book:

General description

The flower stores association in a large city has established an agreement with local independent van drivers to deliver flowers from the city’s flower stores to their destinations. When a store gets a flower delivery order it creates a request which is broadcast to relevant drivers within a certain distance from the store, with the time for pick up (typically now) and the required delivery time if it is an urgent delivery. A driver is then assigned and the customer is notified that a delivery has been scheduled. The driver picks up the delivery and delivers it, and the person receiving the flowers confirms the delivery time by signing for it on the driver's mobile device. The system maintains a ranking of each individual driver based on his or her ability to deliver flowers on time. Each store has a profile that can include a constraint on the ranking of its drivers, for example a store can require its drivers to have a ranking greater than 10. The profile also indicates whether the store wants the system to assign drivers automatically, or whether it wants to receive several applications and then make its own choice.

Skeleton Specification

Phase 1: Bid Phase

The communication between the store and the person who makes the order is outside the scope of the system, so as far as we are concerned a delivery’s life-cycle starts when a store places a Delivery Request event into the system. The system enriches the Delivery Request event by adding to it the minimum ranking that the store is prepared to accept (each store has different level of tolerance for service quality). Each van is equipped with a GPS modem which periodically transmits a GPS Location event. The system translates these events, which contain raw latitude and longitude values, into events which indicate which region of the city the driver is currently in. When it receives a Delivery Request event the system matches it to its list of drivers. A filter is applied to this list to select only those authorized drivers who satisfy the ranking requirements and who are currently in nearby regions. A Bid Request event is then broadcast to all drivers that pass this filter.

Phase 2: Assignment phase

A driver responds to the Bid Request by sending a Delivery Bid event designating his or her current location and committing to a pick up time. Two minutes after the broadcast the system starts the assignment process. This is either an automatic or a manual process, depending on the store’s preference. If the process is manual the system collects the Delivery Bid events that match the original Bid Request and sends the five highest-ranked of these to the store. If the process is manual, the store makes the assignment and creates an Assignment event that is sent to the system; if the process is automatic then the first bidder among the selected drivers wins the bid, and the Assignment event is created by the processing system. The pickup time and delivery time are set and the Assignment is sent to the driver.

There are also some alerts associated with this process: If there are no bidders an alert is sent both to the store and to the system manager; if the store has not performed its manual assignment within one minute of receiving its Delivery Bid events then both the store and system manager receive an alert.

Phase 3: Delivery process

When the driver arrives to pick up the flowers from the store, the store sends a Pick Up Confirmation event; when the driver delivers the flowers, the person receiving them confirms by signing the driver's mobile device, and this generates a Delivery Confirmation event. Both Pick-Up Confirmation and Delivery Confirmation events have time-stamps associated with them, and this allows the system to generate alert events. A Pick-Up Alert is generated if a Pick-Up Confirmation was not reported within five minutes of the committed pick up time. A Delivery Alert is generated if a Delivery Confirmation was not reported within ten minutes of the committed delivery time.

Phase 4: Ranking Evaluation

The system performs an evaluation of each driver’s ranking every time that that driver completes 20 deliveries. If the driver did not have any Delivery Alerts during that period then the system generates a Ranking Increase if the driver has had more than five delivery alerts during that time then the system generates a Ranking Decrease to reduce the ranking by one point. If the system generates a Ranking Increase for a driver whose previous evaluation had been a Ranking Decrease then it generates an Improvement Note. event indicating that the driver’s ranking has increased by one point. Conversely

Phase 5: Activity Monitoring

The system aggregates assignment and other events and counts the number of assignments per day for each driver for each day on which the driver has been active. Once a month the system creates reports on drivers' performance, assessing the drivers according to the following criteria:

  • A permanent weak driver is a driver with fewer than five assignments on all the days on which the driver has been active.
  • An idle driver is a driver with at least one day of activity which had no assignments.
  • A consistent weak driver is a driver, whose daily assignments are at least two standard deviations lower than the average assignment per driver on each day in question.
  • A consistent strong driver is a driver, whose daily assignments are at least two standard deviations higher than the average assignment per driver on each day in question.
  • An improving driver is a driver whose assignments increase or stay the same day by day

More about the building blocks used in the book will be discussed in future postings.