Showing posts sorted by relevance for query Blog 2 again. Sort by date Show all posts
Showing posts sorted by relevance for query Blog 2 again. Sort by date Show all posts

Thursday, August 28, 2008

On the "Event Processing Thinking" Blog - after the first year

One of the ways to obtain events is through "calendar events", this is useful for time-out management, periodic triggering etc. Today I saw in my calendar a reminder: this is the one year anniversary of the "event processing thinking" Blog - you should write something about it. Actually, yesterday I got a note from one of the analyst firms that research the impact of Web 2.0 on companies and was asked to participate in this study on my Blogger hat... This is not the first time that people approach me based on reading my Blog for various purposes, and actually I can say that I have under-estimated the power of Blogs and the amount of visibility it gets. This is probably the most visible communication vehicle exists today (how many people are reading papers?)

Looking at the Blogland I also realized that the visibility can be a double-edged sword, since people can easily expose their own ignorance, so I am trying to write only on stuff that I think
I know something about...

One thing that is interesting is the statistics (who reads the Blog) - it seems that the previous time I've written about statistics has been one of the most read postings (see below).

Looking at the Google Analytics statistics it seems that since the start of measurement (I've installed Google Analytics 2 weeks after the Blog start) more than 10,000 distinct persons (10,139 to be exact) have read this Blog. I don't have any illusion that there are 10,000 people who are interested in event processing, and some got due to the wonders of the almighty Google (e.g. looked for a picture of unicorn), so a better metrics is to see that 1/3 of the readers returned more that once, and 1432 readers returned more than 50 times - which is the more reasonable number the amount of people interested in the content. It seems that the amount of people who read all or at least 2/3 of the Blog postings is around 800, and this seem to be the size of effective readership.

What else can I learn from the statistics? The most popular postings are:

(1). Agnon, the dog, playing and downplaying is still, and by far the most popular one, in this posting is one of the postings where I claim that "event processing" is a discipline that stands on its own fits, and not a footnote to database technology or business rule technology.

(2). Revisiting the Blog **2 again which, like this posting, is talking about statistics around this Blog, I wonder why this posting is so popular (or people wanted to look at the map of Arkansas to plan their next holiday.

(3). On infant, professor and unicorn despite the fact that this posting is much younger, it had a lot of traction, some because people are looking for pictures of unicorns, and some because always disputes bring more rating... However, rating is not all, and when I think that I've said all that I need to say about particular topic, I move on.

As far as the geographical distribution of readers: there have been readers from 124 countries.
In terms of amount of entries - the big ones are:
(1). USA, (2). UK, (3). Israel, (4). Japan, (5). Germany, (6). Canada, (7). France and (8).India. As far as the amount of individual readers - the big ones are:
(1). USA, (2). UK, (3). Germany, (4). India, (5). Australia, (6). Israel, (7). France and (8). Holland. So it seems that in Japan I have relatively small (less than 100) but loyal set of readers - I am still looking for some opportunity to travel to Japan - never been there (actually I have never been in India either).
In the USA there are now readers from all 50 states (+ DC) and the leading are: California, Massachusetts and New York. Putting Arkansas map helped - and now Arkansas in the 16th place in the USA in visits.

The three big cities in terms of visits are still : (1). London, (2). New York City, (3). Bangalore.

I'll not survey the negative and positive reviews about this Blog - and let every reader judge. that is the essence of the entire Web 2.o business! -- well, that's all for today; Will return soon with a more professional posting.

Wednesday, February 20, 2008

Revisiting the Blog ** 2 again


I have never been in Arkansas, and in return - nobody from Arkansas have ever read my Blog - not surprisingly. What is surprising is that in the rest of the USA states (with the exception of North Dakota) have people who live there, or at least visited there and have read this Blog.
I have done some investigation on - who reads this Blog, when the Blog has hit 1000 distinct readers. I have decided to re-visit the statistics if the Blog ever hits 3000 readers - when I checked it this morning I found out 3217 distinct readers, so it is a good time to look at some statistics, actually this number is quite surprising, since I don't really think that there are so many people interested in this, and indeed some are probably passing by in the Internet roads, however the statistics show that around 45% returned more than one time. Moreover 569 people has visited the Blog more than 50 times (172 of them more than 200 times, where the number of Blog postings did not reach 100 yet), so I guess that this is provide the size of the community who really reads the material.
From Geographical point of view, it seems that I still writing mostly for the USA readers - 1530 distinct readers (almost half of the readers). In number of accesses - USA is the first, UK second, Israel third followed by- Japan, Thailand, Canada, India, France, Spain and Germany. However, when looking at the number of "distinct readers" - it is somewhat different order -- USA is still first and UK still second, but here Canada is third followed by - India, Israel, Germany, France, Australia, Holland and Sweden. There is coverage of all continents - some of our neighbors - "the Palestinian Territory" that does not have map yet in Google analytics, so I don't know where exactly, Egypt, Lebanon, Lybia and Iran are all represented. from South America - almost all access are from Brazil, some sporadic ones from other countries. In Africa - South Africa and some countries in the north - but also - Sudan, Kenya and Uganda. Also coverage of almost all countries in Europe and Asia - total 92 countries. As far as cities - the three big cities, in number of accesses are: Petach-Tikwa (in Israel), London and Bangkok;
the big cities in number of distinct readers are: London, New York City and Bangalore.
The biggest source of reference is Google in various ways, followed by other Blogs like - the CEP Blog, TIBCO's Blog and Rulecore's Blog and from David Luckham's site
The most popular postings were:
1. Agnon, the dog, playing and downplaying in which I stated the opinion that Business Rules are a possible way to implement CEP, but not the only way - which lead to several more postings investigating the relationships.
2. On Event stream Processing in which I stated my controversial opinion that the term "event stream processing" does not represent anything interesting, and the person invented it, Mark Palmer, thinks it should go away.
3. On the mythical event per second in which I stated that "event per second" really means nothing outside the context of a specific benchmark. I have dedicated more postings to this issue.
4. On bitter pills , a recent posting that answered Tim Bass' criticism about the state of the EP market, provided more optimistic view (followed by some constructive assertions in the next posting).
It seems that the popular ones are the macro-level - but I'll continue from time to time also to write on micro-level things, since it seems to have its audience also. The most popular to those who got through search engines were related to XTP and context terms - so I need to get back to them also.
So - thanks to all the readers -- more event processing related postings - later.

Monday, October 20, 2008

More on the semantics of synonyms


Still lazy days of the holiday week, I took advantage of the time to help my mother, who decided that she wants to leave home and move to seniors residence located 10 minutes walk from my house, this requires to deal with many details, so that is what I was doing in the last three days.... In the picture above (taken from the residence site) you can see the front entrance and the view seen from the residence tower, on the right hand side of the upper picture one can see part of the neighborhood we are living in (Ramat Begin) surrounded by pine trees all over.
Now, holiday eve again, and this is a good time to visit the Blog again. Last time I started the discussion in the semantics of synonyms by posing a simple example of conjunction over a bounded time interval (same pattern that Hans Glide referred to in his Blog), and slightly different from the "temporal sequence" pattern.
In the previous posting I have posed the following example:

Detect a pattern that consists of conjunction of two events (order is not important) - e1, e2.
e1 has two attributes = {N, A}; e2 has also two attributes = {N, B} ; the pattern matching is partitioned according to the value of N (on context partitions I'll write another time).

For each detection, create a derived event e3 which includes two attributes = {N, C}; E3 values are derived as: E3.N := E1.N ; E3. C = E1. A * E2. B.

Let's also assume that the relevant temporal context is time-stamps = [1, 5] - and the events of types E1 and E2 that arrived during this period are displayed in the table below:




The question is: how many instances of event E3 are going to be created, and what will be the values of their attributes?

Looking at this example, for N = 2, there is exactly one pair that matches the pattern
E1 that occurs in timestamp 5, and E2 that occurs in timestamp 4, so E3 will have the attributes {N = 2, C = 24}. However, for N = 1 things are more complicated. If we'll take the set oriented approach that looks at it as "join" (Cartesian product), since we have 3 instances of E1 and two instances of E2, we'll get 6 instances of E3 with all combinations. In some cases we may be interested in all combinations, but typically in event processing we are looking for match and not for join -- that is the difference between "event-at-a-time" type of patterns and "set-at-a-time" patterns that is being used by some of the stream processing semantics. So what is the correct answer ? -- there is no single correct answer, thus what is needed is to fine tune the semantics using policies. For those who are hard-coding event processing, or using imperative event processing languages, this entire issue seems a non-issue, since when they develop the code for a particular case they also build (implicitly) the semantics they require for a specific case, however policies are required when using higher level languages (descriptive, declarative, visual etc...), policies are needed to bridge between the fact that semantics is built-in inside higher level abstraction, and the need to fine-tune the semantics in several cases. In our case we can have several types of policies:

Policies based on order of events - example:

For E1 - select the first instance; for E2 - select the last instance.
For E1 - select the last instance; for E2 - select the last instance

Policies based on values - example:

For E1 - select the highest 2 instances for the value of A ; for E2 select the lowest instance for the value of B.

These are examples only -- it is also important to select a reasonable default which satisfies the "typical case", so if the semantics fits this default, no more action is needed.

These have been examples only, in one of the next postings I'll deal with identifying the set of policies required in order to make the semantics precise.



Monday, August 11, 2008

On faithfull representation and other comments



Back home from the vacation in Turkey, the vacation took place in the Limak Limra hotel, about 1.5 hours drive from Antalya airport (see picture of one of the many swimming pools above). It was a great British philosopher who preached to workaholists people like myself about "in praise of idleness" . So - not taking the laptop with me, I have learned several things:
1. Unlike the Israeli beach which consists of soft sand, the beach in Turkey consists of small and large stones;
2. Turkish chefs know how to cook many types of foods quite well, but have a lot to learn still in preparing Sushi,
3. The reputation of Charter flights about long delays is actually true (however, this is also true today for many regular flights).


Since Richard Veryard has sent me an Email about his Blog postings entitled "Faithfull Representation" in which he referred to an illustration that I have made as a "simple situation model" and attributed this model to both Tim Bass and myself (goodness gracious me!). Tim, who constantly claims that he has much more general view than me, could not believe that his name and my name are mentioned in the same sentence as agreeing on something, and asserted (I am using "cut and paste" from Tim's Blog:) "Opher tends to view CEP as mostly an extension of active database technology where I see CEP as a technology that is much more closely aligned with the cognitive models".


Here are some comments:


1. The illustration that Richard is quoting does not mean to explain what a situation is, but to show the relations among several concepts, I am enclosing it again -



As can be seen I am writing there that composite events (which are taken from active database terminology) and complex events (which are not) may both represent situations, which does not say that this is the only way to represent situation (as saying that fish is an animal does not define what is an animal).

2. I have explained the basic idea of situation in this posting , simply said - a situation is a concept in the "real world" domain (not in the computer domain) that requires reaction. In some cases a single event determines a situation, in some cases, detecting a pattern determines a situation, and in other cases, patterns only approximate the notion of situation, and there is no 1-1 mapping between events and situation, note that in that posting I also have provided an example of non deterministic situations.

3. Regardless of the situation definition, Richard is absolutely right that all over the event processing life-cycle we may have instances in which the events are inaccurate or uncertain , and the reader is referred to this posting for some examples of uncertainty issues we are dealing with. This is an area that I am investigating in the last few years together withAvi Gal from the Technion and Segev Wasserkrug (our joint Ph.D. student who graduated recenlty with a Ph.D. dissertation was denoted as excellent by the exam committee). Hot from the oven - A paper about it is published in the recent (August 2008) issue of IEEE Transactions on Knowledge and Data Engineering, which is dedicated to "SPECIAL SECTION on Intelligence and Security Informatics". The actual paper can be downloaded from Avi Gal's website. Another paper related to the same study has been presented in DEBS 2008.

4. While I totally agree that in some cases the uncertainty is needed - and certainly some security applications are example, I also believe that the potential market for the more basic deterministic world is much higher, and we are far from picking up all the low hanging fruits of the deterministic event processing.

5. We still have challenges in defining the semantics of the different cases of handling uncertain events/patterns/situations. The fact that there are arithmetic of uncertainty help, but not everything that exists in AI research fits the real world requirements of scalability, performance etc..

6. About the comment of me viewing event processing as extension of active database technology -- I view event processing as a discipline by its own right (and this is a topic for another discussion which I'll defer), it has origins in several disciplines, one of them is active databases, but it has several more ancestors - sensor fusion, discrete event simulation, distributed computing/messaging/pub-sub and some more, and draws concepts from each of them. Anybody who reads my Blog can realize that there is a fundamental difference between active database that extends database engines and event processing that is not based on database technology, there are some other differences too.

7. My friendly advice to Tim is that before he makes assertion about how and what people think (and this does not refer necessarily to myself) he will re-read his own excellent posting :"red herring fallacies" .

More on event processing as a discipline - at a later post.

Monday, April 21, 2008

On Event Clouds

Marc Adler in a couple of his blog postings wondered about support of event clouds in the product he chose, and at the end has settled in the opinion of the vendor (Mark Tsimelzon from Coral8) who claims that "cloud" is an abstract term, and in reality we are facing multiple streams that may or may not be ordered. The response comes from Greg-the-architect who is in "everybody are confused" mode recently. Greg-the-architect claims that vendors have sinned in disinformation towards their customer to hide their inabilities to cope with hidden causal relations.

So - what can I contribute to that party ?


First - let's look again at the defintion of event cloud in the glossary:


Event cloud: a partially ordered set of events (poset), either bounded or unbounded, where the partial orderings are imposed by the causal, timing and other relationships between the events.


Clouds became a fashionable term, we hear a lot about cloud computing in the recent year, that we all feel like flying in various clouds.


What about the clouds/streams debate ? -- one of the differences that are stated is that a cloud is a poset (partially ordered set) while a stream is totally ordered. I agree that this terms come from two different origins, the question is if indeed a cloud can be supported by multiple streams, while people focus the discussion on whether streams are always totally ordered or can also support non-ordered set of events - this is not really an interesting distinction. I agree here with Mark Tsimelzon that a stream can also be un-ordered, this is up to implementation. If one wants to make a distinction between "streams" being ordered and other things that can be unordered, I propose the term "pipes" - where ordered pipe is a stream. But the ordered/unordered does not make the main difference. Reading the cloud definition again, it is the notion of cuasality that is important for having a cloud. The "partial ordering" in the cloud is a result of causality relations between events. I have discussed in a past posting the notion of causality, support in causality (including pre-determined causality that may be result of mining, or inference system) is the enabler for the support of clouds (i.e. the partial order vs. no order).


Cloud is indeed the collection of events that an enterprise is faced with, and this cloud may be implemented by a collection of pipes (or streams, if you wish) and support in causality relation.


We can also look at a (small) cloud, which is the collection of all events that a single EPA (Event Processing Agent) is facing as an input - and this is just a subset of the big "Cloud" - with its own pipes and causality relations.


Now - to the most important question - besides the game in terminology, is it important to make these distinctions?


As stated before, the world of event processing is not monolithic, there are some applications which need total order, while other applications need partial order, and other applications don't care about the notion of order at all. Causality relations are required by some applications, either if the pre-defined relations between the events play a role in the event processing, or if there is a need to trace back the lineage of a certain event / action. For other applications it may be just an unnecessary overhead. So my (2 cents worth of) advice to the people who are looking at CEP products - is to look at their requirements and determine if they need causality, and partial ordered set. It may be that the support of totally ordered stream is totally sufficient for their applications, if it is not - they should look for if and how causality is implemented. I hope that I have not confused you even more... More - later.



Saturday, January 5, 2008

On Trifecta and Event Processing


Reading the educated blog of my friend Tim Bass
I had to admit that I don't have any clue what a "Trifecta" is, and rushed to wikipedia
for help to find out that in horse races - it is a bet on the first three horses. I must confess to my ignorance - I have never watched a horse race, neither have I gambled on horses. I have made a major technological gamble though, around 10 years ago, to bet around "complex event processing" (without using this name and without knowing on the three others who have worked in parallel on the same topics - will have at some point postings about the history), and after suffering some frustration periods, it seems that the reality also slowly getting there, and I still hope it will be the "next big thing" - but we have some work to do in order to get there -- again, a topic for another posting. Now the question posed both by Tim Bass, and by Joe McKendrick
is - whether CEP is depended on SOA, and in turn - does it require the coupling of SOA and BPM - which is by itself a problematic concept.
My answer is -- CEP is an horse that is indepdent in any other horse, donkey, mule or tiger. As I noted in earlier postings, SOA and EDA are orthogonal - SOA is about modularization or componentization of the enterprise, and hence it IT systems, introducing modules as "services". Now let's talk about the relations between SOA, CEP and BPM.
(1). Services in SOA can be producers and consumers for "Event Processing Agents"; as a producer - the service is instrumented to emit events, as a consumer it can be notified or triggered by an event; Services are one type of consumers and producers but not the only type - every application can be consumer or producer, thus there is a loose coupling.
(2). In SOA environment there is an ESB (Enterprise Service Bus), whose functions have some overlap with event processing (especially mediated event processing) - can be a natural host for the "event processing network" (I'll write more on ESBs in another posting). However, again Event Processing Agents can also run outside ESB.
(3). Workflows (BPM) are special type of consumers and producers. In the SOA world, BPM orchestrate services, and thus, BPM can emit events about the status of the workflow, while as a consumer it can add/modify/delete one or more workflow instances. Some of the Event Processing applications (e.g. some types of BAM applications) are based on business processing - however, even in SOA environment this is only one type of consumer and producer, and there are other types (e.g. RTE applications which process events in-line and not in observation mode).
Bottom line --- Event Processing can have different interactions with SOA, and when IBM's announcements in this area will be available you'll realize that there are different entry points. Event processing can also work in legacy and non-SOA environment.

So today I have learned about horse races and promised some more postings (not promising when) -- more, later.

Sunday, October 21, 2007

Fallacies in implementation of event processing solutions


Back in Israel... the flight that has taken me back home suddenly landed in Athens, instead of continuing to its destination, it turns out that one of the passengers left a laptop on a vacant seat and went to sleep elsewhere in the aircraft, some passengers attracted the attention of the crew, and after some consultations, they decided to land in the nearest airport to have security people check the laptop, after landing the laptop owner woke up and retrieved his laptop, and the aircraft could take off again (well - not so fast, took almost an hour to take off again) - strange story (has been in several variations in all daily newspapers in Israel)... a fallacy in (human) event processing system. Talking about fallacies, recently Hans Glide in his blog talks about "failure scenarios" where EP solutions may fail to produce correct results. This reminds me of previous work I have done in the area of derived data using active databases, where in a relatively old paper I have looked at several possible fallacies in active databases - the analysis is also valid for event processing, here are some examples:


1. non deterministic execution due to conflicting rules.

2. Inability to reconstruct the derived data from the raw data since multiple rules derive the same data-element

3. Redundant updates in case of inter-connected derived updates, e.g. the same data-element is updated several times within one path

4. Derivations may result in infinite loops


The paper also shows that it is possible to eliminate these problems using a specific execution model.


Taking it back from active databases to event processing - in active databases, this paper has been in the framework of deriving data, while in the event processing cases we derive events - however, some variations of the same phenomena can occur in event processing - certainly non deterministic behavior can happen, and some other fallacies as well.


Hand Glide mentions that each event processing implementation project should pay attention to these - and indeed event processing application debug is more complex than the debug of regular applications due to inter-relations between rules/queries/derivations.

IMHO - it is a responsibility of the EP vendors to provide semantic debuggers that will enable doing it. Some are already around, but more work is done in this area, the users should not be on their own here... more about validation of EP applications - in one of the next posts (I am promising many follow-ups, but new topics keep coming)...

Thursday, December 27, 2007

On Business Intelligence and Event Processing

This is a quiet time, although we do not have holiday period in Israel, those of us, like myself, in which major part of the work requires interaction with the rest of the universe, it is quiet time since my colleagues are away - much less Emails, no conference calls, today I've spent the day in Rehovot known for its famous "Weizmann Institute" but this time (a famous building from the institute is in the picture above) I have visited an IBM site not far this institute, IBM has acquired in the recent years several Israeli companies, and they are now fused into a single lab ILSL - Israel Software Lab (well - ISL was already taken in IBM by India Software Lab).

Anyway, coming back, I saw that the blog world of event processing is also alive these days, but this time I would like to give a few comments to "operational analytics: yesterday today and tomorrow" by Colin White
In this article Mr. White has some interesting assertions (at least, my own interpretation of these assertions):

(1). CEP is a buzzword which stands for a kind of operational analytics that should be used in extreme cases.
(2). BAM has failed since it has been independent of BI
(3). Hint at the fate of CEP (or ESP, it is not clear from his article what is the difference) will be the same if not be part of BI.

While I am sure that Mr. White is a big expert on BI, it seems that he also falls into the "hammer and nail" trap that has been discussed by myself and several other bloggers in this area. So here is some preliminary responses to his assertions:

(1). CEP is a technology, it has roots in multiple disciplines, and some of it has roots in BI, but there is a distance between this and the assertion that it is part of BI. CEP has different uses that may not be even connected to BI (e.g. network management diagnostics or situational awareness), here again we get back to the issue of motivation for using CEP - the consistent view of database people is that the only reason to use CEP is extreme latency/throughput otherwise one can use the right religion of SQL databases, I think that this issue has been discussed also in the EP blogs that there are multiple reasons for using CEP and the high throughput / low latency is one, but not necessarily event the dominant one.

(2). As far as the "BAM has failed" -- is that a fact or a wishful thinking ? in the Gartner BAM and EP summit we have heard some success stories of BAM, and saw some prospering BAM products. While there are synergies and relationships between BAM and BI - I wonder what are the success / failure criteria used to derive this assertion ?

(3). I have already stated my opinion about the question - whether event processing is a footnote to databases
While I spent many years as part of the database community, I am in the opinion that event processing is a discipline by its own right, with some interaction and intersection with the database area, as well as other areas (business rules, distributed computing, middleware, software agents).

This is a preliminary reaction - mostly giving some comments to the article, which does not free me from writing a "positive" article about - the relationships between event processing and business intelligence, and I'll do so in one of the next postings - more later.

Monday, January 5, 2009

On event processing and some interesting queries

Some people have returned from the vacation with a surplus of energy, otherwise I cannot explain why my inbox today was full of mails from the same thread of discussion in the everlasting Yahoo CEP interest group trigerred by a question sent by Luis Poreza, a graduate student from University of Coimbra in Portugal. I am taking a liberty to re-write the question since it was phrased as a question in trading system, thus, some of the responders answered in trading related stuff that did not help to answer Luis' question, so getting as far away as possible from the stock market, I will base the rewriten question in the fish market. So the story is as follows: the price of 1 KG of fish is determined according to the hour, the demand, the supply and the general mood of the seller. In 10:50 he made this price as 71, then in 11:15 the price was down to 69 no more changes by 12:00. There is a computerized system that works in time windows of one hour starting every hour. The request is to find out for the time window 11:00 - 12:00 whether the price of 1 KG of fish was ever > 70. The claim is that intuitively the answer is yes, since the price in the interval [10:50, 11:15] was 71, but if we look at all the events that occurred at this window there was no event with value > 70, thus current "window oriented" tools will answer --- no.

There have been plenty of answers, some even tried to answer the question, for example by adding dummy events (one at the end of the interval ? every minute? ) with the value 71.

However -- I am going to claim the following assertions:

(1). The requirement given is not an event processing pattern.
(2). Attempts to treat it as event processing patterns are not very useful.
(3). It is in fact a kind of temporal query
(4). There may be a sense to have the capability to issue temporal queries as a response to events (AKA retrospective event processing) but this has to be done right.

Assertion one - the requirement is not an event processing pattern. Event processing pattern is a function of events, it is no surprise that Luis found some difficulty to phrase it as such. Let me take two other examples that look syntactically the same and try to understand what is the problem here:



The government agency example: A government agency known for its long queues in getting service tries to monitor the lenght of the queue. Periodically some clerk goes out and counts the number of people waiting in the queue. In 10:50 he found 71 people in the queue, in 11:15 69 people in the queue, no more samples by 12:00. Now the question is -- whether there has been some point in the time window between [11:00, 12:00] in which the number of people in the queue > 70.

Before starting the discussion, let's look at another example, the bank account example.
In 10:50 Mr. X has deposited $30, his previous balance was $41, which made his balance $71;
in 11:15 Mr. X has withdrawn $2, his balance was set to $69.

The fish market example looks from syntax point of view exactly like the queue monitoring example, in both cases we have events in the hours 10:50, 11:15 with attributes 71 and 69 respectively. However, they are not the same, the reason is that the price in the fish market is fixed until changed, while the length of the queue may have been changed several times up and down since the event here is only a sample and does not cover all events. Both of these events observe some state (price or length of queue), but the semantics is quite different. If we'll use the solution of dummy event for the queue case then the value will probably be wrong, furthermore, we cannot really answer the query in the queue case in "true" or "false", yet, in reality, periodic sampling is a totally valid type of events. Moreover, if we look at the bank account example, it looks very different from the fish market example -- it has two types of events, and the events do not observe a state, but report on change, and report the change value ("delta"). Thus looking at the two events of deposit and withdrawal we'll not be able also to answer the question, but knowing the state (balance of the account) and the delta (for the deposit and withdrawal) we are getting something which is semantically similar to the fish market example.

What can we learn from these examples? first that the property "the value is the same until it is changed" is not a property of an attribute in event, it is the property of the state (data) that may be created or updated by events. This is true for some state, this is not true for others. Solution given based on the fact that a human knows the semantics of this state, and writes ad-hoc query. However this is processing of the state, based on its semantic properties, and not of the events.

Assertion two -- Attempts to treat it as event processing is not useful.

In the past I've blogged about the hammer and the nail. There is a natal tendency of anybody who has a product to try and starch its boundaries. This may also backfire, since if trying to do some functions that this product is good at, and not doing great work can overshadow the good parts of the product. Solution like adding "dummy events" is a kind of hacking. It abuses the notion of event (since dummy event did not really happen), moreover, given the fact that this is just ad-hoc query, and there can be many such queries, in order to cover all them, we may need exponential number of dummy events... Anyway- event processing software is just a part of bigger picture, and instead of improvising, hacking or get to this functionality, it may be more advisable to use a product with better fit.


Assertion three -- This requirement is in fact a temporal query. I will not get into temporal queries now, but the actual query is over the price of 1 KG fish as changed by time. It is an existential query -- looking if some predicate holds somewhere in the interval. Other example of temporal queries can be: was there any day during the last 30 days in which the customer has withdrawn more than $10000 in a single withdrawal.

And this example brings us back to assertion four --- there may be a sense to couple event processing software with temporal queries. Example is that we have an event that makes a customer "suspect" in many laundering, but we need reinforcement by looking at some temporal queries in the past - like the one written above... I'll write about this type of functionality in a later phase.

Well - it is 1:15 AM, so I'd better take some sleep, tomorrow is again a busy day. So conclusion -- not everything that looks simple to do manually is simple to be done by a generic type of thinking, second -- event processing software should concentrate on doing event processing right, and not doing other stuff wrong... Some follow up Blog postings -- later

Tuesday, January 21, 2014

Some simplification goals in the design of the event model

I have written in this Blog about our work on "The Event Model" which is based on the search for simplification in event-based modeling.   Here are some of the simplification goals that we strive to achieve while designing the model.   These are a high level goals.  

1. Stick to the basics by eliminating technical details.    Looking at designs and implementations of event-driven applications, one can observe that there are two types of logic: the business logic, which directly states how derived events are generated and how the values of their attributes are assigned, and supporting logic that is intended to enrich events, or query databases as part of the processing.
2. Employ top down, goal oriented design.    Many of the design tools require logic completeness (such as referential integrity) at all times.  This entails the need to build the model in a bottom up fashion, namely all the meta-data elements are required to be defined (events, attributes, data elements) prior to referring to them in the logic definition.   Our second simplification design goal is to support top down design, and allow temporary inconsistency working in the “forgive” mode  in which some details may be completed at a later phase.  This design goal complements the “stick to the basics” goal, by concentrating on the business logic first, and completing the data aspects later.
3. Reduce the quantity of logic artifacts.  In typical event processing application, there may be multiple logic artifacts (event processing agents, queries, or processing elements depending on the programming model) that stand for different circumstances in which a single derived event is being derived.  Our design goal is to have a single logic artifact for every derived event that accumulates all circumstances in which this derived event is generated.  This goal reduces the number of logic artifacts and makes it bounded by the quantity of derived events.  It also eases the verifiability of the system, since possible logical contradictions are resolved by the semantics of this single logic artifact.
4. Use fact types as first class citizens in the model.  In many of the models, terms in the user’s terminology are modeled as attributes that are subordinates of entities or relationships.  In some cases it is more intuitive to view these concepts as “fact types” and make them first class citizen of the model, where the entity or event they are associated with is secondary (and may be a matter of implementation decisions).  This is again consistent with the “stick to the basic” goal. 

These goals are high level.  I'll write more details in the future about the ways we chose to satisfy each of these goals, and discuss alternatives for doing that.  I guess that over time we'll accumulate more simplification goals. 

Tuesday, September 23, 2008

event processing meets artificial intelligence




Bedford, MA, USA.




In the EPTS symposium last week, Alan Lundberg from TIBCO, who moderated the "business panel" made the analogy to AI, especially to "Experts systems", saying that there was a hype in the beginning, and people believed it will solve many of the world problems, and in the reality, it did not recover from sliding down in the hype cycle, this triggered the (somewhat surprising to some) response of Brenda Michelson, that actually EP is under-hyped, and its place in the hype-cycle is much lower in the climbing phase than the Gartner analysts draw, this is the diagram that Brenda presented with "event processing" in orange, way below SOA (in blue), BPM (in red), and Web 2.0 (in green).






Anyway - this is not the topic of today's Blog, but going back to the AI issue. The term AI is interesting, in the sense that it has spawned several disciplines (e.g. robotics, image processing, information retrieval, data mining and more) which are based on AI principles, but when they mature they stop being AI and become disciplines of their own. This is the same phenomenon we have for philosophy - the mother of all arts and sciences - many disciplines has emerged from philosophy, but when they depart, they are not considered as philosophy anymore. Event processing as a young discipline, is a descendent of multi disciplines as stated in the past, AI is certainly one of them.




What are the current topics in which AI touches event processing?




1. Modeling: the basic term "situation" and "context" have been taken from AI (situation calculus), conceptual modeling is important for design of EP applications, AI techniques can help here



2. Discovery: Prediction of events, mining of patterns - these are all derivatives of machine learning in AI.




3. Reasoning: Defining precise semantics of both event processing languages and execution models. Evidently from the recent discussions in the community, this becomes an important topic - again, precise reasoning of both the regular case of event processing, and the extended case of handling uncertain events.


As my colleague Guy Sharon described in the research session of the EPTS meeting, we in IBM Haifa Research Lab (together with some colleagues in IBM Watson Research Center) are engaged in the "Intelligent Event Processing" project that concentrates now on the discovery aspects, however, the idea is to extend the activity probably through collaborative work with the academia, as part of this collaboration we are organizing the "Intelligent Event Processing" workshop which will take place as one of the AAAI spring symposium series that will take at Stanford University, March 2009. The idea is to have the EP community meet the AI community and create partnerships to deal with these issues... so - target this conference for paper submission and/or attendance. More - later.

Saturday, January 1, 2011

My 2010




In 2010 I have posted less entries in this blog, relative to 2008 and 2009, 128 postings, roughly once every three days. Maybe I've been busier, and maybe I've been lazier - probably the combination of these two. 

The happiest day of the year was surprisingly the day in August when I've returned from a family vacation in western Canada and found two things that got by mail:   

12 copies of the book EPIA that fresh out of press (I have already gave 10 of them as a present, left 2 for myself - one at home and one in the office); the EPIA book was a major task for 18 months, so I was relieved to see it in print, at some points it fell like never-ending saga. 

 
The second item that arrived it the same day is a plaque designating the fact that I've received an IBM corporate award, which is the highest award that IBM gives,  the order of the events was somewhat funny: first I got the plaque in the mail, two weeks later I got a letter signed by IBM's CEO, which notifying me that I am receiving the award, two weeks later than the list of awards was publicly published, and then after six more weeks there was the award granting ceremony, giving me again the plaque I got in August.

Another notable event was the Dagstuhl seminar on event processing in May 2010.   We are now in the final phase of editing the end result of this team work.  Dagstuhl is a wonderful place, and we had a very good team there spending  5 days in dealing with the present and future of event processing.

I had some trips abroad - both business and pleasure.   Unlike most years I spent only four days in the USA in October, for the OMG financial market conference,  spent two and half weeks in a family vacation in Western Canada,   here is a picture with my daughter Daphna somewhere in the Canadian Rockies.


Another trip was to VLDB in Singapore, with a couple of days vacation in Hong-Kong on the way, and several days vacation in Singapore.

Here is a picture from Hong-Kong's wax museum,  where I am photographed with an old friend.


Here are two pictures from Singapore, one from a zoo that resides in a rain-forest in the northern part of the island, and the second in a spa where you can get your feet cleaned by fish

Another conference I have participated in was DEBS 2010 in Cambridge, UK.



 This year I have also started the work on a new project that deals with proactive computing, and will write more about in in 2011.    

Overall --  interesting year,  the leaves a lot of unfinished challenges for the future.  

Wednesday, September 3, 2008

On event processing as a paradigm shift


The readers are probably familiar with this picture where it shifts between seeing two faces facing each other (in black) and a white vase. I came across a (relatively) new blogger in this area, Pern Walker, blogging for Oracle's "event driven architecture". The title of the posting is:
Event servers, a disruptive technology. It describes the components of the (former) BEA framework, nothing new here, but the interesting part is the conclusion - event processing COTS is a disruptive technology, it displaces custom code in event processing, since it is more cost-effective.
This reminds me of a discussion we had in May 2007 in the Dagstuhl seminar on event processing, it was a night discussion with wine, and was lead by Roy Schulte, the question that Roy has posed to the participants : "Will Event Processing (EDA) become a paradigm shift in the next few years or not?”.
Today, I don't intend to answer this question, instead I'll post part of the discussion in Dagstuhl that included observations about "paradigm shifts" (thanks to my colleague, Peter Niblett, who documented the entire Dagstuhl seminar). I'll return to this topic again, with my (and maybe other) opinions about the answer, after the EPTS event processing symposium
Observations (from the Dagstuhl discussion):
  • Paradigm shifts can’t happen if there are too many barriers; have the entry barriers for "event processing" already been removed? ;
  • Paradigm shifts are more likely to happen when adopters decide they need a whole new avenue of applications; they are less likely to happen as a way of re-engineering existing systems. For example the German population will reach 1:2 old: young ratio by 2020 so this requires a paradigm shift of healthcare models. Can we identify new avenues of relevant applications?
  • Paradigm shifts usually happen as a result of some external change, not just because of innate strengths of the technology itself. Can we identify such external changes?
  • Standardization is not necessary for a paradigm shift, but good, appropriate standards (de facto or otherwise) certainly help

Another question is to where in essence is the "paradigm shift" - is it the decoupled "event-driven" paradigm ? is it the "complex event processing", i.e. ability to find patterns on multiple events? is it the entire processing framework as the Oracle's Blog claim?

More - Later