Monday, December 31, 2012

Reflection on blogging in 2012

The year 2012 is going to expire today,  this has been the first year since 1985 that I have not visited the USA (I have been several times in Europe, though), somehow I don't think this will be true for 2013.

Looking at this Blog,  I had less posts this year (this is post 92nd for the year, the record year was 2009 with  162), but the flow of readers was bigger this year than the previous years,  I recently  came across an article in HBR Blog entitled "If you're serious about ideas, get serious about blogging".   

Looking at the popularity test,  the most read post this year was entitled the pilot decision making process,  which shows that the mental thinking of a pilot is situation driven.  One of the main areas that I have investigated this year is how to make people think in a situation driven way when coming to IT systems, which are dominated by the request-response thinking.   

Another popular post was   not about event processing but the one dealt with the question: Is computer science a science or engineering?   This question was triggered by the fact that my daughter Daphna participated in a science day in the high school she was going to attend (and is attending now) and it seems that while this school has computer science major, it does not regard it as a science.    My opinion is that computer science is neither science or engineering but a thing of its own.

Additional ones are again a more generic post on presentation skills.  This is a  soft skill that I think is extremely important in today's world.  I am putting emphasis in all the courses and seminars I am teaching, my source of inspiration, as I have written is Steve Jobs style of presentation.

Several popular professional posts: 
  On temporal extensions to  SQL 2011.      I am following temporal databases for many years, and the eventual extension to SQL is long overdue.   
On event server as the 21st century application server - following Paul Vincent,  I think we are seeing this shift happens.

Last but not least of the popular post was my review of Dave Maier's keynote in DEBS 2012, where I observed that the fragmentation in research make even distinguished researchers to reinvent wheels.

Let's see what blogging topic will be interesting in 2013 -- happy new year.





Thursday, December 27, 2012

On{X} Recipes - cool event based Android application

Thanks to Guy Sharon I've learned about cool Andorid application developed by Microsoft Israel called On{X}.  It enables to specify and use an event- condition-action recipes (the name sounds tastier than rules).

Ring on the third call from the same person when my phone is silent 
Set mode to silent between 11:00PM and 7:00am after the phone has not been unlocked for 2 hours
Text my wife "my phone is dying" when the battery goes below 15%

All of these look to me as event patterns.  Looking at the On{X} Blog I learned that new features include - dynamic regions, noise detection and more.  There is a collection of recipes and one can program more using JavaScript 

A very interesting event processing application  -- I may buy smartphone one of these days :-) 

Wednesday, December 26, 2012

More on request driven vs. event driven




In the table above (taken from a recent presentation I am working on) I have summarized some of the main differences between request driven thinking and event driven thinking.   It is interesting to note that many of the activities we are doing in life are event driven,  however, we are programmed to think that computer should be approached in request driven way, and I have noticed that even if the application itself is event driven by nature, people will tend to convert it to request driven.  Event driven action is being activated not due to explicit request but since an event has occurred, or a derived event was concluded.  This may happen in unknown time and unknown frequency.  Furthermore, a request getting into a system should always entail response (which can be error message),  an event getting into the system may be ignored, since it is out of context, just increment internal state, or close the circle of detecting derived event which can be either internal to the system, or trigger external action or notification,  Note that only the last case has visible response to the outside.    One of the challenges is to educate people to think in event driven way rather than request driven.  I'll write more on event driven thinking in the sequel. 

Friday, December 21, 2012

Container fleet management with event processing

An interesting application of event processing technology is reported by Jack Vaughan on container fleet management done in  Orient Overseas Container Lines (OOCL), Hong-Kong based company.
It tracks various events such as monitoring the load time to avoid penalties.   You can read more in the article itself.  

Thursday, December 20, 2012

On event oriented thinking



The title of this Blog is "event processing thinking", and much of the posts are my own thoughts about event processing, but today I would like to write about another topic: event oriented thinking.  
An interesting observation from recent discussions is that  in daily life we can very easily think in an event-driven way,  when the phone is ringing we are either answering it or ignoring it, when we hear about traffic jam - we try to find another way, and so on...      However - when it comes to thinking about computerized systems,  many people are programmed to think in a request-driven way, which means:  a person sends a request and the computer responses.   Thus people are trying to use what they know even if they have scenarios that are event-driven.  A very typical line of thinking is:  event is a data, we should insert it in a database, and then ask query that reflects the situation that we wish to detect.   The question is -- when are we going to ask this query.  We can ask it on any update -- but if the event is just one component in a pattern we wish to detect (example: we wish to detect when a customer sent three requests within a single day -  assuming that most customers don't send more than one request, then we can have 99.99% of the queries redundant),  we can ask the query periodically - but then again, we can both ask redundant queries, and moreover, not able to react on time, since the granularity of our timing is wrong.   There is also a notion of continuous query -- but this is not really a query, since it is not a result of a specific request.  

The nature of event-driven scenarios are: we don't know when they are going to happen, we even don't know whether they are going to happen,  but when they happen we want to do something - sometimes very fast (e.g. earthquake detection).   Furthermore, the situation we would like to detect can be realized in a pattern of many events,  and each individual event may trigger reaction, just increment some state, or even be ignored or filtered out.

The fact that people are trying to model and implement event-driven scenarios using the traditional request response is a thinking mismatch, and creates added complexity.  I'll follow up with an example in one of the next posts.  More - later.

Tuesday, December 18, 2012

On fifty years of databases

This drawing is taken from the ACM SIGMOD Blog post by Thomas Haigh entitled 
"Fifty years of databases".   It tells the story of  IDS (Integrated Data Store) as the first database developed 50 years ago in GE.  IDS was designed by Charlie Bachman, who received Turing award for his pioneering work on databases, and as Haigh remarks  - Bachman was the first Turing award winner who did not have PhD, and actually spent his life in industry and not in academia.  It is worth remarking that the two persons who followed Bachman by receiving Turing awards in the database area, Tedd Codd  and Jim Gray were also from industry, in fact, the academic database community did not have until today any Turing award winner.  Bachman's Turing award talk "programmer as navigator" was very insightful.  Bachman compared himself to Copernicus who said that the earth is revolving around the sun and not vice-versa, and said that the computing world will revolve around data - where programming will be side effects of operations associated with data.   We are not quite there, but for me it was an inspiring goal.
IDS is far from the current databases we have today, but it laid the principles, and started from pure engineering position, theory came later.     I am trying to compare the database area development to those of event processing, and think that in many respects we are still were databases have been 40 years ago,  so the challenge is to advance it further...  more on that -later.

Saturday, December 15, 2012

On decision latency

James Taylor wrote in his Blog an article on decision latency, the illustration above is taken from his article. 
The latency for getting a decision consists of three time intervals:

The Capture latency is the time taken from the event occurrence in the universe until it is detected (captured) by the processing system
The Analysis latency is the time taken from the time that the event is detected by the system, until the time it is  processed and creates the derived event/data that triggers an action, and the Decision latency has two parts - which I think should be distinct,  the time to make the decision how to react, and the time it takes to react.

These four type of latency are actually orthogonal.  For the Capture latency what is required is an infrastructure of instrumentation, sensors and communication; for the Analysis latency -- the event processing system latency, and possibly additional analysis required by querying historical databases,  for the decision latency --  the decision tools (either a decision management system or optimization system), and for the action latency -- the speed in which it can be applied. 

Why the decision latency is important? -- here we are getting back to the real-time issue.  In some cases the time is critical for decisions,  an example that we saw a few weeks ago is in the Israeli anti-rockets systems
known as "Iron Dome" (Kipat Barzel).  The system detects that a rocket was launched, and then it needs to analyze its course and determine whether it is going to hit a populated area, in order to decide whether it is cost-effective to send a missile to hit the rocket (this decision is needed since a missile is very expensive), and if yes,  determine when and where the missile will intercept the rocket and launch the missile at the right time to the right direction.  Latency in all the four phases is critical., and the success of this system saved many lives.    
Decision latency is not only related to defense systems,  it exists in different other areas - healthcare is certainly one of them, but also business.  The time scale does not have to be seconds,   decision latency can span minutes or hours,  

We can look at the value of decision as function of its latency,  and can see that there are soft real-time cases which the value of decision goes gradually to zero, Firm real-time where it goes to zero at a certain deadline.
Hard essential in which it goes to a certain penalty (such as missing SLA), and hard critical where it is going to a minus infinity (a disaster case), the Iron Dome case is of that type.  

Of course, there are no miracles, and the cost of a system grows when bounded latency need to be guaranteed (for example cost of extremely robust communication).  For each type of event-based system there is a need to calculate the trade-off between the value of latency and the cost. 

Friday, December 7, 2012

Launching the online magazine: Real-time Business Insights: Event Processing in Practice


There are two events today -- the 15th birthday of my daughter Daphna (in the picture taken in our recent family vacation in Malta)




Daphna's birthday will help me remember also the other event happening today -  launch of the online magazine.  I have written about the idea of the online magazine before,  a few months ago - took some time to accomplish - assemble and editorial board, solicit material from authors and find a publisher who believes in the idea to take care of the logistics.   

After some deliberations - the name chosen is shown on the Logo below.  The idea is to start with several pilots issues, the first of them is being launched today.  The magazine consists of several sections:
Business Strategies presents the customer's perspective on how other enterprises gained value from processing streaming data and events in a variety of use patterns.
Tools and Tactics presents the practitioner's perspective with best practices and lessons learned from those who develop event-based solutions.
Frontline presents the community perspective with provocative and thought provoking opinions.
Industry Insights provides in-depth education about the use of real-time business insights in a particular industry.
On the Horizon provides the researcher's perspective on what's coming next in
applied research projects and new technologies.
 EP News keeps readers abreast of the latest industry developments

Each of these section has a section editor, I volunteered to serve as editor-in-chief and section editor of the blogging section ("On the Horizon") - to set a personal example, I have also written the first one, expressing an opinion which will come at no surprise to this Blog's readers.  

The online magazine can be downloaded from here:   (registration is required by the publisher), 


You are encouraged to download, react (the magazine's website links to the EPTS LinkedIn group), distribute to anybody who might be interested, and contribute to any of the magazine's section.


Enjoy and react!



Wednesday, December 5, 2012

On behavioral programming


Yesterday I took the train to Rehovot and visited the Weizmann Institute.  I gave a talk in a local seminar and  was hosted by David Harel's research group.  An interesting work they are doing is on behavioral programming, which is a formal language independent model to specify reactive applications.  They have been experimented with games, robots and toy helicopter.   They are actually specifying event processing systems;  I see a possible benefit in their approach that their formal model is a basis for validation which in event based systems is tricky,  due to the temporal nature.   I'll  work with them on some use cases  from our universe to learn more about their model.    Can be interesting.  

Saturday, December 1, 2012

On combining event processing with neural networks

This picture is taken from a recent article by Kamalkumar Mistry which explains the benefits of possible combining event processing with neural networks, starting by explaining each of these technologies separately and then providing a patient monitoring example. In this example the combination of the two technologies is that a patient is monitored in real-time, the signals from the monitors are getting to an event processing system as an input, and after processing is done the output is fed into a neural networks, which recommends action based on an  individual model of each individual patient.

This is a valid use case,  I actually thought about additional use of neural nets and this is to tune up the monitoring, in event processing terminology - set up the patterns to be monitored.   Taking back the monitoring patients example,  which is actually one of the use cases we analyzed in the past,  in fact it is example no.1 in the examples we put in the introductory chapter of the EPIA book, (chapter 1, page 7).  In our example, the physician can tune up the system to have individual monitoring patterns for each individual patient, since in different patients, different combinations of signals over time may mean different things.   The event processing pattern can be recommended by the neural network system based on the same patient model mentioned by Mistry, and in this case the relationship between the neural net and the event processing system are reversed.

Anyway - it is interesting to investigate this combination further.

A promo: stay tuned for the inauguration issue of the online magazine soon... 

Wednesday, November 28, 2012

On Dynamic EPAs by Berhnard Seeger

I came across a presentation by Berhnard Seeger entitled: "Dynamic complex event processing - not only the engine matters"  - the picture above is taken from that presentation.  Seeger uses the term "DEPA" for "Dynamic Event Processing Agent".   The dynamic refers to the ability to add/modify EPAs without affecting the event sources and event sources without affecting the EPAs, and ability to change EPAs at run-time (we haves supported this feature in Amit). 

The reference between all the players are indirect and done through meta-data entities.  There are other components to this model -- inclusion of actions in order to check contradictions, simulations for debug,

I totally agree that all these features are important (not sure that a new term is needed, this relates to implementation of EPA),  in fact we have worked on related issues in the past, see our paper in DEBS 2010 entitled: "analyzing the behavior event processing applications". 

In any event - interesting presentation, read and enjoy!   

Saturday, November 24, 2012

The big data hype cycle 2012

I haven't written in the last few days,  I have been in EU project review (as a reviewer) in Brussels and also had some time to be tourist, and climbed the Atomium, Brussels known icon

and  in several museums in center city, taking refuge from the rain 

  including the famous Magritte museum.   I have imported some Belgian chocolate (most of it was already given away)  and a Belgian virus, with whom I am struggling in the last couple of days.

I also came across the Gartner's big data hype cycle for 2012 -- the first time in which Gartner chose to look at big data as an area.


You may notice that "complex event processing" is around the peak of the diagram.

It seems that this hype cycle made Irfan Khan, CTO of Sybase quite furious, his firm reaction was:
"Gartner dead wrong about Big Data life-cycle".    Khan claims that Big Data is not a hype but a reality, and expectations are under-inflated not over-inflated since it can do much more than what people assume.

I guess that there is growing adoption to technologies associated with Big Data, but I don't think that it reached the plateau of productivity, as Khan's claims,  since this is not around whether there are mature products (by the vendors' conception), but around the utilization in industry, and it is difficult to say that most organizations had good exploitation of such technologies.  Furthermore, Khan's claim that Big Data is under-inflated actually shows that the plateau of productivity has been reached.   

In any event,  the event processing angle is interesting.  Note that originally event processing appeared in the hype cycle of enterprise architecture for several years.  In 2012 event processing does not appear explicitly, 
Big Data appears as one block in the top.  This shows that event processing has migrated (at least in Gartner's mind) from the middleware world into the analytics world,  and this is also compatible  with some of the current trends, but this should be a subject of another posting - coming soon. 

Tuesday, November 13, 2012

On the "end of the engineer"

After writing yesterday on science and engineering somebody attracted my attention to a (not new) very visible  posting by Tom Gillis on the Forbes Blog entitled: "The *End* of the engineer".    Gillis, who labels himself as an engineer who grew up in a family of engineers claims that in the past what the market competition was on better engineering and brings some examples of high-tech vendors who failed due to the fact that others succeeded to get better engineering.    The claim is that it is no longer the case, the differentiation is not in the engineering, but in understanding customers needs (even if the customers are not aware of them),  the ultimate example is the direction that Steve Jobs took Apple whose success was due to the market insights and not to superior engineering.    While engineers are still needed, Gillis claim that now they are not the one who will bring the crucial value, but those who can understand the customer's way of thinking, thus the heroes of the high-tech will be those who have "soft" skills, and the education system has to reflect it -- interesting perspective,  as you can imagine, also controversial, you can view the comments to the original Blog posting, some of them had strong opinions to either side (the author added prefix to the Blog in response)...   Not sure it is the end of engineering -- but I agree that the education for high-tech workers today is not technology only...


Monday, November 12, 2012

On software, engineering and computer science

Today the IBM Haifa Research Lab hosted the Programming Languages and Software Engineering whole day seminar, the keynote speaker was David Parnas,  one of the pioneers of software engineering,  Many years ago when I was in the Israeli Air-Force, I have investigated the new discipline of software engineering to see whether we can apply in a big software project that I've managed, and Parnas was one of the first names I have encountered, but this was the first time I saw him in person.  There was a panel on the term and profession of software engineering.   It reminded me one of the posts on this Blog from early this year that was entitled "Is Computer Science - science or engineering?" which was triggered by the fact that my daughter Daphna had a "science day" introducing the science classes in the high school she started to attend this year, and despite the fact that they teach "computer science" they did not include it in the science day - thus the school does not think it is a science.    

Interestingly, David Parnas, as well as other panelists, don't really think of computer science as engineering, in fact, David Parnas talked about a fight between an engineers association in Canada to forbid computer science graduates that don't really have engineering training to call themselves engineers.   I always thought that there is something pretentious in the fact that programmers (typically with computer science education) call themselves "software ENGINEERS".  As somebody said in the discussion today  -- an engineer is a person that can be sued on negligence if it will be proven that engineering rigorous principles were not met.
Until the time it will be true for software, software producers cannot call themselves software engineers.

Back to - is computer science a science -  Winton Cerf wrote in CACM an article entitled "where is the science in computer science?"   The answer according to Cerf is  that unlike physical sciences which are about modeling the world,  in computer science the science is tools for understanding complex software systems and make predictions about their behavior --  not really sure I am convinced...

It seems that computer science might be an animal of its own - neither science nor engineering, especially in the conventional terms.

Sunday, November 11, 2012

On the Internet of EVERYTHING

Cisco came out recently with the concept of  "The Internet of EVERYTHING".
While the "Internet of Things" deal with connecting anything to the Internet, the Internet of EVERYTHING deal with the things and the semantic connection among things that can make the world actionable in real-time.  A simple example is the car theft example. 

A car is connected to the Internet through its GPS sensor that reports its location, it has also semantic relations to a list of eligible drivers who are permitted to drive in this car, each of these persons is also connected to the Internet using his or her mobile phone, thus the Internet knows the person's location  (disregard the privacy issue for this scenario!),  so if the car is moving (inferred from the GPS change in location), and all the eligible drivers are not in the car - it means that the car is stolen, and it can then report to the police and have them track its location.  
This scenario is based on - things,  contexts of things, and processing events about things..   It is actually quite straightforward from technology point of view.  I wonder if the "Internet of EVERYTHING" will survive the buzzword test of time... 

Saturday, November 10, 2012

On IBM scientific accomplishment

This week, the annual accomplishment process of the IBM Research Division was concluded. This is a process that recognizes major impact activities in various categories: scientific, contribution to IBM products, contribution to IBM services, contribution to standards and some more.  

Within this year's process, our work on the event processing conceptual model has been recognized as a scientific accomplishments.  The criteria are: number of citations (according to Google Scholar) and support letter from senior members of the scientific community in this specific area. 

It is interesting to note that the major publication referred was the book I have written together with Peter Niblett, "Event Processing in Action".


 The interesting fact is that the book was not written as a research oriented book, but was geared towards the professional market,  yet it accumulated so far 153 citations, with the number steadily growing (when the process started the number was around 130).  

Drilling down to the citations list it is also interesting to observe that while some of the citing papers belong to the event processing community, many others come from different domains and implemented systems in the areas of power management in mobile devices from Finland, rotor-craft control from Brazil, as well as others that indicate that the material in the book had some practical impact in additional to the impact on the scientific community, which is also important, as science is being built in layers.

I have been out of the research work for about 10 years, where I kept research activity in the back sit, mainly through supervising  PhD and MSc students at the Technion.  The major project I was involved in the years 1998-2005 (AMIT), has a single major publication that is actually the summary of the PhD dissertation of Asaf Adi  (this paper also accumulated nice number of citations). 

The question whether citation number is a good metric - is another discussion, for me the actual impact (those using the work in practice) is also an encourging indication that the work is not done in vain -- more later

Saturday, November 3, 2012

More on statistical reasoning - Chomsky on "Where AI went wrong"

I have written before about the claim that statistical reasoning is over-hyped in its claim that all the problems in the universe can be solved by statistical reasoning over past data (see also my report about Sethu Raman Keynote in DEBS'12).  

A more blunt claim against statistical reasoning has been made by Noam Chomsky which claims that the fact that AI took a wrong direction by making statistical reasoning its mainstream, and claim that knowledge achieved by statistical reasoning, while having practical usages as "good enough" for various uses, creates shallow knowledge that only approximate the universe and does not create a solid model of the universe.  
An interesting interview with Noam Chomsky was published recently in "the Atlantic".    
Always thought provoking!

I'll write more about correlation vs. causality.  

Saturday, October 27, 2012

StreamEPS from SGT - an open source event processing from Ghana




I was recently approached by a company that resides in Accra, Ghana called SoftGene Technologies,  which  has developed an open source event processing product called StreamEPS.   Looking closely at the description of the supported functionality, one can realize that this is an implementation that follows the EPIA book.  I'll write in a separate posting about the impact of the book in terms of follow-up works, it is quite interesting...
Softgene technologies describe itself as "Research-lead private company".  I like the definition, since I believe that much of the useful software is research lead. 

This also completes the continent coverage of people working in development event processing software.  
While there are quite a lot of software developed in Europe and North America.  There is now event processing software developed in Asia (Sri Lanka, Japan and Israel - that I know of), Australia, and Brazil.

If there are event processing related software developed in additional countries -- let me know and I'll survey in this Blog.






Saturday, October 20, 2012

multidisciplinary research -- duck, cheetah, sailfish and spine-tailed-swift

I have recently written about "lead vs. impact" in industry research,  today I would like to continue these series of thoughts by observing that achieving a lead is often a result of multidisciplinary research.  There are two ways to approach multidisciplinary research, one is:  develop versatile, multidisciplinary researchers, and the other is tight collaboration of researches from multiple disciplines.   The difference can be explained by examples from the animal world.

A duck is a multidisciplinary animal


It swims, it walks and it flies.  It is not excellent in any of them, but he can do all.  

Looking at the question who is the fastest swimmer, flier and runner we come across the sailfish, spine-tailed-swift and cheetah 



It is obviously cheaper to keep one duck than to keep these three animals,  and in many cases the walking, swimming and flying abilities of the duck are "good enough".   This is particularly true when the scale of ambition for the research is "impact".  However, when the aim is "lead", it is often the combination of people who are excellent in their disciplines collaborating together which constitutes the lead.    Versatility cannot replace excellence.     More  posts in these series - later.   

Friday, October 19, 2012

Lead vs. impact in industrial research

The quarterly letter of John Kelly, the head of IBM Research Division reminded us that his first directive is to shift the main goal of what IBM research is doing from impact and contribute to lead.   

Much of the work done in industrial research today falls into the contribution impact,  as illustrated in the picture, the human models are already there, and the contribution is in coloring them.  Much of the industrial research today concentrates upon incremental contribution to the company's products or services.  




The alternative is lead.  Take the Israeli invention known as USB stick  (in Israel we call it disk-on-key), the picture below explains why.



Lead means the creation of something new that is not an increment of some existing stuff.

This leads to several questions - first, is this scalable, in the sense that - are there enough revolutionary ideas, or if there are enough people skilled enough - even in the research community - to generate such ideas. 
Another question is - work towards lead is much more risky relative to work towards incremental contribution.  It might also take less resources, although there is no exact correlations, some of the revolutions were done by small teams,  e.g. the relational model was devised by a single person.   Furthermore,  leading may mean disrupting some existing interest, thus accumulate enemies and encounter the corporate's immune system, see my post about the "innovator's dilemma".
In my opinion the answer is that while incremental contribution should not be eliminated, the best researchers have to be geared towards the drive for lead.    This requires a supporting culture, in which risks are tolerated.   This is sometimes against the DNA of the risk-averse companies, and the tendency to focus on the pressing business as usual stuff which is mainly incremental.   I think that even if we assume a certain rate of failures,   one revolutionary result worth thousand evolutionary ones, and this is the relative scale of scores that should be weighted.     

While I referred to the context of research in industry  and corporate cultures, this is also somewhat true for academic research which also have great deal of incremental work AKA "delta papers".


I am investigating recently the impact of my own research work over the years and will write about it soon, with some interim conclusions.  


Call for papers - DEBS 2013 in Arlington Texas


The DEBS conference is returning to the USA, in 2013 it will occur in Arlington, Texas.
The call for papers (in the research and industrial tracks), tutorials, demos, PhD workshop, and grand challenge can be found in the conference's website.    Note that the industrial track consists of industry papers, and industry experience reports do not require full papers.

The deadline for submission in most categories is February 8, 2013  and the conference itself will be held on June 29 - July 3, 2013.


Thursday, October 11, 2012

On gesture events as regular expressions - Proton from Berkeley

Proton is a name of a project in which have investigated the proactive event-driven approach (see our talk in DEBS'2012). I came across another proton, this time from UC Berkeley.  It deals with codifying gestures as regular expressions of touch event symbols.  In the website you can find tutorial, downloadable version and papers.   Interesting idea,  enjoy!

Wednesday, October 10, 2012

SAS announcement on event processing


SAS announced today that a new "SAS DataFlux Event Stream Processing Engine" will be available in December.  It is described as: "the new software is a form of complex event processing (CEP) technology...incorporates relational, procedural and pattern-matching analysis of structured and unstructured data".     Welcome to the event processing club,  this seems to be an indication that the analytics guys see the value of adding event processing to their portfolio, I guess that either the "limited appeal" of event processing has somewhat changed in the last couple of years to justify it.  Anyway - I welcome SAS to the club, and hope that they will also become active  part of the event processing community.  


Sunday, October 7, 2012

On big data, small things and events that matter

In a recent post in the Harvard Business Review Blog entitled: "Big Data Doesn't Work if You Ignore the Small Things that Matter" ,  Robert Plant argues that in some cases organization invest a lot in "big data" projects trying to get insights around their strategy, while failing to notice the small things, like customers leaving due to bad service.   Indeed big data and analytics are now fashionable and somewhat over-hyped.  There is also some belief, fueled by the buzz that it solves all the problems of the universe, as argued by Sethu Raman in his DEBS'12 keynote address.   Events are playing both in the big data game, but also in the small data game, trying to observe a current happening, such as time-out on service, long queues etc..., when it relates to service, and other phenomena in other domains.  Sometimes the small things are the most critical.
I'll write more about big data and statistical reasoning in a subsequent post.

Saturday, October 6, 2012

More on the semantic overloading of derived events


I am recently getting back to the time in which I have dealt with semantic data models, and now I am trying to view current event-driven applications in that way, thus the semantic overloading is one of the interesting first issues that emerge.  I'll write more about semantic modeling of event processing later, but right now I'll concentrate in the semantic overloading of derived events.   There are various definitions of the term "event", but in all of them event represents a VERB in the natural language.   Looking at what we defined as derived events, it seems that some of the derived events we are looking at can indeed be described by a verb in the natural language, while others are really described by nouns.    Thus my current thinking is to have the semantic notion of DERIVATION, but the derivation can yield different concepts:
Events - when indeed the derived conclusion is that something (virtually) happened.
Entity facts - when the derived conclusion is a value of some fact
Messages - when the derived conclusion is some observation that has to be notified to some actor. 

Examples from the Fast Flower Delivery use case that we used in the EPIA book.  

The automatic assignment creates a real event -- can be expressed by the verb ASSIGN
The timeout pattern "pickup alert" which means that a pickup was not done on time --- this is an observation that is notified to somebody.  It is therefore a message that can be expressed by NOTIFICATION
The driver-ranking calculated as a function of assignment count, is actually a fact related to driver, driver-ranking is a noun, thus it is a derived fact.

More - later. 

Tuesday, October 2, 2012

On family vacation in Malta

I have not disappeared, spent most of the previous week in a family vacation in Malta.   Malta is a small country consists of several islands in the  Mediterranean sea  The climate is similar to the one we have in Israel.  The local language is also a Semitic language like Hebrew, it is actually a descendant of the Arabic language mostly. Like any other Mediterranean people - they are nice, friendly, and take their time.  We lived 5 minutes walk from the capital city of Valletta, which is a small city surrounded by walls and looks somewhat similar to the old city of Jerusalem.  In the evening it looks like a deserted place, nobody is walking in the street, and besides a few restaurants it looks like a ghost town. 


Some highlights:
There is a prehistoric site which has limited visitation (10 per hours), we made reservation 2 months ahead to get inside. 
Malta is a catholic country, and have a lot of ancient churches, we have visited some of them. 
We decided to rely on public transportation, which is cheap but time consuming, some of the lines have low frequency --- not really a good choice.
We took a day trip to Gozo, the northern island which has nice beaches.  
The Maltese people like to celebrate - we have watched too separate celebrations, one of them in the honor of the national fish, called Lampuki,   the other in Valleta  is the Valleta white night.  I am not sure what the celebration is about, but it was crowded and noisy, which is a contrast to the regular silent in Valleta.
Hope that my daughters will post pictures on Facebook soon. 

In general:  nice place for vacation.
I'll continue with professional blogging, hopefully tomorrow.

Saturday, September 22, 2012

The semantic overload of derived events


The term "derived events" is frequently used in the event processing terminology.  Here is an example taken from the Gigaspaces Blog of using the term "derived event".    Recent discussion with somebody who is just learning the event processing area, made me realized that we overload the term "derived event".  On one hand we define event as "something that happens", and say that a derived event is an event, thus one may assume that derived events also happen.    However, the way this term is typically used has some semantic overloading.   There are actually different types of derived events.

One type of derived event is really an event that we did not observe directly, but concluded that it happened by observing other events.  Such cases is the case of fraud detection, money laundering detection and system problem diagnostics.   

Another type of derived event is a notification.   We are doing a calculation based on events and the result is notified to some person of application.  For example: derived event that calculates the highway fees, based on exit and entry events on the highway, and rate calculated by load on the highway.   This is a derived event, however - it does not really stands for something that happens, the happening here is a notification to the driver how much the fees are,  there are many derived events of this type.

The question is whether we should make distinction between the two cases. From semantic point of view they are clearly distinct.    From execution model and language point of view -- they are indistinguishable; both take events as input, apply some assertions and functions over a collection of event, and create a structured message sent on some channel.    From semantic point of view there is a difference,  the question is from pragmatic point of view, is this distinction important for somebody that takes any role in the life-cycle of the application.   More -later.

Saturday, September 15, 2012

On Google Intelligence Events


I am using Google Analytics (quite infrequently) to view the activity on this Blog, but for those using websites for commercial purposes, or as a social media vehicle, tracking activity on website is a very valuable tools, especially in our metrics-driven universe. The illustration above which seems as a typical sense and respond cycle is taken from an article in seemingly odd location;   the social media sun recently reported on a feature in Google Analytics called "Google Intelligence Events" (although I think this feature is not new).    The article claimed that this is "a simple feature called complex event processing", falling into the "complex" word trap,  and in other cases it asserts that "Google's intelligence events is only a basic use of event processing engine".
In fact, what this tool is capable of doing is issuing alerts of two types:

Automatic alerts:  indicating significant changes of traffic to the website
Custom alerts:     threshold crossing of a certain indication (e.g. the traffic from Singapore was down 20% from previous day).  This is a threshold over all tracked variables (source, demographics, bounce rate etc..). 

As far as event processing is going -- it is indeed limited capabilities, mainly threshold oriented comparing two set of events (based on time).   However, it shows that event processing has got into the web analytics world and there is a potential of doing more in this space.  More on this -later.    

Sunday, September 9, 2012

Event is a relative term

Following recent discussion with Jeff Adkins about the semantics of event,  one of the observations has been that event is a relative term.  First - different observers can describe the same event in different terms, and indeed different observers may not agree about - what really happened, or where it happened, or when it happened, or all the above, kind of "Rashhomon effect".    It may also be relative in the sense that different views may look at different properties of the event,  for example one view will only look at a customer enters the store, another view will also look at demographic properties of the customer -- age group, gender.
The relativism effect can mean that:  event meta-data may have different views with different structure, and event instance may have several corresponding instances according to the observers.  
The question is -- how to consolidate different observers?  this may involve modal  semantics.  I'll write about it at later phase.  

Jeff also recommended me to read a philosophy book called :"The ontology of mind: events, processes and states" by Helen Steward.  I have ordered this book and will write a review about it upon reading it.  

Sunday, September 2, 2012

While you have slept - more about the big brother


Back to Chris Taylor's - this time he wrote a guest blog in the Forbes blog entitled "While you slept" dealing with issues of privacy.   I have written before about the big brother aspect on using sensors within the smarter planet. This is other aspect -- a lot of information are flowing on us from various systems, social networks, and various systems.   A smart program can look at what we published in blogs, twitter, facebook and others and determine on our political opinions, religious beliefs and others.  It seems that the combination of sensors, activities on social media, and other systems make us give us our privacy and many people don't seem to be bothered -- they allow everybody to see their pictures on facebook, or read what they write there.  There are people who report in Twitter on what they are doing 30 times a day including all their happy and frustrating minutes.  It seems that there is a growing section of the population who are giving up their privacy from their own will, and other people who are not aware that their privacy is been invaded.   I guess that this is one of the characteristics of the current web generation - losing privacy.   Of course, event processing can help in drawing conclusions  about a person.   Chris ends his posting in call for companies to be aware and set up policies and strategy and for government to regulate.  

Wednesday, August 29, 2012

On five years of blogging



Five years ago I have started this blog. In the first posting entitled "First Blog" I put a picture of myself (from 8 years ago, I think) and stated that I never wrote anything like blog or something similar. This is the 694th posting, and I am amazed that I keep doing it.  I am still amazed every time I run into a person telling me that he or she reads my Blog.  The Blog has been much more visible than I ever imagined,  I have looked at some statistics and report it in the sequel.  Over the years I have been asked several times to advertise stuff (for profit), or allow people to be guest blogger, and always answered politely that this is not really what I have in mind.   The biggest reward I got from this Blog was the offer I received by Manning to write a book following the Blog posts on event processing.  The book "Event Processing in Action" which I wrote with blood, sweat and tears, with Peter Niblett (he was not the reason for the blood and tears), is probably the most important thing I have done so far (but I have some plans to surprise in the future).     As for statistics -- I have looked at two statistics gathering tools, one is Google Analytics which uses an instrumentation I've put into the blog, 2 weeks after I started it,  and the internal statistics of Google Blogs that started in July 2008.  The results are somewhat incompatible (the blog statistics shows higher numbers).  Anyway -- it seems that I had more than 250,000 page views over the years.  Since many of the visitors are one time visitors, it is more interesting to see how many regular readers this blog has - the number seems to be around 2500 that read every post on this blog, and around 5000 more that read most of the blog posts.  I don't know that many people (!).   The readers are coming from 199 countries, where the big ten are:
1). USA, 2). UK, 3). Germany, 4). India, 5). Canada, 6). Israel, 7). Philippines, 8). France, 9). Australia, 10). Japan  - among these countries I never visited in either India or Japan.    Among cities the leading cities are: London, NYC, Bangalore, Tel-Aviv, Manila, Singapore, Karlsruhe and my home city Haifa.
The most read posts were:  On unicorn, professor and infant - where I wrote about hype, analytics and reality.  Interestingly in 2008 the claim was that CEP is over-hyped.  Today the opinion is that analytics is over-hyped.  The second most popular is the post on family trees, an off-topic post where I told about a few days work I invested in constructing family tree during Passover vacation in 2010, and the third one is one of the oldest ones from December 2007 talking about simple events and simple event processing, terms that I don't use anymore.   
Enough statistics for today -- five years of blogging passed quickly, let's see if I will be able to proceed for another five.

Back to professional postings - soon.