Friday, May 17, 2013

Prelude to the DEBS 2013 tutorial

The notable event of this week in our family was the wedding of our eldest daughter Anat. In the picture you can see Anat, her new husband Adi with my wife and myself, in fancy dresses.

It was also very loaded week at work, one of the items has been submitting extended abstract of the  tutorial that Jeff Adkins and myself are  planned to deliver in DEBS 2013.     So here is the tentative outline we submitted in the original proposal:

Outline  
Topic I:  Introduction – Brief history of event processing  
Topic II:  The major differentiation factors of event-based thinking
Topic III:  The ontology of events and event influence
Topic IV:  Anatomy of reactive systems 
Topic V: Pragmatics – a computational independent model for event-based systems 

The tutorial starts with follows some of my previous postings about "event oriented thinking",   and Jeff's 4D classification.  It analyzes the way that people think about systems in the conventional way vs. the way people think in event-driven fashion, and gets into the role of events in language, and in system modeling.  
Unlike the implementation oriented thinking we usually employ -- this is a computational independent thinking, it looks at the event-based systems from the customer's perspective.     See (at least some of) you in Arlington.

Friday, May 10, 2013

Event processing - small data vs. big data and the Sorites Paradox.

This picture is taken from a blog post from the "Big Data Journal" by Jim Kaskade entitled "Real-time Big Data or Small Data".  

Kaskade attempts to define quantitative metrics to what is "small data" vs. what is "big data".  
In terms of throughput big data is defined as >> 1K event per second, while small data is << 1K per second, I guess that around 1K event per second is defined as medium data...  
On variety big data is defined as at least 6 sources of structured events and at least 6 sources of unstructured events.  There are other dimensions like - small data relates to one function in the organization, while big data to several lines of business.     

The attempt to define where "big data" starts is interesting, the main issue is what are the conditions in which implementation of systems should become different, and here the borders are not that clear, since there are currently systems that can scale both up and down.

Interestingly -- "Big" and "Small" are fuzzy terms.  Which reminds me on one of the variations of the Sorites Paradox,  that I've came across during my Philosophy studies, many years ago, which goes roughly like this.

Claim:  Every heap of stones is a small heap.
Proof by mathenatical induction.
Base:  A heap of 1 stone is a small heap
Inductive step:  Take a small heap of K stones and add 1 stone, surely it will stay a small heap.



Thursday, May 9, 2013

Causality vs. correlation - statistical reasoning is not enough - NY Times Interview with Dave Ferrucci


Dave Ferrucci, who was until several months ago an IBM Fellow  and was known as the father of Watson, was interviewed by the NY Times in his new working place at Bridgewater Associates.

In the interview Ferrruci somewhat continues the line of thought of Noam Chomsky,  saying that AI has concentrated around statistical reasoning based on correlations, but the drawback is that one cannot understand why the prediction made by the statistical reasoning is correct.  While Chomsky bluntly stated that statistical reasoning does not create a solid model of the universe, Ferruci claims that a complementary approach is required -  understanding causality.    This is a rather old issue, in symbolic logic, there is a distinction between "material implication"  which states that  IF A is true then B is true, and the meaning is that always when A is true then B is also true, which makes a sentence like  "If the week has seven days than  the capital city of France is Paris" - a valid statement in logic.    Entailment, on the other hand, said that "A ENTAILS B" if it is necessary and relevant, in other word, there is a causality among them.  Thus, Ferruci concentrates now on building causality models to model the world economy.      I concur with the assertion that understanding causalities give better abilities of reasoning and prediction.   As David Luckham already noted, causality among events is one of the major abstraction of event processing models.   Here is a rather old discussion about causality of events.  

Tuesday, May 7, 2013

Event processing academic course at the university of Potsdam

I am following academic courses on event processing, and today came across a graduate seminar entitled "event processing technology"  given by Mathias Weske, who is known for his work on business process management,  given in the Hasso Plattner Institute at the University of Potsdam.

It is interesting to note the topics covered in this seminar: 
  1. Scalability: complex event processing solutions for high performance and low latency 
  2. Aggregation concepts: event processing approaches to extract business information from raw events
  3. Correlation: combining BPM and CEP
  4. Uncertainty: handling of noise in data streams
  5. Prediction: predict future events
  6. Heterogeneity: Processing heterogeneous events

All of them are active research topics in event processing.    Some of them are citing our work on uncertainty and proactive event processing.   It will be interesting to collect information about event processing academic courses worldwide and lesson learned from them.  Academic courses are enablers of making a technology part of main stream computing.   


Tuesday, April 30, 2013

RTInsights online magazine: the April 2013 issue is out




After some delay, the current issue of the RT Insights online magazine is on the air.  
This issue consists of four  interesting articles in the four themes of the magazine:

Business Strategies:Chris Taylor, of TIBCO Software, shows readers how Caesars Entertainment keeps track of thousands of customers at over 50 casinos and hotels with CEP, collecting data on each customer's behaviors and transactions to personalize customer experiences and reward big-spending and loyal customers in real time.
Frontline:  Nenad Stojanovic, PhD, of the Research Center for Information Technologies at the University of Karlsruhe, explains a leading edge approach to personalizing content called Adaptive Augmented Reality (A2R) which adapts content in real-time to each individual's behavioral cues.
Tools and Tactics:Chris Bird, Principal Consultant at MomentumSI and one of the most experienced practitioners in event processing , explains the uses of both in-stream and out-of-band models for event detection in the analysis of ongoing transactions
On the Horizon:Published author Phil Windley, PhD and CTO at Kynetx,  discusses the coming of the "trillion-node networks" of computing devices and programmable chips embedded into every conceivable product and device (known as "Internet of Things"), thus adding intelligence and automation to even the simplest tasks 
You can freely download this issue from the RTInsights website.  Hope you'll find it useful, and looking forward to your feedback (on behalf of the editorial board and the publisher).   



Monday, April 29, 2013

On malicious event sources: the Twitter Hoax case

In case anyone missed it -- the Twitter Hoax occurred last week.  The story is that a Twitter message occurs at 1:08 pm, that was sent on behalf of Associated Press, notifying on two explosions in the white house, and the injury of the USA president Obama.  Since there are automatic trading programs who get their decisions based (among other sources) on social media, and AP is considered a reliable source, the figure above (taken from Wall Street Journal) shows what happened to the Dow Jones.  The Tweet, of course, was not sent by AP, it was hacked.   Within 2 minutes denials started to arrive, and around 1:13 the Dow Jones got back to where it was 5 minutes earlier.   Regulators are now checking what they can do about such incidents, as reported today by the NY Times.    While writing three years ago about the benefits of Twitter as event source, I noted the danger of abuse.  

Actually when relying on events, the danger of abuse and hacking exist anywhere, one can abuse medical events and sabotage health systems, one can abuse traffic events and make the roads messy, and I guess that there are many other creative way to abuse life.  Yet, I don't think that going backwards and ignore incoming information is practical.    In fact we made malicious sources as one of the motivation of dealing with uncertainty in event processing, but this area is still young.    This struggle will continue to evolve as one of the challenges of big data.   

Saturday, April 27, 2013

Machine to machine protocol from NY Times

The "Internet of Things" where any thing is connected to the network, is one of the most influential trends today.  Cisco in its vision about the "Internet of Everything" based on the "Internet of Things"  predicts that the economical impact of the "Internet of Everything"  is  14.4 Trillion Dollars.   There is a lot of work about the infrastructure, one of it is MQTT (Message Queue Telemetry Transport), a protocol used to support Machine-to-Machine communication.    The NY times had  recent article within its blogs on MQTT as an open standard.  In the article there is a link to a talk by my IBM colleague Andy Stanford-Clark.  
One of the mentioned applications occur in the automobile industry, of putting sensors in cars, e.g. on the car battery.
Andy is working on these topics for a long time,  in fact the chapter in our EPIA book that deals with event consumers described some of Andy's application and the ambient orb's picture appears in the book was  taken at Andy's office.  Andy has famous talks and video clips on his house where he uses MQTT to control the power.  A recent video clip that Andy posted entitled "the house that twitters" demonstrates the idea.  There are other presentations and video clips on this topic over the years. 

Internet of Things will create most of the world's events of the future, and will be a major factor in making the world event-driven.    I'll write soon about the synergy between the old world and the new world.