Event Processing Thinking

Sunday, September 2, 2012

While you have slept - more about the big brother

Back to Chris Taylor's - this time he wrote a guest blog in the Forbes blog entitled "While you slept" dealing with issues of privacy. I have written before about the big brother aspect on using sensors within the smarter planet. This is other aspect -- a lot of information are flowing on us from various systems, social networks, and various systems. A smart program can look at what we published in blogs, twitter, facebook and others and determine on our political opinions, religious beliefs and others. It seems that the combination of sensors, activities on social media, and other systems make us give us our privacy and many people don't seem to be bothered -- they allow everybody to see their pictures on facebook, or read what they write there. There are people who report in Twitter on what they are doing 30 times a day including all their happy and frustrating minutes. It seems that there is a growing section of the population who are giving up their privacy from their own will, and other people who are not aware that their privacy is been invaded. I guess that this is one of the characteristics of the current web generation - losing privacy. Of course, event processing can help in drawing conclusions about a person. Chris ends his posting in call for companies to be aware and set up policies and strategy and for government to regulate.

Wednesday, August 29, 2012

On five years of blogging

Five years ago I have started this blog. In the first posting entitled "First Blog" I put a picture of myself (from 8 years ago, I think) and stated that I never wrote anything like blog or something similar. This is the 694th posting, and I am amazed that I keep doing it. I am still amazed every time I run into a person telling me that he or she reads my Blog. The Blog has been much more visible than I ever imagined, I have looked at some statistics and report it in the sequel. Over the years I have been asked several times to advertise stuff (for profit), or allow people to be guest blogger, and always answered politely that this is not really what I have in mind. The biggest reward I got from this Blog was the offer I received by Manning to write a book following the Blog posts on event processing. The book "Event Processing in Action" which I wrote with blood, sweat and tears, with Peter Niblett (he was not the reason for the blood and tears), is probably the most important thing I have done so far (but I have some plans to surprise in the future). As for statistics -- I have looked at two statistics gathering tools, one is Google Analytics which uses an instrumentation I've put into the blog, 2 weeks after I started it, and the internal statistics of Google Blogs that started in July 2008. The results are somewhat incompatible (the blog statistics shows higher numbers). Anyway -- it seems that I had more than 250,000 page views over the years. Since many of the visitors are one time visitors, it is more interesting to see how many regular readers this blog has - the number seems to be around 2500 that read every post on this blog, and around 5000 more that read most of the blog posts. I don't know that many people (!). The readers are coming from 199 countries, where the big ten are:

1). USA, 2). UK, 3). Germany, 4). India, 5). Canada, 6). Israel, 7). Philippines, 8). France, 9). Australia, 10). Japan - among these countries I never visited in either India or Japan. Among cities the leading cities are: London, NYC, Bangalore, Tel-Aviv, Manila, Singapore, Karlsruhe and my home city Haifa.

The most read posts were: On unicorn, professor and infant - where I wrote about hype, analytics and reality. Interestingly in 2008 the claim was that CEP is over-hyped. Today the opinion is that analytics is over-hyped. The second most popular is the post on family trees, an off-topic post where I told about a few days work I invested in constructing family tree during Passover vacation in 2010, and the third one is one of the oldest ones from December 2007 talking about simple events and simple event processing, terms that I don't use anymore.

Enough statistics for today -- five years of blogging passed quickly, let's see if I will be able to proceed for another five.

Back to professional postings - soon.

Saturday, August 25, 2012

Acting faster than the speed of thinking

Chris Taylor from TIBCO has written in a Blog with the nice title: "a place for good ideas in a fast changing world" entitled "getting there faster than your opponent". In the same spirit of TIBCO's two second advantage, it makes the point that event processing is vital for getting things faster than others. Chris enlists the famous OODA loop originated in the US Air Force, for mentioning that faster decision can impact the combat field. Note that OODA is one of the variations of control loops, other variation I have written about is the 4D variation.

It is interesting to note that speed of reaction has been one of initial reasons for using event processing technology in applications like high frequency trading, where trading programs compete on speed. I guess that military applications also gain from competition in fast reaction, as well as cyberspace wars.

While these are notable applications, a common misconception is that event processing is restricted to these type of applications, however, there are many other applications in which competing on speed is not an issue that can benefit from the use of event processing due to the benefits in reducing cost of development and maintenance due to higher level abstractions. In fact one of the first posts in this Blog, almost five years ago tried to answer the question whether the only motivation to use event processing is high performance? in this post I have discussed the Total Cost of Ownership as a function, I guess it is also applicable today.

The two main observations are: acting faster than the speed of human thinking issue an important type of event processing applications, but not the only reason, also in many cases event processing is not enough, and some real-time decision mechanism (reactive or proactive) need to be applied to achieve autonomic action, since the required speed requires the elimination of human from the loop.

Saturday, August 18, 2012

On open access in scientific publications

A recent article in "The scientist" issued a blunt attack on open access in scientific publications, call it: predatory publishing. This article claims that open access publications will not allow people to distinguish between "science" and "non-science", since it will not guarantee the quality of publications through the rigorous process of peer reviews, and then gives the main reason for objection --- "implications for tenure and promotion will be significant". Some of the comments compared the arguments with those made against open source software.

I think that there are some orthogonal issues here: first -- I am for open access, actually every person today can put any paper on a website, and now all we need is to index it so it will be accessible to all. Scientific publication is not different in principle from newspaper article, blog, TV show, book - in all of them we can find low quality and high quality examples.

The review process is highly subjective and noisy, and the amount of false positives and false negatives is quite high, it provides a quality control which is better than nothing, but it does not guarantee anything. There can be some quality evaluation models, but this is a different discussion.

I have written before about the fact that the need to measure everything is taking over work in the corporate world, this is also the case in academia, in many cases universities count papers, or use some other quantitative metrics, which fits the current publication system. A new system will require to change this criteria which is an earthquake in the academic system (which needs earthquakes from time to time)...

I think that open access in scientific publication is happening and resistance is futile.....

Sensors as actuators

Phil Windley, the CTO of Kyntex, has reported recently in his Blog about "talking back to a thermostat". Phil has constructed a thermostat as event producer, The example he talks about is changing the display color, however one can think on even more meaningful actions like turning on and off the air-condition. The thinking of sensors as vehicles for human awareness is not enough, in our universe, there will be more sensors than people, and especially much more sensors than people who are able to look at sensor reading treat is as data (small or big) and make sense of this data. We are more and more move to era of autonomic computing, the Mars rover "curiosity" is largely autonomic, since the speed of light barrier creates a time delay that makes it virtually impossible for remote control. We'll see more and more sensors who will also serve as actuators, or smart sensors directly talking with smart actuators (e.g. thermostat to the air-condition

controller). The required event processing (filter, aggregation, maybe even patterns that takes into consideration other sensors' input, like presence of people in the house) can be distributed between the smart sensor and the smart actuators without any middleware. More on embedded event processing - later.

Saturday, August 11, 2012

How can the level of uncertainty be determined?

You may identify this formula as Heisenberg uncertainty principle, nature is full of uncertainty and so is the world of business and any other world. It is not difficult to convince people that uncertainty representation and handling is needed, but people wonder how in practice people will be able to consume and digest uncertainty based systems. I'll try to refer to it in a series of posts, the first of them deals with the following issue: uncertainty handling models assume that the level of uncertainty can be quantified. Quantification can be in the discrete world, a collection of values, and in the continuous world -- probability, measure of belief, and similar metrics. The question is -- how can we determine the value that represents uncertainty.

There are three ways to determine uncertainty: prior knowledge, observation and statistical reasoning.

The prior knowledge often exists due to physical properties: the accuracy of sensor may be a property of the sensor reported by the producer, mathematical models can determine the error rate of physical measurement due to friction or other physical phenomena, this can also be rooted in statistical analysis, but for a various system it is given as prior knowledge.

Observation: In some cases and observation includes some uncertainty, such as: I left home somewhere between 8 - 8:30am, there was an accident somewhere in main street that made me late for the meeting, I heard that the accident was caused due to a drunk driver (but not sure this rumor is true), I arrived to the meeting a few minutes after the meeting start, and waited a long time to the elevator that took me into the 35th floor.

This observation is full of inexact fact: time, space, event attributes and more. People typically does not know how to quantify it, but can use fuzzy terms that can be translated into quantified values either in the discrete space or in the continuous space.

The statistical reasoning path is based on learning mechanism that is based on the ratio between historical input and the real value. The assumption here is that eventually the real value is known and can be compared to the reported value.

There are some interesting questions about the representation formalism, the coverage that can be obtained by these methods, and the methodology for value assignment. I'll write about these questions in a following post.

Friday, August 10, 2012

On what you need vs. what you can consume

In a recent review on our uncertainty project, the major question that came up was -- will user be able to consume uncertain data, or uncertain decision making? The question of consumability is a major question -- and it seems that in some cases consumability is considered as a barrier of introducing new technologies. This is true for all technologies based on statistical reasoning that have the reputation that in order to use them one has to be PhD in statistical reasoning... Consumability comes in all levels --- there are products that come with hundreds of configurable parameters to control its performance, some famous expert systems have been conceived to configure these parameters, as the quantity of humans who can successfully do it is too small, Consumability comes to application developers that use languages and tools. in many cases the amount of language commands that are not used is high, on the other hand, the developer is working hard to achieve the desired functionality, since it is not supported in a simple way. Then of course consumability of any product to consumers. Today we have very sophisticated washers and dryers with many programs, which totally not understood by the consumer, or just look at a remote control for any home appliance -- TV, Air-condition and more -- do we really know what are all these buttons for?

Back to uncertainty --- it is quite clear that the world is uncertain, and ignoring uncertainty in decision making may yield undesirable results. It is also clear that in the daily life we often make decisions under uncertainty, the main problem is how to move this intuition to a computerized form in a way that people will be able to utilize. The major questions are:

how uncertainty quantification (such as probability) are obtained?
how uncertain results are consumed in decisions?
how quality of uncertain decisions are evaluated and taken into account?

I'll write more about it soon - in general, consumability issue in event processing application design and construction will occupy much of my agenda in the near future. More - later.