Showing posts with label EPA. Show all posts
Showing posts with label EPA. Show all posts

Wednesday, November 28, 2012

On Dynamic EPAs by Berhnard Seeger

I came across a presentation by Berhnard Seeger entitled: "Dynamic complex event processing - not only the engine matters"  - the picture above is taken from that presentation.  Seeger uses the term "DEPA" for "Dynamic Event Processing Agent".   The dynamic refers to the ability to add/modify EPAs without affecting the event sources and event sources without affecting the EPAs, and ability to change EPAs at run-time (we haves supported this feature in Amit). 

The reference between all the players are indirect and done through meta-data entities.  There are other components to this model -- inclusion of actions in order to check contradictions, simulations for debug,

I totally agree that all these features are important (not sure that a new term is needed, this relates to implementation of EPA),  in fact we have worked on related issues in the past, see our paper in DEBS 2010 entitled: "analyzing the behavior event processing applications". 

In any event - interesting presentation, read and enjoy!   

Wednesday, December 15, 2010

Revisiting EPN

This illustration, taken from the EPIA book, and drawn by Peter Niblett,  is a portion of the EPN that describes the "Fast Flower Delivery" example that accompanies this book.   In an internal discussion today somebody raised the question, why do we need EPN at all,  and not using the alternative that has been used in Amit, and other places:  each EPA subscribes to an event type, whenever an event from this event type is detected, the appropriate EPA listens to it and processes it,  and all the event flow is implicit and the person defining the system does not need to worry about it.


Since this question is actually a good question,  I wanted to share my response.  There are two main reasons why we have shifted in the thinking to the EPN model:  efficiency and usability.  


  I'll start with the usability, experience shows (and this observation is true also to inference based systems) that people feel more comfortable in ability to control the flow rather then having implicit flows, they understand better what it does, can better debug and validate it, and trust such systems more.   Note that EPN is not a workflow, it does not represent control flow, it represent event streaming flow (in a way similar to data flow, with some semantic distinctions).  


The other reason is efficiency.    If an EPA subscribes to event type then either an EPA has to process and filter out a substantial amount of irrelevant events, or the amount of event types might successfully be increased.   Imagine the following scenario:   An event of type ET1 arrives,  first it meets a filter that filters out much of the event using some assertion, and then there are various EPAs that process only the filtered-in events,  one of this EPAs is enrichment, adding some information from a database,  and then the enriched event is being sent to an aggregator for further processing.      If we use the "event type" subscription, there are two choices:  first -- create event type ET2 for the filtered-in events, identical to ET1, and create derived event of type ET2 for each filtered-in event of type ET1,  then create event type ET3 for the enriched event with added enriched attribute, and then indeed each EPA subscribes to a single event type.  The second choice is to use ET1 for all three cases, but add indication (using some derived attribute) which variation of ET1 it is, and filter inside the aggregator to have only the right type of ET1.  Both are inefficient, the first one due to the need to manage much more event types, the second is that much more events are transmitted to each EPA to filter out, and the order also becomes important here.   


The explicit EPN resolves it by the fact that each EPA sends it output to a channel and the channel can route according to source, type, assertion etc...   -  thus a specific  output terminal of a channel is really the topic which EPA subscribes to.     Note that all the possibilities mentioned before are just special cases of EPN and if one insists, such EPN can be constructed, in the extreme case, one can construct EPN with a single channel that routes every event to every EPA to decide whether it wants to use it or not,  but I would not recommend it as a good design pattern.     More - later.

Monday, October 5, 2009

A simple example of agents and contexts


In the last few days, dishes in my house are washed by hand. Our dish washer have seen signs of old age, thus I acquired a new one. First step was to order delivery from the store. There is the software engineering principle of "separation of concerns", thus those who do the delivery don't install it. I am not allowed to install it myself, since if I open the package I am losing the warranty, only a technician of the service company related to the importer is authorized to open the package. After some coordination this technician arrived today, looked at our kitchen and said -- "this is very common, they have changed the standard, now there is a thermostat on the pipe, so the size of the hole that your need in the kitchen closet is larger". Since there is a separation of concerns, he does not do holes, and I had to call the Carpenter who came later today, so I'll be able to call the technician again tomorrow... This is the result of the fact that the event of "standard of pipe changed, thus it effects the size of the pipe" was not reported by the store, since they don't install dish washers so they probably don't know --- this separation of concerns, and not seeing event over a larger story, wasted a lot of time and money to the society...

Anyway -- enough complaining.

I got a question from Hans, related to my posting on context, I am copying the question here:

"
I would be interested to hear about how the context concept works in a particular use case: Let's say I have events that contain, among other things, a number. I would like to capture the third quartile (a data point that sits immediately above 75% of the other points) of the number, every hour. I would then like to perform some operation over the previous 8 of these hourly numbers.

How would this be expressed as contexts and agents?"

OK -- here it is:

We start by event of type E1 that flows into the system from some consumer and has some numeric attribute A, for which we want to capture the third quartile.

  • First we define a context C1 of type of sliding fixed temporal interval with both duration and period of 1 hour.
  • Then we define an agent A1, which is valid within C1, and calculates the third quartile and produces event of type E2, with an attribute B which that captures the derived value.
  • Further we define a context C2 that is a sliding event temporal interval on event E2, with event count period and event count duration both of 8.
  • And finally we define an agent A2 that subscribe to events of type E2, and is valid with context C2, that does some operation on E2, and may in turn produce (say) event of type E3, which is flowing to some consumer.

This is the basic model. Of course, it is a bit more complicated, if our agent library does not contain an agent that calculate quartiles may not be a basic, this can be somehow combined from some combination of agents, since an agent can be recursive and include mini-EPN, but as such we can still model it as a single agent in the high level view.

The nice thing about this abstraction that it is quite simple to model such problems... More - Later.



Sunday, September 13, 2009

On event channels

Last week, the Disney Channel arrived to the Israeli cable system and enriched the set of already existing children channels; so speaking about channels, it is a good time to discuss another type of channel -- an event channel, which is discussed in chapter 6 of the EPIA book draft. Some people view channel as an edge in the event processing graph, but we view channel as a type of node, since it has some processing associated with it. We define a channel as a processing element that receives events from one or more source processing elements (We refer to EPA, producer and consumer as processing elements), makes routing decisions, and sends the input events unchanged to one or more target processing elements in accordance with these routing decisions. Note that like the term Event Processing Agent, channels are abstractions and can be implemented in various ways (e.g. through messaging systems, through buffers, through persistent stores etc...). Channels are classified according to their routing schemes. Some of the common routing schemes are:
  • Fixed routing scheme: The channel has predefined input terminals wired to predefined processing elements, and predefined output terminals wired to predefined processing elements. Every event that is received on any input terminals is sent to all output terminals. Note that this type of channels can be defined implicitly.
  • Subscription-based: EPAs or consumers can subscribe to the channel dynamically. The routing decision is determined according to the list of subscribers that is valid at the time that a decision is made.
  • Itinerary-based: The sink's input terminal identifier or identifiers are obtained from some attribute in the event's payload, this is used to send an event to a specific consumer instance, when the EPN node is the consumer class.
  • Context-based: The channel makes routing decisions based on the context to which the EPA belongs. This is applicable for pattern detection ("complex event processing") type of EPA. The channel selects the appropriate run-time EPA based on the context defined in the pattern- I'll discuss contexts in length in one of the next postings, as this is the topic of the next chapter in the book.
  • Type-based: The channel makes routing decisions based on the event type of the event that is being routed.
  • Content-based: The routing decision is based on the event's content, this can be phrased as assertions, rules, decision trees or decision tables, and are based on the input event content, as well as context information.

This is just the basic definition, in one of the next postings I'll show example of how all these concepts fit together.

Monday, September 7, 2009

On Event Processing Patterns


This is an illustration that has been created by my of my former colleagues to the AMiT team, Tali Yatzkar, when she attended a "presentation course" as an excercise in the course to explain what is an event processing pattern (we did not use this term at that time), this is the original picture, it is animated (the animation is not presevered when copying from file to picture) and the geometric shapes in the left-hand side of the picture are moving. The idea is simple, there are patterns that designate the relationship between a set of events, e.g. a conjuction: event E1 and event E2 both hoccur in the same context (e.g. relate to the same person within 2 hours). This rather simple idea is the jewel of the crown in event processing systems, and the basis of what David Luckham called: Complex Event Processing. It is also what makes a composite event in active database terminology (I have discussed in the past the subtle differences between those term definition). This illustration in some variations has a life of its own, and we saw it in presentations of some other companies and people, I even once had to comment on a Slideshare presentation when it was attributed to (see my comment to this presentation). Anyway, besides giving Tali her due credit, I am writing about event processing patterns, since one of the chapters we complete now for the second-third review of the EPIA book deals with the notion of event processing pattern as a major abstraction. As all abstractions in our meta-language, a specific languages may implement a certain pattern as a language primitive, or implement it through a combination of language primitives. Those interested in the formal definition will need to read the book since the formal definition require definitions of several terms, so I'll give some a less formal definition here -- pattern is really a function that takes a collection (or stream) of input events that satisfy some filtering assertions (e.g. they have to be within context, and have certain other patterns) and returns zero or more "matching sets", which include a collection of individual events that collectively satisfy the pattern. Let's take a couple of examples:

The first example: Bid example: There is a bid for some auction that has been provided on an auction site. The idea is to select a single winner. The input events are acution offering events and bid events. The bid events are partitioned according to the auction offering they are refering to, and are also filterred out according to time (each auction is open for a certain amount of time only) and according to threshold condition (has to be no less than a minimal price).The matching set in this case consists of a single bid event per auction offering. The matching pattern here is - "relative max", which means that any event that we are looking for the event with the relative (to the other input events) maximal value of some attribute (in this case the bid amount). Note that the "relative max" pattern does not necessarily provide a single bidder, thus we also need a "synonyms policy" to determine what happens when we have multiple events of the same type that match the criteria. In this case we take the fairness criterion of FCFS, and the synonyms policy will be -"first", meaning the first bidder that offerred the maximal price. In our meta-language this looks like:
Pattern name = Bidder selection; Pattern type = relative max; Input events = (Auction offering, Bid); Context = (segment = by auction offering, temporal = auction offering is open); Filtering assertion = (Bid.Price >= Auction Offering.Minimal Price); policies = (cardinality = single, synonyms = first)

Note that in these three and a quarter lines we have expressed logic that is quite complex, and this is the magic of patterns. As an exercise to the reader, write the equivalent logic in Java, and then change it so that it will chose all bidders that have provided the relative maximum for a second round of bids.

The second example is a sequence example, this figure is being taken from the EPIA book; the example is looking at the case in which a patient is released from the hospital and then admitted again within 48 hours with the same complain that brought this patient to the hospital in the first time.


Here we are looking for a sequence (the order is important, of course), of the patient release event, and the patient admission event for the same patient with the same complain within 48 hours. The definition in our meta-language will be roughly:

Pattern name = Repeating admission, Filter type = sequence, input events = (Patient Release, Patient Admission), Context = (segment = by patient and complain) temporal = Patient release + 48 hours).

This pattern creates a matching set which consists of a pair of events of types patient release and patient admission).

Note that the pattern return the selected events, and the EPA can derive new events as the function of these selected event.

Here we saw two type of patterns: relative max, which is a set oriented pattern, and sequence which is event oriented patterns. I'll provide the list of patterns collected so far in one of the future postings.

Saturday, September 5, 2009

On the anatomy of event processing agents


Weekends is catch-up times for any workaholic. In my case, I dedicate the spare time in weekend to advance in the EPIA book, which, as expected, is somewhat behind the original schedule. We are now cleaning up the main chapters of the books that deal with : event processing network, contexts and patterns. I'll dedicate some of the coming postings to book related stuff.

In chapter 6 we get a deep dive into the notion of EPN, and its various components: event processing agents, channels and global states. Consumers and producer are discussed earlier in the book. Event Processing Agent (EPA) is a major abstraction we are using. In recent posting I've discussed that the conceptual EPN may be mapped in various ways to implementations, EPA is part of the EPN, as such it is conceptual creature, and can be mapped to implementation in various ways.

The illustration on the top of this page illustrates the (logical) anatomy of an EPA. It illustrates a kind of SCA like component (SCA has now a proposed events extension, I'll discuss it another time), with input terminals and output terminals. In this cases, the EPA gets events from input terminals, each of them has a distinct type of event. Let's take a simple three event conjunction example. A virtual car dealer matches used car sellers and buyers, once there is an agreement on the transaction, it supervises that it is carried out. The used car buying transaction is a conjunction of three events (the order of them may not be important) :

Event1: The seller physical hands over the car to the buyer
Event2: The buyer transfers to the seller the agreed upon amount for the car
Event3: The department of transportation approved and records the ownership transfer -- in Israel it is a service given in post offices, the buyer and seller identify and sign, and the post office clerk has access to the transportation ministry vehicle registration system and can verify who is the registered owner, and if the car has any incumbrance

The three logical steps of EPA are: filtering, matching (if the EPA is looking for a pattern), and deriving. The filtering part may be done outside the EPA, e.g. if the EPA instance deals with a certain transaction, then the channel should route only events related to this transaction to the EPA, I'll discuss more of the notion of channel in the near future. However, there are cases that the events should be filtered by the EPA, e.g. when it involves a condition on more than one event. After the filtering part some subset of the input events survive to be the players in the EPA. If the EPA is doing also pattern matching, it creates matching sets according to the requested pattern. In the case of the used car selling, the pattern is a conjunction of these three events that relate to the same transaction, the order is up to the buyer and seller. When the pattern is matched, it means, in this case, that three events have been detected, which satisfy together the pattern. These three events create a matching set, for each EPA instance there may be zero , one or multiple matching sets during the lifespan of this EPA instance. After the matching set is created, the third phase - the derivation phase is aimed to determine what should be reported out of the results of this EPA. It can just report the composite event that contains the three events in the matching set, and it can contain any other raw or calculated value that is a function of the attributes of these three events. Furthermore, for different purposes, different derived events can be required - consumed by other EPAs or by consumers. This is the way we describe EPA, note that this description is general enough to cover EPAs from different types. The input terminals can take either individual events or sets. I'll discuss more about EPA types, channels, contexts and patterns later

Monday, August 31, 2009

On conceptual and run-time EPN



Working now in my spare time on completing the second third of the EPIA book, so I'll have several postings related to the next three chapters of the books that are now in the "cleaning phase". The topic I'll discuss today deals with the concept of EPN (Event Processing Network) which is a major concept in our book. The approach we have taken in the book is to explain event processing through a meta-language that provide the various event processing concepts, and the event flow through a model based on event processing network. We are now also competing an editor that will enable the reader to play with the meta-language. However, this meta-language is not an executable language (at least not in this phase), and thus we also show the readers how the same application described in the meta-language is implemented using various executable languages of different language styles. The EPN described by the meta-language is a "conceptual EPN", it consist of logical EPAs, while the run-time EPN consist of run-time artifacts that implement the run-time instances of the EPAs.
The conceptual EPN can be mapped into physical implementation in various ways, as shown in this picture:



The traditional centralized implementation is that the entire EPN is being executed using a single run-time artifact, and the EPN describe the internal flow within this artifact.

When talking about distributed EPN, the EPN can be distributed according to several criteria:

  • Segment partition: All the EPAs that relate to "platinum customers" are being executed by one run-time artifact, all the EPAs that relate to "gold customers" are being executed by another run-time artifact etc...
  • Function partition: All the EPAs that perform a certain function are being executed by a unique run-time artifact
  • Location partition: All the EPAs that relate to events created in a certain location.
These, of course, are just examples. The most distributed example, is, of course, a direct mapping of each EPA instance to run-time artifact.

The conceptual EPN is important for design and validation of the event processing application, while the run-time EPN is useful for control and management of the run-time.

More about EPNs and their components - in later postings.


Thursday, July 23, 2009

On logical and physical interpretations of EPN and EPA


My youngest daughter Daphna has finished last week her summer course in the Technion in the framework of the program of "science seeking youth". She studied her first programming course using "Microworlds", a variation of the rather old Logo language, this is of course been translated to lower level language when executes in practice, by this fact is totally transparent to those who program in Microworlds. I am using this analogy since there seems to be some terminology discussion going on recently about the terms EPA and EPN. These terms were introduced in the past by David Luckham, who used them to describe a physical operational view of event processing application. Thus, an EPA is mapped in 1-1 fashion to a software module, and the EPN describes the running software modules and connections among them using physical channels, the first version of the EPTS glossary reflects this view.

However, the way I am using the terms EPN and EPA is slightly different, the physical view is of interest to system administrators, but for the users, designers and developers, the logical view is more relevant, thus I am using these term in a logical way and not a physical way. In order to demonstrate the difference, let's look at the following simple example: There are many patterns that relate to the management of a call center, one of them is the frustrated customer detection: if a gold customer complains three times within a single day (possibly on multiple issues), then a supervisor should call this customer immediately.

However, there is a spectrum of ways that this application can be implemented in reality:
  • It is possible to have a centralized implementation with a single software module that executes all the different functions within this applications, and actually the EPN is internal to this module;
  • On the other extreme we can have a software module implements any single function instance, for example, an agent that detect the frustrated customer pattern for Alice, where a different agent detects the frustrated customer pattern for Bob.
  • Another possibility is a context oriented implementation --- all patterns related to the Alice are processed within a single software module
  • Yet another possibility is a functional partition -- there is a single module for detecting the frustrated customer pattern for all customers
  • There can be also some more combination.
Should the user / system designer / developer care about it and build a different EPN for each variation ? In the past when event processing was hard coded in general purpose programming languages, the logical EPN was also the physical EPN, but one of the gains from using dedicated event processing languages are the ability to abstract the implementation out.
The actual mapping of functions to software modules is left to an optimizer, and can be dynamically changed based on change in the system behavior, load balancing etc.. Actually the paper we presented in DEBS 2009 is part of such an optimization scheme. Thus, the way I am using the term EPA is a single logical function and not necessarily a software module. In the EPIA book we are building our entire concept based on a logical level meta-language that can be translated to various implementations, and even programming styles. As said, there is also an interest in the physical realization of EPN, but it is more of interest to system administrators and implementers of event processing products, but it should be transparent to the user of event processing applications. More on this topic - later.

Tuesday, May 12, 2009

On Gartner's EPN Reference Architecture


Today is a holiday (for children, no vacation for adults..) called Lag Baomer, the highlight (besides not going to school) is that last night all children have gathered around bonfires, as seen in the picture. Fun.

Recently Gartner has published a report called "A Gartner Reference Architecture for Event Processing Networks".

On the positive side, it seems that the concept of EPN, as an underlying model for event processing is catching. The readers of the Blog may realize that I am in the opinion that we need an agreed upon conceptual and execution model for event processing (the same role that the relational model assumes in relational database, however, I never believed that the relational model per se, is appropriate also as the model behind event processing). The book I am writing now "Event Processing in Action" concentrates around the notion of EPN, and a deep dive into construction of EPN-based application.

Reading Gartner's report I found some slight differences between the way they describe EPN, and my own description. In the Gartner report they define a term called "dissemination network" that consists of event processing agents, channels and event flow among them, and then they define EPN to be a dissemination network + producers + consumers. I actually could not find any compelling reason to introduce the notion of dissemination network. According to the definition we are using, event processing network is a directed graph that has nodes for producers, channels, EPAs and consumers, and edges that determine the event flow among them. Another difference is that the Gartner report views event consumers and event producers as type of event processing agents. I have a slightly different opinions, I think that both event producers and consumers are not really event processing agents, since event processing agent is some software module that function events and may generate more events. Event consumer and producer have nodes representing them in the EPN in order to make the event flow from and to them explicitly, however, they are only proxies of the actual producer and consumer, for the event processing network, they are sources and sinks. The main difference is that EPA functionality is explicitly specified in the EPN definition, while what the producer and consumer do is "black box". We don't want to include their functionality, since we don't want to extend the event processing language ad infinitum,

Mentioning the EPIA book -- Chapter 3 is now on the Web, and can be obtained through the MEAP program, this is the last chapter in the introductory part, and deals with principles of programming with events. Chapter 4, the first in the deep dive will be sent to the publisher soon. It has been much more challenging to write, deals about what information we need to store about events -- I'll Blog about it soon.

Monday, April 27, 2009

More on Revision


Long day today, I got to the office around 8AM and left around 9PM. Since we have holiday this week I am trying to condense the remaining days of the week and the result is long day with plenty of conference calls. The picture above is a glance (from below) on the IBM Haifa Lab (the pair of connected building on the right hand side of the picture), my office is in the back building (known as the "banana" due to its shape), and is not really in the nice part of the building -- the one with the view to the Haifa Bay -- well, one can have everything in life -).

I still need to complete the previous posting on revision. I gave some explanation about the concept of revision, and now I still need to discuss implementation of revision in event processing. To recall -- a revision in event processing is getting later knowledge that asserts that either a reported event did not really happen, or some information associated with the event was wrong.

Let's look at two separate cases, one in which the processing has not gone out of the event processing network, and second that the results of the processing have gone out to the "outside world".

In the first case, there may be an opportunity to revise the impact of the revised event by doing kind of undo-redo for all the event processing agents that it passes directly or indirectly. Direct ones are easy -- those that the revised event participate as an input in them, indirect is more tricky, since we need to trace the causality among events, in this case, an event that is an output of an event processing agent in which the revised event participate (relate to the same context) has a causality relation to the original event, thus, an event processing agent, in which this event participates as an input, also needs to do an undo/redo, and causality is a transitive relation, so it continues as far as the EPN arrived so far. It should be noted that the fact that there is a causality may not require a real undo/redo, take as an example that an event of type E1 designates a bid, and the event of type E2 designates the bid with maximal value arrived in a certain time interval. Let's assume that a certain bid has been revised, however, neither the revised bid, or the revising bid change the selection of E2.

The second case is that the revised event has consequences that have been sent to an external consumer, thus, it may have triggered an action, a collection of actions, or a workflow that has been carried out, and this may propagate further ("the butterfly effect"), in this case, either we can treat it as "too late" and do nothing, however, there may be a cases that it can be critical to undo/redo also the consequences, e.g. the revised event has some financial meaning. In this case we'll need to issue compensation for the triggered action, which may be impossible (the consumer does not support compensation) or difficult. I'll blog again about revising the history and its aspects at a later phase.

Saturday, April 18, 2009

On Event Processing Building Blocks

Back to work for one day in the office, with five conference calls (one with Germany, one with France, one with UK, and two with USA...) and then back to home for the weekend. When I have free time I like to read books, the current book I am reading is "A Lion Among Men", 0ne of the books of Gregory Macguire, who writes stories that take as background famous children stories (in this case - the Wizard of Oz), actually this is the third one behind the scene of the Wizard, now taking the Lion as its main character. I have another book of the same author still waiting...

We also submitted the draft of chapter 3 of the "Event Processing in Action" book to the publisher, which hopefully be posted on the MEAP site soon.

The approach we have taken in the book, as I have written before, is to use the "building block" approach, describing event processing principles, and the use case whose construction demonstrates the application, using building blocks, which are like the chemical elements. The application itself is being built by using "definition elements" which are like atoms (my partner for writing this book, Peter Niblett, has come with the analogy from the world of chemistry). we believe that this is the right approach to teach what event processing is -- in the "deep dive" part of the book we dedicate a chapter for each of the major seven building blocks and then dive deeper into the types of event processing agents (which deserves a different discussion). We'll also provide samples of how each building blocks is realized in different models.

The seven building blocks are:
  • Event type: defines the event schema
  • Even producer: the projection of the event producer over the event processing network (note that the event producer itself is outside the scope)
  • Event consumer: same -- the projection of the event consumer over the EPN.
  • Event channel: the glue that holds the EPN together
  • Event processing agent: the brain that does the entire work; each agent is doing a specific task of processing.
  • Context: the semantic partition of events and agents
  • Event derivation: A building block that is possibly part of each EPA that specifies the derived event.

There are some more building blocks that are used to support these ones, but our claim is that this set of building block is what needed to build an event processing application.

Chapter 4 which is in advanced phases of being written starts the deep dive by discussing the event type building block, and in one of the next posts I'll say more about it.

Saturday, April 11, 2009

Some footnotes to the forthcoming book "Event Processing in Action" - Take One

Last night, I went to see a movie (a rare event for me) -- and chose to see "Slumdog Millionaire", my daughter told me later that people who have not read the book enjoyed it more, the movie is OK, even cute, however -- for a movie who won 8 academy awards, I have somewhat bigger expectations (comparing for example to "Gone with the wind" who also had 8 academy awards. Well -- the movie industry is probably not peaking these days...


Today, together with (most of) my tribe, we have done some hiking in a place called "Judge River", well, river in the local terms, with a modest amount of water, but bridge, a lot of trees, some flowers, and since it is a holiday, a lot of people.

Now back home and like any Sunday morning I plan to go to one of the coffee shops (I am rotating between the coffee shops in Haifa, well, to be exact, those who have free parking nearby) to work on revisions to the draft of chapter four of the "Event Processing in Action" book.



From time to time I'll blog about giving some footnote from behind the scenes of the book-being-written. Today I'll blog about several issues: scope, language and exercises.

Scope: The idea is to focus about teaching the event processing concepts step-by-step using a use case which will accompany the book throughout, so the question is -- what is the scope of event processing. We define this scope by defining the "event processing network", and thus the question, that I started discussing in my precious posting is -- whether pre-processing and post-processing to the event processing network is part of the event processing network. While we have a chapter that is dedicated to event producers (and pre-processing) and another chapter that is dedicated to event consumers (and post-processing), the scope of what we discuss as part of the specification of the event processing part do not include what is done by the producers and consumers, whose projection on the EPN is the events they produce and consume. However, there is a case in which a consumer is also a producer, and this is important since there is a possible causality relationship between the event it consumes and consumer and event it produces. As an example: the use case is talking about "fast flower delivery" and one of its functions is choosing the driver that will get the delivery among the drivers that has issued a bid. Some of the stored prefer automatic assignment by the system, and some want to get the bids and do the assignment on their own. The automatic assignment is definitely an EPA (Event Processing Action), since this is a software that performs some operation on events, however the manual assignment can be either manual, or the store is using some external software to do it, however, this is not really part of the EPN, thus it is not modelled by the system. We are of course interested to trace the assignment to the bid which is the input to the store. This is also a good example to show that the same event type can include both raw events (the manual assignment are raw events from the EPN POV) and derived events (the automatic assignment).

Language: We decided neither to use any single language to explain the concepts, nor to invent a new language. However, we believe that just a theoretical discussion will not be enough. What we have decided to do is to take a "building block" approach, in which the different parts of the systems (event types, event processing agents etc..) are specified using "definition elements" which are platform independent concepts, or in other words, meta language. In each section we'll provide the full part of the application using this meta language, in order to connect it to the "ground", we'll also make samples of these definitions using variety of languages in various style. Thus, chapter 4 that I am writing now talks about defining the event schema. We define the schema using our "event type" building block, and will also show definitions in various schema languages (XML, positional relational-schema-like etc..), the same will go for all types of event processing agents. We intend to ask owners of existing languages (from those who will agree to get their languages analyzed by the EPTS event processing languages analysis -- taking on another hat) to provide language definition of our use case, and will check the possibility of posting them all.

Last but not least are the exercises, as we want the book to be a textbook for academic course on event processing, as one of its targets, we have decided to put exercises at the end of each chapter for the benefit of the students and instructors (we also plan to provide slides in the future), one of the questions we agreed with the publisher to ask the reviewers (there is a formal review for each 1/3 of the book) is whether this is the right way or it can make other readers uncomfortable. The options are now: leave as is (exercises at the end of each chapter, make all exercises as appendix or remove them completely from the book, and have them available on a website).

That's all for now -- more footnotes - later.