Friday, March 23, 2012

Aleri - Complex Event Processing - Part I


Algo Trading - a CEP use case
Trackback URL
Sybase's Aleri streaming platform is one of the more popular products in the CEP market segment. It's is used in Sybase's trading platform - the RAP edition, which is widely used in capital markets to manage positions in a portfolio. Today, in the first of the multi-part series, I want to provide an overview of the Aleri platform and provide some code samples where required. In the second part, I will present the Aleri Studio, the eclipse based GUI that simplifies the task of modeling CEP workflow and monitor the Aleri server through a dashboard.





Fraud Detection - Another CEP use case. Trackback url

In my previous blog post on Complex Event Processing, I demonstrated the use of Esper, the open source CEP software and Twitter4J API to handle stream of tweets from Twitter.  A CEP product is much more thanhandling just one stream of data though. Single stream of data could be easily handled through the standard asynchronous messaging platforms and does not pose very challenging scalability or latency issues. But when it comes to consuming more than one real time stream of data and to analyzing it in real time, and when correlation between the streams of data is important, nothing beats a CEP platform. The sources feeding streaming platform could vary in speed, volume and complexity. A true enterprise class CEP should deal effectively with various real time high speed data like stock tickers and slower but voluminous offline batch uploads, with equal ease. Apart from providing standard interfaces, CEP should also provide an easier programming language to query the streaming data and to generate continuous intelligence through such features as pattern matching and  snapshot querying.

Sybase Trading Platform - the RAP edition. Trackback URL
To keep it simple and at high level, CEP can be broken down to three basic parts. The first is the mechanism to grab/consume source data. Next is the process of investigating that data, identifying events & patterns and then interacting with target systems by providing them the actionable items. The actionable events take different forms and formats depending on the application you are using the CEP for. An action item could be - selling an equity position based on calculated risk in a risk monitoring application. indicating potential fraud events in money laundering applications or alerting to a catastrophic event in a monitoring system by reading thousands of sensors in a chemical plant. There literally are thousands of scenarios where a manual and off-line inspection of data is simply not an option. After you go through the following section, you may want to try Aleri yourself. This link http://www.sybase.com/aleriform directly takes you to the Aleri download page. Evaluation copy valid for 90 days is freely available from Sybase’s official website. Good amount of documentation, an excellent tutorial and some sample code on the website should help you get started quickly.

 If you are an existing user of any CEP product, I encourage you to compare Aleri with that product and share it with the community or comment on this blog. By somewhat dated estimates, Tibco CEP  is the biggest CEP vendor in the market. I am not sure how much market share another leading product StreamBase has. There is also a webinar you can listen to on Youtube.com that explains CEP benefits in general and some key features of Streambase in specific. For new comers, this serves as an excellent introduction to CEP and a capital markets use case.

An application on Aleri CEP is built by creating a model using the Studio (the gui) or using Splash(the language) or by using the Aleri Modeling language (ML) - the final stage before it is deployed.

Following is a list of the key features of Splash.

  • Data Types - Supports standard data types and XML . Also supports ‘Typedef ‘ for user defined data types.
  • Access Control – a granular level access control enabling  access to a stream or modules (containing many streams)
  • SQL – another way of building a model.  Building an Aleri studio model could take longer due to its visual paradigm. Someone proficient with SQL should be able to do it much faster using Aleri SQL which is very similar to regular SQL we all know.
  • Joins - supported joins are Inner, Left, Right and Full
  • Filter expressions  - include Where, having, Group having
  • ML - Aleri SQL produces data model in Aleri modeling language (ML) – A proficient ML users might use only ML (in place of Aleri Studio and Aleri SQL)to build a model. 
  • The pattern matching language - includes constructs such as ‘within’ to indicate interval (sliding window), ‘from’ to indicate the stream of data and the interesting ‘fby’ that indicates a sequence (followed by)
  • User defined functions – user defined function interface provided in the splash allows you to create functions in C++ or Java and to use them within a splash expression in the model. 


Advanced pattern matching – capabilities are explained through example here. – Following three code segments and their explanations  are directly taken from Sybase's documentation on Aleri.
The first example checks to see whether a broker sends a buy order on the same stock as one of his or her customers, then inserts a buy order for the customer, and then sells that stock. It creates a “buy ahead” event when those actions have occurred in that sequence.

within 5 minutes
from
BuyStock[Symbol=sym; Shares=n1; Broker=b; Customer=c0] as Buy1,
BuyStock[Symbol=sym; Shares=n2; Broker=b; Customer=c1] as Buy2,
SellStock[Symbol=sym; Shares=n1; Broker=b; Customer=c0] as Sell
on Buy1 fby Buy2 fby Sell
{
if ((b = c0) and (b != c1)) {
output [Symbol=sym; Shares=n1; Broker=b];
}
}


This example checks for three events, one following the other, using the fby relationship. Because the same variable sym is used in three patterns, the values in the three events must be the same. Different variables might have the same value, though (e.g., n1 and n2.) It outputs an event if the Broker and Customer from the Buy1 and Sell events are the same, and the Customer from the Buy2 event is different.

The next example shows Boolean operations on events. The rule describes a possible theft condition, when there has been a product reading on a shelf (possibly through RFID), followed by a non-occurrence of a checkout on that product, followed by a reading of the product at a scanner near the door.

within 12 hours
from
ShelfReading[TagId=tag; ProductName=pname] as onShelf,
CounterReading[TagId=tag] as checkout,
ExitReading[TagId=tag; AreaId=area] as exit
on onShelf fby not(checkout) fby exit
output [TagId=t; ProductName=pname; AreaId=area];

The next example shows how to raise an alert if a user tries to log in to an account unsuccessfully three times within 5 minutes.

from
LoginAttempt[IpAddress=ip; Account=acct; Result=0] as login1,
LoginAttempt[IpAddress=ip; Account=acct; Result=0] as login2,
LoginAttempt[IpAddress=ip; Account=acct; Result=0] as login3,
LoginAttempt[IpAddress=ip; Account=acct; Result=1] as login4
on (login1 fby login2 fby login3) and not(login4)
output [Account=acct];

People wishing to break into computer systems often scan a number of TCP/IP ports for an open one, and attempt to exploit vulnerabilities in the programs listening on those ports. Here’s a rule that checks whether a single IP address has attempted connections on three ports, and whether those have been followed by the use of the “sendmail” program.

within 30 minutes
from
Connect[Source=ip; Port=22] as c1,
Connect[Source=ip; Port=23] as c2,
Connect[Source=ip; Port=25] as c3
SendMail[Source=ip] as send
on (c1 and c2 and c3) fby send
output [Source=ip];

Aleri provides many interfaces out of the box for an easy integration with source and target systems.  Through these interfaces/adapters the Aleri platform can communicate with standard relational databases, messaging frameworks like IBM MQ, sockets and file system files. Data in various formats like csv, FIX, Reuters market data, SOAP, http, SMTP is easily consumed by Aleri  through standardized interfaces.

Following are available techniques for integrating Aleri with other systems.


  • Pub/sub API is provided in Java, C++ and dot net - A standard pub/sub mechanism
  • SQL interface with SELECT, UPDATE, DELETE and INSERT statements  used through ODBC and JDBC connection.
  • Built in adapters for market data and FIX  
In the next part of this series we will look at the Aleri Studio, the gui that helps us build the CEP application the easy way.

No comments:

Post a Comment