Friday, February 5, 2016

WSO2 Data Analytics Server (DAS)- Explained Simply & Briefly


  • WSO2 DAS is there to collect and analyze real-time and persisted (stored/not temporary) data (even large volumes of data) and produce ( visualize/ communicate) results.

  • It exposes a single API for external data sources to publish data to it.


    Data Collection

  • Event = Unit of data collection
  • Event Stream = Sequence of events of a single type.

  • One Event Stream should be created in DAS to provide structure required to process events. [DAS management console is well facilitated to do that :-) ]
  • The required and preferred attributes for the event stream can be defined there (id, values, properties)
  • If we want to analyze data in batch mode (not in real-time) we have to persist the event stream information. (If required indexing we ca configure that too :-) )

  • After creating the path (event stream) to receive data, we have to create an Event Receiver to catch the data from different sources.

  • A second Event Stream is required to publish the processed events of the, first event stream created. In this stream we can define the required information and analytic results which are needed to be outputted. (We may persist this stream too for the compatibility to batch processing )

  • Event publisher is needed to be created in relation to an event-publishing-event-stream  (the above second event stream) to publish processed and analyzed results to external systems.
  • Event simulator can be used to submit events to a event stream (above first mentioned event stream)

          Analyzing Data

  • We can configure any of the data event streams of DAS to perform analytics (Batch analytics / Realtime analytics / Interactive analytics )

    Batch analytics
  • We can perform batch analytics only when the event streams are configured to be persisted.
  • This WSO2 DAS batch analytics engine is powered by Apache Spark and so Batch Analytic scripts written in Spark SQL Queries could be used to analyze the persisted data and get results.
  • The specific Spark console provided in WSO2 DAS also can be used to run Spark Queries and get results.

    Real-time analytics

  • Real time analytics can be done using a set of specified queries or rules through a SQL-like Siddhi Query Language defined in an Execution Plan under Streaming Analytics.
  • For this, we should be publishing(submitting) events to the DAS at that time, because this is a real time analysis, but not a persisted data analysis. (Even simulator provided in DAS could be used to simulate event publishing) 
     
    Interactive Analytics

  • Interactive analytics are used to retrieve fast results through ad hoc querying of received or already processed data. But for this, we have to have indexed the related Event Stream attributes. Data Explorer under Interactive Analytics in DAS dashboard is used for tis task.

         Visualizing (communicating) Data

  • WSO2 DAS has provided a very easy-to-use customizable dashboard mechanism to visualize the data and results of the data analytics processes, via Analytics Dashboard in the Dashboard menu.
  • Various types of gadgets (bar charts/line charts/ arc charts/ line charts/map charts) can be added to the the dashboard to represent data for any preferred event stream.

    Following diagram illustrates an overview event flow of an example scenario in the WSO2 DAS, as we just talked about. (please note that here the names of the event receivers, streams,etc. are just taken for this example scenario)



    So this is the overall mechanism in WSO2 DAS in brief and anybody interested is welcome to try it out it. ;-)
    http://wso2.com/products/data-analytics-server/