ESB Analytics Server is the analytics distribution for the WSO2 ESB, which is built on top of WSO2 Data Analytics Server (DAS). Analytics for ESB consists of an inbuilt dashboard for Statistics and Tracing visualization for Proxy Services, APIs, Endpoints, Sequence and Mediators. Here I will discuss on the architecture of the Analytics Server, and how it operates behind the scenes, to provide this comprehensive dashboard.
Analytics Server can operates in three modes:
- Statistics Mode
- Tracing Mode
- Offline Mode
For all the three modes, data are published from ESB Server to the Analytics server via the data bridge. In doing so ESB server uses the "publisher" component/feature, while Analytics server uses the "receiver" component/feature of the data bridge. ESB triggers an event for a single message flow, to the Analytics Server. Each of these events contains the information about all the components that were invoked during the message flow.
If the statistics are enabled for a given Proxy/API at the ESB side, then the Analytics server will operates in "Statistics Mode". If the Tracing and capturing Syanpse properties are also enabled at the ESB side, then the Analytics server will operates on the "Tracing Mode". Analytics server will switch between these modes on the fly, depending on the configurations set at the ESB side.
Statistics Mode
In this mode, ESB server will be sending information regarding each mediator, for each message to Analytics Server. Analytics Server will be calculating the summary statistics out of these information, and will store only summary statistics, but will not store any raw-data coming from the ESB. This is a hybrid solution of both siddhi (WSO2 CEP) and Apache Spark. This mode generate statistics in real time.
Pros: Can handle much higher throughput. Statistics are available in real time.
Cons: No tracing available. Hence any message related info will not be available in the dashboard.
Tracing Mode
Similar to the Statistics mode, ESB server will be sending information regarding each mediator, for each message to Analytics Server. Analytics Server will be calculating the summary statistics out of these information,. But unlike previous case, it will store both statistics as well as component wise data. This enables the user to trace any message using the dashboard. More importantly, this mode also allows a user to view statistics and trace messages in real time.
Pros: Statistics and Tracing info are available in real time. Message level details are also available.
Cons: Throughput is limited. Can handle upto around 7000/n events per second, where n is the number of mediators/components available in the message flow of the event sent from ESB.
Offline Tracing
This mode also allows a user to get statistics as well as tracing, similar to the previous "Tracing Mode". But this operates in an offline/batch analytics mode, unlike previous scenario. More precisely, Analytics Server will store all the incoming events/data from ESB, but will not process them on the fly. Rather a user can collect data for any period of time, and can run a predefined spark script in-order to get the stats and tracing details.
Pros: Users can trace messages, and message level details are available. A much high throughput can be achieved compared to "Tracing Mode"
Cons: No realtime processing.