Azure Stream Analytics
Today I want to show you how Azure Stream Analytics works. Service which stores and analyse events. In order to do this we need to create an ingestion service — Event Hub which provides data to Stream Analytics service. We need example data as well — in this case it is python script — which generates event and send to Event Hub every 10 seconds.
Here is and example of created Event Hub.
As we noticed, events are sent to Event Hub.
Now it is time for Stream Analytics.
After Stream Analytics job is created. We can foucus on three fields: Input, Oputput and Query.
In this case we have two inputs. Event Hub and SQL Database. SQL Database is for reference data which is joined to data from Event Hub.
We have two outputs. SQL Database and Blob Storage where data is stored.
Query — place where the magic happens. On the left side you can view data in every stream (inputs and outputs) — which is visible on the bottom view.
Your input and outputs streams are conjuncted togeter by SQL.
In case above we are using all of the inputs and outputs. So we store data in SQL Database and ADLS2 and ingest data from Event Hub and SQL database.
Here is data stored as JSON in Data Lake:
And SQL Database.
If we want to store aggregated events by time window then we can use one of built-in functions.
We have five types of windows:
The best explanation is provided by Microsoft:
We use one example: hopping window.
We recreate previous code to make it more concise and implement Hopping window.
In the example above we calculating sum of “Value02” and quantity of events. Every 10 seconds events are agregated over last 20 seconds — they are aggregated by [Continent] column. “System.Timestamp()” is the end time of the window.
As you can see events are aggregated: