Much of the interest in the kappa architecture has to do with code duplication. The code that transforms data along the cold path is often duplicated on the hot path. If you look back at the code in the brainjammer_batch.exe located in the Program.cs file in the Chapter06/Ch06Ex01 directory on GitHub, you might see some duplication. The objective of the batch layer code that processes the brain waves is to find the median brain wave reading values per frequency for a given session. The Azure Stream Analytics code snippet you saw in the “Temporal Windows” section does the same thing. In the batching scenario the median is calculated using the Math.Round() method, while in the Azure Stream Analytics scenario the median value is calculated using the PERCENTILE_CONT() T‐SQL function. This scenario does prove the point, but like all similar situations the decision is based on the requirements. It is more cost effective to reduce the amount of data and the complexities of your code and architecture. However, not all data analytics solutions require a speed layer.
Create a Stream Processing Solution
The content in this section uses the concepts and technologies discussed in the previous design‐oriented context. The objective of the stream processing solution is to identify the scenario in which brain waves are being read and then ingested, in real time. The temporal windowing function that will be utilized is tumbling, and the data format sent to the Azure Event Hub is in JSON format, resembling the following:
{ “ALPHA” : 4.4116,
“BETA_H” : 1.237,
“BETA_L”: 1.4998,
“GAMMA”: 0.8494,
“THETA”: 6.4356}
Complete Exercise 7.2, where you will test that JSON file against an Azure Stream Analytics query. The testing will ensure that the query output meets the requirements to obtain the real‐time visualization via a Power BI output. You will perform that configuration in Exercise 7.5.