Friday, December 11, 2020

Azure Data Explorer - Approaches For Data Aggregation In Kusto

In my previous posts I tried to transcribe the things that were not too obvious for me when I initially started working on Kusto Query Language. Continuing with the same thought, this time I’m going to share a few of the approaches that can be taken to aggregate the data. 

Let’s consider the below input data:


  1. let demoData = datatable(Environment: string, Version:int, BugCount:int)  
  2. [  
  3. "dev",1, 1,  
  4. "test",1, 1,  
  5. "prod",1, 1,  
  6. "dev",2, 2,  
  7. "test",2, 0,  
  8. "dev",3, 2,  
  9. "test",3, 0,  
  10. "prod",2,2,  
  11. ];

Description

Get the average number of bugs falling under each category. 

Expected Output


 





There are several approaches to achieve this. 


Approach 1 - Using Partition Operator 

Partition operator first partitions the input data with defined criteria and then combines all the results.

  1. demoData| partition by Environment (summarize ceiling(avg(BugCount)) by Environment);  
Approach 2 - Using Join Operator 

Join merges the two tables based on the specified key.


  • demoData| join kind=leftouter (  
  • demoData | summarize ceiling(avg(BugCount)) by Environment) on Environment  
  • | project Environment, avg_BugCount  
  • | distinct Environment,avg_BugCount;  
    Approach 3 - Using Lookup Operator 

    Lookup operator extends the column of the second table and looks up the values in the first one.


    1. let Averages = demoData  
    2. | summarize ceiling(avg(BugCount)) by Environment;  
    3. demoData | lookup (Averages) on Environment  
    4. | project Environment, avg_BugCount  
    5. | distinct Environment,avg_BugCount  

    I hope you enjoyed aggregating data. 

    Happy Kustoing!