Tuesday, October 20, 2020

Azure Log Analytics - Kusto Query - Get data for last 20 days

In continuation to my previous post on Get Min/Max from each category, this time let’s do one of the most demanded queries with filter criteria on date time field.

For the purpose of simplicity and keep this article more focused, I have removed data from all the additional columns as shown below:

GenerationDate

DescriptionTitle

DescriptionDetail

FeedKey

2020-10-02 00:00:00:0000000

2020-10-21 00:00:00:0000000

2020-10-21 00:00:00:0000000

2020-10-21 00:00:00:0000000

2020-10-21 00:00:00:0000000

2020-10-22 00:00:00:0000000

2020-10-22 00:00:00:0000000

Query description

The idea is to fetch all the records, which occurred in past 20 days of GenerationDate.

Approaches

Now to achieve our expected result, there could be more than a way.  One way could be:

Approach 1

Find out the date which fall exactly twenty days back using ago(…) and then use conditional operator (<= and >=) to achieve this result. 

Above approach would work perfectly but the problem with this approach is lot many lines of code and calculation.

Approach 2

The other approach having very less lines of code could be using the between(…) as shown below:

  • DemoData   
  • where (todatetime(GenerationDate) – now()) between(0d…20d);     
  • NOTE: Make sure to do the proper datetime cast else you may end up getting an error ‘Arithmetic expression cannot be carried-out between StringBuffer and DateTime’.

    Happy kustoing!

    Friday, October 16, 2020

    Tips for Effective Code Reviews

    In the field of software development, code review plays a very vital role. It not only enhances the code quality, but also identifies issues before the code goes into the tester’s hand.

    Throughout my development experience, I came across many amazing guidelines that can contribute to an effective code review.

    So, as part of this article I’m going to list down all those points.

    Naming Conventions, Data types, Sizes and Comments

    • Variable names should be small but self-explanatory. Based on the language and platform, standards and casing should be followed.
    • Always prefer field names using adjective or noun.
    • Prefer to use PascalCasing for resource keys.
    • Always prefer to name events with a verb. It’s good to think about tense while naming events, i.e. rendered, rendering, etc.
    • Keep method size as small as possible. I personally prefer, size of method body of <=12 lines. So, that complete method can be seen without scrolling the screen.
    • All the parameters should pass validation check before passing to any method.
    • Keep class size as small as possible. If it is not possible to achieve this due to business requirement and design, then preferred way is to go for partial classes.
    • Comments should present at class, interface and at method level. If logic inside a method is complex and not self-explanatory, it’s good to add few lines of comment inside the method also.
    • Proper indentation plays a very big role in the readability of any source code. So, never ignore this.
    • While working with expression trees, prefer anonymous types over tuples as you may have a need to rename properties.
    • Avoid boxing and unboxing as much as possible. It’s always good to go with concreate data types.
    • Prefer using StringBuilder over String class, whenever situation demands high concatenation operation.
    • Prefer lazy loading and initialization as it would increase memory footprint. Do only when it is really required.
    • Avoid extending ValueType
    • Prefer enums over static constants.
    • Avoid using enum for storing single value.

    Object Construction

    • Never define a public constructor in the abstract class as public constructor is useful only if an instance has to be created.
    • Avoid doing much work inside constructor, apart from initializations.
    • Avoid using static classes unless there is a specific need.

    Object Destruction

    • Make sure to take care of memory allocation and deallocation.
    • Avoid using Finalize method for managed objects as it would be released automatically by garbage collector. Recommended approach would be to implement IDisposable.

    Termination/Exit Criteria and Exception Handling

    • There should be a single exit point for any function. Excessive use of return statements in a function body should strictly be restricted.
    • Always add Default case to switch case statement.
    • Always add else, if using if-elseif.
    • Anyone can call your code without asking you. So, always handle NULL checks.
    • Prefer Using statement to automatically cleanup resources, that implements IDisposable.
    • Use Finally block to clean up resources that don’t implement IDisposable.
    • Need proper exception handling with more generic exception types on top.
    • Prefer to throw exceptions, iff you do not want error to be unnoticed. In other words, it is always better to avoid throwing exceptions for simple scenarios.

    Serialization

    • All the classes that needs to be serialized must be marked with attribute Serializable. Once class is marked as serializable, all it’s public members will automatically be serialized. For any private member to be serialized, it has to be explicitly decorated with attribute SerializeField.
    • Serialize only the required members rather than serializing the complete class.
    • While working with serialization, map all the properties explicitly as serialized name. It won’t break the code in case if anyone renames any property in future.
    • Prefer a class or a struct over anonymous types or tuple types, for serialization.

    Threading

    Avoid using Thread.Abort for thread termination.

    Avoid using Thread.Suspend and Thread.Resume to synchronize threads. Better way to achieve similar behavior is to use Mutex and Monitor.

    Avoid synchronization as it decreases performance and maximizes the likelihood of deadlocks.

    Prefer Interlocked class over lock statement.

    Code Structuring

    Implement proper design patterns wherever possible.

    Prefer parallelism.

    Make intelligent decision between Abstract class and Interface.

    Do not add members to already released/shipped interface as it will create versioning problems.

    Inheritance hierarchy should be chosen very carefully as badly implemented inheritance could be very dangerous. I personally avoid inheritance as much as I can, specially beyond 2 levels.

    Reflection and Caching

    Try to avoid reflection as much as possible.

    Analyze caching scenarios very carefully as not all scenarios are suitable for caching.

    Database Practices

    Prefer stored proc over inline SQL queries.

    Be wise while taking indexing decisions. There are some tools (Database Tuning Advisor), which also provides indexing suggestions.

    Code Analysis and Coverage

    Code should be error and warning free.

    Using code analysis tool can help in identifying many of the common mistakes. I prefer the one, which comes with Visual Studio.

     Keep an eye on various code metrics, i.e. cyclomatic complexity, depth of inheritance, cyclomatic complexity, loc, etc. If any of these are beyond certain levels, then refactoring has to be taken up. 

    A good code coverage can tell, how maintainable the code is. Hence always prefer to write unit test cases, which will also help in identifying unused and unreachable code blocks.

    Last but not the least, there are many very well-known coding principles exists, which one can follow:

    DRY – Don’t Repeat Yourself, KISS – Keep It Simple Stupid, Hollywood Principle – Don’t call us , we’ll call youSOLIDYAGNI – You Are not Gonna Need It

    Proper code review helps in slashing maintenance cost of the software and it should be part of day-to-day activities.

    Hope you enjoyed this checklist.

    Back to You

    This post is just the start and there is lot more which can be added.

    Please share your thoughts and comments, so that I can make this list more extensive.

    Tuesday, October 13, 2020

    Azure Log Analytics - Kusto Query - Get Min/Max Within Each Category Filter

    In continuation to my previous post on Get Categorial Count, this time let’s get our hands dirty with one more query related to filter criteria for date time field.

    Below is the sample data on which we are going to query:

    GenerationDate

    IngestionTime

    DescriptionTitle

    DescriptionDetail

    FeedKey

    2020-05-21 00:00:00:0000000

    2020-05-25 02:00:00:0000000

    Schedule Task

    Read feed from server 1

    acbf-uhef-4t5i-dfff

    2020-05-21 00:00:00:0000000

    2020-05-25 03:00:00:3000000

    Schedule Task

    Read feed from server 1

    acbf-uhef-4t5i-dfff

    2020-05-21 00:00:00:0000000

    2020-05-25 03:00:00:3500000

    Schedule Task

    Read feed from server 1

    acbf-uhef-4t5i-dfff

    2020-05-21 00:00:00:0000000

    2020-05-25 03:00:00:3000000

    Monitoring Task

    Monitoring failed for LOC

    lcbf-u78f-4p5i-dfff

    2020-05-21 00:00:00:0000000

    2020-05-26 02:00:00:0000000

    Schedule Task

    Data missing for palto

    acbf-uhef-4t5i-dfff

    2020-05-22 00:00:00:0000000

    2020-05-26 00:09:00:0000000

    Schedule Task

    Read feed from server 1

    acbf-uhef-4t5i-dfff

    2020-05-22 00:00:00:0000000

    2020-05-27 00:04:00:0000000

    Failover Handling

    Disk fault occurred in region R

    acbf-uhef-4t5i-dfff

    Query description:

    For each unique combination of FeedKey and Description, find the maximum and minimum Ingestion time

    Kusto query:

    1. let fact = DemoData  
    2. where GenerationDate == datetime(2020-05-21)  
    3. | summarize dcount(FeedKey) by DescriptionTitle, DescriptionDetail, FeedKey, GenerationDate; 
    4.  
    5. let minIngestionTimes = fact | join kind=leftouter DemoData on FeedKey, DescriptionTitle, DescriptionDetail, GenerationDate  
    6. | project FeedKey, DescriptionTitle, DescriptionDetail, GenerationDate, IngestionTime 
    7. | summarize MinIngestTime = arg_min(IngestionTime,*) by FeedKey, DescriptionTitle, DescriptionDetail;  

    8. let maxIngestionTimes = fact | join kind=leftouter DemoData on FeedKey, DescriptionTitle, DescriptionDetail, GenerationDate  
    9. | project FeedKey, DescriptionTitle, DescriptionDetail, GenerationDate, IngestionTime
    10. | summarize MaxIngestTime = arg_max(IngestionTime,*) by FeedKey, DescriptionTitle, DescriptionDetail;  

    11. minIngestionTimes | join kind=innerunique maxIngestionTimes on FeedKey, DescriptionTitle, DescriptionDetail  
    12. | extend Description = strcat(DescriptionTitle," : ", DescriptionDetail)  
    13. | project FeedKey, Description, MinIngestTime, MaxIngestTime, GenerationDate,  
    14. | sort by FeedKey  

    Expected output

    FeedKey

    Description

    MinIngestTime

    MaxIngestTime

    GenerationDate

    acbf-uhef-4t5i-dfff

    Schedule Task : Read feed from server 1

    2020-05-25 02:00:00:0000000

    2020-05-25 03:00:00:3500000

    2020-05-21 00:00:00:0000000

    lcbf-u78f-4p5i-dfff

    Monitoring Task : Monitoring failed for LOC

    2020-05-25 03:00:00:3000000

    2020-05-25 03:00:00:3000000

    2020-05-21 00:00:00:0000000

    acbf-uhef-4t5i-dfff

    Schedule Task : Data missing for palto

    2020-05-26 02:00:00:0000000

    2020-05-26 02:00:00:0000000

    2020-05-21 00:00:00:0000000

     Happy kustoing!

    Friday, October 9, 2020

    Azure Log Analytics - Kusto Query - Get Categorial Count

    It’s been a while since I started working on data analysis part. When it comes to data analysis, it’s all about how efficiently one can filter and fetch the small set of useful data from humongous collection.

    I used Kusto Query Language (KQL) for writing advanced queries for Azure Log Analytics. At first, when you will start writing queries, it would be very daunting and keeping that in mind, I thought, I should share a few of those queries which could save huge on the beginner’s time.

    Hence, my next few posts would be mostly based on how to achieve expected output using KQL. So, let’s get started with a simple scenario first.

    Below is the sample data on which we are going to query:

    GenerationDate

    IngestionTime

    DescriptionTitle

    DescriptionDetail

    FeedKey

    2020-05-21 00:00:00:0000000

    2020-05-25 02:00:00:0000000

    Schedule Task

    Read feed from server 1

    acbf-uhef-4t5i-dfff

    2020-05-21 00:00:00:0000000

    2020-05-25 03:00:00:3000000

    Schedule Task

    Read feed from server 1

    acbf-uhef-4t5i-dfff

    2020-05-21 00:00:00:0000000

    2020-05-25 03:00:00:3000000

    Monitoring Task

    Monitoring failed for LOC

    lcbf-u78f-4p5i-dfff

    2020-05-22 00:00:00:0000000

    2020-05-26 02:00:00:0000000

    Schedule Task

    Data missing for palto

    acbf-uhef-4t5i-dfff

    2020-05-22 00:00:00:0000000

    2020-05-26 00:09:00:0000000

    Schedule Task

    Read feed from server 1

    acbf-uhef-4t5i-dfff

    2020-05-22 00:00:00:0000000

    2020-05-27 00:04:00:0000000

    Failover Handling

    Disk fault occurred in region R

    acbf-uhef-4t5i-dfff

    Query description:

    How to get the varied description count for each FeedKey.

    Kusto: 


    1. DemoData   
    2. where GenerationDate >= datetime(2020-05-20) and GenerationDate <= datetime(2020-05-23)  
    3. | extend Descriptions = strcat(DescriptionTitle," : ",DescriptionDetail)  
    4. | summarize dcount(FeedKey) by Descriptions, FeedKey   
    5. | summarize DescriptionCount = count() by FeedKey | sort by DescriptionCount desc;

    Expected output:


    FeedKey

    DescriptionCount

    acbf-uhef-4t5i-dfff

    3

    lcbf-u78f-4p5i-dfff

    1

     Happy kustoing!