Showing posts with label YouTube. Show all posts
Showing posts with label YouTube. Show all posts

Tuesday, July 6, 2021

Extract Text from Sales Receipt using Pre-Built Model - Azure Form Recognizer

Nowadays, where almost everything is turning to online and virtual modes, a very common problem any organization is facing is the processing of receipts that were scanned and submitted electronically for reimbursement purposes. 

Now for any claim or reimbursements to get clear, first those must reach to proper accounts department based on the organization and the sector, and one way to perform this activity is by manual intervention. A person or a team must go through all those digitally scanned receipts manually and filter them based on the departments or any other validation and eligibility criteria they may have.

The situation becomes more tragic when the volume of such scanned receipts is too high. So, get rid of this manual effort, a lot many organizations have already opted for a solution that is AI-based, and lot many are in a process of doing so.

Definitely, one can go for OCR, which is short for Optical Character Recognization technologies to extract data but here the problem is not only about data extraction, but it is also about data interpretation. Because there could be an incident, wherein the user uploaded a wrong document altogether, which is not a receipt. So, the solution should be robust enough to filter out these scenarios.

How can AI-based solutions be achieved?

Like many other Azure services, here also we can utilize a service named Form Recognizer, which consists of intelligent processing capabilities and allow us to automate the processing of forms and receipts. Basically, it is a combination of OCR and predictive models, which in turn falls under the umbrella of Azure Cognitive Services.

Here OCR will work on text extraction and models will help us to filter the useful information, like invoice date, address, amount, description, name or could be any other relevant field, which business demands.

What all models are supported by Form Recognizer?

Form Recognizer supports two types of models: Pre-built and Custom models.

  • Prebuilt – Are the ones, which are provided out-of-box and are already trained with some basic sales data based on USA sales format.
  • Custom Models – Are the ones, which can be tailored based on our needs with our own data and business needs.

So, in this article, I’ll be focusing on the pre-built models and will cover custom model integration as part of another article.

How to get started with Form Recognizer?

The very first thing, we need is login to the Azure portal at portal.azure.com to create Azure Resource. There are two ways to create Azure resources.

  • Using Azure Form Recognizer
  • Using Azure Cognitive Services

If anyone is planning to use other services under Cognitive Services, then existing/new resources can be used. But if one needs to work only with Form Recognizer Service, then also it can be done.

Implementation Details

For development, I'm using Python as a language and Visual Studio Code having Jupyter Notebook. Here is the core implementation:

key = "KEY_TO_BE_REPLACED"
endPoint = "ENDPOINT_TO_BE_REPLACED"

import os
from azure.ai.formrecognizer import FormRecognizerClient
from azure.core.credentials import AzureKeyCredential

client = FormRecognizerClient(endpoint = endPoint, credential = AzureKeyCredential(key))
image = "IMAGE_FILE_PATH"
fd = open(image, "rb")

analyzeReceipt = client.begin_recognize_receipts(receipt = fd)
result = analyzeReceipt.result()

print('Address: ', result[0].fields.get("MerchantAddress").value)
print('Contact Number: ', result[0].fields.get("MerchantPhoneNumber").value)
print('Receipt Date: ', str(result[0].fields.get("TransactionDate").value))
print('Tax Paid: ', result[0].fields.get("Tax").value)
print('Total Amount Paid: ', result[0].fields.get("Total").value)

items = result[0].fields.items()
for name, field in items:
    if name=="Items":
        for items in field.value:
            for item_name, item in items.value.items():
                print(item_name, ': ', item.value)

Sample Input and Output

I've taken the below receipt as an input,















and for this, above code generated below output:






Summary

This article mentions high-level steps of how one can use a pre-built ML model to read information from a sales receipt, with an assumption that the reader is already aware of how to use Python, VS Code, Jupyter Notebook along with how to import Python modules. But if you are new to any of these, I would recommend you to watch my below video explaining this article from start to end.

Thursday, July 1, 2021

Getting Started with Reading Text from an Image using Azure Cognitive Services

In this article, we will learn about how we can read or extract text from an image, irrespective of whether it is handwritten or printed.

In order to read the text, two things come into the picture. The first one is Computer Vision and the second one is NLP, which is short for Natural Language Processing. Computer vision helps us to read the text and then NLP is used to make sense of that identified text. In this article, I’ll mention specifically about text extraction part.

How Computer Vision Performs Text Extraction

To execute this text extraction task, Computer Vision provides us with two APIs:

  • OCR API
  • Read API

OCR API, works with many languages and is very well suited for relatively small text but if you have so much text in any image or say text-dominated image, then Read API is your option.

OCR API provides information in the form of Regions, Lines, and Words. The region in the given image is the area that contains the text. So, the output hierarchy would be - Region, Lines of text in each region, and then Words in each line.

Read API, works very well with an image, that is highly loaded with text. The best example of a text-dominated image is any scanned or printed document. Here output hierarchy is in the form of Pages, Lines, and Words. As this API deals with a high number of lines and words, it works asynchronously. Hence do not block our application until the whole document is read. Whereas OCR API works in a synchronous fashion.

Here is the table depicting, when to use what:

OCR API

Read API

Good for relatively small text

Good for text-dominated image, i.e Scanned Docs

Output hierarchy would be Regions >> Lines >> Words

Output hierarchy would be Pages >> Lines >> Words

Works in a synchronous manner

Works in an asynchronous manner.

 Do watch out my attached video for the demo and code walkthrough:

Monday, June 21, 2021

Understanding Recall and Precision in Simplest Way

This video talks about how to evaluate any Machine Learning classification model and what all matrices are available to do so. It contains very simple to follow examples along with calculations and brief overview of confusion matrix.

Monday, June 14, 2021

Building End-to-End Custom Image Classifier using Azure Custom Vision

This video talks about the complete flow of image classification, starting from image collection, image tagging/labelling, training classification model, evaluating the model, and testing the predictions using Prediction API. This complete flow relies on Azure Custom Vision service.

Detecting Vehicles in an Image using Azure Custom Vision Service

This video explains, how to start with object detection using Azure custom vision service. It talks about how one can detect cars and buses from any valid image along with some of the best practices about training the model. One will also learn about some of the limitations of detection API.

Monday, May 31, 2021

Getting Started with Image Analysis using Azure Cognitive Services

This is the very first video of this series and talks about what is image analysis, what image analysis can do, how to use Azure Cognitive Services to analyze any image, followed by C# code to achieve the image analysis results.

Automatically Shutdown the Azure VM using Automation Tasks Template

Want to learn, how to automatically shutdown your Azure Virtual Machine to save some money? Here you go:


Friday, May 21, 2021

Creating Virtual Environment for Python from VS Code

When we are talking about a term environment along with Python, it is a context in which our Python application runs, or we can say that the Python program runs. 

An environment consists of an interpreter and all the installed packages, which clearly means that one can have multiple environments on a single machine, or rather I would say, every Python application can have its own environment.

Now the question is, why do we need such environments? 

To know more about the virtual environment and how to create one using Visual Studio Code, have a look at my below video

Tuesday, May 18, 2021

Wednesday, May 5, 2021

Chat Application using Azure Web PubSub Service (Preview)

Azure Web PubSub service, as its name says, it is based on publish-subscribe pattern and enables us to build real-time web applications. 

Some of the popular examples where we can use this service is, for any chat-based applications, any collaboration application, like white boarding application. We can also use this service for any application which needs instant push notifications. In fact, there are many more example, we can think about. 

The best part is, we can use Azure Web PubSub service on all the platforms which supports WebSocket APIs and it allows up to 100 thousand concurrent connections at any point of time.

Components required to create a basic chat application:

  1. Instance of Azure Web PubSub Service
  2. Publisher application
  3. Subscriber application

To know about how to create and use these components, I’ve created a complete video demonstrating these: 

C# Code for Publisher and Subscriber:

Below is the C# code for the respective classes.

Publisher.cs

  1. var connectionString = "Your_ConnectionString_Here";  
  2. var hub = "Your_Hub_Here";  
  3. var serviceClient = new WebPubSubServiceClient(connectionString, hub);  
  4. while (true)  
  5. {  
  6.       Console.Write("Enter message: ");  
  7.       string message = Console.ReadLine();  
  8.       serviceClient.SendToAll(message);  
  9. }

Subscriber.cs

  1. var connectionString = "Your_ConnectionString_Here";  
  2. var hub = "Your_Hub_Here";  
  3.   
  4. // Either generate the URL or fetch it from server or fetch a temp one from the portal  
  5. var serviceClient = new WebPubSubServiceClient(connectionString, hub);  
  6. var url = serviceClient.GetClientAccessUri();  
  7.   
  8. using (var client = new WebsocketClient(url))  
  9. {  
  10.     client.MessageReceived.Subscribe(msg => Console.WriteLine($"Message received: {msg}"));  
  11.     await client.Start();  
  12.     Console.WriteLine("I'm connected.");  
  13.     Console.Read();  
  14. }  

Hope you enjoyed learning about Azure Web PubSub service.

Monday, May 3, 2021

401 vs 403 vs 409 - When to use What?

 This video explains the use of HTTP status codes 401, 403 and 409 along with an example.

Connecting Azure Account from Visual Studio Code

This video talks about how to connect to Microsoft Azure using Visual Studio Code and what all commands can be executed in order to interact with Azure.

Monday, April 19, 2021

How to Update Secret in GitHub Repository

This video explains the need of changing the deployment token, which gets generated while pushing our Azure Static Web App code on GitHub. It also talks about from where and how to get this token changed.

Sunday, April 18, 2021

CI/CD Setup with GitHub Repository using Visual Studio Code - Azure Static Web App

This video explains how to use Azure Static Web Apps (Preview) Extension to setup CI/CD with GitHub repository. It also explains about, how to push the code to GitHub, build and deploy it from VS Code.

Sunday, April 11, 2021

Visual Studio Code - Enable Spelling Checker

Recently, I’ve recorded a video which demonstrates how to get rid of typographic errors which we may have left while writing logic. As it is very usual that, while writing logic, sometimes we are in so hurry that we do not verify variable names or say class names. So, to help with all this, Visual Studio has an extension called Code Spell Checker. This video will throw some light on how to install this extension and how to use that. Have a look:

Sunday, April 4, 2021

Visual Studio Code - How to get started with C# project

Recently, I’ve recorded a video which demonstrates how to get started with C# development on Visual Studio Code. I provided a brief overview of this editor and also spoke about what all extensions are required to set up, in order to make developer’s life easy. One will also get to learn about how to create a solution file and a project from scratch using commands.

Tuesday, March 30, 2021

Redirecting Traffic Based On Priority Using Azure Traffic Manager

For any web application to work there has to be an associated end point, which means whenever users sends a request, endpoint is the first thing which gets hit. If I say in simple term, endpoint is an internet facing service which could be hosted either in Microsoft Azure or outside of Azure.

What is Azure Traffic Manager

Now what is Traffic Manager? As it’s name suggests, it manages the traffic. It distributes the traffic across various Azure regions along with health monitoring capabilities. So, if you are planning for a multi-region support with high availability, it could be a perfect service for you. 

Now, you must be thinking, what is the core which is making this happen? Actually it uses the DNS to route traffic to endpoints based on the selected Traffic Manger profile and the configured routing mechanisms. I’ll shortly mention about what all routing methods are available but before that let’s take a quick look at some of the major benefits which we can achieve using Traffic Manager. 

Benefits of using Azure Traffic Manager

  • Traffic Manager provides automatic failover whenever endpoint goes down as it keeps monitoring the end points.
  •  In case of planned maintenance activity, Traffic Manager redirects the traffic to other end points which are configured. So, we need not to worry about any downtime windows.
  • Traffic Manager has the ability to perform calculation to know which end point will provide lowest latency to the user. In a way, Traffic Manager improves the responsiveness by redirecting the traffic to such end points.
  • Another important capability of Traffic Manager is, it not only supports Azure end points, but it also go hand-in-hand with external non-Azure endpoints.
  • And lastly, Traffic Manager allows to combine multiple routing methods to achieve any complex business scenarios using nested profiling mechanism.

Ways to manage traffic

Currently, there are 6 ways once can manage DNS routing. 


Performance 

This method is useful, when endpoints are configured in different geographic locations and one wants to select the closet one, in terms of lowest network latency.


Weighted

In this method, one can create endpoints with designated weights, ranging between 1 and 1000 and based on the weight, traffic is redirected accordingly.


Geographical

This method route users based on the geo location or say, it works based on the geography their DNS query originates from. It may look similar to performance but actually it’s different. The best use case, I can think of is, let’s say due to some compliance and government regulations one wants all the traffic from EAST US to be redirected to WEST US endpoint. So, this can be achieved by selecting the geo based profile.


Priority

As it’s name says, it works on the basis of assigned priority. So, whichever endpoint is having higher priority, traffic would be redirected to that one. By any chance, if endpoint having highest priority is down or not healthy, request will automatically be redirected to the endpoint which is assigned as 2nd highest priority and it keeps going on. Such routing method is very useful in disaster recovery scenarios.


Multivalue

In this type of routing method, there exists multiple endpoints for a single client request. One caveat here is, one can not go with server name mappings and has to mandatorily go with IPv4 or IPv6 one. Unfortunately, I couldn’t find any real-time use case here.


Subnet

This method allows to map range of IP addresses to specific endpoints. Say, you want certain users or to be specific certain IP addresses to always use WEST US endpoint and rest all can use other end points then this routing method can be used.

Well enough of theory. Let’s have a look at practical example of how priority based DNS routing works.


Thursday, March 25, 2021

Enabling Preview Features in Visual Studio 2019

By default, Visual Studio doesn’t enable preview feature selection in Visual Studio 2019. Say you have .Net 5.0 installed on your machine and you are creating a new Console Application.

 After creating the application, you will notice that, although .Net 5.0 is installed, application still picks up the .Net Core 3.1 as a default framework. In fact, application didn’t ask for framework selection too.

So, how can we select the framework while creating an application itself. For doing this, we need to enable Preview Features in Visual Studio by going to Options menu as shown below:












Video of this feature can be found here.

Hope you enjoyed learning this cool tip.

Monday, March 22, 2021

Get Notified via Azure Event Grid whenever Azure Blob is updated

In this article, we will learn how to get notification whenever something is changed in Azure Storage. Let’s say, we want our application to get notified whenever any new file is uploaded to the blob or say any file is deleted from the blob. How can we do that?

Before directly jumping into the solution, let’s have all the major pieces listed here which will contribute towards this.

Azure Storage: We need an Azure storage wherein we will be uploading our blog objects, i.e. image files.

Azure Event Grid: Next we need is Azure Event Grid. It is a routing service provided by Microsoft and the best part about this service is, it has built-in support for events coming from storage blobs and resource groups. So, we need to create a subscription which will tell what event and which specific topic we are interested in.

Endpoint: Lastly, we need an endpoint, which will receive notifications. So, it could be Function App, Logic App, or any custom application which is hosted somewhere and is accessible over HTTP.

With this much brief theoretical knowledge, we are good to proceed for implementation.

Complete article can be found here.

Video tutorial can be found here.

Saturday, March 20, 2021

Bring Azure Blob Objects Back to Life

Nowadays, there are many applications which are utilizing Azure Blob Storage for reading and writing objects. Looking at that, it’s quite common that these objects may get deleted accidentally due to user’s negligence or application's behavior. 

So, my today’s writeup is around this topic wherein we will see, how can we bring back our deleted blob objects back to life.

Lifecycle of blob storage is managed by a very well-known concept called Versioning. Versioning deals with the state - when any object was created, when was it modified or when was it deleted. If this topic impresses you, you can read complete blog post here or you can view it on my YouTube channel.