Tuesday, February 25, 2020

Utilizing Azure Blob and WebJob to Convert Excel Files to Flat File Format

I believe, there are many articles or blogs already available which speaks about how to convert an excel file to a comma separated file using C# and in all the cases (which I referred), excel is read from a hard drive of a local machine and csv file is saved back to the same hard drive. But in spite of knowing this, again, I’m going to draft another post.

Wondering, why?

Well, this post is going to be slightly different in the way files are being read and saved back. Below are the major offerings of this post:   
  • What if we have many excel files to convert but disk is not having enough space to save all of those? Same is the case for conversion output too.
  • What if we don’t have permission to save our converted files on to the local machine?
  • How can we run this conversion utility using web jobs?
In order to address the above challenges, we can utilize Azure capabilities wherein we will do everything on the fly without utilizing disk space as a storage for our files. Let’s see everything in action by going step by step.

Problem Statement
Reading excel files from Azure blob storage, convert them to csv format and uploading them back to Azure blob storage. This entire process has to run via triggered web job, so that it can be repeated as and when excel to csv conversion is required.

Setting up environment
I’m using Visual Studio 2019 v16.4.0 and having an active Azure subscription.

High level steps
  • Creating containers in Azure storage
  • Reading Excel from Azure storage
  • Converting Excel to CSV format
  • Uploading CSV to Azure storage
  • Creating Azure WebJob
  • Triggering Azure WebJob
Creating containers in Azure storage
A container must be created under blob service to store all the excel files which need to be converted to csv format. Now there are two ways, one can create a container – one is through the Azure portal and another one is by using C#. As both these are easily available on MSDN, I’m not going to repeat the entire procedure. 

For detailed steps on how to create a container, please refer references section placed at the end of this article.         
     
For our exercise, I’ve created two containers named excelcontainer and csvcontainer under one storage account. Where,

excelcontainer – holds excel files which are to be converted to csv
csvcontainer – holds the converted csv files

Below is the screenshot of my excelcontainer, which holds 3 excel workbooks:








Reading Excel from Azure storage
Now we have excelcontainer ready with uploaded files, it’s time to read data from all those files and here is the code to do that:


public async Task<List<BlobOutput>> Download(string containerName)
        {
            var downloadedData = new List<BlobOutput>();
            try
            {
                // Create service and container client for blob
                BlobContainerClient blobContainerClient =
                     _blobServiceClient.GetBlobContainerClient(containerName);
                // List all blobs in the container
                await foreach (BlobItem item in blobContainerClient.GetBlobsAsync())
                {
                    // Download the blob's contents and save it to a file
                    BlobClient blobClient = blobContainerClient.GetBlobClient(item.Name);
                    BlobDownloadInfo downloadedInfo = await blobClient.DownloadAsync();
                    downloadedData.Add(new BlobOutput
                       { BlobName = item.Name, BlobContent = downloadedInfo.Content });
                }
            }
            catch (Exception ex)
            {
                throw ex;
            }
            return downloadedData;

        }

Where BlobOutput is the DTO with below members.
public class BlobOutput
{       
      public string BlobName { getset; }
      public Stream BlobContent { getset; }
}

Converting Excel to CSV format
In above step, we have collected the data from each blob object into a stream. So, in this step, we will convert the streamed data into csv format and here is the code for that:


public static List<BlobInput> Convert(List<BlobOutput> inputs)
        {
            var dataForBlobInput = new List<BlobInput>();
            try
            {
                foreach (BlobOutput item in inputs)
                {
                    using (SpreadsheetDocument document =
                           SpreadsheetDocument.Open(item.BlobContent, false))
                    {
                        foreach (Sheet _Sheet in
                                 document.WorkbookPart.Workbook.Descendants<Sheet>())
                        {
                            WorksheetPart _WorksheetPart =
                               (WorksheetPart)document.WorkbookPart.GetPartById(_Sheet.Id);
                            Worksheet _Worksheet = _WorksheetPart.Worksheet;
                            SharedStringTablePart _SharedStringTablePart =
                                     document.WorkbookPart.GetPartsOfType
                                               <SharedStringTablePart>().First();
                            SharedStringItem[] _SharedStringItem =
                                     _SharedStringTablePart.SharedStringTable.Elements
                                               <SharedStringItem>().ToArray();
                            StringBuilder stringBuilder = new StringBuilder();
                            foreach (var row in _Worksheet.Descendants<Row>())
                            {
                                foreach (Cell _Cell in row)
                                {
                                    string Value = string.Empty;
                                    if (_Cell.CellValue != null)
                                    {
                                        if (_Cell.DataType != null &&
                                            _Cell.DataType.Value == CellValues.SharedString)
                                            Value = _SharedStringItem[int.Parse
                                                    (_Cell.CellValue.Text)].InnerText;
                                        else
                                            Value = _Cell.CellValue.Text;
                                    }
                                    stringBuilder.Append(string.Format("{0},", Value.Trim()));
                                }
                                stringBuilder.Append("\n");
                            }
                            byte[] data = Encoding.UTF8.GetBytes
                                               (stringBuilder.ToString().Trim());
                            string fileNameWithoutExtn = item.BlobName.ToString().Substring
                                             (0, item.BlobName.ToString().IndexOf("."));
                            string newFilename = $"{fileNameWithoutExtn}_{_Sheet.Name}.csv";
                            dataForBlobInput.Add(new BlobInput { BlobName = newFilename,
                                                                 BlobContent = data });
                        }
                    }
                }
            }
            catch (Exception Ex)
            {
                throw Ex;
            }
            return dataForBlobInput;
        }

where BlobInput is the DTO with below members:
public class BlobInput
 {
        public string BlobName { getset; }
        public byte[] BlobContent { getset; }

 }

If a workbook contains multiple sheets, then a separate csv will be created for each sheet with the file name format as <ExcelFileName>_<SheetName>. csv.

Uploading CSV to Azure storage
Once the data is converted to csv, we are good to go for uploading the csv files back to container and here is the code to perform this:


public async Task Upload(string containerName, List<BlobInput> inputs)
        {
            try
            {
                // Create service and container client for blob
                BlobContainerClient blobContainerClient =
                         _blobServiceClient.GetBlobContainerClient(containerName);
                foreach (BlobInput item in inputs)
                {
                    // Get a reference to a blob and upload
                    BlobClient blobClient =
                        blobContainerClient.GetBlobClient(item.BlobName.ToString());
                    using(var ms=new MemoryStream(item.BlobContent))
                    {
                        await blobClient.UploadAsync(ms, overwrite: true);
                    }                   
                }               
            }
            catch (Exception ex)
            {
                throw ex;
            }

        }

So far, we have read the excel file from container, convert it to csv format and uploaded back to another container. All good. The next task is to automate this using triggered WebJob.

Creating Azure WebJob
WebJob can be created using Visual Studio by right clicking on the project and selecting Publish…

Apart from this, there are many ways to create a triggered WebJob and all are mentioned over here on MSDN.

Triggering Azure WebJob
If everything is setup correctly, you will be able to see below screen on your Azure portal. 










As this is triggered WebJob, clicking on Run button will trigger this job and will create output as shown below:










Takeaway
Using Azure storage and WebJob, we have converted files from one format to another without utilizing the local disk space for saving file during this entire conversion process.

References:

Monday, February 3, 2020

Getting host information from current the URL in ASP.NET Core 3.1

While working on web application, it’s quite natural that we need to jump between various environments (i.e. Development, Testing, Production, etc.) during various phases of product life cycle. In other words, all these environments may have different-different host addresses. Let’s have a look at a few of those.

During the development phase, we usually run our application with http://localhost:8080/features/..., where our host is localhost:8080

During the testing phase, the same application can be run on http://www.consumerapps.com/features/..., where our host is www.consumerapps.com

Now, what if we want to get the host name in log file for an audit purpose. We cannot go and hard code it in the application, as it may change based on the environment on which application is running.

In ASP.NET Core 3.1, it can be easily achieved using HttpContext. First change we have to do is, to register IHttpContextAccessor as a singleton:

 services.AddSingleton<IHttpContextAccessor,HttpContextAccessor>(); 

Next, we have to make it available to the controller via constructor injection:

public class HomeController : Controller
{
        private readonly ILogger _logger;
        private readonly IHttpContextAccessor _httpContextAccessor;
 
        public HomeController(ILogger logger, IHttpContextAccessor httpContextAccessor)
        {
            _logger = logger;
            _httpContextAccessor = httpContextAccessor;
        }
}

Once above setup is done, host name can be accessed in any Action using below line of code:

string host = _httpContextAccessor.HttpContext.Request.Host.Value;

Hope you enjoyed this tip.

Thursday, December 26, 2019

Globally configuring values for JSON Serializer in ASP.NET Core 3.1

This article will focus on how one can set certain constraints on the given data type for JSON serialization and that too at the application level, which means changes need to be done at a global level rather than doing for specific custom class or property. We will also see, how one can fallback to default settings, post this application level change.
Let’s understand this with the help of an example.
Making application level changes for JSON serialization

Here problem statement is, we want all the float values to be restricted to 3 decimal places. Now, one way to achieve this is to decorate all the float properties in all the model classes with specific attribute using [JsonConverter(typeof(…)].
With above attribution, one can indeed achieve the goal of conversion or data formatting, but what if there are so many float values across the application. Is it feasible to go and change each and every single float property under every model class? I feel, NO :(

So, the solution to this problem is to do such setting globally and this can be achieved in two steps, wherein first step we will introduce a class which will take care of conversion or data formatting stuff and in second step we will associate this class with ASP.NET Core services. Here is the code:
Model class:
public class Calculate
{
  public float Price { get; set; }
  public float Rate { get; set; }
}
Converter class:
public class FloatConverter : JsonConverter
{
  public override float Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
  {
    throw new NotImplementedException();
  }
  public override void Write(Utf8JsonWriter writer, float value, JsonSerializerOptions options)
  {
    writer.WriteStringValue(string.Format("{0:F3}", value));
  }
}

and then few code changes in Startup class is as follows: 
public void ConfigureServices(IServiceCollection services)
{
  ...
  services.AddControllersWithViews().AddJsonOptions(options =>
    {
       options.JsonSerializerOptions.Converters.Add(new FloatConverter());
    });
}

Now, if you will run your application, you will notice that all the floats came out of JSON serialization are shown till 3 decimal places.

Fallback to default settings

Now let's say, there are few float properties in our model class, on which we don't want to apply this global setting. Now, how to solve this?

Well, to make the default settings work, we have to create another converter which will override the global setting and then decorate all those properties with this new converter. This new converter will do nothing more than retaining the default behavior. In our example, I don't want to apply any conversion logic for Rate property. So, this is how we can change our model class: 

public class Calculate
{
  public float Price { get; set; }
  [JsonConverter(typeof(DefaultFloatConverter))]
  public float Rate { get; set; }
}
and code for DefaultFloatConverter is as follows:
public class DefaultFloatConverter : JsonConverter
{
  public override float Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
  {
   throw new NotImplementedException();
  }
  public override void Write(Utf8JsonWriter writer, float value, JsonSerializerOptions options)
  {
    writer.WriteNumberValue(value);
  }
}

Now, if you will run the application, you will notice that Rate is coming in it's default precision whereas Price is restricted to 3 decimals.

Hope you enjoyed reading.

Saturday, August 3, 2019

Avoid duplication of ModelState.IsValid in ASP.NET Core

Generally, whenever something is to be saved to the database or to any other place, as a best practice almost everyone use to validate the state of the model. So, if state of model is valid, we proceed and if model state is invalid, we handle it as a bad request. This looks something like this:
If(!ModelState.IsValid)

{

    // create bad request object

}       
 
So, all these were done by using the IsValid property.

Problem

Now what if we have to perform the same validation for all the models. Are we going to write the same validation in each and every controller?

Of course, No.

Solution

Rather than duplicating the same code in each and every controller, we can create a global filter. This global filter has few methods, but for our purpose we can go with OnActionExecuting.
public class ValidateModelStateFilter : ActionFilterAttribute

{
    
   public override void OnActionExecuting(ActionExecutingContext context)

    {

        if (!context.ModelState.IsValid)

        {

           context.Result = new BadRequestObjectResult(context.ModelState);

        }

    }

}
       
 
Next is to update ConfigureServices method under Startup class:
services.AddMvcCore(options =>

{

     options.Filters.Add(typeof(ValidateModelFilter));

})
Once above changes are done, we need not to repeat the same ModelState validation check in each and every controller.

Hope this tip would be useful.

Thursday, July 11, 2019

Received MVP Award for the 3rd time


Another amazing news received this week. MVP award is in my hand for the 3rd time.


Tuesday, July 9, 2019

Sunday, July 7, 2019

Build errors dialog in Visual Studio

Background
When you are in between of writing your logic and by mistake you pressed F5, what will happen? Boom… you will lend up with below dialog due to compilation errors:










Now think about it. If we are writing a code it means we need that to be executed whenever we are running the application. Isn’t it? It’s very-very rare case when someone still would like to execute previous logic. At least for me, I never ever want this to happen.

Solution

So, how to get rid of this rarely used prompt. There are two ways:

First, Simply check the checkbox ‘Do not show this dialog again’. But for me, on one of my machines this checkbox has not even appeared. Hence, I opted for second option.

Second, go to Tools >> Options… >> Projects and Solutions >> Build and Run. On right side panel, just change the value in dropdown displayed under ‘On Run, when build or deployment errors occur:’ to Do not Launch and we are done.















Now going forward, you will never see that dialog until and unless build is successful. Hope you like this small trick.