Sunday, May 20, 2012

Performance analysis for String and StringBuilder

Sometimes small-small changes in our code really makes a huge difference to a performance. There are many tips and tricks available and among those, one I am going to discuss over here. I'll be talking about String vs StringBuilder. One needs to be very careful while playing with strings because memory wise there is a huge impact of strings. I know, there are lots and lots of articles available on net on String and StringBuilder, but still I am going to show this, using some statistics.

Here I am taking Fx 4.0 C# console application with different static methods to showcase my analysis. Basically what I am doing here is, I am having a String variable named outputString and just looping that for 1000 times and concating the string to variable outputString. 

Please note, concatenation is done using + symbol. So, what happens internally is, whenever concatenation is done using + symbol, every time, new String object is created. So, as with my snippet. Here I am looping 1000 times, so, it is creating 1000 String objects and every time is is replaced with variable outputString . That way, whenever we use string concatenation with the plus (+) sign, it is definitely going to cost our application performance.

Well, I guess this much boring theory is enough. Let's move towards statistics. 

Here I am using CLR Profiler and is really one of the good tool to analyse our code performance. This tool tells us, how much memory bytes are consumed, Garbage Collector  performance and how many objects it is moving to generation Gen0, Gen1 and Gen2 buckets. And at the same time statistics provided by this tool is very easy to understand.

Ok, I just ran CLR Profiler for the above code and got the below statistics. Here I am not going to cover GC generations in detail, but would like to touch bit on it. One must know that all the objects created in application, first comes to G0 bucket and then older objects are moved to G1 bucket. If the G1 bucket is going to full then older objects get moved to G2 bucket.  But for .Net GC, frequency of visiting G1 and G2 is very less, compare to the G0 bucket. It means that GC is visiting bucket 0 frequently, so it is releasing G0 objects much frequently and the scope of object is also very less. So, if your application is creating objects which lot many objects are moving to G1 and G2, then it is not a good sign. 

Now quickly jumping back to our example:

Here we see that heap bytes are present in all three Gen 0,Gen 1,Gen 2 and even the memory wise also it is 7 digit (2,894,353).
 Here Relocated bytes means it is going to be the part of G1 related objects. Here I am not going to analyse all the result, but somehow we are seeing here some negative signs because few of the objects are falling in G1 and G2 buckets also.

Now before commenting on it, lets take StringBuilder's data. In this example, I just created a StringBuilder instance named sb. Here I am doing the same thing, but instaed of string, I am taking instance of StringBuilder. In case of StringBuilder, whenever value will be appended, it will not create any new object but just updates the reference of the sb object with the new value. So, internally it is not creating a new object for every concatenation. So, this is the real benefit of StringBuilder as compare to String object.

Although we are looping for 1000 times, but it doesn't mean that we are creating 1000 string objects. That's the way we are controlling memory usage and creation of new objects. Now will run profiler and checkout the results.

Here we see that memory bytes are reduced to 5 digits (92, 332) and relocated bytes are nothing. If we will see that Heap bytes, it is unknown (0) for all G0, G1 and G2. It means, none of the objects are moved to G1 and G2. All the objects are created in G0 and release from G0 itself.

So, here we noticed that there is a significant difference in both memory usage as well as GC's bucket movements.

Hence we can conclude that we should prefer to use StringBuilder, rather than String specially when  we are dealing with concatenations. 

1 comment:

  1. Good info. You shared valuable information and usage for performance basis.