Wednesday, April 9, 2014

Controlling Degree of Concurrent Execution in Parallel Loops

Having multi-core machine, one can get the benefit of Parallelism for long-running and blocking tasks. I hope most of you might have used Parallel. For, Parallel. Foreach and TPL several times. Generally parallelism is used to get the benefits of all the available cores on the machine that commonly leads to the shortest execution time for any task.
  1. Did you ever thought about what can be the possible demerits of using all the available cores for just a single application?
  2. Is it really required to use all the available cores of your machine for a single application? Can't we use only a few of the cores?
  3. Is there any way to restrict these parallel loops in terms of cores?
The answer of all the preceding questions is YES.

But why should we bother about how many cores are participating in execution?
Well, there can be several reasons behind this. The foremost reason that I feel is, when a single time consuming application/task is running on all the available cores utilizing most of the processing power, what will happen to other applications running on the same machine. Those other applications might hang and may even encounter performance issues, which is not at all acceptable. Isn't it?

Another reason can be, if your long running process is executing on the server, then it is not a good idea to raise an unlimited number of requests at the same time, since it may lead to server timeout or may introduce a DoS attack. In layman words, your server may go down, which is again an unacceptable behavior. Isn't it?

Solution
All the preceding problems can be resolved by managing the available cores explicitly, especially when dealing with parallel loops. Instead of using all the cores for a single long running process, use only a few of them. So, the other cores can be used by the rest of the applications.

Before proceeding, let's have a look at a small code snippet.

Parallel.For(1, 10, i =>
{
    Debug.WriteLine("{0} is executed on Task {1}", i, Task.CurrentId);
});

If you run the code above multiple times, definitely your output may vary. Let's have a look at a few outputs:

Output on first run
3 is executed on Task 2
7 is executed on Task 4
4 is executed on Task 2
6 is executed on Task 2
1 is executed on Task 1
9 is executed on Task 5
8 is executed on Task 4
2 is executed on Task 2
5 is executed on Task 3

Output on second run
9 is executed on Task 5
5 is executed on Task 3
7 is executed on Task 4
3 is executed on Task 2
1 is executed on Task 1
6 is executed on Task 3
2 is executed on Task 5
8 is executed on Task 4
4 is executed on Task 2

How to control this degree of concurrency?
There is already a property available in C#. So, by using this property one can restrict the number of concurrent tasks created during the execution of parallel loops. By assigning some value to MaxDegreeOfParallelism, we can restrict the degree of this concurrency and can restrict the number of processor cores to be used by our loops. The default value of this property is -1, which means there is no restriction on concurrently running operations. Let's quickly jump to the code:

ParallelOptions po = new ParallelOptions();
po.MaxDegreeOfParallelism = 2;
Parallel.For(1, 10, po, i =>
{
    Debug.WriteLine("{0} is executed on Task {1}", i, Task.CurrentId);
});

In code above, I am setting MaxDegreeOfParallelism to 2, which means that only a maximum of 2 tasks will be created, that will in turn use fewer cores, which is 2 here. On executing the code above, you will get the following output:

Output on first run
5 is executed on Task 2
6 is executed on Task 2
7 is executed on Task 2
8 is executed on Task 2
9 is executed on Task 2
2 is executed on Task 2
3 is executed on Task 2
4 is executed on Task 2
1 is executed on Task 1

Output on second run
1 is executed on Task 2
2 is executed on Task 2
5 is executed on Task 1
6 is executed on Task 1
3 is executed on Task 2
7 is executed on Task 1
4 is executed on Task 2
9 is executed on Task 2
8 is executed on Task 1

Whatever number of times you execute your code above, the number of concurrent tasks will never go above 2.

The same concept can be applied for a parallel Foreach loop also.

When concurrency needs to be controlled ?
It is not necessary that you always need to tweak this setting. It fully varies from scenario to scenario. Please use it with caution. The most common scenarios for using this setting is:
  1. When a huge number of automatically created threads may lead to deadlock or livelock
  2. When you want your loop or algorithm to use only a limited number of cores. It is usually required when more than one time-consuming algorithm needs to be run simultaneously
Happy learning!!!
Hope you enjoyed concurrency.

No comments:

Post a Comment