Friday, 13 December 2019

So when does a Task start?

This question can turn out to be funny if you don't pay attention. So, i wrote an simple code so that it gives clarity to the uninitiated.

private static async Task WriteWithDelay()
{
            Console.WriteLine($"Write before delay in thread id : {Thread.CurrentThread.ManagedThreadId} at {DateTime.UtcNow.Ticks}");
            await Task.Delay(10000).ConfigureAwait(false);
            Console.WriteLine($"Write after delay in thread id : {Thread.CurrentThread.ManagedThreadId} at {DateTime.UtcNow.Ticks}");
            return Thread.CurrentThread.ManagedThreadId;

  }

 static async Task Main(string[] args)
 {
            Task[] tasks = new Task[10];
            for(int i = 0; i < tasks.Length; i++)
            {
                tasks[i] = WriteWithDelay();
            }

            Console.WriteLine($"Before when all at {DateTime.UtcNow.Ticks}");
            await Task.WhenAll(tasks);

            Console.WriteLine($"After when all at {DateTime.UtcNow.Ticks}");
}

Well, you can check the output and be clear about the order in which things happen.

Write before delay in thread id : 1 at 637118566941927002
Write before delay in thread id : 1 at 637118566942107032
Write before delay in thread id : 1 at 637118566942114110
Write before delay in thread id : 1 at 637118566942117029
Write before delay in thread id : 1 at 637118566942117029
Write before delay in thread id : 1 at 637118566942117029
Write before delay in thread id : 1 at 637118566942117029
Write before delay in thread id : 1 at 637118566942117029
Write before delay in thread id : 1 at 637118566942127007
Write before delay in thread id : 1 at 637118566942127007
Before when all at 637118566942127007
Write after delay in thread id : 4 at 637118567043690656
Write after delay in thread id : 11 at 637118567043690656
Write after delay in thread id : 5 at 637118567043690656
Write after delay in thread id : 7 at 637118567043690656
Write after delay in thread id : 13 at 637118567043690656
Write after delay in thread id : 9 at 637118567043690656
Write after delay in thread id : 12 at 637118567043690656
Write after delay in thread id : 8 at 637118567043690656
Write after delay in thread id : 6 at 637118567043690656
Write after delay in thread id : 10 at 637118567043700652
After when all at 637118567043871008

Stateful vs Stateless

So, this has been a question that I have faced multiple times over the course of years. Should you write services that are stateful or should you try out a design that should be stateless?

Disclaimer: All opinions expressed below are personal opinion and are point in time opinions. Software evolves very fast and new patterns emerge every year. So do not quote me :). 

By Stateful service, I mean services that keep data as part of the service. There are multiple options that allow us to implement stateful services e.g. Service Fabric Stateful Service etc.

Let us dive into the pros and cons.


Stateful Services

Pros:

1. Keeps data close to where it is needed the most and minimize network latency. (*there will still be limitation on the size of data and type of data that can keep. You will eventually need a persistent backing store anyway.)
2. Can potentially improve the performance of the service.
3. Invariably the design could lead to partitioning the data as well and can increase the scale options.

Cons:

1. Enforces data affinity and special considerations need to be put in place if you want to go for active-active sites. Same restriction applies to scale out scenarios (think consistent hashing).
2. Disaster recovery needs to be carefully planned as there is level of coupling between compute plane and data plane.

Stateless Services

Pros:

1. Infrastructure (e.g. compute instances) can be scaled without much concerns because there is no local data affinity.
2. Data is separated from compute and keeps conceptual/physical separations consistent - at least from visualization perspective.

Cons:

1. Easier to set up active-active sites as long as the backend can hold up.
2. Disaster recovery for compute plane is independent of disaster recovery of data plane.


References - link # 1 link # 2