Thursday 31 December 2020

Two-Sum: C#, Javascript, Python, Java

There are quite a few things that you can learn by solving one computer programming problem in various computer programming languages. So I decided to try out solving Two-Sum problem in 4 languages - C#, JavaScript, Python, and Java.

C# 

public int[] TwoSum(int[] nums, int target) {

        Dictionary<int, int> dict = new Dictionary<int, int>();

        for(int i=0; i < nums.Length; i++){

            int searchTerm = target-nums[i];

            if(dict.ContainsKey(searchTerm)){

                return new int[2] { dict[searchTerm], i};

            }else{

                dict.Add(nums[i], i);

            }

        }       

        return null;

    }

JavaScript

var twoSum = function(nums, target) {

    const map = {};

    for (let i = 0; i < nums.length; i++){

        let diff = target - nums[i];

        if(map[diff] !== undefined){

            return [map[diff], i];

        }else{

            map[nums[i]] = i;

        }

    }

    return null;

};

Python

def twoSum(self, nums: List[int], target: int) -> List[int]:

        map = {}

        for i, num in enumerate(nums) :

            diff = target - num

            if(map.get(diff) != None) :

                return [map[diff], i];

            else :

                map[num] = i;

        return None

Java

public int[] twoSum(int[] nums, int target) {

        Map<Integer, Integer> map = new HashMap<>();

        for (int i = 0; i < nums.length; i++) {

            int complement = target - nums[i];

            if (map.containsKey(complement)) {

                return new int[] { map.get(complement), i };

            }

            map.put(nums[i], i);

        }

        return null;

    }

Observations:

1. Code written in Python is probably the most succinct and readable one than the code written in other languages.

2. Coming from a C# background, it was refreshing to note that primary constructs like dictionary, loop, if-else is pretty much same across other programming languages with a difference in syntax. I did have to struggle to find equivalent of Dictionary in Java though. 

3. Python and JavaScript are similar in tone. Java and C# are similar in tone. There is a lot less ceremony in Python and JavaScript than what is required in Java and C#.

4. When I ran these solutions in LeetCode, the CPU time and memory consumption for each of the languages was a learning experience.


Memory footprint : Python's memory consumption is < 50% of what is typically consumed by Java, C# and JavaScript (perhaps running on Node.JS?). 

Runtime: Somehow Java turned out to be the fastest, I am not sure what compile time/run time optimization would cause this (something for me to learn more about). C# was the slowest by an order of magnitude, perhaps due to the extra time required to load assembly, JIT the code etc. Python was much faster than my expectations :).

All of the above observations can also be result of a transient issue on LeetCode servers. So, don't go by the performance numbers right away - better to run these on local machine and make up your mind.

Saturday 12 December 2020

GitOps!!

Operations in any industry is a tough challenge. It is very different from creating something new. For example - creating a medicine for a disease requires a different process than the process required to industrialize production, distribution and sales processes and ensuring that we never run out of its supply.

Software is also similar. Process for writing code for building functionality is very different from the process required to package the artefacts, distributing it and operationalizing it in production. There are so many methods and tools available to handle these differences.

In the below video Kelsey Hightower talks about GitOps which focuses on providing a process that can simplify the process of managing production. The idea of segregating the production environment config from the actual code artefact (i.e. container image) is an interesting one. One of its core benefits is that it separates environment management and updates from code development. So developers can keep working much faster and we can have processes in place to have non-gated deployments for test environments and gated deployments (e.g. Test environments, UAT, Production rings etc.).

Truly speaking, if you are able to find the right definition of business functionalities and technical functionalities and be able to package them correctly, things can become simpler to upgrade/rollback/canary-test. 

Sunday 6 December 2020

[Recursion]: Finding maximum depth of a binary tree

Recursion in programming is quite common and yet quite tricky to get right. The only you can get it right is by writing code that uses recursion a few times and understand the thinking pattern required to identify smaller version of the same problem.

A typical example of recursion is to find depth of a binary tree.

Say, your tree class looks like following.

public class Node {

    public Node Left { get; set;}

    public Node Right {get; set;}

    public int Value { get; set;}

}

if your tree looks like following, then you can find its maximum depth by just looking at it.

                                                 8

                                          12         15

                                       6    null  null  9

                                                             30

Now, imagine that this tree is much larger. Inspection of the tree is not going to help much. Instead, it might be better to figure out the maximum depth at any given level by 1) looking at depth of its child nodes 2) finding maximum of the depths calculated for its children 3) adding 1 to  it.

So a recursive function that calculates the maximum depth will look like below.

public int FindMaximumDepth(Node node){

    if(node == null){

         return 0;

    }

    var left = FindMaximumDepth(node.Left);

    var right = FindMaximumDepth(node.Right);

    return 1 + Math.Max(left, right);

}

As you can see the above code is quite concise as it uses recursion by finding a smaller version of the same problem by identifying the recursive pattern to the problem. It is quite helpful to have practiced the identification of recursive pattern.

Saturday 14 November 2020

C# Record Type

There are few new features in C# 9.0 that provide more clarity in terms of how to express code that is better aligned with your thoughts.

An example of such feature is "record". Instead of declaring a type as class, you can declare it as a record and that becomes immutable - perfect for message passing. It is much closer to what you would call a "Value Object". Better yet, there is an in-built implementation of ToString() function and value comparison is automatically taken care of - you don't need to write a comparer yourself.

What goes on in the backend - well - record gets compiled as a class which implements IEquatable<T> and provides override implementation of multiple methods.

Consider this example.

If you look at the compile code, it looks like below.



Reference video: 

Sunday 11 October 2020

Event-based vs Message-based integrations

In software design and implementations, two terms "Event" and "Message" are often used interchangeably - most probably because they appear so similar. Throw in "Stream" in the mix and it get murkier. Yet, for purists, there is a big difference.

Let us focus on "Event" and "Message" for now.

Event : Lightweight, used to indicate an "event" only. E.g. an order has been placed. In simplified terms we can say that an event informs about occurrence of something. An event is pretty much discreet. This means that if we have 10 orders placed, we will receive 10 events. You can use your own custom schema to define events or use a standard e.g. CloudEvent. Events can be transmitted over a variety of channels e.g. HTTP, Streaming, Messaging etc. Given that events are lightweight and do not carry verbose information, they are mostly used to "notify" interested systems (pub-sub is one way of doing it, pulling event on a subscription is another way of doing it, etc.).

Message : Verbose, used to provide full detail of what needs to happen. E.g. Order with its line items so that a downstream fulfillment system can use the message to proceed with it. Messages generally follow custom schema as they pretty much "domain" messages and need to fit your own business needs. In specific cases where you need to integrate with a third party system, you might publish your own message schema so that they can understand the message. A good messaging system will also allow you to deduplicate them, group them under a session, perhaps commit a batch of message as part of single transaction, read/write messages in batches. In fact, a conventional message schema allows you to add multiple messages as single message provided it fits the maximum size allowed per message. 

Now that you understand the purpose of each, you can pick the associated technologies such that its fits your business needs.

It is not too difficult to mix and match the two and achieve business goals. Feel free to try it. 

Thursday 17 September 2020

struct and interface : The confusion

The fact that, in C#, a value type like struct can implement an interface can be perplexing. It can lead to few misunderstanding unless you stick to one basic fact that struct is a value type. 

Let us create an interface for example.

public interface ITemp

{

        void M();

}

As it is allowed, we can have a struct that implements this interface.

public struct Temp2 : ITemp

{

        private int val;

        public void M()

        {

            val++;

            Console.WriteLine(val);

        }

}

So far, so good.

What happens if i pass a structure to a method that accepts parameter of type ITemp? Magic of boxing/unboxing takes place that ensures that struct is being passed as a reference. If you have a method that takes parameter of type struct Temp2, the value will be passed as value.

private static void Run(Temp2 temp)

{

            temp.M();

}

private static void Run2(ITemp temp)

{

            temp.M();

}

static void Main(string[] args)

{

var tempStruct = new Temp2();

tempStruct.M(); 

Run(tempStruct);

tempStruct.M();

Run2(tempStruct);

}

As you would have guessed, here is the output.

1

2

2

3

Hopefully, this clarifies the behavior of struct types that implement an interface.

Saturday 12 September 2020

The cost of function call stacks

 So, this one is pretty much a known thing but is generally overlooked. There is a cost associated with each function call stack and as an application developer, we should try to find ways to avoid unnecessary calls.

Let us take a look at an example. I have written 2 versions of a Fibonacci series function in C#.

static void Main(string[] args)
        {
            int number = 42;
            Program p = new Program();
            Stopwatch sw = new Stopwatch();
            sw.Start();
            Console.WriteLine(p.Fibonacci(number));
            sw.Stop();
            Console.WriteLine($"Time taken in ms : {sw.ElapsedMilliseconds}");
           sw.Restart();
            Console.WriteLine(p.FibonacciV2(number));
            sw.Stop();
            Console.WriteLine($"Time taken in ms : {sw.ElapsedMilliseconds}");
        }

        public int Fibonacci(int n)
        {
            if(n == 0 || n == 1){
                return n;
            }

            return Fibonacci(n-2) + Fibonacci(n-1);
        }

        public int FibonacciV2(int n)
        {
            if(n == 0 || n == 1){
                return n;
            }
            
            int result = 0;
            Dictionary<intinttemp = new Dictionary<intint>();
            temp.Add(00);
            temp.Add(11);
            for(int i = 2i <= ni++){
                result = temp[i-2] + temp[i-1];
                temp[i] = result;
            }

            return result;
        }

I compiled this code in release mode and ran 4 times to check the output for comparison purpose.

267914296

Time taken in ms : 1556

267914296

Time taken in ms : 0

267914296

Time taken in ms : 1541

267914296

Time taken in ms : 0

267914296

Time taken in ms : 1609

267914296

Time taken in ms : 0

267914296

Time taken in ms : 1656

267914296

Time taken in ms : 1

What you notice is order of magnitudes faster if you calculate Fibonacci number in an iterative way instead of a recursive way. It also shows the cost of allocation of a function stack. 

If you run the same program to calculate Fibonacci number for a smaller value like 20, the difference is so small that it wouldn't matter. Here is the output.

6765

Time taken in ms : 1

6765

Time taken in ms : 0

6765

Time taken in ms : 1

6765

Time taken in ms : 0

6765

Time taken in ms : 0

6765

Time taken in ms : 0

6765

Time taken in ms : 1

6765

Time taken in ms : 0

This output highlights that increase in total time is not linearly related to depth of recursion.

So where does this fit in the world of object oriented programming where you are asked to create modular single responsibility functions. There is a definite value of modular structure as keeps the Software extensible, robust, and testable. 

As long as your application is not using n layers of function calls to perform trivial tasks, you will be good as the cost is not as high. There are runtime level capabilities in each platform (Java, .NET etc.) where if you chose the right options, nothing bad will happen. 

A typical example in .NET is user input validation which is a frequent and (mostly) trivial operation. You can utilize model binding, custom object validators, or model state to keep validation at the entry point instead of hiding it in a utility class which is 5 layers down.

Friday 26 June 2020

Serverless and containers

Many a times, we hear people use two popular industry terms "containers" and "serverless" interchangeably. This can cause a bit of confusion, especially in terms of categorization of the problem they solve.

To start with, they solve 2 distinct problems.

Containers

Containers were designed to solve the problem of "missing" dependencies or prerequisites when you moved your application binaries from one environment to another environment. For instance, if you had a .NET Framework application, you will need to ensure that the machines (or VMs) in the new environment have the right .NET framework version, right tools, right environment variables, right folder structure, etc. available before you start utilizing it. Containers let you package all such dependencies in the form of an "image" which can be maintained in a secure "container registry". You can use any of the popular container "runtime" software e.g. Docker etc. to spin up instances of the "container image" and make your application available.

In order to scale this set up, you require an "orchestrator" that let you spin up multiple instances of multiple container images and ensure that communication is appropriately set up for interaction between services contained in those images. 

You can run it using your own infra set up in a cloud or an on-prem environment if you have the skills. Or utilize the "managed" offering by your cloud providers and makes it "serverless" for you :).

Serverless

Serverless was born out of the need of customers to pay for what they use in a Cloud infrastructure. In the end, Cloud is just someone else's computer. Customers don't use cloud resources all the time, and therefore they should not have to pay for unused minutes. Moreover, customers require rapid scale up/scale down capabilities from cloud platforms to make their business run efficiently. Sure, there are cloud services/offerings that you can manage scale out policies on your own with some help from the cloud platform, but it adds an overhead to the cloud consumers. 

Serverless offerings from various cloud providers solve this issue and let customers worry about their application code only and be able to run the application in a cost effective manner in cloud.

Point to note - Almost all of cloud service providers let you package your applications in a container and run it in a serverless environment. There are non-container based versions of serverless solutions e.g. Functions as well. In fact, there are even higher level "serverless" solutions like Azure LogicApps, Azure Flow as well which let you focus on your business "code" and let infra be managed by your cloud provider so that you pay based on "execution count".

In summary, you can run containers with/without serverless infrastructure and serverless solutions can let you run containerized and non-containerized applications. They are definitely not same and are complementary to each other.

Now, you may ask where do microservices fit into this :)? Well, we can take up this topic in another post.

Read more about this topic: Link 1 Link2

Tuesday 2 June 2020

Request Collapsing is not same as Request Caching

Though "Request Collapsing" and "Request Caching" sound like close cousins, they are not in same family of functionalities. Both are applicable at transport/server level and do not generally require application level code.

Request Collapsing:
It is useful for cases when you want to collapse (or coalesce) parallel requests, which are same e.g. requesting same resource from the backend, into a single request to the backend. This reduces load on the backend server as number of requests can be reduced significantly. Example of tools that let you do that at http server/entry points are hystrix, varnish etc. You can find more details about request collapsing here.


Request Caching:
It is good for cases when you expect to serve specific type of content frequently and the content isn't expected to change for some period of time. It makes sense to store the value to be returned in a temporary (or persistent) cache store e.g. using a CDN like akamai or letting the browser know that it is ok to cache the value whenever requested, thus reducing load on the backend service. These can be generally governed via HTTP Headers. More details can be found here.

Each use case is different and serves a specific requirements. So it will be possible to mix and match the two when designing your infrastructure.

Tuesday 14 April 2020

Race condition in ARM template

I was recently working on deployment of a web application and created an ARM template for Azure deployment and ran into an interesting issue.

AppGallery Deploy Failed: 'System.Threading.ThreadAbortException: Thread was being aborted.

This was a bit surprising. Why would ARM deployment run into ThreadAbort kind of issues.

After a bit of digging i noted this github thead

I ran into exact issue. Apparently the Azure deployment process ran the steps related to appsettings and MSDeploy in parallel and this caused a race condition resulting in this strange message.

Fix was quite simple - mention appsettings dependent on MSDeploy so that Azure deployment process runs them in sequence, thus avoiding race condition. You can refer the fix mentioned in the same github thread.

If you run into this problem, this can come handy.

Monday 30 March 2020

Redis DataStructure: SortedSet

So let's say that you want to maintain a list of unique items, with items sorted based on the last access time. Few examples of such use case - List of recent actions to be shown against a user profile, List of items in a purchase cart, List of my favorite dishes in a catalog etc.

While there are multiple technology options that can help in this scenario, i wanted to highlight the usage of Redis store for such cases. Redis supports multiple data structures, each serving a set of unique scenario. SortedSet is one such data structure that helps in this particular scenario.

From Redis.IO

Sorted sets are a data type which is similar to a mix between a Set and a Hash. Like sets, sorted sets are composed of unique, non-repeating string elements, so in some sense a sorted set is a set as well.
However while elements inside sets are not ordered, every element in a sorted set is associated with a floating point value, called the score (this is why the type is also similar to a hash, since every element is mapped to a value).
Moreover, elements in a sorted sets are taken in order (so they are not ordered on request, order is a peculiarity of the data structure used to represent sorted sets). They are ordered according to the following rule:
  • If A and B are two elements with a different score, then A > B if A.score is > B.score.
  • If A and B have exactly the same score, then A > B if the A string is lexicographically greater than the B string. A and B strings can't be equal since sorted sets only have unique elements.

I hope things are clear now. Let us get on with a test application code in C# and .NET core.

Let us make an Item class.

public class Item
{
public string Id { get; set; }

public string Name { get; set; }

public string Format()
{
return $"{Id}:{Name}";
}
}

Add a reference to StackExchange.Redis nuget package.

var connection = await ConnectionMultiplexer.ConnectAsync("");
var database = connection.GetDatabase();
string setName = "";

Stopwatch sw = new Stopwatch();

// First insert 20 unique items
sw.Restart();
for (int i = 0; i < 20; i++)
{
Item item = new Item()
{
Id = "90ddc267-e96c-4d48-b05e-43c3cf1e5de1",
Name = "FixedName" + i.ToString()
};
Stopwatch sw1 = new Stopwatch();
sw1.Start();
await database.SortedSetAddAsync(setName, item.Format(), DateTime.UtcNow.Ticks);
sw1.Stop();
Console.WriteLine($"Time taken for 1 SortedSetAddAsync {sw1.ElapsedMilliseconds}");
}

sw.Stop();
Console.WriteLine($"Time taken for 20 SortedSetAddAsync {sw.ElapsedMilliseconds}");

// Try getting top 10 items with their score i.e. access time
var items = await database.SortedSetRangeByRankWithScoresAsync(setName, order: Order.Descending, stop: 9);
Console.WriteLine(items);

// Try overriding items in reverse order
sw.Restart();
for (int i = 19; i >= 0; i--)
{
Item item = new Item()
{
Id = "90ddc267-e96c-4d48-b05e-43c3cf1e5de1",
Name = "FixedName" + i.ToString()
};

Stopwatch sw1 = new Stopwatch();
sw1.Start();
await database.SortedSetAddAsync(setName, item.Format(), DateTime.UtcNow.Ticks);
sw1.Stop();
Console.WriteLine($"Time taken for 1 SortedSetAddAsync {sw1.ElapsedMilliseconds}");
}

sw.Stop();
Console.WriteLine($"Time taken for 20 SortedSetAddAsync {sw.ElapsedMilliseconds}");

// Try getting top 10 items with their score i.e. access time
items = await database.SortedSetRangeByRankWithScoresAsync(setName, order: Order.Descending, stop: 9);
sw.Stop();
Console.WriteLine($"Time taken for top 10 SortedSetRangeByRankWithScoresAsync {sw.ElapsedMilliseconds}");
Console.WriteLine(items);

// Remove items that are not part of top 10 items list
sw.Restart();
await database.SortedSetRemoveRangeByRankAsync(setName, 10, -1);
sw.Stop();
Console.WriteLine($"Time taken for SortedSetRemoveRangeByRankAsync {sw.ElapsedMilliseconds}");

// verify that indeed items have been removed
sw.Restart();
items = await database.SortedSetRangeByRankWithScoresAsync(setName, order: Order.Descending, stop: 9);
sw.Stop();
Console.WriteLine($"Time taken for top 10 SortedSetRangeByRankWithScoresAsync {sw.ElapsedMilliseconds}");
Console.WriteLine(items);


Once you run it, you will realize 2 things.
1. This data structure is quite powerful and handles a lot of concurrency related scenarios out of the box. Less work for you.
2. The operations are fast!! (provided you do not have any major network lag.)

Sunday 29 March 2020

Pro Tip: Killing a windows process

Imagine you are using your beloved windows and working on something important. And suddenly you notice that the "Explorer.exe" is stopped responding - in other words, you are not able to browse folders, programs that open file dialog are hanging now, and you are not able to even open the "Task Manager" from task bar. 

Well, in such cases, one of the options is to "restart" windows :)

But let's say, you want to avoid it because you are not sure if your precious work is saved.

One option you can try is:

1. Open command prompt in admin mode.
2. (optional) Run "tasklist" command to see running processes.
3. Since we know the name of the process that needs to be restarted, we can run taskkill /IM "explorer.exe" /F
4. Now that the previous explorer process is dead, we need to launch a new one. That is simple - run "explorer"


Hope that helps!!

Sunday 5 January 2020

Why is Brotli compression not working in ASP.NET Core?

Now that Brotli has become a widely supported encoding format, I thought of trying it out myself so that I can experience the gains on network bandwidth usage.

However, I faced an interesting challenge. I guess anyone, who tries to use Brotli compression on an ASP.NET core website, might face it and this challenge is primarily related to the way ASP.NET core lines up the modules in its pipeline.

I created an empty ASP.NET Core ReactJS website (I called it "BrotliExperiment") and added following code in the ConfigureServices method of Startup.cs.

             services.Configure(options =>
            {
                options.Level = CompressionLevel.Optimal;
            });

            services.AddResponseCompression(options =>
            {
                options.EnableForHttps = true;
                options.Providers.Add();
            });

I also enabled response compression in the Configure method of Startup.cs using following code.

           app.UseResponseCompression();

However, when i tried to check the response headers, it was not present :(.


So, i figured out from the internet that the issue is that UseResponseCompression method should called before UseStaticFiles method. So I moved it above the UseStaticFiles method.

And it works :)