Tuesday, July 30, 2019

Thread, code and data - Story of a Multithreading Program in Java

There are certain things, which you don't learn on academics or training class, you develop those understanding after few years of work experience, and then you realize, it was very basic, how come I had missed that all those years. Understanding of how a multi-threaded Java program executes is one of such things. You definitely have heard about threads, how to start a thread, how to stop a thread, definitions like its independent path of execution, all funky libraries to deal with inter-thread communication, yet when it comes to debugging a multithreaded Java program, you struggle.

At least I can say this from my personal experience. Debugging is in my opinion real trainer, you will learn a subtle concept and develop an understanding which will last long, only through debugging.

In this article, I am going to talk about three important things about any program execution, not just Java, Thread, code, and data.

Once you have a good understanding of how these three work together, it would be much easier for you to understand how a program is executing, why a certain bug comes only sometimes, why a particular bug comes all time and why a particular bug is truly random.

How Thread, Code, and Data work together

What is a program? In short, it's a piece of code, which is translated into binary instruction for  CPU. CPU is the one, who executes those instructions e.g. fetch data from memory, add data, subtract data etc. In short, what you write is your program, the Code.

What varies between the different execution of the same program, is data. It's not just mean restarting the program, but a cycle of processing, for example, for an electronic trading application, processing one order is one execution. You can process thousands of order in one minute and with each iteration, data varies.

One more thing to note is that you can create Threads in code, which will then run parallel and execute code, which is written inside their run() method. The key thing to remember is threads can run parallel.

When a Java program starts, one thread known as main thread is created, which executed code written inside the main method, if you create a thread, then those threads are created and started by the main thread, once started they start executing code written in their run() method. See Multithreading and Parallel Computing in Java if you are not familiar with how to create another thread in Java.

Thread, Code and Data - How a Multithreading Java Program Run

So if you have 10 threads for processing Orders, they will run in parallel. In short, Thread executes code, with data coming in. Now, we will see three different kinds of issue, we talked about

1) Issues, which always comes

2) Issues, which comes only sometimes, but consistent with the same input

3) Issues, which is truly random

Issue one is most likely due to faulty code, also known as programming errors e.g. accessing the invalid index of an array, accessing Object's method after making it null or even before initializing it. They are easy to fix, as you know their place.

 You just need to have knowledge of programming language and API to fix this error.

The second issue is more likely to do with data than code. Only sometimes, but always come with the same input, could be because of incorrect boundary handling, malformed data like Order without certain fields for example price, quantity etc.

Your program should always be written robustly so that it won't crash if incorrect data is given as input. The impact should only be with that order, the rest of the order must execute properly.

The third issue is more likely coming because of multithreading, where order and interleaving of multiple thread execution causing race conditions or deadlocks. They are random because they only appear if certain random things happen e.g. thread 2 getting CPU before thread 1, getting a lock on incorrect order.

Remember, Thread scheduler and Operating system are responsible for allocating CPU to threads, they can pause them, take CPU from them at any time, all these can create a unique scenario, which exposes multithreading and synchronization issue.

Your code never depends upon the order of thread etc, it must be robust to run perfectly in all condition.

In short, remember thread executes code with data given as input. Each thread work with the same code but different data. While debugging issue, pay attention to all three, Thread, Code and data.

Further Learning
The Complete Java Masterclass
Multithreading and Parallel Computing in Java
Java Concurrency in Practice - The Book
Applying Concurrency and Multi-threading to Common Java Patterns
Java Concurrency in Practice Bundle by Heinz Kabutz
10 Java Multithreading and Concurrency Best Practices
Top 50 Multithreading and Concurrency Questions in Java

Thanks for reading this article so far. If you like this article then please share with your friends and colleagues. If you have any questions or feedback then please drop a note.

No comments :

Post a Comment