Tuesday, September 12, 2023

How to design a Limit Order Book in Java? System Design Question from Investment Bank

No doubt that Data structure and algorithms are an integral part of any Programming job interview, including Java, C++, or any other programming language. In fact, Data structure and algorithms are quite a favorite one and all top-notch companies including Google, Microsoft, Amazon, and investment banks like Goldman Sachs, Citigroup, Barclays, Morgan Stanley or J.P. Morgan focus extensively on data structure and algorithms while hiring both senior and mid-level developers in Java, C++, and C# positions.

When I shared some traditional, popular, and more frequently asked questions on Data structure and Algorithms in my earlier article, I received a lot of feedback to share some practical, scenario-based questions on data structure e.g. which kind of data structure will you use in a particular scenario and why.

This sounds really interesting, because this not only tests your knowledge of existing data structure e.g. array, linked list, trees, graphs, stack, queue, ring buffer, Map but also test how well you can apply and use this data structure in a given situation, and in some cases, can you come up with a new innovative data structure.

Of course, classical questions on data structure like finding a middle element of a linked list in one pass or internal working of HashMap still matters, but these new questions will give you more fun and edge while preparing for Java, C++ or C# developer roles.




2 Software Design and  Algorithm Interview Questions in Java

Ok, without any more introduction, let's see some good, practical questions from data structure and algorithms:

1. Limit Order Book

Which data structure will you use to implement a Limit Order Book? and Why?
This question is asked mostly on investment banking Java interviews. Sometimes it's also asked as an Object-Oriented design question, which involved detailed design and coding. Let me give a brief intro to a Limit order book, especially for those programmers who are not from financial services firms or not familiar with stock trading.

In Stock trading, exchanges like NYSE (New York Stock Exchange) maintains an order book for every security or stock which is traded on their exchange e.g. GOOG, which is a symbol of Google's stock. There are mainly two kinds of orders customers can send, a buy order, and a sell order. 

When we use Limit Price, which means buy order with a limit price of $50 can be executed if it found a sell order of $50 or an order of lower price says $49, but it can not be executed with a sell order of price $51. 

Similarly, a sell order can execute for a price, which is either equal to or higher than the limit price. In general, a LIMIT order executes if it found a specified price or better price(lower in the case of a buy, and higher in the case of sell).


Orders are executed on a first come first serve basis, so exchange also maintains a time priority. Now, let's see the flow, an order comes to exchange, exchange looks order book for that symbol, if it found a match it executes order otherwise, it adds that order at the end of a price queue, which represents time priority, head of the queue represents the order with highest time priority i.e. the order which comes before all the order below it.

Now our goal is to perform the above operation as quickly as possible. If you look at it closely, it involves finding the opposite order of matching the price, which is equal to or less/greater than the specified price, removing the order from the order book if it matched or canceled, adding order into the order book at an appropriate place if not matched.



In all these operations, traversing is a key, which hints towards binary search tree data structure. If we use a Binary Tree for storing order, we can find a matching order using binary search which is of order O(log2N), not quite fast as O(1) but still a decent one.

Similarly adding and removing orders will also cost that much time, because they involve traversal. Now in order to tackle different symbols, since the order of one symbol can not match to order of another symbol, an OrderBook must be associated with a symbol. This is just a basic idea, without going into exact requirements i.e. methods required to be supported by an order book.

Btw, If you are not familiar with tree data structure and its variants e.g. binary search tree, balanced tree-like AVL and Red-Black tree, and Tries then I suggest you first read a good book on Data Structure and Algorithms like Introduction to Algorithms by Thomas H. Cormen

Scenario based data structure and algorithm question


So in short, you can use something like this:

class MathcingEngine{

   private Map books;

}

class OrderBook{
    private BinaryTree<Order> orders;

    public execute(Order ord){
        //find matching order
        //if match found then create trade
        //else add this order into tree
    }
}

I have used Queue data structure to maintain time priority for orders of the same price. Of course, this is very high level, but this does give you some idea of how to approach a problem and choose a particular data structure. Remember the choice of data structure is mainly driven by the operation performed on that e.g. We used Queue instead of Stack here, because the order which comes first, should execute first (FIFO) if price matches.

I have used a tree data structure because we need to find either a specified price or a better price, which involves search between different prices. Remember, we have not yet considered concurrency, thread-safety, and all those things which are quite important while designing structures for high volume, low latency applications, but that's a topic for a separate discussion.

Here our goal is to focus on choosing the right data structure. Also, if you think the binary tree is not the right data structure for representing a limit order book, you can definitely share your thoughts and solution with us. We will see if it solves the problem with better performance in terms of time and development effort.

On the other hand, if you are not familiar with essential data structures like the array, linked list, binary search tree, balanced trees like AVL and Red-Black Tree, and hash tables, I suggest you read a good book on data structures. For Java developers, I recommend reading "Data Structure and Algorithms Made Easy" by Narasimha Karumanchi. You can also combine this book with these data structure and algorithms courses for better learning. 

2 Practical Data Structure, Algorithm, and Design Interview Questions from Investment Bank



2. Market Data Store

Which data structure will you choose to store Market data? and Why?
This is another data structure question, which is asked in various wall street firms e.g. Barclays, Citibank, Goldman Sachs, Deutsche Bank, WellsFargo, etc. 

Since many of you might not be familiar with Market data, Let me brief you about Market data and key things which will help us to approach the problem. In simple words, Market data are the live prices of different stocks at a given moment. Since prices move very quickly in Exchange, given the volume of high-frequency trading and electronic trading, market data quickly become stale.

Suppose your system is storing market data to analyze and found under-valued or over-valued stocks to make a buying or selling decision, and you need to store only recent market data for that purpose. Now this gives us an idea, that as soon as we receive an update, we should discard old market data, even if it's not yet processed.

Since Market data is processed by another thread and subsequently removed from the store, we need to also think of a data structure that provides consume kind of functionality. Initially, I thought of using an unbounded Queue for storing market data, which is simultaneously processed by another thread, a typical producer-consumer design, but this design has a problem in terms of updating prices at high speed.

scenario based data structure and algorithm questions design


For example, if you receive an update before earlier market data get processed, you can not update that, which means your application will process old data, which is stale and even if better data is available. Though you can tackle this problem by introducing a check and update method i.e. if a market data for that symbol already exists then remove it and add new data to the queue.

You can do that by using the contains() method but got to be careful while overriding equals() and hashCode(), because if you are wrapping symbol and price in an object say stock, then stocks with different prices will be treated as different. So Queue data structure seems a solution for this problem, let me know if you come across any other solution.

I also thought about Map, with LinkedHashMap you can get your order of insertion, and being a Map, you can update Market data in constant time, but Consumer thread, which needs to iterate and process market data will have a hard time over traversing, as Map will constantly updating.

Well, some of these questions can be very open-ended until the interviewer gives some more details, but at the same time they also expect the candidate to ask the right questions, which mainly comes as what, why, when kind of queries, but remember those are very important and also appreciated by interviewers.


That's all on this list of System Design and Practical Data structure and algorithm questions for Java, C++, and C# developers. These are just a way of approaching the problem, you may have another approach or another data structure to solve these problems, and until you can justify your use of a particular data structure in terms of benefits, you are perfectly fine to use that. In fact, improving your own solution, almost always create a better impression during interviews.

At the same time don't forget to prepare basic data structure and algorithms questions on a linked list, array, queue, and stacks. Stay tuned, I will share few more such data structure questions as and when I come across, meanwhile you are always welcome to share something similar. 

Till then you can also use problems given in Algorithm Design Manual by Steven S Skiena to improve your data structure and problem-solving skill.


Related Data Structure and Algorithm Interview Questions from Javarevisited Blog
  • Top 15 Data Structure and Algorithm Interview Questions (see here)
  • Top 20 String coding interview questions (see here)
  • 20 Dynamic Programming Interview questions (dynamic programming questions)
  • 10 Object Oriented Analysis and Design Questions (oop design questions)
  • 133 core Java interview questions of last 5 years (see here)
  • 30 System Design Questions for interviews (system design questions)
  • Top 30 Array Coding Interview Questions with Answers (see here)
  • Top 30 linked list coding interview questions (see here)
  • Top 50 Java Programs from Coding Interviews (see here)
  • Top 5 books on Programming/Coding Interviews (list)
Thanks for reading this article so far. If you like this article then please share it with your friends and colleagues. If you have any questions or doubt then please let us know and I'll try to find an answer for you.

P. S. - If you want to learn System Design in depth and looking for best System Design resources then you can also checkout my earlier post about best System Design courses and Books to learn System in depth, not just for interview but also general Software design task.

2 comments:

  1. Really unique type of questions and it was really useful. I have faced similar questions in one of the big 4 companies. Could you please let me know is there any book or course where I can find these kind of scenario based questions for practice?

    ReplyDelete
  2. Very interesting post, for designing limit order book, these are main considerations
    1. O(1) performance for add order
    2. cancel order in O(1) should remove from bok
    3. execute order in O(1)

    there are also few queries which require to as fast like volume of orders between two prices etc. I am looking for the best datastructure for limit order book, so if anyone has any insight or solution please suggest. Any code example would be better.

    ReplyDelete