15 Java NIO, Socket, and Networking Interview Questions Answers

Networking and Socket Programming is one of the important areas of Java programming language, especially for those programmers, who are working in client server-based applications. Knowledge of important protocols e.g. TCP and UDP in detail is very important, especially if you are in the business of writing high-frequency trading applications, which communicate via FIX Protocol or native exchange protocol. In this article, we will some of the frequently asked questions on networking and socket programming, mostly based around TCP IP protocol. 

This article is kinda light on NIO though, as it doesn't include questions from multiplexing, selectors, ByteBuffer, and FileChannel but it does include classical questions like the difference between IO and NIO.

The main focus of this post is to make Java developerss familiar with low-level parts e.g. how TCP and UDP protocol works, socket options and writing multi-threaded servers in Java.

The questions discussed here are not really tied up with Java programming language and can be used in any programming language, which allows programmers to write client-server applications.

By the way, If you are going for an interview on Investment banks for a core Java developer role, you better prepare well on Java NIO, Socket Programming, TCP, UDP, and Networking along with other popular topics e.g. multi-threadingCollections API, and Garbage Collection tuning. You can also contribute any question, which is asked to you or related to socket programming and networking and can be useful for Java interviews.



Java Networking and Socket Programming Questions Answers

Here is my list of 15 interview questions related to networking basics, internet protocol, and socket programming in Java. Though it doesn't contain basic questions from APIs like Server, ServerSocket, it focuses on the high-level concept of writing the scalable server in Java using NIO selectors and how to implement that using threads, their limitations, and issues, etc.

I will probably add a few more questions based on some best practices while writing socket-based applications in Java. If you know a good question on this topic, feel free to suggest.




1. Difference between TCP and UDP protocol? 
There are many differences between TCP (Transmission Control Protocol) and UDP (User Datagram Protocol), but the main is TCP is connection-oriented, while UDP is connectionless. This means TCP provides guaranteed delivery of messages in the order they are sent, while UDP doesn't provide any delivery guarantee.

Because of this guarantee, TCP is slower than UDP, as it needs to perform more work. TCP is best suited for the message, which you can't afford to lose, like. order and trade messages in electronic trading, wire transfer in banking and finance, etc. UDP is more suited for media transmission, where the loss of one packet, known as datagrams is affordable and doesn't affect the quality of service.

This answer is enough for most of the interviews, but you need to be more detailed when you are interviewing as Java developer for a high-frequency trading desk. Some of the points which many candidates forget to mention is about order and data boundary.

In TCP, messages are guaranteed to be delivered in the same order as they are sent but data boundary is not preserved, which means multiple messages can be combined and sent together, or receiver may receive one part of the message in one packet and other parts of the message in next packet.

Though the application will receive the full message and in the same order. TCP protocol will do the assembling of message for you. On the other hand, UDP sends a full message in a datagram packet, if clients receive the packet it is guaranteed that it will get the full message, but there is no guarantee that the packet will come in the same order they are sent.

In short, you must mention the following differences between TCP and UDP protocol while answering during the interview :
  • TCP is guaranteed delivery, UDP is not guaranteed.
  • TCP guarantees order of messages, UDP doesn't.
  • Data boundary is not preserved in TCP, but UDP preserves it.
  • TCP is slower compared to UDP.
for a more detailed answer, see my post 9 differences between TCP and UDP protocol.


2. How does TCP handshake works? 
Three messages are exchanged as part of TCP head-shake e.g. Initiator sends SYN,  upon receiving this Listener sends SYN-ACK, and finally, initiator replied with ACK, at this point TCP connection is moved to ESTABLISHED state. This process is easily understandable by looking at the following diagram.

Java Networking Interview Questions and Answers





3. How do you implement reliable transmission in UDP protocol? 
This is usually a follow-up to the previous interview question. Though UDP doesn't provide a delivery guarantee at the protocol level, you can introduce your own logic to maintain reliable messaging e.g. by introducing sequence numbers and retransmission.

If the receiver finds that it has missed a sequence number, it can ask for a replay of that message from the Server. TRDP protocol, which is used Tibco Rendezvous (a popular high speed messaging middle-ware) uses UDP for faster messaging and provides reliability guarantee by using sequence number and retransmission.


4. What is Network Byte Order? How do two hosts communicate if they have different byte-ordering? There are two ways to store two bytes in memory, little-endian (least significant byte at the starting address) and big-endian (most significant byte at the starting address). They are collectively known as host byte order. For example, an Intel processor stores the 32-bit integer as four consecutive bytes in memory in the order 1-2-3-4, where 1 is the most significant byte.

IBM PowerPC processors would store the integer in the byte order 4-3-2-1. Networking protocols such as TCP are based on a specific network byte order, which uses big-endian byte ordering. If two machines are communicating with each other and they have different byte ordering, they are converted to network byte order before sending or after receiving.

Therefore, a little-endian micro-controller sending to a UDP/IP network must swap the order in which bytes appear within multi-byte values before the values are sent onto the network, and just swap the order in which bytes appear in multi-byte values received from the network before the values are used. In short, you can also say network byte order is the standard of storing bytes during transmission, and it uses a big-endian byte ordering mechanism.


5. What is Nagle's algorithm?
If the interviewer is testing your knowledge of TCP/IP protocol then it's very rare for him not to ask this question. Nagle's algorithm is a way of improving the performance of TCP/IP protocol and networks by reducing the number of TCP packets that need to be sent over the network. It works by buffering small packets until the buffer reaches Maximum Segment Size.

Since small packets, which contain only 1 or 2 bytes of data, has more overhead in terms of TCP header, which is 40 bytes. These small packets can also lead to congestion in a slow network. Nagle's algorithm tries to improve the efficiency of TCP protocol by buffering them, to send a larger packet.

Also, Nagle's algorithm has a negative effect on non-small writes, so if you are writing large data on packets then it's better to disable Nagle's algorithm. In general, Nagle's algorithm is a defense against the careless application, which sends lots of small packets to the network, but it will not benefit or have a negative effect on well-written applications, which properly takes care of buffering.


6. What is TCP_NODELAY? 
TCP_NODELAY is an option to disable Nagle's algorithm, provided by various TCP implementations. Since Nagle's algorithm performs badly with the TCP delayed acknowledgment algorithm, it's better to disable Nagle's when you are doing the write-write-read operation.

Where a read after two successive writes on the socket may get delayed up to 500 milliseconds until the second write has reached the destination. If latency is more concerned over bandwidth usage e.g. in a network-based multi-player game, the user wants to see action from other players immediately, it's better to bypass Nagle's delay by using the TCP_NODELAY flag.


7. What is multicasting or multicast transmission? Which Protocol is generally used for multicast? TCP or UDP? 
Multi-casting or multicast transmission is one too many distributions, where the message is delivered to a group of subscribers simultaneously in a single transmission from the publisher. Copies of messages are automatically created in other network elements e.g. Routers, but only when the topology of a network requires it.

Tibco Rendezvous supports multicast transmission. Multi-casting can only be implemented using UDP because it sends full data as a datagram package, which can be replicated and delivered to other subscribers. Since TCP is a point-to-point protocol, it can not deliver messages to multiple subscribers, until it has the link between each of them.

Though UDP is not reliable, and messages may be lost or delivered out of order. Reliable multicast protocols such as Pragmatic General Multicast (PGM) have been developed to add loss detection and retransmission on top of IP multicast. IP multicast is widely deployed in enterprises, commercial stock exchanges, and multimedia content delivery networks. Common enterprise use of IP multicast is for IPTV applications


8. What is the difference between Topic and Queue in JMS? 
The main difference between Topic and Queue in Java Messaging Service comes when we have multiple consumers to consumer messages. If we set up multiple listener threads to consume messages from Queue, each message will be dispatched to only one thread and not all threads. On the other hand in the case of Topic, each subscriber gets its own copy of the message.



9. What is the difference between IO and NIO?
The main difference between NIO and IO is that NIO provides asynchronous, non-blocking IO, which is critical to write faster and scalable networking systems. While most of the utility from IO classes are blocking and slow. 

NIO takes advantage of asynchronous system calls in UNIX systems such as select() system call for network sockets. Using select(), an application can monitor several resources at the same time and can also poll for network activity without blocking. 

The select() system call identifies if data is pending or not, then read() or write() may be used knowing that they will complete immediately.



10. How do you write a multi-threaded server in Java?
A multi-threaded server is one that can serve multiple clients without blocking. Java provides excellent support to developers such as servers. Prior to Java 1.4,  you can write a multi-threaded server using traditional socket IO and threads. This had a severe limitation on scalability because it creates a new thread for each connection and you can only create a fixed number of threads, depending upon the machine's and platform's capability.

Though this design can be improved by using thread pools and worker threads, it is still a resource-intensive design. After JDK 1.4 and NIO's introduction, writing scalable and multi-threaded servers become a bit easier. You can easily create it in a single thread by using Selector, which takes advantage of the asynchronous and non-blocking IO model of Java NIO.



11. What is the ephemeral port?
In TCP/IP connection usually contains four things, Server IP, Server port, Client IP, and Client Port. Out of these four, 3 are well known most of the time, what is not known is client port, this is where ephemeral ports come into the picture. ephemeral ports are dynamic ports assigned by your machine's IP stack, from a specified range, known as the ephemeral port range, when a client connection explicitly doesn't specify a port number.

These are short-lived, temporary ports, which can be reused once the connection is closed, but most of IP software doesn't reuse ephemeral port until the whole range is exhausted. Similar to TCP, UDP protocol also uses an ephemeral port, while sending datagram. In Linux ephemeral port range is from 32768 to 61000, while in windows default ephemeral port range is 1025 to 5000. The similarly different operating system has different ephemeral port ranges



12. What is the sliding window protocol? 
Sliding window protocol is a technique for controlling transmitted data packets between two network computers where reliable and sequential delivery of data packets is required, such as provided by the Transmission Control Protocol (TCP). 

In the sliding window technique, each packet includes a unique consecutive sequence number, which is used by the receiving computer to place data in the correct order. The objective of the sliding window technique is to use the sequence numbers to avoid duplicate data and to request missing data


13. When do you get the "too many files open" error? 
Just like File connection, Socket Connection also needs file descriptors, Since every machine has a limited number of file descriptors, it's possible that they may run out of file descriptors. When it happens, you will see a "too many files open" error. You can check how many file descriptor per process is allowed on UNIX based system by executing ulimit -n command or simply count entries on /proc//fd/


14. What is TIME_WAIT state in TCP protocol? When does a socket connection go to TIME_WAIT state? 
When one end of TCP Connection closes it by making a system call, it goes into TIME_WAIT state. Since TCP packets can arrive in the wrong order, the port must not be closed immediately to allow late packets to arrive. That's why that end of TCP connection goes into TIME_WAIT state. 

For example, if a client closes a socket connection then it will go to TIME_WAIT state, similarly, if the server closes the connection then you will see TIME_WAIT there. You can check the status of your TCP and UDP sockets by using these networking commands in UNIX.


15. What will happen if you have too many socket connections in the TIME_WAIT state on the Server? 
When a socket connection or port goes into the TIME_WAIT state, it doesn't release the file descriptor associated with it. The file descriptor is only released when the TIME_WAIT state is gone i.e. after some specified configured time. If too many connections are in the TIME_WAIT state then your Server may ran out of file descriptors and start throwing "too many files open" error, and stop accepting new connections.


That's all about in this list of networking and socket programming interview questions and answers. Though I have originally intended this list for Java programmers it is equally useful for any programmer. In fact, this is the bare minimum knowledge of sockets and protocols every programmer should have. I have found that C and C++ programmers are better at answering these questions than the average Java programmers.

One reason for this may be because Java programmers have got so many useful libraries like Apache MINA, which does all the low-level work for them. Anyway, knowledge of fundamentals is very important and everything else is just an excuse, but at some point, I also recommend using tried and tested libraries like Apache MINA for production code.

2 comments :

Anonymous said...

On NIO, I have seen some good questions like :
1) Difference between StringBuilder and ByteBuffer in Java?
2) How Selector works?
3) Difference between Channel and Stream in Java?
4) What is loopback? What happens if your client and server on same host?
5) How many sockets a Java program can open without crashing?

I don't know if you heard about them, but I found them quite interesting.

Anonymous said...

Few more questions to add :
1) What is multicasting?
2) What is difference between broadcast and multicast? which is more efficient and why?
3) what is multicast address and multicast group?

Post a Comment