Wednesday, August 14, 2013

How to fix Too many files open in Tomcat, Weblogic Server

Not many Java programmers knows that socket connections are treated like files and they use file descriptor, which is a limited resource. Different operating system has different limits on number of file handles they can manage. One of the common reason of Too many files open in Tomcat, Weblogic or any Java application server is, too many clients connecting and disconnecting frequently at very short span of time. Since Socket connection internally use TCP protocol, which says that a socket can remain in TIME_WAIT state for some time, even after they are closed. One of the reason to keep closed socket in TIME_WAIT state is to ensure that delayed packets reached to the corresponding socket. Different operating system has different default time to keep sockets in TIME_WAIT state, in Linux it's 60 seconds, while in Windows is 4 minutes. Remember longer the timeout, longer your closed socket will keep file handle, which increase chances of Too many files open exception

This also means, if you are running Tomcat, Weblogic, Websphere or any other web server in windows machine, you are more prone to this error than Linux based systems e.g. Solaris or Ubuntu

By the way this error is same as Too many files open exception, which is throw by code from IO package if you try to open a new FileInputStream or any stream pointing to file resource.

How to solve Too many files open : Too many files open in Tomcat serverNow, we know that this error is coming because clients are connecting and disconnecting frequently. If that's seems unusual to your application, you can find the culprit client and prohibit them from reconnecting from making a connection, but if that is something, your application may expect and you want to handle it on your side, you have two options :

1) Increase number of open file handles or file descriptors per process.
2) Reduce timeout for TIME_WAIT state in your operating system

In UNIX based operating system e.g. Ubuntu or Solaris, you can use command ulimit -a to find out how many open file handles per process is allowed.

$ ulimit -a
core file size        (blocks, -c) unlimited
data seg size         (kbytes, -d) unlimited
file size             (blocks, -f) unlimited
open files                    (-n) 256
pipe size          (512 bytes, -p) 10
stack size            (kbytes, -s) 8192
cpu time             (seconds, -t) unlimited
max user processes            (-u) 2048
virtual memory        (kbytes, -v) unlimited

You can see that, open files (-n) 256, which means only 256 open file handles per process is allowed. If your Java program, remember Tomcat, weblogic or any other application server are Java programs and they run on JVM, exceeds this limit, it will throw Too many files open error.

You can change this limit by using ulimit -n to a larger number e.g. 4096, but do it with advise of UNIX system administrator and if you have separate UNIX support team, than better escalate to them.

Another important thing to verify is that, your process is not leaking file descriptors or handles, well that's a tedious thing to find out, but you can use lsof command to check how many open file handles is owned by a particular process in UNIX or Linux. You can run lsof command by providing PID of your process, which you can get it from ps command.

Similarly, you can change TIME_WAIT timeout, but do with consultation of UNIX support, as a really low time means, you might miss delayed packets. In UNIX based systems, you ca n see current configuration in /proc/sys/net/ipv4/tcp_fin_timeout file. In Windows based system, you can see this information in windows registry. You can change the TCP TIME_WAIT timeout in Windows by following below steps :

1) Open Windows Registry Editor, by typing regedit in run command window
2) Find the key HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\tcpip\Parameters
3) Add a new key value pair TcpTimedWaitDelay asa decimal and set the desired timeout in seconds (60-240)
4) Restart your windows machine.

Remember, you might not have permission to edit windows registry, and if you are not comfortable, better not to do it. Instead ask Windows Network support team, if you have any,  to do that for you. Bottom line to fix Too many files open, is that either increasing number of open file handles or reducing TCP TIME_WAIT timeout. Too many files open issue is also common among FIX Engines, where client use TCP/IP protocol to connect with brokers FIX servers. Since FIX engines needs correct value of incoming and outgoing sequence number to establish FIX session, and if client tries to connect with a smaller sequence number than expected at brokers end, it disconnects the session immediately. If client is well behind, and keep retrying by increasing sequence number by 1, it can cause Too many files open at brokers end. To avoid this, let's FIX engine keep track of it's sequence number, when it restart. In short, " Too many files open" can be seen any Java Server application e.g. Tomcat, Weblogic, WebSphere etc, with client connecting and disconnecting frequently.

Further Learning
Understanding the Java Virtual Machine: Memory Management
Understanding and Solving Java Memory Problems
Java Performance The Definitive Guide By Scott Oaks


Anonymous said...

Hello there, What is difference between Too many open files and Too many files open error, doesn't both error are same? Actually I am getting "Could not open more connection Too many open files in Apache ActiveMQ running on my windows machine. Can I use above information to solve this problem, please help.

Wang said...

In order to fix Too many open files, you must remember to close any stream you open e.g. FileInputStream, FileOutputStream, SocketInputStream or SocketOutputStream. Always remember to close them in finally block. It's also worth noting that which operation make use of file descriptors e.g.

- opening an incoming socket connection will use a file handle. So as outgoing socket connection.

- Reading and Writing on files will use file descriptors as well.

Javin @ clustered vs nonclustered index said...

@Anonymous, I think both " Too many files open " and " Too many open files" are same error. Actually error part is "TOO MANY OPEN FILES", which comes when limit on file handles are exhausted, due to poor management i.e. not closing streams once done or due to increased volume. Only difference, I can think of is that is thrown from methods, which is trying to open socket connection, while other one is thrown by File API.

Post a Comment