Wednesday, January 4, 2012

Why use Memory Mapped File or MapppedByteBuffer in Java

Memory Mapped Files in Java is rather new java concept for many programmers and developers, though it’s been there from JDK 1.4 along with java.nio package. Java IO has been considerably fast after introduction of NIO and memory mapped file offers fastest IO operation possible in Java, that's the main reason of Why high performance Java application should use Memory Mapped files for persisting data. It's already quite popular in high frequency trading space, where electronic trading system needs to be super fast and one way latency to exchange has to be on sub-micro second level. IO has always been concern for performance sensitive applications and memory mapped file allows you to directly read from memory and write into memory by using direct and non direct Byte buffers. Key advantage of  Memory Mapped File is that operating system takes care of reading and  writing and even if your program crashed just after writing into memory. OS will take care of writing content to File. One more notable advantage is shared memory, memory mapped files can be accessed by more than one process and can be act as shared memory with extremely low latency. See Peter's comment also on comment section.

Earlier we have seen how to read xml file in Java and how to read text file in java and in this Java IO tutorial we gonna look on  what is memory mapped file, how to read and write from memory mapped file and important points related to Memory Mapped Files.


What is Memory Mapped File and IO in Java

memory mapped file and io in java read write exampleMemory mapped files are special files in Java which allows Java program to access contents  directly from memory, this is achieved by mapping whole file or portion of file into memory and operating system takes care of loading page requested and writing into file while application only deals with memory which results in very fast IO operations. Memory used to load Memory mapped file is outside of Java heap Space. Java programming language supports memory mapped file with java.nio package and has MappedByteBuffer to read and write from memory.


Advantage and Disadvantage of Memory Mapped file

Possibly main advantage of Memory Mapped IO is performance, which is important to build high frequency electronic trading system. Memory Mapped Files are way faster than standard file access via normal IO. Another big advantage of memory mapped IO is that it allows you to load potentially larger file which is not otherwise accessible. Experiments shows that memory mapped IO performs better with large files. Though it has disadvantage in terms of increasing number of page faults. Since operating system only loads a portion of file into memory if a page requested is not present in memory than it would result in page fault Most of major operating system like Windows platform, UNIX, Solaris and other UNIX like operating system supports memory mapped IO and with 64 bit architecture you can map almost any file into memory and access it directly using Java programming language. Another advantages is that the file can be shared, giving you shared memory between processes and can be more than 10x lower latency than using a Socket over loopback.

MappedByteBuffer Read Write Example in Java

Below example will show you how to read and write from memory mapped file in Java. We have used RandomAccesFile to open a File and than mapped it to memory using FileChannel's map() method, map method takes three parameter mode, start and length of region to be mapped. It returns MapppedByteBuffer which is a ByteBuffer for dealing with memory mapped file.


import java.io.RandomAccessFile;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;

  public class MemoryMappedFileInJava {

    private static int count = 10485760; //10 MB

    public static void main(String[] args) throws Exception {

        RandomAccessFile memoryMappedFile = new RandomAccessFile("largeFile.txt", "rw");

        //Mapping a file into memory

        MappedByteBuffer out = memoryMappedFile.getChannel().map(FileChannel.MapMode.READ_WRITE, 0, count);

        //Writing into Memory Mapped File
        for (int i = 0; i < count; i++) {
            out.put((byte) 'A');
        }

        System.out.println("Writing to Memory Mapped File is completed");

     

        //reading from memory file in Java
        for (int i = 0; i < 10 ; i++) {
            System.out.print((char) out.get(i));
        }

        System.out.println("Reading from Memory Mapped File is completed");

    }

}


Summary

To summarize the post here is quick summary of memory mapped files and IO in Java:

1) Java supports Memory mapped IO with java.nio package.
2) Memory mapped files is used in performance sensitive application e.g. high frequency electronic trading platforms.

3) By using memory mapped IO you can load portion of large files in memory.

4) Memory mapped file can result in page fault if requested page is not in memory.

5) Ability to map a region of file in memory depends on addressable size of memory. In a 32 bit machine you can not access beyond 4GB or 2^32.

6) Memory mapped IO is much faster than Stream IO in Java.

7) Memory used to load File is outside of Java heap and reside on shared memory which allow two different process to access File. By the way this depends upon, whether you are using direct or non-direct byte buffer.

8) Reading and writing on memory mapped file is done by operating system, so even if your Java Program crash after putting content into memory it will make to disk, until OS is fine.
9) Prefer Direct Byte buffer over Non Direct Buffer for higher performance.

10) Don't call MappedByteBuffer.force() method to often, this method is meant to force operating system to write content of memory into disk, So if you call force() method each time you write into memory mapped file, you will not see true benefit of using mapped byte buffer, instead it will be similar to disk IO.


11) In case of power failure or host failure, there is slim chance that content of memory mapped file is not written into disk, which means you could lose critical data.

12) MappedByteBuffer and file mapping remains valid until buffer is garbage collected. sun.misc.Cleaner is probably the only option available to clear memory mapped file.


That’s all on memory mapped file and memory mapped IO in Java. Its pretty useful concept and I encourage you to learn more about it. If you are working on high frequency trading space than memory mapped file is quite common there.

Related Java Tutorial

11 comments :

Peter Lawrey said...

Nice article. I would add that changes don't need to be flushed to disk. If you change a memory value and immediately die (e.g. triggering a SIGSEGV) the change is still saved to disk (Provided your OS doesn't die)

Another advantages is that the file can be shared, giving you shared memory between processes and can be more than 10x lower latency than using a Socket over loopback.

For maximum performance you can use Unsafe to access the data directly. ;)

BTW: The ads are a bit much. :P

Javin @ how to run Java Program said...

Thanks Peter.Indeed key advantage is file sharing and shared memory which allows two process to read from a memory mapped, very useful to keep static or product data on high frequency trading system.

Peter Lawrey said...

It even useful for exchanging data which is updated very quickly as well. esp. if its data you need to persist anyway.

Anonymous said...

Java memory mapped I/O is quite nice!
But I would like to point out that the count number you stated (count = 1010241024) is not 10 MB it's actually 963 MB.
Just don't be suprised when the file grows bigger and bigger ;)

Anonymous said...

Hi Javin, but isn't it true that you cannot unmap a memory mapped file, which can only be done if the garbage collector collects your buffer? In that case isn't memory mapped file a bit useless?

I have read on Peters blog where he used the sun.misc.Cleaner file to clean up the memory mapped file, is this the correct approach? I say this because there is no update on the SUN website regarding this bug

shambhu said...

@Anonymous, you are right, MappedByteBuffer and file mapping remains valid until buffer is garbage collected. sun.misc.Cleaner is probably the only option available to clear memory mapped file. If I come across any other method to unmap or clean contents of MappedByteBuffer I will let you know.

CK said...

What happens when there is a io exception from the disk? Does the java program get a chance to recover or does the process crash?

Anonymous said...

How big memory mapped file can be in Java? I read that it can only be 2GB because ByteBuffer, which is often used as MappedByteBuffer uses integer as index, which means only 2GB? What to do to map a file which is larger than 2GB in Java?

Levi said...

Hello Sir, What is difference between direct and non direct byte buffer in Java?

Anonymous said...

Few more things to remember while using Memory Mapped File for high performance application :

1) Prefer Direct Byte buffer over Non Direct Buffer
2) Don't call MappedByteBuffer.force() method to often, this method is meant to force operating system to write content of memory into disk, So if you call force() method each time you write into memory mapped file, you will not see true benefit of using mapped byte buffer, instead it will be similar to disk IO.
3) In case of power failure or host failure, there is slim chance that content of memory mapped file is not written into disk, which means you could lose critical data.

Cheers

Anonymous said...

Javin,
Thanks for a great article. While looking at the MappedByteBuffer code, I found that its only DirectByteBuffer backed. So, how so you think its required to be GCed?

Post a Comment