MD5 checksums are good to verify integrity of files and It's easy to generate MD5 checksum in Java. Java provides couple of ways to generate MD5 checksum for any file, you can either use java.security.MessageDigest or any open source library like Apache commons codec or Spring. All 3 ways we have seen in our earlier article about generating MD5 hash for String is also applicable to generate MD5 checksum for any file. Since most of md5() or md5Hex() method takes byte, you can simply read bytes from InputStream or pass to these md5 methods. Apache commons codec from version 1.4 also provides an overloaded method to accept InputStream, which makes generating checksum very easy in Java. For those who are not familiar with checksum, it's a fixed size datum generated from a block of data to detect any accidental change in data. Which means once you create checksum for a file, which is based on contents of file, any change on file e.g. adding white space, deleting a character will result in different checksum. By comparing stored checksum with current checksum, you can detect any change on File. It's good practice to provide checksum of WAR or JAR files to support teams for production release. In this Java tutorial we will learn how to create MD5 checksum for any file in Java.
Java program to generate MD5 checksum for Files
When we create MD5 checksum for a File any further change's produce a different checksum. In this Java program we will see two ways to create MD5 checksum for a File. In first method we have used standard Java library and MessageDigest from security package to create MD5 checksum. If you notice we have used update() method of MessageDigest, instead calling digest with a byte. This is a right way to generate MD5 checksum of a File because Fie could be very large and you might not have enough memory to read entire file as byte array and result in Java.lang.OutOfMemoryError: Java Heap Space. It's better to read data in parts and update MessageDigest. Second method uses Apache commons Codec to generate MD5 checksum of a File. DigestUtils provides overloaded method md5Hex() which can accept InputStream from version 1.4, which means you don't need to convert InputStream to String or byte array. Let's see complete Java example to create MD5 checksum for any file in Java.
Some programmer uses BigInteger to convert byte array to Hex String, as shown above, may be because its looks a beautiful one liner But it truncates leading zero, which can cause some problems. Let's run this program again with by changing file's content to 27, which produces MD5 checksum with leading zero.
Now you can see output from first method to create MD5 checksum only contains 31 characters and leading zero is missing. It's better to use conventional way to convert byte array to Hex String rather that using this shortcut. If you really like using BigInteger, than you make up for those leading zero by using format method of String. You can take advantage of fact that BigInteger only truncates leading zero and String always contains 32 characters. Here is a way to use format method of String to produce 32 char, lowercase, hexadecimal String which is left padded with 0 :
String.format("%032x",new BigInteger(1, hash));
if you replace toString() method of BigInteger with format method of String, you will receive same output from both methods.
That's all on How to generate MD5 checksum for a File in Java. As I said it's good to verify checksum of Files before releasing it to production environment and It's pretty easy to generate MD5 checksum in Java using Apache Commons Codec or even Spring.