Sunday, February 6, 2022

2 Examples to read Zip Files in Java, ZipFile vs ZipInputStream - Tutorial

ZIP format is one of the most popular compression mechanisms in the computer world. A Zip file may contain multiple files or folders in compressed format.  Java API provides extensive support to read Zip files, all classes related to zip file processing are located in the java.util.zip package. One of the most common tasks related to zip archive is to read a Zip file and display what entries it contains, and then extract them in a folder. In this tutorial, we will learn how to do this task in Java. There are two ways you can iterate over all items in a given zip archive, you can use either java.util.zip.ZipFile or java.util.zip.ZipInputStream. Since a Zip file contains several items, each of them has a header field containing the size of items in a number of bytes. This means you can iterate all entries without actually decompressing the zip file.

The ZipFile class accepts a java.io.File or String file name, it opens a ZIP file for reading, and the UTF-8 charset is used to decode the entry names and comments.

The main benefit of using ZipFile over ZipInputStream is that it uses random access to iterate over different entries, while ZipInputStream is sequential, because it works with the stream, due to which it's not able to move positions freely.

It has to read and decompress all zip data in order to reach EOF for each entry and read the header of the next entry. That's why it's better to use ZipFile class over ZipInputStream for iterating over all entries from the archive.

We will learn more about how to use read Zip files in Java, by following an example. By the way, code should work with zip file created by any zip utility e.g. WinZip, WinRAR, or any other tool, .ZIP format permits multiple compression algorithms. I have tested with Winzip in Windows 8, but it should work with zip files created by any tool.




How to Read a Zip archive in Java? Example

In this example, I have used ZipFile class to iterate over each file from a Zip archive. getEntry() method of ZipFile returns an entry, which has all metadata including name, size, and modified date and time.

You can ask ZipFile for InputStream corresponding to this file entry for extracting real data. This means, you only incur the cost of decompression when you really need to. By using java.util.zip.ZipFile, you can check each entry and only extract certain entries, depending upon your logic.

ZipFile is good for both sequential and random access of individual file entries. On the other hand, if you are using ZipInptStream then like any other InputStream, you will need to process all entries sequentially, as shown in the second example.


The key point to remember, especially if you are processing large zip archives is that Java 6 only support zip file up to 2GB. Thankfully Java 7 supports zip64 mode, which can be used to process large zip files with a size of more than 2GB.




Java Program to read Zip File in Java

Here is a complete Java Program to read zip files in Java, you can copy and paste this code to your IDE like Eclipse and IntelliJIDEA to run this code. 

import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.Date;
import java.util.Enumeration;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;
import java.util.zip.ZipInputStream;

/**
 * Java program to iterate and read file entries from Zip archive.
 * This program demonstrate two ways to retrieve files from Zip
 *  using ZipFile and by using ZipInputStream class.
 * @author Javin
 */

public class ZipFileReader {

    // This Zip file contains 11 PNG images
    private static final String FILE_NAME = "C:\\temp\\pics.zip";
    private static final String OUTPUT_DIR = "C:\\temp\\Images\\";
    private static final int BUFFER_SIZE = 8192;

    public static void main(String args[]) throws IOException {

        // Prefer ZipFile over ZipInputStream
        readUsingZipFile();
    //  readUsingZipInputStream();

    }

    /*
     * Example of reading Zip archive using ZipFile class
     */

    private static void readUsingZipFile() throws IOException {
        final ZipFile file = new ZipFile(FILE_NAME);
        System.out.println("Iterating over zip file : " + FILE_NAME);

        try {
            final Enumeration<? extends ZipEntry> entries = file.entries();
            while (entries.hasMoreElements()) {
                final ZipEntry entry = entries.nextElement();
                System.out.printf("File: %s Size %d  Modified on %TD %n",
   entry.getName(), entry.getSize(), new Date(entry.getTime()));
                extractEntry(entry, file.getInputStream(entry));
            }
            System.out.printf("Zip file %s extracted successfully in %s",
  FILE_NAME, OUTPUT_DIR);
        } finally {
            file.close();
        }

    }

    /*
     * Example of reading Zip file using ZipInputStream in Java.
     */

    private static void readUsingZipInputStream() throws IOException {
        BufferedInputStream bis 
    = new BufferedInputStream(new FileInputStream(FILE_NAME));
        final ZipInputStream is = new ZipInputStream(bis);

        try {
            ZipEntry entry;
            while ((entry = is.getNextEntry()) != null) {
                System.out.printf("File: %s Size %d  Modified on %TD %n",
         entry.getName(), entry.getSize(), new Date(entry.getTime()));
                extractEntry(entry, is);
            }
        } finally {
            is.close();
        }

    }

    /*
     * Utility method to read  data from InputStream
     */

    private static void extractEntry(final ZipEntry entry, InputStream is)
               throws IOException {
        String exractedFile = OUTPUT_DIR + entry.getName();
        FileOutputStream fos = null;

        try {
            fos = new FileOutputStream(exractedFile);
            final byte[] buf = new byte[BUFFER_SIZE];
            int read = 0;
            int length;

            while ((length = is.read(buf, 0, buf.length)) >= 0) {
                fos.write(buf, 0, length);
            }

        } catch (IOException ioex) {
            fos.close();
        }

    }

}

Output:
Iterating over zip file : C:\temp\pics.zip
File: Image  (11).png Size 21294  Modified on 10/24/13
File: Image  (1).png Size 22296  Modified on 11/19/13
File: Image  (2).png Size 10458  Modified on 10/24/13
File: Image  (3).png Size 18425  Modified on 11/19/13
File: Image  (4).png Size 31888  Modified on 11/19/13
File: Image  (5).png Size 27454  Modified on 11/19/13
File: Image  (6).png Size 67608  Modified on 11/19/13
File: Image  (7).png Size 8659  Modified on 11/19/13
File: Image  (8).png Size 40015  Modified on 11/19/13
File: Image  (9).png Size 17062  Modified on 10/24/13
File: Image  (10).png Size 42467  Modified on 10/24/13
Zip file C:\temp\pics.zip extracted successfully in C:\temp\Images\

In order to run this file, make you must have, a zip file with the name pics.zip in C:\temp, and output directory C:\temp\Images available, otherwise it will throw java.lang.NullPointerException. After the successful run of this program, you can see the contents of the zip file extracted inside the output directory. 

2 Examples to read Zip Files in Java, ZipFile vs ZipInputStream - Tutorial


By the way, as an exercise, you can enhance this program to get the name of the zip file from the user and create an output directory of the same name.

That's all about How to read Zip files in Java. We have seen two different approaches to iterate over each file entry in Zip file and retrieve them. You should prefer using ZipFile over ZipInputStream for iterating over each file from the archive. 

It's also good to know that java.uti.zip package also supports GZIP file formats, which means you can also read .gz files generated by gzip command in UNIX from your Java program.

12 comments:

  1. Can we also compress more than one text files with help of this API?

    ReplyDelete
  2. nice article, but last words are - "It's also good to know that java.uti.zip package".. Some missprint here in package name

    ReplyDelete
  3. @Of al..., yes you can compre more than one text files using tihs API. It allows you to create a ZIP file from a directory, which may contain sub-directories. You can even ues WinZip, WinRAR and 7-Zip to open Zip files created in Java. In UNIX, you can use zip command by it's own or gunzip command.

    ReplyDelete
  4. @Lanqu is right, there seems to be a little L is missing there, it should be java.util.zip package :-) , looks like @Lanqu has got really good observation skills.

    ReplyDelete
  5. I have an issue here, I was trying to create a big zip file to backup lots of images and video files using Java but I am getting java.lang.OutOfMemoryError again? I tried to increase heap space but it didn't help, any idea?

    ReplyDelete
  6. You can use Zip4j library to easily create zip file, extract a single or all files from zip archive, split them just like Winzip and many more. It's very useful open source library to deal with ZIP file in Java. I prefer to use this library instead of writing code every time.

    ReplyDelete
  7. can you please provide code for creating zip files inside another zip file

    ReplyDelete
  8. Hello @Anonymous, zip file is just another, try the same program and just ask it to archive a zip file.

    ReplyDelete
  9. "You can use Zip4j library to easily create zip file, extract a single or all files from zip archive, split them just like Winzip and many more."

    Good advise!

    ReplyDelete
  10. is there any way to read nested zip file

    ReplyDelete
  11. Did you try with loop? I mean flatten the file and check if the flattened file is zip again?

    ReplyDelete
  12. I'm getting this "Error: Could not find or load main class ZipFileReader"

    ReplyDelete