Q

Can I extract bytes from a large ZIP file to search for a string?

I have to develop a service which connects to an FTP server and gets a specific zip file which unbelievably is too large. This zip file has only one compressed text file in it. All I have to do is connect to the FTP server, get the file in contents in memory without extracting it. Wouldn't it be feasible to extract only a small amount of bytes in a buffer and search for the string directly, and if it doesnt match, then continue to get...

next chunk of data in the temporary buffer--as the memory is extremely low as compared to the file?

There are a number of available packages (commercial and open-source) that provide detailed programmatic FTP retrieval and ZIP-file manipulation. However, the java.util.zip package, in the standard JDK, available at http://java.sun.com/j2se, has built-in support for ZIP file manipulation.

If you are feeling ambitious, you should devote some time and become familiar with the java.util.zip package. The following example uses the java.util.zip classes to read a ZIP file and extract its contents, one entry at a time, to a given directory. The entries in the ZIP file are read into memory, where they can be operated on as needed:

public class Archive
{
   private String fileName = "";

   public static void main(String[] args)
   {
      Archive archive =
         new Archive("c:\myfiles\test.zip");
      boolean result = archive.extract("c:\test");
      System.out.println("result = " + result);
   }

   public Archive(String fileName)
   {
      this.fileName = fileName;
   }

   public boolean extract(String directoryPath)
   {
      File dir = new File(directoryPath);
      if (dir.exists() == false)
      {
         if (dir.mkdir() == false)
         {
            System.out.println("Failed to create directory: "
                               + directoryPath);
            return false;
         }
      }

      try
      {
         FileInputStream inStream =
            new FileInputStream(fileName);
         ZipInputStream zInStream =
            new ZipInputStream(inStream);
         ZipEntry entry = zInStream.getNextEntry();
         int count = 0;
         while ((entry = zInStream.getNextEntry()) != null)
         {
            if (entry.isDirectory() == false)
            {
               System.out.println("reading entry "
                                  + (++count) + ": "
                                  + entry.getName());
               byte[] buffer = new byte[(int)entry.getSize()];
               zInStream.read(buffer, 0, buffer.length);
               // do something with buffer
               zInStream.closeEntry();
            }
         }
      }
      catch (IOException e)
      {
         System.out.println(e.toString());
         return false;
      }

      return true;
   }
}

This was first published in August 2002

Dig deeper on Java Web Services

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchSoftwareQuality

SearchCloudApplications

SearchAWS

TheServerSide

SearchWinDevelopment

Close