Genotrance

Icon

Random thoughts, ideas and experiences

Py2Exe: Zlib not available

Recently, I released ClearAxis, a simple application that allows you to configure the Araxis Merge utility as the default diff tool within Rational ClearCase. The ClearCase diff tool is rather primitive and could really use some replacing.

ClearAxis is written in Python and is packaged using Py2Exe so that any end user could run it without having to install Python. More importantly, ClearCase can only invoke executables from it’s map file and running python.exe with command line arguments does not work. As a result, ClearAxis had to be a standalone executable.

While developing ClearAxis, I ran into a problem with Py2Exe that I had encountered earlier with Clump as well. If you run the generated .exe from the directory it is located in, everything works fine. If you specify the full path to the .exe on the command line, that works too. However, when ClearAxis was executed through the ClearCase map file, you’d get the following error in the .exe.log file:

Traceback (most recent call last):
File “clearaxis.py”, line 1, in ?
zipimport.ZipImportError: can’t decompress data; zlib not available

With Clump, this error occurred when I added the Clump directory to the system path and tried running clump.exe from any directory. Working around this for Clump wasn’t a huge problem. I created a batch file in the Windows directory which resolved the full path the to .exe. So I didn’t need to add Clump to the system path, nor did I need to figure out this Zlib problem.

With ClearAxis, this workaround wasn’t viable. I searched the web for several hours, looking for solutions to this issue before I gave up and began scavenging the Py2Exe code instead. There’s multiple solutions for this issue using combinations of Py2Exe setup flags. The choice is between the size of the resulting application versus the number of files in the application directory. It also depends on how many executables are to be compiled by Py2Exe for your application.

Py2Exe Default

Just as reference, here’s the default settings in Py2Exe as a baseline. It is a good option if you run your application by resolving it’s full path like in a shortcut or batch file.

zipfile = “shared.lib”
compressed = 0

If the .exe path is not fully resolved, the Zlib error occurs.

Method 1

In this method, the Python libraries and required DLLs are compressed and bundled into the .exe itself. Since Zlib is bundled into the .exe, we can now decompress the libraries and the Zlib error goes away.

zipfile = None
compressed = 1
bundle_files = 2

This approach works well for an application with a single executable. For an application with multiple executables, this method increases the total size considerably since none of the common Python libraries and DLLs are shared. Instead, they are redundantly packaged into each executable.

Method 2

In this method, only the Python libraries are bundled into the executable. We can’t turn on compression since the Zlib DLL is not bundled. If we turn on compression, the Zlib error comes back since the libraries still need to be decompressed and Zlib can not be found.

zipfile = None
compressed = 0

Like Method 1, this approach also increases the total size if multiple executables need to be compiled, though not as much since the DLLs are still shared.

Method 3

In this method, all the Python libraries and the required DLLs are stored on the file system without any compression or packaging. As a result, no decompression needs to be performed and our Zlib error goes away.

zipfile = “shared.lib”
compressed = 0
skip_archive = 1

The issue with this approach is that we end up with more than 200 files spread across more than 15 directories in the resulting application. Not too pretty.

Comparison

The above methods provide different results as far as file size and directory structure are concerned. Here’s what it looks like for Clump which has a single executable. As you can see, Method 1 gives the best bang for the buck.

Method Size Files
Default 4,299,454 15 files
1 4,202,393 5 files
2 5,560,562 15 files
3 5,542,032 > 200 files

As for AppSnap which has two executables, one for the console and one for the GUI, the results are more interesting. Do you choose size or a clean directory structure?

Method Size Files
Default 5,051,330 21 files
1 9,106,298 9 files
2 8,796,440 21 files
3 6,584,139 > 200 files

Note: All sizes above are calculated after compressing all executables and DLLs with UPX and recompressing shared.lib with 7Zip at the maximum level. Check out the Py2Exe wiki under the 7Zip and UPX section for more information.

Conclusion

If you have one executable in your application, Method 1 seems to be the best option. For two to three executables, Method 2 seems to be a decent middle ground. For more than four executables, Method 3 is the most sensible since the redundancy is no longer justified. And all this only if you expect to run your application from a directory other than it’s own on the command line. Otherwise, the Py2Exe default gives us very acceptable results.

Which method to choose greatly depends on your application’s requirements. ClearAxis uses Method 2 for now but probably will move to Method 1 in the next release to avail the improved total size. As for AppSnap, I don’t care about executing it from a directory other than it’s own so it is using the default method for now.

Take a pick. At least there’s a solution.

Filed under: Programming

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Twitter Updates

%d bloggers like this: