Articles‎ > ‎

C# (.net) interface for 7-Zip archive dlls

Here you can download interface translation for 7-Zip archive format dlls. This article has also posted on Code Project and won The Code Project Best C# article of June 2008.

About 7-Zip

7-Zip is open-source archive program with plug-in interface. New archive formats and/or archive codecs can be added by dlls. 7-Zip ships with several archive formats preinstalled:

  • 7z - its own format features good compression (LZMA, PPMd) but can be slow in terms of packing/unpacking
  • Packing / unpacking: ZIP, GZIP, BZIP2 and TAR
  • Unpacking only: RAR, CAB, ISO, ARJ, LZH, CHM, Z, CPIO, RPM, DEB and NSIS

The project is written in C++ language.

More you can find on official 7-Zip site - www.7-zip.org.

About this contribution

This contribution allows you to use 7-zip archive format dlls in your programs written in .net languages.

This module I create for my own project that have ability to work with archives. Currently my project has only extract capabilities, so only this part of 7-Zip interface translated to C#. Later I plan to translate compress capability as well. For now if you need such functionality right now you can implement it by yourself, with this code, and 7-Zip source code.

This translation is tested and already working in my own project.

Implementation details

All communication with archive dlls done with com-like interfaces (why com-like, and not com see in known issues section). Callbacks are also implemented as interfaces.

Every dll contains class that can implement one or more interfaces. Some formats allows only extracting, some also provide compress abilities. Public interfaces translated to C#:

  • IProgress - basic progress callback
  • IArchiveOpenCallback - archive open callback
  • ICryptoGetTextPassword - callback for prompt password for archive
  • IArchiveExtractCallback - extract files from archive callback
  • IArchiveOpenVolumeCallback - open additional archive volumes callback
  • ISequentialInStream - simple read-only stream interface
  • ISequentialOutStream - simple write-only stream interface
  • IInStream - input stream interface with seek capability
  • IOutStream - output stream interface
  • IInArchive - main archive interface

Every dll export function for creating archive class handler and function to get archive format properties. These functions translated as .net delegates:

  • CreateObject - creates object with given class id. Used mostly for create IInArchive instance.
  • GetHandlerProperty - get archive format description (implemented class ids, default archive extension, etc)

Update (1.3): In 7-Zip 4.45 there is some changes in dll interface. Now all archive formats and compression codecs implemented as one big dll. So several new exported functions (and delegates for these functions in translation) are added to handle several archive handler classes in one dll.

Extracting algorithm

  1. Load 7z.dll library
  2. Get CreateObject function (use CreateObjectDelegate)
  3. Execute CreateObject function with appropriate format interface GUID, function will return interface, cast this interface to IInArchive.
  4. Open existing archive using IInArchive.Open function (you can optionally provide IArchiveOpenCallback, note that some formats require it and some not)
  5. Examine archive content and create list of file numbers to extract (numbers of files inside archive)
  6. Execute IInArchive.Extract function with file numbers and provide IArchiveExtractCallback.
  7. For each file to extract 7z.dll will call IArchiveExtractCallback.GetStream, provide destination file stream for every file to extract
  8. Optionally you can implement other IArchiveExtractCallback functions to show progress, make cleanup, etc
  9. Close IInArchive and existing archive stream
  10. Unload 7z.dll library

Packing algorithm

  1. tbd

Points of interest

7-Zip interfaces uses variants (PropVariant) for property values. C# does not support such variants as classes and all such parameters are implemented in C# as IntPtr. This is done for compatibility and because I prefer not to use unsafe code in my projects.

Fortunately managed class System.Runtime.InteropServices.Marshal has method GetObjectForNativeVariant that you can use for converting such "pointers" to objects. However this method does not handle all PropVariant types (for example VT_FILETIME), for these cases I added my GetObjectForNativeVariant method to this translation.

7-Zip works with files through its own interfaces, so if you want to open file on disk, or in memory you need to provide class implement one or more necessary interfaces. Several such wrapper classes are also present in this translation (they are wrap around standard .net Stream class).

Update (1.2): Most of the complexity related to PropVariant processing is now hidden in special PropVariant structure. And interface methods now return PropVariant instead of IntPtr.

Known issues

First and most disappointing issue is that you cannot use 7-Zip dlls directly. This means that you cannot simple take such dlls from 7-Zip distribution and you them in your projects. This is because of the incomplete COM interfaces implementations in 7-Zip code. All issues are related to IUnknown.QueryInterface implementation. 7-Zip's QueryInterface does not return IUnknown interface if prompted (this part is most critical for working with com-interfaces in .net), and some classes do not return any interface at all!

This is done because 7-Zip code is C++ code and works with pointers, and most functions returns direct pointers to interface implementation. That means that 7-Zip code not use QueryInterface at all. Sad, but .net works in a different way, and first access to any interface always goes though QueryInterface and IUnknown.

So if we use dlls directly we have constant InvalidCastException. So we need to make several changes in 7-Zip code and rebuild dlls. Or ask Igor Pavlov to include such changes to the 7-Zip code itself :)

Important Update: Starting from 7-Zip 4.46 alpha Igor did necessary changes in code. So, from this version forward, you can use format dlls directly, without applying any patch. Superb!

Second issue is much smaller one. It is related to multi-threading. If you plan to use 7-Zip interfaces only in one stream you have no problem. Problem came when you try to use one interface in several thread. In this case all thread except main one (thread where interface are created) throw exception on any interface method calls. This is because of RCW behavior. RCW is an object that wraps COM-interface in .net. When you try to use interface in different thread RCW tries to marshal interface and fails (because this implementation does not support ITypeInfo).

Fortunately I've found simple solutions for this. Main interface (IInArchive) returns as IntPtr, and not as RCW object. When you need to access this interface, call System.Runtime.InteropServices.Marshal.GetTypedObjectForIUnknown or any other related method and get RCW object. If you need to use this interface in another thread simple call System.Runtime.InteropServices.Marshal.FinalReleaseComObject (or ReleaseComObject), and create another RCW wrapper around returned IntPtr pointer. Of course in this case you can use interface only in one thread in time, but this is better than using interface only in one thread. And any logic can be easily implemented with correct thread locking.

And third is a well known issue but still I think it must be noted here. It appears that .net runtime does not support com interfaces inheritance (interfaces marked with ComImport attribute). This is definitely .net bug, but I don't know when Microsoft fixes this bug or fix it at all.

There is simple solution to avoid this bug. Inherited interface must be declared as standalone one and first methods must be methods of inherited interfaces in the order of appearance. You can see sample of such "inheritance" in this translation source.

Demo

Due to many request, I have spend some time and written a little demo program. Demo program lacks proper error checking, lacks different archives support (zip format is hardcoded in source, but can be easily changed), it lacks almost everything, but it has two advantages: it's simple, and it's works.

Demo has only two modes, first to list all files in archive, second is to extract single file from archive. I think that this is enough to understand how to use 7-zip interfaces and how to create something more complex.

If you want to run demo, don't forget to put 7z.dll (can be found on official 7-zip site) to the executable folder with executable.

Update (2.0): Now demo can create simple archives. Also in this demo some issues found by CodeProject community users was fixed.

Version history

2.0 - Packing support introduced, added support for the latest 7-zip version (4.60+), demo updated to show basic packing principles
1.5 - Small demo added
1.3 - Added two new delegate for features added in 7-Zip 4.45
1.2 - Variant type changed from IntPtr to newly created PropVariant structure
1.1 - Stream wrappers added, minor interface translation changes for better usability
1.0 - initial release

Downloads

7zIntf20.zip | mirror (Slow)
module with 7-Zip interfaces translated to c#

7z445bPatch.zip (temporary removed due to google bandwidth limitations)
Patched files from 7-Zip 4.45 beta which correctly returns IUnknown interface and contains several bug fixes (all related to QueryInterface, some archive handlers does not return any interface at all!)

7-Zip 4.46 beta sources on SourceForge

Comments