Version française
Home site

Thespacer home page

Important security patch

gzip 1.2.4 may crash when an input file name is too long (over 1020 characters). The buffer overflow may be exploited if gzip is run by a server such as an ftp server. Some ftp servers allow compression and decompression on the fly and are thus vulnerable. See technical details here. This patch to gzip 1.2.4 fixes the problem. The beta version 1.3.3 already includes a sufficient patch; use this version if you have to handle files larger than 2 GB. A new official version of gzip will be released soon.

Introduction

gzip (GNU zip) is a compression utility designed to be a replacement for compress. Its main advantages over compress are much better compression and freedom from patented algorithms. It has been adopted by the GNU project and is now relatively popular on the Internet. gzip was written by Jean-loup Gailly (jloup@gzip.org), and Mark Adler for the decompression code.

gzip produces files with a .gz extension. gunzip can decompress files created by gzip, compress or pack. The detection of the input format is automatic.

The format of the .gz files generated by gzip is described in RFCs (Request For Comments) 1951 and 1952. Some additional information on the gzip format is given here. A brief description of the compression and decompression algorithms used by gzip is given here. A more informal introduction written by Antaeus Feldspar is given here.

If you have a question about gzip, look first for an answer in this page. If you don't find it, write to support@gzip.org . Please give as much information as possible, at least the name of your operating system (Windows XP, Linux...), the exact command that you typed and the exact error messages that you get. If you just say "gzip doesn't work" I cannot provide any help.

spacer

Sources

The gzip sources, written in C, are available here in various formats: Press the shift key then click on a link to download one of these files. You can also get these files from many mirror sites.

To extract gzip-1.2.4.tar and compile the sources on Unix systems, do:

    tar xvf gzip-1.2.4.tar
    cd gzip-1.2.4
    ./configure
    make

To extract .tar and .tar.gz files on Windows 9x/NT/2000/ME/XP use PowerArchiver 6.1 (freeware) or 7-zip (freeware) or Winzip (commercial). For tar on MSDOS or other systems, see the FAQ of the comp.compression newsgroup.

On several systems, compiler bugs cause gzip to fail, in particular when optimization options are on. See the section "Special targets" at the end of the INSTALL file for a list of known problems. For all machines, use make check to check that gzip was compiled correctly. Try compiling gzip without any optimization if you have a problem.

The gzip user manual is available here.

spacer

Executables

Executables for various systems are available here:

For Linux or BeOS, gzip is already on your system.

To extract tar.Z files on Unix systems, do:

    zcat file.tar.Z | tar xvf -

spacer

Frequently Asked Questions

gunzip complains about corrupted data or a CRC error

99.9% of the problems with gzip are due to file transfers done in ASCII mode instead of BINARY mode. In particular, gopher is known to corrupt binary files by considering them as ASCII. Make sure that your local copy of the file has exactly the same byte size as the original.

If you have transferred a file in ASCII mode and you no longer have access to the original, you can try the program fixgz to remove the extra CR (carriage return) bytes inserted by the transfer. A Windows 9x/NT/2000/ME/XP binary is here. But there is absolutely no guarantee that this will actually fix your file. Conclusion: never transfer binary files in ASCII mode. To compile fixgz and run it, do:

   cc -o fixgz fixgz.c
   fixgz  bad.gz  fixed.gz
   gzip -tv fixed.gz

gunzip complains about a multi-part gzip file

This is the same problem as above: a transfer not made in binary mode has corrupted the gzip header, thus fooling gunzip into emitting an incorrect error message. Transfer the file again in binary mode.

Where is gunzip for MSDOS or Windows?

The MSDOS gzip package contains a file README.DOS, please read it. In short:

     copy gzip.exe gunzip.exe

Is there a Windows interface for gzip?

PowerArchiver 6.1, 7-zip and Winzip include the gzip compression code and can decompress .gz and tar.gz files. Win-GZ can compress and decompress files in gzip format. Please note that gzip, 7-zip, PowerArchiver 6.1 and Win-GZ are freeware but you must register Winzip and PowerArchiver > 6.1 if you use them regularly.

Can I adapt the gzip sources to perform in-memory compression?

Use the zlib data compression library instead.

How can I extract a tar.gz or .tgz file?

Files with extension tar.gz or .tgz are tar files compressed with gzip. On Unix extract them with:
    gunzip < file.tar.gz | tar xvf -
    gunzip < file.tgz    | tar xvf -
If you have GNU tar you can use the z option directly:
    gtar xvzf file.tar.gz
    gtar xvzf file.tgz

For Windows 9x/NT/2000/ME/XP, use PowerArchiver 6.1, 7-zip (freeware) or Winzip (commercial).

gzip complains with Broken pipe

If you use the commands described above to extract a tar.gz file, gzip sometimes emits a Broken pipe error message. This can safely be ignored if tar extracted all files without any other error message.

The reason for this error message is that tar stops reading at the logical end of the tar file (a block of zeroes) which is not always the same as its physical end. gzip then is no longer able to write the rest of the tar file into the pipe which has been closed.

This problem occurs only with some shells, mainly bash. These shells report the SIGPIPE signal to the user, but most others (such as tcsh) silently ignore the pipe error.

You can easily reproduce the same error message with programs other than gzip and tar, for example:

  cat /dev/zero | dd bs=1 count=1

gzip complains with trailing garbage ignored

Some tar.gz files are padded with zeroes to have a size which is a multiple of a certain block size. This occurs in particular when the compressed tar file is on a device such as a magnetic tape. When such files are extracted with a command such as
    gunzip < file.tar.gz | tar xvf -
    gtar xvzf /dev/rmt/0
gunzip decompresses correctly the tar.gz file, then attempts to decompress the rest of the input which consists of zeroes. Since those zeroes are not in gzip format, gzip ignores them. The tar extract command still works correctly, since gzip has sent through the pipe all the data that tar needs.

You can avoid this harmless warning by using the -q option of gzip, as in:

    gunzip -q < file.tar.gz | tar xvf -
    GZIP=-q           gtar xvzf /dev/rmt/0         # for bash, ksh, sh ...
    (setenv GZIP -q;  gtar xvzf /dev/rmt/0)        # for csh, tcsh, ...

My hard disk has bad sectors. Can I still recover my .gz files?

You know this already, but let me repeat it again: there is no substitute for backups. If you use gzip for important backups, you must test the backups before deleting the original files. To test a tar.gz file, do:
     for GNU tar:  tar tvfz file.tar.gz
     for any tar:  gunzip < file.tar.gz | tar tvf -
If you transfer the tar.gz file to another machine, test the destination file (after the file transfer), not the source file. This will detect bad file transfers. If you are not using tar, do at least gzip -tv file.gz to test important files. Once the damage is made, it is somewhere between extremely difficult and impossible to recover damaged .gz files. If your data is so valuable that you are willing to spend a lot of time to recover part of it, read this.

Can gzip handle files of more than 4 gigabytes?

Yes, but you need this patch to the gzip 1.2.4 sources, or use the beta version 1.3.x. See section executables to download binaries with the patch already included.

Files already compressed without the patch are correct; the patch is useful only for decompression. Decompression with zcat outputs the correct data plus an error message length error that you can ignore for a single file. For example you can decompress with

     gunzip < file.gz > file
zcat on multiple files will stop at the first error so use the patch to avoid any problem.

To get a corrected binary, save the file 4g-patch.tar in the directory containing the gzip sources, then do:

     tar xvf 4g-patch.tar
     make

For some systems (Solaris 2.6, AIX), you may get the error message "Value too large for defined data type". A complete source tree fixing this problem is available here, thanks to Paul Eggert. For AIX, compile add "-D_POSIX_SOURCE -D_LARGE_FILES -D_LARGE_FILE_API" to the compilation flags.

What about patents?

gzip was developed as a replacement for compress because of the UNISYS and IBM patents covering the LZW algorithm used by compress.

I have probably spent more time studying data compression patents than actually implementing data compression algorithms. I maintain a list of several hundred patents on lossless data compression algorithms, and I made sure that gzip isn't covered by any of them. In particular, the --fast option of gzip is not as fast it could, precisely to avoid a patented technique.

The first version of the compression algorithm used by gzip appeared in zip 0.9, publicly released on July 11th 1991. So any patent granted after July 11th 1992 cannot threaten gzip because of the prior art, and I have checked all patents granted before this date.

During my search, I found two interesting patents on a process which is mathematically impossible: compression of random data. This is somewhat equivalent to patents on perpetual motion machines. Check here for a short analysis of these two patents.

Year 2000?

gzip handles correctly dates within or after year 2000. More information about GNU software and year 2000 can be found here.

I get a compilation error about utimbuf

On systems which declare utimbuf in unistd.h instead of utime.h or sys/utime.h (such as AIX), use:
 
    make CFLAGS="-UHAVE_UNISTD_H"

I can't compile gzip on Solaris

You need a compiler to compile gzip and you need gzip to get gcc. To get out of this loop, install first the gzip binary for Solaris (Sparc or i386). See also here for beta versions.

Does gzip support encryption?

No. Simple encryption algorithms such as that of PKZIP can be broken. And adding strong encryption such as that of PGP to gzip would not make much sense because that would duplicate the functionality of those encryption programs. PGP already incorporates the gzip compression code, so use PGP if you need compression plus strong encryption.

If you are satisfied with weak encryption, you can use zip.

Can gzip compress several files into a single archive?

Not directly. You can first create a tar file then compress it:
    for GNU tar:   gtar cvzf file.tar.gz filenames
    for any tar:   tar cvf -  filenames | gzip > file.tar.gz
Alternatively, you can use zip, PowerArchiver 6.1, 7-zip or Winzip. The zip format allows random access to any file in the archive, but the tar.gz format usually gives a better compression ratio.

Can gunzip extract a .zip archive?

No. Use unzip instead. For Windows, use PowerArchiver 6.1 or WiZ or 7-zip (freeware) or or Winzip (commercial).

WebTV users please read this

The AOL web page for AOL Instant Messenger incorrectly detects the WebTV browser as a Unix browser. Therefore WebTV users are shown by mistake the page for TiK, which cannot work for WebTV. So please do not email me about AOL Instant Messenger or WebTV. The instructions telling you to dowload gzip cannot work for WebTV.

In any case, there is no way to use AOL Instant Messenger on WebTV. Emailing me about this will not help in any way, I am not related to either WebTV or AOL. The correct contact for WebTV is here. Once again, the web page which led you here contains an error, you must not follow the instructions meant for Unix users only. Thank you for your understanding.

spacer

Related links

spacer The home page of gzip's author Jean-loup Gailly
The home page of gzip's co-author Mark Adler spacer
The spacer data compression library, also written by Jean-loup and Mark.

All this started with the spacer project.

spacer
FAQ (Frequently Asked Questions) of the comp.compression newsgroup.

Many thanks to spacer for supporting the gzip.org domain and for its excellent service.
Thanks to Greg Roelofs for the gzip logo.

The gzip home page is www.gzip.org

spacer
Last modification: July 27th, 2003. Send your comments to jloup@gzip.org . Jean-loup's PGP key is here.

Back to the index
Back to Jean-loup's home page

gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.