gzip and bzip2 are the most common compression utilities used on Linux and are used pretty much anywhere compression is required. This is especially true of log files because the logrotate utility is usually configured by default to run when a server such as Apache2 is installed. logrotate
periodically takes the current working log file, gives it a version number and compresses it leaving a new, empty log file for the application to write to.
This is great for stopping your disk filling up. However, compressed files are not readable or searchable by the usual command line tools such as less
or grep
. Obviously, one could simply decompress all the files and then work on them as normal. However, for a busy web or email server, this can be a very large amount of data and is an additional step that is not required because both gzip
and bzip2
provide equivalents of less
, cat
and grep
to work directly with the compressed data.
gzip - .gz
zless
zcat
zgrep
bzip2 - .bz2
bzless
bzcat
bzgrep
These command work in exactly the same way as their standard counterparts. For example, if I need to search through all of my rotated dpkg
logs (that have the form dpkg.log.N.gz
) for the package apache2
I can run the command:
zgrep apache2 dpkg.log.*.gz
Reading the file dpkg.log.5.gz
:
zless dpkg.log.5.gz
If you need to use command line tools that require uncompressed text as stdin such as tail
then simply pipe the output of zcat
into them e.g.:
zcat dpkg.log.5.gz | tail
With these tools, you can quickly find the information you need without having to unpack the files first.