This guide explains how to read and search through compressed files on the Linux bash command line without decompressing them first.
gzip and bzip2 are the most common compression utilities used on Linux and are used pretty much anywhere compression is required. This is especially true of log files because the logrotate utility is usually configured by default to run when a server such as Apache2 is installed.
logrotate periodically takes the current working log file, gives it a version number and compresses it leaving a new, empty log file for the application to write to.
This is great for stopping your disk filling up. However, compressed files are not readable or searchable by the usual command line tools such as
grep. Obviously, one could simply decompress all the files and then work on them as normal. However, for a busy web or email server, this can be a very large amount of data and is an additional step that is not required because both
bzip2 provide equivalents of
grep to work directly with the compressed data.
gzip - .gz
zless zcat zgrep
bzip2 - .bz2
bzless bzcat bzgrep
These command work in exactly the same way as their standard counterparts. For example, if I need to search through all of my rotated
dpkg logs (that have the form
dpkg.log.N.gz) for the package
apache2 I can run the command:
zgrep apache2 dpkg.log.*.gz
Reading the file
If you need to use command line tools that require uncompressed text as stdin such as
tail then simply pipe the output of
zcat into them e.g.:
zcat dpkg.log.5.gz | tail
With these tools, you can quickly find the information you need without having to unpack the files first.