NOTES ON THE LZRW3-A ALGORITHM ============================== Author : Ross Williams (ross@spam.ua.oz.au). Date : 15-Jul-1991. Abstract -------- This file announces the release of my LZRW3-A algorithm which was invented on 31-Dec-1990 along with LZRW3. The LZRW3-A algorithm has the following features: 1 Requires only 16K of memory (for both compression and decompression). 2 The compressor runs about two times faster than Unix compress's. 3 The decompressor runs about three times faster than Unix compress's. 4 Yields a few percent better compression than Unix compress for most files. 5 Allows you to dial up extra compression at a speed cost in the compressor. The speed of the decompressor is not affected. 6 Algorithm is deterministic. 7 Algorithm is free of patent problems. The algorithm has not been patented (nor will it be) and is of the LZ77 class which is fairly clear of patents. 8 This implementation in C is in the public domain. (Timing tests for the speed comparison were performed on a Pyramid 9820.) LZRW3-A dominates Unix compress in memory and speed. LZRW3-A dominates Unix compress in compression for object files and smallish (e.g. 50K) text files, but yields worse compression for large complex English text files. The exact figures for the Calgary corpus on C implementations on my Macintosh-SE are (percentage remaining, compression speed, decompression speed, memory required during compression and decompression): PerRem ComK/S DecK/S ComMem DecMem LZRW1-A 55.1 % 17 K/s 69 K/s 16 K 0 K LZRW2 51.5 % 18 K/s 54 K/s 24 K 16 K LZRW3 50.0 % 20 K/s 33 K/s 16 K 16 K LZRW3-A 45.8 % 8 K/s 31 K/s 16 K 16 K I would like to hear from anyone who knows of an algorithm that performs similarly or better to this one when implemented in C and compiled and run on the same machine on the same files. Availability ------------ The only implementation available is in C. It can be found in the following archive in about mid August 1991. FTP Archive Access: Machine : sirius.itd.adelaide.edu.au [IP=129.127.40.3] Directory : ~pub/compression Files : lzrw3-a.txt - This file. lzrw3-a.c - An implementation in C. Motivation for LZRW3-A ---------------------- LZRW3-A was designed as a direct competitor to Unix compress. LZRW3-A is identical to the LZRW3 algorithm except that it uses a "deep" hash table (of depth 8 by default). See the explanation in the comments in the program code for more details. Benchmark --------- Here are the results of applying LZRW3-A.C compiled under THINK C 4.0 and running on a Mac-SE (8MHz 68000) to the standard Calgary corpus. +----------------------------------------------------------------+ | DATA COMPRESSION TEST | | ===================== | | Time of run : Mon 15-Jul-1991 05:29PM | | Timing accuracy : One part in 100 | | Context length : 262144 bytes (= 256.0000K) | | Test suite : Calgary Corpus Suite | | Files in suite : 14 | | Algorithm : LZRW3-A | | Note: All averages are calculated from the un-rounded values. | +----------------------------------------------------------------+ | File Name Length CxB ComLen %Remn Bits Com K/s Dec K/s | | ---------- ------ --- ------ ----- ---- ------- ------- | | rpus:Bib.D 111261 1 49044 44.1 3.53 8.47 31.19 | | us:Book1.D 768771 3 420464 54.7 4.38 7.27 30.07 | | us:Book2.D 610856 3 277955 45.5 3.64 8.51 33.40 | | rpus:Geo.D 102400 1 84218 82.2 6.58 4.23 15.04 | | pus:News.D 377109 2 192880 51.1 4.09 7.08 25.89 | | pus:Obj1.D 21504 1 12651 58.8 4.71 5.23 17.44 | | pus:Obj2.D 246814 1 108044 43.8 3.50 8.01 28.11 | | s:Paper1.D 53161 1 24526 46.1 3.69 8.11 30.24 | | s:Paper2.D 82199 1 39483 48.0 3.84 8.11 32.04 | | rpus:Pic.D 513216 2 111622 21.7 1.74 10.64 49.31 | | us:Progc.D 39611 1 17923 45.2 3.62 8.06 29.01 | | us:Progl.D 71646 1 24362 34.0 2.72 10.74 39.51 | | us:Progp.D 49379 1 16805 34.0 2.72 10.64 37.58 | | us:Trans.D 93695 1 30296 32.3 2.59 11.02 38.06 | +----------------------------------------------------------------+ | Average 224401 1 100733 45.8 3.67 8.29 31.21 | +----------------------------------------------------------------+ ----