[Previous] [Contents] [Next]

cbpe

Use Byte-Pair Encoding to compress or decompress a file (QNX)

Syntax:

cbpe [options] [srcfile [dstfile]]

Targets:

QNX 4

Options:

-b size
Set the maximum block size before compression (default 4096).
-c nchars
Set the maximum character set before compression (default 200).
-d
Decompress the specified file. If this option is specified, all other options are ignored.
-f
Output as an embedded filesystem
-t count
Set the minimum threshold for byte pairs before replacement (default 3).

Description:

The cbpe utility compresses the file specified on the command line using a byte-pair encoding algorithm. This algorithm produces fair to good compression of most data. While the compression algorithm is relatively expensive to compute, the decompression algorithm is extremely fast, small, and simple. This allows it to be built directly into, for example, the embedded filesystem (Efsys) and the embedded system IPL for OS images (see romqnx).

By default, this utility always compresses. To decompress a file, specify the -d option.

Normally, the output is a file or character stream to the standard output. In the case of the embedded filesystem, you must inform the filesystem that this is a file that should be decompressed automatically when read. To do this, specify the -f option; without it, the filesystem simply reads back the compressed data directly.

The cbpe utility reads the specified source file and compresses the data in blocks to the associated output file. During decompression, these blocks are re-expanded. To specify the maximum size a block can be expanded to when it's decompressed, use the -b option during compression.

A typical compressed file contains many back-to-back compressed blocks. A small block value creates overhead that reduces the compression ratio, while a block value larger than the default (4096) has little effect on increasing the compression factor and may in fact reduce it.

The compression algorithm looks for pairs of bytes that can be represented by a single byte. To do this, it needs to reserve some byte values for representing these pairs. This reduces the available character set of the source.

Input characters are read until either the block size has been reached or more unique characters have been read than the maximum character set permits. At this point a block of data is compressed. To change the maximum character set, use -c.

The default count value for -t provides reasonable compression for a wide range of output. But if the data has a limited character set, you may be able to save space by increasing the number. For data that contains all possible byte pairs, you may be able to save space by decreasing the number.


Note: The decompressor doesn't need to know the maximum block size, character set, and minimum byte-pair values.

You should hold off on assigning a new code to byte pairs that may occur only once in the data. The default value prevents new codes from being consumed too easily. The total available number of new byte pairs is (256 - nchars), where nchars is set by -c.

Examples:

To compress the embedded shell and put it on an embedded filesystem (the shell will be decompressed by the filesystem when read):

cbpe -f esh /efs1p1/bin/esh

To compress a file called data and put it in the file called data.bpe:

cbpe data > data.bpe

or

cbpe data data.bpe

To decompress the file called data.bpe and put it in the file called data:

cbpe -d data.bpe > data

or

cbpe -d data.bpe data

See also:

romqnx


[Previous] [Contents] [Next]