Combined symbol matching facsimile data compression system

Abstract
A facsimile data compression system, called combined symbol matching (CSM), is presented. The system operates in two modes: facsimile and symbol recognition. In the facsimile mode, a symbol blocking operator isolates document symbols such as alphanumeric characters and other recurring binary patterns. The first symbol encountered is placed in a library, and as each new symbol is detected, it is compared with each entry of the library. If the comparison is within a tolerance, the library identification code is transmitted along with the symbol location coordinates. Otherwise, the new symbol is placed in the library and its binary pattern is transmitted. Nonisolated symbols are left behind as a residue, and are coded by a two-dimensional run-length coding method. In the symbol recognition mode, the library is prerecorded and each entry is labeled with its ASCII code. As each character is recognized, only the ASCII code in transmitted. Computer simulation results are presented for the CCITT standard documents. With text-predominate documents, the compression ratio obtained with the CSM algorithm in the facsimile mode exceeds that obtained with the best run-length coding techniques by a factor of two or more and is comparable for graphics-predominate documents. In the symbol recognition mode, compression ratios of 250:1 have been achieved on business letter documents.