Abstract
In Part I predictive coding was defined and messages, prediction, entropy, and ideal coding were discussed. In the present paper the criterion to be used for predictors for the purpose of predictive coding is defined: that predictor is optimum in the information theory (IT) sense which minimizes the entropy of the average error-term distribution. Ordered averages of distributions are defined and it is shown that if a predictor gives an ordered average error term distribution it will be a best IT predictor. Special classes of messages are considered for which a best IT predictor can easily be found, and some examples are given. The error terms which are transmitted in predictive coding are treated as if they were statistically independent. If this is indeed the case, or a good approximation, then it is still necessary to show that sequences of message terms which are statistically independent may always be coded efficiently, without impractically large memory requirements, in order to show that predictive coding may be practical and efficient in such cases. This is done in the final section of this paper.

This publication has 4 references indexed in Scilit: