Abstract
This paper examines a common design for a lexical analyser and its supporting modules. An implementation of the design was tuned to produce the best possible performance. In effect, many of the optimizations that one would expect of a production‐quality compiler were carried out by hand. After measuring the cost of tokenizing two large programs with this version, the code was ‘detuned’ to remove specific optimizations and the measurements were repeated. In all cases, the basic algorithm was unchanged, so that the difference in cost is an indication of the effectiveness of the optimization. Comparisons were also made with a tool‐generated lexical analyser for the same task. On the basis of the measurements, several specific design and optimization strategies are recommended. These recommendations are also valid for software other than lexical analysers.

This publication has 8 references indexed in Scilit: