Over the past decade, much attention has been devoted to the evaluation of air-quality models with emphasis on model performance in predicting the high concentrations that are important in air-quality regulations. This paper stems from our belief that this practice needs to be expanded to 1) evaluate model physics and 2) deal with the large natural or stochastic variability in concentration. The variability is represented by the root-mean- square fluctuating concentration (σc about the mean concentration (C) over an ensemble—a given set of meteorological, source, etc. conditions. Most air-quality models used in applications predict C, whereas observations are individual realizations drawn from an ensemble. For σc∼C large residuals exist between predicted and observed concentrations, which confuse model evaluations. This paper addresses ways of evaluating model physics in light of the large σc the focus is on elevated point-source models. Evaluation of model physics requires the separation of the mean model error-the difference between the predicted and observed C—from the natural variability. A residual analysis is shown to be an elective way of doing this. Several examples demonstrate the usefulness of residuals as well as correlation analyses and laboratory data in judging model physics. In general, σc models and predictions of the probability distribution of the fluctuating concentration (c′), Ω(c′, are in the developmental stage, with laboratory data playing an important role. Laboratory data from point-source plumes in a convection tank show that Ω(c′ approximates a self-similar distribution along the plume center plane, a useful result in a residual analysis. At pmsent,there is one model—ARAP—that predicts C, σc, and &Omega(c for point-source plumes. This model is more computationally demanding than other dispersion models (for C only) and must be demonstrated as a practical tool. However, it predicts an important quantity for applications— the uncertainty in the very high and infrequent concentrations. The uncertainty is large and is needed in evaluating operational performance and in predicting the attainment of air-quality standards.