In analysis of binary data from clustered and longitudinal studies, random effect models have been recently developed to accommodate two-level problems such as subjects nested within clusters or repeated classifications within subjects. Unfortunately, these models cannot be applied to three-level problems that occur frequently in practice. For example, multicenter longitudinal clinical trials involve repeated assessments within individuals and individuals are nested within study centers. This combination of clustered and longitudinal data represents the classic three-level problem in biometry. Similarly, in prevention studies, various educational programs designed to minimize risk taking behavior (e.g., smoking prevention and cessation) may be compared where randomization to various design conditions is at the level of the school and the intervention is performed at the level of the classroom. Previous statistical approaches to the three-level problem for binary response data have either ignored one level of nesting, treated it as a fixed effect, or used first- and second-order Taylor series expansions of the logarithm of the conditional likelihood to linearize these models and estimate model parameters using more conventional procedures for measurement data. Recent studies indicate that these approximate solutions exhibit considerable bias and provide little advantage over use of traditional logistic regression analysis ignoring the hierarchical structure. In this paper, we generalize earlier results for two-level random effects probit and logistic regression models to the three-level case. Parameter estimation is based on full-information maximum marginal likelihood estimation (MMLE) using numerical quadrature to approximate the multiple random effects. The model is illustrated using data from 135 classrooms from 28 schools on the effects of two smoking cessation interventions.