The evaluation of skills in anesthesiology residents is usually subjective and lacks demonstrable reliability. Therefore, an objective criterion-referenced skill test for measuring performance of continuous lumbar epidural anesthesia was developed. For such a test to be useful, it is necessary to demonstrate agreement among rater-observers. Eight performances of continuous lumbar epidural anesthesia were recorded on video tape and simultaneously rated by nine anesthesiology faculty observers to determine inter-rater reliability. Inter-rater agreement was analyzed by determining coefficient kappa for each item and the entire test. Coefficient kappa for the entire test was 0.82 indicating a high degree of agreement between raters on the performance or nonperformance of various items. Development and utility of skill tests are discussed.