Analysis of Competing Classifiers using Components of Variance of ROC Accuracy Measures

Marcus A. Maloof, Sergey V. Beiden, and Robert F. Wagner

Fukunaga and Hayes' theory of the effect of sample size states that for a single classifier, variance comes predominantly from the finite test sample. In this paper, we present results from an empirical study that support and extend this theory. To investigate, we conducted a Monte Carlo simulation using Bayesian and non-Bayesian classifiers for detection tasks derived from Gaussian distributions. To compare the performance of the methods, we used ROC analysis and a nonparametric technique that conducts bootstrap experiments to estimate components of variance. Results support the assertion that variance comes predominantly from the test sample for the case of a single classifier; however, for the case of two competing classifiers, they suggest that variance comes predominantly from the finite training set, not the test set. Our estimates of the variance scale with respect to sample size in accordance with the theory, but because we examined components of variance, rather than total variance, we provide a much more detailed analysis.

Paper available in PostScript (gzipped) and PDF.

@techreport{maloof.tr.02,
  author = "Maloof, M.A. and Beiden, S.V. and Wagner, R.F.",
  title = "Analysis of competing classifiers in terms of
    components of variance of {ROC} accuracy measures",
  type = "Technical Report",
  number = "CS-02-01",
  month = "January",
  year = 2002,
  institution = "Department of Computer Science, Georgetown University",
  address = "Washington, DC"
}