An increasing number of diagnostic tests and biomarkers have been validated during the last decades, and this will still be a prominent field of research in the future because of the need for personalized medicine. Strict evaluation is needed whenever we aim at validating any potential diagnostic tool, and the first requirement a new testing procedure must fulfill is diagnostic accuracy.
SUMMARY: Diagnostic accuracy measures tell us about the ability of a test to discriminate between and/or predict disease and health. This discriminative and predictive potential can be quantified by measures of diagnostic accuracy such as sensitivity and specificity, predictive values, likelihood ratios, area under the receiver operating characteristic curve, overall accuracy and diagnostic odds ratio. Some measures are useful for discriminative purposes, while others serve as a predictive tool. Measures of diagnostic accuracy vary in the way they depend on the prevalence, spectrum and definition of the disease. In general, measures of diagnostic accuracy are extremely sensitive to the design of the study. Studies not meeting strict methodological standards usually over- or underestimate the indicators of test performance and limit the applicability of the results of the study.
KEY MESSAGES: The testing procedure should be verified on a reasonable population, including people with mild and severe disease, thus providing a comparable spectrum. Sensitivities and specificities are not predictive measures. Predictive values depend on disease prevalence, and their conclusions can be transposed to other settings only for studies which are based on a suitable population (e.g. screening studies). Likelihood ratios should be an optimal choice for reporting diagnostic accuracy. Diagnostic accuracy measures must be reported with their confidence intervals. We always have to report paired measures (sensitivity and specificity, predictive values or likelihood ratios) for clinically meaningful thresholds. How much discriminative or predictive power we need depends on the clinical diagnostic pathway and on misclassification (false positives/negatives) costs 1).