|
|
Table 3 adds the numbers in Tables 1 and 2 to describe the outcomes for all pairs judged to be of any value for comparisons.
Nonmate | Mate | All | |
Exclusion | 3947 | 611 | 4558 |
Identification | 6 | 3703 | 3709 |
Inconclusive | 1032 | 3875 | 4907 |
All | 4985 | 8189 | 13174 |
Many books describe the interpretation of simpler two-by-two tables of binary decisions (like exclusion and identification) for two states of nature (such as nonmates and mates). For discussion in a legal context, see David H. Kaye et al., The New Wigmore on Evidence: Expert Evidence (2d ed. 2011).The row for inconclusives complicates the analysis slightly, as indicated below.
1. False Positives and Sensitivity
A false positive is an opinion that the pair of prints originated from the same finger of the same individual (an inclusion or identification) when, in fact, the exemplar and the latent came from different sources (nonmated pairs).
Only 10,052 (59%) of the presentations were deemed of value for individualization (VIn). Of these, 4,083 were nonmates, and 5,969 were mates. Five examiners (5/169 = 3%) made false identifications. Their answers to a questionnaire did not indicate anything unusual in their backgrounds. Three of them said they were certified (one did not respond to the background survey).
One of the five examiners made two false identifications, making the false positive rate(for pairs of prints deemed VIn
FPRVIn = P(identification | nonmate & VIn) = 6/10052 = 0.1%.
In clinical medicine, “sensitivity” denotes the probability that a diagnostic or screening test (such as a blood test for a disease) will give a positive result when the disease is present. If the test includes a quantitation that gives an “inconclusive” reading when the blood sample is too small, this would reflect an inherent limitation of the test rather than a lack of sensitivity to the disease when applied to an adequate sample. Analogously, the sensitivity of the examiners is the proportion of mated VIn pairs for which the LPEs reached a conclusion that were judged to be identifications. By this reasoning,
Sensitivity = P(identification | mate & VIn & conclusion) = 3663/4113 = 89.1%.
If the examiners’ inability to reach a conclusion after declaring a pair of prints as of value for individualization were treated as detracting from sensitivity, then their sensitivity in this experiment was only
Sensitivity = P(identification | mate & VIn) = 3663/5969 = 61.4%.
Most examiners did not indicate that the 6 pairs that produced false positive errors were difficult comparisons, and for only 2 of the 6 false positives did the LPE making the error describe the comparison as difficult.
In no case did two examiners make the same false positive error. The errors occurred on image pairs where a large majority of examiners made correct exclusions; one occurred on a pair where the majority of examiners judged the comparisons to be inconclusive. Thus, the six erroneous identifications probably would have been detected if blind verification were performed as part of the operational examination process.
Two of the false positive errors involved a single latent, but with exemplars from different subjects. Four of the five distinct latents on which false positives occurred (vs. 18% of nonmated latents) were deposited on a galvanized metal substrate, which was processed with cyanoacrylate and light gray powder. These images were often partially or fully tonally reversed (light ridges instead of dark), on a complex background.
2. False Negatives and Specificity
Whereas the false positive rate was only FPRVIn = 0.1%, the false negative rate for prints deemed of value for identification or exclusion was much larger:
FNRVIn = 450/5969 = 7.5%; FNRVIn+VExO = 611/8169 = 7.5%.
The specificity of a clinical test is the probability that it will report that the disease is absent when the disease actually is absent. Here, if the calculation is limited to pairs that produced definite conclusions,
Specificity = P[exclusion | nonmate & (VIn or VExO) & conclusion] = (3622 + 325) / (3622 + 6 + 325) = 99.8%.
If we regard inconclusives as a sign of the inability to exclude when an exclusion is warranted, however, we get a smaller value:
Specificity = P[exclusion | nonmate & (VIn or VExO)] = (3622 + 325) / (3622 + 6 + 325 + 1856 + 2019) = 79.2%.
Eighty-five percent of examiners made at least one false negative error, distributed across half of the image pairs that were compared. Awareness of previous errors was not correlated with the false negative errors; indeed, 65% of participants said that they were unaware of ever having made an erroneous exclusion after training.
Years of experience were at best weakly correlated with FNRVIn. The correlation coefficient was only 0.15 (p = 0.063). The correlation with certification was not even close to statistical significance (p = 0.871).
3. Posterior Probabilities
False negative and positive rates tell us how LPEs responded to mates and nonmates, but they are not direct measures of the probability that an identification or an exclusion is correct. This posterior probability also depends on the prior probability that a pair is from the same source. The formula that gives the posterior probabilities is Bayes’ rule. Using the proportion of mates in the paired prints deemed VIn (59%), the predictive values were
PPV = P(mate | identification & VIn) = 3663/3669 = 99.8%
and
NPV = P(nonmate | exclusion & VIn) = 3622/4072 = 88.9%.
Using the proportion for all pairs designated as of value for either individualization or exclusion gives essentially the same values:
PPV = P(mate | identification & VIn or VExO) = 3701/3709 = 99.8%
and
NPV = P(nonmate | exclusion & VIn or VExO) = 3947/4558 = 86.6%.
In casework, the prevalence of mated pair comparisons varies substantially among organizations, by case type, and by how candidates are selected. Mated comparisons are far more prevalent in cases where the exemplars come from individuals suspected of leaving the latent print because of nonfingerprint evidence than when candidates come from an AFIS trawl. The predictive values given above therefore would not apply in most cases. The final installment will discuss a much better way to use the experiment to inform a jury or jury about the value of a fingerprint identification.