Friday, April 29, 2016

False Justice and Prosecutors' Fallacies

False Justice, by Jim and Nancy Petro, is an engaging, first-person tale of a former Ohio Attorney General's involvement in correcting false convictions as well as a summary and refutation of, as the book's subtitle puts it, "Eight Myths that Convict the Innocent." 1/ The book reveals the frustrations that lawyers in the innocence movement know all too well, and it wisely warns prosecutors, police, and the public of pernicious fallacies about criminals and the criminal justice system.

But the book perpetuates a fallacy of a different sort -- a statistical fallacy often called, in legal circles, "the prosecutor's fallacy." 2/ With some types of trace evidence (particularly DNA evidence), it is feasible to estimate the probability of a match between a defendant and the trace evidence given that a suspect is not the source of the trace at the crime scene. We can write this coincidental match probability as Pr(Match | ~Source).

The problem is that the judge or jury wants to know the probability of a DNA match given that a suspect is not the source of the DNA: P(~Source | Match). These two conditional probabilities are conceptually distinct. Sometimes they can be numerically identical or very close to one another, but other times they are not even close. The statistical or logical fallacy consists of naively transforming P(Match | ~Source) into P(~Source | Match).

The first instance of this transposition occurs at page 47, when the Petros quote from a letter intended to persuade a county prosecutor of Clarence Elkins' innocence (and of the guilt of a different inmate in the same cellblock):
We had a very convincing match. In a letter [the Ohio Innocence Project] and Elkin's attorneys ... informed the Summit County prosecutor that newly conducted DNA testing "conclusively exonerates Elkins and implicates Earl Mann in the murder and rapes in which Elkins was convicted." The letter explained that the full profile of the DNA from the girl's panties and Mrs. Judy Johnson's vaginal swab were "consistent with Earl Mann's DNA for full 12-point match."
How convincing was this match? This "full 12-point match," did not involve a random match probability P(Match | ~Source) as small as those for normal STR matches. It came from Y-STR testing. Ordinary forensic STR testing uses loci scattered across different pairs of autosomal chromosomes. For those STRs, estimating the probability of a 12-locus match would involve 24 multiplications of smallish fractions and give rise to tiny match probabilities for any given profile. Not so for forensic Y-STR testing. Y-STRs all lie on a single Y chromosome and are inherited father to son, as one package. Multiplying the population frequencies for individual Y-STRs would not make sense. 3/ Instead of multiplying,
As in all Y-STR DNA analysis, the the odds of finding a match are calculated on how many times that specific configuration of markers has been seen in a particular database. In this case, Earl Mann's DNA, in a database of 4,000 samples, matched the crime scene DNA. The letter explained, "Thus far, it ... is a unique Y-STR profile, and there is less than a 1 in 4,000 chance that it is not Earl Mann who left his DNA at the crime scene in the most highly probative areas."
Presumably, "less than ... 1 in 4,000" refers to the fraction 1/4001 that expresses how often the profile has been seen -- only once -- compared to how many Y-STR profiles have been recorded -- 4000 previous profiles plus Mann's. 4/ A 1/4001 "chance that it is not Earl Mann" given that the trace DNA and Mann's have the Y-STR profile is P(~Source | Match). In contrast, 1/4001 is the probability of randomly picking Mann's profile from a population in which 1/4001 profiles are just like Mann's. It is P(Match | ~Source). Elkin's lawyers have transposed. To build their case against Mann, they have committed the prosecutor's fallacy.

Now, Earl Mann was almost certainly guilty -- but not just because he had a matching profile. According to the 2000 Census, Summit County was home to approximately 140,000 men between the ages of 20 and 59. At a rate of 1 man per 4001, we would expect to find 140000/4001 = 35 of them with matching DNA. Looking at just the Y-STR match, it no longer sounds as if the chance that Mann was not the source is only 1/4001.

In fact, one could argue the chance that Mann was not the source is P(~Source | Match) = 34/35! After all, there were some 35 men in the right age range and locale for whom one could say "the full profile of the DNA from the girl's panties and Mrs. Judy Johnson's vaginal swab were 'consistent with ... for full 12-point match.'" Mann is just one of them. As such, for him, P(Source | Match) = 1/35; hence, P(~Source | Match) = 34/35.  5/

The passage quoted above is not the only instance of transposition in False Justice. It occurs just about every time the Petros quote a random match probability. Most of these probabilities are so small that the resulting likelihood ratio would swamp any reasonable prior probability, making the fallacy for particular transpositions somewhat academic. Still, False Justice does not get its description of the meaning of small match probabilities quite right.

  1. Jim Petro & Nancy Petro, 2015. False Justice: Eight Myths that Convict the Innocent. Routledge: New York, NY (rev. ed.).
  2. William C. Thompson & Edward L. Shumann, (1987). Interpretation of Statistical Evidence in Criminal Trials: The Prosecutor's Fallacy and the Defense Attorney's Fallacy. Law and Human Behavior, 2(3): 167-187 (introducing the phrase).
  3. See, e.g., David H. Kaye, 2010. The Double Helix and the Law of Evidence. Harvard Univ. Press: Cambridge, MA.
  4. Another way to estimate the Y-STR profile frequency is more commonly used, but that is tangential to the issue of transposition.
  5. A better way to arrive at P(~Source | Match) is to apply Bayes' rule. That formula yields 34/35 if one assumes that Mann and every other man in Summit County in the age range mentioned has the same prior probability of being the source of the trace DNA and that everyone else in the world has a source probability of zero. 

No comments:

Post a Comment