In the cornucopia of court opinions, errors in the description of statistical concepts are a dime a dozen. Most never see the light of day. In a recent essay in
Jurimetrics Journal, however, Northwestern University Law Professor
Jay Koehler illuminated a few in an essay entitled “Forensic Fallacies and a Famous Judge.”
1/
The judge is
Richard Posner of the U.S. Court of Appeals for the Seventh Circuit. At roughly the same time that Koehler’s essay appeared, the ABA's flagstaff magazine showcased Judge Posner as as “a prolific author whose seemingly endless curriculum vitae includes books on topics as varied as the economic basis of justice, intelligence reform, Bush v. Gore, the failures of capitalism, the Clinton impeachment, law in literature, antitrust theory, disaster response and the origins of nude dancing. ... a restless and engaged public intellectual.” (The
ABA Journal’s interesting interview with him that followed these remarks is available
online.)
Judge Posner's days as a professor at the University of Chicago Law School also made him a legendary figure in the law-and-economics movement, and Koehler identifies him as “a brilliant, quantitatively minded jurist ... the most-cited legal scholar in the world.” So what kinds of errors did this famous judge make?
You will have to read
Koehler’s essay for his full answer. Here, I elaborate on, and supply additional criticism (along with some defense) of one passage from the opinion in
United States v. Ford.
2/ The defendant in the case, John Ford, had been convicted of bank robbery before. Northwestern Law School’s Appellate Advocacy Center represented him on appeal from this second conviction. The Center convinced the Seventh Circuit panel that the photo spread shown the bank manager was so suggestive that his identification of the mug shot of Ford as the armed robber may well have violated due process. Despite this error, the Court of Appeals did not reverse the conviction. Sua sponte, the panel affirmed on the ground that the error was harmless.
It was harmless, in the court’s view, because other evidence was overwhelming. The other evidence was “the bank manager's description of the robber,” who wore “a dust mask that covered his nose and mouth,” together with proof “that the dust mask found outside the bank was the robber's, and the DNA found on the dust mask matched the defendant's DNA.”
3/
In explaining this harmless-error analysis, Judge Posner wrote:
Although the defendant's lawyer tried to throw dust in the jurors' eyes by a vigorous challenge to the DNA evidence, and might have succeeded with another jury, the challenge had no merit. What is involved, very simply, in forensic DNA analysis is comparing a strand of DNA (the genetic code) from the suspect with a strand of DNA found at the crime scene. See "DNA Profiling," Wikipedia, http://en.wikipedia.org/wiki/DNA_profiling (visited May 31, 2012). Comparisons are made at various locations on each strand. At each location there is an allele (a unique gene form). In one location, for example, the probability of a person's having a particular allele might be 7 percent, and in another 10 percent. Suppose that the suspect's DNA and the DNA at the crime scene contained the same alleles at each of the two locations. The probability that the DNA was someone else's would be 7 percent if the comparison were confined to the first location, but only .7 percent (7 percent of 10 percent) if the comparison were expanded to two locations, because the probabilities are independent. Suppose identical alleles were found at 10 locations, which is what happened in this case; the probability that two persons would have so many identical alleles, a probability that can be computed by multiplying together the probabilities of an identical allele at each location, becomes infinitesimally small — in fact 1 in 29 trillion, provided no other comparisons reveal that the alleles at the same location on the two strands of DNA are different. This is the same procedure used for determining the probability that a perfectly balanced coin flipped 10 times in a row will come up heads all 10 times. The probability is .510, which is less than 1 in 1000.
Because the DNA sample taken from the dust mask was incomplete, 10 was all the locations that could be profiled; but that was enough to enable a confident estimation (the 1 in 29 trillion) that the probability that DNA on the dust mask was not the defendant's was exceedingly slight. No evidence was presented to cast doubt on the validity of the DNA test conducted in this case or on the odds stated by the government's expert witness; nor did the cross-examination of the witness, though vigorous, undermine his testimony. The combination in this case of the unimpeached DNA evidence with the bank manager's description of the robber would have persuaded any reasonable jury beyond a reasonable doubt that the defendant was the robber. 4/
Professor Koehler only quotes the part of this discussion involving alleles with population frequencies of 7% and 10%. He observes that:
Judge Posner commits the source probability error. This error involves equating the random match probability (RMP) in DNA analysis with the probability that someone other than the matchee is the source of the matching DNA profile. Judge Posner commits this fallacy when he writes that “the probability that the DNA was someone else's would be 7 percent” in a situation where the suspect's DNA matches DNA found at a crime scene, and where both DNA profiles contain an allele that is common to 7% of the population. Posner's logical error is easily seen when one considers that if the probability that the DNA belongs to someone other than the defendant is 7%, then it would have to be the case that the probability that the DNA belongs to the defendant would have to be 93%. But is this right? It is not. By this logic, every person who shares this particular allele would have a 93% chance of being the source of the DNA in question. Because many people share this allele (for example, 7% of people in Chicago alone would include about 200,000 people), it obviously cannot be the case that each of 200,000 allele-matching Chicagoans has a 93% chance of being the source of the DNA sample in question. To be clear, the probability associated with a DNA match—even a very small probability such as one in one million or one in one billion—does not itself identify the probability that the matchee is or is not the source of a recovered DNA sample. This latter probability cannot be identified absent a fact finder's estimate of the “prior probability” (that is, before the DNA analysis) that the matchee is the source of the evidence. The prior probability depends on nonforensic factors including, but not limited to, the strength of the nonforensic evidence presented in the case (for example, eyewitness testimony, motive, and so forth). In other words, counterintuitive though it may seem, the results of a DNA analysis cannot be translated directly into a probability that someone is or is not the source of DNA evidence. Judge Posner is certainly not the only person who has failed to appreciate this point. Since the earliest days of DNA evidence, judges, attorneys, jurors and experts alike have been committing the source probability error. 5/
Professor Koehler is correct. For many decades judges have been committing not only the “source probability error” for DNA and other forms of trace evidence
6/ but also the more general fallacy of the transposed conditional with
p-values and confidence intervals.
7/ Judge Posner’s opinion is yet another instance. However, in some respects the passage is both more and less troubling than Koehler indicates.
The court’s description of its computation displays a certain ignorance of genetics of
STRs (short tandem repeats) as well as a lack of precision in describing probabilities. First, the STR alleles are not “unique gene forms.” That is, they are not forms of genes, and no forensic STR allele is unique. Some STRs happen to lie within introns of genes, but even those alleles do not change the expressed proteins. Other courts of appeals use slightly better terminology in their opinions.
8/ One state court of appeals did a lot worse, writing that "Human cells contain two genes capable of being analyzed for DNA: mitochondrial DNA (mtDNA) and nuclear DNA (nDNA). We will take each of these genes in turn. ... Nuclear DNA is the larger of the DNA genes."
9/
Second, it is not “the probabilities” that “are independent.” It is the individual events whose unconditional probabilities are multiplied to arrive at the probability of a joint event (the profile) that are nearly independent.
Third, the profile probability is not just the product of the allele frequencies. Two alleles occur at each locus. If one allele (
A) at a locus has a 7% population frequency and the other one (
B) at that same locus has a 10% frequency, then the frequency of the combination is expected to be 2(.07×.10) = 1.4%, not .7%. (The multiple of 2 is necessary because the
A could have come the father and the
B from the mother or vice versa.) If only one peak (corresponding to
A) appears at one locus and only one (
B) appears at the other locus, then the two-locus probability is .07
2 × .10
2 = .0049%. (This assumes that both loci are homozygous—that both parents transmitted the same allele. If the peak heights are not consistent with this supposition and drop-out is suspected, a conservative calculation would be 2(.07×1) × 2(.10×1) = 2.8%.)
Of course, Judge Posner could respond by accusing Professor Koehler (and now me) of trying “to throw dust in the [readers'] eyes” or of missing the forest for the trees. To an extent, that response would be correct.
Sure, it would be nice if the Seventh Circuit could adhere to standard terminology in genetics and statistics, but none of these errors had an effect on the case. In
Ford, a substantial fraction of the population could not reasonably be regarded as the source of the DNA on the dust mask. At least not if one believes that the random-match probability is
p = 1/29,000,000,000,000. In that event, the probability of at least one unrelated individual in all of Chicago having matching DNA could be approximated as 1 – e
–np, where
n is the
Chicago population of about 2.7 million men, women, and children. This duplication probability is only 0.008%.
Well, what about a brother matching at these ten loci? A rough calculation (from the formula at
page 87 of the 1992 NRC Committee report on forensic DNA technology) gives a full sibling match probability of about 0.001%.
In other words, Judge Posner’s linguistic or computational errors in describing DNA profiling are themselves harmless—
in this case. But it would be unfortunate if other courts were to rely on the infelicitous phrasing in
Ford, especially in cases in which the random-match probabilities are less extreme. It also would be dangerous to emulate the judge's reliance on
Wikipedia for a understanding of forensic DNA evidence issues, but that's a complaint for another day.
Notes
- Jonathan J. Koehler, Forensic Fallacies and a Famous Judge, 54 Jurimetrics J. 211 (2014)
- 683 F.3d 761 (7th Cir. 2012).
- Id. at 767.
- Id. at 768.
- Koehler, supra note 1, at 215–16 (notes omitted).
- David H. Kaye, David E. Bernstein & Jennifer L. Mnookin, The Wigmore, A Treatise on Evidence: Expert Evidence § 14.1.2 (2d ed. 2011); 2014 Cumulative Supp. § 14.5.1(b).
- Id. §§ 12.8.2(b) , 12.8.3(a).
- E.g., United States v. Mitchell, 652 F. 3d 387, 400 (3rd Cir. 2011) ("non-genic stretches of DNA not presently recognized as being responsible for trait coding") (quoting United States v. Kincade, 379 F.3d 813, 818 (9th Cir.2004) (en banc) (plurality opinion).
- Diggs v. State, 73 A.3d 306, 317-18 (Md. Ct. Spec. App. 2013).