Monday, September 18, 2023

Is "Match Form" Testimony Poor Form?

The likelihood ratio (LR) is essentially a number that expresses how many times more probable the data from an experiment are if one hypothesis is true than if another hypothesis is true. For example, suppose we make a single measurement of the height of a known individual. Then we do the same for an individual who is covered from head to foot by a sheet. We want know if we have measured the same individual twice or two different individuals once. The closer the two measured heights are to one another, the more the measurements support the same-source hypothesis as opposed to the different-source hypothesis.

Why? Because closer measurements are more probable for same-source pairs than for different-source pairs. This implies that in repeated experiments with some proportion of same-source and different-source pairs, the closer measurements will tend to filter out the different-source pairs (which tend to have more distance between the two measurements) and to include more same-source pairs (which tend to be marked by the more similar measurements).

By quantifying the relative probability for the data given each hypothesis, the LR indicates how well a given degree of similarity discriminates between the hypotheses. Its value is

LR = Probability(data | H1) / Probability(data | H2),

where H1 is the same-source hypothesis and H2 is the different-source hypothesis.

Likelihood ratios are routinely reported in cases with samples from crime scenes or victims that contain DNA from several individuals. A DNA analyst might testify that the electropherograms are ten thousand times more probable if the defendant's DNA is present than if an unrelated person's DNA is there. \1/ We may call such statements "relative-probability-of-the-data" testimony.

But some DNA experts prefer what they call a "match form" for the presentation. \2/ An example of a "match form" statement is that “[a] match between the shoes … and [the defendant] is 9.67 thousand times more probable than a coincidental match to an unrelated African-American person.” \3/ More generally, a match-form presentation states that “a match between the evidence and reference [samples] is (some number) times more probable than coincidence.” \4/

This formulation has been criticized as highly misleading. According to William Thompson, it is

likely to mislead lay people and foster misunderstandings that are detrimental to people accused of a crime. I recommend that Cybergenetics immediately cease using this misleading language and find a better way to explain its findings. Standards development organizations such as OSAC should consider developing standards that address the appropriateness, or inappropriateness, of such presentations. Courts should refuse to admit PG [probabilistic genotyping] evidence when it is mischaracterized in this manner. Lawyers involved in cases in which defendants were convicted based on this misleading language should consider the appropriateness of appellate remedies. \5/

The main concern is that juxtaposing “match” and “coincidence” will lead judges and jurors to think that the "match statistic" pertains to the probabilities of hypotheses (H1 and H2) about the source of the DNA rather than probabilities about the laboratory’s data. In simpler terms, the concern is that most people will understand "coincidence" and "coincidental match" as an assertion that the observed match is the result of coincidence; moreover, they will think that "match" is an assertion that the defendant is the matcher. If that happens, then the assertion that a match is 10,000 times more likely than coincidence would be (mis)understood as a statement that the odds against a coincidence having occurred are 10,000 to 1.

Instead, LR = 10,000 should be understood (according to Bayes' rule) as a statement about the change in the odds that defendant, as opposed to some unknown, unrelated person, is the matcher. For example, if defendant has a strong alibi—strong enough, in conjunction with other evidence, to establish that the prior odds of H1 as opposed to H2 are only 1 to 5,000—then this LR raises the odds to 10,000 x 1:5,000 = 2:1. Such final odds are far from overwhelming.

Cybergenetics does not seems disposed to abandon "match form" testimony. Dr. Thompson claims that for fingerprint comparisons, "'[m]atch' is shorthand for source identification, [s]o, it is predictable that many lay people will interpret the term 'match,' when used to describe DNA evidence, to mean that the person of interest has been identified either definitively or with a high degree of certainty as a contributor." Pointing to a dictionary, Cybergenetics angrily responds that this is just "Thompson’s private language." \6/ But a tradition in forensic science is to equate a "match" with an identification, as shown by the title of articles such as "Is a Match Really a Match? A Primer on the Procedures and Validity of Firearm and Toolmark Identification." \7/ In popular culture, the term may have a similar connotation. Perhaps Youtube trumps Merriam-Webster. \8/

As far as I know, no studies compare the comprehensibility of relative-probability-of-the-data testimony to match-form testimony. Therefore, the law and the practice has to be guided by intuition. My sense is that avoiding the transposition of the probabilities in a likelihood ratio requires special care if the match-versus-coincidence approach is used. The witness must explain not only that a "DNA match" is merely a degree of similarity between the electropherograms being compared, but also that "coincidence" or "coincidental match" is shorthand for the proposition that the "match" is a match to an unrelated person (or other specified source)—and that it is not a conclusion that a coincidence has occurred. The phrase "coincidental match" is too ambiguous to be left undefined.

In short, I am not sure that an absolute rule against match-form testimony is necessary, but I see no clear benefit to the phraseology. Relative-probability-of-the-data testimony seems to be a more straightforward description of a DNA likelihood ratio. However, it too needs explanation to reduce the risk of blindly transposing the conditional probabilities for the data into conditional probabilities for the hypotheses. Cases announcing that a likelihood ratio is a ratio of source-hypothesis probabilities are legion. \9/

Notes

  1. Cf. Commonwealth v. McClellan, 178 A.3d 874 (Pa. Super. Ct. 2018) ("[I]t was determined that the DNA sample taken from the gun's grip was at least 384 times more probable if the sample originated from Appellant and two unknown, unrelated individuals than if it originated from a relative to Appellant and two unknown, unrelated individuals").
  2. Mark Perlin, Explaining the Likelihood Ratio in DNA Mixture Interpretation, in Proceedings of Promega's Twenty First International Symposium on Human Identification at 7 (Dec. 29, 2010); cf. Mark W. Perlin, Joseph B. Kadane & Robin W. Cotton, Match Likelihood Ratio for Uncertain Genotypes, 8 Law, Probability & Risk 289 (2009), https://doi.org/10.1093.
  3. United States v. Anderson, No. 4:21-CR-00204, 2023 WL 3510823, at *3 (M.D. Pa. Apr. 26, 2023). For additional instances of “match form” testimony or reporting, see Howell v. Schweitzer, No. 1:20-cv-2853, 2023 WL 1785530 (N.D. Ohio Jan. 11, 2023); Sanford v. Russell, No. 17-13062, 2021 WL 1186495 (E.D. Mich. Mar. 30, 2021); State v. Anthony, 266 So.3d 415 (La. Ct. App. 2019).
  4. Mark W. Perlin et al., TrueAllele Casework on Virginia DNA Mixture Evidence: Computer and Manual Interpretation in 72 Reported Criminal Cases, 9 PLOS ONE e92837, at 8 (2014).
  5. William C. Thompson, Uncertainty in Probabilistic Genotyping of Low Template DNA: A Case Study Comparing STRMix™ and TrueAllele™, 68 J. Forensic Sci. 1049, 1059 (2023), doi:10.1111/1556-4029.15225.
  6. Mark W. Perlin et al., Reporting Exclusionary Results on Complex DNA Evidence, A Case Report Response to 'Uncertainty in Probabilistic Genotyping of Low Template DNA: A Case Study Comparing Strmix™ and Trueallele®' Software 31 (May 18, 2023), available at SSRN: https://ssrn.com/abstract=4449313 or http://dx.doi.org/10.2139/ssrn.4449313.
  7. Stephen G. Bunch et al., Is a Match Really a Match? A Primer on the Procedures and Validity of Firearm and Toolmark Identification, 11 Forensic Science Communications, No. 3 (2009), https://archives.fbi.gov/archives/about-us/lab/forensic-science-communications/fsc/july2009/review/2009_07_review01.htm.
  8. In addition, a dictionary definition of "match" (https://www.merriam-webster.com/dictionary/match) is "a pair suitably associated." Suitable association suggests that a hypothesis about the nature of the association is true.
  9. E.g., State v. Pickett, 246 A.3d 279 (N.J. App. 2021) (The "likelihood ratio [is] a statistic measuring the probability that a given individual was a contributor to the sample against the probability that another, unrelated individual was the contributor.") (citing Justice Ming W. Chin et al., Forensic DNA Evidence § 5.5 (2020)).

No comments:

Post a Comment