Wednesday, September 27, 2023

How Accurate Is Mass Spectrometry in Forensic Toxicology?

Mass spectrometry (MS) is the "[s]tudy of matter through the formation of gas-phase ions that are characterized using mass spectrometers by their mass, charge, structure, and/or physicochemical properties." ANSI-ASB Standard 098 for Mass Spectral Analysis in Forensic Toxicology § 3.11 (2023). MS has become "the preferred technique for the confirmation of drugs, drug metabolites, relevant xenobiotics, and endogenous analytes in forensic toxicology." Id. at Foreword.

But no "criteria for the acceptance of mass spectrometry data have been ... universally applied by practicing forensic toxicologists." Id. Therefore, the American Academy of Forensic Sciences' Academy Standards Board (ASB) promulgated a "consensus based forensic standard[] within a framework accredited by the American National Standards Institute (ANSI)," id., that provides "minimum requirements." Id. § 1.

To a nonexpert reader (like me), the minimum criteria for the accuracy of MS "confirmation" are not apparent. Consider Section 4.2.1 on "Full-Scan Acquisition using a Single-Stage Low-Resolution Mass Analyzer." It begins with the formal requirement that

[T]he following shall be met when using a single-stage low-resolution mass analyzer in full-scan mode.
a) A minimum of a single diagnostic ion shall be monitored.

It is hard to imagine an MS test method that would not meet the single-ion minimum. Perhaps what makes this requirement meaningful is that the one or more ions must be "diagnostic." However, this adjective begs the question of what the minimum requirement for diagnositicity should be. A "diagnostic ion" is a "molecular ion or fragment ion whose presence and relative abundance are characteristic of the targeted analyte." Id. § 3.4. So what makes an ion "characteristic"? Must it always be present (in some relative abundance) when the "targeted analyte" is in the specimen (at or above some limit of detection)? That would make the ion a marker for the analyte with perfect sensitivity: Pr(ion|analyte) = 1. Even so, it would not be characteristic of the analyte unless its presence is highly specific, that is, unless Pr(no-such-ion|something-else) ≅ 1. But the standard contains no minimum values for sensitivity, specificity, or the likelihood ratio Pr(ion|analyte) / Pr(ion|something-else), which quantifies the positive diagnostic value of a binary test. \1/

This is not to say that there are no minimum requirements in the standard. There certainly are. For example, Section 4.2.1 continues:

b) When monitoring more than one diagnostic ion:
1. ratios of diagnostic ions shall agree with those calculated from a concurrently analyzed reference material given the tolerances shown in Table 1; OR
2. the spectrum shall be compared using an appropriate library search and be above a pre-defined match factor as demonstrated through method validation.

But the standard does not explain how the tolerances in Table 1 were determined. What are the conditional error probabilities that they produce?

Likewise, establishing a critical value for the "match factor" \2/ before using it is essential to a frequentist decision rule, but what are the operating characteristics of the rule? "Method validation" is governed (to the extent that voluntary standards govern anything) by ANSI-ASB 036, Standard Practices for Method Validation in Forensic Toxicology (2019). This standard requires testing to establish that a method is "fit for purpose," but it gives no accuracy rates that would fulfill this vague directive.

Firms that sell antibody test kits for detecting Covid-19 infections no longer can sell whatever they deem is fit for purpose. In May 2020, the FDA stopped issuing emergency use permits for these diagnostic tests without validation showing that they "are 90% 'sensitive,' or able to detect coronavirus antibodies, and 95% 'specific,' or able to avoid false positive results." \3/ Forensic toxicologists do not seem to have proposed such minimum requirements for MS tests.

NOTES

  1. Other toxicology standards refer to ASB 098 as if it indicates what it required to apply the label "diagnostic." ANSI/ASB 113, Standard for Identification Criteria in Forensic Toxicology, § 4.5.2 (2023) ("All precursor and product ions are required to be diagnostic per ASB Standard 098, Standard for Mass Spectral Data Acceptance in Forensic Toxicology (2022).").
  2. Section 3.13 defines "match factor" as a "mathematical value [a scalar?] that indicates the degree of similarity between an unknown spectrum and a reference spectrum."
  3. See How Do Forensic-science Tests Compare to Emergency COVID-19 Tests?, Forensic Sci., Stat. & L., May 5, 2020 (quoting Thomas M. Burton, FDA Sets Standards for Coronavirus Antibody Tests in Crackdown on Fraud, Wall Street J., Updated May 4, 2020 8:24 pm ET, https://www.wsj.com/articles/fda-sets-standards-for-coronavirus-antibody-tests-in-crackdown-on-fraud-11588605373).

Monday, September 18, 2023

Use with Caution: NIJ's Training Course in Population Genetics and Statistics for Forensic Analysts

The National Institute of Justice (NIJ) "is the research, development and evaluation agency of the U.S. Department of Justice . . . dedicated to improving knowledge and understanding of crime and justice issues through science." It offers a series of webpages and video recordings (a "training course") on Population Genetics and Statistics for Forensic Analysts. The course should be approached with caution. I have not worked through all the pages and videos, but here are a few things that rang alarm bells:


NIJ's Training Comment

Many statisticians have employed what is known as Bayesian probability ... which is based on probability as a measure of one's degree of belief. This type of probability is conditional in that the outcome is based on knowing information about other circumstances and is derived from Bayes Theorem. Bayes' rule applies to both objective and subjective probabilities. Both types of probability include conditional probabilities. The "type of probability" is not derived from Bayes' Theorem.

Conditional probability, by definition, is the probability P of an event A given that an event B has occurred. ... Take the example of a die with six sides. If one was to throw the die, the probability of it landing on any one side would be 1/6. This probability, however, assumes that the die is not weighted or rigged in any way, and that all of the sides contain a different number. If this were not true, then the probability would be conditional and dependent on these other factors. The "other factors" are nothing more than part of the description of the experiment whose outcomes are the events that are observed. They are not conditioning events in a sample space.

The following equation can be used to determine the probability of the evidence given that a presumed individual is the contributor rather than a random individual in the population: LR = P(E/H1) / P(E/H0) ... . In the case of a single source sample, the hypothesis for the numerator (the suspect is the source of the DNA) is a given, and thus reduces to 1. This reduces to: LR = 1/ P(E/H0) which is simply 1/P, where P is the genotype frequency. The hypothesis for the numerator of a likelihood ratio is always "a given"--that is, it goes on the right-hand-side of the expression for a conditional probability. So is the hypothesis in the denominator. Neither probability "reduces to 1" for that reason. Only if the "evidence" is the true genotype in both the recovered sample and the sample from the defendant can it be said that P(E|H1) = 1. In other words, to say that the probability of a reported match is 1 if the defendant is the source treats the probability of laboratory error as zero. That may be acceptable as a simplifying assumption, but the assumption should be made visible in a training course.

Although likelihood ratios can be used for determining the significance of single source crime stains, they are more commonly used in mixture interpretation. ... The use of any formula for mixture interpretation should only be applied to cases in which the analyst can reasonably assume "that all contributors to the mixed profile are unrelated to each other, and that allelic dropout has no practical impact." This limitation does not apply to modern probabilistic genotyping software!

Is "Match Form" Testimony Poor Form?

The likelihood ratio (LR) is essentially a number that expresses how many times more probable the data from an experiment are if one hypothesis is true than if another hypothesis is true. For example, suppose we make a single measurement of the height of a known individual. Then we do the same for an individual who is covered from head to foot by a sheet. We want know if we have measured the same individual twice or two different individuals once. The closer the two measured heights are to one another, the more the measurements support the same-source hypothesis as opposed to the different-source hypothesis.

Why? Because closer measurements are more probable for same-source pairs than for different-source pairs. This implies that in repeated experiments with some proportion of same-source and different-source pairs, the closer measurements will tend to filter out the different-source pairs (which tend to have more distance between the two measurements) and to include more same-source pairs (which tend to be marked by the more similar measurements).

By quantifying the relative probability for the data given each hypothesis, the LR indicates how well a given degree of similarity discriminates between the hypotheses. Its value is

LR = Probability(data | H1) / Probability(data | H2),

where H1 is the same-source hypothesis and H2 is the different-source hypothesis.

Likelihood ratios are routinely reported in cases with samples from crime scenes or victims that contain DNA from several individuals. A DNA analyst might testify that the electropherograms are ten thousand times more probable if the defendant's DNA is present than if an unrelated person's DNA is there. \1/ We may call such statements "relative-probability-of-the-data" testimony.

But some DNA experts prefer what they call a "match form" for the presentation. \2/ An example of a "match form" statement is that “[a] match between the shoes … and [the defendant] is 9.67 thousand times more probable than a coincidental match to an unrelated African-American person.” \3/ More generally, a match-form presentation states that “a match between the evidence and reference [samples] is (some number) times more probable than coincidence.” \4/

This formulation has been criticized as highly misleading. According to William Thompson, it is

likely to mislead lay people and foster misunderstandings that are detrimental to people accused of a crime. I recommend that Cybergenetics immediately cease using this misleading language and find a better way to explain its findings. Standards development organizations such as OSAC should consider developing standards that address the appropriateness, or inappropriateness, of such presentations. Courts should refuse to admit PG [probabilistic genotyping] evidence when it is mischaracterized in this manner. Lawyers involved in cases in which defendants were convicted based on this misleading language should consider the appropriateness of appellate remedies. \5/

The main concern is that juxtaposing “match” and “coincidence” will lead judges and jurors to think that the "match statistic" pertains to the probabilities of hypotheses (H1 and H2) about the source of the DNA rather than probabilities about the laboratory’s data. In simpler terms, the concern is that most people will understand "coincidence" and "coincidental match" as an assertion that the observed match is the result of coincidence; moreover, they will think that "match" is an assertion that the defendant is the matcher. If that happens, then the assertion that a match is 10,000 times more likely than coincidence would be (mis)understood as a statement that the odds against a coincidence having occurred are 10,000 to 1.

Instead, LR = 10,000 should be understood (according to Bayes' rule) as a statement about the change in the odds that defendant, as opposed to some unknown, unrelated person, is the matcher. For example, if defendant has a strong alibi—strong enough, in conjunction with other evidence, to establish that the prior odds of H1 as opposed to H2 are only 1 to 5,000—then this LR raises the odds to 10,000 x 1:5,000 = 2:1. Such final odds are far from overwhelming.

Cybergenetics does not seems disposed to abandon "match form" testimony. Dr. Thompson claims that for fingerprint comparisons, "'[m]atch' is shorthand for source identification, [s]o, it is predictable that many lay people will interpret the term 'match,' when used to describe DNA evidence, to mean that the person of interest has been identified either definitively or with a high degree of certainty as a contributor." Pointing to a dictionary, Cybergenetics angrily responds that this is just "Thompson’s private language." \6/ But a tradition in forensic science is to equate a "match" with an identification, as shown by the title of articles such as "Is a Match Really a Match? A Primer on the Procedures and Validity of Firearm and Toolmark Identification." \7/ In popular culture, the term may have a similar connotation. Perhaps Youtube trumps Merriam-Webster. \8/

As far as I know, no studies compare the comprehensibility of relative-probability-of-the-data testimony to match-form testimony. Therefore, the law and the practice has to be guided by intuition. My sense is that avoiding the transposition of the probabilities in a likelihood ratio requires special care if the match-versus-coincidence approach is used. The witness must explain not only that a "DNA match" is merely a degree of similarity between the electropherograms being compared, but also that "coincidence" or "coincidental match" is shorthand for the proposition that the "match" is a match to an unrelated person (or other specified source)—and that it is not a conclusion that a coincidence has occurred. The phrase "coincidental match" is too ambiguous to be left undefined.

In short, I am not sure that an absolute rule against match-form testimony is necessary, but I see no clear benefit to the phraseology. Relative-probability-of-the-data testimony seems to be a more straightforward description of a DNA likelihood ratio. However, it too needs explanation to reduce the risk of blindly transposing the conditional probabilities for the data into conditional probabilities for the hypotheses. Cases announcing that a likelihood ratio is a ratio of source-hypothesis probabilities are legion. \9/

Notes

  1. Cf. Commonwealth v. McClellan, 178 A.3d 874 (Pa. Super. Ct. 2018) ("[I]t was determined that the DNA sample taken from the gun's grip was at least 384 times more probable if the sample originated from Appellant and two unknown, unrelated individuals than if it originated from a relative to Appellant and two unknown, unrelated individuals").
  2. Mark Perlin, Explaining the Likelihood Ratio in DNA Mixture Interpretation, in Proceedings of Promega's Twenty First International Symposium on Human Identification at 7 (Dec. 29, 2010); cf. Mark W. Perlin, Joseph B. Kadane & Robin W. Cotton, Match Likelihood Ratio for Uncertain Genotypes, 8 Law, Probability & Risk 289 (2009), https://doi.org/10.1093.
  3. United States v. Anderson, No. 4:21-CR-00204, 2023 WL 3510823, at *3 (M.D. Pa. Apr. 26, 2023). For additional instances of “match form” testimony or reporting, see Howell v. Schweitzer, No. 1:20-cv-2853, 2023 WL 1785530 (N.D. Ohio Jan. 11, 2023); Sanford v. Russell, No. 17-13062, 2021 WL 1186495 (E.D. Mich. Mar. 30, 2021); State v. Anthony, 266 So.3d 415 (La. Ct. App. 2019).
  4. Mark W. Perlin et al., TrueAllele Casework on Virginia DNA Mixture Evidence: Computer and Manual Interpretation in 72 Reported Criminal Cases, 9 PLOS ONE e92837, at 8 (2014).
  5. William C. Thompson, Uncertainty in Probabilistic Genotyping of Low Template DNA: A Case Study Comparing STRMix™ and TrueAllele™, 68 J. Forensic Sci. 1049, 1059 (2023), doi:10.1111/1556-4029.15225.
  6. Mark W. Perlin et al., Reporting Exclusionary Results on Complex DNA Evidence, A Case Report Response to 'Uncertainty in Probabilistic Genotyping of Low Template DNA: A Case Study Comparing Strmix™ and Trueallele®' Software 31 (May 18, 2023), available at SSRN: https://ssrn.com/abstract=4449313 or http://dx.doi.org/10.2139/ssrn.4449313.
  7. Stephen G. Bunch et al., Is a Match Really a Match? A Primer on the Procedures and Validity of Firearm and Toolmark Identification, 11 Forensic Science Communications, No. 3 (2009), https://archives.fbi.gov/archives/about-us/lab/forensic-science-communications/fsc/july2009/review/2009_07_review01.htm.
  8. In addition, a dictionary definition of "match" (https://www.merriam-webster.com/dictionary/match) is "a pair suitably associated." Suitable association suggests that a hypothesis about the nature of the association is true.
  9. E.g., State v. Pickett, 246 A.3d 279 (N.J. App. 2021) (The "likelihood ratio [is] a statistic measuring the probability that a given individual was a contributor to the sample against the probability that another, unrelated individual was the contributor.") (citing Justice Ming W. Chin et al., Forensic DNA Evidence § 5.5 (2020)).