Monday, September 18, 2023

Use with Caution: NIJ's Training Course in Population Genetics and Statistics for Forensic Analysts

The National Institute of Justice (NIJ) "is the research, development and evaluation agency of the U.S. Department of Justice . . . dedicated to improving knowledge and understanding of crime and justice issues through science." It offers a series of webpages and video recordings (a "training course") on Population Genetics and Statistics for Forensic Analysts. The course should be approached with caution. I have not worked through all the pages and videos, but here are a few things that rang alarm bells:


NIJ's Training Comment

Many statisticians have employed what is known as Bayesian probability ... which is based on probability as a measure of one's degree of belief. This type of probability is conditional in that the outcome is based on knowing information about other circumstances and is derived from Bayes Theorem. Bayes' rule applies to both objective and subjective probabilities. Both types of probability include conditional probabilities. The "type of probability" is not derived from Bayes' Theorem.

Conditional probability, by definition, is the probability P of an event A given that an event B has occurred. ... Take the example of a die with six sides. If one was to throw the die, the probability of it landing on any one side would be 1/6. This probability, however, assumes that the die is not weighted or rigged in any way, and that all of the sides contain a different number. If this were not true, then the probability would be conditional and dependent on these other factors. The "other factors" are nothing more than part of the description of the experiment whose outcomes are the events that are observed. They are not conditioning events in a sample space.

The following equation can be used to determine the probability of the evidence given that a presumed individual is the contributor rather than a random individual in the population: LR = P(E/H1) / P(E/H0) ... . In the case of a single source sample, the hypothesis for the numerator (the suspect is the source of the DNA) is a given, and thus reduces to 1. This reduces to: LR = 1/ P(E/H0) which is simply 1/P, where P is the genotype frequency. The hypothesis for the numerator of a likelihood ratio is always "a given"--that is, it goes on the right-hand-side of the expression for a conditional probability. So is the hypothesis in the denominator. Neither probability "reduces to 1" for that reason. Only if the "evidence" is the true genotype in both the recovered sample and the sample from the defendant can it be said that P(E|H1) = 1. In other words, to say that the probability of a reported match is 1 if the defendant is the source treats the probability of laboratory error as zero. That may be acceptable as a simplifying assumption, but the assumption should be made visible in a training course.

Although likelihood ratios can be used for determining the significance of single source crime stains, they are more commonly used in mixture interpretation. ... The use of any formula for mixture interpretation should only be applied to cases in which the analyst can reasonably assume "that all contributors to the mixed profile are unrelated to each other, and that allelic dropout has no practical impact." This limitation does not apply to modern probabilistic genotyping software!

No comments:

Post a Comment