I do so with some trepidation. Neither the SWGDAM Guidelines nor the articles that I have located supply a simple and clear exposition of the actual workings of any modern forensic PGS. The Guidelines state that
A probabilistic genotyping system is comprised of software, or software and hardware, with analytical and statistical functions that entail complex formulae and algorithms. Particularly useful for low-level DNA samples (i.e., those in which the quantity of DNA for individuals is such that stochastic effects may be observed) and complex mixtures (i.e., multi-contributor samples, particularly those exhibiting allele sharing and/or stochastic effects), probabilistic genotyping approaches can reduce subjectivity in the analysis of DNA typing results.That sounds great, but what do these "complex formulae and algorithms" do? Well,
probabilistic approaches provide a statistical weighting to the different genotype combinations. Probabilistic genotyping does not utilize a stochastic threshold. Instead, it incorporates a probability of alleles dropping out or in. In making use of more genotyping information when performing statistical calculations and evaluating potential DNA contributors, probabilistic genotyping enhances the ability to distinguish true contributors and noncontributors.Moreover, "[t]he use of a likelihood ratio as a reporting statistic for probabilistic genotyping differs substantially from binary statistics such as the combined probability of exclusion."
This sounds good too, but what is "a statistical weighting," and how is a probability of exclusion, which is not confined to 0 to 1, a "binary statistic"? To gain a clearer picture of what might be going on, I thought I would start with the simplest possible situation — a crime-scene sample with a single contributor — to surmise how a probabilistic analysis might operate. My analysis is something of a guess. Corrections are welcome.
Two Peaks, One Inferred Genotype, One Likelihood Ratio of 50: Not a PGS!
In "short tandem repeat" typing via capillary electrophoresis, the laboratory extracts DNA from a sample and uses the PCR (polymerase chain reaction) to make millions of copies of a short stretch of DNA between a designated starting point and a stopping point (a "locus"). These fragments vary in length among different individuals (although none are unique). The laboratory runs the sample fragments through a machine that measures the quantity of the fragments as a function of the length of the fragments. For example, a plot of the quantity on the y-axis and the fragment length on the x-axis might show two prominent peaks, which I will call A and B, of roughly equal height rising above a noisy baseline. This AB pattern at a single locus is exactly what one would expect for DNA from an individual who inherited a fragment of length A from one parent and a fragment of length B from the other parent. Starting with roughly equal numbers of maternally and paternally inherited DNA molecules in the original sample, PCR should generate about equal quantities of the maternal and paternal length variants ("STR alleles") of the two distinct lengths. These produce the two peaks in the graph (the electropherogram).
The analyst then could compute the “random match probability” or “probability of inclusion” (PI) — that is, the probability P(RAB) that a randomly selected individual would be type AB. Even if the analyst used a computer program to do the calculation, no “probabilistic genotyping” would be involved. The “genotype” AB would be regarded as known to a certainty (for the purpose of the computation), and the probability PI pertains to something else — to the chance of coincidentally finding an individual with a matching profile: PI = P(RAB). If 1 in 50 people have the profile AB, then PI = 1/50.
The evidentiary value of the inclusion can be computed as a “likelihood ratio” (LR). If the hypothesis (Hp) that the suspect, who also is type AB, is the contributor of the DNA in the sample is correct, and if the sample has plenty of undegraded DNA, the probability of the data DAB (an A and a B peak detected in the sample) is P(DAB|Hp) = 1. On the other hand, if someone unrelated to the suspect is the contributor (Hd), then P(DAB|Hd) is the probability of inclusion PI = 1/50. Thus, the evidence — the A and B peaks — is 1/PI = 50 times more probable when the suspect is the contributor than when an unrelated person is. This ratio of the probabilities of the evidence conditional on the hypotheses is the likelihood ratio. It measures the support the evidence lends to Hp as opposed to Hd. LRs greater than 1 support Hp over Hd (e.g., Kaye et al. 2011).
Two Peaks, Two Inferred Genotypes with Probabilities for Each Genotype: A PGS?
This much is straightforward, conventional thinking. But an AB contributor is not the only conceivable explanation for the two peaks. Maybe they reflect DNA from an AA individual (one who inherited the fragment of length A from both parents), and the B is just an artifact known as “stutter” (Brooks et al. 2012). If this possibility cannot be dismissed as wildly improbable (as it could be if, for example, the putative stutter peak were far from the A peak), then the analysis should take into account both AA and AB as possible contributor profiles.
One way to do so would be to study the detection probability P(DAB) in experiments with samples from AA and AB contributors. Suppose that a large number of such experiments showed that when the contributor is AA, the probability of detecting AB is P(DAB|CAA) = 1/10 and that when the contributor is AB, the probability is P(DAB|CAB) = 1. Sometimes, AA contributors produce AB peaks; AB contributors always do.
In a case in which the suspect is type AB, what is the evidentiary value of the two peaks A and B? The suspect is still AB, so P(DAB|Hp) is unchanged at 1. But the denominator of the LR, P(DAB|Hd) requires us to consider the probability that the contributor’s profile is AA as well as the probability that it is AB. Imagine that the laboratory receives crime-scene samples with DNA profiles that are representative of a population in which 1 in 100 people are AA and (as stated before) 1 in 50 are AB. Because only 1 in 10 DNA samples from AA contributors will appear to be AB, about 1 in 1000 samples will have the AB peaks and come from AA contributors:
More samples, about 20 per 1000, will have the AB peaks and come from AB contributors:
Thus, in about 20 out of 21 detections of AB peaks, the contributor is AB. (Most readers who have borne with me this far will recognize this result as a simple application of Bayes' rule for the posterior probability: P(CAB|DAB) = 20/21.)
A PGS thus could assign probabilities of P(CAA|DAB) = 1/21 and P(CAB|DAB) = 20/21 for the two possible contributor genotypes. The hypothesis Hd is that either an unrelated person who is AA or, as before, that the peaks come from an unrelated AB contributor. If the suspect is not the source and if the apparent AB profile really is AA (which has probability 1/21), Hd requires that a random, unrelated person be type AA (an event that has probability P(RAA) = 1/100). Likewise, if the suspect is not the source and the apparent AB profile really is AB (which has probability 20/21), then Hd requires that a random, unrelated person be type AB (an event that has probability P(RAB) = 1/50). Consequently, the probability of the evidence DAB given Hd is
P(DAB|Hd) = P(RAA) ⋅ P(CAA|DAB) + P(RAB) ⋅ P(CAB|DAB)
= (1/100) (1/21) + (1/50) (20/21) = 41/2100 = 0.0195.
This likelihood is very close to the previous denominator of 1/50 = 0.020. The resulting LR is 2100/41 = 51.2.
The Probability in PGS
This toy model of a PGS only used information about peak location and only mentioned a stutter peak as a source of uncertainty in the contributor's genotype. A more sophisticated PGS would use peak heights as well and would attend to allelle drop-in and drop-out, and other complicating features. The most complete models dispense with the rules of thumb (“analytical thresholds,” “stochastic thresholds,” and “peak-height ratios”) that human examiners employ to decide whether a peak is high enough to count as real, what to do with it in computing a likelihood ratio, and what potential genotypes to cross off the list of possibilities when confronted with a mixture of DNA from several contributors (Kelly et al. 2014).
I do not propose to explain these matters any better than SWGDAM has. My purpose here has been to clarify just what is “probabilistic” about a PGS. The key point is not that the system produces a likelihood ratio as opposed to a probability of exclusion or inclusion. Likelihood ratios also apply to categorical inferences as to what profiles are present in a mixed sample. A PGS is distinctive because it assigns probabilities to the possible profiles and uses more information to arrive at what, one hopes, is a better likelihood ratio for the hypotheses about whether a suspect is a contributor.
- C. Brookes, J.A. Bright, S. Harbison, J. Buckleton, Characterising Stutter in Forensic STR Multiplexes, 6 Forensic Sci. Int’l: Genetics 58-63 (2012)
- David H. Kaye et al., The New Wigmore on Evidence: Expert Evidence (2d ed. 2011)
- Hannah Kelly, Jo-Anne Bright, John S. Buckleton, James M. Curran, A Comparison of Statistical Models for the Analysis of Complex Forensic DNA Profiles, 54 Sci. & Justice 66–70 (2014)
Thanks are owed to Sandy Zabell for correcting errors in the original posting. This version was last updated 1 February 2016.