Sunday, September 24, 2017

How Experts (Mis)Represent Likelihood Ratios for DNA Evidence

Earlier this month, I noted the tendency of journalists to misconstrue a likelihood ratio as odds or probabilities in favor of a source hypothesis. I mentioned expressions such as "the likelihood that a suspect’s DNA is present in a mixture of substances found at a crime scene" and "the probability, weighed against coincidence, that sample X is a match with sample Y." In place of such garbled descriptions, I proposed that
Putting aside all other explanations for the overlap between the mixture and the suspect's alleles -- explanations like relatives or some laboratory errors--this likelihood ratio indicates how much the evidence changes the odds in favor of the suspect’s DNA being in the mixture. It quantifies the probative value of the evidence, not the probability that one or another explanation of the evidence is true.
The journalists' misstatements occurred in connection with likelihood ratios involving DNA mixtures, but even experts in forensic inference make the same mistake in simpler situations. The measure of probative value for single-source DNA is a more easily computed likelihood ratio (LR). Unfortunately, it is very easy to describe LRs in ways that invite misunderstanding. Below are two examples:
[I]n the simplest case of a complete, single-source evidence profile, the LR expression reverts to the reciprocal of the profile frequency. For example: Profile frequency = 1/1,000,000 [implies] LR = P(E|H1) / P(E|H2) = 1 / 1/1,000,000 = 1,000,000/1 = 1,000,000 (or 1 million). This could be expressed in words as, "Given the DNA profile found in the evidence, it is 1 million times more likely that it is from the suspect than from another random person with the same profile." -- Norah Rudin & Keith Inman, An Introduction to Forensic DNA Analysis 148-49 (2d ed. 2002).
Comment: If "another random person" had the "the same profile," there would be no genetic basis for distinguishing between this individual and the suspect. So how could the suspect possibly be a million times more likely to be the source?
A likelihood ratio is a ratio that compares the likelihood of two hypotheses in the light of data. [I]n the present case there are two hypotheses: the sperm came from twin A or the sperm came from twin B, and then you calculate the likelihood of each hypotheses in the face or in the light of the data, and then you form the ratio [LR] of the two. So the ratio tells you how much more likely one hypothesis is than the other in the light of the experimental data. --Testimony of Michael Krawczak in a pretrial hearing on a motion to exclude evidence in Commonwealth v. McNair, No. 8414CR10768 (Super. Ct., Suffolk Co., Mass.) (transcript, Feb. 15, 2017).
Comment: Defining "likelihood" as a quantity proportional to the probability of data given the hypothesis, the first sentence is correct. But this definition was not provided, and the second sentence further suggests that the "experimental data" makes one twin LR times more probable to be the source than the other. That conclusion is correct only if the prior odds are equal -- an assumption that does not rest on those data.
With this kind of prose and testimony, is it any surprise that courts write that "[t]he likelihood ratio 'compares the probability that the defendant was a contributor to the sample with the probability that he was not a contributor to the sample'”? Commonwealth v. Grinkley, 75 Mass.App.Ct. 798, 803, 917 N.E.2d 236, 241 (Mass. Ct. App. 2009) (quoting Commonwealth v. McNickles, 434 Mass. 839, 847, 753 N.E.2d 131 (2005))?

Monday, September 11, 2017

The New York City Medical Examiner's Office "Under Fire" for Low Template DNA Testing

According to an Associated Press story, “DNA lab techniques” are “now under fire.” 1/ The article refers to the procedures used by New York City Office of the Chief Medical Examiner to analyze and interpret low template DNA mixtures—samples with minuscule quantities of DNA. Selected segments of the DNA molecules are copied repeatedly (by mean of a chemical process known as PCR) to produce enough of them to detect certain highly variable DNA sequences (called STRs). Every round of PCR amplification essentially doubles the number of replicated segments. Following the lead of the U.K.’s former Forensic Science Service, which pioneered a protocol with extra cycles of amplification, the OCME used 31 cycles instead of the standard 28.

But if the PCR primer for an STR does not latch on to enough of the small number of starting DNA molecules, that STR will not appear in PCR-amplified product. At the same time, if stray human DNA molecules are present in the samples, their STRs can be amplified along with the ones that are of real interest. The first phenomenon is called “drop-out”; the latter is “drop-in.”

Initially, OCME analysts interpreted the results by hand. Some years later, it created a computer program that used empirically determined drop-in and  drop-out probabilities and generated a measure of the extent to which the DNA results supported the conclusion that the mixture contains a suspect’s DNA as opposed to an unrelated contributor’s. It published a validation study of the software. 2/

Both these “lab techniques” have been “under fire,” as the AP put it, for years. The article suggests that only two courts have decided serious challenges to LT-DNA evidence, that they reached opposite conclusions, and that the more recent view is that the evidence is unreliable. 3/ In fact, a larger number of trial courts have considered challenges to extra cycles of amplification and to the FST program. Almost all of them found the OCME’s approach to have gained scientific acceptance.

The published opinions from New York trial courts are noteworthy. (There are more unpublished ones.) In the first reported case, People v. Megnath, 898 N.Y.S.2d 408 (N.Y. Sup. Ct. Queens Co. 2010), a court admitted manually interpreted LT-DNA evidence, finding that the procedures are not novel and that the modifications are generally accepted. 4/ In United States v. Morgan, 53 F.Supp.3d 732 (S.D.N.Y. 2014), a federal district court reached the same conclusion. In People v. Garcia, 963 N.Y.S.2d 517 (N.Y. Sup. Ct. 2013), a local New York court found general acceptance of both extra cycles and the FST program.

The first setback for the OCME came in People v. Collins, 49 Misc.3d 595, 15 N.Y.S.3d 564 (N.Y. Sup. Ct., Kings Co. 2015), when a well regarded trial judge conducted an extensive hearing and issued a detailed opinion finding the extra cycles and the FST program had not achieved general acceptance in the scientific community. However, other New York judges have not followed Collins in excluding OCME LT-DNA testimony. People v. Lopez, 50 Misc.3d 632, 23 N.Y.S.3d 820 (N.Y. Sup. Ct., Bronx Co. 2015); People v. Debraux, 50 Misc.3d 247, 21 N.Y.S.3d 535 (N.Y. Co. Sup. Ct. 2015). In the absence of any binding precedent (trial court opinions lack precedential value) and given the elaborate Collins opinion, it is fair to say that “case law on the merits of the science” is not so “clear,” but, quantitatively, it leans toward admissibility.

This is not to say that the opinions are equally persuasive or that they are uniformly well informed. A specious argument that several courts have relied on is that because Bayes’ theorem was discovered centuries ago and likelihood ratios are used in other contexts, the FST necessarily rests on generally accepted methods. E.g., People v. Rodriguez, Ind. No. 5471/2009, Decision and Order (Sup.Ct. N.Y. Co. Oct. 24, 2013). That is comparable to reasoning that because the method of least squares was developed over two centuries ago, every application of linear regression is valid. The same algebra can cover a multitude of sins.

Likewise, the Associated Press (and courts) seem to think that the FST (or more advanced software for computing likelihood ratios) supplies “the likelihood that a suspect’s DNA is present in a mixture of substances found at a crime scene.” 5/ A much longer article in the Atlantic presents a likelihood ratio as "the probability, weighed against coincidence, that sample X is a match with sample Y." 6/ That description is jumbled. The likelihood ratio does not weigh the probability that two samples match "against coincidence."

Rather, the ratio addresses whether the pattern of alleles in a mixed sample is more probable if the suspect's DNA is part of the mixture than if an unrelated individual's DNA is there instead. The ratio is the probability of the complex and possibly incomplete pattern arising under the former hypothesis divided by the probability of the pattern under the latter. Obviously, the ratio of two probabilities is not a probability or a likelihood of anything.

Putting aside all other explanations for the overlap between the mixture and the suspect's alleles--explanations like relatives or some laboratory errors--this likelihood ratio indicates how much the evidence changes the odds in favor of the suspect’s DNA being in the mixture. It quantifies the probative value of the evidence, not the probability that one or another explanation of the evidence is true. Although likelihood-ratio testimony has conceptual advantages, explaining the meaning of the figure in the courtroom so as to avoid the misinterpretations exemplified above can be challenging.

  1. Colleen Long, DNA Lab Techniques, 1 Pioneered in New York, Now Under Fire,  AP News, Sept. 10, 2017, 
  2. Adelle A. Mitchell et al., Validation of a DNA Mixture Statistics Tool Incorporating Allelic Drop-Out and Drop-In, 6 Forensic Sci. Int’l: Genetics 749-761 (2012); Adelle A. Mitchell et al., Likelihood Ratio Statistics for DNA Mixtures Allowing for Drop-out and Drop-in, 3 Forensic Sci. Int'l: Genetics Supp. Series e240-e241 (2011).
  3. Long, supra note 1 ("There is no clear case law on the merits of the science. In 2015, Brooklyn state Supreme Court judge Mark Dwyer tossed a sample collected through the low copy number method. ... But earlier, a judge in Queens found the method scientifically sound.").
  4. For criticism of the “nothing-new” reasoning in the opinion, see David H. Kaye et al., The New Wigmore on Evidence: Expert Evidence (Cum. Supp. 2017).
  5. These are the reporter’s words. Long, supra note 1. For a judicial equivalent, see, for example, People v. Debraux, 50 Misc.3d 247, 256, 21 N.Y.S.3d 535, 543 (N.Y. Co. Sup. Ct. 2015) (referring to FST as “showing that the likelihood that DNA found on a gun was that of the defendant”).
  6. Matthew Shaer, The False Promise of DNA Testing: The Forensic Technique Is Becoming Ever More Common—and Ever Less Reliable, Atlantic, June 2016,

Friday, September 1, 2017

Flaky Academic Journals and Forestry

The legal community may be catching on to the proliferation of predatory, bogus, or just plain flaky journals of medicine, forensic science, statistics, and every other subject that might attract authors willing to pay "open access" fees. As indicated in the Flaky Academic Journals blog, these businesses advertise rigorous peer review, but they operate like vanity presses. A powerful article (noted here) in Bloomberg's BNA Expert Evidence Report and Bloomberg Businessweek alerts litigators to the problem by discussing the most notorious megapublisher of biomedical journals, OMICS International, and its value to drug companies.willing to cut corners in presenting their research findings.

The most recent forensic-science article to go this route is Ralph Norman Haber & Lyn Haber, A Forensic Case Study with Only a Single Piece of Evidence, Journal of Forensic Studies, Vol. 2017, issue 1, unpaginated. In fact, it is the only article that the aspiring journal has published (despite spamming for potential authors at least 11 times). The website offers an intriguing description of this "Journal of Forensic Studies." It explains that "Forensic studies is a scientific journal which covers high quality manuscripts which are both relevant and applicable to the broad field of Forestry. This journal encompasses the study related to the majority of forensically related cases."