Friday, October 27, 2017

Dodging Daubert to Admit Bite Mark Evidence

At a symposium for the Advisory Committee on the Federal Rules of Evidence, Chris Fabricant juxtaposed two judicial opinions about bite-mark identification. To begin with, in Coronado v. State, 384 S.W.3d 919 (Tex. App. 2012), the Texas Court of Appeals deemed bite mark comparisons to be a “soft science” because it is “based primarily on experience or training.” It then applied a less rigorous standard of admissibility than that for a “hard science.”

The state’s expert dentist, Robert Williams, “acknowledged that there is a lack of scientific studies testing the reliability of bite marks on human skin, likely due to the fact that few people are willing to submit to such a study. However, he did point out there was one study on skin analysis conducted by Dr. Gerald Reynolds using pig skin, ‘the next best thing to human skin.’” The court did not state what the pig skin study showed, but it must have been apparent to the court that direct studies of the ability of dentists to distinguish among potential sources of bite marks were all but nonexistent.

That dentists have a way to exclude and include suspects as possible biters with rates of accuracy that are known or well estimated is not apparent. Yet, the Texas appellate court upheld the admission of the "soft science" testimony without discussing whether it was presented as hard science, as "soft science," or as nonscientific expert testimony.

A trial court in Hillsborough County, Florida, went a step further. Judge Kimberly K. Fernandez wrote that
During the evidentiary hearing, the testimony revealed that there are limited studies regarding the accuracy or error rate of bite mark identification, 3/ and there are no statistical databases regarding uniqueness or frequency in dentition. Despite these factors, the Court finds that this is a comparison-based science and that the lack of such studies or databases is not an accurate indicator of its reliability. See Coronado v. State, 384 S.W. 3d 919 (Tex. App. 2012) ("[B]ecause bite mark analysis is based partly on experience and training, the hard science methods of validation such as assessing the potential rate of error, are not always appropriate for testing its reliability.")
The footnote added that "One study in 1989 reflected that there was a 63% error rate.” This is a remarkable addition. Assuming "the error rate" is a false-positive rate for a task comparable to the one in the case, it is at least relevant to the validity of bite-mark evidence. In Coronado, the Texas court found the absence of validation research not preclusive of admissibility.  That was questionable enough. But in O'Connell, the court found that the presence of research that contradicted any claim of validity “inappropriate” to consider! That turns Daubert on its head.

Friday, October 20, 2017

"Probabilistic Genotyping," Monte Carlo Methods, and the Hydrogen Bomb

Many DNA samples found in criminal investigations contain DNA from several people. A number of computer programs seek to "deconvolute" these mixtures -- that is, to infer the several DNA profiles that are mushed together in the electrophoretic data. The better ones do so using probability theory and an estimation procedure known as a Markov Chain Monte Carlo (MCMC) method. These programs are often said to perform "probabilistic genotyping." Although both words in this name are a bit confusing, 1/ lawyers should appreciate that the inferred profiles are just possibilities, not certainties. At the same time, some may find the idea of using techniques borrowed from a gambling casino (in name at least) disturbing. Indeed, I have heard the concern that "You know, don't you, that if the program is rerun, the results can be different!"

The answer is, yes, that is the way the approximation works. Using more steps in the numerical process also could give different output, but would we expect the further computations to make much of a difference? Consider a physical system that computes the value of π. I am thinking of Buffon's Needle. In 1777, Georges-Louis Leclerc, the Count of Buffon, imagined "dropping a needle on a lined sheet of paper and determining the probability of the needle crossing one of the lines on the page." 2/ He found that the probability is directly related to π. For example, if the length of the needle and the distance between the lines are identical, one can estimate π as twice the number of drops divided by the number of hits.3/ Repeating the needle-dropping procedure the same number of times will rarely give exactly the same answer. (Note that pooling the results for two runs of the procedure is equivalent to one run with twice as many needle drops.) For a very large number of drops, however, the approximation should be pretty good.

MCMC computations are more complicated. They simulate a random walk that samples values of a random variable so as to ascertain a posterior probability distribution. The walk could get stuck for a long time in a particular region. Nevertheless, the general approach is very well established in statistics, and Monte Carlo methods are widely used throughout the sciences. 4/ Indeed, they were integral to the development of nuclear weapons. 5/ The book, Dark Sun: The Making of the Hydrogen Bomb, provides the following account:
On leave from the university, resting at home during his extended recovery [from a severe brain infection], [Stanislaw] Ulam amused himself playing solitaire. Sensitivity to patterns was part of his gift. He realized that he could estimate how a game would turn out if he laid down a few trial cards and then noted what proportion of his tries were successful, rather than attempting to work out all the possible combinations in his head. "It occurred to me then," he remembers, "that this could be equally true of all processes involving branching of events." Fission with its exponential spread of reactions was a branching process; so would the propagation of thermonuclear burning be. "At each stage of the [fission] process, there are many possibilities determining the fate of the neutron. It can scatter at one angle, change its velocity, be absorbed, or produce more neutrons by a fission of the target nucleus, and so on." Instead of trying to derive the expected outcomes of these processes with complex mathematics, Ulam saw, it should be possible to follow a few thousand individual sample particles, selecting a range for each particle's fate at each step of the way by throwing in a random number, and take the outcomes as an approximate answer—a useful estimate. This iterative process was something a computer could do. ...[W]hen he told [John] von Neumann about his solitaire discovery, the Hungarian mathematician was immediately interested in what he called a "statistical approach" that was "very well suited to a digital treatment." The two friends developed the mathematics together and named the procedure the Monte Carlo method (after the famous gaming casino in Monaco) for the element of chance it incorporated. 6/
Even without a computer in place, Los Alamos laboratory staff, including a "bevy of young women who had been hastily recruited to grind manually on electric calculators," 7/ performed preliminary calculations examining the feasibility of igniting a thermonuclear reaction. As Ulam recalled:
We started work each day for four to six hours with slide rule, pencil and paper, making frequent quantitative guesses. ... These estimates were interspersed with stepwise calculations of the behavior of the actual motions [of particles] ... The real times for the individual computational steps were short ... and the spatial subdivisions of the material assembly very small. ... The number of individual computational steps was therefore very large. We filled page upon page with calculations, much of it done by [Cornelius] Everett. In the process he almost wore out his own slide rule. ... I do not know how many man hours were spent on this problem. 8/
  1. In forensic DNA work, probabilities also are presented to explain the probative value of the discovery of a "deterministic" DNA profile -- one that is treated as known to a certainty. See David H. Kaye, SWGDAM Guidelines on "Probabilistic Genotyping Systems" (Part 2), Forensic Sci., Stat. & L., Oct. 25, 2015. In addition, the "genotypes" in "probabilistic genotyping" do not refer to genes.
  2. Office for Mathematical, Science and Technology Education, College of Educvation, University of Illinois, Boffon's Needle: An Analysis and Simulation,
  3. Id.
  4. See, e.g., Persi Diaconis, The Markov Chain Monte Carlo Revolution, 46 Bull. Am. Math. Soc'y 179-205 (2009),; Sanjib Sharma, Markov Chain Monte Carlo Methods for Bayesian Data Analysis in Astronomy, arXiv:1706.01629 [astro-ph.IM], https://doiorg/10.1146/annurev-astro-082214-122339.
  5. Roger Eckhard, Stan Ulam, John von Neumann, and the Monte Carlo Method, Los Alamos Sci., Special Issue 1987, pp. 131-41,
  6. Richard Rhodes, Dark Sun: The Making of the Hydrogen Bomb 303-04 (1995).
  7. Id. at 423 (quoting Françoise Ulam).
  8. Id.

Wednesday, October 11, 2017

District Court Rejects Defendant's Reliance on PCAST Report as a Reason to Exclude Fingerprint Evidence

Yesterday, the U.S. District Court for the Northern District of Illinois rejected a defendant's motion to exclude a latent fingerprint identification on the theory that "the method used is not sufficiently reliable foundationally or as applied to his case." 1/ The court also precluded, as too "distracting," cross-examination of the FBI latent-print examiner about the FBI's infamous error in apprehending of Brandon Mayfield as the Madrid train bomber.

I. "Foundational Validity"

The "foundational validity" challenge came out of the pages of the 2016 PCAST Report. 2/ The PCAST Report seems to equate what it calls "foundational validity" for subjective pattern-matching methods to multiple "black box studies" demonstrating false-positive error probabilities of 5% or less, and it argues that Federal Rule of Evidence 702 requires such a showing of validity.

That this challenge would fail is unsurprising. According to the Court of Appeals for the Seventh Circuit, latent print analysis need not be scientifically valid to be admissible. Furthermore, even if the Seventh Circuit were to reconsider this questionable approach to the admissibility of applications of what the FBI and DHS call "the science of fingerprinting," the PCAST Report concludes that latent print comparisons have foundational scientific validity as defined above.

A. The Seventh Circuit Opinion in Herrera

Scientific validity is not a foundational requirement in the legal framework applied to latent print identification by the Court of Appeals for the Seventh Circuit. In United States v. Herrera, 3/ Judge Richard Posner 4/ observed that "the courts have frequently rebuffed" any "frontal assault on the use of fingerprint evidence in litigation." 5/ Analogizing expert comparisons of fingerprints to "an opinion offered by an art expert asked whether an unsigned painting was painted by the known painter of another painting" and even to eyewitness identifications, 6/ the court held these comparisons admissible because "expert evidence is not limited to 'scientific' evidence," 7/ the examiner was "certified as a latent print examiner by the International Association for Identification," 8/ and "errors in fingerprint matching by expert examiners appear to be very rare." 9/ To reach the last -- and most important -- conclusion, the court relied on the lack of cases of fingerprinting errors within a set of DNA-based exonerations (without indicating how often fingerprints were introduced in those cases), and its understanding that the "probability of two people in the world having identical fingerprints ... appears to be extremely low." 10/

B. False-positive Error Rates in Bonds

In the new district court case of United States v. Bonds, Judge Sara Ellis emphasized the court of appeals' willingness to afford district courts "wide latitude in performing [the] gate-keeping function." Following Herrara (as she had to), she declined to require "scientific validity" for fingerprint comparisons. 11/ This framework deflects or rejects most of the PCAST report's legal reasoning about the need for scientific validation of all pattern-matching methods in criminalistics. But even if "foundational validity" were required, the PCAST Report -- while far much more skeptical of latent print work than was the Herrera panel -- is not so skeptical as to maintain that latent print identification is scientifically invalid. Judge Ellis quoted the PCAST Report's conclusion that "latent fingerprint analysis is a foundationally valid subjective methodology—albeit with a false positive rate that is substantial and is likely to be higher than expected by many jurors based on longstanding claims about the infallibility of fingerprint analysis."

Bonds held that the "higher than expected" error rates were not so high as to change the Herrera outcome for nonscientific evidence. Ignoring other research into the validity of latent-print examinations, Judge Ellis wrote that "[a]n FBI study published in 2011 reported a false positive rate (the rate at which the method erroneously called a match between a known and latent print) of 1 in 306, while a 2014 Miami-Dade Police Department Forensic Services Bureau study had a false positive rate of 1 in 18."

Two problems with the sentence are noteworthy. First, it supplies an inaccurate definition of a false positive rate. "[T]he rate at which the method erroneously called a match between a known and latent print" would seem to be an overall error rate for positive associations (matches) in the sample of prints and examiners who were studied. For example, if the experiment used 50 different-source pairs of prints and 50 same-source pairs, and if the examiners declared 5 matches for the different-sources and 5 for the same-source pairs, the erroneous matches are 5 out of 100, for an error rate of 5%. However, the false-positive rate is the proportion of positive associations reported for different-source prints. When comparing the 50 different-source pairs, the examiners erred in 5 instances, for a false-positive rate of 5/50 = 10%. In the 50 same-source pairs, there were no opportunities for a false negative. Thus, the standard definition of a false-positive error rate gives the estimate of 0.1 for the false-positive probability. This definition makes sense because none of the same-source pairs in the sample can contribute to false-positive errors.

Second, the sentence misstates the false positive rates reported in the two studies. Instead of "1 in 306," the 2011 Noblis-FBI experiment found that "[s]ix false positives occurred among 4,083 VID [value for identification] comparisons of nonmated pairs ... ." 12/ In other words (or numbers), the reported false-positive rate (for an examiner without the verification-by-another-examiner step) was 6/4083 = 1/681. This is the only false-positive rate in the body of the study. An online supplement to the article includes "a 95% confidence interval of 0.06% to 0.3% [1 in 1668 to 1 in 333]." 13/ A table in the supplement also reveals that, excluding conclusions of "inconclusive" from the denominator, as is appropriate from the standpoint of judges or jurors, the rate is 6/3628, which corresponds to 1 in 605.

Likewise, the putative rate of 1/18 does not appear in the unpublished Miami-Dade study. A table in the report to a funding agency states that the "False Positive Rate" was 4.2% "Without Inconclusives."14/This percentage corresponds to 1 in 24.

So where did the court get its numbers? They apparently came from a gloss in the PCAST Report. That report gives an upper (but not a lower) bound on the false-positive rates that would be seen if the studies used an enormous number of random samples of comparisons (instead of just one). Bending over backwards to avoid incorrect decisions against defendants, PCAST stated that the Noblis-FBI experiment indicated that "the rate could be as high as 1 error in 306 cases" and that the numbers in the Miami-Dade study admit of an error rate that "could be as high as 1 error in 18 cases." 15/ Of course, the error rates in the hypothetical infinite population could be even higher. Or they could be lower.

III. Discussing Errors at Trial

The PCAST Report accepts the longstanding view that traces of the patterns in friction ridge skin can be used to associate latent prints that contain sufficient detail with known prints. But it opens the door to arguments about the possibility of false positives. Bonds wanted to confine the analyst to presenting the matching features or, alternatively, to declare a match but add that the "level of certainty of a purported match is limited by the most conservative reported false positive rate in an appropriately designed empirical study thus far (i.e., the 1 in 18 false positive rate from the 2014 Miami-Dade study)."

Using a probability of 1 in 18 to describe the "level of certainty" for the average positive association made by examiners like those studied to date seems "ridiculous." Cherry-picking a distorted number from a single study is hardly sound reasoning. And even if 1/18 were the best estimate of the false-positive probability that can be derived from the totality of the scientific research, applying it explain the "level of certainty" one should have that the examiner's conclusion would not be straightforward. For one thing, the population-wide false-positive probability is not the probability that a given positive finding is false! Three distinct probabilities come into play. 16/ Explaining the real meaning of an estimate of the false-positive probability from PCAST's preferred "black-box" studies in court will be challenging for lawyers and criminalists alike. Merely to state that a number like 1/18 goes to "the weight of the evidence" and can be explored "on cross examination," as Judge Ellis did, is to sweep this problem under the proverbial rug -- or to put it aside for another day.

  1. United States v. Myshawn Bonds, No. 15 CR 573-2 (N.D. Ill. Oct. 10, 2017).
  2. Executive Office of the President, President’s Council of Advisors on Science and Technology, Report to the President: Forensic Science in Criminal Courts: Ensuring Scientific Validity of Feature-Comparison Methods, Sept. 2016).
  3. 704 F.3d 480 (7th Cir. 2013).
  4. For remarks on another opinion from the judge, see Judge Richard Posner on DNA Evidence: Missing the Trees for the Forest?, Forensic Sci., Stat. & L., July 19, 2014, 
  5. Herrera, 704 F.3d at 484.
  6. Id. at 485-86.
  7. Id. at 486.
  8. Id.
  9. Id. at 487.
  10. Id.
  11. Judge Ellis stated that she "agree[d] with Herrera's broader reading of Rule 702's reliability requirement."
  12. Bradford T. Ulery, R. Austin Hicklin, JoAnn Buscaglia, & Maria Antonia Roberts, Accuracy and Reliability of Forensic Latent Fingerprint Decisions, 108(19) Proc. Nat’l Acad. Sci (USA) 7733-7738 (2011).
  13. Available at
  14. Igor Pacheco, Brian Cerchiai, Stephanie Stoiloff, Miami-Dade Research Study for the Reliability of the ACE-V Process: Accuracy & Precision in Latent Fingerprint Examinations, Dec. 2014, at 53 tbl. 4.
  15. PCAST Report, supra note 2, at 94-95.
  16. The False-Positive Fallacy in the First Opinion to Discuss the PCAST Report, Forensic Sci., Stat. & L., November 3, 2016,

Friday, October 6, 2017

Should Forensic-science Standards Be Open Access?

The federal government has spent millions of dollars to generate and improve standards for performing forensic-science tests through the Organization of Scientific Area Committees for Forensic Science (OSAC). Yet, it does not require open access to the standards placed on its "OSAC Registry of Approved Standards." Perhaps that can be justified for existing standards that are the work of other authors -- as is the case for some pre-existing standards that have made it to the Registry. But shouldn't standards that are written by OSAC at public expense be available to the public rather than controlled by private organizations?

When the American Academy of Forensic Sciences (AAFS) established a Standards Board (the ASB) to "work closely with the [OSAC] Forensic Science Standards Board and its subcommittees, which are dedicated to creating a national registry of forensic standards," 1/ ASB demanded the copyright to all standards, no matter how little or how much it contributes to the writing of the standards. It insists that "the following disclaimer shall appear on all ASB published and draft documents:
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or on an intranet, without prior written permission from the Academy Standards Board, American Academy of Forensic Sciences, 410 North 21st Street, Colorado Springs, CO 80904,
Copyright © AAFS Standards Board [year]
Moreover, "[u]nless expressly agreed otherwise by the ASB, all material and information that is provided by participants and is incorporated into an ASB document is considered the sole and exclusive property of the AAFS Standards Board. Individuals shall not copy or distribute final or draft documents without the authorization of the ASB staff." 2/

The phrasing "is considered" departs from the ASB's own guidance that "[t]he active voice should be used in sentences." 3/ Who considers draft documents written by OSAC members "the sole and exclusive property of the AAFS Standards Board"? The ASB? The OSAC? The courts? Why should they? OSAC is not furthering the public interest by giving a private organization a monopoly over its work products. It should retain the copyright and reject the AAFS's unenforceable 4/ "no copying, no distributing" philosophy via a Creative Commons Attribution license.

  1. Foreword, ASB Style Guide Manual for Standards, Technical  Reports and Best Practice Recommendations (2016),
  2. Id. at 12.
  3. Id. at 1.
  4. The asserted restriction on reproduction cannot be enforced literally because many reproductions are fair uses of the copyrighted material. That is what allows me reproduce the material quoted in this posting without ASB's permission. Arguably, reproducing an entire standard for noncommercial purposes would fall under the open-textured fair-use exception of 17 U.S.C. § 107.