Thursday, October 20, 2016

The First Opinion To Discuss the PCAST Report

In United States v. Chester, No. 13 CR 00774 (N.D. Ill. Oct. 7, 2016), a federal district court denied “defendants’ second joint renewed motion to exclude expert testimony regarding firearm toolmark analysis.” Normally, yet another federal district court opinion admitting testimony of a positive association in toolmarks would not be newsworthy. But this trial court decision came in response to a recent report from the President’s Council of Advisors on Science and Technology (PCAST). This report, Forensic Science in Criminal Courts: Ensuring Scientific Validity of Feature-Comparison Methods, argues that source-attribution testimony for bullet markings has not been demonstrated to possess scientific validity. 1/ Thus, Forensic Magazine ran this headline: “Firearms Evidence Allowed in Chicago Hobos Gang Trial—Despite PCAST Argument.” 2/ So what is PCAST's analysis, and how did the court overcome it?

PCAST expressed concern that the
“Theory of Identification as it Relates to Toolmarks”—which defines the criteria for making an identification—is circular. The “theory” states that an examiner may conclude that two items have a common origin if their marks are in sufficient agreement,” where “sufficient agreement” is defined as the examiner being convinced that the items are extremely unlikely to have a different origin. In addition, the “theory” explicitly states that conclusions are subjective (p. 104).
The court did not dispute the absence of a standardized procedure with well-defined judgmental criteria for source attribution. Rather, it maintained that the very scientific literature cited by PCAST does not contradict a previous ruling in the case that the judgments of firearms examiners amount to the kind of “scientific knowledge” necessary to admit scientific evidence under the Supreme Court's opinion in Daubert v. Merrell Dow Pharmaceuticals, 509 U.S. 579 (1993). As discussed at length in many publications (and occasionally in depth), Justice Blackmun's opinion in Daubert articulates a loose, multifactor standard for ascertaining the scientific soundness of proposed testimony, 3/ and the district court had examined the “Daubert factors.” in its first ruling. This time, it limited its analysis to “error rates.” Judge John J. Tharp, Jr., wrote that:
As such, the report does not dispute the accuracy or acceptance of firearm toolmark analysis within the courts. Rather, the report laments the lack of scientifically rigorous “blackbox” studies needed to demonstrate the reproducibility of results, which is critical to cementing the accuracy of the method. Id. at 11. The report gives detailed explanations of how such studies should be conducted in the future, and the Court hopes researchers will in fact conduct such studies. See id. at 106. However, PCAST did find one scientific study that met its requirements (in addition to a number of other studies with less predictive power as a result of their designs). That study, the “Ames Laboratory study,” found that toolmark analysis has a false positive rate between 1 in 66 and 1 in 46. Id. at 110. The next most reliable study, the “Miami-Dade Study” found a false positive rate between 1 in 49 and 1 in 21. Thus, the defendants’ submission places the error rate at roughly 2%. 3 The Court finds that this is a sufficiently low error rate to weigh in favor of allowing expert testimony. See Daubert v. Merrell Dow Pharms., 509 U.S. 579, 594 (1993) (“the court ordinarily should consider the known or potential rate of error”); United States v. Ashburn, 88 F. Supp. 3d 239, 246 (E.D.N.Y. 2015) (finding error rates between 0.9 and 1.5% to favor admission of expert testimony); United States v. Otero, 849 F. Supp. 2d 425, 434 (D.N.J. 2012) (error rate that “hovered around 1 to 2%” was “low” and supported admitting expert testimony). The other factors remain unchanged from this Court’s earlier ruling on toolmark analysis. See ECF No. 781.

3. Because the experts will testify as to the likelihood that rounds were fired from the same firearm, the relevant error rate in this case is the false positive rate (that is, the likelihood that an expert’s testimony that two bullets were fired by the same source is in fact incorrect).
But is it true that “the report does not dispute the accuracy ... of firearm toolmark analysis within the courts”? The report claims that the accuracy of the statements that appear in court is not known with adequate precision. According to the authors, no convincing range for the risks of errors can be derived from the existing scientific literature. PCAST insists that “[b]ecause firearms analysis is at present a subjective feature-comparison method, its foundational validity can only be established through multiple independent black box studies ... .” 4/ PCAST is emphatic (some would say dogmatic) in insisting that “the sole way to establish foundational validity is through multiple independent ‘black-box’ studies that measure how often examiners reach accurate conclusions across many feature-comparison problems involving samples representative of the intended use. In the absence of such studies, a feature-comparison method cannot be considered scientifically valid” (pp. 66, 68).

Under this criterion for scientific validity, it is hard to see how the PCAST report can be characterized as not disputing accuracy. The Miami-Dade study that the opinion relies on barely counts for PCAST. The report lists it under the heading of “Non-black-box studies of firearms analysis” (p. 106). That leaves a single study, and a single study cannot satisfy the report’s demand for multiple studies. For better or worse, PCAST's bottom line is clear
At present, there is only a single study that was appropriately designed to test foundational validity and estimate reliability (Ames Laboratory study). Importantly, the study was conducted by an independent group, unaffiliated with a crime laboratory. Although the report is available on the web, it has not yet been subjected to peer review and publication.

The scientific criteria for foundational validity require appropriately designed studies by more than one group to ensure reproducibility. Because there has been only a single appropriately designed study, the current evidence falls short of the scientific criteria for foundational validity. There is thus a need for additional, appropriately designed black-box studies to provide estimates of reliability.
The district court in Chester read this passage as meaning that the science is there, but that it would be nice to have a few more studies to show that other researchers can replicate the small error rates in the unpublished study. But is not PCAST really saying that there is a paucity of acceptable experiments from which to ascertain applicable error probabilities? In this regard, much more than "reliability" (in the statistical sense) and "reproducibility" of a number is at issue. One should ask not just whether a a second research group has replicated a given study, but more broadly, whether a solid body of studies with varied designs and different samples of examiners establish that the findings as a whole are robust and generalizable.

In contrast, the Chester court is satisfied with two studies that it understands to reveal false positive error probabilities in the neighborhood of 2%. Where PCAST is unable to perceive evidence of “validity” in the sense of reasonably well known error probabilities, the court finds the probability of error to be small enough to allow testimony as to the origin of the bullets.

But which probability does the court conclude is comfortingly small? The PCAST report defines a false-positive error probability one way. The Chester court expresses a different understanding of the meaning of this probability. Stay tuned for later discussion of this "false-positive error fallacy."

  1. See Eric Lander, William Press, S. James Gates, Jr., Susan L. Graham, J. Michael McQuade, and Daniel Schrag, PCAST Releases Report on Forensic Science in Criminal Courts, Sept. 20, 2016, 
  2. Seth Augerstein, Firearms Evidence Allowed in Chicago Hobos Gang Trial—Despite PCAST Argument, Forensic Mag., Oct. 13, 2016. 
  3. See, e.g., David H. Kaye, David E. Bernstein & Jennifer L. Mnookin, The New Wigmore: A Treatise on Evidence: Expert Evidence (2d ed. 2011) (updated annually).
  4. P. 106. The definition of "black-box study" at page 48, Box 2, is seriously incomplete. There, the report explains that "[b]y a 'black-box study,' we mean an empirical study that assesses a subjective method by having examiners analyze samples and render opinions about the origin or similarity of samples." But an "empirical" studies come in a multitude of designs. The PCAST authors have specific ideas about the design of empirical studies that they call "black-box studies."
More on the PCAST Report

No comments:

Post a Comment