Saturday, September 21, 2019

Police Genetic Genealogy at GEDmatch: Is Opt-in the Best Policy?

With over a million DNA scans in its database, GEDmatch has been instrumental in the success of genetic genealogy searches in criminal investigations. The genealogy website
provides applications for comparing your DNA test results with other people. There are also applications for estimating your ancestry. Some applications are free. More advanced applications require membership in the GEDmatch Tier1 program at $10 per month.
By uploading the text file of genetic data from a direct-to-consumer genetic testing company such as 23andMe or, to a publically accessible database such as GEDmatch, genealogy enthusiasts can discover whether they have large blocks of DNA in common with other individuals who have uploaded their data. Extensive haploblock matching might reflect membership in the same family tree.

Police have used this service to identify possible relatives of the unknown individuals whose DNA was recovered from a crime-scene. Ordinary genealogy research into various records may lead to the common ancestor and then back down to the descendants -- including the one who left the crime-scene DNA.

This process of genetic genealogy for criminal investigations attracted extensive publicity, leading to a series of changes in GEDmatch's policies. At first, the DNA data from all participants (who chose to make their data available for searches by others) was open to law enforcement trawls. Later, the website's users could designate their data as not available for criminal investigations but still open for searches by the general public. In other words, they could opt-out of this one use. Then, the policy became opt-in only. Unless a participant affirmatively chooses to make his or her data available for criminal investigations, it is not included in trawls for police who identify themselves as such.

If police respect these policies, GEDmatch may be useless to them. In the first three weeks since GEDmatch moved to the restrictive policy, only 50,000 people opted in. CeCe Moore, the genetic genealogist with Parabon NanoLabs, which has scored the most successes for police, reported that the crime-scene uploads in that period stopped producing usable matches. "It’s basically useless now," said Moore. "Our work on any new cases is significantly stalled."
Does the opt-in policy go too far? GEDmatch encourages people to use an alias. Users can be contacted, but the individual inquiring won’t find them unless they choose to respond, and anyone can take his or her information off GEDmatch at any time. However, when there is a potential lead to a crime-scene sample, it would be easy for police or prosecutors to issue a subpoena or secure a search warrant to find the email address if not the name behind the alias.

By the way, the owner of GEDmatch said that he is considering charging police a fee to use the website. Unless the database accessible to police returns to its earlier searchable size of a million or so, he may not have many such customers.
C.J. Guerrini et al., Should Police Have Access to Genetic Genealogy Databases?, PLOS Biology, Oct. 2, 2018, 16(10): e2006906.

We conducted the survey online using Amazon Mechanical Turk (MTurk) ... . We restricted participation to individuals who were 18 years of age or older and located in the United States and paid them US$0.25 for taking the survey. ... [T]he majority supported police searches of genetic websites that identify genetic relatives (79%) and disclosure of DTC genetic testing customer information to police (62%), as well as the creation of fake profiles of individuals by police on genealogy websites (65%) (Fig 1). However, respondents were significantly more supportive of these activities (all p < 0.05) when the purpose is to identify perpetrators of violent crimes (80%), perpetrators of crimes against children (78%), or missing persons (77%) than when the purpose is to identify perpetrators of nonviolent crimes (39%).

My note: MTurk respondents may not be representative of users of GEDmatch.


Thursday, August 15, 2019

Post PCAST: Washington D.C. High Court Won't Tolerate No-doubt Testimony Matching a Bullet to a Single Gun

In an opinion relying in part on the PCAST Report on feature-comparison evidence, the District of Columbia's highest court discussed limits on the testimony a firearms-toolmark examiner can give. But it did not get very far. An earlier opinion in Williams v. United States, 130 A.3d 343 (D.C. 2016), determined that the admission of some extreme testimony (described below) was not plain error. In Williams v. United States, 210 A.3d 734 (D.C. 2019) (Williams II), the Court of Appeals revisited the plain-error question in light of later rulings decided before sentencing. It concluded that even though entertaining the opinion testimony was "error" and the error was "plain," the "plain error" exception to the rule against reversing a conviction on the basis of unobjected-to testimony did not justify reversal. (I know, that is a convoluted sentence, but the law on the plain-error exception to the need for a contemporaneous objection is convoluted.)

At trial,
[T]he examiner opined that “these three bullets were fired from this firearm.” On redirect, when asked whether there was “any doubt in [his] mind” that the bullets recovered from Mr. Kang's SUV were fired from the gun found in Mr. Williams's bedroom, the examiner responded, “[n]o, sir.” The examiner elaborated that “[t]hese three bullets were identified as being fired out of Exhibit No. 58. And it doesn't matter how many firearms Hi[-]Point made. Those markings are unique to that gun and that gun only.” The examiner then restated his unequivocal opinion: “Item Number 58 fired these three bullets.”
(Citations omitted). On the petition for rehearing that generated the Williams II opinion, the government relied on a footnote in one of these cases, Gardner v. United States, 140 A.3d 1172 (D.C. 2016). The note in Gardner stated that the holding was “limited in that it allows toolmark experts to offer an opinion that a bullet or shell casing was fired by a particular firearm, but it does not permit them to do so with absolute or 100% certainty.” 140 A.3d at 1184 n.19. The government argued in Williams II that "this footnote authorized opinion testimony identifying a specific bullet as having been fired by a specific gun."

Justice Catharine Easterly's opinion for the court found this interpretation of the footnote "difficult to square with the above-the-line holding that the trial court 'had erred' by admitting the examiner's 'unqualified opinion,' that the 'the silver gun was the murder weapon.'" Id. at 1184. The opinion added that
Moreover, the publication post Gardner of another federal government report—President's Council of Advisors on Science and Technology (“PCAST”), Forensic Science in Criminal Courts: Ensuring Scientific Validity of Feature-Comparison Methods (Sept. 2016), ... reiterates toolmark and firearms examiners do not currently have a basis to give opinion testimony that matches a specific bullet to a specific gun and that such testimony should not be admitted without a verifiable error rate does not support the government's argument that only express statements of certainty should be prohibited.
Nonetheless, the opinion did not "resolve the ambiguity of Gardner's footnote because "in this case ... the firearms and toolmark examiner not only testified ... that a specific bullet could be matched to a specific gun, but also that he did not have 'any doubt' about his conclusion." (Footnote omitted.) In the end, after emphasizing that the no-doubt-specific-source testimony "was error," and "the error is plain," the court only held that the plain-error exception to the rule against reversing on the basis of evidence that was not the subject of a contemporaneous objection did not apply. It did not apply because, considering "the government['s] powerful circumstantial case" in other respoects, "Mr. Williams ... cannot show a reasonable probability of a different result absent this error."

Reading between the lines, it appears that Justice Easterly was unable to convince the other two panel members to explicitly adopt (in dictum) the procedure PCAST recommended for source attributions -- a categorical conclusion accompanied by the upper bound of an estimated rate of Type I error as seen in so-called black-box experiments (or something similar). She wrote separately that the Gardner footnote "can only logically be understood in one way: as an acknowledgment that the government might be able to present expert opinion testimony that a specific bullet was fired by a specific a gun if the examiner could reliably qualify his pattern-matching opinion—i.e., if he can provide a verifiable error rate." To which Senior Judge Frank Nebeker replied in his separate opinion: "This is not a case in which to resolve the knotty question of to what degree of certainty, or not, an expert's opinion is admissible as to a particular fact." 1/

  1. Whether any source-attribution opinion -- with or without some initial qualification as to the degree of certainty -- is necessary or desirable is a further question, even more removed from what Judge Nebeker loosely called a "harmless error judgment." (The harmless-error doctrine is a little different from the plain-error doctrine.)

Sunday, July 21, 2019

Confidence Intervals -- If Only It Were That Simple

Confidence Interval: Statistics such as means (or averages) and medians are often calculated from data from a portion—or sample—of a population rather than from data for an entire population. Statistics based on sample data are called “sample statistics,” whereas those based on an entire population are called “population parameters.” A confidence interval is the range of values of a sample statistic that is likely to contain a population parameter, and that likeliness is expressed with a specific probability. For example, if a study of a sample of 1,500 Americans finds their average weight to be 150 pounds with a 95 percent confidence interval of plus/minus 25 pounds, this means that there is a 95 percent probability that the average weight of the entire American population is between 125 and 175 pounds. --Wm. Nöel & Judy Wang, Is Cannabis a Gateway Drug? Key Findings and Literature Review: A Report Prepared by the Federal Research Division, Library of Congress, Under an Interagency Agreement with the Office of the Director, National Institute of Justice, Office of Justice Programs, U.S. Department of Justice, Nov. 2018, at 3.

{T]here is a 5 percent chance the true value [of a 95% one-sided confidence interval] exceeds the bound. --President’s Council of Advisors on Science and Technology, Forensic Science in Criminal Courts: Ensuring Scientific Validity of Feature-Comparison Methods, Sept. 2016, at 153.
[T]he confidence level does not give the probability that the unknown parameter lies within the confidence interval. ... According to the frequentist theory of statistics, probability statements cannot be made about population characteristics: Probability statements apply to the behavior of samples. That is why the different term ‘confidence’ is used. --David H. Kaye & David A. Freedman, Reference Guide on Statistics, in Reference Manual on Scientific Evidence 211, 247 (Federal Judicial Center & National Research Council Committee on the Development of the Third Edition of the Reference Manual on Scientific Evidence eds., 3d ed. 2011).

Warning! ... [T]he fact that a confidence interval is not a probability statement about [an unknown value] is confusing. --Larry Wasserman, All of Statistics: A Concise Course in Statistical Inference 93(2004) (emphasis in original).

Wednesday, July 17, 2019

No Tension Between Rule 704 and Best Principles for Interpreting Forensic-science Test Results

At a webinar on probabilistic genotyping organized by the FBI, the Department of Justice’s Senior Advisor on Forensic Science, Ted Hunt, summarized the rules of evidence that are most pertinent to scientific and expert testimony. In the course of a masterful survey, he suggested that Federal Rule of Evidence 704 somehow conflicts with the evidence-centric approach to evaluating laboratory results recommended by a subcommittee of the National Commission on Forensic Science, by the American Statistical Association, and by European forensic-science service providers. 1/ In this approach, the expert stops short of opining on whether the defendant is the source of the trace. Instead, the expert merely reports that the data are L times more probable when the hypothesis is true than when some alternative source hypothesis is true. (Or, the expert gives some qualitative expression such as "strong support" when this likelihood ratio is large.)

Whatever the merits of these proposals, Rule 704 does not stand in the way of implementing the recommended approach to reporting and testifying. First, the identity of the source of a trace is not necessarily an ultimate issue. To use the example of latent-print identification given in the webinar, the traditional opinion that a named individual is the source of a print is not an opinion on an ultimate issue. Courts have long allowed examiners to testify that the print lifted from a gun comes from a specific finger. But this conclusion is not an opinion on whether the murder defendant is the one who pulled the trigger. The examiner’s source attribution bears on the ultimate issue of causing the death of a human being, but the examiner who reports that the prints were defendant's is not opining that the defendant not only touched the gun (or had prints planted on it) but also pulled the trigger. Indeed, the latent print examiner would have no scientific basis for such an opinion on an element of the crime of murder.

Furthermore, even when an expert does want to express an opinion on an ultimate issue, Rule 704 does not counsel in favor of admitting it into evidence. Rule 704(a) consists of a single sentence: “An opinion is not objectionable just because it embraces an ultimate issue.” The sole function of these words is to repeal an outmoded, common-law rule categorically excluding these opinions. The advisory committee that drafted this repealing rule explained that “to allay any doubt on the subject, the so-called ‘ultimate issue’ rule is specifically abolished by the instant rule.” The committee expressed no positive preference for such opinions over evidence-centric expert testimony. It emphasized that Rules 701, 702, and 403 protect against unsuitable opinions on ultimate issues. Modern courts continue to exclude ultimate-opinion testimony when it is not sufficiently helpful to jurors. For example, conclusions of law remain highly objectionable.

Consequently, any suggestion that Rule 704 is an affirmative reason to admit one kind of testimony over another is misguided. “The effect of Rule 704 is merely to remove the proscription against opinions on ‘ultimate issues' and to shift the focus to whether the testimony is ‘otherwise admissible.’” 2/ If conclusion-centric testimony is admissible, then so is the evidence-centric evaluation that lies behind it--with or without the conclusion.

In sum, there is no tension between Rule 704(a) and the recommendation to follow the evidence-centric approach. Repealing a speed limit on a road does not imply that drivers should put the pedal to the floor.

  1. This is the impression I received. The recording of the webinar should be available at the website of the Forensic Technology Center of Excellence in a week or two.
  2. Torres v. County of Oakland, 758 F.2d 147, 150 (6th Cir.1985).
UPDATED: 18 July 2019 6:22 AM

Saturday, July 6, 2019

Distorting Daubert and Parting Ways with PCAST in Romero-Lobato

United States v. Romero-Lobato 1/ is another opinion applying the criteria for admissibility of scientific evidence articulated in Daubert v. Merrell Dow Pharmaceuticals 2/ to uphold the admissibility of a firearms examiner's conclusion that the microscopic marks on recovered bullets prove that they came from a particular gun. To do so, the U.S. District Court for the District of Nevada rejects the conclusions of the President's Council of Advisors on Science and Technology (PCAST) on validating a scientific procedure.

This is not to say that the result in the case is wrong. There is a principled argument for admitting suitably confined testimony about matching bullet or ammunition marks. But the opinion from U.S. District Court Judge Larry R. Hicks does not contain such an argument. The court does not reach the difficult question of how far a toolmark expert may go in forging a link between ammunition and a particular gun. It did not have to. In what seems to be a poorly developed challenge to firearms-toolmark expertise, the defense sought to exclude all testimony about such an association.

This posting describes the facts of the case, the court's description of the law on the admissibility of source attributions by firearms-toolmark examiners, and its review of the practice under the criteria for admitting scientific evidence set forth by the Supreme Court in Daubert.


A grand jury indicted Eric Romero-Lobato for seven felonies. On March 4, 2018, he allegedly tried to rob the Aguitas Bar and Grill and discharged a firearm (a Taurus PT111 G2) into the ceiling. On May 14, he allegedly stole a woman's car at gunpoint while she was cleaning it at a carwash. Later that night, he crashed the car in a high-speed chase On the front passenger's seat, was a Taurus PT111 G2 handgun.

Steven Johnson, a supervising criminalist in the Forensic Science Division of the Washoe County Sheriff's Office," 3/ was prepared to testify that the handgun had fired a round into the ceiling of the bar. Romero-Lobato moved "to preclude the testimony." The district court held a pretrial hearing at which Johnson testified to his background, training, and experience. He explained that he matched the bullet to the gun using the "AFTE method" advocated by the Association of Firearm and Tool Mark Examiners.

Defendant's challenge rested "on the critical NAS and PCAST Reports as evidence that 'firearms analysis' is not scientifically valid and fails to meet the requisite threshold for admission under Daubert and Federal Rule of Evidence 702." Apparently, the only expert at the hearing was the Sheriff Department's criminalist. Judge Hicks denied the motion to exclude Johnson's expert opinion testimony and issued a relatively detailed opinion.


Skipping over the early judicial resistance to "this is the gun" testimony, 4/ the court noted that despite misgivings about such testimony on the part of several federal district courts, only one reported case has barred all source opinion testimony 5/ and the trend among the more critical courts is to search for ways to admit the conclusion with qualifications on its certainty.

Judge Hicks did not pursue the possibility of admitting but constraining the testimony, apparently because the defendant did ask for that. Instead, the court reasoned that to overcome the inertia of the current caselaw, a defendant must have extraordinarily strong evidence (although it also recognized that the burden is on the government to prove scientific validity under Daubert) .The judge wrote:
[T]he defense has not cited to a single case where a federal court has completely prohibited firearms identification testimony on the basis that it fails the Daubert reliability analysis. The lack of such authority indicates to the Court that defendant's request to exclude Johnson's testimony wholesale is unprecedented, and when such a request is made, a defendant must make a remarkable argument supported by remarkable evidence. Defendant has not done so here.
Defendant's less-than-remarkable evidence was primarily two consensus reports of scientific and other experts who reviewed the literature on firearms-mark comparisons. 6/  Both are remarkable. The first document was the highly publicized National Academy of Sciences committee report on improving forensic science. The committee expressed concerns about the largely subjective comparison process and the absence of studies to adequately measure the uncertainty in the evaluations. The court deemed these to be satisfied by a single research report submitted to Department of Justice, which funded the study:
The NAS Report, released in 2009, concluded that “[s]ufficient studies have not been done to understand the reliability and repeatability” of firearm and toolmark examination methods. ... The Report's main issue with the AFTE method was that it did not provide a specific protocol for determining a match between a shell casing or bullet and a specific firearm. ... Instead, examiners were to rely on their training and experience to determine if there was a “sufficient agreement” (i.e. match) between the mark patterns on the casing or bullet and the firearm's barrel. ... During the Daubert hearing, Johnson testified about his field's response to the NAS Report, pointing to a 2013 study from Miami-Dade County (“Miami-Dade Study”). The Miami-Dade Study was conducted in direct response to the NAS Report and was designed as a blind study to test the potential error rate for matching fired bullets to specific guns. It examined ten consecutively manufactured barrels from the same manufacturer (Glock) and bullets fired from them to determine if firearm examiners (165 in total) could accurately match the bullets to the barrel. 150 blind test examination kits were sent to forensics laboratories across the United States. The Miami-Dade Study found a potential error rate of less than 1.2% and an error rate by the participants of approximately 0.007%. The Study concluded that “a trained firearm and tool mark examiner with two years of training, regardless of experience, will correctly identify same gun evidence.”
A more complete (and accurate) reading of the Miami Dade Police Department's study, shows that it was not designed to measure error rates as they are defined in the NAS report and that the "error rate" was much closer to 1%. That's still small, and, with truly independent verification of an examiners' conclusions, the error rate should be smaller than that for examiners whose findings are not duplicated. Nonetheless, as an earlier posting shows, the data are not as easily interpreted and applied to case work as the report from the crime laboratory suggests.The research study, which has yet to appear in any scientific journal. has severe limitations.

The second report, released late in 2016 by the President's Council of Advisors on Science and Technology (PCAST) flatly maintained that microscopic firearms-marks comparisons had not been scientifically validated. Essentially dismissing the Miami Dade Police and earlier research as not properly designed to measure the ability of examiners to infer whether the same gun fired test bullets and ones recovered from a crime scene, PCAST reasoned that (1) AFTE-type identification had yet to be shown to be "reliable" within the meaning of Rule 702 (as PCAST interpreted the rule); (2) if courts disagreed with PCAST's legal analysis of the rule's requirements, they should at least require examiners associating ammunition with a particular firearm to give an upper bound, as ascertained from controlled experiments, on false-positive associations. (These matters are discussed in previous postings.)

The court did not address the second conclusion and gave little or no weight to the first one. It wrote that the 2016 report
concluded that there was only one study done that “was appropriately designed to test foundational validity and estimate reliability,” the Ames Laboratory Study (“Ames Study”). The Ames Study ... reported a false-positive rate of 1.52%. ... The PCAST Report did not reach a conclusion as to whether the AFTE method was reliable or not because there was only one study available that met its criteria.
All true. PCAST certainly did not write that there is a large body of high quality research that proves toolmark examiners cannot associate expended ammunition with specific guns. PCAST's position is that a single study is not a body of evidence that establishes a scientific theory--replication is crucial. If the court believed that there is such a body of literature, it should have explained the basis for its disagreement with the Council's assessment of the literature. If it agreed with PCAST that the research base is thin, then it should have explained why forensic scientists should be able to testify--as scientists--that they know which gun fired which bullet. This opinion does neither. (I'll come to court's discussion of Daubert below.)

Instead, the court repeats the old news that
the PCAST Report was criticized by a number of entities, including the DOJ, FBI, ATF, and AFTE. Some of their issues with the Report were its lack of transparency and consistency in determining which studies met its strict criteria and which did not and its failure to consult with any experts in the firearm and tool mark examination field.
Again, all true. And all so superficial. That prosecutors and criminal investigators did not like the presidential science advisors' criticism of their evidence is no surprise. But exactly what was unclear about PCAST's criteria for replicated, controlled, experimental proof? In fact, the DOJ later criticized PCAST for being too clear--for having a "nine-part" "litmus test" rather than more obscure "trade-offs" with which to judge what research is acceptable. 7/

And what was the inconsistency in PCAST's assessment of firearms-marks comparisons? Judge Hicks maintained that
The PCAST Report refused to consider any study that did not meet its strict criteria; to be considered, a study must be a “black box” study, meaning that it must be completely blind for the participants. The committee behind the report rejected studies that it did not consider to be blind, such as where the examiners knew that a bullet or spent casing matched one of the barrels included with the test kit. This is in contrast to studies where it is not possible for an examiner to correctly match a bullet to a barrel through process of elimination.
This explanation enucleates no inconsistency. The complaint seems to be that PCAST's criteria for a validating a predominantly subjective feature-comparison procedure are too demanding or restrictive, not that these criteria were applied inconsistently. Indeed, no inconsistency in applying the "litmus test" for an acceptable research design to firearms-mark examinations is apparent.

Moreover, the court's definition of "a 'black box' study" is wrong. All that PCAST meant by "black box" is that the researchers are not trying to unpack the process that examiners use and inspect its components. Instead, they say to the examiner, "Go ahead, do your thing. Just tell us your answer, and we'll see if you are right." The term is used by software engineers who test complex programs to verify that the outputs are what they should be for the inputs. The Turing test for the proposition that "machines can think" is a kind of black box test.

Nonetheless, this correction is academic. The court is right about the fact that PCAST gave no credence to "closed tests" like those in which an examiner sorts bullets into pairs knowing in advance that every bullet has a mate. Such black-box experiments are not worthless. They show a nonzero level of skill, but they are easier than "open tests" in which an examiner is presented with a single pair of bullets to decide whether they have a common source, then another pair, and another, and so on. In Romero-Lobato, the examiner had one bullet from the ceiling to compare to a test bullet he fired from one suspect gun. There is no "trade-off" that would make the closed-test design appropriate for establishing the examiner's skill at the task he performed.

All that remains of the court's initial efforts to avoid the PCAST report is the tired complaint about a "failure to consult with any experts in the firearm and tool mark examination field." But what consultation does the judge think was missing? The scientists and technologists who constitute the Council asked the forensic science community for statements and literature to support their practices. It shared a draft of its report with the Department of Justice before finalizing it. After releasing the report, it asked for more responses and issued an addendum. Forensic-services providers may complain that the Council did not use the correct criteria, that its members were closed-minded or biased, or that the repeated opportunities to affect the outcome were insufficient or even a sham. But a court needs more than a throw-away sentence about "failure to consult" to justify treating the PCAST report as suspect.


Having cited a single, partly probative police laboratory study as if it were a satisfactory response to the National Academy's concerns and having colored the President's Council report as controversial without addressing the limited substance of the prosecutors' and investigators' complaints, the court offered a "Daubert analysis." It marched through the five indicia that the Supreme Court enumerated as factors that courts might consider in assessing scientific validity and reliability.

A. It Has Been Tested

The Romero-Lobato opinion made much of the fact that "[t]he AFTE methodology has been repeatedly tested" 8/ through "numerous journals [sic] articles and studies exploring the AFTE method" 9/ and via Johnson's perfect record on proficiency tests as proved by his (hearsay and character evidence) testimony. Einstein once expressed impatience "with scientists who take a board of wood, look for its thinnest part and drill a great number of holes where drilling is easy." 10/ Going through the drill of proficiency testing does not prove much if the tests are simple and unrealistic. A score of trivial or poorly designed experiments should not engender great confidence. The relevant question under Daubert is not simply "how many tests so far?" It is how many challenging tests have been passed. The opinion makes no effort to answer that question. It evinces no awareness of the "10 percent error rate in ballistic evidence" noted in the NAS Report, that prompted corrective action in the Detroit Police crime laboratory.

Instead of responding to PCAST's criticisms of the design of the AFTE Journal studies, the court wrote that "[a]lthough both the NAS and PCAST Reports were critical of the AFTE method because of its inherent subjectivity, their criticisms do not affect whether the technique they criticize has been repeatedly tested. The fact that numerous studies have been conducted testing the validity and accuracy of the AFTE method weighs in favor of admitting Johnson's testimony."

But surely the question under Daubert is not whether there have been "numerous studies." It is what these studies have shown about the accuracy of trained examiners to match a single unknown bullet with control bullets from a single gun. The court may have been correct in concluding that the testing prong of Daubert favors admissibility here, but its opinion fails to demonstrate that "[t]here is little doubt that the AFTE method of identifying firearms satisfies this Daubert element."

B. Publication and Peer Review

Daubert recognizes that, to facilitate the dissemination, criticism, and modification of theories, modern science relies on publication in refereed journals that members of the scientific community read. Romero-Lobato deems this factor to favor admission for two reasons. First, the AFTE Journal in which virtually all the studies dismissed by PCAST appear, uses referees. That it is not generally regarded as a significant scientific journal -- it is not available through most academic libraries, for example -- went unnoticed.

Second, the court contended that "of course, the NAS and PCAST Reports themselves constitute peer review despite the unfavorable view the two reports have of the AFTE method. The peer review and publication factor therefore weighs in favor of admissibility." The idea that the rejection in consensus reports of a series of studies as truly validating a theory "weighs in favor of admissibility" is difficult to fathom. Some readers might find it preposterous.

C. Error Rates

Just as the court was content to rely on the absolute number of studies as establishing that the AFTE method has been adequately tested, it takes the error rates reported in the questioned studies at face value. Finding the numbers to be "very low," and implying (without explanation) that PCAST's criteria are too "strict," it concludes that Daubert's "error rate" factor too "weighs in favor of admissibility."

A more plausible conclusion is that a large body of studies that fail to measure the error rates (false positive and negative associations) appropriately but do not indicate very high error rates is no more than weakly favorable to admission. (For further discussion, see the previous postings on the court's discussion of the Miami Dade and Ames Laboratory technical reports.)

D. Controlling Standards

The court cited no controlling standards for the judgment of  "'sufficient agreement' between the 'unique surface contours' of two toolmarks." After reciting the AFTE's definition of "sufficient agreement," Judge Hicks decided that "matching two tool marks essentially comes down to the examiner's subjective judgment based on his training, experience, and knowledge of firearms. This factor weighs against admissibility."

However, the opinion adds that "the consecutive matching striae ('CMS') method," which Johnson used after finding "sufficient agreement," is "an objective standard under Daubert." It is "objective" because an examiner cannot conclude that there is a match unless he "observes two or more sets of three or more consecutive matching markings on a bullet or shell casing." The opinion did not consider the possibility that this numerical rule does little to confine discretion if no standard guides the decision of  whether a marking matches. Instead, the opinion debated whether the CMS method should be considered objective and confused that question with how widely the method is used.

The relevant inquiry is not whether a method is subjective or objective. For a predominantly subjective method, the question is whether standards for making subjective judgments will produce more accurate and more reliable (repeatable and reproducible) decisions and how much more accurate and reliable they will be.

E. General Acceptance

Finally, the court found "widespread acceptance in the scientific community." But the basis for this conclusion was flimsy. It consisted of statements from other courts like "the AFTE method ... is 'widely accepted among examiners as reliable'" and "[t]his Daubert factor is designed to prohibit techniques that have 'only minimal support' within the relevant community." Apparently, the court regarded the relevant community as confined to examiners. Judge Hicks wrote that
it is unclear if the PCAST Report would even constitute criticism from the “relevant community” because the committee behind the report did not include any members of the forensic ballistics community ... . The acceptance factor therefore weighs in favor of admitting Johnson's testimony.
If courts insulate forensic-science service providers from the critical scrutiny of outside scientists, how can they legitmately use the general-acceptance criterion to help ascertain whether examiners are presenting "scientific knowledge" à la Daubert or something else?

  1. No. 3:18-cr-00049-LRH-CBC, 2019 WL 2150938 (D. Nev. May 16, 2019).
  2. 509 U.S. 579 (1993).
  3. For a discussion of a case involving inaccurate testimony from the same laboratory that caught the attention of the Supreme Court, see David H. Kaye, The Interpretation of DNA Evidence: A Case Study in Probabilities, National Academies of Science, Engineering and Medicine, Science Policy Decision-making Educational Modules, 2016, available at; McDaniel v. Brown: Prosecutorial and Expert Misstatements of Probabilities Do Not Justify Postconviction Relief — At Least Not Here and Not Now, Forensic Sci., Stat. & L., July 7, 2014,
  4. See David H. Kaye, Firearm-Mark Evidence: Looking Back and Looking Ahead, 68 Case W. Res. L. Rev. 723, 724-25 (2018), available at The court relied on the article's explication of more modern case law.
  5. The U.S. District Court for the District of Colorado  excluded toolmark conclusions in the prosecutions for the bombing of the federal office building in Oklahoma City. The toolmarks there came from a screwdriver. David H. Kaye et al., The New Wigmore, A Treatise on Evidence: Expert Evidence 686-87 (2d ed. 2011).
  6. The court was aware of an earlier report from a third national panel of experts raising doubts about the AFTE method, but it did not cite or discuss that report's remarks. Although the 2008 National Academies report on the feasibility of establishing a ballistic imaging database only considered the forensic toolmark analysis of firearms in passing, it gave the practice no compliments. Kaye, supra note 2, a 729-32.
  7. Ted Robert Hunt, Scientific Validity and Error Rates: A Short Response to the PCAST Report, 86 Fordham L. Rev. Online Art. 14 (2017),
  8. Quoting United States v. Ashburn, 88 F.Supp.3d 239, 245 (E.D.N.Y. 2015).
  9. Citing United States v. Otero, 849 F.Supp.2d 425, 432–33 (D.N.J. 2012), for "numerous journals [sic] articles and studies exploring the AFTE method."
  10. Philipp Frank, Einstein's Philosophy of Science, Reviews of Modern Physics (1949).
MODIFIED: 7 July 2019 9:10 EST