Saturday, March 26, 2016

NIST Distances Itself from the First OSAC-approved Forensic Science Standard

On January 11, 2016, a group created by the federal government to develop better standards for forensic science approved -- without changing a single word of substance -- a standard previously promulgated by a committee of ASTM International (formerly the American Society for Testing Materials). The federally mandated body that showcased this standard is the Organization of Scientific Area Committees for Forensic Science (OSAC). It is supported, administratively and financially, by the Department of Commerce's highly respected National Institute of Standards and Technology (NIST). 1/

The approved standard has the ponderous name of ASTM E2329−14 Standard Practice for Identification of Seized Drugs. To my eye, it looks like an odd choice for the first (and thus far, only) entry in the OSAC Registry of Approved Standards. Why so?

For one thing, this one standard itself seems to approve of nine other ASTM standards -- none of which have been vetted by OSAC. Second, the FSSB approved the standard over objections from two out of the three OSAC "resource committees" -- its Legal Resources Committee and its Human Factors Committee. 2/ Third. the standard permits definitive conclusions based on standardless, subjective assessments of botanical specimens. Fourth, without discussing or citing any studies of error probabilities for the methods involved, the standard states or suggests that false-positive errors will not occur. Thus, an earlier posting tartly contrasted some of the language in the Standard to the admonition in the 2009 report of the National Research Council Committee on Identifying the Needs of the Forensic Science Community that "[a]ll results for every forensic science method should indicate the uncertainty in the measurements that are made, and studies must be conducted that enable the estimation of those values."

On March 17, 2016, more than two months after the NIST-created OSAC adopted this first standard, NIST issued a public statement disavowing the standard as written because "concerns have been raised that some of the language in the standard is not scientifically rigorous." 3/ Like the National Research Council, NIST appreciates that "no measurement, qualitative or quantitative, should be characterized as without the risk of error or uncertainty." 4/ The statement adds that "NIST and the FSSB have independently asked that ASTM review the language."

The FSSB's action is, in a way, quite puzzling. Why would the FSSB want the organization that already wrote and approved the standard that the FSSB reviewed and adopted as a registry-ready, gold standard to "review" the FSSB-approved standard? It is not as if some new scientific research suddenly undermined the standard, requiring it to be revised. And even if new information had surfaced after the FSSB voted, the appropriate response would have been to take it down until the issue could be resolved.

Moreover, why would NIST and the FSSB "independently" ask for ASTM review when the subcommittee that wanted the standard on the registry already had promised to secure revisions through ASTM. The record before the FSSB included the following response to criticisms filed by the Legal Resource Committee:
The Seized Drug subcommittee intends to clarify the quoted language pertaining to uncertainty and error during the next ASTM revision of this document." E2329-14 Seized Drugs Response to LRC Comments FINAL.pdf (277K) SAC Chemistry/Instrument Analysis, Jan. 11, 2016
Does the fact that NIST and the FSSB have added their voices to that of the OSAC Subcommittee on Seized Drugs mean that revisions that clearly should have been made before posting a standard to the repository will occur any sooner? And what can justify leaving a standard that admittedly needs "clarification" on the registry pending the requisite rewriting? Cannot laboratories continue to use the ASTM standard to guide them just as they did before the OSAC registry existed?

Whatever the answers to these questions may be, NIST's reservations about the first OSAC standard, although not spelled out in full, were the subject of questions at a recent meeting of the National Commission on Forensic Science. On March 21, 2016, Commissioner Marc LeBeau asked presenters from NIST whether NIST planned to post statements of agreement as well as disagreement for every future OSAC-approved standard. I cannot locate a transcript or videotape of the meeting, but my recollection is that the answer was essentially "no."

No doubt, NIST hopes that the kerfuffle over ASTM E2329−14 is one off, but the apparent inclination of OSAC subcommittees to try to import unimproved ASTM standards into the registry does not bode well. The latest example is ASTM E2388-11 Standard Guide for Minimum Training Requirements for Forensic Document Examiners. It is up for consideration as an OSAC standard (and for public comment during the next couple of weeks) even though OSAC has no approved standard on what the document examiners are expected to do once they are trained.

Disclosure and disclaimer: I am a member of the OSAC Legal Resource Committee. The information and views presented here do not represent those of, and are not necessarily shared by, NIST, OSAC, any unit within these organizations, or any other organization or individuals.

Notes
  1. See OSAC Subcommittee and Scientific Area Committee (SAC) Chairs, https://rticqpub1.connectsolutions.com/content/connect/c1/7/en/events/event/shared/1187757659/speaker_info.html?sco-id=1187765255 ("OSAC is part of an initiative by NIST and the Department of Justice to strengthen forensic science in the United States. The organization is a collaborative body of more than 500 forensic science practitioners and other experts who represent local, state, and federal agencies; academia; and industry. NIST has established OSAC to support the development and promulgation of forensic science consensus documentary standards and guidelines, and to ensure that a sufficient scientific basis exists for each discipline.").
  2. The third resource committee, the Quality Infrastructure Committee (QIC), does not seem to comment on the substance of proposed standards.
  3. NIST Statement on ASTM Standard E2329-14, Mar. 17, 2016, http://www.nist.gov/forensics/nist-statement-on-astm-e2329-14.cfm
  4. That said, the NIST statement cautions that "It is important to note that NIST is not contesting results obtained from seized evidence using the standard." Id.

Monday, March 7, 2016

Hot Paint: Another ASTM Standard (E2937-13) that Needs More Work

A second standard for comparing samples of paint under review for the OSAC Registry is ASTM E2937-13, on "Infrared Spectroscopy in Forensic Paint Examinations." It raises several of the issues previously noted in the broader ASTM E1610-14 "Standard Guide for Forensic Paint Analysis and Comparison."

The standard for IR spectroscopy presupposes that the goal is to “to determine whether any significant differences exist between the known and questioned samples,” where a “significant difference” is “a difference between two samples that indicates that the two samples do not have a common origin.” The criminalist then is expected to declare whether “[s]pectra are dissimilar,” “indistinguishable,” or “inconclusive.”

Although categorical judgments have the benefit of simplicity and familiarity, most literature on forensic inference now maintains that analysts should present statements about the weight of the evidence rather than categorical conclusions about source hypotheses. By considering and presenting the degree to which the observations support one hypothesis as compared to another without dictating the conclusion that must be drawn, the analyst supplies the most information. It is not clear whether the standard rejects this view and is intended to preclude experts from using a weight-of-evidence approach to the comparison process.

The categorical approach that the standard adopts is notable for its vagueness. On its face, the definition of “significant difference” permits analysts to declare that differences with almost no discriminating power are so significant that two samples “do not have a common origin.” This lack of guidance arises because any difference that occurs more frequently among two samples with different origins than among two same-source samples “indicates” different origins and hence is “significant.” For example, a difference that arises 1,000 times more often for different-source samples is indicative of difference sources. But so is a difference that arises only 10% more often for different-source samples. Both “indicate” non-association. They differ only in the magnitude of the measure of non-association. The 1,000-times-more-often quantity is a strong indication of non-association, whereas the 10% figure is a weak indication. But in both cases, the differences indicate (to some degree) non-association relative to association.

To avoid this looseness, one might try to read “indicates” as connoting “strongly indicates” or “establishes,” but there is no reason to promulgate an ambiguous standard that requires readers in the fields of forensic science and law to struggle to discern and supply its intended meaning. And, if “establishes” is the intended meaning, then more guidance is needed to help analysts determine, on the basis of objective data about the range of differences seen in same-source and in different-source samples, when a difference is “significant” in the sense of discriminating between the former and the latter types of samples. That is, the standard should supply a validated decision rule; it should present the conditional error probabilities of this decision rule; and it should refer specifically to the studies that have validated it. These features of standards are not absolute requirements for admitting scientific evidence, but they would go far to assuring courts and counsel that the criteria of “known or potential rate of error” and “standards controlling the technique's operation” enumerated in Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579, 594 (1993), militate in favor of admissibility (and persuasiveness of the testimony if a case goes to trial).

Section 10.6.1.1 of the ASTM Standard does not begin to do this. It offers an unbounded “rule of thumb” — “that the positions of corresponding peaks in two or more spectra be within ±5 cm^-1. For sharp absorption peaks one should use tighter constraints. One should critically scrutinize the spectra being compared if corresponding peaks vary by more than 5 cm^-1. Replicate collected spectra may be necessary to determine reproducibility of absorption position.” What is the basis for “critical scrutiny”? How many replicates are necessary? When are they necessary? What is the accuracy of examiners who follow the open-ended “rule of thumb”?

Given the lack of standards for deciding what is “significant,” the definitions of “dissimilar,” “indistinguishable,” and “inconclusive” are indeterminate. They read:
  • 10.7.1 Spectra are dissimilar if they contain one or more significant differences.
  • 10.7.2 Spectra are indistinguishable if they contain no significant differences.
  • 10.7.3 A spectral comparison is inconclusive if sample size or condition precludes a decision as to whether differences are significant.
Inasmuch as any difference can be considered “significant,” the criminalist has no basis in the standard to declare an inclusion, an exclusion, or an inconclusive outcome. This deprives the standard of the legally desirable status under Daubert of “standards controlling the technique's operation.”

Thursday, March 3, 2016

What Is a "Conservative" Method in Forensic Statistics?

Statistical hypothesis testing involves a "null hypothesis" against an "alternative hypothesis." If data are not well outside the range of what would be expected if the null hypothesis is true, then that hypothesis cannot be rejected in favor of the specified alternative. It is usually thought that the more demanding the statistical test, the more "conservative" it is. For example, if a researcher claims to have discovered a new treatment that cures cancer, the null hypothesis is that the new therapy does not work. Sticking with this belief retains the status quo (of not using the novel treatment). In this example, the "conservative" thing to do is to insist on a small p-value (results that have a small probability of arising if the treatment is ineffective) before accepting the alternative.

Does this carry over to forensic science? Is it conservative to retain the null hypothesis unless there is strong evidence against it? Consider the following excerpt from an FBI publication on forensic glass comparisons 1/:
A conservative threshold will differentiate all samples from different sources but may also indicate that a difference exists in specimens that are actually from the same source. A high threshold for differentiation may not be able to differentiate all specimens from sources that are genuinely different but will not differentiate specimens that are actually from the same source.
The "conservative" scientific stance therefore tends to support or preserve the prosecution's case. The state can produce a witness who can testify "conservatively" to finding that the broken window at the crime scene is chemically indistinguishable from the bit of glass removed from the defendant's sweatshirt.

On the other hand, a committee of the National Academic of Sciences that studied forensic DNA testing defined "conservative" in terms of impact on a defendant's claim of innocence 2/:
Conservative—favoring the defendant. A conservative estimate is deliberately chosen to be more favorable to the defendant than the best (unbiased) estimate would be.
Plainly, the FBI document's use of "conservative" is difficult to square with the NAS committee's definition of the word. The FBI document treats the hypothesis that favors the hypothesis supporting the prosecution's case as the status quo that should be retained unless there is strong evidence to the contrary.

This use of the prosecution's hypothesis that the broken window is the source of the incriminating fragment as the null hypothesis is not necessarily wrong, but it engenders confusion. The confusion can be dispelled if the presentation of the findings includes a statement of how rare or common "indistinguishable" windows are in a relevant population. Evaluating the data about the glass thus would have two steps. In step 1, the data are classified as"indistinguishable" (or not). If the samples are indistinguishable, then a random match probability is provided to indicate its probative value with respect to the hypothesis that the glass originated from the broken window.

Of course, if one could articulate the probability of the data given the hypothesis that the source is broken window versus the probability that the glass associated with the defendant had a different origin, this two-step process would not be needed. The expert could present these probabilities.

References
  1. Maureen C. Bottrell, Forensic Glass Comparison: Background Information Used in Data Interpretation, 11 Forensic Sci. Communications No. 2 (2009)
  2. National Research Council, Committee on The Evaluation of Forensic DNA Evidence: An Update, The Evaluation of Forensic DNA Evidence 215 (1996)

Tuesday, March 1, 2016

"Reasonable Scientific Certainty," the NCFS, the Law of the Courtroom," and that Pesky Passive Voice

In a posting last week about a proposed National Commission on Forensic Science recommendation for the Attorney General to take the position that an expert witness "is not required" to utter words like "reasonable scientific certainty" as a condition for the admission of the testimony, I discussed a set of cases cited in a public comment from a Commissioner. The author of the public comment wrote me that I was mistaken in at least one respect. He does not maintain that "Recommendation #1 would require the Department of Justice to argue for overturning existing law that 'seem[s] to require' these phrases in some forensic-science identification fields" and that attributing this view to him
takes my comment out of context and mischaracterizes it. My comments relating to the cited federal district court opinions stated that 'a number of federal judges apparently endorse -- and some still seem to require -- the use of these phrases in their courtrooms ... . I didn't argue that 'existing law' 'seems to require' these phrases in some forensic-science identification fields.'  Instead, I said that 'a number of federal judges ... seem to require' the use of these phrases. I think that it's fair to say that 'existing law' on this topic and what a given federal district court judge may believe existing law to be, are not necessarily the same thing. (In my experience they can be far from the same thing). Case law does not always (and often does not) translate into the law of the courtroom -- the way trial judges interpret and apply federal rules and case law in trial practice. That was the point I was attempting to make -- not that 'Recommendation #1 would require the Department of Justice to argue for overturning existing law.' That is clearly not the case. However, DOJ attorneys may nevertheless be required to utter those 'magic words' in a given courtroom and in a given case.
It certainly is true that some judges will want the proponents of the evidence to swear to its "reasonable scientific certainty." And, unless a higher court has explicitly ruled that this phraseology should not be used  -- as some have and as is appropriate for the reasons stated in a separate views document that the Commission is developing -- the trial judge may believe that he or she has the legal prerogative to make it "the law of the courtroom." The Commission recommendation recognizes as much, for it explicitly allows the prosecutor to put on such testimony when directed to do so by the judge in the case.

But does this judge's edict really have the status of "law"? Under the crude,legal realist perspective that the law is whatever the court says it is, I suppose one would have to call it "law." However, it seems clearer to call it a judicial practice that is ripe for change (without having to amend any rules of evidence or change any binding caselaw). To encourage this change, and to the extent that the passive voice in Recommendation 1(b) introduces ambiguity, it would be better for the Commission simply to say that "The Attorney General should direct all attorneys appearing on behalf of the Department of Justice ... (b) to assert the legal position that such terminology should not be required."