Thursday, June 12, 2014

More on the Mistakes in “Forensic Science Isn’t Science”

In "Flawed Journalism on Flawed Forensics in Slate Magazine" I referred to "other inaccuracies in the article" by Mark Joseph Stern entitled “Forensic Isn’t Science.” Here are some excerpts from the article with some thoughts.

Behind the myriad technical defects of modern forensics lie two extremely basic scientific problems. The first is a pretty clear case of cognitive bias: A startling number of forensics analysts are told by prosecutors what they think the result of any given test will be. This isn’t mere prosecutorial mischief; analysts often ask for as much information about the case as possible—including the identity of the suspect—claiming it helps them know what to look for. Even the most upright analyst is liable to be subconsciously swayed when she already has a conclusion in mind. Yet few forensics labs follow the typical blind experiment model to eliminate bias. Instead, they reenact a small-scale version of Inception, in which analysts are unconsciously convinced of their conclusion before their experiment even begins.

Few psychologists would propose that expectancy effects will cause errors in interpretation in every experiment. The role of cognitive bias in science and inference generally is quite complicated. There is no typical “blind experiment model.” The physicists who announced the discovery of the Higgs boson or those who believed they detected the marks of gravitational waves in the cosmic background radiation had no such model. In much of science, experimenters know what they are looking for; fortunately, some results are not ambiguous and not subject to subtle misinterpretation. There is the joke that if your experiment needs statistics, you should do a better experiment.

That said, when interpretations are more malleable—as is often the case in many forensic disciplines—various methods are available to minimize the chance of this source of error. One is “sequential unmasking” to protect forensic analysts from unconscious (or conscious) bias that could lead them to misconstrue their data when exposed to information that they do not need to know. There is rarely, if ever, an excuse for not using methods like these. But their absence does not make a solid latent fingerprint match or a matching pair of clean, single-source electropherograms, for example, into small-scale versions of Inception.

Without a government agency overseeing the field, forensic analysts had no incentive to subject their tests to stricter scrutiny. Groups such as the Innocence Project have continually put pressure on the Department of Justice—which almost certainly should have supervised crime labs from the start—to regulate forensics. But until recently, no agency has been willing to wade into the decentralized mess that hundreds of labs across the country had unintentionally created.

When and where was the start of forensic science? Ancient China? Renaissance Europe? Should the U.S. Department of Justice have been supervising the Los Angeles Police Department when it founded the first crime laboratory in the United States in 1923? The DOJ has had enough trouble with the FBI laboratory, whose blunders led to reports from the DOJ’s Office of the Inspector General and which has turned to the National Research Council for advice on more than one occasion. The 2009 report of a committee of the National Research Council had much better ideas for improving the practice of forensic science in the United States. Its recommendation of a new agency entirely outside of the Department of Justice for setting standards and funding research, however, gained little political traction. The current National Commission on Forensic Science is a distorted, toothless, and temporary version of the idea.

In 2009, a National Academy of Sciences committee embarked on a long-overdue quest to study typical forensics analyses with an appropriate level of scientific scrutiny—and the results were deeply chilling.

The committee did not undertake a “scientific” study. It engaged in a policy-oriented review of the state of forensic science without applying any particularly scientific methods. (This is not a criticism of the committee. NRC committees generally collect and review relevant literature and views rather than undertake scientific research of their own.) This committee's quest did not begin in 2009. That is when it ended. Congress voted to fund the study in 2005.

Aside from DNA analysis, not a single forensic practice held up to rigorous inspection. The committee condemned common methods of fingerprint and hair analysis, questioning their accuracy, consistent application, and general validity. Bite-mark analysis—frequently employed in rape and murder cases, including capital cases—was subject to special scorn; the committee questioned whether bite marks could ever be used to positively identify a perpetrator. Ballistics and handwriting analysis, the committee noted, are also based on tenuous and largely untested science.

The report is far more nuanced (some might say conflicted) than this. Here are some excerpts:
"The chemical foundations for the analysis of controlled substances are sound, and there exists an adequate understanding of the uncertainties and potential errors. SWGDRUG has established a fairly complete set of recommended practices." P. 135.

"Historically, friction ridge analysis has served as a valuable tool, both to identify the guilty and to exclude the innocent. Because of the amount of detail available in friction ridges, it seems plausible that a careful comparison of two impressions can accurately discern whether or not they had a common source. Although there is limited information about the accuracy and reliability of friction ridge analyses, claims that these analyses have zero error rates are not scientifically plausible." P. 142.

"Toolmark and firearms analysis suffers from the same limitations discussed above for impression evidence. Because not enough is known about the variabilities among individual tools and guns, we are not able to specify how many points of similarity are necessary for a given level of confidence in the result. Sufficient studies have not been done to understand the reliability and repeatability of the methods. The committee agrees that class characteristics are helpful in narrowing the pool of tools that may have left a distinctive mark. Individual patterns from manufacture or from wear might, in some cases, be distinctive enough to suggest one particular source, but additional studies should be performed to make the process of individualization more precise and repeatable." P. 154.

"Forensic hair examiners generally recognize that various physical characteristics of hairs can be identified and are sufficiently different among individuals that they can be useful in including, or excluding, certain persons from the pool of possible sources of the hair. The results of analyses from hair comparisons typically are accepted as class associations; that is, a conclusion of a 'match' means only that the hair could have come from any person whose hair exhibited—within some levels of measurement uncertainties—the same microscopic characteristics, but it cannot uniquely identify one person. However, this information might be sufficiently useful to 'narrow the pool' by excluding certain persons as sources of the hair." P. 160.

"The scientific basis for handwriting comparisons needs to be strengthened. Recent studies have increased our understanding of the individuality and consistency of handwriting and computer studies and suggest that there may be a scientific basis for handwriting comparison, at least in the absence of intentional obfuscation or forgery. Although there has been only limited research to quantify the reliability and replicability of the practices used by trained document examiners, the committee agrees that there may be some value in handwriting analysis."

"Analysis of inks and paper, being based on well-understood chemistry, presumably rests on a firmer scientific foundation. However, the committee did not receive input on these fairly specialized methods and cannot offer a definitive view regarding the soundness of these methods or of their execution in practice." Pp. 166-67

"As is the case with fiber evidence, analysis of paints and coatings is based on a solid foundation of chemistry to enable class identification." P. 170

"The scientific foundations exist to support the analysis of explosions, because such analysis is based primarily on well-established chemistry." P. 172

"Despite the inherent weaknesses involved in bite mark comparison, it is reasonable to assume that the process can sometimes reliably exclude suspects. Although the methods of collection of bite mark evidence are relatively noncontroversial, there is considerable dispute about the value and reliability of the collected data for interpretation." P. 176

"Scientific studies support some aspects of bloodstain pattern analysis. One can tell, for example, if the blood spattered quickly or slowly, but some experts extrapolate far beyond what can be supported." P. 178
Hardly a ringing endorsement of all police lab techniques, but neither is the report an outright rejection of all or even most techniques now in use.

The report amounted to a searing condemnation of the current practice of forensics and an ominous warning that death row may be filled with innocents.

According to an NRC press release issued in February, 2009, “[t]he report offers no judgment about past convictions or pending cases, and it offers no view as to whether the courts should reassess cases that already have been tried.” Such language may be a compromise among the disparate committee members. But to derive the conclusion that death row is “filled with innocents” even partly from the actual contents of the report, one would have to consider the deficiencies identified in the system, the extent to which these deficiencies generated the evidence used in capital cases, and the other evidence in those cases. Other research is far more helpful in evaluating the prevalence of false convictions.

Given the flimsy foundation upon which the field of forensics is based, you might wonder why judges still allow it into the courtroom.

As the 2009 NRC Committee explained, there is no single "field of forensics." Rather,
"Wide variability exists across forensic science disciplines with regard to techniques, methodologies, reliability, error rates, reporting, underlying research, general acceptability, and the educational background of its practitioners. Some of the forensic science disciplines are laboratory based (e.g., nuclear and mitochondrial DNA analysis, toxicology, and drug analysis); others are based on expert interpretation of observed patterns (e.g., fingerprints, writing samples, toolmarks, bite marks, and specimens such as fibers, hair, and fire debris). Some methods result in class evidence and some in the identification of a specific individual—with the associated uncertainties. The level of scientific development and evaluation varies substantially among the forensic science disciplines." P. 182.
The courts have been lax in responding to overblown testimony in some fields and to those techniques that lack proof of their fundamental precepts.

In 1993, the Supreme Court announced a new test, dubbed the "Daubert standard," to help federal judges determine what scientific evidence is reliable enough to be introduced at trial. The Daubert standard ... wound up frustrating judges and scientists alike. As one dissenter griped, the new test essentially turned judges into "amateur scientists," forced to sift through competing theories to determine what is truly scientific and what is not.

Blaming the persistence of the admissibility of the most dubious forensic disciplines on Daubert is strange. Daubert's standard did not spring into existence fully formed, like Athena from the brow of Zeus. A similar standard was in place in a number of jurisdictions. As the The New Wigmore: A Treatise on Evidence shows, the Court borrowed from these cases. A smaller point to note is that there were not one, but two partial dissenters (who concurred in the unanimous judgment). Chief Justice Rehnquist and Justice Stephens objected to the majority’s proffering "general observations" about scientific validity, and they did not complain about the ones the Mr. Stern points to as an explanation for the persistence of questionable forensic "science."

Even more puzzlingly, the new standards called for judges to ask "whether [the technique] has attracted widespread acceptance within a relevant scientific community"—which, as a frustrated federal judge pointed out, required judges to play referee between "vigorous and sincere disagreements" about "the very cutting edge of scientific research, where fact meets theory and certainty dissolves into probability."

That’s Chief Judge Alex Kozinski of the Ninth Circuit Court of Appeals writing on remand in Daubert itself. Judge Kozinski could not possibly be objecting to the Supreme Court's opinion on the ground that "widespread acceptance within a relevant scientific community" is an impenetrable standard. Quite the opposite. He applied that very standard in his previous opinion in the case and was bemoaning what he called the "brave new world" that the Court ushered in as it vacated his opinion. A recent survey of judges found that 96% of the (disappointingly small) fraction responding deemed the general scientific acceptance to be helpful -- more than any other commonly used factor in judging the validity of scientific evidence.

American jurors today expect a constant parade of forensic evidence during trials. They also refuse to believe that this evidence might ever be faulty. Lawyers call this the CSI effect, after the popular procedural that portrays forensics as the ultimate truth in crime investigation. [¶] “Once a jury hears something scientific, there’s a kind of mythical infallibility to it,” Peter Neufeld, a co-founder of the Innocence Project, told me. “That’s the association when a person in white lab coat takes the witness stand. By that point—once the jury’s heard it—it’s too late to convince them that maybe the science isn’t so infallible.”

Refusal to question scientific evidence is not what most lawyers call the CSI effect. In any event, jury research does not support the idea that jurors inevitably reject attacks on scientific testimony or that the testimony of the first witness in a figurative white coat is unshakeable.

If judges can’t be trusted to keep spurious forensic analysis out of the courtroom, and juries can’t be trusted to disregard it, then how are we going to keep the next Earl Washington off death row? One option would be to permit anybody convicted on the basis of biological evidence to subject that evidence to DNA analysis—which is, after all, the one form of forensics that scientists agree actually works. But in 2009, the Supreme Court ruled that convicts had no such constitutional right, even where they can show a reasonable probability that DNA analysis would prove their innocence. (The ruling was 5–4, with the usual suspects lining up against convicts’ rights.)

This option, which applies applies to a limited set of cases (and hence is no general solution) is not foreclosed by District Attorney for the Third Judicial District v. Osborne, 129 S.Ct. 2308 (2009). If there is a minimally plausible claim of actual innocence after conviction, let’s allow such testing by statute. Of course, it would be better to thoroughly test potential DNA evidence (when it is relevant) before trial—something that Osborne's trial counsel declined to request, fearing that it would only strengthen the prosecution’s case.

Until lab technicians follow some uniform guidelines and abandon the dubious techniques glamorized on shows like CSI, forensic science will barely qualify as a science at all. As a recent investigation by Chemical & Engineering News revealed, little progress has been made in the five years since the National Academy of Sciences condemned modern forensic techniques.

Again, as the NRC committee stressed, "forensic science" is not a single, uniform discipline. Since the report, funding has increased, some guidelines have been revised, and significant research has appeared in some fields. Still, the pace resembles that of global warming. It is coming, notwithstanding resistance described in earlier years on this blog.

As for the "investigation by Chemical & Engineering News," the latest I saw from that publication was an article in a May 12, 2014 issue with a map showing selected instances of examiner misconduct dating back to 1993 and indicating that only five states require laboratory accreditation. No effort was made to ascertain how many labs are still operating without accreditation. With no apparent literature review, The article simply asserted that
[I]n the years since [2009], little has been done to shore up the discipline’s scientific base or to make sure that its methods don’t result in wrongful convictions. Quality standards for forensic laboratories remain inconsistent. And funding to implement improvements is scarce. [¶] While politicians and government workers debate changes that could help, fraudsters like forensic chemist Annie Dookhan keep operating in the system. No reform could stop a criminal intent on doing wrong, but a better system might have shown warning signs sooner. And it likely would have prevented some of the larger, systemic problems at the Massachusetts forensics lab where Dookhan worked.
I must be missing the real investigation that the C&E News writers conducted.


  1. Thank you for providing a reasoned, fair and supported response to the Slate article. It is interesting and a bit depressing to me that the field of forensic science seems to get polar reactions: it is either all good or all bad, and your article makes clear that a better way of looking at the field, a realistic way of looking at the field, is that there are strengths and weaknesses in different forensic techniques. I was a visiting fellow at NIJ (1995-98) and convinced NIJ to provide funding for research in handwriting identification and fingerprint identification: at the time the general reaction was that these forensic techniques needed no research, everything was fine in forensic-science land. Now, the general reaction to forensic science is that it is all bad. These polarizing trends are not useful, and I especially appreciate your ability to de-polarize the issue of how forensic science really operates and what is needed. Thank you!

  2. The opposite CSI effect is something that has also been discussed. "An alternative hypothesis, which runs in the opposite direction, is that CSI has fooled the public into thinking that forensic science is far more effective and accurate than it actually is.5 If true, jurors may be likely to readily accept whatever conclusions forensic science witnesses point them to." "THE CSI EFFECT: POPULAR FICTION ABOUT FORENSIC SCIENCE AFFECTS THE PUBLIC’S EXPECTATIONS ABOUT REAL FORENSIC SCIENCE" by N.J. Schweitzer Michael J. Sak