Forensic Science, Statistics & the Law: "Our Worst Fears Have Been Realized" -- Forensic "Evidence, Science, and Reason in an Era of 'Post-truth' Politics" (Part 2)

This posting continues the previous summary, with some annotations in the form of footnotes, of a October 2017 panel discussion of forensic science. 1/ It does not include the audience question-and-answer period because that part of the recording, although posted for a time, is no longer available.

PROFESSOR CHARLES FRIED, who represented Merrell Dow Pharmaceuticals in Daubert v. Merrell Dow Pharmaceuticals, described his role as “easy." In the context of "civil trials ... having to do with whether a particular chemical, which was usually a therapeutic chemical, had ... a capacity to cause a particular untoward event," the issue "was studied regularly [thanks to] the Food and Drug Administration [which] usually required enormously rigorous, randomized, double-blind trials.” However, in contrast to the “easy domain ... of causation in areas where there were really quite regular methods for testing ... and institutions that did it, .... God help us when we get to fingerprints, bullet lead, bite marks, hair samples. So there is a real problem here, and I have no sympathy with the current Department of Justice.”

Coming from the man who was the Department’s Solicitor General during the Reagan administration, this sentiment is chilling. But it is mild in comparison to DR. ERIC LANDER’s suggestion that the Justice Department has yet to embrace the scientific revolution that began in the 15th or 16th century. In his view,

[W]hat Judge Edwards did [with the NAS committee that he co-chaired] was write a spectacular report that pointed out all the scientific problems. [T]he Department of Justice dismissed it because they said it was about how to make forensic science better. [I]f it wasn’t about admissibility, they didn't really care, because if you ... could talk to a jury, well, you didn’t have to make it better.

So PCAST took the next step. We wrote a report that was about Rule 702. [W]e really didn’t care about anything else. We weren’t writing about how to improve forensic evidence in general. ... We made a specific recommendation to [the] standing committee on the Federal Rules of Evidence that they revise the advisory note around Rule 702, essentially that 702 needs fixing. This morning, they met. ... I spent four and one-half hours with said committee that convened in response to the PCAST report. [2/] ... Ted Hunt was there, and we had a grand old time. So I’m still full of vim and vigor about this thing here. ...

[M]ost of these [feature-comparison] methods weren’t developed to be science. They were developed to be rough heuristics for an investigation. [T]he courts have accepted this kind of evidence despite the lack of any empirical testing. ...

Fingerprints. [In] 1984, the Department of Justice in an official document ... which it disavowed last year, said ... that fingerprints were infallible — papally infallible. [3/] In 2009, the former head of the FBI crime lab testified [that] the error rate was less than one in 11 million. Why? Because the FBI had done 11 million fingerprint cases and he was not aware of an error. ... This is true. It is cited in the PCAST report. [4/] Since the time of Judge Edwards’ 2009 report, the FBI, God bless them, did a real empirical test of fingerprints. And now we have a measurement of an error rate. [O]ne in 600 is their best guess. Could be, with error bars, as high as one in 300. That’s great. We now actually know that it ain’t perfect. It’s not terrible, and you can tell that to a jury. ...

Firearms. They did a whole bunch of fish-in-a-barrel tests. They gave you a bag of bullets. They gave you another bag of bullets. They said every bullet in here has a match in here. Figure out who matches. They make very few mistakes when they know that the right answer is there on the multiple-choice test. If the multiple-choice test includes “none of the above,” you might not do as well. ... They did multiple choices without “none of the above,” and they found an error rate of one in 5,000. Then in 2014, the Department of Defense commissioned a study, and they found, well, one in 50—kind of like one in 5,000—just a hundredfold less.

Hair analysis. They did an amazing study in 1974, which the Justice Department cited last year as the foundational proof of the validity of hair analysis in which they found the error rate was less than 1 in 40,000. That study involved giving people hairs and asking if they thought they matched, and by the way, telling the examiners each hair you’re considering comes from a different person. As a matter of fact, it’s shocking they made any errors at all. When the FBI actually used DNA analysis on hairs that had been said by examiners to match, they found one time in nine they got it wrong.

... Bite marks. The ... field said one in six trillion was the error rate. When you give them four choices of people, they still get it wrong one time in six in that closed set—a remarkable off by one in a trillionfold. ...

Footwear matches is declared in the seminal textbook in the field to have an error rate [of] about one in 683 billion. I can’t tell you how far off that is because there has never, ever been an empirical test of footwear because they know they can calculate that it must be that accurate.

So our radical position — and I say “radical” because Mr. Hunt this morning described PCAST’s position as radical [5/] — was that a forensic feature-comparison method can be considered reliable only if its accuracy has been empirically tested under conditions appropriate to its intended use and found to have accuracy appropriate to the intended use. That’s our radical position, which I think is about sort of the foundation of the scientific revolution — that empirical evidence is necessary. This would have been controversial in ancient Greece and other places, but in the last four hundred years, this hasn’t been so controversial.

But in the forensic community they doubt it. They argue other things can substitute for it. It’s enough if the method is based on science, like based on a true story. The examiners [maintain that] "[w]e haven’t got reliability data, but [we] have good professional practices, training, certification, accreditation programs, professional organizations, best practices manuals, extensive experience using it, and published papers in peer-reviewed journals." And PCAST noted in passing that the same is true about psychics. If you go online, all of those indicia apply to the field of psychics. There are peer reviewed journals for psychics, accreditation standards, etc. There's even a subdiscipline of forensic psychics, by the way. And so we said, those are all good. I don’t want you to not have those things, but they can never establish reliability.

So it’s flamingly obvious, but some people disagree. And of 20 speakers this morning, only three quibbled with the need for empiricism. They all were employed by the Department of Justice. They were Ted Hunt and two colleagues. I asked this question, yes-no, and I would say it broke down 17-3 on "Is empirical evidence actually necessary?" And 17 people are post the scientific revolution, and three are, well, the jury is out on the scientific revolution.

In any case, I’ll just add that the Department of Justice, as you might imagine, hated this report. They hated it. We ... reported to the President, and this was done at the request of the President. We then took it to the Department of Justice, as we do with all agencies, and let them know what we were thinking, and they had a fit. They had a fit because, they said, “Do you realize this could jeopardize existing cases and past convictions?” And they said, “Could you grant us grace, like three or four years to fix all this before we have to live by these rules. We ... concluded that as scientists, it was not within our purview to grant grace, that others might be able to do that. All we could do was speak the facts. And so we did. And they hated it, and they attempted very hard to kill the report. We did battle for about four months. The Justice Department sent over 300 comments, and we dutifully answered every one and made small changes in response to them. And at the end they still opposed the release of the report. And I will merely note that in the first inaugural President Obama said we will restore science to its rightful place. The White House was faced with a disagreement between its science advisors as to whether a report should be released and the Department of Justice. The White House called the Department of Justice and said “You’re going to have to wrap your head around the idea this report’s coming out.” And it came out.

One of our recommendations, as I said was the federal Judicial Conference should take on this question of, Does Rule 702 need a change, either as to the rule or to the advisory note. There was a robust discussion. There was no agreement as to whether the rule itself should be changed and how—there was a broad range of ideas about that—or whether the advisory note should be changed. We’d recommended just change the advisory note, but we were told if you don’t change the rule, you can’t change the advisory note. So I suggested put a comma in somewhere and change the advisory note. And they agreed that would trigger it, that would be fine. And we’ll see where it goes. [S]cience isn’t rolling over yet. [I]n the end, science does win out, and we’re just going to have to be very, very stubborn.

JUDGE EDWARDS added that

You should all be wondering why the courts haven’t been able to step in and turn us in the right direction, since we’re about justice, supposedly. ... First of all, the people who are testifying ... often don’t know what they don’t know ... . We often have a defense counsel who was not up to the task. We have judges [who] don’t want to move away from precedent unless there’s compelling reason, and there are a lot of cases out there saying that these disciplines are acceptable. And what the judges have done is to accept that precedent and not even allow Daubert hearings in the criminal arena, which is really very sad..

The other thing ... is ... we don’t know how to quantify variability because they haven’t been studied. ... In most of these areas they have not done the studies to quantify the variability, the error rates, et cetera. And the judges get this. So when the judges are told—and there are some judges who are willing to listen carefully—are told you should at least limit the testimony of the expert so they don’t overstate and say “match!” ... [w]hat do you tell their expert they can say and not say? ... If you say to the expert, “Don’t overstate, don’t claim ‘matched,’ claim something less, the prosecutor is up in arms because ... if you show any uncertainty coming out of the mouth of your expert, you may not meet [the proof-beyond-a-reasonable-doubt] standard. So you have no support coming from the prosecution, and we don’t yet know ... what [to] tell the experts [about] the limits of [their] testimony. [W]e don’t have any good case law helping us. The Supreme Court has given us nothing. The Melendez[-Diaz] case was the best hope we had a number of years ago in 2009. [6/] They cited our report and said it was terrific—we need reform. And then nothing. And there’s been no other case, and that’s where I think we’re stuck. The judges are not moving because I think they don’t know how to limit the testimony of the experts in a way that would be effective and would achieve what we’re talking about.

PROFESSOR FRIED: Let me ask a question because not being a criminal lawyer, I find this puzzling. I am a constitutional lawyer, and ... you have got to prove guilt beyond a reasonable doubt, and there’s the Confrontation Clause — much misused by Justice Scalia, but here it could really do a job. All the judges would have to do — but you’re telling me they don’t do it, and they’re not doing their job, they’re acting unconstitutionally. [Suppose] you get one of these phony experts — and they are phony, some of them are. [I]n the civil area, they’re not only phony, but they’re crooks. I mean they are what [is] known as paid liars, but that’s a different thing. In the criminal area, in the prosecution, they are professional liars. [T]hey may not be paid; nevertheless, why are the defense lawyers not allowed to poke these holes under the Confrontation Clause and under cross-examination? It would fall apart in cross-examination, particularly if you had a contrary expert ... . Why doesn’t that happen? That would create reasonable doubt in an unreasonable number of cases. Why doesn’t that happen? You tell me, judge.

JUDGE EDWARDS: I’ve never understood the bitemark example. And I say this with great sadness .... The judges let it in. They’ve tried to do the cross-examination. It comes in, the judges let it in. If you can get someone who’s been identified as an expert, you’ve got the jury.

PROFESSOR FRIED: But what if you get an expert on the other side?

JUDGE EDWARDS: Here’s the problem. You don’t have scientists, serious scientists, like Eric, who have any interest in doing serious work in forensics.

To which DR. LANDER added that "[i]t's an unusual kind of science when the scientists work for one side. In criminal law, the scientists work almost exclusively for the prosecution." After elaborating, he concluded the panel's presentation with the following reaction to Professor Fried's question about why vigorous cross-examination and countervailing experts do not solve the problems of dubious science and overclaiming: 7/

What do you do when you don't really know what your accuracy is? You don't have a method. End of story, which means it's not admissible. If ... I have a scientific method that measured something [but] I have no clue how accurate it is, it's not a method. It doesn't come in. It doesn't go to weight. It goes to admissibility.

NOTES

"Our Worst Fears Have Been Realized" — Forensic "Evidence, Science, and Reason in an Era of 'Post-truth' Politics" (Part 1). Nov. 20, 2017, https://for-sci-law.blogspot.com/2017/11/our-worst-fears-have-been-realized.html .
The committee's regularly scheduled meeting took place during the afternoon, while Dr. Lander was speaking at Harvard. The committee spent the morning at Boston College listening to short presentations from many invited speakers — among whom Dr. Lander was prominent. The transcript of the addresses and discussion — including back-and-forth between Dr. Lander and a few Justice Department employees — is reproduced in the Fordham Law Review, along with papers supplied by a few of the speakers. Symposium on Forensic Expert Testimony, Daubert, and Rule 702, 86 Ford. L. Rev. 1463 (2018).
The body of the PCAST report does not provide the name, date, or author(s) of the "official document" declaring papal infallibility. Note 97 on page 45 refers only to the defunct URL http://www.justice.gov/olp/file/861906/download. However, a separate list of references for fingerprinting includes the publication "Federal Bureau of Investigation. The Science of Fingerprints. U.S. Government Printing Office. (1984): p. iv." This booklet seems to be referring to a full set of fingerprints as a token of individual identity. It states at iv that
Of all the methods of identification, fingerprinting alone has proved to be both infallible and feasible. Its superiority over the older methods, such as branding, tattooing, distinctive clothing, photography and body measurements (Bertillion system), has been demonstrated time after time. While many cases of mistaken identification have occurred through the use of these older systems, to date the fingerprints of no two individuals have been found to be identical.
The witness in the case was not "the former head of the FBI crime lab." But he was the head of the FBI's latent fingerprint unit.
The transcript of the advisory committee's symposium at Boston College does not reflect any use of the word "radical" by Ted Hunt. But he did take issue with the insistence in the PCAST report that for highly subjective feature-comparison methods,
The sole way to establish foundational validity is through multiple independent black box studies that measure how often examiners reach accurate conclusions across many feature-comparison problems involving samples representative of the intended use. In the absence of such studies, the feature comparison method cannot be considered scientifically valid.
The Department of Justice, he explained, regarded as "wrong and ill advised ... PCAST’s novel premise that the set of criteria that comprise its nonseverable six-part test collectively constitute the exclusive means by which scientific validity of a feature-comparison method can be established." Symposium on Forensic Expert Testimony, Daubert, and Rule 702, 86 Ford. L. Rev. 1463, 1520 (2018). The Department's position is that "mainstream scientific thought" looks to "all available information, evidence, and data." The real issue, of course, is what to do when "all available information" includes almost no well designed studies of the accuracy and reliability of subjective measurements and opinions from them.
Justice Scalia's opinion for the Court in Melendez-Diaz v. Massachusetts, 557 U.S. 305 (2009), devoted but a single sentence (shown in italics) to the NRC report:
Nor is it evident that what respondent calls "neutral scientific testing" is as neutral or as reliable as respondent suggests. Forensic evidence is not uniquely immune from the risk of manipulation. According to a recent study conducted under the auspices of the National Academy of Sciences, "[t]he majority of [laboratories producing forensic evidence] are administered by law enforcement agencies, such as police departments, where the laboratory administrator reports to the head of the agency." National Research Council of the National Academies, Strengthening Forensic Science in the United States: A Path Forward 6-1 (Prepublication Copy Feb. 2009) (hereinafter National Academy Report). And "[b]ecause forensic scientists often are driven in their work by a need to answer a particular question related to the issues of a particular case, they sometimes face pressure to sacrifice appropriate methodology for the sake of expediency." Id., at S-17. A forensic analyst responding to a request from a law enforcement official may feel pressure — or have an incentive — to alter the evidence in a manner favorable to the prosecution.
The availability of cross-examination is part of the Justice Department's argument for leaving Rule 702 and the committee note alone. Andrew Goldsmith argued that
PCAST and the changes predicated on PCAST’s suggestions ignore the basic nature of the criminal justice system. It ignores, as Justice Harry Blackmun wrote in Daubert, both the capabilities of the jury and of the adversary system generally. Vigorous cross-examination, presentation of contrary evidence, and careful instruction on the burden of proof are the traditional and appropriate means of attacking shaky but admissible evidence.
Symposium on Forensic Expert Testimony, Daubert, and Rule 702, 86 Ford. L. Rev. 1463, 1527 (2018). The last sentence comes verbatim from Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579, 596 (1993). However, the Department is ignoring the rest of Justice Blackmun's paragraph, which concludes with these words: "These conventional devices, rather than wholesale exclusion under an uncompromising 'general acceptance' test, are the appropriate safeguards where the basis of scientific testimony meets the standards of Rule 702." PCAST's argument is that the testimony does not meet the standards of Rule 702 unless the highly subjective and largely standardless assessments of skill-and-experience based "scientific" experts are adequately tested. The real issue is what adequate testing requires in this context.

Index to comments and cases on the PCAST report

Forensic Science, Statistics & the Law

Pages

Wednesday, December 26, 2018

"Our Worst Fears Have Been Realized" -- Forensic "Evidence, Science, and Reason in an Era of 'Post-truth' Politics" (Part 2)

No comments:

Post a Comment

Labels

Popular Posts

Search This Blog

Blog Archive

Places to visit, books to read, meetings to attend [or to avoid]