Saturday, May 20, 2017

Science Friday and Contrived Statistics for Hair Comparisons

On May 19th, Public Radio International's Science Friday show had a segment entitled "There’s Less Science In Forensic Science Than You Think." The general theme — that some practices have not been validated by rigorous scientific testing — is a fair (and disturbing) indictment. But listeners may have come away with the impression that the FBI has determined that hair examiners make up statistics from personal experience 95% of the time to help out prosecutors.

Ira Flato, the show's host, opened with the observation that "The FBI even admitted in 2015, after decades, investigators had overstated the accuracy of hair sample matches over 95% of the time in ways that benefited the prosecution." He returned to this statistic when he asked Betty Layne DesPortes, a lawyer and the current President of the American Academy of Forensic Sciences, the following question:
Dr. DesPortes, I want to go back to that FBI admission in 2015 that for decades investigators had overstated the accuracy of their hair samples, and I mean 95% of the time in a way that benefited the prosecution. Is this a form of cognitive bias coming into the picture?
Ms. DesPortes replied that
It is, and ... you would have overstatement along the lines of, "Well, I’ve never seen in my X years of experience that two hairs would be this similar, so it must be a match," and then they would just start making statistics up based on, "Well, I’ve had a hundred cases in my practice, and there have been a thousand cases in my lab, and nobody else has ever reported similar hairs like this," so let’s just start throwing in one in a hundred thousand as a statistic — "one in a hundred thousand" — and that’s where the misstatement came in.
But neither Ms. DesPortes nor anyone else knows how often FBI examiners cited statistics like "one in a hundred thousand" based on either their recollections of their own casework or their impression of the collective experience of all hair examiners. 1/

To be sure, such testimony would have been flagged as erroneous in the FBI-DOJ Microscopy Hair Comparison Review. But so would a much more scientifically defensible statement such as
The hair removed from the towel exhibited the same microscopic characteristics as the known hair sample, and I concluded it was consistent with having originated from him. However, hair comparison is not like fingerprints, for example. It’s not a positive identification. I can’t make that statement." 2/
The Hair Comparison Review was not designed to produce a meaningful estimate of an error rate for hair comparisons. It produced no statistics on the different categories of problematic testimony. The data and the results have not been recorded (at least, not publicly) so as to allow independent researchers to ascertain the extent to which FBI examiners overstated their findings in various ways. See David H. Kaye, Ultracrepidarianism in Forensic Science: The Hair Evidence Debacle, 72 Wash. & Lee L. Rev. Online 227 (2015).

The interim results from the Hair Comparison Review prompted the Department of Justice to plan a retrospective study of FBI testimony involving other identification methods as well. In July 2016, it asked a group of statisticians how best to conduct the new "Forensic Science Disciplines Review." The informal recommendations that emerged in this "Statisticians' Roundtable" included creating a database of testimony that would permit more rigorous, social science research. But this may never happen. A new President appointed a new Attorney General, who promptly suspended the expanded study.

  1. Ms. DesPortes may not have meant to imply that all the instances of exaggerated testimony were of the type she identified.
  2. That statements like these may be scientifically defensible does not render them admissible or optimal.
(For related postings, click on the label "hair.")

No comments:

Post a Comment