Wednesday, November 22, 2017

A Two-culture Problem with Forensic Science?

Recently, U.S. Court of Appeals Judge Harry T. Edwards complained that
Forensic practitioners are the people who got us in trouble in the first place. They don't know what they don't know. ... The people who are doing this do not understand what we mean when we say to them, "what you're doing has no scientific foundation." They don't understand it because they were brought up in a different world. They don't understand science. 1/
That is a harsh and sweeping generalization. Judge Edwards, who co-chaired a committee empaneled by the National Research Council to study forensic science in the United States, knows that the forensic-science community is far from monolithic. 2/

But I fear that, in part and in some instances, the frustration expressed by practitioners with the criticism that the validity and reliability of pattern-matching practices such as fingerprint, toolmark, footwear, handwriting, and bitemark comparisons have yet to be sufficiently demonstrated does reflect a lack of appreciation for what it takes to validate a process of measurement and inferences in these fields.

An example is the reaction of the president of the International Association for Identification (IAI) to a draft recommendation of a subcommittee of the U.S. National Commission on Forensic Science. 3/ The document, which received the endorsement of 60% of the members of the Commission 4/ (including, I would guess, all of the nonforensic scientists on the panel), suggested that
Forensic science practitioners should not state that a specific individual or object is the source of the forensic science evidence and should make it clear that, even in circumstances involving extremely strong statistical evidence, it is possible that other individuals or objects could possess or have left a similar set of observed features. Forensic science practitioners should confine their evaluative statements to the support that the findings provide for the claim linked to the forensic evidence.
In other words, practitioners should evaluate the probability of the observed correspondence in the features of the specimens they examine if the specimens have a common origin and the probability of these observations if the specimens are from different sources. But they should not take it upon themselves to opine on the ultimate issue of who is the source.

Although advocacy of this approach is hardly novel in the international forensic-science community, the president of "the foremost international organization" 5/ of practitioners was incensed at the thought that latent fingerprint examiners should cease and desist from "conclusion decisions" 6/ in court. In an interview for Forensic Magazine, he said that it was just unfair. 7/ He fulminated (or elaborated):
"Even if all the minutiae all match up, you're telling me I can't say it came from the same source?" Ruslander said. "There are millions of fingerprints in AFIS, and there's never been a bad match, to my knowledge," he added. "That's a pretty good empirical study." 8/
Anyone competent in scientific methodology would have to call this kind of study fundamentally misconceived. One could design a good experiment to ascertain the validity of an automated fingerprint matcher, but it would not consist of one person’s memory of no “bad matches” — whatever that means for a system that merely produces a list of possibly matching candidates rather than single-source “conclusion decision.” Moreover, even if a perfectly accurate, fully automated system to make single-source attributions existed, what would its uncanny performance tell us about the validity and reliability of mere mortals who do not make their “conclusion decisions” the same way and who are known to err from time to time?

Now there are studies that clearly demonstrate that latent print examiners can make source attributions and exclusions at rates far better than chance — in other words, there is scientifically demonstrable expertise even if the procedure is highly subjective and not particularly “scientific” at critical junctures. But remarks like those of the leader of “the world’s oldest and largest forensic science identification association” 9/ as to what is a “good empirical study” only lend credence to Judge Edwards’ complaint. They make it appear that practitioners “were brought up in a different world” and “don't understand science.” The forensic-science community can and must do better.

  1. "Our Worst Fears Have Been Realized" -- Forensic "Evidence, Science, and Reason in an Era of 'Post-truth' Politics" (Part 1), Forensic Science, Statistics and Law, Nov. 20, 2017, 
  2. His committee’s 2009 NRC Report noted that
    the “forensic science community” ... consists of a host of practitioners, including scientists (some with advanced degrees) in the fields of chemistry, biochemistry, biology, and medicine; laboratory technicians; crime scene investigators; and law enforcement officers. There are very important differences, however, between forensic laboratory work and crime scene investigations. There are also sharp distinctions between forensic practitioners who have been trained in chemistry, biochemistry, biology, and medicine (and who bring these disciplines to bear in their work) and technicians who lend support to forensic science enterprises. (P. 7)
  3. In the interest of full disclosure, I should note that I participated in drafting the "views document" that contains this recommendation and that I made suggestions to the Commission for further revisions.
  4. Transcript of Meeting 13, Part 1, Apr. 10, 2017,.at 48,
  5. United States v. Herrera, 704 F.3d 480, 486 (7th Cir. 2013).
  6. H.W. “Rus” Ruslander, Feb. 5,  2017, IAI Position Statement on Conclusions, Qualified Statements, and Probability Modeling,
  7. Seth Augenstein, National Commission on Forensic Science Asks for Public Comment, Forensic Mag., Feb. 22, 2017,
  8. Augenstein, supra note 4.
  9. IAI’s Mission Statement,

Monday, November 20, 2017

"Our Worst Fears Have Been Realized" -- Forensic "Evidence, Science, and Reason in an Era of 'Post-truth' Politics" (Part 1)

On October 27, a trio of panelists spoke at the Harvard Law School on "Evidence, Science, and Reason in an Era of 'Post-truth' Politics." The organizers, law professors Scott Brewer and Dan Kahan, called the panel “stellar” -- and for good reason. The speakers were
★ Judge (and now Professor) Harry T. Edwards, who co-chaired the National Academic of Science’s 17-member committee on Strengthening Forensic Science in the United States: A Path Forward. The committee’s 2009 report found that “In a number of forensic science disciplines, forensic science professionals have yet to establish either the validity of their approach or the accuracy of their conclusions, and the courts have been utterly ineffective in addressing this problem.”
★ Professor (and former Justice) Charles Fried, who was the U.S. Solicitor General during the Reagan administration, and who argued on behalf of Merrell Dow Pharmaceuticals in the landmark case of Daubert v. Merrell Dow Pharmaceuticals, 509 U.S. 579 (1993), and
★ Professor Eric S. Lander, who is best known to the forensic-community for his early testimony and writing on DNA evidence and, of late, for his leadership role in a 2016 report of the President’s Council of Advisors on Science and Technology. This report on “Forensic Science in Criminal Courts: Ensuring Scientific Validity of Feature-Comparison Methods” noted that "[f]ederal appellate courts have not with any consistency or clarity imposed standards ensuring the application of scientifically valid reasoning and reliable methodology in criminal cases involving Daubert questions." His day job lies in directing the Broad Institute of MIT and Harvard, which “is empowering a revolution in biomedicine to accelerate the pace at which the world conquers disease.”
The program did not deal with “Evidence, Science, and Reason” writ large. Mostly, it concerned how the legal system has handled the issue of ensuring that trace evidence, which can associate individuals or objects with crimes, produces “scientific truth.”

What follows is a summary and compilation of some of the more provocative -- and sometimes ad hominem -- statements by Professor Brewer and Judge Edwards. There are also notes on a few of their remarks. I hope to touch on the remainder of the hour-and-a-half program in latter installments. The full video recording is on the web.

PROFESSOR BREWER introduced the panel and the “question of how the truth that the [best?] of science has to offer can inform and guide what is called ‘forensic science’ in such a way that when judges and jurors rely, as they clearly do, on forensic science, they are actually relying on information that legitimately and accurately claims the mantle of scientific truth ... .” He opined that “The response to both reports has been, or seems to me anyway, dispiriting.” The response to which he referred was
☁ A statement from then Attorney General Loretta Lynch that “the Department will not be adopting the recommendations [of the 2016] report related to the admissibility of forensic-science evidence”
☁ Testimony from then Senator (now Attorney General) Jeff Sessions after the 2009 report that “I don’t think we should suggest that these proven scientific scientific principles that we’ve been using for decades are somehow uncertain ... .”
☁ “More recently, the Sessions' Justice Department appointed Ted Hunt, a former prosecutor, as was Sessions, as senior forensic advisor overseeing a forensic science working group to create guidelines for forensic examiners to follow in court testimony. Hunt was ... one of two commissioners [on the National Commission on Forensic Science] to reject the recommendation that forensic experts and attorneys working on behalf of the Justice Department stop using the phrase ... ‘to a reasonable degree of scientific certainty.’” 1/
☁ The Justice Department under Attorney General Sessions “disbanded the National Commission on Forensic Science.” (For the official explanation, see The Justice Department’s Explanation for the End of the National Commission on Forensic Science, Forensic Sci., Stat. & L., April 26, 2017.)
JUDGE EDWARDS spoke of the "two-and-one-half years" that the NAS committee spent "going through all of the research that was available," the discouraging conclusions ("a community in disarray"), and the recommendation for an "independent federal agency." As for the latter,
[O]ne of our most important recommendations was the DOJ, the Department of Justice, should not be that agency. ... They had a vested interest in prosecuting. It's inconsistent with the culture of science ... and we were all, to a person, unanimous in the view that DOJ had to be kept out of it, and boy, were we prescient.
Our worst fears have been realized. ... We got no serious help from DOJ once the report issued, and that goes through ... all the administrations that have been involved. And the only time that DOJ acted in a way that has been useful is when they've been under pressure -- for example, when they were exposed on microscopic hair examinations ... and they had to 'fess up ... Other than that, we were not getting help from DOJ. ... I was appalled we could get no support from the Department to try to advance reform movement.
The other part of our worst fears realized is DOJ is now the self-annointed leader of the forensic science reform movement, which is a disaster. ... [I]t is shrewd on their part because they want to control what is and is not done -- mostly what is not done. And they are in control right now, which is really unfortunate.
On the other hand, Judge Edwards added that
Now ... just so you get the full picture, after the [2009 NAS] report issued, it was cited throughout the world ... . The settlement of the hair cases, press reports, and the National Institute of Justice [within the DOJ] has begun to sponsor some research to try to improve some of the disciplines. ... And in 2013, ... DOJ, and (in my view) under pressure, because a lot of us including the press, had been pressing for some movement, they cosponsored with the National Institutes of Standards and Technology the creation of the National Commission on Forensic Science.
But the Commission, Judge Edwards contended, suffered precisely because it was run by the Justice Department:
Now, here's the problem. DOJ held on to it. Not NIST -- DOJ. So any recommendations coming out of this group went to DOJ, and DOJ decided whether or not the recommendations would be implemented. [ 2/ ] Most of the recommendations have not been implemented. [ 3/ ] They only met twice a year. [ 4/ ] There was no real leadership, and at one point one of my colleagues, Judge Jed Rakoff, a district court judge in New York, resigned from the Commission because DOJ was going to limit the scope of their work, and he wrote an article, an op-ed piece, in the press, and they backed down and forced DOJ to open it back up again. [ 5/ ] But they had no enforcement, and the recommendations were not being accepted by DOJ, but there was a little bit of progress because at least until they were shut down, they started to come together around some recommendations that would have advanced the forensic science project. [ 6/ ]
Next, Judge Edwards discussed the PCAST report and the reaction of the Justice Department:
It's a really strong report ... essentially saying that with respect to these pattern-matching disciplines there are serious problems -- this is not science. You have people testifying about things on the assumption that it's science, and there was no scientific basis for what they were saying.

And then you have the current Department of Justice. ... They tried to block the issuance of the White House report -- DOJ did. I know about the internal battle. ... These were world-class scientists who had studied all of these disciplines [and] had come to very serious conclusions about the frailties of these disciplines, and DOJ pressed the White House not to let this report come out. It finally did come out. DOJ said "we're not interested," and when the new administration came in, DOJ said "we're still not interested, and we're not going to apply any of the recommendations here."

The current DOJ -- and I had an opportunity to hear the new leader within the Department of Justice, Ted Hunt -- who spoke to the National Academy of Sciences a week or two ago at a meeting that I was at. This is a person who was on the National Commission of Forensic Science [and] had voted against a number of proposals that would helped to reform the forensic science community. He blasted the PCAST report and said "they did not understand what they were talking about with respect to science." ... I really wish I could have videotaped the exchange. When he blasted PCAST -- and I'm sitting in a room with world-class scientists at the National Academy of Scientists -- and he did his critique on scientific research, and one of my colleagues couldn't stand it any longer. She said "what are you talking about?" She said, "I teach scientific methodology every Friday, every Friday at 1, and you haven't the faintest idea of what you are talking about." And it was exactly accurate. He had no sense of scientific methodology. ... He made a comment that was one of the most astonishing things I've ever heard. He said, "and incidentally, the jury is still out on bitemarks." The jury is not still out on bitemarks. Trust me, there is no science supporting bitemarks, and yet it is still a discipline that we use in the United States, and it's still being accepted  by the courts. And this new person who's heading the forensic science wing at DOJ has the chutzpah to say that the jury's still out. And I said, "well if you're really serious about advancing reform, wouldn't the first thing you would want to say to the world be, bitemarks is gone as far as we're concerned?" He had no interest in an independent group overseeing the reform effort, and he may kill the National Institute of Science and Technology effort.
In sum,
The National Commission on Forensic Science is dead now because Justice has killed it. So an enterprise that might have produced recommendations that could have been helpful is no longer in existence. There is this guy in Justice, Ted Hunt, who's now called head of it all, and he has his little working group, and no one knows what they're doing, and they refuse to have an independent science group oversee it.
As for NIST and its creation, the Organization of Scientific Area Committees for Forensic Science (OSAC), Judge Edwards maintained the committees are not staffed by enough "real scientists" who actually "understand science":
NIST still exists, and they oversee Scientific Area Committees. And what they are trying to do is establish standards in the disciplines for each of these groups. But I want you to understand ... while this is a noble enterprise in some respects, it's not going to get us where we need to go. The NIST enterprise with these Area Committees is pretty much dominated by forensic practitioners. Forensic practitioners are the people who got us in trouble in the first place. They don't know what they don't know. That's the problem. ... The people who are doing this do not understand what we mean when we say to them, "what you're doing has no scientific foundation." They don't understand it because they were brought up in a different world. They don't understand science. The disciplines that they are now trying to set standards for, many of them have not been validated and they have not been shown to be reliable. So how do you set standards a discipline that has not been shown to be valid and not shown to be reliable? That's one of the frailties of this whole NIST project. ...

These practitioners do not want to know sources of variability. They don't want to try and understand error rates. They don't want to believe that uncertainty exists. They object to blind studies that would help to confirm the reliability of their work. They're persuaded by very small sample sizes. And they fight the real scientists with whom they are working on these Area Committees. And they dominate the committees by 70% to 30%, and the real scientists on these committees with whom I've been in contact say it's a nightmare trying to struggle with them because they don't understand the issues.
His final remarks concerned courtroom testimony and judicial permissiveness:
The exaggerated testimony in court is horrible. We have people testifying "zero error rate, vanishingly small, essentially zero," and we have appellate court opinions in the federal courts adopting zero error rates as if it were a viable notion. ... [T]he federal rules are no help. ... Rule 702 ... was based on Daubert, which purports to talk about scientific validity, [but] Daubert has been ... a failure in ... the criminal arena ... In criminal cases, the notion of scientific validity that is very much a part of Daubert has not worked. It has failed. And it has failed because ... of judges who are wedded to precedent [and] believe that because we said it before, it must be right, and because these practitioners have been around for a long time, it must be right. In other words, history is the proof, and precedent controls. ... And when the experts come in, even when they have some science ... [in cases involving the compositional analysis of bullet lead] they did not know how to do ... a statistical analysis to look at variability and error rates -- they don't know anything about it. And the courts didn't know they didn't know anything about it. ...
  1. On the nature of Commissioner Hunt’s arguments against the Commission’s recommendation, see "Reasonable Scientific Certainty," the NCFS, the Law of the Courtroom," and that Pesky Passive Voice, Forensic Science, Stat. & L., Mar. 1, 2016; Is "Reasonable Scientific Certainty" Unreasonable?, Forensic Sci., Stat. & L., Feb. 26, 2016.
  2. It is not obvious that the division of authority between NIST and DOJ was unreasonable or nefarious. I think Judge Edwards' criticism boils down to frustration with the absence of a centralized, scientific agency that regulates forensic science in America. Neither NIST nor DOJ has the power to make mandatory rules for all forensic scientists. As discussed in a posting of April 12, 2017 (Two Misconceptions About the End of the National Commission on Forensic Science), NCFS provided advice for specific actions by the Attorney General and promulgated more general views for the benefit of what DOJ and NIST call "stakeholders." NIST officials co-chaired and vice-chaired the Commission, and, with DOJ funding, NIST established a complementary structure -- the Organization of Scientific Area Committees on Forensic Science (OSAC) -- to develop science-based, voluntary standards. The stated aim of OSAC is "to identify and promote technically sound, consensus-based, fit-for-purpose documentary standards that are based on sound scientific principles." How well OSAC has met this goal is a distinct question. The Commission recommended the creation of an independent body to ensure "technical merit" of the forensic-science standards that OSAC deems meritorious. OSAC has three "resource committees," but none of them is tasked with reviewing and reporting on the technical merit of standards that the committees propose for addition to a registry of OSAC-approved standards. 
  3. The Commission made recommendations for actions by the Department of Justice regarding accreditation of forensic-science service providers, proficiency testing, public release of quality-management-systems documents, a code of professional responsibility for forensic providers, AFIS interoperability, root-cause analysis of errors, certification of medicolegal death examiners, accreditation of medical examiner and coroner offices, electronic networking of those offices, a national disaster call center, a national office for medicolegal death investigation, model legislation for medicolegal death investigations, use of the term "reasonable scientific certainty," pretrial discovery, documentation and reporting, and forensic-science curriculum development. It also promulgated "views documents" that did not call on the Attorney General to take specific actions. See Work Products Adopted by the Commission, Nov. 6, 2017,
         Computing a percentage for the adoption of the recommendation would not be trivial, as some were adopted in part and rejected in other parts. For example, the code of professional responsibility that the Department adopted omitted or altered the following provisions:
    ▹ Utilize scientifically validated methods and new technologies, while guarding against the use of unproven methods in casework and the misapplication of generally-accepted standards. [Rather than commit to using scientifically validated methods, DOJ enjoins forensic-science professionals to "[c]onduct research and forensic casework using the scientific method or agency best practices. Where validation tools are not known to exist or cannot be obtained, conduct internal or inter-laboratory validation tests in accordance with the quality management system in place."]
    ▹ Conduct independent, impartial, and objective examinations that are fair, unbiased, and fit- for-purpose. [DOJ's version omits "objective" and reads "Conduct examinations that are fair, unbiased, and fit-for-purpose."]
    ▹ Once a report is issued and the adjudicative process has commenced, communicate fully when requested with the parties through their investigators, attorneys, and experts, except when instructed that a legal privilege, protective order or law prevents disclosure. [DOJ's version reqiuires "[h]onest[] communication ... when permitted by ... agency practice."]
    ▹ Appropriately inform affected recipients (either directly or through proper management channels) of all nonconformities or breaches of law or professional standards that adversely affect a previously issued report or testimony and make reasonable efforts to inform all relevant stakeholders, including affected professional and legal parties, victim(s) and defendant(s). [DOJ preferred that its laboratory professionals have a much more limited duty to "[i]nform the prosecutors involved through proper laboratory management channels of material nonconformities or breaches of law or professional standards that adversely affect a previously issued report or testimony."]
  4. The Commission did not meet only twice a year. It met four times in 2014 and 2015, three times in 2016, and twice in the first quarter of 2017, after which its charter was not renewed. The average meeting rate was thus four times a year.
  5. Judge Rakoff did not write an op-ed article -- at least not one that I can find on the web. His letter of resignation appeared in the Washington Post, and newly appointed Deputy Attorney General Sally Yates flew to New York, talked with Judge Rakoff, and rescinded the decision that pretrial discovery rules were outside the Commission's mandate. The Commission made its recommendations for more complete pretrial discovery in criminal cases involving forensic-science evidence, and DOJ implemented them. See "A Bump in the Road" for the National Commission on Forensic Science, Jan. 29, 2015, Forensic Sci., Stat. & L.,; Justice Department Reverses Decision on the Mandate of the National Commission on Forensic Science, Jan. 31, 2015,; Joseph Ax, After Quitting in Protest, Prominent U.S. Judge Rejoins DOJ Commission, Reuters, Jan. 30, 2015,
  6. Sadly, the Commission was unable to attain a two-thirds majority on a subcommittee's "Final Draft Views on Report and Case Record Contents" (59% voted in favor) and "Final Draft Views on Statistical Statements in Forensic Testimony" (60% voted in favor). Whether more meetings would have produced consensus (in the sense of the required two-thirds vote) on these important matters is unclear.

Saturday, November 18, 2017

A New Breed of Peer Reviewer

Dr. Olivia Doll has ample academic credentials on her C.V. She holds a doctorate in Canine Studies from the Subiaco College of Veterinary Science (Dissertation: Canine Responses to Avian Proximity); a master’s degree in Early Canine Studies from Shenton Park Institute for Canine Refuge Studies; and a bachelor’s degree from the Staffordshire College of Territorial Science. Her research includes such topics as the relationships between Doberman Pinschers and Staffordshire Terriers in domestic environments, the role of domestic canines in promoting optimal mental health in ageing males, the impact of skateboards on canine ambulation, and the benefits of abdominal massage for medium-sized canines.

However, the Shenton Park “Institute” is the animal shelter in which she lived as a pup, and there is no College of Veterinary Science in Subiaco. Her expertise in canine studies comes strictly from her skill, experience, and training (not to mention her instincts) as a five-year-old Staffordshire Terrier, AKA Ollie.

Nevertheless, this tongue-in-jowl record was sufficient to induce the following flaky academic journals to add the pit bull from Perth to their editorial boards:
■ *EC Pulmonary and Respiratory Medicine, Echronicon Open Access
■ Journal of Community Medicine and Public Health Care, Herald Scholarly Open Access
■ Journal of Tobacco Stimulated Diseases, Peertechz
■ Journal of Alcohol and Drug Abuse, SMGroup
■ Alzheimer’s and Parkinsonism: Research and Therapy, SMGroup
■ *Journal of Psychiatry and Mental Disorders, Austin Publishing Group
■ *Global Journal of Addiction and Rehabilitation Medicine (Associate Editor), Juniper Publishers
■ Austin Addiction Sciences, Austin Publishing Group
Ollie's exploits received international coverage in newspapers and in Science in May. Yet, Dr. Doll remains on the websites of the journals marked with an asterisk (accessed November 18, 2017). I do not know if she is still reviewing papers for the Journal of Community Medicine and Public Health Care, for that journal does not list names of editors or reviewers.


Friday, November 17, 2017

Comedic License

The other day, I mentioned a satire by the comedian John Oliver on developments in forensic science. A few minutes of the video was played for an international audience at Harvard Law School on October 27 in the introduction to a panel discussion entitled "Evidence, Science, and Reason in an Era of 'Post-truth' Politics." The clip included remarks from Santae Tribble and his lawyer about hair evidence in the highly (and deservedly) publicized exoneration of Mr. Tribble. 1/ I got the impression from the Oliver video that an FBI criminalist claimed that there was but a 1 in 10 million chance that the hair could have come from anyone else. In Oliver's recounting of the case,
Take Santae Tribble, who was convicted of murder, and served 26 years in large part thanks to an FBI analyst who testified that his hair matched hair left at the scene, and he will tell you the evidence was presented as being rock solid. "They said they matched my hair in all microscopical characteristics. And that's the way they presented it to the jury, and the jury took it for granted that that was my hair."

You know, I can see why they did. Who other than an FBI expert would possibly know that much about hair? ... Jurors in Tribble's case were actually told that there was one chance in ten million that it could be someone else's hair.
I have not seen the trial transcript, but a Washington Post reporter, who presumably has, provided this account:
In Tribble’s case, the FBI agent testified at trial that the hair from the stocking matched Tribble’s “in all microscopic characteristics.” In closing arguments, federal prosecutor David Stanley went further: “There is one chance, perhaps for all we know, in 10 million that it could [be] someone else’s hair.” 2/
If the Post is correct, it was not "an FBI expert" who "actually told" jurors that the probability the hair was not Tribble's was 1/10,000,000. It was an aggressive prosecutor who made up a number that he imagined "perhaps ... could" be true. Like the similar figure of 1/12,000,000 in the notorious case of People v. Collins, 438 P.2d 33 (Cal. 1968), this number plainly was objectionable.

In distinguishing between the criminalist's testimony and the lawyer's argument, I do not mean to be too critical of Oliver's condensed version and passive voice. Sometimes a little oversimplification saves a lot of pedantic explanation. The sad fact is that, whichever participants in  the trial were responsible for the 1/10,000,000 figure, Tribble's case exemplifies Oliver's earlier observation that "[t]he problem is, not all forensic science is as reliable as we have become accustomed to believing. ,,, It's not that all forensic science is bad, because it's not. But too often, its reliability is dangerously overstated ... ." 3/

  1. See The FBI's Worst Hair Days, July 31, 2014, Forensic Sci., Stat. & L.,
  2. Spencer S. Hsu, Santae Tribble Cleared in 1978 Murder Based on DNA Hair Test, Wash. Post, Dec. 14, 2012, (emphasis added).
  3. Cf. David H. Kaye, Ultracrepidarianism in Forensic Science: The Hair Evidence Debacle, 72 Wash. & Lee L. Rev. Online 227 (2015),

After composing these remarks, I located the 2012 motion to vacate Tribble's conviction. (Motion to Vacate Conviction and Dismiss Indictment With Prejudice on the Grounds of Actual Innocence Under the Innocence Protection Act, United States v. Tribble, Crim. No. F-4160-78 (D.C. Super. Ct. Jan. 18, 2012), available at This document contains more extended excerpts from the trial transcript, These indicate that although the FBI agent, James Hilverda, did not generate a nonsource probability of 1/10,000,000, he did insist that it was at least "likely" that Tribble was the source, and he indicated that the probability of a coincidental match could be on the order of one in a thousand.
Q. Is it possible that two individuals could have hairs with the same characteristics?
A. It is possible, but my personal experience, I rarely have seen it. Only on very rare occasions have I seen hairs of two individuals that show the same characteristics. Usually when we find hairs that we cannot distinguish from two individuals, it is due to the fact that the hairs are either so dark that you cannot see sufficient characteristics, or they are too light that they do not have very many characteristic[ s] to examine. So you cannot make a real good analysis in that manner. But there is a potential even with sufficient characteristics to once in a while to see hairs that you cannot distinguish.
Q. And is it because of that small possibility that two hairs can be from different people that you never give absolutely positive opinion that two hairs, in fact, did come from the same person.
A. That is correct.
Q. If you compare them and they microscopically match in all characteristics, could you say they could have come from the same person?
A. That's correct.
Q. Have you ever seen two hairs from two. different people that microscopically match?
A. Well, it is possible in the thousands of hair examples I have done, it is very, very rare to find hairs from two different people that exhibit the same microscopic characteristics. It is possible, but as I said, very rare.
A. ... I found that these hairs — the hairs that I removed from the stocking matched in all microscopic characteristics with the head hair samples submitted to me from Santae Tribble. Therefore, I would say that this hair could have originated from Santae Tribble.
Q. Is there any microscopic characteristics [sic] of the known hair that did not match?
A. No. The hair that I aligned with Santae Tribble matched in all microscopic characteristics, all characteristics were the same.
Q. And were there a large number of characteristics which were the same or—
A. All the characteristics were the same and there was a sufficient number of characteristics to allow me to do my examinations.
A. ... I think you can identify individuals, say, as to race, you can an [sic] indication that it came from an individual because it matches in all characteristics and I would say that when I have-in my experience that I feel when I have made this type of examination, it is likely that it came from the individual which I depicted it as coming from.
Mrs. McCormick said at least one of the robbers, the only one that she saw, was wearing a stocking mask. A block from where that homicide occurred, more or less, through the alley and around the corner on a fresh trail found by a trained canine dog, was found what, a stocking mask. In that stocking mask was found what, a hair.
Now whose hair was that? Well it was compared by the FBI laboratory and they can say for one thing, it wasn't Cleveland Wright's hair. They can say that that hair matched in every microscopic characteristic, the hair of Santae Tribble. Now for scientific reasons, which were explained, they cannot say positively that that is Santae Tribble's hair because on rare occasions they have seen hairs from two different people that matched. But usually that is the case where there have been few characteristics present. Now Agent Hilverda told you that on these hairs, the known hairs of Santae Tribble and the hair found in the stocking, there were plenty of characteristics present, not that situation at all. But because the FBI is cautious, he cannot positively say that that is Santae Tribble's hair.
Well what did Special Agent Hilverda say, microscopic analysis. Throwing the large terms away, what did he say? "Could be." He looked at one hair, he looked at Santae Tribble's hair and said, could be.
[T]he dog found a stocking with a hair which not only could be Santae Tribble's, it exactly matches Santae Tribble's in every microscopic characteristic. And Mr. Potenza says throw away all the scientific terms and reduce it to "could be." Scientific terms are important He told you how he compares hairs with powerful microscopes. It is not just the color or wave. He could reject Cleveland Wright's hair immediately; it wasn't Cleveland Wright's hair in that stocking. But he couldn't reject Santae Tribble's because it was exactly the same. And the only reason he said could be is because there is one chance, perhaps for all we know, in ten million that it could [be] someone else's hair. But what kind of coincidence is that?
... The hair is a great deal more than "could be." And if you listened to Agent Hilverda and what he really said, the hair exactly matched Santae Tribble's hair, found in the stocking.

Wednesday, November 15, 2017

It’s a Match! But What Is That?

When it comes to identification evidence, no one seems to know precisely what a match means. The comedian John Oliver used the term to riff CSI and other TV shows in which forensic scientists or their machines announce devastating “matches.” The President’s Council of Advisors on Science and Technology could not make up their minds. The opening pages of their 2016 report included the following sentences:

[T]esting labs lacked validated and consistently-applied procedures ... for declaring whether two [DNA] patterns matched within a given tolerance, and for determining the probability of such matches arising by chance in the population (P. 2)
Here, a “match” is a correspondence in measurements, and it is plainly not synonymous with a proposed identification. The identification would be an inference from the matching measurements that could arise "by chance" or because the DNA samples being analyzed are from the same source.

By subjective methods, we mean methods including key procedures that involve significant human judgment—for example, about which features to select within a pattern or how to determine whether the features are sufficiently similar to be called a probable match. (P. 5 n.3)
Now it is seems that “match” refers to “sufficiently similar” features and to an identification of a single, probable source of the traces with these similar features.

Forensic examiners should therefore report findings of a proposed identification with clarity and restraint, explaining in each case that the fact that two samples satisfy a method’s criteria for a proposed match does not mean that the samples are from the same source. For example, if the false positive rate of a method has been found to be 1 in 50, experts should not imply that the method is able to produce results at a higher accuracy. (P. 6)
Here, “proposed match” seems to be equated to “proposed identification.” (Or does “proposed match” mean that the degree of similarity a method uses to characterize the measurements as matching might not really be present in the particular case, but is merely alleged to exist?)

Later, the report argues that
Because the term “match” is likely to imply an inappropriately high probative value, a more neutral term should be used for an examiner’s belief that two samples come from the same source. We suggest the term “proposed identification” to appropriately convey the examiner’s conclusion ... . (Pp. 45-46.)
Is this a blanket recommendation to stop using the term “match” for an observed degree of similarity? It prompted the following rejoinder:
Most scientists would be comfortable with the notion of observing that two samples matched but would, rightly, refuse to take the logically unsupportable step of inferring that this observation amounts to an identification. 1/
I doubt that it is either realistic or essential to banish the word “match” from the lexicon for identification evidence. But it is essential to be clear about its meaning. As one textbook on interpreting forensic-science evidence cautions:
Yet another word that is the source of much confusion is 'match'. 'Match' can mean three different things:
• Two traces share some characteristic which we have defined and categorised, for example, when two fibres are both made of nylon.
• Two traces display characteristics which are on a continuous scale but fall within some arbitrarily defined distance of each other.
• Two traces have the same source, as implied in expressions such as 'probable match' or 'possible match'.
If the word 'match' must be used, it should be carefully defined. 2/
  1. I.W. Evett, C.E.H. Berger, J.S. Buckleton, C. Champod, & G. Jackson, Finding the Way Forward for Forensic Science in the US—A Commentary on the PCAST Report, 278 Forensic Sci. Int'l 16, 19 (2017). One might question whether “most scientists” should “be comfortable” with observing that two samples “matched” on a continuous variable (such as the medullary index of hair). Designating a range of matching and nonmatching values means that a close nonmatch is treated radically differently than an almost identical match at the edge of the matching zone. Ideally, a measure of probative value of evidence should not incorporate this discontinuity.
  2. Bernard Robertson, Charles E. H. Berger, and G. A. Vignaux, Interpreting Evidence: Evaluating Forensic Science in the Courtroom 63 (2d ed. 2016).

Saturday, November 4, 2017

Louisiana's Court of Appeals Brushes Aside PCAST Report for Fingerprints and Toolmark Evidence

A defense effort to exclude fingerprint and toolmark identification evidence failed in State v. Allen, No. 2017-0306, 2017 WL 4974768 (La. Ct. App., 1 Cir., Nov. 1, 2017). The court described the evidence in the following paragraphs:
The police obtained an arrest warrant for the defendant and a search warrant for his apartment. During the ... search ... , the police recovered a .40 caliber handgun and ammunition ... The defendant denied owning a firearm ... .
     BRPD [Baton Rouge Police Department] Corporal Darcy Taylor processed the firearm ... , lifting a fingerprint from the magazine of the gun and swabbing various areas of the gun and magazine. Amber Madere, an expert in latent print comparisons from the Louisiana State Police Crime Lab (LSPCL), examined the fingerprint evidence and found three prints sufficient to make identifications. The latent palm print from the magazine of the gun was identified as the defendant's left palm print.
     Patrick Lane, a LSPCL expert in firearms identification, examined the firearm and ammunition in this case. Lane noted that the firearm in evidence, was the same caliber as the cartridge cases. ... He further test-fired the weapon and fired reference ammunition from the weapon for comparison to the ammunition in evidence. Lane determined that based on the quality and the quantity of markings that were present on the evidence cartridge case and the multiple test fires, the weapon in evidence fired the cartridge case in evidence.
Defendant moved for "a Daubert hearing ... based on a report, released by the President's Council of Advisors on Science and Technology (PCAST) three days before the motion was filed, which called into question the validity of feature-comparison models of testing forensic evidence." The court of appeals decided that there had been such a hearing. The opinion is not explicit about the timing of the hearing. It suggests that it consisted of questions to the testifying criminalists immediately before they testified. In the court's words,
[B]efore the trial court's determination as to their qualification as experts and the admission of their expert testimony, Madere and Lane were thoroughly questioned as to their qualifications, and as to the reliability of the methodology they used, including the rates of false positives and error. The defendant was specifically allowed to reference the PCAST report during questioning. Thus, the trial court allowed a Daubert inquiry to take place in this case. The importance of the Daubert hearing is to allow the trial judge to verify that scientific testimony is relevant and reliable before the jury hears said testimony. Thus, the timing of the hearing is of no moment, as long as it is before the testimony is presented.
The court was correct to suggest that a "hearing" can satisfy Daubert even if it was not conducted before the trial. However, considering that the "hearing" involved only two prosecution witnesses, whether it should be considered "thorough" is not so clear.

As for the proof of scientific validity, the court pointed to a legal history of admissibility mostly predating the 2009 NRC report on Strengthening Forensic Science in the United States and the later PCAST report. It failed to consider a number of federal district court opinions questioning the type of expert testimony apparently used in the case (it is certain that "the weapon ... fired the cartridge case"). Yet, it insisted that "[c]onsidering the firmly established reliability of fingerprint evidence and firearm examination analyses, the expert witness's comparison of the defendant's fingerprints, not with latent prints, but with known fingerprints, ... we find [no] error in the admission of the testimony in question." The assertion that the latent print examiner did not compare "the defendant's fingerprints [to] latent fingerprints" is puzzling. The fingerprint expert testified that "[t]he latent palm print from the magazine of the gun was identified as the defendant's left palm print."That was the challenged testimony, not some unexplained  comparison of known prints to known prints.

The text of opinion did not address the reasoning in the PCAST report. A footnote summarily -- and unconvincingly -- disposed of the PCAST report in a few sentences:
[T]he PCAST report did not wholly undermine the science of firearm analysis or fingerprint identification, nor did it actually establish unacceptable error rates for either field of expertise. In fact, the PCAST report specifically states that fingerprint analysis remains “foundationally valid” and that “whether firearms should be deemed admissible based on current evidence is a decision that belongs to the courts.”
"Did not wholly undermine the science" is faint praise indeed. The council's views about the "foundational validity" (and hence the admissibility under Daubert) of firearms identification via toolmarks was clear: "Because there has been only a single appropriately designed study, the current evidence falls short of the scientific criteria for foundational validity." (P. 111).

As regards fingerprints, the court's description of the report is correct but incomplete. The finding of "foundational validity" was grudging: "The studies collectively demonstrate that many examiners can, under some circumstances, produce correct answers at some level of accuracy." (P. 95). The council translated its misgivings about latent fingerprint identification into the following recommendation:
Overall, it would be appropriate to inform jurors that (1) only two properly designed studies of the accuracy of latent fingerprint analysis have been conducted and (2) these studies found false positive rates that could be as high as 1 in 306 in one study and 1 in 18 in the other study. This would appropriately inform jurors that errors occur at detectable frequencies, allowing them to weigh the probative value of the evidence. (P. 96).
To say that the Louisiana court did not undertake a careful analysis of the PCAST report would be an understatement. Of course, courts need not accept the report's detailed criteria for establishing "validity." Neither must they defer to its particular views on how to convey the probative value of scientific evidence. But if they fail to engage with the reasoning in the report, their opinions will be superficial and unpersuasive.

Why the News Stories on the Louisiana Lawyer Dog Are Misleading

The news media and the bloggers are abuzz with stories of how Louisiana judges think that a suspect's statement to "get me a lawyer dog" (or maybe "give me a lawyer, dog") is not an invocation of the right to counsel, which, under Miranda v. Arizona, requires the police to terminate a custodial interrogation. Although the case has nothing to do with forensic science or statistics, this blog often points out journalists' misrepresentations, and I'll digress from the main theme of this blog to explain how the media has misrepresented the case.

Five days ago, a blog called "Hit and Run" observed that Justice Scott Chricton of the Louisiana Supreme Court wrote a concurring opinion to explain why he agreed that the court need not review the case of Warren Demesme. It seems that Demesme said to his interlocutors, "If y'all, this is how I feel, if y'all think I did it, I know that I didn't do it so why don't you just give me a lawyer dog cause this is not what's up." The police continued the interrogation. Demesme made some admissions. Now he is in jail on charges of aggravated rape and indecent behavior with a juvenile. writer (formerly with Fox Business and NBC) read the Justice's opinion and thought
Chricton's argument relies specifically on the ambiguity of what a "lawyer dog" might mean. And this alleged ambiguity is attributable entirely to the lack of a comma between "lawyer" and "dog" in the transcript. As such, the ambiguity is not the suspect's but the court's. And it requires willful ignorance to maintain it.
Credulous writers at Slate, the Washington Post, and other news outlets promptly amplified and embellished Krayewski's report. Slate writer Mark Joseph Stern announced that
Justice Scott Crichton ... wrote, apparently in absolute seriousness, that “the defendant’s ambiguous and equivocal reference to a ‘lawyer dog’ does not constitute an invocation of counsel that warrants termination of the interview.”
Reason’s Ed Krayewski explains that, of course, this assertion is utterly absurd. Demesme was not referring to a dog with a license to practice law, since no such dog exists outside of memes. Rather, as Krayewski writes, Demesme was plainly speaking in vernacular; his statement would be more accurately transcribed as “why don’t you just give me a lawyer, dawg.” The ambiguity rests in the court transcript, not the suspect’s actual words. Yet Crichton chose to construe Demesme’s statement as requesting Lawyer Dog, Esq., rather than interpreting his words by their plain meaning, transcript ambiguity notwithstanding.
This Slate article also urged the U.S. Supreme Court to review the case (if it were to receive a petition from the as-yet-untried defendant). The Post's Tom Jackman joined the bandwagon, arguing that
When a friend says, “I’ll hit you up later dog,” he is stating that he will call again sometime. He is not calling the person a “later dog.”
But that’s not how the courts in Louisiana see it. .... It’s not clear how many lawyer dogs there are in Louisiana, and whether any would have been available to represent the human suspect in this case ... .
Yet, the case clearly does not turn on "the lack of a comma between 'lawyer' and 'dog,'" and Justice Chricton did not maintain that Mr. Demesme's request was too ambiguous because "lawyer" was followed by "dog." Public defender Derwyn D. Bunton contended that when “Mr. Demesme said "with emotion and frustration, 'Just give me a lawyer,'" he "unequivocally and unambiguously asserted his right to counsel." (At least, this is what the Washington Post reported.) If this were all there was to the request, there would no doubt that the police violated Miranda.

The problem for Mr. Demesme is that the "unambiguous" assertion "just give me a lawyer" did not stand alone. It was conditional. What he said was "if y'all think I did it, I know that I didn't do it so why don't you just give me a lawyer ...[?]" For Justice Chricton, the "if" was the source of the ambiguity. That ambiguity did not arise from the phrase "lawyer dog." It would have made no difference if defendant had said "lawyer" without the "dog." Contrary to the media howling, Justice Chricton was not taking the phrase "lawyer dog" literally. He was taking the phrase "if y'all think" literally. Here is what the judge actually wrote:
I agree with the Court’s decision to deny the defendant’s writ application and write separately to spotlight the very important constitutional issue regarding the invocation of counsel during a law enforcement interview. The defendant voluntarily agreed to be interviewed twice regarding his alleged sexual misconduct with minors. At both interviews detectives advised the defendant of his Miranda rights and the defendant stated he understood and waived those rights. ... I believe the defendant ambiguously referenced a lawyer—prefacing that statement with “if y’all, this is how I feel, if y’all think I did it, I know that I didn’t do it so why don’t you just give me a lawyer dog cause this is not what’s up.”... In my view, the defendant’s ambiguous and equivocal reference to a “lawyer dog” does not constitute an invocation of counsel that warrants termination of the interview ... .
The Justice cited a Louisiana Supreme Court case and the U.S. Supreme Court case, Davis v. United States, 512 U.S. 452 (1994). In Davis, Naval Investigative Service agents questioned a homicide suspect after reciting Miranda warnings and securing his consent to be questioned. An hour and a half into the questioning, the suspect said "[m]aybe I should talk to a lawyer." At that point, "[a]ccording to the uncontradicted testimony of one of the interviewing agents, the interview then proceeded as follows:"
[We m]ade it very clear that we're not here to violate his rights, that if he wants a lawyer, then we will stop any kind of questioning with him, that we weren't going to pursue the matter unless we have it clarified is he asking for a lawyer or is he just making a comment about a lawyer, and he said, [']No, I'm not asking for a lawyer,' and then he continued on, and said, 'No, I don't want a lawyer.'
They took a short break, after which "the agents reminded petitioner of his rights to remain silent and to counsel. The interview then continued for another hour, until petitioner said, 'I think I want a lawyer before I say anything else.' At that point, questioning ceased."

The Supreme Court held that the initial statement “[m]aybe I should talk to a lawyer,” coming after a previous waiver of the right to consult counsel and followed by the clarification that "I'm not asking for a lawyer," could be deemed too equivocal and ambiguous to have forced the police to have terminated the interrogation immediately.

The Louisiana case obviously is different. Police did not seek any clarification of the remark about a lawyer "if y'all think I did it." From what has been reported, they continued without missing a beat. However, in the majority opinion for the Court, Justice Sandra Day O'Connor went well beyond the facts of the Davis case to write that
Of course, when a suspect makes an ambiguous or equivocal statement it will often be good police practice for the interviewing officers to clarify whether or not he actually wants an attorney. That was the procedure followed by the NIS agents in this case. Clarifying questions help protect the rights of the suspect by ensuring that he gets an attorney if he wants one, and will minimize the chance of a confession being suppressed due to subsequent judicial second-guessing as to the meaning of the suspect's statement regarding counsel. But we decline to adopt a rule requiring officers to ask clarifying questions. If the suspect's statement is not an unambiguous or unequivocal request for counsel, the officers have no obligation to stop questioning him.
The Louisiana courts -- and many others -- have taken this dictum -- repudiated by four concurring Justices -- to heart. Whether it should ever apply and whether Justice Chricton's application of it to the "if ..." statement is correct are debatable. But no responsible and knowledgeable journalist could say that the case turned on an untranscribed comma or on the difference between "lawyer" and "lawyer dog." The opinion may be wrong, but it is clearly unfair to portray it as "willful ignorance" and "utterly absurd." The majority opinion in Davis and the cases it has spawned are fair game (and the Post article pursues that quarry), but the writing about the dispositive role of the lawyer dog meme in the Louisiana case is barking up the wrong tree.

Friday, October 27, 2017

Dodging Daubert to Admit Bite Mark Evidence

At a symposium for the Advisory Committee on the Federal Rules of Evidence, Chris Fabricant juxtaposed two judicial opinions about bite-mark identification. To begin with, in Coronado v. State, 384 S.W.3d 919 (Tex. App. 2012), the Texas Court of Appeals deemed bite mark comparisons to be a “soft science” because it is “based primarily on experience or training.” It then applied a less rigorous standard of admissibility than that for a “hard science.”

The state’s expert dentist, Robert Williams, “acknowledged that there is a lack of scientific studies testing the reliability of bite marks on human skin, likely due to the fact that few people are willing to submit to such a study. However, he did point out there was one study on skin analysis conducted by Dr. Gerald Reynolds using pig skin, ‘the next best thing to human skin.’” The court did not state what the pig skin study showed, but it must have been apparent to the court that direct studies of the ability of dentists to distinguish among potential sources of bite marks were all but nonexistent.

That dentists have a way to exclude and include suspects as possible biters with rates of accuracy that are known or well estimated is not apparent. Yet, the Texas appellate court upheld the admission of the "soft science" testimony without discussing whether it was presented as hard science, as "soft science," or as nonscientific expert testimony.

A trial court in Hillsborough County, Florida, went a step further. Judge Kimberly K. Fernandez wrote that
During the evidentiary hearing, the testimony revealed that there are limited studies regarding the accuracy or error rate of bite mark identification, 3/ and there are no statistical databases regarding uniqueness or frequency in dentition. Despite these factors, the Court finds that this is a comparison-based science and that the lack of such studies or databases is not an accurate indicator of its reliability. See Coronado v. State, 384 S.W. 3d 919 (Tex. App. 2012) ("[B]ecause bite mark analysis is based partly on experience and training, the hard science methods of validation such as assessing the potential rate of error, are not always appropriate for testing its reliability.")
The footnote added that "One study in 1989 reflected that there was a 63% error rate.” This is a remarkable addition. Assuming "the error rate" is a false-positive rate for a task comparable to the one in the case, it is at least relevant to the validity of bite-mark evidence. In Coronado, the Texas court found the absence of validation research not preclusive of admissibility.  That was questionable enough. But in O'Connell, the court found that the presence of research that contradicted any claim of validity “inappropriate” to consider! That turns Daubert on its head.

Friday, October 20, 2017

"Probabilistic Genotyping," Monte Carlo Methods, and the Hydrogen Bomb

Many DNA samples found in criminal investigations contain DNA from several people. A number of computer programs seek to "deconvolute" these mixtures -- that is, to infer the several DNA profiles that are mushed together in the electrophoretic data. The better ones do so using probability theory and an estimation procedure known as a Markov Chain Monte Carlo (MCMC) method. These programs are often said to perform "probabilistic genotyping." Although both words in this name are a bit confusing, 1/ lawyers should appreciate that the inferred profiles are just possibilities, not certainties. At the same time, some may find the idea of using techniques borrowed from a gambling casino (in name at least) disturbing. Indeed, I have heard the concern that "You know, don't you, that if the program is rerun, the results can be different!"

The answer is, yes, that is the way the approximation works. Using more steps in the numerical process also could give different output, but would we expect the further computations to make much of a difference? Consider a physical system that computes the value of π. I am thinking of Buffon's Needle. In 1777, Georges-Louis Leclerc, the Count of Buffon, imagined "dropping a needle on a lined sheet of paper and determining the probability of the needle crossing one of the lines on the page." 2/ He found that the probability is directly related to π. For example, if the length of the needle and the distance between the lines are identical, one can estimate π as twice the number of drops divided by the number of hits.3/ Repeating the needle-dropping procedure the same number of times will rarely give exactly the same answer. (Note that pooling the results for two runs of the procedure is equivalent to one run with twice as many needle drops.) For a very large number of drops, however, the approximation should be pretty good.

MCMC computations are more complicated. They simulate a random walk that samples values of a random variable so as to ascertain a posterior probability distribution. The walk could get stuck for a long time in a particular region. Nevertheless, the general approach is very well established in statistics, and Monte Carlo methods are widely used throughout the sciences. 4/ Indeed, they were integral to the development of nuclear weapons. 5/ The book, Dark Sun: The Making of the Hydrogen Bomb, provides the following account:
On leave from the university, resting at home during his extended recovery [from a severe brain infection], [Stanislaw] Ulam amused himself playing solitaire. Sensitivity to patterns was part of his gift. He realized that he could estimate how a game would turn out if he laid down a few trial cards and then noted what proportion of his tries were successful, rather than attempting to work out all the possible combinations in his head. "It occurred to me then," he remembers, "that this could be equally true of all processes involving branching of events." Fission with its exponential spread of reactions was a branching process; so would the propagation of thermonuclear burning be. "At each stage of the [fission] process, there are many possibilities determining the fate of the neutron. It can scatter at one angle, change its velocity, be absorbed, or produce more neutrons by a fission of the target nucleus, and so on." Instead of trying to derive the expected outcomes of these processes with complex mathematics, Ulam saw, it should be possible to follow a few thousand individual sample particles, selecting a range for each particle's fate at each step of the way by throwing in a random number, and take the outcomes as an approximate answer—a useful estimate. This iterative process was something a computer could do. ...[W]hen he told [John] von Neumann about his solitaire discovery, the Hungarian mathematician was immediately interested in what he called a "statistical approach" that was "very well suited to a digital treatment." The two friends developed the mathematics together and named the procedure the Monte Carlo method (after the famous gaming casino in Monaco) for the element of chance it incorporated. 6/
Even without a computer in place, Los Alamos laboratory staff, including a "bevy of young women who had been hastily recruited to grind manually on electric calculators," 7/ performed preliminary calculations examining the feasibility of igniting a thermonuclear reaction. As Ulam recalled:
We started work each day for four to six hours with slide rule, pencil and paper, making frequent quantitative guesses. ... These estimates were interspersed with stepwise calculations of the behavior of the actual motions [of particles] ... The real times for the individual computational steps were short ... and the spatial subdivisions of the material assembly very small. ... The number of individual computational steps was therefore very large. We filled page upon page with calculations, much of it done by [Cornelius] Everett. In the process he almost wore out his own slide rule. ... I do not know how many man hours were spent on this problem. 8/
  1. In forensic DNA work, probabilities also are presented to explain the probative value of the discovery of a "deterministic" DNA profile -- one that is treated as known to a certainty. See David H. Kaye, SWGDAM Guidelines on "Probabilistic Genotyping Systems" (Part 2), Forensic Sci., Stat. & L., Oct. 25, 2015. In addition, the "genotypes" in "probabilistic genotyping" do not refer to genes.
  2. Office for Mathematical, Science and Technology Education, College of Educvation, University of Illinois, Boffon's Needle: An Analysis and Simulation,
  3. Id.
  4. See, e.g., Persi Diaconis, The Markov Chain Monte Carlo Revolution, 46 Bull. Am. Math. Soc'y 179-205 (2009),; Sanjib Sharma, Markov Chain Monte Carlo Methods for Bayesian Data Analysis in Astronomy, arXiv:1706.01629 [astro-ph.IM], https://doiorg/10.1146/annurev-astro-082214-122339.
  5. Roger Eckhard, Stan Ulam, John von Neumann, and the Monte Carlo Method, Los Alamos Sci., Special Issue 1987, pp. 131-41,
  6. Richard Rhodes, Dark Sun: The Making of the Hydrogen Bomb 303-04 (1995).
  7. Id. at 423 (quoting Françoise Ulam).
  8. Id.

Wednesday, October 11, 2017

District Court Rejects Defendant's Reliance on PCAST Report as a Reason to Exclude Fingerprint Evidence

Yesterday, the U.S. District Court for the Northern District of Illinois rejected a defendant's motion to exclude a latent fingerprint identification on the theory that "the method used is not sufficiently reliable foundationally or as applied to his case." 1/ The court also precluded, as too "distracting," cross-examination of the FBI latent-print examiner about the FBI's infamous error in apprehending of Brandon Mayfield as the Madrid train bomber.

I. "Foundational Validity"

The "foundational validity" challenge came out of the pages of the 2016 PCAST Report. 2/ The PCAST Report seems to equate what it calls "foundational validity" for subjective pattern-matching methods to multiple "black box studies" demonstrating false-positive error probabilities of 5% or less, and it argues that Federal Rule of Evidence 702 requires such a showing of validity.

That this challenge would fail is unsurprising. According to the Court of Appeals for the Seventh Circuit, latent print analysis need not be scientifically valid to be admissible. Furthermore, even if the Seventh Circuit were to reconsider this questionable approach to the admissibility of applications of what the FBI and DHS call "the science of fingerprinting," the PCAST Report concludes that latent print comparisons have foundational scientific validity as defined above.

A. The Seventh Circuit Opinion in Herrera

Scientific validity is not a foundational requirement in the legal framework applied to latent print identification by the Court of Appeals for the Seventh Circuit. In United States v. Herrera, 3/ Judge Richard Posner 4/ observed that "the courts have frequently rebuffed" any "frontal assault on the use of fingerprint evidence in litigation." 5/ Analogizing expert comparisons of fingerprints to "an opinion offered by an art expert asked whether an unsigned painting was painted by the known painter of another painting" and even to eyewitness identifications, 6/ the court held these comparisons admissible because "expert evidence is not limited to 'scientific' evidence," 7/ the examiner was "certified as a latent print examiner by the International Association for Identification," 8/ and "errors in fingerprint matching by expert examiners appear to be very rare." 9/ To reach the last -- and most important -- conclusion, the court relied on the lack of cases of fingerprinting errors within a set of DNA-based exonerations (without indicating how often fingerprints were introduced in those cases), and its understanding that the "probability of two people in the world having identical fingerprints ... appears to be extremely low." 10/

B. False-positive Error Rates in Bonds

In the new district court case of United States v. Bonds, Judge Sara Ellis emphasized the court of appeals' willingness to afford district courts "wide latitude in performing [the] gate-keeping function." Following Herrara (as she had to), she declined to require "scientific validity" for fingerprint comparisons. 11/ This framework deflects or rejects most of the PCAST report's legal reasoning about the need for scientific validation of all pattern-matching methods in criminalistics. But even if "foundational validity" were required, the PCAST Report -- while far much more skeptical of latent print work than was the Herrera panel -- is not so skeptical as to maintain that latent print identification is scientifically invalid. Judge Ellis quoted the PCAST Report's conclusion that "latent fingerprint analysis is a foundationally valid subjective methodology—albeit with a false positive rate that is substantial and is likely to be higher than expected by many jurors based on longstanding claims about the infallibility of fingerprint analysis."

Bonds held that the "higher than expected" error rates were not so high as to change the Herrera outcome for nonscientific evidence. Ignoring other research into the validity of latent-print examinations, Judge Ellis wrote that "[a]n FBI study published in 2011 reported a false positive rate (the rate at which the method erroneously called a match between a known and latent print) of 1 in 306, while a 2014 Miami-Dade Police Department Forensic Services Bureau study had a false positive rate of 1 in 18."

Two problems with the sentence are noteworthy. First, it supplies an inaccurate definition of a false positive rate. "[T]he rate at which the method erroneously called a match between a known and latent print" would seem to be an overall error rate for positive associations (matches) in the sample of prints and examiners who were studied. For example, if the experiment used 50 different-source pairs of prints and 50 same-source pairs, and if the examiners declared 5 matches for the different-sources and 5 for the same-source pairs, the erroneous matches are 5 out of 100, for an error rate of 5%. However, the false-positive rate is the proportion of positive associations reported for different-source prints. When comparing the 50 different-source pairs, the examiners erred in 5 instances, for a false-positive rate of 5/50 = 10%. In the 50 same-source pairs, there were no opportunities for a false negative. Thus, the standard definition of a false-positive error rate gives the estimate of 0.1 for the false-positive probability. This definition makes sense because none of the same-source pairs in the sample can contribute to false-positive errors.

Second, the sentence misstates the false positive rates reported in the two studies. Instead of "1 in 306," the 2011 Noblis-FBI experiment found that "[s]ix false positives occurred among 4,083 VID [value for identification] comparisons of nonmated pairs ... ." 12/ In other words (or numbers), the reported false-positive rate (for an examiner without the verification-by-another-examiner step) was 6/4083 = 1/681. This is the only false-positive rate in the body of the study. An online supplement to the article includes "a 95% confidence interval of 0.06% to 0.3% [1 in 1668 to 1 in 333]." 13/ A table in the supplement also reveals that, excluding conclusions of "inconclusive" from the denominator, as is appropriate from the standpoint of judges or jurors, the rate is 6/3628, which corresponds to 1 in 605.

Likewise, the putative rate of 1/18 does not appear in the unpublished Miami-Dade study. A table in the report to a funding agency states that the "False Positive Rate" was 4.2% "Without Inconclusives."14/This percentage corresponds to 1 in 24.

So where did the court get its numbers? They apparently came from a gloss in the PCAST Report. That report gives an upper (but not a lower) bound on the false-positive rates that would be seen if the studies used an enormous number of random samples of comparisons (instead of just one). Bending over backwards to avoid incorrect decisions against defendants, PCAST stated that the Noblis-FBI experiment indicated that "the rate could be as high as 1 error in 306 cases" and that the numbers in the Miami-Dade study admit of an error rate that "could be as high as 1 error in 18 cases." 15/ Of course, the error rates in the hypothetical infinite population could be even higher. Or they could be lower.

III. Discussing Errors at Trial

The PCAST Report accepts the longstanding view that traces of the patterns in friction ridge skin can be used to associate latent prints that contain sufficient detail with known prints. But it opens the door to arguments about the possibility of false positives. Bonds wanted to confine the analyst to presenting the matching features or, alternatively, to declare a match but add that the "level of certainty of a purported match is limited by the most conservative reported false positive rate in an appropriately designed empirical study thus far (i.e., the 1 in 18 false positive rate from the 2014 Miami-Dade study)."

Using a probability of 1 in 18 to describe the "level of certainty" for the average positive association made by examiners like those studied to date seems "ridiculous." Cherry-picking a distorted number from a single study is hardly sound reasoning. And even if 1/18 were the best estimate of the false-positive probability that can be derived from the totality of the scientific research, applying it explain the "level of certainty" one should have that the examiner's conclusion would not be straightforward. For one thing, the population-wide false-positive probability is not the probability that a given positive finding is false! Three distinct probabilities come into play. 16/ Explaining the real meaning of an estimate of the false-positive probability from PCAST's preferred "black-box" studies in court will be challenging for lawyers and criminalists alike. Merely to state that a number like 1/18 goes to "the weight of the evidence" and can be explored "on cross examination," as Judge Ellis did, is to sweep this problem under the proverbial rug -- or to put it aside for another day.

  1. United States v. Myshawn Bonds, No. 15 CR 573-2 (N.D. Ill. Oct. 10, 2017).
  2. Executive Office of the President, President’s Council of Advisors on Science and Technology, Report to the President: Forensic Science in Criminal Courts: Ensuring Scientific Validity of Feature-Comparison Methods, Sept. 2016).
  3. 704 F.3d 480 (7th Cir. 2013).
  4. For remarks on another opinion from the judge, see Judge Richard Posner on DNA Evidence: Missing the Trees for the Forest?, Forensic Sci., Stat. & L., July 19, 2014, 
  5. Herrera, 704 F.3d at 484.
  6. Id. at 485-86.
  7. Id. at 486.
  8. Id.
  9. Id. at 487.
  10. Id.
  11. Judge Ellis stated that she "agree[d] with Herrera's broader reading of Rule 702's reliability requirement."
  12. Bradford T. Ulery, R. Austin Hicklin, JoAnn Buscaglia, & Maria Antonia Roberts, Accuracy and Reliability of Forensic Latent Fingerprint Decisions, 108(19) Proc. Nat’l Acad. Sci (USA) 7733-7738 (2011).
  13. Available at
  14. Igor Pacheco, Brian Cerchiai, Stephanie Stoiloff, Miami-Dade Research Study for the Reliability of the ACE-V Process: Accuracy & Precision in Latent Fingerprint Examinations, Dec. 2014, at 53 tbl. 4.
  15. PCAST Report, supra note 2, at 94-95.
  16. The False-Positive Fallacy in the First Opinion to Discuss the PCAST Report, Forensic Sci., Stat. & L., November 3, 2016,

Friday, October 6, 2017

Should Forensic-science Standards Be Open Access?

The federal government has spent millions of dollars to generate and improve standards for performing forensic-science tests through the Organization of Scientific Area Committees for Forensic Science (OSAC). Yet, it does not require open access to the standards placed on its "OSAC Registry of Approved Standards." Perhaps that can be justified for existing standards that are the work of other authors -- as is the case for some pre-existing standards that have made it to the Registry. But shouldn't standards that are written by OSAC at public expense be available to the public rather than controlled by private organizations?

When the American Academy of Forensic Sciences (AAFS) established a Standards Board (the ASB) to "work closely with the [OSAC] Forensic Science Standards Board and its subcommittees, which are dedicated to creating a national registry of forensic standards," 1/ ASB demanded the copyright to all standards, no matter how little or how much it contributes to the writing of the standards. It insists that "the following disclaimer shall appear on all ASB published and draft documents:
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or on an intranet, without prior written permission from the Academy Standards Board, American Academy of Forensic Sciences, 410 North 21st Street, Colorado Springs, CO 80904,
Copyright © AAFS Standards Board [year]
Moreover, "[u]nless expressly agreed otherwise by the ASB, all material and information that is provided by participants and is incorporated into an ASB document is considered the sole and exclusive property of the AAFS Standards Board. Individuals shall not copy or distribute final or draft documents without the authorization of the ASB staff." 2/

The phrasing "is considered" departs from the ASB's own guidance that "[t]he active voice should be used in sentences." 3/ Who considers draft documents written by OSAC members "the sole and exclusive property of the AAFS Standards Board"? The ASB? The OSAC? The courts? Why should they? OSAC is not furthering the public interest by giving a private organization a monopoly over its work products. It should retain the copyright and reject the AAFS's unenforceable 4/ "no copying, no distributing" philosophy via a Creative Commons Attribution license.

  1. Foreword, ASB Style Guide Manual for Standards, Technical  Reports and Best Practice Recommendations (2016),
  2. Id. at 12.
  3. Id. at 1.
  4. The asserted restriction on reproduction cannot be enforced literally because many reproductions are fair uses of the copyrighted material. That is what allows me reproduce the material quoted in this posting without ASB's permission. Arguably, reproducing an entire standard for noncommercial purposes would fall under the open-textured fair-use exception of 17 U.S.C. § 107.

Sunday, September 24, 2017

How Experts (Mis)Represent Likelihood Ratios for DNA Evidence

Earlier this month, I noted the tendency of journalists to misconstrue a likelihood ratio as odds or probabilities in favor of a source hypothesis. I mentioned expressions such as "the likelihood that a suspect’s DNA is present in a mixture of substances found at a crime scene" and "the probability, weighed against coincidence, that sample X is a match with sample Y." In place of such garbled descriptions, I proposed that
Putting aside all other explanations for the overlap between the mixture and the suspect's alleles -- explanations like relatives or some laboratory errors--this likelihood ratio indicates how much the evidence changes the odds in favor of the suspect’s DNA being in the mixture. It quantifies the probative value of the evidence, not the probability that one or another explanation of the evidence is true.
The journalists' misstatements occurred in connection with likelihood ratios involving DNA mixtures, but even experts in forensic inference make the same mistake in simpler situations. The measure of probative value for single-source DNA is a more easily computed likelihood ratio (LR). Unfortunately, it is very easy to describe LRs in ways that invite misunderstanding. Below are two examples:
[I]n the simplest case of a complete, single-source evidence profile, the LR expression reverts to the reciprocal of the profile frequency. For example: Profile frequency = 1/1,000,000 [implies] LR = P(E|H1) / P(E|H2) = 1 / 1/1,000,000 = 1,000,000/1 = 1,000,000 (or 1 million). This could be expressed in words as, "Given the DNA profile found in the evidence, it is 1 million times more likely that it is from the suspect than from another random person with the same profile." -- Norah Rudin & Keith Inman, An Introduction to Forensic DNA Analysis 148-49 (2d ed. 2002).
Comment: If "another random person" had the "the same profile," there would be no genetic basis for distinguishing between this individual and the suspect. So how could the suspect possibly be a million times more likely to be the source?
A likelihood ratio is a ratio that compares the likelihood of two hypotheses in the light of data. [I]n the present case there are two hypotheses: the sperm came from twin A or the sperm came from twin B, and then you calculate the likelihood of each hypotheses in the face or in the light of the data, and then you form the ratio [LR] of the two. So the ratio tells you how much more likely one hypothesis is than the other in the light of the experimental data. --Testimony of Michael Krawczak in a pretrial hearing on a motion to exclude evidence in Commonwealth v. McNair, No. 8414CR10768 (Super. Ct., Suffolk Co., Mass.) (transcript, Feb. 15, 2017).
Comment: Defining "likelihood" as a quantity proportional to the probability of data given the hypothesis, the first sentence is correct. But this definition was not provided, and the second sentence further suggests that the "experimental data" makes one twin LR times more probable to be the source than the other. That conclusion is correct only if the prior odds are equal -- an assumption that does not rest on those data.
With this kind of prose and testimony, is it any surprise that courts write that "[t]he likelihood ratio 'compares the probability that the defendant was a contributor to the sample with the probability that he was not a contributor to the sample'”? Commonwealth v. Grinkley, 75 Mass.App.Ct. 798, 803, 917 N.E.2d 236, 241 (Mass. Ct. App. 2009) (quoting Commonwealth v. McNickles, 434 Mass. 839, 847, 753 N.E.2d 131 (2005))?