Saturday, November 18, 2017

A New Breed of Peer Reviewer

Dr. Olivia Doll has ample academic credentials on her C.V. She holds a doctorate in Canine Studies from the Subiaco College of Veterinary Science (Dissertation: Canine Responses to Avian Proximity); a master’s degree in Early Canine Studies from Shenton Park Institute for Canine Refuge Studies; and a bachelor’s degree from the Staffordshire College of Territorial Science. Her research includes such topics as the relationships between Doberman Pinschers and Staffordshire Terriers in domestic environments, the role of domestic canines in promoting optimal mental health in ageing males, the impact of skateboards on canine ambulation, and the benefits of abdominal massage for medium-sized canines.

However, the Shenton Park “Institute” is the animal shelter in which she lived as a pup, and there is no College of Veterinary Science in Subiaco. Her expertise in canine studies comes strictly from her skill, experience, and training (not to mention her instincts) as a five-year-old Staffordshire Terrier, AKA Ollie.

Nevertheless, this tongue-in-jowl record was sufficient to induce the following flaky academic journals to add the pit bull from Perth to their editorial boards:
■ *EC Pulmonary and Respiratory Medicine, Echronicon Open Access
■ Journal of Community Medicine and Public Health Care, Herald Scholarly Open Access
■ Journal of Tobacco Stimulated Diseases, Peertechz
■ Journal of Alcohol and Drug Abuse, SMGroup
■ Alzheimer’s and Parkinsonism: Research and Therapy, SMGroup
■ *Journal of Psychiatry and Mental Disorders, Austin Publishing Group
■ *Global Journal of Addiction and Rehabilitation Medicine (Associate Editor), Juniper Publishers
■ Austin Addiction Sciences, Austin Publishing Group
Ollie's exploits received international coverage in newspapers and in Science in May. Yet, Dr. Doll remains on the websites of the journals marked with an asterisk (accessed November 18, 2017). I do not know if she is still reviewing papers for the Journal of Community Medicine and Public Health Care, for that journal does not list names of editors or reviewers.


Friday, November 17, 2017

Comedic License

The other day, I mentioned a satire by the comedian John Oliver on developments in forensic science. A few minutes of the video was played for an international audience at Harvard Law School on October 27 in the introduction to a panel discussion entitled "Evidence, Science, and Reason in an Era of 'Post-truth' Politics." The clip included remarks from Santae Tribble and his lawyer about hair evidence in the highly (and deservedly) publicized exoneration of Mr. Tribble. 1/ I got the impression from the Oliver video that an FBI criminalist claimed that there was but a 1 in 10 million chance that the hair could have come from anyone else. In Oliver's recounting of the case,
Take Santae Tribble, who was convicted of murder, and served 26 years in large part thanks to an FBI analyst who testified that his hair matched hair left at the scene, and he will tell you the evidence was presented as being rock solid. "They said they matched my hair in all microscopical characteristics. And that's the way they presented it to the jury, and the jury took it for granted that that was my hair."

You know, I can see why they did. Who other than an FBI expert would possibly know that much about hair? ... Jurors in Tribble's case were actually told that there was one chance in ten million that it could be someone else's hair.
I have not seen the trial transcript, but a Washington Post reporter, who presumably has, provided this account:
In Tribble’s case, the FBI agent testified at trial that the hair from the stocking matched Tribble’s “in all microscopic characteristics.” In closing arguments, federal prosecutor David Stanley went further: “There is one chance, perhaps for all we know, in 10 million that it could [be] someone else’s hair.” 2/
If the Post is correct, it was not "an FBI expert" who "actually told" jurors that the probability the hair was not Tribble's was 1/10,000,000. It was an aggressive prosecutor who made up a number that he imagined "perhaps ... could" be true. Like the similar figure of 1/12,000,000 in the notorious case of People v. Collins, 438 P.2d 33 (Cal. 1968), this number plainly was objectionable.

In distinguishing between the criminalist's testimony and the lawyer's argument, I do not mean to be too critical of Oliver's condensed version and passive voice. Sometimes a little oversimplification saves a lot of pedantic explanation. The sad fact is that, whichever participants in  the trial were responsible for the 1/10,000,000 figure, Tribble's case exemplifies Oliver's earlier observation that "[t]he problem is, not all forensic science is as reliable as we have become accustomed to believing. ,,, It's not that all forensic science is bad, because it's not. But too often, its reliability is dangerously overstated ... ." 3/

  1. See The FBI's Worst Hair Days, July 31, 2014, Forensic Sci., Stat. & L.,
  2. Spencer S. Hsu, Santae Tribble Cleared in 1978 Murder Based on DNA Hair Test, Wash. Post, Dec. 14, 2012, (emphasis added).
  3. Cf. David H. Kaye, Ultracrepidarianism in Forensic Science: The Hair Evidence Debacle, 72 Wash. & Lee L. Rev. Online 227 (2015),

After composing these remarks, I located the 2012 motion to vacate Tribble's conviction. (Motion to Vacate Conviction and Dismiss Indictment With Prejudice on the Grounds of Actual Innocence Under the Innocence Protection Act, United States v. Tribble, Crim. No. F-4160-78 (D.C. Super. Ct. Jan. 18, 2012), available at This document contains more extended excerpts from the trial transcript, These indicate that although the FBI agent, James Hilverda, did not generate a nonsource probability of 1/10,000,000, he did insist that it was at least "likely" that Tribble was the source, and he indicated that the probability of a coincidental match could be on the order of one in a thousand.
Q. Is it possible that two individuals could have hairs with the same characteristics?
A. It is possible, but my personal experience, I rarely have seen it. Only on very rare occasions have I seen hairs of two individuals that show the same characteristics. Usually when we find hairs that we cannot distinguish from two individuals, it is due to the fact that the hairs are either so dark that you cannot see sufficient characteristics, or they are too light that they do not have very many characteristic[ s] to examine. So you cannot make a real good analysis in that manner. But there is a potential even with sufficient characteristics to once in a while to see hairs that you cannot distinguish.
Q. And is it because of that small possibility that two hairs can be from different people that you never give absolutely positive opinion that two hairs, in fact, did come from the same person.
A. That is correct.
Q. If you compare them and they microscopically match in all characteristics, could you say they could have come from the same person?
A. That's correct.
Q. Have you ever seen two hairs from two. different people that microscopically match?
A. Well, it is possible in the thousands of hair examples I have done, it is very, very rare to find hairs from two different people that exhibit the same microscopic characteristics. It is possible, but as I said, very rare.
A. ... I found that these hairs — the hairs that I removed from the stocking matched in all microscopic characteristics with the head hair samples submitted to me from Santae Tribble. Therefore, I would say that this hair could have originated from Santae Tribble.
Q. Is there any microscopic characteristics [sic] of the known hair that did not match?
A. No. The hair that I aligned with Santae Tribble matched in all microscopic characteristics, all characteristics were the same.
Q. And were there a large number of characteristics which were the same or—
A. All the characteristics were the same and there was a sufficient number of characteristics to allow me to do my examinations.
A. ... I think you can identify individuals, say, as to race, you can an [sic] indication that it came from an individual because it matches in all characteristics and I would say that when I have-in my experience that I feel when I have made this type of examination, it is likely that it came from the individual which I depicted it as coming from.
Mrs. McCormick said at least one of the robbers, the only one that she saw, was wearing a stocking mask. A block from where that homicide occurred, more or less, through the alley and around the corner on a fresh trail found by a trained canine dog, was found what, a stocking mask. In that stocking mask was found what, a hair.
Now whose hair was that? Well it was compared by the FBI laboratory and they can say for one thing, it wasn't Cleveland Wright's hair. They can say that that hair matched in every microscopic characteristic, the hair of Santae Tribble. Now for scientific reasons, which were explained, they cannot say positively that that is Santae Tribble's hair because on rare occasions they have seen hairs from two different people that matched. But usually that is the case where there have been few characteristics present. Now Agent Hilverda told you that on these hairs, the known hairs of Santae Tribble and the hair found in the stocking, there were plenty of characteristics present, not that situation at all. But because the FBI is cautious, he cannot positively say that that is Santae Tribble's hair.
Well what did Special Agent Hilverda say, microscopic analysis. Throwing the large terms away, what did he say? "Could be." He looked at one hair, he looked at Santae Tribble's hair and said, could be.
[T]he dog found a stocking with a hair which not only could be Santae Tribble's, it exactly matches Santae Tribble's in every microscopic characteristic. And Mr. Potenza says throw away all the scientific terms and reduce it to "could be." Scientific terms are important He told you how he compares hairs with powerful microscopes. It is not just the color or wave. He could reject Cleveland Wright's hair immediately; it wasn't Cleveland Wright's hair in that stocking. But he couldn't reject Santae Tribble's because it was exactly the same. And the only reason he said could be is because there is one chance, perhaps for all we know, in ten million that it could [be] someone else's hair. But what kind of coincidence is that?
... The hair is a great deal more than "could be." And if you listened to Agent Hilverda and what he really said, the hair exactly matched Santae Tribble's hair, found in the stocking.

Wednesday, November 15, 2017

It’s a Match! But What Is That?

When it comes to identification evidence, no one seems to know precisely what a match means. The comedian John Oliver used the term to riff CSI and other TV shows in which forensic scientists or their machines announce devastating “matches.” The President’s Council of Advisors on Science and Technology could not make up their minds. The opening pages of their 2016 report included the following sentences:

[T]esting labs lacked validated and consistently-applied procedures ... for declaring whether two [DNA] patterns matched within a given tolerance, and for determining the probability of such matches arising by chance in the population (P. 2)
Here, a “match” is a correspondence in measurements, and it is plainly not synonymous with a proposed identification. The identification would be an inference from the matching measurements that could arise "by chance" or because the DNA samples being analyzed are from the same source.

By subjective methods, we mean methods including key procedures that involve significant human judgment—for example, about which features to select within a pattern or how to determine whether the features are sufficiently similar to be called a probable match. (P. 5 n.3)
Now it is seems that “match” refers to “sufficiently similar” features and to an identification of a single, probable source of the traces with these similar features.

Forensic examiners should therefore report findings of a proposed identification with clarity and restraint, explaining in each case that the fact that two samples satisfy a method’s criteria for a proposed match does not mean that the samples are from the same source. For example, if the false positive rate of a method has been found to be 1 in 50, experts should not imply that the method is able to produce results at a higher accuracy. (P. 6)
Here, “proposed match” seems to be equated to “proposed identification.” (Or does “proposed match” mean that the degree of similarity a method uses to characterize the measurements as matching might not really be present in the particular case, but is merely alleged to exist?)

Later, the report argues that
Because the term “match” is likely to imply an inappropriately high probative value, a more neutral term should be used for an examiner’s belief that two samples come from the same source. We suggest the term “proposed identification” to appropriately convey the examiner’s conclusion ... . (Pp. 45-46.)
Is this a blanket recommendation to stop using the term “match” for an observed degree of similarity? It prompted the following rejoinder:
Most scientists would be comfortable with the notion of observing that two samples matched but would, rightly, refuse to take the logically unsupportable step of inferring that this observation amounts to an identification. 1/
I doubt that it is either realistic or essential to banish the word “match” from the lexicon for identification evidence. But it is essential to be clear about its meaning. As one textbook on interpreting forensic-science evidence cautions:
Yet another word that is the source of much confusion is 'match'. 'Match' can mean three different things:
• Two traces share some characteristic which we have defined and categorised, for example, when two fibres are both made of nylon.
• Two traces display characteristics which are on a continuous scale but fall within some arbitrarily defined distance of each other.
• Two traces have the same source, as implied in expressions such as 'probable match' or 'possible match'.
If the word 'match' must be used, it should be carefully defined. 2/
  1. I.W. Evett, C.E.H. Berger, J.S. Buckleton, C. Champod, & G. Jackson, Finding the Way Forward for Forensic Science in the US—A Commentary on the PCAST Report, 278 Forensic Sci. Int'l 16, 19 (2017). One might question whether “most scientists” should “be comfortable” with observing that two samples “matched” on a continuous variable (such as the medullary index of hair). Designating a range of matching and nonmatching values means that a close nonmatch is treated radically differently than an almost identical match at the edge of the matching zone. Ideally, a measure of probative value of evidence should not incorporate this discontinuity.
  2. Bernard Robertson, Charles E. H. Berger, and G. A. Vignaux, Interpreting Evidence: Evaluating Forensic Science in the Courtroom 63 (2d ed. 2016).

Saturday, November 4, 2017

Louisiana's Court of Appeals Brushes Aside PCAST Report for Fingerprints and Toolmark Evidence

A defense effort to exclude fingerprint and toolmark identification evidence failed in State v. Allen, No. 2017-0306, 2017 WL 4974768 (La. Ct. App., 1 Cir., Nov. 1, 2017). The court described the evidence in the following paragraphs:
The police obtained an arrest warrant for the defendant and a search warrant for his apartment. During the ... search ... , the police recovered a .40 caliber handgun and ammunition ... The defendant denied owning a firearm ... .
     BRPD [Baton Ropuge Police Department] Corporal Darcy Taylor processed the firearm ... , lifting a fingerprint from the magazine of the gun and swabbing various areas of the gun and magazine. Amber Madere, an expert in latent print comparisons from the Louisiana State Police Crime Lab (LSPCL), examined the fingerprint evidence and found three prints sufficient to make identifications. The latent palm print from the magazine of the gun was identified as the defendant's left palm print.
     Patrick Lane, a LSPCL expert in firearms identification, examined the firearm and ammunition in this case. Lane noted that the firearm in evidence, was the same caliber as the cartridge cases. ... He further test-fired the weapon and fired reference ammunition from the weapon for comparison to the ammunition in evidence. Lane determined that based on the quality and the quantity of markings that were present on the evidence cartridge case and the multiple test fires, the weapon in evidence fired the cartridge case in evidence.
Defendant moved for "a Daubert hearing ... based on a report, released by the President's Council of Advisors on Science and Technology (PCAST) three days before the motion was filed, which called into question the validity of feature-comparison models of testing forensic evidence." The court of appeals decided that there had been such a hearing. The opinion is not explicit about the timing of the hearing. It suggests that it consisted of questions to the testifying criminalists immediately before they testified. In the court's words,
[B]efore the trial court's determination as to their qualification as experts and the admission of their expert testimony, Madere and Lane were thoroughly questioned as to their qualifications, and as to the reliability of the methodology they used, including the rates of false positives and error. The defendant was specifically allowed to reference the PCAST report during questioning. Thus, the trial court allowed a Daubert inquiry to take place in this case. The importance of the Daubert hearing is to allow the trial judge to verify that scientific testimony is relevant and reliable before the jury hears said testimony. Thus, the timing of the hearing is of no moment, as long as it is before the testimony is presented.
The court was correct to suggest that a "hearing" can satisfy Daubert even if it was not conducted before the trial. However, considering that the "hearing" involved only two prosecution witnesses, whether it should be considered "thorough" is not so clear.

As for the proof of scientific validity, the court pointed to a legal history of admissibility mostly predating the 2009 NRC report on Strengthening Forensic Science in the United States and the later PCAST report. It failed to consider a number of federal district court opinions questioning the type of expert testimony apparently used in the case (it is certain that "the weapon ... fired the cartridge case"). Yet, it insisted that "[c]onsidering the firmly established reliability of fingerprint evidence and firearm examination analyses, the expert witness's comparison of the defendant's fingerprints, not with latent prints, but with known fingerprints, ... we find [no] error in the admission of the testimony in question." The assertion that the latent print examiner did not compare "the defendant's fingerprints [to] latent fingerprints" is puzzling. The fingerprint expert testified that "[t]he latent palm print from the magazine of the gun was identified as the defendant's left palm print."That was the challenged testimony, not some unexplained  comparison of known prints to known prints.

The text of opinion did not address the reasoning in the PCAST report. A footnote summarily -- and unconvincingly -- disposed of the PCAST report in a few sentences:
[T]he PCAST report did not wholly undermine the science of firearm analysis or fingerprint identification, nor did it actually establish unacceptable error rates for either field of expertise. In fact, the PCAST report specifically states that fingerprint analysis remains “foundationally valid” and that “whether firearms should be deemed admissible based on current evidence is a decision that belongs to the courts.”
"Did not wholly undermine the science" is faint praise indeed. The council's views about the "foundational valdidty" (and hence the admissibility under Daubert) of firearms identification via toolmarks was clear: "Because there has been only a single appropriately designed study, the current evidence falls short of the scientific criteria for foundational validity." (P. 111).

As regards fingerprints, the court's description of the report is correct but incomplete. The finding of "foundational validity" was grudging: "The studies collectively demonstrate that many examiners can, under some circumstances, produce correct answers at some level of accuracy." (P. 95). The council translated its misgivings about latent fingerprint identification into the following recommendation:
Overall, it would be appropriate to inform jurors that (1) only two properly designed studies of the accuracy of latent fingerprint analysis have been conducted and (2) these studies found false positive rates that could be as high as 1 in 306 in one study and 1 in 18 in the other study. This would appropriately inform jurors that errors occur at detectable frequencies, allowing them to weigh the probative value of the evidence. (P. 96).
To say that the Louisiana court did not undertake a careful analysis of the PCAST report would be an understatement. Of course, courts need not accept the report's detailed criteria for establishing "validity." Neither must they defer to its particular views on how to convey the probative value of scientific evidence. But if they fail to engage with the reasoning in the report, their opinions will be superficial and unpersuasive.

Why the News Stories on the Louisiana Lawyer Dog Are Misleading

The news media and the bloggers are abuzz with stories of how Louisiana judges think that a suspect's statement to "get me a lawyer dog" (or maybe "give me a lawyer, dog") is not an invocation of the right to counsel, which, under Miranda v. Arizona, requires the police to terminate a custodial interrogation. Although the case has nothing to do with forensic science or statistics, this blog often points out journalists' misrepresentations, and I'll digress from the main theme of this blog to explain how the media has misrepresented the case.

Five days ago, a blog called "Hit and Run" observed that Justice Scott Chricton of the Louisiana Supreme Court wrote a concurring opinion to explain why he agreed that the court need not review the case of Warren Demesme. It seems that Demesme said to his interlocutors, "If y'all, this is how I feel, if y'all think I did it, I know that I didn't do it so why don't you just give me a lawyer dog cause this is not what's up." The police continued the interrogation. Demesme made some admissions. Now he is in jail on charges of aggravated rape and indecent behavior with a juvenile. writer (formerly with Fox Business and NBC) read the Justice's opinion and thought
Chricton's argument relies specifically on the ambiguity of what a "lawyer dog" might mean. And this alleged ambiguity is attributable entirely to the lack of a comma between "lawyer" and "dog" in the transcript. As such, the ambiguity is not the suspect's but the court's. And it requires willful ignorance to maintain it.
Credulous writers at Slate, the Washington Post, and other news outlets promptly amplified and embellished Krayewski's report. Slate writer Mark Joseph Stern announced that
Justice Scott Crichton ... wrote, apparently in absolute seriousness, that “the defendant’s ambiguous and equivocal reference to a ‘lawyer dog’ does not constitute an invocation of counsel that warrants termination of the interview.”
Reason’s Ed Krayewski explains that, of course, this assertion is utterly absurd. Demesme was not referring to a dog with a license to practice law, since no such dog exists outside of memes. Rather, as Krayewski writes, Demesme was plainly speaking in vernacular; his statement would be more accurately transcribed as “why don’t you just give me a lawyer, dawg.” The ambiguity rests in the court transcript, not the suspect’s actual words. Yet Crichton chose to construe Demesme’s statement as requesting Lawyer Dog, Esq., rather than interpreting his words by their plain meaning, transcript ambiguity notwithstanding.
This Slate article also urged the U.S. Supreme Court to review the case (if it were to receive a petition from the as-yet-untried defendant). The Post's Tom Jackman joined the bandwagon, arguing that
When a friend says, “I’ll hit you up later dog,” he is stating that he will call again sometime. He is not calling the person a “later dog.”
But that’s not how the courts in Louisiana see it. .... It’s not clear how many lawyer dogs there are in Louisiana, and whether any would have been available to represent the human suspect in this case ... .
Yet, the case clearly does not turn on "the lack of a comma between 'lawyer' and 'dog,'" and Justice Chricton did not maintain that Mr. Demesme's request was too ambiguous because "lawyer" was followed by "dog." Public defender Derwyn D. Bunton contended that when “Mr. Demesme said "with emotion and frustration, 'Just give me a lawyer,'" he "unequivocally and unambiguously asserted his right to counsel." (At least, this is what the Washington Post reported.) If this were all there was to the request, there would no doubt that the police violated Miranda.

The problem for Mr. Demesme is that the "unambiguous" assertion "just give me a lawyer" did not stand alone. It was conditional. What he said was "if y'all think I did it, I know that I didn't do it so why don't you just give me a lawyer ...[?]" For Justice Chricton, the "if" was the source of the ambiguity. That ambiguity did not arise from the phrase "lawyer dog." It would have made no difference if defendant had said "lawyer" without the "dog." Contrary to the media howling, Justice Chricton was not taking the phrase "lawyer dog" literally. He was taking the phrase "if y'all think" literally. Here is what the judge actually wrote:
I agree with the Court’s decision to deny the defendant’s writ application and write separately to spotlight the very important constitutional issue regarding the invocation of counsel during a law enforcement interview. The defendant voluntarily agreed to be interviewed twice regarding his alleged sexual misconduct with minors. At both interviews detectives advised the defendant of his Miranda rights and the defendant stated he understood and waived those rights. ... I believe the defendant ambiguously referenced a lawyer—prefacing that statement with “if y’all, this is how I feel, if y’all think I did it, I know that I didn’t do it so why don’t you just give me a lawyer dog cause this is not what’s up.”... In my view, the defendant’s ambiguous and equivocal reference to a “lawyer dog” does not constitute an invocation of counsel that warrants termination of the interview ... .
The Justice cited a Louisiana Supreme Court case and the U.S. Supreme Court case, Davis v. United States, 512 U.S. 452 (1994). In Davis, Naval Investigative Service agents questioned a homicide suspect after reciting Miranda warnings and securing his consent to be questioned. An hour and a half into the questioning, the suspect said "[m]aybe I should talk to a lawyer." At that point, "[a]ccording to the uncontradicted testimony of one of the interviewing agents, the interview then proceeded as follows:"
[We m]ade it very clear that we're not here to violate his rights, that if he wants a lawyer, then we will stop any kind of questioning with him, that we weren't going to pursue the matter unless we have it clarified is he asking for a lawyer or is he just making a comment about a lawyer, and he said, [']No, I'm not asking for a lawyer,' and then he continued on, and said, 'No, I don't want a lawyer.'
They took a short break, after which "the agents reminded petitioner of his rights to remain silent and to counsel. The interview then continued for another hour, until petitioner said, 'I think I want a lawyer before I say anything else.' At that point, questioning ceased."

The Supreme Court held that the initial statement “[m]aybe I should talk to a lawyer,” coming after a previous waiver of the right to consult counsel and followed by the clarification that "I'm not asking for a lawyer," could be deemed too equivocal and ambiguous to have forced the police to have terminated the interrogation immediately.

The Louisiana case obviously is different. Police did not seek any clarification of the remark about a lawyer "if y'all think I did it." From what has been reported, they continued without missing a beat. However, in the majority opinion for the Court, Justice Sandra Day O'Connor went well beyond the facts of the Davis case to write that
Of course, when a suspect makes an ambiguous or equivocal statement it will often be good police practice for the interviewing officers to clarify whether or not he actually wants an attorney. That was the procedure followed by the NIS agents in this case. Clarifying questions help protect the rights of the suspect by ensuring that he gets an attorney if he wants one, and will minimize the chance of a confession being suppressed due to subsequent judicial second-guessing as to the meaning of the suspect's statement regarding counsel. But we decline to adopt a rule requiring officers to ask clarifying questions. If the suspect's statement is not an unambiguous or unequivocal request for counsel, the officers have no obligation to stop questioning him.
The Louisiana courts -- and many others -- have taken this dictum -- repudiated by four concurring Justices -- to heart. Whether it should ever apply and whether Justice Chricton's application of it to the "if ..." statement is correct are debatable. But no responsible and knowledgeable journalist could say that the case turned on an untranscribed comma or on the difference between "lawyer" and "lawyer dog." The opinion may be wrong, but it is clearly unfair to portray it as "willful ignorance" and "utterly absurd." The majority opinion in Davis and the cases it has spawned are fair game (and the Post article pursues that quarry), but the writing about the dispositive role of the lawyer dog meme in the Louisiana case is barking up the wrong tree.

Friday, October 27, 2017

Dodging Daubert to Admit Bite Mark Evidence

At a symposium for the Advisory Committee on the Federal Rules of Evidence, Chris Fabricant juxtaposed two judicial opinions about bite-mark identification. To begin with, in Coronado v. State, 384 S.W.3d 919 (Tex. App. 2012), the Texas Court of Appeals deemed bite mark comparisons to be a “soft science” because it is “based primarily on experience or training.” It then applied a less rigorous standard of admissibility than that for a “hard science.”

The state’s expert dentist, Robert Williams, “acknowledged that there is a lack of scientific studies testing the reliability of bite marks on human skin, likely due to the fact that few people are willing to submit to such a study. However, he did point out there was one study on skin analysis conducted by Dr. Gerald Reynolds using pig skin, ‘the next best thing to human skin.’” The court did not state what the pig skin study showed, but it must have been apparent to the court that direct studies of the ability of dentists to distinguish among potential sources of bite marks were all but nonexistent.

That dentists have a way to exclude and include suspects as possible biters with rates of accuracy that are known or well estimated is not apparent. Yet, the Texas appellate court upheld the admission of the "soft science" testimony without discussing whether it was presented as hard science, as "soft science," or as nonscientific expert testimony.

A trial court in Hillsborough County, Florida, went a step further. Judge Kimberly K. Fernandez wrote that
During the evidentiary hearing, the testimony revealed that there are limited studies regarding the accuracy or error rate of bite mark identification, 3/ and there are no statistical databases regarding uniqueness or frequency in dentition. Despite these factors, the Court finds that this is a comparison-based science and that the lack of such studies or databases is not an accurate indicator of its reliability. See Coronado v. State, 384 S.W. 3d 919 (Tex. App. 2012) ("[B]ecause bite mark analysis is based partly on experience and training, the hard science methods of validation such as assessing the potential rate of error, are not always appropriate for testing its reliability.")
The footnote added that "One study in 1989 reflected that there was a 63% error rate.” This is a remarkable addition. Assuming "the error rate" is a false-positive rate for a task comparable to the one in the case, it is at least relevant to the validity of bite-mark evidence. In Coronado, the Texas court found the absence of validation research not preclusive of admissibility.  That was questionable enough. But in O'Connell, the court found that the presence of research that contradicted any claim of validity “inappropriate” to consider! That turns Daubert on its head.

Friday, October 20, 2017

"Probabilistic Genotyping," Monte Carlo Methods, and the Hydrogen Bomb

Many DNA samples found in criminal investigations contain DNA from several people. A number of computer programs seek to "deconvolute" these mixtures -- that is, to infer the several DNA profiles that are mushed together in the electrophoretic data. The better ones do so using probability theory and an estimation procedure known as a Markov Chain Monte Carlo (MCMC) method. These programs are often said to perform "probabilistic genotyping." Although both words in this name are a bit confusing, 1/ lawyers should appreciate that the inferred profiles are just possibilities, not certainties. At the same time, some may find the idea of using techniques borrowed from a gambling casino (in name at least) disturbing. Indeed, I have heard the concern that "You know, don't you, that if the program is rerun, the results can be different!"

The answer is, yes, that is the way the approximation works. Using more steps in the numerical process also could give different output, but would we expect the further computations to make much of a difference? Consider a physical system that computes the value of π. I am thinking of Buffon's Needle. In 1777, Georges-Louis Leclerc, the Count of Buffon, imagined "dropping a needle on a lined sheet of paper and determining the probability of the needle crossing one of the lines on the page." 2/ He found that the probability is directly related to π. For example, if the length of the needle and the distance between the lines are identical, one can estimate π as twice the number of drops divided by the number of hits.3/ Repeating the needle-dropping procedure the same number of times will rarely give exactly the same answer. (Note that pooling the results for two runs of the procedure is equivalent to one run with twice as many needle drops.) For a very large number of drops, however, the approximation should be pretty good.

MCMC computations are more complicated. They simulate a random walk that samples values of a random variable so as to ascertain a posterior probability distribution. The walk could get stuck for a long time in a particular region. Nevertheless, the general approach is very well established in statistics, and Monte Carlo methods are widely used throughout the sciences. 4/ Indeed, they were integral to the development of nuclear weapons. 5/ The book, Dark Sun: The Making of the Hydrogen Bomb, provides the following account:
On leave from the university, resting at home during his extended recovery [from a severe brain infection], [Stanislaw] Ulam amused himself playing solitaire. Sensitivity to patterns was part of his gift. He realized that he could estimate how a game would turn out if he laid down a few trial cards and then noted what proportion of his tries were successful, rather than attempting to work out all the possible combinations in his head. "It occurred to me then," he remembers, "that this could be equally true of all processes involving branching of events." Fission with its exponential spread of reactions was a branching process; so would the propagation of thermonuclear burning be. "At each stage of the [fission] process, there are many possibilities determining the fate of the neutron. It can scatter at one angle, change its velocity, be absorbed, or produce more neutrons by a fission of the target nucleus, and so on." Instead of trying to derive the expected outcomes of these processes with complex mathematics, Ulam saw, it should be possible to follow a few thousand individual sample particles, selecting a range for each particle's fate at each step of the way by throwing in a random number, and take the outcomes as an approximate answer—a useful estimate. This iterative process was something a computer could do. ...[W]hen he told [John] von Neumann about his solitaire discovery, the Hungarian mathematician was immediately interested in what he called a "statistical approach" that was "very well suited to a digital treatment." The two friends developed the mathematics together and named the procedure the Monte Carlo method (after the famous gaming casino in Monaco) for the element of chance it incorporated. 6/
Even without a computer in place, Los Alamos laboratory staff, including a "bevy of young women who had been hastily recruited to grind manually on electric calculators," 7/ performed preliminary calculations examining the feasibility of igniting a thermonuclear reaction. As Ulam recalled:
We started work each day for four to six hours with slide rule, pencil and paper, making frequent quantitative guesses. ... These estimates were interspersed with stepwise calculations of the behavior of the actual motions [of particles] ... The real times for the individual computational steps were short ... and the spatial subdivisions of the material assembly very small. ... The number of individual computational steps was therefore very large. We filled page upon page with calculations, much of it done by [Cornelius] Everett. In the process he almost wore out his own slide rule. ... I do not know how many man hours were spent on this problem. 8/
  1. In forensic DNA work, probabilities also are presented to explain the probative value of the discovery of a "deterministic" DNA profile -- one that is treated as known to a certainty. See David H. Kaye, SWGDAM Guidelines on "Probabilistic Genotyping Systems" (Part 2), Forensic Sci., Stat. & L., Oct. 25, 2015. In addition, the "genotypes" in "probabilistic genotyping" do not refer to genes.
  2. Office for Mathematical, Science and Technology Education, College of Educvation, University of Illinois, Boffon's Needle: An Analysis and Simulation,
  3. Id.
  4. See, e.g., Persi Diaconis, The Markov Chain Monte Carlo Revolution, 46 Bull. Am. Math. Soc'y 179-205 (2009),; Sanjib Sharma, Markov Chain Monte Carlo Methods for Bayesian Data Analysis in Astronomy, arXiv:1706.01629 [astro-ph.IM], https://doiorg/10.1146/annurev-astro-082214-122339.
  5. Roger Eckhard, Stan Ulam, John von Neumann, and the Monte Carlo Method, Los Alamos Sci., Special Issue 1987, pp. 131-41,
  6. Richard Rhodes, Dark Sun: The Making of the Hydrogen Bomb 303-04 (1995).
  7. Id. at 423 (quoting Françoise Ulam).
  8. Id.

Wednesday, October 11, 2017

District Court Rejects Defendant's Reliance on PCAST Report as a Reason to Exclude Fingerprint Evidence

Yesterday, the U.S. District Court for the Northern District of Illinois rejected a defendant's motion to exclude a latent fingerprint identification on the theory that "the method used is not sufficiently reliable foundationally or as applied to his case." 1/ The court also precluded, as too "distracting," cross-examination of the FBI latent-print examiner about the FBI's infamous error in apprehending of Brandon Mayfield as the Madrid train bomber.

I. "Foundational Validity"

The "foundational validity" challenge came out of the pages of the 2016 PCAST Report. 2/ The PCAST Report seems to equate what it calls "foundational validity" for subjective pattern-matching methods to multiple "black box studies" demonstrating false-positive error probabilities of 5% or less, and it argues that Federal Rule of Evidence 702 requires such a showing of validity.

That this challenge would fail is unsurprising. According to the Court of Appeals for the Seventh Circuit, latent print analysis need not be scientifically valid to be admissible. Furthermore, even if the Seventh Circuit were to reconsider this questionable approach to the admissibility of applications of what the FBI and DHS call "the science of fingerprinting," the PCAST Report concludes that latent print comparisons have foundational scientific validity as defined above.

A. The Seventh Circuit Opinion in Herrera

Scientific validity is not a foundational requirement in the legal framework applied to latent print identification by the Court of Appeals for the Seventh Circuit. In United States v. Herrera, 3/ Judge Richard Posner 4/ observed that "the courts have frequently rebuffed" any "frontal assault on the use of fingerprint evidence in litigation." 5/ Analogizing expert comparisons of fingerprints to "an opinion offered by an art expert asked whether an unsigned painting was painted by the known painter of another painting" and even to eyewitness identifications, 6/ the court held these comparisons admissible because "expert evidence is not limited to 'scientific' evidence," 7/ the examiner was "certified as a latent print examiner by the International Association for Identification," 8/ and "errors in fingerprint matching by expert examiners appear to be very rare." 9/ To reach the last -- and most important -- conclusion, the court relied on the lack of cases of fingerprinting errors within a set of DNA-based exonerations (without indicating how often fingerprints were introduced in those cases), and its understanding that the "probability of two people in the world having identical fingerprints ... appears to be extremely low." 10/

B. False-positive Error Rates in Bonds

In the new district court case of United States v. Bonds, Judge Sara Ellis emphasized the court of appeals' willingness to afford district courts "wide latitude in performing [the] gate-keeping function." Following Herrara (as she had to), she declined to require "scientific validity" for fingerprint comparisons. 11/ This framework deflects or rejects most of the PCAST report's legal reasoning about the need for scientific validation of all pattern-matching methods in criminalistics. But even if "foundational validity" were required, the PCAST Report -- while far much more skeptical of latent print work than was the Herrera panel -- is not so skeptical as to maintain that latent print identification is scientifically invalid. Judge Ellis quoted the PCAST Report's conclusion that "latent fingerprint analysis is a foundationally valid subjective methodology—albeit with a false positive rate that is substantial and is likely to be higher than expected by many jurors based on longstanding claims about the infallibility of fingerprint analysis."

Bonds held that the "higher than expected" error rates were not so high as to change the Herrera outcome for nonscientific evidence. Ignoring other research into the validity of latent-print examinations, Judge Ellis wrote that "[a]n FBI study published in 2011 reported a false positive rate (the rate at which the method erroneously called a match between a known and latent print) of 1 in 306, while a 2014 Miami-Dade Police Department Forensic Services Bureau study had a false positive rate of 1 in 18."

Two problems with the sentence are noteworthy. First, it supplies an inaccurate definition of a false positive rate. "[T]he rate at which the method erroneously called a match between a known and latent print" would seem to be an overall error rate for positive associations (matches) in the sample of prints and examiners who were studied. For example, if the experiment used 50 different-source pairs of prints and 50 same-source pairs, and if the examiners declared 5 matches for the different-sources and 5 for the same-source pairs, the erroneous matches are 5 out of 100, for an error rate of 5%. However, the false-positive rate is the proportion of positive associations reported for different-source prints. When comparing the 50 different-source pairs, the examiners erred in 5 instances, for a false-positive rate of 5/50 = 10%. In the 50 same-source pairs, there were no opportunities for a false negative. Thus, the standard definition of a false-positive error rate gives the estimate of 0.1 for the false-positive probability. This definition makes sense because none of the same-source pairs in the sample can contribute to false-positive errors.

Second, the sentence misstates the false positive rates reported in the two studies. Instead of "1 in 306," the 2011 Noblis-FBI experiment found that "[s]ix false positives occurred among 4,083 VID [value for identification] comparisons of nonmated pairs ... ." 12/ In other words (or numbers), the reported false-positive rate (for an examiner without the verification-by-another-examiner step) was 6/4083 = 1/681. This is the only false-positive rate in the body of the study. An online supplement to the article includes "a 95% confidence interval of 0.06% to 0.3% [1 in 1668 to 1 in 333]." 13/ A table in the supplement also reveals that, excluding conclusions of "inconclusive" from the denominator, as is appropriate from the standpoint of judges or jurors, the rate is 6/3628, which corresponds to 1 in 605.

Likewise, the putative rate of 1/18 does not appear in the unpublished Miami-Dade study. A table in the report to a funding agency states that the "False Positive Rate" was 4.2% "Without Inconclusives."14/This percentage corresponds to 1 in 24.

So where did the court get its numbers? They apparently came from a gloss in the PCAST Report. That report gives an upper (but not a lower) bound on the false-positive rates that would be seen if the studies used an enormous number of random samples of comparisons (instead of just one). Bending over backwards to avoid incorrect decisions against defendants, PCAST stated that the Noblis-FBI experiment indicated that "the rate could be as high as 1 error in 306 cases" and that the numbers in the Miami-Dade study admit of an error rate that "could be as high as 1 error in 18 cases." 15/ Of course, the error rates in the hypothetical infinite population could be even higher. Or they could be lower.

III. Discussing Errors at Trial

The PCAST Report accepts the longstanding view that traces of the patterns in friction ridge skin can be used to associate latent prints that contain sufficient detail with known prints. But it opens the door to arguments about the possibility of false positives. Bonds wanted to confine the analyst to presenting the matching features or, alternatively, to declare a match but add that the "level of certainty of a purported match is limited by the most conservative reported false positive rate in an appropriately designed empirical study thus far (i.e., the 1 in 18 false positive rate from the 2014 Miami-Dade study)."

Using a probability of 1 in 18 to describe the "level of certainty" for the average positive association made by examiners like those studied to date seems "ridiculous." Cherry-picking a distorted number from a single study is hardly sound reasoning. And even if 1/18 were the best estimate of the false-positive probability that can be derived from the totality of the scientific research, applying it explain the "level of certainty" one should have that the examiner's conclusion would not be straightforward. For one thing, the population-wide false-positive probability is not the probability that a given positive finding is false! Three distinct probabilities come into play. 16/ Explaining the real meaning of an estimate of the false-positive probability from PCAST's preferred "black-box" studies in court will be challenging for lawyers and criminalists alike. Merely to state that a number like 1/18 goes to "the weight of the evidence" and can be explored "on cross examination," as Judge Ellis did, is to sweep this problem under the proverbial rug -- or to put it aside for another day.

  1. United States v. Myshawn Bonds, No. 15 CR 573-2 (N.D. Ill. Oct. 10, 2017).
  2. Executive Office of the President, President’s Council of Advisors on Science and Technology, Report to the President: Forensic Science in Criminal Courts: Ensuring Scientific Validity of Feature-Comparison Methods, Sept. 2016).
  3. 704 F.3d 480 (7th Cir. 2013).
  4. For remarks on another opinion from the judge, see Judge Richard Posner on DNA Evidence: Missing the Trees for the Forest?, Forensic Sci., Stat. & L., July 19, 2014, 
  5. Herrera, 704 F.3d at 484.
  6. Id. at 485-86.
  7. Id. at 486.
  8. Id.
  9. Id. at 487.
  10. Id.
  11. Judge Ellis stated that she "agree[d] with Herrera's broader reading of Rule 702's reliability requirement."
  12. Bradford T. Ulery, R. Austin Hicklin, JoAnn Buscaglia, & Maria Antonia Roberts, Accuracy and Reliability of Forensic Latent Fingerprint Decisions, 108(19) Proc. Nat’l Acad. Sci (USA) 7733-7738 (2011).
  13. Available at
  14. Igor Pacheco, Brian Cerchiai, Stephanie Stoiloff, Miami-Dade Research Study for the Reliability of the ACE-V Process: Accuracy & Precision in Latent Fingerprint Examinations, Dec. 2014, at 53 tbl. 4.
  15. PCAST Report, supra note 2, at 94-95.
  16. The False-Positive Fallacy in the First Opinion to Discuss the PCAST Report, Forensic Sci., Stat. & L., November 3, 2016,

Friday, October 6, 2017

Should Forensic-science Standards Be Open Access?

The federal government has spent millions of dollars to generate and improve standards for performing forensic-science tests through the Organization of Scientific Area Committees for Forensic Science (OSAC). Yet, it does not require open access to the standards placed on its "OSAC Registry of Approved Standards." Perhaps that can be justified for existing standards that are the work of other authors -- as is the case for some pre-existing standards that have made it to the Registry. But shouldn't standards that are written by OSAC at public expense be available to the public rather than controlled by private organizations?

When the American Academy of Forensic Sciences (AAFS) established a Standards Board (the ASB) to "work closely with the [OSAC] Forensic Science Standards Board and its subcommittees, which are dedicated to creating a national registry of forensic standards," 1/ ASB demanded the copyright to all standards, no matter how little or how much it contributes to the writing of the standards. It insists that "the following disclaimer shall appear on all ASB published and draft documents:
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or on an intranet, without prior written permission from the Academy Standards Board, American Academy of Forensic Sciences, 410 North 21st Street, Colorado Springs, CO 80904,
Copyright © AAFS Standards Board [year]
Moreover, "[u]nless expressly agreed otherwise by the ASB, all material and information that is provided by participants and is incorporated into an ASB document is considered the sole and exclusive property of the AAFS Standards Board. Individuals shall not copy or distribute final or draft documents without the authorization of the ASB staff." 2/

The phrasing "is considered" departs from the ASB's own guidance that "[t]he active voice should be used in sentences." 3/ Who considers draft documents written by OSAC members "the sole and exclusive property of the AAFS Standards Board"? The ASB? The OSAC? The courts? Why should they? OSAC is not furthering the public interest by giving a private organization a monopoly over its work products. It should retain the copyright and reject the AAFS's unenforceable 4/ "no copying, no distributing" philosophy via a Creative Commons Attribution license.

  1. Foreword, ASB Style Guide Manual for Standards, Technical  Reports and Best Practice Recommendations (2016),
  2. Id. at 12.
  3. Id. at 1.
  4. The asserted restriction on reproduction cannot be enforced literally because many reproductions are fair uses of the copyrighted material. That is what allows me reproduce the material quoted in this posting without ASB's permission. Arguably, reproducing an entire standard for noncommercial purposes would fall under the open-textured fair-use exception of 17 U.S.C. § 107.

Sunday, September 24, 2017

How Experts (Mis)Represent Likelihood Ratios for DNA Evidence

Earlier this month, I noted the tendency of journalists to misconstrue a likelihood ratio as odds or probabilities in favor of a source hypothesis. I mentioned expressions such as "the likelihood that a suspect’s DNA is present in a mixture of substances found at a crime scene" and "the probability, weighed against coincidence, that sample X is a match with sample Y." In place of such garbled descriptions, I proposed that
Putting aside all other explanations for the overlap between the mixture and the suspect's alleles -- explanations like relatives or some laboratory errors--this likelihood ratio indicates how much the evidence changes the odds in favor of the suspect’s DNA being in the mixture. It quantifies the probative value of the evidence, not the probability that one or another explanation of the evidence is true.
The journalists' misstatements occurred in connection with likelihood ratios involving DNA mixtures, but even experts in forensic inference make the same mistake in simpler situations. The measure of probative value for single-source DNA is a more easily computed likelihood ratio (LR). Unfortunately, it is very easy to describe LRs in ways that invite misunderstanding. Below are two examples:
[I]n the simplest case of a complete, single-source evidence profile, the LR expression reverts to the reciprocal of the profile frequency. For example: Profile frequency = 1/1,000,000 [implies] LR = P(E|H1) / P(E|H2) = 1 / 1/1,000,000 = 1,000,000/1 = 1,000,000 (or 1 million). This could be expressed in words as, "Given the DNA profile found in the evidence, it is 1 million times more likely that it is from the suspect than from another random person with the same profile." -- Norah Rudin & Keith Inman, An Introduction to Forensic DNA Analysis 148-49 (2d ed. 2002).
Comment: If "another random person" had the "the same profile," there would be no genetic basis for distinguishing between this individual and the suspect. So how could the suspect possibly be a million times more likely to be the source?
A likelihood ratio is a ratio that compares the likelihood of two hypotheses in the light of data. [I]n the present case there are two hypotheses: the sperm came from twin A or the sperm came from twin B, and then you calculate the likelihood of each hypotheses in the face or in the light of the data, and then you form the ratio [LR] of the two. So the ratio tells you how much more likely one hypothesis is than the other in the light of the experimental data. --Testimony of Michael Krawczak in a pretrial hearing on a motion to exclude evidence in Commonwealth v. McNair, No. 8414CR10768 (Super. Ct., Suffolk Co., Mass.) (transcript, Feb. 15, 2017).
Comment: Defining "likelihood" as a quantity proportional to the probability of data given the hypothesis, the first sentence is correct. But this definition was not provided, and the second sentence further suggests that the "experimental data" makes one twin LR times more probable to be the source than the other. That conclusion is correct only if the prior odds are equal -- an assumption that does not rest on those data.
With this kind of prose and testimony, is it any surprise that courts write that "[t]he likelihood ratio 'compares the probability that the defendant was a contributor to the sample with the probability that he was not a contributor to the sample'”? Commonwealth v. Grinkley, 75 Mass.App.Ct. 798, 803, 917 N.E.2d 236, 241 (Mass. Ct. App. 2009) (quoting Commonwealth v. McNickles, 434 Mass. 839, 847, 753 N.E.2d 131 (2005))?

Monday, September 11, 2017

The New York City Medical Examiner's Office "Under Fire" for Low Template DNA Testing

According to an Associated Press story, “DNA lab techniques” are “now under fire.” 1/ The article refers to the procedures used by New York City Office of the Chief Medical Examiner to analyze and interpret low template DNA mixtures—samples with minuscule quantities of DNA. Selected segments of the DNA molecules are copied repeatedly (by mean of a chemical process known as PCR) to produce enough of them to detect certain highly variable DNA sequences (called STRs). Every round of PCR amplification essentially doubles the number of replicated segments. Following the lead of the U.K.’s former Forensic Science Service, which pioneered a protocol with extra cycles of amplification, the OCME used 31 cycles instead of the standard 28.

But if the PCR primer for an STR does not latch on to enough of the small number of starting DNA molecules, that STR will not appear in PCR-amplified product. At the same time, if stray human DNA molecules are present in the samples, their STRs can be amplified along with the ones that are of real interest. The first phenomenon is called “drop-out”; the latter is “drop-in.”

Initially, OCME analysts interpreted the results by hand. Some years later, it created a computer program that used empirically determined drop-in and  drop-out probabilities and generated a measure of the extent to which the DNA results supported the conclusion that the mixture contains a suspect’s DNA as opposed to an unrelated contributor’s. It published a validation study of the software. 2/

Both these “lab techniques” have been “under fire,” as the AP put it, for years. The article suggests that only two courts have decided serious challenges to LT-DNA evidence, that they reached opposite conclusions, and that the more recent view is that the evidence is unreliable. 3/ In fact, a larger number of trial courts have considered challenges to extra cycles of amplification and to the FST program. Almost all of them found the OCME’s approach to have gained scientific acceptance.

The published opinions from New York trial courts are noteworthy. (There are more unpublished ones.) In thefirst reported case, People v. Megnath, 898 N.Y.S.2d 408 (N.Y. Sup. Ct. Queens Co. 2010), a court admitted manually interpreted LT-DNA evidence, finding that the procedures are not novel and that the modifications are generally accepted. 4/ In United States v. Morgan, 53 F.Supp.3d 732 (S.D.N.Y. 2014), a federal district court reached the same conclusion. In People v. Garcia, 963 N.Y.S.2d 517 (N.Y. Sup. Ct. 2013), a local New York court found general acceptance of both extra cycles and the FST program.

The first setback for the OCME came in People v. Collins, 49 Misc.3d 595, 15 N.Y.S.3d 564 (N.Y. Sup. Ct., Kings Co. 2015), when a well regarded trial judge conducted an extensive hearing and issued a detailed opinion finding the extra cycles and the FST program had not achieved general acceptance in the scientific community. However, other New York judges have not followed Collins in excluding OCME LT-DNA testimony. People v. Lopez, 50 Misc.3d 632, 23 N.Y.S.3d 820 (N.Y. Sup. Ct., Bronx Co. 2015); People v. Debraux, 50 Misc.3d 247, 21 N.Y.S.3d 535 (N.Y. Co. Sup. Ct. 2015). In the absence of any binding precedent (trial court opinions lack precedential value) and given the elaborate Collins opinion, it is fair to say that “case law on the merits of the science” is not so “clear,” but, quantitatively, it leans toward admissibility.

This is not to say that the opinions are equally persuasive or that they are uniformly well informed. A specious argument that several courts have relied on is that because Bayes’ theorem was discovered centuries ago and likelihood ratios are used in other contexts, the FST necessarily rests on generally accepted methods. E.g., People v. Rodriguez, Ind. No. 5471/2009, Decision and Order (Sup.Ct. N.Y. Co. Oct. 24, 2013). That is comparable to reasoning that because the method of least squares was developed over two centuries ago, every application of linear regression is valid. The same algebra can cover a multitude of sins.

Likewise, the Associated Press (and courts) seem to think that the FST (or more advanced software for computing likelihood ratios) supplies “the likelihood that a suspect’s DNA is present in a mixture of substances found at a crime scene.” 5/ A much longer article in the Atlantic presents a likelihood ratio as "the probability, weighed against coincidence, that sample X is a match with sample Y." 6/ That description is jumbled. The likelihood ratio does not weigh the probability that two samples match "against coincidence."

Rather, the ratio addresses whether the pattern of alleles in a mixed sample is more probable if the suspect's DNA is part of the mixture than if an unrelated individual's DNA is there instead. The ratio is the probability of the complex and possibly incomplete pattern arising under the former hypothesis divided by the probability of the pattern under the latter. Obviously, the ratio of two probabilities is not a probability or a likelihood of anything.

Putting aside all other explanations for the overlap between the mixture and the suspect's alleles--explanations like relatives or some laboratory errors--this likelihood ratio indicates how much the evidence changes the odds in favor of the suspect’s DNA being in the mixture. It quantifies the probative value of the evidence, not the probability that one or another explanation of the evidence is true. Although likelihood-ratio testimony has conceptual advantages, explaining the meaning of the figure in the courtroom so as to avoid the misinterpretations exemplified above can be challenging.

  1. Colleen Long, DNA Lab Techniques, 1 Pioneered in New York, Now Under Fire,  AP News, Sept. 10, 2017, 
  2. Adelle A. Mitchell et al., Validation of a DNA Mixture Statistics Tool Incorporating Allelic Drop-Out and Drop-In, 6 Forensic Sci. Int’l: Genetics 749-761 (2012); Adelle A. Mitchell et al., Likelihood Ratio Statistics for DNA Mixtures Allowing for Drop-out and Drop-in, 3 Forensic Sci. Int'l: Genetics Supp. Series e240-e241 (2011).
  3. Long, supra note 1 ("There is no clear case law on the merits of the science. In 2015, Brooklyn state Supreme Court judge Mark Dwyer tossed a sample collected through the low copy number method. ... But earlier, a judge in Queens found the method scientifically sound.").
  4. For criticism of the “nothing-new” reasoning in the opinion, see David H. Kaye et al., The New Wigmore on Evidence: Expert Evidence (Cum. Supp. 2017).
  5. These are the reporter’s words. Long, supra note 1. For a judicial equivalent, see, for example, People v. Debraux, 50 Misc.3d 247, 256, 21 N.Y.S.3d 535, 543 (N.Y. Co. Sup. Ct. 2015) (referring to FST as “showing that the likelihood that DNA found on a gun was that of the defendant”).
  6. Matthew Shaer, The False Promise of DNA Testing: The Forensic Technique Is Becoming Ever More Common—and Ever Less Reliable, Atlantic, June 2016,

Friday, September 1, 2017

Flaky Academic Journals and Forestry

The legal community may be catching on to the proliferation of predatory, bogus, or just plain flaky journals of medicine, forensic science, statistics, and every other subject that might attract authors willing to pay "open access" fees. As indicated in the Flaky Academic Journals blog, these businesses advertise rigorous peer review, but they operate like vanity presses. A powerful article (noted here) in Bloomberg's BNA Expert Evidence Report and Bloomberg Businessweek alerts litigators to the problem by discussing the most notorious megapublisher of biomedical journals, OMICS International, and its value to drug companies.willing to cut corners in presenting their research findings.

The most recent forensic-science article to go this route is Ralph Norman Haber & Lyn Haber, A Forensic Case Study with Only a Single Piece of Evidence, Journal of Forensic Studies, Vol. 2017, issue 1, unpaginated. In fact, it is the only article that the aspiring journal has published (despite spamming for potential authors at least 11 times). The website offers an intriguing description of this "Journal of Forensic Studies." It explains that "Forensic studies is a scientific journal which covers high quality manuscripts which are both relevant and applicable to the broad field of Forestry. This journal encompasses the study related to the majority of forensically related cases."

Monday, August 14, 2017

PCAST's Review of Firearms Identification as Reported in the Press

According to the Washington Post,
The President’s Council of Advisors on Science and Technology [PCAST] said that only one valid study, funded by the Pentagon in 2014, established a likely error rate in firearms testing at no more than 1 in 46. Two less rigorous recent studies found a 1 in 20 error rate, the White House panel said. 1/
The impression that one might receive from such reporting is that errors (false positives? false negatives?) occur in about one case in every 20, or omaybe one in 40.

Previous postings have discussed the fact that a false-positive probability is not generally the probability that an examiner who reports an association is wrong. Here, I will indicate how well the numbers in the Washington Post correspond to statements from PCAST. Not all of them can be found in the section on "Firearms Analysis" (§ 5.5) in the September 2016 PCAST report, and there are other numbers provided in that section.

But First, Some Background

By way of background, the 2016 report observes that
AFTE’s “Theory of Identification as it Relates to Toolmarks”—which defines the criteria for making an identification—is circular. The “theory” states that an examiner may conclude that two items have a common origin if their marks are in “sufficient agreement,” where “sufficient agreement” is defined as the examiner being convinced that the items are extremely unlikely to have a different origin. In addition, the “theory” explicitly states that conclusions are subjective. 2/
A number of thoughtful forensic scientists agree that such criteria are opaque or circular. 3/ Despite its skepticism of the Association of Firearm and Tool Mark Examiners' criteria for deciding that components of ammunition come from a particular, known gun, PCAST acknowledged that
relatively recently ... its validity [has] been subjected to meaningful empirical testing. Over the past 15 years, the field has undertaken a number of studies that have sought to estimate the accuracy of examiners’ conclusions.
Unfortunately, PCAST finds almost all these studies inadequate. "While the results demonstrate that examiners can under some circumstances identify the source of fired ammunition, many of the studies were not appropriate for assessing scientific validity and estimating the reliability because they employed artificial designs that differ in important ways from the problems faced in casework." 4/ "Specially, many of the studies employ 'set-based' analyses, in which examiners are asked to perform all pairwise comparisons within or between small samples sets." Some of these studies -- namely, "closed-set" designs "may substantially underestimate the false positive rate." The only valid way to study validity and reliability, the report insists, is with experiments that require examiners to examine pairs of items in which the existence of a true association is independent of an association in each and every other pair.

The False-positive Error Rate in the One Valid Study

According to the Post, the "one valid study ... established a likely error rate in firearms testing at no more than 1 in 46." This sentence is correct. PCAST reported a "bound on rate" of "1 in 46." 5/ This figure is the upper bound of a one-sided 95% confidence interval. Of course, the "true" error rate -- the one that would exist if there were no random sampling error in the selection of examiners -- could be much larger than this upper bound. Or, it could be much smaller. 6/ The Post omits the statistically unbiased "estimated rate" of "1 in 66" given in the PCAST report.

The 1 in 20 False-positive Error Rate for "Less Rigorous Recent Studies"

The statement that "[t]wo less rigorous recent studies found a 1 in 20 error rate" seems even less complete. The report mentioned five other studies. Four "set-to-set/closed" studies suggested error rates of 1 in 5103 (1 in 1612 for the 95% upper bound). Presumably, the Post did not see fit to mention all the "less rigorous" studies because these closed-set studies were methodologically hopeless -- at least, that is the view of them expressed in .the PCAST report.

The Post's "1 in 20 figure" apparently came from PCAST's follow-up report of 2017. 7/ The addendum refers to a re-analysis of a 14-year-old study of eight FBI examiners co-authored by Stephen Bunch, who "offered an estimate of the number of truly independent comparisons in the study and concluded that the 95% upper confidence bound on the false-positive rate in his study was 4.3%." 8/ This must be one of the Post's "two less rigorous recent studies."  In the 2016 report, PCAST identified it as a "set-to-set/partly open" study with an "estimated rate" of 1 in 49 (1 in 21 for the 95% upper bound). 9/

The second "less rigorous" study is indeed more recent (2014). The 2016 report summarizes its findings as follows:
The study found 42 false positives among 995 conclusive examinations. The false positive rate was 4.2 percent (upper 95 percent confidence bound of 5.4 percent). The estimated rate corresponds to 1 error in 24 cases, with the upper bound indicating that the rate could be as high as 1 error in 18 cases. (Note: The paper observes that “in 35 of the erroneous identifications the participants appeared to have made a clerical error, but the authors could not determine this with certainty.” In validation studies, it is inappropriate to exclude errors in a post hoc manner (see Box 4). However, if these 35 errors were to be excluded, the false positive rate would be 0.7 percent (confidence interval 1.4 percent), with the upper bound corresponding to 1 error in 73 cases.) 10/
Another Summary

Questions of which studies count, how much they count, and what to make of their limitations are intrinsic to scientific literature reviews. Journalists limited to a few sentences hardly can be expected to capture all the nuances. Even so, a slightly more complete summary of the PCAST review might read as follows:
The President’s Council of Advisors on Science and Technology said that an adequate body of scientific studies does not yet show that toolmark examiners can associate discharged ammunition to a specific firearm with very high accuracy. Only one rigorous study with one type of gun, funded by the Defense Department, has been conducted. It found that examiners who reached firm conclusions made positive associations about 1 time in 66 when examining cartridge cases from different guns. Less rigorous studies have found both higher and lower false-positive error rates for conclusions of individual examiners, the White House panel said.
  1. Spencer S. Hsu & Keith L. Alexander, Forensic Errors Trigger Reviews of D.C. Crime Lab Ballistics Unit Prosecutors Say, Wash. Post, Mar. 24, 2017.
  2. PCAST, at 104 (footnote omitted).
  3. See, e.g., Christophe Champod, Chris Lennard, Pierre Margot & Milutin Stoilovic, Fingerprints and Other Ridge Skin Impressions 71 (2016) (quoted in David H. Kaye, "The Mask Is Down": Fingerprints and Other Ridge Skin Impressions, Forensic Sci., Stat. & L., Aug. 11, 2017,
  4. PCAST, at 105.
  5. Id. at 111, tbl. 2.
  6. The authors of the study had this to say about the false-positive errors:
    [F]or the pool of participants used in this study the fraction of false positives was approximately 1%. The study was specifically designed to allow us to measure not simply a single number from a large number of comparisons, but also to provide statistical insight into the distribution and variability in false-positive error rates. The result is that we can tell that the overall fraction is not necessarily representative of a rate for each examiner in the pool. Instead, examination of the data shows that the rate is a highly heterogeneous mixture of a few examiners with higher rates and most examiners with much lower error rates. This finding does not mean that 1% of the time each examiner will make a false-positive error. Nor does it mean that 1% of the time laboratories or agencies would report false positives, since this study did not include standard or existing quality assurance procedures, such as peer review or blind reanalysis. What this result does suggest is that quality assurance is extremely important in firearms analysis and that an effective QA system must include the means to identify and correct issues with sufficient monitoring, proficiency testing, and checking in order to find false-positive errors that may be occurring at or below the rates observed in this study.
    David P. Baldwin, Stanley J. Bajic, Max Morris, and Daniel Zamzow, A Study of False-Positive and False-Negative Error Rates in Cartridge Case Comparisons, May 2016, at 18, available at
  7. PCAST, An Addendum to the PCAST Report on Forensic Science in Criminal Courts, Jan. 6, 201.
  8. Id. at 7.
  9. PCAST, at 111, tbl. 2
  10. Id. at 95 (footnote omitted).

Friday, August 11, 2017

"The Mask Is Down": Fingerprints and Other Ridge Skin Impressions

The mask is down, and this should lead to heated debates in the near future as many practitioners have not yet realized the earth-shattering nature of the changes. (Preface, at xi).
If you thought that fingerprint identification is a moribund and musty field, you should read the second edition of Fingerprints and Other Ridge Skin Impressions (FORSI for short), by Christophe Champod, Chris Lennard, Pierre Margot, and Milutin Stoilovic.

The first edition "observed a field that is in rapid progress on both detection and identification issues." (Preface 2003). In the ensuing 13 years, "the scientific literature in this area has exploded (over 1,000 publications) and the related professions have been shaken by errors, challenges by courts and other scientists, and changes of a fundamental nature related to previous claims of infallibility and absolute individualization." (Preface 2016, at xi).

The Scientific Method

From the outset, the authors -- all leading researchers in forensic science -- express dissatisfaction with "standard, shallow statements such as 'nature never repeats itself'" and "the tautological argument that every entity in nature is unique." (P. 1). They also dispute the claim, popular among latent print examiners, that the "ACE-V protocol" is a deeply "scientific method":
ACE-V is a useful mnemonic acronym that stands for analysis, comparison, evaluation, and verification ... . Although [ACE-V was] not originally named that way, pioneers in forensic science were already applying such a protocol (Heindi 1927; Locard 193]). ... Its. It is a protocol that does not, in itself give details as to how the inference is conducted. Most authors stay at this descriptive stage and leave the inferential or decision component of the process to "training and experience" without giving any more guidance as to how examiners arrive at their decisions. As rightly highlighted in the NRC report (National Research Council 2009, pp. 5-12): "ACE-V provides a broadly stated framework for conducting friction ridge analyses. However, this framework is not specific enough to qualify as a validated method for this type of analysis." Some have compared the steps of ACE-V to the steps of standard hypothesis testing, described generally as the "scientific method" (Wertheim 2000; Triplett and Cooney 2006; Reznicek et al. 2010: Brewer 2014). We agree that ACE-V reflects good forensic practice and that there is an element of peer review in the verification stage ... ; however, draping ACE-V with the term "scientific method" runs the risk of giving this acronym more weight than it deserves. (Pp. 34-35).
Indeed, it is hard to know what to make of claims that "standard hypothesis testing" is the "scientific method." Scientific thinking takes many forms, and the source of its spectacular successes is a set of norms and practices for inquiry and acceptance of theories that go beyond some general steps for qualitatively assessing how similar two objects are and what the degree of similarity implies about a possible association between the objects.

Exclusions as Probabilities

Many criminalists think of exclusions as logical deductions. They think, for example, that deductively valid reasoning shows that the same finger could not possibly be the source of two prints that are so radically different in some feature or features. I have always thought that exclusions are part of an inductive logical argument -- not, strictly speaking, a deductive one. 1/ However, FORSI points out that if the probability is zero that "the features in the mark and in the submitted print [are] in correspondence, meaning within tolerances, if these have come from the same source," then "an exclusion of common source is the obvious deductive conclusion ... ." (P. 71). This is correct. Within a Boolean logic (one in which the truth values of all propositions are 1 or 0), exclusions are deductions, and deductive arguments are certainly valid or invalid.

But the usual articulation of what constitutes an exclusion (with probability 1) does not withstand analysis. Every pair of images has some difference in every feature (even when the images come from the same source). How does the examiner know (with probability 1) that a difference "cannot be explained other than by the hypothesis of different sources"? (P. 70). In some forensic identification fields, the answer is that the difference must be "significant." 2/ But this is an evasion. As FORSI explains,
In practice, the difficulty lies in defining what a "significant difference" actually is (Thornton 1977). We could define "significant as being a clear difference that cannot be readily explained other than by a conclusion that the print and mark are from different sources. But it is a circular definition: Is it "significant" if one can cannot resolve it by another explanation than a different source or do we conclude to an exclusion because of the "significant" difference? (Page 71).
Fingerprint examiners have their own specialized vocabulary for characterizing differences in a pair of prints. FORSI defines the terms "exclusion" and "significant" by invoking a concept familiar (albeit unnecessary) in forensic DNA analysis -- the match window within which two measurements of what might be the same allele are said to match. In the fingerprint world, the analog seems to be "tolerance":
The terms used to discuss differences have varied over the years and can cause confusion (Leo 1998). The terminology is now more or less settled (SWGFAST 2013b). Dissimilarities are differences in appearance between two compared friction ridge areas from the same source, whereas discrepancy is the observation of friction ridge detail in one impression that does not exist in the corresponding area of another impression. In the United Kingdom, the term disagreement is also used for discrepancy and the term explainable difference for dissimilarity (Forensic Science Regulator 2015a).

A discrepancy is then a "significant" difference and arises when the compared features are declared to be "out of tolerance" for the examiner, tolerances as defined during the analysis. This ability to distinguish between dissimilarity (compatible to some degree with a common source) and discrepancy (meaning almost de facto different sources) is essential and relies mainly on the examiner's experience. ... The first key question ... then becomes ... :
Ql, How probable is it to observe the features in the mark and in the submitted print in correspondence. meaning within tolerances, if these have come from the same source? (P. 71).
The phrase "almost de facto different sources" is puzzling. "De facto" means in fact as opposed to in law. Whether a print that is just barely out of tolerance originated from the same finger always is a question of fact. I presume "almost de facto different sources" means the smallest point at which probability of being out of tolerance is so close to zero that we may as well round it off to exactly zero. An exclusion is thus a claim that it is practically impossible for the compared features to be out of tolerance when they are in an image from the same source.

But to insist that this probability is zero is to violate "Cromwell's Rule," as the late Dennis Lindley called the admonition to avoid probabilities of 0 or 1 for empirical claims. As long as there is a non-zero probability that the perceived "discrepancy" could somehow arise -- as there always is if only because every rule of biology could have a hitherto unknown exception -- deductive logic does not make an exclusion a logical certainty. Exclusions are probabilistic. So are "identifications" or "individualizations."

Inclusions as Probabilities

At the opposite pole from an exclusion is a categorical "identification" or "source attribution." Categorical exclusions are statements of probability -- the examiner is reporting "I don't see how these differences could exist for a common source" -- from which it follows that the hypothesis of a different source has a high probability (not that it is deductively certain to be true). Likewise, categorical "identifications" are statements of probability -- now the examiner is reporting "I don't see how all these features could be as similar as they are for different sources" -- from which it follows that the hypothesis of a common source has a high probability (not that it is certain to be true). This leaves a middle zone of inclusions in which the examiner is not confident enough to declare an identification or an exclusion and the examiner makes no effort to describe its probative value -- beyond saying "It is not conclusive proof of anything."

The idea that examiners report all-but-certain exclusions and all-but-certain inclusions ("identifications") has three problems. First, how should examiners get to these states of subjective near-certainty? Second, each report seemed to involve the probability of the observed features under only a single hypothesis -- different source for exclusions and same source for inclusions. Third, everything between the zones of near-certainty gets tossed in the dust bin.

I won't get into the first issue here, but I will note FORSI's treatment of the second two. FORSI seems to accept exclusions (in the sense of near-zero probabilities for the observations given the same-source hypothesis) as satisfactory; nevertheless, for inclusions, it urges examiners to consider the probability of the observations under both hypotheses. In doing so, it adopts a mixed perspective, using a match-window  p-value for the exclusion step and a likelihood ratio for an inclusion. Some relevant excerpts follow:
The above discussion has considered the main factors driving toward an exclusion (associated with question Q1; we should now move to the critical factor that will drive toward an identification, with this being the specificity of the corresponding features. ...

Considerable confusion exists among laymen, indeed also among fingerprint examiners, on the use of words such as match, unique, identical, same, and identity. Although the phrase "all fingerprints are unique" has been used to justify fingerprint identification opinions, it is no more than a statement of the obvious. Every entity is unique, nu because an entity can only be identical to itself. Thus, to say that "this mark and this print are identical to each other" is to invoke a profound misconception; the two might be indistinguishable, but they cannot be identical. In turn, the notion of "indistinguishability" is intimately related to the quantity and quality of detail that has been observed. This leads to distinguishing between the source variability derived from good-quality prints and the expressed variability in the mark, which can be partial, distorted, or blurred (Stoney 1989). Hence, once the examiner is confident that they cannot exclude, the only question that needs to be addressed is simply:
Q2. What is the probability of observing the features in the mark (given their tolerances) if the mark originates from an unknown individual?
If the ratio is calculated between the two probabilities associated with Ql. and Q2, we obtain what is called a likelihood ratio (LR). Ql becomes the numerator question and Q2 becomes the denominator question. ...

In a nutshell, the numerator is the probability of the observed features if the mark is from the POI, while the denominator is the probability of the observed features if the mark is from a different source. When viewed as a ratio, the strength of the observations is conveyed not only by the response to one or the other of the key questions, but by a balanced assessment of both. ... The LR is especially ... applies regardless of the type of forensic evidence considered and has been put at the core of evaluative reporting in forensic science (Willis 2015). The range of values for the LR is between 0 and infinity. A value of 1 indicates that the forensic findings are equally likely under either proposition and they do not help the case in one direction or the other. A value of 10,000, as an example, means that the forensic finding provides very strong support for the prosecution proposition (same source) as opposed to its alternative (the defense proposition—different sources). A value below 1 will strengthen the case in favor of the view that the mark is from a different source than the POI. The special case of exclusion is when the numerator of the LR is equal to 0, making the LR also equal to 0. Hence, the value of forensic findings is essentially a relative and conditional measure that helps move a case in one direction or the other depending on the magnitude of the LR. The explicit formalization of the problem in the form of a LR is not new in the area of fingerprinting and can be traced back to Stoney (1985). (P. 75)
In advocating a likelihood ratio (albeit one for an initial "exclusion" with poorly defined statistical properties), FORSI is at odds with the historical practice. This practice, as we saw, demands near certainty if an inclusion is to labelled an "identification" or an "exclusion." In the middle range, examiners "report 'inconclusive' without any other qualifiers of the weight to be assigned to the comparison." (P. 98). FORSI disapproves of this "peculiar state of affairs." (P. 99). It notes that
Examiners could, at times, resort to terms such as "consistent with, points consistent with," or "the investigated person cannot be excluded as the donor of the mark," but without offering any guidance as to the weight of evidence [see, for example, Maceo (2011a)]. In our view, these expressions are misleading. We object to information formulated in such broad terms that may be given more weight than is justified. These terms have been recently discouraged in the NRC report (National Research Council 2009) and by some courts (e.g., in England and Wales R v. Puacca [2005] EWCA Crim 3001). And this is not a new debate. As early as 1987, Brown and Cropp (1987) suggested to avoid using the expressions "match," "identical" and "consistent with."

There is a need to find appropriate ways to express the value of findings. The assignment of a likelihood ratio is appropriate. Resorting to the term "inconclusive" deprives the court of information that may be essential. (P. 99).
The Death of "Individualization" and the Sickness of "Individual Characteristics"

The leaders of the latent print community have all but abandoned the notion of "individualization" as a claim that one and one finger that ever existed could have left the particular print. (Judging from public comments to the National Commission on Forensic Science, however, individual examiners are still comfortable with such testimony.) FORSI explains:
In the fingerprint held, the term identification is often used synonymously with individualization. It represents a statement akin to certainty that a particular mark was made by the friction ridge skin of a particular person. ... Technically identification refers to the assignment of an entity to a specific group or label. whereas individualization represents the special case of identification when the group is of size 1. ... [Individualization] has been called the Earth population paradigm (Champod 2009b). ... Kaye (2009) refers to "universal individualization" relative to the entire world. But identification could also be made without referring to the Earth's population, referring instead to a smaller subset, for example, the members of a country, a city, or a community. In that context, Kaye talks about "local individualization" (relative to a proper subset). This distinction between "local" and "global" was used in two cases ... [W]e would recommend avoiding using the term "individualization." (P. 78).
The whole earth definition of "individualization" also underlies the hoary distinction in forensic science between "class" and "individual" characteristics. But a concatenation of class characteristics can be extremely rare and hence of similar probative value as putatively individual characteristics, and one cannot know a priori that "individual" characteristics are limited to a class of size 1. In the fingerprinting context, FORSI explains that
In the literature, specificity was often treated by distinguishing "class" characteristics from "individual" characteristics. Level 1 features would normally be referred to as class characteristics, whereas levels 2 and 3 deal with "individual" characteristics. That classification had a direct correlation with the subsequent decisions: only comparisons involving "individual" characteristics could lead to an identification conclusion. Unfortunately, the problem of specificity is more complex than this simple dichotomy. This distinction between "class" and "individual" characteristics is just a convenient, oversimplified way of describing specificity. Specificity is a measure on a continuum (probabilities range from 0 to 1, without steps) that can hardly be reduced to two categories without more nuances. The term individual characteristic is particularly misleading, as a concordance of one minutia (leaving aside any consideration of level 3 features) would hardly be considered as enough to identify The problem with this binary categorization is that it encourages the examiner to disregard the complete spectrum of feature specificity that ranges from low to high. It is proposed that specificity at each feature level be studied without any preconceived classification of its identification capability by itself Indeed, nothing should prevent a specific general pattern—such as, for example, an arch with continuous ridges from one side to the other (without any minutiae)—from being considered as extremely selective, since no such pattern has been observed to date. (P.74)
FORSI addresses many other topics -- errors, fraud, automated matching systems, probabilistic systems, chemical methods for detection of prints, and much more. Anyone concerned with latent-fingerprint evidence should read it. Those who do will see why the authors express views like these:
Over the years, the fingerprint community has fostered a state of laissez-faire that left most of the debate to the personal informed decisions of the examiner. This state manifests itself in the dubious terminology and semantics that are used by the profession at large ... . (P. 344).
We would recommend, however, a much more humble way of reporting this type of evidence to the decision maker. Fingerprint examiners should be encouraged to report all their associations by indicating the degree of support the mark provides in favor of an association. In that situation, the terms "identification" or "individualization" may disappear from reporting practices as we have suggested in this book. (P. 345).

  1. David H. Kaye, Are "Exclusions" Deductive and "Identifications" Merely Probabilistic?, Forensic Sci., Stat. & L., Apr. 28, 2017,
  2. E.g., SWGMAT, Forensic Paint Analysis and Comparison Guidelines 3.2.9 (2000), available at