Friday, January 12, 2024

What's Uniqueness Got to Do with It?

Columbia University has announced that "AI Discovers That Not Every Fingerprint Is Unique"! The subtitle of the press release of January 10, 2024, boldly claims that

Columbia engineers have built a new AI that shatters a long-held belief in forensics–that fingerprints from different fingers of the same person are unique. It turns out they are similar, only we’ve been comparing fingerprints the wrong way!

Forensic Magazine immediately and uncritically rebroadcast (quoting verbatim without acknowledgment from the press release) the confused statements about uniqueness. According to the Columbia release and Forensic Magazine, "It’s a well-accepted fact in the forensics community that fingerprints of different fingers of the same person—or intra-person fingerprints—are unique and therefore unmatchable." Forensics Magazine adds that "Now, a new study shows an AI-based system has learned to correlate a person’s unique fingerprints with a high degree of accuracy."

Does this mean that the "well-accepted fact" and "long-held belief" in uniqueness been shattered or not? Clearly, not. The study is about similarity, not uniqueness. In fact, uniqueness has essentially nothing to do with it. I can classify equilateral triangles drawn on a flat surface as triangles rather than as other regular polygons whether or not the triangles are each different enough from one another (uniqueness within the set of triangles) that I notice these differences. To say that objects "are unique and therefore unmatchable" is a nonsequitur. A human genome is probably unique to that individual, but forensic geneticists know that six-locus STR profiles are "matchable" to those of other individuals in the population. A cold hit to a person who could not have been the source of the six-locus profile in the U.K. database occurred long ago (as was to be expected for the random-match probabilities of the genotypes).

Perhaps the myth that the study shatters is that it is impossible to distinguish fingerprints left by different fingers of the same individual X from fingerprints left by fingers of different individuals (not-X). But there is no obvious reason why this would be impossible even if every print is distinguishable from every other print (uniqueness).

The Columbia press release describes the study design this way:

[U]ndergraduate senior Gabe Guo ... who had no prior knowledge of forensics, found a public U.S. government database of some 60,000 fingerprints and fed them in pairs into an artificial intelligence-based system known as a deep contrastive network. Sometimes the pairs belonged to the same person (but different fingers), and sometimes they belonged to different people.

Over time, the AI system, which the team designed by modifying a state-of-the-art framework, got better at telling when seemingly unique fingerprints belonged to the same person and when they didn’t. The accuracy for a single pair reached 77%. When multiple pairs were presented, the accuracy shot significantly higher, potentially increasing current forensic efficiency by more than tenfold.

The press release reported the following odd facts about the authors' attempts to publish their study in a scientific journal:

Once the team verified their results, they quickly sent the findings to a well-established forensics journal, only to receive a rejection a few months later. The anonymous expert reviewer and editor concluded that “It is well known that every fingerprint is unique,” and therefore it would not be possible to detect similarities even if the fingerprints came from the same person.

The team ... fed their AI system even more data, and the system kept improving. Aware of the forensics community's skepticism, the team opted to submit their manuscript to a more general audience. The paper was rejected again, but [Professor Hod] Lipson ... appealed. “I don’t normally argue editorial decisions, but this finding was too important to ignore,” he said. “If this information tips the balance, then I imagine that cold cases could be revived, and even that innocent people could be acquitted.” ...

After more back and forth, the paper was finally accepted for publication by Science Advances. ... One of the sticking points was the following question: What alternative information was the AI actually using that has evaded decades of forensic analysis? ... “The AI was not using ... the patterns used in traditional fingerprint comparison,” said Guo ... . “Instead, it was using something else, related to the angles and curvatures of the swirls and loops in the center of the fingerprint.”

Proprietary fingerprint matching algorithms also do not arrive at matches the way human examiners do. They "see" different features in the patterns and tend to rank the top candidates for true matches in a database trawl differently than the human experts. Again, however, these facts about automated systems neither prove nor disprove claims of uniqueness. And, theoretical uniqueness has little or nothing to do with the actual probative value of assertions of matches by humans, automated systems, or both.

Although not directly applicable, the day after the publicity on the Guo et al. paper, I came across the following report on "Limitations of AI-based predictive models" in a weekly survey of papers in Science:

A central promise of artificial intelligence (AI) in health care is that large datasets can be mined to predict and identify the best course of care for future patients. Unfortunately, we do not know how these models would perform on new patients because they are rarely tested prospectively on truly independent patient samples. Chekroud et al. showed that machine learning models routinely achieve perfect performance in one dataset even when that dataset is a large international multisite clinical trial (see the Perspective by Petzschner). However, when that exact model was tested in truly independent clinical trials, performance fell to chance levels. Even when building what should be a more robust model by aggregating across a group of similar multisite trials, subsequent predictive performance remained poor. -- Science p. 164, 10.1126/science.adg8538; see also p. 149, 10.1126/science.adm9218

Note: This posting was last modified on 1/12/24 2:45 PM

Saturday, November 18, 2023

SWGDE's Best Practices for Remote Collection of Digital Evidence from a Networked Computing Environment

SWGDE 22-F-003-1.0, Best Practices for Remote Collection of Digital Evidence from a Networked Computing Environment, is a forensic-science standard proposed for inclusion on the Organization of Scientific Area Committees for Forensic Science (OSAC) Registry—"a repository of selected published and proposed standards … to promote valid, reliable, and reproducible forensic results.”

The best practices “may not be applicable in all circumstances.” In fact, “[w]hen warranted, an examiner may deviate from these best practices and still obtain reliable, defensible results.” I guess that is why they are called best practices rather than required practices. But what circumstances would justify using anything but the best practices? On this question, the standard is silent. It merely says that “[i]f examiners encounter situations warranting deviation from best practices, they should thoroughly document the specifics of the situation and actions taken.” 

Likewise, the best practices for “preparation” seem rather rudimentary. “Examiners should ascertain the appropriate means of acquiring data from identified networked sources.” No doubt, but how could they ever prepare to collect digital information without ascertaining how to acquire data? What makes a means “appropriate”? All that a digital evidence expert can glean from this document is that he or she “should be aware of the limitations of each acquisition method and consider actions to mitigate these limitations if appropriate” and should consider “methods and limitation variables as they relate to various operating systems.” How does such advice regularize or improve anything?

Same thing with a recommendation that “[p]rior to the acquisition process, examiners should prepare their destination media”? What steps for preparing the destination media are best? Well, [s]terilization of destination media [whatever the process of “sterilization” is in this context] is not generally required.” But it is required “when needed to satisfy administrative or organizational requirements or when a specific analysis process makes it a prudent practice.” When would sterilization be prudent? The drafters do not seem to be very sure. “[E]xaminers may need to sanitize destination media provided to an external recipient to ensure extraneous data is not disclosed.” Or maybe they don’t? “Examiners may also be required to destroy copies of existing data to comply with legal or regulatory requirements.” Few people would dispute that the best practice is to follow the law, but examiners hardly need best practices documents from standards developing organizations to know that.

The standard is indeterminate when it comes to what it calls “triage”—“preview[ing] the contents of potential data sources prior to acquisition.” We learn that “[e]xaminers may need to preview the contents of potential data sources prior to acquisition” to “reduce the amount of data acquired, avoid acquiring irrelevant information, or comply with restrictions on search authority.” What amount of data makes "triage" a best practice? How does the examiner know that irrelevant information may be present? Why can "triage" sometimes be skipped? When it is desirable and how should it be done? The standard merely observes that “[t]here may be multiple iterations of triage … .” When are multiple iterations advisable? Well, it “depend[s] on the complexity of the investigation.” Equally vague is the truism that “[e]xaminers should use forensically sound processes to conduct triage to the extent possible.” 

Finally, designating steps like “perform acquisition” and “validate collected data” as “best practices” does little to inform examiners of how to collect digital evidence from a network. To be fair, a few parts of the standard are more concrete, and, possibly, other SWGDE standards fill in the blanks. But, on its face, much of this remote acquisition standard simply gestures toward possible best practices. It does not expound them. In this respect, it resembles other forensic-science standards that emerge from forensic-science standards developing organizations only to be criticized as vague at critical points.

"Conditions Regarding the Use of SWGDE Documents"

SWGDE is the Scientific Working Group on Digital Evidence. Its website describes it as a meta-organization—a group that “brings together organizations actively engaged in the field of digital and multimedia evidence to foster communication and cooperation as well as to ensure quality and consistency within the forensic community.” Structured as a non-profit corporation, it solicits "your donations or sponsorship." \1/ Its 70 “member organizations” consist of (by a quick and possibly error-prone categorization and count):

  • 16 local, state, and federal police agencies; \2/
  • 4 digital forensics software companies; \3/
  • 18 training and consulting organizations; \4/
  • 6 prosecutors' offices; \5/
  • 8 crime laboratories and coroners' or medical examiners' offices; \6/
  • 3 major corporations; \7/
  • 3 universities; \8/
  • A swath of federal executive agencies (or parts of them), including NASA, NIST, and the Departments of Defense, Homeland Security, Interior, Justice, Labor, and Treasury. \9/

SWGDE has produced “countless academic papers,” although none are listed on its website. SWGDE "encourages the use and redistribution of our documents," but it regards them as private property. It states that "The Disclaimer and Redistribution policies (also included in the cover pages to each document) also establish what is considered SWGDE's Intellectual Property."

These policies are unusual, if not unique, among among standards developing organizations. An IP lawyer would find it odd, I think, to read that admonitions such as the following are part of an author's copyright:

Individuals may not misstate and/or over represent [sic] duties and responsibilities of SWGDE work. This includes claiming oneself as a contributing member without actively participating in SWGDE meetings; claiming oneself as an officer of SWGDE without serving as such ... .

With respect to actual IP rights, SWGDE purports to control not only the specific expression of ideas—as allowed by copyright law—but all "information" contained in its documents—a claim that far exceeds the scope of copyright. It imposes the following "condition to the use of this document (and the information contained herein) in any judicial, administrative, legislative, or other adjudicatory proceeding in the United States or elsewhere":

notification by e-mail before or contemporaneous to the introduction of this document, or any portion thereof, as a marked exhibit offered for or moved into evidence in such proceeding. The notification should include: 1) The formal name of the proceeding, including docket number or similar identifier; 2) the name and location of the body conducting the hearing or proceeding; and 3) the name, mailing address (if available) and contact information of the party offering or moving the document into evidence. Subsequent to the use of this document in the proceeding please notify SWGDE as to the outcome of the matter.

As author (or otherwise), an SDO certainly can ask readers to do anything it would like them to do with its publications—and the SWGDE "conditions regarding use" do contain the phrase "the SWGDE requests." Even reformulating the paragraph as a polite request rather than a demand supposedly supported by copyright law, however, one might ask what legislative proceeding with a "formal name" would have a forensic-science standard "offered or moved into evidence." Impeachment and subsequent trial, I guess.

Notes

  1. Neither its full name nor its acronym turned up in a search of the IRS list of tax-exempt 501(c)(3) organizations, so donors seeking a charitable deduction on their taxes might need to inquire further.
  2. As listed on the website, they are the Columbus, Ohio Police Department; Eugene Police Department; Florida Department of Law Enforcement (FDLE); Lawrence, KS Police Department; Johnson County, KS Sheriff's Office; Los Angeles County, CA Sheriff's Department; Louisville, KY Metro Police Department; Massachusetts State Police; Oklahoma State Bureau of Investigation; New York State Police; New York City Police Department (NYPD); Plano, TX Police Department; Seattle Police Department; Weld County, CO Sheriff's Office; US Department of Justice - Federal Bureau of Investigation (FBI); US Department of Homeland Security - US Secret Service (USSS); and the US Postal Inspection Service (USPIS).
  3. Amped Software USA Inc.; AVPreserve; BlackRainbow; SecurCube.
  4. National White Collar Crime Center (NW3C); Digital Forensics.US LLC / Veritek Cyber Solutions; MetrTech Consultancy; Midwest Forensic Consultants LLC; Hexordia; Forensic Data Corp; Forensic Video & Audio Associates, Inc; Laggui And Associates, Inc.; Loehrs Forensics; N1 Discovery; Precision Digital Forensics, Inc. (PDFI); Premier Cellular Mapping & Analytics; Primeau Forensics, Recorded Evidence Solutions, LLC; AVPreserve; LTD; BEK TEK; TransPerfect Legal Solutions; VTO Labs; Unique Wire, Inc
  5. Adams County, CO District Attorney's Office; Burlington County, NJ Prosecutor's Office; Dallas County, TX District Attorneys Office; Middlesex County, NJ Prosecutor's Office; State of Wisconsin Department of Justice; US Department of Justice - Executive Office United States Attorney Generals Office.
  6. City of Phoenix, AZ Crime Lab; Houston Forensic Science Center; Boulder County Coroner's Office; Miami-Dade County, FL; Medical Examiner Department; Virginia Department of Forensic Science; Westchester County, NY Forensic Lab; North Carolina State Crime Laboratory; and the US Department of Defense - Army Criminal Investigation Laboratory (Army CID).
  7. Carrier Corporation; Target Corporation; and Walmart Stores Inc.
  8. San Jose State University; University of Colorado Denver - National Center for Media Forensics (NCMF); University of Wisconsin Stevens Point.
  9. NASA Office of Inspector General - Computer Crimes Division; National Institute of Standards and Technology; Treasury Inspector General for Tax Administration; US Department of Defense - Defense Cyber Crimes Center (DC3); US Department of Homeland Security - Homeland Security Investigations (HSI); US Department of Justice - Office of the Inspector General (DOJ OIG); US Department of Labor - Office of Inspector General (DOL OIG); US Department of the Interior - Office of the Inspector General (DOI OIG); US Department of Treasury - Internal Revenue Service (IRS); US Postal Service - Office of Inspector General (Postal OIG). Yet another organizational member is the Puerto Rico Office of the Comptroller, Division of Database Analysis, Digital Forensic and Technological Development.

Wednesday, September 27, 2023

How Accurate Is Mass Spectrometry in Forensic Toxicology?

Mass spectrometry (MS) is the "[s]tudy of matter through the formation of gas-phase ions that are characterized using mass spectrometers by their mass, charge, structure, and/or physicochemical properties." ANSI-ASB Standard 098 for Mass Spectral Analysis in Forensic Toxicology § 3.11 (2023). MS has become "the preferred technique for the confirmation of drugs, drug metabolites, relevant xenobiotics, and endogenous analytes in forensic toxicology." Id. at Foreword.

But no "criteria for the acceptance of mass spectrometry data have been ... universally applied by practicing forensic toxicologists." Id. Therefore, the American Academy of Forensic Sciences' Academy Standards Board (ASB) promulgated a "consensus based forensic standard[] within a framework accredited by the American National Standards Institute (ANSI)," id., that provides "minimum requirements." Id. § 1.

To a nonexpert reader (like me), the minimum criteria for the accuracy of MS "confirmation" are not apparent. Consider Section 4.2.1 on "Full-Scan Acquisition using a Single-Stage Low-Resolution Mass Analyzer." It begins with the formal requirement that

[T]he following shall be met when using a single-stage low-resolution mass analyzer in full-scan mode.
a) A minimum of a single diagnostic ion shall be monitored.

It is hard to imagine an MS test method that would not meet the single-ion minimum. Perhaps what makes this requirement meaningful is that the one or more ions must be "diagnostic." However, this adjective begs the question of what the minimum requirement for diagnositicity should be. A "diagnostic ion" is a "molecular ion or fragment ion whose presence and relative abundance are characteristic of the targeted analyte." Id. § 3.4. So what makes an ion "characteristic"? Must it always be present (in some relative abundance) when the "targeted analyte" is in the specimen (at or above some limit of detection)? That would make the ion a marker for the analyte with perfect sensitivity: Pr(ion|analyte) = 1. Even so, it would not be characteristic of the analyte unless its presence is highly specific, that is, unless Pr(no-such-ion|something-else) ≅ 1. But the standard contains no minimum values for sensitivity, specificity, or the likelihood ratio Pr(ion|analyte) / Pr(ion|something-else), which quantifies the positive diagnostic value of a binary test. \1/

This is not to say that there are no minimum requirements in the standard. There certainly are. For example, Section 4.2.1 continues:

b) When monitoring more than one diagnostic ion:
1. ratios of diagnostic ions shall agree with those calculated from a concurrently analyzed reference material given the tolerances shown in Table 1; OR
2. the spectrum shall be compared using an appropriate library search and be above a pre-defined match factor as demonstrated through method validation.

But the standard does not explain how the tolerances in Table 1 were determined. What are the conditional error probabilities that they produce?

Likewise, establishing a critical value for the "match factor" \2/ before using it is essential to a frequentist decision rule, but what are the operating characteristics of the rule? "Method validation" is governed (to the extent that voluntary standards govern anything) by ANSI-ASB 036, Standard Practices for Method Validation in Forensic Toxicology (2019). This standard requires testing to establish that a method is "fit for purpose," but it gives no accuracy rates that would fulfill this vague directive.

Firms that sell antibody test kits for detecting Covid-19 infections no longer can sell whatever they deem is fit for purpose. In May 2020, the FDA stopped issuing emergency use permits for these diagnostic tests without validation showing that they "are 90% 'sensitive,' or able to detect coronavirus antibodies, and 95% 'specific,' or able to avoid false positive results." \3/ Forensic toxicologists do not seem to have proposed such minimum requirements for MS tests.

NOTES

  1. Other toxicology standards refer to ASB 098 as if it indicates what it required to apply the label "diagnostic." ANSI/ASB 113, Standard for Identification Criteria in Forensic Toxicology, § 4.5.2 (2023) ("All precursor and product ions are required to be diagnostic per ASB Standard 098, Standard for Mass Spectral Data Acceptance in Forensic Toxicology (2022).").
  2. Section 3.13 defines "match factor" as a "mathematical value [a scalar?] that indicates the degree of similarity between an unknown spectrum and a reference spectrum."
  3. See How Do Forensic-science Tests Compare to Emergency COVID-19 Tests?, Forensic Sci., Stat. & L., May 5, 2020 (quoting Thomas M. Burton, FDA Sets Standards for Coronavirus Antibody Tests in Crackdown on Fraud, Wall Street J., Updated May 4, 2020 8:24 pm ET, https://www.wsj.com/articles/fda-sets-standards-for-coronavirus-antibody-tests-in-crackdown-on-fraud-11588605373).

Monday, September 18, 2023

Use with Caution: NIJ's Training Course in Population Genetics and Statistics for Forensic Analysts

The National Institute of Justice (NIJ) "is the research, development and evaluation agency of the U.S. Department of Justice . . . dedicated to improving knowledge and understanding of crime and justice issues through science." It offers a series of webpages and video recordings (a "training course") on Population Genetics and Statistics for Forensic Analysts. The course should be approached with caution. I have not worked through all the pages and videos, but here are a few things that rang alarm bells:


NIJ's Training Comment

Many statisticians have employed what is known as Bayesian probability ... which is based on probability as a measure of one's degree of belief. This type of probability is conditional in that the outcome is based on knowing information about other circumstances and is derived from Bayes Theorem. Bayes' rule applies to both objective and subjective probabilities. Both types of probability include conditional probabilities. The "type of probability" is not derived from Bayes' Theorem.

Conditional probability, by definition, is the probability P of an event A given that an event B has occurred. ... Take the example of a die with six sides. If one was to throw the die, the probability of it landing on any one side would be 1/6. This probability, however, assumes that the die is not weighted or rigged in any way, and that all of the sides contain a different number. If this were not true, then the probability would be conditional and dependent on these other factors. The "other factors" are nothing more than part of the description of the experiment whose outcomes are the events that are observed. They are not conditioning events in a sample space.

The following equation can be used to determine the probability of the evidence given that a presumed individual is the contributor rather than a random individual in the population: LR = P(E/H1) / P(E/H0) ... . In the case of a single source sample, the hypothesis for the numerator (the suspect is the source of the DNA) is a given, and thus reduces to 1. This reduces to: LR = 1/ P(E/H0) which is simply 1/P, where P is the genotype frequency. The hypothesis for the numerator of a likelihood ratio is always "a given"--that is, it goes on the right-hand-side of the expression for a conditional probability. So is the hypothesis in the denominator. Neither probability "reduces to 1" for that reason. Only if the "evidence" is the true genotype in both the recovered sample and the sample from the defendant can it be said that P(E|H1) = 1. In other words, to say that the probability of a reported match is 1 if the defendant is the source treats the probability of laboratory error as zero. That may be acceptable as a simplifying assumption, but the assumption should be made visible in a training course.

Although likelihood ratios can be used for determining the significance of single source crime stains, they are more commonly used in mixture interpretation. ... The use of any formula for mixture interpretation should only be applied to cases in which the analyst can reasonably assume "that all contributors to the mixed profile are unrelated to each other, and that allelic dropout has no practical impact." This limitation does not apply to modern probabilistic genotyping software!