Monday, July 1, 2024

“Predictive or Profiling Evidence” and Diaz v. United States

Today the Association of American Law School’s (AALS) Section on Evidence distributed the following announcement to its members:

In a divided decision, the Supreme Court recently concluded that expert testimony about the likely mental state of individuals arrested with drugs in their position is admissible in criminal trials under the Federal Rules of Evidence in Diaz v. United States, 602 U. S. ____ (2024). This decision was sharply contested in its own right, but also drew attention to an area of broader controversy in the law of evidence: the increasing use of "predictive" or "profiling" evidence, by which expert witnesses present testimony suggesting that an individual is more or less likely to have had a particular mental state or behaved in a particular way based on their personal circumstances or characteristics. Scholars who have written about this phenomenon have expressed disquiet about the use of such evidence (and some courts have limited the use of such evidence in edge cases involving particular prejudice or overbroad characterizations), but no consensus has emerged as to the reasons for objecting to predictive evidence or as to how to systematically distinguish between such evidence and other forms of indirect and circumstantial evidence that is routinely admitted. This panel brings together scholars to discuss predictive and profiling evidence from a variety of perspectives. Was the Supreme Court right to find such evidence consistent with the federal rules governing the admissibility of expert evidence? Is such evidence generally consistent with due process and equal protection concerns? Is it generally empirically sound? Do we need new federal rules or common law doctrines to limit the admissibility of some forms of predictive or profiling evidence?

Who can object to a scholarly program on the subject, even if it is hardly new, having been the subject of a multitude of opinions, a number of statutes, and even a previous AALS program decades ago? Hopefully, the courts and the professoriate have made some progress in understanding what has come to be called "framework evidence." Neither would it be fair to criticize a necessarily brief announcement for not defining "predictive or profiling evidence." Presumably, the evidence teachers do not need a definition.

But I was surprised to read my fellow law professors’ sweeping characterization of Diaz v. United States, 144 S.Ct. 1727 (2024), as establishing that "such evidence [is] consistent with the federal rules governing the admissibility of expert evidence" and "that expert testimony about the likely mental state of individuals arrested with drugs in their position [sic?] is admissible in criminal trials under the Federal Rules of Evidence." Although I doubt that the announcement will cause its narrow range of readers to believe that the case stands for more than it does, I do worry that the same kind of language will crop up in undiscerning judicial opinions and commentary on Diaz. Therefore, it may be be worth listing some problematic aspects of the statements.

First, the case plainly held that certain "expert testimony about the likely mental state of individuals arrested with drugs in their position [possession?] is" not "admissible in criminal trials under the Federal Rules of Evidence." Every Justice agreed that no expert can testify that a defendant charged with importing proscribed drugs knew that they were transporting drugs. That would be explicit ultimate-opinion testimony on a criminal defendant's state of mind. (There are cases limiting the Rule 704(b) ban to mental health professionals, but the Court did not consider that possible way to interpret the rule. It stuck with a more literal reading of the text.)

Second, no “predictive or profiling evidence” was introduced in the case, and only one Justice thought it worth discussing. Certainly, Diaz is not a case of a criminal profiler predicting (“inferring” would be more precise) the characteristics of a criminal from the type or manner of the crimes under investigation (or any other such “predictive or profiling evidence”). As the excerpts from the trial transcript reproduced below show, the witness did not claim such expertise; furthermore, the trial judge barred him from stating a belief about the defendant’s knowledge (although one could well think that his testimony made it plain enough what his belief was). 

Third, the issue before the Court was not admissibility under the rules of evidence writ large. It was the scope of a single part of a solitary rule. Federal Rule 704(b), which has no counterpart in the rules of most states, declares that no expert witness may "state an opinion about whether the defendant did or did not have a mental state or condition that constitutes an element of the crime charged or of a defense" because "[t]hose matters are for the trier of fact alone." All that the Diaz Court held was that when such an opinion is not stated and is an inference that does not necessarily follow (as a matter of deductive logic) from the witness’s statements, then Rule 704(b) does not preclude its admission. It did not—and could not—have held that the Rule makes it admissible. Cf. David H. Kaye, The Ultimate Opinion Rule and Forensic Science Identification, 60 Jurimetrics J. 75 (2020).

Thus, Diaz should not be read as supporting—or opposing—the use of “predictive or profiling evidence” generally, or even in the subcategory of testimony offered to prove a defendant’s state-of-mind. 

More thoughts on the three opinions in the case and how the result fits into the range of possible interpretations of Rule 704(b) will appear in the upcoming supplement to The New Wigmore on Evidence: Expert Evidence § 2.2.3(b) (available online in VitalLaw) and, time permitting, in a further positing about the case here.


Excerpts from Trial Transcript
Mar. 18, 2021

The most pertinent portion of the testimony at issue in Diaz is as follows (with italics added):

BY MR. OLAH [Assistant US Attorney]:
Q. Where do you work?
A. I’m a special agent with Homeland Security Investigations.
Q. And how long have you been with HSI?
A. I’ve been a special agent since 1996. So going on 20 — I believe 28 years.
Q. Were you in law enforcement before joining HSI?
A. Prior to becoming a special agent, I was a U.S. Border Patrol agent. And prior to that, I was a sheriff’s corrections deputy.
***
Q. Have you been involved in drug trafficking investigations as a special agent with HSI?
A. Yes, I have.
Q. Approximately how many such investigations?
A. I’ve been involved in over 500 investigations dealing with distribution of drugs and also the – which would include the importation of drugs.
Q. And can you summarize for the jury the various investigation techniques you’ve used?
A. The techniques I’ve used, I’ve utilized wiretaps, where you actually listen to a drug trafficker talk on the telephone and how they conduct business. I’ve done controlled purchases where I utilized an undercover agent or a cooperating source. And we actually go out on the street and buy the drugs. I’ve spoken with cooperating defendants that have been arrested for drug trafficking related offenses. I’ve talked to cooperating sources that have information related to the distribution of drugs and drug trafficking organizations. I have spoken with other agents that work drug trafficking organizations and have worked on task forces with other agencies such as the Federal Bureau of Investigation, the Drug Enforcement Administration, and local police departments dealing with drug trafficking related crimes.
***
Q. Agent Flood, why are drugs imported into the United States?
MS. IREDALE: Objection, 401.
THE COURT: Overruled.
THE WITNESS: Based upon drugs that – some drugs are manufactured in Mexico and outside the United States. Therefore, they’re brought across the border, into the United States to be sold.
BY MR. OLAH:
***
Q. With respect to vehicles, can you describe the general process of movement from Mexico to wherever it goes?
A. From Mexico, they are packaged. They are put into *** a vehicle. *** I have seen drugs hidden in every area of a vehicle. *** And then they are transported from point A to point B across the border.
Q. And based on your training and experience, are the transporters compensated for their efforts?
A. Yes. It’s a job. It’s to take it from point A to point B.
***
Q. Agent Flood, based on your training and experience, are large quantities of drugs entrusted to drivers that are unaware of those drugs?
MS. IREDALE: Objection. 401, 403.
THE COURT: Overruled.
THE WITNESS: No. In extreme circumstances – actually, in most circumstances, the driver knows they are hired. It’s a business. They are hired to take the drugs from point A to point B.
BY MR. OLAH:
Q. And why aren’t – why don’t they use unknowing couriers, generally?
MS. IREDALE: Objection. 401, 403.
THE COURT: Overruled. You may answer.
THE WITNESS: Generally, it’s a risk of your – your cargo not making it to the new market; not knowing where it’s going; not being able to retrieve it at the ending point, at your point B. So there’s a risk of not delivering your product and, therefore, you’re not going to make any money.
***
Cross Examination
***
Q. So you said that unknowing couriers are very rare.
A. Yes.
***
Q. You work for HSI. Right?
A. Correct.
Q. And you’re aware that your own agency has identified many schemes where drug trafficking organizations use unknowing couriers. Right?
A. I – I know of three schemes that were primarily identified as being possible for an unknowing courier. It doesn’t necessarily mean that they are unknowing couriers. ***

Saturday, June 15, 2024

Volunteer Bias in Interlaboratory Studies

The National Institute for Standards and Technology (NIST) is soliciting laboratories to join an "interlaboratory study" of CG-MS (gas chromatography mass spectrometry) for seized drug analysis. The announcement that I received, in abbreviated form, reads:

Forensic Science Quality Assurance Program
Seized Drugs General Method for GC-MS Reporting Limits Study

. . .
Study Design, Purpose, and Rationale
The goals of the study are 1) to capture the range of methods, instrumentation, and analytical approaches used in the community, 2) investigate mass spectral variability across methods, and 3) investigate how different reporting practices effect the limit of seized drug reporting.
Timeline and Commitment
Registration is currently open and will close on July 5, 2024. To participate in this study, laboratories must be accredited forensic laboratories based in the United States and have a valid Schedule I & II DEA license, a validated seized drug screening method using GC-MS, and a documented reporting practice. To be considered for the study, participants will be required to complete a pre-study questionnaire pertaining to the method that will be used for sample analysis. After acceptance into the study, participants will be provided a kit of 10 solutions containing mixtures of controlled substances and asked to analyze the solutions and report whether the analytes are present above their established reporting thresholds. Participants will also be required to report chromatographic peak height/area and retention time of each peak and provide the raw datafile from each run. Standards used for comparison will also be reported. [D]ue to a limited number of available kits, completion of the pre-study questionnaire does not guarantee acceptance into the study. . . .
Publication of Results
Upon closure of data entry, laboratories will receive a preliminary report containing a summary of reported data, consensus results, and a summary of analytes present in each mixture. [A] final report . . . will be made publicly available by Spring 2025. . . . NIST will not knowingly reveal laboratory identities associated with study results.
For questions, contact andrea.yarberry@nist.gov
To signup, go to: https://forms.gle/gPDU9aENHguPkw1D7

The effort is laudable, but one might ask why NIST is not beginning with a sampling frame of laboratories created to represent "the community" and then drawing a probability sample from this list. Will the laboratories that notice the announcement and ask to participate present the full "range of methods, instrumentation, and analytical approaches used in the community"? Will the volunteer sample be skewed toward higher quality labs? Will it include all the "different reporting practices [that] effect [sic] the limit of seized drug reporting [whatever this "limit" denotes in the population of laboratories]"?

Rigor in sampling may not be required to answer certain questions, but it seems relevant to determining what the sign-up form refers to as "the current landscape of GC-MS methods and associated reporting practices and how those factors effect [sic] the concentration of drug that is/is not ultimately reported." Certainly, it should be a consideration for the legal community if and when the results of the study are presented as an indication of "the known or potential rate of error" for GC-MS analysis as practiced in forensic-science laboratories (Daubert v. Merrell Dow Pharm., 509 U.S. 579, 594 (1993)).

Sunday, May 26, 2024

ISO Standards on Forensic Science: Pay to Play?

"ISO, the International Organization for Standardization, brings global experts together to agree on the best way of doing things – for anything from making a product to managing a process." 1/ For the last few years, it has been devising the following overarching set of standards for all of forensic science:

  • Forensic sciences (TC 272) ISO/DIS 21043-1, Forensic sciences - Part 1: Terms and definitions - 5/27/2024, $58.00
  • ISO/DIS 21043-3, Forensic Sciences - Part 3: Analysis - 5/26/2024, $62.00
  • ISO/DIS 21043-4, Forensic Sciences - Part 4: Interpretation - 5/26/2024, $67.00
  • ISO/DIS 21043-5, Forensic Sciences - Part 5: Reporting - 5/26/2024, $53.00 2/

These are

part of a series which, when completed, will include the different components of the forensic process from scene to courtroom ... . The series describes primarily “what” is standardized, not the “how” or “who”. Best practice manuals and standard operating procedures should describe “how” the requirements of this document would be met. 3/

It sounds like the standards in progress will not specify "the best way of doing things." Will they merely list the things that are in need of "standardization"? Will they be too open-ended to constitute what the U.S. Supreme Court refers to as "standards controlling the technique's operation"4/?

I cannot answer these questions because I have not seen the drafts that were open for public comment. Members of the public cannot read the drafts without paying IS0 the $240 listed above. If anyone who has paid to play has thoughts on these documents that they would like to share beyond the TC (Technical Committee) that drafted the standards, I'll post them--at no charge.

Notes

  1. Int'l Org. for Standardization, About ISO.
  2. ANSI Standards Action, Mar. 15, 2024, at 48.
  3. ISO 21043-1:2018(en) Forensic sciences — Part 1: Terms and definitions.
  4. Daubert v. Merrell Dow Pharm., 509 U.S. 579, 594 (1993).

Friday, January 12, 2024

What's Uniqueness Got to Do with It?

Columbia University has announced that "AI Discovers That Not Every Fingerprint Is Unique"! The subtitle of the press release of January 10, 2024, boldly claims that

Columbia engineers have built a new AI that shatters a long-held belief in forensics–that fingerprints from different fingers of the same person are unique. It turns out they are similar, only we’ve been comparing fingerprints the wrong way!

Forensic Magazine immediately and uncritically rebroadcast (quoting verbatim without acknowledgment from the press release) the confused statements about uniqueness. According to the Columbia release and Forensic Magazine, "It’s a well-accepted fact in the forensics community that fingerprints of different fingers of the same person—or intra-person fingerprints—are unique and therefore unmatchable." Forensics Magazine adds that "Now, a new study shows an AI-based system has learned to correlate a person’s unique fingerprints with a high degree of accuracy."

Does this mean that the "well-accepted fact" and "long-held belief" in uniqueness been shattered or not? Clearly, not. The study is about similarity, not uniqueness. In fact, uniqueness has essentially nothing to do with it. I can classify equilateral triangles drawn on a flat surface as triangles rather than as other regular polygons whether or not the triangles are each different enough from one another (uniqueness within the set of triangles) that I notice these differences. To say that objects "are unique and therefore unmatchable" is a nonsequitur. A human genome is probably unique to that individual, but forensic geneticists know that six-locus STR profiles are "matchable" to those of other individuals in the population. A cold hit to a person who could not have been the source of the six-locus profile in the U.K. database occurred long ago (as was to be expected for the random-match probabilities of the genotypes).

Perhaps the myth that the study shatters is that it is impossible to distinguish fingerprints left by different fingers of the same individual X from fingerprints left by fingers of different individuals (not-X). But there is no obvious reason why this would be impossible even if every print is distinguishable from every other print (uniqueness).

The Columbia press release describes the study design this way:

[U]ndergraduate senior Gabe Guo ... who had no prior knowledge of forensics, found a public U.S. government database of some 60,000 fingerprints and fed them in pairs into an artificial intelligence-based system known as a deep contrastive network. Sometimes the pairs belonged to the same person (but different fingers), and sometimes they belonged to different people.

Over time, the AI system, which the team designed by modifying a state-of-the-art framework, got better at telling when seemingly unique fingerprints belonged to the same person and when they didn’t. The accuracy for a single pair reached 77%. When multiple pairs were presented, the accuracy shot significantly higher, potentially increasing current forensic efficiency by more than tenfold.

The press release reported the following odd facts about the authors' attempts to publish their study in a scientific journal:

Once the team verified their results, they quickly sent the findings to a well-established forensics journal, only to receive a rejection a few months later. The anonymous expert reviewer and editor concluded that “It is well known that every fingerprint is unique,” and therefore it would not be possible to detect similarities even if the fingerprints came from the same person.

The team ... fed their AI system even more data, and the system kept improving. Aware of the forensics community's skepticism, the team opted to submit their manuscript to a more general audience. The paper was rejected again, but [Professor Hod] Lipson ... appealed. “I don’t normally argue editorial decisions, but this finding was too important to ignore,” he said. “If this information tips the balance, then I imagine that cold cases could be revived, and even that innocent people could be acquitted.” ...

After more back and forth, the paper was finally accepted for publication by Science Advances. ... One of the sticking points was the following question: What alternative information was the AI actually using that has evaded decades of forensic analysis? ... “The AI was not using ... the patterns used in traditional fingerprint comparison,” said Guo ... . “Instead, it was using something else, related to the angles and curvatures of the swirls and loops in the center of the fingerprint.”

Proprietary fingerprint matching algorithms also do not arrive at matches the way human examiners do. They "see" different features in the patterns and tend to rank the top candidates for true matches in a database trawl differently than the human experts. Again, however, these facts about automated systems neither prove nor disprove claims of uniqueness. And, theoretical uniqueness has little or nothing to do with the actual probative value of assertions of matches by humans, automated systems, or both.

Although not directly applicable, the day after the publicity on the Guo et al. paper, I came across the following report on "Limitations of AI-based predictive models" in a weekly survey of papers in Science:

A central promise of artificial intelligence (AI) in health care is that large datasets can be mined to predict and identify the best course of care for future patients. Unfortunately, we do not know how these models would perform on new patients because they are rarely tested prospectively on truly independent patient samples. Chekroud et al. showed that machine learning models routinely achieve perfect performance in one dataset even when that dataset is a large international multisite clinical trial (see the Perspective by Petzschner). However, when that exact model was tested in truly independent clinical trials, performance fell to chance levels. Even when building what should be a more robust model by aggregating across a group of similar multisite trials, subsequent predictive performance remained poor. -- Science p. 164, 10.1126/science.adg8538; see also p. 149, 10.1126/science.adm9218

Note: This posting was last modified on 1/12/24 2:45 PM

Saturday, November 18, 2023

SWGDE's Best Practices for Remote Collection of Digital Evidence from a Networked Computing Environment

SWGDE 22-F-003-1.0, Best Practices for Remote Collection of Digital Evidence from a Networked Computing Environment, is a forensic-science standard proposed for inclusion on the Organization of Scientific Area Committees for Forensic Science (OSAC) Registry—"a repository of selected published and proposed standards … to promote valid, reliable, and reproducible forensic results.”

The best practices “may not be applicable in all circumstances.” In fact, “[w]hen warranted, an examiner may deviate from these best practices and still obtain reliable, defensible results.” I guess that is why they are called best practices rather than required practices. But what circumstances would justify using anything but the best practices? On this question, the standard is silent. It merely says that “[i]f examiners encounter situations warranting deviation from best practices, they should thoroughly document the specifics of the situation and actions taken.” 

Likewise, the best practices for “preparation” seem rather rudimentary. “Examiners should ascertain the appropriate means of acquiring data from identified networked sources.” No doubt, but how could they ever prepare to collect digital information without ascertaining how to acquire data? What makes a means “appropriate”? All that a digital evidence expert can glean from this document is that he or she “should be aware of the limitations of each acquisition method and consider actions to mitigate these limitations if appropriate” and should consider “methods and limitation variables as they relate to various operating systems.” How does such advice regularize or improve anything?

Same thing with a recommendation that “[p]rior to the acquisition process, examiners should prepare their destination media”? What steps for preparing the destination media are best? Well, [s]terilization of destination media [whatever the process of “sterilization” is in this context] is not generally required.” But it is required “when needed to satisfy administrative or organizational requirements or when a specific analysis process makes it a prudent practice.” When would sterilization be prudent? The drafters do not seem to be very sure. “[E]xaminers may need to sanitize destination media provided to an external recipient to ensure extraneous data is not disclosed.” Or maybe they don’t? “Examiners may also be required to destroy copies of existing data to comply with legal or regulatory requirements.” Few people would dispute that the best practice is to follow the law, but examiners hardly need best practices documents from standards developing organizations to know that.

The standard is indeterminate when it comes to what it calls “triage”—“preview[ing] the contents of potential data sources prior to acquisition.” We learn that “[e]xaminers may need to preview the contents of potential data sources prior to acquisition” to “reduce the amount of data acquired, avoid acquiring irrelevant information, or comply with restrictions on search authority.” What amount of data makes "triage" a best practice? How does the examiner know that irrelevant information may be present? Why can "triage" sometimes be skipped? When it is desirable and how should it be done? The standard merely observes that “[t]here may be multiple iterations of triage … .” When are multiple iterations advisable? Well, it “depend[s] on the complexity of the investigation.” Equally vague is the truism that “[e]xaminers should use forensically sound processes to conduct triage to the extent possible.” 

Finally, designating steps like “perform acquisition” and “validate collected data” as “best practices” does little to inform examiners of how to collect digital evidence from a network. To be fair, a few parts of the standard are more concrete, and, possibly, other SWGDE standards fill in the blanks. But, on its face, much of this remote acquisition standard simply gestures toward possible best practices. It does not expound them. In this respect, it resembles other forensic-science standards that emerge from forensic-science standards developing organizations only to be criticized as vague at critical points.