Monday, January 3, 2022

Fitting "Physical Fit" into the Courtroom

The logic of piecing together fragments of broken glass, torn tape, cut paper, and the like seems simple enough. \1/ If the pieces fit in all their details at the edges, and if all surface marks or impressions that would cross an edge also align nicely, one has circumstantial evidence that they were once part of the same object.

The strength of this evidence for a single source depends on the extent and detail of the concordance between the recovered pieces. A physical fit between two halves of a broken plank of wood is powerful evidence for the hypothesis that the two pieces resulted from breaking this one plank. But if the pieces are weathered and the splintered edges dulled, the physical fit will be less precise and less supportive of the claim that they came from the same original plank.

At the other extreme, if two pieces are plainly discordant, they might have come from different places on the same object, with the intermediate pieces being missing. Or they might have come from different objects entirely. Consider tearing off five pieces of duct tape from the same roll of tape and comparing the edges of the first and the last segments. The detailed structure of the edges should not be complementary. Likewise, tearing segments of tape from five different rolls should result in a mismatch between the first and the fifth segment.

Criminalists or materials experts can be extremely helpful in examining the recovered pieces of objects to determine the degree of physical fit -- that is, in elucidating how well the edges fit together and the extent to which a mark on the surface of one piece lines up with a mark on the other when the pieces are aligned. But how they should describe their findings seems to be muddled in forensic-science standards. This posting describes the current vocabulary and argues that it is articificial and a departure from the ordinary meaning of the term "fit." It then outlines better alternatives to reporting the results of an investigation into physical fit.

I. The Standard Approach

Let’s look at a couple of ASTM standards. E2225-19a (Standard Guide for Forensic Examination of Fabrics and Cordage) instructs that “[i]f a physical match is found, it should be reported in a manner that will demonstrate that the two or more pieces of material were at one time a continuous piece of fabric or cordage” (§ 7.2.2). This standard treats the “physical match” as an observable property of the specimens (concordant edges and surface marks) that is conclusive of the hypothesis of a single source (the inference from the data).

ASTM E3260−21 (Standard Guide for Forensic Examination and Comparison of Pressure Sensitive Tapes), on the other hand, characterizes “physical fit” not as a property of the materials, but as a “type of examination that can be performed” (§ 10.5.1). This “conclusive type of examination ... is a physical end match.” Id. It “involves the comparison of edges, fabric (if present), surface striae, and other surface irregularities between samples in which corresponding features provide distinct characteristics that indicate the samples were once joined at the respective separated edges.” Of course, "distinct characteristics that indicate the samples were once joined at the respective separated edges” are not necessarily "conclusive," making this definition of "physical fit" as a "type of examination" puzzling. The intent, it seems, is to define a physical fit examination (rather than a physical fit) as one that is capable of conclusively proving that the pieces were once joined together.

A Proposed New Standard Guide for the Collection, Analysis and Comparison of Forensic Glass Samples, ASTM WK72932, released for public comment late last year states that “broken objects can be reassembled to their original configuration ... called a ‘physical fit’ (§ 11.1). But a physical fit is the original configuration of a broken object only if the pieces come from that original object, and this origin story is not true just because a standard defines "physical fit" that way. The evidence from the examination may be that the separate pieces fit together extremely well. If so, the conclusion is that they were once together within or as a unitary object. This conclusion may well be true, but one cannot decide, by the fiat of a definition, that the pieces that are observed to fit together well have been realigned as they once were. Yet, a later section similarly asserts that “[a] glass physical fit is a determination that two or more pieces of glass were once part of the same broken glass object” (§ 11.2.8). This effort to define "physical fit" as inherently conclusive prompted eleven lawyers (including me) \2/ to caution ASTM that “[t]he hypothesis or conclusion that fragments come from the same object is not a physical fit. It is an inference drawn from the observations that produce the designation of a physical fit.”

Still more recently, an OSAC subcommittee released a Standard Guide for Forensic Physical Fit Examination (OSAC 2022-S-0015) for public comment before it is delivered to ASTM for consideration there. This proposed standard goes off in another direction. It equates a “physical fit” with the examiner’s state of mind about a hypothetical ensemble of experiments:

13.1 Physical Fit
13.1.1 The items that have been broken, torn, separated, or cut exhibit physical features that realign in a manner that is not expected to be replicated.
13.1.1.1 Physical Fit is the highest degree of association between items. It is the opinion that the observations provide the strongest support for the proposition that the items originated from the same source as opposed to the proposition they originated from different sources.

13.2 No Physical Fit
13.2.1 The items correspond in observed class characteristics, but exhibit physical features that do not realign, or they realign in a manner that could be replicated.
13.2.2 Alternatively, the items can exhibit physical features that partially realign, display simultaneous similarities and differences, show areas of discrepancy (e.g., warped areas, burned areas, missing pieces), or have insufficient individual characteristics that hinder the ability to determine the presence or absence of a physical fit.

Statisticians will notice the shift from (1) the incompletely expressed frequentist idea of an infinite sequence of trials in which different objects A and B are broken and the pieces from A never align with those from B to (2) the likelihoodist conception of support for the same-source hypothesis. But that implicit change in the theory of inference is hardly a cardinal sin in this context. If the probability of a fit at least as good as the one observed is practically zero for different sources, and if the probability of such a fit for the same source is much higher, then the support (the log-likelihood ratio) is very high.

Nevertheless, defining physical fit as a categorical opinion rather than a more variable degree of congruency that generates the opinion — and dumping everything short of a perceived fit into the category of ”no physical fit” — deviates from the common understanding that physical fit comes in degrees. There can be a remarkably great fit, a pretty good fit, and so on, down to a blatant misfit. The question the examiner must answer, at least intuitively, before the fit/no-fit classification can be made is just how well the pieces fit together. Fit is not a uniform degree of association that springs into existence exactly when a particular examiner is convinced that no other source could account for the complexity and extent of the fit. There is no such thing as “the strongest support.” One can always conceive of a situation with still stronger support (because a fracture or other separation of the pieces could generate an even richer set of irregularities in the edges).

The current approach of defining a physical fit as a single source for the pieces and calling everything else “no fit” does not create a vocabulary that judges or jurors will easily understand. A vocabulary in which physical congruency (fit) lies on a continuum — and that then addresses the inference that should be drawn from the observations — is more transparent.The definitions in the standards collapse the two steps of data acquisition and inference into one.

II. Inference: From Data to Conclusions

So how should examiners answer the question of how well the pieces fit together? An examination for fit yields multidimensional, spatial data. An examiner could present photographs of the aligned edges and surfaces and highlight the concordant and discordant features. Although the highlighting involves some interpretative thinking, I have called a courtroom presentation that stops at this point "features-only testimony." \3/ It is appropriate when examiners have no special expertise at interpreting how strongly their results support the same-source hypothesis. If they are no better than lay judges and jurors at discerning how improbable the features are in the hypothetical cases of repeatedly breaking the same object, it could be argued that these witnesses should not try to interpret the results any further. Such interpretation would not actually assist the trier of fact, as required by Federal Rule of Evidence 702.

For example, a few days ago, a forensic scientist told me of a case in which a criminalist was able to reassemble pieces of glass recovered at the site of a hit-and-run accident so that they fit neatly into the metal holder of a side rear mirror on the suspect’s car that was missing its glass. That’s good detective work, but did the criminalist have any special insights to offer into the obvious implications of this solution to the jigsaw puzzle? (The work was not presented in court because the crime laboratory’s management was concerned that there was no written protocol for pasting mirror fragments back in place. As the scientist observed, that's silly. The evidence practically speaks for itself, and its message is the same with or without a written protocol.)

Nevertheless, let’s assume that examiners do have specialized skill at interpreting the findings about the alignment of the features. The ASTM and OSAC-proposed standards ignore the possibility of a qualitative expression of relative support — for example, “It is far more likely to get the detailed alignment of the features I just showed you if the pieces were broken parts of the same objects than if they were from different objects.” Or, similarly, “The detailed alignment gives very strong support to the idea that the pieces broke off of the same object as opposed to two different objects.”

As Part I showed, the standards advocate a fit/no-fit classification in which “fit” is either a statement about the probability of the same-source hypothesis (that the pieces had to have come from the same object) or a statement of belief in the hypothesis (“my opinion is that they were together in the same object — that’s what makes it a physical fit). No-fit does not have a comparably sharp meaning. It could mean anything from no realistic possibility that the pieces were once contiguous parts of the same object to “partial fit features [that] increase the significance of the finding” (OSAC 2022-S-0015 § 13.2.4).

A more straightforward and comprehensible approach would be to have a three-tiered reporting scale for the support the data give to the same-source hypothesis. What is now called a physical fit would be designated a highly probative physical fit (that is, a physical fit that strongly supports the same-source hypothesis). “Partial fit features” would be described as a limited fit (that gives some support to the same-source hypothesis). Finally, an obvious mismatch could be called a misfit (which strongly supports the conclusion that the pieces were never adjacently located on the same object).

This tripartite classification is an imperfect way to express an underlying likelihood ratio formed from subjective probabilities. Whether better results would be achieved if analysts were forced to articulate their probabilities, either quantitatively or in the qualitative way mentioned earlier, is an interesting question. But the three-tiered reporting scale is closer to the current practice and seems feasible. \4/ It offers a framework for a better standard on reporting the results of a physical fit examination. Or so it seems to me — those who disagree are encouraged to hit the comment button.

NOTE

  1. But see Forensic Science’s Latest Proof of Uniqueness, Dec. 22, 2013, http://for-sci-law.blogspot.com/2013/12/forensic-sciences-latest-proof-of.html.
  2. The other commenters were Alyse Bertenthal, Amanda Black, Jennifer Friedman, Julia Leighton, Kate Philpott, Emily Prokesch, Matt Redle, Andrea Roth, Maneka Sinha, and Pate Skene.
  3. David H. Kaye et al., The New Wigmore on Evidence" Expert Evidence (2d ed. 2011).
  4. When there is a mismatch, testimony about a physical match has little value. Other features than the alignment of edges and surface markings will need to be studied if the expert is to shed light on whether the pieces came from a single object. The current and proposed standards are clear on this point.

No comments:

Post a Comment