According to ProPublica, a ruling that Dr. Richard Vorder Bruegge’s “testimony met the Daubert standard” “enshrined the FBI unit’s techniques and testimony as reliable scientific evidence.” This characterization seems to overstate the importance of a single, unpublished pretrial ruling that followed a short, spur-of-the-moment hearing. 2/ Still, the testimony was central to McKreith’s conviction on the bank robbery counts. Dr. Vorder Bruegge’s conclusions fortified the testimony from ordinary witnesses that, among other things, McKreith and the bank robber wore two wrist watches at the same time; wore or had ski masks in south Florida; wore plaid shirts; drove a maroon, burgundy or red-colored car; and had the same general appearance.
Dr. Vorder Bruegge’s signal contribution came from comparing video images from bank cameras to items seized from the defendant’s home and to McKreith himself. Some of this testimony was not definitive. For example, Vorder Bruegge testified that enlargements of the photographs from several bank robberies lacked sufficient resolution for him to say “with a scientific certainty” what that they showed wristwatches. Instead, he stated that they were “consistent with these features being two watches on the left wrist of the bank robber.” Likewise, he testified that a bank photograph that showed a profile of the robber’s face lacked the resolution to reveal allegedly “individual identifying characteristics [such as moles, stars, chipped teeth, ear patterns or other facial minutiae],” but “the overall characteristics of the profile, which include the shape of the nose, mouse, and chin” displayed “similarities” “consistent with” McKreith’s being the bank robber.
Dr. Vorde Bruegge’s analysis of the pattern of a plaid Van Heusen shirt in photos from seven of the robberies produced more conclusive evidence. A condensed version of the testimony on direct examination on December 16, 2002, is in the shaded boxes that follow. Discussion is interspersed between the boxes. ProPublica posted the full transcript.of the testimony from Dec. 16 and 17.
"Individual Identifying Characteristics"
MR. [Roger] STEFIN [Assistant US Attorney]: Our next witness would be Richard Vorder Bruegge.
THE COURT: Okay, members of the jury, [o]ur next witness will be an expert witness. ....
[Defense counsel objected that the government first needed “to lay a sufficient predicate” and “request[ed] a Daubert ... hearing. The court held an impromptu hearing, with no written briefing, at which Dr. Vorder Bruegge affirmed that “the techniques [are] well recognized ... in forensics [a]nd in the scientific community at large.” The court overruled the defendant’s objection.]
Q. [by MR. STEPHIN] [H]ave you had the opportunity to view the eight by ten photographs that have been introduced in evidence in this particular case with respect to the bank robbery images in each of the eight bank robberies?
A. Yes, I have. ...
Q. ... Let’s talk about the shirt now. What is it about the manufacturing process of shirts that might enable you to identify features of that shirt that could be then compared with image analyses?
A. Well, any, any comparison analysis involves a comparison of first of all, the class characteristics. ... In a shirt like this, the class characteristics ... include such things as, does it have a pattern, which this shirt does — it has a plaid pattern or a checked pattern, if you will. ... It also is a button-down shirt — it's not a pull-over, and it's a long-sleeved shirt. ...
Once one has ... found ... class characteristics that match, one can move on to look at the individual identifying characteristics. ... [I]f I were trying to differentiate one person from another and reach a positive identification, then I would need individual identifying characteristics, such as moles, scars, freckle patterns, chipped teeth. These, these features enable you to really differentiate people down to saying this person is unique from all other people. [B]ecause this shirt is a patterned shirt, there are individual identifying characteristics ... based upon ... the way ... the pieces on the shirt are cut out and then sewn together.
THE COURT: Okay, members of the jury, [o]ur next witness will be an expert witness. ....
[Defense counsel objected that the government first needed “to lay a sufficient predicate” and “request[ed] a Daubert ... hearing. The court held an impromptu hearing, with no written briefing, at which Dr. Vorder Bruegge affirmed that “the techniques [are] well recognized ... in forensics [a]nd in the scientific community at large.” The court overruled the defendant’s objection.]
Q. [by MR. STEPHIN] [H]ave you had the opportunity to view the eight by ten photographs that have been introduced in evidence in this particular case with respect to the bank robbery images in each of the eight bank robberies?
A. Yes, I have. ...
Q. ... Let’s talk about the shirt now. What is it about the manufacturing process of shirts that might enable you to identify features of that shirt that could be then compared with image analyses?
A. Well, any, any comparison analysis involves a comparison of first of all, the class characteristics. ... In a shirt like this, the class characteristics ... include such things as, does it have a pattern, which this shirt does — it has a plaid pattern or a checked pattern, if you will. ... It also is a button-down shirt — it's not a pull-over, and it's a long-sleeved shirt. ...
Once one has ... found ... class characteristics that match, one can move on to look at the individual identifying characteristics. ... [I]f I were trying to differentiate one person from another and reach a positive identification, then I would need individual identifying characteristics, such as moles, scars, freckle patterns, chipped teeth. These, these features enable you to really differentiate people down to saying this person is unique from all other people. [B]ecause this shirt is a patterned shirt, there are individual identifying characteristics ... based upon ... the way ... the pieces on the shirt are cut out and then sewn together.
The class-individual distinction is entrenched in forensic science, and there is a grain of truth in it, But it assumes what is to be proved -- that every "individual characteristic" (or some unspecified combination of them) makes an individual distinguishable from every other individual. "Class characteristics" are known to be generic, whereas "individual" ones might not be. Logically, both work the same way -- the presence or absence of a characteristic changes the probability of a common source for the specimens or images being compared. A less tendentious pair of terms would be "generic" and "randomly acquired or incorporated," keeping in mind that random characteristics are not necessarily specific to an individual.
"One in-35 Through Random Processes"
Q. So what can you conclude with respect to the pattern itself on this shirt?
A. [B]ecause we don't see the patterns matching across the seams, we can use that as a way to individualize this shirt relative to the other shirts that would be manufactured at the same time. ...
Q. [W]hat do you call this when you find an individual feature?
A. Individual identify characteristic. Basically, each seam ... can be considered on its own as an entire set of individual identifying characteristics. It's almost one individual identifying characteristic. ... [I]f we look at where the right yoke meets the right front panel, there's this ... thick dark that I've been talking about this comes down from the neck, and it terminates here about 2/3 of the way across this panel ... . Likewise, if we look at the left sleeve, I point out that ... very thin dark line here, which is coming up the right sleeve. It almost exactly meets the point where the yoke joins the front panel. ... Likewise, ... look at the collar itself. You'll see that the collar has a very dark line [that] just cuts the corner of the bottom hole on this side. That is paralleled on the other side because it's one piece. So that's another individuating characteristic for this shirt. ...
Q. All right. And did you take precise measurements of these types of lining up of the different — the thick lines or the thin lines in order to further identify, you know, the individual characteristics of this particular shirt?
A. I measured the width of these features on this shirt so that I could figure out [the] relationship ... between places meeting on one side with those meeting on the other side.
Q. And I think you said that the pattern repeats basically every three-and-one-half inches?
A. Yes. Basically, ... if we just repeat going from this dark line to the next dark line, it's three-and-a-half inches. ...
Q. The thick line or the thin line?
A. The thin line. Each ... of these features repeats every three-and-a-half inches. It's just that the thin line is the smallest feature that we can see very easily.
Q. So you were using the thin line as a point of reference in measuring out the three-and-one-half inches? A. Exactly. Recall that I said there's a curved surfaces here where the sleeve meets the other seams. That would complicate any type of repeat analysis because since it's not a straight line, that's going perpendicular to that feature, so it's actually going to be longer.
So basically, if we did have a single seam that we were trying to match this pattern to run across it, the chances of that very thin dark line matching up, matching up or at least touching the black line on the other side, would be one in 35 through random processes.
Q. Could you explain that, explain that a little bit, how you came up to the one in 35? ...
A. ... Let's use, let’s use the yoke and the sleeve. ...
Q. OK ... And the yoke, again, is this back panel ... ?
A. Right. I'm using the yoke in the back because they're almost aligned. ... You can see here that on the yoke the dark line is ... slightly below the seam here. And you have to go about half an inch down before you hit the black line on the other side.
Q. Okay. In other words, you can measure the distance between the black line on the yoke to the black line on the sleeve, and you measured, say ... a half an inch?
A. Yeah.
Q. All right. And so just taking that one point of reference, what would be the odds that two shirts coming out the same manufacturing plant being manufactured in this, in this method would end up having the same alignment of yoke and the sleeve, whereby you could have this ... half an inch offset between them just for this one point of identification?
A. Just for that one point, it would be a 1 in 35 chance.
Q. And how did you come up with the number that it's a 1 in 35 of probability that these two items would line up the same in more than one shirt?
A. Because the feature itself that I’m looking to align is 1/35th of the overall repeat length. And to see ... that one feature come up randomly happens to be one in 35 times.
Q. So one in every 35 shirts manufactured by the company using this patterned cloth, the exact same cloth, you could say that one in 35 probability-wise would come up with the exact same alignment between the yoke and the left sleeve?
A. That is correct.
A. [B]ecause we don't see the patterns matching across the seams, we can use that as a way to individualize this shirt relative to the other shirts that would be manufactured at the same time. ...
Q. [W]hat do you call this when you find an individual feature?
A. Individual identify characteristic. Basically, each seam ... can be considered on its own as an entire set of individual identifying characteristics. It's almost one individual identifying characteristic. ... [I]f we look at where the right yoke meets the right front panel, there's this ... thick dark that I've been talking about this comes down from the neck, and it terminates here about 2/3 of the way across this panel ... . Likewise, if we look at the left sleeve, I point out that ... very thin dark line here, which is coming up the right sleeve. It almost exactly meets the point where the yoke joins the front panel. ... Likewise, ... look at the collar itself. You'll see that the collar has a very dark line [that] just cuts the corner of the bottom hole on this side. That is paralleled on the other side because it's one piece. So that's another individuating characteristic for this shirt. ...
Q. All right. And did you take precise measurements of these types of lining up of the different — the thick lines or the thin lines in order to further identify, you know, the individual characteristics of this particular shirt?
A. I measured the width of these features on this shirt so that I could figure out [the] relationship ... between places meeting on one side with those meeting on the other side.
Q. And I think you said that the pattern repeats basically every three-and-one-half inches?
A. Yes. Basically, ... if we just repeat going from this dark line to the next dark line, it's three-and-a-half inches. ...
Q. The thick line or the thin line?
A. The thin line. Each ... of these features repeats every three-and-a-half inches. It's just that the thin line is the smallest feature that we can see very easily.
Q. So you were using the thin line as a point of reference in measuring out the three-and-one-half inches? A. Exactly. Recall that I said there's a curved surfaces here where the sleeve meets the other seams. That would complicate any type of repeat analysis because since it's not a straight line, that's going perpendicular to that feature, so it's actually going to be longer.
So basically, if we did have a single seam that we were trying to match this pattern to run across it, the chances of that very thin dark line matching up, matching up or at least touching the black line on the other side, would be one in 35 through random processes.
Q. Could you explain that, explain that a little bit, how you came up to the one in 35? ...
A. ... Let's use, let’s use the yoke and the sleeve. ...
Q. OK ... And the yoke, again, is this back panel ... ?
A. Right. I'm using the yoke in the back because they're almost aligned. ... You can see here that on the yoke the dark line is ... slightly below the seam here. And you have to go about half an inch down before you hit the black line on the other side.
Q. Okay. In other words, you can measure the distance between the black line on the yoke to the black line on the sleeve, and you measured, say ... a half an inch?
A. Yeah.
Q. All right. And so just taking that one point of reference, what would be the odds that two shirts coming out the same manufacturing plant being manufactured in this, in this method would end up having the same alignment of yoke and the sleeve, whereby you could have this ... half an inch offset between them just for this one point of identification?
A. Just for that one point, it would be a 1 in 35 chance.
Q. And how did you come up with the number that it's a 1 in 35 of probability that these two items would line up the same in more than one shirt?
A. Because the feature itself that I’m looking to align is 1/35th of the overall repeat length. And to see ... that one feature come up randomly happens to be one in 35 times.
Q. So one in every 35 shirts manufactured by the company using this patterned cloth, the exact same cloth, you could say that one in 35 probability-wise would come up with the exact same alignment between the yoke and the left sleeve?
A. That is correct.
Several things are going on here to give rise to a uniform probability distribution. First, the plaid design of the shirt comes from a repeated 3.5"-wide distinctive pattern that contains lines of at least two different thicknesses. Second, these lines can be offset from one another across the seams of the shirt. Third, Dr. Vorder Bruegge measures how large the offset is in 1/10th" strips on the 8x10" photographs. Finally, the exact offset -- and hence, which strip a corresponding line on the other side of a seam falls, is determined at random.Thus, the number (call it X) of 1/10th" strips that separate the starting point of every plaid block on the two different cuts of fabric that are sewn together along a seam can take on the value x = 0, 1, 2, ..., 34, with probability f(x) = 1/35 for every x. The following sketch of a repeating block showing only one line in the pattern may clarify what x stands for:
The idea of assigning an equal probability to elementary events dates back to Bernouilli (1713) and Laplace (1814). The underlying principle of insufficient reason, indifference, or symmetry holds that if there is no reason to believe that any event is more likely to occur than any other event in a set of possible, mutually exhaustive events, then one should assume that all the events are equally probable. This principle offers one way to motivate or understand the axioms of probability theory. Although it has fallen out of favor, it is not without defenders. 4/1/10" strip ---------| | 1/10" strip | | 1/10" strip | |---------} x=2 (2/10" offset) 1/10" strip | | ... 1/10" strip ---------| | 1/10" strip | | 1/10" strip | |--------- 1/10" strip | | ... s 1/10" strip ---------|e| 1/10" strip |a| 1/10" strip |m|--------
But Vorder Bruegge did not base his probability model on a metaphysical or philosophical principle. He gave an empirical justification. During the short Daubert inquiry, he testified that he visited "manufacturing plants, factories where articles of clothing [are] made" and that "in this particular instance, I visited ... cutting plants in Alabama where the patterned material is cut out, as well as manufacturing plants where the cut out pieces are sewn together so that I can see for myself how the process takes place." Furthermore, "I've also been to another plant, Guess plant in Southern California, where shirts were also manufactured, and found that they use the same manufacturing process at the Guess Factory as they do at the Arrow shirt factories in Alabama and ... Georgia." He learned that "[m]anufacturers do not make an effort, in general, to make these features align, because to do so would be prohibitively expensive" and that "there are also places where it is not possible ... to make them align because of the curvature such as along the arms and the sleeves."
In essence, he proposed that busy workers stitched the pre-cut pieces of fabric together without regard to how well their patterns aligned with one another. He elaborated:
Q. All right. Could you ... explain the manufacturing process ... as to how shirts of this nature would be made in a factory?
A. [Y]ou start off with a huge bolt of cloth that can be hundreds and hundreds of yards long. The cloth is then laid down at a cutting plant on a table that [is] maybe about a hundred yards long. Now, these huge bolts will be rolled out on the table and then rolled back onto itself until the entire roll is done. The another role is attached, ... and it continues until you have on the order of 500 plies of material. Now, because of the way that the material is rolled back in, this pattern is not going to line up from one ply to the next. ...
Once the fabric is laid out and you've got 500 plies on top, the manufacturers lay out, basically, tracing paper that has cutting patterns. Just like if you were sewing at home and making your own clothes, you would have a pattern to cut out. Only this is a hundred yards long [and] has been designed by engineers whose only job is to figure out how best to place each piece of this shirt in its closest proximity so they are wasting as little of that fabric as possible. ...
[T]hey slap it down and they will actually have people get on there with jigsaws and cut them out by hand. Or in some of the plants they will have computers that can manage the cutting out. Once all of those pieces are cut out, they [are] transferred over to the sewers ... [Y]ou've got people who do nothing all day but sew on sleeves onto shoulders or yolks on to the back.
Q. ... So there will be, like, thousands of pieces for the collar and thousands ... for the sleeves and so forth?
A. ... If you have 500 plies thick, you have 500 pieces in one particular area. ... [O]n this shirt we've got two pieces on the collar. If you look closely at this shirt, you'll see there's actually a piece here it appears on the outside and a piece on the inside. There's another piece that goes around the collar; so that's three pieces. ... [All together, there are] 16 pieces of this same pattern cloth on this shirt [that must be sewn together].
Q. What significance does that have in terms of determining whether or not, you know, identifying the uniqueness of a particular shirt as opposed to any of the other shirts that are manufactured by the plant using the same cloth coloring?
A. Well, as I mentioned before, every one of these pieces is cut out. The plies line up on every row. If, in the manufacturing process, they happened to take a piece from the top layer and try to stitch it to a piece from the next layer down, you're going to get a lot of randomness in this process. Because there is this offset, you're not going to get this pattern lining up across the seam in any consistent way from one shirt to the next. ... [I]n fact, you can't get a curved seam like this to have an alignment across it, because it isn't geometrically possible.
You look at the pockets on this shirt and you'll see that they line up. [T]he fact that one of these pockets lines up makes this shirt twice as expensive to manufacture as it would be if this pocket were on the bias. [T]hey have to make sure that this pocket is cut out and exactly the same orientation as the front panel. Furthermore, they have to actually have a human being take the time to physically line this pocket up when they're sewing ... .
Q. Is there any effort to line up any of the other pieces of cloth? In other words, lining up the lines from the back to the sleeves, or the yolk to the sleeves, or the yolk to the front panels —
A. No. No. And you can see that for yourself just by comparing the way the left sleeve doesn't align with the yoke in the same way that the right sleeve aligns with the yoke. ...
A. [Y]ou start off with a huge bolt of cloth that can be hundreds and hundreds of yards long. The cloth is then laid down at a cutting plant on a table that [is] maybe about a hundred yards long. Now, these huge bolts will be rolled out on the table and then rolled back onto itself until the entire roll is done. The another role is attached, ... and it continues until you have on the order of 500 plies of material. Now, because of the way that the material is rolled back in, this pattern is not going to line up from one ply to the next. ...
Once the fabric is laid out and you've got 500 plies on top, the manufacturers lay out, basically, tracing paper that has cutting patterns. Just like if you were sewing at home and making your own clothes, you would have a pattern to cut out. Only this is a hundred yards long [and] has been designed by engineers whose only job is to figure out how best to place each piece of this shirt in its closest proximity so they are wasting as little of that fabric as possible. ...
[T]hey slap it down and they will actually have people get on there with jigsaws and cut them out by hand. Or in some of the plants they will have computers that can manage the cutting out. Once all of those pieces are cut out, they [are] transferred over to the sewers ... [Y]ou've got people who do nothing all day but sew on sleeves onto shoulders or yolks on to the back.
Q. ... So there will be, like, thousands of pieces for the collar and thousands ... for the sleeves and so forth?
A. ... If you have 500 plies thick, you have 500 pieces in one particular area. ... [O]n this shirt we've got two pieces on the collar. If you look closely at this shirt, you'll see there's actually a piece here it appears on the outside and a piece on the inside. There's another piece that goes around the collar; so that's three pieces. ... [All together, there are] 16 pieces of this same pattern cloth on this shirt [that must be sewn together].
Q. What significance does that have in terms of determining whether or not, you know, identifying the uniqueness of a particular shirt as opposed to any of the other shirts that are manufactured by the plant using the same cloth coloring?
A. Well, as I mentioned before, every one of these pieces is cut out. The plies line up on every row. If, in the manufacturing process, they happened to take a piece from the top layer and try to stitch it to a piece from the next layer down, you're going to get a lot of randomness in this process. Because there is this offset, you're not going to get this pattern lining up across the seam in any consistent way from one shirt to the next. ... [I]n fact, you can't get a curved seam like this to have an alignment across it, because it isn't geometrically possible.
You look at the pockets on this shirt and you'll see that they line up. [T]he fact that one of these pockets lines up makes this shirt twice as expensive to manufacture as it would be if this pocket were on the bias. [T]hey have to make sure that this pocket is cut out and exactly the same orientation as the front panel. Furthermore, they have to actually have a human being take the time to physically line this pocket up when they're sewing ... .
Q. Is there any effort to line up any of the other pieces of cloth? In other words, lining up the lines from the back to the sleeves, or the yolk to the sleeves, or the yolk to the front panels —
A. No. No. And you can see that for yourself just by comparing the way the left sleeve doesn't align with the yoke in the same way that the right sleeve aligns with the yoke. ...
The uniform-probability model of the placement of the lines is not as "preposterous" and "outrageous" as the ProPublica article suggests. But no one can see for themselves that there is no "effort to line up any of the other pieces of cloth" from the mere fact that the plaid patterns are displaced where a "sleeve aligns with the yoke." That is like saying that a marksman is shooting entirely at random because he missed the bullseye. There is a random component, but if the the marksman is skilled, bullets are more likely to arrive near the center of the target than to be dispersed equally across it.
Our theory about the marksman firing at random would be more credible if we saw that he was wearing a blindfold and firing rapidly. Likewise, the theory in McKreith is that the workers won't "take the time to physically line [the plaid patterns] up when they are sewing." It could be a good theory, but how has it been validated? Might not some workers try to get the unit patterns to line up just a little bit as they join the pre-cut pieces of fabric? If that happened, smaller cross-seam offsets would be more probable than larger ones. We could test the theory empirically by inspecting shirts. If the uniform probability model is correct, we would expect to find pretty much the same number of 1/10" offsets (x = 0, 1, 2, and so on) at a given seam in a large sample of Van Heusen shirts with the same design. The FBI apparently had no such data to support the probability model.
To be sure, the probability model was empirically motivated — Dr. Vorde Bruegge informed himself about the manufacturing process by visiting several manufacturing plants. His sense of the process might be entirely correct. When interviewed, I was not told what model he had adopted and what the basis for it was. `But even if I had been asked to study the testimony before reacting, I still might have said that even a plausible model could be "terribly flawed" (or at least not well validated).
"650 Billion, Give or Take a Few Billion"
A characteristic that results from the manufacturing process that occurs with a probability of 1/35 does not "individualize" an item in the only-one-in-the-universe sense that criminalists use the term. It is a generic feature, and Dr. Vorder Bruegge did not claim otherwise. To arrive at the ultimate opinion took a few more steps. The first was to assume that the offset at each seam is statistically independent of the offset at every other seam (and combination of them).
Q. Now, would the same randomness apply to all the other features in pieces of cloth that go into the shirt?
A. Yes, they would.
Q. So if you were able to, for example, reach a measurement as far as the left sleeve is concerned, with the yoke, would the line up — would the line up be the same or would it be, again, random with respect to the right sleeve?
A. There's going to be a 1 in 35 chance that it's going to be the same on the right sleeve as it is on the left sleeve.
Q. All right. So if you were able to find, for example, from the photographs, two points of identification, whereby the photograph matches the shirt in one area, maybe the yoke to the sleeve on the left side, and then you're able to find a second point of comparison or identification, say on the right sleeve and the yoke, what would be the odds of two shirts randomly being manufactured coming from the factory that would match this particular shirt?
A. It would be ... 1 in 35. But to simplify things and to be conservative, I prefer to use one in 30. By saying one in 30, that's — each giving it a better chance of being the same, but it makes the math easier. Thirty times 30 is 900. So one in 900 chance that you're going to find another shirt that has the left sleeve aligned to the yoke the same way and the right sleeve aligned to the yoke in the same way.
Q. All right. Now let's say you ... have a good enough picture that you can make three points identification. ...
A. Well, ... if we had sleeve to yoke, yoke to back, and yoke to sleeve, then that’s 30 times 30 times 30, which is one in 27,000.
Q. So if you were able to make three items of — points of identification the odds would be one in 27,000 that there would be two shirts randomly made from the factory that would have those three points of identification exactly the same?
A. Correct. ...
Q. And were you able to actually make those types of identification with respect to the photographs depicted in the bank robbery surveillance photos with this shirt ... .?
A. Yes. I was.
Q. And what was the highest number of points of identification that you were able to match up with respect to any particular bank robbery photo — in the surveillance photographs with respect to this shirt?
A. Eight. ...
Q. So 30 to the eighth power [would] be the odds in which two shirts would be randomly manufactured by the company [with] all those eight points of identification lining up exactly the same?
A. That's correct.
Q. And does your pocket calculator actually print out all the numbers that would come out if you were to insert 30 to the 8th power?
A. No. It came out to be 6.5 x 10 to the 11th, which is basically 650 billion, give or take a few billion.
A. Yes, they would.
Q. So if you were able to, for example, reach a measurement as far as the left sleeve is concerned, with the yoke, would the line up — would the line up be the same or would it be, again, random with respect to the right sleeve?
A. There's going to be a 1 in 35 chance that it's going to be the same on the right sleeve as it is on the left sleeve.
Q. All right. So if you were able to find, for example, from the photographs, two points of identification, whereby the photograph matches the shirt in one area, maybe the yoke to the sleeve on the left side, and then you're able to find a second point of comparison or identification, say on the right sleeve and the yoke, what would be the odds of two shirts randomly being manufactured coming from the factory that would match this particular shirt?
A. It would be ... 1 in 35. But to simplify things and to be conservative, I prefer to use one in 30. By saying one in 30, that's — each giving it a better chance of being the same, but it makes the math easier. Thirty times 30 is 900. So one in 900 chance that you're going to find another shirt that has the left sleeve aligned to the yoke the same way and the right sleeve aligned to the yoke in the same way.
Q. All right. Now let's say you ... have a good enough picture that you can make three points identification. ...
A. Well, ... if we had sleeve to yoke, yoke to back, and yoke to sleeve, then that’s 30 times 30 times 30, which is one in 27,000.
Q. So if you were able to make three items of — points of identification the odds would be one in 27,000 that there would be two shirts randomly made from the factory that would have those three points of identification exactly the same?
A. Correct. ...
Q. And were you able to actually make those types of identification with respect to the photographs depicted in the bank robbery surveillance photos with this shirt ... .?
A. Yes. I was.
Q. And what was the highest number of points of identification that you were able to match up with respect to any particular bank robbery photo — in the surveillance photographs with respect to this shirt?
A. Eight. ...
Q. So 30 to the eighth power [would] be the odds in which two shirts would be randomly manufactured by the company [with] all those eight points of identification lining up exactly the same?
A. That's correct.
Q. And does your pocket calculator actually print out all the numbers that would come out if you were to insert 30 to the 8th power?
A. No. It came out to be 6.5 x 10 to the 11th, which is basically 650 billion, give or take a few billion.
Some of ProPublica's criticism of this part of the testimony misses the mark. The one example of the "[m]any problems in the examiner’s testimony [that] went unnoticed, or were simply unknown, during trial" is that "Vorder Bruegge undercut the precision of his calculations when he admitted having rounded down the shirt measurements used in his calculations because 'it makes the math easier.'" But how could anyone not notice that he used the figure of 1/30 instead of 1/35? And why is that reduction in "the precision of the calculations" a problem for the defendant? It means that the joint probability is even smaller than the pocket calculator's output.
When ProPublica contacted me and quoted the 1/650,000,000,000 figure, my reaction was "How could you get that number"? Not knowing anything about the case, I assumed it was the product of frequency estimates for different kinds of characteristics, such as shirt size, color, style, pattern, imperfections, discolorations, and so on. I doubted that such frequencies were known with sufficient accuracy to justify giving a single astronomically large denominator to a jury. 5/
Apparently, "Karen Kafadar, chair of the statistics department at the University of Virginia," had a similar reaction and was among the "seven statisticians and independent forensic scientists [who] told ProPublica that "[t]he statistics were also preposterous" because "[t]he features Vorder Bruegge matched might be common in plaid shirts, making them of little value for identifying the garments." Indeed, Dr. Kafadar inveighed "that the 1-in-650-billion claim 'makes about as much sense as the statement two plus two equals five.'"
However, while the proposition that "two plus two equals five" seems to violate mathematical logic, it is not illogical to argue for a uniform probability distribution of offsets and to multiply probabilities as Dr. Vorder Bruegge did. If the 1/10" measurements are all correct, if the offsets are uniformly distributed over such strips, and if the independence assumption for different seams holds, then the probability of an equal offset at every one of n corresponding seams is (1/35)n, just as he testified. The assumptions are part of a perfectly logical argument, and they are not inherently "preposterous." But they have not been the subject of any systematic study (that I know of).
The Probability of Uniqueness
With all that said, the testimony had an unusual virtue over the typical "individualization" thinking of its day. Dr. Vorde Bruegge followed Lord Kelvin's dictum that "when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of science, whatever the matter may be." Dr. Vorde Bruegge was explicit and numerical about the grounds for concluding that only one shirt was involved in the pictures:
Q. [D]id you, during your research, contact the ... Van Heusen Company or the factory to determine ... approximately how many shirts using this cloth and this design were made by them?
A. They do not keep records of every single shirt that they ever make, but they have people who recognize patterns and they also know what their typical runs are. [A]t most, they said they would have made about no more than 18,000 of these shirts. ...
Q. So seven points of identification with respect to the bank robbery surveillance photos and the shirt with the Commerce Union bank robbery?
A. Yes.
Q. Let's go to the next robbery. ...
A. That would be seven, I believe.
Q. So the odds of this being random with two shirts, 30 to the seventh power?
A. If we use 30.
Q. Being conservative?
A. Yes. ...
A. That's seven points of identification with respect to the Bank of America shirt?
A. Yes.
Q. [F]rom the ... Bank United robbery? ... Five positive points of comparison?
A. Right.
Q. Okay. South Trust ... Well it's either 30 to the 7th power or 30 to the 8th power or 30 to the 6th power? A. That's it for South Trust. ...
Q. So that would be five points of identification for the Union Bank photographs?
A. Yes.
Q. [I]s there a point that you were able to conclude that all the photographs in all the different bank robberies are depicting the same shirt?
A. Well, for me in this case, it only took three points. Because with one to the 30th chance each time ... having three points ... I've got 30 times 30 times 30, which is 27,000, which is half again as many as 18,000. [S]o for my opinion it's enough to go over to say that three of those are enough. The fact that I've got four, five, six, seven and eight, just makes me all the more certain.
Q. And what is your opinion with respect to this comparison analysis?
A. They're all the same shirt.
Q. All right. And all the shirts. In the bank robbery surveillance photographs, do you have an opinion as to whether or not they are the same shirt as Exhibit 11, which is the questioned shirt?
A. Government Exhibit 11, in my opinion, is the shirt worn by the bank robber in each of these seven bank robberies. ...
A. They do not keep records of every single shirt that they ever make, but they have people who recognize patterns and they also know what their typical runs are. [A]t most, they said they would have made about no more than 18,000 of these shirts. ...
Q. So seven points of identification with respect to the bank robbery surveillance photos and the shirt with the Commerce Union bank robbery?
A. Yes.
Q. Let's go to the next robbery. ...
A. That would be seven, I believe.
Q. So the odds of this being random with two shirts, 30 to the seventh power?
A. If we use 30.
Q. Being conservative?
A. Yes. ...
A. That's seven points of identification with respect to the Bank of America shirt?
A. Yes.
Q. [F]rom the ... Bank United robbery? ... Five positive points of comparison?
A. Right.
Q. Okay. South Trust ... Well it's either 30 to the 7th power or 30 to the 8th power or 30 to the 6th power? A. That's it for South Trust. ...
Q. So that would be five points of identification for the Union Bank photographs?
A. Yes.
Q. [I]s there a point that you were able to conclude that all the photographs in all the different bank robberies are depicting the same shirt?
A. Well, for me in this case, it only took three points. Because with one to the 30th chance each time ... having three points ... I've got 30 times 30 times 30, which is 27,000, which is half again as many as 18,000. [S]o for my opinion it's enough to go over to say that three of those are enough. The fact that I've got four, five, six, seven and eight, just makes me all the more certain.
Q. And what is your opinion with respect to this comparison analysis?
A. They're all the same shirt.
Q. All right. And all the shirts. In the bank robbery surveillance photographs, do you have an opinion as to whether or not they are the same shirt as Exhibit 11, which is the questioned shirt?
A. Government Exhibit 11, in my opinion, is the shirt worn by the bank robber in each of these seven bank robberies. ...
Right or wrong, the testimony is transparent about the threshold for concluding that a picture has the defendant's shirt in it. For 18,000 shirts, Dr. Vorder Bruegge asserted, it takes only three seams to conclude that there is but one shirt in existence. But is this a reasonable criterion for an inference of uniqueness?
It might seem that way. One might be tempted to reason as follows: The random-match probability for two seams is 1/900. For 18,000 shirts manufactured as described in the uniform probability model, we would expect to find about 18,000 shirts x 1/900 matches per shirt = 200 shirts with two matching seams. That is way too many to claim individuality. But for 3 matching seams, the expected number is 18,000 x 1/27,000 = 2/3. In other words, we would expect to find lots of two-seam matches, but not even one three-seam match. So it does not look like a second shirt is very likely for three or more matching seams.
But there is a problem here that did go unrecognized in the case and the article about it. I have called it the expected value fallacy. 6/ It consists of thinking that as long as the expected number of items in a population is less than 1, the probability that there really is less than one is very high. Life, or at least mathematics, is not this simple. Even when the expected number is a little less than 1, as it is here, the probability of at least one additional matching item is appreciable. In this case, it is about 3/10, as shown in Box 1.
BOX 1. THE PROBABILITY OF ANOTHER MATCHING SHIRT
The number y of items with a given characteristic that has a constant, small probability p of appearing in each item in a large population of size n is a Poisson random variable with the parameter λ = np. The probability of any y is f(y; λ) = λye-λ/y! For the three-seam match, λ = 2/3.The conditional probability that at least one more item in the population has the characteristic, given the known fact that one such item is present is [1 – f(0, λ) – f(1; λ)] / [1 – f(0; λ)] = 0.30.
The number y of items with a given characteristic that has a constant, small probability p of appearing in each item in a large population of size n is a Poisson random variable with the parameter λ = np. The probability of any y is f(y; λ) = λye-λ/y! For the three-seam match, λ = 2/3.The conditional probability that at least one more item in the population has the characteristic, given the known fact that one such item is present is [1 – f(0, λ) – f(1; λ)] / [1 – f(0; λ)] = 0.30.
In short, even ignoring any modeling and measurement uncertainty in McKreith, there is a 30% probability that another Van Heusen shirt that matches at three seams has been manufactured. This large a risk of an erroneous conclusion of individuality would not be acceptable under a 2000 FBI policy that uses similar reasoning in explaining when an analyst may testify that a DNA profile comes from a named individual.
Thus, it appears that Dr. Vorde Bruegge choose a relatively undemanding quantitative threshold for source attribution. We can say this only because he was unusually clear as to the probabilistic basis for his conclusion that only one shirt — the defendant's — appeared in all the pictures. It also should be noted that the final opinion would have been the same had he selected a threshold as high as five, for which the conditional probability of matching Van Heusen shirts of the same type would be only 0.0004, or 0.04%. Still, the prosecution introduced the three-seam testimony to make the matches at five and more seams appear more impressive than they were. Even if this abuse of statistical reasoning did not affect the outcome, it was unfortunate.
NOTES
- Ryan Gabrielson, The FBI Says Its Photo Analysis Is Scientific Evidence. Scientists Disagree, Propublica, Jan. 17, 2019; Ryan Gabrielson, FBI Scientist’s Statements Linked Defendants to Crimes, Even When His Lab Results Didn’t, Propublica, Feb. 22, 2019.
- United States v. McKreith, 140 Fed.Appx. 112, 2005 WL 1600471 (11th Cir. 2005) (per curiam).
- Hans-Werner Sinn, A Rehabilitation of the Principle of Insufficient Reason, 94 Q. J. Econ. 493 (1980); Jon Williamson, Justifying the Principle of Indifference, 8 European J. Phil. Sci. 559 (2018).
- An oral ruling in the midst of a trial rarely creates or enshrines a rule of law. The U.S. Court of Appeals for the Eleventh Circuit affirmed the conviction, but it did not consider its opinion important enough to release for publication, and McKreith did not raise the Daubert claim on appeal. Although the January article states that McKreith “exhausted his appeals, most of which attempted to dispute the FBI Lab findings,” the Westlaw database’s history of the case displays only one direct appeal and one petition for postconviction relief for ineffective assistance of counsel. In neither of these attacks did McKreith challenge the scientific basis of the FBI lab’s work, and no court seems to have cited the case to support admitting similar testimony.
- The article states that "The statisticians who reviewed Vorder Bruegge’s materials for ProPublica said the examiner’s calculations cannot be correct. Vorder Bruegge’s statistic — 1 in 650 billion — is simply too astronomical to be true, said Kaye, the Penn State professor. There isn’t a database documenting features on plaid-shirt seams like there is for human DNA, making it impossible to determine the likelihood a different shirt would appear to match the robber’s shirt." I was not one of "[t]he statisticians who reviewed Vorder Bruegge’s materials for ProPublica," but I was (and am) of the view that estimating such small numbers on the basis of strong modeling assumptions alone is fraught with danger. The same thing can be said about DNA evidence. Decades ago, England's leading forensic statistician, Ian Evett, rhetorically asked me why American experts give such small probabilities in DNA matches instead of stopping at a number like one in a million.
- David H. Kaye, The Expected Value Fallacy in State v. Wright, 51 Jurimetrics J. 1 (2011).
No comments:
Post a Comment