Monday, December 19, 2011

Finegan's Wake: A Partial DNA Match in Rhode Island

A Rhode Island newspaper reports that "[t]he Cranston Police Department has arrested 49-year-old David Finegan, of no permanent address, for the burglary and sexual assault of an 81-year-old woman." [1] Police "collected [DNA] at the crime scene on May 2." On June 10, the Rhode Island Department of Health reported that it had a DNA profile from the crime with which to query an offender database. The Rhode Island laboratory did not use software designed for kinship searching, but on July 8, the department advised detectives that it had found a partial DNA match to a female inmate.

The article does not speculate on why it took a month to complete a routine computer search and report the results.Was it because of legal concerns? Was the partial match trawl intentional, or was the discovery inadvertent? Whichever it was, the detectives turned their attention to five male siblings. They discovered that one of them, David Finegan, was "in close proximity . . . on the night of the incident."

Why this roundabout method of identifying Finegan? He was on parole in July. Was the underlying offense not one that triggered entry into the DNA database? Was there a backlog in entering offender profiles into the database? Whatever the explanation, Finegan had the misfortune of being picked up on July 14 on a parole violation and held for the weekend. Detectives quickly obtained a search warrant and took a sample of DNA from him before he made bail and dropped out of sight. A week later, they learned that it matched the crime-scene DNA.

Pursuing an anonymous tip, police found and arrested Finegan in Providence. He "is being charged with burglary and first-degree sexual assault." Interestingly, he has an arrest record (including domestic assault, felony DWI, resisting arrest and other assaults) dating back to 1991. A bill that would expand the state database to include arrestees is before House and Senate committees in Rhode Island.


1. Joe Kernan, Arrest Made in Rape of Elderly Woman, Cranston Herald, Dec. 19, 2011


Thanks to Frederick Bieber for informing me of the Cranston Herald article.

Cross-posted from The Double Helix Law Blog

Friday, December 16, 2011

Abusing AFIS -- Conviction by Computer?

On television shows such as NCIS, seconds after a forensic scientist inserts an image of a latent fingerprint into a machine connected to an automated fingerprint identification system (an AFIS), a computer screen blinks MATCH FOUND ... MATCH FOUND ... MATCH FOUND ... , and the chase is on. Apparently, some police and prosecutors think this is real.

Robert Garrett, a past president of the International Association for Identification, writing in the Evidence Technology Magazine, describes three cases in which governments have taken very serious actions against individuals based solely on the output of an AFIS search — with no review by any latent print examiner [1]. AFIS searches generate a list of potential matches — often the 20 closest matches as determined by an algorithm that looks at prints very differently from the way humans do. There is no proof that the AFIS ranks the candidates in the way that a skilled human examiner, relying on more information in the images, would. A number 1 candidate can be an obvious mismatch (to the human eye and brain). The Scientific Working Group on Friction Ridge Analysis, Study, and Technology (SWGFAST) insists that “AFIS ranks and scores have no role in formulating and stating conclusions based on ACE-V,” the steps that latent print examiners follow [2] and that “[t]he practice of relying on current AFIS technology to individualize latent prints correctly is not sound” [3].

According to Mr. Garrett, U.S. Customs and Immigration Enforcement relies on raw AFIS results to initiate deportation proceedings, and grand juries issue indictments relying on this information without testimony from a qualified examiner that the AFIS match means anything [1, p.10]. He concludes that “AFIS hits must be examined by a qualified fingerprint examiner and the results of that examination verified before any proceedings are commenced against a potential suspect. It is unethical, unprofessional, and—most likely—unconstitutional to do otherwise” [1, p. 11].

I would not presume to question Mr. Garrett’s professional judgment of what is unprofessional conduct in the fingerprint expert community, but it seems fair to ask on what basis he concludes that the practice of using unreviewed AFIS output is “most likely ... unconstitutional.” The article offers two possible bases for this judgment. First, Mr. Gerrett writes that
U.S. Supreme Court decisions in Melendez-Diaz v. Massachusetts and, more recently, Bullcoming v. New Mexico, reiterated a defendant’s Sixth Amendment right “to be confronted with the witnesses against him.” Reports of a laboratory or investigative finding do not satisfy the requirement.
This won’t wash. The Sixth Amendment right to confrontation is a trial right. It does not apply to administrative and grand jury proceedings. Even the usual rules for expert evidence are not binding in these proceedings. At trial, an AFIS hit, not used as part of the basis for an examiner’s opinion, would be impermissible under the rules of evidence, since this kind of scientific evidence is neither scientifically valid (under Daubert) nor generally accepted in the scientific community (per Frye). But this does not make it unconstitutional. An argument could be made that it deprives the defendant of due process to be convicted on the strength of such evidence. Cf. McDaniel v. Brown, 130 S.Ct. 665 (2010). However, the Confrontation Clause line of cases on which Mr. Garrett relies suggests that “machine-generated” test results can be introduced without the separate judgment of a human examiner. The state of Illinois currently is seeking to exploit this idea — unpersuasively in the judgment of many commentators — in the pending Supreme Court case of Williams v. Illinois (involving DNA tests and discussed on other postings on this blog).

The second suggestion regarding the constitutionality of AFIS evidence is that
Our society and its government have embraced technology in various forms for its efficiency and economy. In the areas of law enforcement and public safety, these technological advances have included AFIS, the Combined DNA Index System (CODIS), airport security screening devices, and red light/traffic cameras. But these advances bring with them compromises of privacy and our right “…to be secure in their persons, houses, papers, and effects…”
It would be extravagant to assert that AFIS, CODIS, airport magnetometers and ordinary scanners, and red light cameras are “most likely unconstitutional,” and this may not be what Mr. Garrett intended to state or imply.

In any event, the article is an eye-opener. Police and prosecutors who rely on unexamined AFIS matches are not acting professionally or responsibly. Mr. Garrett deserves thanks for shedding some light on this remarkable practice.


1. Robert J. Garrett, Automated Fingerprint Identification Systems (AFIS) and the Identification Process, Forensic Sci. Mag., July-Aug. 2011, at 10–11.

2. SWGFAST, Position Statement on the Role of AFIS Ranks and Scores and the ACE-V Process, Oct. 15, 2011.

3. SWGFAST, Press Kit, May 19, 2004, at 14.1.3.

Thursday, December 15, 2011

Williams v. Illinois (Part II: More Facts, from Outside the Record, and a Question of Ethics)

This morning, Professor Richard Friedman posted a revealing report that Cellmark sent to the Illinois State Police (ISP). As he explains, and as my previous posting on the facts of Williams v. Illinois indicated, the report consists of much more than "machine-generated" statements. But the "lodging" (not part of the record in the case) and Professor Friedman’s remarks warrant a few additions or revisions to my presentation of the facts of the case.

On cross-examination, ISP analyst Karen Kooi Abbinanti, who examined the blood sample that Williams gave under court order in another case, testified to William’s STR profile. Because ISP analyst Sandra Lambatos, who provided the state’s only evidence of a DNA match, testified that “there [was] a computer match generated of the male DNA profile found in semen from the vaginal swabs of [LJ] to a male DNA profile that had been identified as having originated from Sandy Williams,” I presumed that the Cellmark report listed this profile as coming from the male fraction of DNA in the vaginal swab. Indeed, Lambatos testified that the “allele chart” in the Cellmark report “included data that [she] used to run [her] data bank search.” Joint Appendix at 61. Thus, I wrote that
The unnamed analyst believed that the semen had the following profile: D3 (16, 19), DWA (17, 17), FGA (18.2, 22), D8 (14, 14), D21 (29, 30), D18 (13, 17), D5 (12, 13), D13 (11, 11), D7 (10, 12), D16 (9, 11), TH01 (7, 7), TPOX (11, 11), and CSF (8, 10). The analyst’s report included this profile . . . .
Now that the report is lodged, it is clear that this singular profile is not what the anonymous Cellmark analyst and Cellmark’s two laboratory directors, Robin Cotton and Jennifer Reynolds, signed off on. Their table, which was Lambatos’s “data,” has the entry of (10, 12, 13) instead of (12, 13) for the D5S818 locus. Had Ms. Lambatos used this tri-allelic genotype, Williams would have been excluded! (Tri-allelic, single locus profiles are rare, but they are not unheard of. For example, one paper reports three cases of tri-allelic patterns observed during routine forensic casework on 5964 Belgian residents [1], and the D5S818 (10, 12, 13) profile has been observed [2].)

Ms. Lambatos, however, testified on cross-examination that the Cellmark report’s “deduced male donor profile” (to quote the report itself) was not actually a deduced profile, but only a list of deduced alleles. Joint Appendix at 71. Interpreting it in this fashion (which may well be the correct understanding what the unknown analyst meant to write), she searched the unspecified database for certain two-allele subsets of the three alleles— namely, (13, 13), (10, 13), and (12, 13). Id. This made sense because, if Cellmark had correctly identified the victim’s profile — something that Lambatos did not check — then the rapist rather than the victim had to be the source of the 13-repeat allele.

The circumscribed nature of Ms. Lambatos’s testimony on direct examination about the “DNA match” is worthy of comment. Full disclosure would have required a scientist to reveal that other male profiles than just Williams’ profile were “consistent with” the vaginal-swab mixture and could have been picked out of a database in her trawl. Instead, Ms. Lambatos acquiesced in or suggested confining her testimony to Williams’s matching profile and the random-match probability associated with that one profile. In other words, she chose not to acknowledge possibilities that were inconsistent with the state’s theory. Does such selectivity contravene the professional responsibility of forensic scientists to “[a]ttempt to qualify their responses while testifying when asked a question with the requirement that a simple ‘yes’ or ‘no’ answer be given, if answering ‘yes’ or ‘no’ would be misleading to the judge or the jury”? [1]

The answer, I think, depends on how misleading Ms. Labatos’s answers on direct examination were. This was not a case of a single profile that probably could exclude everybody except for a twin brother. The analysts were unable to distinguish between Sandy Williams and other males with similar, but not identical profiles, as possible sources of the male DNA. By not disclosing this fact, Ms. Lambatos and the prosecutor made the DNA “match” sound especially compelling. The prosecutor asked about “the male DNA profile found in the semen.” Ms. Labatos made no effort to correct or clarify even though she firmly believed that Cellmark was reporting at least three different male profiles for the semen (and that Williams was, of course, a match to only one of them). Hammered with Ms. Lambatos’s figures for the Williams’ profile frequency, a judge surely would think that only Williams or a mythical twin could have been the rapist. In contrast, a judge who understood that Cellmark's tests also pointed to men with other DNA profiles might have been more willing to entertain some doubt.

The counterargument is that the probative value of the evidence for the ambiguous profile is essentially the same as the probative value of the evidence for the unambiguous profile that Ms. Lambatos was asked about. Assuming that the vaginal swab DNA is a mixture of the victim’s DNA and one man’s DNA, and assuming that the laboratory called all the alleles correctly, the likelihood ratio for the hypotheses of Williams versus that of a random, unrelated man is 1/[p(10,12,13) + p(13,13) + p(10,13) + p(12,13)], where p is the random-match probability for the full genotype, including the alleles shown in parentheses. Ms. Lambatos computed the probability p(12,13) as falling in the quadrillionths. Although I have not consulted allele frequency tables, it is a safe bet that similarly small probabilities would pertain to the profiles with the (13,13) and (10,13) genotypes. The random-match probability for the profile with the tri-allelic pattern would be even smaller. (When asked by the defense, Ms. Lambatos testified that a tri-allelic male was not a real possibility.) Therefore, I would predict that the correct computation would not differ from the number given to the judge by more than an order of magnitude. Hence, the witness’ failure to clarify or correct the prosecutor in her questioning affected the probative value of the evidence minimally.

Nevertheless, for the expert to present such oversimplified testimony without any qualification seems problematic to me. When confronted with the omissions on cross-examination, the expert owned up to them, but did she not ask the prosecutor to present the expert's reasoning accurately in the first place? And if she did, why did the prosecutor not do it?


1. G. Mertensemail, S. Rand, E. Jehaes et al., Observation of Tri-allelic Patterns in Autosomal STRs During Routine Casework, 2 Forensic Sci. Int’l: Genetics Supplement Series 38-40 (2009).

2. NIST STR-base, Tri-Allelic Patterns, June 2, 2011,

3. American Society of Crime Laboratory Directors Laboratory Accreditation Board, ASCLD/LAB Guiding Principles of Professional Responsibility for Crime Laboratories and Forensic Scientists, Principle 19, Version 1.1, 2009.

Cross posted from The Double Helix Law Blog, 15 Dec. 2011

Was the White Powder Cocaine? Apparently so, but Melendez-Diaz Acquitted

In February, the Boston Globe reported the acquittal of Luis Melendez-Diaz [1]. Melendez-Diaz gained legal fame when the Supreme Court reversed his initial conviction for drug trafficking because the Commonwealth of Massachusetts used sworn "certificates of analysis" from a state laboratory to show that a white powder was cocaine. [2] A state statute allowed this official hearsay, and the certificates were not accompanied oral testimony (or even a written report) of how the laboratory reached this conclusion.

On retrial, the Commonwealth "called to the stand a chemist from the state Department of Public Health who testified that the substance allegedly found in the back seat of a police cruiser with Melendez-Diaz and two other men in 2001 had tested positive for cocaine." Apparently the accuracy of the laboratory's finding was not in dispute. The defense lawyer said that the case "really seemed to be about guilt by association." Police had arrested Melendez-Diaz and two other men and placed them in the back of a police cruiser. Because the officers saw the men fidgeting on the way to the police station, they searched the car and found 19 bags of a white substance.

Evidently, Melendez-Diaz's proximity to the bags did not convince the jurors of his guilt. A spokesman for the district attorney's office said that “We’re 10 years out from the original incident, and the passage of so much time only makes a case tougher to try. The acquittal did not free Melendez-Diaz, who is serving a separate 10-year sentence for drug trafficking.


1. Martin Finucane, Drug Defendant Retried on High Court’s Order Is Acquitted, Boston Globe, Feb. 11, 2011.

2. Melendez-Diaz v. Massachusetts, 557 U.S. __, 129 S. Ct. 2527 (2009).

Wednesday, December 14, 2011

Williams v. Illinois (Part I: Just the Facts)

The Supreme Court heard oral argument last week in Williams v. Illinois. The case could have been just another of the thousands of rape cases that work their way through state courts across the country. But the prosecution decided to take a shortcut to convict Sandy Williams. Rather than present the laboratory analyst who produced the DNA profiles of the victim and the rapist, it had an analyst from a completely different laboratory testify in a bench trial. This gap in the state's case eventually attracted the attention of the U.S. Supreme Court. Indeed, Williams is the third case in as many years in which the Supreme Court has agreed to review criminal convictions relying on the findings of laboratory workers who do not appear at trial.

Unsurprisingly, the parties frame the issue before the Court differently. On one hand, according to Williams the question is
Whether the prosecution violates the Confrontation Clause when it presents . . . the substance of a testimonial forensic laboratory report through the trial testimony of an expert witness who took no part in the reported forensic analysis, where the defendant had no opportunity to confront the analysts who authored the report.
Brief for Petitioner at i. On the other hand, according to the state, the issue is
Whether a criminal defendant’s Sixth Amendment right to confront witnesses against him is satisfied where a prosecution expert testifies live at trial to her independent, expert opinions and is subject to unrestricted cross-examination.
Brief for Respondent at i.

How can the expert’s opinions be “independent” if she simply took what someone else told her as true in forming them? What is the value of “unrestricted cross-examination” when the witness’s knowledge of what transpired is so severely restricted? Understanding what information the testifying witness relied on seems crucial to an informed resolution of the case. Yet, the prosecution elicited no detailed information on this at trial. In this posting, I shall describe the facts of the case more fully and what I was able to extract by reading the trial transcript about the “data” (as the state’s witness called it), used in forming her “independent, expert opinions” (as Illinois calls them). Later postings will comment on the oral argument and the legal issues.

The Rape Kit

The Supreme Court of Illinois outlined the events leading to the submission of biological evidence to the state laboratory:
On February 10, 2000, 22–year–old L.J. worked until 8 p.m. as a cashier at a clothing store in Chicago. On her way home . . . [a]s she passed an alley, the defendant came up behind her and forced her to sit in the backseat of a beige station wagon, where he [sexually assaulted her]. He then pushed L.J. out of the car while keeping L.J.'s coat, money, and other items. After L.J. ran home, her mother opened the door and saw her in tears, partially clothed with only one pant leg on. [H]er mother called the police.

Shortly after 9 p.m., Chicago police officers arrived at the home . . . . After L.J. told the officers what had transpired, the officers issued a “flash” message for a black male, 5 foot, 8 inches tall, wearing a black skull cap, a black jacket and driving a beige station wagon. An ambulance transported L.J. and her mother to the emergency room. [V]aginal swabs . . . were . . . placed into a criminal sexual assault evidence collection kit along with L.J.'s blood sample. The kit was sent to the Illinois State Police (ISP) Crime Lab for testing and analysis.
On February 15, 2000, [a] forensic biologist . . . performed tests that confirmed the presence of semen. . . .
People v. Williams, 939 N.E.2d 268, 270–71 (Ill. 2010).

To Maryland and Back

Like too many police laboratories, the ISP lab was behind in processing rape kits and other DNA samples. So after letting the rape kit sit for nine months, it sent the vaginal swab that it knew contained semen along with the reference sample of LJ’s blood to a private company, Cellmark Diagnostics, in Germantown, Maryland, via Federal Express. Cellmark received the samples the next day (November 29, 2000). Another fours months went by before Cellmark returned the samples and supplied a report (on April 3, 2001).

The analysis of the reference sample of the victim’s blood at Cellmark should have been straightforward. That sample had plenty of DNA purely from LJ. Nevertheless, no one testified about the electropherogram. Was it clean, with clear peaks, or did it have blobs, pull-up, or off-ladder peaks? The transcript of the testimony, reproduced in relevant parts at the end of this posting, does not contain such questions. The testifying expert never looked at this electropherogram or the data underlying it.

The transcript does suggest, however, that the vaginal swab was not so easily analyzed. Vaginal swabs from rape victims often contain epithelial cells from the victim and some number of sperm cells. Extracting the DNA yields a female fraction from the victim’s epithelial cells and a male fraction from the sperm cells. The two DNA profiles will be mixed together, and someone must deduce which male profiles are consistent with the mixture (and the probability that each such profile is present in the mixture).

Sometimes this “mixture deconvolution” can be avoided by a procedure that extracts the female DNA first and then extracts the male DNA (because sperm cells are harder to break open than are epithelial cells). Under ideal conditions, the latter extract contains DNA from the semen cells only. Unfortunately, the differential extraction failed. The second extract still was a mixture of DNA from LJ and the unknown rapist.

So the Cellmark analyst did his or her best to decipher it by “subtracting” the victim’s peaks (as ascertained from the victim’s reference blood sample) and attributing the remaining peaks to the rapist. This was not as simple as it sounds, because the individual STR alleles are not that uncommon, and the victim and the rapist probably had some alleles in common. In any event, the Cellmark analyst arrived at a single profile — a set of 13 pairs of numbers characterizing the DNA that the rapist inherited from his mother and father. The unnamed analyst believed that the semen had the following profile: D3 (16, 19), DWA (17, 17), FGA (18.2, 22), D8 (14, 14), D21 (29, 30), D18 (13, 17), D5 (12, 13), D13 (11, 11), D7 (10, 12), D16 (9, 11), TH01 (7, 7), TPOX (11, 11), and CSF (8, 10). The analyst’s report included this profile and LJ’s profile (derived from her reference blood sample) as well as at one electropherogram from the mixture.

Meanwhile Williams’ Profile Goes Into the Police Database

During this year of desultory activity, Sandy Williams ran into trouble with the Chicago police:
On August 3, 2000, police arrested the defendant for an unrelated offense and, pursuant to a court order, drew a blood sample from [him]. On August 24, 2000, forensic scientist Karen Kooi [at the ISP lab] performed an analysis on the sample . . . . Kooi extracted a [DNA] profile and entered it into the database at the ISP Crime Lab.
Williams, 939 N.E.2d at 270–71. Which database this was is not clear. It seems unlikely that it was the state’s database for linking known criminals to crime-scene samples, for that database was limited to convicted offenders. Only this year did Illinois pass a law to extend the state database to include certain arrestees.

ISP Uses the Cellmark Profile to Pick Out Williams

The analyst at the police lab who received the materials from Cellmark was Sandra Lambatos. Her testimony is rather fuzzy with regard to how she proceeded. I cannot tell for certain whether she immediately entered the male profile as Cellmark reported it into the unspecified computer database system or whether she waited to do that until she looked more deeply into Cellmark’s report. In any event, the queried the database for the profile that Cellmark reported must have come from the sperm. Bingo! This profile matched the profile of Williams taken after his arrest in August. Every 13-locus profile is exceedingly rare in the general population — perhaps unique to an individual and any identical twins.

Police put Williams in a line-up on April 17, 2001, and LJ identified him as the man who sexually assaulted her over a year earlier.

The ISP Analyst Testifies — and the Cellmark Analyst Does Not

At trial, the state did not call anyone from Cellmark — the laboratory that did the DNA profiling of the vaginal swab. It relied instead on Ms. Lambatos to talk about the results. Lambatos verified that the 13 pairs of numbers that constituted the ISP’s profile of Williams’ reference blood sample were the same as the 13 pairs that constituted the male fraction reported by Cellmark. This established that the database profile with Williams’ name associated with it was, indeed, Sandy Williams’ profile.

At some point before the trial, Lambatos looked at one of the electropherograms from Cellmark — the one following the differential extraction for the mixture in the vaginal sample. She did not take the Cellmark electropherograms from the victim and from the vaginal swab mixture and compare them herself to deduce the rapist profile. Instead, she accepted Cellmark’s report of the victim’s profile as a given and looked only at the mixture electropherogram to infer the rapist profile.

In short, from what I can glean from the testimony, Cellmark's deduction of the male profile is a nontrivial, human inference (although it also could be done with software). But it also looks as if Lambatos made the same deduction using (1) Cellmark's electropherogram of the mixture, and (2) the alleles reported by Cellmark for the reference sample of the victim's blood. With regard to the alleles in (2), getting from the electropherogram for the victim's reference sample of blood to her STR profile could have been quite simple, but there could have been a bit of interpretation there too. In any event, this is not a case of an interpretive analyst starting with two electropherograms of single-source profiles — Lambatos' testimony about forming an "independent opinion" from the Cellmark "data" notwithstanding.

Excerpts from the trial testimony of Sandra Lambatos
identifying Sandy Williams as the rapist

Direct examination [JA 42]

Q  What is your current occupation?
A  I am a stay-at-home mom. [JA 43]
Q  Where did you work before that?
A  At the Illinois state police crime laboratory.
. . . [JA 51]
Q  [O]n the date of November 28th of 2000, was evidence from this case sent to (Cellmark) diagnostic laboratory . . . ?
. . . [JA 52]
Q  What was the evidence that was sent?
A  Vaginal swab and a blood standard from [LJ].
. . . [JA 54]
Q  And does this [shipping manifest] indicate the date that the evidence . . . was sent back . . . from (Cellmark) . . . ?
A  It does.
Q  And what is the date . . . ?
A  April 3d of 2001.
. . . [JA 56]
Q  Was there a computer match generated of the male DNA profile found in semen from the vaginal swabs of [LJ] to a male DNA profile that had been identified as having originated from Sandy Williams?
A  Yes, there was.
Q  Did you compare the semen . . . from the vaginal swabs . . . to the male DNA profile that had been identified by Karen Kooi from the blood of Sandy Williams? [JA 57 ]
A  Yes, I did.
. . .
Q  What was your conclusion?
A  I concluded that Sandy Williams cannot be excluded as a possible source of the semen identified in the vaginal swabs.
Q  In other words, is the semen identified in the vaginal swabs of [LJ] consistent with having originated from Sandy Williams?
A  Yes.
Q  What is probability of this profile occurring in the general population?
. . .
A  This profile would be expected to occur in approximately 1 in 8.7 quadrillion black, 1 in 390 quadrillion white, or 1 in 109 quadrillion Hispanic unrelated individuals.
. . . [JA 58]
Q  In your expert opinion, can you call this a match to Sandy Williams?
A  Yes.

Cross examination [JA 61]

Q  And that report [from Cellmark dated Feb. 15, 2001] included an allele chart, correct?
A  Yes.
. . .
Q  And that included data that you used to run your data bank search.
A  Correct. [JA 62]
Q  You did not interpret the results by [Cellmark], did you?
A  Partially. I did review their data, and I did make my own interpretations. So I looked at what the programs, what they sent to me, and did make my own interpretation, my own opinion.
Q  That would be the vaginal swab with respect to the electropherogram E2, right?
. . .
A  Yes.
Q  You did not receive electropherograms for the E1?
A  I believe all I have in my case file is E2, correct.
Q  And you did not receive electropherograms from the standard of [LJ], did you?
A  No, I did not.
. . . [JA 68]
Q  And you reviewed the electropherograms just for that second fraction from the differential extraction [procedure], correct?
A  Correct.
Q  You did not receive the electropherograms for the first part of the procedure, that first part of the extraction, did you?
A  Correct.
. . . [JA 69]
"[T]hey sent the chart that was in the F1 fraction E1. Also the profile that was in the E2 fraction and the profile that was in [LJ]'s standard, and I had only the electropherograms from the E2 fraction . . . .
Q  But you did not receive their data or their electropherograms?
A.  No, I did not receive electropherograms for those fractions.
Q  You never received any computer data, the electronic data.
A   I myself did not receive that, but that was sent to the laboratory.
Q  You never viewed that?
A  Oh no, I did not.

First posted to The Double Helix Blog, 13 Dec. 2011

Sunday, November 27, 2011

Legislation to Implement Kinship Matching in Criminal DNA Databases

A few years ago, “Jeffrey Rosen, a constitutional law professor at George Washington University, warned: ‘I can guarantee if familial searching proceeds, it will create a political firestorm.’” (1) But familial searching, as it is tendentiously called, prompted no huge political protests when California, Colorado, New York, and Virginia adopted it administratively. Now, legislative initiatives to implement it in various states (2, 3) and federally (4) have begun. However, the proposed legislation is timid, usually authorizing the practice only in murder and sexual assault cases and only after traditional investigative methods have failed.


1. Maura Dolan & Jason Felch, California Takes Lead on DNA Crime-fighting Technique: The State Will Search its Database for Relatives of Unidentified Suspects in Hopes of Developing Leads, Los Angeles Times, Apr. 26, 2008

2. Mike Cook, DNA — It’s All in the Family, Minnesota House of Representatives Session Weekly: News from the House, Apr. 8, 2011

3. Mark Scolforo, DNA Proposal Has Foes: Pa. Bill to Expand its Collection Opposed by the ACLU, Phil. Inquirer, Oct. 2, 2011

4. Press Release, Schiff's Familial DNA Language Passes as Part of Conference Report, Nov. 21, 2011

Cross-posted from The Double Helix Law Blog

Tuesday, October 18, 2011

New Mexico Supreme Court Proposes Rules on Lab Reports

The New Mexico Supreme Court is soliciting comments on "three representative proposals that have been suggested . . . to address the admission of state laboratory forensic analyses in light of Bullcoming." [1] The state court is referring to the U.S. Supreme Court's somewhat unenlightening opinion in Bullcoming v. New Mexico, 131 S. Ct. 2705 (2011), discussed here some months ago. Inasmuch the the rules are a response to a Confrontation Clause decision that applies only in criminal cases, I shall assume that these rules also would apply solely in criminal cases.

The first proposal covers all "forensic scientific evidence including blood and breath alcohol test reports, controlled substance chemical analysis reports." It orders the parties to confer about stipulating to a waiver of the defendant's right to be presented with a laboratory analyst to cross-examine on the laboratory's findings and of the state's right to present a live witness along with the reports. [1]

Of course, the parties can initiate such such discussions now, and they need not stipulate to anything anyway. The proposed rule states that "The report or print-out ... shall not be admitted at trial without the testimony of necessary witnesses unless the defendant stipulates in writing," but that adds nothing of substance to the status quo (or to the rest of the rule).

But what happens if one or more of the parties do not event want to talk about a stipulation. The rule says that "[i]f either party cannot obtain the opposition’s position regarding a proposed stipulated order, that party may file a motion requesting a hearing to determine the opposition’s position regarding the need for testimony ... ." Maybe the judge can induce the parties to take an irrevocable position well before the trial, as the rule seems to contemplate. This might help the lab schedule its staff time, but is all this judicial rule-making worth the effort to achieve this convenience?

Alternative Rule 2 abolishes the hearsay rule as applied to "a written report of the conduct and results of a chemical analysis of breath or blood for determining blood alcohol concentration." Of course, many jurisdictions have statutes to this effect, and others apply the business records exception to reach the same result. The (new?) hearsay exception makes no difference in New Mexico criminal cases as long as the U.S. Supreme Court adheres to the interpretation of the Confrontation Clause articulated in Crawford v. Washington, 541 U.S. 36 (2004).

Alternative 3 requires the prosecution to serve on the defendant "[a] copy of a report of the methods and findings of any examination conducted by an employee of any governmental laboratory ... no later than ninety (90) days before trial" and to give notice to the defense at the same time if it intends to introduce the report into evidence. If New Mexico prosecutors do not already provide timely disclosure of reports or if this provision requires that more complete reports be prepared than is currently the practice, it would be a significant improvement.

Yet, the rule also imposes a burden on the defense to object in writing before trial. In this regard, it reads as follows: "If the defendant does not file a written objection with the court to the use of the laboratory report and certificate within the time allowed by this subparagraph, then the report and certificate are admissible in evidence."

The thinking seems to be that the defense ordinarily should not have to object before trial. However, if the prosecution affirmatively notifies defense counsel that it does not plan to present a necessary witness, then the defense should be forced to make a pretrial demand for confrontation (or lose that right). The rule would require the defendant to give notice at least 30 days before trial. [3] A dictum in Melendez-Diaz v. Massachusetts, 129 S. Ct. 2527 (2009), approves of such notice-and-demand rules. [4]

Interestingly, none of the proposed rules go so far as to place the burden on the defendant to give notice in all cases in which it learns that a laboratory report exists. The state rules committee's commentary to the first alternative rule states: "This rule applies in lieu of a notice and demand rule, which the committee rejected, and is meant to ensure that the waiver is not made by accident or lack of knowledge. The defendant may waive this right by stipulated order, but the waiver shall be made willingly, knowingly, and intelligently."


1. Myrna Raeder initiated a discussion of these rules on Roger Park's discussion list for law professors. The rules are available at,_2,_3.pdf

2. The proposed rule states that "[t]he parties shall confer and either party may file a stipulated order to admit a report or print-out of results ... or to limit the witnesses required to appear at trial."

3. This proposed rule include the statement that "[i]f the defendant does not file a written objection with the court to the use of the laboratory report and certificate within the time allowed by this subparagraph, then the report and certificate are admissible in evidence ... ."

4. See;

Thursday, October 6, 2011

DNA Identification Technology: Fast and Furious

Today’s talks at the International Symposium on Human Identification indicated some directions in which DNA-based identification technology will move in the near future. For example, one company reported a way to type 26 different STRs simultaneously. Is that enough to justify testimony of global individualization (with the exception of identical twins)?

The Departments of Defense, Homeland Security, and Justice are seeking self-contained devices for rapid STR profiling and interpretation, and several companies claim to be on the verge of delivering them. “Rapid” means an hour or so, and the hope is that these microfluidic devices will permit on-the-spot (or at-the-police-station) results for investigations as well as DNA database queries and entries. One company promises a functioning product in April 2012. Another refers to an existing instrument “compact enough to be used in an office setting, airport security area, mobile van, or field-forward military site.”

None of these has been fully validated. The FBI is figuring on widespread implementation at local police stations in 4-7 years, but police in Palm Bay, Florida, have posted videos on YouTube to advertise their success with a microfluidic device in “Operation Rapid Hit.”

Finally, companies are supplying police with phenotype and ancestry data, including probable eye and hair color. For the future, the most impressive -- and disquieting -- approach uses “next-generation sequencing” to extract all the usual STRs, together with phenotypically and medically informative data in one fell swoop.

Indeed, sequencing the oral bacteria that we host is possible. A speaker described one individual whose microbiome included a bacterium used in the industrial production of yogurt and cheese. Just imagine the APB: “The suspect is a white male with brown hair (probability = 0.45) and blue eyes (probability = 0.95) who likes yogurt.”

Tuesday, October 4, 2011

An Odd Set of Odds in Kinship Matching with DNA Databases

The 22d International Symposium on the Future of Human Identification began yesterday with a set of workshops. One was on "familial searching." The phrase refers to trawling the profiles in a DNA database for certain types of partial matches to a DNA profile from a crime-scene sample.

Partial matches that are useful in generating investigative leads to family members arise much more often when a particular kind of relative (say, a full sibling) is the source of the crime-scene sample than when an individual who is not closely related to the database inhabitant is the source. The ratio of the probability of the partial match under the former condition (a given genetic relationship) to the latter (unrelated individuals) is a likelihood ratio (LR). The LR (or, technically, its logarithm) for siblingship expresses the weight of the evidence in favor of the hypothesis that the source is full sibling as opposed to an unrelated individual.

After explaining the this idea, the first speaker presented the following formula:
"Odds" = LRautosomal x LRY-STR x 1/N         (1)
She attributed this formula to the California state DNA laboratory that does familial searching in that state. In this equation, N is the size of the database, LRautosomal is the likelihood ratio for the partial match at a set of autosomal STR loci, and LRY-STR is the likelihood ratio for the matching Y-STR haplotype.

She described this as a Bayesian computation that could lead to statements in court such as "there is a 98% probability" that the person whose DNA was found at the crime scene is a brother of Joe Smith, a convicted offender whose DNA profile is in a DNA database.

There are three interesting things to note about these suggestions. To begin with, it is not clear why such a statement would be introduced in a trial. By the time the suspect has become a defendant, a new sample of his DNA should have been tested to establish a full match to the crime-scene sample. At that point, why would the judge or jury care whether defendant is related to a database inhabitant. The relevance of the DNA evidence lies in the full match to the crime-scene sample, and the jury need not consider whether the defendant is a relative of someone not involved in the alleged crime. (One might ask whether the trawl through the database somehow degrades the probative value of the full match, but, if anything, it increases it. [1])

The issue could arise, however, if police were to seek a court order or search warrant to collect a DNA sample from the suspect. At that point, they would need to describe the significance of the partial match to the convicted offender.

This possibility brings us to the second noteworthy point about equation (1). The "odds" (or the corresponding probability) are not the way to present the weight of the partial match. Consider the prior probability of a match in a small database, say, of size N=2. Prior to considering the partial match, why would one think that the probability of a database inhabitant being the sibling of the criminal who resides outside the database is 1/N = 1/2? It is quite improbable that the database of two people includes a relative of every criminal who leaves DNA at a crime-scene. The a priori probability for a small database must be closer to 0 than 1/N.

That the prior probability is less than 1/N is a general result. The only exception occurs when it is absolutely certain that a sibling of the perpetrator is in the database. On that assumption, prior odds of 1 to N-1 are not unreasonable. But that assumption is entirely artificial, and to advise a magistrate that the posterior odds have the value computed according to (1) would be to overstate the implications of the partial match.

The third thing to note about dividing by N is that it accomplishes nothing in producing a viable list of partially matching profiles in a DNA database trawl. The straightforward approach is to produce a short list of candidates in the database whose first-degree relatives might be the source of the crime-scene sample. The minimum value of LRautosomal x LRY-STR should be large enough to keep the two conditional error probabilities (including a candidate when there is no relationship, and not including a candidate when there is a relationship) small. This threshold value does not depend on N. (A later speaker made this observation.)

Equation (1), it seems, is useless. Instead, the magistrate should be told the value of the LR and how often such large LRs would occur when a crime-scene sample comes from a relative versus how often it would occur when it comes from an related person.


1. David H. Kaye, 2009, Rounding Up the Usual Suspects: A Legal and Logical Analysis of DNA Database Trawls, North Carolina Law Review, 87(2), 425-503.

Friday, September 30, 2011

Prometheus Unbound: Releasing the New Edition of the FJC Reference Manual on Scientific Evidence

Two days ago, the National Academy of Sciences released a third edition of the Federal Judicial Center’s Reference Manual on Scientific Evidence. I listened to the webcast of the unveiling as an insider and an outsider. An insider in that I co-authored two of the chapters. An outsider in that until the Manual eventually emerged, I seemed poised on the exterior side of the event horizon of a singularity into which drafts disappeared and time slowed.

Wednesday's unveiling included remarks from the two co-chairs of the NAS committee assembled to commission and supervise the writing of the manual—a group of five judges and five science professionals (a physician, a toxicologist, an engineer, a statistician, and an epidemiologist). Judge Gladys Kessler explained that in 1993, Daubert v. Merrell Dow Pharmaceuticals established “the gatekeeping role” of federal judges.

Certainly, the majority opinion, penned by Justice Blackmun in Daubert, is famous for this metaphor, but if federal judges were not gatekeepers before 1993, waht were they? Surely not sheep. The Daubert opinion draws heavily on prior case law regarding the Federal Rules of Evidence, substituting for the previously dominant “austere” requirement of general acceptance in the scientific community a multifaceted inquiry into “evidential reliability.” Under either legal standard, however, judges are gatekeepers.

Although Judge Kessler correctly suggested that Daubert’s reliability standard (borrowed from earlier court of appeals cases) goes beyond mere relevancy, the same can be said for the general-acceptance standard (announced in a court of appeals case in 1923) that it displaced. Thus, the notion that judges were not gatekeepers for scientific evidence until 1993 always has struck me as odd. (For more on this legal history and the meaning of Daubert, see Kaye et al. (2011).)

Dr. Jerome Kassirer fielded a number of questions from the virtual and physical audiences. One came from a forensic scientist or analyst in Florida who wanted to know if there were any “practicing forensic scientists” on the editorial committee. The response, that the Manual relied on the very detailed 2009 NRC report on forensic science in its treatment of the forensic sciences, missed the subtext of the question. The practicing forensic science community has been bashing the 2009 report for not accurately depicting the knowledge base of forensic identification techniques (other than DNA evidence). The criticism often takes the form of complaints that the committee lacked enough forensic scientists. (For a rejoinder from the co-chair of that committee, see Edwards (2010).)

Another question was why there was no chapter on digital forensics. The answer referred to the failure of the designated author to produce a manuscript that the committee thought would be useful or intelligible to judges. This probably was not the only chapter to fall by the wayside. Indeed, the problems encountered with such chapters may be part of a more complete answer than the one given to the interlocutor who asked whether an 11-year gap between editions was not a bit much.

A final question to which I alerted included a little speech about the importance of Bayesian inference. The questioner wanted to know why the Manual did not mention Bayes’ rule. Evidently, the questioner was not updating his prior beliefs with any data, for the chapters on statistics, DNA evidence, and medicine have substantial discussions of Bayes' rule. A better question would have been why there is not more discussion in the epidemiology chapter or why the presentation in the medicine chapter is so garbled. But that is a subject for another day.


Harry T. Edwards, 2010, The National Academy of Sciences Report on Forensic Sciences: What it Means for the Bench and Bar, Presentation at the Superior Court of the District of Columbia Conference on The Role of the Court in an Age of Developing Science & Technology, Washington, D.C., May 6, available at NAS Report on Forensic Science.pdf.

David H. Kaye et al., 2011, The New Wigmore, A Treatise on Evidence: Expert Evidence, 2d ed., New York, NY: Aspen Pub.

National Research Council Committee on the Development of the Third Edition of the
Reference Manual on Scientific Evidence ed., 2011, Reference Manual on Scientific Evidence, Washington DC: National Academies Press.

Friday, September 23, 2011

"The first experimental study exploring DNA interpretation"

A recent study, entitled “Subjectivity and Bias in Forensic DNA Mixture Interpretation,” proudly presents itself as “the first experimental study exploring DNA interpretation.” The researchers are to be commended for seeking to examine, on at least a limited basis, the variations in the judgments of analysts about a complex mixture of DNA.

In addition to documenting such variation, they suggest that their experiment shows that motivational or contextual bias caused analysts in an unnamed case in Georgia to include one suspect in a rape case as a possible contributor to the DNA mixture. This claim merits scrutiny. The experiment is not properly designed to investigate causation, and the investigators' causal inference lacks the foundation of a controlled experiment. To put it unkindly, if an experiment is an intervention that at least attempts to control for potentially confounding variables so as to permit a secure inference of causation, then this is no experiment.

In the study, Itiel Dror, a cognitive neuroscientist and Honorary Research Associate at University College London, and Greg Hampikian, a Professor of Biology and Criminal Justice at Boise State University, presented electropherograms to 17 “expert DNA analysts ... in an accredited government laboratory in North America.” The electropherograms came from a complex mixture of DNA from at least four or five people recovered in a gang rape in Georgia. The article does not state how many analysts worked on the case, whether they worked together or separately, the exact information that they received, or whether they peeked at the suspects’ profiles before determining the alleles that were present in the mixture. They imply that the analysts were told that unless they could corroborate the accusations, no prosecution could succeed. Reasonably enough, Dror and Hampikian postulate that such information could bias an individual performing a highly subjective task.

In the actual case, one man pled guilty and accused three others of participating. The three men denied the accusation. The Georgia laboratory found that one of the three could not be excluded. Contrary to the expectations or desires of the police, the analysts either excluded the other two suspects or were unable to reach a conclusion as to them.

The 17 independent analysts shown the electropherograms from the case split on the interpretation of the complex mixture data. The study does not state the analysts’ conclusions for suspects 1 and 2. Presumably, they were consistent with one another and with the Georgia laboratory’s findings. With regard to suspect 3, however, “One examiner concluded that the suspect ‘cannot be excluded’, 4 examiners concluded ‘inconclusive’, and 12 examiners concluded ‘exclude.’”

From these outcomes, the researchers draw two main conclusions. The first is that “even using the ‘gold standard’ DNA, different examiners reach conflicting conclusions based on identical evidentiary data.”

That complex mixture analysis is unreliable (in the technical sense of being subject to considerable inter-examiner variation) is not news to forensic scientists and lawyers. Although the article implies that the NRC report on forensic science presents all DNA analysis as highly objective, the report refers to “interpretational ambiguities,” “the chance of misinterpretation,” and “inexperience in interpreting mixtures” as potential problems (NRC Report 2009, 132). The Federal Judicial Center’s Reference Manual on Scientific Evidence (Kaye & Sensabaugh 2011) explains that “A good deal of judgment can go into the determination of which peaks are real, which are artifacts, which are ‘masked,’ and which are absent for some other reason.” In The Double Helix and the Law of Evidence (2010, 208), I wrote that “As concurrently conducted ... , most mixture analyses involving partial or ambiguous profiles entail considerable subjectivity.” In 2003, Bill Thompson and his colleagues emphasized the risk of misinterpretation in an article for the defense bar.

These concerns about ambiguity and subjectivity have not escaped the attention of the courts. Supreme Court Justice Samuel Alito, quoting a law review article and a book for litigators, wrote that
[F]orensic samples often constitute a mixture of multiple persons, such that it is not clear whose profile is whose, or even how many profiles are in the sample at all. All of these factors make DNA testing in the forensic context far more subjective than simply reporting test results … .
and that
STR analyses are plagued by issues of suboptimal samples, equipment malfunctions and human error, just as any other type of forensic DNA test.
District Attorney’s Office for Third Judicial Dist. v. Osborne, 557 U.S. __ (2009) (Alito, J., concurring). Dror and Hampikian even quote DNA expert Peter Gill as saying that “If you show 10 colleagues a mixture, you will probably end up with 10 different answers.” Learning that 17 examiners were unanimous as to the presence of two profiles in a complex mixtures and that they disagreed as to a third supports the widespread recognition that complex mixtures are open to interpretation, and it adds some more information about just how frequently analysts might differ in evaluating one set of electropherograms.

The second conclusion that the authors draw is that in the Georgia case “the extraneous context appears to have influenced the interpretation of the DNA mixture.” This conclusion may well be true; however, it is all but impossible to draw on the basis of this “first experimental study studying DNA interpretation.” As noted at the outset, the “experimental study” has no treatment group. The study resembles collaborative exercises in DNA interpretation that have been done over the years. A true experiment—or at least a controlled one—would have included some analysts exposed to potentially biasing extraneous information. Their decisions could have been compared to those of the unexposed analysts.

Instead of controlling for confounding variables, the researchers compare the outcomes in their survey of analysts’ performance on an abstract exercise to the outcomes for one or two analysts in the original case. This approach does not permit them to exclude even the most obvious rival hypotheses. Perhaps it was not information about the police theory of the case and the prosecution's needs, but a difference in the labs' protocols that caused the difference. Perhaps the examiners outside of Georgia, who knew they were being studied, were more cautious in their judgments. Or, perhaps the police pressure, desires, or expectations really did have the hypothesized effect in Georgia. The study cannot distinguish among these and other possibilities.

In addition, the difference in outcomes between the Georgia group and the subjects in the study seems to be within the range of unbiased inter-examiner variability. How can one conclude that the Georgia analysts would not have included suspect 3 if they had not received the extraneous information and had followed the same protocol as the other 17? If the variability due to ordinary subjectivity in the process is such that 1 time out 17 an analyst will include the reference profile in question, then the probability that a Georgia analyst would do so is 0.059. I am not a firm believer in hypothesis testing at the 0.05 level, but I cannot help thinking that even under the hypothesis that the Georgia group was not affected to the slightest degree by the extraneous information, the chance that the result would have been the same is not negligible.

In raising these concerns, I certainly am not claiming that an expectation or motivation effect arising from information about the nature of the crime and the need for incriminating evidence played no role in the Georgia case. But the research reported in the paper does not go very far to establish that it was a significant factor and that it was the cause of the disparity between the Georgia analysts and the 17 others.

The authors’ caveat that “it is always hard to draw scientific conclusions when dealing with methodologies involving real casework” is not responsive to these criticisms. The problem lies with conclusions that outstrip the data when it would have been straightforward to collect better data. The sample size here does not give a reliable estimate of variability in judgments of examiners working in the same laboratory. The sampling plan ignores the possibility of greater variability across laboratories. Perhaps the confined scope of the study reflects a lack of funding or an unwillingness of forensic analysts to cooperate in research because of the pressure of their caseloads or for other reasons -- a complaint aired in Mnookin et al. (2011). Inasmuch as the researchers do not explain how they chose their small sample, it is hard to know.

Beyond the ubiquitous issue of sample size, the subjects in the "experiment" were not assigned to treatment and control groups. No analysts were given the same extraneous information that the Georgia ones had. Of course, extraneous information presented in an experiment could be less influential than it would be in practice. External validity is always a problem with experiments. But there is reason to believe that a controlled experiment could detect an effect in a case like this. Bill Thompson (2009) reported anecdotes suggesting that even outside of actual casework, different DNA examiners presented with electropherograms of mixtures can be induced to reach different conclusions when given extraneous and unnecessary information about the case. That the effect might be less in simulated conditions does not mean that it is undetectable in a controlled experiment.

Convincing scientific knowledge flows from a combination of well designed experimental and observational studies. Dr. Dror's work on fingerprint comparisons (e.g., Dror 2006) has contributed to a better understanding of the effect of examiner expectations in that task. Experiments designed to detect the impact of potentially biasing information on interpretation of DNA profiles in a controlled setting also would be worth undertaking.


Itiel E. Dror & Greg Hampikian (2011), Subjectivity and Bias in Forensic DNA Mixture Interpretation, Sci. & Justice, doi:10.1016/j.scijus.2011.08.004,

Itiel E. Dror, David Charlton & Ailsa E. Peron (2006), Contextual Information Renders Experts Vulnerable to Making Erroneous Identifications, Forensic Science International, 156(1): 74-78

David H. Kaye (2010), The Double Helix and the Law of Evidence

David H. Kaye & George Sensabaugh (2011), Reference Guide on DNA Identification Evidence, in Reference Manual on Scientific Evidence, 3d ed.

Jennifer L. Mnookin et al. (2011), The Need for a Research Culture in the Forensic Sciences, UCLA Law Review, 58(3): 725-779

National Research Council Committee on Identifying the Needs of the Forensic Sciences Community (2009), Strengthening Forensic Science in the United States: A Path Forward, Wash DC: National Academy Press

William C. Thompson, Simon Ford, Travis Doom, Michael Raymer, Dan E. Krane, Evaluating Forensic DNA Evidence: Essential Elements of a Competent Defense Review, The Champion, Apr. 2003, at 16

William C. Thompson (2009), Painting the Target Around the Matching Profile: the Texas Sharpshooter Fallacy in Forensic DNA Interpretation, Law, Probability & Risk 8: 257-276

Cross-posted at the Double Helix Law blog

Monday, August 29, 2011

The Expected Value Fallacy in State v. Wright

In State v. Wright, 253 P.3d 838 (Mont. 2011), a woman (identified by the Montana Supreme court as Sierra) complained that her date raped her. No semen was recovered, but a penile swab from Timothy Wright, the man she accused, showed a mixture of DNA. Predictably, the major profile seemed to be the defendant's (presumably coming from his skin cells). The minor profile, however, matched Sierra's, confirming her accusation. The random-match probability was "1 in 467,700 Caucasians" and less for other groups. Id. at 841. The direct examination of the state's DNA analyst included the following colloquy:
Q. When you're determining whether or not [Sierra's] DNA is on that penis, tell me what the language "cannot be excluded" means?
A. So that means that the 16 locations we looked at for a DNA profile was at every of those 16 locations.
Q. So whose DNA is on that penis, that penile swab that you examined at the Lab?
A. Well, it--Timothy Wright and [Sierra] can't be excluded as contributing to that profile.
Q. If you--if you're finding her DNA, how come your conclusion isn't that she's included in the profile? That confuses me.
A. At the Forensic Science Division we don't use the word "included." Instead we use "cannot be excluded." It basically means the same thing. It's just our terminology we use.
Id. The prosecutor pressed on, eliciting the statement that the woman's DNA was present in the penile swab:
Q. Can you explain--let's focus on the Caucasian statistic. Can you explain that statistic to the jury? What's it really mean?
A. So that means that in a population of 467,000 you would expect that one person in that population could be included in this mixture.
Q. All right. How many--what's the population of the state of Montana, do you know?
A. It's approximately a million, just under.
Q. So in this particular scenario we've got a mixture of two DNA's, right?
A. Yes.
Q. Statistically speaking, then, I'm just--I want to make sure I understand you, is there only--are there only two people in the state of Montana that can contribute those particular profiles?
A. Yes. Statistically looking at the state of--or the population of Montana two people in Montana would contribute to this mixture.
Q. Those being whom [sic] according to your test results?
A. According to the test results Timothy Wright and [Sierra].
Id. at 841-42.

The prosecution argued in closing that the woman was included as the contributor and that this meant her DNA was on the defendant. On appeal, the defense contended that the source attribution was a knowingly false statement by the prosecutor.

The Montana Supreme Court concluded that this contention was unfounded, but it also described the analyst's source attribution as "internally inconsistent" with the statements that the defendant "cannot be excluded as a contributor." However, there is no logical inconsistency in this testimony. A test that does not exclude someone includes that person. It may include other people when they are tested--or it may not. If the random match probability is small enough, then the test would be expected to exclude all unrelated people, leaving the defendant (or a twin brother or perhaps a close relative) as the only possible source of the semen.

The actual problem in the case was not that the prosecutor knowingly misrepresented the testimony or deliberately elicited false testimony. It was the analyst erred when she stated that "two people in Montana would contribute to this mixture . . . Timothy Wright and [Sierra]" solely on the basis of the "the test results." The expert apparently reasoned that (1) Montana's population is about 1,000,000; (2) about 500,000 would be women; (3) the exact number of women in Montana with the minor profile is 1 (the random-match probability 1/500,000 times the female population of 500,000); hence, it is practically certain that no one but Sierra contributed the minor profile.

Technically, the quantity 1 is an expected value of a variable X that represents the number of women with the minor profile in a randomly generated population of 500,000 women. Expected values are all around us. If we flip a fair coin twice, the expected number of heads is (1/2) x 2 = 1. But we know quite well that the actual outcomes can vary about the expected value. Thus, the probability that two flips of the coin will produce 2 heads is 1/4. Over many coin flips, this "unexpected value" is expected to occur about 25% of the time. Betting on exactly one head in this situation would produce many losses.

How risky was the analyst's source attribution in Wright? The number of occurrences of the minor profile in the population is analogous to the number of heads that would occur when flipping a very heavily weighted coin 500,000 times. Imagine that we generate many populations of 500,000 profiles from a coin that has a probability of only 1/500,000 on each toss. Some populations will have 0 heads (minor profiles), some will have exactly 1, some will have 2, and so on. The number of heads, X, is approximately a Poisson random variable with mean λ = (1/500,000) x 500,000 = 1. Its probability distribution is f(x; λ) = λxe/x! = e-1/x! = .368/x! For example f(0; 1) = .368/0! =.368, and f(1; 1) = .368/1! = 0.368.

What is the probability that the analyst is wrong in thinking that there is only 1 woman with the minor profile? Of the many populations we generated with the coin, we can ignore some 36.8% of them--the ones with 0 minor profiles. We can ignore them because the real population has one woman (Sierra)--and possibly more with the minor profile. This leaves 63.2% of the populations to consider.

The analyst will be wrong in asserting X = 1 when we have a population in which X = 2 or more. This situation occurs in every population for which X is not 0 or 1. There are 100% x 36.8% x 36.8% = 26.4% such populations, and all of them are within the 63.2% that apply to this case. Consequently, looking at the DNA evidence in isolation, we conclude that there are 26.4/63.2 = 41.8% possible populations for which the analyst errs in asserting that the minor profile from the swab is Sierra's.

Yet, the jury was informed that the DNA test proved that Sierra's DNA was on the swab. The legal issue on appeal should have been whether this extravagant testimony to which no objection was raised was plain error or offended due process. Cf. McDaniel v. Brown, 130 S. Ct. 665 (2010) (not reaching the due process issue). But regardless of how those issues might be resolved, a well trained DNA analyst should not have testified in this fashion. It is hardly news to the forensic science community that the expected number of DNA profiles in a population must be much less than 1 to strongly support an inference that the profile is unique within that population. See David J. Balding, Weight-of-Evidence for Forensic DNA Profiles 148 (2005) (describing the kind of reasoning employed in Wright as a "uniqueness fallacy"); see also Ian W. Evett & Bruce S. Weir, Interpreting DNA Evidence (1998).

Cross-posted from the Double Helix Law blog. An expanded version is published in D.H. Kaye, The Expected Value Fallacy in State v. Wright, 2011. Jurimetrics 51: 1-8,

Sunday, August 28, 2011

A Kiss is Just a Kiss: Lip Print Identification in the Criminal Law Bulletin

A periodical for lawyers, the Criminal Law Bulletin, recently published an article on “The Investigative and Evidential Uses of Cheiloscopy (Lip Prints).” [1] The author argues that “concerns about the reliability of lip prints evidence are unfounded” and “that lip prints evidence is admissible evidence.” The legal analysis rests on misconceptions about American and English law, but I won’t spell these out. Instead, I want to comment on the author’s approach to the validation of a technique for scientific identification.

According to the article, the following facts apparently demonstrate that “[c]heiloscopy has a scientific foundation”:

  • “Human lips are made up of wrinkles and grooves, just like fingerprints and footprints. Grooves are of several types and these groove types and wrinkles form the lip pattern which is believed to be unique to every individual.”
  • A 1970 article in the Journal of the Indian Dental Association reported that “none of the lip prints from the 280 Japanese individuals showed the same pattern.”
  • In a 2000 article in the same journal, “Vahanwala and Parekh studied the lip prints of 100 Indians (50 males and 50 females) and concluded that lip prints are unique to individuals” and that “It can therefore be conclusively said — Yes, lip prints are characteristic of an individual and behold a potential to be recognized as a mark of identification like the fingerprint!”
  • "Saraswathi et al. studied 100 individuals, made up of 50 males and 50 females, aged 18 - 30, and found that ‘no individual had single type of lip print in all the four compartments and no two or more individuals had similar type of lip print pattern.’"
  • “A study by Sharma et al [published, as was the previous study, in the Taiwanese Journal of Dental Sciences] also found that lip prints are unique to every individual.”
  • Tsuchisashi studied 1364 Japanese individuals, 757 males and 607 females, aged 3 - 60 years and found no identical lip prints. . . . [T]he lips of the twins frequently showed patterns extremely similar to those of their parents [but nonetheless distinguishable].”

These studies do not address the relevant questions. The hypothesis that lip prints are unique is neither a necessary nor a sufficient condition for forensic utility. Before a method of identification can be considered valid, research should establish that multiple impressions from the same person are typically closer to one another than two impressions from different individuals [2] and that they so much closer that an examiner can accurately classify pairs according to their source. Until these questions are asked and answered, the caution expressed in an article not mentioned in the Criminal Law Bulletin seems apt: “Although lip print identification has been utilized in the court of law in isolated cases, more research needs to be conducted in this field . . . .” [3]


1. Norbert Ebisike, The Investigative and Evidential Uses of Cheiloscopy (Lip Prints), 47 Crim. Law Bull. No. 4, Art. 4 (Summer 2011)

2. David H. Kaye, Questioning a Courtroom Proof of the Uniqueness of Fingerprints, 71 Int’l Stat. Rev. 521 (2003), available at

2. Shilpa Patel, Ish Paul, Madhu Sudan Astekar, Gayathri Ramesh, Sowmya G V, A Study of Lip Prints in Relation to Gender, Family and Blood Group, 1 J. Oral & Maxillofacial Pathology No. 1 (2010), abstract available at

Friday, August 26, 2011

The Transposition Fallacy in Matrixx Initiatives, Inc. v. Siracusano: Part II

The previous posting promised a simple example that would demonstrate the fallacy in claims such this one:
For a p-value of .09, the odds of observing the AER [adverse event report] is 91 percent divided by 9 percent. Put differently, there are 10-to-1 odds that the adverse effect is “real” (or about a 1 in 10 chance that it is not).
Brief of Amici Curiae Statistics Experts Professors Deirdre N. McCloskey and Stephen T. Ziliak in Support of Respondents, Matrixx Initiatives, Inc. v. Siracusano, 131 S.Ct. 1309 (2011) (No. 09-1156).

Here is one such example. A bag contains 100 coins. One of them is a trick coin with tails on both sides; the other 99 are biased coins that have a 0.3 chance of coming up tails and a 0.7 chance of coming up heads. I pick one of these coins at random and flip it twice, obtaining two tails. On the basis of only this sample data (the two tails), you must decide which type of coin I picked. The p-value with respect to the “null hypothesis” (N) that the coin is a normal (albeit biased) heads-tails one is the probability of seeing two tails in the two tosses: p = 0.3 x 0.3 = 0.09. Should you reject the null hypothesis N and conclude that I flipped the unique tails-tails coin? Are the odds for this alternative hypothesis (A) 10:1, as the brief of the statistical experts asserts?

Of course not. Just consider repeating this game over and over. Ninety-nine percent of the time, you would expect me to pick a heads-tails coin. In 9% of those cases, you expect me to get tails-tails on the two tosses (9% x 99% = 8.91%). The other way to get tails-tails on the tosses is to pick the tails-tails coin. You expect this to happen about 1% of the time. Thus, the odds of the tails-tails coin given the data on the outcome of the tosses are 1% to 8.91% = 1:8.91, which is about 1:9. Despite the allegedly significant (in “practical, human, or economic” terms) p-value of 0.09, the alternative hypothesis remains improbable.

A more formal derivation uses Bayes' rule for computing posterior odds. Let tt be the event that the coin I picked produced the two tails when tossed (the data), and let "|" stand for "given that" or "conditioned on." Then Bayes' rule reveals that

Odds(A|tt) = L x Odds(A),

where L is the "likelihood ratio" given by P(tt|A) / P(tt|N) and Odds(A) are the odds prior to flipping the coin. The value of L is 1/.09 = 100/9. Hence,

Odds(A|tt) = (100/9) Odds(A).

Because there is only 1 trick coin and 99 normal coins in the bag, the prior odds of A are Odds(A) = 1:99. Hence, the posterior odds are Odds(A|tt) = (100/9)(1/99) = 100/891 = 1:8.91. In other words, the odds for the alternative hypothesis are only about 1:9 -- practically the opposite of the 10:1 odds quoted in the statistics experts' brief.

The lesson of this example is not that a statistic with a p-value of 0.9 always can be safely ignored. It is that the p-value, by itself, cannot be converted into a probability that the alternative hypothesis is true (“that the adverse effect is ‘real’”). Knowing that the two tails arise only 9% of the time when the head-tails coin is the cause does not imply that 9% is the probability that a heads-tails coin is the cause or that 91% is the probability that the tails-tails coin is the “real” cause. Statisticians have warned against this confusion of a p-value with a posterior probability time and again. The brief of "Amici Curiae Statistics Experts" thus brings to mind the old remark, "With friends like these, who needs enemies?" A more complete review of the brief is available at Nathan Schachtman's website (see Further Readings).

Further Reading

David H. Kaye et al., The New Wigmore, A Treatise on Evidence: Expert Evidence (2d ed. 2011).

Nathan A. Schachtman, The Matrixx Oversold, Apr. 4, 2011,

Friday, August 19, 2011

The Transposition Fallacy in Matrixx Initiatives, Inc. v. Siracusano: Part I

One might expect to hear phrases like “Not statistical significance there!” and “There is no way that anybody would tell you that these ten cases are statistically significant” hurled by a disgruntled professor at an underperforming statistics student. Yet, in January 2011, they came from the Supreme Court bench during the argument in Matrixx Initiatives, Inc. v. Siracusano.[1]

The issue before the Court was “[w]hether a plaintiff can state a claim under § 10(b) of the Securities Exchange Act and SEC Rule 10b-5 based on a pharmaceutical company's nondisclosure of adverse event reports even though the reports are not alleged to be statistically significant.” [2] In the case, the manufacturer of the Zicam nasal spray for colds issued reassuring press releases at a time when it was receiving case reports from physicians of loss of smell (anosmia) in Zicam users. The pharmaceutical company, Matrixx Initiatives, succeeded in getting a security fraud class action dismissed on the ground that the plaintiffs failed to plead “statistical significance.”

Because case reports are just a series of anecdotes, it is not immediately obvious how they could be statistically significant, but a determined statistician could compare the number of reports in the relevant time period to the number that would be expected under some model of the world in which Zicam is neither a cause nor a correlate of anosmia. If the observed number departed from the expected number by a large enough amount—one that would occur no more than about 5% of the time when the assumption of no association is true (along with all the other features of the model)—then the observed number would be statistically significant at the 0.05 level.

The Court rejected any rule that would require securities-fraud plaintiffs to engage in such statistical modeling or computation before filing a complaint. This result makes sense because a reasonable investor might want to know about case reports that do not cross the line for significance. Such anecdotal evidence could be an impetus for further research, FDA action, or product liability claims—any of which could affect the value of the stock. In rejecting a bright-line rule of p < 0.05, the Court made several peculiar statements about statistical significance and the design of studies, but these are not my subject for today. (An older posting, on March 25, has some comments on this issue.)

Instead, I want to look at a small part of an amicus brief from “statistics experts” filed on behalf of the plaintiffs. There is much in this brief, which really comes from two economists (or perhaps these eclectic scholars should be designated historians or philosophers of economics and statistics), with which I would agree (for whatever my agreement is worth). But I was shocked to find the following text in the “Brief of Amici Curiae Statistics Experts Professors Deirdre N. McCloskey and Stephen T. Ziliak in Support of Respondents”:
The 5 percent significance rule insists on 19 to 1 odds that the measured effect is real.26 There is, however, a practical need to keep wide latitude in the odds of uncovering a real effect, which would therefore eschew any bright-line standard of significance. Suppose that a p-value for a particular test comes in at 9 percent. Should this p-value be considered “insignificant” in practical, human, or economic terms? We respectfully answer, “No.” For a p-value of .09, the odds of observing the AER [adverse event report] is 91 percent divided by 9 percent. Put differently, there are 10-to-1 odds that the adverse effect is “real” (or about a 1 in 10 chance that it is not). Odds of 10-to-1 certainly deserve the attention of responsible parties if the effect in question is a terrible event. Sometimes odds as low as, say, 1.5-to-1 might be relevant.27 For example, in the case of the Space Shuttle Challenger disaster, the odds were thought to be extremely low that its O-rings would fail. Moreover, the Vioxx matter discussed above provides an additional example. There, the p-value in question was roughly 0.2,28 which equates to odds of 4 to 1 that the measured effect — that is, that Vioxx resulted in increased risk of heart-related adverse events — was real. The study in question rejected these odds as insignificant, a decision that was proven to be incorrect.

26. At a 5 percent p-value, the probability that the measured effect is “real” is 95 percent, whereas the probability that it is false is 5 percent. Therefore, 95 / 5 equals 19, meaning that the odds of finding a “real” effect are 19 to 1.

27. Odds of 1.5 to 1 correspond to a p-value of 0.4. That is, the odds of the measured effect being real would be 0.6 / 0.4, or 1.5 to 1.

28. Lisse et al., supra note 14, at 543-44.
Why is this explanation out of whack? The fundamental problem is that, within the framework of classical (Neyman-Pearson) hypothesis testing, hypotheses like “the adverse effect is real” or “a measured effect being real” do not have odds or probabilities attached to them. In Bayesian inference, statements like “the probability that the measured effect is ‘real’ is 95 percent, whereas the probability that it is false is 5 percent” are meaningful, but frequentist p-values play no role in that framework. Equating the p-value with the probability that a null hypothesis is true and regarding the complement of a p-value as the probability that the alternative hypothesis is true (that something is “real”) is known as the transposition fallacy. [2] That two “statistics experts” would rely on this crude reasoning to make an otherwise reasonable point is depressing.

The preceding paragraph is a little technical. Soon, I shall post a simple example that should make the point more concretely and with less jargon.


1. Transcript of Oral Argument, Matrixx Initiatives, Inc. v. Siracusano, 131 S.Ct. 1309 (2011) (No. 09-1156), 2011 WL 65028, at *12 & *16 (Kagan, J.).

2. Petition for Writ of Certiorari at i, Matrixx Initiatives, Inc. v. Siracusano, 131 S.Ct. 1309 (2011) (No. 09-1156), 2010 WL 1063936.

3. David H. Kaye, David E. Bernstein & Jennifer L. Mnookin, The New Wigmore: A Treatise on Evidence: Expert Evidence (2d ed. 2011).

Monday, July 25, 2011

Third Circuit Upholds Federal Arrestee DNA Database Law

In 2009, the United States District Court for the Western District of Pennsylvania made legal history. For the first time, a federal court held that the government lacked the constitutional power to compel individuals who had been arrested and charged with a crime to provide a DNA sample. Today, the Court of Appeals for the Third Circuit (one of the 12 appellate courts that sit one rung below the Supreme Court in the federal judicial system) reversed this ruling. Yet, both courts applied the "totality of the circumstances" standard for ascertaining the reasonableness of searches and seizures. What, then, accounts for the anticlinal outcomes?

Basically, the two courts took very different views of the individual's Fourth Amendment interests and their role in evaluating the legislation. The following passages from the district court opinion and the opinion of the majority of the court of appeals are illustrative.
  • District Court: [T]he search in this instance is one that reveals the most intimate details of an individual's genetic condition, implicating compelling and fundamental “interests in human dignity and privacy. See Schmerber v. California, 384 U.S. 757, 770 (1966).
  • Court of Appeals: Schmerber recognized society‘s judgment that
    blood tests "do not constitute an unduly extensive imposition on an individual‘s personal privacy and bodily integrity."
  • District Court: [T]o compare the fingerprinting process and the resulting identification information obtained therefrom with DNA profiling is pure folly. Such oversimplification ignores the complex, comprehensive, inherently private information contained in a DNA sample.
  • Court of Appeals: While we acknowledge the seriousness of Mitchell‘s concerns about the possible misuse and future use of DNA samples, we conclude that these hypothetical possibilities are unsupported by the record before us and thus do not have any substantial weight in our totality of the circumstances analysis.
The district court was not reassured by the fact that DNA identification profiling currently is little more than a token of personal identity. On the basis of a student law review article, it feared that "DNA samples may reveal private information regarding familial lineage and predisposition to over four thousand types of genetic conditions and diseases; they may also identify genetic markers for traits including aggression, sexual orientation, substance addiction, and criminal tendencies."

The majority of the en banc court of appeals was less fearful that the government would change its use of the samples to go beyond the current production and trawling of identification profiles. It observed that "[t]he judiciary risks error by elaborating too fully on the Fourth Amendment implications of emerging technology before its role in society has become clear. ... At this juncture, ... we consider the amount and type of personal information to be contained in the DNA profile to be nominal."

Thus, the district court saw the retained samples as a potentially rich source of private information about the arrestee that the government might exploit some day (although it did not explain why the government would be interested in performing genetic tests for 4,000 or more medical conditions). The court of appeals was content to uphold the status quo: "As currently structured and implemented . . . the DNA Act‘s compulsory profiling of qualified federal offenders can only be described as minimally invasive--both in terms of the bodily intrusion it occasions, and the information it lawfully produces."

Similar cases are pending before the Second and Ninth Circuits. I predict that both will uphold the federal law -- over some vigorous dissents. When will the Supreme Court step in?


United States v. Mitchell, 681 F.Supp.2d 597 (W.D. Pa. 2009)

-United States v. Mitchell, No. 02-2859 (3d Cir. Aug. 25, 2011) (en banc)

Cross-posted from the Double Helix Law Blog

Saturday, July 9, 2011

Junk Science in United States v. Pool

Having granted en banc review in United States v. Pool, 621 F.3d 1213 (9th Cir. 2010), the U.S. Court of Appeals for the Ninth Judicial Circuit is likely to produce as wild a set of conflicting opinions on DNA databases as it did in United States v. Kincade, 379 F.3d 813 (9th Cir. 2004).

The panel that heard the Pool case divided 2-1 and generated 3 opinions. Judge Callahan wrote an opinion upholding the federal law on taking DNA after an arrest. Visiting Judge Lucero (of the Tenth Circuit) joined in this opinion, but he also wrote a separate opinion. Judge Schroeder dissented.

The briefs that informed these three opinions left something to be desired. Here, I'll focus on one of my pet peeves--disingenuous or inane claims about the CODIS STR loci as a threat to privacy.

Appellant's Opening brief (available from a link on EPIC's website, along with a one-sided list of vaguely related articles) is rather shameless in this regard. It starts with the claim that "DNA profiles derived by STR may yield probabilistic evidence of the contributor’s race or sex." [1] Probabilistic evidence of sex from autosomal STRs? The arresting officers or jailers need a genetic test for that?

Then the brief cites Simon Cole's writing to support its sweeping statement that "scientific studies have debunked the notion that these regions of the genetic code are devoid of any biological function." Yet, the brief cites no study that "debunks" the notion that the length polymorphisms of the CODIS tetranucleotide STRs lack "biological function." The concurring opinion of Judge Lucero recognizes that Cole rejects the claim of functionality (for the moment). [2] However, a group in France has a theory and some data for a mechanism through which one such STR could regulate the expression of an enzyme. [3]

Finally, the brief proposes that the "specter of discrimination and stigma could arise where one or more STRs is found to correlate with another genetic marker whose function is known, so that the presence of the seemingly innocuous STR serves as a 'flag' for that genetic predisposition or trait." [4] An accompanying footnote gives this example: "A study in England from 2000 found that one of the markers used in DNA identification is closely related to the gene that codes for insulin, which itself relates to diabetes." [5]

The accused STR is TH01. It has been used in many studies investigating the association between (a) SNPs, VNTRs, and this STR in a complex of genes and (b) a large number of diseases. Unsurprisingly, associations have been observed. Some of the reported associations were spurious and were not replicated. Other associations probably are real. This does not mean that TH01, by itself, is a useful predictor of any of these diseases in a given population. In fact, one forensic biologist used the 2000 paper cited in Pool's brief to show that "such associations [between forensic STRs and disease-causing alleles in genes] are so ridiculously weak that serious protest could never form." [6] His explanation follows:
This is illustrated well by the possible association between certain alleles of an STR named TH01 and diabetes type 1 (Bennett and Todd, 1996; Stead et al., 2000). TH01 alleles are used routinely in DNA typing, and for a minute, the manufacturers of genetic fingerprint kits started to feel the heat over the possible association between an exonic illness and an intronic allele. Fortunately, it takes just a pen and a piece of paper to brush off possible concerns: four out of 1000 Europeans will eventually get diabetes type 1. If you carry one of the ‘risk’ alleles in the intronic TH01 region, your chances of getting diabetes type 1 is 0.13 out of 1000. If I find out that you are carrying the alleged risk allele in my laboratory during DNA typing, I could—but I am not allowed to—calculate your total risk for diabetes as 0.4 × 1.3 = 0.52%. In plain language: in the worst case scenario, one allele of your possible genetic fingerprint might tell me that your general risk of getting diabetes type 1 is increased from 0.4 to 0.52%. All other alleles will not tell me anything about you, or your potential risk for illnesses. Abuse of such information is impossible because it simply has no practical predictive value.
I do not want to "brush off possible concerns," and I understand the pressures and temptations of advocacy. Still, I wonder whether the Sacramento Federal Defender consulted the scientific literature on TH01 before citing an old article. Or whether he knew that the claims in the law review essay cited in the brief were the subject of an extensive rejoinder in the same journal. [7] If he did, he choose not to share this fact with the court. To my mind, that is not good advocacy.


1. Brief at 12 (quoting from a plurality opinion in Kincade).
2. 621 F.3d at 1230.
3. See Rolando Meloni, Post-genomic Era and Gene Discovery for Psychiatric Diseases: There Is a New Art of the Trade? The Example of the HUMTH01 Microsatellite in the Tyrosine Hydroxylase Gene, 26 Molecular Neurobiology 389 (2001).
4. Brief at 12.
5. Id. at 12 n.8.
6. Mark Benecke, Coding orNon-coding?, That Is the Question, 3 European Molecular Biology Organization Reports 498 (2002).
7. David H. Kaye, Please, Let's Bury the Junk: The CODIS Loci and the Revelation of Private Information, 102 Nw. U. L. Rev. Colloquy 70 (2007).

Crossposted from the Double Helix Law Blog

Friday, July 8, 2011

Confusing DNA Samples with DNA Profiles?

It is tough for lawyers to get science right. I say this not to denigrate lawyers—I am one myself—but to stress the importance of taking the time and effort to communicate the scientific facts clearly so that the value judgments are persuasive. An article attacking the constitutionality of an Arkansas law on DNA sampling from arrestees illustrates this point. In “Step Out of the Car: License, Registration, and DNA Please,” Associate Professor Brian Gallini of the University of Arkansas School of Law, gives an account of DNA profiling that makes it appear that the process of forensic DNA profiling reveals “the totality of a person’s genetic makeup” to arrive at an identification profile. At least, that is how the following exposition of DNA profiling for identification could be read:
[E]ven the layperson knows that taking a DNA sample requires an intrusion into the body, which thereafter reveals the totality of a person’s genetic makeup. ... Although courts have characterized DNA swabs as only “minimally intrusive,” they do so without recognizing ... the intrusion upon the arrestee’s interest in keeping the information revealed by a DNA sample private. From a buccal swab, the state obtains an analyzable sample of an arrestee’s DNA. That, in turn, allows the state to perform a polymerase chain reaction procedure (PCR), which involves replicating the DNA sample. This replication then allows the tester to look at “short tandem repeats” (STR). At this stage, the STRs reveal specific areas of DNA known as “loci.” In total, the tester is looking to isolate thirteen different loci in order to identify an individual’s exact genetic makeup. Once complete, the sample potentially “provides the instructions for all human characteristics, from eye color to height to blood type.”
What is wrong with this picture? Let me count the ways:
  1. PCR does not replicate the DNA sample. Human cells can replicate the full nuclear genome, but PCR can only replicate short stretches of DNA from targeted locations—the loci.
  2. Replication itself does not allow the tester to look at STRs. Visualization or ascertainment comes later.
  3. STRs do not “reveal specific areas of DNA known as ‘loci.’” An STR is a certain type of DNA sequence that occurs at, well, an STR locus. PCR primers used in forensic identification amplify only the sequences at these loci. The rest of the genomes remains terra incognito.
  4. The tester is not seeking “to identify an individual’s exact genetic makeup.” Rather, the laboratory is seeking to ascertain a small number of variations that are not in genes (or not in the exons of genes).
  5. The physical sample was complete before it was typed. “Once complete,” the tiny profile cannot possibly “provide[] the instructions for all human characteristics, from eye color to height to blood type.” The STR typing never gives any instructions for phenotypes.

Do these corrections mean that samples could not be used to gain information about human phenotypes such as eye color? Of course not. Eye color is a phenotype that can be deduced (in some instances) from genotyping. But such genotyping is not STR profiling.

And how much would it invade your privacy if a laboratory technician were to figure out your eye color in this roundabout way—instead of looking you in the eye? But that’s another story, and I have argued elsewhere against indefinite sample retention.


Brian Gallini, Step Out of the Car: License, Registration, and DNA Please, 62 Ark. L. Rev. 475 (2009)

Crossposted from the Double Helix Law blog.