Tuesday, February 24, 2015

Genetic Determinism and Essentialism on the Electronic Frontier

The latest bit of what, in the scientific world, is discredited genetic determinism, comes from the Electronic Frontier Foundation (EFF). This is not the first time the EFF has strayed from electronics to genetics, where it seems inclined to overstate scientific findings. 1/ Now the organization wants the Supreme Court to decide whether it is an unreasonable search or seizure for police, without probable cause and a warrant, to acquire and analyze shed DNA for identifying features that might link a suspect to a crime. That is a perfectly reasonable request, although, in the unlikely event that the Court takes this bait, making the case for a Fourth Amendment violation will not be easy.

What is less reasonable, indeed, what many geneticists and bioethicists regard as ill-advised, is to portray DNA as a map of “who we are, where we come from and who we will be.” 2/ My DNA is not who I am. It determines some things about me — my blood type, for example — but not my occupation, my interests, my skills, my criminal record, or my political affiliation. Yet, rather than simply point out that people have legitimate reasons to want to maintain the confidentiality of certain traits or risks that DNA analysis could reveal — such as an inherited form of Alzhiemer’s Disease — the EFF is concerned that “[r]esearchers have theorized DNA may also determine race, intelligence, criminality, sexual orientation, and even political ideology.” 3/

Of course, researchers have “theorized” almost everything at one time or another. And the prospect that police will collect DNA from a suspect surreptitiously to find out if he is a liberal Democrat or a conservative Republican seems a tad silly. Still, I was curious: Is there really a theory of how genes determine political ideology?

I turned to the news article in a 2012 issue of Nature cited by the EFF. 4/ Nothing in the article gives a theory of genetic determinism for political ideology. The article refers to twin studies that imply genetics plays some role in political behavior. There are some reports of candidate genes from studies that have “yet to be independently replicated.” 5/

As for a theory of how unknown genes might, to some degree, in some settings, influence political ideology, the theory is that some genes affect general attitudes or emotional reactions that could relate in some manner to political ideology. For example,
US conservatives may not seem to have much in common with Iraqi or Italian conservatives, but many political psychologists agree that political ideology can be narrowed down to one basic personality trait: openness to change. Liberals tend to be more accepting of social change than conservatives. ...

Theoretically, a person who is open to change might be more likely to favour gay marriage, immigration and other policies that alter society and are traditionally linked to liberal politics in the United States; personalities leaning towards order and the status quo might support a strong military force to protect a country, policies that clamp down on immigration and bans on same-sex marriage. 6/
These remarks are not a basis for a true friend of the Court to imply that political ideology might be a genetically determined phenotype. 7/

Notes
  1. See David H. Kaye, Dear Judges: A Letter from the Electronic Frontier Foundation to the Ninth Circuit, Forensic Science, Statistics and the Law, Sept. 20, 2012.
  2. Brief of Amicus Curiae Electronic Frontier Foundation in Support of Petitioner on Petition for a Writ of Certiorari, Raynor v. Maryland, No. 14-885, Feb. 18, 2015, at 2.
  3. Id. (note omitted).
  4. Lizzie Buchen, Biology and Ideology: The Anatomy of Politics, 490 Nature 466 (2012).
  5. Id. at 466.
  6. Id. at 468.
  7. For a critical discussion of factual errors and distortions in Supreme Court amicus briefs generally, see Allison Orr Larsen, The Trouble with Amicus Facts, 100 Va. L. Rev. 1757 (2014).

Friday, February 20, 2015

Buza Reloaded: California Supreme Court Grants Review

Yesterday the California Supreme Court granted review in People v. Buza, No. A125542 (Cal. Ct. App., 1st Dist., Dec. 3, 2014), and ordered the Court of Appeal opinion "depublished." A depublication order "is not an expression of the court's opinion of the correctness of the result of the decision or of any law stated in the opinion." Cal. Rules of Court, Rule 8.1125(d) (2015). However, "an opinion of a California Court of Appeal ... that is not ... ordered published must not be cited or relied on by a court or a party in any other action" in California. Rule 8.1115(a).

The California Department of Justice issued a information bulletin advising all state law enforcement agencies that
By operation of state law, the Supreme Court’s order granting review removes the Court of Appeal’s opinion as published authority and prevents citation or reliance on that decision in any other action. As a result of the California Supreme Court’s grant of review of this decision, there is now no state precedent that precludes collection of DNA database samples from adult felony arrestees pursuant to Penal Code section 296.

Penal Code sections 296(a)(2) and 296.1(a) therefore are in full effect and mandate the collection of DNA database samples from all adults arrested for a felony or wobbler offense. All authorized arrestee samples that have been or will be received by the California Department of Justice DNA Data Bank program will be analyzed and uploaded to CODIS.

Closely related postings

Sunday, February 15, 2015

"Remarkably Accurate": The Miami-Dade Police Study of Latent Fingerprint Identification (Pt. 2)

A week ago, I noted the Justice Department’s view that a “study of ... latent print examiners ... found that examiners make extremely few errors. Even when examiners did not get an independent second opinion about the decisions, they were remarkably accurate.” 1/ But just how accurate were they?

The police who conducted the study “[p]resented the data to a professor from the Department of Statistics at Florida International University” (p. 39), and this “independent statistician performed a statistical analysis from the data generated” (p. 45). The first table in the report (Table 4, p. 53) contains the following data (in slightly different form):

Table 1. Classifications of Pairs
Examiner's
Statement
Nonmates (N) Mates (M)
953 235
+ 42 2547
? 403 446

Here, “+” stands for a positive opinion of identity between a pair of prints (same source), “–” denotes a negative opinion (an exclusion), and “?” indicates a refusal to make either judgment (an inconclusive) even though the examiner initially deemed the prints sufficient for comparison.

What do the numbers in Table 1 mean? As noted in my previous posting, they pertain to the judgments of 109 examiners with regard to various pairings of 80 latent prints with originating friction ridge skin (mates) and nonoriginating skin (nonmates). A total of 3,138 pairs were mates; of these, the examiners reached a positive or negative conclusion in 2,692 instances. Another 1,398 were nonmates; of these, the examiners reached a conclusion in 995 instances. Given that examiners were presented with mates and that they reached a conclusion of some sort, the proportion of matches declared was P(+|M & not-?) = 2,457/2,692 = 91.3%. These were correct matches. For the pairings in which the examiners reached a conclusion, they declared nonmates to match in P(+|N & not-?) = 42/995 = 4.2% of the pairs. These were false positives. With respect to all the comparisons (including the ones that they found to be inconclusive), the true positive rate was P(+|M) = 2,457/3,138 = 78.3%, and the false positive rate was P(+|N) = 42/1,398 = 3.0%. Similar reasoning applies to the exclusions. Altogether, we can write:

Table 2. Conditional Error Rates

Excluding inconclusives Including inconclusives
False + P(+ | N & not-?)
4.2%
P(+ | N)
3.0%
False – P(– | M & not-?)
8.7%
P(– | M)
7.5%


These error rates, which are clearly reported in the study, do not strike me as "remarkably small"—especially considering that they include the full spectrum of pairs—easy as well as difficult comparisons. Of course, they do not include blind verification of the conclusions, a matter addressed in another part of the study.

The authors report more reassuring values for “Positive Predictive Value” (PPV) and “Negative Predictive Value (NPV).” These were 98.3% and 92.4%, respectively. But these quantities depend on the proportions of pairs that are mates (69%) and nonmates (31%) in the test pairs. The prevalence of mates in casework—or the “prior probability” in a particular case—might be quite different. 2/

A better statistic for thinking about the probative value of an examiner’s conclusion is the likelihood ratio (LR). Are matches declared more frequently when examiners encounter mated pairs than nonmates? How much more frequent are these correct classifications? Are declared exclusions more frequent when examiners encounter nonmates than mates? How much more frequent are these correct classifications?

The LR answers these questions. For declared matches, the LR is P(+|M) / P(+|N) = 0.783 / 0.030 = 26. For declared exclusions, it is P(–|N) / P(–|M) = 9. 3/ These values support the claim that, on average, examiners can distinguish paired mates from paired nonmates. If all the examiners were flipping fair coins to decide, the LRs would be expected to be 1. The examiners did much better than that.

Nevertheless, claims of overwhelming confidence across the board do not seem to be justified. If examiners were presented with equal numbers of mates and nonmates, one would expect that a declared match would be a correct match in P(M|+) = 26/27 = 96% of the cases in which a match is declared. 4/ Likewise, a declared exclusion would a correct classification in P(N|–) = 9/10 = 90% of the instances in which an exclusion is declared. The PPV and PNV in the Miami-Dade study are a little bit higher because the prevalence of mates was 69% instead of 50%, and the examiners were cautious — they were less likely to err when making positive identifications than negative ones.

Suppose, however, that in a case of average difficulty, an average examiner declared a match when the defendant had strong evidence that he never had been in the room where the fingerprints were found. Let us say that a judge or juror, on the basis of the non-fingerprint evidence in the case, would assign a probability of 1% rather than 50% or 69% to the hypothesis of that the defendant is the source of the latent print. The examiner, properly blinded to this evidence, would not know of this small prior probability. An LR of 26 would raise the prior probability from 1% to 26%. Informing the judge or juror of the reported PPV of 98.3% from the study without explaining that it does not imply a “predictive value” of 98.3% in this case would be very dangerous. It would lead the factfinder to regard the examiner’s conclusion as far more powerful than it actually is.

Notes

  1. David H. Kaye, "Remarkably Accurate": The Miami-Dade Police Study of Latent Fingerprint Identification (Pt. 1), Forensic Science, Statistics, and the Law,  Feb. 8, 2015
  2. In addition, the NPV has been adjusted upward from 80% “[i]n [that] consideration was given to the number of standards presented to the participant.” P. 53.
  3. Removing nondeclarations of matches or exclusions (inconclusives) from the denominators of the LRs does not change the ratios very much. They become 22 and 11, respectively.
  4. This result follows immediately from Bayes' rule with a prevalence of P(M) = P(N) = 1/2, since P(M|+) = P(+|M) P(M) / [P(+|M) P(M) + P(+|NM) P(NM)] = P(+|M) / [P(+|M) + P(+|NM)] = LR / (LR + 1) = 26/27.

Sunday, February 8, 2015

"Remarkably Accurate": The Miami-Dade Police Study of Latent Fingerprint Identification (Pt. 1)

A week ago (Feb. 2, 2015), the Justice Department issued a press release entitled "Fingerprint Examiners Found to Have Very Low Error Rates." According to the Department:
A large-scale study of the accuracy and reliability of decisions made by latent fingerprint examiners found that examiners make extremely few errors. Even when examiners did not get an independent second opinion about the decisions, they were remarkably accurate. But when decisions were verified by an independent reviewer, examiners had a 0% false positive, or incorrect identification, rate and a 3% false negative, or missed identification, rate. ... “The results from the Miami-Dade team address the accuracy, reliability, and validity in the forensic science disciplines, ...” said Gerald LaPorte, Director of NIJ’s Office of Investigative and Forensic Sciences.
Inasmuch as the researchers -- latent print examiners and a police commander in the Miami Dade Police Department 1/ -- only studied the performance of 109 latent print examiners, it is not clear how many forensic science disciplines it actually addresses. Nor is it obvious what "validity" means (beyond "accuracy") in this one activity.

But let's put press releases to the side and look into the study itself. The authors assert that
The foundation of latent fingerprint identification is that friction ridge skin is unique and persistent. Through the examination of all of the qualitative and quantitative features available in friction ridge skin, impressions can be positively identified or excluded to the individual that produced it. 2/
This study does next to nothing to validate this foundation. The premise of uniqueness is very difficult to validate, and this study is limited to "80 latent prints with varying quantity and quality of information from [a grand total of] ten known sources." 3/ But, to its credit, the research does tell us about the ability of one large group of examiners to correctly and reliably pair these particular latent prints to the more complete known prints of the fingers that generated them. Let's see how much it reveals in this regard.

The Test Set

As for the prints used in the experiment, "[a] panel of three International of Association (IAI) certified latent print examiners independently examined and compared the 320 latent prints to the known standards and scored each latent print and subsequent comparison to their known standard according to a rating scale that was designed and used for this research; 80 were selected as the final latent prints to be used for testing purposes." 4/ The purpose of the three independent examinations was to rate the latent-known pairs on a difficulty scale "in order to present the participants with a broad range of latent print examinations that were representative of actual casework." 5/ Although the researchers may well have succeeded in fashioning a test set with pairs of varying difficulty, the report does not explain how they knew that this set was "representative of actual casework" and that "[t]he test sets utilized in this study were similar to the work that participants perform on a daily basis." 6/ Neither did they report how consistently the three uber-experts gauged the difficulty of the pairs.

The Examiners Who Were Tested

It seems that readers of the Miami-Dade report must take on faith the assertion that the test set is "representative of actual casework." In contrast, it is plain that the test subjects are not representative of all caseworkers. Rather than seek a random sample of all practicing latent print examiners -- which would be a difficult undertaking -- the researchers chose a convenience sample. Only "[l]atent print examiners in the United States who were an active member [sic] of the IAI received an email invitation from the MDPD FSB inviting them to participate in this study." 7/ Inasmuch as IAI certification is a mark of distinction, the sampling frame diverges from the population of all examiners. Departing from good statistical practice, the report does not state how large the nonresponse rate for IAI-certified invitees was. If it was high (as seems probable), the sample of examiners is likely to be a biased sample of all IAI-certified examiners.

In addition to soliciting participation from IAI-certified examiners, "[a]pplications were also made available to any qualified latent print examiner, regardless of affiliation with a professional organization." 8/ How this was done is not explained, but in the end, 55% of the subjects were not IAI-certified. 9/

Of course, these features of the sampling method do not deprive the study of all value. The experiment shows what a set of motivated examiners (volunteers) with high representation from IAI-certified examiners achieved when they (1) knew that their performance would be used in a report on the capabilities of their profession, (2) had an unspecified period of time to work, and (3) may not have always worked alone on the test materials. In the next posting on the study, I will describe these results.

Notes

  1. The only description of the authors in the report is on the title page, which identifies them as Igor Pacheco, CLPE (MDPD), Brian Cerchiai, CTPE (MDPD), and Stephanie Stoiloff, MS (MDPD)." The International Association for Identification lists the first two authors as certified latent print examiners as of Dec. 4, 2014. Mr. Cerchiai is also a, IAI certified tenprint examiner. The third author is a senior police bureau commander in the Forensic Services Bureau of the Miami-Dade Police Department (MDPD). In July 2012, she testified before the Senate Judiciary Committee on behalf of the International Association of Chiefs of Police that "[f]orensic science is not the floundering profession that some may portray it to be."
  2. Igor Pacheco, Brian Cerchiai & Stephanie Stoiloff, Miami-Dade Research Study for the Reliability of the ACE-V Process: Accuracy & Precision in Latent Fingerprint Examinations, Final Technical Report, Award No. 2010-DN-BX-K268, Dec. 2014 (abstract).
  3. Id. The latent prints were not just from fingers. Some were palm prints.
  4. Id. at 24.
  5. Id. at 27.
  6. Id. at 35.
  7. Id. at 34.
  8. Id. at 35.
  9. Id. at 51.
Related Postings
  • Reports on studies in mainstream journals can be found on this blog under the labels "fingerprint" and "error."