Monday, July 28, 2008

DNA and birthdays: fun with math

While I was reading up for my last blog entry, I became enraged that the FBI is doing so little to inform the public that our DNA fingerprints aren’t as unique as originally believed, since DNA profiling has become so important as evidence in court trials. But then my husband reminded me of the birthday problem, and it became clear that there is no real reason to be upset.

Originally, it seemed unfair to me that juries are told that if the DNA profile of a sample from the crime scene matches that of the defendant, the odds of the match being a false-positive—in other words, the odds that the sample didn’t really stem from the defendant—are less than one in a billion.

Recently, database searches have shown that the odds of two people sharing identical DNA profiles may be much higher—as many as 3 pairs of individuals in a database of 30,000 (see
previous blog entry). At first glance, it appears that jurors are being duped into thinking that DNA evidence is more solid than it really is.

But they’re not, and a look at the birthday problem shows why.

What is the birthday problem? Stanford professor Keith Devlin
explained it on NPR as follows:

"The birthday problem asks how many people you need to have at a party so that there is a better-than-even chance that two of them will share the same birthday. Most people think the answer is 183, the smallest whole number larger than 365/2. In fact, you need just 23. The answer 183 is the correct answer to a very different question: How many people do you need to have at a party so that there is a better-than-even chance that one of them will share YOUR birthday? If there is no restriction on which two people will share a birthday, it makes an enormous difference. With 23 people in a room, there are 253 different ways of pairing two people together, and that gives a lot of possibilities of finding a pair with the same birthday."

The point is, that as soon as you start comparing random pairs of people instead of specific individuals, the odds of a match increase dramatically.

Which means that it is entirely possible that the odds of a match between the defendant and someone else in the database are less than 1 in a billion, even while the odds of any two random matches are much higher.

So my anger at the FBI for trying to hush up the story has now been replaced with frustration with the press and myself for not thinking the story through before judging the FBI’s decision.

Any objections to this application of the birthday problem?

Thursday, July 24, 2008

What to keep in mind if you are on the jury

How reliable is DNA evidence really? The FBI is working hard to keep a lid on an increasing number of cases that show that DNA fingerprinting isn’t as credible as we’re supposed to believe, write Jason Felch and Maura Dolan of the LA Times (July 20, 2008).

DNA profiles are commonly admitted as evidence in court cases, and are often sufficient to convict a suspect even when there is no other evidence. The DNA profile of the suspect is compared to that of a sample found at the crime scene.

The vast majority of our DNA is identical from person to person, but there are some stretches, called Variable Number Tandem Repeats, which vary in length between individuals. Humans have two copies of DNA—one from mom and one from dad—so we have two versions of each of these repeat segments.

In DNA fingerprinting, investigators look at the length of these repeat segments on both sets of DNA. The Combined DNA Index System (CODIS), the FBI-funded computer system that searches DNA profiles, uses 13 of these repeat segments.

As recently as 2001, a match of 9 loci was sufficient for conviction in many states, though most states now try to compare all 13 loci. Juries are often told that the odds of two unrelated people sharing 9 of these markers are less than one in a billion.

But a search of Arizona’s DNA database by Kathryn Troyer in 2001 revealed two unrelated men who matched at 9 of the 13 loci.

Instead of trying to get to the bottom of things, the FBI responded to these findings with skepticism, and even tried to block future searches. Thomas Callaghan, head of the FBI's CODIS unit, called Troyer’s findings “misleading,” and reprimanded her laboratory for releasing the search results to a California court.

Despite threats from Callaghan to be cut off from the national database, similar searches followed in California, Illinois, and Maryland.

A Maryland judge wrote, “The court will not accept the notion that the extent of a person's due process rights hinges solely on whether some employee of the FBI chooses to authorize the use of the [database] software.”

The database search in Maryland turned up 32 pairs of individuals which matched at 9 loci, in a database of 30,000. Three of these pairs matched at all 13 loci, though it is not clear whether these individuals were related.

The Illinois search revealed 903 pairs of individuals, in a database of 220,000, whose DNA fingerprints matched at 9 loci.

DNA has become a strong weapon in courtroom battles, so it is easy to see why the FBI and prosecutors would panic at these findings. But it does kind of make you wonder whose interests they are serving by hushing up the truth.