Monday, July 28, 2008

DNA and birthdays: fun with math

While I was reading up for my last blog entry, I became enraged that the FBI is doing so little to inform the public that our DNA fingerprints aren’t as unique as originally believed, since DNA profiling has become so important as evidence in court trials. But then my husband reminded me of the birthday problem, and it became clear that there is no real reason to be upset.

Originally, it seemed unfair to me that juries are told that if the DNA profile of a sample from the crime scene matches that of the defendant, the odds of the match being a false-positive—in other words, the odds that the sample didn’t really stem from the defendant—are less than one in a billion.

Recently, database searches have shown that the odds of two people sharing identical DNA profiles may be much higher—as many as 3 pairs of individuals in a database of 30,000 (see
previous blog entry). At first glance, it appears that jurors are being duped into thinking that DNA evidence is more solid than it really is.

But they’re not, and a look at the birthday problem shows why.

What is the birthday problem? Stanford professor Keith Devlin
explained it on NPR as follows:

"The birthday problem asks how many people you need to have at a party so that there is a better-than-even chance that two of them will share the same birthday. Most people think the answer is 183, the smallest whole number larger than 365/2. In fact, you need just 23. The answer 183 is the correct answer to a very different question: How many people do you need to have at a party so that there is a better-than-even chance that one of them will share YOUR birthday? If there is no restriction on which two people will share a birthday, it makes an enormous difference. With 23 people in a room, there are 253 different ways of pairing two people together, and that gives a lot of possibilities of finding a pair with the same birthday."

The point is, that as soon as you start comparing random pairs of people instead of specific individuals, the odds of a match increase dramatically.

Which means that it is entirely possible that the odds of a match between the defendant and someone else in the database are less than 1 in a billion, even while the odds of any two random matches are much higher.

So my anger at the FBI for trying to hush up the story has now been replaced with frustration with the press and myself for not thinking the story through before judging the FBI’s decision.

Any objections to this application of the birthday problem?

No comments: