Around 80% of the human genome has an equivalent section in the chimp genome. When we compare these sections we find that they’re around 99% similar. That means, for example, that you only need to change the hair-related genes very slightly to produce either a naked ape or a hairy one. That’s fascinating, and that’s where phrases such as “99% chimp” come from. Not from looking at the entire genome, but so-called “alignable” parts.
Except they aren’t 99% similar. Or at least, that’s what Tomkins claimed in a recent article published in Answers in Genesis’s scientific journal. His analysis revealed that we’re only 70% similar; and this new number has gotten a lot of publicity in creationist circles. If you’ve spent any time reading ICR or AiG, or even DI, you’ve probably heard this 70% figure being used as evidence against evolution. But how can he arrive at such a different conclusion from the rest of science? Did he make a mistake, or is he stumbling onto a truth suppressed by the hundreds of geneticists examining the human genome each day?
On the face of it his methodology seems sound. Tomkins basically cut the human genome up into loads of little sections (known as slices) and compared them to slices of the chimp genome. He then determined the percentage of which was “optimally aligned” and that was the figure he used. Although this is a different method to what other researchers used, there’s nothing hugely wrong with it.
Tomkins’ results for how similar the genomes are when different length slices were used
Similarly, he processed the genome before his analysis with a computer algorithm he wrote himself. Although obviously different to what other researchers do, there’s nothing inherently wrong with it. At almost every step in the research he does something unusual, but not necessarily unscientific. This leads to the first problem I have with his research: “degrees of freedom.”
Over the course of research a scientist has many options as to how to proceed. What statistical analysis should they use? How long should the experiment run for? What computer program should they use? And so forth. Individually these decisions are innocuous but taken together they introduce biases into the data.
For example, insertion or deletions are instances where bits of the genome have been duplicated or deleted. Since large “indels” can arise from a single duplication or deletion event, these are typically counted separately with a different method that takes this into account. Tomkin’s lumps them into his analysis of the whole genome. There’s nothing inherently wrong with doing so, but it will bias the results and make chimps and humans seem more distant.
This is one of the reasons replication is so important: getting a different set of researchers to do something likely means different decisions will be taken. If the results are still the same we know the decisions of the researcher (known as “degrees of freedom“) didn’t bias the results too much. The problem here is there’s just the one experiment to go on, with just the one program, one algorithm, one definition of “optimal alignment” used and so forth. There’s nothing inherently wrong with the program, algorith etc. but there is the potential for degrees of freedom to result in biases. Until other researchers try and replicate the results with slightly different methods then these results should be taken with a grain of salt.
Of course, science has been comparing chimp and human DNA for quite a while now, with several experiments using slightly different methods all arriving at figures of 95-99% similar. For example, the original comparison of chimp and human DNA and the recent comparison of chimp, bonobo and human DNA used different computer programs and so forth, but still arrived at a figure of chimp/human similarity of ~99%. Even work by a young earth creationist arrived at this figure! Given all of these different studies arriving at the same conclusion, I’m inclined to think that Tomkin’s methodology – whilst not inherently wrong – introduced a few too many biases into his results.
But say lots of people replicate his method and get similar results, vindicating it. There’s still nothing there that’ll challenge evolution. He’s compared one animal to humans with a unique method and gotten a unique number. Is that number unusually low or high, thus challenging evolution? Without more animals analysed with this method we can’t say. If he examined a crocodile and found it to be 90% similar, then there’d be evidence against evolution. Just having one anomalous number produced with an anomalous method. Hardly challenging.
So next time you see the 70% figure, I’d be skeptical. The methodology has probably introduced biases (particularly by counting indels into the total figure) and the results need context. Until that’s done Tomkins might as well be saying “humans and chimps are eleventy armchairs similar!”