98.77% Wrong

Posted 28 September 2010 by Guest Contributor

by Joe Felsenstein,
http://evolution.gs.washington.edu/felsenstein.html Over at Uncommon Descent (in this thread) "niwrad" presents a calculation, lengthily explained, showing that the assertion that human and chimp genomes differ by 1% in their base sequence is wrong. What "niwrad" does is extraordinary. Choosing random places in one genome (doing this separately for each chromosome) "niward" takes 30-base chunks, and then looks over into the other genome to see whether or not there is a perfect match of all 30 bases. This turns out to occur between 41.60% of the time and 69.06% of the time in autosomes (it varies from chromosome to chromosome). The median is about 65%. So the difference is really 35%, not 1%, right? Not so fast. If two sequences differ by 1.23% (the actual figure from the chimp genome paper), a one-base chunk will match 98.77% of the time. A two-base chunk will perfectly match (0.9877 x 0.9877) of the time. And so on. A 30-base chunk will match a fraction of the time which is the 30th power of 0.9877. That's 0.6898 of the time. So the 65% figure is pretty close to what is expected from a difference of 1.23% at the single-base level. However the penny hasn't dropped yet over there (as of this writing, anyway). One commenter ("CharlesJ") has asked whether there isn't about a 1 in 4 chance of a 30-base mismatch if the difference is really 1%. That's correct, and "niwrad" has (somewhat incorrectly) replied that it's actually 1 in 3. This is a bit wrong but one way or the other the whole article goes up in smoke. "niwrad" has not figured that out yet. Of course what creationists never do when they get upset about the 1% figure and claim it is Much Higher Than That is to compare that figure with the percentage difference with the orang genome or the rhesus macacque genome (gorilla isn't available yet). Those are of course higher yet, no matter how you calculate the figure, leaving the chimp as our closest relative.

116 Comments

DS · 28 September 2010

I thought they had the Fig Newton of informational type stuff on their side. Can't he perform the irreducibly complex calculations that are required?

How do they explain the one-to-one correspondence between chimp and human chromosomes and bands? Do they have a twisted calculation for that? How do they explain all of the other genetic data such as SINE insertions and mitochondrial DNA? Let me guess...

Reed A. Cartwright · 28 September 2010

The comparison I performed was completely different from those usually performed by geneticists, because was purely statistical in nature.

Bwahaha. As opposed to the geneticists who perform analyses based both in statistics and genetics. (Note: using Monte Carlo doesn't make a method magically statistical, purely or otherwise.) I don't think niwrad has any understanding of how the statistics that geneticists use actually work. He cites the results of the chimp genome paper without ever bothering to understand what units it is in. As Joe has pointed out 98+% similarity, is a statement about per-aligned-base similarity. Estimating a 30-mer dictionary distance is not going to magically change the results of per-aligned-base similarity. Take some of my own research. Using about 12,000 orthologus nucleotides in humans and chimps, I estimated evolutionary divergence using statistically sophisticated expectation maximization and hidden Markov model techniques. As you see in Figure 3, humans and chimps are about 1.25% divergent (Look, error bars!). To put it another way, in humans and chimps 79 out of 80 ortholgous nucleotides have not changed since their common ancestor. Mice and rats are 16.8% divergent, meaning that 5 out of 6 have not changed. Despite all their fascination with human-chimp divergence, ID creationists never get around to explaining how two species of vermin are 13 times more divergent than humans and chimps.

IBelieveInGod · 28 September 2010

What percentage of the total genome of both chimps, and humans have been compared against each other?

Reed A. Cartwright · 28 September 2010

IBelieveInGod said: What percentage of the total genome of both chimps, and humans have been compared against each other?

According to the UCSC Genome browser, it's at least 98%.

Reed A. Cartwright · 28 September 2010

I'll also point out that one does not need to look at all 3 billion or so bases to estimate a divergence on the scale of 1.25%. Even with 12,000 bases (which is low given modern data), the potential error of my estimate was 0.2%. Thus from my data the net divergence between the species is nearly certain to be between 1%-1.5%.

Reed A. Cartwright · 28 September 2010

However, if the two genomes were really 95% similar or more, as is commonly claimed, also a 30BPM statistical test should produce 95% results, and it does not.

Epic Fail!

Michael Roberts · 28 September 2010

What's new about an epic fail?

Any argument put froward by ID or YEC is an epic fail. All are

eric · 28 September 2010

IBelieveInGod said: What percentage of the total genome of both chimps, and humans have been compared against each other?

At first I thought you were trollingly trying to change the subject away from the fact that Niwrad's entire argument rests on a simple math error. But then I thought, ah hah, you're making a subtle reference to the fact that Niwrad made and equally simple and stupid error by using randomly-selected 30-base chunks of each chromasome. That is an extremely small percent of each genome to compare against each other. Bravo IBIG for highlighting yet another problem with this design argument.

Bobo · 28 September 2010

And hey, guess what?

If instead of comparing only 30 nucleotides, you compare all (approx.) 3,000,000,000 nucleotides in our genomes, the identity is zero!

By Jove, this mathemajigger has disproved evolution! Praise be to the pink unicorn!

DS · 28 September 2010

IBelieveInGod said: What percentage of the total genome of both chimps, and humans have been compared against each other?

What percentage of my questions have you answered? Here are a few more for you: How do you explain the one-to-one correspondence between chimp and human chromosomes and bands? Do you have an explanation for that? How do you explain all of the other genetic data such as SINE insertions and mitochondrial DNA? Let me guess… I thought this retard had been banned from all threads except the bathroom wall. He should definitely be segregated from decent society. He has spewed three hundred and fifty pages of filth all over the bathroom wall. Don't let him do it here.

harold · 28 September 2010

The error is shockingly crude and childish. The person behind "niwad" is a dolt.

I will break it down to an even simpler analogy.

Imagine two equal length sequences of symbols.

One consists solely of "A"'s. It looks like this "AAAAAAAA....."

The other is 99% A's, but 1% B's. The exact location of the individual B's is not predictable.

A strand of it might look like this "ABAAAAAAAAAAAAAAAAAAA..."

Any dunce can see that if we examine long enough segments, the sequences will be 99% identical. That is, 99% of the time, the symbol at position "n" in the first strand will be identical to the symbol at position "n" of the second strand.

Most people can also see, however, that probability of a sequence of length "m" chosen from one being identical to the same position, same length sequence from the other is (0.99)^m.

Let's imagine a truly asinine person who wants to argue against "the strands look similar theory" for ideological reasons.

He could randomly sample segments of length "m" from either strand and see if they had the exact same sequence as the same position, same length segment from the other strand.

The larger an arbitrarily chosen "m" becomes, of the course, the lower the probability that the entire sampled sequence will be identical between the two.

This is exactly what niwad has done, using m = 30.

To put it another way, his argument is exactly the same as arguing that two equal length series of coin flips will on average be 50% identical, because any given sequence of thirty coin flips has a less than 50% chance of being identical to the next series of thirty coin flips.

SWT · 28 September 2010

I am shocked -- shocked! -- to discover that "nirwad" has made what he believes to be a major innovation in how we compare genomes to quantify difference, has applied it to actual data, but yet failed to submit this breakthrough to a peer-reviewed journal for publication.

Another opportunity to build the scientific infrastructure for ID squandered through an abysmally bad research and publication strategy. It's almost like they don't really want knowledgeable review of their work ...

SWT · 28 September 2010

And of course by "nirwad" I meant "niwrad". My deepest apologies for misspelling the pseudonym.

With all the sincerity I can muster,

SWT

Joe Felsenstein · 28 September 2010

eric said: (in part of a response to trolling by IBelieveInGod): ... Niwrad's entire argument rests on a simple math error.

I wouldn't call it a math error so much as comparing apples to apple sauce.

But then I thought, ah hah, you're making a subtle reference to the fact that Niwrad made and equally simple and stupid error by using randomly-selected 30-base chunks of each chromasome. That is an extremely small percent of each genome to compare against each other.

Taking a 30-base chunk (niwrad took a large number of them) and seeing whether each has a match in the other genome isn't itself bad -- it will mostly find matches at the corresponding location. And ten thousand of those, sampled, is a pretty good sample. What the problem is, is that there are 30 bases and a mismatch of one base is enough to make the whole thing count as a 100% mismatch. Horribly biased. If niwrad had instead counted the fraction of the 30 that matched, and averaged that, the result would have been closer to 1.23%.

Joe Felsenstein · 28 September 2010

Our peerless leader Reed Cartwright has pointed out to me that there is also a major response to niwrad's silliness at Todd Wood's blog, Todd being a creationist but an honest biologist.

harold · 28 September 2010

Todd Wood, the world's only honest creationist*.

Therefore, also, the world's least psychologically tormented, but also loneliest, creationist.

*I count only people who actually had access to an education, but choose to deny scientific reality, as creationists. Historical figures from pre-scientific times, or people who have been involuntarily education deprived, don't count.

Mike Elzinga · 28 September 2010

I have said before that The Fundamental Misconception of the ID/creationists goes right back to Henry Morris’ pitting the “myth of evolution against the science of thermodynamics.”

Here is niwrad on the thermodynamic argument. It is not surprising that he gets this wrong also.

It is that fundamental misconception that drives all “statistical calculations” by the ID/creationists. They know with out a shadow of a doubt that “everything descends into chaos without a guiding intelligence or program.” It’s because of entropy and the second law, despite the fact that they have learned to say publicly that they don’t believe evolution violates the second law (they have even learned to go out of their way to do some cheap calculations that shows it doesn’t). Nevertheless, their thinking reveals the fundamental misconception is still there.

Therefore all their “statistical arguments” begin by selection, using a uniform sampling distribution, from an essentially infinite set of possibilities. It proves evolutionists wrong 10¹⁵⁰ percent of the time.

Joe Felsenstein · 28 September 2010

Joe Felsenstein said: If niwrad had instead counted the fraction of the 30 that matched, and averaged that, the result would have been closer to 1.23%.

Oops, should have been "If niwrad had instead counted the fraction of the 30 that did not match, and averaged that, the result would have been closer to 1.23%". Oh well, you all knew what I meant ...

mrg · 28 September 2010

harold said: Todd Wood, the world's only honest creationist*.

I like to read Wood's blog. I judge him as barking up the wrong tree, of course, but dang ... the guy is a genuine article. Wood's attitude appears to be: "Evo science is on a solid basis as far as the evidence goes, but I believe there's more to it than that, and if I do the grunt work I'll be able demonstrate it and actually convince the science community of it." Those more inclined to bait our resident creotrolls might ask them what they think of Wood.

mrg · 28 September 2010

Mike Elzinga said: Here is niwrad on the thermodynamic argument. It is not surprising that he gets this wrong also.

I looked over it quickly, not trying to dig into an argument that was as specious as it was obscurely phrased, but I noticed he was making a linkup to "creationut information theory (CIT)" in it. MrE, entropy is your hot button. CIT is mine. (Again, pronounced with a VERY soft "c".) BTW, on Todd Wood -- as he points out at the end of his blog entry, as also noted by JF, all quibbling over the PRECISE percentage difference between human and chimp genomes is irrelevant. No matter how the pie is sliced, chimps still look well more like us genetically than they look like a gorilla.

Mike Elzinga · 28 September 2010

mrg said: I looked over it quickly, not trying to dig into an argument that was as specious as it was obscurely phrased, but I noticed he was making a linkup to "creationut information theory (CIT)" in it. MrE, entropy is your hot button. CIT is mine. (Again, pronounced with a VERY soft "c".)

The reason this entropy thing is my “hot button” issue is because I was around when Morris and Gish launched their attack on the biology teachers and on evolution. I have samples of the early writings of Morris and Gish in my files. I know exactly what they were trying to do; they even said it. Indeed they attacked the fossil record and everything else; but that narrative of “pitting the myth of evolution against the science of thermodynamics” was an explicit program and articulated as such. It remains the centerpiece of creationist arguments even today; you can find it on the websites of ICR and AiG even though they don’t like attention drawn to it. They know they have been repeatedly corrected on this, yet they pushed the misconceptions anyway. And Dembski and Behe picked it up in their approach to dealing with complex assemblies of molecules; so we see the clear track of the narrative right on into ID. ID/creationist “information theory” is their “solution” to the “evolution vs. the second law narrative.” It is their “scientific” theory of their sectarian god. It is how they make their sectarian dogma scientific; and therefore superior to all other dogma. They aren't going to let go of this narrative; it has been too lucrative.

mrg · 28 September 2010

Mike Elzinga said: ID/creationist “information theory” is their “solution” to the “evolution vs. the second law narrative.”

From their point of view, it's actually better. The SLOT is well-defined, and any reasonable examination of the SLOT shows that it neither rules out nor confirms evo science. "Information", however, is not so well-defined and, appropriately, there are no well-defined physical laws associated with it. Is there "information" in the genome? Sure, but having said that, what do we know that we didn't before? We could even come up with ad-hoc ways of measuring it -- number of coding base pairs, for instance -- but that really only allows us to compare genome sizes. It certainly doesn't support derivation of any fundamental physical laws like this snatched-out-of-the-air "Law of Conservation of Information." In the end, however, the SLOT and CIT arguments are the same: "An unmade bed never makes itself."

Mike Elzinga · 28 September 2010

mrg said: From their point of view, it's actually better. The SLOT is well-defined, and any reasonable examination of the SLOT shows that it neither rules out nor confirms evo science.

Actually, this is not quite correct. Matter cannot condense without the 2nd law. To say that the entropy of an isolated system spontaneously goes to a maximum is also to say that matter interacts. You cannot say that entropy increases and deny that matter interacts; that is an oxymoron. And we already know a great deal about how matter interacts, even in very complex systems. There are no known barriers to these processes continuing right on up to and including living organisms. No CIT is required (and that's no CIT).

Ryan Cunningham · 28 September 2010

Why stop at 30-mers? How about comparing 100,000-mers? My genome wouldn't even be 99% similar to my own parents at that level, which means I can't possibly be related to them. Clearly I am the result of immaculate conception, so I'm the new messiah. And I say all Christians should believe evolution is true.

Game. Set. Match.

Ben W · 28 September 2010

From that horrible, horrible thread about the SLOT...

Nirwad says, "Maxwell’s demon is a thought experiment and is fictitious, nevertheless it clearly proves that intelligence can counter entropy in principle."

Wow.

juicyheart · 28 September 2010

Of course what creationists never do when they get upset about the 1% figure and claim it is Much Higher Than That is to compare that figure with the percentage difference with the orang genome or the rhesus macacque genome (gorilla isn’t available yet). Those are of course higher yet, no matter how you calculate the figure, leaving the chimp as our closest relative.

Has this been done for the Bonobo? or is it in the works? Anyone know?

Maya · 28 September 2010

The depressing thing is that a number of people are spending time to refute this nitwit but he'll just let the thread die and then refer back to his "successful counter of the 99% similarity myth" in a few months time. No honor, no shame, not even the slightest cognitive dissonance.

mrg · 28 September 2010

Ben W said: Nirwad says, "Maxwell’s demon is a thought experiment and is fictitious, nevertheless it clearly proves that intelligence can counter entropy in principle." Wow.

I didn't catch that on my quick glance through the article. And so we prove that over a century of analysis showing that Maxie's lil' demon really IS a fiction is all twaddle. With the cherry on top being the visualization of what NW would say if EVIL-utionists tried to use an argument based on a "fiction". "Not gonna call them names, NW? I would bet you would."

GEORGE · 28 September 2010

Not only are they misunderstanding this data, but they ignore the fact that no matter how close our genomes are to each other, 50 or 60 or 99%, it still means we're... related. How closely or how distantly is hardly the point as far as proving whether or not we are related.

IBelieveInGod · 28 September 2010

GEORGE said: Not only are they misunderstanding this data, but they ignore the fact that no matter how close our genomes are to each other, 50 or 60 or 99%, it still means we're... related. How closely or how distantly is hardly the point as far as proving whether or not we are related.

We do all have a common Creator:)

Mike Elzinga · 28 September 2010

Ben W said: From that horrible, horrible thread about the SLOT... Nirwad says, "Maxwell’s demon is a thought experiment and is fictitious, nevertheless it clearly proves that intelligence can counter entropy in principle." Wow.

This was one of the arguments that, back in the 1970s and 80s, we in the physics community thought was so silly that nobody could possibly take creationists seriously after hearing it. What we were totally naive about was the political nature of the movement and that many of these arguments were taunts directed at the scientific community in order to lure them into debates. The Catch 22 in all this is that to ignore them and snicker among ourselves was to allow the crap to spread, but to debate the creationists was to give them credibility. I think we are in a little better position now because the ID/creationists can no longer distance themselves from the mountain of crap they have generated; whether or not they believe it themselves. It’s all out there on the web and in books, and we can hold our noses and keep the ID/creationists at arms length while we disinfect. I kinda like the fact that we can now scrub the floor with these IDiots without having to engage them directly and give them the stature they crave.

RBH · 28 September 2010

IBelieveInGod said: We do all have a common Creator:)

Who made it come out exactly as though evolution were true. Amazing, that. And there are data to suggest that if there were a creator, there were probably many creators, not just one.

Mike Elzinga · 28 September 2010

RBH said:
IBelieveInGod said: We do all have a common Creator:)
Who made it come out exactly as though evolution were true. Amazing, that. And there are data to suggest that if there were a creator, there were probably many creators, not just one.

I didn't know the creator was a Commoner.

John Harshman · 28 September 2010

I think you're overestimating the probability of a match here. Your calculation would be correct if niwrad were comparing homologous sequences. But he isn't. Instead he's using one 30-base human sequence as a probe into an entire chimp chromosome. If that 30 bases is repeated anywhere in the chromosome, he counts as a match. If there is no exact repeat, no match. About 4% of human sequences are not orthologous to any chimp sequence. I think it's reasonable to suppose that, if there is a related sequence somewhere in the chimp genome, it would be expected to be less similar to the human probe than would a hypothetical orthologous sequence. Then again, many of these sequences would be repetitive, giving us extra chances at a match. In fact, it's hard to figure out the true expectation of this pointless distance measure. Why 30 bases? Why a search limited to one chromosome? Why a 100% identity cutoff? Wouldn't it just make more sense to align the two genomes and do a direct comparison? But of course that's what nimrod is complaining about. Feh.

Mike Elzinga · 28 September 2010

John Harshman said: Wouldn't it just make more sense to align the two genomes and do a direct comparison? But of course that's what nimrod is complaining about. Feh.

And wouldn’t that involve exactly the same pattern recognition skills the creationists claim they have when they say they recognize design when they see it?

Rob · 28 September 2010

IBIG, Right you are. Evolution is the common creator:)

DS · 28 September 2010

IBelieveInGod said: We do all have a common Creator:)

Great. Then you won't mind explaining exactly why the nested hierarchy of SINE insertions is congruent with all of the other genetic, developmental and morphological data. Did the common creator copy the mistakes? Why did she do this? Why is this pattern, supposed produced by the deceitful creator, exactly what one would expect if evolution were true? Why are you incapable of answering these questions? Why are you allowed to post here?

Dale Husband · 28 September 2010

DS said: I thought this retard [IBelieveInGod] had been banned from all threads except the bathroom wall. He should definitely be segregated from decent society. He has spewed three hundred and fifty pages of filth all over the bathroom wall. Don't let him do it here.

Ever heard of morphing? That has several definitions, but one example is when you appear with the same screen name but a different e-mail address in order to get around being blocked from a forum like this. Sockpuppetry is simular, except you change both your e-mail address and your screen name, because you want to fool people into thinking you are more than one person. Learn more trolling techniques here: http://scienceblogs.com/pharyngula/plonk.php

What gets people put into the Pharyngula killfile dungeon? This is a list of annoyances; it usually takes more than one incident to get thrown in the slammer, though. The people who've been incarcerated are typically persistent and have a known history of pulling these stunts over and over again. Concern trolling: A particularly annoying form of trolling in which someone falsely pretends to be offering advice to favor a position they do not endorse; a creationist who masquerades as someone concerned about the arguments for evolution as an excuse to make criticisms. Godbotting: Making an argument based only on the premise that your holy book is sufficient authority; citing lots of bible verses as if they were persuasive. Insipidity: A great crime. Being tedious, repetitive, and completely boring; putting the blogger to sleep by going on and on about the same thing all the time. Morphing: Changing pseudonyms to avoid killfiles. Slagging: Making only disparaging comments about a group; while some of this is understandable, if your only contribution is consistently "X is bad", even in threads that aren't about X, then you're simply slagging, not discussing. Sockpuppetry: Like morphing, but with a specific intent: creating multiple identities supporting a position to create a false impression of popularity Spamming: Using the comments to sell real estate, mortgage assessments, little blue pills, porn, or Russian mail-order brides. Spammers are not tolerated at all; they are expunged without comment. Stupidity: Some people will just stun you with the outrageous foolishness of their comments; those who seem to say nothing but stupid things get the axe. Trolling: Making comments intended only to disrupt a thread and incite flames and confusion. Wanking: Making self-congratulary comments intended only to give an impression of your importance or intelligence.

mrg · 28 September 2010

RBH said: And there are data to suggest that if there were a creator, there were probably many creators, not just one.

Well, who's to say the Gods aren't into crowdsourcing? I am always puzzled at people who seem to get upset about the existence of Gods. Personally, I tend to find them amusing. Amazing the places one can go when dealing with entities who have been released from physical constraints.

John Harshman · 28 September 2010

Now I think on it, an additional component of error would be those sequences in the human genome with no homologs at all in the chimp genome. A deletion of even one base in a single-copy sequence in either the chimp or human lineage, for example.

Reed A. Cartwright · 28 September 2010

John Harshman said: Now I think on it, an additional component of error would be those sequences in the human genome with no homologs at all in the chimp genome. A deletion of even one base in a single-copy sequence in either the chimp or human lineage, for example.

But such events are rarer than mutations, so wont add to much additional error. Despite all these sources of errors, the results for the 30-mer test agree well with the probability that a 30-nucleotide long homologous sequence in humans and chimps experience no mutation.

Henry J · 28 September 2010

Ryan Cunningham, posted 9/28/10 12:58 PM Why stop at 30-mers? How about comparing 100,000-mers? My genome wouldn’t even be 99% similar to my own parents at that level, which means I can’t possibly be related to them. Clearly I am the result of immaculate conception, so I’m the new messiah. And I say all Christians should believe evolution is true. Game. Set. Match.

Inconceivable!!!111!!!eleven!!!!

Frank J · 28 September 2010

Apologies if this was answered already, here or at UcD:

I know that the "% difference" arguments are always sought and fabricated to promote doubt that "RM + NS" can cause the necessary changes in any imaginable length of time. But was there any mention of whether the supposed alternative occurred "in vivo," or required separate origin (abiogenesis) of both lineages? The reason I ask is that the DI's own Michael Behe was quite clear that the alternative is an "in vivo" process, meaning that the 2 lineages shared common ancestors regardless of whether "RM + NS" produces the differences.

Henry J · 28 September 2010

As for the second law arguments, the fact that seeds grow into adult plants, and eggs into adult animals, stops that argument in its tracks, for anybody willing to think.

Evolution is after all a side effect of growth, and any energy associated with it is a tiny fraction of the energy expended for growth.

So that case is closed. (And has been for many decades. Or do I mean centuries?)

Henry J

John Harshman · 28 September 2010

Reed A. Cartwright said: But such events are rarer than mutations, so wont add to much additional error.

While that seems to be the case here, it isn't necessarily true, depending on your value of "much". If, for example, the entire 4% of non-homologous sequences were caused by deletion, then 4% of comparisons would come up negative, prior to throwing out sequences differing by point mutations. The question would then be whether you considered turning 69% similarity into 65% similarity "much". For the record, I consider Roy Britten's measure of genomic similarity (gaps counted as differences) silly too, though not as silly as this one.

Mike Elzinga · 28 September 2010

Henry J said: As for the second law arguments, the fact that seeds grow into adult plants, and eggs into adult animals, stops that argument in its tracks, for anybody willing to think. Evolution is after all a side effect of growth, and any energy associated with it is a tiny fraction of the energy expended for growth. So that case is closed. (And has been for many decades. Or do I mean centuries?) Henry J

Here is part of a video of Thomas Kindell reciting almost word-for-word the arguments out of Henry Morris and Gary Parker’s book What is Creation science? He has even lifted the same graphics out of that book. The conflict narrative is illustrated quite well here. Henry Morris’s crap can still be accessed over on the ICR and AiG websites. It’s still the same crap it has always been; but they keep recycling it.

Mike Elzinga · 28 September 2010

Here is more of that video of Thomas Kindell

I think there are other links there that get the rest of that entire talk.

Joe Felsenstein · 28 September 2010

John Harshman said: I think you're overestimating the probability of a match here. Your calculation would be correct if niwrad were comparing homologous sequences. But he isn't. Instead he's using one 30-base human sequence as a probe into an entire chimp chromosome. If that 30 bases is repeated anywhere in the chromosome, he counts as a match. If there is no exact repeat, no match. About 4% of human sequences are not orthologous to any chimp sequence. I think it's reasonable to suppose that, if there is a related sequence somewhere in the chimp genome, it would be expected to be less similar to the human probe than would a hypothetical orthologous sequence. Then again, many of these sequences would be repetitive, giving us extra chances at a match. In fact, it's hard to figure out the true expectation of this pointless distance measure.

As you note, we may also be underestimating the probability of a match if there are (say) trinucleotide repeats. For simple single-copy stuff I would think that the chances are good that the only close match of a 30-base piece would be the homologous copy. If so, in effect this is just one of those alignment-free analyses. It does have a bias downward in that if there are enough changes in the 30, the match may not be detected.

John_S · 28 September 2010

IBelieveInGod said: We do all have a common Creator:)

Or creators, or creators of creators of creators. First Argument: We're not really genetically similar to chimps. Fall-Back Argument: OK, we're genetically similar; but that's because we were all designed by the same god.

Matt Ackerman · 28 September 2010

I'm just checking, but every one here realizes that niwrad is clearly trolling uncommon descent by posting transparently false analysis and seeing just how many creationist defend his (purposeful) stupidity, right? I mean, seriously, do you think that creationist are as inventive as niwrad? This is original research he did. Creationist rarely (if ever) look at the data, no matter how perversely. Additionally, his name is Darwin spelled backward. niwraD does not believe what he posted. I will bet money on it. It's a good trolling effort, but I don't really think we should be wasting our time talking about how stupid it is. It is meant to be stupid. That is the whole point.

ben · 28 September 2010

IBelieveInGod said:
GEORGE said: Not only are they misunderstanding this data, but they ignore the fact that no matter how close our genomes are to each other, 50 or 60 or 99%, it still means we're... related. How closely or how distantly is hardly the point as far as proving whether or not we are related.
We do all have a common Creator:)

Spare us your blatherings about your Flying Spaghetti Monster. That was who you were referring to, right?

Matt Ackerman · 28 September 2010

In other news, bornagain77 has a fairly coherent criticism of the 99% number, in that the 99% number ignores all structural variation. Any sequence which humans possess and chimps do not, or vice versa, is left out of the 99% calculation.

I believe that most of you can see why stating the average similarity of the portion of the DNA which is most similar might be interpreted as slightly miss leading by some.

Now, I don't think there is anything wrong with stating the 99% number, it is a good estimate of substitution rates, but, never the less, SNPs don't make up all the genetic variation in the world.

Leszek · 28 September 2010

I am with Dave Ackerman in calling Poes on this one.

The work firewall won't let me get back there but if you go down the comments and look you will find one post where he says that changing the number of pairs closer to 1 will give you the same results as the scientifically presentended 1.2%.

I found that comment odd at the time because it basically admits its all bunk without saying it.

It could still be really just a creationist doing what creationists do best but my vote is that its a POE. Provisional of course. Its so hard to tell with them.

Mike Elzinga · 28 September 2010

Matt Ackerman said: I'm just checking, but every one here realizes that niwrad is clearly trolling uncommon descent by posting transparently false analysis and seeing just how many creationist defend his (purposeful) stupidity, right?

I don’t know how Uncommonly Dense chooses those who post subjects over there, but it seems you are suggesting the UD is running this crap up the flagpole and watching who salutes. That may not be entirely out of the question; but I wonder. Is this niwrad so clever that he has even duped the UD crew into thinking he is one of theirs?

Matt Ackerman · 28 September 2010

Of course, what may be more interesting is that the 99% figure really is misleading.

As bornagain77 points out, the 99% figure is a comparison of the portion of the genome which are homologous. In other words, of the portion of the genome which is clearly the same, what percentage of the genome is the same? The 99% figure is meant to be an estimate of the rate of single nucleotide substitutions, and as such is a fine metric, however, much of the genetic variation between humans and chimps occurs as structural variation. In other words genes are duplicated or deleted, and all of the genes which are unique to humans or unique to chimps are left out of this calculation.

Again, this is fine, if what you want to do is estimate rates of single nucleotide substitutions. But there are such events as gene duplications and gene deletions, and I think we ought to be aware that there have been ALOT of duplications and deletions between humans and chimps, which argues that duplicagtions and deletions may just be an important form of sequence evolution. Google AMYLASE some time.

Matt Ackerman · 28 September 2010

Leszek said: I am with Dave Ackerman in calling Poes on this one.

Dave Ackerman is my brother, do you happen to know him?

IBelieveInGod · 28 September 2010

http://www.ncbi.nlm.nih.gov/pubmed/15716009

Mike Elzinga · 28 September 2010

Leszek said: It could still be really just a creationist doing what creationists do best but my vote is that its a POE. Provisional of course. Its so hard to tell with them.

There is a fairly robust way of checking. ID/creationists love plug-and-chug because they think it makes them smart. Look at their understanding of concepts instead. No amount of plug-and-chug gets around misconceptions and misrepresentations.

Mike Elzinga · 28 September 2010

IBelieveInGod said: http://www.ncbi.nlm.nih.gov/pubmed/15716009

Instead of bombing this site with your quote-mining and copy/paste arguments, why don’t you instead show us your level of conceptual understanding of science? I have a paper you can go through where everyone here can look on at the same time and actually evaluate objectively and out in the open whether you understand the concepts. And it is a paper from one of your own. Why don’t you explain and justify for us the concepts there? We will know instantly how you are doing. The last troll like you ran away or tried to change the subject. We think you will do the same.

Stanton · 28 September 2010

Mike Elzinga said:
IBelieveInGod said: http://www.ncbi.nlm.nih.gov/pubmed/15716009
Instead of bombing this site with your quote-mining and copy/paste arguments, why don’t you instead show us your level of conceptual understanding of science? I have a paper you can go through where everyone here can look on at the same time and actually evaluate objectively and out in the open whether you understand the concepts. And it is a paper from one of your own. Why don’t you explain and justify for us the concepts there? We will know instantly how you are doing. The last troll like you ran away or tried to change the subject. We think you will do the same.

IBelieve will run away and try to change the subject, too. In fact, it was why he was banished to the Bathroom Wall in the first place. Do also remember that IBelieve is a Creationist who always insists that Evolution is false, that science is actually denying God, and that wanting to teach science to children, instead of religious propaganda, in a science classroom, is exactly the same as wanting to herd theists into gas chambers. And IBelieve hates the Truth, too, as he will deny anything and everyone that runs contrary to his inane claims, including any previous inane claims.

John Vanko · 28 September 2010

This comment has been moved to The Bathroom Wall.

phhht · 28 September 2010

Poofster, Why is it always punish, punish, punish? Why is it never tolerate, accept, forgive unto the 240th generation?

IBelieveInGod said:
GEORGE said: Not only are they misunderstanding this data, but they ignore the fact that no matter how close our genomes are to each other, 50 or 60 or 99%, it still means we're... related. How closely or how distantly is hardly the point as far as proving whether or not we are related.
We do all have a common Creator:)

OgreMkV · 28 September 2010

This comment has been moved to The Bathroom Wall.

Divalent · 28 September 2010

On the off chance that no one else has yet pointed this out, we are even more different than you think. Every single chromosome on the chimp is non-identical to the one in human (including the two chimp chromosomes that make up Human chromosome #2). So we are 100% different. (Which should come as no surprise: after all, no intelligent adult person would actually mistake a chimp for a human.)

[BTW, this web site does not work with IE 8.0. comments don't thread right (no separate pages) and you can't post a comment with it.]

curiouslayman · 28 September 2010

I'm having trouble fallowing the term 'one-base pair' Could you set this post up with some illustrations of what exactly was being compared?

John Harshman · 28 September 2010

Matt Ackerman said: In other news, bornagain77 has a fairly coherent criticism of the 99% number, in that the 99% number ignores all structural variation. Any sequence which humans possess and chimps do not, or vice versa, is left out of the 99% calculation.

Sure, but if you count that extra variation in any reasonable way -- by which I mean counting mutations, the numbers come out nearly the same. Structural variations, if by that you mean insertions, deletions, duplications, inversions, chromosomal fissions and fusions, etc. are much rarer than point mutations. The Chimp Genome consortium came up with a figure of 35 million point mutations separating chimps and humans, or 1.23% of a 3gb haploid genome. They also estimated 5 million indels, which is pretty small compared to 35 million. The only way you can make that into something big is by counting a 10,000 base indel as 10,000 differences; but it's a single mutation, so the proper metric would count it so. (They ignore even rarer events like inversions, but there are at most a few hundred of them anyway.) Counting every 10,000 base indel as 10,000 differences is how Roy Britten came up with a 5% difference between humans and chimps. I think that's silly, but it's less silly if you use the same metric to compare other species. As many others have said, however you do the comparison, as long as you use the same metric, humans are closer to chimps than to anything else.

John Harshman · 28 September 2010

It isn't one-base pair; it's one base-pair. A base is one "letter" of a DNA sequence; "pair" because DNA is a double helix, and each base is paired with another on the opposite strand. In a sequence like ACCGATACGTA, each of those letters is a base. It would be paired with (reading in the same direction for convenience) a sequence of TGGCTATGCAT in the opposite strand. Each of those sequences is 11 bases (or base-pairs, if you like) long.

Mike Elzinga · 28 September 2010

Divalent said: [BTW, this web site does not work with IE 8.0. comments don't thread right (no separate pages) and you can't post a comment with it.]

I'm running IE 8.0; and it seems to work just fine.

Leszek · 28 September 2010

Matt Ackerman said:
Leszek said: I am with Dave Ackerman in calling Poes on this one.
Dave Ackerman is my brother, do you happen to know him?

No actually, I just misread your name somehow.

Michael Roberts · 29 September 2010

RBH said:
IBelieveInGod said: We do all have a common Creator:)
Who made it come out exactly as though evolution were true. Amazing, that. And there are data to suggest that if there were a creator, there were probably many creators, not just one.

Obviously God did! Now I hold that being both a Christian and a dreaded evilutionist:)

TomS · 29 September 2010

However one calculates the percent difference or similarity between humans and chimps, how does that compare with the similarity/difference between chimps and gorillas, or between sheep and goats, or between horses and donkeys?

One might be able to account for there being similarities by appealing to a common designer - a designer working with limitations imposed by the materials being used, or a designer having similar goals for the products - but the interesting thing is that the similarities and differences fall into an overall pattern, known as the "nested hierarchy" or the "tree of life". To the best of my knowledge, no one has ever suggested an explanation of a nested hierarchy which does not involve common descent with modification. (That is, not only for the biological tree, but also for a similar pattern among languages and a similar pattern among manuscript traditions.)

This is a pattern in the world of life which is extremely complex and makes predictions about what will be discovered, and thus cannot be accounted for by "chance" alone. If it is "designed", it is designed to look like common descent with modification.

Ron Okimoto · 29 September 2010

My take is that who would do all the work to produce utter bullshit? My guess is that the author knew that the methodology was bogus, but that he thought that it could be used to confuse the rubes. He didn't even know what he was comparing his data to. Could anyone that would take the time to do this type of senseless analysis be that clueless? It could be giving the guy too much credit, but he likely knew the results would be bogus.

What would be the point of trying something like this when you could just take known genes and compare them? Heck you can take whole BAC contigs and compare them and get the deletions and insertions too.

Ron Okimoto · 29 September 2010

TomS said: However one calculates the percent difference or similarity between humans and chimps, how does that compare with the similarity/difference between chimps and gorillas, or between sheep and goats, or between horses and donkeys? One might be able to account for there being similarities by appealing to a common designer - a designer working with limitations imposed by the materials being used, or a designer having similar goals for the products - but the interesting thing is that the similarities and differences fall into an overall pattern, known as the "nested hierarchy" or the "tree of life". To the best of my knowledge, no one has ever suggested an explanation of a nested hierarchy which does not involve common descent with modification. (That is, not only for the biological tree, but also for a similar pattern among languages and a similar pattern among manuscript traditions.) This is a pattern in the world of life which is extremely complex and makes predictions about what will be discovered, and thus cannot be accounted for by "chance" alone. If it is "designed", it is designed to look like common descent with modification.

Most of the ones that are not utterly clueless understand that the type of similarity observed in the DNA of related organisms indicates an order of creation, but a lot of them can't admit that. They can't deal with the fact that the order is what is expected from biological evolution. Guys like Behe and Denton admit the fact of common descent or descent with modification, but for some reason the majority of naysayers can't bring themselves to face reality. They will quote guys like Yockey about the improbability of making a protein sequence, but won't listen to them when they also say that the variation among those proteins is not random and such a pattern among species would be even more unlikely to occur by chance as the simple nonbiological probability estimates for getting a protein sequence. The nested similarity is what they can't explain and is the biggest single reason that they are wrong. There is a project to sequence the genomes of 10,000 species. What is the naysayers prediction about what will be discovered? Real scientists know that we will get a much better idea of the evolutionary relationships between taxa. The naysayers will just have more to deny.

Joe Felsenstein · 29 September 2010

TomS said: One might be able to account for there being similarities by appealing to a common designer - a designer working with limitations imposed by the materials being used, or a designer having similar goals for the products - but the interesting thing is that the similarities and differences fall into an overall pattern, known as the "nested hierarchy" or the "tree of life". To the best of my knowledge, no one has ever suggested an explanation of a nested hierarchy which does not involve common descent with modification. (That is, not only for the biological tree, but also for a similar pattern among languages and a similar pattern among manuscript traditions.) This is a pattern in the world of life which is extremely complex and makes predictions about what will be discovered, and thus cannot be accounted for by "chance" alone. If it is "designed", it is designed to look like common descent with modification.

My understanding is that Linnaeus, back in the 1750s, thought that the hierarchical pattern of groups within groups was the pattern laid down by the creator. Linnaeus was not a modest man, and was pleased that he, the great Linnaeus, had perceived this pattern (though he was not the first to come up with a hierarchical pattern). I think later in life he began to waver and think that maybe some of the species had a common origin, and had evolved, though he did not publicize this. As to whether that constitutes an explanation, I don't think most of us would say that it does, The problem with "common design" as a scientific explanation is that if the designer is all-knowing and all-powerful, then she can make any pattern whatsover, including hierarchical groups or nonhierarchical patterns, and if we can't know her motives there is no prediction there at all. So "common design" is a non-starter as a scientific theory.

DS · 29 September 2010

TomS wrote:

"One might be able to account for there being similarities by appealing to a common designer - a designer working with limitations imposed by the materials being used, or a designer having similar goals for the products - but the interesting thing is that the similarities and differences fall into an overall pattern, known as the “nested hierarchy” or the “tree of life”. To the best of my knowledge, no one has ever suggested an explanation of a nested hierarchy which does not involve common descent with modification. (That is, not only for the biological tree, but also for a similar pattern among languages and a similar pattern among manuscript traditions.)"

Quite true, however, it's even worse than that. Even if you come up with some post hoc rationalization for the nested hierarchy of genetic similarities, you will never be able to explain why there is also a nested hierarchy of SINE insertions. THese are genetic mistakes. We know the mechanisms by which they occur and their relative and absolute rates. We know that they cause death and disease and we know that they are not going to ever be useful in any way. There is also no known mechanism whereby they can be reversed, so they persist, even through speciation events. This means that they are perfect phylogenetic markers. And the nested hierarchy of SINE insertions also happens to be exactly the same as the nested hierarchy produced by other types of genetic data. So, if you cling to the idea of some kind of intelligent designer, despite all of the evidence to the contrary, you still can't answer the question - why did she copy the mistakes?

harold · 29 September 2010

IBIG the creationist posted this link -

http://www.ncbi.nlm.nih.gov/pubmed/15716009

IBIG seemed to think that the title supported a vast amount of genetic difference between humans and chimps.

However, let's look at the actual abstract that the link leads to (emphasis mine) -

"Abstract
The chimpanzee is our closest living relative. The morphological differences between the two species are so large that there is no problem in distinguishing between them. However, the nucleotide difference between the two species is surprisingly small. The early genome comparison by DNA hybridization techniques suggested a nucleotide difference of 1-2%. Recently, direct nucleotide sequencing confirmed this estimate. These findings generated the common belief that the human is extremely close to the chimpanzee at the genetic level. However, if one looks at proteins, which are mainly responsible for phenotypic differences, the picture is quite different, and about 80% of proteins are different between the two species. Still, the number of proteins responsible for the phenotypic differences may be smaller since not all genes are directly responsible for phenotypic characters."

I find this abstract somewhat imprecise - I'd have to read the entire article to understand what metric the authors are using to generate the "80% of proteins are different" statement.

Nevertheless, it is obvious to any reader that the abstract itself merely makes points that are within mainstream science. We all know that humans and chimpanzees share very recent common ancestry, we all know that humans and chimpanzees have similar genomes, and we all know that humans and chimpanzees have many phenotypic differences. There's nothing to support creationism here.

DS · 29 September 2010

harold said: IBIG the creationist posted this link - http://www.ncbi.nlm.nih.gov/pubmed/15716009 IBIG seemed to think that the title supported a vast amount of genetic difference between humans and chimps. Nevertheless, it is obvious to any reader that the abstract itself merely makes points that are within mainstream science. We all know that humans and chimpanzees share very recent common ancestry, we all know that humans and chimpanzees have similar genomes, and we all know that humans and chimpanzees have many phenotypic differences. There's nothing to support creationism here.

Do you mean to say that a creationist posted something after only reading the title and that it really proved just the opposite of what he claimed? Shocking! Next thing you know he will be cutting and pasting whole sections of the bible without understanding them either.

harold · 29 September 2010

Getting back to niwrad and his error, I am going to even further clarify what - as far as I can understand - he is doing wrong. This will be consistent with my first post but may be more clear to those who lack biology or basic probability backgrounds.

Let's start by looking at what it means when we say that humans and chimpanzees have 98-99% of nucleotides in common.

Humans and chimpanzees share many genes in common. In fact they have very similar chromosome structures and very similar sized genomes, for that matter.

What this means is that it is often very easy to assign a positional identity to a given nucleotide in a the human genome and look at the equivalent nucleotide in the chimpanzee genome, or vice versa. Some judgment is necessary for this process but there are widely accepted and reasonable ways of doing it. This can be done for much less related species, in fact, but it is especially easy to do when species are as close as humans and chimpanzees.

(There are now-antiquated methods of comparing sequence similarity between strands of DNA that don't even involve direct sequencing, and work via hybridization, but I won't bother to discuss that now.)

Since there are four DNA purine and pyrimidine bases, if the human and chimpanzee genomes happened to be similar in size and chromosome organization by coincidence, we would only expect 25% or so of the nucleotides at given definiable positions to be the same between the human and chimp genome. (This is basically true even though the four bases do not occur at exactly equal frequency.)

What we actually see is, if we look at a the human nucleotide at a given definable location, the same nucleotide at the same position in the chimpanzee 98-99% of the time. This is mildly oversimplified but essentially what claims of similar nucleotide sequence mean (obviously).

No evidence can ever "rule out" magical "common design" by an arbitrary trickster deity who would mimic evidence for common descent. However, we can note that common descent would be poorly supported if humans and chimpanzees were extremely different at the genetic level, as all other lines of evidence show them to be closely related. In fact, the new genetic sequencing data, which could have raised doubts, is convergent with all other evidence and adds to the support for common descent.

However, an extraordinarily asinine way of attempting to generate a lower number for nucleotide sequence similarity between humans and chimps would be to play an "apples to oranges" game, and compare, instead of individual nucleotide positions, segments of nucleotides at given positions, calling them the same only if the entire segment is identical. This is apparently what niwrad did. At least that's how it's explained in the post above, and that is consistent with the results he got. If the matching is done correctly - comparing thirty human nucleotide identities to the thirty nucleotide identities at legitimately the same position in the chimp genome - and the definition of "the same sequence" is changed to "segments of thirty must be identical", then, as was pointed out, ony (.9877)^30 "identity" would be expected.

This is extremely obvious.

I'm not sure why this creationist boob stopped at 30. He could have generate an even lower number by using a longer arbitrary segment length (this would also be a good way of demonstrating his error even to a stubborn person - one could show that changing arbitary segment length analyzed had the predicted effect). He could also have generated even lower numbers by doing an apples to oranges trick and comparing non-matched positions.

All such nonsense would be equally worthless. Given the obviousness of the error, it's unclear to me how much self-deception, versus desire to deceive others, is a factor.

John · 29 September 2010

That paper is mentioned by several creationists who show no indications of having actually read it. It was brought up by a guy on youtube so many times that I tracked down a copy of it - Eighty percent of proteins are different between humans and chimpanzees (PDF).

Anyone who actually cares to look at it will quickly realize it says nothing like creationists say it does. 80% of proteins are different in that they are not identical (which 20% are), but those that are different mostly only differ by one or two amino acids, highly consistent with inheritance and divergence from a common ancestor.. As Table 4 in the paper there shows, the vast majority of proteins have sequence identities between 98-100%.

Robin · 29 September 2010

IBelieveInGod said:
GEORGE said: Not only are they misunderstanding this data, but they ignore the fact that no matter how close our genomes are to each other, 50 or 60 or 99%, it still means we're... related. How closely or how distantly is hardly the point as far as proving whether or not we are related.
We do all have a common Creator:)

Well then, apparently your common creator created all life from a single-celled organism... ...through the process of evolution.

Mike Elzinga · 29 September 2010

Ron Okimoto said: My take is that who would do all the work to produce utter bullshit? My guess is that the author knew that the methodology was bogus, but that he thought that it could be used to confuse the rubes. He didn't even know what he was comparing his data to. Could anyone that would take the time to do this type of senseless analysis be that clueless? It could be giving the guy too much credit, but he likely knew the results would be bogus. What would be the point of trying something like this when you could just take known genes and compare them? Heck you can take whole BAC contigs and compare them and get the deletions and insertions too.

As far back as I can remember these tactics of the creationists (and that goes back into the 1970s at least), the leaders have been deliberately cranking out bullshit about every scientific concept they can get their grubby hands on. I am pretty sure they know exactly what they are doing. I think it is partly taunting; they want to have debates with real scientists. But I also think that they don’t mind having the bullshit go unchallenged. It is the pseudo-science they have invented to prop up their sectarian dogma and make it “superior” to all others. The latest piece of crap along this line is an 18-page article by Jason Lisle over at AiG that purports to solve the “distant starlight problem” for the YECs. In that article, Lisle makes up a bunch of crap on reference frames and then claims it is special relativity. It isn’t even close to special relativity; but the rubes will never know. Given Lisle’s gift for glib bullshitting, I would bet he is hoping that some scientist with a big name will take him on and give him the boost in reputation that will make him a superstar in the wacky world of YECdom. The best approach these days, now that there is so much ID/creationist crap on line and in books, is for nobodies in the science community to take these IDiots down hard without confronting them directly. It is easy these days to attribute all this crap to its authors, so there is no need to give these authors any face time.

mrg · 29 September 2010

John said: Anyone who actually cares to look at it will quickly realize it says nothing like creationists say it does. 80% of proteins are different in that they are not identical (which 20% are), but those that are different mostly only differ by one or two amino acids, highly consistent with inheritance and divergence from a common ancestor.

The statement in the paper is logically on the lines of: "One percent of the buildings in the region were damaged by the hurricane. 80% of the towns there reported damage."

harold · 29 September 2010

John -

Anyone who actually cares to look at it will quickly realize it says nothing like creationists say it does. 80% of proteins are different in that they are not identical (which 20% are), but those that are different mostly only differ by one or two amino acids, highly consistent with inheritance and divergence from a common ancestor.. As Table 4 in the paper there shows, the vast majority of proteins have sequence identities between 98-100%

Another example of outrageously dishonest misrepresentation and quote mining by creationists. Even reading the abstract made it obvious that the authors were making a completely mainstream claim. The work this paper summarizes is of value. Although there's much to be learned about genetics, over the last twenty years, knowledge of genetics and genomics has arguably outpaced everything else. Proteins have taken a back seat for a while, but deserve a lot of study. The actual data is not surprising. Codons in genes should be at least (.9877)^3, or about 97%, identical between humans and chimpanzees; actually more due to selection. And different codons may code for the same amino acid. That may have a very subtle phenotypic impact - if the mRNA strand has to "wait for" a rarer tRNA, subtly different folding or similar things may occur, for example - but it won't change the amino acid sequence. So let's say that if 98.77% of single amino acids are the same, at least 97% of codons are coding for the same amino acid. And that's an absurdly conservative estimate, ignoring synonymous codons and not accounting for selection. As it happens, (.97)^~52 = .2 (creationists - I used something called "logarithms" to figure that out quickly - don't waste your time trying to understand). Now, a 52 AA protein is rather short, but we get a surprisingly reasonable number even using that excessively conservative estimate. And we have to remember that larger proteins often contain repeated domains that are coded for by the same nucleotide sequenc. If we were to set the length of the "average" protein at 100 AA, then a 20% rate of protein sequence identify would imply around a 98.4% identity in individual amino acid sequence. That's highly reasonable, especially given that selection would likely be more selection against point mutations that change AA sequence, than for them. (*Presumably some mutations that change AA sequence were selected for, some of them point mutations, and that's a big part of how we became human.*) So in fact, the "80% of proteins different" metric is nicely within the range of what we would predict from the genome sequence data.

Flint · 29 September 2010

What I take away from all this is how very little total change at the level of the genes can result in such large morphological changes. However we measure (or even misrepresent that measure) of genetic differences between humans and chimps, we can't help but notice that chimps are quite drastically different every which way.

I sometimes wonder what would be the minimum genetic difference necessary to cause the maximum morphological difference. Might it be in a gene directing an important stage of development?

eric · 29 September 2010

Flint said: I sometimes wonder what would be the minimum genetic difference necessary to cause the maximum morphological difference.

Hmmm...maybe some sort of hox-like gene, i.e. a gene that controls the expression of other genes? I'm thinking segments on millipedes here: there's probably one gene that functions as a "repeat what you just did N times" command for building segments. Modify that gene to read 10N. :)

Harold said: Another example of outrageously dishonest misrepresentation and quote mining by creationists.

IBIG regularly lifts entire sections from AIG, creationist authors, and similar sources. Lately he's gotten better at citing where he got his material from, but my guess is he lifted this link from some creationist site without even bothering to see what it said.

taxmania · 29 September 2010

John said: That paper is mentioned by several creationists who show no indications of having actually read it. It was brought up by a guy on youtube so many times that I tracked down a copy of it - Eighty percent of proteins are different between humans and chimpanzees (PDF). Anyone who actually cares to look at it will quickly realize it says nothing like creationists say it does. 80% of proteins are different in that they are not identical (which 20% are), but those that are different mostly only differ by one or two amino acids, highly consistent with inheritance and divergence from a common ancestor.. As Table 4 in the paper there shows, the vast majority of proteins have sequence identities between 98-100%.

Although I personally agree with what John and the authors of this paper represent. Minor variations in 80% of the proteins would certainly not be unexpected. We should be careful not to overstate what the paper and table 4 posit as to the extent of variation within the proteins. Perhaps a minor point, but table 4 doesn't show that "the vast majority of proteins have sequence identities between 98-100%". It instead is used to represent the frequency distribution of different functional classes of proteins over 100, 99, 98, and below 98% variation (amino acid identity). No information as to the portion below 98% is represented. I would be surprised if a large portion of proteins had that much variation, but the paper doesn't really address that.

John Harshman · 29 September 2010

If I recall, the mean sequence identity between chimp and human protein-coding sequences is 99.5%, and while the mean doesn't necessarily tell us anything about "the vast majority", I bet the variance isn't all that great. Very few protein-coding sequences will be less than 98% similar. This makes sense if most evolution is neutral (as it almost certainly is). 98.7% similarity is what you get at the neutral rate, and any sequences being conserved by selection, as most protein-coding sequences are, will be more similar than that.

Matt Ackerman · 29 September 2010

Flint said: What I take away from all this is how very little total change at the level of the genes can result in such large morphological changes.

And this is why I don't like the 99% figure. 5% of the DNA between humans and chimps is 100% different, and the remaining 95% is 1% different. It seems likely that most of the phenotypic differences between humans and chimps arises from the 5% of genes which are unique to humans or chimps, and that the 95% of genes that humans and chimps share are less responsible for phenotypic differences. 5% of genes being unique to human or chimps is a fairly reasonable fraction of genes. That means that roughly 1 out of 40 genes present in humans is not present in chimps.

Matt Ackerman · 29 September 2010

Ofcourse, I should say, that all of these comparisons are not between 'humans' and 'chimps' but rather between James Watson, the human, and Clint, the chimp. James Watson differs from other humans, and Clint differs from other chimps. so almost any of these comparisons really overestimate the differences between humans and chimps, owing to an inability to infer ancestral states, arising from a sample size of one (ok, it can be slightly larger in some particular studies, but may won't be if you just compare 'the human *cough Watson* genome' to the chimp *cough Clint* genome.')

Ron Okimoto · 29 September 2010

harold said: John -
Anyone who actually cares to look at it will quickly realize it says nothing like creationists say it does. 80% of proteins are different in that they are not identical (which 20% are), but those that are different mostly only differ by one or two amino acids, highly consistent with inheritance and divergence from a common ancestor.. As Table 4 in the paper there shows, the vast majority of proteins have sequence identities between 98-100%
Another example of outrageously dishonest misrepresentation and quote mining by creationists. Even reading the abstract made it obvious that the authors were making a completely mainstream claim. The work this paper summarizes is of value. Although there's much to be learned about genetics, over the last twenty years, knowledge of genetics and genomics has arguably outpaced everything else. Proteins have taken a back seat for a while, but deserve a lot of study. The actual data is not surprising. Codons in genes should be at least (.9877)^3, or about 97%, identical between humans and chimpanzees; actually more due to selection. And different codons may code for the same amino acid. That may have a very subtle phenotypic impact - if the mRNA strand has to "wait for" a rarer tRNA, subtly different folding or similar things may occur, for example - but it won't change the amino acid sequence. So let's say that if 98.77% of single amino acids are the same, at least 97% of codons are coding for the same amino acid. And that's an absurdly conservative estimate, ignoring synonymous codons and not accounting for selection. As it happens, (.97)^~52 = .2 (creationists - I used something called "logarithms" to figure that out quickly - don't waste your time trying to understand). Now, a 52 AA protein is rather short, but we get a surprisingly reasonable number even using that excessively conservative estimate. And we have to remember that larger proteins often contain repeated domains that are coded for by the same nucleotide sequenc. If we were to set the length of the "average" protein at 100 AA, then a 20% rate of protein sequence identify would imply around a 98.4% identity in individual amino acid sequence. That's highly reasonable, especially given that selection would likely be more selection against point mutations that change AA sequence, than for them. (*Presumably some mutations that change AA sequence were selected for, some of them point mutations, and that's a big part of how we became human.*) So in fact, the "80% of proteins different" metric is nicely within the range of what we would predict from the genome sequence data.

It is simpler than that, coding sequence differs less than noncoding sequence between chimps and humans, but it is still around 0.7% different. The average protein gene has a coding sequence of about 1000 base-pairs. So you expect 7 nucleotide differences in the average gene. Around 2/3 of these differences are expected to be amino acid changes just by chance, but there is something called selection (a lot of amino acid substitutions are likely detrimental) so instead of having most of the differences change an amino acid the ratio is switched and there is usually around 1:2 replacement to silent substitutions in genes because a lot of the changes have been selected against. So even with the switch in ratios it isn't surprising that 80% of the protein genes screened in the publication contained at least one replacement substitution. The difference in the coding sequence is still less than the differences found within the noncoding regions.

John Harshman · 30 September 2010

Matt Ackerman said: And this is why I don't like the 99% figure. 5% of the DNA between humans and chimps is 100% different, and the remaining 95% is 1% different. It seems likely that most of the phenotypic differences between humans and chimps arises from the 5% of genes which are unique to humans or chimps, and that the 95% of genes that humans and chimps share are less responsible for phenotypic differences.

Nope. You assume that 5% of DNA means 5% of genes, but almost all of that 5% is junk, mostly Alu repeats. Anyway, it's silly to count each base of that 5% as equivalent to a point mutation, since we can account for all of it by only (!) 5 million mutations, while there are 35 million point mutations.

Joe Felsenstein · 30 September 2010

Just a quick note to point out that the invalidity of the basic approach of comparing blocks of 30 bases, and counting them as different if there is even one base of mismatch, has still not been understood by "niwrad" over at UD. He is a bit troubled by the objections put forth by "CharlesJ", and says in their post #56:

Forgive me if I don’t understand what you mean in details. Nevertheless I agree with you that the results of the 30BMP test are not directly comparable to those in genomics literature. The 62% 30BPM similarity is not directly comparable with the 99% identity. We need a corrective coefficient. I agree with you also that such corrective coefficient differs depending on we do a 30BPM or 40BPM or 50BPM test … To understand this I argue according to what I did in #29. Given two supposed genomes that match 99% a 30BPM test gives 70% matches. Since the real test gave 62% my first idea to obtain a 30BPM value comparable to 99% is to apply the simple formula: 99×62/70 = 87.7%. In other words the multiplier coefficient that we must apply to the 62% is 99/70 = 1.41.

Actually you don't correct for block size that way. If there is an underlying similarity of p at the one-base level, the probability of a match when there are blocks of B bases is p raised to the Bth power. Call that Q. So to get back to p from Q you just raise Q to the 1/B power. Or use logs and get log(p) by dividing log(Q) by B. With the exception of a couple of pro-evolution commenters there, the rest of them still think "niwrad" has proven that the underlying difference between humans and chimps is nothing like 1%. But then "niwrad" said in the post that

Now, I don’t personally believe that humans and chimps share a common ancestry, for a host of reasons that would take me too long to explain in this post.

so "niwrad" is not accustomed to "getting" basic scientific facts.

D. P. Robin · 30 September 2010

Michael Roberts said: What's new about an epic fail? Any argument put froward by ID or YEC is an epic fail. All are

You might later have thought that "froward" is a typo. Not so!

http://dictionary.reference.com/browse/froward says: fro·ward /ˈfroʊwərd, ˈfroʊərd/ Show Spelled[froh-werd, froh-erd] –adjective willfully contrary; not easily managed: to be worried about one's froward, intractable child.

Seems spot-on to me! dpr

DS · 30 September 2010

Oh come on. I call POE, again. Darwin spelled backwards! Correction factors so his made up metric can be compared to the way the real scientists do it? Come on, no one can be this stupid. He's just yanking chains, milking it for all it's worth. Has to be a POE.

As for humans and chimps not being related, I pointed out the evidence for that days ago. No amount of hand waving or soul searching is going to make that evidence go away. POE or not, this guy is just plain wrong. I do hope that he realizes that anyone dumb enough to fall for this nonsense probably won't have the sense to understand that they have been duped, even after he fesses up and explains to them that it was all a scam to make them look stupid. That should be worth a laugh, seeing how many of them become POE deniers.

harold · 30 September 2010

Proudly predicting creationist behavior since 1999. Earlier, I said -

creationists - I used something called “logarithms” to figure that out quickly - don’t waste your time trying to understand)

Maybe someone thought I was being sarcastic. I wasn't. And indeed later, Joe Felsenstein noted this -

Creationist: "To understand this I argue according to what I did in #29. Given two supposed genomes that match 99% a 30BPM test gives 70% matches. Since the real test gave 62% my first idea to obtain a 30BPM value comparable to 99% is to apply the simple formula: 99×62/70 = 87.7%. In other words the multiplier coefficient that we must apply to the 62% is 99/70 = 1.41."
JF: "Actually you don’t correct for block size that way. If there is an underlying similarity of p at the one-base level, the probability of a match when there are blocks of B bases is p raised to the Bth power. Call that Q. So to get back to p from Q you just raise Q to the 1/B power. Or use logs and get log(p) by dividing log(Q) by B"

In fairness, the creationist quoted here did figure out that niwrad was wrong, but was incompetent to correct him. A key point here is that niwrad is not failing due to basic knowledge of biology. It isn't that he thinks the human and chipanzee genomes are different because he has been fed false information about one of the genomes. He's failing at the level of basic logic and basic math. And declaring himself a genius for doing so. And no-one at UD, a site ostensibly run by a PhD in a statistics-related field, is able or willing to correct him. Ignorance of the facts is easily correctable. Psychological problems so severe that they make you deny basic math and logic are not easily correctable.

Matt Ackerman · 30 September 2010

Joe Felsenstein so "niwrad" is not accustomed to "getting" basic scientific facts.

Again, Darwin (aka. niwraD) is clearly a educated person who thinks creationisms is a load of crap, and is simply having one over on the folks at Uncommon Descent. If you ask me, it is in very poor taste to mock creationist in general for what niwarD is saying, since niwarD doesn't believe it, and he is trying to look stupid, in order to make creationist look stupid.

harold · 30 September 2010

DS -

Oh come on. I call POE, again. Darwin spelled backwards! Correction factors so his made up metric can be compared to the way the real scientists do it? Come on, no one can be this stupid. He’s just yanking chains, milking it for all it’s worth. Has to be a POE.

"Poe's Law" refers to the general principle that religious extremists are so whacked out that those who try to parody them can't be distinguished from the real thing, and vice versa. This guy could be a parody of sorts, and the discussion so far would still be true (including the fact that anyone remotely familiar with genomic sequencing and/or very basic probability could correct him, and no-one at UD has done so). Bombastic, pompous names are common among narcissistic creationists, so a name that could imply "overturning Darwin" doesn't give me much information. My method of detecting possible parodies, which I believe produces better than random results, is as follows - Just as most racists today deny racism and speak in coded language, most creationists, for whatever reason, choose not to speak openly about the hellfire, brutal executions, obsession with and misinterpretation of the relatively tiny proportion of the Bible that talks about sex, and (in many cases) modern ethnic biases that drive them. They will make not-very-veiled threats ("You'll find out soon!") when provoked, but tend to dissemble away from these topics. Meanwhile, most parodists aren't interested in mimicking dissembling, weaseling, and so on, as that isn't much fun. They prefer to parody the "sinners in the hands of an angry God" type stuff of the past. So when I see someone saying something like "Evolutionists are sodomites who will burn in hell", especially without provocation, I think that there is a reasonable probability that it is a parody. When I see a lot of dissembling, squirming, and weaseling, even when challenged, I know I am dealing with a real creationist. As for this guy, the "great genius who easily overturns science with obviously incorrect math" is a common type of sincere creationist. So who can say for sure?

DS · 30 September 2010

harold said: DS -
Oh come on. I call POE, again. Darwin spelled backwards! Correction factors so his made up metric can be compared to the way the real scientists do it? Come on, no one can be this stupid. He’s just yanking chains, milking it for all it’s worth. Has to be a POE.
"Poe's Law" refers to the general principle that religious extremists are so whacked out that those who try to parody them can't be distinguished from the real thing, and vice versa. This guy could be a parody of sorts, and the discussion so far would still be true (including the fact that anyone remotely familiar with genomic sequencing and/or very basic probability could correct him, and no-one at UD has done so). Bombastic, pompous names are common among narcissistic creationists, so a name that could imply "overturning Darwin" doesn't give me much information. My method of detecting possible parodies, which I believe produces better than random results, is as follows - Just as most racists today deny racism and speak in coded language, most creationists, for whatever reason, choose not to speak openly about the hellfire, brutal executions, obsession with and misinterpretation of the relatively tiny proportion of the Bible that talks about sex, and (in many cases) modern ethnic biases that drive them. They will make not-very-veiled threats ("You'll find out soon!") when provoked, but tend to dissemble away from these topics. Meanwhile, most parodists aren't interested in mimicking dissembling, weaseling, and so on, as that isn't much fun. They prefer to parody the "sinners in the hands of an angry God" type stuff of the past. So when I see someone saying something like "Evolutionists are sodomites who will burn in hell", especially without provocation, I think that there is a reasonable probability that it is a parody. When I see a lot of dissembling, squirming, and weaseling, even when challenged, I know I am dealing with a real creationist. As for this guy, the "great genius who easily overturns science with obviously incorrect math" is a common type of sincere creationist. So who can say for sure?

Absolutely. That was my point, which you have made much more clearly and eloquently then I could ever hope to.

Matt Ackerman · 30 September 2010

Nope. You assume that 5% of DNA means 5% of genes, but almost all of that 5% is junk, mostly Alu repeats. Anyway, it’s silly to count each base of that 5% as equivalent to a point mutation, since we can account for all of it by only (!) 5 million mutations, while there are 35 million point mutations.

Typically comparisons of sequence gain and lost ignores repetitive areas of the genome, because areas with highly repetitive sequence are difficult to assemble. There is of course a bias of duplications and deletions to be in intergenic regions, which is also true of SNPs. However, this bias is surprisingly weak. My numbers were actually coming from the structural divergence between human and chimpanzees genomes in protein coding regions. Approximately 6% of protein coding genes in human are absent in chimpanzees, and approximately 8% of protein coding genes present in chimpanzees are absent in humans (humans have experienced more lineage specific deletions than chimps) (Demuth et al. The Evolution of Mammalian Gene Families. PLoS ONE 1(1): e85. doi:10.1371/journal.pone.0000085) It is widely agreed that structural divergence between us and our closest relatives is potentially responsible for a large proportion of the phenotypic divergence. Attempting to create figures that describe the total similarity in some abstract way seems to lead the public to the erroneous conclusion that there is insufficient genetic variation to account for phenotypic variation. Few phenotypic differences have been traced to the molecular level, but several are already know to arise from structural variation, which is only to be expected. After all, HOX regulated genes depend on synteny (i.e. genes being next to each other) to determine patterns of expression, so it only makes sense that changes in synteny can be responsible for phenotypic divergence. Ultimately, I can see no point to creating a number that can describe the similarity of any two genomes, even if there were a correct way to do so. Certainly it is important to study sequence identity (i.e. # of substitutions), because single nucleotide substitutions can be very reliably inferred from sequence data and patterns of substitutions are a rich source of data. Structural variants (i.e. insertions, deletions, inversions, and transpositions of greater than a few kb.) arise at a low rate in comparison to SNPs, but so what? Asking how similar human and chimp genomes might be an interesting high school science project, but why should I care? Interesting questions, such as determining the relative contribution of mutational events to adaptive evolution, will not be answered by these sorts of analysis.

John Harshman · 30 September 2010

Matt Ackerman said: My numbers were actually coming from the structural divergence between human and chimpanzees genomes in protein coding regions. Approximately 6% of protein coding genes in human are absent in chimpanzees, and approximately 8% of protein coding genes present in chimpanzees are absent in humans (humans have experienced more lineage specific deletions than chimps) (Demuth et al. The Evolution of Mammalian Gene Families. PLoS ONE 1(1): e85. doi:10.1371/journal.pone.0000085)

This is a very liberal definition of "gene". If a recent duplication has produced 2 copies of a gene in the human lineage, are those different genes, and can chimpanzees be said to lack a gene? Duplication is an important source of material for evolution, but I suggest that most duplications, like most other mutations, are evolutionarily meaningless. How many gene deletions separating humans and chimps are of single-copy genes and how many are of recent duplicates? Demuth, I notice, ascribes most changes in gene family size to neutral evolution. I do agree that the most interesting questions here are about which differences are functional, and what those functions are. There are uses for distance measures, though.

SWT · 30 September 2010

If this is parody, my hat is off to niwrad -- he/she has managed to convince the powers that be at UD to let him/her be a blog contributor, not simply a commentator.

Matt Ackerman · 30 September 2010

John Harshman said: This is a very liberal definition of "gene".

No, I believe that the professional geneticists are using the definition of gene which is generally accepted in the scientific community. You are perfectly welcome to create your own definition, but don't expect me to use it.

If a recent duplication has produced 2 copies of a gene in the human lineage, are those different genes.

Yes.

and can chimpanzees be said to lack a gene?

Yes. They can be said to lack the duplicate which is unique to humans.

Duplication is an important source of material for evolution, but I suggest that most duplications, like most other mutations, are evolutionarily meaningless.

I strongly doubt it. I suspect that the majority of gene duplications are mildly deleterious.

How many gene deletions separating humans and chimps are of single-copy genes and how many are of recent duplicates?

50% of the deletions are of single copy genes. 50% are of genes in gene families with more than one copy. Page e85.

I do agree that the most interesting questions here are about which differences are functional, and what those functions are. There are uses for distance measures, though.

I didn't say distance measures are useless; in fact, I list their uses. However, distance measurements do not measure some abstract 'genotypic similarity' because, as far as I am aware, no such goal can exist. When the general public reads the words 'chimps are 99% similar to humans' they assume scientist mean some sort of abstract genotypic similarity, which they do not. I don't really see anything you said that disagrees with my point.

Demuth et al. said: "In total, our results support mounting evidence that gene duplication and loss may have played a greater role than nucleotide substitution in the evolution of uniquely human phenotypes, and certainly a greater role than has been widely appreciated."

John Harshman · 30 September 2010

Geneticists may say that deletion of a recent duplicate is indeed loss of a gene, but saying it that way would also tend to deceive a layman into thinking that something important had just happened, like dropping your only copy of, say, cytochrome c. I really do think that most duplicated genes are subsequently deleted, either because they're slightly deleterious or because their loss isn't selected against. And I would further imagine that most of them are pseudogenized (real word?) before they are lost. In some cases, it may be the original copy that's deleted, perhaps even in nearly half of those cases. No matter. None of this prevents gene duplication and loss from being important in evolution, and perhaps more important than point mutations, but I would be interested in seeing the evidence that it's more important.

I do see that about half of all reductions of copy number, during the human lineage, in gene families that are inferred to have been present in the ancestral mammal have resulted in extinction of the family, generally by deletion of a single copy in a family that had been reduced previously to one copy. That doesn't count gene families that weren't in that ancestral mammal, and it doesn't count families that may have had losses but didn't have a net loss, but I'll accept it as an estimate. Can we suppose that most of those losses were in moribund families, i.e. those that humans just weren't using for much?

Robert Byers · 1 October 2010

I read this on uncommon descent and the researcher is showing that humans and primates etc have like templates but the differences could not come from ToE.
This is a line of investigation that others probably will pick up on ape/human sameness claims in time.
As i told him biblical creationists should welcome as close a likeness to primate bodies as possible.
its impossible upon looking at apes to not conclude God simply has one blueprint of life and twists things about. one computer program fits all.
So people were simply given the best type of body in the equation of the blueprint. the ape body. Otherwise an entirely different kind of body would of had to be thought up that still included eyes etc.
Our body is not relevant to conclusions on our origins. Looking for the differences ois a waste of time.

hoary puccoon · 1 October 2010

Not being a biologist, I'm confused by the discussion of genes without a discussion of whatever the control areas are called for turning genes on and off.

It seems obvious to me, for instance, that chimp arms are made with very similar genes to ours for all the various protiens-- for hair, nails, bones, muscles, etc. The big difference between us is the longer length of chimp arms relative to the torso and hind legs.

Theoretically, a chimp could have exactly the same genes we do, and still look and act like a chimp, not a human, as different genes are turned on and off at different rates.

So, what are the control areas for the genes called, and how much research goes into differentiating the rates at which genes are turned on and off?

Ron Okimoto · 1 October 2010

hoary puccoon said: Not being a biologist, I'm confused by the discussion of genes without a discussion of whatever the control areas are called for turning genes on and off. It seems obvious to me, for instance, that chimp arms are made with very similar genes to ours for all the various protiens-- for hair, nails, bones, muscles, etc. The big difference between us is the longer length of chimp arms relative to the torso and hind legs. Theoretically, a chimp could have exactly the same genes we do, and still look and act like a chimp, not a human, as different genes are turned on and off at different rates. So, what are the control areas for the genes called, and how much research goes into differentiating the rates at which genes are turned on and off?

Regulatory regions or regulatory DNA sequences control gene action. There are 5 prime untranslated control regions, 3 prime untranslated control regions, intronic control regions, control regions within functional gene sequence, and control regions that can be very distant from the gene that are sometimes call locus control regions. There are also gene products that control gene regulation that can be on other chromosomes.

raven · 1 October 2010

Theoretically, a chimp could have exactly the same genes we do, and still look and act like a chimp, not a human, as different genes are turned on and off at different rates.

Probably more true than not. Chimps and humans differ by 1.23% according to the OP. Two humans can differ by as much as 0.5% according to the human genome project. The vast majority of the human chimp differences must be neutral drift. When they sequenced the Neanderthal genome, they had a hard time finding differences between them and us. One highly speculative calculation implied that the number of biologically significant differences between chimp and human genomes might be around 200 mutations.

W. Kevin Vicklund · 1 October 2010

niwrad realizes his error and abandons the field, citing the Vizzini defense ("Inconceivable!")

Consider that this high figure is obtained under the following conditions very favorable to similarity: (1) the ESM model helps to obtain high value of similarities; (2) the 30BPM test, for definition, is a lavish one because allows a total scrambling of patterns. If one or both of these conditions is not applied the scenario can only get worse for similarity. The conditions #2 implies that to speak of “identity” between genomes is nonsense, despite the high value obtained in the test. Besides 1.27% of difference in 3 billions base genomes makes 38 millions point mutations after all. As a consequence the normalized result of the 30BPM test in no way supports the evolutionist claim of a common ancestor of these genomes. A blind evolution that changes and scrambles 38 millions bases is unthinkable. I am satisfied of this work and wish to thank you for the collaboration.

FWIW, his overall number, including the sex chromosomes, works out to a similarity of 98.4% or a difference of 1.6%, assuming a random dispersal. Or about 48 million bp difference.

hoary puccoon · 2 October 2010

Ron and Raven--
Thanks for responding. Do you know if anyone is actively working on the specific changes in control regions between chimps and humans?

Also, does anyone else suspect that niwrad is a sock puppet for Dembski himself? That line, "a blind evolution that changes and scrambles 38 millions bases is unthinkable" sounds exactly like his brand of 'let's see how much those rubes will swallow' cynicism. Obviously, whether scrambling 38 million pairs is "unthinkable" depends on how many pairs there are total and how many generations separate chimps and humans. There must be about, what, at least a million generations separating chimps and humans, and three billion base pairs? That comes out to about 13 base pair changes per billion base pairs per generation.

raven · 2 October 2010

Obviously, whether scrambling 38 million pairs is “unthinkable” depends on how many pairs there are total and how many generations separate chimps and humans.

Scrambling 38 million base pairs is not unthinkable. It is reality. Two humans can differ by up to 15 million base pairs. We know this by DNA sequencing. If having large numbers of base pair differences was deleterious or "unthinkable" whatever that means, we would all be dead and nonexistent. This is the Fallacy of Argument from Being Stupid.

There must be about, what, at least a million generations separating chimps and humans, and three billion base pairs? That comes out to about 13 base pair changes per billion base pairs per generation.

The number of mutations per human generation is known, again from DNA sequencing. Each human is born with 150 new mutations compared to their parents.

Do you know if anyone is actively working on the specific changes in control regions between chimps and humans?

Probably they are. This work is slow because humans and chimps are not good experimental animals for obvious reasons. More of this sort of work is done in rodents. It can be slow and expensive because some of it involves transgenic mix and match type experiments.

Karen S. · 2 October 2010

Excuse my ignorance, but what is a POE?

Dale Husband · 2 October 2010

Karen S. said: Excuse my ignorance, but what is a POE?

It's a reference to Poe's Law, which says that attempts to mock fundamentalist behavior end up being indistinguishable from actual fundamentalist behavior, so delusional and nonsensical it is.

mrg · 2 October 2010

You will also see the term "Loki troll", which is basically the same thing. I tend to prefer it because it's a little more intuitive ... it's more associated with the TALK.ORIGINS forum.

"POE" can be interpreted as "Pretense Of Extremism", but it reality it was orginally expressed by one Nathan Poe. No, it had little or nothing to do with Edgar Allen Poe.

Karen S. · 2 October 2010

Thanks mrg and Dale Husband for defining POE. I've believed for a long time that a certain poster on
BioLogos named conrad is one of those. (Now I have a word for his kind.) I simply cannot believe that even a fundie could be so butt-clenchingly stupid.

Karen S. · 2 October 2010

Not being a biologist, I’m confused by the discussion of genes without a discussion of whatever the control areas are called for turning genes on and off.

Nova had a good program on regulatory genes, etc., and you can watch it online: What Darwin Never Knew

Joe Felsenstein · 3 October 2010

harold said:
creationists - I used something called “logarithms” to figure that out quickly - don’t waste your time trying to understand)
Maybe someone thought I was being sarcastic. I wasn't. And indeed later, Joe Felsenstein noted this -
Creationist: "To understand this I argue according to what I did in #29. Given two supposed genomes that match 99% a 30BPM test gives 70% matches. Since the real test gave 62% my first idea to obtain a 30BPM value comparable to 99% is to apply the simple formula: 99×62/70 = 87.7%. In other words the multiplier coefficient that we must apply to the 62% is 99/70 = 1.41."
JF: "Actually you don’t correct for block size that way. If there is an underlying similarity of p at the one-base level, the probability of a match when there are blocks of B bases is p raised to the Bth power. Call that Q. So to get back to p from Q you just raise Q to the 1/B power. Or use logs and get log(p) by dividing log(Q) by B"

Sorry to have missed this. Yes, it is easy using logarithms as I later noted.

In fairness, the creationist quoted here did figure out that niwrad was wrong, but was incompetent to correct him.

"The crteationist quoted here" was actually niwrad, if I read their comments correctly. Now the non-creationist commenter "CharlesJ" at UD has got the correct formula, but in a messy form. He says to obtain the probability of having a non-match in B bases, you just sum up the all probabilities of obtaining K mismatches using the terms for K>0 in a binomial distribution which has B trials with probabilities p of Heads. But since the binomial probabilities sum to one, it is easier to compute the probability of a B-base match by using just the term for 0 mismatches, and then subtracting that sum from 1. In effect that is the method we have mentioned here.