Viewing a single comment thread. View all comments

newappeal t1_iuwownz wrote

The "basically 100%" figure is a nucleotide-for-nucleotide comparison of the genomes. You line up a human and Neanderthal genome, count how many nucleotides have the same identity (A/T/C/G) and divide that by the total length of the genome. (Because the genomes are not exactly the same length, the metric would have to be more nuanced than that, but this imperfect definition is fine for illustrative purposes.) This measure is agnostic to the actual genetic history of each species or individual being compared, but it is broadly reflective of time since the last common ancestor.

The "2%" figure is based on heritage. Here, we're comparing loci (regions of the genome; genes are loci, but "locus" is a more general term than "gene") instead of individual nucleotides. We probe the human genome for long sequences that as a whole resemble a sequence at the same location in the Neanderthal genome, count all those up and then either divide that count by the number of loci examined, or divide the base-pair length of all the like loci by the base-pair length of each genome. Loci determined to be Neanderthal in origin (and determining whether a shared locus was transferred from H. sapiens to H. neanderthalensis or the other way around is its own problem) do not necessarily have 100% sequence identity with the ancestral Neanderthal strain - indeed, we would not expect them to - but they are more similar to Neanderthals than other regions of the genome. A higher similarity indicates more recent divergence from Neanderthals, through horizontal gene transfer (mating and recombination) rather than through common descent from humans' and Neanderthals' last common ancestor.

21

iayork t1_iuwvvw8 wrote

To give OP an example: Imagine two books, 10 chapters long, almost exactly the same, but each page has a typo or two. In Chapter 7, say, one book may say "teh" instead of "the" on page 2, and the other may say "Neandertal" instead of "Neanderthal" on page 3; and so on. Overall, the books are 99.9% identical, but each chapter has a set of diagnostic typos.

Now we create a third book, by replacing chapter 7 of book 1 with chapter 7 of book 2. By comparing the pattern of typos with the parent books, we can clearly tell that chapter 7 comes from a different source.

Are the books 99.9% identical? Yes. Did book 3 get 10% of its content from a different source? Also yes.

57

Rabwull t1_iuzpdhb wrote

I have never seen haplotypes described so simply and clearly. May I steal your analogy every time I explain this, forever? I can cite you as iayork (2022) if you like.

11

newappeal t1_iv0ovwe wrote

I too will be using this analogy in the future. Thank you for adding this, u/iayork!

3

Dan13l_N t1_iv0y384 wrote

But how do you know that the whole chapter 7 comes from a different source? What if only one paragraph comes from another source?

2

Rabwull t1_iv2gbcz wrote

This is an excellent question. In reality, you only know the source between consistent diagnostic typos. If we run out of these typos 10 pages before the end of Chapter 7, we can't be sure from which source those last 10 pages come.

5