Viewing a single comment thread. View all comments

reginald_burke t1_ishxsag wrote

Don’t we have good definitions for this, such as the Levenshtein edit distance? For your example, Levenshtein would say 24 edits (via 24 additions).

4

Ferociousfeind t1_isib75n wrote

Single mutations can also involve the copying or deletion of large chunks of DNA. Levenshtein would be 23 edits off, because only one event was involved in adding a single 24-segment DNA piece. This is a simple thing to calculate, but it misses some of the behavior of mutation, and so misses a bit of the picture. The more true-to-life version is more complex, more nuanced, a bit more up to interpretation, and less capable of giving a single concrete percentage.

2

light24bulbs t1_isi9wop wrote

Truly that's just a count of the number of differing base pairs, which makes complete sense. This isn't that complicated. I'm sure you could argue it isn't the most RELEVANT figure that a geneticist would be concerned with, but, I think it's fair to say that's what they would take it to mean. I'd love to know if I'm wrong about that.

It's binary data, run a diff and give me the count. Since we are talking about the number 24, if there's 24 base pairs out of the total different, it's just total / 24 = variance ratio.

Likewise, the average is simply: take any two people, could the number of base pairs differing in each or present in one and not the other. Do that many times between different people, that's the average.

0