Are DNA cMs higher for matches when grandparents are 1st cousins?

+7 votes
I volunteer with the Adoption Angels and have a case with a cM match of 1804 (Aunt or 1/2 Aunt) I now realize that the match’s parents were 1st cousins and wonder if the relationship skews the DNA match prediction (eg. DNA Painter).  I don’t want to narrow the focus of my search for the adoptee’s father if this is the case.

in Genealogy Help by Morgan Mulligan G2G6 (8.2k points)
edited by Morgan Mulligan
On the face of it, the adoptee and the match might have been half or full 2nd cousins (through the match's other parent) in addition to any closer relationship.  So the expected cM for this connection has to be added to the other number.

However, if they were full 1st cousins and full 2nd cousins, that seems very unlikley to account for the result you have, unless there's also a 3rd relationship.

3 Answers

+6 votes
Best answer
OK, let's "Do the math!"

Through the biological mom, the adoptee has what I'll call a non-endogamous grandparent (NG), and an endogamous grandparent (EG). By the latter (EG), I mean, of course, the grandparent who is related to the half-sibling's father.

At each location, for each pair of chromosomes, there is a 50% chance of the DNA from bio-mom being from NG. Half the time adoptee gets the NG DNA there, the half-sibling also does (the other half of the time they don't match.

So far, we have 25% of the time they match, 25% of the time they don't. The tricky part is the OTHER 50% of the time - when adoptee gets the DNA from EG.

Obviously, half the time the adoptee gets the DNA from EG, the half-sibling ALSO gets the EG DNA from mom - a match! So we're up to "they match 50% of the time, and 25% of the time they don't" - 75% of the cases accounted for.

The remaining 25% of the time, the adoptee gets the EG DNA from mom, but the half-sibling DOESN'T. The trick is that they may or may not STILL match in this case - sometimes half-sibling's DAD may contribute matching EG DNA!

Since half-sibling's parents are 1st cousins, they will inherit the same DNA from their common grandparents at a given location 1/4 of the time. But there's also the factor that half-sibling will get DNA from their unrelated paternal grandparent half the time. So the chance that half-sibling has mom's EG DNA - FROM their DAD, despite not getting it from mom - is 1/8.

So in this last 25% of cases, where adoptee gets EG DNA from mom, but half-sibling doesn't, they will match 1/8 of the time ANYWAY, but NOT match 7/8 of the time.

In total, they should match 1/4+1/4+1/4*1/8 = 17/32 of the time, on average. That's as opposed to regular half-siblings, who should match 1/2 = 16/32 of the time.

In terms of centimorgans, if a 100% match is about 3460cM, than a regular half-sibling match would be about 1730cM, while in this case it would be about 1838cM. So it DOES "skew the prediction", but in your case it actually STRENGTHENS the case for their being half-siblings.

NOTE: My Golden Rule of Probability is that if something looks fairly simple and strait-forward (like the idea of "adding" the cM values for the two relationships that exist simultaneously here) then it's WRONG. Even fairly simple probability problems are often deceptively tricky.
by Frank Stanley G2G6 Mach 3 (35.7k points)
selected by Bill Vincent
Each time your introduce probability (which is great) you also introduce error. Can you estimate how much error would be associated with your final calculation?
Thanks for this Stanley!  In this case the match to the adoptee is actually an Aunt or 1/2 Aunt (not siblings afterall).  The Aunt’s (or 1/2 aunt) parents were 1st cousins. How does the math and cMs change in this case?
First, I was going to add, in case it wasn't clear:

I'm saying that I would expect, for this "endogamous half-sibling" case, where the half-sibling's parents are 1st cousins, the average cM value to be about 1838cM, vs 1730cM for "regular half-siblings". Further, that the 1804cM quoted is actually more consistent with "endogamous half-sibling" than it is with "regular half-sibling". That being said, 1804cM is still nonetheless with the bounds of what you'd see for "regular half-sibling".

Second, "Thanks, Bill"! By "error", I assume you mean the normal statistical variation you see around the mean value (also called the "expected value", even though the chance of actually getting that exact value is almost zero). To figure out how much variation to expect for this "endogamous half-sibling" case ought to be at least as tricky as finding the mean.

Realistically, since the means are fairly close, I wouldn't expect the variation to be very much different between the two cases. Variances add, the variance of the "extra" part would be much smaller than the main part, and they're undoubtedly correlated, subtracting from that. Given that the variation on regular half-siblings isn't that well known in the first place (as far as I can tell), I wouldn't worry about the endogamous case being significantly different.
You're killin' me Morgan! (LOL)

Really, this has been an interesting case for me. I've been kind of paranoid about what endogamy can do to the numbers, and this case tells me that maybe it's not normally all that bad. In a way, this is something of an extreme case - 1st cousins producing offspring - but on the other hand it doesn't say they're related beyond that. Endogamy is really about small communities where intermarrying has been going on for many generations, I guess, so even this case may not speak to the bigger picture on endogamy.

I'll have to take a crack at the "aunt" case(s).
OK, here's the "endogamous aunt" case.

I had a suspicion that it would come out exactly the same (because that's just how things seem to work out), and that turned out to be the case.

If you have the match as a full sibling to the adoptee's biological parent, with the match's parents being 1st cousins to each other, expect about 1838cM, vs the about 1730cM you'd get without their parents being 1st cousins. Proof to follow, in case anybody's interested.

Then there's the "half-aunt" case. Really, I don't think this should be considered as plausible. The effect for that case can only be less, and even Blaine's Famous Chart - known for including fairly outrageous outliers - only has that relation going up to 1446cM.

Let's call the grandparents of the Match and Bio-parent M, X1, X2, and N, with M & X1 being on the paternal side, and X2 & and N being on the maternal side. Further, that X1 and X2 are full sibling. Really, X1 represents the DNA received by the father from the common grandparents, while X2 represents the DNA received from them by the mother. With the father and mother being 1st cousins, the chance that X1 matches X2 is 1 in 4.

At any given location on a pair of chromosomes, Adoptee gets DNA from just one of these four (M, X1, X2, or N). At that same location, Match receives DNA from both parents, with four equally likely possibilities: (X1,X2), (X1, N), (M, X2), and (M, N).

So between Adoptee and Match, there are 16 equally likely possibilities at each location. What happens for twelve of these 16 possibilities is straightforward:

* For the 4 cases where Adoptee has DNA from M, two cases - where Match has (M, X2) or (M, N) - result in a match, but the other two cases result in no match.

* The 4 cases where Adoptee has DNA from N works the same way, with the same result, except with (X1, N) and (M, N) providing the two matches.

* The 2 cases where Adoptee has DNA from X1 or X2, and Match has (M, N) will both result in no match.

* The 2 cases where Adoptee has DNA from X1 or X2, and Match has (X1, X2) will both give a match.

In the 12 cases considered so far (out of the 16 possible), 6 result in matches; the other 6 do not. The remaining 4 cases are where Adoptee has X1 or X2, and Match has X1 or X2, but not both. Two of these cases are also straightforward:

* Adoptee with X1 and Match with (X1, N) gives a match.

* Adoptee with X2 and Match with (M, X2) gives a match.

We're up to 14 case considered (out of the 16), with 8 resulting in matches, while 6 don't. The remaining cases are:

* Adoptee with X1 and Match with (M, X2)

* Adoptee with X2 and Match with (X1, N)

Since X1 matches X2 1/4 of the time, both of these case count as a 1/4 match.

Then the total result is (8+0.25+0.25)/(14+1+1) = 8.5/16 = 17/32. Adoptee will match Match at 17/32 of the locations on their chromosome pairs. A 100% match gives about 3460cM, so a 17/32 match should be about 1838cM.

Thanks for the math! I followed it, but would hate to have to come up with it on my own haha.

My grandmother came from an endogamous community (Seventh Day Baptists), and I always seem to see significantly "closer" relationship predictions than I can find actual paths. It's not unusual to see a predicted 3rd-5th cousin actually turn out to be 8th-10th, at least from what I have on paper. So, either one of two things is happening:

  1. There's simply a closer relationship than what I currently have documentation for.
  2. All the endogamous connections result in a lot more STRs in common than you'd get from a single common ancestor. 

I actually think it's closer to either #2 or both. In other words, because I don't have all the relationships and the numbers are skewed, whether or not I missed the actual MRCA. I also think the numbers in these cases are much more complex because it's more than just a single instance of married second cousins--that might be true, plus another ancestor had parents who were second cousins once removed, and maybe also third cousins to boot!



This probability stuff is definitely not for the timid, even for people who don't cringe at math in general!  :)

I have a pair of gt-gt grandparents who were second cousins. 3/4 of gt-grandpa's ancestry was from a small community of Germans who migrated from eastern PA to Frederick Co MD, to westernmost PA in the late 1700s (I kind of wonder if another of his grandparents isn't also related to the other two that I know are related to each other). It doesn't seem to have that much effect on the cMs in my matches, but maybe now I'm closer to quantifying that.

The descendants of this gt-gt grandma's siblings seem to have intermarrying cousins as a sort of family tradition! I remember one person who has my 5th-gt grandfather in his pedigree at least 5 times (with nobody closer than 2C marrying!) It's kind of painful to work on that part of the tree...

From what I'm seeing (on AncestryDNA), a match below about 40cM can be anything from a 2C1R to a 6C1R. Something over 40cM is likely a 4C or closer. Over about 70cM practically has to be 3C or closer. The handful I have that are 7C to 8C are below about 11cM. 8C is my limit, so far. Some of these extremely distant people are unverified, but a few have some strong indications of being for-real.

AncestryDNA is so messed-up when it comes to "predicted" or "possible" relations that I assume when people use the term that it doesn't even mean anything, but maybe other outfits do better with that. So I guess I'm proposing a "#3" - that published predicted relations can be misleading. I'd be curious how your results compare with these results I'm seeing.

Here's a document that Emma MacBeath shared with me with study findings of endogamous vs non-endogamous.  The project results on page 22 indicate the cMs are not as far apart as I suspected.

The Shared cM Project

Thanks, Frank & ____. Until I get more confident I have more of a tree, I'll go with #1 (need more paper tree) and #3 (stick to cM numbers and ignore their estimates) for now ;)


I've been thinking about your question, "Can you estimate how much error would be associated with your final calculation?", which I took to mean as asking how much variation we would expect from random chance.

I can't say precisely, but maybe it's inciteful to consider modelling this as a binomial distribution. You divide up the 3460cM into a number of equal segments, use the probability calculated, and the binomial distribution tells you the probability of matching on each given number of segments (which you then multiply by the number of centimorgans per segment to get centimorgans). For this case, I used 90 segments, which gives 38.44cM per segment (which I think is about typical for this close a relation).

For the "normal" case, with p=16/32, 95.5% of the time you'd get 36->54 segments matching (that's 1384cM->2076cM). 88.7% of the time it's 38->53 and 1461cM->1999cM. The average is 45 segments (1730cM).

For our "endogamous case", with p=17/32, 94.3% of the time you'd get 39->56 segments matching (1499cM->2153cM), 90.9% of the time it's 40->55 and 1538cM->2114cM, average is 47.8 segments (1838cM).

This is a pretty approximate model, but happens to match the central 90% numbers for AncestryDNA (which are about 1434cM->2028cM, with an average of about 1728cM), fairly decently. In general (for other relations), I think this model tends to understate the variation somewhat, but it doesn't do half bad.

So these 90% and 95% intervals (94.3% & 90.9% are as close as you can get) - take your pick - that I've calculated for our "endogamous" case ought to be pretty close to what the real thing would be.

On a side note, it's been observed that the values observed follow something close to a normal distribution (the celebrated "bell-shaped curve"), at least for the closer relationship. The neat thing about this binomial distribution model is that for a case such as this it can be approximated pretty accurately with a normal distribution, but for cases of more distant relations it automatically gets asymmetric (NOT like a normal distribution), just like the real distribution is supposed to.

+3 votes
1804cM shared is substantially outside of the normal range for first cousins; the usual maximum without endogamy is around 1225 or so (1804cM is in the normal range for a half sibling).
by C Handy G2G6 Mach 2 (27.2k points)
Right you are... I have corrected my post but the query still remains.  Thanks.
+3 votes


It's hard to estimate this because one can't know how much DNA was shared between the two parents. What I would recommend is to use the Are Your Parents Related utility at GEDMatch to see how many ROH the match has. I usually use this utility automatically for people searching for their parents. Better still is to use the full ROH utility by David Pike, which you can find here:

I'm sure someone will provide a more satisfactory answer to your question, but testing both people who match for ROH is always a good first step.

- Bill

by Bill Vincent G2G6 Mach 7 (75k points)
By the way, you should tell anyone who asks that a centiMorgan is 14 minutes and 24 seconds of your day. (1/100 of a day.) I would definitely do that if I could. :-)
Thanks Bill.  Turns out the relationship of the match to the adoptee is aunt or 1/2 aunt.  I’m waiting to get the data file from the adoptee to check ROH. Do you know if the tools will also pick up grandparents being related in some way?
Wouldn't that be a centimorgen?

Related questions

+17 votes
4 answers
693 views asked Dec 10, 2015 in The Tree House by Mags Gaulden G2G6 Pilot (463k points)
+7 votes
1 answer
147 views asked Aug 2, 2018 in The Tree House by Anonymous Barnes G2G6 Mach 3 (32.2k points)
+6 votes
3 answers
+5 votes
0 answers
144 views asked Apr 18, 2018 in The Tree House by Lynn Wiggers G2G6 Mach 1 (13.9k points)
+16 votes
2 answers

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright