Confirmed with DNA and Ancestry DNA's problematic estimates

+11 votes
353 views

This post is intended to make explicit something that seems like it is already implicitly allowed under WikiTree policy. If others find it agreeable, people could then refer to this post in the future to confirm that their actions comport with policy.

It has been noted before (for instance, Frank Stanley's answer here) that Ancestry DNA will classify a lot of relationships around the 3rd cousin range as "4th-6th" cousin. If you use that as the prediction during the DNA confirmation process, then you cannot proceed to use such a 3rd cousin match.

However, the instructions say only "If the DNA testing company has predicted that you and your match are third cousins or closer..." It does not specify which prediction is used. When viewing the match, the initial prediction is specified as below:

 If you click the link indicated with the arrow, you get taken to a second screen of predictions as below:



As you can see, all three of the most likely estimates are closer than that 4th-6th cousin range Ancestry puts on the previous page.

As the chart displayed above are still Ancestry DNA's own predictions, I propose that it is allowed within WikiTree policy to use any of these predicted relationships as the "prediction" during the DNA confirmation process for relationships built from matches with people 3rd cousin or closer. We need a cutoff -- I propose further that we allow any relationship estimate of 5% or higher, although I am happy to revise this number in this post if people settle on a different percentage. (I just chose 5% as the number statisticians use as the "believable" cutoff for hypothesis testing.)

in Policy and Style by Barry Smith G2G6 Pilot (215k points)
I don't see the issue here. I was using their documented relationship within the tree. The DNA is just used to confirm that relationship.
The current WikiTree instructions for DNA confirmation say you need more. You have a documented tree relationship and also a predicted relationship from DNA. The instructions say explicitly that both must match, and you should report both. I personally think this causes more issues than it solves, making it not worthwhile. But as long as policy is what it is, I follow it. The point of this post is to make clear that WRT Ancestry DNA, WikiTree’s policy as written is less constraining then some may realize.
What I meant was I was using the documented relationship as primary with the DNA just used to confirm it, not the other way around. If all you are saying is that ancestry's ballpark estimates for cM vs 3rd-cousinship is off, then perhaps the proposal should be what's the minimum cM threshold for a 3rd cousin. The helper text of "predicted relationship" shouldn't be the criteria used if they're in the tree with MRCA and a shared cM count that supports that tree relationship.

Hi Barry,

I know that your intent with this post is to discuss 3rd cousin and closer AncestryDNA matches, but just to clarify, since AncestryDNA doesn't currently provide the tools necessary for segment triangulation, match information from AncestryDNA can't be used for DNA confirmation here for matches more distant than 3rd cousins (or more distant than the equivalent shared DNA relationships, such as 2C2R, half 2C1R, half 1C3R).

But for 3rd cousins (or the above mentioned equivalent shared DNA relationships) or closer, as you've indicated, I use the AncestryDNA "Possible DNA Relationships" link, as well as the Shared cM Project Relationship Chart info (see below), relative to the actual documented paper trail relationship when evaluating AncestryDNA matches for possible confirmation here.

Also per the DNA Confirmation Help Page, the DNA source confirmation statement actually requires either the predicted relationship or the total shared DNA. I don't include in the source statement all the predicted relationships for the total shared DNA from the "Possible DNA Relationships" link, but I do include the total shared DNA. I've not found a way to link to the AncestryDNA "Possible DNA Relationships" link that is associated with the total shared DNA for the match at AncestryDNA.

John Kingman (in that same G2G thread to which you linked) had suggested also checking against this Shared cM Project Relationship Chart. In my DNA source confirmation statements, I usually link to the total cM value for the similar (more recent) chart at DNA Painter (example link for 80cM match).

The suggested 5% cutoff seems reasonable to me.

Mike: "perhaps the proposal should be whats the minum cM threshold for a 3rd cousin."  That would be a proposal for a reworking of the current DNA confirmation policy, and my experience so far has been that any reworkings based on suggestions just don't happen. That's why I didn't go any farther than I did in the OP. And using a minimum cM threshold as the criterion would complicate the current criteria anway since it would need ot be determined separately for each company -- for instance, Ancestry uses Timber while FTDNA includes segments down to 5cM that should really be excluded. And then there's imputation thrown in the mix in MH and possibly FTDNA to complicated things.  It sounds to me like the main principle in designing the DNA confirmation instructions was KISS, so using minimum thresholds will probably not happen.

When you say "shared cM count that supports that tree relationship" -- supports based on what criterion? Bettinger's histograms? Those are problematic in their own way. So the current policy used the company's own determinations of predicted relationships rather than reference to cM histograms or anything else. The point of my post is that, for those who follow the current policy, Ancestry DNA's "predicted relationship" takes multiple forms, and my reading of the policy is that you can use any of them to justify your marking a relationship as confirmed with DNA.

Rick: my reading of the current WikiTree DNA confirmation help page is that it does not allow for using an external chart of estimated total cM for relationship type (i.e., Bettinger's histograms) for determining a predicted relationship. And that makes some sense, since different companies come up with drastically different cM totals for the same exact comparison of DNA kits -- the fact that Bettinger's charts ignore this difference is just one of the problems with his charts. But regardless, the current instructions don't allow for using Bettinger's charts.

You write, "Also per the DNA Confirmation Help Page, the DNA source confirmation statement actually requires either the predicted relationship or the total shared DNA. I don't include in the source statement all the predicted relationships for the total shared DNA from the "Possible DNA Relationships" link, but I do include the total shared DNA. I've not found a way to link to the AncestryDNA "Possible DNA Relationships" link that is associated with the total shared DNA for the match at AncestryDNA."

 Can you tell me what passage you are referring to on the DNA confirmation page in your first sentence there? I don't see anything down in the sample citations about an either/or. Even if that is there somewhere, my reading of the instructions is that if you don't get past step 5:

If the DNA testing company has predicted that you and your match are third cousins or closer and this corresponds with your genealogically-known relationship, continue.

→ If the DNA-predicted relationship does not correspond with your known relationship, more genealogy or DNA testing needs to be done. See Help:DNA Matches for tips on working with your matches.

then it doesn't matter what the source citation says -- if the DNA testing company did not state you were third cousins or closer, then you have to stop the process and cannot use that match to mark any relationship as confirmed with DNA.

Hi Barry,

I really don't believe that we are in disagreement on this topic. Hopefully this will answer your questions. Let me know if I miss something.

Regarding advancing past step 5: "If the DNA testing company has predicted that you and your match are third cousins or closer and this corresponds with your genealogically-known relationship, continue." ...

I believe I'm interpreting the guidelines as you've indicated: the confirmation guidelines do not "specify which prediction is used". AncestryDNA provides the "Possible DNA Relationships" percentage prediction link, which shows their percentages for the different relationships for their particular shared DNA calculation. I use their percentage prediction to determine if the paper trail genealogically-known relationship corresponds with the possible relationships that they provide.

Regarding "But regardless, the current instructions don't allow for using Bettinger's charts." ...

There isn't anything in the confirmation guidelines that precludes using additional criteria (in addition to AncestryDNA's "Possible DNA Relationships" percentage prediction already used) to verify the consistency of the relationship. So I confirm that the paper trail genealogically-known relationship is consistent with both the AncestryDNA "Possible DNA Relationships" percentage prediction and the Shared cM Project Relationship Chart. In the source statement, I'm unable to link to the AncestryDNA "Possible DNA Relationships" percentage prediction popup, but I do provide the total shared DNA amount that AncestryDNA reports and I link to the total cM value page for the Shared cM Project Relationship Chart at DNA Painter.

Regarding "Can you tell me what passage you are referring to on the DNA confirmation page in your first sentence there? I don't see anything down in the sample citations about an either/or." ...

The passage is actually in 2 places in the source requirements section of that help page (depending upon whether your match is on WikiTree or not): "here is what the source citation needs to include: ... Predicted relationship from the DNA testing company or amount of shared DNA".

Rick: sorry, I thought you had meant either the predicted relationship or the relationship documented with a paper trail (because I didn't read your comment carefully enough). Yes, the help page says either the predicted relationship or shared cM can go in the source citation. But it doesn't explicitly allow for anything other than predicted relationship earlier in the process, on step 5. As you say, you could use more -- but my post is not about going the extra mile. It was about getting through step 5 without getting shutdown and unable to continue.

You use the percentage prediction at Ancestry DNA, as do others. So yes, we are in agreement that it is fine. The only point of this post was to bring awareness to everyone else who doesn't know about those extra predictions that they exist and can be used. And also, since they can be in disagreement with the predicted relationship range on the main match page, they are worth checking. And that even though they can be in disagreement, it is fine to choose which one you use.

And since you wrote that you choose to use the percentage predictions yourself, then yes, we are in agreement. smiley

Hi Barry,

It appears that AncestryDNA may be changing some of their "initial" relationship predictions to be more in line with the "Possible DNA relationships" popup that they provide which shows the percentages for the possible relationship groups based on the amount of shared DNA.

I've noticed that some of my AncestryDNA matches that originally had an "initial" prediction of 3rd-4th Cousin, now reflect a prediction of 2nd-3rd Cousin. Similarly, some of my AncestryDNA matches that originally had an "initial" prediction of 4th-6th Cousin, now reflect a prediction of 3rd-4th Cousin. All of which appear to be more in line with the actual relationships for those particular matches.

I'm seeing those changes in the match list and on a particular match's DNA comparison page. But interestingly, if I go to the match's Ancestry profile, it reflects the old predicted relationship. Also if I've messaged the match, the relationship prediction displayed for the match within the messaging system is also the old prediction.

2 Answers

+1 vote
Hmmmmm, so many points to be made here but let me start off saying that the goal, IMHO, should be to avoid false positives without excluding any true positives.  I hope we can all agree to that.  Next, I think it is reasonable to see that we have a grey area in that 3c and closer are not eligible for triangulation but 3c and closer below a certain level are not eligible for confirmation without triangulation.  OUCH!

With that said, let me make some observations in no particular order:

1) The image above might have the top three categories closer than 4th cousin but the requirement is to be 3c or closer so we can exclude any 3c1r relationships.  The total confidence for 3c or closer then, would be the sum of the1st, 2nd, and 4th entries or about 76%.

2) Since 3C and closer matches allow one to confirm the MRCA (a single person for half cousins and both for full cousins), I think for full cousins the number of matching segments needs to be some minimum.   Obviously, it cannot be lower that two and with only two, there is a 50/50 chance that they both come from the same person.  If we can agree on an acceptable probability that they are from two separate persons, we can mathematically derive the number of segments needed to meet that probability I should think.

3) For yDNA, we expect at least 90% of the STRs match so possibly we should look at using at least a 90% confidence that a match is 3C or closer.

4) The same DNA might register as beyond 3C on Ancestry but by downloading the results and importing to other platforms (i.e. GEDmatch, MyHeritage, etc.) one can often see them at 3C or closer.  In some cases, a prediction is not even made rather one needs to take the segments  and sizes and use a source such as ISOGG to help determine the probability for 3C or closer.  

4) The major reason for the above is a filtering step made by Ancestry suing Timber (see: https://blogs.ancestry.com/techroots/filtering-dna-matches-at-ancestrydna-with-timber/)

5) If WT uses minimum numbers of segments and minimum sizes to establish whether a match is good for confirmation, there would be quite a challenge dealing with the various sizes used by testing companies especially in dealing with short segments.

I could go on but the challenge is to decide not only what constitutes a reasonably match but also how would that be dealt with by the various testing providers and GEDmatch.  A second challenge is what to do with matches at the 3C or closer level that do not meet the guidelines as triangulation is not an option under current rules (I would say that that three 3C matches could be a good triangulation or even 2 3C matches and a 4C, but maybe that's just me).
by Thom Anderson G2G6 Mach 5 (50.4k points)
+2 votes
Slightly off-topic, but it should be noted that there is an error on the DNA Confirmation help page. It says, "If your DNA test match is a third cousin, a second cousin once removed, or closer, continue." The second cousin equivalent to third cousin is actually second cousin TWICE removed, since it is the number of recombination events (DNA passings) which determines how much DNA you and your match are likely to share. That number is eight for third cousins and only seven for second cousins once removed.
by Bennet George G2G6 Mach 1 (12.3k points)

Related questions

+4 votes
3 answers
+22 votes
18 answers
+36 votes
9 answers
+10 votes
2 answers
+3 votes
5 answers
188 views asked Jul 4 in Genealogy Help by R Power G2G Crew (900 points)
+3 votes
4 answers
139 views asked Jul 3 in Genealogy Help by R Power G2G Crew (900 points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...