AncestryDNA Matching White Paper

+17 votes

Haven't seen this info posted yet, but is about to change their matching criteria. 8cM is the minimum now, so the number of matches should be dramatically reduced.

AncestryDNA Matching White Paper (15 Jul 2020)

in The Tree House by Mike Wells G2G6 Mach 4 (48.4k points)
retagged by Ellen Smith
Found it. Thanks!

The purge has occurred.

Dad prior:

  • All Matches: 111,669
  • "Close" Matches (20-3,480cM): 5,830
  • Distant Matches (6-20cM): 105,839
Dad now:
  • All Matches: 90,433
  • Close matches:  5,888
  • Distant matches: 84,545
Mom prior:
  • All Matches: 48,139
  • "Close" Matches (20-3,480cM): 1,208
  • Distant Matches (6-20cM): 46,931
Mom now:
  • All Matches: 41,798
  • Close Matches: 1,224
  • Distant Matches: 40,574

Thanks for the heads-up, Darlene!

Mine from about 10 hours prior (as per my comment above):

  • All Matches: 50,290
  • "Close" Matches (20-3,490cM): 1,155
  • Distant Matches (6-20cM): 49,135

Mine now:

  • All Matches: 24,301
  • "Close" Matches (20-3,490cM): 1,155
  • Distant Matches (6-20cM): 23,146

I had made Notes on any of my matches with "Common Ancestors" which could have a largest segment < 8.0 cM, and those appear to remain in my matches list. Hopefully I'll have time to look at them eventually.

They just couldn't give me a couple more days to get to 160,000 total, could they?

My post-purge match statistics 31 Aug 2020:

  • All Matches: 83,530
  • "Close" Matches (20-3,480cM): 6,279
  • Distant Matches (6-20cM): 77,251

"Distant Matches" lost: 76,319
Percentage lost: 49.7%

In the "distant match" department Rick came in at a 52.9% loss. Darlene's dad lost 14.1% and her mother 13.6%. But Darlene ran the automated utility to save 7.9cM matches and lower. It'll be interesting if Rick and my ~50% will hold as a possible median.

Edited: I didn't see Rick's post when I was writing mine. Cannot not include him!

My impression is that Darlene took steps to save a lot of the distant matches for her parents. I only tried to save my matches with "Common Ancestors" that might be removed, which I believe was a significantly smaller percentage than what Darlene saved. And we took similar steps to mine with the other 3 kits there to which I have access. And those all also had 50% or more drop in matches.

I believe that most AncestryDNA customers didn't take any steps between what Darlene and I did to try to retain potential DNA matches, so I wouldn't be surprised if the AncestryDNA match database size has been reduced by 50%.
Egads!  I'm so glad I ran the utility!  I'd be devastated to have lost 50% . . .

The reason I wanted to keep the smaller ones is that I have found them useful when researching some of my brick walls.  When I get several small shared cM matches, along with numerous larger shared cM matches, that trace to a common ancestor (which I've determined by researching their trees, not because ThruLines shows a common ancestor), I feel more confident that I may be chasing the right family.  I can't be super confident until I can find people on other sites and can locate a TG, but it gives me hope and thus I continue researching that possibility.
Blaine Bettinger likes to do polls, so I'm thinking he may set up a new one to see what average % of matches people lost.
I lost 46% after I managed to save some manually. Didn't know about the tool. Aunt lost 66% and great uncle lost 60%. Darlene's explanation for the utility of these distant matches is the same as mine. However, this doesn't seem to be the typical customer use case of From posts here and elsewhere it seems that many people are unaware of the value of these matches.  

Even if say 90% of those 6-7cM matches were somehow incorrect and wild goose chase, the remaining 10% surname candidates STILL would be potentially useful for brick wall research and well worth the effort.
I spent days adding those matches to groups.  I queried for the surnames in my lines and used that to add them to groups.  Then any left over, I added to a To Be Checked Group  so I lost nothing...  But my finger tips ae numb....
I made only very modest efforts to save matches < 8 cM (filtering on common ancestor, etc.). However, I made a list of a few matches with 2 segments, where one of the segments must logically be < 8 cM (e.g. 13 cm total across 2 segments). The ones I listed (just a sample, mind you) still show the same total cM after the purge. From the white paper, it sounded like short segments would be discarded early in the process, but that doesn't appear to be the case.

8 Answers

+13 votes
Best answer
I saw this announcement from Roberta Estes two nights ago.  I immediately set about assigning a group to a lot of my smaller matches to avoid them being eliminated.  Those smaller matches have led me to researching some lines I wouldn't have previously considered.  Unfortunately, so many people are doing this right now that the ancestry site keeps bombing out -- saying that their backend servers are overloaded.  Frustrating!!

If ancestry would use their head, they would offer a chromosome browser and charge for accessing it.  I would gladly pay $5/month to have access to it.  I understand they are trying to avoid having to spend money to get more computer power.  But I think there are many of us willing to pay to be able to 'unlock' the huge benefit of a chromosome browser, which would increase their revenue . . .
by Darlene Athey-Hill G2G6 Pilot (437k points)
selected by Jana Shea

There is a petition on to try get to provide a chromosone browser: 

Marty, thank you for pointing this out.  I have signed the petition and shared it on Facebook.  Will everyone else please do the same?  Power to the people!!
Signed petition and donated.  Would be nice if Ancestry also added the features offered by Genetic Affairs. I spend too much time doing work better suited for computers.

I wonder if the recent sale of Ancestry to Blackstone will see some changes that seem to have been bogged down.  When a merger is pending most companies kind of coast until the deal is done and then a lot of internal negotiations begin...,equity%20holders%20for%20%244.7%20billion.

+13 votes

Thanks, Mike. And here are Robert Este's takes on why this is happening and what actions you should take in the next couple of weeks to preserve what will otherwise be lost:

by Barry Smith G2G6 Pilot (219k points)
Thanks Barry. For sure I've been able to make use of many < 8cM matches' trees.  On the whole, it will be quite a loss of potential ancestral information for everyone.
I will filter to 6-7 cm and Common Ancestors and tag them at least to save the ones with Common Ancestors. I don't think I need to filter to 8cms to pick up te 7.6s and the like from what I understand. Im probably a bit less passionate about this than others, but just in case...
If you have someone shown as 12cM on 2 segments or 18cM on 3 segments, Estes suggests you may lose such a person. It isn't clear to me if they are filtering out every *segment* under 8cM or just matches with total cM under 8cM, but Estes implies it might be the former. I looked at my entire list of common ancestors and grouped all of these people as well.
I just noticed my stepfather is tagged as my birth father. I dropped him completely, but have to wait for the information to be rebuilt by to avoid all the false DNA matches and ThruLines and on his side of the tree that is also deep at the 6-7cm level. I will start with my other DNA kits that don't have this problem. I will probably just tagged them with a new "Former Distant 6-7cm Common matches".
+6 votes

I have copied and am pasting information I received about three days ago from our local genealogical society DNA SIG regarding Ancenstry changing their matching criteria:

On the AncestryDNA Matching FB page, there is an article in the LostCousins newsletter regarding what appears to be some pending changes at Ancestry in coming weeks.  The article is copied below and the link is as follows:
The sender says he is taking steps to save matches in the 15-19 cM range.
In the same conference call I learned that Ancestry are in the process of updating their DNA matching criteria, and that as part of this process almost all  matches where users share less than 8cM will be removed, probably next month (see below for the exceptions). The current threshold is 6cM, and I estimate that as many as 8000 of my 24000 matches will be lost.
The aim is to remove false matches – matches that occur by chance, or because of statistical anomalies. But whilst improving the quality of matches is important, it's inevitable that many valid matches will be discarded. Indeed matches could disappear even if Common Ancestors have been identified.
However, if you’re quick there's a possible solution - I've been advised that matches of under 8cM won't disappear should any one of the following apply:
  1. You've added them to a group (using one the 32 user-definable coloured circles)
  2. You've entered something in the Notes field
  3. You've sent a message to the other member
I suggest you give priority to those where common ancestors have been identified. This won’t take very long – in my case just 6 of my 75 'Common Ancestors' matches share less than 8cM of DNA and I'd already made a note against all of them. What I hadn't done, however, was go through the same process for all of my cousins – I manage about dozen tests for relatives – so that's what I'm going to be working on over the next week.
I'm also going to add notes against matches who have surnames in their tree that correspond to my major 'brick walls' - any one of those matches could provide a vital clue! (This is particularly important where the other user's tree is private since I won’t have been able to evaluate the match.) And again, I'm going to have to repeat the process for the cousins whose tests I manage.
I understand that before the end of this week Ancestry will be adding a message to the DNA page notifying users of the impending change, and that they'll also be publish a White Paper describing the updated matching process, but I wanted LostCousins members to have as much time as possible to prepare for the change.
by Carol Baldwin G2G6 Pilot (521k points)

Ancestry's announcement specifically states: "Our updated matching algorithm will increase the likelihood you are actually related to your very distant matches. As a result, you’ll no longer see matches (or be matched to people) that share less than 8 cM with you - unless you have added a note about them, added them to a custom group or have messaged them. These changes to the matching algorithm will reduce the total number of DNA matches you have and the number of new matches you will receive. It may also affect the number of ThruLines you may see."

Note that the emphasis is mine, not theirs.  So bottom line is that as long as they share 8cM or higher with you, you should still see them.

+5 votes
It seems somewhat unclear EXACTLY what's going on. At the start of their official notice, it says they're "changing the way we calculate the amount of DNA you share with your matches". That indicates that the cM number would change (which would mean you don't really know which matches are safe from he Big Purge).

But the follow-up details only talks about the lengths of segments - it makes it sound like in cases where there's a short gap between segments that they'll count that as one larger segment instead of two smaller ones. It says that won't effect the cMs, but the number of segments will drop in some cases.

And they'll tell you what the longest segment is. I suppose that's a bit more of a glimpse into what's going on under the hood, but I don't see doing a whole lot with just that.

Maybe the White Paper would illuminate on the situation, but he last one of those I saw was pretty long, and ddn't look like it was necessarily really telling me much.
by Frank Stanley G2G6 Mach 6 (67.0k points)
+6 votes

Here is Blaine Bettinger's FB thread. As usually, there are passionate opinions. And lots of them. Blaine's take is a bit more nuanced than the heading might lead you to believe.

by Marty Acks G2G6 Pilot (118k points)
+7 votes

And here is Judy Russell's (The Legal Genealogist) take: 

by Marty Acks G2G6 Pilot (118k points)
+3 votes
I don't understand what all the concern is about -- so many of those low "matches" are not true matches anyway (Identical by State; IBS).

Personally, I'd be glad to have most of the noise removed.

But I never look at anything below a total 30cM anyway. And most of those seem to be IBS too.

The math and statistics modeling is improving all the time, so I'm sure there will be more changes like this in years to come.
by Dennis Wheeler G2G6 Pilot (537k points)
The concern is for the loss of clues to common ancestral lines. If the 6-7cM matches are lost, then the info from those matches' trees are most likely forever lost as well. So very simply, your matches could have had info on your common ancestors which you did not and would otherwise not have known about.
not very likely
I had over 100 that Ancestry marked with "common ancestor" in 6-7cM match range. Had they not bothered to include our MRCA, that info would be lost. Correct??
0 votes

The bad news... I lost about half of my total matches... down to about 45K total.  Sad to think that this may have eliminated some 3rd cousins who could be as low as 0cms.

The good news though is that it did keep my matches that I went through and marked/categorized, even with just a "star" and no notes! So if you at least "stared" a lower cm match, it seems to have stayed.

This was just a stared match

by Ken Parman G2G6 Mach 4 (45.6k points)

Related questions

+13 votes
1 answer
+6 votes
4 answers
+3 votes
5 answers
200 views asked Jul 4 in Genealogy Help by R Power G2G Crew (900 points)
+3 votes
4 answers
153 views asked Jul 3 in Genealogy Help by R Power G2G Crew (900 points)
+11 votes
2 answers
+10 votes
2 answers
+2 votes
0 answers
284 views asked Mar 24, 2020 in Genealogy Help by anonymous G2G Rookie (280 points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright