Discrepancies between GEDMATCH one-to-one and one-to-many results

+11 votes
689 views
I have been looking closely at the DNA data of a third cousin of mine who tested on 23andMe and uploaded his data to Gedmatch. (We are trying to find his biological father.) I am seeing puzzling discrepancies between Gedmatch's one-to-one and one-to-many comparisons, and I wonder if others here have insights into what's going on.

When I do a one-to-one autosomal comparison on Gedmatch Genesis, using default settings, we match on 50 cM over 4 autosomal segments (longest segment 19 cM), plus 18 cM on one segment on the X chromosome. This is roughly the same as what 23andMe shows -- specifically, matches of 56 cM over 6 autosomal segments (longest segment 17 cM), plus 14 cM on one segment on the X chromosome. (These differences between 23andMe and Gedmatch are attributable to different data-handling protocols.)

But when I do a one-to-many autosomal comparison on Gedmatch Genesis, the report shows a total match of only 19 cM with the largest segment being only 11 cM. That's very inconsistent with the total match of 50 cM and longest segment of 19 cM that the one-to-one report gives me.

Is this a one-off glitch, or is it explained by something about the way Gedmatch Genesis interprets data? I think this may be important to understand, because it affects the usefulness of the one-to-many report on Gedmatch.
in The Tree House by Ellen Smith G2G Astronaut (1.5m points)
I haven't personally analysed the differences, but I know that for the previous classical GEDmatch, there were a number of reports of differences.  Somewhere, there was an instruction to use one-to-many as a finding tool, then always confirm it with a one-to-one.  The one-to-many tool appeared to work as a 'low resolution' analysis, which required a 'high resolution' one-to-one comparison for accurate results.

However, the differences reported then were not near as significant as what you are seeing.  It would be good to have an academic study of the difference, to better understand the error margins we can expect.  I very much agree with your final sentence.

I've also seen that "one-to-many" sometimes suggests better matches than appear in a one-to-one comparison that uses the recommended settings in Gedmatch, but this is the opposite situation. The one-to-one comparison shows a much better match than appears in the one-to-many list.

3 Answers

+6 votes
Do you know which chips the two 23andme tests were performed with?  If one was recent, using the v5 chip, and the other is more than a couple of years old, so done with the v4 or v3 chip, then the two tests will have measured different collections of SNPs for each testee.  It is possible discrepancies then could be the result of overlap thresholds.  If you aren't sure, look at the "overlap" column in the one-to-many report.  Is your overlap number with the other kit highlighted in pink or red?   If so, you will have tested with different chips.  Then I might suggest dropping the "SNP window size threshold" during the one-to-one comparison down to maybe 100 SNPs and see if some of the missing segments reappear.

To call a segment a match, the Genesis one-to-one algorithm default setting requires the segment to have somewhere between 200 and 400 SNPs measured in both kits that match (excepting that a few discrepancies are allowed, as long as they aren't bunched too closely together).  The one-to-one report tells you that this 200 to 400 threshold is dynamic during the algorithm, so it's hard to say what segments in that range will make the cut.  The cMs of a segment, on the other hand, are determined by where on the chromosome the segment begins and where it ends -- the number of matching positions measured along that segment are irrelevant.  You can have a high-cM segment where only 100 or 200 positions along the segment were measured in common, and if you had measured more positions you might have found some mismatches.

I do know from experience that there are different SNP thresholds used in the one-to-one comparison than in the two-to-many comparison, and so presumably also for the one-to-many comparison.  A few weeks ago a match contacted me about a 12 cM segment that is confirmed a false match (neither of my parents match that fellow on the same segment).  The overlap was only 250 SNPs, and that didn't make the cut in a one-to-one comparison even though it showed up in the two-to-many comparison he had run.
by Barry Smith G2G6 Pilot (292k points)

Sorry, but both of us tested at 23andMe several years ago, with the V3 chip. This cousin of mine wasn't on Gedmatch until recently (he joined at my suggestion), but he has been on my DNA relatives list on 23andMe since I first tested there.

And our "overlap" measurement at Gedmatch is well above 400,000 -- among of the highest values I've seen there (definitely not one of those pink values, which are around 80,000 or less).

Note: This is not a case where the one-to-many comparison suggests a match that doesn't appear in one-to-one. This is the opposite situation: a match that is much better in one-to-one than in one-to-many.

Ah, okay.  If your match is willing to share his data file with you, you could run some more detailed tests.  For instance, David Pike's tool will tell you not just matching regions but also how many no-calls and mis-matches there are within the matching segments.  Other than that, I'm out of ideas without knowing the details of GEDmatches algorithms, which, I believe, are subject to change.  (Perhaps a bug was introduced?)
An easier step would be to run the DNA File Diagnostic Utility at GEDmatch on both files, if you haven't done so already.  If one file has poor quality because of a high number of no-calls or mis-calls, then the divergent results could be explained if the one-to-many and one-to-one algorithms have different tolerances for these errors.  This is especially the case since the results you reported suggest that several of these segments are on the small side.  

But the sure-fire method is to get both data files and compare yourself the regions where the one-to-one reports match.  If you scan through the few hundred to few thousand SNPs in the regions, you can see for yourself if they appear to be matching segments.
+4 votes
You should trust the One-to-one match over a One-to-many match.
by Peter Roberts G2G6 Pilot (705k points)
edited by Peter Roberts
Yes, but if the match is pretty far down on a person's one-to-many list, it's unlikely ever to be noticed.

ADDED: This cousin is one of my strongest matches in Gedmatch (based on the one-to-one data), but he shows up around number 500 on my one-to-many list (down among the 5th cousins).
+3 votes

I noticed some differences last Feb and did some analysis on it.  I am an Adoption Angel and we use these reports a lot so when I started seeing some unexpected results I alerted Emma and then engaged Ed Williams to look at the data with me.  Because so much of the data is confidential I can't share it.  What I can share is what I found looking at my own data as a control group if you will...  

I had uploaded to the original site 2 kits.  They were both migrated to Genesis.  That data is 100% the same in both places.  

I also uploaded a kit from the original site to Genesis directly.  

And then Genesis took one of my kits and uploaded it to Genesis in their migration so I have 3 kits showing:

When I log into Genesis as myself I see 3 kits for me.  All 3 came from two FTDNA files.  One out of 36  which is T129400 this is the kit I ran the One to Many For in Genesis

one out of 37 which is the second swab for my original Au test with FTDNA.  This is kit T336731

And the third one is GU2396830 which is the upload I did from I think it was 37 to Genesis a while back..

Ellen the system kept saying it was too long so I deleted the charts.  I sent you the whole email via your personal email with the charts in it.  But for everyone else I am leaving the prose part:
Now, the first view is showing at a much higher level than the Genesis view and the only thing that makes sense is that the new chip is not testing the same locations to the same degree because these tests came from the same test.  And they matched perfectly in the first site.  It seems to me that the match showing in the new Genesis tool is only comparing the old matches and not the new Genesis uploaded data because that does not even match myself...   

Ellen contact me directly because of the confidentiality issue.  I have permission to share a few things I have just with you to help you see what I saw.  My tests were all with FTDNA.

I have seen similar results from Ancestry but I have to say I dread anyone who has used 23andme because that data is often not consistent with what we see for the same testers coming from other sites.  There appear to be additional variables there we have to deal with.  I also seem to find that they show as a more distant relationship than they might actually be.  These statements are based on observing a number of tests for adoptees who used 23andme and used either or both Ancestry and / or FTDNA.  It is not a large batch of people so really more anecdotal than a research study.

Hope that helps.

by Laura Bozzay G2G6 Pilot (832k points)

Thanks, Laura. I'm coming to the conclusion that we can't always place a lot of credence in what we see on Gedmatch. frown

I think we have to learn how to read it and know when things might be off due to testing company data not overlapping like we were used to seeing.  I just wish they would talk to people like Ed Williams because he really gets the DNA thing and could help them make this work better I am sure.  

What I am doing is putting more faith in the migrated packages than the new Genesis ones.  Because I see more consistency there right now.  Then when I see a match with a genesis kit I know it is likely closer than displayed.  So all is not lost.  Then I run matches through the People Who Match 1 or 2 kits to identify branch lines.

Related questions

+4 votes
2 answers
285 views asked Mar 31, 2020 in WikiTree Tech by Peter Roberts G2G6 Pilot (705k points)
+8 votes
1 answer
+1 vote
3 answers
+2 votes
1 answer
445 views asked Jun 24, 2020 in Genealogy Help by Michelle West G2G Crew (550 points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...