How do I manage duplicate kits at GEDmatch?

+22 votes
2.5k views

If you tested with different labs and have more than one kit in GEDmatch then “You should have your preferred kit set to Public, and any identical kits (same person but from other testing companies) set to Research.  Leaving multiple identical kits set to Public clogs up everyones match lists and reduces the effectiveness of many of the tools (eg. Triangulation reports).”

This and other great GEDmatch tips and information at

https://web.archive.org/web/20220310183857/https://genie1.com.au/tips-for-using-gedmatch/

in The Tree House by Peter Roberts G2G6 Pilot (694k points)
edited by Peter Roberts
Here are step by step instructions for how to change a duplicate kit to research mode:

Login to your GEDmatch account

Near the bottom left of the main page, click on the yellow pencil icon to the right of the duplicate kit be changed.

To the right of “Public Profile” click on the radio button for Research and then click the Change button.
I changed it to research. I did not know I had to have only one public. Sorry about that. The reason I took the second test with FTDNA is because I thought the Ancestry test was not correct and was not showing cousns on both sides. By comparing both tests on Gedmatch I was able to prove that both tests were identical.

Thanks
Harry, I had the same sense about the difference between the two testing companies, because the follow through of Ancestry re my 2nd test (a waste of money and time) seemed not to be handled well at all. But you now say that you've proven that the tests were "identical." What made you think that? It's so easy to make false conclusions, so I had to ask.
I do NOT have a duplicate kit on gedmatch, however on the gedmatch main page my daughters kit says it may be a duplicate and might need deleted? It is my daughter, not me...do you know how I can fix this? Thank you

7 Answers

+10 votes
 
Best answer

The Tier1 tools at GEDmatch.com now have a "Combine multiple kits into 1 superkit" tool that allows you to take kits from multiple test companies and combine them together into one superkit.  I'd recommend paying $10 for a one-month access to the tools and creating this superkit and making it the public one and the others research.

Read more about it on Kitty Cooper's blog post.

by William Foster G2G6 Pilot (120k points)
selected by Peter Roberts

Or, instead of paying $10, one could just combine the kits using some free source or some POSIX commands.

for i in <list of DNA results files>; do

    grep -v -e "^#" -e "^rsid" -e "^RSID" $i|sed 's/"//g'|sed 's/,/\t/g'|sed -E 's/\t([0ACGT])\t([0ACGT])/\t\1\2/'| sed 's/\r//'|sed 's/CA/AC/;s/GA/AG/;s/GC/CG/;s/TA/AT/;s/TC/CT/;s/TG/GT/;s/00/--/'|grep -v -e '\-\-'

done|sort -u|awk  -v prev=0 '{if ($2$3 != prev) {print $0;prev=$2$3}}' >>superkit.txt

Thanks Thom!  To anyone who understands, please provide step by step instructions (or video) on how to use a free source or POSIX commands to combine the kits with the coding shown.
If you have a Macintosh computer, you can open a shell terminal and run the command there.  But if this seems daunting, I'd recommend the $10 option at GEDmatch since the other tools like DNA triangulation, that you get for 30 days are very much worth it.
The example of the POSIX commands mostly illustrates how simple and straight-forward the process is.  Rather than posting instructions, a web app or browser app could be a better approach.  That said, William Foster's point is on the mark.  For example, I would not see the need to spend $10 for simply creating a superkit, but do buy Tier 1 services to use the other services, which, of course, includes their superkit creation service.
One difference between doing it on GEDmatch versus externally is that with GEDmatch, a kit is created by combining other kits without making the that kit available for download AFAIK.

External methods rely on the original result files from the corresponding test providers that are then combined and can be uploaded to GEDmatch, Family Tree DNA, Living DNA, and MyHeritage.  [I have only tested uploading to GEDmatch.]
@Thom, maybe promote your original comment to an answer.
Thom (or somebody else), can you please create a (YouTube) video which demonstrates using those commands to combine two kits?  Thank you!
+15 votes
Peter, thanks for posting about this.  I have many, many matches (aarrggh!) that have three or more kits, all public, on Gedmatch.  Was just going through one of my GenomeMatePro files the other day combining a bunch of them . . .
by Darlene Athey-Hill G2G6 Pilot (534k points)
+8 votes
Thank you for this.. I think I have done it right..
by Kristina Wheeler G2G6 Mach 1 (19.2k points)
+4 votes
Thanks, Peter, for reaching out to individuals and making this public post to help inform and educate us as to how to best activate only one DNA test at GEDmatch, so test results are not adversely affected.
by Cynthia Larson G2G6 Pilot (179k points)
+4 votes
Thank you, Peter. I made a combo kit (of both my 23andMe+Ancestry, I would've used FTDNA/MyHeritage as well, but they were uploaded Ancestry, so wasn't sure if those would make a difference?). I do find that I match people at different amounts (or not at all) depending on which kit I do use, which is why I kept all three as the Ancestry was the original one, and found more matches on that.. but as I saw this, I made the single ones Research and the main/public one is the combo one now :)
by Valerie Sizemore G2G3 (3.3k points)
+2 votes
Once the Superkit (combined kit) is created, there is no need to further modify the status of the base kits because Gedmatch takes care of that in the back end.  Base kits will NOT show up in match lists if there is a Superkit/Combined kit that they are a basis of.
by Virginia Winslett G2G6 Mach 1 (14.5k points)
+3 votes
My experience with combined kits is that they sometimes filter out true positive matches and sometimes present false positives matches that do not appear in any base kits.  This does not mean that I do not value the superkits, but it does mean that I do not want to ignore the base kits.  YMMV
by Living Anderson G2G6 Mach 7 (78.6k points)
Hello Thom,

Would you please provide the GEDmatch IDs for where true positive matches were filtered out by the superkit?

The research kit IDs can be entered in the note field and still used for comparisons.  

Triangulation is almost impossible when one person in the TG has multiple kits in GEDmatch that are not marked as research.

Thanks and sincerely,
Thanks, I believe I found the same thing when I compared matches to superkit vs. separate kits from FTDNA and Ancestry

Related questions

+6 votes
2 answers
+5 votes
3 answers
+8 votes
3 answers
371 views asked Mar 4, 2017 in Genealogy Help by Barbara Shoff G2G6 Mach 2 (22.6k points)
+10 votes
0 answers

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...