Differences among 23andMe, FTDNA, Ancestry's raw genetic datasets?

+8 votes
595 views
If I decide to provide my raw genetic data to someone, is there any significant difference among data scope or quality among 23andMe, FTDNA and Ancestry? If so, what are those differences? Also, in general, despite any differences, is it typically better to provide the most recent dataset I obtained? Or again, does that not tend to matter much?

Thanks in advance for your insights.
in The Tree House by Susan Keil G2G6 Mach 6 (67.8k points)
retagged by Susan Keil
Have you created an account with Gedmatch?  If not, that would be a good idea because you can upload the information from all sites and it will compare them all, instead of only one site.
Yes, I have. What I'm trying to ask about is if I were to upload to, say, sequencing.com, would it be best for me to upload the most recent test I've done? Or does, 23andMe, for example, provide a more thorough dataset so I'd be better off providing that one instead of Ancestry even if I did Ancestry x years ago and I did 23andMe x+3 years ago.
Edison Williams probably can supply an erudite answer.

I imagine the answer would depend on what you hope to be able to do with the information you would get (whether from sequencing.com or somewhere else).

2 Answers

+10 votes
 
Best answer
The quality is fine for any of those kits, but they may cover different sets of SNPs. For maximum coverage, you could combine your kits with this tool:

http://dnagenics.com/dna-kit-studio/
by Ann Turner G2G6 Mach 1 (16.9k points)
selected by Susan Keil
Interesting! Thank you, Ann. I'll communicate with the vendor/recipient to see if they can accept a combined file I "assemble" or if they require the file direct from, say, Ancestry.

I guess this then leads me to the question of whether the raw datasets for this data falls under some standards/expected format such as is used in other industries. (i.e. SDCM, CDASH, AdaM, etc. are the ones in my line of work.)
The companies do have somewhat different formats, but they are easily recognized by most places that accept uploads. Some companies use comma separated values while others are tab-delimited. Most companies display the genotype column with the two alleles listed as one string, e.g. AG. However, Ancestry splits the genotype field into two columns, A and G (but the order is not meaningful). The so-called "23andMe" format seems to be mentioned most often: 4 tab-separated fields with rsid, chromosome number, chromosome position, and genotype.
Thanks, that makes sense. So if someone in academia, for example, is researching/testing something and I want to give them the most robust file I have, there really isn't any difference between a data file provided by X in 2015 vs. a data file provided by Y in 2019. It's not something that is getting better and better over time, and the researcher would get something less from the 2015 file than the 2019 file? Sorry for all the questions and thank you for your patience with me.

Hi Susan, I don't know the answer, but on Gedmatch, if you subscribe to their Tier 1 utilities, you can create a 'Super Kit.'  The Legal Genealogist discussed it here and Kitty Cooper discussed it here. I've tested my parents at 23andMe, ancestry, and FTDNA and combined each of their tests on Gedmatch into the SuperKit.

As a side note, if you upload multiple tests on Gedmatch for the same person, it's recommended that you mark all but one as Research kits, so that only one shows up.  If you create the SuperKit, you would have it showing.

An upload site might have accumulated more data from one chip than another and have a preference. I suspect the sheer amount of data from the newer GSA chip is now overtaking earlier versions. If that's the case, then 23andMe v5 or FTDNA v2 or MyHeritage v2 would be preferable. Ancestry v1 is an earlier chip, very similar to FTDNA v1, MyHeritage v1, and 23andMe v3, while v2 is a custom chip with lower overlap.

https://isogg.org/wiki/Autosomal_SNP_comparison_chart
Thank you, Darlene. I have used Gedmatch and sometimes at at Tier 1. I don't recall playing with their Super Kit though. Thanks for reminding me of this route.

And, Ann, thanks again. I think what I should do is remind myself of the dates and versions for the 3 tests I did at the various companies. That would help the conversation, it sounds like!
+6 votes
by Linda Peterson G2G6 Pilot (786k points)
Thanks. I checked it and it does not address my question.

Related questions

+11 votes
1 answer
+24 votes
2 answers
+12 votes
2 answers
904 views asked Feb 18, 2017 in The Tree House by Steve Stobaugh G2G6 Mach 2 (20.4k points)
+7 votes
2 answers
+6 votes
3 answers
+23 votes
7 answers
+9 votes
1 answer

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...