What is most needed for autosomal DNA and ancestry?

+12 votes
1.2k views

The following quote from Dr. Tim Janzen nicely sums up what is most needed   for autosomal DNA (auDNA) and ancestry: "None of the
companies are giving us what we really need:  automated clustering of our
matches into triangulated groups and then automated searching of the
pedigree charts of those people in the triangulated groups for shared
ancestors, surnames, and locations."

in The Tree House by Peter Roberts G2G6 Pilot (560k points)

Magnus, we take things one step at a time here at WikiTree. One improvement at a time. Evolutionary changes instead of revolutionary changes.

Peter, I want to understand more about AncestryDNA's DNA Circles and what we can recommend to WikiTreers regarding using them for DNA Confirmation. It's disappointing that they're not grouping people who share the same segment. I thought I heard Tim Janzen say that he'd never heard of a DNA Circle, in practice, not representing a triangulated group. That we don't know how they're doing it, which is troublesome, but that they are grouping people who share common ancestry. Maybe I'm remembering this incorrectly or didn't understand him.

Just to let y'all know, I am following this posting very closely. Mags
Peter, about my comment concerning the Circles and Ancestor Hints going away and coming back.

You are right about the way that the Circles and Ancestor Hints are created.

My comment addresses the fact that many times, whenever Ancestry.com makes a change to their site, a person can see 1-all of their Circles and Ancestor Hints disappear for 3-4 days.  This is not about the DNA matching this is a glitch in their programming.

As an example, I have 2 circles -Henry Thomas Heath and Eliza Jane Campbell.  Since the beginning of Ancestry circles,  I have been included in those circle with 6 other members. Periodically, I will lose one of them for a week but this time it was both of them.  When they return to my list, it is exactly the same 6 members with the exact same information.

With the Ancestry Hints,  I had 4 people - Sarah J Lloyd, Robert Mayberry Dickens, Benjamin Franklin Donathan and Russell Rollington Richardson on 30 March.  Then I only had Robert Mayberry Dickens on 2 April.  Then I had all 4 back on 4 April.

Sometimes, Ancestry makes a change that causes the list to re-populate but very slowly and has nothing to do with DNA.  I was wanting to convey that you may wait a week and see if the disappearance of a Circle or Ancestry Hint is a permanent change.

Ancestry still has the DNA circles.  I've got 19 circles for my DNA.  Peter is right that it isn't triangulation (although some of it is) for the people forming the circle.  Ancestry states:

"Who is in a DNA Circle?

  • Circle members share DNA with other members of the circle.
  • Circle members all have family tree evidence that they are direct descendants of (ancestor's name).
  • Your DNA matches some or all members of this DNA Circle."
Some of the people in the DNA circle are DNA matches to me, while others are DNA matches to others in the circle but not to me.  In one particular circle, of the 10 other people in it, 6 share DNA with me & 4 don't.
 
have been able to confirm, using Gedmatch, that there are triangulated groups within some of my DNA circles to prove the ancestral couple.  However, some of the circles somewhat 'overlap' (my terminology), i.e. one circle may be for a great-grandfather of mine, while another one is for the mother of that person.  Some of the people matching me are part of both circles.  So we really need to see what chromosome segments we share and triangulate.
 
In another direction, I've got some 'new ancestor discoveries' on AncestryDNA.  I think that's one step away from being set as an AncestryDNA circle.  I've managed, with most of these, to research and determine that the person mentioned is a descendant of one of my ancestors, but the person mentioned is not my ancestor.  The people that have that particular person in their family tree just haven't yet connected their ancestor to mine...
 
Here are a couple links with a bit more about Ancestry DNA circles:  http://lisalouisecooke.com/2015/12/ancestrydna-dna-circles/
 
I checked and I have it from two experts that AncestryDNA's Circles is not triangulation.  Some who belong to a circle may be a triangulated group but it would usually take a lot of work to figure that out.

Darlene and Peter are correct. Unfortunately AncestryDNA is using a black box approach and however good it is as long as they aren't allowing us to use a chromosome browser so that we can do triangulation I unfortunately have to say that we cannot accept any "proven by DNA" which is solely based on whatever AncestryDNA is posting on their website (whichever they are calling it right now, DNA circle, NAD, magic-fairy-dust-ancestor-finder or whatever).

They only way to allow someone who has only used AncestryDNA to test his DNA and use it for the "proven by DNA" on WikiTree is by uploading the data of all people in the triangulation group to GEDmatch or a similar website which allows triangulation by the rules stated many times in other posts (see minimum criteria for triangulation).

We should be very strict about this as otherwise the maybe-right / maybe-wrong "proven by AncestryDNA" family trees are spreading like wildfire on WikiTree!

We should be very strict about this as otherwise the maybe-right / maybe-wrong "proven by AncestryDNA" family trees are spreading like wildfire on WikiTree!

But what is the problem?

If the theory of pile ups is true then we can't trust gedmatch triangulations or what we see in the chromosome browser as prove for IBD ==> it's also a best guess using a known algorithm but not true algorithm.....

A triangulation using a pile-up segment must be a non wanted triangulation but how can we avoid that? 

 

No, I have to disagree with that Magnus. First of all, only a small percentage of your TG's are pile-ups. It's obviously dependant on which number of people in a TG you call them a pile-up. Let's say 25.

On this basis I can say that less than 10% of TG's are pile-ups.

Secondly, not sure where this "we can't trust gedmatch triangulations" comes from. Right now GEDmatch triangulations are one of the two sources we can trust (the other one being triangulations proven by 23andme). None of the other sources are safe IMO (I haven't tried DNA.land, that might be similar to GEDmatch.

I was referring to the black box that AncestryDNA is using. We unfortunately can't say which is right and which is wrong, like Peter wrote.

As you're very interested in this topic, I'd suggest you join the DNA genealogy mailing list where lot's of experts are discussing exactly these points. Hope to see you there, Magnus, as I appreciate your comments and efforts on WikiTree a lot!

DNA and WikiTree is about confirming the ancestry.  Pile ups are not an issue with that.  If anyone thinks it is, then please point out a 7+ cM pile up where the testers are in WikiTree and have GEDmatch IDs.

Thanks,

A)>> Right now GEDmatch triangulations are one of the two sources we can trust

But if you triangulate on a segment that is a pile-up then its "false" as they don't have a recent common ancestor or we are less certain of the exact relationship (e.g., the number of generations back one would have to go to find a common ancestor) (see figure 4.1 in chapter 4.1)

B)>> Then please point out a 7+ cM pile up where the testers are in WikiTree and have GEDmatch IDs.

Joke: as we don't use templates its difficult to find the Wikitree profiles ;-) 

C)>>Pile ups are not an issue with that.

My understanding

The issue is that new things are found all the time because this is a rather new area ==> the theory we had 5 years ago has maybe now changed with e.g. pile-ups because companies like Ancestry now has a lot of tests (most american people).

Another problem is that is such a small part of the population that has been tested and it's most people in the US. 

I guess we will find new odd things when new groups of people test.... another problem is that this is big business and companies like Ancestry don't tell us what they identify as pile-ups  

D)>> point out a 7+ cM pile up

Difficult then you need the answers.... 

I have an odd match with 33 cM chromosome 1 from position 230488906 to 247093448 for a total of 33.65 cM. 

The first possible match we have found is more than 9 generations away maybe it's a "false indication"....

From Goodbye false positives ancestrydna updates matching algorithm 
Sometimes the “pile-up” segment can be quite large, possibly even as high as 10 cM or more (although more information will be required to confirm this).

----------

See also:
Here is one person that speak about that we should be cautions with piled-up segments 5-10 cM 

* Excessive IBD sharing from ISOGG

* Long article about Ancestry matching  


 


 

4 Answers

+2 votes

Peter if you have time we could have have a webchat.... I am no expert but starts to understand that a wiki is an excellent tool for gathering information. The father of the web Tim Berner Lee started this concept in 1998 and my understanding is that Wikidata is one way how data from a Wiki is "reused" in a structured way i.e. something the we would like to do.....
 
Peter But a table is just a table..... which is free text with frames... we would like to have structured data  ==> we can move it to a database and query it... 

e.g find all people in my ICW that have a person in the family tree that have sources from the same church book as one person in my family tree....

Maybe the input from FTDNA etc. should be done outside Wikitree.....

Video  https://youtu.be/sAuH6xpsKzA

by C S G2G6 Pilot (274k points)
Magnus, I assume you're familiar with this: http://www.wikitree.com/wiki/Database_Dumps
Conceivably a third party (e.g. you) could put together data from WikiTree and FTDNA, GEDMatch, etc.

Unfortunately, the full text of profiles is not included in the dumps and I don't think it could be. But you could point to whatever is on WikiTree or grab it dynamically.

Also note that we use Schema meta tags on profiles to make this sort of thing easier for developers. These tags could be extended.
Magnus, I'm not confident I can think fast enough for a webchat.  It takes me some time to compose what I write.

Until you mentioned it, I did not know about Wikidata.

I havn't looked into the database dump but as long we don't use templates the bio section it is free text ==> it's difficult to reuse the data.... we would like to have semantic knowledge extractions and then a template is a good tool....

You've kind of lost me, as usual, Magnus. :-)

My general point here is that you don't have to wait for WikiTree to do all the things you want to do. You could set up a third party site. I can't think of any "semantic knowledge extractions" that you couldn't do independently.

Peter let me know when you get time... I feel this enormous complex.... and DNA genealogy has just started.... but the concept of structured information and using the knowledge inside Wikitree and combining it with other sources is the future.... 

My plan is to start learn about the semantic web doing the training mentioned earlier to get some understanding ...

I also think there is a version problem with the wikimedia engine we have on Wikitree, don't know the status of dbpedia etc....

The DBpedia Information Extraction Framework

 

 

I glanced at Wikidata and I'm quite impressed and excited about the possibilities.

One option would be to export (or cut and past) the triangulated groups one has in GEDmatch into Wikidata and then link to the shared ancestry via WikiTree's Relationship Finder.  WikiTree (or Wikidata?) could also provide a data visualization of the triangulated group's shared segment (using the chomosome #, start and stop position) in Wikidata.

"You could set up a third party site. I can't think of any "semantic knowledge extractions" that you couldn't do independently.!

I think what you have done with Wikitree is "crazy" for me just playing around with Swedish parish info on my mac is a big enough project for me.... ;-)

Peter Wikidata is crazy big....

Just the person in Sweden who has done the Swedish parish information has spent some hours very day the last years....

Here you have a Query server with some example questions https://query.wikidata.org/ 

#List of countries ordered by the number of their cities with female mayor
SELECT ?country ?countryLabel (count(*) as ?count) 
WHERE 
{
    ?city wdt:P31/wdt:P279* wd:Q515 .  # find instances of subclasses of city
    ?city p:P6 ?statement .      # with a P6 (head of goverment) statement
    ?statement ps:P6 ?mayor .     # ... that has the value ?mayor
    ?mayor wdt:P21 wd:Q6581072 .     # ... where the ?mayor has P21 (sex or gender) female
    FILTER NOT EXISTS { ?statement pq:P582 ?x }  # ... but the statement has no P582 (end date) qualifier
    ?city wdt:P17 ?country   # Also find the country of the city
     
    # If available, get the "ru" label of the country, use "en" as fallback:
    SERVICE wikibase:label {
        bd:serviceParam wikibase:language "ru,en" .
    }

GROUP BY ?country ?countryLabel 
ORDER BY DESC(?count) 
LIMIT 100

 

Click on examples and Execute

Try the example locations of all Picasso's work

That is they use the wikimedia template Coord that contains coordinates ==> they get the coordinates and in the query manager they can display it on a map

 

Did we start adding structured info in Wikitree we could map 

  • the location for all people immigrating to MI in 1880-1890  
  • map the locations a "family tree" was located between 1870-1890
  • map the locations of all your DNA matches on FTDNA that have a tree on Wikitree with locations and have a match i segment 2...  
Displaying the same result in a table
+2 votes
Just to throw this "out there".

We already have the structure for DNA "circles" on WikiTree as we already have our ancestors auto-populated with our DNA and others DNA who match us and them.

It would be a matter of wrangling that data and creating some sort of form to pull the the common ancestor based on segment matches. Since it automatically aligns with the correct "side" (Paternal/Maternal) of the family, having to identify the HIR wouldn't be an issue?

Kind of thinking out loud here...

Mags
by Mags Gaulden G2G6 Pilot (547k points)

No

In an equation you have the left side and the right side

A = B

In Wikitree we have "hopefully" the family tree from the church books with the correct relation = old classic genealogy

In Ancestry DNA circles I assume you start with the DNA part = shared segments etc... that indicates relationship.... then I guess they also try to use the family tree info they have (and often is not so good)

So I would say we in Wikitree have the traditional genealogy with some small extra DNA stuff that indicate what people should be related in A DNA test.....and if you find a person who is DNA tested in Wikitree you need to get the Gedmatch number for that person and check if the church book info also can be proven at e.g. Gedmatch..... 

IF someone in WIkitree used the Wikitree auDNA template and connected 2 people with a common Ancestry and also added shared segments THEN we had also a DNA connection and a segment that could be used..... adding the match info just as text means we can't use that info in a database doing intelligent matches....

Hello Magus,

Thank you. I was day dreaming when the Wikitree auDNA template was created!  I assume [if it were approved and adopted] it should go on only each auDNA tester's profile?  Then use the "Confirmed with DNA" (radio button) for each parent/child relationship back to the shared "ancestral couple"* ?   

It would be excellent if that template provided a link to the Relationship Finder to automatically show the ancestral lines back to their common ancestry.  I wonder if it would be helpful for the template to have the option of adding GEDmatch IDs? [On the other hand if those template variables are "data" then the "database" could in theory find each tester's GEDmatch ID (if there is one) and show their relationship trails.]

*Third cousins and closer matching can incude the ancestral couple. Beyond third cousins requires triangulation and you don't likely know if the segment is from the male or the female of the couple and so the ancestral couple should not have a parent/child "Confirmed wth DNA" tag. 

Maybe some day your dreams come true.... 100 years ago speaking about internet no one would have believed you.....

I think the template should be on every person in the "path" between person A <-> B otherwise its difficult to understand when you look at a profile who is connected and how?!?!?

It would be excellent if that template provided a link to the Relationship Finder to automatically show the ancestral lines back to their common ancestry

That's "nearly" one line of code to use the Descendant view... 
 

I have common ancestor Persson-1427 ==>

URL should be http://www.wikitree.com/genealogy/Persson-Descendants-1427 ==> you can see both DNA confirmed Carl Magnus Sälgö and DNA confirmed Barbro Maijgren ancestors  on the same view.....

 

  

 

A video with me playing around with Wikidata, Wikitree and maps and.....

 https://youtu.be/3GxFLci33AM

Magnus,  I agree that we need more structured data.

Have you generated a report from GEDmatch's Triangulation utility?

If we could take an auDNA tester's triangulation report and import it (in a WikiTree ?free space page? associated with that auDNA tester)  as structured data then we would have the raw data we need.  In theory WikiTree could reveal which GEDmatch IDs have a WikiTree ID.

Not played around much with Gedmatch..... (I am a little bit bored with DNA as not much is happening ;-) )

Don't know how good searching it is inside Wikitree but it would be cool to import the list and have search links so that you could see if someone inside Wikitree had that gedmatch ID.... or had that gedmatch id in the GEDmatch's Triangulation utility.....

Saw that last week Family Tree upgrade its wiki and that one module CirrusSearch had better searching on templates but I don't know how it works
 

 

Mangus, Another thought I had while watching your video regarding maps and coordinates is that my WikiTree ID, the chromosome number for my segment match, and the start and stop numbers for that matching segment can be mapped as if that chromosome was geographic place.

The sky is the limit.....

One of the best programmers I have met is the person who did DNAgen.net both 

  1. displaying more family trees on a map DNAGEN database solution
  2. auto comparing family trees and report back potential matches with also distance between the two people (birth/death locations)
  3. he did another visualisation of ICW
  4. and started a FTDNA group dnagen-experiment

I feel he could fix that in 2 minutes....

 

TED lecture of a person that 20 years ago invented the hyperlink because he was tired of it was so complex to access documents

https://www.ted.com/talks/tim_berners_lee_on_the_next_web?language=en

Now he would like us to put the data on the web as data...as he is tired that its so difficult to access data

His name is Tim Berners-Lee and invented the World Wide Web

  • "there is huge frustration that we don't have data on the web as data"
  • "data drives a lot of things happens in our life"
  • "we combine and gets something more interesting" 
  • "we think about having all our data on the web. We call it the linked data"
  • at 8:40 he explains how Wikipedia links its data in template infoboxes to dbpedia and how other sources links into this data structure i.e. Wikitree could be part of this with genealogy information (read todays categories) /wiki profile information

The free training Introduction to Linked data and the Semantic WEB delivered by Southampton University

https://www.futurelearn.com/courses/linked-data/1/welcome

Wikitree could deliver

  1. Genealogy data in a structured way e.g. Swedish parishes has on a Category gathered good data to be used for doing genealogy
  2. Wikitree profile has some structured data as birth/death/name but by adding events in a persons live and locations we could get much more use of the data... 
I did not think of the descendants' views.  Thank you. Wikitree has X-DNA descendants' view http://www.wikitree.com/treewidget/Wise-1341/890#X  , mtDNA descendants' view  http://www.wikitree.com/treewidget/Archer-1069/890#mt  , Y-DNA descendants' view   http://www.wikitree.com/treewidget/Roberts-7104/890#Y    and for auDNA the regular descendants' view.

Re links to different views 

Y-DNA Descendants
http://www.wikitree.com/treewidget/Roberts-7104/890#Y is no problems
==> sample template code

<includeonly>[http://www.wikitree.com/treewidget/{{{common}}}/890#Y Y-Desc]</includeonly>

mtDNA descendants

then you need to check if the common ancestry is a man or a women which I feel is maybe a problem in the version of Wikitree we use

http://www.wikitree.com/treewidget/Sawyer-1285/89#mt

==> sample template code

<includeonly>[http://www.wikitree.com/treewidget/{{{common}}}/89#mt mtDNA-Desc]</includeonly>

X-DNA descendants

http://www.wikitree.com/treewidget/Roberts-7103/890#X

==> sample template code

<includeonly>[http://www.wikitree.com/treewidget/{{{common}}}/890#X X-DNA-Desc]</includeonly>

auDNA descendants
I assume you would like to see

http://www.wikitree.com/genealogy/Roberts-Descendants-7103

<includeonly>?????? see below for workaround and an Apache redirect solution</includeonly>



Suggestion on a work around because we maybe lack string manipulation functions when creating templates

The small challenge we have inside Wikitree and templates is maybe that on the version of Wikitree we use there is a lack of string manipulation (havn't checked with Chris)==> 

If I have argument Roberts-7103 and would like to create an URL
http://www.wikitree.com/genealogy/Roberts-Descendants-7103

then I would like to split Roberts-7103 in 2 parts

part1 = Roberts

part2 = 7103

and tell wikitree please display the URL

http://www.wikitree.com/genealogy/part1-Descendants-part2

I have tried to find functions for that in Wikitree (havnt spoken with Chris) but not found. My understanding is that Wikitree is run on a Apache webserver 2.2.31 ==> we could add some redirect rules to the webserver ==> that redirects to the wanted Urls and works easy with the templates

==> we have a link that is easy to create in a Wikitree template e.g. soe kind of rest like syntax

http://www.wikitree.com/genealogyRest/Roberts/7103/Get/Descendant

and that is by Apache redirecting to the URL we have today
http://www.wikitree.com/genealogy/Roberts-Descendants-7103

 

+2 votes
It's hard to automate clustering of triangulation groups because different families are focused on different research.  Also seem people would likely object to see their DNA automatically grouped for them in this way.  TO me it makes more sense to post triangulation groups manually and then let people comment on them. It's like a venn diagram there are overlapping interests and not all people might be interested in the same part of triangulation.   Especially if it is to a brick-wall people have different identifiers for who that person is.  For instance a connection in my line is Brower-Bechtel and it's McElhaney/Motes in another tree.  Both of us are right.   The work to triangulate this link manually spans decades, and there is no way to automate that part of it.
by Marc Snelling G2G6 (7.2k points)

Also seem people would likely object to see their DNA automatically grouped for them in this way.

But people at FTDNA etc.. has payed FTDNA to do it for them.... no one complains more than the user interface at FTDNA could be better....

TO me it makes more sense to post triangulation groups manually and then let people comment on them.

How?....
How do I find my groups?

Not all people might be interested in the same part of triangulation.....

I think that is what you have at FTDNA. A lot of people test and most of the people give up finding relatives ==> you have ICW groups if you would like to use them use them and if you don't move on what is the problem? Lesson learned is that doing DNA genealogy you need to work together and a common family tree as Wikitree is excellent and a platform like a Wiki is also great...

The challenge I see is how to combine all information sources in an useful way. 

Wikitree has a lot of useful data and we could make Wikitree much more useful if we started to add structured data using templates....

 

But people at FTDNA etc.. has payed FTDNA to do it for them.... no one complains more than the user interface at FTDNA could be better.... 

 

People at FTDNA have paid to have their DNA tested.  Whether they join One Name, Group or other studies is based on personal opt-in decisions.

A lot of the work of finding family member triangulation groups happens outside of FTDNA - often on GedMatch.  Also on independent family sites and through personal contact, and email.   Some of the triangulation groups out there are only known to family members.

ICW groups reveal details of people's ancestry that can be sensitive.  Ethnicity, non-parental events, slavery, there are all kinds of reasons to treat this kind of information with sensitivity.    When Ben Affleck did the Finding Your Roots TV show, he asked the executives not to publish information about his slave-holding ancestors.

Inside private triangulation groups I do no share information publicly unless everyone in the group agrees to.   

 

 

 

Inside private triangulation groups I do no share information publicly unless everyone in the group agrees to

Yes the equation is not easy. 

One odd thing is that when you upload to FTDNA you give other people access to your data and tree if FTDNA think you are related (i.e. above 7cM).... why do we trust those people more than other?

I feel most of people doing good genealogy research in Sweden should never upload the family tree on Wikitree as they don't want to share the work they have done.... 

DNA genealogy will make slow progress as long as we don't get open data....   

 

+2 votes
Circles are based on trees with DNA matches.  There are six people that have a tree with the ancestor Henry Thomas Heath and they share DNA.

Ancestor Hints are based on DNA matches with trees.  There are seven members with Robert Mayberry Dickens in their tree and I share DNA with two of them but he is not in my tree.

I can visit the AncestryDNA profile of those matched, and suggested match, members to see who else I share matches in common with.  This is the closest that Ancestry gets to triangulation but you can begin to build groups of trees and ancestors to explore; and, to see where the DNA links. It's just tedious and labor intensive.
by Kathleen Heath G2G6 Mach 2 (20.6k points)

Related questions

+17 votes
3 answers
+3 votes
2 answers
+22 votes
2 answers
+17 votes
7 answers
+15 votes
3 answers
+12 votes
3 answers

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...