Unconnected Trees

+24 votes

Last summer, during a discussion of who should be our third relationship finder connection, I started rambling about unconnected family trees of various sizes. In the end, the decision was made to go in a different direction (and one that I have no problem with), but I still keep wondering if there is some way to get more information about unconnected trees, like listing them by size, or something like that.

What I said then was:

I can't actually see the data. Actually, I'm not sure that anybody would have the time or patience to do this manually, but if there were a script which could define clusters of connected profiles, I rather suspect that it would find something like this:

Rocks: A large number of profiles with no connections to any other profiles, started by people who signed up as guests, but never went any further with WikiTree, for whatever reason.

Islets: Another large number of clusters of 2-12 profiles where somebody has entered, say, their own immediate family, and maybe back a generation or two, but never built out any farther (whether because they haven't had the time, or the knowledge of the available tools, or what).

Islands: A smaller number of clusters of, say, 13-144 profiles where somebody has been pretty industrious, or maybe several islets have managed to connect with one another, but somehow haven't found a link to the larger family tree.

Subcontinents: Clusters of, say, 145-500 profiles, possibly several smaller clusters that have managed to connect with one another, but not with the larger family tree.

Continents: Clusters of 501+ profiles with no connection to the larger family tree.

The Mainland: I assume that there is one cluster which is massively larger than all the rest, and possibly even larger than all of them put together, and I rather suspect that A.J. is in that cluster, so that if you have a link to A.J., then you're probably on the Mainland. 

Therefore, it seems to me that picking other relationship finders who are also on the mainland would be kind of pointless. (Well, fun, of course, but not really helpful. Once you're connected to A.J., you're probably connected to everybody else on the mainland.)

What would help, or at least encourage, people who aren't connected to A.J. is to pick relationship finders in continents, starting with the largest. In a sense, it doesn't matter who we pick on each continent, although picking someone notable would probably encourage people to at least run the test. And that way, somebody can console themselves by saying, "Okay, I'm not on the Mainland with A.J., but at least I'm on the continent of Doeania, because I'm connected to Jane Doe."

(Or, possibly, it may turn out that A.J.'s continent isn't actually the Mainland, but rather the continent of Jacobsia, and we'd need to pick somebody else as the relationship finder for the real Mainland.)

And then, of course, we can have challenges trying to connect different islets, islands, subcontinents, and continents to the Mainland. Connect an unconnected cluster, and get a "Bridge Builder" badge, or something like that. (I didn't mention trying to connect rocks, because I'm thinking that would be pretty hard to do, but anybody who managed to connect up, say, 100 rocks to other clusters would be my genealogical superhero.)

(And if you don't like those metaphors, feel free to pick one that works for you. I grew up on the water, so these terms just kind of naturally pop into my head.)


asked in WikiTree Tech by Greg Slade G2G6 Pilot (194k points)

Great news on this! We should be able to add numbers indicating how connected individuals on Special:Unconnected are within a few days, including the ability to sort to find the "most connected of the unconnected."

I'll post more on Greg's recent comments on a Connectors Challenge.

And it's now live! You can now see how many connections people on Special:Connected have. :-)

Interestingly, Greg, there are no "continents" and only a couple "sub-continents". Almost everybody has under 100 connections. I'll explain more in a new post soon.
Just took a look--I love the new number of connections function already!! It's particularly helpful to be able to sort by it as well.
Well, I took a look through the list, and got down to the people with only 100 connections without recognising any names. (Well, there's Elizabeth Taylor, but not the one we all know.) I didn't take the time to check all the names again Wikipedia, I just did a quick pass. But I still think it would be a good idea to pick the most famous person in the largest cluster as the next relationship finder.


The thought occurred to me today that, given that most of the unconnected clusters we've found are much smaller than I anticipated, and also to stop mixing metaphors, perhaps a better classification system would be:

  • leaves: individual profiles with no connections at all
  • twigs: clusters of 2-12 profiles
  • branches: clusters of 13-144 profiles
  • boughs: clusters of 145+ branches not connected to the main tree
Some of the unconnected trees can be guided to the mainland by GEDMatch data. Verified cousins preferably closer in degree can greatly narrow down searches, but the WikiTree's DNA confirmation aids are practically useless for this because they really only help with refuting already known and documented connections.

A cousin registry of some kind would be useful. Right now, I am doing a systematic analysis of my 1500+ matches on GEDMatch to see who I can triangulate. It would be nice to be able to share that information with my WikiTree matches so we know who we can build bridges to.
I have recently joined the Connectors and am still plugging away at one family that I have written about before on G2G. I spend one evening a week exclusively working to connect what started as 2 individuals and has grown to 34 people. They are connected to each other but not to the main tree. I think it is important to connect them, but in the long run I think it is more valuable to add people and write as comprehensive biography as possible with necessary sources. If the idea that we are all connected is correct, and I believe it is, then eventually everyone will be part of the main tree. I also have noticed that my connections are almost all through my paternal side of the family so that even distant cousins on my mother's side go through a twisted path through to my father's side. I am not sure I see the value in all of this. I would much rather spend a generous amount of time profiling my ancestors so that it is of value to my descendants and anyone else that is interested.
Byer, the foot work of directly constructing family narratives and biographies from records will still be vital and will still go on. No one is required to do genetic genealogy. Some of us will do it anyway. However, genetic genealogy is a tremendous boon to genealogists generally because when done well genetic genealogy provides a map for exploration and can triangulate fruitful avenues to search to discover previously undiscovered connections.

The genetic database I am constructing for myself will help to sort out whether or not a person is on my mothers or father's side of the family. With the help of a genetic pedigree program and a database of autosomal data like what GEDMatch provides, I can construct the skeleton of a family tree and place cousins in relation to me without first constructing the connections between me and them. I can match GEDCOMs connected to those people to abstract paths in my family tree which in some cases will allow me to construct my family tree from ancestors to me and from my ancestors to them.

What's more is that no one in my family has ever told me that I have a Japanese ancestor. I wouldn't even know to go looking without genetic admixture analysis. And It is hard to beat genetic genealogy in terms of confirming ancient ancestors; I share 5cM with BRC2 the Hungarian from 3,200 years ago. By triangulation with more modern ancestors and cousins, I can deduce the flow of genes across the world to me, and I can deduce the flow of genes from those same ancestors to thousands (probably millions) of my living cousins. In the long run, this means identifying and integrating larger families across larger time spans and global distances, and immediately, it means sharing a larger family with my children and my immediate family members.

Our goals and methods aren't antagonistic. They are complimentary.
Ian, I totally agree with you. I was commenting on the thread as a whole and not on your specific comment. Maybe I put my 2-cents worth in the wrong spot. I have recently had my DNA tested and have uploaded the results to Wikitree and GEDMATCH. This is very exciting and amazing that a little bit of spit can reach so far back in time. I am researching the correct way to record "verified by DNA" so I can record that information on Wikitree.

In my response I did not mention DNA but instead was commenting on the business of getting the unconnected connected to the Global Tree. Maybe DNA will help, but if the person or persons that you are trying to connect is not a blood relative, DNA is not going to help.

Nikki Barr Byer
Byer, I wasn't sure; it seemed ambiguous.

Though I think that connecting the unconnected is a very valuable thing to do for WikiTree both in terms of the community and in terms of individual value. Statistically many people are going to be related to many other people; as an example, in my own tree I have found that few of the people that WikiTree suggests I would be related to through the DNA confirmation aid are actually related to me, but I also found that those same people tended to be related to other people that it suggested and related to people that I am related to.

Helping them to figure out who they are related to and are not related to helps me out both directly and indirectly. By connecting them with other people, I increase my chances of connecting with people I am related to by DNA.

In the long run, the connections directly facilitate machine transcription of relationships which means that it will become easier for WikiTree as a system to cross reference both genealogical and genetic information to tell us ultimately where missing or erroneous relationships lie. To do that you need both people you are directly related to and people who are only very distantly and indirectly related to you.

1 Answer

+5 votes

This would certainly help the connectors with their task. There was a list of unconnected families produces for the Grand Family Reunion. I've been adding locations as comments to this list for some time, have done letters A-K, and have also removed the trees that have now been connected. The list can be found at: [http://www.wikitree.com/wiki/Project:Global_Family_Reunion/Unconnected_Trees].

And if anyone could identify any 'continents' from your list above, then I'm sure many of us connectors would be only too pleased to work towards connecting them. What a bonus to know that you'd connected over 500 people to the 'mainland'.
answered by Carol Keeling G2G6 Mach 3 (32.9k points)
Thank you for your work in maintaining that. I took a look at it some time back, and many of the trees I dipped into had already been connected, so I stopped looking there. But basically, that page is sort of what I had in mind, except with some kind of indicator on it showing the size of each unconnected tree. (Which probably means that it would need to be automated somehow. I can't imagine anybody crawling through trees manually to count up profiles.)

I started from the end of the alphabet, and have marked all the connected fragments from your linked list. S-Z so far. I wasn't 100% sure about removing them from the list, so just marked them as connected.
People who do that kind of manual checking are awesome. Good on ya!

Thank you, Elizabeth. I should really try and finish checking and updating this list of families. I like working on these trees, as each should contain at least 50 unconnected profiles, and it is far better to spend time connecting these, rather than the countless 'stand alone' profiles that have been created.


I've taken to adding a number of profiles entry to the trees that I add to those pages, to give people an indication of how big each unconnected branch is. 

Is there some kind of limit to how long we leave connected branches up on those pages? I can easily foresee a time when more of them are connected than unconnected, which would reduce the value of the pages somewhat, I'm thinking. (Besides, if people are taking part in the Connectors Challenge, their successes in connecting branches would be documented there.)


The list of unconnected trees has been somewhat eclipsed by the Special:Unconnected upgrade that we were given at the beginning of April, but I can see that the two can work in parallel. I was previously removing all connected trees that I found, in order to keep the list up-to-date, and I think that should continue. I can't see any benefit to leaving them there.

As of today (24 May 2016) the list of unconnected trees that was prepared for the Grand Reunion is up-to-date. I've added locations where they were missing and removed all of the trees that are now connected. There are still over 330 trees here, all waiting to be connected to the global tree.

What a lot of fascinating ideas for a newcomer to the Connectors group!

Originating in England and now residing in Australia, I've had some immediate successes and it seems that serendipity plays a significant part along with persistence! However, if it is going back in time to make connections then it is surely the royalty of Europe that will help most, as AJ recognised with the current Queen of England, and if it is geographical, across the miles, then colonisation is surely the most important factor.


Related questions

+10 votes
2 answers
194 views asked Jul 11, 2016 in WikiTree Tech by Paula Dea G2G6 Mach 5 (52.7k points)
+12 votes
1 answer
+10 votes
1 answer
+8 votes
3 answers
+15 votes
2 answers
+2 votes
2 answers
104 views asked Feb 11 in Genealogy Help by Paula Reinke G2G6 Mach 3 (38.5k points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright