Andersson-5056-1.jpg

100 Circles: A Geometry of The Tree

Privacy Level: Public (Green)

Surnames/tags: connectors 100_circles connection_finder
This page has been accessed 4,097 times.

It's circles all the way down

Contents

Introduction

There are different ways to look at a tree, and to consider its geometry. You can define it as a set of branches and roots, and that's the way one generally thinks when speaking about genealogical trees. Or you can cut its trunk at some point and look at its growth rings. The approach of WikiTree we propose here is quite similar to the latter, and we could say "100 Rings" instead of "100 Circles".

We use in this page "The Tree" to denote what is called sometimes in WikiTree parlance the "Main Tree" or "Single Tree", or "Big Tree", that is the growing set of so-called "connected profiles" (over 24 million in Nov 2021).

This work is expanding from an original publication by Bernard Vatant under the title Cent cercles de Jean-Joseph (in French, so far). See also the preliminary discussion on G2G.

Follow the conversation on G2G under the tag "100_circles". Additional discussion is taking place on several free-space pages. For those, please see the See also section at the bottom of this page.

Abstract

The WikiTree Connection Finder allows us to compute the distance (expressed in so-called "degrees of connection") between any two profiles of The Tree. This distance is defined as the minimal length of a path between those two profiles.

For the purpose of our exploratory activity, a "circle" is the set of all profiles at a given distance from a profile chosen as Focus[1]. At distance 1 (the first circle) are parents, siblings, children, and spouse(s). The Connection Finder presents these with their gendered names as father and mother et cetera. It also makes no distinction between full and half siblings. On the second circle are the parents of parents and spouses; the siblings of parents and spouses; the children and spouses of siblings; and children and spouses of children. These are what we commonly talk about as grandparents, aunts and uncles, nieces and nephews, grandchildren, and in-laws.

One can explore this way (in theory at least) most of The Tree in about 100 circles. In other words, from any of the over 24 million connected profiles of The Tree there will be very few (if any) profiles at a distance of more than 100 steps of connection (see FAQ for more details on this).

The geometry of circles can be viewed "globally" through a statistical analysis of the distribution of circles population, and "locally" by a systematic completion in WikiTree of the smallest circles around well-chosen Focus Profiles. Those two approaches are complementary.

[1] We formerly used "Center Profile" instead of "Focus Profile", the change was decided to avoid conflation with the vocabulary used to speak of centrality in graph theory.

FAQ

Exactly 100 circles?

The number of circles needed to reach all connected profiles depends on the profile chosen as Focus, but for most of them, it's indeed less than 100. As of Nov 15, 2021, for our reference profile Mary Stuart, the furthest non-empty circle is C84. In other words, all connected profiles are at most 84 degrees from Mary Stuart. But actually most of them are much closer. Over 99% of the connected profiles are in the first 40 circles. The remaining 40 circles contain the "long tail" of far-away branches loosely connected to The Tree, and/or extending far in the past. The most populated circle, called "Peak Circle" for Mary is C18, and over 90% of profiles are between C10 and C30.

If the Focus is less close to the bulk of The Tree, above figures are shifted to higher values, but the whole picture is basically the same. For the original Jean-Joseph Marie Vatant (1804-1875) the Peak Circle is C30, and the furthest circle is C99. (figures as of Nov 15, 2021)

We have those circles, so what?

One can ask what those circles can bring to the WikiTree experience, the genealogist work and everyday life. Here come a few answers (non-exhaustive list).

  • Shifting viewpoint from "my ancestors", "my family", "my blood" to a more peripheral vision, towards an attention to all four directions of kinship (parents, children, siblings, spouses).
  • Systematic exploration and expansion of first circle is likely to reach and go over your zone of genealogical comfort, discovering unexpected alliances and filiations, traveling further across time, space and social boundaries.
  • As the original Jean-Joseph Marie Vatant and Olof Andersson cases show, even very endogamic first circles eventually expand towards every corner of The Tree. Checking how and when this happens is a good lesson in history of migrations and exogamy.

How do I participate?

  • You can join one or more of the existing "Focus" teams. Just contact the Profile Manager. See table below.
  • You can start a new one for a profile of your choice. See below.

How do I choose and propose a new Focus ?

A new Focus can be anyone of the 24 million+ connected profiles of The Tree, one of your ancestors, a notable, whoever you see fit. Here are some clues to help your choice.

  • Avoid profiles likely to raise issues or conflicts : living or recently deceased people, obscure or disputed filiations, notorious descendants conflicts. We recommend profiles with Privacy status set to "Open", born before 1870.
  • A good principle could be to choose a profile about halfway between the point where the circle runs into the realm of living people and the point in the past where sources get thin. (Living by 1800-1820, time where the world population crossed the one billion milestone, seems a good choice)
  • You should be the Profile Manager, and know the profile well, make sure you can source without problems the Profile itself and its first and second circles at least.
  • Check if the first few circles are already well populated and sourced. This will be a task-intensive work. Of course you can start from scratch from a new profile, but this will be still more work.
  • Choosing a profile with many siblings and children is of course also more work to begin with but more fun. As an indication, the original Jean-Joseph had 10 siblings, 9 children and 54 grandchildren. That's a solid start.
  • Try to gather a team to help you with this particular profile.

Beyond those recommendations there is at this point neither formal process nor particular requirement for your new profile to be included here. Ask the PM of this page, or better ask to be added to the Trusted List so that you can edit yourself the figures for the profile you manage.

If too many candidates pop up, it might be necessary in the future to adopt more formal rules.

How much work is it?

A lot! Based on Jean-Joseph experience, it will need more than one year of steady work to complete up to the 4th circle. The total number of identified and documented profiles is currently (Nov 2021) 580 for the first three circles, and over 750 for the fourth circle, the latter figure being far from final and likely to reach the thousand range.

Note (JK): The amount of additional work you will need to do might vary. If you already have a well-filled-out tree, some of the work will likely already have been done. Note also that you can do it at your leisure. There is no deadline. It can be quite interesting to engage in a leisurely exploration of your connections, circle by circle.

What do you mean by a "complete" circle?

A circle is said complete in when all its possible members seem to have been thoroughly searched, and all discovered profiles have been sourced and added to WikiTree. Declaring a circle complete is a very bold assumption indeed. Despite thorough research, new yet unknown children, siblings or spouses might be discovered in the future. Declaring a circle incomplete, on the other hand, is acknowledging that other members certainly exist, but have either not been searched, or not been added to WikiTree yet.

How will this work for Americans?

Because the United States is "a nation of immigrants," most Americans will not be able to trace every branch of their tree back for many generations. Choosing a recent ancestor as Focus would allow for more tracing into the past but would mean that the circles could not be extended very far into the future. How will those limitations affect the results? What would be the best choice in such a case?


Note (BV) : The expansion of circles is actually more horizontal in time (lateral if you prefer, through spouses and siblings) than vertical beyond the smallest circles. Of course the expansion hits walls in the past and in the future, around the 4th circle is the Focus is living around 1820, but bear in mind that each circle will count a growing number of profiles also living around 1820, so there are no limits to the horizontal expansion, and beyond the smallest circles it will be the main population.

Is each person-node only counted once?

So that, for example, a parent is not also counted as a spouse of the other parent and a child is not also counted as a sibling of another child?

Indeed, no double count.

Note that it is a tricky task in very endogamic circles (such as Jean-Joseph ones). In case of doubt, the final judge is the distance given by the Connection Finder application. If the profile is at 4 degrees from the Focus, he sits in the 4th circle. But there can be different paths of the same minimal length (todo : add examples). Checking duplicates will be a major difficulty, that's why we need counting tools such as Aleš "magic query" to get actual figures.

With the fourth circle one end is going to dip into the realm of the living, isn't it?

Yes indeed. Bernard for one is sitting alive and well in the fourth circle of Jean-Joseph, as well as quite a few of his living siblings, cousins etc. There was even maybe in 2020 a last living member of the third circle, born around 1930. Fifth circle extend to children born in the late 1900s, and there are certainly members of the sixth circle yet to be born.

There will be for a certain amount of time forbidden expansion paths due to privacy concerns. But that's the rule of the game. Such expansions will happen later on, hopefully, thanks to generations after us. The distribution model has to take this into account.

Will the circles end up lopsided if there are gaps in the information?

That question is discussed here.

What explains the enormous variations in the sizes of the circles for the various Foci shown in the table below?

The distribution of the circles population and their differences are driven by different parameters at different scales. The sum of the populations of all the circles will be the same for all profiles, because it is the total population of the connected Tree. (This is exactly true only if data are retrieved the same day, since connections are updated each night, and the Tree is growing by thousands of profiles every day.)

  • In the smallest circles, the differences are mainly linked to the care taken of the Focus and close relatives by the local PMs, and of course the size of first few circles' families. Large families have potentially over 100 profiles in C2, between 300 and 400 in C3, and over 1,000 in C4. One important part of the work in progress here is to ensure that for those small circles are as complete as possible. For our reference Focus Mary Stuart, like for many "Euro Aristos" such a work has already mostly been achieved, hence the great values for C4 and C5. Similar values are expected for the (far less notable!) Vatant-5 and Andersson-5056, for whom a systematic completion is under way.
  • Differences in C10 and C15 are linked to the distance of the Focus to the bulk of WikiTree. For Adaline Carlton, although the very first circles are not heavily populated compared to the above, population has a very sharp growth after C10, and distribution after C15 is very similar to Mary Stuart, with a peak population at C18.
  • For profiles quite far from the bulk, as Jean-Joseph Vatant, it can currently take up to the 15th circle to see a significant growth of the circles population. Afterwards, the distribution is similar to the above, but the peak circle is currently C30. (See diagram). But shortcuts to the bulk are certainly yet to be discovered. (One important parameter in this case is the very low WikiTree adoption rate in France)
  • In all cases, the population of the peak circle is about the same, a little less than 10% of the total Tree population. This does not show clearly in the table, but is visible in the diagram.

How is circles population changing over time?

With WikiTree growth, the circles population is of course changing. The overall population of The Tree is growing at a rate of over 3 million per year. How is this growth distributed over the 100 circles?

The first few circles growth is of course driven by the activity of the local PMs. If they have done a good job, up to the second and third circles can be complete after months of work, and are not subject to significative changes afterwards.

In the "far circles" (C15 and beyond), growth is driven by the global, somehow random, activity of WikiTree, independent of local growth. Since it's always raining where it's wet, most of the growth happens in the already most populated circles. Moreover, new connections keep bringing back profiles down to lower circles, making the overall distribution sharper and sharper with time.

As an illustration, the following diagram is showing variation of circles population for Jean-Joseph Vatant over six months, from mid-November 2020 to mid-May 2021. The negative growth beyond C38 does not mean profiles at those distances have disappeared or been disconnected, they have been pulled closer by reconnections at various distances, and have contributed to the growth of closer circles.

Growth of jean-Joseph circles over six months (Nov 2020 to May 2021)

How does the connected "Main" tree compare to the total number of WikiTree profiles? Is WikiTree becoming more connected over time?

Yes, WT is becoming more connected over time, although the total number of unconnected profiles has not changed significantly.

Currently (15 Nov 2021), there are over 28 million total profiles on WT according to the website's home page. Of those, over 24 million are connected, according to the Connection Finder.

The absolute number of unconnected profiles has not changed much over the years, staying around 4 million. But thanks to global growth, the relative number of connected has increased from 67% to 83% between 2016 and 2020. As of Nov 2021 it is over 84%.

Many of the unconnected profiles have been there since WT's early days of blind Gedcom imports. "Connectors" are working hard on a daily basis to reduce this number, but at the other end of the workflow, new branches are steadily added by new WikiTreers not yet connected. All in all, those two mechanisms seem to balance each other to keep the total of unconnected more or less stable.

Tools

Sharing a spreadsheet

Working to complete the first few circles needs an external "checklist". It can take a shared spreadsheet on Google Drive or any convenient shared space, so that several collaborators can work together on it. (ToDo : link to an example of such a spreadsheet)

Querying population for all the circles

Aleš Trtnik has developed a "magic query" yielding the population for all circles for a given Focus. It has and will be used to retrieve and update the data in the table below.

Current Foci

The following table lists the Focus Profiles (aka Foci) currently actively under study (in bold), with the date the data were retrieved. Below the table is given more detail about each Focus Profile including the profile manager. The "PPP" (Project Protected Profiles) listed are given for reference.

Columns C1 to C20 give the cumulative population of circles. For example the column C10 gives the total population of circles up to the 10th, in other words the number of profiles at a maximal distance of 10 from the focus.

"Peak" value is the distance of the most populated circle (see FAQ). In standard statistics vocabulary, it's called "Mode" of the distribution.

"Mean" value is the average of all distances to the focus. Due to the "long tail" of the distribution, the mean value is always slightly greater than the peak value.

Figures in bold indicate that the circle is "complete". See FAQ for what we mean by that.

Focus ID C1 C2 C3 C4 C5 C10 C15 C20 PeakMeanupdate
Vatant-5 23 157 580 1,411 1,972 3,558 5,501 21,474 30 32.4 2022-01-21
Andersson-5056 22 120 530 944 1,421 8,744 29,338 153,540 28 30.3 2021-12-17
Mars-121 22 86 243 798 2,528 302,559 2,420,952 12,232,920 20 21.3 2022-01-18
Carlton-1976 21 80 150 294 527 39,718 2,918,763 13,028,090 18 21.1 2021-12-17
Von_Keyserling-13 831 115 196 315 6,595 333,745 5,575,193 22 23.9 2021-12-17
Lothrop-29 28 231 1,188 5,396 21,974 1,706,925 10,523,80418,938,2491517.3 2022-01-21
Stewart-6849 18 120 519 1,905 5,524 148,208 3,552,591 14,815,4811820.2 2022-01-07
Windsor-1 9 50 186 525 1,428 63,908 1,102,744 11,083,078 20 21.8 2022-01-07
Gardahaut-1 8 43 132 244 474 21,992 1,569,901 11,058,370 1822.3 2022-01-07

The following table provides another view of the circles distribution : how far you have to go to cover a given percentage of the total population (to be completed for other profiles)

Focus ID 1% 10% 50% 90% 99% update
Vatant-5 24 27 32 39 49 2021-12-12
Mars-121 10 16 21 28 38 2022-01-18
Lothrop-29 7 11 16 24 33 2021-05-18
Stewart-6849 10 15 19 27 36 2021-05-01

Focus Profiles:

There is additional information about the analysis under way for some of these profiles on free-space pages listed in the "See also" section below.

  • Vatant-5: Jean-Joseph Marie Vatant, 1804-1875, Bretagne, France. PM : Bernard Vatant. C1, C2 and C3 completed, give or take a few for C3. Target : completion of C4 in 2021.
  • Andersson-5056: Olof Andersson, 1793-1860, Västmanland, Sweden. PM : Eva Ekeblad
  • Mars-121: Marie Mars, 1689-1776, b. France, d. Québec. PM : Greg Lavoie
  • Carlton-1976: Adaline Carlton, 1826-1897, Ohio, USA; first circle complete, second in progress. PM : Julie Kelts
  • Von_Keyserling-13: Graf Hermann Alexander (Hermann) von Keyserling, 1880 - 1946, Baltic German philosopher. Formerly Von_Keyserlingk-24, PM Kelsey Jackson Williams. On 2021-07-08, Jelena Eckstädt merged Von_Keyserlingk-24 into Von_Keyserling-13 (reason : he wrote himself without the k)
  • Lothrop-29: Samuel Lothrop, 1622-1700, b. England, d. Connecticut, USA. (PPP)
  • Stewart-6849: Mary Stuart (Stewart), Queen of Scots (1542 - 1587). (PPP)
  • Windsor-1: HM Queen Elizabeth II A. Windsor, born 1926, Queen of the United Kingdom of Great Britain and Northern Ireland. (Living, PPP)
  • Gardahaut-1 : Jeanne Suzanne Gardahaut (1900-1992) born in Nantes, France, moved to Paris with her family, and crossed over to Seattle in 1920.

The following diagram represents the compared distribution of circles population for three of the above profiles. The shape of the distribution is quite similar, with a shift towards greatest values for profiles more "off-center", due to their social, geographical and/or temporal distance from the core of The Tree.

Distribution of circles population for three different profiles

See also

Additional discussion of related issues is at 100 Circles - more discussion.

G2G discussions:

Free-space pages:

External links:

To-do List and temporary notes

  • FAQ (under construction)
  • Improve the "Tools" section
  • Translation of "Cent Cercles" into English
  • Add to See also section
  • An introduction to centrality in graphs
  • Update the graphics for consistency with the table




Collaboration
  • Login to request to the join the Trusted List so that you can edit and add images.
  • Private Messages: Contact the Profile Managers privately: Eva Ekeblad, Bernard Vatant, and Julie Kelts. (Best when privacy is an issue.)
  • Public Comments: Login to post. (Best for messages specifically directed to those editing this profile. Limit 20 per day.)
  • Public Q&A: These will appear above and in the Genealogist-to-Genealogist (G2G) Forum. (Best for anything directed to the wider genealogy community.)
Comments: 8

Leave a message for others who see this profile.
There are no comments yet.
Login to post a comment.