100 Circles: A Geometry of The Tree

Privacy Level: Public (Green)

Surnames/tags: connectors 100_circles connection_finder
Last profile change on 21 May 2023
18:22: Bernard Vatant edited the Text on 100 Circles. (added USBH profile : Ernest Banks (1931-2015)) [Thank Bernard for this]
This page has been accessed 8,374 times.


Introduction : The Circles of The Queen

Various exchanges since the beginning of this project (Nov 2020) have shown that a new, gentle, less abstract, as simple as possible ... introduction to the 100 Circles concepts was needed. So, let's start with a family everyone should be familiar with, namely the Royal Family of the late Queen Elizabeth II Windsor (1926-2022) (hereafter QEII, with due respect).

Most Royal Family close members have a profile in WikiTree of course, and the application My Connections is showing them, ordered by their distance in "degrees" to QEII. Each "degree" in the "My Connections" page is what we will call here a "circle". The first circle, or degree, gathers 8 people around her : 2 parents, 1 sister, 1 spouse and 4 children. The second circle gathers the parents, spouses, siblings and children of members of the first circle, 41 people altogether. Grandparents, aunts and uncles, nieces and nephews, grandchildren, and in-laws of the initial profile.

Following degrees/circles are computed the same way. Those profiles who were already counted in the previous circles should not be counted twice. It needs the smart algorithm of the Connection Finder to achieve that. Computing the shortest path to any of the 27+ million connected profiles allows to place each of them in one single circle, in any given state of the data base. For example former US President Barack Obama currently belongs to the 21st circle. The "My Connections" application is limited to 1000 results and/or ten circles, for obvious performance issues. In the case of at hand, the limit is reached in the middle of the 5th circle. But counting the population of all circles beyond this limit is possible, thanks to a query specially developed by Aleš Trtnik for the project. This query is made available to people interested in the project provided they use it sparingly (it's quite heavy on the server).

Do we really get as far as 100 Circles? In this case, barely! As of Sep 2022, the furthest connected profiles are 88 degrees from QEII, but actually less than 2000 profiles (the so-called "Outer Rim") are further than 60 degrees, and over 99.5% of connected profiles are less than 50 degrees. The following diagram is showing the distribution of population for those 50 first circles as of 9 Sep 2022. The peak of the distribution is the 20th circle.

An intriguing question is : who are the almost 3 million people in the 20th circle, the most populated one? The answer is : just about anyone. Random examples (as of May 2022), more or less famous : Matilda (Plantagenet) of England (abt.1156-1189) (direct ancestor at 20 generations), Stanley Ann Dunham (1942-1995) (mother of above quoted Barack Obama), Marie-Hélène Abgrall (1854-1905) (my maternal great-grandmother), Charles Joseph Kobloth (1870-1924) (cabinetmaker in Paris), Henrica van Asten (1757-1832) ... The Connection Finder will help you find more of them. Take a distant profile in the Outer Rim table, check his connection to Windsor-1, e.g., Václav Kuneš. The profile at distance 20 from QEII is in her 20th circle. Strike that one off the path, you'll get another one, etc.

Changing Focus

One could argue that the above example is far from being a random one in the Single Tree, and the distribution of her circles too smooth and regular a curve to be true. How things would look like, seen from other viewpoints? We try to answer this question by comparing the distribution of circles population for different "Focus Profiles".

Focus ID CC1 CC2 CC3 CC4 CC7 CC10 CC20 Peak Mean e update
Windsor-1 8 49 189 531 8,712 76,274 14,486,499 20 21.3 91 2023-05-01
Stewart-6849 17 118 513 1,925 26,981 153,292 17,882,337 18 20.1 88 2023-04-30
Aquitaine-84 17 117 462 1,567 14,448 39,119 1,063,666 25 27.4 98 2023-04-30
Lothrop-29 27 232 1,222 5,690 254,184 2,014,650 22,940,145 14 17.1 86 2023-04-30
Allerton-6 17 105 575 2,986 117,226 1,391,778 21,839,021 15 17.7 89 2023-05-01
Du_Toit-82 14 92 577 2,774 95,518 400,957 8,225,405 22 22.9 95 2023-05-01
Andersson-5056 21 119 530 1,076 3,596 12,850 4,729,003 23 24.9 96 2023-05-01
Vatant-5 22 156 605 1,958 4,288 5,968 99,017 28 30.8 102 2023-04-30
Davis-26671 26 127 354 796 9,886 95,607 11,673,830 21 22.1 96 2023-05-01
Luker-573 14 98 403 1,335 30,729 175,405 17,442,384 16 20.0 92 2023-05-01
Banks-2508 19 48 88 161 2,509 7,493 3,001,422 23 26.6 97 2023-05-21
Bacon-2568 8 25 80 214 3,252 34,344 14,547,367 19 21.4 92 2023-05-01

Columns CC1 to CC20 give the cumulative population of circles. For example the column CC7 gives the total population of circles up to the 7th, in other words the number of profiles at a maximal distance of 7 degrees from the Focus. This is the same CC7 number as the Connection Counts appearing on member profiles.

Peak value is the distance of the most populated circle. In standard statistics vocabulary, it's called "Mode" of the distribution.

Mean value is the average of all distances to the Focus. Due to the "long tail" of the distribution, the mean value is always slightly greater than the peak value. As a reference, the average distance between two random connected profiles, based on samples constructed using Jamie Nelson's very cool application, can be assessed, as of July 2022, to be in the interval [22-25] with a confidence of 95%. See G2G discussion for more details.

e is the eccentricity of the profile, in other words the distance of the furthest profiles, or the radius of the greatest non-empty circle.

Other profiles have been previously included in this table, such as Marie Mars (1689-1776), Adaline (Carlton) Van Wye (1826-1897), Hermann Alexander von Keyserling (1880-1946) chosen by their respective PM, involved in the original project team.

Shapes of the circles distribution

Are circles distributions really as different as they seem from the above table? The differences are driven by different parameters at different scales. The sum of the populations of all the circles will be the same for all profiles, because it is the total population of the connected Tree. (This is exactly true only if data are retrieved the same day, the Tree is growing by about 10k profiles every day).

  • In the closest circles, the differences are mainly linked to the care taken of the Focus and close relatives by the local PMs, and of course the size of first few circles' families. Large families have potentially over 100 profiles in C2, between 300 and 400 in C3, and over 1,000 in C4. One important part of the work in progress here is to ensure that those first circles are as complete as possible. For our reference Focus Mary Stuart, like for many "Euro Aristos" such a work has already mostly been achieved, hence the great values for C4 and C5. Similar values are expected to be reached for the (far less notable!) Vatant-5 and Andersson-5056, for whom a systematic completion is under way.
  • Differences in C10 and C20 are linked to the distance of the Focus to the (mostly American) bulk of WikiTree. For profiles such as Jean-Joseph Vatant, it takes up to the 15th circle to see a significant growth of the circles population, and the peak circle is currently C28. Shortcuts to the bulk are certainly yet to be discovered. (One important parameter in this case is the very low WikiTree adoption rate in France)
  • In all cases, the population of the peak circle is about the same, a little less than 10% of the total Tree population. When put together on the same diagram, most distributions look globally quite the same, simply more or less shifted to higher distances, and the peak more or less sharp.

Nevertheless, differences appear when looking more closely at a variety of cases. For most profiles under study, the growth to the peak is steeper than the decline to the long tail. In many cases, a secondary "bump" appears beyond the peak, as for Patty (Luker) LaPlante below.

Some cases show at the opposite a remarkable symmetry of the distribution on each side of the peak, such as Emma (Davis) Schipp. The reasons of such differences are not known at this stage.

South African profiles typically show a quick start, up to 100k around C8, then a dip between C10 and C15. This is due to a heavily connected local cluster.

Outer Rim profiles, generally connected through a long and tortuous path and several bottlenecks, display a chaotic distribution over many circles. Of course, such a distribution is likely to collapse drastically whenever a shorcut is found.

How the circles grow

With WikiTree growth, the circles population is of course changing. The overall population of The Tree is growing at a rate of over 3 million per year. How is this growth distributed over the 100 circles?

The first few circles growth is of course driven by the activity of the local PMs. If they have done a good job, up to the second and third circles can be complete after months of work, and are not subject to significative changes afterwards.

In the "far circles" (C15 and beyond), growth is driven by the global, somehow random, activity of WikiTree, independent of local growth. Since it's always raining where it's wet, most of the growth happens in the already most populated circles. Moreover, new reconnections keep bringing back profiles down to lower circles, making the overall distribution sharper and sharper with time.

As an illustration, the following plot compares the circles population for Jean-Joseph Vatant in November 2020 and November 2022.

In this global picture the work done during the same period to systematically populate the first circles (up to C4) is just invisible. Actually this work had practically no impact on the global distribution, given the strong endogamy of those first circles. The few reconnections with a visible global impact have happened beyond C10.

For other examples, see also the 2020 page Bridges from Sweden, which turned out as a comparison between Olof Andersson and four other profiles, has been updated to show the distribution changes in the circle populations of the same five profiles - showing the contrast between profiles affected only by the general growth of the Big Tree and profiles where "reconnections" have been actively worked on.

The mean distance is slowly but steadily getting smaller for all studied reference profiles. The peak of the distance distribution is getting sharper and shifted to lower values. The following table shows the changes in mean distance over two years for a few reference profiles.

Profile Nov 2020 Nov 2021 Nov 2022
Samuel Lothrop 17.5 17.4 17.2
Mary Stuart 20.4 20.3 20.3
Elizabeth II Windsor  ? 21.9 21.6
Olof Andersson 32.1 30.4 25.1
Jean-Joseph Vatant 33.2 32.5 30.1

There is of course a limit to the growth of each circle, hence to the "collapse" of the distribution towards lower circles, but this limit is far from reached, for three main reasons :

  • Most connected profiles have not completed even their first circle : parents, siblings, spouse(s) and children are still missing.
  • Each new reconnection pulls profiles to lower circles, and the whole distribution to lower distances. Such collapse events can be spectacular for profiles still at far distances from the bulk of the Tree.
  • Many people in the first circles are yet to be born, or not yet included because they are too recent, for privacy reasons.

This last point is fascinating to consider. For example the 12 great-grandchildren of QEII are not yet in WikiTree, and when they are they will add to her C3. Later on, hopefully, their children will add to C4 by 2040, and their grandchildren to C5 by 2070, the sixth generation will be born somewhere around the turn of next century. With a mean of 3 generations by century, C9 members will be added by 2200, etc. Even Mary Stuart, who might look quite ancient to most of us, has still more than two centuries to wait to see the last members of her 20th circle come to life. QEII belongs to her 12th circle only ...

The shape of things to come : Billion Profiles Single Tree

Based on current trends, one can conjecture what the circles distribution would look like when the Single Tree has passed the billion threshold. Such a figure seems a bold hypothesis, but keeping the current annual growth rate of about 15%, this would be achieved in less than 30 years! In the Billion Profiles Single Tree, it can be conjectured that for most connected profiles, the distribution would converge towards having a similar shape, with a peak in the 15-20 range, and a mean distance below 20. Differences would simply came from the population of the very first circles, Samuel Lothrop case would certainly stay as a reference of sharp growth. One can wonder how such figures are possible, but various estimations of the issue, based on the mean size of families, indicate a ballpark estimation of tenfold growth every 2 circles in the ascending part of this asymptotic distribution, once the 10,000 threshold is passed, somewhere around the 6th circle. A tenfold growth every 2 circles would give this kind of figures for cumulated population :

  • C6 : 10,000
  • C8 : 100,000
  • C10 : 1M (figure already passed by Samuel Lothrop)
  • C12 : 10M
  • C14 : 100M ...

... reaching half a billion around C15, which would be the peak circle (as it already is in Lothrop distribution)

All this is of course highly conjectural, but indicates that the shape of things to come is somehow already visible in the current state of affairs. But let the Tree grow, and let's keep gathering data!

Bottom line

One can ask what all this can bring to the WikiTree experience, the genealogist work and everyday life. Here come a few answers (non-exhaustive list).

  • Shifting viewpoint from "my ancestors", "my family", "my blood" to a more peripheral vision, towards an attention to all four directions of kinship (parents, children, siblings, spouses).
  • Systematic exploration and expansion of first circle is likely to reach and go over your zone of genealogical comfort, discovering unexpected alliances and filiations, traveling further across time, space and social boundaries.
  • As the original Jean-Joseph Marie Vatant and Olof Andersson cases show, even very endogamic first circles eventually expand towards every corner of The Tree. Checking how and when this happens is a good lesson in history of migrations and exogamy.

See also

Free-space pages:

G2G discussions:

External links:

  • Login to request to the join the Trusted List so that you can edit and add images.
  • Private Messages: Contact the Profile Managers privately: Bernard Vatant and Eva Ekeblad. (Best when privacy is an issue.)
  • Public Comments: Login to post. (Best for messages specifically directed to those editing this profile. Limit 20 per day.)
  • Public Q&A: These will appear above and in the Genealogist-to-Genealogist (G2G) Forum. (Best for anything directed to the wider genealogy community.)
Comments: 8

Leave a message for others who see this profile.
There are no comments yet.
Login to post a comment.