Is a lot of the work on Historically Significant people reinventing the wheel?

+9 votes
I like the isea of one big tree. But my observation is that when people load up their GEDCOMs, all too often, they duplicate existing people and the merges never proceed.

The core of Wikitree seems to be the Historically Signicant people. But, I wonder if two things might need to occur to make this robust and avoid duplication.

1. Should Wikitree consider establishing a definiative database on some of the key Royal Houses? There are already two large databases with 200-300,000 people on the website, and one (or both) of these could be imported (with agreement and acknowledement from th LDS) into Wikitree  (British Sisle: Peerage, Baroetage & Landed Genrty -222,000 profiles, and/or Europe: Royal and Noble Houses of Europe 298,000 profile). These profile could be merged with the existing HSAs to create a definitive and locked database.

3. A rigorous search of families could then occur to merge duplicated entries into these families.

My interest in this was triggered by my one 'royal' gateway ancestor, Anne Gascoigne. There seem to be 5 - 10 duplicated Gascoigne family trees on Wikitree covering the 12th to the 17th centuries.

John Cherry
in Policy and Style by John Cherry G2G6 Mach 1 (11.3k points)
recategorized by Chris Whitten

4 Answers

+1 vote
Hi John,

Thanks for your interest in this!

Establishing a definitive database is what we're trying to do, isn't it? That's why we mark profiles as Historically Significant, which marks them as the final profiles for those people. It's not yet "definitive", but it is "locked". Also, I'm not familiar with those databases, but a quick glance suggests that they're based on community trees. If that's right, then they could easily contain errors, and so wouldn't be any better than our current system. Or am I missing something there?

As for the second step, the rigorous search, that is what the group is working on. Sadly, there are few of us, and as you say, a lot of duplicates! So all we can do is keep working on it, and keep trying to get more people to join in the effort.

I'd be interested in hearing more of your thoughts on this! :)

by Liander Lavoie G2G6 Pilot (444k points)
The trees on HistFam vary, but the one that really could be a good starting basis is the British Isles - Peerage, Baronetage & Landed Gentry (220,000 people) because every entry is extensively referenced and based on 15 major historical  publications. Importing the entire tree into Wikitree would give you the stem againist which all merges could occur claiming descent from British aristocracy. The European Royal Houses one is almost as rigorous, but is currently being reviewed to improve its sources.

I want to make a little comment about British Peerage. Years ago when I was attempting to trace a Border Reiver family I had an occassion to contact the editor of the book that is considered the authoritive source for Peerage. Baron's? I don't remember the name.

What he said was quite amazing. More than a few, he said, of the people who wrote histories for families were not correct. If a family inquired about a specific Royal, the writer made it fit - even if they were not related. I've been very sceptical in my own research ever since.

I have a cousin on each side of the family who fudged. One says dad decends from Erik the Red, and the other says mom descends from Robin Hood's rotund friend, Littlejohn. It makes for a fun story on a cold winter's night. We all have these kinds of stories.
+2 votes

I made similar observations (e.g. with the Vermandois family).

IMHO, the present data processing tends to limit the cleaning of the information. For instance, merging should be an addition of existing information, except where a single one is compulsory. When all profiles have been merged, it would be easier to handle the conflicting information.
by Living Pictet G2G6 Mach 3 (30.7k points)
Jacques, what do you mean "merging should be an addition of existing information, except where a single one is compulsory"?

Can you explain?

We do append the bio, images, memories, comments, etc. But you need to select which database fields to keep from which profile. Is that what you mean, keep these? How would you propose it happen, that the database fields from the merged-in profile be stored in the biography?
Perhaps I spoke too quickly. I was under the impression that one had to choose one  of the profiles as a whole. My bad.

My input on this is that if people are passsionate about a particular line nothing in WikiTree stops them from proceeding.  What I run into often is that someone loads a large Gedcom of which they really are only working on their own direct ancestors.  The rest of the profiles get little attention.  On the other hand as one that works on a lot of the HSA folks it is often quite stunning how little research has gone into the creation of an individual.  For instance I've been on the Roosevelt family along with several others for over a month.  The entire tree has numerous well published sources yet we have duplicate individuals with no creditable sourcing.  So....   Bite off as much as you can chew, don't put so many names into WikiTree that you dread checking in.  I would start with a free space page listing the peerage information you've stated.  Then folks have a WikiTree source to view before adding new individuals.  Here is an example for the founding fathers


+2 votes

I followed up on the Gasconigne line and you are correct massive number of duplicates.  As arborists we'll start to tackle that line.  Keep in mind that I wouldn't do any profile work until the merges are completed.  It can get very frustrating to clean up a profile only to see your work compounded by the next merge.

by Ed Burke G2G6 Mach 2 (23.5k points)
On WikiTree we are a collaboration which means that we don't have an authorative entity that puts out data.  Members post the information about their families which become part of a single world tree.  We often see discussions about spellings and unlike other sites where people build separate trees here we encourage settling on a single profile where those types of issues can be shared.  For instance I'm a Burke but most if not all records show the family name originally spelled Burk.  My ancestor Joesph used both.  What is important is that we have only one Joseph in my family not two and while other relatives may continue to refer to him as a Burk here on WikiTree his profile is Burke.  Within his profile his name is shown with both spellings.  My goal is to connect as many other Burk(e)s to my tree as possible and I want Joseph to be as easy to find as can be.  Having the offical exact profile spelling isn't as important since it really isn't used other than as a shortcut to navigate.  Having a rich profile with input from multiple interested individuals is what I value most about WikiTree.
I am all in favor of collaboration. But the way it is implemented here makes the whole  thing very dependent on the absent or blocking users.

If we intend to reduce the amount of duplicates, some temporary measures - lke the ones I mentioned above - could be the way to achieve that, IMHO.
Well said Mr. Burk(e) :D

Liz Shifflett (nee Noland, aka Knowland, Nolan, Nowland, Newlin, etc.)
Jacques: The last name can be changed, but that creates a redirect in the database. That way the ID that used to point to that person will then redirect to that person. If we didn't have that, links would break every time we changed a last name.

The same redirect happens when we merge. The profile ID that was merged away becomes a redirect to the new ID. That way no links are broken.

Since multiple redirects are bad, it is therefore bad to change the last name, and then merge the profile. So instead we just merge the profile. I'm not sure I understand how you think changing the last name first would help us reduce the amount of duplicates.
Thanks for the explanation, but I get it the first time. It is a pity that a varying data was chosen as the invariant part. AFAIK, people choose a reference number to play that role.

IMHO, it is easier to identify duplicates - especially multiple duplicates - if they appear on a single page.
I get what you're saying, but even if you changed the last name before merging, you'd still have to find the duplicates to do that. So either way, you're initially looking for duplicates with different names.
  1. If we are lucky, it could be possible to change last names for all members in one move (e.g. "De Vermandois" to "Vermandois"). This would save a lot of time.
  2. I maintain that having all Vermandois on a single list - instead of 211 "De Vermandois", 46 "Of Vermandois" and 168 "Vermandois", not mentioning the mispelled ones - would make the merging exercice much easier.

In other words, doing all this "by hand" looks like mission impossible. It is like emptying an ocean with a spoon.

There are too many disagreements about last name spelling to massively allow all last names to be changed.  What I'm looking for is simply a way to enter multiple last names into the surname search field then use all of our normal tools against them.  So instead of searching for Burke I type Burk, Birk, Burke, Burkes, Burks into the file and the list would bring all of them up for viewing.  Once I see that we have merge opportunities but different last names we use the standard tools to request that merge.  Use the Bio and alternate last names to document what has been done.
Yes it should. People make mistakes and others carelessly repeat it over and over again because: [1]  it is a in handy file on the web, and {2] information on the internet is never wrong.
i totally agree with this!  After searching for matches in one spelling, it's necessary to do the process again in another spelling.   (I work a lot in my Scottish background.)  As we get to ancient names, Gaelic names, Scandanavian names, etc. we cannot predict how the spellings could change through several generations.  And then, we always run into the question of "son of" (and" daughter of") and all these iterations in the various languages.  Having a search option for more than one way of  entering the name would be a marvelous tool.
0 votes
Historically significant people are priceless. The trouble is not in creating the tree. The trouble comes from dumping massive amounts of unsourced data and merging it. One winds up with the same mess found elsewhere on the web. It is especially difficult when one is dealing withother languages during the time when the Patronyn was extinguished and the new Surname emerged. Very the two do don't look alike. Few would guess Longstreet and Brinkerhoff were once Dircksz.

My g.g.grandfather lived in a state that had/has a three-part marriage license. His name was written three ways by three different people. One was the Court Clerk of the same name. Pake, Paeke, Peake, Peak, Peek. Ten years later I am trying to make a connetion.

Perhaps it would be better to create one large file for each historical person, It is then one source in one place for all gedcoms for John Doe.  

It's just two cents.

Related questions

+35 votes
1 answer
+19 votes
6 answers
+9 votes
4 answers
+2 votes
1 answer
+7 votes
2 answers
+2 votes
3 answers
124 views asked Dec 3, 2019 in The Tree House by Kelly Kersey G2G6 Mach 1 (12.5k points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright