Why does WikiTree give false suggestions on creating a parent profile?

+5 votes
251 views

On creating a parent profile for , I got the following message: "If any of the following appear to be a match do not proceed to create the new profile. Connect the existing profile instead.

  • Barend Johannes M Vorster 1780 de Caep de Goede Hoop, Dutch Cape Colony [South Africa] - ~ Managed by Susan de Bruyn. [view] [set as father]
  • If there is no match above, mark the checkbox by names that are close. This will save a "rejected match" so that others do not confuse the two people in the future.
    If one of the above may be a match but you're not sure, request to join the Trusted List or contact the Profile Manager. In very rare cases, you may want to create your new version of the profile and then later propose a merge if they're duplicates. This should not be done often. It's important that WikiTree only has one profile for every person. Duplicates damage our entire project."

Vorster is not the same as Verster! The first names Ryno Johannes are not the same as Barend Johannes M ....

WikiTree profile: Ryno Johannes Verster
in The Tree House by Philip van der Walt G2G6 Pilot (150k points)
I up-voted this post
Thanks Keith, appreciated. I wonder from whom the down-vote was? :-)
We'll never know unless we become sysops :)
... :-)

I didn't vote either up or down, but in this case I could see sympathizing with the down voter, as your post paints a rather alarmist view of WikiTree, that I just don't think is true. (there is always the possibility of it being true still, but I don't think this is one of those cases)

All I can say is that I copy / pasted that exact text. Had I followed the suggestion, I would have created a false profile. The results do not lie.

5 Answers

+7 votes
 
Best answer
Hi Philip,

Because it is not 100% perfect yet, and it requires a human eye to verify :)

Have a nice day!
by Keith Hathaway G2G6 Pilot (603k points)
selected by Philip van der Walt
Thanks Keith, you too (I'm confident that future tweaks will remedy this - I just hope that WikiTree will be ready for the time that quantum cumputing rules the ether ... :-)
I am fascinated with the quantum world.  I'm a youtube expert on the subject (ha)

Computing will evolve more than we can imagine.

Cheers :)
So I gather. I reckon that this WikiTree enterprise will be considered the primitive one-dimensional pioneer beginning (with other internet genealogical sites) of a new multi-dimensional super-fast genealogy. We will probably be obsolete as WikiTreers ....
It will be a sad day when WikiTree's computer makes the decisions about relationships rather than putting up a list of suggestions from which we can choose, or not.
I have this feeling that between computers and dna testing genealogist as a whole will one day be obsolete; no mysteries and all info readily available.

I was sitting with this college senior one time who was recording what a newscaster was saying as homework towards her degree as a court stenographer.  Here she was just a month from graduation and hopefully employment and we both listened as the news report shifted to "Why court stenographers are obsolete".  Of course it was about voice-recognition software and implications.
You are describing a future day in which smiling computers will live lives of creative fulfillment and human beings will be bored stiff.
Maybe.  But we taught computers to crunch numbers so nowadays nobody sits around siding beads on their abacus all day.  We're happy to ask our phone math problems rather that do them ourselves.  We can probably find a way to keep ourselves busy even after genealogy is no longer a challenge :)
WikiTree, year 3000.  "Where computers collaborate."
Ha!  Outstanding!
Indeed. That's a good one Jack! Where computers can democratically hash out issues and up- or down vote questions / proposals ... as someone did with me after posting this G2G question.
Never take it to heart.... down-voters will be down-voters.  What a way to live though; down-voting the weather, down-voting collaboration, down-voting lunch.  There are enough downs in the world.  Better to seek and spread ups wherever appropriate and to pass politely by people who would do otherwise.
+10 votes
Because people are fallible, spellings vary, and no algorithm is perfect. It is better to have too many possible matches than too few.

I know it can seem overwhelming sometimes and frustrating but just chalk it up to being extra careful in an effort to keep duplicate profiles from being created. I have actually found several matches that I didn't immediately think were matches.
by Deb Durham G2G Astronaut (1m points)
Yes people are fallible. But this is an automated suggestion, algorithmically driven. It should be improved.
I have no doubt it will be improved. I'm just not sure how it will or can be improved to narrow it down in a way that won't ultimately discard potential matches that actually match but have discrepancies.

When I said people are fallible, I meant they make typos, they accidentally enter birth and death dates for a sibling, parent, or spouse, they get the proper name wrong, they get the middle name wrong,  they spell the surname incorrectly, etc. There are so many reasons the correct match may look like the wrong match due to human error I'm not sure it is beneficial to tweak the algorithm to the point it discards these possibilities.

I will say that I have seen some totally off the wall "matches", though. The surname you mentioned could easily be a misspelling or a variation but some aren't even close beyond the first few letters. ;-P
+6 votes
Philip, As others have mentioned people are not perfect and they are the one's programming the computers so a computer program will never be perfect either. If you give the same problem to 10 programmers you will end up with at least 12 solutions and none of them will be 100% right or wrong
by Dale Byers G2G Astronaut (1.3m points)
+7 votes
Partly because a lot of the existing data is inaccurate.  For instance, thousands of profiles on WikiTree have acquired extra given names that the person never really had.
by RJ Horace G2G6 Pilot (562k points)
+8 votes
Its actually pretty close -- close enough to suggest, and let a human decide.

Voster and Vester differ by only one letter 'e' vs 'o', which could easily be a typo or transcription error. Both have Johannes. In addition to the names being similar, the dates and locations are also a close match.

How many times in your research have you discovered the same person but with different first or middle names? I have one ancestor with eight different suggested middle initials in various different trees and sources. Source records are inconsistent and no one seems to really know the truth. Yet its clear, we're all talking about the same person.

If we only relied on exact matching of spelling, many, many ancestors would be lost to time (and typos).

So the computer is trying to be helpful in offering up possible matches using fuzzy matching, and let the humans decide how remote the possibilities are.

In terms of computer fuzzy matching, I'd say this was an okay suggestion to consider. I've seen suggestions in other applications that were much more farfetched and useless. Yet I'm glad for the suggestions and additional clues.
by Dennis Wheeler G2G6 Pilot (532k points)

Related questions

+7 votes
1 answer
85 views asked Mar 9, 2015 in WikiTree Tech by Antonia Reuvers G2G6 Mach 1 (17.3k points)
+13 votes
3 answers
+6 votes
0 answers
+22 votes
8 answers
+6 votes
1 answer
82 views asked Oct 10, 2015 in WikiTree Tech by Cynthia B G2G6 Pilot (127k points)
+7 votes
1 answer
+12 votes
6 answers
219 views asked Dec 26, 2018 in WikiTree Tech by Shirley Dalton G2G6 Pilot (466k points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...