Matches are missing resulting in duplicates

+5 votes
199 views

I added a profile for Ray Nash Studt (Studt-73) several months ago via a GEDCOM upload. A couple of days ago, I discovered there is a duplicate Ray Nash Studt (Studt-39). The two have the same full name with the exact same spelling and the same birth year. I was surprised, because I try to be very careful about comparing profiles when I'm adding people. But there it is, so I clicked on matches from Studt-39 to propose a merge.

The only potential match that was offered from the Studt-39 profile page was to Ray Stitt (Stitt-39) with no dates. I tried again to find the match from the Studt-73 page. I was offered matches to Stitt-39 again, Stout-1020, and Steed-1959. 

I ran another search for potential matches from the general "Find...Matches" page with very loose parameters and asking to include any rejected matches. They still don't come up as a potential match (and this confirms I didn't accidentally reject it when I uploaded).

I have been able to propose a merge of the two profiles by just going to the general "Matches & Merges" page and typing in the two profile numbers. That is not the problem.

With the same exact full name and birth year, these two should be showing up as potential matches--especially when profiles that don't match either the name or the birth year are being offered as possible matches. And when you click on matches from each profile, the list of potential matches should be the same!

This seems like a glitch in the match programming.

in WikiTree Tech by Regan Conley G2G6 Mach 4 (46.0k points)
edited by Regan Conley
A merge was already proposed on 27 October.

Edit: Oops, you mentioned that. No idea why they aren't being found, sorry.
I can remember making profiles on a multiple occasions and finding duplicates later, and i always look for possible matches.

In my case, a lot of it might have to do with Quebecois names being confusing for the match system, but it is certainly annoying!

2 Answers

+3 votes

It might be worth adding your example to https://www.wikitree.com/g2g/1122012/something-mechanism-suggests-potential-duplicates-creation, so that Jamie has various examples all in one place when she tries to improve the matching.

by Paul Masini G2G6 Pilot (389k points)
Thanks! I didn't see that when I searched for similar questions. I will add it there.
+1 vote
The death dates were different by 20 years, so they wouldn't have shown up as a match.
by Jamie Nelson G2G6 Pilot (627k points)

That's insane

Two people with the exact same names and birthdates don't show as a match because one editor didn't have accurate or complete information on a death date?

The algorithm may be functioning the way it's been programmed, but it shouldn't be programmed that way.

Anyone with the same full name and birth year should certainly show up as a potential match for the editor to evaluate. If people that don't even have similar names show up, I would expect anyone that has the same name to show up, especially if they have at least one identical date.

Imagine how many John Smiths there are born the same year (+/- 2 years). The death date is used to narrow the results down.

If you search for John Smith b. 1887 and get 3000 results, then, yes, it makes sense to further reduce that list by other criteria, like location or death date. That's a fairly simple 'if 50+ > then death date' filter.

It does not make sense that you ask for profiles like Herbert Octavius Schnicklheimer b. 1887 and get ONE other Herbert Octavius Schnicklheimer b. 1887, but hide him (for any reason), simply because the search began from a certain page.

The current search clearly can find both profiles and make them the top results, right next to each other. The problem is that it's intentionally hiding perfect matches for reasons that don't make sense and, worse, offering illogical alternatives ("Guess what?! We found your guy, but forget that! How about Henry Shack in New Zealand instead?").

If the profile matching search is functioning the way it's been programmed to (which it sounds like), then it seems the reason people can't find duplicates is because the programming is hiding them. It would make sense to not do that.

The duplicate detection when creating a profile is just https://www.wikitree.com/wiki/Special:SearchPerson. A list of results is shown after you enter a birth date, and then they update again after you add another piece of information to filter the matches (the death date). It's not the system's problem that the user didn't check the first set of matches. 3 matching pieces of data and 1 major mismatch doesn't make a "perfect" match.

The vast majority of the time duplicates aren't detected it is because first names are different (Liz vs. Elizabeth, etc.) That's currently being worked on, and I suspect there will be a drastic decrease in duplicates created once that's done.
I'm not sure what happens when you add an individual profile.

As you can see from my original post, this duplicate was not presented as a potential match when I created the profile when I was working from a gedcom upload. I didn't reject it. I didn't fail to "check it." It wasn't presented.

Once I realized a duplicate had been created, I tried to merge and the system didn't offer the two as matches.

"The user" (me) checked and re-checked in different ways, so let's not say that "it's not the system's fault that I didn't check."

And we're not talking about the system not detecting duplicates because the first names don't match. You said it was because the death dates were different on a matching first name, middle name, last name and birth date.

The bottom line: I hear you that it's not a bug and that the system is intentionally programmed to function the way that it is currently functioning. Let's consider the issue closed.

Related questions

+6 votes
1 answer
181 views asked Aug 22, 2017 in The Tree House by Michael Hammond G2G6 Mach 1 (13.3k points)
+9 votes
1 answer
+14 votes
3 answers
291 views asked Feb 22, 2020 in WikiTree Tech by Paul Gierszewski G2G6 Mach 8 (89.1k points)
+19 votes
4 answers
286 views asked Jan 26, 2019 in WikiTree Tech by Paul Gierszewski G2G6 Mach 8 (89.1k points)
+5 votes
0 answers
106 views asked Jan 24, 2018 in WikiTree Tech by John Macdonald G2G6 (6.1k points)
+12 votes
1 answer
265 views asked Aug 11, 2017 in The Tree House by Cindy Lesure G2G6 Pilot (127k points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...