WikiTree matchbot running blindly around proposing merges ...

+17 votes

I received a long mail from WikiTree matchbot proposing various merges of socalled "duplicates" - the only constant is that a) they all have the same years b) they might share a similar or same first, middle or last name c) one of the profiles will not have any parents attached. All of them I rejected.

This is worrying, because the profiles obviously have either different first and middle names, or last names. For example: Pretorius-2516 (L) and Maartins-1 (R)

I have counted these proposals: 9 in total. And that is me alone. I wonder how many inacurate / bad merges this surge in matchbot proposals will lead to across the WikiTree spectrum, if everyone will receive this today ... !?! See: Venter-6

* edit: added Space Bot Monitors project to this question now

asked in The Tree House by Philip van der Walt G2G6 Pilot (140k points)
edited by Bea Wijma
Hi Philip, thank you for posting this.  Sounds like Bea also believes there is an important issue here.  I took off your tag "improvements" only to temporarily fit in "bug" and "sysops".  This will hopefully have it be seen by Julie and Erin first thing this a.m.  There is a chance we might even "flag" your post... not for any bad reason at all.  Only to bring it to Eowyn's attention first thing as well.  If so, we can remove the flag and restore tags as soon as a Team member arrives on the scene.

Thank you Philip.
Thanks for the help Keith ! It indeed looks like Matchbot is proposing all kinds of merges with just one thing ''matching''

it’s been running a bit wild since the change to the search parameter to add Unknown to the mix (it had previously always been excluded, so these had never been in any search). Pretty sure we got through all those before the next changes to the search parameters – ignoring space in a LNAB & a change on how first name/middle name were looked at. It’s gotten a bit ridiculous since then, but “this too [should] pass”. 

Liz, [MatchBot MP]

I think it's worth having a Wiki Genealogist look at Maria Catharina Elizabeth (Maartins) Pretorius (1885) and Martha Catharina Pretorius (aft. 1885) to see if they're the same person.

I've offered to Liz that if the MatchBot Monitors would prefer it, we could have MatchBot give them a list of suggestions rather than proposing the merges. But that would be more work for them, rather than distributing it to all volunteers. It's an option Liz and I will keep talking about.
not only more work, but more than I think should be asked of MatchBot MPs. One benefit I've seen of bad MatchBot proposals is that the manager will post additional information in the rejection comment - including managers who have made no contributions since uploading their gedcom years ago. The MatchBot MPs can't do that. They can only go by what's posted & visible (MatchBot looks at all privacy levels, but we can't).

There is only one other profile on WikiTree with the LNAB Maartins, confirmed by a marriage certificate: - taking that there might be also variations, perhaps one of the SAR-project members could look into that (there must be a familial link between the two Maartin profiles somewhere).

I do not think however (even considering that the spouse had Martin as a second name), that even if they are duplicates ( seeming te be the source of the one married to Maarten), that seeing all the bad matches (merge proposls today) with the Venter genealogy (which even has experienced genealogists and secondary sources confused by all the in-marriage and faulty secondary sources - some dating back to 1887), that this one "correct" merge proposal (let say!) would prove the other nine proposals correct as well. The odds are against it.

I am on the Matchbot project and do some of the reviewing/rejecting of bad matches.  I went through what was rejected this morning that I had in my inbox and tried to sort them into groups by reject reason.

Description Count
Different Parents/Siblings 14
Different Countries 12
Different States 8
Different LNAB 8
Other 3
Different First Names 2
Different Spouses 2
Different First/Last Names 1

0711 Matchbot Reject Reasons

I don't know if the match criteria can be tweaked to include looking at the place of birth or parent's names but that could help to eliminate a lot of the non-matches that we reject.
I still feel that even these non-matches are helpful as I use them to tag profiles as unsourced when applicable as many are really old profiles that haven't been touched in years.  Unfortunately, a lot of these are being matched against up-to-date profiles and causing some unneeded stress.
I just got my first email from Matchbot with a proposed merge...It's very confusing because it was so obvious that it wasn't a match that it was already marked as a rejected match.    There was nothing in the 2 profiles that would indicate that this was the same person.. No common name, No common parents, no common birth date or place?  Craige-46 & Mitchell-2677   Is this a common occurrence?

Hi Cindi,

That one threw me for a loop this morning as well.  Usually there is at least one thing I can point to that they have in common besides year of birth but this one really didn't. They had the same year of birth but that is all that they had in common other than their first names both began with 'M'.

We have issues with matches not really matching but this one was really an outlier in having only year of birth match, usually some other fields match or are very similar.

I will make a note of this and continue to watch to see if we get more.


Thanks.  It kinda threw me for a loop to have that merge proposal come through.   it was the first I'd seen from Matchbot.

2 Answers

+12 votes
Best answer
Looks indeed like a problem Philip, it's like it now is proposing merges for all profile with perhaps just one thing in common ?  (name, middle name, last name, date) . I'll send a link to your question to the experts, perhaps someone can fix this .
answered by Bea Wijma G2G6 Pilot (243k points)
selected by Philip van der Walt
I had a merge proposal like that too based solely on the name Mary and the about birth date being 1680. Had totally different surnames at birth and current married names, different husbands, different children, and from different countries. And looking at the profile other way off merge proposals have been too.
And it has been doing this for the past 5 or 6 weeks at least !  And many of us have been trying to get someone to stop it !
I know a while ago there was a change or something and Matchbot was now also comparing and proposing merges for/with Unknowns, so perhaps it has something to do with that, I have added a request for help and a link to this G2G in the leader and the Matchbot group, so I think someone will look into this and perhaps there's a patch or something which can stop these random merge proposals, it's indeed a problem..
+8 votes
I mentioned this problem recently as well. I agree that it is a problem because some who do not pay attention to the merge proposals may allow some bad merges to be completed by others who think that they are doing the right thing.
answered by Dale Byers G2G Astronaut (1.2m points)
Exactly my worry, but when I posted on Matchbot's page to express my concerns, I was told he's only doing his job. Well he may be only doing his job but he's not doing it very well in my opinion. Before these changes I was having to sift through a hundred or more suggestions when I created a Smith profile, so I gave up sifting. I haven't created a Smith profile recently but it will be interesting to see how many potential "matches" Matchbot comes up with next time I do.

Related questions

+5 votes
2 answers
94 views asked Jul 18, 2017 in The Tree House by Kathy Zipperer G2G6 Pilot (221k points)
+10 votes
1 answer
+5 votes
0 answers
61 views asked Oct 23, 2016 in Policy and Style by Carolyn Martin G2G6 Pilot (126k points)
+5 votes
1 answer
84 views asked Jan 5, 2018 in WikiTree Tech by Cynthia B G2G6 Pilot (124k points)
+3 votes
1 answer
85 views asked Feb 19, 2017 in The Tree House by Ros Haywood G2G6 Pilot (523k points)
+23 votes
12 answers
+3 votes
2 answers
96 views asked May 5, 2018 in WikiTree Tech by Karen Lorenz G2G6 Mach 3 (32.7k points)
+6 votes
2 answers

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright