Has distributed computing been reviewed as an option to calculate matches more frequently?

+5 votes
68 views
In some scientific research, the scientists have their computer calculations made by computers of people all over the world. This is called 'distributed computing' and it allows them to have their results way faster than if they would only use their own computers. Of course this only works if you have a lot of volunteers, but I think they're doing a great job of that already by joining each other on platforms like BOINC, World Community Grid, etc.

Anyway, I was thinking we could use this kind of computational power for the bots on WikiTree. Out of 550000 genealogists on the site, I think we should be able to find enough volunteers to help the EditBot, MatchBot and what more bots on WikiTree keep the site more up to date. This way we wouldn't have to wait until the next day to see the impact of our changes regarding connections to the global tree, error reports for the Data Doctors to work with, ...

Do you guys think this is a good idea? And maybe more importantly, is it achievable to implement it?
in WikiTree Tech by Sibe Bleuzé G2G Crew (810 points)

1 Answer

+1 vote
Mine idea is following. Create a string from each individual like name, surname, father, mother,kids, years,etc. Replace to empty string unknown. Than dump it to new table, better on separate server, smeani g this table has two fields -id and string field literally describing individual and nearest family. Then duplicate this table. And join them into third table without key. It will be for example 1000 records creating million records (1000*1000). Resulting table will have 5 fields - id1, string1,id2,string2, distance.
Then run script that first transliterates info to Latin. Then run a script that calculates Euclid distance between strings. Convert resulting value from absolute to percent. Then sort. Delete data where id1 equals id2. Done. We got a comparison of how similar each profile to existing profiles. Of course this will need crazy computation power, but even if you make this run monthly and simple compare makes you find 5 profiles within 25 days. This will be beneficial!
by Fedir Indutnyi G2G Crew (540 points)

Related questions

+12 votes
2 answers
+17 votes
2 answers
+3 votes
2 answers
164 views asked Dec 9, 2015 in WikiTree Tech by C S G2G6 Pilot (270k points)
+3 votes
1 answer
+10 votes
1 answer
+6 votes
3 answers
205 views asked Dec 10, 2018 in WikiTree Tech by SJ Baty G2G6 Pilot (566k points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...