Proposal -Preventing gedcom bombs

+13 votes
235 views
Anyone who has moved beyond their own research lines has encountered the problems of previous gedcom bombs, duplicates, erronious data, mixed up families etc. Nothing to do about them but work through them.

Problem is it is still happening, to a lesser extent with the controls in place now, but never the less they are still getting through and not always being tidied by the uploaders.

I propose that new users should be limited to 30 individuals for their first gedcom upload, they can upload more for checking but import is limited to 30. This is enough for a 4 generation ancestor tree, a single line going back 15 generations or a smaller familiy group with siblings and children. After import the user can request permission to upload more but this can only be approved after someone has checked their first import. The gedcom checker would first check there are no duplicates, pointing the user to the merge pages so they can learn to merge, then they would check (a sample?) profiles to see if, at least, basic editing has happened, removal of gedcom cruft and this is a gedcom import message. Refering the user to the bio style guide if necessary. The gedcom equiped badge is only awarded after checking the first upload.

After this process users would be aware of the work involved and would perhaps be more careful with their future uploads.

I was thinking the checking might be part of the greeters function as they already have personal contact with new users, but their workload is already high perhaps a new volunteer group would be better.
in WikiTree Tech by Living Geleick G2G6 Pilot (225k points)
I would like to participate in this discussion.  I could assume that I know what a GEDCOM bomb and that I have inadvertently engaged in their use.  My problems have come about by thinking that I was correctly uploading swedish letters such as ö, ä and Å and dicovering after WikiTree acceptance that èo. èa and êA were, infact, uploaded.I drew the attention on this problem to WikiTree and was able to have one of my three uploads deleted.  In correcting this problem I have realized that it has occured before with Swedish uploads and might be a problem uploading from Brasil.

Basically though, I do not know what the approvers are looking for.  I would thing the possible uploading of Swedish, Chinese or Arabian would be looked for and by including a cover sheet potential language problems might be avoided.
I just registered and tried to upload my tree which has 7170 individuals representing 28 years of research.  Not permitted.  That ends my interest in wikitree right there.  It is complely counterintuitive to have a project like wikitree which aspires to creating some sort of comprehensive worldwide tree yet stop the most extensive data from getting into the system right there.  Pathetic.  I'm out of here.
Malcom

I appreciate your frustration.  It has  not been easy for mr to merge my research into WikiTree.  I do believe that a permanant rtecord of the research I have done is rewording.  If I have found 5000 individuals who are connected into one large family I do not want that information lost.

I have broken up my tree into branches of from 300 to 700 people.  The WikiTree philosophy has forced me to check each one of me individuals.  I try to be as accurate as possible but I still find the Anna who was recorded as male.  Currently I am investigating why two Samuel Gustaf Mattsons. born  on different days were married to the same person and died on the same day.  Mistakes happen and I appreciate finding them so that I may correct them.

Do not reject WikiTree.  Break up your trtee into seperate branches, upload them, make sure they are correct and move on.  What better way to preserve 28 years of work.

 

Norm

2 Answers

+5 votes
 
Best answer

I like your idea Rhian. After dealing with a gedcom bomb yesterday I would have to agree some changes are needed. Uploads of large gedcoms and ditching them soon after is a problem. I greet on a daily basis so I have a good feel for the amount of gedcoms being uploaded during my scheduled time. The amount of questions I receive after the upload is another story. Below is a perfect example of why some changes need to be made

Hi Michelle,

My name is ____ :)
I think I uploaded 4 gedcoms:
2013_09_16.ged
2013_09_16_dead.ged
2013_09_16_living.ged
2013_09_16_Val.ged
The first one was the biggest, with over 5,000 names.
The other three are subsets of that one.
WikiTree is all new to me and I was just experimenting with the facilities. What may happen is that I delete my account and my wife (who is - as I have mentioned - the genealogist in the family) creates her own account.
 
As a greeter and an Arborist I'm torn between adding to the greeters work load or a new volunteer group. The greeters get a feel for what's happening with new users. A new volunteer group would not. Adding a volunteer group to the greeter feed would make things a bit messy.  As an Arborist I'd like to see someone engaging new users about gedcoms in order to eliminate some of the messes I'm cleaning up.  When I receive an e-mail pertaining to a gedcom  like the one above. I shoot an e-mail to Eowyn to give her a heads up.
 
 

 

by Michelle Hartley G2G6 Pilot (167k points)
selected by Matt Pryber
As I said above the greeters are in the situation of being in contact already with new users, and as you say they get a feel for what is happening, but I am reluctant to suggest them as it may increase their workload while decreasing that of the arborists, they have enough work for a decade or two as well.

Perhaps we should look at a mentor system, some may need lots of help others hardly any. I am not sure how many new users there are a day, or how many come back a second day.
+4 votes
Rhian,
It’s great you’re trying to figure out a solution for this dilemma. The problem with GEDCOMS is one we've debated for a long time. Some have suggested doing away with them all together, but if we did that, many people with very well developed family trees (the kind of people who are a huge asset to WikiTree) wouldn't join our community because they wouldn't be willing to retype such a huge amount of data. Others don't have genealogical software on their computer and don't want to worry with the technical aspect of adding it and figuring out how to split their GEDCOM.  We’ve come across quite a few who have simply given up in the attempt to split a GEDCOM. Others have given up because GEDMatches is too confusing or they would have to skip too many ancestors that have descendants attached to them. The prospect of reconstructing these families overwhelms them.

As far as the greeters taking a more active role in GEDCOM assistance, we’d need a lot more greeters. From midnight last night until noon today, we had confirmed 36 new members so by midnight tonight that number will have increased quite a bit.  We've greeted about 66 guests, not including the fact that most of the new members were also greeted as guests before they were confirmed as members so we contacted most, if not all, of the 36 twice. We also thank guests for uploading GEDCOMS, photos, and other data. Along with that, there's quite a bit of interaction with guests, volunteers, and greeters through email messages. You are correct when you say that some new members don't need much help, but others need a lot. Often many additional hours are spent (off the scheduled time) dealing with those people who need extra help—and we’re happy to do it. We want them to enjoy WikiTree as much as we do and are willing to go the extra mile to make sure that happens, but in order to take on such a large responsibility as GEDCOM help or review, we would need more greeters, a LOT more.
by Debby Black G2G6 Mach 8 (85.0k points)
That was what I was afraid  of, which is why I brought up mentors as a posability, greeters are already too busy, and a mentor would be someone passed on to by a greeter to take over the more technical problems such as gedcoms, profile editing and merges. I have no idea how feasable the idea is, my initial proposal was a better way to control gedcom uploads.

To be honest if the only way to stop the proliferation of bad profiles from gedcom imports was to totally ban them then I would be in favour of a ban, but I am trying to find a way to have them and to have less problem profiles. Computer literacty of the average user is something we are discussing in the Style Committee and I see now that it might be a more general problem in the wider community which brings me back to mentoring. I shall have to think further about this.
Well said Debby. Yes we would need a LOT more. People don't realize how much we do on the side of greeting helping new members. Many of the questions asked are about gedcoms. My feeling is they  need more information. But, I don't see that we are the solution unless we had at least twice the amount of greeters or more.
One thing I will say real quick while I'm thinking about a response to all this is that there is quite a bit of GEDCOM information available. New members need to READ it. Greeters can help direct them to the GEDCOM FAQ. Perhaps in the message left when confirming someone.
Then you have greeters like me who are part time and have very little, if any, GEDCOM experience.  I am "Computer Illiterate," I type slow and learn slow.  My ability to be more active is hampered by family obligations. I try to get on two or three times a day. Lately they have been brief stays.  For your proposal we would need computer savy tech's who have the time to dedicate. I do think your proposal has a great deal of merit.  As stated above, if we lose members or potential members with well developed lines, would it be worth the restrictions?  As I add my tree in Wikitree, I am asked is this person  the same as this one.  With nothing to go on except a name, how would I know? We still have a lot of these profiles.  Matching these people with incomming GEDCOM profiles is impossible.  Maybe we need to begin going through and adding approx. dates to these profiles.  This would require a standard by which everyone would add a date or research a dated person.
You're right, Eowyn. Most of the information they need is there, but often people feel more comfortable asking us rather than spending a lot of time searching and reading. I generally try to answer their questions and include links so they'll know where the information came from. This is in the hope that they will read further.

Speaking of including GEDCOM information in their confirmation letter, I’ve got my confirmation letter so crammed full of things they need to know that there’s little room left for anything else.

When they first attempt to upload a GEDCOM, perhaps the GEDCOM FAQ should pop up advising them to read the FAQ before they upload a GEDCOM.  If they read it before uploading, that would save some guests and volunteers a lot of grief.  Of course, none of this addresses Rhian’s issue about GEDCOM bombs. The solution to that is beyond me.

Eowyn is right about there being quite a bit of information available.  There is almost too much.  The best advice I would give to new guys is to make sure their trees have Dates and Places (preferably with real sources) and then to start very, VERY small.

It is easier to learn by doing and the more detailed the instruction is the less it will be read. (how many people have seen the new Apple iOS7 agreement and bothered to read it at all?).  We should teach them how to: 1) check for missing/malformed data; 2) break out one single family from their tree as a GEDCOM; then 3) import and validate the result.

Once you get to SEE it done it gets much easier and the problems are fewer.  Alan's comment about:

"I am asked is this person  the same as this one.  With nothing to go on except a name, how would I know?"

gets to the heart of most of the problems with matching however.  I would invite everyon to see William Wood ( http://www.wikitree.com/wiki/Wood-675 ) and do a "search" for matches.   I think there are a total of about 3 that have Dates and Places and all are rejects.  You do not have much if you only have a name.

Is it possible that with a limit of 30 person gedcom uploads, for the first upload, the user would learn the process better? Not beging overwhelmed with loads of matches, or having to check hundreds of bio's has to be easier.

Is there a method of making sure peoplr have at least seen, preferably have read the gedcom faq? Perhaps adding a new fist step on the add gedcom page before choose file and upload there should be a 'read the faq' step for people who have not uploaded a gedcom.

Could Gedmatches be improved by using a baptism or burial date where there is no birth or death date? I know some people are against putting calculted dates into a profile, but anything that improves the reliability of GEDmatches has to be good.

I am more convinced that my initial thoughts that this would be too much work for greeters are correct, though I still think there should be some sort of check before allowing further uploads. Nobody has yet commented on the mentor idea, I am thinking that it might be a more technical group of volunteers who offer to train and assist users in, specifically gedcom uploads, but also perhaps complicated merges.

Related questions

+29 votes
5 answers
+19 votes
1 answer
+6 votes
1 answer
134 views asked Sep 30, 2021 in WikiTree Tech by Cheryl Rothwell G2G1 (1.8k points)
+5 votes
2 answers
156 views asked Jul 21, 2021 in WikiTree Tech by M Cole G2G6 Mach 8 (89.6k points)
+3 votes
0 answers
164 views asked Jan 15, 2020 in WikiTree Tech by Corinne Morris G2G6 Mach 2 (25.7k points)
+1 vote
4 answers
186 views asked Dec 29, 2019 in WikiTree Help by Wayne Hammond G2G1 (1.5k points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...