Should we stop allowing GEDCOM bio updates to PPP profiles?

+54 votes
1.7k views
When people upload data from their GEDCOMs to existing profiles, oftentimes they add some (or a LOT) of duplicated "sources", or "sources" from Family Data Collection or Millenium File or Edmund West, etc.

Some of the profiles they are adding to are VIP's (i.e., Mayflower passengers and the like). The additions these GEDCOM uploads have made are really unnecessary, and are cluttering up the profile.

Should we stop allowing GEDCOM updates to PPP profiles, in order to further protect these ancestors? I'm not suggesting locking down the profile - only cease allowing the GEDCOM update process to add to it.

One example is shown in the URL, but this applies more broadly than just to him.
WikiTree profile: Richard Warren
in Policy and Style by S Willson G2G6 Pilot (222k points)
Tanya, the new gedcom process is still letting bad data through.
I love being able to enter new people via GEDCOMpare. It saves me typing and reduces the chance of my making a typo in the date or LNAB, while still giving me the chance to clean up the sources and bio.

However, I don't find it helpful to use GEDCOMpare to edit data in existing profiles. For the amount of effort the cleanup takes me, I'm better off manually adding my sources. And so far, no one who's edited the bio via GEDCOM on a profile I manage has made a useful addition; it's all been badly formatted duplicates of  sources already on the profile or Ancestry trees.

Hi Sharon,

I'm new to WikiTree and I agree with you.  Adding profiles by GEDCOM lessens the risk of typos but I also strip all the source material and manually add it.  It's actually quicker than trying to edit it.  It's also nice to be able to see a family group at a time.  

I guess the issue (might be) that it took around a week to work out my new 'system' so I'm now laboriously re-checking every profile from that week and adjusting the format.  Thankfully I didn't upload thousands of profiles and I'm down to my last 100 to check. 

I've also learned to focus on one small 'twig' at a time which has helped.

I would hate to see the loss of the GEDCOM uploading system as I have around 15000 people to add and I'm finding a lot of my research has not been added to WikiTree as it is South Australia specific so no shortcuts for me!  Furthermore, I think a 'system' is only as good as its operator.  I'm seeing good and bad manually added/GEDCOM added profiles and I think there "might" (caveat!) be an argument to be made that a bad GEDCOM upload "might" be an operator issue +/- bad data/poor research.

I'm in favor of not allowing GEDCOMpare to edit PPP profiles.
Limit GEDCOMs of any type to after 1800 or so, the earlier you go in time the more likely you are to find profiles already in existence in our tree.  Having to clean up GEDCompare imports is also an absolute pain when you have already done the work of sourcing and improving a profile to the best avaliable data on them.  See my comments below for more.
I don't know whether WikiTree has better coverage of the earlier periods, but my impression has been that coverage in any period depends largely on the specific interests of community members.  Some families or some locations will be heavily represented in the database, while others will be virtually ignored.

A bigger problem is that, the earlier the time-frame, the more likely you are to run into such inconveniences as missing (or never-created) records or the need for specialized research tools or expertise.  Thus, the earlier the time-frame, the more likelihood there is of running afoul of inaccurate, credulous or outright fraudulent published sources and the harder it becomes to prove or disprove their claims.  Then, as you've experienced, once you've managed to disprove some commonly accepted, published "fact," you have to deal with the would-be "corrections."
I agree wholeheartedly as most folks don't have to add more than 3-4 generations now to connect to the tree in some way.

Merging makes a mess and I've seen a number of profiles where the garbage is kept and the good stuff ends up tossed by folks who are confused by the process.  Manual entry would solve that.
I would like to have been given a choice about someone adding a GEDCOM to my tree. I just wanted to cry when I saw that pop up on an person. Most of that stuff is wrong and I don't like LDS affiliation like baptism after death junk.

GEDCOM is not the genealogy fairy. Genealogy involves work and anyone who believes otherwise is mistaken.

How about adding a bullet on the Wiki profile manager page to permit GEDCOM imports or refuse?
I have found more inaccurate information on Gedcom related profiles as compared to manually added data.
For most of us PPP = Project Protected Profiles.  As an old proposal manager knows always establish your acronyms the first time you use them  in any document, then use away.

And on the topic at hand limiting access to the Project Protected Profiles is reasonable (especially for ones like the Pilgrims) since those should be well established.  Others not so much and the current process on the system added sources lok better than the hand made ones I create.

16 Answers

+11 votes
Plan B in GEDcompare is - don't mark the match, then use Add to create a duplicate, then merge.  With the new merging form, it comes to very much the same thing - EXCEPT the merging will probably be done by somebody else.

For the gedcommer this has big advantages - it's a lot quicker, other people do a lot of the work, and he gets to be a PM.  The downside is, some stuff might get dropped that he would have kept, but it'll be in the history.

A 3rd option would be possible, where you mark the match, but the system then creates a duplicate and immediately proposes the merge.  Perhaps that would be the way to go with PPPs.  The gedcommer wouldn't become a PM in that case (though he'd join the TL).  But he'd get his stuff imported, and the PM would then get the merge proposal and decide what to keep, if anything.
by Living Horace G2G6 Pilot (631k points)
edited by Living Horace
Maybe option 4 would be for the system to recognize a match for a PPP profile, and when it does, it marks the GEDCOM import person as one that has to be done manually - and doesn't actually create a new profile at all. Then the GEDCOM uploader would have to actually invest some time in determining what to upload, if anything.
In fact, perhaps it would be logical to have a general principle that a merge into a PPP has to be done by a PM of the PPP.
Would there be some way to add to that general principle so that the PM of a PPP would have to approve any addition of a child or parent?
I like option 3.  It doesn't completely stop people (the duplicate profile becomes a set of proposed changes), but limits actual edits to PM's of PPP's.
No!  It is against WT honor code to make duplicates on purpose!  Not a good plan - dupe merged are redirects and we do not need more of those at all
+25 votes
I totally agree that these shouldn't be allowed to happen, particularly if a profile has the PPP - Profile Protection.  Many of these profiles have been merged, and re-merged to get rid of duplicates, and clean up the bios, add sources, removed 'dead links' to internet trees etc.

The gedcom often introduces more dead links, info provided but unsourced which often is already in the bio and sourced, or has sources like S#2222 which means nothing.  Another clean-up is needed, only to wait for the next gedcom.

The person(s) uploading often are not familiar with project guidelines, and don't remove the material that is redundant and/or unsourced.  Or they remove redundant material, keeping what they added and removing the researched material, or they just leave it all, expecting someone else to do the cleaning.

I think the gedcom uploads in these cases should be prevented entirely.
by Chris Hoyt G2G6 Pilot (862k points)
I agree with you completely, Chris.  Gedcom uploads to PPPs should be stopped.
I also agree, Chris.  I had thought PPPs were already blocked from interference from gedcoms, but obviously not.

Perhaps requiring manual entries on all PPP and pre-1700 profiles would help.  Also, what about establishing a waiting period after signing up, or a minimum number of manually-input contributions, before allowing someone to upload a gedcom?  That might help ensure at least some familiarity with WT.
+17 votes
I've been entering my own tree one profile at a time for this very reason. I'd rather do thst than have 100s blank or screwy profiles to merge later.
by Em Laetsch G2G1 (2.0k points)
I enter by hand - not that all mine are so great - but at least I am trying to do it right - research is needed on a lot of mine yet but I am constantly working on it and find more family to add so it gets complicated
I also enter all of mine by hand - and there have been a lot of them, but this permits me to control what goes into  the system..  It takes  more time, but is worth it.  I would also support eliminating GEDCOMs entirely.  They cause a lot of extra work.
Yes! I admit that many of my profiles only have minimal sources attached because I wanted to enter several generations quickly, so I could connect to later generations with existing profiles. I'll add more sources and improved biographies later. I much prefer this type of "bare bones" profile to all of the ghost profiles that were added by Gedcoms and then never looked at again by the person who added them. It's instant gratification for the uploader and requires no effort or genuine commitment on their part.
That is pretty much how I do it too, I had some GEDCOMS at one point but had not kept them because of a computer crash - and I am not really sorry because I was new to this back then and a lot of it was not sourced or even valid - I am glad I hand wrote the names dates and sources for so many and that I was super lucky with my mom's family that it ended up being from lines that were documented so well compared to lots of people - I really only have one "brick wall" left as far as getting back to point of entering North America - those that were not already here that is
+17 votes
I completely agree. I've had to clean up a set of profiles (prominent aristos) back in September/October which were polluted by a Gedcom upload - even the name fields had been changed around and were no longer what they should be. And so, Gedcompare is still doing this to protected profiles?

The rule is we should communicate with a project before making significant edits to a PP profile. Does GedCompare discuss before making edits? If it does not, then it should not tinker with protected profiles.
by Isabelle Martin G2G6 Pilot (566k points)
Well said. Gedcom uploads don't seem to require any communication before adding their content. Their uploads are definitely not in keeping with our policies for Project Protected Profiles.
I agree with S Willson, GEDCompare does not stop messing with PPPs, just saw it happen a couple of days ago.  :(
+11 votes
I am a newer member to WikiTree.  i have uploaded a GEDcom from Ancestry.com.  I am very grateful to be able to quickly upload my new people and match others with an already created profile.  But learning the hard way, I have no desire to create a mess for another profile manager.  I think if I add stuff that makes a mess, the Profile Manager should call me out on it and ask me to correct it.  In most cases, editing duplicated info/sources.  It is very hard to add from source information from Ancestry.com.  I try to find it on Family Search first, but I have had little luck with almost 650 of my own family members.

I would love to have a template for what information needs to be kept from an Ancestry resource.  I found some profiles through suggested links from a Project Manager in the Pre-1700. They were good examples and that was very helpful. I know have a visual of a great profile with good sources.  I would love to see a few examples of an Ancestry.com GEDcom upload as is with suggested edit below.  Or, perhaps a sample profile as it appears from Ancestry sources and a sample of what the cleaned up version would look like.  Kind of like a before and after.  

I do agree, we need something in place to prevent a lot of junk.  

And to all who have allowed me on your trusted list, I am slowly working to review and correct any junk I may have left behind.  Just message me if you find one.
by Janet Akins G2G6 (7.5k points)
Thank you, Janet. I wish more new(er) members were like you.
There are some good g2g threads regarding what to keep from Ancestry.com. Ales has been working on templates for "citations" from Ancestry that might actually lead somewhere. If I weren't on my tablet, I would copy and paste some links for you, but just do a search on g2g for ancestry sources or deleting sources from subscription services. Hopefully that will give you some helpful info.
Thank you, Edie.  I really appreciate the information about where I can find help in the area of editing resources from subscription groups.  This is so important to all of Wiki Tree.  New members need to be given the links and information when we join.  It would be very helpful to have more detail on the Wiki Tree Information for Policy and Style.  I know that I read the information.  

For All Wiki Treers--Keep asking specific questions on Policy and Style.  I am very impressed at how much help there is out there.
+12 votes

The text below is from the help-page.

Project-protecting a profile does three things:

  1. It tells the WikiTree community that the profile belongs in a project. Major edits shouldn't be taken lightly. They should be discussed.
  2. It protects the Last Name at Birth. The LNAB on a protected profile cannot be changed and the profile cannot be merged-away.
  3. It protects the parents. You need to be a manager of the profile, or a Project Coordinator or Leader to edit the parents of a PPP.
If a new member adds profiles from a gedcom this is probably not something he/she knows anything about. It takes a while to get familiar how thing are (or should be) done at WikiTree.
 
My opinion is that either the PPP needs to be changed or the gedcom process needs to be changed.
by Maggie Andersson G2G6 Pilot (150k points)
Yet, we're still seeing problems on with parents being added.  My guess, after following a few recent discussions of such problems, is that this is happening because someone creates a parent (perhaps through a gedcom), then attaches the PPP as a child.  

It seems to me that, in the spirit of what project protection is intended to address, it would be appropriate to bar any attachment of a protected profile to another profile except by a PM (including the project involved) or Trusted List member.
+14 votes

Chris W. just reported this to me (and asked that I post it here):

"Today or tomorrow we are adding links to encourage project contact
when editing pre-1700 profiles.... [There] will be a status message you see when editing or adding pre-1700 profiles, including from GEDCOMpare."

Hopefully, this will slow down (if not stop?) the bad data that is getting added to new or existing profiles.

EDIT: He also said things are done in steps, and this is the next step. I add this because the solution above does not specifically address the issue originally asked about-- GEDCOMpare edits to PPP profiles (unless they're pre-1700; I suppose most are).

 

by Jillaine Smith G2G6 Pilot (906k points)
Yes, I've noticed my PGM badge is still in place, although I did make notification that I didn't feel able to meet the increased requirements for participation and so would opt out.  I didn't want to foul up whatever process there was to remove badges en masse, so I've just been waiting to see what happened to it.

I was under the impression that other projects were performing similar clean-outs.
Susan, I'll contact you off-board.
hmm, I know my project leader was looking at the project badges and the activity level of many of them a while ago.  Seems like not such a good idea.  I know of one cousin who is part of the project, but also has roots in Eastern Europe and was working on those lately.  Doesn't mean she's not interested in contributing to the project.
Periodically reviewing  and revisiting project involvement is a good thing.

I've found people who requested a project badge and never made another edit on wikitree after being awarded the badge or who have been inactive for years.

I realize that people's interest may vary. That's fine. But projects need active volunteers willing to improve project-related profiles.
I figure WikiTree has over half a million pre-1700s, maybe closer to a million.

Only a small minority of those are covered by projects like PGM and NNS.

Most projects don't work the same way.

PPP is the sharpest level of protection.  There's a wider set of profiles managed by a project.  And an even wider set of profiles watched by a project.

In principle that extends to 5,000 watched profiles per project.  In practice maybe up to 15,000.  About 1%, if all the profiles watched are pre-1700.

So most pre-1700s are unwatched.

Bigger lists can be built using categories and stickers.  About 25,000 profiles have the England Sticker, but they aren't all pre-1700.

Question is, what does coordination mean at all these different levels of project involvement?  How many profiles and contributors can a project coordinate?

More projects don't necessarily help, as they tend to just spread the same people more thinly.
I've never thought of myself as having an OCD in trying to follow rules and answer questions honestly (but maybe I am: I'm certainly obsessive in trying to find everybit of info about the profiles I work on)
  If being a member of the relevent project is sufficient to edit or create pre 1700 profiles then maybe it would be better if that was the question asked .Maybe some official guidance on what it is supposed to mean given. (fairly urgently before cleanathon)
 If membership of a project is also conditional on working within the guidelines re sourced evidence and naming conventions, then this could add a level of quality control. That seems sensible but feasible for the more widely ranging projects?
I noticed the new message on a few profiles I was working on the last few days and, in spite of being mildly OCD,  I chose to ignore it because:

1. I was not making any substantive changes to the profile,

2. I was simply moving the references tag under Sources, adding an Acknowledgement heading and an unsourced template, and

3. They were orphans from a gedcom upload from 2011.

I was working them in order to get the attention of people working unsourced profiles. They appear to be sourced because of the ubiquitous reference to So and so entered this on such and such a date.  I can't adopt that many profiles and most aren't even remotely related to me . I was hoping I wouldn't get my fanny kicked for ignoring it on the few profiles it showed up on, but these are profiles no one has touched since 2011-2012 and are orphans. I decided I would take the risk in order to try and save them from invisibility.
Thank you, Edie, good job!

You know, that sounds like something the EditBot could do, shouldn't be too hard to detect.
That would be great. I'm not sure how many profiles there are, but it's a lot. It's the upload of Christensen-519. She's no longer an active member. I still have a lot to go.
I agree that a person can be in and out of a project - does not mean they quit - like myself who went from the NNS to trying to fix some of my ancestors that are in or right down from PGM folks - and then off to Canada for a bit, but I am not DONE with my NNS people - just elsewhere for a week or two
+5 votes
Every time I think about getting rid of the Gedcom process altogether (which is every time this conversation comes up), I then think about the gap between the number of profiles here and the number of profiles on other sites.  Geni.com claims to have 180 million profiles.  WikiTree currently has 17 million.  I'm not sure how many unique profiles there are on Ancestry and Family Search, but I'm sure both have many millions more unique profiles that aren't on any of the others.  My ambitious side wonders if there's a way WikiTree could get all that genealogy data and boil it down to something usable here, maybe even as a working family tree, waiting to be promoted to the status of a true WikiTree profile.  Gedcom uploads could also be imported using that same status: a working family tree which would only become full profiles after the user has achieved some facility with the site, identifying if profiles already exist for their working file, and determining if their research adds anything to existing profiles.  A pipe dream, I know.
by Kyle Dane G2G6 Pilot (112k points)
Once more I've had someone enter data by GEDCOM upload into an existing profile I manage, added text on DOB etc, his ''source'' was Find-a-grave.  Excuse me????  We are talking another profile in the early to mid 1700s of an ordinary person, NO grave is findable for such, Find-a-Grave is being used as a tree builder like Ancestry etc.  I object most strenuously to it being considered a valid ''source'' in such instances.
Yeah at that point the grave has too be backed up with attached sources that can be obtained in a old cemetery record collection ussualy XD
A little experience is all that is needed for online genealogists to figure out what information on Find-a-grave is source material (headstone photos and inscriptions, usually), and which things are just copied-and-pasted family trees with no sources and/or attribution.  I would say, "object, but be nice about it in hopes that the user will learn the difference."
Our system, here on WikiTree, seems to encourage considering FAG as a source, since it flags discrepancies between profile data and FAG memorial data on the suggestion reports we all get for our watchlists.  The system can't tell whether you've already explained why the data in your profile is preferable to the FAG data - it would be nice, if there was a toggle button on the edit page to trigger "disregard FAG data" - so you have to go in and designate it a false error.
might Ales be amenable to altering the programming to limit FAG comparisons by date, anything before a date where gravestones became the done thing here in North America to be ignored?  Because most graves older than 200 years here do not have a gravestone at all.
I think a limitation such as that would be too complex, needing to be done area by area, even just in North America.  I know I've see photos of contemporary gravestones from the early 1700s in New England (not often in great shape, but there!)  Perhaps some of the difference depends on how settled the area was.  I have difficulty tracking down death information, much less a gravestone, for my more adventurous ancestors, even in the 19th century.
Around here only the rich or VIPs got gravestones.  And a lot of cemeteries ''recycle'' their plots.  People lease land with them, not buy.  Long lease, granted, but there you go.

If it's too complex, then I would suggest removal of the FAG comparison that was installed not that long ago.  Too much dubious data to have it be considered a valid data-checkpoint.
If your 100% possitive your info is more accuarate than the entry on FAG(WHICH In general should be the situation you run into,since this platform is much less casual) then on the data_doctor error comparison suggestion with FAG,click the "never show this error again" bubble. As for the fair condition 17th century colonial gravestones, you can see if someone has made a "Rubbed copy" off of the authentic grave stone which then it shows perfectly clear. I also know most 17th century colonial gravestones have this angel image on It.

As shown here: https://www.wikitree.com/photo/jpg/Young-8061
Yes, but it's an annoyance to have to go through the procedure to get it off your Watchlist's error report.  For one thing, I always wonder whether I might have missed something, so I go back and check the FAG memorial, just to be certain, even if I've left a note about a discrepancy on the profile. It's time-consuming.  

And I've found that, even if I deliberately leave a link to a FAG memorial off the profile, because it contradicts the documentation I've found - I have an ancestor with a nice, non-contemporary gravestone which says he died in 1898; I found him living with a daughter in the 1900 US Census - that comparison program will flag the "error" for me to deal with.

As I said, I would like to have a nice little button next to the date/place of death fields which tells the program in advance to disregard the FAG data for that profile.

Edit: On second thought, Danielle, you have the right idea: No button; just get rid of the comparison program - and for exactly the reason you cite.  It's nice when you can see a photo of a stone, and sometimes that's the only evidence of the date of death (or birth) you have, but it should be a last resort, not a stand-by.
data doctors already have enough work on their plates without having to deal with things like FAG discrepancies.  I've looked at some FAG links, the photo is that of the cemetery gate in modern times.  And since there have been whole cemeteries moved from their original location, lol, good luck finding such a grave.

I've also had one person come along and ''correct'' capitalization on a place name, I went huh?  Not hardly, Ste-Anne-de-Beaupré does NOT take a capital on ''de''.  Well-meaning I'm sure, but not well informed.
+5 votes
I've experience not only Duplicates but it STRIPS a lot of things too like media files.   I know it was the only way to get my FTM files moved to other programs but it removed a lot of stuff and now I'm dealing with 2, 3 and more duplicates which almost doubled my tree... what a mess !   I'm going thru the entire thing one by one.   In the last 6 months I've deleted over 10,000 duplicated files and many more to go.   :(
by Rebecca Snider G2G6 Mach 1 (15.9k points)
Yes, WikiTree GEDCOM imports don't import media files, whether the media were in personal software or on a website like Ancestry.

I think that the handling of media files in GEDCOM exports and imports is at the root of some of the biggest problems we have with profiles created from Ancestry.com GEDCOMs. Ancestry users are encouraged to add "facts" to their family trees and to source those facts by creating links to media files that exist only within Ancestry. Those media files don't get exported when a GEDCOM is created from Ancestry, nor do they get imported when that GEDCOM is uploaded to WikiTree. As a result, members who felt that they had attractive and well-sourced profiles in Ancestry may discover that all they have imported to WikiTree is a handful of factoids and a collection of ugly-looking Ancestry URLs that in many cases aren't even accessible to Ancestry members.
+8 votes

It would be so much easier to support this if we didn't have so many utterly screwed up project protected profiles in the first place, in other words, if we would have some quality control in place before a profile gets project protected. And then, yes, by all means, content of the biography should be also protected.

by Helmut Jungschaffer G2G6 Pilot (602k points)
Are uncontrolled gedcom imports one of the factors causing the screwed up project-protected profiles?  If so, then controlling them make sense:  At the least, stop (or reduce) the on-going damage while the project works on cleaning up all the profiles it protects.

This, of course, assumes that the project members will have more time to devote to cleaning up all the PPP biographies (and parent or child attachments), if they aren't side-tracked by the repeated need to salvage some of them.
My only concern is that project protection sends the wrong message to people who are not familiar with the very limited scope of protection. And consequently, I would feel much better if project protection would not be slapped on profiles before they are checked for some minimum standards of quality - such as is there any evidence they have actually existed. So for me the discussion about having to protect the few good profiles in the sea of dubious ones makes only sense when we at the same time address the quality issues.
I see your point. I had thought one of the criteria for earning project protection was a developed, documented biography, but I did recently run across a protected profile which was little more than a gedcom dump documented by a family tree (with a broken link, naturally).  Perhaps a first step would be for projects which are protecting profiles to re-examine all the profiles on their lists, then eliminate protection from those which don't meet the standards - or, of course, bring them up to snuff.

But, in the meanwhile, stiffening the protection makes sense to me: finger in  dike, so to speak.
some PPPs are applied to get a pending merge frozen in direction, with name variations being the problem they are trying to solve.  Only they get forgotten there after the merge is done.
Could a list of such non-project PPPs be generated?  Perhaps that would be a start on winnowing the number of profiles being protected - so the remainder could be more easily worked on and their additional protection justified.
don't know, talk to arborists who are the ones who do this from what I understand.
Thanks for the suggestion, Danielle.  I have a query in to an Arborist, just to see whether such a list would even be possible.
+8 votes
I was not experienced enough to understand what was happening in Wikitree when I uploaded a gedcom.  I very much wish I had never done so because it introduced a lot of sloppy profiles.  I've been slowly working on them, but I sure wish I hadn't uploaded.  I think I would be comfortable with entering manually.  I find most sources can added to a profile with copy / paste functions either from my genealogy software, or by reconfirming the source on Ancestry or FamilySearch, etc..
by Janice Tanche G2G6 (6.6k points)
Janice, how would it have affected you, if there had been a waiting period--one explicitly designed to allow you to familiarize yourself with WikiTree while you worked manually--before you could upload your gedcom?  Would you have simply turned away, or would you have worked through it?
Fair question.  I really don't know how I would have responded.  I might have not persisted in learning about Wikitree.  I know I would have felt I needed some familiar data to work with in order to understand what was going on.  But uploading what I did resulted in a disappointing outcome once I started looking at Wikitree more closely (and I'm still fairly inexperienced).  I think maybe 20 or so (or less) would have been plenty with which to start to explore, and then just release them into the big picture one at a time.  I think if I ever contributed something via gedcom again, I'd be very conservative, choosing only a particular, limited area that I was prepared to work on.
You can no longer mass-import profiles into WikiTree. You can use your GEDCOM to pre-fill person data, but you still have to add each person individually (using the same page that you use to create a profile manually). You can see what the profile will look like before you create it, so any sloppiness can be fixed before profile creation.
Jamie,

I wish that it was working this way. So many of the GEDCOMpare-generated edits we see through the PGM activity feed clearly demonstrate lack of previewing, lack of editing narrative. Unfortunately, this new process is costing a LOT of time for project volunteers. We are STILL coming along behind with our digital brooms. It's very disheartening.
That's not the fault of GEDCOMpare though. It's the fault of the person using it.

I am in favor of not allowing GEDCOM updates to PPP, because having PPP status means there are issues with the profile and changes to the profile need to be made carefully.

But I don't see a reason to stop GEDCOMpare from working for pre-1700 profiles. The person adding the information has the appropriate badge, and there is a message (on both the top and bottom of the page) warning people to collaborate. If a person ignores this and still adds unsourced junk to profiles without collaboration, the solution isn't to take GEDCOMpare away from everybody. The solution is to take the appropriate action against the person not following the honor code and the pre-1700 guidelines.
And who is going to take that appropriate action? I technically have the power to remove pre-1700 badges. Should I do that each time yet another newbie makes a poor edit to a pre-1700 profile? I really have no interest in playing bad cop here.
Whoever gets annoyed enough can take the action? And it could be as simple as sending a private message to help guide the newbie in the right direction, it doesn't necessarily have to be taking the badge away.

People adding junk from GEDCOMs is no different than people adding junk manually. It's just as easy to undo GEDCOM added stuff as it is to undo people pasting the entire text from FaG or Ancestry search results. So I'm not sure why there is so much hate directed at GEDCOMpare.
Perhaps because the gedcoms come in bulk?  If the time-per-gedcom-generated-profile were the same as the time-per-manually-entered-profile, I don't think they would generate sufficient disturbance to create this reaction.
Apparently, I missed something when I was getting started with Wikitree. I've more recently learned more about it.  I consider myself to have some skill as a genealogist to the point that I would never have contributed stuff like what turned up attributed to me after I uploaded my gedcom.  I don't know how long GEDCOMpare has been available.  I think I got a chance to use it and work through it.  However, I suspect all I did was try to get rid of duplicates I was introducing. What I did not want is all the crappy formating in the profiles, which I've been gradually trying to clean up one at a time.  But, I've never found a list to work from (of what I imported), and they mostly don't show as unsourced because they have some source material, but they need to be cleaned up.  So, what I'm saying is I ended up with an unexpected and unintended result using gedcom import.
The new GEDCOM import launched last September.

Now there is no bulk importing. People have to make a choice for every single person in their GEDCOM if they want the data for that person in WikiTree or not. If the person already exists then they get a merge-like screen, where they choose which data to use, and they are able to edit the biography before saving. If the person doesn't already exist they get sent to the add person page with the data pre-filled. They can make changes to the data before saving the profile. No data is added to WikiTree from a GEDCOM upload without action from the uploader.
Ah, that would be what I needed, and I think that would have worked for me.  I would still suggest starting off with a smaller number just to be able to see how it works and how much work I'll need to do to fix it up.  Even so, I think some folks might not understand that they are actually adding mistakes because of weak sources (e.g., ancestry trees as sources). Not uploading gedcoms or limiting the size, I suppose, minimizes damage by people who don't understand, but it would probably discourage others from adding quality work that others would love to see.
I have heard the same story over and over and over again from new users and from veteran users talking about their initial gedcom experience with WikiTree, Janice.
Jamie, I believe there may still be people uploading gedcoms with their fingers figuratively on the "go" button.  Just this last week I saw a situation - happily caught by Lisa Murphy (wonderful Arborist that she is) before it came to my attention and before it caused any damage - where someone doing apparently mindless entries tried to merge the profiles for two dead children of the same name (for whom I stand as PM) with the profile of a third adult of that name who happened to have died, in a different town and with different parents, within a few years of each other.
Jamie, every day (every day!) we see PPP'ed PGM profiles polluted by GEDCOMpare "edits". While yes, people still have to approve of their gedcompare edits one by one, it's still faster than manual editing without gedcompare. And the few active and quality volunteers we do have end up spending their time cleaning up these messes because most of the people using gedcompare with these profiles don't know what they're doing. I'd rather that PGM volunteers were working their way through profiles we haven't yet improved, not having to spend time re-improving "finished" profiles and chasing after newbies. Like I said, disheartening.
I'm thankful for this discussion.  I feel less bad about how I started out with my file.  Hopefully one day I get it all cleaned up.
PGM and a few other sections of the tree have issues with duplication and bad research. That's why PP exists and hopefully, after the discussion period, this policy will be approved.

But people are assuming that these issues are tree-wide, and want to ban GEDCOMs completely because of it. There are almost 800,000 pre-1700 profiles and many more that aren't in WikiTree yet. Most profiles don't have the issues that PGM profiles have.
Is it only "PGM and a few other sections of the tree" which have these issues - because of the particular challenges the research involves, I assume - or are the issues made more obvious in certain places because of project oversight?  As RJ Horace has pointed out here, only a very small portion of the profiles, even of pre-1700 profiles, enjoys even theoretical project oversight.
I'm a manager of over 6000 profiles, and manage a couple hundred pre-1700 profiles, and haven't had a single GEDCOM change to any of them. Then again, the profiles I manage aren't in parts of the tree that have a lot of coverage. Most every pre-1700 person I've wanted to work on has not been entered into WikiTree yet. A lot of people interested in genealogy are interested in PGM or royal profiles, so there will be a lot of activity on those. So let's lock them down because they've already been researched to death. But there are TONS of profiles that haven't been researched yet, and whose data isn't published anywhere online. Let's not make it harder for people to add those.
there are a lot of people in the USA and Canada descended from French colonists to New France, not all of whom are even entered yet, and not all of whom are fully researched.  But some of those that are fully researched have so many descendants it's a repeat operation regularly to merge new profiles into them and sort out the junk from the valid data brought in.  For example, Hélène Desportes, reputedly the first woman of French origin born in the colony, must have millions of descendants by now.  Spread all over the world.  How to limit the extra work is something needed most definitely.
so true - I wish there was a Metis project to help with and protect those - but right now there is not really one and because of the First Nation way of record keeping is word of mouth passed down this causes mass confusion and lots of wrong trees out there - crossed lines extra generations and so forth - and like a lot of other folks(the Scots and French) - many named their kids after relatives so you have to be alert every second or you jump to the wrong generation within a family - easy for mistakes to happen entering by hand too
There is a project for native American people, might fit under there.

And there's a First Peoples Canada project separate from the Native Americans project.

+4 votes
It can't be just bios.  It wouldn't make sense to allow people to change data fields, and then not let them edit bios to match or cite sources etc.  You'd have to block changes to data fields.

And connections as well.  Actually, if a PPP already has a full set of connections, a gedcommer doesn't need to add any more.  His new connection will be at least one step away.
by Living Horace G2G6 Pilot (631k points)
Your option 3 is still the best idea - when someone tries to GEDCOM into a PP profile, put up a notice it's not allowed and that they should instead create a duplicate profile and propose a merge of the duplicate into the desired profile.  Then it's totally under the control of the project people, and they can treat the changes in the duplicate as 'proposed changes', keep what's good (if anything) and ignore the rest.
However, there's the danger that mega-gedcommers with thousands of aristos in their tree would just create thousands of duplicates.
RJ, your suggestion makes a ton of sense. Do not allow the creation of duplicates to a PPP or make changes to it. They can only create a profile one person away, but then we would still have someone trying to add new spouses or children to the PPP.  (I am assuming that the parent would also be project-protected.)  Then how would that work? The addition could only be approved by the PMs or project leaders?
+5 votes
It does seem like the problem is that you haven't imported your gedcom until you've integrated it all into the main tree.   This means most of the integration is going to be done by newbies in a hurry.  Those Add and Edit buttons look like a to-do list, and people think they're supposed to do them all, and before they do anything else.

Perhaps the intermediate stage can be beefed up.  Say that the gedcom is "installed" as soon as the comparisons have been done.  At that point, every person in the gedcom that's been matched has a kind of Unmerged Match with a WikiTree profile.  Indicate this on the profile, and regard that as the end of the Import procedure.  Remove the Add and Edit buttons from the Gedcompare form.

Then provide those facilities through the normal tree-building process.  Eg, if a profile has a gedcom match, and you do Add Father, you get an option to Add Father From Gedcom.

Then people would usually start from themselves and work backwards, and by the time they get to the immigrants and aristos they might have decided they don't need to edit every profile they have a gedcom match for.   Or it might take a long time to get round to them all.  The job would have its normal place in the priorities and wouldn't seem like a high-priority task as it does at present.

And if people do decide they need to incorporate their gedcom data into an existing profile, this could be done by changing the "Unmerged Match in Gedcom" state to a "Proposed Merge From Gedcom".  This would make the relevant data in the gedcom visible to other users to approve and complete the "merge" in the usual way (with the usual alternative of joining the TL).
by Living Horace G2G6 Pilot (631k points)
I really dislike the idea of intentionally making duplicates that then have to be merged.

We need to diminish the amount of additional work this new process is requiring of projects, not increase it.

I say simply disallow gedcom "editing" of PPP Profiles.  Have the system force a skip.
But if my gedcom has loads of aristos and the system starts skipping them all, I'll probably just stop finding matches.  That's how we got so many duplicates in the first place.
When I say "skip," I mean disallow creation of a profile; i.e., skip the GEDCOM entry.
+8 votes

I poked around in GEDCOMpare to remind myself what a user sees when they try to edit a pre-1700 project-protected profile. It seems that there are currently several alerts and warnings, but maybe more are needed. Here's what I found and what I recommend to possibly improve the current arrangement:

1. Ample alerts exist on edit screen for a PPP profile. When the EDIT window is opened for a PPP profile, there's a red warning banner that says "Project-Protected Profile: Do not save any changes without communicating with project members" and a notice like this (the actual notice I saw), accompanied by the PPP image:

Dunham-151 is collaboratively edited by the members of a project. Changes must be discussed.

  1. Send a private message private message to the Profile Manager, Puritan Great Migration Project WikiTree.
  2. Post a public comment on the profile.

2. Recommendation: Earlier notice of PPP status could help. Currently, GEDCOMpare doesn't indicate that a profile is PPP until after a person has opened the "EDIT" window, by which time I think it might be too late to dissuade them from completing their planned edit. The COMPARE page (which a person needs to view in order to access the EDIT button) does show the project name as profile manager. It seems to me that adding an indication of "Project Protected" on the COMPARE screen might reduce the incidence of uninformed edits to these profiles.

3. Editing of non-PPP Pre-1700 profiles is just like "usual." For Pre-1700 profiles that aren't project protected, the EDIT window has the new banner that says "If you are not already coordinating with a pre-1700 project, click here before proceeding (required). Thank you!" This is the same as what currently displays for a non-GEDCOMpare edit to a pre-1700 profile.

4. Recommendation: Remind people that they may not need to edit the profile. For all types of profiles, the edit screen advises:

Below is the current data for Dunham-151 (left column) and the new values suggested by the GEDCOM (center column). Select what you want to keep and, if you like, manually edit it (right column). Before clicking the "Save Changes" button, be sure you have evaluated the current profile in its entirety, including any comments or Research Notes.

The final statement in the alert might be too late. People should review the current profile before they even think about editing, not as their last act before pushing the "Save Changes" button. I think it would be helpful to precede that paragraph with the advice:

Before editing this profile, review the current profile in its entirety. The information content you propose to add might already be there, in which case you should cancel your edit.

by Ellen Smith G2G Astronaut (1.5m points)
actually, the new method for merging profiles, whether or not they are GEDCOM imports of some sort, need to be amended.  A person can select which data to keep in one click, and if they don't know what they are doing, then poof, all the work that may have been done on the other just is gone.  Not sure how easy it is to recover that, changes tab only shows that a merge took place.  I personally do not like it.
Thanks for the recommendations, Ellen. I agree with both of them.

@Danielle Careless people are going to make poor merging decisions no matter what we do. But if you have suggestions for improving the merging process I'll make sure they get on the list of suggestions :)

I just did a merge to see how it might be better.  Right now, one can select the right or left items for each individual fact in the upper section (name, dob etc).  When it comes down to the bio section, one can pick left or right or do manual edit.  But both original versions are just stuck in sequence there, the only thing removed is the extra ==sources==<references /> tags.  

I would remove the option to pick left or right on bio section, since that can very much lose valid data.  Put a notice to review the data before merging by all means.

On the one I just did, the person had a different source than I did, but both are quite valid.  I incorporated it with the existing text.  I know on prior merge of a PPP profile, Guy had asked me which version to keep (long tangled story that), and I had said keep it all and I'll sort it out after merge is done.  

Actually, in both merges and gedcom imports the profile data fields also give me the option of a free-form entry -- so I can add Nicknames, correct place name spellings, etc. I like that -- I don't have to choose between one profile or the other.

And it's very important to be able to choose the text section from one version or the other, rather than being required to merge them and edit the merged version. Sometimes the two profiles are duplicates (because someone copied the text from one profile to the other), so it's a convenience to able to choose to keep one version and discard the other. And sometimes there's little or nothing of value to be kept from one profile.
+4 votes
Yeah if only a patch was built through the wikitree profile search database too prompt an alert with relevant identical suggestions too profiles that match exact dates of births,Marraiges,deaths,and locations on existing profiles while uploading a GEDCOM. Because that info is in the data doctor system. And the data sends alerts on place names for instance in suggested errors, so why can't an alert prompt be produced on certain profiles while in process of uploading GEDCOM, for an easy user friendly and efficient way for a new user too learn too go search the data base before creating a duplicate?

GEDCOM is worth keeping though and it's highly valuable too be able too keep a steady bank of information coming too work with inside the system. GEDCOM can make connections that can only be connected because of that one piece of the one and only shard of evidence on planet earth uploaded via GEDCOM.XD
by Living Smith G2G6 Mach 6 (60.7k points)
edited by Living Smith
+4 votes
Gedcoms should not be allowed to change existing profiles. Ever. Period. Regardless of whether they are PPP or not. Gedcom imported data changes and sources are not selective and are almost always spam that must be removed.
by Chase Ashley G2G6 Pilot (311k points)

Related questions

+6 votes
2 answers
+14 votes
3 answers
+3 votes
1 answer
100 views asked May 7, 2016 in WikiTree Tech by Cynthia B G2G6 Pilot (139k points)
+5 votes
1 answer
111 views asked May 5, 2016 in WikiTree Tech by Cynthia B G2G6 Pilot (139k points)
+4 votes
1 answer
127 views asked May 4, 2016 in WikiTree Tech by Cynthia B G2G6 Pilot (139k points)
+6 votes
1 answer
157 views asked Mar 22, 2016 in Policy and Style by Joseph St. Denis G2G6 Mach 3 (32.5k points)
+7 votes
1 answer
170 views asked Sep 16, 2015 in Policy and Style by Bruce Veazie G2G6 Mach 6 (62.1k points)
+4 votes
1 answer
360 views asked Jun 9, 2022 in The Tree House by N Gauthier G2G6 Pilot (293k points)
+6 votes
1 answer

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...