WikiTree_AGC.png

WikiTree AGC

Privacy Level: Public (Green)
Date: 23 Jul 2020 [unknown]
Location: [unknown]
Profile manager: Rob Pavey private message [send private message]
Last profile change on 15 Sep 2022
11:38: Carrie Quackenbush posted a comment on the page for WikiTree AGC [Thank Carrie for this]
This page has been accessed 10,070 times.

WikiTree AGC (Automatic GEDCOM Cleanup) is a Browser Extension developed by Rob Pavey that allows the user to reformat a profile that was created from a GEDCOM (by GEDCOMpare or the earlier GEDCOM import process).

NOTE: There are user options to configure the reformatting which are not immediately obvious, see the User Options section below.

Contents

How to install

The extension is free to install and use (except on the Apple App Store). It works in many different browsers.

  • For Chrome, Opera, Brave, Edge, Vivaldi and other Chromium based browsers , install it from the Chrome Web Store.
  • For Firefox install it from the Firefox Add-ons page.
  • For Safari on Mac or iOS go to the App Store and search for "WikiTree AGC"

How to use

In WikiTree go to a profile that was created from a GEDCOM. Click the edit tab and scroll down to the the Edit Text section.

If WikiTree AGC can recognize this as a GEDCOM created profile then there will be an AGC button to the left of the line of buttons above the big text box:

Press this button and the text in the big text box will be reformatted. Scroll down and press the "Preview" button so see what it will look like if saved.

There is a video demonstation on this Feature Friday Episode.

Undo feature

If you want to try again with different options or want to make some manual changes before reformatting you can undo the changes and try again. If you look at the AGC button again you will see it now looks like this (sorry still temporary art!):

Pressing it again will undo the changes (in both the text box and any other fields that the extension has changed). You can then edit the user options (see below) and press the button again to redo the reformatting with the new options.

What problem does this extension address?

The current GEDCOMpare tool creates profiles which many users are not happy with. The primary problem is that the biography section is so unlike what a typical WikiTree user would create that some users feel that it is faster to delete the whole biography and start over. While this may be an exaggeration, if we can automatically create a biography that provides a nice starting point that the users can then gradually improve, that would make the GEDCOM pathway more attractive.

So this extension adds a button that, when pressed, reads the GEDCOMpare created biography and replaces it with a new biography in a chronological narrative format. The user can then preview it and make edits before saving the changes.

Some of the most obvious issues with the latest GEDCOMpare created biographies are:

  • There are no line breaks. So the entire biography section is one paragraph. This makes it hard to read and edit. Also known as "the blob".
  • The facts are grouped by fact type. So all the Residence facts are in one section and all the Marriage facts are in another etc. Thus it is not in chronological order.
  • Even within one fact section (such as Residence) the facts are not sorted chronologically. They are in an order that appears random (probably the order in the GEDCOM).
  • For each fact, the biography contains first the description, then the date and then the location. The description is usually some extra info or notes added by the user or automatically by Ancestry etc. It would be more readable if the date and location came first.
  • Nearly all of the citations (<ref>) are named and are referenced in multiple places. For example there is usually a "Name" fact that references all of the citations (since most citations include the person's name). This is mimicking how Ancestry works but is not how most WikiTree profiles work. (see advanced sourcing for more information).
  • The citations and sources are separated. i.e. none of the <ref>s include the source details directly. Instead they reference a separate source like this: Source: #S-214974479. There can be some benefits to this in reducing the text size of the profile if the same source is used by multiple citations and some users may like this. But it does make the profile harder to read and edit in my opinion. Also it is officially not recommended.

The extension will also clean up profiles created by older WikiTree GEDCOM import tools. However, if they have been edited too much it will not be able to recognize the events.

What does the extension do?

There is a more detailed list of what it does below. But these screenshots give you a quick taste.

Here is a biography created by GEDCOMpare from an Ancestry GEDCOM:

A typical Ancestry GEDCOM created profile.

If the user goes into edit mode they will see a new button on left hand end of the toolbar above the biography. This button only shows up if the biography looks like it was created by GEDCOMpare:

The AGC button is on the left of the toolbar.

Pressing the button will modify the text box below by rewriting the text:

The biography that WikiTree AGC creates.

More details of what the extension does

  • It creates a biography in a narrative form. I.e. it says "Leonard was born on 12 September 1920 in Camden, London, England." rather than "Born: 12 Sep 1920, Camden, London, England.".
  • All facts in the biography are sorted chronologically.
  • It embeds the source text within the <ref>
  • It optionally switches to have only one <ref> for each citation. This is typically with the latest fact using that citation. There are actually five options for when to use named references:
    • Never: Refs are never named. There is one ref for each citation, typically on the latest fact that referenced it. All the sources will be in chronological order.
    • Minimal: Only adds more than one ref for a citation if there would otherwise by no ref for a narrative event.
    • Selective: Only adds more than one ref for a ciitation if adding a ref to a narrative event will likely be adding a ref that has a more accurate date or location for that narrative event.
    • Multiple Use: Preserves all refs from the GEDCOMpare biography but only names refs that need to be named.
    • All : Always names references and keeps all refs the GEDCOMpare created.
  • If there is an exact baptism date in the same year as a year-only birth date it switches the "Birth date" field of the profile to be before the baptism date. Same for an exact burial date with a year-only death date in the same year. If this is done then a note of this is added in the research notes. (There is an option to turn this off).
  • For female profiles, if the Current Last Name is the LNAB and there are marriage facts and the last husband's name is known, then the Current Last Name is changed to the last husband's last name (optional).
  • If there is a File fact section, then a list of External Media Links is added at the end of the biography. (This is optional).
  • If the description for a fact contains a link to FindMyPast then a new citation (ref) is created that contains this link and it is removed from the description.
  • Arrival and Departure facts can get linked. For example, if there is an arrival fact and a departure fact that use the same citation it will combine them into a single narrative event.
  • If the person died at an age of 12 or less the "Died Young" sticker is added.
  • All dates in the narrative are transformed into a standard form: dd Month yyyy (e.g. 10 September 1842). This could be user configurable in the future.
  • Duplicate marriage citations are merged.
  • Any existing Biography or Research Notes sections that existed before the GEDCOM import are preserved.
  • Meaningful titles are put on the source citations based on the fact section/type and source. (This is optional).
  • If a source is an Ancestry or MyHeritage source (i.e. a subscription source) it will clean up the source information to remove extraneous data. Also, if it is a recognized source a link to available free sources for this data will be added (this is optional).

Things it does NOT do:

  • If the GEDCOM was exported from Ancestry then the source citations still point to Ancestry records. However, if the source is a recognized source (currently limited to England sources) then it will add a link to where a free source may be found.
    I am working on a new extension extension to assist with finding free sources for the subscription records.

User options

Not every WikiTree user will have the same preference for how the biography is formatted. The browser extension has user options to allow for this. If you have ideas for new options please let me know. Details of how to use the options follow...

How to edit the user options

In the Chrome toolbar there is an icon that looks like a puzzle piece. Click that and then click the three dots to the right of WikiTree AGC. Then select options from the menu. See picture below:

In Firefox you click on the "burger menu" (top right corner of browser window) and select "Add ons and Themes". Then you click on the "..." next to WikiTree AGC and click Preferences.

The current options screen

That will bring up a new tab in the browser which looks like this (this image is a bit out of date as new options have been added):

Select the options that you want and press save.

Explanation of the options

These are the current options:

  • General
    • Spelling. UK English or US English. e.g. baptised vs. baptized. A future improvement could have an option to select spelling based on the fact location.
    • Whether to add the person's age to narrative events.
    • Whether to add an External Media section to biography if there are files referenced.
  • References
    • When to use named references.
    • Whether to add a newline before the first reference on a narrative event. Doing so makes it slightly easier to edit but inserts a space before the [1] etc.
    • Whether to add a newline between each reference on a narrative event. Doing so makes it slightly easier to edit but inserts a space between the [1] [2] etc.
    • Whether to add newlines within each reference on a narrative event. The newlines are after the opening <ref> and before the closing </ref>. Doing so makes it easier to edit and has no effect on the public view.
    • Whether to add meaningful names to references
  • Research notes
    • Whether to add an "Alternate names" section to the research notes if the GEDCOM has name variations.
  • Other fields
    • If there is an exact baptism date in the same year as a year-only birth date then change the "Birth date" field of the profile to be before the baptism date.
    • If there is an exact burial date in the same year as a year-only death date then change the "Death date" field of the profile to be before the burial date.
    • For female profiles, if the Current Last Name is the LNAB but there are marriages and the last husband's name is known then change the CLN to that.

Reporting problems

If you try this on a profile and there seems to be a bug please do not "Save Changes". Instead, leave the profile as it was created by GEDCOMpare and send a private message to Pavey-429. Include the profile name. I will then try it on that profile myself and debug the issue.

Please include what seems wrong e.g.:

  • The AGC button doesn't show up at all for this profile
  • The AGC button shows up but does nothing
  • The resulting reformatting has an issue
  • ...

Please include the version number of Ancestry AGC that you are using. This is visible on the chrome://extensions/ page or in the "More information" box towards the bottom of the Firefox add-on page.

Also, if you see any issues with these instructions please let me know.

Release Notes

See WikiTree AGC Release Notes page.

Wish list for future enhancements

See WikiTree AGC Wish List.

Acknowledgements

Thanks to the many WikiTree users who have beta tested the extension or provided feedback on it, including: Loralee, Hilary, Christina, Steve, Michelle, Kathleen, Jo, Leandra, Jonathan, Raewyn, Geoff, Frances





Collaboration
Comments: 46

Leave a message for others who see this profile.
There are no comments yet.
Login to post a comment.
This addon is a real gift. Thank you!

I've only had one issue so far which is it wrapping out on this profile: Catherine Tiedke. When I hit the button, it freezes the entire Chrome browser tab and sometimes the one that launched that page, like the gedmatch page. Could it be her mother's umlaut?

posted by Carrie Quackenbush
How can I identify profiles where I can use this app. I tried on several likely suspects from the suggestion list, but the AGC icon did not appear.
posted by Sharon Danbrook
It looks for certain markers that indicate that it was created from a GEDCOM. Have the profiles that you tried been edited or merged (you can tell from the Changes tab). The markers are different for old format imports and newer ones so it is not simple to give a full list.
posted by Rob Pavey
Hi Rob! This is an extremely low-priority and silly request, but in a future version of AGC could you make it so that theres an option to remove Category: Needs GEDCOM Cleanup automatically? I always forget to remove it and have to go back! I've been trying to clean up the profiles in that category & the other ones in the tree they are directly related to, so it would help me a lot.
posted by Liz Marshall
What a fine tool! While it would have saved me many hours of work when I first started uploading my data as GEDCOMS in 2018, I have persisted in editing all of them at least twice since then. I will try using it on some profiles I've recently adopted.

Thanks for your fine work! Jane

posted by Jane (Snell) Copes
edited by Jane (Snell) Copes
Hi! Love the tool, but wondering if you could help with a variant of the tool or suggest something you already know about.

My problem: I have richly sourced GEDCOM entries which match to sparse, already-created WikiTree profiles. If a profile doesn't already exist, I can add it, use AGC to spiff it up and voilia. But, if a profile already exists and I match it, I don't have a good way to automatically extract all of my GEDCOM goodness so that I can manually add it to the existing profile ...

What would be good would be either:

  • a way to have AGC pull directly from my GEDCOM file and give me suggested text, or
  • a way to force the WikiTree "add" functionality to output, which I could then use AGC on

Any suggestions appreciate, thanks! Jeff

posted by Jeff Jones
I think GEDCOMpare lets you do this. If you match an existing profile then press the EDIT button in GEDCOMpare to see a side-by-side view of the existing profile and your GEDCOM data. By default it includes the data from both so all your sources will be included. Then after saving the changes you can run AGC.
posted by Rob Pavey
Would be nice if you could debveolp som tool that also taker care of the Sources, In Sweden we have ArkivDigital a paid service were all links to the swedish church recordasare to befound. I asumme that somewere in the Gecomfilethey are to be found. If we have a tool that can identify and publish them in a smart way my research and all other that are using ArkivDigital or Riksarkivet in Sweden could sharte our findings.
This seems like a good start and any innovation to improve the speed at which we can format sources is highly regarded. The only downside with this approach in my experience is the value in slowly going through sources in depth and thinking about some of the broader implications of what is contained within them I have recently been doing this for a number of profiles and it is really valuable for exploring how to develop a detailed profile of someone and challenge the unchallenged assumptions present in lots of individual Ancestry family trees. These assumptions are so hard to shift because they have been made once and then repeated without critical thought.

For example, by exploring the detail in English census records, you can find out who people lived with, where, and in some cases why. To me, a lot of this detail is missed if the process is automated. What would be really cool is a process that makes the tedious task of transcribing sources easier that also asks pertinent questions about what may have been missed in the process. Anyway, I really appreciate this contribution. Just thought I would share my experience.

posted by Simon Ross
Are you planning a version for Safari or will either the Chrome or Firefox versions work with Safari?
posted by Gerald Newnham
I am currently (slowly) working on a Safari version of WikiTree Sourcer. If I can get that published on the Apple store I will probably try to do a Safari version of AGC also. It may be a few months away though. The Chrome and Firefox versions will unfortunately not work with Safari.
posted by Rob Pavey
Thank you. That's good news and I'm sure I won't be the only person who will be grateful for your time and effort in writing these very useful programs.
posted by Gerald Newnham
Hi Rob, I'm a little new to this app and started using it recently. It's a huge help! I have one big issue however, and that's with AGC splitting up names into first and middle names. I noticed a thread where you somewhat addressed this by only doing it on UK or USA profiles, but that doesn't always work for USA. We have historical cultures from other countries that continued the use of forename and surname for a good while after coming to the U.S. I work in particular with people of French descent, but there are others too. Also, the preferred name should not be changed. Many people, in the South especially, use their middle name as a preferred name. There is no way the app can determine what is correct for that field.

I'm asking to please please reconsider an option to turn off the change that splits multiple forenames into different fields.

Hi Joyce,

I plan to get back to working on AGC a bit soon. I will look into adding an option.

Cheers, Rob

posted by Rob Pavey
Hi Joyce,

I have released version 1.0.0 of WikiTree AGC. It now has an option for this. The text on the options screen is:

"For old GEDCOM imports move additional names from the Proper First Name field to the Middle Name field:"

I hope that works for you.

posted by Rob Pavey
Hi Rob, I uninstalled the old and tried to install the new version, but was unable to. Returns this error message:

<Package is invalid. Details: 'Could not load options page 'options.html'.'.>

Hi Joyce,

I have never seen that error before. Which browser are you using? Chrome or Firefox? Thanks, Rob

posted by Rob Pavey
I have confirmed that people are able to use the latest version on Chrome and Firefox. It sounds like an issue local to your machine. What browser are you using and how did you update the extension? Possibly something got corrupted on download?
posted by Rob Pavey
I apologize I thought I had answered this question but must have closed without posting it. I'm running Chrome. I updated Chrome to the latest version, rebooted, then reinstalled the app, but still got the same error.
It looks like there must have been an error in the package that I released. I have released a new version 1.0.1. Can you try that?

Thanks, Rob

posted by Rob Pavey
Hi Rob - nicely done. I'm new to WT and just uploaded a GEDCOM of 45 names. Took me hours to get them all cleaned up enough to add nicely. AGC would have really helped ... I think. Anyway I went back and ran AGC on all of them just to make the formatting so much nicer. Now I just have to deal with all the leftover issues AGC commented on!

My GEDCOM comes from Roots Magic. The way the sources/citations are handled there ends up with me having a LOT of duplicated information in the large text box. I manually cleaned that up for that first 45 person GEDCOM. Just about to try a few more people and see if AGC handles it all. If not, I'll get back to you with examples.

But really just wondering if any other Roots Magic users have contacted you to consider changes that would handle Roots Magic specifics.

posted by Glen Bodie
Hi Glen, glad that this is helping you.

I did hear just last week from another user using a RootsMagic GEDCOM who was seeing similar issues and was talking to RoosMagic support about it. That was Schmehl-58 if you are interested in chatting directly.

posted by Rob Pavey
This isn't a Roots Magic tech support issue. It is working correctly as designed and as it has has done for many versions of the product.

If RM were to change to give you more options in the generation of a GEDCOM that could help but they are putting finishing touches on a major upgrade to V8 (which they've been working on for several years) and they're bug fixing not taking enhancement requests. So if any GEDCOM option were to get added it is a few years down the road - though waiting to see what V8 looks like seems a prudent thing to do. I had been thinking about writing a GEDCOM post-processor to deal with this until I found out about AGC. You've got the great majority of the infrastructure all in place and working, so that's a better place to deal with it - as long as supporting other GEDCOM generators than Ancestry is in your plan.

Give me a day or two to generate a descent test case with my RM V7, see what AGC does with it, and get back to you.

posted by Glen Bodie
I generated a Roots Magic GEDCOM (#1), imported it using GEDCOMpare (#2), cleaned up the Text with AGC (#3), and then further cleaned it up manually (#4). I extracted from all of those just the section that identified the person and one fact (an Occupation) and the one source for that fact. The extracts make it much shorted and possible to see the specific issue, but sooner or later you will want to see a full GEDCOM, I know. It's a bit much to post here - can I email you an attached text file?

This handling of the complex source representation in Roots Magic is the main thing I was looking at. But there is a further issue in date representation - things that Roots Magic supports but AGC does not recognize as dates. RM allows not just the familiar "Before (BEF) date", the "After (AFT) date" but also "FROM date1 TO date2" and "BETween date1 AND date2". I can get you more information on how those are handled in Roots Magic.

posted by Glen Bodie
Hi Glen,

I haven't looked at this page for a long time and I realize now that I missed your replies. I may be able to at least improve some of the handling up RootsMagic date formats.

If you can send me a Private Message with an example WikiTree profile text as it is created by GEDCOMpare I can add it to my test suite and work on improving the handling of it.

I have been spending all my time on my other extension (WikiTree Sourcer) but hope to get back to some work on AGC soon.

Cheers, Rob

posted by Rob Pavey
Hi Rob, I use AGC and it is miraculous. However, I just lost my SAVE button this morning. No matter what I did, I could not save any changes to any profile (I'm not using AGC at the moment for these profiles.) After a lot of investigation, I removed the AGC extension, and voila!, I can now save. Any suggestions?

~Kathy

posted by Kathy Evans
Hi Kathy, thanks for reporting the problem.

I'm not able to reproduce the problem myself. I guess it is possible for the WikiTree website to change in a way that could mess up the extension but I'm not seeing it. Could you give a bit more detail? - Was the "SAVE CHANGES" button completely missing? If so, is this true for the one above the edit box as well as the one at the bottom of the page? - Which browser are you using? I assume Chrome?

Thanks, Rob

posted by Rob Pavey
The Save Changes button was grayed out, and under the button it said "No changes to save", and it was both buttons.

Chrome is my main browser, but this morning I tried it on Opera, Vivaldi, and Edge. Right now, Vivaldi is working with all extensions removed. When I tested it, I removed extensions right to left, and after I removed each extension I refreshed each time. After AGC was deleted it resolved, but that may have been because it was the last extension, I can't remember.

Now, I just went back and did that again in Chrome, added the extensions, removed them, and then added them back. The first time I added the extensions, the Save was grayed out. After I removed them, I could save. Then I just added them back in, and I can still Save.

So, obviously, there is no problem with your App, and I am sorry for bothering you with this. I appreciate your quick response.

posted by Kathy Evans
Thanks for the update Kathy. Let me know if it happens again.

The AGC extension does have some code to enable the save changes button after it modifies the text (by sending a 'change' event). Otherwise you get the "No changes to save" message. So there could be some relation. There isn't any code that disables the save button though.

posted by Rob Pavey
Hi Rob,

I had this case where someone used AGC to reformat a profile that had a Could not interpret date in Birth Date (20 MAY 19??)., reformatted to was born on 20 May 1900. Please see https://www.wikitree.com/index.php?title=DeROO-85&diff=117958614&oldid=100169665 Sent me on a fruitless hunt searching for someone born 1900-05-20. Only after inspecting the change log did I notice the 1900 had no basis. Dropping that search filter it was not hard to find the person, with birth date 1911-05-20...

I guess the problem was initially caused by the GEDCOM import software assigning birth date (basic data item) 1900-00-00 from input 20 MAY 19??. Maybe you can implement logic in your otherwise wonderful tool to process "Could not interpret date" cases more sophisticated? Could not find any reference to "interpret" in your documentation.

posted by Jan Terink
Hi Jan,

Thanks for reporting that problem. I'm thinking about the best way for the extension to handle that. Perhaps you have some suggestions? I can detect the "Could not interpret date" text (and also ? characters in dates). Options that I can think of would be:

  1. Error out. Display a message that the profile cannot be reformatted until the date format is fixed, don't make any changes.
  2. Add alert in the "= Issues to be checked =" section. Would have to decide what to do with the date.
  3. Add alert in the "= Issues to be checked =" AND leave the "Could not interpret date" in the bio.

Any thoughts? Rob

posted by Rob Pavey
edited by Rob Pavey
I would go for option 3, Rob.

Looking at the latest 851 suggestions (almost 16K) there are less than 10 occurrences of ?? So that would, imo, not justify a significant development effort.

Thank you!

posted by Jan Terink
I just released version 0.1.26 which includes a check for this situation. I went with option 1 as it was the easiest to implement and seems safest. As you say there are very few profiles with this situation but I think the latest GEDCOMpare may still add this text and more imports are coming from RootsMagic which has date formats that GEDCOMpare does not handle.
posted by Rob Pavey
Indeed easiest and safest, Rob, and no problem, considering the limited occurrences.
posted by Jan Terink
I noticed today that if you have initials for middle names in the first name field and they get moved to the middle name field they get reversed.

I will check that it is just initials. It did it with one that had a name followed by an initial.

posted by Hilary (Buckle) Gadsby
I have checked another profile with 2 middle names and it transposed the names when it moved them to the middle name field.
posted by Hilary (Buckle) Gadsby
Thanks for spotting that Hilary!

I have fixed the issue in version 0.1.21

posted by Rob Pavey
Hi Rob,

I have been using AGC for a couple of months now and generally like the way it works. One thing that may be helpful is to look for the first and middle name in the first name and prefered name field. If found, correct it or put an "Issue" in the reference notes to prompt correction. I adopted dozens of profiles imported by gedcoms and they all have this issue. Here is an example. Shirley-1037

posted by Scott Anderson
Hi Scott, I will take a look at that. It sounds like the best approach would be to fix it up and add an issue. That way the preferred name field could be modified before generating the bio text (which uses the preferred name everywhere in the narrative facts).
posted by Rob Pavey
Version 0.1.19 is now available and implements your request. Let me know if you see any issues.
posted by Rob Pavey
Is there a way the app could change

Record File Number

Record File Number: geni:6000000008508719699

to something like

in the Sources section, as the FamilySearch ID is currently handled?

I've been copying them out by hand to not lose the information and it is rather tedious.

Example https://www.wikitree.com/index.php?title=Zaborskas-53&diff=113732769&oldid=5477361

posted by Aaron Gullison
edited by Aaron Gullison
Hi Aaron, I have added the feature that you requested. It is in version 0.1.16 which should be available on the Chrome store in a day or so.

Thanks for the suggestion, Rob

posted by Rob Pavey
Wow. Really. Wow. Thank you. This will SO help me get through my earliest gedcom uploads that I have been slowly working through, manually. Love it!
posted by Jillaine Smith
Hi Jillaine,

I am glad to hear that my extension is helping you. Rob

posted by Rob Pavey
Minor Bug Report

I tried the GedCom cleanup on Bancroft-665. One minor glitch but otherwise awesome! For some reason it placed the 2nd level "Death" heading and information right after <ref> in the last census citation. Maybe because it was a 2nd level heading instead of 3rd level like it should have been?

posted by Scott Anderson
Hi Scott,

Yes it would be because the = Death = second was a second level heading rather than third. It looks like someone manually added that death section after the GEDCOM import. So it treats it like any other second level heading (e.g. Research notes). Rob

posted by Rob Pavey

Categories: GEDCOM Help