Beta testers wanted! WikiTree AGC reformats GEDCOMpare created profiles into a nice chronological narrative.

+18 votes
398 views

Hi Everyone,

While importing profiles from my Ancestry GEDCOM I grew frustrated with the biographies created by GEDCOMpare and wrote my own Chrome extension called WikiTree AGC that reformats them into a nice starting point for a “typical” WikiTree biography. More information on the extension is on this free space page: https://www.wikitree.com/index.php?title=Space:WikiTree_AGC

It is just one button click to reformat the biography.

It is working great for my GEDCOM and my style preferences. But I realize that everyone has their own style preferences and everyone uses Ancestry (or other tools) in different ways and thus creates different looking GEDCOMs. That is where you come in. The extension has various options to control what it produces but I’m looking for feedback to improve it.

So I am looking for some beta testers to try it out before I publish it to the Chrome extension store.

Requirements for beta testers:

  • You have (or are prepared to install) the Chrome browser
  • You are willing to download a zip file and install an unpacked Chrome extension
  • You have a GEDCOM you can use to add profiles on WikiTree (one from Ancestry or other sources)
  • You are willing to report issues to me

Thanks for your help.

Rob

UPDATED: The extension is now available for free in the Chrome Web Store. The latest development version is also available for anyone to download. All details on the free space page mentioned above.

WikiTree profile: Space:WikiTree_AGC
in The Tree House by Rob Pavey G2G6 Mach 1 (10.7k points)
edited by Rob Pavey
This looks like a wonderful initiative, Rob!
Your write up, with the different options, looks very good. Have you contacted Jamie to possibly have it listed with the other wikitree apps or extensions?  It would seem that we could use your app just to clean up any gedcom profiles that we come across. Do we have to actually create profiles with a gedcom?
Thanks Linda,

I haven't contacted Jamie to have it listed yet. I wanted to gauge interest first. Then get it published in the Chrome store (I'm in process on getting it reviewed for publishing).

It can be used to clean up gedcom profiles that were imported previously but ONLY if they have not been edited beyond adding a few newlines. Parsing the output of GEDCOMpare isn't simple and I have to make assumptions about the exact order of certain lines (i.e. the description is on a line before the date for each fact).

For this reason I would prefer that, for beta testing, it is only used on freshly created bios - otherwise it is very hard to determine if an issue a a problem with my extension or because the bio has been previously edited :P
Happy to be a beta tester for you Rob some of my GEDCOM upload has never been touched.
Rob, I added the profile_improvement tag to give this post more visibility.

6 Answers

+2 votes

Great idea Rob, but should you perhaps create the standard headings within the bio, a la the biographies standard.

I'm also not 100% sure about the named ref being dropped for numbered sources, I know it may not add much to the wikitree page but might help folks who uploaded the gedcom keep their sources more organized? I'm not sure how many people care about those numbers though. 

Be happy to be a beta for you, I do have a fair number I could create and use this on.

by Jonathan Crawford G2G6 Mach 9 (92.7k points)
Jonathan, I am not sure what you are saying he isn't doing. He has Biography, Research notes,  Sources with references tag. Other headings are optional.

Do you mean headings for Birth, Marriage, Death, Census?  If so, maybe that could be an option, similar to how ref names are created.  Some want biography in a narrative fashion without headings and others like the headings.
As far as the numbered sources being dropped, most people when they go in and clean up a gedcom profile or combine sources, the S## that was used is removed and a different name is created, if a name is needed, because the S## is not needed since it was only a way to find the other information in the sources section. You only need a named ref if it is going to be used an additional time. Some people do create named ref for all their references though, but it is not needed.
Right, optional headers, so maybe include the option to add those headers in the appropriate spot instead of the bolded headers like Birth, Marriage, etc. that GEDCOMPARE adds in..

Early Life

Family

Occupation

Death and Legacy

Timeline
Yes, I have seen this in the named refs, but I never remove or modify the name. The database geek in me wonders if there is someone who has a database table or spreadsheet with all the sources they have collected for their family and they actually USE that number to keep themselves organized. I know that there shouldn't be a need for a six digit alphanumeric just to identify a unique source on one profile, but it isn't hurting anything to leave it either, and if it means that someone might be screwed up if I remove it, I don't.

I feel like that falls into "use their style, not yours"
Yes, it could be an option to add subheadings like "Early Life". I would be interested to see if beta testers find that would be useful. GEDCOM created profiles don't tend to be long enough to justify it in most cases, but someone might be planning to make all their gedcom created profiles really long. The option would probably have to use some rules. i.e. if a person died before the age of 12 I currently add the {{Died Young}} sticker. If someone died before 25 or if there are less than N events then additional subheaders are probably not wanted etc.

This is all useful feedback :)
Thanks for volunteering to test it out Jonathan. I have sent you a PM with instructions.
When the gedcom comes in, it has the 'headings' with the quotes to convert it to bold, so those could possibly be made into equal signs for a header.
That would make it less of a chronological narrative. It would be a biography where the facts/events were grouped by type.

If there is a desire for an option to do this I could add it to my wish list. It would be a non-trivial amount of work (a whole new style).
+1 vote
Sounds awesome! I'd like to try it out
by Christina Mckeithan G2G6 Mach 1 (12.0k points)
Great! I have sent you a PM.
I've redone a bunch of profiles with this now and I have to say its a huge improvement over what a GEDCOM profile looked like before. I quit using my GEDCOM because of the formatting, but I feel like its useful again. Thanks Rob!
+1 vote
This is much improved over the current output that we get from Ancestry imports. I have a couple of comments about the format:

-- I would recommend that the "meaningful titles" in the source citations be excluded, not the default. Titles like this are not industry-standard or WikiTree-standard citation format and shouldn't be encouraged, in my opinion.

-- Changing "vague" birth and death date fields to be "before" the exact baptism or burial date is concerning. How are you defining "vague"? Do you mean a year-only date, or would that include dates with year and month without the exact day? Would the user be alerted to the fact that a date field was changed?

-- In the sample output, it shows one item without a citation. Maybe one could assume that it came from the same source as the next item, but that isn't an obvious conclusion from the context. The reader shouldn't be forced to make assumptions about sources, or be left wondering. Any future rearrangement of the statements in the profile could distance it even more from its source..

-- Regarding the "Multiple Use" option for citations - I think this should be the default. It is a clean and efficient method, and is a recommended WikiTree format for using the same citation multiple times.

Just a few thoughts for consideration. Thanks for your work on this, it is definitely worth the effort!
by Joyce Rivette G2G6 Pilot (112k points)
Thanks Joyce,

Good point about the meaningful titles on the sources. That is definitely my personal style rather than a standard WikiTree thing. I will make that optional in the next version.

On the vague birth dates, yes it only does this if the birth is year-only and the same year as the baptism currently. If this is done it adds a section to the research notes section explaining this. I could also make this behavior optional.

The sample output is using the "Never" option for named references. The source citation in the GEDCOMpare created biography is linked to both the residence fact and the military fact. With the "Never" option it only puts it on the later of the two facts. Any of the other options would have included a source citation link on that line.

I'm interested in seeing what people think the default for the named references should be after experimenting with it. In the case of GEDCOMs from Ancestry, any source that mentions any idea of the birth date or age gets linked to the birth fact. So if "multiple use" is used all of these sources are listed with the first line of the bio. So I am finding that either "Minimal" or "Selective" works best for Ancestry GEDCOMs. I'm interested to see what other think.

Cheers,

Rob

Yes, I can see that it's a fine line between not missing anything and having too much unnecessary repetition. I personally would rather make the choice to remove the unnecessary bits manually. Then again, although I use Ancestry, I haven't imported any gedcoms from there, only dealt with the aftermath.cheeky 

How does the "Selective" option work? How does it determine if a date or location is "more accurate"? That seems to be a subjective decision not easily made by programming. I agree it would need more testing to see how well it does with this.

On the "Multiple Use" option, if you haven't already, you should consider the possibility of making it work similar to the "Never" option in that it would only put the full citation on the later of the facts, using the named ref for all the others preceding. That would help spread the bulk of the citation text around, and hopefully result in placing the main citation with the primary event that it refers to most of the time.

Yes, I agree that it is a fine line. My guiding "simple mission statement" is to try to provide the most convenient starting point for the user to then improve the biography, while also making something that looks OK if left for some time before someone comes back to work on it.

Both the "Minimal" and "Selective" options try to pick the "most accurate" references. This is data driven. I use string matching to recognize known sources that I have recorded accuracy information for. E.g. most England censuses have an approximate birth year and a birth place accurate to the  town, while the 1841 census has a birth date within a 5 year accuracy and a birth county (well actually only whether it is the same county as current residence but I don't make that distinction currently). On the other hand the 1939 register has a birth date accurate to the day but no birth location.This dataset can be increased over time as I come across more cases. It is not perfect but it seems to work quite well.

All of the options work as you suggest. The full citation is moved to the latest fact and any "secondary uses" of the citation just use the name.
That's interesting that you've built in some quality checking for sources. I like it! I'm sure there will be more to add to your database as the app gets more widely used. Thanks!
+1 vote
Wanted to give a heads up, I'm getting a lot of 571 suggestions because there's no link to the individual find a grave profile in the citations created.
by Christina Mckeithan G2G6 Mach 1 (12.0k points)
Hi Christina,

Could you provide a profile ID of one of the profiles that you reformatted using the extension that is getting this 571 suggestion? Then I can investigate whether it is something that I can fix in the extension. I don't seem to be getting this suggestion on any of my reformatted profiles.

Thanks,
Rob

Nevermind, I figured out how to use WikiTree+ to find the profiles that you manage with these suggestions :)

So, looking at Simpson-17621, it looks like the original GEDCOMpare-created profile had:

<ref name="ref_0">
Source: [[#S1046062360]] {{Ancestry Record|60525|33045961}}
</ref>

and

* Source: <span id='S1046062360'>S1046062360</span> U.S., Find A Grave Index, 1600s-Current Ancestry.com Publication: Ancestry.com Operations, Inc. Note: <i>Find A Grave</i>. Find A Grave. http://www.findagrave.com/cgi-bin/fg.cgi. {{Ancestry Record|60525|0}}

and the reformatted profile has:

<ref>'''Burial''': U.S., Find A Grave Index, 1600s-Current Ancestry.com Publication: Ancestry.com Operations, Inc. Note: <i>Find A Grave</i>. Find A Grave. http://www.findagrave.com/cgi-bin/fg.cgi. Citing: {{Ancestry Record|60525|33045961}} (accessed 12 August 2020)</ref>

I should be able to fix this by removing the http://www.findagrave.com/cgi-bin/fg.cgi. in the same way that I am already removing the {{Ancestry Record|60525|0}} that was in the source.

Thanks for reporting the issue.

Rob

Thank you! Ill try to remember to leave an example next time :)
0 votes
Would love to help out, Rob!
by Azure Robinson G2G6 Pilot (204k points)

Hi Azure, the AGC extension is now published and out of beta test. You can download it from the Chrome store. See https://www.wikitree.com/index.php?title=Space:WikiTree_AGC

If you find any issues please report them and I will try to fix them. I will be on vacation for most of September though so fixes will have to wait until after that :)

Cheers,
Rob

0 votes

I have 2 minor issues with your description of the illustrated biography section as a "chronological narrative."

To me, this format is not a narrative, but merely the placement of all the facts into a single paragraph. Unless the biography is written in complete sentences, I would prefer the standard timeline to display the facts as written.

And since the "Died" fact appears before the "Residence" and "Military" facts, the display is not chronological.

As I said, these are just minor issues - and mainly my personal preferences (so feel free to ignore!).

by Lindy Jones G2G6 Pilot (217k points)
Hi Lindy, It sounds like you are looking at the “before” picture and thinking it is the “after” picture.
Well, then I don't have any issues.

Thanks for the reply!

Related questions

+21 votes
9 answers
+39 votes
16 answers
977 views asked Aug 25, 2020 in The Tree House by Rob Pavey G2G6 Mach 1 (10.7k points)
+14 votes
10 answers
300 views asked Nov 13, 2020 in WikiTree Tech by Greg Clarke G2G6 Mach 3 (32.8k points)
+7 votes
0 answers
+7 votes
0 answers
55 views asked Sep 12, 2017 in WikiTree Tech by Paul Taylor G2G6 Mach 1 (15.4k points)
+9 votes
1 answer
136 views asked Sep 13, 2017 in WikiTree Tech by Bill Weech G2G Crew (430 points)
+8 votes
1 answer

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...