no image

Kentucky Record Transcriptions: A Case Study in Computer Assisted Profile Creation

Privacy Level: Public (Green)
Date: [unknown] [unknown]
Location: [unknown]
This page has been accessed 47 times.

I will be describing everything in EXCRUCIATING detail, but for now a quick summary, with no attempt to make it digestible for the non-specialist:

The book Afro-American deaths of Boone thru Boyle County, Kentucky, 1852-1862 by Gwendolyn Tippe is basically pages of data in a tabular format. I downloaded pages as JPEG images from FamilySearch, then used the OCR program tesseract to produce text files from them. I hand edited the resulting files so each row was a comma separated value list. I processed the edited text files using a program I wrote in Python to produce two new files. One a .csv file I can import into the GRAMPS genealogy program. I then exported the tree I created in GRAMPS to a .ged file, which I import to WikiTree. The second file I fed to a program I wrote in Python that generates the full text for the bio section. As I added the profiles through the GEDCOMpare process, when I got to the step of editing my new profile all I had to do was copy and paste my generated bio into the text box. BOOM! Done.

This is an "orphaned" profile — there's no Profile Manager to watch over it. Please adopt this profile.

  • Login to request to the join the Trusted List so that you can edit and add images.
  • Public Comments: Login to post. (Best for messages specifically directed to those editing this profile. Limit 20 per day.)
  • Public Q&A: These will appear above and in the Genealogist-to-Genealogist (G2G) Forum. (Best for anything directed to the wider genealogy community.)

Leave a message for others who see this profile.
There are no comments yet.
Login to post a comment.