Help:Database Dumps

Search WikiTree's help pages:

Categories: Docs | WikiTree Apps Project

Raw data dumps of the publicly-available family tree are available to WikiTree Apps project members and some others. You're encouraged to consider how they might be useful. Join the WT Apps project.

Contents

Permissions and Restrictions

Permissions are granted on an individual basis since we do not have a standard license. Do not access or use the data without permission.

Generally speaking, usage should be non-commercial, but commercial purposes may be allowed. We need to make sure they are consistent with the WikiTree Pledge, i.e. "we will never knowingly and willingly sell or transfer the worldwide family tree to any individual or organization that intends to charge for access to it."

Unless or until WikiTree no longer exists, i.e. Interesting.com, Inc. is no longer supporting it and there is no successor organization:

1.) Files should not be redistributed. This is because we want to be sure anyone using the files is doing so in accordance with the above and below.

2.) If applicable for your use, the data in your application must be regularly updated. This is particularly important for privacy reasons. Even though the data is privacy-controlled, privacy settings are edited by members and subject to change. A member may have set a person as public one day and change the setting to private the next day. We also respond to copyright claims and privacy take-down requests on a daily basis.

3.) If your application has hyperlinks, person data should clearly link to the original profiles on WikiTree so that full information, updates, and credit to contributors is available to your users.

Access

The files are available on the apps.wikitree.com server. They require a special login for SFTP access. Logins are based on your WikiTree ID so you will need to register as a WikiTree member and confirm your e-mail address if you haven't already done so. Contact Chris.

After SFTP login, navigate up to the /dumps directory.

Updates

They are currently updated once a week on Sunday nights.

Using the Files

They are zipped tab-separated data files. For the key to the fields see below.

Basic files

dump_people_user

This is the main file. It contains all people on WikiTree except those who are at the Privacy Level "Unlisted." The information included is privacy-controlled. For example, if a person does not have a public family tree their father and mother will be 0.

Included fields:

  • User ID: This is needed to connect father, mother, marriages, and photos.
  • WikiTree ID: This is needed for the WikiTree URL.
  • Gender
  • First Name: This concatenates WikiTree's Prefix, Formal First Name, and Middle Name fields if the person is public. If the person is private it's Prefix, Preferred First Name, and Middle Initial.
  • Last Name at Birth: aka maiden name.
  • Current Last Name: aka married name.
  • Suffix
  • Birth Date: YYYY-MM-DD or a decade, e.g. 1960s, if the person is private.
  • Birth Date Status: certain/guess/before/after; "blank" means intentionally blank for privacy.
  • Birth Location
  • Death Date
  • Death Date Status
  • Death Location
  • Father ID: User ID for father.
  • Mother ID
  • Privacy

The privacy number corresponds to our Privacy Levels like this:

  • 10: Unlisted
  • 20: Private
  • 30: Private with Public Biography
  • 35: Private with Public Family Tree
  • 40: Private with Public Biography and Family Tree
  • 50: Public
  • 60: Open

dump_people_marriages

The only relationship we store other than father and mother is spouses. (Siblings and children can be inferred from parental relationships.) Marriages are stored apart from our main person table.

The marriages file contains:

  • User ID1
  • User ID2
  • Marriage Location
  • Marriage Date
  • Marriage Date Status: certain/guess/before/after.
  • Privacy1: Privacy Level for User ID1.
  • Privacy2

We store a Marriage End Date but it is not currently included.

The same person can be in multiple marriages.

No marriages are included if either person is private.

dump_people_photos

This contains URLs to images on WikiTree along with connected IDs.

thumbnail images

In addition to dump_people_photos there is a zipped file containing the 75-pixel thumbnails.

Extended files

These database dumps were designed to enable AleŇ° Trtnik to continue building on his impressive work with Project:Database Errors but the files may be useful to other Project:WT Apps members.

dump_people_user_full

This is a more complete version of what's available at dump_people_user. The latter skips some information and concatenates some fields, e.g. Prefix, First Name, and Middle Name are combined into First Name in that file.

Some fields, in addition to those described on the other page, that may not be self-explanatory:

  • WikiTree_ID_DB: A cleaned-up version of the WikiTree ID, e.g. St Maur becomes ST_MAUR.
  • Page ID: This ID, not the User ID, is needed to connect bio/sources sections in the text table.
  • Touched: Last modified date.
  • registration: If an active user, the registration date.
  • editcount: If an active user, the number of contributions.
  • thankcount: If an active user, the number of thank-yous.
  • nochildren: The person had no children.
  • nosiblings: The person had no siblings or no more siblings than those listed.
  • creator: The User ID of the member who created the profile.
  • manager: The User ID of the member who manages the profile. This can only list one manager.
  • has_children: The person has children.
  • is_living: A quick reference field derived from the birth/death dates and other information using the method described at Living People. It is updated when the record is saved.
  • background: If there is a background image for the profile.
  • is_locked: This indicates project-protected profiles.

In this export, gender is a number:

  • 0 = No gender or unknown
  • 1 = Male
  • 2 = Female

For status indicators, the following numbers are used:

  • 1 = "guess": This is commonly used as Uncertain. With dates it means "about" or "approximately". With death dates it could mean that the [exact] date is unknown but the person is non-living.
  • 2 = "certain": With dates it means "exact".
  • 3 = "blank": This means intentionally blank, e.g. blank because still living or for privacy. For middle name it means there is no middle name.
  • 8 = "before": This is only used for dates, i.e. before this date.
  • 9 = "after": This is only used for dates, i.e. after this date.

The mother and father status indicators are special. The numbers for these:

dump_people_marriages_full

This is the same as dump_people_marriages but with the marriage end date and all the certainty status indicators included. The certainty status indicators use the codes described above.



This page was last modified 07:37, 2 August 2017. This page has been accessed 1,976 times.