Project: Data Doctors

Categories: Profile Improvement Project | Functional Projects | DBE Error


This project is intended to improve data in the WikiTree database. It is a subproject of Project: Profile Improvement.

This page is part of the Data Doctors Project.
Latest error report: June 25th 2017.
Errors By: Custom lists.
See WikiTree+ for custom reports and statistics.
Data Doctors Challenge: Summer Vacation.

Contents

Mission

The mission of this project is to help WikiTree members create more accurate profiles which equates to better overall genealogy. Good genealogy is based on sources, so all changes should be sourced.

If we have sources with different facts then we need to add our analysis so other that other WikiTree users can understand our reasoning. This is usually done in a Research Notes section.

Some errors will be classified as errors, e.g. a mother can’t be born after a child. Other errors will be that a WikiTree profile conflicts with another source. We need to resolve the conflict and correlate the facts/sources.

Never hesitate to ask in G2G. WikiTree is a worldwide genealogy community and we all have our special knowledge.

Data Doctors Challenge

Each week a theme is selected for the Data Doctors weekly challenge. Click here to join our current challenge!

Goals

The goal of this project is to make members aware of and to correct ERRORS, review WARNINGS and investigate HINTS in the WikiTree database.

Most errors are the result of typos or imports of GEDCOMs. Also, beginners can make mistakes that are not easily seen.

Warnings are produced by uncommon data. It may be a typo or it may be unique information.

Hints are results of inconsistencies in linking to external databases. It can be the result of a typo, an error in what we have linked to the profile, or an error in the external database and needs investigating.

How to Join

Leader:

Project Coordinators:

Here you can see tag followers and badge holders.

If you would like to be involved in the Database Errors Project please:

  • Add DATA_DOCTORS to your list of followed tags. That way you'll see discussions in your G2G Feed.
  • data_doctor.gif To get the Data Doctor badge, answer in this thread in G2G and the badge will be awarded to you. By receiving this badge, you are allowed to send more than 20 private messages per day. But you must be aware of the warnings here: Daily limits on_messages - Exceptions to the limits. It's important that nobody posts/sends too many similar-looking messages.
  • You will be also invited to the google group where most of the discussions among Data Doctors take place.

How to Get Started

  • First, read the project documentation below.
  • Second, check and correct errors on your managed profiles to learn the process. (My WikiTree / Error Report or Click here).
  • Next, check and correct errors on your extended family tree to learn the process. (On starting profile Second menu / Error report or Click here).
  • Optionally, check and correct errors on your watchlist. (Click here. It redirects you to apps server, where you have to login with WikiTree credentials).
  • Then you might correct errors by type. Check out the Latest errors list to find error types you can correct. Read the help information on that error to familiarize yourself with specifics and then start correcting. If you are working on the complete list of an error in a specific era add a comment on the Latest errors page so that others will know that error is already being checked. (June 25th 2017)
  • You can use also prepared error lists by country, US states, projects,... (Custom lists)
  • Or you can correct errors by location (your town, region, country ...) (Click here).
  • Or you can correct errors by person’s names (first or last names) (From last name page on WikiTree click Error report (last name page or define custom search Click here).

Data doctors need to become familiar with all error types since there can be more than one error on the profile and they can each give you a clue to finding a main problem. Then all the errors can be corrected at once. But you may choose to specialize in one error type.

How to Get Help

  • If the question is about a specific error, read the page for that error. Links to documentation of errors are at the bottom of this page.
  • Search for related subjects in G2G forum - Tag DATA_DOCTORS.
  • Ask in the G2G forum and add the DATA_DOCTORS tag.


Code of Conduct When Fixing Database Errors

  • If you are working on a particular error from the error sheet, add a note on the Latest errors page so no one else will also look at the same list for that week.
  • Click the status button from the error list to see if an error has already been corrected.
  • If no previous status has been marked, look at the profile for any previous comments that other Data Doctors may have already left so you do not leave a duplicate message. An exception to this is if you have new information (such as a new source) that would help the profile manager to better edit an error.
  • DO NOT remove information from a profile. In most cases, information is simply moved to the correct field or to the biography.
  • If you do make changes such as changing an unknown birth location to a known place, make sure to add the source that validates the change.
  • When errors are corrected, always leave a comment on the Error Status page for anyone who may come after you. Don’t forget to click save.
  • As a courtesy, please write a short change comment in the Explain Your Changes box (right above the profile’s Save button) for each change you make. An example might be: “Edited number in first name-DB error 712.” Try not to use the word “remove” or anything similar. This upsets profile managers.
  • When advising another member on correcting errors, be clear and friendly. Make suggestions not demands. Errors happen to everyone! Word your messages in such a way to make the profile managers feel empowered rather than as if they are being sent to the naughty corner.
  • On open profiles, you do not need to contact the profile managers for every error correction. The change log (under each profile’s “Changes” tab) will show every change that has been made. The exception to this is if you plan to make a major change based on primary sources. Then, it is a courtesy to contact the profile manager through a private message before making any changes.

Links

  • See WikiTree+. for custom reports and statistics.

For older error lists see:

Templates

There is also a template {{db_errors}} to put on profile and have link to errors for that profile and connected ones. Check documentation for this template.

  • {{db_errors}} ==> Generates a link that generates a report of current Wikiprofile 5 generations. This form can be used only in biography, not in comments.
  • {{db_errors|10}} ==> Same as 1 but 10 generations. This form can be used only in biography, not in comments.
  • {{db_errors|10|Sälgö-2}} ==> Same as 2 but starts with Wikiprofile Sälgö-2. This form can be used in comments, freespace pages and everywhere else on WikiTree.
  • {{db_errors|Generations=10|WikiTreeID=Sälgö-3}} ==> This form can be used in comments, freespace pages and everywhere else on WikiTree.
  • {{db_errors|10|Sälgö-1|Y}} ==> Third parameter adds more help text. This form can be used in comments, freespace pages and everywhere else on WikiTree.

Special Situations

  • Profile Protected Profiles: These will have a box at the top right that says PROJECT PROTECTED. They might have a green privacy level regardless of time period. Only the manager(s) of the profile, Project Coordinators or Leaders may edit these profiles.

Pre-1700 Profiles

  • Pre-1700 means 1500-1699. You must be Pre-1700 certified to edit these profiles. If you have been with WikiTree long enough to know your way around well, please consider getting the Pre-1700 certification in order to work on errors in this time period. Otherwise, leave these errors to Data Doctors who are Pre-1700 certified.

Pre-1500 Profiles

  • Pre-1500 means all profiles before this century. You must be Pre-1500 certified to edit these profiles. However, this certification takes more work to acquire.

If you have a special interest in a particular profile (It is in your family tree for instance) and you feel an error needs attention, you may post in G2G asking for a Pre-1500 certified member to correct the error on your behalf. Post a very clear message about what needs to be fixed, why, and the source of the information to back up the change. Make sure to add the TAG PRE-1500. You can also add DATA_DOCTORS, so we know the origin of corrections. Those with Pre-1500 badges will work these items. Otherwise, leave these errors to Data Doctors who are Pre-1500 certified

  • Private and public profiles (Level Green to Red) Data Doctors cannot directly correct errors on private profiles. However there are a few steps that can be taken to help the health of a profile
    • If the profile should be open due to the person’s age (anyone born 150 years ago or died 100 years ago), go to the second tab at the top right of the profile page where it shows the profile’s Wiki ID. In the drop down, go down to the choice that says, “Open Profile Request.” You will be asked to confirm the age is old enough to require it being opened. Also, leaving a short note of evidence (such as birth date is 1830 or child was born in 1797) will help the manager make a faster decision to open the profile. You will receive an email when the profile has been opened. You can then go back to correct your error. Leave a comment on the Error Status Page of what you did. This will alert others who might come along that a second request is not needed.

Description of Errors

Status

Setting status of the error is described on Error status Help page

100 Person

  • 101 Birth in future: This one is obvious. We are not fortune tellers. Probably typo in birth date. It is checked on all profiles with date.
  • 102 Death in future: This one is obvious. We are not fortune tellers. Probably typo in death date. It is checked on all profiles with date.
  • 103 Death before birth: Death date is before birth date. Probably typo in birth date or death date. It is checked on all profiles with both dates. For now it is one year gap to handle dates without month and day.
  • 104 Too old: Person is too old. At the moment max age is 115 years and will be lowered as current errors are corrected. Probably typo in birth date or death date. It is checked on all profiles with both dates.
  • 105 Duplicate sibling: Here are profiles that have a sibling with same full name, birth and death date and both parents. They are probably duplicates and need to be merged. If they are not, you can mark error as False Error and you need to do it at both siblings. Similarity will be reduced as current errors are corrected.
  • 106 Duplicates between global tree and unconnected: Here are profiles that have same full name, birth and death date and are not connected in any tree. Orphan profiles are ignored. These are probably duplicates and need to be merged and with these action an unconnected tree is connected to global tree. There can be already connected profiles because my global tree is smaller due to connections in private profiles.
  • 107 Full name in UPPERCASE: Here are profiles that have whole full name in uppercase.
  • 108 Full name in lowercase: Here are profiles that have whole full name in lowercase.
  • 109 Profile should be open (birth date): Here are profiles that should be open, since birth date is older than 200 years or birth date is wrong.
  • 110 Profile should be open (death date): Here are profiles that should be open, since death date is older than 200 years or death date is wrong.
  • 111 Died too young to be parent: Here are profiles that were under 10 years old and have children without birth date.
  • 112 Person is father and mother: Here are profiles that are father to some children and mother to some.
  • 113 Duplicate in relatives:

200 Father

  • 201 Father is self: This person is its own parent. Parent should be deleted or replaced with correct one.
  • 202 Parents are same: This person's mother and father is the same person. One parent should be deleted or replaced with correct one.
  • 203 Father is Female: This means that left person is defined as father of the right person. There are two possible errors. Left person has wrong gender or right person has swapped parents father should be mother.
  • 204 Father has no Gender: This person's father doesn't have a gender. Set parent's gender.
  • 205 Father is too young or not born: This person's father was too young or not born to be the parent so probably one birth date is wrong. Limit is set at 10 years. Correct birth date.
  • 206 Father is too old: This person's father was too old to be the parent so probably one birth date is wrong. Limit is set at 99 years. Correct birth date.
  • 207 Father is also a child: This person's father is also his/her child. He cannot be both. One relation should be deleted or replaced with correct one.
  • 208 Father is also a spouse: This person's father is also her husband. This is rarely true. One relation should be deleted or replaced with correct one.
  • 209 Father is also a sibling: This person's father is also his/her sibling. If there is no 201 error problem is in mother's children. Mother's children should be corrected.
  • 210 Father was dead before birth: This person's father died before birth so probably birth date or father's death date is wrong. Correct wrong date.
  • 211 Duplicate sibling by Father: There is a profile with the same full name, birth and death date and same father. Mother is different. These two profiles are probably duplicates and need to be merged.
  • 212 Profile should be open (Child birth date): This are similar to 109 and 110, but are identified for profiles with no birth and death date and their children were born more then 200 years ago.

300 Mother

  • 301 Mother is self: This person is its own parent. Parent should be deleted or replaced with correct one.
  • 303 Mother is Male: This means that left person is defined as mother of the right person. There are two possible errors. Left person has wrong gender or right person has swapped parents]] father should be mother. Same goes for errors 303.
  • 304 Mother has no Gender: This person's mother doesn't have a gender. Set parent's gender.
  • 305 Mother is too young or not born: This person's mother was too young or not born to be the parent so probably one birth date is wrong. Limit is set at 10 years. Correct birth date.
  • 306 Mother is too old: This person's mother was too old to be the parent so probably one birth date is wrong. Limit is set at 99 years. Correct birth date.
  • 307 Mother is also a child: This person's mother is also his/her child. She cannot be both. One relation should be deleted or replaced with correct one.
  • 308 Mother is also a spouse: This person's mother is also his wife. This is rarely true. One relation should be deleted or replaced with correct one.
  • 309 Mother is also a sibling: This person's mother is also his/her sibling. If there is no 301 error problem is in father's children. Father's children should be corrected.
  • 310 Mother was dead before birth: This person's mother died before birth so probably birth date or mother's death date is wrong.
  • 311 Duplicate sibling by Mother: There is a profile with the same full name, birth and death date and same mother. Father is different. These two profiles are probably duplicates and need to be merged.
  • 312 Profile should be open (Child birth date): This are similar to 109 and 110, but are identified for profiles with no birth and death date and their children were born more then 200 years ago.

400 Marriage

500 Name / Gender

  • 501 Wrong male gender: Person with this name should be male, but is defined as female. So probably gender is wrong or name is incorrect. Correct gender or name.
  • 502 Missing male gender: Person has no gender defined and according to name should be male. Enter gender.
  • 503 Probably wrong male gender: Person with this name should statistically be male, but is defined as female. So probably gender is wrong or name is incorrect. Correct gender or name.
  • 504 Missing probably male gender: Person has no gender defined and according to name is probably male. Enter gender.
  • 505 Wrong female gender: Person with this name should be female, but is defined as male. So probably gender is wrong or name is incorrect. Correct gender or name.
  • 506 Missing female gender: Person has no gender defined and according to name should be female. Enter gender.
  • 507 Probably wrong female gender: Person with this name should statistically be female, but is defined as male. So probably gender is wrong or name is incorrect. Correct gender or name.
  • 508 Missing probably female gender: Person has no gender defined and according to name is probably female. Enter gender.
  • 509 Missing gender: Person has no gender defined cannot be derived from name. Enter gender.
  • 510 Unique name without gender: Person has no gender defined and has unique name and cannot be derived from name. Enter gender and possibly correct name.
  • 511 Unique name (spelling): These are names, that appear only once in database. They are possibly misspelled.

550 Wikidata

Some WikiTree profiles are matched with a profile in WikiData see Space:Wikidata.

570 FindAGrave

Some WikiTree profiles are referencing memorials on findagrave.com.

600 Location

Location errors are split into 3 groups as follows: 601-630 Birth location, 631-660 Death location, 661-690 Marriage location. Some errors can be defined by users Space:Database_Errors_Definition.
  • 601, 631, 661 Wrong word in location: Text is not a location. If the location is not known, location field should be empty.
  • 602, 632, 662 "Y" location: Y is not a location. I this locations were part of GEDCOM imports (Maybe some error in GEDCOM format) and never corrected. Checking also for yes.
  • 603, 633, 663 USA used too early: USA is used before the country existed. Old name should be used.
  • 604, 634, 664 Too short location: Short locations are not allowed, since they can be ambiguous. Also people from other parts of the world don't understand them. For now Minimum Length is 4 characters with exceptions like USA, UK. American states should be at least in form PA, USA which is longer than 4 letters.
  • 605, 635, 665 Number in location: In locations there is only a number. It is often a date entered in the wrong field.
  • 606, 636, 666 Bogus location: This location is inserted as location by autocomplete operations of some softwares or websites.
  • 607, 637, 667 Misspelled word:
  • 608, 638, 668 Misspelled country:
  • 609, 639, 669 Wrong character:
  • 610, 640, 670 Location in UPPERCASE:
  • 611, 641, 671 Location in lowercase:

700 Name Errors

800 Biography

900

  • 901 unconnected empty public profiles: I added this error to find empty unlinked profiles. That means the profile has no relations (parents, children, marriage) and no birth and death data (date nor location) and is public. This was added based on Jillaine Smith request.
  • 902 unconnected empty open profiles: I added this error to find empty unlinked profiles. That means the profile has no relations (parents, children, marriage) and no birth and death data (date nor location) and is open. This was added based on Jillaine Smith request.

910 Sweden Specific

ToDo

  • Merging tool, that would compare also relatives . Done.
  • create completeness scoring of the profile.
  • Add new errors
    • Problems with unicode characters in some GEDCOM imports. Done.
    • 104: At the moment max age is 110 years and will be lowered as current errors are corrected.
    • 105 Reduce similarity as current errors are corrected.
    • 1xx Find duplicates were S=Š Or ss = ß or A=Å
    • 205, 305 Do not allow false errors. Calculations are now exact. Excep Before and After qualifiers.
    • 400 Mclean-3147 suggested to find all partners with same LNAB. Generally they shouldn't be the same.
    • 600 Locations
      • See on what is correct location field.
    • 900 Any profile field empty .


This page was last modified 10:40, 27 June 2017. This page has been accessed 1,887 times.