upload image

Database errors project (5 June 2016)

Privacy Level: Open (White)
Date: 5 Jun 2016 to 12 Jun 2016
Location: Worldwidemap
Surname/tag: data_doctors
Profile manager: Aleš Trtnik private message [send private message]
This page has been accessed 1,290 times.

Categories: DD Suggestions.

This page is part of the Data Doctors Project.
Latest report: February 10th 2019 and the Spreadsheet.
Custom reports by: Suggestion lists, Unsourced lists, Unconnected lists.
See WikiTree+ for custom reports and statistics.
Data Doctors Challenge: Dates_VIII .

Analysis was done on data from June 5th 2016.

Related post.

Here are pages of errors lists with basic person data and links to WikiTree.

Contents

News

Added Error 211, 311

Added error 211 Duplicate sibling by father and 311 Duplicate sibling by mother, that lists profiles wits same FullName, birth and death date and one parent. It is similar to error 105 Duplicate sibling.

1125453 Errors Total 0000-0000 0001-1499 1500-1699 1700-1799 1800-1899 1900-1999 2000-Now Now-9999
211 Duplicate sibling by father 4542 20 360 1086 2634 442
311 Duplicate sibling by mother 1644 16 128 378 952 170

Updated Error 511

When checking for uniqueness of a name, it is also checked against last names, so most of -son names and Latnames as middle name are now not reported as an error.

Updated Name analyses

With name occurrences check, it check for multi word names also each name http://www.sdms.si:92/wikitree/ShowFirstNames.htm. Automatic link to this analyse is added to each error.

Updated error lists

Prepared error lists are reduced in size to show 2000 instead of 5000 errors per page. Also sort order in lists is changed, On beginning are open profiles, that are easily edited and followed by more protected ones.

Updated bogus locations

Errors Total 0000-0000 0001-1499 1500-1699 1700-1799 1800-1899 1900-1999 2000-Now Now-9999
606 Bogus birth location 11 1 2 8
636 Bogus death location 1112 40 221 310 367 172 2
666 Bogus marriage location 1 1

Errors

Analysis was done on data from June 5th 2016.

1171499 Errors Total 0000-0000 0001-1499 1500-1699 1700-1799 1800-1899 1900-1999 2000-Now Now-9999
101 Birth in future 236 236
102 Death in future 272 2 33 229 5 3
103 Death before birth 14087 129 590 1525 6520 5200 81 42
104 Too old 6557 419 1164 1982 2854 133 4 1
105 Duplicate sibling 2365 6 14 250 1342 753
106 Duplicates between global tree and unconnected 2664 20 374 891 1219 160
107 Full name in UPPERCASE 2997 1916 44 58 369 593 17
108 Full name in lowercase 3118 2845 1 6 41 218 7
109 Profile should be open (birth date) 18604 7 403 183 3142 14869
110 Profile should be open (death date) 1571 673 5 70 147 429 245 2
201 Father is self 109 58 10 2 28 11
202 Parents are same 71 6 35 30
203 Father is Female 6243 877 34 304 1161 3420 447
204 Father has no Gender 898 439 2 420 37
205 Father is too young or not born 42799 2035 4585 9274 19190 7607 78 30
206 Father is too old 6577 519 1477 2367 2183 31
207 Father is also a child 364 74 27 34 87 122 20
208 Father is also a spouse 65 9 4 2 5 43 2
209 Father is also a sibling 2828 353 114 253 585 1303 220
210 Father was dead before birth 42179 1675 2672 8429 12419 16104 880
301 Mother is self 3 2 1
303 Mother is Male 6626 672 64 298 1210 3939 443
304 Mother has no Gender 867 752 1 78 36
305 Mother too young or not born 54490 2493 6153 12906 24518 8328 75 17
306 Mother is too old 5564 413 1166 1899 2056 30
307 Mother is also a child 8 1 2 3 2
308 Mother is also a spouse 989 125 62 73 235 477 17
309 Mother is also a sibling 231 40 7 5 27 127 25
310 Mother was dead before birth 48198 2128 1967 10166 14910 18088 937 2
401 Spouse is self 2 1 1
402 Unknown gender of spouse 1681 672 2 4 914 89
403 Single sex marriage 1406 278 82 55 133 722 136
404 Marriage before birth 10925 210 940 2362 5312 2093 5 3
405 Married too old 2750 123 426 764 1437
406 Marriage after death 13268 569 371 2045 3228 6388 667
407 Lived too long after marriage 1420 28 11 138 318 741 184
408 Multiple marriages on same day 9853 247 5 1630 2965 4647 359
409 Marriage to duplicate person 31076 4034 263 4098 7808 13238 1633 2
501 Wrong male gender 8291 1432 6 656 1298 3689 1207 3
502 Missing male gender 73695 37749 12 1011 5320 20624 8947 31 1
503 Probably wrong male gender 6095 1313 59 545 695 2209 1263 10 1
504 Missing probably male gender 35528 24443 8 29 836 5133 5039 39 1
505 Wrong female gender 8987 1408 3 400 1222 4690 1262 2
506 Missing female gender 59846 29756 9 617 4231 18554 6661 17 1
507 Probably wrong female gender 4985 852 20 232 689 2178 1012 2
508 Missing probably female gender 28538 19904 8 140 590 4101 3775 19 1
509 Missing gender 96461 79945 38 409 1442 7812 6760 54 1
510 Unique name without gender 23171 10728 39 177 679 6917 4562 69
511 Unique first name (spelling) 237250 45504 5993 10776 18043 90376 65173 1376 9
512 Separators in first name 68136 11597 361 2528 6537 31042 16027 44
601 Unknown birth location 9026 1691 2 361 1348 4264 1357 3
604 Birth location too short 13787 1798 70 975 2153 7308 1483
605 Number in birth location 1742 975 1 56 164 524 22
606 Bogus birth location 11 1 2 8
631 Unknown death location 16148 1421 773 2568 9202 2184
632 Y death location 5529 60 350 5032 87
634 Death location too short 15752 1024 204 927 2241 10108 1248
635 Number in death location 1045 206 1 125 300 349 64
636 Bogus death location 939 40 173 265 321 140
661 Unknown marriage location 1349 67 65 234 750 233
664 Marriage location too short 2789 214 14 334 579 1414 234
665 Number in marriage location 14 1 1 10 2
901 Unconnected empty public profiles 35194 35194
902 Unconnected empty open profiles 17489 17489

Changes since previous update

Explanation of error changes:

  • Increase in persons profiles was 0.44% so all errors should increase for that amount. that is 5125 errors.
  • and there are 13084 fewer errors

So in 7 days 18209 errors were corrected by my estimation.

Note: Usually with one correction you can correct multiple errors, because errors repeat in different groups.

1.5. 11.5. 15.5. 22.5. 29.5. Projected 5.6. Reduction Delta%
Profiles 11184648 11206469 11262275 11311734 11361166 100,437%
Locations 10707022 10732026 10798925 10853793 10910167 100,519%
Father 6015213 6045514 6072066 6098843 100,441%
Mother 5661601 5689774 5714594 5739296 100,432%
Marriages 2564629 2577987 2588736 2600127 100,440%
101 Birth in future 343 312 308 253 242 243 236 7 2,90%
102 Death in future 370 343 331 313 307 308 272 36 11,79%
103 Death before birth 13139 13111 13080 13022 14153 14215 14087 128 0,90%
104 Too old 7021 7036 7001 6968 6621 6650 6557 93 1,40%
105 Duplicate sibling 4711 3892 3607 3026 2687 2699 2365 334 12,37%
106 Duplicates between bigtree and unconnected 3253 3293 3245 2959 2789 2801 2664 137 4,90%
107 Full name in UPPERCASE 3153 3136 3088 3101 2997 104 3,37%
108 Full name in lowercase 3207 3193 3193 3207 3118 89 2,77%
109 Profile should be open (birth date) 11667 11439 18738 18820 18604 216 1,15%
110 Profile should be open (death date) 1516 1512 1664 1671 1571 100 6,00%
201 Father is self 251 240 121 114 112 112 109 3 3,11%
202 Parents are same 224 221 193 98 79 79 71 8 10,52%
203 Father is Female 6167 6244 6257 6253 6376 6404 6243 161 2,52%
204 Father has no Gender 2159 1689 1175 1026 954 958 898 60 6,28%
205 Father is too young or not born 48551 48867 48694 48607 42740 42928 42799 129 0,30%
206 Father is too old 6952 6955 6928 6789 6648 6677 6577 100 1,50%
207 Father is also a child 510 502 393 378 375 377 364 13 3,36%
208 Father is also a spouse 241 234 232 216 208 209 65 144 68,89%
209 Father is also a sibling 3527 3512 3236 3078 2967 2980 2828 152 5,10%
210 Father was dead before birth 32482 32559 32505 32506 42118 42304 42179 125 0,29%
301 Mother is self 10 6 5 5 8 8 3 5 62,66%
303 Mother is Male 8321 7931 7880 7798 7333 7365 6626 739 10,03%
304 Mother has no Gender 2101 1856 1715 1540 1074 1079 867 212 19,62%
305 Mother too young or not born 65178 65596 65535 65559 54347 54582 54490 92 0,17%
306 Mother is too old 5822 5817 5783 5716 5606 5630 5564 66 1,18%
307 Mother is also a child 35 34 13 11 12 12 8 4 33,62%
308 Mother is also a spouse 1566 1578 1516 1322 1186 1191 989 202 16,97%
309 Mother is also a sibling 373 364 362 356 351 353 231 122 34,47%
310 Mother was dead before birth 31202 31224 31153 31067 48175 48383 48198 185 0,38%
401 Spouse is self 4 3 3 3 2 2 2 0 0,44%
402 Unknown gender of spouse 2990 2538 2386 2068 1742 1750 1681 69 3,92%
403 Single sex marriage 4671 4001 3998 3502 2156 2165 1406 759 35,07%
404 Marriage before birth 10937 10704 10634 10526 11152 11201 10925 276 2,46%
405 Married too old 2857 2871 2847 2812 2760 2772 2750 22 0,80%
406 Marriage after death 12580 12602 12556 12506 13271 13329 13268 61 0,46%
407 Death too old after Marriage 2027 1818 1769 1659 1583 1590 1420 170 10,69%
408 Multiple marriages on same day 10234 10198 10011 10055 9853 202 2,01%
409 Marriage to duplicate person 31870 31772 31338 31476 31076 400 1,27%
501 Wrong male gender 7130 7012 9064 8813 8515 8552 8291 261 3,05%
502 Missing male gender 53397 53276 75402 74800 74330 74655 73695 960 1,29%
503 Probably wrong male gender 8380 8349 6237 6147 6389 6417 6095 322 5,02%
504 Probably missing male gender 56357 56486 36129 35644 35693 35849 35528 321 0,90%
505 Wrong female gender 9072 8717 10479 10295 9699 9741 8987 754 7,74%
506 Missing female gender 51058 51119 60682 60375 59697 59958 59846 112 0,19%
507 Probably wrong female gender 7027 6946 5389 5260 5575 5599 4985 614 10,97%
508 Probably missing female gender 37889 37983 29462 29382 30243 30375 28538 1837 6,05%
509 Missing gender 97415 97714 95747 96390 95072 95487 96461 -974 -1,02%
510 Unique name without gender 24792 24854 23654 23454 23373 23475 23171 304 1,30%
511 Unique name (spelling) 476223 346225 283116 293826 295110 293941 1169 0,40%
512 Separators in first name 68680 68716 68383 68177 68475 68136 339 0,49%
601 Unknown birth location 9291 9343 9366 9392 9003 9050 9026 24 0,26%
604 Birth location too short 16791 13986 13796 13868 13787 81 0,58%
605 Number in birth location 1933 1764 1773 1742 31 1,76%
606 Bogus birth location 58 58 11 47 81,13%
631 Unknown death location 16230 16454 16505 16509 16198 16282 16148 134 0,82%
632 Y death location 6542 6534 6736 6355 5662 5691 5529 162 2,85%
634 Death location too short 18242 17071 16033 16116 15752 364 2,26%
635 Number in death location 1625 1239 1245 1045 200 16,09%
636 Bogus death location 980 985 939 46 4,68%
661 Unknown marriage location 1328 1350 1346 1348 1350 1357 1349 8 0,59%
664 Marriage location too short 3246 3020 2802 2817 2789 28 0,98%
665 Number in marriage location 228 11 11 14 -3 -26,62%
901 Unconnected empty public profiles 35473 35433 35418 35346 35432 35587 35194 393 1,10%
902 Unconected empty open profiles 17242 17221 17172 17136 17235 17310 17489 -179 -1,03%
Total 719198 1261647 1229116 1159314 1180318 1185533,466 1171499 13084 1,10%






Collaboration

On 11 Jun 2016 at 11:13 GMT Marty (Lenover) Acks wrote:

636 bogus death location largely completed - all dates that were open (Privacy 60) , left a few where there was different info in location and death date

On 10 Jun 2016 at 00:50 GMT Abby (Brown) Glann wrote:

Working Error 109

On 9 Jun 2016 at 03:22 GMT Paul Gierszewski wrote:

407: fixed or messaged through 0000, 0001-1499 and 1500-1699, with contributions from several people.

On 8 Jun 2016 at 15:37 GMT Aleš Trtnik wrote:

Pat, Ask such questions in G2G.

Wythel is unique in name part. If it would appear a few times it wouldn't be an error.

I might also correct algorithm to allow lastnames in middle name, since your case is quite common.

On 8 Jun 2016 at 14:28 GMT Pat D Saunders wrote:

511 error: many of my ancestors were given first and middle names that were the first and surnames of someone else. Freas Brown Saunders and Florence Wythel Saunders are examples of such names. It is valid for these people to have middle names that are not common first names. I hope that these names will not be flagged as errors once the middle names are moved to the middle name fields.

On 8 Jun 2016 at 13:37 GMT Pat D Saunders wrote:

511 error: most of these have a middle name or initial after the first space. I would suggest using a different error description for these.

Most 511s on my list had unusual but valid first names. A few were indicating multiple spellings of one name. "Maria(h)" is one example. I would not error flag any name that is all alphabetic; there are just too many possibilities in a world that speaks many languages.

You have taken on an enormous task in trying to find invalid names. I used to write programs that took full names and tried to put them into first name and last name fields. I was amazed at some of the results that I got.

On 7 Jun 2016 at 23:22 GMT Mikey Anonymous wrote:

102: finished up for 1800-1899 public (Marty already did some)

605: 1500-1699 done

661: 1500-1699 done

664: 1500-1699 through "?" (Askildsen Duasen-1), skip few just above for same PM

On 7 Jun 2016 at 13:25 GMT Aleš Trtnik wrote:

502: missing male gender, 1500-1699 Completed

On 7 Jun 2016 at 12:24 GMT Nan Lambert wrote:

Continuing work on 502: missing male gender, 1700-1799.

On 6 Jun 2016 at 18:54 GMT Aleš Trtnik wrote:

@ B. W. J. Molier

I don't think, this can be an error or even a warning. Usually tree ends, where you are out of information.To see what you want, you can use dynamic tree, and you will see, where one parent is missing.

BTW: Ask for such things on G2G.

more comments