upload image

Database errors project (July 31 2016)

Privacy Level: Open (White)
Date: 31 Jul 2016 to 7 Aug 2016
Location: Worldwidemap
Surname/tag: data_doctors
Profile manager: Aleš Trtnik private message [send private message]
This page has been accessed 996 times.

Categories: DD Suggestions.

This page is part of the Data Doctors Project.
Latest report: February 17th 2019 and the Spreadsheet.
Custom reports by: Suggestion lists, Unsourced lists, Unconnected lists.
See WikiTree+ for custom reports and statistics.
Data Doctors Challenge: Names_VIII .
Category suggestion report: Latest report

Analysis was done on data from July 31st 2016.

Related post.

Here are pages of errors lists with basic person data and links to WikiTree.

Contents

News

Custom checking and correcting

You can check spelling of any word on http://wikitree.sdms.si/default.htm in group Analyse item Location spelling. There you can also manually check any location spelling and view misspelled profiles and correct them.

Updated errors 607, 637, 667 Location spelling

Added spelling verification for a few locations. I will add more of them in time. For now I check Massachusetts, Leicestershire, London and England. I added all words, that appears more then 10000 times and are longer then 9 letters. Shorter words have more exceptions so I will add them on request. Please check error list for actual locations. I am sure I missed some.

You can find correct spelling of checked words here. There you can also add new locations to be checked. Each added word must be checked for actual locations. London has exceptions for Lyndon, Loudon, Loddon, Longdon because they are actually places.

Last week News

New errors column

I added new column in the table below, where you can see only errors that appeared last week. With new Data validation on input this numbers will reduce significantly.

New errors count

I added first occurrence for each error. That enables me to see new errors each week due to data growth. Number is exact for last two weeks. Older errors cannot be compared directly due to changes in algorithms. My estimation in previous updates was right on target. Cca 10000 new errors due to data growth. I expect this number to drop drastically with new Data validation on input. Way to go Chris.

http://www.softdata.si/osebe_staro/ales/wikitree/Errors%20age.htm

Errors

Analysis was done on data from July 31st 2016.

1283385 Errors Total 0000-0000 0001-1499 1500-1699 1700-1799 1800-1899 1900-1999 2000-Now Now-9999 Open New
101 Birth in future 128 128 3
102 Death in future 170 1 13 151 5 50 3
103 Death before birth 13460 15 565 1460 6259 5078 82 1 10532 28
104 Too old 5946 270 1070 1888 2629 85 4 5424 22
105 Duplicate sibling 1194 48 167 430 549 813 103
106 Duplicates between global tree and unconnected 1674 2 300 485 802 85 1342 74
107 Full name in UPPERCASE 2512 1607 20 8 326 537 14 104 2
108 Full name in lowercase 3120 2844 5 40 224 7 14 3
109 Profile should be open (birth date) 17354 9 2989 14356 78
110 Profile should be open (death date) 1000 226 4 112 413 243 2 8
111 Died too young to be parent 505 35 65 65 195 138 7 389 1
112 Person is Father and mother 1068 141 6 7 110 683 121 718 13
201 Father is self 94 58 25 11 1 2
202 Parents are same 61 6 2 27 26 11 3
203 Father is Female 4779 773 26 164 825 2548 443 3438 133
204 Father has no Gender 589 401 188 5 5
205 Father is too young or not born 41840 1980 4458 9000 18817 7485 80 20 35298 228
206 Father is too old 6275 446 1377 2316 2125 11 5900 16
207 Father is also a child 166 62 16 14 11 49 14 60 12
208 Father is also a spouse 26 5 3 17 1 6 3
209 Father is also a sibling 1640 307 96 162 113 804 158 963 18
210 Father was dead before birth 40989 1643 2604 7985 12029 15870 858 37604 141
211 Duplicate sibling by father 3945 320 936 2298 391 2938 70
212 Profile should be open (Child birth date) 4532 4532 10
301 Mother is self 3 3 1
303 Mother is Male 5213 637 57 242 1000 2873 404 3719 82
304 Mother has no Gender 851 671 1 151 28 229 30
305 Mother too young or not born 53338 2463 5957 12539 24133 8165 76 5 45732 277
306 Mother is too old 5257 325 1066 1841 2014 11 4897 13
307 Mother is also a child 2 2
308 Mother is also a spouse 351 75 50 33 61 122 10 193 28
309 Mother is also a sibling 111 29 2 2 2 58 18 30 8
310 Mother was dead before birth 46285 2043 1899 9497 14313 17686 845 2 42432 204
311 Duplicate sibling by mother 1299 8 118 310 734 129 886 38
312 Profile should be open (Child birth date) 4251 4251 18
401 Spouse is self 1 1
402 Unknown gender of spouse 794 507 4 236 47 62 23
403 Single sex marriage 1328 241 14 23 80 758 212 599 104
404 Marriage before birth 10050 200 911 2165 4855 1919 8620 57
405 Married too old 2549 94 378 721 1356 2312 12
406 Marriage after death 12650 540 351 1909 3086 6112 652 11235 43
407 Lived too long after marriage 731 22 11 88 160 383 67 621 2
408 Multiple marriages on same day 8935 127 14 1407 2674 4417 296 7678 171
409 Marriage to duplicate person 28379 3767 230 3554 6912 12436 1480 23383 638
501 Wrong male gender 6712 1413 26 86 823 3272 1089 3 4489 76
502 Missing male gender 67256 38371 2 31 2396 17733 8691 32 22240 510
503 Probably wrong male gender 5621 1297 45 482 607 1987 1193 10 3421 107
504 Missing probably male gender 35206 25033 33 651 4693 4755 41 7168 473
505 Wrong female gender 7080 1262 3 187 396 4069 1159 4 4768 75
506 Missing female gender 54750 30462 14 2564 15242 6449 19 19564 425
507 Probably wrong female gender 5095 864 20 222 681 2174 1132 2 3417 129
508 Missing probably female gender 28952 20977 4 74 501 4014 3362 20 5504 569
509 Missing gender 96998 83264 18 354 1225 6307 5782 48 11675 957
510 Unique name without gender 23072 10952 10 153 420 6800 4680 57 9897 125
511 Unique first name (spelling) 257671 48964 7704 12433 19892 96982 70228 1463 5 145830 1896
601 Unknown birth location 8730 1363 6 1343 4565 1453 6990 68
604 Birth location too short 10590 1133 45 924 1961 5180 1347 7018 55
605 Number in birth location 546 356 1 170 19 11 12
607 Misspelled word in birth location 3485 106 62 669 1178 1220 250 3139 3485
631 Unknown death location 15689 1172 598 2579 9407 1933 13972 115
632 Y death location 1107 24 1006 77 1
634 Death location too short 8849 714 133 739 1564 4708 991 5306 36
635 Number in death location 423 69 1 1 292 60 11 10
637 Misspelled word in death location 1926 59 42 477 496 681 171 1706 1926
661 Unknown marriage location 1185 69 1 252 603 260 937 11
664 Marriage location too short 2423 193 312 513 1297 108 1666 13
665 Number in marriage location 4 1 1 2
667 Misspelled word in marriage location 785 45 6 200 246 263 25 713 785
711 Separators in Prefix 1176 85 10 82 34 513 451 1 574 10
712 Number in Prefix 426 45 15 58 154 154 239 5
713 Suffix in Prefix 2684 301 56 118 435 757 1013 4 1512 2684
714 Wrong word in Prefix 1025 237 8 20 64 346 329 21 545 1025
721 Separators in First Name 49899 8365 255 2243 5927 27708 5401 28399 167
722 Number in First Name 92 84 8 44 1
723 Prefix in First Name 67911 13543 2894 3417 5511 32597 9945 4 47022 67911
724 Wrong word in First Name 7906 1999 80 443 887 3021 1475 1 5461 7906
731 Separators in Preferred Name 14647 2380 82 90 142 1945 9963 45 807 139
732 Number in Preferred Name 27 15 5 7 7
733 Prefix in Preferred Name 36659 7799 2632 572 482 2796 22312 66 5917 36658
734 Wrong word in Preferred Name 1748 528 2 6 16 132 1048 16 149 1748
741 Separators in Middle Name 1438 111 18 75 107 845 282 1077 13
743 Prefix in Middle Name 3252 178 48 66 290 2007 663 2722 3252
744 Wrong word in Middle Name 1019 160 3 20 144 501 190 1 881 1019
751 Separators in Nicknames 2627 130 318 278 343 1222 336 2272 32
752 Number in Nicknames 165 4 2 45 113 1 77 1
753 Prefix in Nicknames 10210 725 4023 2011 1203 1602 646 9775 10210
754 Wrong word in Nicknames 930 76 21 59 95 504 175 731 930
761 Separators in Suffix 8298 1285 185 1078 1832 2549 1362 7 5765 45
762 Number in Suffix 1401 285 156 395 399 165 1 1019 2
763 Prefix in Suffix 9063 1132 191 521 1073 2177 3953 16 3940 9063
764 Wrong word in Suffix 1429 93 5 156 230 491 450 4 775 1429
771 Separators in Last Name at Birth 5193 881 79 858 1223 1492 653 7 3491 7
781 Separators in Current Last Name 6699 1323 119 898 1286 1927 1139 7 4268 19
782 Number in Current Last Name 1 1
791 Separators in Last Name Other 4906 306 195 1000 1090 1179 1133 2 1 3435 33
792 Number in Last Name Other 5 2 3 1
901 Unconnected empty public profiles 35169 35169 22
902 Unconnected empty open profiles 17572 17572 17572 83

Changes since previous update

Detailed statistics are available on http://wikitree.sdms.si/default.htm in Statistics section.







Collaboration

On 6 Aug 2016 at 11:44 GMT Marty (Lenover) Acks wrote:

Slowly working through 501 - Open

On 5 Aug 2016 at 22:48 GMT Aleš Trtnik wrote:

Lindon added to exceptions. They will be gone on monday.

On 5 Aug 2016 at 21:46 GMT Patricia Roche wrote:

error 637 - London. Just noticed a bunch of Lindon Utah on the list and not seeing any spelling errors so likely the report thinks it should be London but Lindon is a real place in Utah Co, Utah

On 4 Aug 2016 at 03:24 GMT Patricia Roche wrote:

Working on 637 - "Massachusetts" only and also correcting, if in birth field. Started from the top

On 3 Aug 2016 at 18:56 GMT Jamie Nelson wrote:

Working on 206 this week.

On 2 Aug 2016 at 23:46 GMT Aleš Trtnik wrote:

112: 1700-1800 Working From end

505: 1700-1800 whole

On 2 Aug 2016 at 05:30 GMT Graeme Olney wrote:

207 Open - Done (Except Pre-1500 and No Dates)

208 Open - Done (Except Pre-1500)

308 Open - Started