upload image

Database errors project (July 24 2016)

Privacy Level: Open (White)
Date: 24 Jul 2016 to 31 Jul 2016
Location: Worldwidemap
Surname/tag: data_doctors
Profile manager: Aleš Trtnik private message [send private message]
This page has been accessed 884 times.

Categories: DD Suggestions.

This page is part of the Data Doctors Project.
Latest report: February 17th 2019 and the Spreadsheet.
Custom reports by: Suggestion lists, Unsourced lists, Unconnected lists.
See WikiTree+ for custom reports and statistics.
Data Doctors Challenge: Names_VIII .
Category suggestion report: Latest report

Analysis was done on data from July 24th 2016.

Related post.

Here are pages of errors lists with basic person data and links to WikiTree.

Contents

News

New errors 607, 637, 667 Location spelling

Added spelling verification for a few locations. I will add more of them in time. For now I check Massachusetts, Leicestershire, London and England. I will also add Freespace configuration of locations. Each word must be checked for actual locations. London has exceptions for Lyndon, Loudon, Loddon, Longdon because they are actually places.

1274085 Errors Total 0000-0000 0001-1499 1500-1699 1700-1799 1800-1899 1900-1999 2000-Now Now-9999 Open New
607 Misspelled word in birth location 3453 107 62 667 1156 1212 249 3107
637 Misspelled word in death location 1916 58 42 475 491 680 170 1696
667 Misspelled word in marriage location 782 44 6 199 245 263 25 710

New errors column

I added new column in the table below, where you can see only errors that appeared last week. With new Data validation on input this numbers will reduce significantly.

New errors count

I added first occurrence for each error. That enables me to see new errors each week due to data growth. Number is exact for last two weeks. Older errors cannot be compared directly due to changes in algorithms. My estimation in previous updates was right on target. Cca 10000 new errors due to data growth. I expect this number to drop drastically with new Data validation on input. Way to go Chris.

http://www.softdata.si/osebe_staro/ales/wikitree/Errors%20age.htm

Errors

Analysis was done on data from July 24th 2016.

1188778 Errors Total 0000-0000 0001-1499 1500-1699 1700-1799 1800-1899 1900-1999 2000-Now Now-9999 Open New
101 Birth in future 156 156 2 2
102 Death in future 179 1 13 159 5 1 57 6
103 Death before birth 13593 17 569 1468 6323 5126 83 7 10642 108
104 Too old 5971 271 1075 1901 2635 85 4 5448 40
105 Duplicate sibling 1179 48 179 414 538 806 94
106 Duplicates between global tree and unconnected 1808 2 298 543 877 88 1462 67
107 Full name in UPPERCASE 2737 1743 21 10 349 597 17 118 7
108 Full name in lowercase 3124 2848 5 40 224 7 20 2
109 Profile should be open (birth date) 17549 7 3046 14496 64
110 Profile should be open (death date) 1000 226 3 112 413 244 2 5
111 Died too young to be parent 507 36 66 66 194 138 7 389 1
112 Person is Father and mother 1172 158 11 8 174 695 126 826 23
201 Father is self 93 57 25 11
202 Parents are same 86 6 1 53 26 36 1
203 Father is Female 4821 787 28 164 882 2515 445 3470 117
204 Father has no Gender 599 403 196 15 2
205 Father is too young or not born 42133 1988 4501 9060 18956 7525 80 23 35562 387
206 Father is too old 6344 452 1389 2338 2151 14 5963 85
207 Father is also a child 224 67 17 24 9 91 16 111
208 Father is also a spouse 37 5 3 4 3 20 2 16 3
209 Father is also a sibling 1768 318 97 179 111 896 167 1073 25
210 Father was dead before birth 41323 1665 2618 8059 12146 15970 865 37921 358
211 Duplicate sibling by father 3993 2 328 962 2312 389 2982 81
212 Profile should be open (Child birth date) 4554 4554 11
301 Mother is self 3 2 1
303 Mother is Male 5282 643 59 242 1061 2874 403 3787 94
304 Mother has no Gender 865 685 2 3 139 36 239 8
305 Mother too young or not born 53624 2476 6007 12624 24260 8174 76 7 46001 495
306 Mother is too old 5328 327 1075 1871 2041 14 4958 52
307 Mother is also a child 8 1 2 3 2 5
308 Mother is also a spouse 509 74 51 45 58 263 18 337 21
309 Mother is also a sibling 134 32 2 5 7 69 19 52 2
310 Mother was dead before birth 46608 2065 1898 9598 14409 17771 865 2 42712 330
311 Duplicate sibling by mother 1295 8 124 308 726 129 891 24
312 Profile should be open (Child birth date) 4262 4262 23
401 Spouse is self 1 1
402 Unknown gender of spouse 787 504 2 4 233 44 54 18
403 Single sex marriage 1329 246 15 30 115 718 205 602 75
404 Marriage before birth 10212 202 923 2208 4927 1952 8765 109
405 Married too old 2609 93 387 755 1374 2372 29
406 Marriage after death 12781 543 352 1939 3140 6149 658 11356 107
407 Lived too long after marriage 781 24 11 88 167 417 74 669 28
408 Multiple marriages on same day 9034 129 16 1437 2744 4408 300 7767 195
409 Marriage to duplicate person 28425 3795 230 3576 6932 12417 1475 23411 542
501 Wrong male gender 6774 1419 26 89 866 3293 1078 3 4550 98
502 Missing male gender 67448 38179 2 31 2443 18089 8671 32 1 22654 610
503 Probably wrong male gender 5615 1297 46 458 613 2001 1190 10 3413 65
504 Missing probably male gender 35195 24836 34 671 4816 4797 41 7372 350
505 Wrong female gender 7560 1266 3 192 799 4112 1184 4 5255 74
506 Missing female gender 55091 30325 14 2617 15599 6518 18 20075 498
507 Probably wrong female gender 5085 858 19 221 689 2173 1123 2 3437 37
508 Missing probably female gender 29034 20671 4 76 502 4083 3678 20 5913 283
509 Missing gender 97099 82933 18 360 1247 6530 5963 48 12174 821
510 Unique name without gender 23431 10948 10 154 433 7007 4822 57 10254 192
511 Unique first name (spelling) 257644 49053 7721 12499 19989 97052 69876 1446 8 145988 1630
601 Unknown birth location 8892 1386 3 1377 4640 1486 7147 251
604 Birth location too short 11107 1135 45 927 1967 5687 1346 7523 48
605 Number in birth location 545 359 169 17 11 5
631 Unknown death location 15756 1190 599 2595 9422 1950 14042 102
632 Y death location 1128 24 1027 77 21 21
634 Death location too short 8941 715 132 741 1573 4786 994 5384 61
635 Number in death location 417 68 1 2 287 59 5 5
661 Unknown marriage location 1186 70 1 252 602 261 938 9
664 Marriage location too short 2416 193 313 508 1296 106 1657 6
665 Number in marriage location 3 1 2
711 Separators in Prefix 1177 85 10 84 34 512 451 1 575 10
712 Number in Prefix 422 46 15 58 152 151 237 1
721 Separators in First Name 49944 8374 255 2279 5950 27700 5386 28414 210
722 Number in First Name 92 84 8 44
731 Separators in Preferred Name 64318 10751 328 2357 6059 29555 15224 44 29073 242
732 Number in Preferred Name 120 100 13 7 51
741 Separators in Middle Name 1434 112 19 74 106 838 285 1074 22
751 Separators in Nicknames 2649 133 322 282 352 1217 343 2293 30
752 Number in Nicknames 165 4 2 45 113 1 77
761 Separators in Suffix 8335 1289 186 1091 1861 2545 1356 7 5809 46
762 Number in Suffix 1409 287 157 398 401 165 1 1026 3
771 Separators in Last Name at Birth 5201 880 80 861 1223 1492 658 7 3498 11
781 Separators in Current Last Name 6708 1322 120 907 1289 1921 1142 7 4277 25
782 Number in Current Last Name 1 1
791 Separators in Last Name Other 4928 306 195 1016 1106 1176 1126 2 1 3462 63
792 Number in Last Name Other 7 3 1 3 2
901 Unconnected empty public profiles 35195 35195 11
902 Unconnected empty open profiles 17529 17529 17529 105

Changes since previous update

Detailed statistics are available on http://wikitree.sdms.si/default.htm in Statistics section.







Collaboration

On 26 Jul 2016 at 15:45 GMT Aleš Trtnik wrote:

I did it manually. Since it didn't change much, I no longer publish it. But if you are interested, you can see statistics for each error type on http://wikitree.sdms.si/default.htm in (statistics section) with graphs. You can see there that opened profiles are being corrected and will be soon without errors.

On 26 Jul 2016 at 15:33 GMT S (Hill) Willson wrote:

Are we still showing improvement week to week? The information we used to get on how many errors were corrected was inspiring. I didn't see that in the statistics reports anywhere, but perhaps I missed it.

On 26 Jul 2016 at 11:27 GMT Esmé (Pieterse) van der Westhuizen wrote:

105 : 1700-1799 done

On 26 Jul 2016 at 07:40 GMT Dawn Ellis wrote:

204 open - done

On 26 Jul 2016 at 07:23 GMT Dawn Ellis wrote:

605 - open - done

On 26 Jul 2016 at 07:19 GMT Dawn Ellis wrote:

632 Open - done

On 25 Jul 2016 at 23:31 GMT Graeme Olney wrote:

207: Open - Done (except for those with no dates and Pre-1500)

208: ALL - Done (Except Pre-1500) 308 : Open - Working on

On 25 Jul 2016 at 23:26 GMT Graeme Olney wrote:

201: Open - Done

On 25 Jul 2016 at 23:21 GMT Aleš Trtnik wrote:

112: 1700-1800 Working From end

505: 1700-1800 Working From end

On 25 Jul 2016 at 21:56 GMT Jamie Nelson wrote:

Going to work on 112 this week.

more comments