Surnames I'm tracking - annual report

+19 votes
584 views

A little over a year ago, I decided to change the way I was tracking the surnames of my direct ancestors. I had started my first One Name Study (on Slades) in December 2015, just eight months after I first joined WikiTree. I followed that up with a One Name Study on Frenches in October 2016, and another one for Welches in August 2019. At first, I was annoyed that I couldn't have One Name Studies for all the surnames in my family tree, and was only limited to seven, but by the time I started the third one, I was beginning to realise that tracking a surname takes work, and so the number of One Name Studies that I could realistically do justice to was probably just the three that I had already started. Managing eight, or 16, or 32 One Name Studies just wasn't realistic. (Or at least, not until my T-shirt designs go viral and I get so rich that I can quit my job and hire servants to do things like wash the dishes. Or maybe hire assistants to manage One Name Studies. Or, hey, if I'm going to be rich, why not both?)

For a while, I tried doing something for all three name studies each month, then, for a while, I tried rotating between them. But eventually, I realised that, while the situation was slowly improving for those three surnames, those of my ancestors who weren't born with one of those three surnames were just as badly neglected as they had ever been.

So, starting in November of 2020, I decided to list all the surnames for the previous four generations in my family (and in the family of the light of my life and the delight of my eyes) and work out a schedule to work on two surnames per month. Between the two families, we have 32 surnames.

in The Tree House by Greg Slade G2G6 Pilot (680k points)
edited by Greg Slade

I track the same information for each surname, whether there's a One Name Study for it or not. (Well, with one exception: for surnames which don't have One Name Studies, I don't bother tracking whether the {{One Name Study|name=Surname}} template has been applied to profiles, since there's no category for that surname to add them to.

I posted a thread called "Stuff I put on my One Name Study pages" a while back, and it talks about the stuff I track for each surname. I am also in the process of making my spreadsheets for the surnames I'm working on available for download. If you want to do your own spreadsheet for a surname that you're studying, you can download one or more of the spreadsheets listed in the tables below, and adapt it/them for your own situation. (If you want the spreadsheet for a specific surname that I haven't put up yet, please send me a private message, and I'll send a copy to you.) I have tried to put enough notes into how the spreadsheets work that you can adapt it to your needs.

I have done a little number-crunching on the surnames I'm studying as a whole. Out of the 31 surnames I'm tracking, 85.5% of the profiles are connected to the main tree, which is a little better than the numbers for WikiTree as a whole. However, I can't take any credit for that. The secret seems to be that most of the surnames in our family don't have that many profiles on WikiTree. Out of the 31 surnames I'm tracking, 22 of them have less than 10,000 profiles each, and the least common surname only has 13 profiles, all of which are connected. It seems like the larger surnames have the lowest connection rates. Of the nine surnames in our families with 10,000 profiles or more, six of them come out in the lower portion of the list when I sort it by percent connected.

Surnames I'm studying, ranked by percent connected to the main tree - 2022

A similar kind of dynamic seems to be going on when it comes to notables listed on Wikipedia having profiles created on WikiTree. Again, the more common surnames tend to be on the lower part of the curve.

These numbers need to be taken with a grain of salt, since, for the more common surnames, I can't get through the lists on Wikipedia in a single month, let alone create profiles for missing notables. These numbers may change once I actually finish tallying the most common surnames.

However, even with the numbers being incomplete at this point, it seems to me that people with more common surnames are less motivated to make sure that notables with the same surname have profiles on WikiTree. If your surname is uncommon enough that there are only one or a few notables with your surname, you may be motivated to make sure that all of them get WikiTree profiles. If your name is common enough that there are already a bunch of notables with the same surname on WikiTree, you may not have the same motivation to add more. The situation seems to be bad enough for Miller, White, and Phillips. I can't even imagine how bad it would be for Wong, Smith, or Jones.

Surnames on WikiTree - by percentage of notables from Wikipedia with WikiTree profiles - 2022

There is a measure which I only started tracking this year. This one shows the percentage of "tallied" profiles (profiles from ThePeerage.com, Wikipedia, and our watchlists) and WikiTree as a whole which are unsourced, partially sourced (at least a secondary source), and "fully" sourced* (at least three primary sources). This chart shows the difficulty in measuring sourcing levels for WikiTree as a whole, or even a single Last Name At Birth on WikiTree: for the tallied profiles, I depend on the counts in my spreadsheet, where I assign sourcing levels as I check the profiles. For WikiTree as a whole, I depend on WikiTree to get a count of profiles with the {{Unsourced}} template, and WikiTree+ (thank you, Aleš!) to get a count of profiles with either [[Category:More_Records_Needed]] or [[Category:Profiles_with_Incomplete_Sourcing]]. (I imagine that the Categorization Project will merge those two category hierarchies eventually.) But because both templates and categories are applied manually, there are a ton of profiles to which they should be applied, but haven't been yet. So that's why the tallied profiles show as less than 25% "fully sourced", while all profiles show as more than 95% "fully sourced". (This is one big reason why I keep saying that such-and-such a percentage of profiles on WikiTree is "supposedly" sourced.) Note that the average percentage of profiles with no primary sources across the 31 surnames that I'm tracking is about 5%. The percentage for many of the surnames I track is improbably low.
So one of the things I do when I am working on a surname is search on that surname to get to that surname's genealogy page (like https://www.wikitree.com/genealogy/West), then click on the "table view" link, then click on the "Edit Date" column header, and work down the list, checking the open profiles to see if they have at least one primary source. If they don't, I look to see if I can find one and add it to the profile. If I can't find a source, then I add the {{Unsourced}} template. At least that way, subtracting the profiles with the {{Unsourced}} template from the total number of profiles will give a slightly more accurate picture of the state of sourcing. (Sometimes, I think we should have a "Template-A-Thon" to add the {{Unsourced}} template to profiles that don't have sources.) As it is, some surnames have absurdly low numbers of profiles with the {{Unsourced}} template, so the sourcing situation looks much less dire than it actually is.

(The number of profiles with [[Category:More_Records_Needed]] or similar is also misleadingly low, but while I would like to see more accurate numbers there, it's more important that profiles with no sources at all be easy for sourcerers to find.)
Surnames on WikiTree - sourcing status
* Yes, I know that three primary sources doesn't actually qualify as fully sourced, because ideally every fact stated in a biography should be backed up by at least one primary source, but I based my sourcing levels on the work of Paul Gierszewski, who first came up with a system for measuring how well profiles are sourced.

Somebody asked me how I derive the numbers I use in these reports. Here is my explanation:

For example, if I search on Crozier with no first name, I get a list of all profiles with Crozier as the Last Name At Birth. The current total of Crozier profiles is at the top of the page, a little to the right of centre.

Next, I click on the unsourced link, and that gives me a list of Crozier profiles with the {{Unsourced}} template applied. I just count them up. (With more popular surnames, I have to page through the list, and hopefully not lose track of how many pages I've gone through before I hit the first entry with a different surname, and then multiply the pages by 200 and add the number on the partially full page to get the total.) This number won't be accurate, for reasons I've explained above, but hopefully it will get more accurate as time goes on, as people add the {{Unsourced}} template to more profiles.

Next, I go back to the Crozier genealogy page and click on the unconnected link. Then I click "No" beside "Limit to Open Privacy Level" to get the number of all unconnected Croziers. That number is accurate, at least as of Monday morning. (I think that particular script only runs once a week.)

To see the Croziers on ThePeerage.com, I go to the home page there, then click on the Surnames button, then click on "C", then I type Command-F (Control-F for Windows or Linux) and type Crozier into the search box in my browser. Then I click on the highlighted link to see the list. Darryl includes both Croziers at birth and those who married Croziers, so I follow each link, and enter the name, dates, and ID number into my spreadsheet, keeping the living people (including those born less than 120 years ago with no death date listed) and those for whom Crozier was not their Last Name At Birth separate, so I'm only tallying deceased Croziers at birth. I also have columns for sourcing level, whether a profile is connected to the main tree, whether there's a picture of the person on their profile, etc. I also use IF formulae to put a "1" in the appropriate sourcing level column, so I can get totals of how many profiles are at each level. Then I have the spreadsheet do all the number crunching and chart drawing, setting the charts to use WikiTree-ish colours just to look better, and try to get the reports posted soon after the end of the month (although fairly often, Real Life interferes with that).

For Wikipedia, some surnames have articles like Crozier (surname). Others have more elaborate titles like List of people with surname Miller. Others simply have something like West (name). Others are more simple still, like McNair. The least common surnames have no article at all, and I just have to search for anybody with that surname. Assuming there is a page for a given surname, I work through it the same as with lists on ThePeerage.com, with the difference that there are also fictional characters to separate out.

For a couple of the measures, I use Aleš's reports:

https://wikitree.sdms.si/default.htm?report=srch1&Query=LastNameAtBirth%3DCrozier+unlinked&MaxProfiles=500&Format=

gets me a list of Crozier profiles which aren't linked to any other profiles. I have to say that this measure has come down a lot in recent years. A while back, most of the unconnected profiles on WikiTree were unlinked, but clearly a lot of WikiTreers have been working to link and even connect unlinked profiles.

https://wikitree.sdms.si/default.htm?report=srch1&Query=LastNameAtBirth%3DCrozier+needs+record&MaxProfiles=500&Format=

gives me a list of Croziers which have categories like "Needs More Records" or "Needs Birth Record", etc. applied. It does give me some false positives on some surnames when other categories happen to have the right words (or the wrong ones, depending on your perspective). It also gives me false negatives for reasons that I will have to discuss with Aleš sometime. But at least it's a lot faster than trying to go through all the possible maintenance categories looking for Croziers.

Inspired by Dean Slade's work on Slades, I have added two new datasets to the spreadsheets for the surnames I'm studying:

This will have the most visible effect on those surnames with the fewest profiles on WikiTree. (In some cases, there are more people with a given surname listed on those sites than there are on WikiTree.)

I'm also going to try putting up the spreadsheet that I use for each surname for anybody to download, and that's reflected in the new column in the tables above. I had shared the spreadsheets for a couple of surnames before, but of course those would mostly be of interest to other people who are also studying those particular surnames. Now, over time, I should get up spreadsheets for all 32 surnames that I'm studying.

And, as before, if anybody wants to adapt a spreadsheet to use for your study of a different surname, you're free to do so. 

I have found that the surnames with more profiles are easier to work with if I break up the worksheets. So, in a spreadsheet for a surname with fewer profiles, I have a worksheet for each dataset. But in a spreadsheet for a surname with fewer profiles, I break them up more, so in the Miller spreadsheet, there are separate worksheets for Living Millers listed in Wikipedia, people listed in Wikipedia for whom Miller was not their Last Name At Birth, and fictional Millers listed in Wikipedia, so the Wikipedia worksheet only contains deceased real people with Miller as their Last Name At Birth listed on Wikipedia (plus summaries from the other sheets).

Here are the 32 surnames that I'm working on, in alphabetical order:

Surname Profiles Name Study G2G Thread Spreadsheet
Cannon 7,647 1504815 GMX
Carmichael 4,052 1454054
Crozier 1,747 1180383 GMX
Cutlip 686 1233224
French 15,161 French 304922
Gierszewski 36 1341185
Goodes 180 1262928
Guy 4,970 1485899


 

Surname Profiles Name Study G2G Thread Spreadsheet
Hancock 11,487 Hancock 1485881
Kelso 1,601 1202550 GMX
Kibler 697 1255842
McClure 8,074 McClure 1308966
McDonald 28,799 1329053
McKeller 54 1312514
McMillan 7,571 McMillan 1504766
McNair 2,133 1197941

Surname Profiles Name Study G2G Thread Spreadsheet
Miller 107,090 Miller 1180399 GMX
Nickel 1,209 1218147 GMX
Phillips 45,569 1338991
Pudder 20 1148684
Rucks 172 1453115
Singleton 4,912 Singleton 1504517
Slade 3,816 Slade 203070 GMX
Thorne 3,894

Surname Profiles Name Study G2G Thread Spreadsheet
Toppin 163 1308963
Waddell 3,492 Waddell 1148681 GMX
Webb 26,654 1310214
Welch 13,113 Welch 903315
West 27,653 West 1203124
Westfall 1,846 Westfall 1340696
White 78,790 White 1198753 GMX
Woods 19,757 Woods 1218137

  • The number of profiles for each surname is from my tallies on January 1, 2024.
  • I only manage three One Name Studies. The rest listed in the table are managed by other people.
  • There is no G2G thread for "Thornes on WikiTree" yet, because I only found out about the Thornes in our family tree late last year. I plan to work on Thornes in August.
Wow Greg that's amazing

I should mention a couple of other things that I do with the surnames I work on:

First, for each surname, when I check to see whether the notables with that surname as their Last Name At Birth listed on Wikipedia have profiles on WikiTree, if I find a profile for that person (or create one), and haven't been able to connnect it yet, I add them to the Unconnected Notables page (or one of its sub-pages) because it's much more likely for that profile to get connected if multiple people are trying, rather than just me.

Second, for each surname, I do a WikiTree search with nothing in the "First Name" box, just the surname in the "Last Name or ID" box. The resulting page is where I get the numbers for the beginning-of-the-month and end-of-the-month reports for each surname. On that page, there's arow of "Quick links for members:" The third-last link says "table view". When I click that, I get another page with the same profiles, only in a table. And, more importantly, a table with sortable columns. So I click on the "Edit Date" header, and the entries get sorted from the ones last edited the longest time ago to the ones edited most recently. And, helpfully, the last column shows the Privacy Level for each profile. Then, I work down the rows and check the profiles with the Privacy Level set to Open (the ones with a black circle with a white centre). For each profile that's open, I check to see if it's sourced, and how well.

  • If it's got three or more primary sources, I usually just move on to the next one, although if I have another source that's not already listed, I'll add it first.
  • If it's got fewer than three primary sources, I'll see if I can add any more. If I can't get it up to three primary sources, I'll add [[Category:Location, Needs More Records]].
  • If it's got no sources, I will check to see if I can add any. If I can't find any sources, I'll add {{Unsourced|Location}}. If I find one or more sources, but can't get the profile up to three primary sources, I'll add [[Category:Location, Needs More Records]].

You will have noticed that I talk about "primary" sources, which, presumably, means that there is some other kind of source. There is:

  • I consider a "primary source" as anything that an official record. That is, things like birth, christening, census, school, military, marriage, death, or burial records. 
  • I consider "secondary" sources things like entries in Wikipedia (or any other encyclopaedia or dictionary), biographies, genealogy books or articles, family trees on other web sites, newspaper or magazine articles (including obituaries), blog posts, etc.

I do not consider notes that say things like "Wikipedia" or "Ancestry" or "Family Tree" or "1871 Census", but don't have links to actual entries or documents, to be sources at all. I don't delete them (and will complete them if I have access to the necessary source), but if a profile has one or more of those, but no real sources, I will still add the Unsourced template.

Similarly, I don't consider "Personal knowledge of" a person who wasn't even born before the person the profile is about died to be a source, either. Those notes, I will delete, and replace with real sources if I can.

1 Answer

+3 votes
Fred Kelso.  I found the marriage record; could he have been from OH?  Also found a Fred Kelso in Monroe Co., NY on 1892, 1900 and 1905 censuses with father Fred and mother Elizabeth; not sure if it's him.
by Lorraine O'Dell G2G6 Mach 4 (42.3k points)

Thanks to a tip, I finally found his parents. His mother's Last Name At Birth was Thorne, so now I have a complete list of 32 surnames.

Related questions

+14 votes
5 answers
+9 votes
4 answers
+10 votes
2 answers
+12 votes
3 answers
+8 votes
3 answers
+9 votes
5 answers
+14 votes
11 answers
+9 votes
15 answers
+15 votes
10 answers
+14 votes
2 answers

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...