Regional Categories vs Location Field

+8 votes
140 views

I was looking over the Help pages in regards to categorization, specifically Help:Category Names. I happened to notice some wording that I had never really payed attention to before. Under Category Names for Regions, there is wording that states:

“Category names for locations follow the same general principles as Location Fields.”

As I investigated a bit further, I noted that on Help:Location Fields, it states that:

The rules above apply to Categories as well.”

I know that many of the high-level regional categories were created during the introduction of the Category: namespace, but country specific projects later took these categories over and started to propose and implement the regional naming standards from there. Everything has worked fine (with little to few major hiccups that I can see) for years, but now we may be starting to see some issues...

One of the fundamental features of the location field is that it uses the entire location structure. As an example, a search for “Houston” will return possible matches, including:

  • Houston, Harris, Texas, United States

  • Houston, Harris, Republic of Texas

  • Houston, Harrisburg, Republic of Texas

  • ...

Through the use of the category namespace, and the current naming structure for United States regional categories, this would only be represented by [[Category:Houston, Texas]] (in this instance, the scope is limited to the naming of the category only). If we were to follow the guidance of the location fields, I would assume that the correct name would be [[Category:Houston, Harris, Texas, United States]]?

This recently came to light during another G2G discussion in a proposal for Norway regional categories. A comment in the proposal from Leif states that:

In a constantly growing list of categories, some emphasis should be put on keeping the common namespace clean and tidy. This is the reason for spelling out intermediate units up to and including the country level in all place categories, ensuring a clear and unambiguous ... namespace.

Through follow-up discussions and conversations within the thread, it started to become clear that some of the issues being seen in the category namespace relate back to the addition of the category picker tool that was released back in June. When you try to perform a search of a location, you receive a number of possible options that don’t always make sense to users from other countries. As an example, I worked on a profile a couple years ago, where the person was listed on a US Census as being born in “Moravia”. When typing in “Moravia” into the category picker tool, you are presented with multiple regional options, including:

  • Moravia

  • Moravia, Iowa

  • Moravia, New York

  • Moravia (village), New York

In this case, is the first entry for Moravia perhaps meant to represent Moravia in Czechoslovakia, Moravia in Austria, Moravia in Costa Rica, Moravia in Jamaica, Moravia in Colombia, Moravia in Texas, Moravia in Moravia in Pennsylvania, Moravia in Idaho, Moravia in Ecuador, or Moravia in Oklahoma? (none of which are currently represented by categories in the category picker tool)?

To expand further, some might argue that a place as recognizable as Texas would be hard to mistake, but with the category tool, it is not as clear now. Could this be Texas in Bolivia, Texas in Ecuador, Texas in Australia… The same applied to “Houston, Texas”, and while it make sense as a category name, how does that now relate to places like Houston, Texas, Missouri?

in The Tree House by Steve Harris G2G6 Pilot (336k points)
I agree that the sentence you cite from the page https://www.wikitree.com/wiki/Help:Category_Names presented an overly simplistic view of the situation.

However, a few days ago you moved almost all of the place-category information formerly on that page to https://www.wikitree.com/wiki/Help:Category_Names_for_Regions ,where  the issues you identify are covered in more nuanced detail (for example, at https://www.wikitree.com/wiki/Help:Category_Names_for_Regions#US_State_and_County_Categories ). So I'm rather confused about what it is you are now trying to discuss. (Please 'splain me.)
Ellen, you are correct in that the page information was recently moved, but that was to shorten the Help:Category Names page into an easier to read and follow format. This move did not change any of the rules or the guidance, so that is not at an all an issue or factor in this case.

The main issue is that the statement quoted from Help:Category Names (which was not modified, and has been in place since at least 2017), is that the Category Names should follow the same general principles as Location Fields. I know it is a lot to read and take in, but in the paragraph starting with "Through the use of the category namespace..." it starts to details the exact issues.

For a condensed version, I would say that 'Category names for location do not reflect those intended to be used in the Location Field'.
BTW (and a bit off-topic), I cringe every time someone suggests that "Houston, Harris, Texas" -- or, even worse: "Harris, Texas" for the county -- is the ideal way to report a U.S. place name.

In  U.S. usage, it is rare to reference a county without including the word "county" (e.g., we would refer to "Harris County" or "Harris County, Texas", not "Harris, Texas"). Also, we almost never mention the county when identifying a city or other legally designated municipality (a New England town, Pennsylvania borough, incorporated village in Illinois, etc.) -- and we can omit the county name because a municipality name (but not a township name) is unique within the state. However, a number of states have a county and city (or other municipality) that share the same name -- and all too often the city isn't in the county of the same name (to see what I mean compare Decatur County, Tennessee, and the incorporated town of Decatur, Tennessee, or compare Franklin County, Tennessee, and the city of Franklin, Tennessee). Therefore, to correspond with standard usage and to avoid serious errors due to ambiguity in city vs. county names, when naming a county in a location data field it's real important to get in the habit of including the word "County."

This may sound weird given the post I made, but I completely understand and agree with you!

Since we use FamilySearch's Place Research database to make suggestions for location names, and this often used on profiles (type something, select what pops up and move on), this is the scenario I have described herein.

But you are 100% correct, that in normal usage, it would be called out as "Harris County" and not simply "Harris"!

Ellen, I’ve brought this up in two recent G2G posts, the use of county name but not the word, “County.” The location suggestion list in this instance is useless, as the work, “County,” is not in the suggestions. This causes all sorts of problems. The example I gave in the last thread applies. Henderson, Norh Carolina is not the same as Henderson County, North Carolina. The two are not even close geographically!! 

In an earlier thread, I was told WT just copied the FamilySearch format for the locations suggestions, but that format doesn’t work here on WT. And while we lift up laud and honir in FamilySearch as all-in-all for something’s, this is one instance where it is just not working. 

I have to type out the whole location. Really, it’s not that big a deal. However new folks (or veteran members) who use that format just create later problems. It does make a difference between looking for land records in Vance County (Henderson, North Carolina) and Henderson County, North Carolina. 

I think I’m beating a dead horse here. This is probably my one great WT objection. Now, I’ll just drop it! laugh

In the same way in Queensland, Australia.  You wouldn't write Noosa, Noosa, Queensland.  You'd write it as "Noosa, Noosa Shire, Queensland".

(I can't say what it's be in other states.  I was in Western Australia 40-plus years ago, long enough to have my daughter and return to Qld when she was 4 weeks old.  Same with Victoria; I lived there so long ago I was too young to take an interest, or to care; and I never lived in the other states.  But full designation is always to be desired, for clarity of non-locals.)

No need to drop it Pip, we are just discussing. Everyone's allowed to have an opinion.
I’ve gone back to adding the word County to the US places.  The pop-up box says one is not required to use the suggestion, and when I hit that Henderson, North Carolina entry, I regressed.

@Pip: WT hasn't just "copied the FamilySearch format" for location suggestions. WT is using the location database maintained by FamilySearch.

The quality of location suggestions for Sweden makes me deeply grateful for the option to turn them off.

Eva, I don’t understand, being as technologically challenged as I am. Is WT connected somehow to FS for this (location suggestions)?
@Natalie: You’re right. I let my frustration get the better of me. Thanks!
@Pip: Yes.

I don't know how they do it - if it's fetched from there in realtime or if it's a once-in-a-while data dump. Probably the latter.

Ellen mentioned the use of "County" in location fields, so I thought I'd share from [this section] of the Virginia Project page:

When entering a Virginia location in a profile's datafields, please include as appropriate "County" or "Parish" if you know that Parish or County is meant (many Virginia cities, counties, and parishes share a name but not a location). 

2 Answers

+3 votes
In the past, even recently, I would have considered this unnecessary, but since we have been discussing Norway's place categories, I'm beginning to see it another way. Also, I have been looking at names of schools in US Colleges and Universites that have no place name attached.

Here's one: https://www.wikitree.com/wiki/Category:Hanover_College

If I'm on the category page, I can look up and see the parent categories and know I'm looking at a college in Indiana. If I use the category picker button from a profile, I find two categories (in this case, the category is duplicated and one has the place name and the other does not.) But, how would I know which one of these is the correct category to choose?

Place names offer the same confusion, as Steven pointed out.
by Natalie Trott G2G6 Pilot (585k points)

Category names for institutions such as universities should not be expected to include geographic information, except in situations where there is potential ambiguity (e.g., two different entities with the same name or very similar names, or a university with two locations that people want to categorize separately).  Category descriptions and parent categories can provide location details. (Category picker is an aid in identifying categories. If it finds two potentially applicable categories, a member ought to be able to look at the categories and figure out what to do.)

Accordingly, it seems to me that:

  1. Category: Hanover College, Hanover, Indiana should be merged into Category:Hanover College.
  2. The surviving category should have Category: Hanover, Indiana as one of its parent categories.
That is the current way they are supposed to be set up, but I see the merit in adding a place.
When this idea of always including the country in category names was introduced for Norway I was just intuitively against it - probably because it seems so unnecessary (not needed to avoid duplication/clashes of category names) and repetitive.

Then I started worrying about category names getting too long. I'm not sure they really will, in Norway - and I'm not sure that there aren't already some vary long category names, working just fine. But I did experiment with a really long subcategory name (should be deleted again by now) - and it skews the columns on the parent category page.

I have also started wondering whether all these categories with Norway in them would start crowding out categories like "Norway, Indiana" and "Norway, Michigan". Well, probably not; the category picker seems to prioritize categories starting with whatever it is you write - although typing "Norway" also brings up a few ending in Norway.
Yes, the length of the name is a concern, but sometimes there is just no other way but to have a long category name. As long as careful thought goes into the naming, it can be done.
+4 votes
While I agree that it is important to avoid ambiguity in location category names - and Steven has found some good illustrations showing the necessity to think a bit in advance - I think that a rule to always include the country in location category names will run counter to the policy of having one single landing category for regions where WikiTree has more than one language stream.
by Eva Ekeblad G2G6 Pilot (336k points)
We actually have that issue now, do we not?
Yes, I suppose so.

How is it being solved?
Is there a general model for solutions?

Related questions

+6 votes
4 answers
+2 votes
1 answer
88 views asked Aug 26, 2019 in WikiTree Tech by Raewyn Vincent G2G6 Mach 4 (43.9k points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...