Category Names for Regions

+12 votes

This is a discussion post only and is not a proposal!

Based on some feedback received by multiple users, the recent general name formats being adopted by multiple projects when restructuring their regional categories, and some general comments that can be found in G2G, I would like to bring a topic up for discussion to help gain some insight on any possible pros and cons for Full Location Naming in Categories.

Currently, the approved and documented regional category naming structures can be found at Help:Category Names for Regions. In many cases, care was taken in these structures to help avoid any ambiguities within the structure itself, but may not have included considerations for ambiguities on a global level. For instance, townships in the US can be found across multiple states, and may even be duplicated within the same state (there are 16 Franklin Townships in Pennsylvania), so the format adopted was "Township, County, State" to avoid any ambiguities with townships in the US structure.

In other cases, there may be ambiguities across global categories, such would be the case for Hamburg, where it could be a state in Germany, a city in Iowa or Eastern Cape Africa, a borough in New Jersey or Pennsylvania, one of two unincorporated communities in Ohio, one of two towns in Wisconsin, etc.

These issues are also compounded by the recent introduction of the category picker. Where a category may make sense when viewing the category pages directly (you can read the text on the category page, see the parents, etc.) this is not an option from the picker tool (not to mention there are many items left off of the returned results).

So - What would be the Pros and Cons of using full location formats in categories? Such as:

  • Crosby, Harris County, Texas, United States;
  • Hamburg, Avoyelles Parish, Louisiana, United States; or
  • Landschaft Sylt, Nordfriesland, Schleswig-Holstein, Germany
in The Tree House by Steven Harris G2G6 Pilot (213k points)
The name for "Landschaft Sylt" would be "Amt Landschaft Sylt" (Amt is part of the official name). And in keeping with your other examples you might want to call it "Kreis Nordfriesland". And Germany isn't really what they use.

13 Answers

+6 votes
One pro with this is that it would make categories closer to the location field location picker.

When categories were at the top, this would have had a visual impact but with categories displayed at the bottom, this should not be a problem.
by Doug McCallum G2G6 Pilot (297k points)
edited by Steven Harris
Doug, do you know, without counting, what is the displayed character width of the category picker?  Longer place names might be problematical.
I don't know the max width, but I've used some that are fairly long. I suspect it is about 80 at present.
+6 votes
Cons: Longer category names, much renaming

Pros: Less ambiguity, more consistency across the globe

Numbers wise, it's a wash for me, but the weight of the pros wins out.
by Natalie Trott G2G6 Pilot (461k points)
+8 votes
I think it's extremely, extremely important to allow for regional variations.  We are discussing this in the Wales Project right now.  In England, the county structure has been relatively stable since 1066, so it makes sense to have Category:  Place, County, England.  In Wales, the county structure has changed significantly twice or three times in the last half century and was not very stable before that.  Having the structure simply be Category;  Place, Wales, eliminates enormous difficulties in having to know exactly when one is categorizing in order to know the correct category;  virtually everyone born in the current county structure is still living and not eligible for profiling on WikiTree.
ago by Jack Day G2G6 Pilot (265k points)
Would it be an option to have Place, Wales as a category and Place, county1, Wales and Place, county2, Wales, etc. as subcategories of Place, Wales?
I am firmly in Jack's camp here.
Joke, you want the lowest level category to be the "landing category" that people put on profiles, and that category should not be confusing.  So Category: Place, Wales, should be the landing category.  The idea then is that this lowest level category should be a subcategory under the counties;  if Place is in County 1 presently, but it used to be part of County 2, and before that it was part of County 3, then Place, Wales would be subcategorized under Category: County 1, Wales, Category: County 2, Wales, and  and Category: County 3, Wales.  

What I would hope we avoid is having Category:  Sameplace, County 1, Wales, and also Category: Sameplace, County 2, Wales, and Category: Sameplace, County 3, Wales, because then the user would have to decide which to put on the profile, and the chance of error would approach 100%.
Jack, I think what you stated is exactly what Joke suggested.
Steven, I read it the opposite, that Joke would have place, wales, as the higher category, and he would have place, county1, wales, and place, county2, wales, as subcategories of place, wales -- and therefore they would be the landing categories.  

What I hope we can retain is that place, wales, would be the lowest level, landing category, and that place, wales, could be subcategories of county1, wales and county2, wales.  I don't want any categories that put the place and the county in the same category because that becomes hopelessly confusing, giving the frequent changes of county structure in Wales, and which places are in which counties, in Wales.
+6 votes
I would strongly object to making the county a mandatory part of the category name in the U.S. states where I do most of my work. The most important reason for this is similar to the argument that Jack Day makes for Wales: county structures and boundaries have not been consistent or stable over time.

My favorite example to cite is Hingham, Massachusetts. Although the records transcripts that people import from FamilySearch and Ancestry say its in Plymouth County, in fact before 1643 it wasn't in any county; from 1643 to 1793, it was in Suffolk County; from 1793 to 1803, it was in Norfolk County; and only since 1803 has it been in Plymouth County. Same town through all that (same town histories, vital records books, etc.), so the town should have just one category, but a person looking for court or probate records needs to find the right county for the relevant time period.

Then there's the case of Washington County, Rhode Island. Gedcom-imported profiles claim that towns like Westerly and Kingstown were part of Washington County since the earliest settlement, but the county didn't acquire the name Washington until about a century and a half later (after the Revolutionary War); before that it was Kings County. And people who live there will say they live in "South County," which doesn't exist, but that doesn't  do any real harm since the state eliminated most of the governmental function of counties in the 19th century.

I could offer other examples in other states...
ago by Ellen Smith G2G6 Pilot (933k points)

This is where we are going to start to get at odds with the current wording on Help: Location Fields, which states "use their conventions instead of ours":

Place names, and even boundaries, change over time. They also have different names in different languages. We aim to use the name that was used by the people in that place, at the time of the event you're recording. This standard is often difficult or even impossible to apply, but it is an ideal that members from all over the world can agree upon.

Does this mean we should strive for as accurate of place names as possible, even if it were to separate related profiles?

The rule of "use their conventions, not ours" applies to the documentation of people's lives, which is done in the text of a biography and also in the profile data, including the location fields. Categories do not exist -- and should not be relied upon -- to document people's lives. Categories support genealogy; they are not a substitute for people's biographies. Within WikiTree, categories are like the tail of the dog that consists of genealogy, the tail should not be allowed to wag the dog.

If you insist that the singular town of Hingham must have different categories for every change of designation for the entities that encompass it, I guess you would say that an ancestor who was born in Hingham shortly after his parents immigrated there in 1638 and died there in 1700 would (assuming we don't also bother to have a category for the short-lived Dominion of New England) need to be separately categorized in:

  • Hingham, Massachusetts Bay Colony
  • Hingham, Suffolk County, Massachusetts Bay Colony
  • Hingham, Suffolk County, Province of Massachusetts Bay

And that person's great-great grandchild who lived their whole life in Hingham (say from 1750 to 1810) might need to be categorized in:

  • Hingham, Suffolk County, Province of Massachusetts Bay
  • Hingham, Suffolk County, Massachusetts, United States
  • Hingham, Norfolk County, Massachusetts, United States
  • Hingham, Plymouth County, Massachusetts, United States

I can't see how any useful purpose would be served by doing that...

I am not insisting anything, just pointing out that what we think and what the Help pages state appear to be at odds...
No need to feel constrained by that Help page. Note that the Help page discusses usage in Location Fields. It does not suggest that the same rule must also apply to category names.

Ellen, have you viewed the page by chance? It specifically calls out Location Categories - and that the rules for Location Fields apply to location categories...


I remember more than one discussion within the categorization google group to the effect that that particular language on that Help page was inconsistent with other guidance on categories as tools for usefulness and needed to be changed.  There were at least the beginnings of updating the category help pages, but apparently did not get to the point of revising that particular language on that page. If it still hasn’t been revised, it needs to be fixed.
The location field is not a category.  The purpose of the location field is to tell a truth.   The purpose of a category is to group people for some useful purpose, I would say a genealogical purpose.  The two obviously overlap, but they are different things, and the rules for one don't necessarily apply to the other.
That text that applies the location field rules to categories has been on the Help page since the page was created in May 2017. I can't tell who created tit. Apparently it was never revised to reflect community discussions of the subject of categories.
I don't understand how or why that text was added, since the location field only has space for a location in one language, while the categories in 2017 envisaged multiple single-language category streams -- that was prior to the discussions on mirroring.  So I think the text on the help page, to the extent that it called for categories and location fields to behave in the same way, was a mistake that contradicted virtually everything else being said at the time.
I also agree that this text on the help page needs clarification.

I don't think the same rules should always apply to both location fields. The fields are there to record punctual events, while I see categories more like "intervals". Creating a separate category for every administrative hierarchy that ever existed (with a new category each time one of the administrative entities at any level changed) would IMO be too complicated, impractical, result in sparsely populated or empty categories which few contributors would dare to use.
+5 votes

This structure is closely resembling the new Norwegian Location Category Structure, which was discussed here last year.

I'm pleased to see that others appear to arrive at the same conclusions. To avoid collisions and avoid polluting of the global category namespace, it becomes increasingly important to decide upon unambiguous category structures for place names.

ago by Leif Biberg Kristensen G2G6 Mach 2 (29.3k points)
Leif, the problem is that what creates an unambiguous category structure for one country can create an absolute mess in another, because the politicians who create regional entities and their boundaries did not do it with our categorization issues in mind!

That means that, while of course there needs to be some coordination and discussion, place category structures need to be developed by the projects concerned with that place,  Trying to have one structure that applies to the whole world is like the proverbial "Procrustean Bed", where you start out by creating the bed, and then you stretch the short people and cut off the feet of the tall ones to get them all to fit.
+6 votes
In the New Zealand Project, the categories are based on the current regional council boundaries. These are not unlike counties in other countries. There have been so many changes of names over the years that it’s complicated and mind boggling to keep up with and enter accurately . The way we do it, is to put the historic name in the profile data, but categorise under the current legally recognised name. After all, the place hasn’t moved! For example, Richmond, Tasman, New Zealand. It’s not that long is it? Perhaps in a bigger city, there might be a suburb, but still, it’s only four parts and everyone can find it on Google Maps.
ago by Fiona Gilliver G2G6 Mach 9 (93k points)
+5 votes
An important con would be to define what a "full location" is. For instance, looking at the unfortunate suggestions for location fields that FamilySearch provides us, locations in France are in the format: "Place, Department, Region, France". The Region part is a) totally redundant (each department being unique, no need for a region to disambiguate), b) very often inappropriate (administrative regions were introduced at a much later date than departements, yet FamilySearch systematically includes them) , c) subject to change (regions were completely restructured in 2016). The third part is particularly annoying since it would force a completely unnecessary subdivision of categories, ie. taking an example close to home, I was born in "Chaumont-en-Vexin, Oise, Picardie, France" which is now officially "Chaumont-en-Vexin, Oise, Hauts-de-France, France" (you might object that it is not an absurd subdivision since I'm ancient, but I will maintain that such a division would serve no purpose at all). And yes, it makes for loooong category names. In the case of France, adding "France" in the end is in my view unnecessary, since the department names are unambiguous (with the only possible exception of Savoie which was known as Savoie before it was part of France; but the department itself is currently categorized as "Savoie, France"). I could live with adding France at the end of all landing-level location categories, but would vehemently oppose adding the region.

On the other hand, I have toyed with the idea of making a proposal for Belgium (which I will probably never make, since from what I have observed lately it would not be accepted) and this proposal would make adding the country name at the end mandatory for Belgian location categories. This is warranted by the existence of two Belgian provinces with ambiguous names.
ago by I R G2G6 Pilot (258k points)
+4 votes

I think one issue that would need to be addressed is what language would be used for the country name. If we want to attract more members who do not speak English, then in my view something as fundamental as the category for the birthplace should be available entirely in the local language. Whether parallel categories exist in other languages would be a matter for discussion, but may have impacts on things like requiring a language selection in the category picker. This would undoubtedly then have immediate consistency implications for things like the migration categories, but probably also more widely across much of the existing category structure.

ago by Paul Masini G2G6 Mach 4 (45.2k points)

Hi Paul, we already have multi-language options available in categories, and just recently had an update for mirroring categories published.

As an example, see:

Any profile placed in English category will also show in the French category, and vice versa.

Hi Steven, thanks for your reply. The example you quote actually goes part way to illustrating the issue that I was getting at.

I don't know exactly how the multi-language options work for categories, but I see that Category:Nouveau-Brunswick contains both Category:Bouctouche, Nouveau-Brunswick and Category:Restigouche, New_Brunswick. These are obviously not consistently named, but at least it is reasonably clear what they both represent. Whether a French-Canadian would be put off by having to use a category name containing New Brunswick rather than Nouveau Brunswick I have no idea, but I am sure they would be able to recognise both variations.

A better illustration of the problem that I see would be the wife of my 2nd cousin twice removed, who was born in Shanghai, China. If we do not want to put off potential members from China who do not speak English, the location category should probably be something like 上海,中国 if it is to have a name only in one language. Even if it is then put in several higher level categories one of which is called China, that is not going to help me to know that is the correct category for Shanghai as I don't speak Mandarin (I used a translator to get those symbols). But then if a parallel category called Shanghai, China is created, what makes that so special compared with, for example, Σανγκάη, Κίνα or شنغهاي، الصين ? (Apologies to any speakers of Greek or Arabic if those are not correct, blame the translator not me!)

Even using one of the examples from your original post, should there be a single location category called Landschaft Sylt, Nordfriesland, Schleswig-Holstein, Deutschland, or should there also be a parallel category but using "Germany" instead of "Deutschland"? If they both exist as categories, how do you avoid the category picker getting overcrowded with multiple variations of the same place name?

If the multi-language options for categories already deal with all those issues then that is fantastic, but it isn't clear to me at the moment that they do.

Hi Paul, since this was such a recent promotion for mirroring categories - there is still much cleanup to be performed as you noted. In general, categories should only be listed in the language structure in which they are written.

In regards to language usage, this is something we are trying to work on by reviving the Languages project in coordination with this new category mirroring. So for me, who also does not read or write in Mandarin, I can use the English language stream for any profiles. As the Mandarin language categories are expanded and linked to the English language stream, my profile will automatically show in both English and Mandarin, and whatever other language version may be added in the future.

In regards to the example for Germany, ideally this would be available in both languages, again in coordination with the Languages project.

In regards to the Category picker, this is an issue where the display shows truncated results (not a full listing), and I am not sure what the possible change to overcome this would be since we could potentially have 1,000's of categories available on some single search words.

The New Brunswick category hierarchy needs some work which was put off until we saw the results of the new language support. That work can go ahead now.
+4 votes
I agree to a certain extent--there are too many towns which have the same names as counties which are in totally different parts of a state. And too many same-named towns within a state. But I can see Ellen's point, as well--if you add the county to the categorization, then that may necessitate a date parameter at the lowest category level which adds another layer of complexity.

I find categorization difficult and avoid doing it. I'm in favor of whatever can be done to simplify. I would imagine many WikiTreers feel the same.
ago by Nelda Spires G2G6 Mach 8 (82.3k points)
+4 votes

I think this may be going too far.  I'll cite as example of permutations the current city of Louiseville in Québec, which has a long history, but not under that name:

The seigneurie of Rivière-du-Loup was created in 1665 or 1672 (depending on sources and what they base themselves on).

It was served by missionaries from 1714, ie only had a mission and not a church per se until 1786, when it officially got its own parish priest.  The parish name applied was St-Antoine-de-Padoue throughout.  One finds references to it under St-Antoine de la Rivière-du-Loup over this time period.  Parish territory limits were set in 1722.

The parish municipality of St-Antoine-de-la-Rivière-du-Loup was etablished by ordinance on 1 July 1845.  

On 19 May 1879 the municipality of the village of Rivière-du-Loup got renamed Louiseville in honour of the youngest daughter of Queen Victoria, married to the then governor general of Canada.  This was part of a split of the existing village.  Which parts of the village went where would need in depth study.

In 1988, Louiseville and the village of St-Antoine-de-la-Rivière-du-Loup fused together to become a greater Louiseville.

(sources are in French, since English sources lack detail)

Rivière-du-Loup was sometimes called Rivière-du-Loup-en-Haut to disambiguate it from the Rivière-du-Loup which is on Gaspé peninsula.

So, from the above, can anybody tell who was living in which part of Rivière-du-Loup?  Few records distinguish between the parish municipality of St-Antoine.... and the village municipality of Rivière-du-Loup.  Louiseville is in there, obviously, but only after 1879.  And there was only one church there for the longest time.

So, creating categories that try to follow all these permutations and trying to name them with all the variations etc that also apply, except for Louiseville, which is post-confederation in any case, is going to be counter-productive.  There's currently a gap in time for categories for this location due to all these fancy goings-on.

Only in instances where there is ambiguity would I expand location names, and I certainly wouldn't add the country to all of them unless there is ambiguity, like Isle-de-France recently discussed in categorization, there's one in Mauritius history (it's earliest name), and there's the region in France.

ago by Danielle Liard G2G6 Pilot (209k points)
How do you handle the Location Fields for the profiles of those areas?
Not done a whole lot in this time period there, but pre 1879 I enter Rivière-du-Loup and appropriate 2nd referent (Canada, Nouvelle-France, Province of Québec, Bas-Canada) and note in bio this will become Louiseville.

To take another example, if one were to categorize for current location of Chambly with this method, it could come out:  Category: Chambly, Municipalité Régionale de Comté de la Vallée-du-Richelieu, Montérégie, Québec, Canada.  Talk about a mouthful.   And since MRC's are quite modern, we would then have to find a way to disambiguate from earlier versions of the same town prior to the formation of those.  

PLEASE!  Let's keep things simple.

Don't forget, this was a discussion only, and not a proposal. The idea was to get this information out into the open since it has been requested many times. We see it constantly in location proposals, wanted categories, category renames, etc.

by all means let's get it out there.  And I did understand this was only a discussion.  Am attempting to illustrate some of the consequences which would happen.
+5 votes

Are we not restructuring categories for the sake of the exceptions.  One might understand the duplicate occurrences of town names in different countries, even within countries and even states, being quite common.  To concatenate the category name to include both place, and then county/province/state (effectively the next tier up) ensures the duplications are dramatically reduced.  Surely only those categories which still suffer ambiguation issues after this level should be handled as exceptions to the rule.  The shorter the place category name the better.

ago by Andrew Field G2G6 Mach 1 (15.4k points)
+3 votes
I have long been a proponent of clarity over brevity, but I also think it is important to have regional flexibility.

My personal preference would be to have the country name as part of all locations as it adds considerable clarity and most especially tells members which location guidelines to consult.

But when it comes to defining what else constitutes a “full location,” for WikiTree purposes, I think we ought to defer to the regional projects which have the most knowledge about what aspects of location naming are most stable, which are truly part of the same hierarchy, which are most helpful in terms of clarity, and which are readily available.

For example, in the United States, many states have townships which are within counties, but are not towns, villages or communities.  They are important for purposes of deed descriptions to land and for census purposes, but they are not generally part of the full name or location of towns.  In other states, townships are used in a considerably different manner and have a great deal more importance.

Another problem with attempting to define “full location” globally is that once while the concepts of town/village/community and country work with most locations globally, everything else in the middle varies widely and different countries define things like counties very differently and change a lot of it over time.

So, I would rephrase the question and instead ask what the pro and cons would be of including the country name in all location categories.
ago by
edited ago
+3 votes
I understand Jack's problem with Wales only too well. We have the same problem with district names and border changes in South Africa which were not just three or four over three and a half centuries, as such we proposed to use the commonly known areas Cape, Transvaal, Natal, Free State and South West Africa used with one landing category for each town.

Unfortunately there is much resistance from the categorization community who appose this type of structure based on a more geographical concept than a historical correct naming convention. They would rather have several landing categories per town and then tell us that dates are not approved for use in category names. When you look at how the category picker proposes categories with just the landing category names, you have to adjust or you will just have one big mess. But to have these mile long names does not go down well either. The shortest possible name to identify and distinguish a place should be the criteria in stead of a standard elaborate naming convention.
ago by Louis Heyman G2G6 Mach 4 (42.5k points)
edited ago by Louis Heyman
Louis, I am afraid your comments are incorrect.

There is no resistance from the Categorization project and no effort to create several landing categories where only one is needed. In fact it is quite the opposite, as we have offered to assist the South African Roots project many times in there structure documentation to ensure that a structure could be as easy to use as possible and fully documented for all to use.

You are also misinterpreting the discussion on dates, as the comments were meant to indicate that dates should be a part of the structure proposal, and not accepted to be a de facto standard in regional categorization.
I am very happy to hear that, thanks. It seems like we can now move forward.

Related questions

+7 votes
2 answers
166 views asked May 26, 2017 in Policy and Style by Jerry Dolman G2G6 Pilot (149k points)
+2 votes
1 answer
+27 votes
4 answers
+13 votes
2 answers
85 views asked Jun 9 in The Tree House by I R G2G6 Pilot (258k points)
+9 votes
6 answers
+5 votes
1 answer

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright