no image

Beyond LNAB : Proposal for a new Naming Scheme

Privacy Level: Public (Green)
Date: [unknown] [unknown]
Location: [unknown]
This page has been accessed 335 times.

Contents

The Big Picture : We're One Family, but ...

This section has been added following some exchanges in the G2G discussion about this proposal. It seems necessary to set it in the framework of a more general question. What is the scope of WikiTree?

The author of this proposal is European. He has expressed many times in G2G that he has joined WikiTree (in 2019) with a bit of hesitation in front of "the English North American mold", to take the expression used in one of the G2G comments. The data model is part of this mold, as are conventions, language, and singularly the naming scheme.

Taking the WikiTree claim "We're One Family" at face value, one can make the naive and bold assumption that "We" in this claim has to be taken with the largest scope possible, potentially encompassing all humans who ever dwelled on Earth during the past 20 centuries (arbitrary but fair enough time limits set by WikiTree), provided they left behind them some trustable document usable as source. This bold assumption in mind, the following facts are to consider :

  • For over 75% of WikiTree time span (before 1500), English North American was barely a concept.
  • During the remaining 25% of time, up to today, people easily fitting into the "English North American mold" represent less than 5% of the world population.

Otherwise said, the current data model and naming scheme is somehow unfit for over 95% of the potential profiles in the universal scope claimed by WikiTree. If they want to join or be added to WikiTree, the members of those 95% have to abide willy-nilly by rules set by and for the 5%, and figure out their local adaptations.

Three years of experience trying to bring aboard French amateur genealogists, close relatives or not, with about 99% of failure, and more generally the very poor adoption of WikiTree outside the Anglo American world, show that something has to be done to reach out the said 95%, and make them feel home. The current proposal tries to be a step forward in this direction. Other steps will be needed, such as a multilingual interface, but that's another story.

Abstract

This page proposes a new naming scheme for WikiTree, with the following objectives

  • Solving the many issues linked to the mandatory LNAB as the basis for WikiTree IDs
  • Conformance to largely shared naming standards
  • Usability for any kind of local naming conventions
  • More natural name display on WikiTree pages
  • Better search efficiency
  • Lossless transition from the existing data, no breaking of the existing WikiTree IDs

Issues with the current naming scheme

It's been a longstanding issue that the current naming scheme is difficult to use beyond its original cultural scope. The many difficulties it generates have been discussed over and over on G2G. Convoluted conventions have been proposed to enable the naming of profiles who do not fit easily in the model. But such conventions are difficult to understand, often counter-intuitive, not always known by users hence variously applied, and when applied result in both heavy ugly name display and extreme difficulty to find those profiles through internal search.

The construction of the WikiTree ID on the mandatory field LNAB (Last Name At Birth) is certainly the worst aspect of the naming scheme, and a focus point of frustation.

  • LNAB does not make sense at all in many cases
  • In many cases, decision of what should be the LNAB is not obvious
  • Even when it makes sense, it's not necessarily known at profile creation
  • Changing the LNAB afterwards goes through the heavy process of creation of a new profile, merge and redirection, an extra load on the server resources.
  • The LNAB is often not the name actually used in other records or sources
  • In several contexts, forcing the use of LNAB entails ugly convoluted conventions

The main technical positive point of LNAB is to generate user-friendly wiki IDs, easier to remember than the unique data base id, currently counting 8 digits. It's easy to remember that I am identified by Vatant-1, conveying an indication that I was the first person bearing this name to have a profile in WikiTree. This kind of IDs should keep existing, and any transition to a new naming scheme should of course preserve the legacy IDs.

The new scheme proposal

Three new fields are added to the profile

  • Key Name : Mandatory, unique. This field is replacing LNAB as the field used to generate the WikiTree ID. Its value is the LNAB by default, or set to any other name as seen fit if the LNAB is not known, or does not make sense, or to align all family members on a single spelling when variations occur across siblings or generations, etc.
  • Display Name : Optional. A name under which the person is best known, like e.g., in Wikipedia. This field is freely editable. If non-empty, its value is displayed as the heading in the profile page, instead of the default concatenation Prefix + Preferred Name + (LNAB) + Current Last Name + Suffix. Using standard librarian language, this field would be called "Preferred Name", but there is unfortunately already a field labeled this way.

Change note (2022-06-22) : Make this field optional, to be used only if the default concatenation rule yields to heavy, convoluted, or otherwise unclear display.

  • Alternative Name : Optional, multiple values allowed.

Nice to have : Language tag on "Display Name" and "Alternative Name", enabling multilingual variants of the name and in the future, display in the user preferred language, etc.

All other existing fields are kept as they are, but all are optional. In particular "Proper First Name", "Preferred Name", "Last Name at Birth" and "Current Last Name" all become optional.

A cherry on the cake of this new model is the conformance to standard usage for thesauri and other similar vocabularies, as expressed e.g., in W3C Recommendation SKOS. "Display Name" and "Alternative Name" fields are naturally mapped on the skos:prefLabel and skos:altLabel fields.

A lossless migration

In order to avoid any information loss, the following migration rules are set for existing profiles.

  • The current value of LNAB is copied in the Key Name field. This way, the legacy WikiTree IDs and URLs are not broken. Afterwards, the LNAB is editable without changing the Key Name. So, minor corrections of the LNAB can happen without changing the WikiTree ID, avoiding the new profile creation and merge. This process would be needed only in rare cases the Key Name has to be changed for good reasons, but after creation the Key Name could be considered as an opaque ID, and not even appear in the standard editing form.
  • The "Display Name" and "Alternate Name(s)" fields are left blank until further edition. The current display rules are not changed, since they are OK for the majority of profiles. When the current rules result in convoluted, ambiguous, counter-intuitive display, the "Display Name" can be edited as seen fit. When "Display Name" is non-empty, its content overrides the result of the automatic display generation

Examples

Simple cases : Key Name = LNAB

In the more simple cases, there is no need to change the LNAB after it's been copied in the Key Name field. This is the case for the majority of profiles.

Ex#1 : Bernard Vatant

  • Key Name : "Vatant"
  • LNAB : "Vatant"
  • ID : "Vatant-1"
  • Display Name : "Bernard Vatant"

Ex#2 : John Fitzgerald Kennedy (1917-1963)

  • Key Name : "Kennedy"
  • LNAB : "Kennedy"
  • ID : "Kennedy-96"
  • Display Name : "John F. Kennedy"
  • Alternative Name : "JFK"

Ex#3 : Jacqueline Lee (Bouvier) Onassis (1929-1994)

  • ID : "Bouvier-19"
  • LNAB : "Bouvier"
  • Display Name : "Jacqueline Kennedy Onassis"
  • Alternative Name(s) : "Jacqueline Kennedy", "Jackie Kennedy", "Jackie Onassis"

"Ex#4" : Mary (Stewart) Stuart Queen of Scots (1542-1587)

  • Key Name : "Stewart"
  • LNAB : "Stewart"
  • ID : "Stewart-6849"
  • Display Name : "Mary Stuart, Queen of Scots" @en
  • Display Name : "Marie Stuart, Reine de France et d'Ecosse" @fr

Key Name different of LNAB

In many cases, there are good reasons to disconnect the Key name and the LNAB

Euro Aristo

"Ex#5" : Henri (La Tour d'Auvergne) de La Tour d'Auvergne (1611-1675)

  • LNAB (edited after migration) : "de La Tour d'Auvergne"
  • Key Name : "La Tour d'Auvergne"
  • ID : "La_Tour_d'Auvergne-3"
  • Display Name : "Henri de La Tour d'Auvergne"
  • Alternative Name(s) : "Henri, vicomte de Turenne" @fr, "Turenne"

In this case the Key Name is the legacy LNAB (without the particule "de"). The new LNAB includes the particule (this is a controversed topic, but the model allows the LNAB to be kept with or without the particule, without impact on the Key Name)

Multiple LNAB

"Ex#6" : Théodore Alexis (Borrelli) Borrelli de Serres (1769-1829)

  • Key Name : "Borrelli"
  • ID : "Borrelli-23"
  • LNAB : Borrely, Borrelly, Borrelli.

The legacy LNAB, giving the Key Name, has been set to "Borrelli", but two alternative spellings are found in the baptism record. The priest writes "Borrely", the family members sign either "Borrelly" or "Borrelli". Any or all of those forms could be set as LNAB afterwards. The one chosen as Key Name (Borrelli) is somehow arbitrary, but aligned with other members of the family.

Unknown LNAB

"Ex#7" : Marguerite Le Doucen (abt.1673-1723)

  • Key Name : "Le Doucen"
  • ID : "Le_Doucen-1"
  • LNAB : unknown

No birth record is known, the legacy LNAB was found in the marriage record. After migration, the profile is edited and the LNAB set to "unknown". Note that keeping for a while the legacy value in the LNAB is not critical. It is known from the cultural context that the LNAB exists.

No LNAB

In some cases, the Last Name At Birth does not make sense at all. In such a case the field LNAB after migration, or at creation of a new profile, is left blank.

For example abandoned children of unknown parents (many such cases are found in 19th century cities), had only a given name such as "Claude". In that case this given name will be used as Key Name, the ID will be e.g., "Claude-123456", and the LNAB field left blank.

Corner cases

Some profiles are just a nightmare to deal with in the current state of affairs.

"Ex#8" : Roman (Kacew) Gary (1914-1980) The current naming is a less evil use of the current model. Best known as "Romain Gary", his French name, he was born in the Russian Empire (Vilna, today Vilnius, Lithuania) under the name "Рома́н Ле́йбович Ка́цев", of which one among several latin translitteration is "Roman Kacew" (dropping the second name). He published under several pen names, the most famous one being "Emile Ajar".

The current LNAB, and Key Name from migration would be "Kacew-1". The LNAB could be reset afterwards to the Cyrillic form "Ка́цев", and the various names sorted like :

  • Display Name(s) : "Romain Gary"@fr, "Рома́н Ле́йбович Ка́цев"@ru
  • Alternate Name(s): "Emile Ajar"@fr, "Roman Kacew" ...

FAQ

This section tries to address the various arguments raised in the G2G discussion.

Too much complexity

Argument : The current naming scheme is already complex enough. Adding new fields will add to the burden of editing, and bring more potential confusions.

Answer : For the majority of existing profiles, where LNAB is known and makes sense, and the default concatenation rules generates clear names, nothing will be changed. The three new fields could even been set as "hidden" in user preferences.

Too costly

Argument : Adding new fields to the data base has a cost, similar proposals for new fields have been already refused.

Answer : The cost has to be assessed, but compared to the benefits.

The current model is OK for the majority of profiles

Argument : Why change the model when it's OK for most profiles? The minority for which the model is unfit can adapt, local projects can set rules for how to use the model in "foreign" contexts, like how to fill the LNAB.

Answer : The profiles for which the current model is OK will see no change. The LNAB will be used as Key Name as it is now, Display Name and Alternate Name are optional.

Dropping LNAB as key will create more duplicates

Argument : The LNAB is a common reference basis. Without this reference, people will use anything as Key Name and matching new profiles will be more difficult, leading to the creation of more duplicates.

Answer : The matching algorithm uses the LNAB, but also the other name fields, and the dates. Matching efficiency is an issue independent of the field used to build identifiers.

Proliferation of names will make search less efficient

Argument : Letting people drop anything in the new name fields is likely to bring confusion in the search results.

Answer : A good search engine is happy to leverage a variety of names. A full text search on Display Name and Alternate Name fields is likely to make more easy to find profiles matching e.g "Queen Victoria". The matching algorithm can also benefit of those new fields.

Genealogy needs a specific naming scheme (not Wikipedia)

Argument : A genealogical site should have a more strict naming policy than a general site.

Answer : More constraints on the model means less usability outside its original context. The target is to allow the model to fit the largest scope possible.





Collaboration


Comments

Leave a message for others who see this profile.
There are no comments yet.
Login to post a comment.