no image

England Project - Identification of Profiles

Privacy Level: Public (Green)
This page has been accessed 195 times.

England Project - Identification of Profiles

The current system

A Lancashire profile is one where a person was born, married or died in Lancashire. A person who was born in Cheshire, married in Lancashire and died in Yorkshire would be counted in all three counties.

Ideally, every profile would follow Wikitree’s protocol of Town/Village/City, County, Country, properly spelt and punctuated. When the protocol is followed, the profile is included in the county statistics for total profiles, Unsourced, Suggestions etc. The following examples of Ancoats, a district in Manchester are recognised

  • Ancoats, Lancashire, England
  • Ancoats, Lancashire, England, United Kingdom
  • Ancoats, Manchester, Lancashire, England
  • Ancoats, Manchester, Lancashire, England, United Kingdom

Unfortunately, a significant number of profiles don’t follow the protocols; consequently, these profiles are not included in reports. They are either counted as England profiles, or are not allocated to any country.

The new system

An improved process for analysing location fields and allocating profiles to a county has been developed. This gives scope for some poorly formatted variants in location fields to be recognised.

In addition to well-formatted profiles, going forward, towns, cities and districts followed by the terms listed below will now be counted as Lancashire, England profiles

  • Lancashire, England, UK
  • Lancs, England
  • Lancashire
  • Lancashire England
  • Lancashire, UK
  • Lancashire, United Kingdom

The same approach will be applied to other counties.

These variants are contained in a Locations Table maintained by Ales. It is planned to switch to the new system on 15 October.

A working party has been reviewing the impact of making the switch; all counties should see an immediate increase in their profile count. There will also be resultant (and possibly disproportionate) increases in Suggestions, Unsourced, Unconnected and Unknown figures.

We are not accepting a drop in standards. Poorly formatted variations should be corrected. In due course, abbreviations such as Lancs. and other variants will be identified as suggestions and flagged on Lancashire and Data Doctors reports.

New Counties

In addition to our existing counties, statistics will be produced for Greater Manchester and West Midlands.

Next Steps

There is scope not only to incorporate more variations of punctuations and abbreviations of counties; but also to add cities and large towns to the Locations Table through which more profiles will be identified at county level. This will be considered by the working party as a second phase of this project. Changes will be communicated on an ongoing basis.





Collaboration


Comments: 13

Leave a message for others who see this profile.
There are no comments yet.
Login to post a comment.
On rare occasions I have seen counties being entered with their Chapman code, i.e. NTT for Nottinghamshire. I've no idea how many of such there are, is it worth also considering them?
posted by Derrick Watson
Hi Derrick

Thanks for the question.

The next stage of the process will be (mainly) driven by data.

A query will be run which will inform the working party of the most frequently used, poorly formatted locations that can be allocated to a county or England, from there they will be 'flagged' on reports to be corrected by County Team and/or Profile Improvement Teams.

An old version of the query would suggest that we will pick up the most profiles if we add large towns and cities to the Locations Table. The working party will assess the data to ensure that we don't pick up significant numbers of non-England profiles.

There is a balance to be found of helping us to identify more England profiles and making good use of Ales' time. I think we will get to the stage where we use bespoke Data Doctors reports to find profiles with location fields that need attention rather than asking Ales to add more locations to the table.

Regards Steve

posted by Steven Whitfield
Hi Steve,

Thanks for fixing this. Looking at the latest version of the table, the only significant one that still appears to be missing is "Isle of Wight, Hampshire, England, United Kingdom" (which is the correct location from 1801 until 1890, after which the Isle of Wight has no longer officially been part of Hampshire).

Regards,

Paul

posted by Paul Masini
Thanks Paul

The process we are going through is primarily about identifying profiles that are missed by the current system; and then allocating them to a county team for review (if needed).

"Isle of Wight, Hampshire, England, United Kingdom" will be allocated to Hampshire, included in their stats and improvement reports becasue it ends Hampshire, England, United Kingdom, which is a variation in the Hampshire table.

Regards Steve

posted by Steven Whitfield
Thanks Steve

In that case "Isle of Wight, Hampshire, England" should not appear either, since exactly the same logic applies to that too. (I only suggested "Isle of Wight, Hampshire, England, United Kingdom" should be added because I saw the one without "United Kingdom" was there - logically the list should include either both or neither, not one without the other).

Regards,

Paul

posted by Paul Masini
You're absolutely right Paul. There are already several inconsistencies in the table, which is not ideal. The table was designed for a different purpose from the one we are using it for. We are still evolving our understanding of how things work and finding the best way of specifying what we need.

I'm sure that as we progress we will improve this.

Regards Steve

posted by Steven Whitfield
Hi Steve,

I think there might be a problem at the moment with allocating the unsourced profiles from the Isle of Wight that don't have Hampshire in the location as well. I think they are supposed to appear in https://www.wikitree.com/wiki/Automated:DD_Unsourced_List_ENG_HAM, but it looks like at the moment they are actually in https://www.wikitree.com/wiki/Automated:DD_Unsourced_List_ENG_UNK.

See for example https://www.wikitree.com/wiki/Casford-251 in https://www.softdata.si/wt/Unsourced_20221016/ENG_UNK/2_1700-1799_0.htm or https://www.wikitree.com/wiki/Hill-13332 in https://www.softdata.si/wt/Unsourced_20221016/ENG_UNK/2_1800-1899_0.htm.

Hopefully just a small tweak is required to get them in the right place.

Regards,

Paul

posted by Paul Masini
Hi Paul

Thanks for bringing this to my attention. I will look into it and hopefully we can get it sorted quickly.

I'll post on here when I have an answer.

Regards Steve

posted by Steven Whitfield
Hi Paul

The source of the problem has been identified. I have been told it should be sorted in next week's figures.

I very much appreciate you bringing this to my attention.

Regards Steve

posted by Steven Whitfield
Although I think the Isle of Wight ceased to be part of Hampshire in about 1890?
posted by Sarah Long
Hi Sarah

You have reminded me of an important point that we haven't covered that is perhaps worth making.

In our statistics, we don't allocate profiles to county teams based on strict county boundaries (that, like the Isle of Wight, might have changed over time).

With this new process we are trying to identify more profiles as England profiles and ideally to allocate them to an appropriate team for review. Under the old system, about 4% of the profiles we will now report on were previously not identified as England profiles or sat in a very large 'England' pot rather than being on a county team's report for review.

Regards Steve

posted by Steven Whitfield
I notice that the new location table only has one entry for the Isle of Wight, which is the plain "Isle of Wight" with nothing else, which occurs on 905 profiles. However according to https://www.wikitree.com/wiki/Space:England%2C_Regional_and_County_Statistics_Page there are about 13,000 profiles from the Isle of Wight. I think the table needs to have the correctly formatted "Isle of Wight, England" and "Isle of Wight, England, United Kingdom" added to it. Other incorrect variations could also be added for consistency with the other counties.
posted by Paul Masini
Hi Paul

Thanks for picking this up. I have been sorting this out with Ales over the last hour or so.

By working through this process, we have identified that many of the Isle of Wight profiles were in fact being counted twice under the old system, in Hampshire and in the Isle of Wight lines of the England statistics. This needed sorting.

The table header for this section has now been amended to England, Hampshire, Isle of Wight, which will mean that Isle of Wight profiles will be included in Hampshire and not duplicated; we have also added the England and United Kingdom variations (as well as IOW and I.O.W.)

If members want to work specifically on Isle of Wight profiles (or any other town for that matter) we will have to run tailored reports in Data Doctors.

Regards Steve

posted by Steven Whitfield