Database Project Error: mark profiles with odd location pattern

+9 votes

Suggestion: Some of the early gedcom imports to WikiTree are unsourced and when I look on some Swedish imports the selection of birth locations for children are totally crazy to have in early 1800....

For me its an indication that the genealogy done is not done correct using the church books more Ancestry click and forget style... ;-)


1) Extract birth location, death location for father, mother children and see if there is some possibility that during that Time period a person should moved so much if not flag it as a Warning

in WikiTree Tech by Living Sälgö G2G6 Pilot (304k points)
retagged by Dorothy Barry

4 Answers

+9 votes
Best answer
Such checking could be done, but let us wait for location standardisation from Chris. Then we will build on that.

You probably mean that parents were born in Sweden, died in Sweden and have a child in France. Quite unlikely before 20th century.
by Aleš Trtnik G2G6 Pilot (822k points)
selected by Dale Byers
Even with the location standards it could result in problems. For example the family of one famous American moved to Missouri in the 1800's when that was still under Spanish rule but had children born in the same location after the United States controlled that location. Could there still be errors? Yes. But the number of false errors could probably vastly outnumber the actual errors.

Yes Aleš Trtnik 

I feel lesson learned is that we have too many people who died in Y, Somme and Genealogy 2016 without sources is asking for problmes....

Maybe start marking some gedcom imports as more or less good also a way of getting a warning.... 

No one would like to climb someone else family tree

You already have an error for unsourced so why look for problems that would be solved by just finding proper sources? Even back in the 1400's not everyone lived and died in the same area so find the source, proof, before you call it an error.

Let's fix the problems we already know about before we try to find any more. Adding more errors will only slow down the progress being made in finding the current one's and could actually eliminate most of those that would appear in your proposal without doing anything else.

Dale with you logic the Project:Database_Errors has slowed us down and shouldn't have started until Category Unsourced = 0 

My feeling is that we get more errors every week with new people uploading Gedmatch files..... and as long we don't have more validations or do some backend cleaning we will have more and more errors.... 

Aleš wait for location standardisation from Chris. Then we will build on that.

I am afraid that standardisation move us away from using the location field to find errors as my understanding is that we should use historical correct names==> google map or other tools will not understand the locations....and we will have problem plot the locations on a map

Mangus, according to the latest figures there are less errors this week than last week so we are making progress. I just feel that adding more errors into the project would be counter productive as then we would start moving the wrong way, more errors than the week before and could cause people to give up. Most of the "problems" you re proposing to find will be eliminates just by finding and adding sources which I will be getting back to after this comment.

I will spend my time looking for solutions rather than looking for more problems.
+6 votes
I can't agree with that, although I don't claim to know Sweden inside out. But it is too open to personal interpretation, isn't it?

Could you give an example?
by anonymous G2G6 Pilot (285k points)
I think the problems get obvious when a person with no local knowledge start doing genealogy in a foreign area

I will come back with examples....

Here I feel is a small warning. No sources more than links to Ancestry that adds no value... My guess is he thought Älvsborg and Älvsåker is same same or.....

See map its 1340 km a long trip 1781... not impossible but....

Father born in 
Born January 5, 1781 in Älvsborg, Norrbotten, Sweden 

Grisebo, Älvsåker sn, Hallands län

Hjälmared 2 Torp, Älvsåker sn, Hallands län, (N) 


Grisebo Torp, Älvsåker sn, Hallands län, (N) 


On the above profile we have the following warnings

  1. No sources more than Ancestry family tree
    1. Difficult for a computer to track when we dont use templates
  2. Location pattern see above
  3. Its a Gedcom import
  4. Its never edit
  5. Ancestry links are "empty"
OK I see what you mean, Magnus, but I tend to agree with others that it would turn up more false errors than true ones. It's surprising how much people did get around in the 18th C.
+6 votes
This could result in more False Errors than accurate ones

Example: One family I track has father born in Texas, United States and died in Ohio United States with at least one child born in Canada. Far too often there are several children born in different locations to parents that have neither birth or death locations that are the same as their children. This does not make it an error but rather an example of a mobile society.
by Dale Byers G2G Astronaut (1.7m points)

As Mr Aleš Trtnik can check patterns on name and gender.....

Why can't we see patterns of how people are moving in different countries during different time periods....

If we add more structured data we can easier get better quality. I would like to see structured data for

  1. locations
  2. timeline
  3. sources
Mangus,The information you need for your suggestion to my answer is not in the database dump he gets and is not even in most profiles at this time. For that reason alone even the best programmer could not check for every location someone ever lived.

Dale don't follow you if you comment my latest remark so is that a comment that we should have more templates ==> we get more structured information => we can use our family tree for more interesting things.....

I guess templates will soon appear in the database dumps or a template can be created so its machine readable ==>

==> Isenhour-40 is machine readable see Google Structured reading tool I think we can do templates so the html generated is machine readable ==> we can check for better quality on our family tree data....

Mangus, 1st for most profiles that would be affected by your change, including the one I refer to, there are no templates  on them. 2nd the profile you referenced has the father and his son born in the same State so it would not show as an error and is well within the status of probability.. I believe that citing a birth location as an error based only on the birth and death locations of the parents, is never a good idea without sources to say it is wrong. I have a close relationship with the family of the person I used as an example and can tell you that not all of the locations the father lived in are listed anywhere in the profile so even if the biography was read by a computer and included in the data dump there would be no reason from his profile to say that his child should have been born in Canada, but they were, and that is an actual fact.

Adding a category template for every location a person ever lived could result in a very large number of templates and an excessively long biography that adds no useful information except in very limited cases. Just for me alone I would have to have 13 locations listed for where I lived and only 2 would link to any useful information in my biography to date.

>> 2nd the profile you referenced has the father and his son born in the same State

Isenhour-40 was just an example how a WIkiTree profile is already today machine readable. It has nothing with finding errors 

Regarding templates I think most people on WIkiTree agree with you and are not interested in generate migration maps and history timelines direct from the Wiki profile 

Below Query Wikidata and display the result on a map

Big pic
Wikidata query tool (a lot of the data is from Wikipedia and templates)

As my final comment on this thread let me say that yes I do look at the error report and after checking those that are connected with my watchlist I fixed a couple of minor errors but soon discovered that the majority of the errors on the list for me would need either better sources or else I would just be guessing, and that is what caused the problems in the first place. With that in mind and given the fact that I have been trying to add sources and improve the profiles on my watchlist I have decided that the way I was doing things is the best way to remove the errors.

I choose to work smarter, not harder.
+4 votes

In my genealogy research, I have a rule of thumb that I use based off of statistics I had read about. Most people who have ever lived never move further than about 20 miles from where they're born; this is particularly true of pre-automobile and pre-airplane eras with notable exceptions for certain populations like the early American colonies. Individual data points vary widely.

So here's some articles about how far people tend to wander during their lives.

Human Migration, Wikipedia

Pew Research Center: Who Moves? Who Stays Put? Where’s Home?

Lifetime Mobility in the United States: 2010


Reconstructing the Lifetime Movements of Ancient People: A Neolithic Case Study from Southern England

Got any more links?

Also, the genealogy data that we're generating would be uniquely suited to the study of people's lifetime variance in distance from place of birth.

by Ian Mclean G2G6 Mach 1 (13.7k points)

Thanks for sharing. I hope the Wikitree community starts to realize with the Database Error project how much benefits you get if you add data as data

.... the crazy thing in Sweden is that "normal" people never moved and then starting 1850 some went to Amerika and in 1900 about 1 million people left. Chicago was the 2nd biggest city with Swedish people and normally you know more people in Chicago than in Stockholm... 

Yesterday I learned that also some people left for New Zeeland because it was a cheaper trip...

Thanks for sharing.....

My hope is that we start using the structured data in Wikitree to generate maps on the fly and hopefully also get Timelines of structured info we add....

Adding data as Data is making genealogy much more interesting 

Just getting statistics of death reason for different time periods and regions would be interesting... or see how the value in the estate and inventory changed or was different in different parts of the country would be cool to have as structured data in the WikiTree profiles....

Related questions

+3 votes
3 answers
290 views asked Mar 16, 2020 in The Tree House by Pete Toemmes G2G6 (6.1k points)
+5 votes
1 answer
132 views asked Aug 9, 2016 in Policy and Style by Mary Jensen G2G6 Pilot (133k points)
+21 votes
1 answer
+13 votes
1 answer
337 views asked Jul 24, 2016 in The Tree House by Robin Lee G2G6 Pilot (881k points)
+10 votes
2 answers
314 views asked Jun 25, 2016 in WikiTree Tech by Paula Dea G2G6 Mach 9 (92.4k points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright