New dump is being finalized Biography preview

+18 votes
540 views

I did receive first preview of biography from Chris and have some questions and examples. I got 1% of complete database and they are all from early days of wikitree. So I think newer biographies will look better on average.  

Longest profiles (longer than 100K):

http://www.wikitree.com/wiki/Brown-665
(a lot of duplicated sources)
http://www.wikitree.com/wiki/Turner-176
(a lot of duplicated sources, nothing wisible unless in edit mode)
http://www.wikitree.com/wiki/Hart-69
(Nice extended profile)
http://www.wikitree.com/wiki/Mayo-13
(Nice extended profile)
http://www.wikitree.com/wiki/De_Forest-15
(Nice extended profile)
http://www.wikitree.com/wiki/Sisson-12
(Nice extended profile)

23655 profiles are shorter then 100 letters, like

  • (all GEDCOM data imported)
  • This person was created on 09 March 2010 through the import of arie.ged.
  • ...

http://www.wikitree.com/wiki/Kosta-1

http://www.wikitree.com/wiki/Conn-5

 

With such ratio Empty profile error will have too many errors

 

Is this just recommended or obligatory form of bio.  

 

== Biography ==

...

== Sources ==

...

Enough for beggining

WikiTree profile: Emily Turner
in WikiTree Tech by Aleš Trtnik G2G6 Pilot (808k points)
retagged by Maggie N.
Done.

It contained the same facts repeated over and over and over and over ... plus a duplicated profile with the same facts repeated over and over and over and over. I probably left more info than I need to.

If someone would like to track down some more accessible source citations, that would be awesome. I'm out of time now and have to go run a bunch of errands.

Cleaning GEDCOM 

Start feeling Talk pages would have been nice

On the profile just show nice information would be the best....the "ugly" gedcom could be moved to the talk page plus also discussions like below about Birth locations

Birth location 

Maybe it's just the same location explained in 4 different ways.... why didn't they use GPS and smartphones?

  1. London, England
  2. Middx City, Middlesex, England
    1. Middlesex is greater London
  3. London City
  4. Aldersgate, Middlesex, England
    1. Looks like Aldersgate is explaining if it was inside or outside the city walls see link
    2. More about Aldersgate a gate in the London wall

 


Big pic Aldgate is NE

Magnus --

You should add your notes to the profile in a Research Notes section ;-)

Julie This is pre pre research... but I will do... and be a good citizen in WikiTree land hope pass by London in the next month so then maybe I have more to add to the research section....  

Ps. I connected Brown-665 to this G2G topic... at least

Nice!!
We need something like a separate research notes page where our entries can not be deleted by anyone else including PM.
You mean something like the "Talk"-page on Wikipedia?
Not sure we should make notes that can't be deleted by anyone.  We've had some malicious members in the past, and some that get carried away sometimes and leave rude messages.
I'm not involved in Wikipedia, so no idea what Talk page is like.

And I didn't mean the staff couldn't delete, just something like a forum where only moderators and admin can delete.

Re Julie did some checking about the London Wall and it looks like Wikimedia/Wikipedia has a new cool function to annotate pictures,.,,, feels it would be magic to have it on WikiTree and be able to comment on old pictures....

Picture Link
Video

3 Answers

+8 votes
 
Best answer

Hi Aleš,

"Recommended vs. obligatory" ... that's actually a hard question to answer.

I think it's the same question as this from the style FAQ:

Is it forbidden to break the style rules?

We don't usually use the word "forbidden" when talking about style rules.

Things like pornography and spam are forbidden through our legal Terms of Service. The points of the Honor Code, such as those on courtesy and citing sources, are rules that all active members are expected to follow. Styles and standards are more like guidelines. Style rules are the community consensus for what should be done.

That said, we strongly recommend against using anything other than recommended styles, especially on Open profiles. If you do something that isn't specifically recommended on private or free-space profiles, you do so at your own risk. See below.

 

by Chris Whitten G2G Astronaut (1.5m points)
selected by Maryann Hurt
+10 votes
Duplicated captions could be an indicator of not editing the Bio after merge.

== Sources ==

== Sources ==

http://www.wikitree.com/wiki/Clarke-93

http://www.wikitree.com/wiki/Kidd-37
by Aleš Trtnik G2G6 Pilot (808k points)
Will the bio be compared with the dump?
I don't quite understand.

Bio will be included in new dump that Chris is preparing, so we can create new errors based on biography part of the profile.
I was thinking that another indicator of a bio not big revisited would be when a DOB is revised in the top part but not explained in the bio.
I understand what you mean.

No. Computers are still not smart enough for such things. Maybe in 20 years or more.
+8 votes
I could also validate all links on the profiles.

One error could be DNS part of the url.

The other would be whole URL, but there might be the problem, since some source links require login to access data. Those could be identified and ignored for standard sites (ancestry, findagrave,...). For others, should it be even allowed.
by Aleš Trtnik G2G6 Pilot (808k points)
  1. Do you get the raw text or the template?!?!
     
  2. Do you get the categories? I feel we have some links on categories but we have no bot checking if they are valid
     
  3. Links
    1. One problem is dead links 
    2. Another problem is that to often uploaded GEDCOM files create links to Ancestry that when you follow them has no genealogy value....
       
      1. Are empty
        example Sweden-29 links to Ancestry family tree that is empty and of no use and have no genealogy interest ==> should be flagged as an error and then the link should be deleted... 



        Example "empty" Ancestry page that are created in many WikiTree uploaded GEDCOMs and adds no value to genealogy 
         
        1. Maybe a Xpath can be used and see what e.g. 
          //*[@id="fixed_div"]/div/div/div/div[1]

          contains.....
           
      2. Needs login ==> a non prefered source inside WikiTree 
        exemple Eisenhart-39 links to Ancestry private tree not prefered way of sourcing 
        1. Feels like you get redirected to RequestTreeAccess if that could help
1.) I get Wiki text. What you see in editor of biography.

2.) Categories are separately extracted in additional table by Chris. So I also get categories added by template.

3.2.1 Require login

3.2.2 Require login

We could group login links as third error.

Ok Aleš 

The Ancestry links are rather depressing when you check them
220 000 hits site:www.wikitree.com/wiki AMTCitationRedir  

feels 9 out of 10 are useless....

My odd personal opinion is that all the sources from an external family tree as Ancestry should be moved over to WikiTree and the Ancestry family tree should just be in the See also section

You're a rock star Aleš. Validating links is just one of the hundreds of things we should be doing, but never actually get done. You're ticking off item after item on WikiTree's to-do list (and items we'd never thought of before).

I sometimes feel like my biggest use of my Ancestry account is to determine which Ancestry links are dead and which still function. Displays like that one from Sweden-29 indicate that the person who made the Ancestry tree has abandoned the tree, and the content is permanently gone. All "AMTCitationRedir" URLs with that same Tree ID (in that instance tid=21525863) will be equally useless. Removing the dead links and the associated advertorial text is tedious, even if a person could quickly identify which profiles have the URL (and which of the multiple Ancestry URLs on that page are affected), but an automated search designed to find all instances of a particular confirmed-dead tid would make the removal process go more quickly.

However, for WikiTree profiles that have little or no other indication of their information sources, I don't think that we should remove:

  1. Ancestry Tree links that require a subscription but are still working
  2. Ancestry Tree links that still work, but require permission from the content owner for access
  3. For instances where there is no working link to the Ancestry Tree, a brief indication that "Ancestry Family Trees" was the source of the content in the profile. I currently prefer to replace the lengthy citation to a dead Family Tree with *Ancestry Family Trees. Online publication - Provo, UT, USA: Ancestry.com.  Original data:  Family Tree files submitted by Ancestry members.

Note: People who don't have access to Ancestry should be aware that the string trees.ancestry.com in an URL does not necessarily indicate a link to an Ancestry Family Tree. That same link format also applies to other user-contributed content and records. Apparently when people save a link to an Ancestry record in their own Ancestry area (something I've never done), the link gets a trees.ancestry.com URL that carries forward when they later create a gedcom. These links generally continue working (with content quality that ranges from execrable to excellent); they don't go away when a Tree owner quits.

Related questions

+34 votes
8 answers
+7 votes
0 answers
+3 votes
1 answer
202 views asked Jul 26, 2020 in WikiTree Tech by Justin Cascio G2G5 (6.0k points)
+2 votes
1 answer
+4 votes
1 answer
+8 votes
2 answers
+15 votes
2 answers
218 views asked Aug 2, 2017 in WikiTree Tech by Helen Ford G2G6 Pilot (472k points)
+13 votes
3 answers
+3 votes
1 answer

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...