Historical special character makes profiles unviewable

+11 votes
389 views
This problem affects a number of profiles. One example is:
http://www.wikitree.com/wiki/Westphal-71

The issue is that some historical GED imports from 2010 (possibly other years) seem to include a special character that cannot be read. As such, the whole profile becomes unreadable and it appears to the regular user that "Nobody has entered anything".

I had attempted to fix this issue with my bot but due to the nature of the issue I accidentally translated other special characters that didn't need translating (which then needed fixing).

Given the controversy surrounding that, I thought it would be best to raise it as a WikiTree bug instead.

Thanks.
WikiTree profile: Frederick Westphal
in WikiTree Tech by Jonathan Wainwright G2G5 (5.2k points)
retagged by Living Sälgö
Your example is privacy green, so we can't see what your talking about.
You can see what he's talking about by looking at the change history for that profile.

Looking at the change history it appears that the bot changes an unrecognizable character in WikiTree into a copyright sign which in the context appears to make sense. Michelle then manualy corrected this change by changing the copyright sign back into an unreadable character. The endresult is something that is true to the original import but useless to the reader. At least, all I can see is a placeholder character ().

My question is whether this is a storm in the teacup that some people are pouncing on because they are opposed to the use of a bot? What good does it do to have characters on WikiTree that WikiTree cannot display?

I restored the profile to test the restore feature was working correctly after the bot made a change. I made no purposeful change of the character.
Some changes appear as the profile manager when bot actually made the change. This has been corrected.
Michelle, no criticism intended, I understand the process behind this. My main intent was to comment on the use of this example for the, for a lack of a better concept, anti-bot hysteria 2 days ago.

Helmut,

None takensmiley

4 Answers

+6 votes
 
Best answer
I think I understand technically what is happening (at a high level) - it does appear this was injected during GED import. I'd like to test your theory, though, that the curent import doesn't introduce this problem.

Hopefully a Tech can figure out why it's not displaying and correct that problem, so that it becomes displayable.

So when you attempt to edit, Jonathan, can you see the bio and do you believe you could remove the character? Or is it uneditable for you?
by Scott Fulkerson G2G Astronaut (1.5m points)
selected by Jonathan Wainwright
Editing the bio allows me to change/remove the character. The profile is then made visible.
Jonathan, I'd like to distinguish between 2 different things - an error introduced during gedcom import that is somehow associated with the copyright character and the copyright character itself.

If it is the copyright character itself that's causing the bot to become confused then that will be a serious problem.  Copyright characters are sometimes manually entered in profiles when the person doing the editing includes a quote from a copyrighted source that permits the use of the material, in which case the copyright notice from the source is also included in what is entered.

Do you know that the problem is tied to an error that is distinct from instances of the copyright character?
It's not the copyright character that is the problem. The character that is a problem represents the copyright character but is unreadable in its current form. So there are no problems with people using the copyright character. As for the bot, it can now read and translate the character. The reason I raised the issue is not because it is causing any issues with the bot. It isn't. I raised the issue because I thought it might be of concern that whole profiles are being hidden from regular humans because of this bug.
Yeah - I see that now. Looks like it was uploaded 2010 with a GEDCOM file which has since then been deleted. Original Profile Mgr no longer around. No reasonable way to go back and find what the original character really was, although it does appear that it was intended to represent a copyright character. The black diamond with the white question mark is intended to represent a "replacement character". So Jonathan is 100% correct - the black diamond/question mark is not the problem, just the symptom, unless that object itself causes profile bios to become invisible.

Jonathan, as a test, can you add the "replacement character" to another profile and have someone who does not have rights take a look and see if they notice anything different? (or you may have already done so). If it turns out that the character causing the issue is that one (which I'm beginning to lean that way), then it should be a simple process to just write a script to look at bios and remove that character anywhere it shows up. Minor impact overall, but it could open up bio information that should have been available before.

As far as the symbol goes, from what I can tell, it's a Tahoma font, last symbol on the chart - called a Replacement Character (reproducable as U+FFFD for those of you who understand recreating such a thing).
FYI - I've tried both introducing the character from the fontset (which comes up like a "blank" spot) and copying the one from the thread here, and neither appears to lock out the profile, so it's possible this is truly a legacy issue (as Jonathan believes) and will only affect profiles created during a narrow range of time. At least we can hope that is the case...
Scott, I could be wrong, but I believe Jonathan is saying that the profiles were hidden before his bot went through them - nobody realized that, though, because nobody had ever looked at these profiles - they were just uploaded in the gedcom and then promptly ignored and forgotten.

I think what he's saying is that, as a result of his bot having encountered this problem, we are now aware that it is there.  Perhaps we need to try to find profiles that the bot has not yet touched that were in old gedcoms and have part or all of the information in their bios not displaying on the view page, but the content is there on the edit page.
I don't believe that the symbol that looks like a white question mark inside a black diamond is what was originally in the profiles.  I think there was (and probably still is) a character in there that caused the bot to enter that symbol when it encountered it.  I suspect that the original character that suppressed the display was most likely a control code - you know, the kind made when you press a combination of CTRL and a letter.  It would have been a character that does not print, so we wouldn't see it on the view page - we see the symbol that the bot entered on the view page and that is the only way we know that the page has an error and has hidden content.  I expect that many other pages are like these and also have hidden content, but because the bot has not been unleashed on them, we don't see any clue to let us know that there is information that is not showing.
The profile was reverted before I posted this question, so anything the bot did or didn't do is no longer visible (except on the changes page). If you check the changes you will see that the bot replaced the funny looking symbol with a copyright symbol, which is what you expected.

I know exactly what the profile looked like before the bot did anything because when the bot encountered this page first time round it just stopped, so I looked at the profile myself and it said there was nothing in it. On further investigation, I found that when I edited the page, there was something there, but because of the funny looking symbol it wouldn't display.

I wrote in a bit of extra code on my bot to get it to start up again. It was that same bit of code that caused problems in other profiles (but not this one) because I didn't check whether the code would have any bad side effects. This bit of code has since been replaced with different code that has no bad side effects.
+3 votes
This is another example of the problem with the Bots released bo ForoTree over the weekend and also highlights that no mater what the author of the bots says we may have not yet seen the end result of the releasing of these bots on here yet.
by Dale Byers G2G Astronaut (1.7m points)
reshown by Chris Whitten
This is a problem that existed before the bot. The bot was just trying to fix it, the bot wasn't the cause of this problem.
+3 votes
It's time to put a stop to the use of this/these destructive bots before the entire Wikitree database is corrupted.

If and when it is decided that this technology is appropriate, sufficiently developed and reliable, then let it be cautiously introduced (on a limited trial basis) by Chris and his team.

I, for one, would not be happy for an automated programme to be changing data on profiles that I manage with no collaboration and without any human intervention.
by Peter Knowles G2G6 Mach 6 (69.5k points)
This is a problem that existed before the bot. The bot was just trying to fix it, the bot wasn't the cause of this problem.
Yes, Roland, there were and still are problems associated with errors caused by gedcoms. The bot tried to fix the gedcom problem but failed, and in so doing, created further problems itself. It now seems that someone will need to resolve the problems created by the bot. The problem, as I see it, is the bot - not the gedcom.

This question is to address the pre-existing problems of characters that make profiles not viewable.

This specific problem is neither the bot nor the gedcom's fault, but it's how WikiTree handles some specific characters.

Another question has been asked specifically to discuss bot usage:

http://www.wikitree.com/g2g/154910/should-bots-be-used-on-wikitree-to-modify-profiles

Peter,

The bot was not simply turned loose on WikiTree.  The particular profile used as an example here is one that Michelle Hartley manages.  She is a technically knowledgeable member, also a leader I might add. who is a member of the apps group.  She volunteered to test the bot.  She fully understands that it might not yet be ready for prime time and she is also quite capable of exercising cognizance of its actions.

This situation is very much in keeping with industry accepted best practices for staged deployment of a new feature.  Unfortunately, it seems to have turned into a three ring circus when some members observed it happening and started the G2G ball rolling on what appears to have morphed into a witch hunt.

My bottom line opinions:

  • Yes, we absolutely need to exercise very careful monitoring of the implementation of any bots here. 
  • Yes, we will need to restrict the kinds of actions that bots are permitted to perform. 
  • Finally, yes, we need to be continually open to consideration of the value of enhanced capability offered by new technologies as they emerge.  If we do not keep up with progress then our progress will degrade to stagnation and then start down the path toward extinction.
Thanks Gaile for clarifying some aspects of what has been happening in recent days. Unfortunately most Wikitree members are not aware of some of these technological developments (and experimentation) and only see the problems that are raised when things go wrong.

As for myself, technology left me behind several years ago, so although I understand what people are trying to achieve, I don't understand the intricacies of software development.

As someone who has read virtually every question (and most of the answers and comments) on G2G for nearly four years, I can say that many Wikitree developments are not always to my liking. In itself, that's not a problem because Wikitree will (hopefully) be around long after I am gone and, as Charles Darwin once said, it's survival of the fittest - those remaining are those who were able to adapt (evolve).

I have expressed my opinions on a number of the questions, answers and comments generated by what seems to be a contentious issue over the past week or so. I will now bow out and become a silent observer of where Wikitree goes from here.
+1 Peter.
Thank you Gaille!! Ditto!! I'm not able to give long responses while on vacation. I did volunteer to beta test these bots and have been collaborating with Jon. If I feel something is wrong with what Jon is doing I will tell him or block the acct id something goes wrong. He takes any criticism well and tries to improve in those areas. Nothing negative towards Jon as he's well aware bugs happen and he's completely capable of fixing them quite quickly. These bots don't do anything to profiles unless they are in my watch list or someone else who might of volunteered. Keep in mind Jon could design something really useful for wikitree. I will also add Jon is doing this for free. Anyone with design or programming skills knows that testing is needed and unplanned bugs happen. Will respond more later.

 

(edit was to correct name to Gaile)
+4 votes

This is what the character looks like to me:

Does it make sense to anybody to have these characters in profiles? This is a WikiTree problem since it does not have a character set installed allowing the propper representation of these characters. It is also a problem of the profile manager who should investigate if the character can be represented with the installed character set.

There are a few customized character sets for certain medieval letters around that are not unicode, but apart from those there should really be no problem with any character.

by Helmut Jungschaffer G2G6 Pilot (595k points)
It's not a medieval character. It should be a copyright symbol. However, the GEDCOM seems to have been imported incorrectly.

Just to be clear, if there GEDCOM was re-imported now, I am pretty confident this would not happen. Whatever bug caused this error in the first place seems to have been fixed since.

The issue I am trying to highlight is that there may be hundreds or thousands of profiles that were imported several years ago which are currently showing as blank profiles when in fact there are lots of sources hidden because of this copyright character being stored incorrectly.
What about using a bot to identify such profiles and just generate a list?
This would probably have to be an official bot (like the official Merge Bot) as it would involve checking every single profile (i.e., millions of profiles). It would also involve checking non-public profiles, which can't be checked by a non-official bot without permission.
I think given the sensitive nature of this issue for some people that would be my thoughts, too.

Related questions

+10 votes
2 answers
+11 votes
4 answers
+14 votes
14 answers
+14 votes
4 answers
+11 votes
3 answers
174 views asked May 4, 2015 in WikiTree Tech by Living Knight G2G6 Mach 3 (37.5k points)
+8 votes
1 answer
138 views asked Jun 21, 2016 in WikiTree Tech by Lance Martin G2G6 Pilot (125k points)
+4 votes
2 answers
289 views asked Oct 1, 2018 in WikiTree Tech by John Krizenesky G2G6 (8.9k points)
+11 votes
1 answer
317 views asked Aug 6, 2015 in WikiTree Tech by Allison Mackler G2G6 Mach 6 (63.4k points)
+17 votes
3 answers
+6 votes
2 answers
155 views asked Sep 10, 2014 in WikiTree Tech by S Willson G2G6 Pilot (220k points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...