Why do Nos Origines and Genealogy Canada websites generate a Link Error 966?

+10 votes
305 views
Their links work perfectly fine. Why are they listed 401 Unauthorized? How do we correct them?

Here are more profiles with these links.

*https://www.wikitree.com/wiki/Leblanc-3319

*https://www.wikitree.com/wiki/Girouard-4510

*https://www.wikitree.com/wiki/Bastarache-2

Thanks!
WikiTree profile: Charles Godin
in WikiTree Help by Gisèle Cormier G2G6 Mach 6 (67.1k points)

1 Answer

+8 votes
The links have redirections that occur but those are 302 errors which should resolve just fine. 302 is that the path has changed permanently. I'm not seeing anything obvious that would trigger a 401 but it may be something in the bot environment.
by Doug McCallum G2G6 Pilot (535k points)

There is quite a restrictive file

https://www.nosorigines.qc.ca/robots.txt

I wonder what Aleš's bot has as user agent.

What does restrictive file mean?
It’s file of browsers that aren’t allowed to access the site. Usually it restricts the bad crawlers but it is possible that the one Wikitree is using is in the list.

It's clear from the file that nosorigines is cautious about what crawlers it allows. A further possibility is that if there are crawlers which the site can't identify from their user agent string, or which ignore its robots.txt file, it actively blocks access to them by returning the 401 Unauthorized HTTP status code (though 403 Forbidden might seem more correct in this situation).

By definition, robots.txt should be accessable, otherwise cravlers can't know what they can access.

Actually I can't access any nosorigines.qc.ca URL. Not even https://www.nosorigines.qc.ca/ I tried even from a few proxies without success. It seems that it is completely locked for Europe and many proxies.

I programed link validator to not scan the domain if lock results are returned. I will try to rescan the domain.
Thank you Ales.  Otherwise should we mark them as False, since we can access them?  nosorigines.  It isn't always the best source because *most* are unsourced, but I hate to remove them because some do offer good clues.

I don't mean to imply that I remove sources with Link Errors.

Related questions

+11 votes
2 answers
+8 votes
3 answers
+5 votes
1 answer
403 views asked May 29, 2022 in WikiTree Tech by Anonymous Geddes G2G Crew (770 points)
+5 votes
1 answer
231 views asked Jan 25, 2022 in WikiTree Tech by Matt McNabb G2G6 Mach 3 (37.2k points)
+2 votes
0 answers
+6 votes
2 answers
+6 votes
1 answer
+12 votes
1 answer
403 views asked Sep 22, 2022 in WikiTree Tech by Living Rayner G2G6 Mach 1 (18.4k points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...