Challenge of the Week: Help repair or resolve broken links

+23 votes
1.1k views

Hi WikiTreers!

imageWikiTree+ data analysis has found thousands of broken links inside profiles.

Sometimes it's just because there is a typo in the URL. More often, the webpage or website has moved, or no longer exists.

Each one needs personal attention from someone like you.

We want to correct/update the link if possible. If an updated link can't be found, we want to delete it and leave the rest of the source/reference, or the name of the website.

Fixing bad links helps all of us. Broken links make WikiTree look unreliable and out-of-date. Google ranks all our ancestor profiles lower in search results because of them.

Will you help fix some of these?

Please post an answer below if you'll be joining us, or if you have questions. Thank you so much!

in The Tree House by Eowyn Walker G2G Astronaut (2.5m points)
reshown by Aleš Trtnik
Challenge is active.

Please remember to check for a capture by archive.org's Wayback Machine before deleting a link.

20 Answers

+13 votes

I'd like to work on these but can you give me an example of how to fix this link?  I did check it out on the Wayback Machine and the URL is not archived. .http://hcapps.co.hendricks.in.us/webview/Archive/ArchiveViewer.asp?bump=0&b=200&SelectedFolderName=20000067

by Elizabeth Coltrane G2G6 Mach 2 (25.1k points)
The first thing to do is try and find the record again, since the URL probably changed or the source was deleted.

http://hcapps.co.hendricks.in.us/ is still the website in question, and when we go to -> browse the archives, it comes up with a list of information (a ton of links to record sets).

Obviously without knowing what you were looking for (specific records, names, etc), I would not be able to replicate the steps properly in order to find the image. However, when clicking around I noticed that the URLs follow a specific format.

Take for example, the first thing I clicked on.

http://hcapps.co.hendricks.in.us/ArchiveViewer.aspx?Dir=1&ret=http://hcapps.co.hendricks.in.us/DataWarehouse/Archive/SectionMenus/MLBookIndex.aspx&bump=0&b=200&SelectedFolderName=00000173

In order to find what I believe is the source you were lookign for, I removed the secondary URL found after "&ret=", and replaced it with the URL you sent. This brought me to a record page.

http://hcapps.co.hendricks.in.us/ArchiveViewer.aspx?Dir=1&ret=http://hcapps.co.hendricks.in.us/webview/Archive/ArchiveViewer.asp?bump=0&b=200&SelectedFolderName=20000067

This is why it's super important to 1) make sure you label your outlinks! (even if it is just the name of the collection, though preferably it would have more data than that) and 2) make sure that you are archiving your sources. I make this easy on myself by having my browser automatically archive pages I visit with the Wayback Machine browser extension, but that does not always work (especially with Ancestry.com), so it's ultra important to be thorough.

But to answer your question, normally for this sort of stuff I go to the basic URL to see if the website is still available. If it is, I will often click around or use their search function until I find a lead. Alternatively, I will Google ' "key phrase" site:whateverthewebsiteis.com' so for example I would Google ' "Elizabeth Coltrane" site:wikitree.com' (with your name in quotes). I also do this sort of stuff if the website is no longer available -- googling the name of the website first often brings me to their new URL. Unfortunately, all websites (including Wikitree) are susceptible to eventual link rot, which is why I really advocate for saving as much as possible in the Wayback Machine)
I am cleaning up profiles in this week's challenge.  The  above URL is in one of the profiles I'm working on.  The question I'm asking is how do I leave it in the profile fixed - what is the 'style', I guess, that should remain.  Take it out altogether or leave the main website?  I took all of the steps you mentioned and left the main URL; do you also describe how to search on the profile or just leave the URL?
Very helpful response. Thanks for sharing your knowledge.
Elizabeth, I would try and leave as much information as possible -- the record set, the people in it, and a link. Anything that would help identify what the link is, since that link especially is pretty much unidentifiable and could change at any time. (That being said, sometimes, I leave the archived link with it instead of the live one -- but my browser also automatically archives sites I visit through the Wayback Machine, so I am covered either way).

Also, Curt, thanks! :)
Ok, thank you, Liz.
+11 votes

Thank you, Ales, for this new challenge.  I will try to do some. 

And thank you, Liz, for your comment/reminder about using the Wayback Machine before deleting a link.

by Robin Shaules G2G Astronaut (1.5m points)
+12 votes
I can do some today.
by Marcie Ruiz G2G6 Mach 5 (60.0k points)
+11 votes

I'm in laugh Hopefully I can get a few done today.

by Mindy Silva G2G Astronaut (1.1m points)
+12 votes
I found one I could do, and will try another one or two. It was a challenge to figure out what to update the URL to.

Is there any instruction on Wikitree about what to look for to fix these errors? If some URLs are really old, could the http redirect to https just have expired and the fix could possibly be that simple?

Thanks!
by Sally Kimbel G2G6 Pilot (106k points)
+12 votes
I'm doing some.
by Randall Gardner G2G6 Mach 3 (37.0k points)
+13 votes
FYI, those fixing the following broken links such as:

404 Not Found, http://cslib.cdmhost.com/digital/collection/p15019coll15/id/63/rec/52

can be corrected by replacing "cslib.cdmhost.com" with "cslib.contentdm.oclc.org."

http://cslib.contentdm.oclc.org/digital/collection/p15019coll15/id/63/rec/52
by Marcie Ruiz G2G6 Mach 5 (60.0k points)
If you identified such solution, all suggestions with that domain can be found using this link

https://plus.wikitree.com/function/WTWebAll/Suggestions.htm?Query=cslib_cdmhost_com&MaxErrors=1000&ProfileSearch=1&ErrorID=965

Here it is clearly a domain name change from http://cslib.cdmhost.com to https://cslib.contentdm.oclc.org and that can be corrected by EditBOT. I will run it.
Thanks Aleš!
+12 votes
I will try to help
by Liza Gervais G2G6 Pilot (393k points)
+11 votes
sounds interesting, I'll try a few.
by Vik-Thor Rose G2G6 Mach 3 (34.5k points)
+9 votes
I am going to accept the challenge.
by Sharon Kellar G2G4 (4.7k points)
+9 votes
I'm happy to work on this.
by Valmay Young G2G6 Mach 1 (10.6k points)
+8 votes
I'll see what I can accomplish.
by Connie Mack G2G6 Mach 2 (22.8k points)
+9 votes

I'll look at a few. I'd also recommend that people check out the Wayback Machine before dropping links.

On a related note, I found out just yesterday that the popular RootsWeb site is being taken down this April 6th. No doubt there will be many more broken links once that happens, and it seems that site is not being archived by the Wayback Machine. If you know of RootsWeb resources being used as a source, you can trigger a save, though, to preserve it for the future.

by Jim Patterson G2G6 Mach 1 (14.0k points)
Thanks for the heads up about the demise of RootsWeb. The paywall's are closing in. Time to download any threads I might want to preserve.
+9 votes
I'll take this as my cue to fix some of the many broken census links in my watchlist. Library and Archives Canada loves to change their link structures!
by Liander Lavoie G2G6 Pilot (454k points)
+8 votes
I would like to help out!
by Norma Price G2G6 (6.5k points)
+7 votes
I would like to help with the broken links.  Do we have to check in any place on a Spread sheet? Or just fix them.
by Helen Holt G2G6 Mach 1 (15.8k points)
+7 votes
I will try to work on this. I have several in my own watchlist to fix.
by Joyce Rivette G2G6 Pilot (179k points)
+8 votes

I've already answered, but just wanted to point out that a few of these links look dodgy (IP addresses rather than domains) but are actually the best available. So, flagging Error 961 just based on syntax is causing some false positives - perhaps they should be verified with a HTTP HEADER check.

In particular, internment records for some Halifax, NS cemeteries can be found at http://66.241.235.145 . I've done a Reverse DNS check, and there is apparently no corresponding domain name, nor does the web site itself hint at one. It is, as near as I can tell, set up to use an IP address, not a domain name.

Just heard from the manager of some of those pages. The original domain was ccchalifax.com but at some point they forgot to renew it, and just left it that way.

by Jim Patterson G2G6 Mach 1 (14.0k points)
edited by Jim Patterson

Even with DNS name the websites are unstable, since they can "forgot to renew it". But IP does change even if they change internet provider and is even more unstable. So such links should be avoided.

+7 votes
Hi, not going to do this as part of the challenge, but letting people know about a couple of source links that are used a lot in French Canadian profiles,

One is PRDH, there are lots of old links still, Ales made a nice widget to fix them and update them, turning them into a template.  Put the bare URLs between single square brackets (leave one space at the end), then use the app to fix it with that drops down from the ''Biography'' in the list of apps (automated corrections).

Another is PREFEN, have been working on fixing a lot of these, this was another university research program, some of them with a section that reads ''notices'' in the URL can be updated via Wayback Machine.  Please don't just delete these,   There is priceless data in there including original records images very often.
by Danielle Liard G2G6 Pilot (661k points)
+6 votes
I'll tackle a few of these.
by Ward Hindman G2G6 Mach 3 (35.0k points)

Related questions

+8 votes
3 answers
+13 votes
15 answers
+17 votes
12 answers

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...