It can be tedious to check for broken links on a web page, especially a web page with many links. Other than using desktop tools, i.e. Xenu, Screaming Frog installed on your workstation, Google Docs is a good alternative tool that can be accessed anywhere to check for broken links on your web pages.
1. Create a new spreadsheet in Google Docs, to extract the URLs and anchor text on the web page.
2. Input the web page URL in cell A1.
3. The ImportXML function is useful to extract the necessary desired data on the web page. Insert the ImportXML function, =ImportXML(A1, “//a”) in cell A3 to extract the anchor text of all hyperlinks on the web page. Insert another ImportXML function, =ImportXML(A1, “//a/@href”) in cell B3 to extract the pointing URLs of all hyperlinks.
Here is an example of the spreadsheet on Google Docs with ImportXML function. Feel free to make a copy of it.
4. Copy the generated list of anchor texts and URLs onto another new spreadsheet on Google Docs.
5. Update all the relative path URLs to absolute path.
6. In the new spreadsheet with the list of anchor texts and URLs, create a script in the spreadsheet to check for the server status code for each of the URLs. Click “Tools > Script editor” on the spreadsheet top menu to open a new window to insert or edit the script.
7. Insert the below chunk of code. You have created a script function, named “HTTPResponse”. Save the script with a suitable project name, i.e. “Check Server Status Code”.
function HTTPResponse( uri ) {
var response_code;
try {response_code = UrlFetchApp.fetch( uri ).getResponseCode().toString();}
catch(error) {response_code = error.toString().match( / returned code (ddd)./ )[1] ;}
finally {return response_code ;}
}
8. Go back to the spreadsheet with the list of anchor texts and URLs. Insert the HTTPResponse function in a new column, to check for the server status code.
A “200” value suggests that the URL is available working.
A “404” value means that the URL is unavailable, which you have to remove the broken link or update it to a new working URL. You can learn more about the HTTP server status code.
Feel free to make a copy of the Google Docs spreadsheet with HTTPResponse function.