 |
|
 |
Yes, to a certain extent. PicoSearch will record the URLs of the bad links that it finds while indexing your website. For each bad link URL, PicoSearch will also record the URL of the originating valid page where that bad link was first encountered. This is not necessarily a complete report, because any link outside of the Indexing Scope is not recorded, and at most one originating URL is recorded for a given bad link in a given indexing. This is because the indexing is designed to find and follow links once just for the pages that are supposed to be in your search engine. It takes a lot of extra time and bandwidth to crawl and find every broken link on every page of your site, so for the purpose of building your search index this would be inefficient.
That being said, you can use the information PicoSearch records to track and fix your site's bad links, at least on the pages that are within your Indexing Scope. Your account manager's Most Recent Indexing Log would contain lines that start with [Download Failed (404 Not Found) to indicate that a specific link caused a 404 or "not found" error. By default PicoSearch will try this link again later in the indexing, in case your site was intermittently down while this link was first being indexed. In that case you should see a line that starts with [Retrying URL] lower in your Indexing Log, indicating that PicoSearch is trying a bad link for a second or third or fourth time. (The default setting is two attempts per link which works for most sites, but if you prefer more or no retries just change the "Retry Not Found" setting in your account manager's Index Modes section). Upon the final attempt, if the page is still "not found" then PicoSearch will also indicate the originating URL where it first found the bad link. Then your Indexing Log would contain a block of text like the following, typically towards the lower end of the log (Note the URLs here are for illustration only):
[Retrying URL] http://www.mysite.com/bad_page.html
[Visited URL]
[Download Failed (404 Not Found) - no more retries] http://www.mysite.com/bad_page.html
[Originating URL] http://www.mysite.com/good_page_with_bad_link.html
Going with the example above, you would fix the bad link on good_page_with_bad_link.html and run another indexing. If bad_page.html still occurs on another page on your site, the next indexing would report it in the same way, within the [Originating URL] block. You can then use this method to incrementally fix all your bad links, until you no longer see them reported in your Indexing Log. If you still require a complete bad links report for your entire site, we would recommend looking for other websites or software that are intended for broken link reporting.
|
|
 |
|
 |