Help with PicoSearch

How can I restrict the scope of my index?

Your PicoSearch index is built by starting from the website entry points that you provide. Like any website visitor, PicoSearch will follow the links that it can find. The scope, or breadth, of your index will come from which page links get automatically followed. For most people, the default behavior will work just fine. But if you are having problems with missing or extra documents, the primary link restrictions setting is an important point to understand.
 
For each indexer entry point (and you may have several), one of three levels of indexing restriction must apply in order to guide the following of links.
  • Directory level restriction is the default, and it is appropriate for most web hosted individuals. Directory level restriction means that only links at or below the directory of the homepage (entrypoint) will be indexed.
      NOTE: because so many webmasters mix their links with and without the "www", PicoSearch will follow either interchangeably. Thus, http://www.mysite.com may become http://mysite.com for a while if there is a link without the "www" on the site. If this becomes a problem for you, then you should either make your links all the same way, or you can explicitely tell PicoSearch not to follow a certain format in the Excluded Paths feature (for example, put http://mysite.com as an Excluded Path to keep only the links with the "www")

  • Server level means that only the machine of the homepage will be followed, and this is useful for people who may have several servers working off of one domain name (such as http://this.mydomain.com and http://that.mydomain.com)

  • Domain level is the most general restriction for those people who own a domain name and wish all links which remain within the domain to be included.

In addition to the 3 levels of link restriction, other tricks and tools are available nearby in your Account Management control panel. See this FAQ on skipping web documents and FAQ on adding web documents. You can check the list of documents which made it into your index from the link in your control panel as well. You can experiment with the settings and re-index as often as you like, and if problems persist then contact us.

Back to FAQs