Your PicoSearch index is built by starting from the website entry points
that you provide. Like any website visitor, PicoSearch will follow the
links that it can find. The scope, or breadth, of your index will come
from which page links get automatically followed. For most people, the
default behavior will work just fine. But if you are having problems
with missing or extra documents, the primary link restrictions setting is an important point to understand.
each indexer entry point (and you may have several), one of three
levels of indexing restriction must apply in order to guide the
following of links.
- Directory level restriction is the default, and it is
appropriate for most web hosted individuals. Directory level
restriction means that only links at or below the directory of the
homepage (entrypoint) will be indexed.
NOTE: because so
many webmasters mix their links with and without the "www", PicoSearch
will follow either interchangeably. Thus, http://www.mysite.com may
become http://mysite.com for a while if there is a link without the
"www" on the site. If this becomes a problem for you, then you should
either make your links all the same way, or you can explicitely tell
PicoSearch not to follow a certain format in the Excluded Paths feature
(for example, put http://mysite.com as an Excluded Path to keep only the
links with the "www")
- Server level means that only the machine of the homepage will be
followed, and this is useful for people who may have several servers
working off of one domain name (such as http://this.mydomain.com and
- Domain level is the most general restriction for those people who
own a domain name and wish all links which remain within the domain to
In addition to the 3 levels of link restriction, other tricks and
tools are available nearby in your Account Management control panel.
See this FAQ on skipping web documents and FAQ on adding web documents.
You can check the list of documents which made it into your index from
the link in your control panel as well. You can experiment with the
settings and re-index as often as you like, and if problems persist then contact us.