 |
|
 |
Yes. As a technical detail of interest to certain users, we want you to know that PicoSearch does honor the robot exclusion protocol. PicoSearch will not follow
links or directories which have been deliberately disallowed in the
robot exclusion protocol configuration file robots.txt located on your
server. For users who do not have access to their server and thus
cannot make their own robot file, we provide our own special HTML
directives for NOINDEX and NOFOLLOW.
Note that Sitemaps from robots.txt are not automatically followed, so if you
want PicoSearch to index each URL from your Sitemap be sure to also add that Sitemap's URL in your account
manager's Entry Points section.
If
you ever need to refer to the name of the PicoSearch spider in your
robots.txt files, it is "PicoSearch/1.0". Of course, it's up to you to
find the right robots.txt file syntax that you need, but here is a basic
example of a robots.txt file that first disallows all robots from all
directories but then allows PicoSearch in a subsequent rule that's more
specific. Notice that allowing is best accomplished by disallowing no
directory or a certain directory, so all other directories are allowed
by implication:
User-agent: *
Disallow: /
User-agent: PicoSearch/1.0
Disallow:
|
|
 |
|
 |