 |
|
 |
Every PicoSearch hosted search engine account has a maximum number of
pages that will be indexed. This is the primary limiter on your search
engine; you can put the search box on as many pages as you like, your
site visitors can search as much as they want, but the number of pages
that can be searched is pre-set in your account.
This is because the unit of text search is a page of text. One
page is generally one file, such as an HTML file. The exceptions
include PDFs which may count for the number of pages in the PDF (see FAQ for details), and really
long HTML files which can count for more than one page at indexing time.
PicoSearch gives account maximums that are generous for most
websites. If you know from the start that you have a lot of pages on
your site, you can choose a bigger account to cover your needs. If you
grow more than expected later on, you can always upgrade to get more
pages without losing account value. The online payment system will
upgrade from Professional to Premium with a pro-rated calculation to
include the days left in your account. If you need more than Premium,
just contact us.
What happens when you hit your maximum pages? You will see
warnings in your indexing email notifications and in your account
manager, that not all pages have been indexed and therefore not all of
your site will be searched. PicoSearch will not collect documents above
your maximum, so if you need to find out how many pages you have with
PicoSearch, you can contact us to run a trial size.
If you hit your maximum, don't panic, but do look into it as
soon as possible. You may need to buy more pages, or you may just need
to focus your search engine a little more for both efficiency and
accuracy. Consider the following possibilities:
- You may need to upgrade for more pages: see the Make Payments
link under the blue Managing features of your account manager, or use
the payment link in an email notification warning that not all of your
pages are being searched.
- You may need to cut out unecessary links or pages: see the
List of Documents link under the blue Managing features of your account
manager, or follow links from an email notification. Use the document
list to verify that you want to search everything that's being indexed.
There may be a directory you can do without, or sometimes the indexer
can fall into a "spider trap" such as a calendar function on your site
that generates endless URLs which look like different pages to
PicoSearch. Use the FAQ on how to control the skipping of text or links for powerful techniques like Exclusions and Skip Tags to trim your search engine down to size.
- You may want to limit a file type or control duplicates:
after looking at your List of Documents, you may decide to cut or
control a type of file. Under your account manager's Indexing features,
the Index Modes section has switches to remove plain text files,
case-sensitive URLs, or remove duplicate documents. The Additional File
Formats section lets you turn file types on/off, and even set a maximum
number of pages per PDF. See FAQ for details on PDFs if this file type is particularly using your limits.
- You may actually be within your page maximum but have some unsearchable files worth Excluding:
PicoSearch tries to err on the side of indexing your site, which means
sometimes an odd file type gets interpreted as text and then skipped if
it's suspiciously long. This could be happening if you see an account
warning that not all documents got indexed, even though you haven't
actually hit your maximum pages. See your List of Documents for a notice
on what files were the problem, and use the skipping controls to Exclude these files next time. This will improve indexing efficiency, and remove a source of confusion for you.
|
|
 |
|
 |