Help with PicoSearch

How can I use Direct Searches to send certain searches directly to certain pages?

The essential function of a search engine is to take the word or words that a user is searching for, and return the search results, which lists the documents that contain the user's word or words. Links from the list of search results take the user to the specific documents, and the information in the search results (title, meta description, concordance, etc.) help the user decide which links to follow. PicoSearch does all this, and gives you tools to control which documents are searched and what order they may appear in. The Promotions feature is a particularly powerful way to make sure that certain documents appear first for certain searches in the search results list.

But what if you would like to supplement your search engine with some Direct Searches? That is, for particular searches, you would like to bypass the search results list and just go directly to a particular page on your site. In a search engine that searches the entire internet, it would be risky to assume that the first search result is the only document you want to see. But in your customized PicoSearch search engine, you may know very well if a certain search is best answered by immediately going to a certain page.

PicoSearch offers the Direct Searches feature in your account manager's Designing Section to enable immediate linking to a particular internet address. In the PicoSearch tradition of giving you maximum power, Direct Searches also support some very interesting capabilities, including:
  • multiple direct search statements that will be applied in the order of entry in your account manager
  • run-time checking for valid URLs to jump to (look before you leap)
  • keyword patterns with wildcards to match the user's searches for greater flexibility
  • plug matching pieces of the user's search into the URL to create a whole range of target URLs, effectively supporting look-up dictionaries
  • have wildcards in the URLs of the targeted pages to make the direct search conditional, so that the user will go to that page only if it was in the search results already
  • specify which partitions the direct search can trigger within
  • go straight to the first search result (if you're feeling lucky!)


Unconditional Direct Searches

An unconditional direct search is one that goes to a fully specified URL whenever the keyword pattern matches the user's search. For example:

cars = http://www.mysite.com/list_of_cars.html

Now whenever the user of your search engine types cars, they will be sent directly to the URL for the list_of_cars.html. This URL must be complete and fully qualified, meaning that it starts with http://, because the link comes from the PicoSearch site and must get to the new page exactly. Notice that this URL doesn't have to even be in your search engine, so you could potentially send people to reference pages outside of your site, if you're confident that certain searchers should go there.

As an unconditional rule, the direct search will go straight to the URL at searching time. You should therefore make sure that this URL is reliable so that PicoSearch doesn't send your users to a broken link. If you control the page, you might want to add a comment in the HTML so that future webmasters know that your PicoSearch is depending on the page. To help ensure that your searches don't go directly to a broken link, PicoSearch will check your Direct Searches at indexing time, and invalidate any broken ones with an initial # (see the Mainenance and Statistics topic below).


Look Before You Leap

Before we get into the various tricks that direct searches can do, including creating patterns of possible searches and URL targets, here's an important concern: what if you want to make absolutely sure the URL exists before you send the user to it? For this, replace the = with a =?= to mean "look before you leap". PicoSearch will do a run-time fetch of the head of the URL as fast as it can, and if it's not a valid page then PicoSearch will consider the next matching direct search, or just go to the regular search results page if nothing else.

A head fetch is not a full page fetch so it's the most efficient look-ahead technique there is on the internet, but still it will obviously take some amount of time longer than not checking at all. So don't burden your direct searches unless you need to, in which case don't worry, because the user won't notice a trivial delay but certainly will notice a page error! For example, if you're relying on the Plugins technique to generate ranges of URLs on your site that might not exist, here's a pattern that will go to a definition page only if it's available, whenever the user types define before something else.

define {*} =?= http://www.mysite.com/definition_{*}.html


Keywords and Wildcards

When we say keyword, we really mean a flexible pattern of word or words for the direct search to match. The match is case insensitive, so Cars will match cars in the example above. The match is insensitive to double quotes, since double quotes are optionally used to hold words together in exact phrase searching. The match will also be accent insensitive if you have the Use Accent Insensitivity option turned on in your Alternate Character Options section in your account manager (it is on by default). All other keyword variations will not automatically match however, such as plurals/singulars, synonyms, and being part of a larger phrase. To help with these issues, the keywords can use commas (,) spaces ( ) and the wildcards asterisk (*) and question mark (?).

You can add multiple keywords to a direct search, separated by commas. Multiple word phrases with spaces are allowed. For example:

car,cars,list cars = http://www.mysite.com/list_of_cars.html

This will cause anyone who types car or cars or list cars to go directly to the page list_of_cars.html. The match must be still be exact, so for more flexibility you can use the wildcards * and ? in the keywords. A * will match any number of optional non-space characters, and ? matches at most one optional non-space character. For example:

car*,list car? = http://www.mysite.com/list_of_cars.html

Now you'll match more searches, but be careful to consider what your users will be reasonably searching for given the topic of your website. The pattern car* will match car and cars but also match carpool and careful. The pattern car? will match car and cars but also carp and care. Of course, it's unlikely that a site about cars will also discuss carp, but carpools could be an issue. The keyword patterns for Promotions have the same potential for possible overmatching, but direct searches are most important to anticipate because the user is going directly to a URL and won't ever see the list of other search results.

The default behavior of direct search keyword patterns is that they must match all the terms that the user types, and in the same order, just to trigger the direct search. Therefore, if the user types list of cars then this will not match the direct search example above. To help with matching more of the user’s search terms, the wildcard * takes on a special meaning when it is not connected with a word, i.e. it is separated by spaces at the ends or in the middle of keywords. A single * means one optional word, and a double ** means any number of optional words. For example:

* car*,list ** car? = http://www.mysite.com/list_of_cars.html

Now the first keyword pattern with a single * by itself will match if the user types car or all cars, but it won’t match show me all cars because that’s more than one word. The second keyword pattern with a double ** will match if the user types list cars or list of cars or even list of all your cars. So at this rate, you can see that if you wanted to match the word car anywhere in what the user types, you could use the pattern:

** car ** = http://www.mysite.com/list_of_cars.html


Conditional Partitions

If you want a direct search to only activate when the user is searching in one or more certain partitions, you can prefix the pattern with partitionNAME:. For example, to only jump to the page about car(s) when in the CARS and VEHICLES partitions, use the following:

partitionCARS: partitionVEHICLES: car? = http://www.mysite.com/cars.html


Conditional Direct Searches

Direct searches become even more interesting when you add the wildcard * to the right side of the direct search, that is within the URL. The meaning is to use the direct search only if the entire URL pattern matches one of the actual search result URLs that come from running the user’s search. This behavior not only allows for more generalized direct searches, it also helps to ensure that the direct search really is relevant for what the user typed. The more wildcards you have in the left keyword side of the direct search, the more you might want to make the direct search conditional to avoid overmatching and confusing the user. And if you don’t want the wildcard to ever match more than one URL, you can always add * to the beginning of a fully specified URL (since nothing comes before the http anyway). For example:

** car ** = *http://www.mysite.com/list_of_cars.html

With the double ** on both sides of car, whenever the user types the word car in any search then the direct search would normally have gone straight to the list_of_cars.html. But by putting a * at the beginning of the URL, the direct search becomes conditional, thus dependent upon the list_of_cars.html being an actual search result in the first page of results for what was typed. So if you are returning ten results for a site about cars, then a search for car list will trigger the direct search if list_of_cars.html was in the top ten results anyway. This seems likely, especially if you set Promotions on that URL for some words like list. If however the user types Ford car parts, it’s more likely that other pages will come first based on the words Ford and parts, thus blocking the direct search which would have been inappropriate.

If you’re more interested in using conditionality to match multiple URLs with a single direct search statement, then you’re free to use the * for complex URL patterns. Notice however that ? cannot be a wildcard in a URL as it was in keywords, because ? is the first character of cgi arguments in URLs. So for example:

** list ** = http://www.mysite.com/list_of_*.html

This pattern could work well to go directly to the list of whatever the user is finding, as long as they use the word list somewhere in their search. The conditional search is making sure that there is a page of form list_of_*.html that is being returned in the search results anyway, so since it’s relevant the user will go to it directly. Thus, a search for list of cars would likely go to list_of_cars.html, and list of trucks would likely go to list_of_trucks.html, assuming you’ve made those pages on your site and they’re in the first page of search results. Furthermore, as with Promotions, there is a Maximize Scope option on the Direct Searches feature that will make your conditional searches match beyond the first page of results.

There is one more interesting feature of conditionals involving anchors. Anchors allow browsers to jump down to a pre-specified section of a page depending on the URL. The URL must have #text at the end of it, and the HTML of the page must have <a name="text">...</a> within it, where "text" is any text. This is all standard HTML.

If you have an anchor on the end of your conditional direct search, PicoSearch will not include the anchor in the requirement to match for a result URL. This is good, because the URLs in your search engine are not likely to have anchors anyway. But when the URL does match a search result, the anchor will be used in the direct search. So to build on the previous example:

** list ** = http://www.mysite.com/list_of_*.html#listings

Now when the user has the word list in their search, not only will they jump to the first list page that matches, but also they will jump directly down to the #listings anchor if there is one. This could be handy to skip the top part of long pages.


First Result Searches (If You Dare)

If you don’t specify a URL for a keyword pattern and only use a *, then you will see another special behavior that we call First Result Searching. This has been referred to in some search engines as being lucky or instant searching. It’s kind of risky, but the meaning is to go directly to the first search result, if there is one, no matter what it is. In the context of your custom search engine, combined with keyword patterns, this could still be fairly predictable and useful. For example:

** warranty ** = *

Now whenever a user has the word warranty anywhere in their search, they will go directly to the first search result. If you had only one warranty page on your site, then you might have wanted to spell out the URL in the direct search pattern, either conditionally or unconditionally (see above). But if you have several warranty pages, and in general the word warranty is rare on your site, you might feel confident enough to just send the user straight to the first page the search returns.

For the extreme case, a universal First Result Searching pattern on one word searches would be *=* and for any number of words would be **=* on a line by itself. When specified in the Direct Searches feature, the one word case might be useful if you're confident that single word searches get the right page first, while multiple word searches are less predictable and should display all the results. The all searches line of **=* pretty much short circuits your search engine to never show lists of results. Either case could come after other more specific Direct Searches lines however, since the order of Direct Searches as entered is applied at searching time.

PicoSearch also takes a run-time argument so you can play with First Result searching from only certain search boxes on your site. This argument has the effect of doing a First Result search as a last resort after running through the direct searches that are already in your account manager. The HTML code to add to your search box for one word searches to go to the First Result is the following:

<input type="hidden" name="ds" value="one">

And for all searches (any number of words) add this within your search box code:

<input type="hidden" name="ds" value="all">


Capture Plugins for URL patterns

Curly braces can be used to capture all or parts of the matching query that triggers a direct search, and plug those parts into the URL. This can have the effect of creating a series of direct search results that map queries to URLs on your site, thus triggering a kind of question-answer searching that doesn't even have to rely on your indexer! For example, if we want to respond directly to a search for some word of the form definition WORD by going to a URL of form definition_WORD.html we could do the following:

definition {*} =?= http://www.mysite.com/definition_{*}.html

So now if a user types definition spoongate you need to have a URL on your site for PicoSearch to visit at http://www.mysite.com/definition_spoongate.html. We use the =?= syntax for "look before you leap" behavior so the URL doesn't have to exist.

You can have multiple captures too that will plug in order of appearance. For example if you wanted to enforce that definition be part of the URL, as well as constrain the words to starting with spoon, you could use this rule:

{definition} spoon{*} =?= http://www.mysite.com/{*}_spoon{*}.html

You can further constrain the matching search to a limited set of possibilities separated by a vertical bar, like this rule to only go to the definition pages for spoongate, marshjam, and wigphone. We stopped using the =?= here because we can be sure of how many URLs we need to support on the site (3).

{definition} {spoongate|marshjam|wigphone} = http://www.mysite.com/{*}_{*}.html

Capture plugins can combine with regular wildcards of * or ? on the left, and * on the right side to make the direct search conditional for only when the result is actually in the results of PicoSearch's indexed search. Conditionality would be another way to ensure that the URL is only used if it actually exists. For example, in this rule we allow a single optional letter on the end of the defined word in order to match a plural -s, plus any number of words thereafter, but only go to the matching URL if it is actually in the search. Again we stopped using the =?= here because if it's in the search, the URL must have been indexed and exist already.

{definition} {spoongate|marshjam|wigphone}? ** = *http://www.mysite.com/{*}_{*}.html


Maintenence and Statistics

To help ensure that your unconditional direct searches really will go to valid URLs, PicoSearch tests the URLs each time it indexes your site. If an unconditional direct search URL is down, PicoSearch will inactivate your direct search with a # at the beginning of the line. This indexing-time check cannot be done for URLs with plugins or wildcards, since there is a pattern and not a single URL to check. If you worry about these existing, you can use the "look before you leap" syntax.

The # should not be used by you as a way to comment out direct searches however, since PicoSearch will test every unconditional search again on the next indexing, and potentially reactivate lines as they become available. To add your own comments to your direct searches input, precede a comment line with either #COMMENT# or <!-- like an HTML comment but it doesn't need the end closing -->. For example:

#COMMENT# this is a direct searches comment
<!-- this is another direct searches comment


Because Direct Searches are so distinct in the user’s experience, they will be recorded in the statistics separately. If you have direct search statements in your account manager, then your first page of statistics (the one with the general categorical totals) will include a running count of all triggered direct searches, and another total of just the conditional direct searches. The difference between the two will thus be the number of unconditional direct searches that user searches have triggered.

Free accounts get just 5 direct search lines to play with, while paying accounts get virtually unlimited (thousands). Furthermore, paying accounts get an additional page of statistics just for their top direct searches. This view is similar in format to the top pages and top partitions statistics. You will see the top direct search URLs that your users saw, and the top searches that triggered these. Direct Searches will not be otherwise logged in your statistics, and in particular will not be included in the top pages view, so you can pretty much isolate and understand how your direct searches are being seen by your visitors.



Back to FAQs