Searching is an important part of any business database function, either through internal databases, internal document stores, or through the content of a website. This is needed for both internal company staff and for external customers. Although a simple database query such as "List existing customers with a postal code for Argleton" is a trivial piece of in-house software development, probably through SQL, this is a simplistic example. More complex searches such as "Find all product brochure text that references the Bindeez product" or "Search the customer-uploaded reviews for any synonyms of 'caught fire' and 'pets' or 'children'" are more difficult to implement. Search, especially free text search or text searching through images of scanned documents, is a specialist discipline.
Externally-provided search services
By outsourcing the search function to a specialist search company through software as a service, a more capable search function may be available to even the smallest organisation. Two methods are popular for this:
Web-mediated search
One method searches a company's publicly visible web presence. An existing search engine such as Bing or Google is encouraged to web crawl this site, as they would normally do so anyway.[1] A link to the company's favoured search partner is coded onto their web site as a simple HTML web form or search box. When a query is submitted, this search box searches the main Google (or other) corpus for the text string, but only for results from that particular web site. These results are then displayed on the site's page, as if they were returned by the site itself. This feature is very easily implemented: the search form simply includes a site: qualifier in the query string passed to the search engine.[2]
Search as a service
The second method is more sophisticated, although more complex. It can support enterprise search too, searching through private resources that are not visible to the public web. Only this form is commonly termed 'Search as a service'. A search provider company offers a search service and a contract is agreed with the client to support their searches. The client then uses the provider's API to upload content data or indexing metadata (if already available) for the content to be searched. The provider then constructs a search index for this content. If the content is free text data or similar unstructured data, then it is first tokenised by Lucene, or similar process.[i]
Search as a service may also be particularly useful for mobile applications, where the client device is limited for storage, processing speed and connection bandwidth. This approach is taken by Algolia, a popular player in the field. Alternately, newer service providers like ExpertRec[4] have further simplified the approach by avoiding having to upload data via API and instead by having data extracted by a crawler and then tokenised by Lucene/Solr.
Federated search
Search as a service should not be confused with federated search, such as Z39.50. These are also services where an agent queries one or more external search engines. In these cases, however, the search engine providers are closely coupled to the content databases. The remoting service passes only the query and the results, not the content metadata to populate the search indexes.