Get the real story via our monthly newsletter

Search

    2
    0

rss

Send to a colleague

Home > Enterprise Search > A Look at "Guided Navigation" for Enterprise Search

Get a Free Sample

Wondering about CMS Watch research? Sign up to receive free samples of any of our products.

Glossary

Bayesian Inference or Bayesian Statistics

Categorization

Clustering

Free Text Query

Guided Search

Indexing

Java

Metadata

Pattern Matching

Relevance

Relevance Ranking

Taxonomies

Workflow

XML



Report Excerpt

The Enterprise Search Report 2008 looks at... Mondosoft/SurfRay: MondoSearch Enterprise

"Frames have traditionally posed challenges to Web crawlers. An automated crawler requires tuning to "know" which frame contains unique content and which frame repeats boilerplate. MondoSearch, in general, does a good job of indexing frames and linking to a frame with content. However, the display of the content in dynamically generated framesets can behave erratically. If your website uses frames, test this feature first. "

(p. 236)

More about The Enterprise Search Report 2008

 

Endeca ProFind

A Look at "Guided Navigation" for Enterprise Search

by Stephen Arnold
29-Oct-2004

This article is heavily excerpted from a 10-page review of Endeca ProFind in the CMS Watch Enterprise Search Report, released in October, 2004.


 
Endeca ProFind 4.5 at a Glance

Product

Endeca ProFind 4.5

Platform

Linux, Solaris, Windows 2000 and XP

License Fee

Starts at $50,000 but goes up rapidly.

Purpose

Index both Web and enterprise content and data, creating browsable, categorized results.

Secret Sauce

"Guided Navigation" through automatically generated taxonomies and metadata can make for highly usable search results interfaces and intuitive discovery mechanisms.

Caveats

High cost. Guided Navigation may be better suited to e-commerce than enterprise search.

Compare to

Autonomy, Convera, FAST Search & Transfer, Verity, InQuira, Vivisimo

Key Take-Aways

Usability enhanced by Guided Navigation, which provides the searcher with a more contextual picture of the results, and more easily refines long lists. A robust toolset of APIs and application development tools facilitates both customization and modification of the engine based on a rapid application development paradigm. Although Endeca has found some success in enterprise search, the product may be better suited to e-commerce and CRM use cases.

Introduction

Starting the company in 1999, Endeca's founders sought to find a better way to integrate, organize, and -- especially -- navigate increasingly large enterprise data sets.

The company developed the "Endeca Navigation Engine," one of the first navigation solutions based on meta-relational indexing. Endeca's approach is to apply -- automatically if necessary -- structured metadata to content. Then its resulting search results interface makes the data available in different views or "facets" of the information, allowing the underlying content to be segmented and displayed in different ways....

The secret sauce for Endeca, then, is its focus on putting search in a context. That context may be:

  • a business process
  • the results generated by a single-term query.

Let's look at the first case, using an example of an investment brokerage. Instead of requiring a salesperson to enter a query for a retirement account option or a product specification, Endeca hooks the query to a particular business process. As a result, links to stored queries are displayed to the salesperson. A trigger for a query can be the sales representative clicking on a customer's account number or a default display for general inquiries. The idea is that time is saved because the searcher does not have to formulate a query, and it may have been pre-executed in any case.

For second case, simple text search, Endeca offers some new twists as well, in the form of browsable and highly contextualized results. A good example of the Endeca approach can be seen with the "Book Browser" application at book purveyor Barnes & Noble...

Perhaps more so than other niche search players, you will see Endeca in head-to-head competitions with the "Big 4" enterprise search vendors (Autonomy, Convera, FAST Search & Transfer, and Verity). Interestingly, these majors are beginning to introduce their own "guided navigation" features. Imitation is a form of flattery in the competitive world of search system licensing, but in sales presentations, when competitors use the phrase guided navigation, the buzz around Endeca goes up a notch....

Endeca has been relatively successful, but you will note among its customers a predominance of business-to-consumer (B2C) e-commerce companies. These firms have usually already broken down their product sets into fairly straightforward, fielded categories. Indeed, the success of multifaceted taxonomies in the commerce space has raised substantial expectations that similar clarity could be brought to bear on enterprise content repositories. It turns out, of course, that enterprise content is not so neatly structured -- and this places a renewed premium on the basics -- the underlying indexing and search capabilities of any guided navigation system -- in addition to its capability to segment results....

Search Functionality

Like other advanced search systems, Endeca offers a licensee a wealth of features and functions. The centerpiece of the Endeca search system, however, is the metadata that the Endeca indexing engine generates when content is processed into the system, which is leveraged to generate results interfaces featuring "Guided Navigation"....

Guided Navigation

Along with automated metadata generation from unstructured content, a results interface featuring "Guided Navigation" is a central feature of Endeca. In theory, navigable categories help users who are not familiar with data to ask smarter questions by exposing all the choices available to them. But Guided Navigation goes far beyond typical "browse" solutions by making use of faceted navigation, a more complex, but often more efficient and usable way to find information than monolithic taxonomies.

  • Guided Navigation creates myriad valid browse paths to each record, rather than just the few paths available in a taxonomy, greatly increasing the likelihood that a user will find a record.
  • Guided Navigation lets users prioritize their choices in their own personalized way, rather then forcing them down the arbitrary path of the taxonomist.
  • Endeca normalizes and semantically structures the query vocabulary. This step allows the user to see search results that avoid missing related documents or content retrieved from a source or collection that the user did not know about. The system updates all navigation options at each click, showing users only the valid next questions they can ask, and eliminating multiple of possible dead-end paths. According to Endeca, "Users see only the possible outcomes, never the impossible ones."
  • Endeca's version of Guided Navigation can integrate text search at each step.

The question you will want to investigate carefully is whether Endeca's old-fashioned text search -- as opposed to its sexy categorization capabilities -- is good enough for your needs. Unlike competitor Vivisimo [which offers a classification add-on atop other search vendors' systems], text search with Endeca is integrated into the package, and not an external bolt-on....

Observations

Key: "+" indicates a positive feature, an "-" indicates an absent feature or a drawback, a "" indicates that the potential customer should closely investigate this aspect of the solution in the context of their own information sets.

Endeca Feature Snapshot

Search

 

 

Natural language

+

Supported

Boolean

+

Supported

Data type

+

Word, XML, PowerPoint, and more than 120 other file types found in organizations.

Fuzzy search

+

Supported

Relevance ranking

+

Yes, result sets are ranked by relevance.

Non-text content

Yes, if metadata are available for the binary object

Spider

+

Yes with graphical interfaces to key settings such as spidering depth

Fielded Search

+

Supported

Metasearch function

Collections may be formed.

System

 

 

Architecture

Unix and Windows with a focus on integrating saved searches with specific work tasks, including within other enterprise applications.

Standards

+

Multiple Web and database standards are supported.

Infrastructure

Separate content instances are used for certain applications. Distributed indexes can be supported.

Indexing

Creates an index that allows the licensee to use default "facets" or "related concepts." An administrative interface allows the licensee to develop customized term lists, synonyms, and "use for" lists.

User interface

+

Web browser. "Guided navigation" functions can be integrated into other applications.

Document limit

Performs optimally when collection sizes are carefully managed.

Hosted service

-

Can be supported, but not natively available.

Administrative tools

+

Graphical; toolkit supports mainstream programming languages

API

+

Provided and allows access to index, interface (facets), and search functions.

Code support

+

C++, Java

Security

+

Handled by security flags on servers, folders, and documents

Key: "+" indicates a positive feature, an "-" indicates an absent feature or a drawback, a "" indicates that the potential customer should closely investigate this aspect of the solution in the context of their own information sets.

Benefits

Among the benefits associated with the Endeca solution are:

  • Combined structured and unstructured matches, featuring targeted search and relevance ranking within structured, unstructured and semi-structured information.
  • A multifaceted and highly contextual "Guided Navigation" approach to information location, access, and search.
  • A very attractive results interface that allows a user to organize information choices with single clicks.
  • APIs and tools to embed the technology into other enterprise systems to replace less usable search systems bundled with various portals, ERP, and other such applications.

Drawbacks

Don't be lulled by Endeca's low entry-level licensing price. Like most enterprise search systems able to scale and support extensive customization, license and customization costs can easily reach into six figures (USD) -- or more.

Systems that integrate structured and unstructured data into a single index with suitable metatags can place a significant processing burden on indexing subsystems. Robust hardware and sufficient network resources are required to permit extraction of new records in a database table, metatagging, and especially updating the core index.

Endeca's system, like other enterprise search systems that offer similar scalability, can be complex to configure. Among the issues that licensees may wish to consider are:

  • Less structure in the content means more classification challenges.
  • A mixed environment of structured and unstructured data means that metatagging is essential, and results may depend heavily on the effectiveness of Endeca's autoclassification facility against your corpus of information.
  • Management support is necessary to address content access and security issues raised in a metasearch environment.
  • The more diverse the source material--what Endeca calls the "content stew" -- the greater the infrastructure resources required.

Conclusion

Endeca is one of the few search engine companies that warrants serious consideration solely on the basis of its markedly usable results interfaces. This may say more about the other vendors today than it does about Endeca.

The company also makes a strong case for information discovery. Endeca's premise is that "Search often works when you know precisely what you want, and know how to ask for it. Search fails when you don't know which choices you have, because 'you don't know what you don't know.'"

But it is important to separate the notion of "discovery" from usability. Yes, the interface seems more intuitive and the notion of "multifaceted" taxonomies has strong merit, but not every search use-case calls for a browsable, discovery-oriented approach. Endeca has found substantial success in e-commerce environments, but less penetration into traditional enterprise search, where the intrinsic quality of traditional text search and retrieval results is often as important (or more important) than results classification. Categorization is a means to end, and will always work better on highly structured data sets.

At the end of the day, therefore, it's the quality of results for searchers that matter. For e-commerce companies, Endeca can bring an attractive ROI as customers drill down faster through large catalogs. But for enterprise search, Guided Navigation is not a slam dunk. As with every other vendor in this report you will want to test the system carefully with real users before buying.


Next:

Send Feedback

See all Enterprise Search Channel feature articles.

Need to select a technology vendor, but confused about your choices? See our vendor-neutral technology reports.

Join the conversation

Digg This! Search Technorati Tag it on Del.icio.us



About the Author

Stephen Arnold

Stephen Arnold is a three-decade search and retrieval industry veteran. He is founder of Arnold IT, an independent technology consulting and analysis firm. Arnold served as the principal author of the CMS Watch Enterprise Search Report, which provides detailed, objective reviews of 29 Enterprise Search products.



Get a Free Sample

Wondering about CMS Watch research? Sign up to receive free samples of any of our products.



What we do

CMS Watch™ evaluates content-oriented technologies, publishing head-to-head comparative reviews of leading solutions. What makes us special?

  • Our critical analysis exposes product weaknesses as well as strengths
  • We deliver unrivaled technical depth and comprehensive project advice
  • Our research is led by international topic experts
  • We only work for buyers -- never for vendors

Contact us

CMS Watch

info@cmswatch.com

18113 Town Center Drive, Ste 217

Olney, MD USA 20832

1 800 325 6190 (customer service)

+1 617 763 5336 (int'l customer service)

Fax: +1 214 242 3048