Search Marketplace
Recent Trends in Enterprise Search
by Stephen Arnold
03-Aug-2005 --

The truth is that nothing associated with locating information is cheap, easy, or fast.
Internet search certainly works. Google and Yahoo! use statistical voting algorithms to deliver results that hit dead center in the middle of a consumer-search bell curve. But organizations are not Google or Yahoo!, with advertisers paying the bills and thousands of the world's smartest people working on squeezing more revenue from search.
As explained in the Enterprise Search Report, search within an enterprise is a different problem, and often does not work as well as employees want. Enterprise search follows a different path that leads into a swamp of business language, security, changes to data, and computer systems. In fact, enterprise search is a misnomer because no organization with sane management wants all of its information searchable.
So instead of one bull's eye, there are many. A lawyer working for the chief financial officer may need everything pertaining to a deal involving dozens of employees, several departments, and documents in all versions over a span of years. An engineer in product development needs to locate the AutoCAD drawing used for a part ordered three months ago from a vendor in Taiwan.
Either of these two examples illustrates the challenge for enterprise search. Each professional requires a specific type search functionality. A utility or commodity search system can only go so far, then a specialized system must be available. In fact, I see the marketplace dividing into two classes of search vendors: vendors of utility search and specialty players.
Advent of Utility Search
Today most enterprise search systems are utilities; that is, a search system can match a user's query to items in an index, work on a wide range of file types, and display results in a relevance-ranked list. Search is no longer special. Just like electricity flowing from a power main, enterprise search delivers these basic, commodity functions. And as such traditional enterprise search vendors are vulnerable "baked-in search" within specific data management or content management systems. Oracle, Microsoft, IBM, and SAP all include search in their other enterprise software. Although the technologies vary to some degree, each of these systems offers the same, essentially similar type of search and retrieval.
Therefore, enterprise search system vendors have to explain first why their search is different from or better than Google and then explain why a company should spend money to buy search to replace what is already included in other enterprise software.
As a result, for the last two-and-a-half years, enterprise search vendors have repeatedly tried recalibrate themselves with hope-inducing features. But has enterprise search actually improved over that time? The answer is, "Some." What has happened is that the enterprise search system vendors have highlighted certain functions and positioned these functions as eliminating the myriad annoyances of an existing enterprise search system. When an enterprise search system is upgraded, some annoyances are reduced but may not be entirely eliminated.
Recent Trends
The enterprise search sector rides a different a different trend bandwagon every six or nine months. Let's look at some of the trends:
In 2003, the Google Appliance entered the enterprise market. The Google's Appliance rode the coattails of Google Search and caught other vendors with their complexities exposed. Google offered a search toaster for what looked like a fixed price. Since its introduction, the search toaster has not always been as easy to operate as promised, and the pricing for serious systems hits six figures or more.
Today, an enterprise can select from three competing appliances: the original appliance produced by Thunderstone, the Google Appliance, and the Index Engines' appliance. Buy one, plug it in, and presto, enterprise search.
For searching Intranet websites or certain specialized content, an appliance generally works. For true enterprise applications, Google threatens to get tangled in the same tar ball as Autonomy, Convera, FAST Search & Transfer, and Verity (the "big four" of enterprise search), albeit with a different twist. In order to differentiate themselves from one another, these flagship search companies offer a wide range of bells and whistles. Most of these advanced functions promise to make the search experience better, faster, and more intuitive. The truth is that layering on functionality can increase costs and slow performance unless an organization adds more computing horsepower to the search system. Google is rumored for its forthcoming release of Appliance Version 5 to allow multiple Google Appliances to "hook up" and share resources. Google's advanced technology is a refreshing change from the technologies used by the Big Four. Google automates; the Big Four require a licensee to invest in extensive customization.
By 2004, integrating into business processes became the must-have enterprise search function. The leader of this trend was Endeca, a company with the canny ability to explain search and retrieval in terms of payoff to the customer. Endeca's insight was not just advanced technology. The notion was that canned and free-form search of content limited to what an employee needed to perform her job was the solution to an organization's enterprise search woes. Endeca was right. Narrowing content domains and creating searches that launch automatically eliminated the confusion the "search box" presents to many employees.
Now most enterprise search systems have some type of embedded search component. FAST Search & Transfer, among others, has adopted Endeca's marketing lingo. Most enterprise search systems can launch a saved search and push the results to a Web page available to a specific user. Sadly, however, not all employee information needs can be anticipated with answers coming from known domains of content.
Later in 2004, the buzz increased for XML. Essential for Web Services, XML was trotted out as the way to break down the barriers between structured information (the data residing in Oracle or SQL Server databases) and unstructured information (electronic mail, Word documents, and PowerPoint slides). Never mind the fact that creating XML with consistent tags across documents from different sources represents a non-trivial task. Most enterprise search system supported structured documents before the XML hyperbole kicked in. Corporate procurement teams began to consider the XML-centric technology from specialists such as Diesel Point and vendors of integrated knowledge management systems such as Open Text and Hummingbird.
By March 2005 automatic classification of content had gained momentum. The idea is that organizations have content and, in the words of Donald Rumsfeld, "Don't know what they don't know." As one might expect, automatic classification has become available in most enterprise search vendors' products, but the results have been spotty and hence the temptation to go with a specialized technologies. .
The approach is identical to that used by the aftermarket industry for automobiles. Don't like your car's stereo? Upgrade. Don't like your wheels? Upgrade. These aftermarket add-ons for enterprise search now include -- in addition to auto-classification -- federated searching, non-English linguistic functions, and dozens of other search exotica and performance boosters. There are scores of search add-on vendors. Some are established technology firms such as Inxight, a spin-out from Xerox PARC.
Why are these needed? The functions available from many enterprise search system vendors are uneven. Human specialists will still do a better job indexing Farsi or combining the results of different collections of content than today's core systems can provide, and so the search for magic bullets continues.
What's ahead for the balance of 2005 and early 2006? The answer is, text mining, sometimes called "text analytics" or "content mining." Borrowing from the older realm of data mining in the business intelligence field, the notion behind this new essential enterprise search feature is that algorithms can tirelessly grind through text, discern patterns, and discover relationships. A user can consult the reports generated by a text mining system and uncover relationships, insights, and nuances previously not available at the click of a mouse. No searching required. No reading of documents necessary. Both SAS and SPSS (vendors of statistical software) are in the text mining business, along now with most vendors of enterprise search. Here again, expect hype to supercede results in the short term.
Observations
Several observations are warranted.
First, search system vendors are scrambling to figure out how to sell a product that can keep most of the customers happy most of the time, but in a relatively flat marketplace still grab sales from competitors' order books. Expect a bit of fudging as vendors push to replace existing systems.
Second, the focus on XML tagging, taxonomies, and algorithmic processes should suggest to you by now that this search-and-retrieval activity is complicated. The appliance approach works for some search applications, but usually not for those where the content entails a mix of images, text, and numbers.
Third, the enterprise search community is turning full circle or rediscovering what indexers have known since scrolls were rolled as pushed into clay tubes at Ephesus. Humans and human-like processes are needed to supplement or do certain types of taxonomy development, indexing, classification and analysis. We don't live in a "Star Trek" universe; a machine can only perform certain limited functions.
Search systems, yes, but from whom?
A good search system can be built from any one of more than 100 different companies' software today. Nevertheless, information remains a slippery animal. The way to domesticate information is to narrow the field of focus. Organizations are made up of chemists, lawyers, accountants, salespeople, and dozens of other specialists. Chemists and engineers require one type of search need. Lawyers another.
This means the longer-term future of enterprise search may reside with the big boys -- Oracle, IBM, Microsoft, and Google -- who can put a basic search utility inside their applications, database, and servers. Integration of this type is already underway at Oracle and SAP. The latter's NetWeaver product includes the TREX search tool. Any SAP application can be searched in myriad ways because TREX is in the DNA of SAP R/3.
Going forward, pure-play enterprise search vendors will change again and probably become more specialized, not more generalized. Most of the vendors will be forced to do one or two things well, not many things in an average or poor way.
In the meantime, enterprises seeking the perfect enterprise search system face a digital snipe hunt. The problem is, there aren't any snipe.


