• Home
  • Research
  • What We Offer
  • Who We Are
  • Blog
  • Your cart is empty.
  • Log in
  • Subscribe
  • Free Sample
  • Contact
  • Recent Entries
  • Get Custom Feeds
Team Blog
Free Research Sample

Value of Organized Knowledge

By Jack Bryar at 2002-01-21 00:00:00 |


An Old Problem Gets Worse

In recent years, the volume of news and information resources available to the typical corporate employee has grown exponentially. Corporate Web traffic has jumped by over 600% annually. Web-available content exceeds several billion items. Executives frequently receive more than 200 e-mails a day. The amount of corporate data generated per employee doubles every 18 months.

    "If you printed the information available through our Intranet, it would stretch from the earth to the sun."
    -- Marc Auckland, World-Wide Chief Knowledge Manager, British Telecom

Corporate managers are worried. Sixty percent say that info-glut is having a negative effect on productivity. IDC estimated that in 1999, US Fortune 500 companies lost $12 billion due to an inability to locate knowledge resources amidst all the clutter. Eighty percent of executives believe the problem will get worse before it gets better.

Adding to the strain is the fact that this content is so difficult to access. Corporate information exists in many forms. Each form (electronic news, email, databases, Web pages, archived documents, etc) resides in its own format, accessible only though some unique index system. Often content is not easily accessible. Much of it is scattered across the enterprise.

Yet, access is critical. New applications have sprung up requiring access to information across the enterprise, sometimes across multiple businesses. These include next-generation customer care, competitive research, and B2B transactions. In order to get at the information needed to run these applications, information itself needs to be re-structured, and re-organized -- and so does the method of getting at that information.

Info-Illiteracy: A Barrier to Finding Information

The problem of business info-glut is worse than it appears. Many employees lack the skills needed to find the information they require.

For years, putting tools in the hands of the users was considered the best way for companies and their knowledge workers to get their hands on the information they needed. In most cases, that has meant providing users with a search engine similar to systems found on public Websites. Today, many knowledge workers have to navigate as many as six different search engines and database indexes each day.

New research casts doubt on how well search engines works for most users. A study of AltaVista users revealed a surprising amount of info-illiteracy. According to that study:

  • 80% couldn't/wouldn't build a working Boolean search
  • 87% used less than 3 words

Which Babe Did You Mean?

A big part of the problem is that the same term can have different meanings to different people. Not knowing which terms will uncover sought-after information is a significant barrier for many knowledge workers. Any successful strategy for managing information has to overcome this problem.

    In 1814, Thomas Jefferson was so dissatisfied with the ruined and disorganized state of the documents at the Library of Congress that he donated his collection and then personally reclassified the all the books there.
    -- Source: Systems of Knowledge Organization for Digital Libraries, Gail Hodge

XML to the Rescue?

The Internet has been described as the world's largest library, with the books thrown all over the floor. Many corporate information systems look just as disorganized. Information managers are convinced that the best solution to this clutter involves wrapping up all electronic document forms inside a common format, so that the content inside can be more easily found, and used by different applications.

The wrapper being used by most organizations today is XML. XML allows the tagging of a document with a description of what the document is about, and where it came from. Searching on XML meta-tags can certainly simplify the search process.

Unfortunately, XML does not solve the problem of finding information. It only standardizes the problem. It requires that any XML tagging system clearly understand what the document is about, and it needs to anticipate the search process someone might try to use to find it. This takes time, a great deal of sophistication, or both. Otherwise, the process results in hiding essential documents behind generic, idiosyncratic or meaningless tags, making the information management and retrieval problem even worse.

In order for XML tagging to be meaningful for search and retrieval, the terms used to tag content have to be intuitive enough to encourage their use by information-seekers. They should be structured in a standardized way; less as a set of variable keywords and more like a set of subject categories. These subject categories should be set up in a hierarchical fashion, with logical subtopics and overviews. This, in short, is a taxonomy.

Enabling an ability to search or manipulate content, "by category" is an essential benefit of a successful XML tagging process.

Taxonomies Defined

Taxonomies are sometimes called "classification schemes" or "categorization schemes." Each refers to grouping together similar items into broad "buckets" or "topics" which themselves can be grouped together in ever-broader "hierarchies." Examples of taxonomies include systems as diverse as the Dewey Decimal system found in small libraries, Yahoo's Subject Index, and the massive taxonomic system proposed by Linneaus used by generations of biology students. Wherever they are used, they have the same goal -- to organize knowledge about a given subject.

A sample taxonomy from NewEdge:

Sample Taxonomy

Taxonomies and The Search Process

Perhaps the greatest benefit to taxonomies is improved searching.

Properly constructed taxonomies simplify the process of gathering "the right" information for daily business use by simplifying the vocabulary used in the search process. Tagging systems using raw key words or similar strategies are likely to generate search error rates approaching that of straight text searches. For example, while a search on the word "DSL" will find stories on a particular type of broadband technologies, it will miss others, and may accidentally find content referring to Dutch sign language or Data SubLanguage.

A better approach would be to define these documents as belonging to the subject category, "Digital Subscriber Line." If the searcher can focus on a proven set of categories rather guess at keywords, chances of finding the right content, are far greater, and the process will be faster and more reliable.

The most important contribution of taxonomies to the search process is that they work.

Even using a relatively primitive taxonomic system, Microsoft reported a 40% improvement in hit rates. Satisfaction metrics doubled. In addition, the time spent trying to find a given document was significantly reduced. The success rate of taxonomic-based searching reduces the strain on systems and on the people who use them.

Business is Complicated

Click for larger image

Naturally, one of the most important criteria for taxonomy is that it should be easy to navigate. But building solid taxonomies is much easier said than done. Consider, for example, a taxonomy of business subjects.

Businesses vary in size and have multiple points of focus. Business activities involve an array of subjects that do not always fall into logical groupings. Subject boundaries are often fuzzy.

Subject hierarchies can feel artificial, as content, particularly business critical content, may fall into multiple categories. Indeed, most executive-level business documents involve several categories. Traditional indexing schemes dissolve in complexity as the number of unique concepts grows.

So, while some subjects are relatively easy to categorize, most business functions are not. (I should know: NewsEdge has spent several years developing a proprietary business taxonomy). Nevertheless, you should seriously consider developing a taxonomy for the content management system residing underneath your e-business efforts in general, and your Intranet in particular. Your content contributors and end-users alike will be grateful.

  • Tweet This Entry

Online Education

Check out our classes and Register Today.

Evaluation Research

Get the real story about vendors and products.

Get the Rest of the Story

Enterprise Information Watch

Enterprise Information Watch

Evaluating enterprise content technologies, including ECM, Search, DAM, and Portals.Learn More

SharePoint Watch

SharePoint Watch

Helping you evaluate and optimize SharePoint technologies for the enterprise. Learn More

My Research

Remember MeForgot password?

Not a subscriber? Learn about our subscriptions

Have Questions?

Sales & Customer Support

+1 800 325 6190 (USA)+44 (0) 20 3318 1911 (UK)+1 617 340 6464 (Int'l)sales@realstorygroup.com support@realstorygroup.com

All other inquiries: info@realstorygroup.com

Copyright, 2001 - 2010, Real Story Group. All rights reserved.

  • Contact Us
  • Copyright Policy
  • Privacy Policy
  • Terms of Use

The Real Story Group

  • CMS Watch
  • Enterprise Information
       Watch
  • SharePoint Watch
  • The Real Story Group

Research

  • Vendor Evaluations
  • Webinars & Advisory Papers
  • Online Education
  • Vendor Lists
  • Free Research Sample
  • Purchase Now

What We Offer

  • Research & Advisory
       Services
  • Frequently Asked Questions
  • Consulting Services
  • Customer Support
  • Contact Sales Team

Who We Are

  • We're Different
  • Our Team
  • Media
  • Customer List
  • Events
  • Contact Us

Get the real story via our bi-weekly newsletter.

Follow us on: RSS twitter

Log In

Remember MeForgot password?