Get the real story via our bi-monthly newsletter

Search

    0
    0

rss

Send to a colleague

Home > ECM > What Does Your CMS Call This Guy?

Get a Free Sample

Wondering about CMS Watch research? Sign up to receive free samples of any of our products.

Report Excerpt

The ECM Report looks at... Laserfiche's Standalone Versions of Core Product

"Laserfiche can present an attractive option for smaller firms because they can divide their basic technology into differently tiered products called "Laserfiche Executive" and "Laserfiche Desktop." These are standalone versions of the core product. Other vendors in this space do not have the same approach ..."

(p. 318)

More about The ECM Report

Our customers say

"As always the production quality is of the highest order, and the writing style manages to communicate the product functionalities well without requiring you to be a major in computing science. To my mind the ECM Suites Report will be the de facto reference for the foreseeable future.
- - Martin White,
Managing Director, Intranet Focus Ltd

NEW at CMS Watch

The Search and Information Access ReportThe Search & Information Access Report: This newly updated 341-page Search and Information Access Report critically evaluates 23 Search and Information Access offerings from around the globe... Read more

The Enterprise Collaboration & Community Software ReportThe Enterprise Collaboration & Community Software Report : This newly updated research critically evaluates 27 Enterprise Collaboration and Community Software products head-to-head... Read more

The Enterprise Content Management ReportThe Enterprise Content Management Report : This newly updated research critically evaluates 32 Enterprise Content Management products head-to-head... Read more

 

Glossary

Indexing

Metadata

RDBMS

Taxonomies

XML



 

Names

What Does Your CMS Call This Guy?

by Mark Baker
09-Jan-2002 --



Introduction

A content management system must manage the relationships of the information objects it contains. There are two ways to relate information objects: linking and naming. Linking creates a specific connection between two (or more) specific information objects. Naming clarifies the names of things referred to in one information object in such a way that it is possible at a later time to create a link to many different objects.

For example, in this passage from the review of Rio Lobo in Leonard Maltin's Video and Movie Guide (Signet, 1994):

    <p>Hawkes' final film is a lighthearted Western in the Rio Bravo mold, with the Duke as an ex-Union colonel out to settle some old scores.</p>

We can create a link from the word "the Duke" to a web site about John Wayne:

    <p>Hawkes' final film is a lighthearted Western in the Rio Bravo mold, with <A HREF= "http://www.westerns.com/stars/john_wayne/index.html"> the Duke</A> as an ex-Union colonel out to settle some old scores.</p>

Or we can use markup to clarify that the words "the Duke" are in fact a nickname for the actor named John Wayne:

    <p><director name="Howard Hawkes">Hawkes'</director> final film is a lighthearted Western in the <movie>Rio Bravo</movie> mold, with <actor name="John Wayne">the Duke</actor> as an ex-Union colonel out to settle some old scores.</p>

The advantage of this second approach is that we have now preserved the information required to form a variety of different links at a later time.

Since one of the key virtues of using a content management system is to give you control over how your information is linked, organized and delivered, forming relationships based on names rather than links is a useful way of gaining greater control over your content and increasing the range of uses you can make of it.

The purpose of this paper is to examine the types of names and naming schemes that can be used to form relationships in a content management system.

Naming things

For purposes of naming objects referred to in content we require a naming scheme that:

  1. adequately identifies the named object for purposes of the processing to be performed to generates links, and
  2. allows authors or editors to identify objects using a name they know or can easily access.

As we'll see below, these requirements sometimes compete with each other.

There are three types of names to choose from. Each has its advantages and drawbacks.

Common names
The common name is the name those familiar with the object normally call it. Common names seldom make stable identifiers, because they are subject to ambiguities and cannot always be guaranteed unique. They also tend to change if

  • an underlying product changes name,
  • the underlying vocabulary changes (as it does frequently in high tech),
  • the current name proves unclear to readers, or
  • the author needs to reorder conceptual or abstract material.

Abstract names
Abstract names are generally created solely for the purpose of being guaranteed unique, unambiguous, and stable. For instance, because people share common first names and surnames with others, most governments assign citizens a number that identifies them uniquely. Citizens keep the same number for life, even if their name changes. It is common practice in database systems to generate unique local key values as abstract names for records in a table.

It is generally good practice to make abstract names meaningless, since meaningful abstract names may tend to be adopted as common names and thus be subject to all the pressures that change common names.

Formal names
For some sets of objects there are more formal naming schemes used in place of common names where greater precision is required. In botany, for instance, every plant has a unique Latin name, though it may have many common and local names. Formal names require a formal model and some authority charged with the maintenance of the model and the collection of formal names associated with it.

Formal names are usually stable and unique, though not as absolutely so as abstract names (a botanical name may change, for instance, if the plant in question is proved to belong to a different species or family than first thought). The set of formal names often overlaps the set of common names. Houseplants, for instance, are sometimes known by common English names and sometimes by the Latin botanical names.

You can also create a system of formal names for use within your own system. Essentially, such a system would be a collection of human readable names that obey a set of rules that ensure that they are unique and unambiguous within the context in which they are used.

Working with names in markup

To understand how names are used in CM systems, let's look at how they are used in relational database systems (RDBMSs). RDBMSs rely on unique local keys as names for establishing relationships between tables. While these local keys are not actually required to be abstract names, this is the recommended practice. In well-managed databases, every new record is automatically assigned a unique and unchangeable ID number (often called the "surrogate key") when it is created. This ID number field is used as the local key for the table. When a relationship is formed from a record in one table to a record in another, the local key from one table is entered as a "foreign key" in the related table.

Database tools commonly provide simple mechanisms that developers can use to shield users from the abstract names. This allows users to see common names as they perform linking operations. The user sees a common name, but the database identifies the relationship with the abstract name. For tabular data, this provides the best of both worlds.

However, for other types of material -- descriptive, highly textual, or inherently hierarchical -- documents and document-based tagging schemes such as XML provide no such facility for hiding abstract names.

Content management systems must manage information relationships that occur both in tabular data and descriptive text. They typically do this by subsuming descriptive material into the tabular structure while maintaining full access to the relationships present in the descriptive text. To do this, a system can use XML to mark up a piece of text to indicate:

  1. That it is a name or a reference to an object (a real world object, or an information object).
  2. What kind of object it is (what namespace it belongs to).
  3. The controlled form of the name being used in the namespace.

In the example cited earlier:

    <actor name="John Wayne">the Duke</actor>

The tag "actor" specifies that the text "the Duke" is a name (in this case a nickname) for a real world object, and it specifies that the type or namespace for that object is "actor". Actor's names are managed by the Screen Actor's Guild, so we have a formal name, "John Wayne", that corresponds to the nickname "the Duke", and this is specified by the "name" attribute.

What this markup does, in database terms, is insert a foreign key into the markup of the information component. Whether or not there is a database table with a local key of "John Wayne" can remain an open question, but the fact that the name is established in its proper namespace means that we can form a relationship with any resource that exposes the local key "John Wayne" -- now or in the future.

This is extremely useful, because it allows you to establish links from older material in your system to newly added material without having to go back and retrofit the old material. As long as the new material is given a name that you can connect to the names embedded in the old content, the link can be formed dynamically when the material is presented. Thus if you subsequently add a biography of John Wayne to your collection of movie-related content, a link from "the Duke" in the Rio Lobo review to this biography can be generated (based on the name "John Wayne") the next time the review is accessed.

Naming database records

In some cases, you may want to establish a direct relationship from a phrase in your content to a database record in your system. You may do this either because

  • the database record actually contains the thing you wish to refer to, or
  • you are employing a database table to contain extended identifying information that you can traverse to link to other useful bits of content.

To make a reference to a database record from your markup, you can create an attribute to refer to the appropriate table, and use the value of the key field as the attribute value to identify the record. For example in the sentence

    "Jacques Villeneuve's car crossed the line in first place."

you can add markup to relate the words "Jacques Villeneuve's car" to the entry for "Williams" (the name of the Formula 1 team Villeneuve drove for) in the "Car" table with simple markup:

    "<car ID="342">Jacques Villeneuve's car</car> crossed the line in first place."

The problem is that the author looking at this markup has no easy way to verify the accuracy of the link or to understand what the link means.

We can solve this by adding the common name to the markup:

    "<car ID="342" name="Williams">Jacques Villeneuve's car</car> crossed the line in first place."

Now it is clear what the markup means. (The tag should be generated automatically from the database, both to ensure accuracy and to simplify authoring.)

Of course, this markup has now become rather verbose. In fact, the information object in question is named three times: by its abstract name ("342"), its common name ("Williams") and an ad hoc name ("Jacques Villeneuve's car"). There are advantages to this verbose markup, however, as it greatly improves referential integrity checking and change management.

Conclusion

Using markup to clarify the meaning of names in your content -- rather than embedding pre-selected links -- opens up many opportunities to provide linking that is appropriate to a particular audience or media, and to achieve consistent linking policies for your information set. It also simplifies link management and makes life easier for authors. It also requires you to think carefully about the naming scheme your content requires.

For more information on this approach to relationship management in a content management system see the OmniMark Technologies white paper "Content Engineering" at http://www.omnimark.com/products/contentengineering.pdf.


Next:

Send Feedback

See all ECM Channel feature articles.

Need to select a technology vendor, but confused about your choices? See our vendor-neutral technology reports.

Join the conversation

Digg This! Search Technorati Tag it on Del.icio.us



About the Author

Mark Baker

Mark Baker is the principal of Analecta Communications, a communication consulting firm in Ottawa, Canada. His former positions include Manager of Information Engineering methods at Nortel and Director of Communications for OmniMark Technologies. Mark has written and spoken extensively on single sourcing and markup. He is co-author of HTML Unleashed, 2nd Edition and author of Internet Programming with OmniMark. He is currently writing a book on refactoring content.



Get a Free Sample

Wondering about CMS Watch research? Sign up to receive free samples of any of our products.



What we do

CMS Watch™ evaluates content-oriented technologies, publishing head-to-head comparative reviews of leading solutions. What makes us special?

  • Our critical analysis exposes product weaknesses as well as strengths
  • We deliver unrivaled technical depth and comprehensive project advice
  • Our research is led by international topic experts
  • We only work for buyers -- never for vendors

Contact us

CMS Watch

info@cmswatch.com

3470 Olney-Laytonsville Road Suite 131

Olney, MD USA 20832

1 800 325 6190

1 617 340 6464

UK: +44 2033181911

Fax: +1 617 340 3541