Get the real story via our bi-monthly newsletter

Search

    2
    0

rss

Send to a colleague

Home > Web Content Management > Pay Attention to Content Deployment

Get a Free Sample

Wondering about CMS Watch research? Sign up to receive free samples of any of our products.

Report Excerpt

The Web CMS Report 2009 looks at... The New Presentation Library

"In larger group environments, however, you may find the Presentation Library quite limited from a usability standpoint. In testing this functionality..."

(p. 100)

More about The Web CMS Report 2009

Our customers say

"The Web CMS Report is invaluable for anyone who wants to understand the vendor landscape before they invest in a new Web Content Management system.
- - Barry Bealer,
President, CEO, Really Strategies, Inc.

NEW at CMS Watch

The Digital & Media Asset Management Report 2009 The Digital & Media Asset Management Report 2009: This report provides comparative evaluations of 18 digital media and asset management offerings... Read more
Fundamentals of Digital Asset Management Digital & Media Asset Management Online Education Course: This course will provide you with a thorough grounding in Digital and Media Asset Management technology... Read more
The Search & Information Access Report 2009 The Search & Information Access Report 2009: This report provides comparative evaluations of 20 search and information access offerings... Read more

Glossary

Apache

Application Server

Caching

ColdFusion

Indexing

Java

JavaScript

Personalization

RDBMS

Version Control

Workflow



 

Deployment

Pay Attention to Content Deployment

by Darren Guarnaccia
05-May-2004



So you are putting a new website or new area of your site under content management. The templates have been built, content has been entered, but there's just one last thing you have to do: Deploy the new site -- text, images, script, etc. -- from your staging and development environment to production. And going forward you'll have to do it every time you add or modify content.

Increasingly, organizations seek to separate content management, or the "development environment" from content delivery, or the "production environment." Under this approach editors and content contributors work in a staging or development environment which then publishes the content to a production environment. There are many advantages to this approach, but also some new challenges introduced, including the entire process of content deployment.

In this article I'll suggest many things to consider and identify some choices you need to make to ensure that your deployment process is secure, consistent, and non-disruptive for your authors and readership alike.

Infrastructure Security

The first major consideration when deploying content from staging to production environments is infrastructure security. One of the most important steps in securing your online presence is locking down known vulnerabilities on your production servers. Separating content management from delivery should make this task easier by reducing the number of users that have access to your production servers.

However, your CMS server will need some way to communicate with the delivery tier -- if only to deploy content -- so you need to consider the security dimensions of any deployment regimen.

Now that you've established your separate environments, you need to ensure that you can map the rights assigned in the staging environment to the production environment. Depending on the type of site you have, (Internet, extranet, intranet), you may need to apply rights to those files once they are published. You might lock all the accounting department html pages down such that only people in the accounting department can access them or you may need to remove all physical file rights in the case of publishing your public facing Internet site. This brings up an interesting dilemma. How are you going to map your file system rights for these files from the staging server to the production server? Or even worse, what if they are different types of servers, say NT and Solaris? This sort of rights synchronization issue is the kind that always bites CMS development teams in the shorts at the 11th hour. To complicate matters, you've locked your production server down so tight that it squeaks when it walks, but you have to poke at least one little hole in the barrier to allow your content to be published. Consider building automation or using 3rd party tools to apply rights as appropriate to the production environment.

Now let's look at some deployment alternatives, starting with transport, then configuration management, and concluding with a look at easing transitions for site visitors.

Good Ol' FTP?

I find that the most common method used today to move files is the File Transport Protocol, or FTP. There are many FTP utilities out there, and most CMS packages have built one into their solutions. However, the FTP protocol and its various implementations bring several challenges.

The first major issue is security. FTP servers seem to be one of the favored targets of the hacker community, and many security vulnerabilities have centered around the FTP services on a server. What's worse, many FTP packages transmit user names and passwords in the clear. That right, no encryption. Those nasty fellows with their black hats will gladly sniff the wire waiting for your clear text passwords, and then it's all over but the crying for your little web server. So, for these very reasons, many system administrators refuse to allow FTP services to be available in production environments.

Fortunately, there are some ways to combat some of these issues. Everyone loves to talk about encryption. It's today's modern sport, comparing encryption strengths like baseball cards. Sure, you can encrypt the session through a VPN and use firewalls rules to restrict access down only to known addresses, but wait, there's more. FTP by its very nature offers no guarantee that the files are delivered accurately and completely. If something happens during the FTP session, there is no guaranteed error-checking to guard against file corruption. Furthermore, FTP does not provide a mechanism to request a resend of the file should the session be interrupted. Lastly, if there is an interruption in the FTP transport, there is no mechanism to roll back the rest of the transmission. This is also problematic when publishing to an array of servers. Should the transmission fail half-way on one of the servers, there is no native way to roll back that server to its previous state.

There are newer, more secure versions of FTP, such as Secure FTP. Utilities leveraging this new standard offer encryption, and are less vulnerable. However, they are more difficult to configure. Other vendors have built proprietary extensions to FTP to make it more transactional, but you have to be careful here about the robustness of the tool and possibilities for vendor lock-in. So what are your choices beyond FTP?

Synchronization

Synchronization technologies offer good deployment alternatives and provide the ability to replicate a part of a file system to multiple destinations. The main advantage of these technologies is that deployments can be treated as discrete transactional units, meaning you can roll back to the prior state should a problem occur in the delivery process. Synchronization tools can also validate that the file is delivered correctly using checksum comparisons or similar mechanisms.

There are also other proprietary products from various vendors that address this issue. The bottom line is: You need to think about how you will transport your content files and what level of fault tolerance you want built into the process.

Configuration Management Matters

Beyond the transportation issues, there are several configuration management issues that need consideration. Configuration management is the Achilles' heel of nearly all Web applications, and content management is no exception.

The first issue concerns working in a multiplatform environment. Publishing from a Windows staging environment to a Unix production environment introduces some technical challenges, such as dealing with file naming issues, and correctly mapping file systems from one platform to another. It is also important to consider case sensitivity and the impact of migrating from a non-case sensitive environment to a case sensitive one. As your site grows (and it will grow now that you have a CMS) the supporting file system structure will need to grow with it.

Up to this point we've been focusing on moving physical files, but there is more to a modern website than flat html and image files. Today's websites contain not only html code, but also a mixture of scripting languages such as ASP, JSP, Coldfusion and many others. This variety of scripting code can often interact with database content, and deliver results dynamically.

At this stage, you need to ask yourself three questions:

  1. How will I synchronize the database content from staging with my production environment?
  2. How will I merge my code base that is embedded to varying degrees within my content or at least essential to its display?
  3. How will I prompt my search engine to scan and index the newest content?

Let's look at each one in turn.

On the subject of database synchronization, most content management applications have only primitive mechanisms for taking content from their own repositories and pushing it into databases. For more complex deployments, consider evaluating ETL (Extract, Transform, Load) technologies from various data warehousing technology vendors. This market has mature products that are capable of extracting data from a variety of database structures, transforming it in a variety of ways, and pushing it into target locations.

Now you have to synchronize code. To integrate with existing applications, most modern content management packages provide the ability to layer code such as ASP and JSP into the templates used by the CMS (see example below). It is also possible to structure those templates to pull the code directly from another source.

Coding a Template in RedDot

Regardless of whether you're managing simple templating logic or using the CMS as a full-blown code management system (or something in-between), it is important here to decide what level of integration you need. Start by considering how often that code will change. With the template-driven approach most websites use, you typically don't need to manipulate scripting code very much once it is put in place. Moreover, the typical CMS can allow developers to abstract commonly-changed code components of a template and allow users to make selections in the CMS. This refactoring results in more controlled code adjustments that can be QA'd once, while variations have been pushed to user-configurable selections that can be treated as a simple content type.

Let's look at a practical example: The Marketing department frequently asks the development team to generate new banner messages to appear based on certain conditions -- such as time of day or the value of items in a shopping cart. Every time Marketing requests a new banner with a new set of rules (like time of day it should appear) the developer needs to modify some script code (such as ASP) and code the image path into the logic of the page. Of course, since we're talking about application logic, the entire process must move through a structured QA or release process, which can take days or weeks.

Instead, use your CMS to codify the types of logic that can be chosen by the user (time rotation, cart value, etc) and place those choices in a drop down list. Based on the author's selection, the appropriate amount of code gets placed into the page. Any other variables that drives that code can be managed as content as well, such as the time and dates that piece of content should show up. Once all the permutations have been thoroughly tested, turn your marketing users loose and let them make selections in their pages that change business logic, with no fear of the pages breaking. Your users will be ecstatic, and you've saved yourself a ton of mundane work and testing. Just as importantly, you are now deploying content and simple configurations upon each change, rather than repeatedly deploying code, with all the attendant possibilities for error and requirements for a formal release process.

And speaking of Quality Assurance processes, it's important to understand how you will test changes to your code base and keep your QA environment fresh. In the example above, you can see how you can reduce how often you are changing code, with pure content changes only needing to go through the appropriate workflow (if any). But you should still keep your development and staging environments in sync with your production environment, even if it's only content refreshes. So be sure that you are deploying content to any test/dev/QA servers as you update production servers, so that developers working on new code are working with the latest content, images, and templates.

Flatter will get you (almost) everywhere.

Now that you have a content management solution, you can dispense with the overly nested directory structures you may have maintained as an organizing aid for your authors. It may have made sense when your content managers had to actually slog through directories to find files to edit, but consider the poor site visitor that sees the resulting URL: http://www.yoursite.com/web/1993/html/files/department/
will/this/thing/ever/end/good/grief/I/hope/I/don�t/have/
to/remember/this/let/alone/email/it/to/
someone%20else/index.html
.

A CMS can prevent such unneeded complexity by allowing you to classify and manage content independently of source directory. Keep in mind that modern file systems can easily handle 1000 files in a directory without issue. By reducing the number of directories, you will reduce the complexity of your deployment and provide simpler, more intuitive URLs for the various pages comprising the site.

Rollback is Overrated

One of the biggest stated concerns for content management buyers is the concept of rollback. This is the idea of deploying a piece of content and having the ability to revert back to the prior state of that content as necessary. In my opinion, the need for this is overstated and too much importance is placed on this feature. Most CMS users rarely have a need for rollback. It is far more typical to simply make the corrective changes, update the content, and redeploy, instead.

While the feature may come "out of the box" for some content management solutions, using some rollback features may require adding complexity to your publication process, or worse, feature fixation on this one capability may drive you to overbuy more CMS than you need. Bottom line: When choosing or implementing a CMS, it is important to truly consider the importance of rollback, and how you truly see content managers exploiting it.

Deploy with Grace

On the other hand, one of the most often overlooked aspects of content deployment is how to gracefully update visitors' views when they are on your site as you publish new content. This, of course, depends on the extent of the changes to the content, the type of changes you are making, and how your infrastructure is configured. If your CMS is deployed across multiple load-balanced servers; how will you ensure that all the content is published to all servers, so that a site visitor has a consistent experience during the publishing process? How will a reader be affected when new code is introduced onto a page a user is perusing? Every site is different, but it's important to consider the impact to the visitor's experience, even from the initial stages of planning your site. This is especially critical on large site deployments that can take literally hours to transpire.

So what are the different strategies to ensure continuity to the visitor's experience? The age-old strategy is, of course, to perform updates to content in the middle of the night when most people are asleep. This works in some cases, but don't forget international visitors surfing your site during the middle of their day 12 time zones away.

Another simple strategy is to deploy from the bottom up. It would be problematic if a main news page were updated with headlines for four new articles, yet the corresponding pages with the each story's content were not yet available. So control the publish order to force pages in the lowest levels in the site structure to be published first.

Also, incrementally updating individual areas of a page or individual pages of a site is also a good practice, instead of republishing the entire site as a complete edition, merely for simple changes (some people do this as an artifice for spamming Internet search engines into believing content has been refreshed site-wide -- not a good idea).

For highly complex environments with interactive sites employing personalization, it is common to marshal all users to a single server and then update the remaining server with new content. All newly-arriving visitors get directed to the new, updated site, while those with an existing session see only the old content until their individual sessions ended. That final server would then be updated once all user sessions were complete. Several technology vendors have incorporated this concept into their offerings. This enables each publication to be virtualized; meaning every update to a page creates a new version of the site, and the application then manages the users and their sessions across the old and new versions of the site content. As users end their sessions on the old version of the site, they are then directed to the updated one.

Cache In

Lastly, for the more dynamically driven sites that have caching mechanisms, it's a good idea to create scripts that pre-cache as much of the new content and scripts as possible, to prevent your first handful of readers from having a horrible experience. For example you might wish to precompile a modified JSP page, or re-insert commonly-used (but now updated) elements into the cache.

While Web content management solutions today provide business users the ability to author content for websites, you should avoid overlooking the details and processes involved in physically updating the content of your site. This can present thorny technical challenges and therefore inevitably requires the care and attention of skilled technologists. But it is worth paying very close attention, to ensure that your carefully-crafted content arrives consistently where it belongs, and becomes readily available to your site readership without a hitch.


Next:

Send Feedback

See all Web Content Management Channel feature articles.

Need to select a technology vendor, but confused about your choices? See our vendor-neutral technology reports.

Join the conversation

Digg This! Search Technorati Tag it on Del.icio.us



About the Author

Darren Guarnaccia

Mr. Guarnaccia comes to RedDot Solutions with more than 13 years of professional technical experience. Prior to joining RedDot, Guarnaccia held the position of Chief Technology Officer for White Horse where he delivered e-business strategy and planning services for customer-facing solutions to key Fortune 100 clients such as General Motors and AT&T Wireless.



Get a Free Sample

Wondering about CMS Watch research? Sign up to receive free samples of any of our products.



What we do

CMS Watch™ evaluates content-oriented technologies, publishing head-to-head comparative reviews of leading solutions. What makes us special?

  • Our critical analysis exposes product weaknesses as well as strengths
  • We deliver unrivaled technical depth and comprehensive project advice
  • Our research is led by international topic experts
  • We only work for buyers -- never for vendors

Contact us

CMS Watch

info@cmswatch.com

18113 Town Center Drive, Ste 217

Olney, MD USA 20832

1 800 325 6190 (customer service)

+1 617 763 5336 (int'l customer service)

Fax: +1 214 242 3048