The GRUPA Gremlin
Is There a Gremlin in Your Website?
by Tony Byrne
30-Jun-2003 --

There may be a gremlin in your web publishing system. At first he seemed attractive and made himself quite popular. All the major analyst firms and trade publications said you needed him, so you made the extra effort to bring him on board. But now he's eating up all your resources and leaving behind big messes. This is particularly irksome because the gremlin's care and feeding requires constant intervention from expensive specialists. And what's worse, the little bugger multiplies himself as your site grows bigger.
He's the GRUPA gremlin. GRUPA stands for "Gratuitous Runtime Page Assembly." It's what happens when you overapply the once (and still) popular idea that your system should always generate web pages "on the fly," i.e. a user clicks on a page that triggers some logic to extract snippets of content from a repository and assemble a complete page to stream back to the browser.
Perhaps a critical business requirement compels you to maintain a system to assemble pages at clicktime, or "runtime." But there's also a good chance you are doing so unnecessarily. It could be high time you got rid of this gremlin, before the problem gets worse.
What's Wrong with "On-the-Fly"?
When you implemented a dynamic publishing system, you probably thought Mr. GRUPA was a nice gremlin. Well, he's not. Runtime page assembly is actually quite expensive, so it's in your interest to avoid it except when completely necessary. Let's look at the costs of dynamic content delivery from the standpoint of the various actors in your Web publishing system.
Authors: For content contributors inputting text, GRUPA may not be such
a big deal. If you have a perfect separation of content, presentation, and
other business logic, authors only have to worry about getting text into the
system, not how it is displayed to content consumers. But in the real world,
authors too frequently overwrite or mangle script code and other display logic.
Of course, they can still mess up the display wrapper in a static HTML file,
but at least then, the damage is usually limited to one page.
Editors: The biggest challenge for editors and reviewers within sites using runtime page assembly is the need to preview (or "virtualize") content in context. As a practical matter, many organizations sign off on content in a partial display vacuum, then wait for the pages to go "live" before reviewing them in final form. In short, their production servers act as a staging environment, because their CMS cannot replicate the dynamic delivery logic. The solution to this problem requires investing in additional staging infrastructure and synchronizing that with your CMS workflow review processes. In the end, you expend extra time as well as money.
Readers: Site visitors are the most important actors in your Web publishing system, and GRUPA affects them perhaps the most, especially because of its impact on page load times (more about that below).
Systems: When relying on runtime page assembly, you will likely end up having to support two dynamic systems, your CMS and your delivery environment. If runtime assembly is adding little or no value, then GRUPA has doubled (at least) your work and complexity with scant return at best.
In Praise of Static Websites
Another reason to avoid the GRUPA gremlin is that static publishing models can provide substantial value. Here are a few of advantages of delivering static pages:
- Static pages get delivered faster.
As much as 10 times faster than dynamically-generated pages, depending on how many and what kind of calls a dynamic page spawns. In theory, caching systems will make up some lost ground, but not without adding a lot more complexity. Speedy delivery makes content consumers happy, and means that you require less hardware to support your delivery system. Moreover, the only hardware you need is webservers, which are cheaper, simpler to administer, and generally easier to secure than application servers.
- Static pages are easier to index.
Your site search engine can be tuned more readily against a static file repository (and no need to invest in more expensive software that will index file and database content together).
- Static pages show a better face to external search engines.
Google loves static pages. External search engine spiders may not spider your dynamic content at all -- and if they do, they may not be able to assign a proper title to the result set, reducing the quality of the returns.
- Meaningful file names improve usability.
And there is also semantic value in the location path; note how CNN and other news sites organize content directories by date and/or topic. Many people -- especially frequent visitors -- do look at the location bar for cues.
- No need to worry about database uptime.
For that matter, there's no need to concern yourself with database performance at all -- except for the DB that's connected to your CMS, but it's not going to get hit as hard as your production delivery system.
- Static pages are actually quite mobile.
When you generate a static site, you can host it nearly anywhere across and beyond your enterprise: on your Windows servers, Solaris machines, or your partner's Mac OS webserver.
How Did We Get Here?
If static publishing is so great, why did so many of us invite the GRUPA gremlin to take up residence?
At first, many site managers adopted dynamic delivery models for practical reasons. Before the advent of mature content management systems, one of the few ways to maintain a single authoritative copy of a header or footer snippet, for example, was to use Server-Side Includes (SSIs), a rather insecure and quite performance-intensive -- but nonetheless effective -- approach to content assembly. Then came the advent of embedded scripting languages (ASP/PHP/CF/etc), which used faster approaches to "includes" and the promise of applying more sophisticated, template-driven navigation and presentation logic.
Today, however, the typical Web content management system can maintain and organize discrete page elements (like "footer," or "leftnav") and more atomic sub-elements. If you don't need any runtime input from your site visitors to determine how to combine those elements into pages, then all you need to do is preassemble them into a static HTML file at approval time and deploy those pages from your CMS to a webserver as a final step in your workflows. Think of this as "baking" a page -- or indeed an entire site -- as opposed to "frying" content when people click through your links (this graphic illustrates the difference).
There is another, less practical reason why dynamically-generated sites have proliferated: siteowners wanted to create the impression that a truly dynamic organization was behind them. On-the-fly publishing became cool to engineer. "Static" came to mean out of date; "dynamic" meant the very latest information. This is certainly true for stock quotes and online auctions, where every minute counts. But aren't you sick of visiting URLs that look like this:
http://www.somesite.com/news.asp?list=latest&type=greatest
...only to find 3 press releases from early 2001? At the end of the day, your web content is only as fresh as your humanoid content managers make it, irrespective of your delivery system.
In fact, many commercial enterprises have taken their cue from the media industry and converted to the idea of baking regular "editions" of their sites on an hourly, daily, or weekly basis. They still have emergency workflows for urgent updates in-between editions, but resort to them only sparingly.
Perhaps these enterprises are listening to Web content gurus like Gerry McGovern, who point out that taking this approach lends more discipline, predictability, and quality to Web publishing processes, as well as a better experience for end-users. Content contributors conditioned to seeing their new stuff go live immediately may balk at the notion of organized editions. But in this era of heightened awareness of enterprise accountability and records management, McGovern and others ask, is it really necessary to publish new content willy-nilly throughout the day?
You don't have to produce "editions" to adopt static publishing, but the managerial approach of packaging up a new version of a site (or section of a site) definitely goes hand-in-hand with a baking paradigm.
When Do You Need Runtime Page Assembly?
There is certainly a case to be made for runtime page (or screen assembly). Just be sure to know when it is essential for a business purpose, and when it is gratuitous. Let's consider some possible use cases that proponents frequently employ to espouse dynamic delivery:
- Personalization
Personalization certainly requires run-time assembly. Your delivery system can't personalize a page until it knows who the person is, and people only reveal themselves when they click. A question to ask about personalization, though: do you really need it? Personalization is complex and costly. It has paid some dividends for e-commerce and subscription business models, but few others. Even if your boss insists that every site visitor be greeted by name, you can still pre-assemble all the other elements on the page that do not require browser input, leaving only the element that writes out "Welcome back, Ms. Jones" as the sole dynamic function. Consider this a kind of "parbaking." Ms. Jones will thank you for speeding page delivery times.
- Custom presentations
Perhaps you want to alter parts of the site or page for different sets of users. For example, you might sniff browsers to alter the display in Mozilla. One answer to this requirement is to bake different editions of your static site. You could bake a wireless edition while you're at it. Custom presentation is generally not a good reason to conduct dynamic delivery and is in fact one of the prime incubators of the GRUPA gremlin.
- Dynamic Navigation
You want your site to show links to related content, or autogenerate links to sibling, parent, and daughter pages in a site hierarchy. In most cases, these can baked in just like the header and footer. If you don't need visitor input to set the navigation logic, why do it at runtime? The gremlin is at work here.
- Employee and other E-business Portals
There is a reasonable case here for dynamic delivery. Many corporations want their employees or partners to log into a kind of "personal dashboard" each morning. E-business portals also tend to suck in content (often fast-changing content) from other subsystems besides a default Web CMS. It's generally not advisable or even possible to significantly pre-generate this data, though some possibilities could emerge to pre-assemble some content elements before they hit your delivery system.
Nevertheless, don't confuse this kind of dashboard with a "small p" public portal that simply serves as an entryway to lower-level pages and sites. If you're not aggregating content from multiple internal systems to present personalized views, you likely remain a good candidate for a baking strategy.
- Rotating elements
This one is a bit tougher. If you have components that rotate randomly on each page load (CMS Watch does), you may need to generate and assemble these at runtime. You might consider a rotation schedule and bake editions anyway. And as always, you may not need to generate entire page dynamically, just the rotating elements, so consider parbaking.
Give Yourself a Choice
Baking static sites certainly carries its own challenges. Preassembly of static pages from dynamic elements can be complicated and put stress on CMS machine cycles, as well as potentially gum up the network connecting your CMS with your delivery infrastructure. I would still argue this is still better than stressing your customer-facing delivery servers with continual page assembly.
If you change your logo and it appears on all your pages, you'll need to reassemble and redeploy every static page on your site. On a large site, that will take hours. You'll also need a bit more file system storage on your delivery servers. But so what? Storage is cheap.
For most organizations, I think these challenges will pale in comparison to dealing with the GRUPA gremlin. So when you look at CMS architectures, make sure you can easily implement a baking regimen, for at least sections of your sites.
Unfortunately, some content management packages (like Midgard, powering this site) require you to assemble dynamic pages at runtime. This is especially the case with products that enforce tightly-coupled content management and delivery architectures, and in those cases, a baking solution often entails complicated work-arounds. If you're looking to convert to a content management system and want to avoid gratuitous runtime page assembly, look for a system that can pregenerate and deploy static files.


