Web Services and CMS
Web Services and Content Management
by Travis Wissink
20-Jul-2004

Over the last year, Web Services-based technology implementations have displayed efficiencies as well as revealed immaturities. The Web Service solution stack encompasses several core, solid specifications such as UDDI, WSDL, and SOAP (which I won't get into here; for more details, see this article). Concurrently, we have seen the rise of a more generic notion of "services-oriented architectures," where core capabilities are exposed and leveraged as generic services, using Web Services standards or other similar approaches.
In theory, as more CMS products expose functionality through Web Services (or similar constructs) a content management system could become the nucleus of enterprise content networks more broadly. In order to do so, the CMS will need to become more outward-facing, providing practical services to the enterprise and interacting with other enterprise services.
Core CMS Services
Many CMS Suites contain at least the following services:

Let's look at each one.
- The "CMS Core" is what I usually call the CMS engine and others call this the application engine.
- Library services are similar to a library catalog system. They allow certain people to view, check out, review, view meta data, and track versions of content.
- Most CMS products allow the warehousing of classification terms for the applying attributes to assets. Then the CMS will store any terms associated with specific content objects in the library system as metadata.
- Most products have some sort of workflow subsystem to allow content creators and managers to implement some sort of business process around their CMS assets.
- Search is a mechanism that allows content managers to investigate the CMS repository for specific assets.
- This distribution or deployment functionality is where the CMS pushes content to a delivery system. Some products deploy content to a file system. Others deploy content to a database. Still others can replicate content to a proprietary caching system. Many use some combination of these approaches.
- Lastly, most content management systems allow for transformations of content. This may entail converting XML content to HTML, standard office application files to PDF, as well as many other transformations.
Exposing Services to the Enterprise
(Click image for a larger version)
Let's imagine that CMS services can be exposed to other applications through a Web Services interface. The illustration above shows some of the methods that could be made available to other applications, either to invoke within the CMS, or to utilize for content outside the CMS repository.
Consider this scenario. Your company regularly releases new products, both to its e-catalog and out to its distributor channel. Upon doing so, you need to deploy content related to those products to various websites. Using a services-oriented architecture, your company's product data management or e-commerce systems could activate the CMS distribution service to push particular content into production (even related content outside the CMS) when certain products get released. If that's too automatic a business process, the workflow service could be triggered instead, to activate an approval process.
This is possible today in a few of the more developed CMS products that have packaged up their APIs as Web Services interfaces.
Taking advantage of other services
What about the other way? What if your CMS could take advantage of other enterprise services? Well, many CMS packages -- while they can produce services -- often cannot natively consume services. This could be hubris (many software vendors imagine their product as the center of the enterprise information infrastructure), or more simply a lack of interest among customers still struggling to create and propagate enterprise services.
In any case, the possibilities are intriguing. Let's look at the relatively simple case of supplying a dynamic spell-check service to your CMS so you do not have to rely on local dictionaries. In this case, we will actually use a public Web Service, supplied by Google (you could imagine other internal services running behind your firewall instead).
Here is a use case that depicts a web form or templating engine of a CMS that needs a spell checker.
- A user is in a CMS form, editing content
- The user highlights a word or phrase
- User right clicks on highlight and select spell check
- The system creates a soap message and sends it to Google
- Google responds with its spelling suggestion
- The user is prompted to accept or decline the suggestion
I'll dive into some code now. If that makes your eyes glaze over, just jump down to the next section.
Below is a sample WSDL from Google's Web Service. I have cut out a lot of other functionality to only show operations the Suggestion Spelling service.

Next is a sample, still from the same WSDL but this snippet shows the input and output message formats.

Here is a sample SOAP message. This message gets sent to the suggestion spelling operation. We are trying to find out if "britney" is spelled properly.

Of course our phrase isn't spelled properly so the service's reply message looks like the following.

Google's spelling suggestion reply's to us with a suggestion of "britney spears."
Above depicts the actually Web Service and its input and outputs. To make it all work required a surprisingly small amount of Java code.
ECM without the Suite: Services-based architecture
Many ECM vendors would have you satisfy all your ECM needs by purchasing their varied packages (some offer several dozen products and modules). In doing so, however, you risk vendor lock-in, surprisingly heavy integration work, and getting stuck with modules that do not effectively meet particular use-cases where a best of breed approach would perform better.
A services-oriented architecture could help to resolve this problem by enabling you to apply specialized technologies (including home grown and open-source) as part of a broader ECM architecture. Consider the following chart of a potential services stack (click to enlarge).
I'll start from the bottom up. So should you.
Third Layer
This the management and configuration layer, where each service is registered, defined, and configured. For instance a forms services needs configuration and further development. A developer would modify the forms services and, if needed, define security against individual forms. A business manager can define a Business Process in another configuration manager (in this case a BPEL editor). Such an IDE is typically installed on the business manager's desktop, so it could be a fat client rather than a Web-based tool.
Second Layer
This is the exposure layer that represents services themselves. This layer services isolates the underlying vendor products but yet exposes enterprise accessible operations via Web services. Each of the services in the second layer can operate on their own, without a central ECM application engine.
I added three new functionalities in the service layer that we haven't reviewed yet. Imaging services allow for many image asset functionalities, like scanning, conversion and resizing. Also, Identity services to assist all other applications with access and authorization services. Finally, an Xforms implementation assists us with structured data collection.
Although, there is no one ECM product described here, I should stress that these services are products, or combination products, in their own right. Some services may be based in PHP and others maybe based in Java, but neither care about each others platform language because they all expose themselves with external interaction, likely (though not exclusively) through Web Services specifications.
First Layer
This is the access or presentation layer. Here we have an enterprise portal (EIP) and some sort of directory of services (UDDI in Web Services parlance). First the UDDI is where certain services may register themselves so that other internal and external systems can interact with them. The EIP can be a very useful tool here. Enterprise users have access to specific services via the portal. For example, an HR staffer may want to access business manager services to initiate the hiring of a new manager.
An Example
The following image is a data flow diagram. This diagram depicts users, user and system interaction points, and business flow. In our case this models how a portal user interacts with a different services (likely via portlets) to perform one of their weekly business responsibilities. (Click image to enlarge)
The content contributor first authenticates in the EIP and gains access to a new content portlet. After triggering a new content task the business process service directs the contributor to the select content type task and portlet as well as initiates a new content workflow. The user then selects a content type and submits to the next task with forwards the user to the eForms portlet where the user can input their information. In most circumstances, the system and or the user might also apply some implicit and explicit metadata, invoking the classifications service. After the user submits their content, the BPM service transitions the content to the next task in the content life cycle. The next task gets posted to the contributor's manager inbox for review.
Today, your CMS package might be providing all those services. But does it do them all equally well? And aren't other enterprise applications doing many of the same things? Why support scads of workflow engines? There can be substantial benefits in specialization here.
Closing
In conclusion, Web Services can assist an enterprise in extending its content management capabilities to other systems and vice-versa, thereby obtaining more value from their IT investments. You can also yield the benefits of specialization. A services-oriented architecture allows different systems to do what they do best, enabling a cleaner separation of concerns than is found in most enterprises today, and potentially reducing your dependence on major suite vendors.





