<?xml version="1.0" encoding="us-ascii"?>
<rss version="2.0" xml:base="http://www.cmswatch.com" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/">
   <channel>
      <title>CMS Watch Search and Information Access Feed</title>
      <link>http://www.cmswatch.com</link>
      <description>CMS Watch headlines about Search and Information Access</description>
      <language>en-us</language>
      <lastBuildDate>Sat,  4 Jul 2009 11:14:20 -0400</lastBuildDate>
      <dc:creator>editor@cmswatch.com (Tony Byrne)</dc:creator>
      <dc:rights>Copyright 2005, CMS Watch</dc:rights>
      <dc:publisher>CMS Watch</dc:publisher>
      <image>
         <title>CMS Watch Search and Information Access Feed</title>
         <url>http://www.cmswatch.com/images/cmswatch_logo.gif</url>
         <link>http://www.cmswatch.com</link>
         <width>82</width>
         <height>36</height>
         <description>CMS Watch logo</description>
      </image>
      <item>
         <title>What Wimbledon and vendor selection have in common</title>
         <description>&lt;p&gt;As &lt;a href=&quot;http://www.telegraph.co.uk/sport/tennis/wimbledon/5532994/Wimbledon-2009-Andy-Murray-mania-grips-the-nation.html&quot;&gt;Murraymania&lt;/a&gt; swept the UK, I settled into my Court No. 2 seat on Wimbledon's always-action-packed middle Saturday. In addition to the matches of Serbia's Ivanovic, Australia's Hewitt, and Russia's Safina, I had a great view of the Centre Court scoreboard, so during breaks I was keenly watching the results of Andy Roddick's match. &lt;br /&gt;
&lt;br /&gt;
&amp;quot;Andy's got the first set,&amp;quot; I said to my cousin, who's studying in London and joined me for the day. &amp;quot;Andy's not playing yet,&amp;quot; interjected the Brit to the other side of me. &amp;quot;Yes he is,&amp;quot; I said. Pause. &amp;quot;Oh you mean, &lt;em&gt;your&lt;/em&gt; Andy,&amp;quot; he replied. &amp;quot;Right,&amp;quot; I smiled back, &amp;quot;not &lt;em&gt;your&lt;/em&gt; Andy, who plays tonight.&amp;quot;&amp;nbsp; Then came the most interesting comment: &amp;quot;Well, he's not really my Andy,&amp;quot; the gent said. &amp;quot;I'm English, and he's Scottish.&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Territorial rivalries are perhaps more pronounced in sport than any other pastime, be it the Boston Red Sox vs. the New York Yankees, the Calgary Flames vs. the Edmonton Oilers, or the New Zealand All Blacks vs. the Australian Wallabies.&amp;nbsp; Such territorial rivalries aren't altogether absent from the content management vendor selection process, either, and I find this much more pronounced on the eastern side of the Atlantic than in my native North America.&lt;br /&gt;
&lt;br /&gt;
Of course, in Europe and the UK there are many nations and territories (both political and historic) in a comparatively tiny geographic area, which makes a perfect petri dish for such rivalries to fester. &lt;/p&gt;
&lt;p&gt;When I work with clients or subscribers to help them select vendors, the two or three finalists often end up being very technologically similar, and once tools have been tested and deemed appropriate for clients' environments, the conversations just prior to final selection often become very much &amp;quot;cultural.&amp;quot; It's not just about whether the team is qualified, and if the support line is open when their time zone is open for business. It's also about &lt;em&gt;who they are&lt;/em&gt;, and when a few hundred thousand sterling or euros are on the table, the rivalries come out in closed-door conversations.
&lt;ul&gt;
    &lt;li&gt;&amp;quot;They're Belgian -- there's so many jokes about Belgium. Isn't there a reason for that?&amp;quot;&lt;/li&gt;
    &lt;li&gt;&amp;quot;They're Dutch -- so they're blunt, that's good, right? Aren't the Dutch cheap, too?&amp;quot;&lt;/li&gt;
    &lt;li&gt;&amp;quot;They're German -- so regimented -- is that right for us? What if the schedule slips, will they charge us double?&amp;quot;&lt;/li&gt;
    &lt;li&gt;I wouldn't even know where to start on the Scandinavian rivalries, which go back to the days when Sweden and Denmark traded off conquering the whole of Northern Europe.&lt;/li&gt;
&lt;/ul&gt;&lt;/p&gt;
&lt;p&gt;I end up spending quite a bit of time talking with clients about how they can benefit from vendor characteristics that are different from how their company normally functions. A bit of German organization and Dutch bluntness can be a great thing if your company has neither. I also watch vendors make an extra effort to bring in the &amp;quot;local flavor&amp;quot; to meetings -- someone from the local country or territory, if headquarters is on the other side of the continent. This always makes a big difference to buyers -- more than I believe it should. The English sales guy in the meeting in London isn't going to be the one you'll be working with, or providing you the ongoing service you'll need. Good service is good service, regardless of where it's provided &lt;em&gt;from&lt;/em&gt;.&lt;br /&gt;
&lt;br /&gt;
As an American who does a lot of work in Europe and the UK, I also experience trepidation on the part of some buyers. &amp;quot;Oh, you're &lt;em&gt;American&lt;/em&gt;,&amp;quot; I sometimes hear when I connect with a potential client via phone or meet up in person. Well yes, but CMS Watch is also a UK Limited Company, and one of our Principals is a Brit, and I'm perfectly happy to use the word &amp;quot;whilst&amp;quot; and drink a warm beer with you after work, if it makes you more comfortable.&lt;em&gt; (Note: Americans can be even more blunt that the Dutch.) &lt;/em&gt;Expertise may be what matters in the end, but it's far from the only factor when closing a deal. &lt;br /&gt;
&lt;br /&gt;
Stereotyping is dangerous, and as the world becomes smaller, you the technology buyer need to think more about benefiting from that which may seem foreign or &amp;quot;too different&amp;quot; for your organization. Yes, chemistry is important, but suppliers should be adept enough to adapt to your environment, and yet bring new approaches and attitudes to the project to help you be successful.&amp;nbsp; Be it tennis or a vendor competition, the most appropriate mix of factors need to come together to create success, and sometimes those characteristics may not be the ones you're used to, or possessed by your fellow countryman. &lt;br /&gt;
&lt;br /&gt;
As for my final take on Wimbledon: I wish Rafael Nadal wasn't injured. I'd love to see Federer break the majors record, but I'd be just as thrilled to see Roddick pull through. I don't care where Andy Murray is from, I'll cheer for him to play well, along with anyone willing to call himself a Briton. &lt;br /&gt;
&lt;br /&gt;
May the best player win, wherever he's from.&lt;/p&gt;</description>
         <link>http://www.cmswatch.com/Trends/1636-Wimbledon-Selection?source=RSS</link>
         <category>Digital Asset Management</category>
         <author>tregli@cmswatch.com(Theresa Regli)</author>
         <pubDate>Thu,  2 Jul 2009 15:26:00 -0400</pubDate>
      </item>
      <item>
         <title>A browser is a search engine?</title>
         <description>&lt;p&gt;In a clip on YouTube, an interviewer asks passersby in Times Square what a browser is.The surprising result: many think it's a search engine.&lt;/p&gt;
&lt;p&gt;&lt;object height=&quot;170&quot; width=&quot;280&quot;&gt;
&lt;param value=&quot;http://www.youtube.com/v/o4MwTvtyrUQ&amp;amp;hl=en&amp;amp;fs=1&amp;amp;&quot; name=&quot;movie&quot; /&gt;
&lt;param value=&quot;true&quot; name=&quot;allowFullScreen&quot; /&gt;
&lt;param value=&quot;always&quot; name=&quot;allowscriptaccess&quot; /&gt;&lt;embed height=&quot;170&quot; width=&quot;280&quot; allowfullscreen=&quot;true&quot; allowscriptaccess=&quot;always&quot; type=&quot;application/x-shockwave-flash&quot; src=&quot;http://www.youtube.com/v/o4MwTvtyrUQ&amp;amp;hl=en&amp;amp;fs=1&amp;amp;&quot;&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/p&gt;
  
&lt;p&gt;Of course, the video has been produced by the Google Creative Lab so you could expect it to promote search engines. However, students in Rotterdam &lt;a href=&quot;http://www.youtube.com/watch?v=lEt0N3xu0Do&amp;amp;cc_load_policy=1&quot;&gt;repeated the question&lt;/a&gt; -- and got people answering mostly the same things.&lt;/p&gt;
&lt;p&gt;I tend to think a search engine is one of those somewhat hidden components, like the starter engine of a car. Everybody will &amp;quot;search&amp;quot; all the time, but not too many people know there's such a thing as a &amp;quot;search engine&amp;quot; behind it. Explaining the technicalities of what it takes to run it won't particularly help, either, though nobody is happy when it doesn't work well.&lt;/p&gt;
&lt;p&gt;So if the experiment proves anything it's that the men and women in the street really don't know what's going on behind the scenes. Paradoxically, that's a sign of progress: they don't &lt;em&gt;have&lt;/em&gt; to know in order to be able to use it, much the same way as you don't have to be a mechanic to drive a car anymore.&lt;/p&gt;
&lt;p&gt;So before you head off on a big search, portal, collaboration, or content management project, invest the time to find out what's what and whether it's actually what's needed. After all, you may have been asked for a search engine -- when what's really needed is just a browser. And likewise, you may have been tasked with setting up SharePoint -- when what you really need is just the search engine.&lt;/p&gt;</description>
         <link>http://www.cmswatch.com/Trends/1632-Browser-as-Search?source=RSS</link>
         <category>Search and Information Access</category>
         <author>bloem@radagio.com(Adriaan Bloem)</author>
         <pubDate>Tue, 30 Jun 2009 20:36:00 -0400</pubDate>
      </item>
      <item>
         <title>Lucene can read almost anything: Lucid and ISYS team up</title>
         <description>&lt;p&gt;A few months ago, I blogged about &lt;a href=&quot;http://www.cmswatch.com/Search/Vendors/ISYS&quot;&gt;ISYS&lt;/a&gt; offering their document converter filters &lt;a href=&quot;http://www.cmswatch.com/Trends/1564-Read-me-that-file-so-I-can-index-it,-please&quot;&gt;as a separate component&lt;/a&gt;. My thought was these would come in handy to add on to &lt;a href=&quot;http://www.cmswatch.com/Search/Vendors/Apache&quot;&gt;Lucene&lt;/a&gt; (which, by itself, can't actually read Microsoft Office files, let alone more exotic document types.) That would still leave you with a bit of DIY work, though: integrating the filters in your Lucene implementation.&lt;/p&gt;
&lt;p&gt;As it turns out, &lt;a href=&quot;http://www.cmswatch.com/Trends/1486-Startup-offers-commercial-support-for-Lucene&quot;&gt;Lucid Imagination&lt;/a&gt; had exactly that idea. The company, which offers commercial support for Lucene and Solr, is now offering it's own &amp;quot;LucidWorks&amp;quot; versions with the ISYS filters integrated. This means one of the gaps between open source and commercial search products has been bridged:&amp;nbsp;with the filters, Lucene, too, can read over 200 file types.&lt;/p&gt;
&lt;p&gt;According to Lucid, this has been one of the favorite doubts commercial vendors would cast over the open source search engine, and the move should level the playing field. However, as a customer, you should be aware that there's a couple of other things you may take for granted that are missing. Connectors to various content repositories, for instance, don't come with Lucene, not even a simple web crawler.&lt;/p&gt;
&lt;p&gt;Still, the filters are a welcome addition, and they're certainly an improvement over what's currently available as open source. It's not just in the numbers: ask yourself how you think a converter will read a three-column Word document. You may be surprised to know that some will just go across all the first lines from left to right, then the second lines, etcetera. As always in &lt;a href=&quot;http://www.cmswatch.com/Search/Report/&quot;&gt;&lt;i&gt;Search &amp;amp;&amp;nbsp;Information Access&lt;/i&gt;&lt;/a&gt;, the devil is in the details -- and knowing about these details will pay off.&lt;/p&gt;
&lt;p&gt;The added filters aren't for free, but not exactly expensive, either. There's a 14-day trial, and you can get a subset (e.g., Microsoft Office) of the filters for as little as $3.250 for 2 years, or pay $10.000 for all of them (including those pesky legacy formats you'll discover in a distant corner of your fileserver when you least expect it.) That's still a long way off from the hundreds of thousands even a Google Appliance implementation may cost you in licensing. (Though there's no such thing as a &lt;a href=&quot;http://www.cmswatch.com/Trends/1247-Enterprise-search:-free-as-in-free-beer&quot;&gt;free lunch or free beer&lt;/a&gt; with open source, either.)&lt;/p&gt;
&lt;p&gt;So this is interesting news if you're considering Lucene, but what about ISYS?&amp;nbsp;Aren't they selling the family silver? Well, let me wrap up this post by meandering off into history. As the (perhaps apocryphal) story has it, when the Dutch were at war with the Spanish in the 16th century, they were still selling cannons to their opponents. They figured they might as well make a profit out of it:&amp;nbsp;the outcome would be determined by strategy, anyway.&lt;/p&gt;
&lt;p&gt;Open source projects and commercial vendors, on the other hand, don't even have to be at war. And as with a Spanish Rioja or a Dutch Heineken, it's all about picking the right one for the occasion.&lt;/p&gt;</description>
         <link>http://www.cmswatch.com/Trends/1631-Lucene-Lucid-ISYS?source=RSS</link>
         <category>Search and Information Access</category>
         <author>bloem@radagio.com(Adriaan Bloem)</author>
         <pubDate>Tue, 30 Jun 2009 19:19:00 -0400</pubDate>
      </item>
      <item>
         <title>The Coming Acronym Crisis</title>
         <description>&lt;p&gt;As I talk to people in the content-technology industry (if I may call it that), I'm struck by a common thread that has begun to emerge in conversations involving roadmaps and futures. Vendors are beginning to unshackle themselves from acronyms. Let me spare you the suspense and take you straight to the disturbing punchline: I believe we are headed for an acronym crisis.&lt;br /&gt;
&lt;br /&gt;
I've heard Digital Asset Management vendors say that &lt;a href=&quot;http://www.cmswatch.com/DAM/Report/&quot;&gt;DAM&lt;/a&gt; is not a good acronym any more, because it conjures a narrow, obsolete picture of the problem space. DAM platforms have grown. Offerings like &lt;a href=&quot;http://www.cmswatch.com/DAM/Vendors/MediaBeacon&quot;&gt;MediaBeacon&lt;/a&gt; R3volution, MediaBin (now owned by &lt;a href=&quot;http://www.cmswatch.com/Search/Vendors/Autonomy&quot;&gt;Autonomy&lt;/a&gt;), &lt;a href=&quot;http://www.cmswatch.com/DAM/Vendors/North%20Plains&quot;&gt;North Plains&lt;/a&gt; Telescope, and others, are beginning to include functionalities that are, in many cases, rather &lt;a href=&quot;http://www.cmswatch.com/ECM/Report/&quot;&gt;ECM&lt;/a&gt;-like. Likewise, many &lt;a href=&quot;http://www.cmswatch.com/CMS/Report/&quot;&gt;WCM&lt;/a&gt; products -- &lt;a href=&quot;http://www.cmswatch.com/CMS/Vendors/Alterian&quot;&gt;Alterian&lt;/a&gt; Immediacy, &lt;a href=&quot;http://www.cmswatch.com/CMS/Vendors/PaperThin&quot;&gt;PaperThin&lt;/a&gt; CommonSpot, and many others -- continue to incorporate more and more DAM-like features (e.g., lightbox previews, inline image editing, renditions, image metadata support, Flash previews). Some products, like &lt;a href=&quot;http://www.cmswatch.com/CMS/Vendors/Day%20Software&quot;&gt;Day Software&lt;/a&gt;'s Communiqu&amp;eacute;, have such smooth integration between WCM and DAM offerings that it's hard to tell where one begins and the other one ends. &lt;br /&gt;
&lt;br /&gt;
But it's not just a matter of WCM+DAM convergence (something that's been talked about, and has been happening, for a long time now). &lt;a href=&quot;http://www.cmswatch.com/Search/Report/&quot;&gt;Search&lt;/a&gt; and &lt;a href=&quot;http://www.cmswatch.com/Analytics/Report/&quot;&gt;Web Analytics&lt;/a&gt; are increasingly integral to Content Management solutions, and these technologies are, in turn, driving more personalization and dynamism into Web-facing systems. The information that comes out of all this has high business value and needs to be fed back into any number of other business applications (CRM, BI, KM, Sales Lead Management systems, etc.), lest ROI suffer. But the crisscrossing of so many technologies leads to a paradox: What do you call a system that combines features of WCM, DAM, Web Analytics, Search and Information Access, and maybe CRM as well? No one acronym seems to do the job. &lt;br /&gt;
&lt;br /&gt;
All of this begs the question of whether acronyms are really that important to begin with. I think they are, actually. (Otherwise we wouldn't have so many of them -- and they wouldn't persist for so long after becoming obsolete.) As old acronyms fall into disuse, new ones emerge. CMS gives way to WCM which gives way to WEM &lt;span style=&quot;visibility: visible;&quot; id=&quot;main&quot;&gt;&lt;span style=&quot;visibility: visible;&quot; id=&quot;search&quot;&gt;(Web Experience Management)&lt;/span&gt;&lt;/span&gt;, which in turn will someday give way to something else. Right now, though, the industry is at a crossroads. What do you call a CMS that incorporates aspects of DAM, WCM, &lt;a href=&quot;http://en.wikipedia.org/wiki/Document_management_system&quot;&gt;DM&lt;/a&gt;, Search, Web Analytics, and &lt;a href=&quot;http://en.wikipedia.org/wiki/Text_mining&quot;&gt;Text Mining&lt;/a&gt;, plus (say) a few &lt;a href=&quot;http://www.cmswatch.com/Social/Report/&quot;&gt;social apps&lt;/a&gt;? You can call it a content platform (CP), but that feels vaguely unsatisfying.&lt;br /&gt;
&lt;br /&gt;
Nomenclature is important, I think. But in this case, I'm fresh out of acronyms. And for an analyst, that's embarrassing.&lt;/p&gt;</description>
         <link>http://www.cmswatch.com/Trends/1627-Acronym-Crisis?source=RSS</link>
         <category>Digital Asset Management</category>
         <author>kthomas@cmswatch.com(Kas Thomas)</author>
         <pubDate>Thu, 25 Jun 2009 13:14:00 -0400</pubDate>
      </item>
      <item>
         <title>GSA6: Google Billions, Revisited</title>
         <description>&lt;p&gt;Last week, I posted a &lt;a href=&quot;http://www.cmswatch.com/Trends/1607-GSA-V6-Hype&quot;&gt;highly critical comment&lt;/a&gt; on &lt;a href=&quot;http://www.cmswatch.com/Search/Vendors/Google/&quot;&gt;Google&lt;/a&gt;'s marketing of the Appliance, version 6. My main qualm is that the hyperbole makes it very hard to understand what it actually is they're selling. What you get with a GSA is not exactly how it looks on YouTube (well, the box is, but not necessarily the internals).&lt;/p&gt;
&lt;p&gt;Of course, in my quest to get you the real story, I'm not going to leave it at &amp;quot;press releases and documentation don't match up&amp;quot;. The interesting bit is what the software is actually capable of; even more interesting is what customers are doing with it in reality.&lt;/p&gt;
&lt;p&gt;For now, I'll zoom in on what made the headlines: the Appliance's new capability to index billions of documents, rather than the 30 million of previous version. I noted two things about this:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;The Dynamic Scalibility feature, according to the version 6.0 documentation, &amp;quot;enables multiple Google Search Appliances to work together to scale up to 30 million documents and provide unified search results&amp;quot; (not billions);&lt;/li&gt;
    &lt;li&gt;Being able to index billions of documents, in general (and this applies to all vendors) is a rather meaningless statement, since it really depends on what you're indexing (I used the example of 10-digit phone numbers vs. 40mb PDFs).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Google got in touch with me to explain this, and this led to two surprises.&lt;/p&gt;
&lt;p&gt;First of all: Dynamic Scalibility is, in fact, the feature that would enable indexing billions of documents, and this isn't a beta feature. So what about the documentation's reference to a 30 million document limit? As it turns out: &lt;i&gt;this is an error in Google's documentation&lt;/i&gt;. (For now, the error is still in the &amp;quot;&lt;a href=&quot;http://code.google.com/apis/searchappliance/documentation/60/NewFeatures60.html#federation&quot;&gt;Guide to Software Release 6.0&lt;/a&gt;&amp;quot;, but I've been told this will be corrected.) According to Google, there is no hardwired limit to the number of documents you can index using multiple machines (as long as you buy lots and lots of Appliances to do it on, of course).&lt;/p&gt;
&lt;p&gt;Secondly, about the difference between indexing 10-digit phone numbers or 40mb PDFs: I've been told that the Appliance's hardware is carefully over-spec'ed to handle the load Google claims it can deal with. (The Dell PowerEdge R710s the vendor ships would out-perform many commodity servers). My 40mb comment was a bit of a jab: an Appliance won't index documents larger than 30mb. But as Google explained, the limit has been set so they can guarantee that when they say a GB-7007 can index 10 million documents, it can actually index 10 million of those 30mb PDFs when that's what you need to do. And to be fair, if large documents are an issue for you, you'll want to read our &lt;i&gt;&lt;a href=&quot;http://www.cmswatch.com/Search/Report/&quot;&gt;Search &amp;amp; Information Access Report&lt;/a&gt;&lt;/i&gt; product evaluations carefully, since most enterprise search products have similar limits.&lt;/p&gt;
&lt;p&gt;In the end, of course, the proof will be in the pudding: even if the software is capable of tying together 38 appliances to index a billion documents, this may not mean you'd actually want to. What are minor issues on a smaller corpus suddenly become major problems on that scale, and I'm looking forward to seeing how real enterprises are faring in deploying a cluster of GSAs for such high volumes.&lt;/p&gt;
&lt;p&gt;And if anything: you still shouldn't believe the hype. Google's &amp;quot;billion document index&amp;quot; headline was syndicated across hundreds of news sources before even Google itself found out its documentation contradicted this. You'll want to be sure to get your information from a reliable source.&lt;/p&gt;</description>
         <link>http://www.cmswatch.com/Trends/1611-Google-Search-Billions?source=RSS</link>
         <category>Search and Information Access</category>
         <author>bloem@radagio.com(Adriaan Bloem)</author>
         <pubDate>Thu, 11 Jun 2009 05:09:00 -0400</pubDate>
      </item>
      <item>
         <title>Google Search Appliance v6: don't believe the hype</title>
         <description>&lt;p&gt;The release of a new major version of the &lt;a href=&quot;http://www.cmswatch.com/Search/Vendors/Google/&quot;&gt;Google Search Appliance&lt;/a&gt; usually creates lots of excitement, but that excitement fades quickly once people start to use the machines for real. Even many of the Google resellers I talk to admit to some disappointment: first highly anticipated features are finally introduced, and then they turn out to be rather DIY.&lt;/p&gt;
&lt;p&gt;I don't want to sound like a broken record -- see &lt;a href=&quot;http://www.cmswatch.com/Trends/1122-Google-Search-Appliance:-small-step-in-technology,-giant-leap-in-marketing&quot;&gt;Google Search Appliance: small step in technology, giant leap in marketing&lt;/a&gt; or &lt;a href=&quot;http://www.cmswatch.com/Trends/1339-The-Emperor%27s-New-Box&quot;&gt;The Emperor's New Box&lt;/a&gt;. The point with many of the Google Appliance's new features is that typically, they're not actually on the box itself, and quite often, that means you &lt;a href=&quot;http://www.cmswatch.com/Trends/668-Google-expands-enterprise-search-targets...sort-of&quot;&gt;start loosing the convenience of having a plug-and-play appliance&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;So pardon me for being skeptical of what the new version 6 of the yellow server will actually bring to the table. Try to line up the several lists of new features: the PowerPoint presentation I saw was slightly different than the &lt;a href=&quot;http://www.google.com/enterprise/search/gsa.html&quot;&gt;&amp;quot;New!&amp;quot; list on the product page&lt;/a&gt;. The &lt;a href=&quot;http://www.youtube.com/watch?v=ioesRtK52rc&quot;&gt;YouTube announcement&lt;/a&gt; mentions cross-language-enterprise search, &lt;a href=&quot;http://www.cmswatch.com/Trends/1474-Google-is-an-end---Translate-knows-how-to-bake&quot;&gt;which isn't a feature but an experimental project&lt;/a&gt;. Actually, looking at the version 6 documentation, &lt;a href=&quot;http://code.google.com/apis/searchappliance/documentation/60/NewFeatures60.html#new&quot;&gt;quite a few of the new features turn out to be still in beta&lt;/a&gt;. Many of the announcements mention &amp;quot;early binding&amp;quot; security as a new feature, but &lt;a href=&quot;http://code.google.com/search/#q=%22early%20binding%22&quot;&gt;I haven't been able to find this in the documentation&lt;/a&gt;. The same goes for many of the other capabilities you'll hear buzzing about: they're very hard to trace back to actual features in the new version 6.&lt;/p&gt;
&lt;p&gt;Perhaps the most hyped novelty is scaling, or as Mountain View likes to call it, (GSA)&lt;sup&gt;n&lt;/sup&gt;. As the &lt;a href=&quot;http://googleenterprise.blogspot.com/2009/06/introducing-google-search-appliance-60.html&quot;&gt;Google Enterprise blog&lt;/a&gt; gushes, &amp;quot;When we tested it out, the product manager was pretty excited about all the new features and search power. He was used to hearing about the millions of docs we could handle &amp;ndash; but this time we were going to push it to a new realm: billions.&amp;quot; They've even made a &lt;a href=&quot;http://www.youtube.com/watch?v=rE0KNvECG2s&quot;&gt;video that shows the product manager being all excited&lt;/a&gt;. I can imagine him being excited -- like many others, Google charges per indexed document. But a flat claim of &amp;quot;billions,&amp;quot; with &amp;quot;less than five server racks!&amp;quot; means very little to me. Billions of what? 10 digit phone numbers or 40mb PDFs? (I can go on for hours on why this is quite meaningless, since it all depends on the exact circumstances, but suffice it to say that I've already seen several vendors demonstrate billion-document-indexes on a lot less than five server racks).&lt;/p&gt;
&lt;p&gt;However, I was pleased to see that this is one of the features that's actually on the box itself in version 6. Well... &amp;quot;Dynamic Scalability&amp;quot; is, at any rate, which &amp;quot;enables multiple Google Search Appliances to work together to scale up to  30 million documents&amp;quot;. So how do you get from 30 million to billions? I presume by using &amp;quot;Distributed Crawling&amp;quot;, since this &amp;quot;greatly increases the number of documents that can be  crawled&amp;quot;. That's a beta feature, though, which can only lead me to conclude that the theme of this release should actually be (GSA)&lt;sup&gt;beta&lt;/sup&gt; (pronounced as &amp;quot;to the power of beta&amp;quot;).&lt;/p&gt;
&lt;p&gt;I tend to think that in every market, there's room for Rolls Royces, Volkswagens, and everything in between. But it makes very little sense to take a Volkswagen and &lt;a href=&quot;http://www.priceofhistoys.com/wp-content/uploads/2006/08/vw-truck-rolls-2.jpg&quot;&gt;turn it into a Rolls Royce&lt;/a&gt;. Thankfully, &lt;a href=&quot;http://www.youtube.com/watch?v=cv157ZIInUk&quot;&gt;Volkswagen understands this&lt;/a&gt;. Maybe Google should, too.&lt;/p&gt;</description>
         <link>http://www.cmswatch.com/Trends/1607-GSA-V6-Hype?source=RSS</link>
         <category>Search and Information Access</category>
         <author>bloem@radagio.com(Adriaan Bloem)</author>
         <pubDate>Tue,  2 Jun 2009 13:54:00 -0400</pubDate>
      </item>
      <item>
         <title>When customers are unreasonable</title>
         <description>&lt;p&gt;At CMS Watch, we're advocates only for you, the technology customer. But sometimes    the customer is unreasonable. In my experience, &amp;quot;bad&amp;quot; customers most    typically cause problems around schedules and scope, more so than cheap-skating    the vendor.&lt;/p&gt;
&lt;p&gt;Nonetheless, this video nicely captures how customers can go too far demanding    more value than what they're willing to pay. It's funny. (Hat-tip to &lt;a href=&quot;http://www.linkedin.com/in/jenniferhoppe&quot;&gt;Jennifer    Mayne Hoppe&lt;/a&gt; and &lt;a href=&quot;http://www.heidistrom.com/&quot;&gt;Heidi Strom Moon&lt;/a&gt;.)&lt;/p&gt;
&lt;p align=&quot;center&quot;&gt;&lt;object height=&quot;255&quot; width=&quot;420&quot;&gt;
&lt;param name=&quot;movie&quot; value=&quot;http://www.youtube.com/v/R2a8TRSgzZY&amp;amp;hl=en&amp;amp;fs=1&quot; /&gt;
&lt;param name=&quot;allowFullScreen&quot; value=&quot;true&quot; /&gt;
&lt;param name=&quot;allowscriptaccess&quot; value=&quot;always&quot; /&gt;&lt;embed height=&quot;255&quot; width=&quot;420&quot; src=&quot;http://www.youtube.com/v/R2a8TRSgzZY&amp;amp;hl=en&amp;amp;fs=1&quot; type=&quot;application/x-shockwave-flash&quot; allowscriptaccess=&quot;always&quot; allowfullscreen=&quot;true&quot;&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/p&gt;
&lt;p&gt;But let's talk about the world of software and implementation services -- which    alas is not like ordering dinner, or buying a DVD, or getting your haircut.&lt;/p&gt;
&lt;p&gt;If this clever video was about the real world of software sales, then...&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;There would be no prices on the restaurant menu&lt;/li&gt;
    &lt;li&gt;The DVD cover price would include the box only (&amp;quot;Oh, so you wanted      the disk, too?&amp;quot;)&lt;/li&gt;
    &lt;li&gt;Highlights could only be acquired as part of a larger &amp;quot;make-over      suite&amp;quot; of bundled services&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href=&quot;http://www.cmswatch.com/Reports/&quot;&gt;Our research&lt;/a&gt; is designed to help you save money, and part of what we explain    to you is how to negotiate a better deal. At the end of the day, though, your    suppliers are in business too, so the best relationships engender transparency    -- but also allow for both parties to profit.&lt;/p&gt;</description>
         <link>http://www.cmswatch.com/Trends/1601-When-Customer-Unreasonable?source=RSS</link>
         <category>Digital Asset Management</category>
         <author>tbyrne@cmswatch.com(Tony Byrne)</author>
         <pubDate>Wed, 27 May 2009 19:22:00 -0400</pubDate>
      </item>
      <item>
         <title>Wolfram Alpha search engine: just the facts</title>
         <description>&lt;p&gt;After months of high anticipation of what this &amp;quot;&lt;a href=&quot;http://www.cmswatch.com/Search/Vendors/Google&quot;&gt;Google&lt;/a&gt; killer&amp;quot; would bring, &lt;a href=&quot;http://www.wolframalpha.com&quot;&gt;Wolfram Alpha&lt;/a&gt; was finally opened to the public a week ago.&lt;/p&gt;
&lt;p&gt;It's common practice in this day and age to compare any new public search engine to Google. Or more to the point, to debate whether or not the incumbent will be able to snag a piece of the giant's share of searches, currently just over 70% of the US market, but in most of Europe almost monopolizing at over 90%. This was the main point with &lt;a href=&quot;http://www.cmswatch.com/Trends/1333-Cuil-could-be-cool&quot;&gt;Cuil&lt;/a&gt;, and it will be the main point with &lt;a href=&quot;http://www.cmswatch.com/Search/Vendors/Microsoft&quot;&gt;Microsoft&lt;/a&gt;'s upcoming new web search engine, codenamed &amp;quot;Kumo&amp;quot;.&lt;/p&gt;
&lt;p&gt;Wolfram Alpha, of course, has to be different. And in this case, it actually is; so much so that it makes very little sense to compare it to Google. It's a &amp;quot;computational knowledge engine,&amp;quot; which tries to &amp;quot;understand&amp;quot; your query (not so much in a natural language way, though it does try, but more as a mathematical equation) and then calculates the answer based on a corpus of &amp;quot;curated facts.&amp;quot;&lt;/p&gt;
&lt;p&gt;If you must compare it, look at Google's &amp;quot;&lt;a href=&quot;http://www.google.com/help/features.html&quot;&gt;reference tools&lt;/a&gt;,&amp;quot; e.g. queries like &lt;a href=&quot;http://www.google.nl/search?q=10+dollars+in+euros&quot;&gt;&amp;quot;10 dollars in euros&lt;/a&gt;&amp;quot; (currently, about 7.26). Wolfram Alpha certainly does a &lt;a href=&quot;http://wolframalpha.com/input/?i=10+dollars+in+euros&quot;&gt;better job at this&lt;/a&gt; (including nice graphs of the rate over the last year or last ten years). It will also, for instance, &amp;quot;&lt;a href=&quot;http://wolframalpha.com/input/?i=compare+york+to+new+york&quot;&gt;compare York to New York&lt;/a&gt;&amp;quot; for you (and thankfully, it'll &lt;a href=&quot;http://www.cmswatch.com/Trends/1594-Search-Queries-Starbucks&quot;&gt;show you how it has parsed your request&lt;/a&gt;). Think of it as an excellent way to query sources like the &lt;a href=&quot;https://www.cia.gov/library/publications/the-world-factbook/&quot;&gt;CIA World Factbook&lt;/a&gt; or &lt;a href=&quot;http://www.wikipedia.org&quot;&gt;Wikipedia&lt;/a&gt; and perform comparisons.&lt;/p&gt;
&lt;p&gt;Playing around with it, you'll also run into its limitations quite quickly. You can ask it the average temperature in San Fransisco or the average temperature in The Netherlands, but it won't compare the average temperatures in San Fransisco and The Netherlands. You can ask it to divide 12 by 3 and calculate the result as Dollars to Euros, and you can ask it for the &amp;nbsp;&amp;quot;currency of The Netherlands.&amp;quot; But you can't ask it what &amp;quot;4 in US currency is in The Netherlands currency.&amp;quot;&lt;/p&gt;
&lt;p&gt;In short, while it allows for very complex mathematical queries, please don't ask it to check for more than two facts about facts at the same time. And if you're not looking for facts, you're completely out of luck: it can show you the color purple (the book), but you'll learn little more of substance than that it was first published in 1983 and won a Pulitzer.&lt;/p&gt;
&lt;p&gt;It's tempting to digress on the roots of Wolfram Alpha and talk about &lt;a href=&quot;http://en.wikipedia.org/wiki/Stephen_Wolfram&quot;&gt;Stephen Wolfram&lt;/a&gt;, his &lt;a href=&quot;http://en.wikipedia.org/wiki/Mathematica&quot;&gt;Mathematica&lt;/a&gt; software, and his book &amp;quot;&lt;a href=&quot;http://en.wikipedia.org/wiki/A_New_Kind_of_Science&quot;&gt;A New Kind of Science&lt;/a&gt;&amp;quot;, which attempts to answer, well, &lt;a href=&quot;http://www.bbc.co.uk/cult/hitchhikers/guide/question.shtml&quot;&gt;the question of life, the universe, and everything&lt;/a&gt;. But to be honest, this is probably as helpful as an explanation of &lt;a href=&quot;http://en.wikipedia.org/wiki/Bayes_theorem&quot;&gt;Bayes' theorem&lt;/a&gt; is to understanding the usefulness of &lt;a href=&quot;http://www.cmswatch.com/Search/Vendors/Autonomy&quot;&gt;Autonomy IDOL&lt;/a&gt;. The main difference is probably that Stephen Wolfram, unlike &lt;a href=&quot;http://en.wikipedia.org/wiki/Thomas_Bayes&quot;&gt;Thomas Bayes&lt;/a&gt;, is alive and well, and his ego is an important driving factor behind Wolfram Alpha.&lt;/p&gt;
&lt;p&gt;There's certainly a use for the engine, and you may want to bookmark the site for that reason. You may even want to use the API for this functionality once it becomes available. But &amp;quot;computational knowledge&amp;quot; has its limitations. Wolfram Alpha will, in fact, by using its principles of Mathematica and NKS, compute the answer to the question of life, the universe, and everything. &lt;a href=&quot;http://wolframalpha.com/input/?i=answer+to+life+the+universe+and+everything&quot;&gt;The answer is 42&lt;/a&gt;. I'll leave it up to you to decide whether this just is the wrong question to ask, or if the answer is just really unhelpful.&lt;/p&gt;</description>
         <link>http://www.cmswatch.com/Trends/1595-Wolfram-Alpha?source=RSS</link>
         <category>Search and Information Access</category>
         <author>bloem@radagio.com(Adriaan Bloem)</author>
         <pubDate>Mon, 25 May 2009 14:53:00 -0400</pubDate>
      </item>
      <item>
         <title>Search, query syntax, Google and Starbucks</title>
         <description>&lt;p&gt;There's too much here to write up in one blog post, so I'll pick one of the cherries for you here. Last week's &lt;a href=&quot;http://www.enterprisesearchsummit.com/&quot;&gt;Enterprise Search Summit&lt;/a&gt; in New York ended with roundtable discussions -- which meandered around several topics, and then down the escalators into the hotel lobby.&lt;/p&gt;
&lt;p&gt;I was discussing querying a search engine with &lt;a href=&quot;http://thenoisychannel.com/&quot;&gt;Daniel Tunkelang&lt;/a&gt; (Chief Scientist of &lt;a href=&quot;http://www.cmswatch.com/Search/Vendors/Endeca&quot;&gt;Endeca&lt;/a&gt;), standing across the street from one of the ubiquitous Starbucks. He told me search phrases are much like coffee orders: if you don't go to Starbucks often, you'll hesitantly ask for a large coffee, with milk, maybe skim milk, and you'll say please. If you go there every day, you just go in and say &amp;quot;venti skim milk latte,&amp;quot; pay, and leave with your coffee.&lt;/p&gt;
&lt;p&gt;When we use &lt;a href=&quot;http://www.cmswatch.com/Search/Vendors/Google&quot;&gt;Google&lt;/a&gt; on the web, we're used to the same kind of formulas -- we know Google's query language because we've grown accustomed to it, not because it's particularly good or bad at understanding what we want.&lt;/p&gt;
&lt;p&gt;Endeca is doing research in parsing user's queries into meaningful commands for a search engine (as are many others), and it'll be interesting to see what they come up with. But there is an useful insight here that has a broader appeal: while public web search engines (like Google) have an interest in making the parsing seem easy, within the enterprise there's no reason to hide what processing has been performed on your original query.&lt;/p&gt;
&lt;p&gt;As I walked into Starbucks the next morning for my daily dose of caffeine (sadly lacking my Lavazza Qualit&amp;agrave; Oro when traveling), I realized it once took me several attempts explaining this as &amp;quot;a large espresso,&amp;quot; &amp;quot;an espresso with three extra shots,&amp;quot; or &amp;quot;a double doppio.&amp;quot; I broke the code because I could hear the order go to the barista as a &amp;quot;quad.&amp;quot;&lt;/p&gt;
&lt;p&gt;Maybe you should let your users see what you're actually ordering the search engine to serve up, as well. Show what you're expanding, removing, or translating. And don't forget that &amp;quot;no results&amp;quot; isn't necessarily a bad thing -- as long as you're sure there really are no results to display. It's more helpful to be straightforward than overly polite!&lt;/p&gt;</description>
         <link>http://www.cmswatch.com/Trends/1594-Search-Queries-Starbucks?source=RSS</link>
         <category>Search and Information Access</category>
         <author>bloem@radagio.com(Adriaan Bloem)</author>
         <pubDate>Wed, 20 May 2009 15:33:00 -0400</pubDate>
      </item>
      <item>
         <title>Search, Portals, and SharePoint at Interop-Vegas</title>
         <description>&lt;p&gt;Next week you can find &lt;a href=&quot;http://www.cmswatch.com/Analyst/10-Pelz-Sharpe&quot;&gt;Alan&lt;/a&gt;    and I at &lt;a href=&quot;http://www.interop.com/lasvegas/&quot;&gt;Interop 2009&lt;/a&gt; in Las    Vegas. We'll have a booth there to show some of CMS Watch's latest research,    so feel free to stop by and say hello.&lt;/p&gt;
&lt;p&gt;On Wednesday, we're also co-teaching an intensive, day-long &lt;a href=&quot;http://www.interop.com/lasvegas/conference/workshops.php#1242802800&quot;&gt;workshop    on contemporary enterprise information management&lt;/a&gt;. The morning session covers    &amp;quot;Accessing Enterprise Information via Portals and Search Technology.&amp;quot;    Then after lunch we tackle &amp;quot;Evaluating SharePoint in the Enterprise.&amp;quot;&lt;/p&gt;
&lt;p&gt;If you're interested in Portals, Search, and SharePoint, I encourage you to    sign up. There's a lot happening in all three domains, and we're going to see    if we can sort it all out in one action-packed day. Hope to see you there!&lt;/p&gt;</description>
         <link>http://www.cmswatch.com/Trends/1589-Search-Portals-SharePoint-Interop?source=RSS</link>
         <category>Enterprise Portals</category>
         <author>tbyrne@cmswatch.com(Tony Byrne)</author>
         <pubDate>Mon, 11 May 2009 15:04:00 -0400</pubDate>
      </item>
      <item>
         <title>JBoye 2009 conference wrap-up</title>
         <description>&lt;p&gt;I had the pleasure of participating in the &lt;a href=&quot;http://www.jboye.com/conferences/philadelphia09&quot;&gt;JBoye 
  09 conference&lt;/a&gt; in Philadelphia, USA earlier this week. For those who missed 
  it, here are some interesting tidbits, in no particular order.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;www.rosenfeldmedia.com&quot;&gt;Lou Rosenfeld&lt;/a&gt; talked about the importance of cross-pollinating the work of 
  information architecture, content management, and user experience (money quote: 
  &amp;quot;CM-UX = money down the toilet&amp;quot;). But the meat of his keynote revolved 
  around how to reconcile the different approaches of web analytics and user experience. 
  Web analytics tries to answer &amp;quot;what.&amp;quot; UX tries to answer &amp;quot;why.&amp;quot; 
  Of course you need to ask both, but when you ask them together, you can get 
  some very revealing answers.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://www.classicsys.com/&quot;&gt;Jim Hobart&lt;/a&gt; gave a high-speed tour of different approaches to navigating complex 
  hierarchies, based on actual customer testing. A sprinkling: &lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Avoid three levels of trees in casual user interfaces&lt;/li&gt;
  &lt;li&gt;Avoid cascading menus or menus that require fine-motored mousing skills 
    (we have to fix ours!); indent, or use single-level drop-down instead, with 
    lots of signaling&lt;/li&gt;
  &lt;li&gt;The ideal wait-state for a mouseover pop-out is 1250 to 1800 milliseconds, 
    depending on the size of the target area&lt;br&gt;
  &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;From the annals of making-sure-your-SharePoint-consultancy-knows-what-the-hell-they're-talking-about: 
  &lt;a href=&quot;http://www.jboye.com/conferences/philadelphia09/speakers/dorthe_jespersen&quot;&gt;Dorthe Jesperson&lt;/a&gt; of JBoye shared an anecdote from a major Danish enterprise whose Microsoft 
  partner told them repeatedly (!) that, &amp;quot;it's impossible turn off MySites.&amp;quot; 
  Ahem. Actually it's considered a best practice by many large enterprises to 
  disable MySites upon initial install and then selectively turn them on after 
  addressing technical and governance issues. See &lt;a href=&quot;http://www.cmswatch.com/SharePoint/Research/&quot;&gt;our SharePoint research&lt;/a&gt; for 
  more details.&lt;/p&gt;
&lt;p&gt;The ever thought-provoking &lt;a href=&quot;http://www.intranetfocus.com&quot;&gt;Martin White&lt;/a&gt; and &lt;a href=&quot;http://www.metatorial.com&quot;&gt;Bob Boiko&lt;/a&gt; set off some great discussions 
  about whether we should call ourselves &amp;quot;information managers&amp;quot; rather 
  than &amp;quot;content managers.&amp;quot; The debate is of course partly over semantics 
  (personally, I think Information = Content + Data, but some people find that 
  too trite), and partly about better marketing the importance of what we do within 
  the enterprise. Is &amp;quot;information manager&amp;quot; a more impactful title? ...I 
  don't think so. But then, what is...?&lt;/p&gt;
&lt;p&gt;On a related note, I had the pleasure of moderating the final &amp;quot;town hall 
  debate&amp;quot; between Dana Hallman of the US Treasury Dept and Jerry Boyle of 
  Thermo Fisher Scientific. The following resolutions were put forth to the audience, 
  who voted on them after some lively sparring:&lt;/p&gt;
&lt;p&gt;&amp;quot;Face it: Your privacy is long gone&amp;quot; -- passed&lt;/p&gt;
&lt;p&gt;&amp;quot;We need to stop whining and start doing&amp;quot; -- passed&lt;/p&gt;
&lt;p&gt;&amp;quot;I've seen the future and it's SharePoint&amp;quot; -- failed&lt;/p&gt;
&lt;p&gt;&amp;quot;Communications Directors should call themselves Information Managers&amp;quot; 
  -- failed&lt;/p&gt;
&lt;p&gt;&amp;quot;LinkedIn will soon become an important enterprise information management 
  platform&amp;quot; -- failed&lt;/p&gt;
&lt;p&gt;In sum, the conference was a super learning experience. If you care about web 
  and web content technologies and practices, this is a great event. &lt;/p&gt;</description>
         <link>http://www.cmswatch.com/Trends/1587-JBoye-Wrap-Up?source=RSS</link>
         <category>Search and Information Access</category>
         <author>tbyrne@cmswatch.com(Tony Byrne)</author>
         <pubDate>Fri,  8 May 2009 09:26:00 -0400</pubDate>
      </item>
      <item>
         <title>Search in New York next week</title>
         <description>&lt;p&gt;I'm excited to be headed over to New York next week. The &lt;a href=&quot;http://www.enterprisesearchsummit.com/2009/&quot;&gt;Enterprise Search Summit 2009&lt;/a&gt; brings together a veritable who's-who in search, and I'm looking forward to attend an event where search is the featured topic (and not just another parallel track on the program).&lt;/p&gt;
&lt;p&gt;If you're going, be sure to say hi to us. We love to hear real-life stories about search from end-users in the enterprise and exchange thoughts with you. I'll be doing a presentation (my &lt;a href=&quot;http://www.eurocoins.co.uk/images/2002eurozone2eurocentrev240.jpg&quot;&gt;2 cents' worth&lt;/a&gt; on &lt;a href=&quot;http://www.enterprisesearchsummit.com/2009/speaker.shtml?speaker=AdriaanBloem&quot;&gt;pragmatic ways to implement search&lt;/a&gt;), but I'm really attending to listen (rather than just talk).&lt;/p&gt;
&lt;p&gt;If you haven't decided yet, it's not too late. You can attend the keynotes for free, and better yet, you can get a $200 discount if you sign up &lt;a href=&quot;https://secure.infotoday.com/forms/default.aspx?form=ess2009&amp;amp;priority=speak3&quot;&gt;using this link&lt;/a&gt;. Hope to see you there!&lt;/p&gt;</description>
         <link>http://www.cmswatch.com/Trends/1583-Enterprise-Search-Summit-2009?source=RSS</link>
         <category>Search and Information Access</category>
         <author>bloem@radagio.com(Adriaan Bloem)</author>
         <pubDate>Wed,  6 May 2009 18:27:00 -0400</pubDate>
      </item>
      <item>
         <title>Join us at Internet World UK</title>
         <description>&lt;p&gt;When I think of Earls Court, London, the first thing that comes to mind for me is the many Led Zeppelin concerts I grew up listening to that were recorded in this hallowed venue. Next week Earls Court will take on a different sort of ambiance, to be sure, for &lt;a href=&quot;http://internetworld.co.uk/&quot;&gt;Internet World UK&lt;/a&gt;, the largest expo of its kind in London. &lt;/p&gt;
&lt;p&gt;Join me and my colleagues Tony Byrne and Alan Pelz-Sharpe (who are almost, but not quite as cool as, Jimmy Page &amp;amp; Robert Plant) as we host sessions about Social Software, Web Analytics, Findability, and DAM. We'll also have a stand featuring our now-infamous &amp;quot;content management therapy&amp;quot; couch.  Come by this free show, spend some time with us, and share your content management challenges. &lt;/p&gt;</description>
         <link>http://www.cmswatch.com/Trends/1578-Internet-World-UK?source=RSS</link>
         <category>Digital Asset Management</category>
         <author>tregli@cmswatch.com(Theresa Regli)</author>
         <pubDate>Fri, 24 Apr 2009 13:22:00 -0400</pubDate>
      </item>
      <item>
         <title>Vivisimo - still searching for real social marketing</title>
         <description>&lt;p&gt;&lt;a href=&quot;http://meetstan.com&quot;&gt;Meet Stan&lt;/a&gt;, the everyman, who'll try to sell you on &lt;a href=&quot;http://www.cmswatch.com/Search/Vendors/Vivisimo&quot;&gt;Vivisimo Velocity&lt;/a&gt; enterprise search software.&amp;nbsp; Stan stars in videos, explaining the wonders of enterprise search, writes a blog, can be friended on &lt;a href=&quot;http://www.cmswatch.com/Social/Vendors/Facebook&quot;&gt;Facebook&lt;/a&gt;, and followed on Twitter.&lt;/p&gt;

&lt;p&gt;While I appreciate attempts at explaining what enterprise search is, or can do, I'm not sure Stan is going to help. It may just be me, but when I hear &amp;quot;Meet Stan&amp;quot;, I'm expecting Eminem (&amp;quot;Meet Stan. After meeting a young girl at a rave party, things start getting hot and heavy in an upstairs bedroom&amp;quot;). That must be a different Stan, though; in Vivisimo's world, all men wear mustaches. (And you recognize &amp;quot;hot women,&amp;quot; like Anne, because, well... they have no facial hair.)&lt;/p&gt;
&lt;p&gt;&lt;img height=&quot;97&quot; width=&quot;97&quot; alt=&quot;meet stan&quot; src=&quot;http://meetstan.com/images/layout/blogface.gif&quot; /&gt;&lt;/p&gt;
&lt;p&gt;So interestingly, there isn't really a lesson here about &lt;a href=&quot;http://www.cmswatch.com/Search/Report/&quot;&gt;enterprise search&lt;/a&gt;.&amp;nbsp; Rather, this is an example of how hard it is to use &lt;a href=&quot;http://www.cmswatch.com/Social/Report/&quot;&gt;social software&lt;/a&gt; for marketing purposes. Vivisimo, for one, seems to have confused &amp;quot;viral&amp;quot; with &amp;quot;cheesey.&amp;quot;&lt;/p&gt;
&lt;p&gt;Stan does reply to Twitter comments, but that has mostly taught me he likes microbreweries. It's not the kind of interactive marketing that will engage an audience; you will need to take the people you're addressing a bit more seriously. Stan is talking &lt;i&gt;to&lt;/i&gt; us -- rather than talking &lt;i&gt;with&lt;/i&gt; us.&lt;/p&gt;
&lt;p&gt;On the upside, at least Stan doesn't speak in &lt;a href=&quot;http://bancomicsans.com&quot;&gt;Comic Sans&lt;/a&gt;. But if you're thinking of interactive marketing, or using social software to reach your customers, you should probably think more about what content would attract visitors, and how you're going to interact with them, rather than having an agency design something that ends up as a rather static brochure avatar.&lt;/p&gt;</description>
         <link>http://www.cmswatch.com/Trends/1576-Vivisimo-Stan?source=RSS</link>
         <category>Enterprise Social Software</category>
         <author>bloem@radagio.com(Adriaan Bloem)</author>
         <pubDate>Tue, 21 Apr 2009 04:48:00 -0400</pubDate>
      </item>
      <item>
         <title>Read me that file so I can index it, please</title>
         <description>&lt;p&gt;One of those easy-to-overlook but important details of a search engine: will it actually read your files? You may be interested in &lt;a href=&quot;http://www.cmswatch.com/Search/Vendors/Apache&quot;&gt;Lucene&lt;/a&gt;, but you'll have to find a way to feed it Office documents and PDFs.&lt;/p&gt;
&lt;p&gt;Search engines don't actually directly index the Word document or PDF, they index text. This is where document filters come into play. These do their best to get the text from the file (and usually some metadata, such as an &amp;quot;author&amp;quot; field). If you've ever tried to open some exotic document format in a plain text editor (i.e., Notepad, or VI) you'll understand this can be far from trivial: many of these formats aren't very straightforward.&lt;/p&gt;
&lt;p&gt;The problem isn't just trying to find the text, there are quite a few complications: reading across two or three column layouts; what to do with footnotes; or what to index, period. Spreadsheets are troublesome, but what do you make of images, audio, video? And for many scenarios (like indexing a file share) there will be exotic file types to deal with. (I recall the comments at a municipality once: &amp;quot;But we don't have any exotic file types&amp;quot;. Three months later, a full crawl unearthed a stack of CAD/CAM files that were vital for planning). To make matters worse, file formats change with the software versions that come out (will the converter read Office 2007 or just Office 95?).&lt;/p&gt;
&lt;p&gt;Since it's complicated to build and maintain good filters, most vendors buy them off-the-shelf. As I've &lt;a href=&quot;http://www.cmswatch.com/Trends/1185-The-game-of-musical-chairs-continues-in-enterprise-search&quot;&gt;talked about before&lt;/a&gt;, the market has been cornered by &lt;a href=&quot;http://www.cmswatch.com/Search/Vendors/Oracle&quot;&gt;Oracle&lt;/a&gt; (with the INSO filters) and &lt;a href=&quot;http://www.cmswatch.com/Search/Vendors/Autonomy&quot;&gt;Autonomy&lt;/a&gt; (with the KeyView filters). Almost all the search engines out there use either Oracle's or Autonomy's converters. A notable exception is &lt;a href=&quot;http://www.cmswatch.com/Search/Vendors/Microsoft&quot;&gt;Microsoft&lt;/a&gt;, which has its own standard for this, IFilters. But IFilters are of varying quality, they don't always work with every Microsoft software product, and you may very well have to build a custom filter yourself for some ancient or rare software.&lt;/p&gt;
&lt;p&gt;And there's &lt;a href=&quot;http://www.cmswatch.com/Search/Vendors/ISYS&quot;&gt;ISYS&lt;/a&gt; -- probably the only vendor we cover in our &lt;a href=&quot;http://www.cmswatch.com/Search/Report/&quot;&gt;&lt;i&gt;Search &amp;amp; Information Access Report&lt;/i&gt;&lt;/a&gt; that has developed converters for over 200 document types entirely by themselves. (Even Oracle and Autonomy didn't really build filters themselves -- they bought the companies that produced them).&lt;/p&gt;
&lt;p&gt;It makes sense, then, that ISYS now tries to bank on that hidden capital. The vendor &lt;a href=&quot;http://www.isys-search.com/company/newsevents/filereaders.html&quot;&gt;announced last week&lt;/a&gt; it's releasing its &lt;a href=&quot;http://www.isys-search.com/technology/filereaders/index.html&quot;&gt;File Readers&lt;/a&gt; as a separately available product. It'll be interesting to see these show up in Lucene implementations (and in &lt;a href=&quot;http://www.cmswatch.com/Trends/1548-CMS-Search-Lucene&quot;&gt;content management systems embedding search&lt;/a&gt;). More options means more choice. Black may be the fastest drying paint, but maybe you can now &lt;a href=&quot;http://en.wikipedia.org/wiki/Ford_Model_T#Colors&quot;&gt;have that Model T in purple&lt;/a&gt; again.&lt;/p&gt;</description>
         <link>http://www.cmswatch.com/Trends/1564-Read-me-that-file-so-I-can-index-it,-please?source=RSS</link>
         <category>Search and Information Access</category>
         <author>bloem@radagio.com(Adriaan Bloem)</author>
         <pubDate>Wed,  8 Apr 2009 15:20:00 -0400</pubDate>
      </item>
      <item>
         <title>Listening to the Pogue</title>
         <description>&lt;p&gt;That would be &lt;a href=&quot;http://www.davidpogue.com/&quot;&gt;David Pogue&lt;/a&gt;, &lt;em&gt;New    York Times&lt;/em&gt; technology columnist and author of the &amp;quot;&lt;a href=&quot;http://missingmanuals.com/&quot;&gt;Missing    Manuals&lt;/a&gt;&amp;quot; series of books. He'll be delivering the first keynote at    &lt;a href=&quot;http://www.jboye.com/conferences/philadelphia09/&quot;&gt;JBoye 09 in Philadelphia&lt;/a&gt;    next month. Even better, he's teaching a 3-hour (!) course on &amp;quot;&lt;a href=&quot;http://www.jboye.com/conferences/philadelphia09/tutorial/6&quot;&gt;Google:    The Missing Manual&lt;/a&gt;.&amp;quot; Both promise to be as entertaining as they are    informative.&lt;/p&gt;
&lt;p&gt;Hope to see you in Philly in any case. (BTW, you can get a 20% registration    discount using the code &lt;strong&gt;cmswatchphilly&lt;/strong&gt;.)&lt;/p&gt;</description>
         <link>http://www.cmswatch.com/Trends/1560-Pogue?source=RSS</link>
         <category>Search and Information Access</category>
         <author>tbyrne@cmswatch.com(Tony Byrne)</author>
         <pubDate>Tue,  7 Apr 2009 00:27:00 -0400</pubDate>
      </item>

   </channel>
</rss>

