Get the real story via our bi-monthly newsletter

Search

    4
    0

rss

Send to a colleague

Home > Commentary > Trends Archive > Drupal, Mollom, and the Future of Blog Spam

Browse TrendWatch Blog

Recent Blog Entries

The Complete Archive

Trends by Vendor


TrendWatch by Channel

Web Content Management Trends

Enterprise Portals Trends

ECM Trends

Web Analytics Trends

Enterprise Search Trends

SharePoint Trends

Digital & Media Asset Management Trends

XML & Component Content Management Trends

E-mail Archiving & Management Trends

Enterprise Social Software & Collaboration Trends


Report Excerpt

The Enterprise Collaboration & Community Software Report looks at... End-user interface in Traction TeamPage

"The ability to slice and dice information in various ways, comment on it, tag it, and build collaborative knowledgebases is very powerful. But this very feature-richness also makes the platform a bit daunting, especially because its somewhat dated interface doesn't always work as effectively as most competitors' to simplify your options."

(p. 233)

More about The Enterprise Collaboration & Community Software Report

Our customers say

"Thankfully, the folks at CMS Watch, who've historically done an unparalleled job tracking the content management system market - including commercial and open source options - have put out a revised edition of their The Enterprise Social Software & Collaboration Report... The report is very ambitious, including in its 500+ pages coverage of public networks, enterprise platform vendors and social software suites as well as blog, wiki, and community vendors.
- - John Eckman,
Senior Director, Optaros Labs

NEW at CMS Watch

The Search and Information Access ReportThe Search & Information Access Report: This newly updated 341-page Search and Information Access Report critically evaluates 23 Search and Information Access offerings from around the globe... Read more

The Enterprise Collaboration & Community Software ReportThe Enterprise Collaboration & Community Software Report : This newly updated research critically evaluates 27 Enterprise Collaboration and Community Software products head-to-head... Read more

The Enterprise Content Management ReportThe Enterprise Content Management Report : This newly updated research critically evaluates 32 Enterprise Content Management products head-to-head... Read more

 
 

TrendWatch Blog

Drupal, Mollom, and the Future of Blog Spam

11-Jun-2008   --  

Is it just me, or has anyone else been struck by the lack of attention being paid to blog comment spam?

No one needs a reminder of how severe the spam problem is with e-mail. But e-mail spam is just one piece of the spam pie. (Oh man, talk about a hard-to-swallow metaphor...) Somewhere between 80 and 90 percent of comments posted to blogs and/or wikis come from spambots or their human surrogates. Bear in mind, as technologies go, blogging is fairly new by comparison to e-mail. We're still near the beginning of the blog-spam curve.

To the extent that Social Software and Web CMS vendors sell, bundle, or pre-integrate blog and wiki solutions for you to employ beyond the firewall, they're selling you spam magnets as part of the deal. But they're not necessarily helping you with spam filtration.

You'd expect Social Software purveyors to be pioneers in this area, and some of them have decent services. But surprisingly, many of the vendors covered in our just-published Enterprise Social Software Report 2008: Networking & Collaboration Within and Beyond the Enterprise scored rather poorly on anti-spam capabilities.

Typical remedies for blog spam include comment moderation, challenge-response techniques, and automated filtering based on some combination of reputation assessment and AI-based text analysis. There are problems with all three approaches.

Moderation is tantamount to hand-processing. This is impractical in many cases and will only become more so over time.

A more practical deterrent is the CAPTCHA (a common challenge-response technique). The idea is that if you can correctly identify the letters in a deformed Gif image of a word, you're human, not a spambot, and therefore can be trusted not to post garbage. The CAPTCHA deters robots remarkably well (so far, at least), but it also deters legitimate posters to some extent. (Not everyone wants to play a word game in order to leave a comment.) It will not deter a malicious human. Offshore boilerrooms of paid CAPTCHA-breakers can (and do) still break through.

Filtering based on AI-driven text analysis can be effective for blog comments as well as e-mail. The problem with text analysis is that unless misclassification errors can be kept to just a couple of percent, you're still letting a lot of junk through. Consider a blog that receives 100 comments. Typically, 80 will be spam. An AI-based spam filter that's 90 percent accurate will let 8 bogus comments through. Since you had just 20 legitimate comments to begin with, you're left with a situation where over a quarter of your published comments (8 of 28) are bogus.

Comment spam mitigation technology is obviously a work in progress. Some interesting new work in this area is being pursued by none other than Dries Buytaert (creator of Drupal). Buytaert, along with university classmate Benjamin Schrauwen, recently introduced Mollom, a comment-filtering SaaS offering (free for non-commercial users). Buytaert and Schrauwen hold doctorates in computer science. Schrauwen's is in machine learning.

Mollom relies mostly on proprietary text analysis techniques, but takes a multi-tiered approach. When a comment arrives for analysis, it is given a score of ham (good), spam (bad), or uncertain. When the content's quality is uncertain, Mollom issues a CAPTCHA challenge to the submitter. If the submitter passes the CAPTCHA test, the content is marked as good. Buytaert and Schrauwen claim that Mollom (currently used by 1459 websites) is 99.78 percent effective.

What makes Mollom better than, say, Akismet? It's hard to know, at this point. Mollom's algorithms are a closely guarded secret (but are likely to be the original work of Schrauwen). Akismet says only that it runs "hundreds of tests" on every incoming comment (which sounds more than a bit Rube Goldberg-ish).

Mollom's most important differentiator may ultimately be its ability to act as an OpenID reputation service. For every incoming request associated with an OpenID value, Mollom updates the reputation of that ID based on the scoring of the associated comment(s). Over time, the trustworthiness of any user who has an OpenID becomes a simple table lookup rather than an elaborate exercise in artifical intelligence.

If you're in the process of selecting a Web CMS and/or Social Software vendor, and you plan to deploy public-facing blogs or wikis, be sure to take comment spam mitigation into account. Moderation of comments (by humans) is inherently costly. A SaaS service like Mollom or Akismet may not completely eliminate the need for moderation but could be money well-spent. One thing is certain: spam is something you need to budget for and architect around. Ask your vendors what kind of help you can expect from them. And don't settle for the sound of crickets chirping.

- Submitted by: Kas Thomas, Analyst - Twitter: kasthomas

All Social Channel Trends

Join the conversation

Digg This! Search Technorati Tag it on Del.icio.us




Get a Free Sample

Wondering about CMS Watch research? Sign up to receive free samples of any of our products.




What we do

CMS Watch™ evaluates content-oriented technologies, publishing head-to-head comparative reviews of leading solutions. What makes us special?

  • Our critical analysis exposes product weaknesses as well as strengths
  • We deliver unrivaled technical depth and comprehensive project advice
  • Our research is led by international topic experts
  • We only work for buyers -- never for vendors

Contact us

CMS Watch

info@cmswatch.com

3470 Olney-Laytonsville Road Suite 131

Olney, MD USA 20832

1 800 325 6190

1 617 340 6464

UK: +44 2033181911

Fax: +1 617 340 3541