Beating the RSS crunch with aggregation/bloglines

How to alleviate RSS traffic by using aggregation and bloglines.

Those not intimately familiar with RSS, and the congestion problems caused by newsreaders that too-frequently poll for RSS feeds, might not have noticed the announcement on September 28, 2004, that Bloglines had created a set of APIs that developers could use to access their aggregated blog database. Not coincidentally, Bloglines announced the same day that blog- and newsreading software vendors FeedDemon, NetNewsWire and Blogbot have...

decided to use the Bloglines APIs in their popular desktop applications.

Understanding why this means something, and in fact what it means, requires some quick background. If you stop to think that many RSS readers poll ALL the Web sites from which their users have signed up for feeds, and that some do so by default as often as every 5 minutes, it's not hard to understand how a server with a popular collection of blogs could become overwhelmed with update requests both quickly and easily. In fact, some recent news reports (for example, the eWeek story from 9/20/2004 "RSS Comes with Bandwidth Price Tag") have stressed that smaller sites can get overwhelmed, and even the largest sites heavily taxed, by keeping up with RSS update polls. A current nerd joke in circulation even makes the point that a successful RSS site is indistinguishable from a distributed denial of service attack (because a whole bunch of clients around the Internet make heavy and simultaneous requests for service)!

What Bloglines does for RSS feeds is very much like what Google and Yahoo do for popular Web pages and information: they compile this content into their databases, so that accesses to frequently requested pages are satisfied from a local cache, instead of requiring the original server to handle yet another update or access request. If a user or a mirror site draws its RSS feeds from Bloglines instead of the original site, Bloglines can update its material quite frequently for a much lower bandwidth cost than if all sites and users who turn to Bloglines turned directly to those sites instead. Thus, blog aggregation databases hold an important key to keeping update RSS traffic manageable.

Furthermore, by publishing a freely available set of Web services APIs, Bloglines permits developers to create code that reads RSS and Atom (a related type of blog feed format, also based on XML) feeds straight from the Bloglines databases. A quick review of the Bloglines API documentation shows that this stuff is both simple and straightforward. In fact, there are only three small APIs in total:

  • Notifier API: used to gather a count of unread items in a Bloglines account (uses a single update function)
  • Sync API: used to access subscription lists and/or unread items (uses two functions: listsubs to retrieve subscription data on a per-account basis and getitems to retrieve blog entries on a per-subscription basis)
  • Blogroll API: used to incorporate subscription lists into other sites or applications. (Provides a mechanism to create lists of links to other blogs, using a single chunk of pre-defined HTML markup that loads a script from the Bloglines server.)

This is pretty simple stuff, and makes it easy for Web developers and site managers to add pretty spiffy blog support to their offerings. And it comes with one more additional and possibly substantial advantage: those who've messed with RSS (and to some extent, Atom) know that such feeds come in a variety of formats, each with its own little eccentricities and minor format and access variations. By grabbing feeds from Bloglines, you're guaranteed that all incoming feeds use the same, well-documented format and you can therefore promise to your users anything that Bloglines can deliver--and believe me, that's a lot!


Ed Tittel is a full-time writer and trainer whose interests include XML and development topics, along with IT Certification and information security topics. E-mail Ed at etittel@techtarget.com with comments, questions, or suggested topics or tools for review.


This was first published in October 2004

Dig deeper on XML and XML schema

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchSoftwareQuality

SearchCloudApplications

SearchAWS

TheServerSide

SearchWinDevelopment

Close