In previous tips on the subject of Web syndication we've dug into the Resource Description Framework (RDF) on which...
so much syndication information hangs, and have also looked into the differences between RSS versions, including what makes RSS 1.0 essentially and profoundly divergent from RSS 2.0. In this tip, we'll take a look at Atom 1.0, an attempt to carry the torch originally lit for RSS 0.9x and 2.0 further forward without muddying the waters any further by sticking to RSS terminology. This is the fork of the RSS path that stands for Really Simple Syndication and that attempts to provide a straightforward, human-readable syndication markup language at the same time that it supports a "robust, flexible, and consistent content model" (in the words of James Snell, an IBM Software Engineer and a contributor to RFC 4287, from his excellent article entitled An overview of the Atom 1.0 Syndication Format).
Atom 1.0 has become increasingly popular since its release in 2005 as an IETF RFC (4287, currently in draft standard status) for many reasons, including:
- Simple, straightforward syntax.
- Ability to support many types of content and pointers to content, including plain text, escaped HTML, well-formed XHTML, XML of just about any kind, base-64 encoded binary and URI pointers to content not included in a feed. By contrast RSS handles only plain text and escaped HTML content.
- Disambiguating features to discriminate among feeds and entries based on the presence of a unique identifier (which may be a URI for a blog or some other Web resource or even a unique 128-bit Globally Unique Identifier, also known as a GUID), a title (which provides a short, human-readable subject for each entry) and a timestamp to indicate when the most recent update occurred. When multiple sources for the same feed are present, this combination of data elements makes it easy to decide which one is most current and therefore, which one should be used.
- A well-defined model for extending Atom markup, including clear specifications as to where extension elements may or may not appear, clear identification of language sensitivity in extensions (and affected by xml:lang attributes) and descriptions of how Atom parsers or handlers must respond when they encounter an unknown or unfamiliar extension in the markup.
- Clear specifications are provided for required and optional metadata elements in the Atom namespace, so that authors and contributors may be readily identified, feed categories provided and all kinds of other descriptive information supplied (generator, link, logo, rights, source, summary, id and link elements are all defined among Atom metadata elements).
These are just the major features of Atom, but the markup language also incorporates mechanisms to allow individual entries to exist outside a specific feed, thereby supporting improve options for aggregation and how syndicated content may be distributed. Likewise, it supports timestamps compatible with both XML Schema and ISO-8601, relative URIs using the xml:base namespace, enhanced internationalization and language support through Internationalized Resource Identifiers (IRIs) and xml:lang, improved accessibility features, mechanisms to simplify feed subscription processes and even a MIME media type to identify Atom 1.0 documents, among other features.
In addition, Atom 1.0 offers improved support for podcasting. Whereas the RSS 2.0 enclosure tag supports only one type of audio format per item (and thus requires multiple feeds to support multiple audio formats), Atom 1.0 markup supports multiple enclosure links within a single feed, where each link may be explicitly typed to identify a different audio format (MP3, BitTorrent, WMA, Ogg Vorbis and so forth).
Finally, Atom 1.0 has proven to be a richer and more powerful mechanism for defining and delivering syndicated feeds than either version of RSS. This explains why most current tools support Atom as well as RSS for syndication and why so many content developers are turning to Atom for syndication purposes. It's nice to observe that the intent of Atom's original designers to "create a more technologically sound design" and to "let the new format [Atom] to work in harmony with the architecture as well as the culture of the Web" has apparently been realized (quotations come from Uche Ogbuji's seminal Atom article "Use the Atom format for syndicating news and more", published in May 2004).
Ed Tittel is a full-time writer and trainer whose interests include XML and development topics, along with IT Certification and information security topics. Among his many XML projects are XML For Dummies, 4th edition, (Wylie, 2005) and the Shaum's Easy Outline of XML (McGraw-Hill, 2004). E-mail Ed at firstname.lastname@example.org with comments, questions or suggested topics or tools for review.