Monday, October 20, 2008

RDFa - W3C recommendation for adding RDF annotations to HTML documents

The folks over at W3C have been working on a new scheme that allows RDF-like annotations to be added to HTML Web pages. W3C just announced that RDFa is now a full-fledged "Recommendation" (W3C standard) and has an updated Primer. Actually, the annotations are for XHTML documents. According to the RDFa Primer:

Today's web is built predominantly for human consumption. Even as machine-readable data begins to appear on the web, it is typically distributed in a separate file, with a separate format, and very limited correspondence between the human and machine versions. As a result, web browsers can provide only minimal assistance to humans in parsing and processing web data: browsers only see presentation information. We introduce RDFa, which provides a set of XHTML attributes to augment visual data with machine-readable hints. We show how to express simple and more complex datasets using RDFa, and in particular how to turn the existing human-visible text and links into machine-readable data without repeating content.


The web is a rich, distributed repository of interconnected information organized primarily for human consumption. On a typical web page, an XHTML author might specify a headline, then a smaller sub-headline, a block of italicized text, a few paragraphs of average-size text, and, finally, a few single-word links. Web browsers will follow these presentation instructions faithfully. However, only the human mind understands that the headline is, in fact, the blog post title, the sub-headline indicates the author, the italicized text is the article's publication date, and the single-word links are categorization labels. The gap between what programs and humans understand is large.

What if the browser received information on the meaning of a web page's visual elements? A dinner party announced on a blog could be easily copied to the user's calendar, an author's complete contact information to the user's address book. Users could automatically recall previously browsed articles according to categorization labels (often called tags). A photo copied and pasted from a web site to a school report would carry with it a link back to the photographer, giving her proper credit. When web data meant for humans is augmented with hints meant for computer programs, these programs become significantly more helpful, because they begin to understand the data's structure.

RDFa allows XHTML authors to do just that. Using a few simple XHTML attributes, authors can mark up human-readable data with machine-readable indicators for browsers and other programs to interpret. A web page can include markup for items as simple as the title of an article, or as complex as a user's complete social network.

RDFa benefits from the extensive power of RDF [RDF], the W3C's standard for interoperable machine-readable data. However, readers of this document are not expected to understand RDF. Readers are expected to understand at least a basic level of XHTML.

I personally have not studied RDFa yet, but I have a strong suspicion that it may be relevant to my interests.

OTOH, it may simply represent a steppingstone on the path to better things.

-- Jack Krupansky


Post a Comment

Subscribe to Post Comments [Atom]

<< Home