The Road to the Semantic Web is Paved with Microformats
Google recently and quietly announced something huge, “rich snippets”. Rich snippets are smart previews, displayed right on a search results page. While Google has long relied on snippets to attach a bit of information to each link (thus letting the user know what he or she might expect on each page represented by a link), rich snippets go a step further: they extract key characteristic of the page, be it a rating of a review or a person’s contact information. Google doesn’t have to guess it, it knows it. Google’s rich snippets are powered by microformats and RDFa, two semantic standards that are rapidly gaining adoption. Google’s implementation allows semantically-marked web content (such as reviews and contact information) to be exposed, aggregated and averaged in a Google search results page. In short, after years in the lab, the web is at last, albeit quietly, becoming semantic!
Microformats are not a substitute for the semantic web, they are a stepping stone and a very important one. They demonstrate the feasibility and value of adding semantic meaning to web page content. They do so using existing browsers and standards. They do so today, in the field not in the lab. By making web pages understandable to both humans (also known as readers…) and machines, using current technologies, current browsers and minimal effort, microformats allow web content to be reliably understood and aggregated by search engines. The future is bright. Google could, for example, calculate an average review for a book from a list of semantically compliant sites. Google could also uniquely identify a user as a single human being across sites. The semantic web, a web of meaning, is finally taking shape.
I am convinced the semantic web is going to change the way we publish content, exchange, correlate and aggregate information, both in the public domain and the enterprise. It’s an exciting time for web professionals who can look forward to building companies and next generation systems that leverage semantic data.

In Toronto and interested in the semantic web? Join us at the Toronto Semantic Web group on LinkedIn.

[…] I recently made Betterdot’s Contact Us page both human and machine readable by adding hCard microformat markup to the underlying XHTML. This notion of “machine readable” content is arguably […]
Helping Machines Read, A Simple Microformat Case Study « The Server-Side Pad
May 19, 2009 at 10:12 pm
[…] largest collection of human knowledge ever assembled. It’s also slowly being re-engineered semantically as a giant global database. Thus opportunities abound for businesses to systematically mine the […]
Fee-Based APIs Are Coming (It’s a Good Thing!) « The Server-Side Pad
May 28, 2009 at 9:47 pm
[…] is a lot of opportunity to enrich the machine understanding of web communications, but I really think this problem is probably best addressed with some kind of […]
euri.ca , nsfw tag considered harmful (: euri.ca :)
June 14, 2009 at 6:11 pm
Fabien, I read with interest in the August issue of Technology Review about Wolfram’s ‘Alpha’ “answer engine”. Also mentioned was Google Squared and Yahoo’s SearchMonkey. While the Semantic Web was alluded to, there was a lot of emphasis placed on the complexity of “curating” data from across the web. I guess for hard facts that’s fine, but for secondary or tertiary data, there is a lot of interference going on. It seems to me, that these answer engines are in conflict with Semantic Web since they are guessing at what the answers are by inferring what the input data are. Nothing was mentioned of RDF: that’s a crucial source-level (from a data perspective) control on the semantics of the data. As a “data guy” I don’t think I would use an answer engine. At least with Wikipedia I know what the references are. Scary.
Ian Mosley
July 19, 2009 at 9:51 pm