Via adrian holovaty: The W3C‘s Semantic data extractor, which tries to extract some information from a HTML semantic rich document.
I also mentioned my idea in the comments:
What might also be of interest is a tool that clearly displayed the outline of a document (as the W3C Validator does) and explains that the words within <hn> tags have more meaning to search engines and other software, than <font> and <b> tags.
Again, the basic idea of “here’s what a computer sees as important” with the extra push of “here’s what search engines see as important.”
Hmmm, perhaps “have more meaning” should really say “are more important” or something to that effect.
Now I just need to figure out how to semantically mark up the bit above where I quoted myself on my own site with a comment I made on another site. Ah, that’s a task for another day…