Consistent Microformats – David Shanske

One of the problems in consuming microformats is consistency. There are a variety of different ways people structure their pages.

Many people have written code to solve this problem. I do it in my library, Parse This. Aaron Parecki does it in his XRay library. The Microsub specification has a stricter jf2 output in order to simplify the client having to make all sorts of checks.

This is the point. It is easier to consume a clean and consistent parsed microformats structure. Some of this would probably be solved by some consensus on the matter.

So, what does Parse This, and its ilk do? I lack a name for this sort of code.

It has two options: feed or single return
Feed tries to identify and standardize an h-feed. This means if there are multiple top level h- items, it will try to convert it into an h-feed.
Single will try to identify the top level h- item that matches the URL of the page.
In both cases, it will run authorship discovery in order to find the representative author and add this as an author property to the h-feed, or single h-entry etc.
It will try to run post type discovery.

David Shanske

Leave a Reply Cancel reply