Exploring Feed Discovery and Markup

The issue of finding feeds to subscribe is a challenge that I have explored in my attempts to implement code in support of the Yarns Microsub Server. I want to publish feeds in a way that others can find them, not just users, but automated systems that present them to users.

So, let’s start with the homepage. Given someone’s homepage, they may link to one or more additional feeds that contain content other than what is visible on the page.

To explore this, I’m looking at link types represented by rel values. This exploration is partly for my own notes, and partly for comment.

Rel-Home

Let’s start with rel-home. By adding rel=”home” to a hyperlink, a page indicates that the destination of that hyperlink is the homepage of the site in which the current page appears.  Rel-home should be used with rel-alternate to link from a permalink page to the feed for the site.

Rel-values combine, and they also combine with other attributes. So rel=”home alternate” would not mean something is both home and an alternate, it would mean it is to link from a permalink page to the feed for the site (home page).

Rel-Alternate

Rel-alternate does by itself mean that the destination is some alternate representation or version of the current page, however it combines in special ways with other rel values and other attributes to provide different more specific meanings.

When used in combination with the type attribute (with a value other than that of the document itself; e.g. other than “text/html”), rel=”alternate” means a link to a representation of the contents of the current document in a different format, as designated by the type attribute.

Rel-Feed

According to the Microformats wiki, rel-feed is for publishing and discovering a feed of the current page.  This is somewhat confusing, because many feeds currently use rel-alternate and type for this.

The entry cites a 2006 blog entry, which notes that the rel-alternate usage must be maintained due adoption, but indicates the fact the feeds are not always alternative representations of a page.

Rel-feed, in this entry, is suggested as an explicit statement something is a feed.

If combinations create additional meaning, feed could also be combined with alternate and home to create the meanings to indicate different relationship.

POSSE Post Discovery

The original rel-feed proposal was only supported on <link> elements, and not on <a> elements. There was no literature forbidding it, so in 2014, Bridgy began to search for the property on both elements.

This all encapsulated in the POSSE Post Discovery algorithm.  It notes that rel-feed on an element of type text/html would be considered an h-feed and if not found, consider the page to be their unfiltered feed.

Feed Discovery

The IndieWeb Wiki discusses a proposed way to determine a primary feed.

  • Namely, if you cannot subscribe to a URL, as it itself is not a feed, does it advertise a rel-feed?
  • If not, does the main object of the page contain a feed nested inside it?
  • Are there multiple feeds on the page, or multiple rel-feeds?
  • Identify which of the feeds has a url property set to that of the page and declare that the primary.
  • If there are multiple feeds on the page, and the URL has no fragment identifier in it, then there is no clear canonical version.

I’d like to modify that idea as follows. This is just a rough work in progress.

  • Is this a feed?
    • Yes – This is the primary feed for the URL.
      • Is there a link on this page to itself with the property rel-home
        • If yes, then this is the primary feed for the site.
        • If no, it is merely the feed for that URL of the site.
      • Are there additonal rel-feed entries on the page that point to different URLs? If so,
      • To determine the primary feed for the site, the main site page should be linked with rel-home.
    • No
      • Does it have a rel-alternate on the page? This would indicate an alternative representation of the page.
      • Does the rel-alternate have a type of application/rss+xml, application/atom+xml, or application/json+feed? Alternatively, if there is application/json, you can probably assume a jsonfeed.
      • Does it have a rel-alternate home on the page? This would indicate the main feed of the site
      • Does it have a rel-feed on the page? This would indicate links to alternative feeds. For example, a feed for a category or date archive of which this page was a part.
    • Are there multiple feeds on the page?
      • If so, assume the first is the primary

Conclusions and Musings

Researching this brings me to several questions and conclusions.

  1. Who is actually checking for multiple rel values on the same property and deriving different meanings for them?
  2. I should publish the home rel and consume it in regard to feeds
  3.  Would a link with a rel=”feed tag” indicate a tag archive? According to the rel-tag draft, the last path element of the URL indicates the actual tag, not the text.
  4. Would a link with a rel=”feed” and a datetime property indicate a date archive?
  5. Would a link with a rel=”feed author” indicate an author archive?

Speaking as someone who continues to try to improve feed discovery, being able to, from a page, identify tag feeds, home feeds, date feeds, and author feeds would address the recently noted desire of Chris Aldrich to have these links.

The difference being, instead of them being side files, such as RSS or JSONFeed, they would be links to marked up h-feeds, and human readable as such. I already have links to tag and date archives, and the title of the link is clear on what those items are.

But when a feed reader tries to do discovery on the page, should it find the primary feed, and any feeds a post is a part of and offer those?

Example in plaintext.

  • View All Posts by John Doe
  • View All Posts tagged with ‘indiewebcamp’
  • View All Posts made on November 12.

 

Still thinking about this, but welcome feedback.

David Shanske

My day job is in training for an airline. I also develop Indieweb WordPress plugins so that others can take control of their online identity.

    Mentions

  • 💬 Reply to Feed WordPress 101: Feeding The Machine

Leave a Reply

Your email address will not be published. Required fields are marked *