an introduction

Why Are There Permalinks?

Tom Coates wrote a nice piece recently (On Permalinks and Paradigms) about how a simple innovation, an afterthought really, was the key to the weblog revolution of the last couple years.

A permalink is simply a direct, permanent link to a document, or some other object like an image. Because it is designed to never, ever change, it can safely be used to refer to the object from elesewhere on the internet. If you have ever clicked on a link and seen "404 File not found," then you have experienced first-hand why permalinks are a good idea.

Location location location

Things on websites change. Documents, even whole folders full of documents, get moved from one place to another. Less often, a filename is changed, or you switch content management software. Every now and then you might convert your site to an entirely different format, from say HTML to XML.

All of those things will change the location of a document, and break any links, bookmarks, and printed references to it that people and/or robots have created over time.


There Should Be A Standard

...but there isn't. There is an oddly difficult-to-find recommendation, however. In 1998, Tim Berners-Lee wrote Cool URIs Don't Change under the W3C Style banner. He suggests that URIs be actively designed to be as simple and therefore stable as possible. His recommendation is to remove all of the following information from your URIs:
  • Authors name
  • Subject
  • Status
    directories like "old" and "draft" and so on
  • Access restrictions
  • File name extension
    This is a very common one. "cgi", even ".html" is something which will change. You may not be using HTML for that page in 20 years time, but you might want today's links to it to still be valid.
  • Software mechanisms
    Look for "cgi", "exec" and other give-away "look what software we are using" bits in URIs. Anyone want to commit to using perl cgi scripts all their lives?
He stresses the importance of creation date as one of the few document attributes that isn't likely to change (unless you're cooking the books). And it makes sense if you look at traditional journals, which have been organised successfully by publication year for centuries.


How Does Berylium Generate Permalinks?

Only moderately well, according to Mr. Berners-Lee's recommendation. But better than most.

A Berylium permalink looks like http://sitename/objtype-id

That encodes three (four if you include http) pieces of information which are relatively unlikely to change. There is no filetype extension and no software signature.

There is also no date, so "document", a popular objtype, will need to mean the same thing in 2246 that it does today. By 2246 there will also be 243 years worth of unique ids. (But I suspect by then we'll already have said everything there is to say anyway. heh.)

Should there be year field? Probably, as it would allow the server to use a lookup table to discover where it should look for "document-395476".

The other caveat about the current permalink implementation is that it serves as a redirect, not a seemingly permanent URI. In other words, if you bookmark the page you're looking at, you won't be storing the permalink even if you used the permalink to get there. This will be changed prior to Berylium2's release.

In fact it might make more sense to always redirect to the permalink from the human-readable URI. But Berylium tries to strike a balance between the way that humans conceive of website structure, and the way that websites should be engineered. Providing the permalink on the page should be enough, we'll see.

By Chris Snyder on June 17, 2003 at 8:54pm

jump to top