Yesterday Terry Ryan announced the launch of WhichElement.com, a site dedicated to the semantic side of HTML5. This is an area I had not dug into a lot previously. Terry is quite passionate about it and because of this I've taken a deeper look at the topic and found it to be quite interesting. I highly encourage you to check out the site and participate in the discussion. I thought I'd take a few minutes and discuss a bit about the technology in use on the site.
As Terry mentioned in his blog post, he wanted the site to be as lightweight as possible. This meant no back-end server, no database, etc. Basically just a simple web server and that's it. This presented a problem for content creators though. We wanted to make it as easy as possible for folks to write new articles. In order to make it easier for contributors, we wanted to minimize the size of the template they had to use. This also made it easier on us. If we ever changed the header, for example, it would suck to have to manually go through all of the articles and update each one.Obviously a server-side solution would make this trivial. You would take your article template and simply add directives to include a global header and footer. (Technically you could also use Server Side Includes. Does anyone still use those?) But without a server-side program in place, what could we do?
For my solution I came up with - what I thought - was a simple enough technique. Given a request for an article, let's say something like so: whichelement.com/articles.html?article=address, I would use JavaScript to handle:
- Parsing in the requested content, in this case "address"
- Performing an Ajax request to load in the address content, which would be a file with minimal layout and focused entirely on the specific content of the article
This turned out to be rather simple. JavaScript gives you access to the URL. Given that, it's then trivial to do an Ajax request to get the content. We decided it would be nice to make the URLs a bit cleaner. If you go to the site now, you can see the form we have in place: http://whichelement.com/concepts/address. To get this to work, I "cheated" a bit and made use of an Apache URL rewrite scheme:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ engine.html [QSA,L]
This logic notices a request for a non existent file/folder and rewrites the request to engine.html. The engine file contains the basic site template with certain aspects empty. Back on the JavaScript side we have this being run on startup:
function doRoute(folder) {
var loc = window.location.href;
//strip final / if there
if(loc.substr(loc.length-1,1) == "/") loc = loc.substr(0,loc.length-1);
var parts = loc.split("/");
var target = parts[parts.length-1];
window.document.title = target + " "+window.document.title;
$("#tagcrumb").text(target);
$.ajax({
url:folder+target+"/index.html",
success:function(res,code) {
$("#mainArticle").html(res);
resizeFooter();
},
error:function(err) {
//assume 404 and load in badtag.html, it better exist...
$("#mainArticle").load(folder+"bad/index.html");
$("#tagcrumb").text("unobtainium");
}
});
};
The argument, folder, is simply an abstraction. We've got two instances of this engine in place, so the folder argument lets me specify which one is in use. You can see the URL parsing, and really, after that it's just a URL request and content update. Here's an example of one of the content files on the site:
<header>
<h1>Address</h1>
<p>A postal address, where someone would deliver mail.</p>
</header>
<figure class="example" >
<img src="/concepts/articles/address/address_example.jpg" width="299" height="227" alt="An example address" />
<figcaption>An example address, as if you had no idea what <q>postal address</q> meant.</figcaption>
</figure>
<h2>Overview</h2>
<p>It's seems straight-forward, there is an element named <address>. Case closed right?
It's not so simple, the spec has something to say. </p>
<h2>Candidates</h2>
<ul>
<li><a href="/elements/address"><address></a></li>
<li>A collection of <a href="/elements/div"><div></a> and <a href="/elements/span"><span></a> elements.</li>
</ul>
<p>At first glance it would seem <address> would be the right choice, it's called <em>address</em> for goodness' sake. But alas, as per the specification, address in the context of an <abbr>HTML5</abbr> element <address> means <q>To whom should I <strong>address</strong/> my issue with this piece of content.</q> So in this case it is referring to the authors of the article or the maintainer of the page. If in that context, a postal address makes sense then you can use it, otherwise postal addresses should be otherwise marked up.</p>
<h2>Verdict</h2>
<p>We reccomend a collection of <a href="/elements/div"><div></a> and <a href="/elements/span"><span></a> elements because the spec clearly states <address> is not the correct element ot use.
</p>
<h2>Further Reading</h2>
<ul class="optionlist">
<li><a href="http://html5doctor.com/the-address-element/">HTML5 Doctor -> The Address Element</a></li>
<li><a href="http://dev.w3.org/html5/spec/the-address-element.html#the-address-element">w3c -> The address element</a></li>
</ul>
I think you will agree - this is pretty darn slim. There's one last piece to this puzzle I want to share. The solution I created worked fine, but with one exception. If a non-JavaScript capable browser came to the site, they would see nothing. (Well, no content.) I was perfectly happy with that outside of the fact that Google's search engine would also see nothing. I added a slight tweak to my Apache rewrite rule to simply bypass it for Google's search bot:
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^Googlebot.*
RewriteRule ^(.*)$ /concepts/articles/$1.html [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ engine.html [QSA,L]
So - any comments on this technique? It seems to be working well, but obviously with the site going live just yesterday, I definitely assume there are going to be some edge cases.
p.s. I had to edit the code sample for Address to remove the code tag. My blog uses the code tag to do highlighting. I wanted to be sure that was out in the open so that if anyone compares it to what they see in Firebug, they knew I wasn't trying to make the template seem even smaller than it really is.
Archived Comments
Neat stuff, something I'm really into at the mo - using as little server resource as possible.
You can actually achieve this with 'zero server' if you use Amazon S3's website bucket feature (ok, not a real CDN and there are obviously servers running somewhere, but close enough).
You setup your default page to index.html and your error page to index.html, the js can handle the routing, including 404s for stuff it doesn't know what to do with.
Also, take a look at backbone.js and it's routing if you haven't already - designed for this sort of thing (so less wheel reinventing)
S3 websites:
http://aws.typepad.com/aws/...
re: Amazon - you know - I think it IS on Amazon - but not S3, EC2.
re: Backbone.js: Not a fan. I don't think it's bad, I just don't dig it. I much prefer Angular.
Ray...
I'd love for you to take a look at my Ember.js article. Curious to see what you think about it. It's super lightweight as far as code goes.
Brian Rinaldi asked me to expand on it for an article that will be published in ADC next month (most likely).
I like Terry's choice of fonts.
I've just popped on to the Angular site and seen your testimonial on there, nice ;). Really like the WhichElement site too.
I'd say the approach was sound. Probably using backbone or angular for the routing is overkill seeing as you've condensed it down to such slim code.
To offer yet another, serverless approach, did you consider an offline CMS such as Jekyll? Jekyll powers github pages which are pure static content (my blog runs this way). This way, you need no alternative for crawlers. Only issue I see is search, though I'm sure there are JS implementations out there...
@andy: I saw your Ember blog post. It looks ok. :)
@Phillip: Another thing Terry turned me on too.
@DW: Nope, never considered it. Never even heard of it. :)
@dominic From what I know of Jekyll it's powered by Ruby. While I might get away with doing a server less backend, I couldn't get away with doing one that even staticly is powered by something other then ColdFusion.
Ha ha, ok. Site sponsored by Adobe? I personally use my local machine to build my site and deploy it - no server involved and I can install what I like (should that be the issue).
Anyways, its good to look around, will be interesting to see what sort of solutions come around when (and if) we can run CF from the command line. E.g. a CF implementation along the lines of Jekyll (that does not need a running CF server to generate).
Nope, the site isn't sponsored by Adobe (outside of that fact that stuff like this is part of our jobs).
@Ray Yeah, I guess being Adobe ColdFusion Evangelists does rather change the position! (didn't realise that TR was one also -duh to me).
On the flip side - I do believe (as I'm sure you do to), that exploring solutions outside of CF really benefits us as CF developers and eventually the community (I'm not saying that you should use Jekyll here, would clearly be an odd choice).
But anyways, I've hijacked your post with enough yadda yadda already. Great work on the site.
Actually, I'm not a ColdFusion evangelist. My role (and Terry too) is in a group dedicated to web standards. I've been spending a lot of my time lately in HTML5, CSS3, and JS APIs. It's absolutely fascinating.
My bad, I read 'Developer Evangelist' and misinterpreted. But yeah, sure is - a job by itself.