HTML5: Going Offline

HTML5 Logo
HTML5 Logo

Although the HTML5 specification still has Working Draft status over at the W3C, browser support is quite good. If you’re using the latest version of your favorite browser, chances are that it can handle some, if not all, of the new HTML5 features. The Wikipedia article Comparison of layout engines (HTML5) seems fairly updated when it comes to what the different layout engines support in case you want to stay on top of what happens on the client side of HTML5.

Amongst the new features are offline web application support, at the time of writing implemented by all the major browsers except for Internet Explorer. For the users of browsers that do support this feature, however, it means that parts of (or your entire) web application can be made available to them even if they for some reason are offline or if your production servers have taken a dive and the website is down. Of course this never happens in your production environment, but this can still be a good thing to know about offline web application support in HTML5 in case you have a friend who you suspect might find himself in this unwelcome challenge some day.

When a user visits your web application, the browser will check the HTML pages for a reference to a cache manifest file and if the reference is found, scan the file for information on how your web application can be cached offline. Naturally, this means that the user will need to visit your web application at least once for the client to save it for use in offline mode.

Adding offline support in your web application is surprisingly easy. It all starts out with some HTML editing. In your index.html file - or some file that contains an <html-tag>, I’m guessing you’re using some kind of template system - add a manifest attribute, like this:

...

You’ve now created a reference to the cache manifest, the file that tells the client what can be stored for offline use and what cannot. Next, create the cache.manifest file in the root of your web application.

CACHE MANIFEST
http://www.example.com/index.html
/logo.png
/scripts/very-important-javascripts.js
http://www.example.com/styles/screen.css

As you can see, the cache manifest can contain both absolute and relative paths. Now, there is a small quirk you have to work through when it comes to the cache manifest. It has to be sent to the client with the text/cache-manifest content type or the client will not recognize it. If you’re working on an Apache-hosted web application, you can simply add the following to the .htaccess file, which is most likely located in the root of the web application (if the file doesn’t exist, feel free to create it):

AddType text/cache-manifest .manifest

This will make sure that the server sends the file to the client with the correct content type. The first time the user visits your web application, the client will now download the resources listed in the cache manifest and cache them locally. If the user disconnects from the internet and refresh the web application, all of those resources will be available offline. Magic!

In some cases, however, you want to explicitly stop the client from trying to cache particular resources, for instance a script used to track visitors. Here’s a modified version of our first cache.manifest file:

CACHE MANIFEST
NETWORK:
/tracking.cgi
CACHE:
http://www.example.com/index.html
/logo.png
/scripts/very-important-javascripts.js
http://www.example.com/styles/screen.css

Everything in the NETWORK section of the file will never be cached and made available offline. Trying to load the tracking.cgi resource while in offline mode will result in an error. The CACHE section of the manifest file contains all the other resources from the first file and will be cached and made available offline as in the previous example.

There is also an option to create a fallback for files that you don’t want the client to cache. Let’s for instance say that you want to use a different logo in offline mode, perhaps to indicate to the users that they are using your web application in offline mode. The following change to the cache.manifest file will enable this:

CACHE MANIFEST
NETWORK:
/tracking.cgi
FALLBACK:
/logo.png /logo_offline.png
CACHE:
http://www.example.com/index.html
/scripts/very-important-javascripts.js
http://www.example.com/styles/screen.css

Using the above cache manifest, the client will cache the logo_offline.png resource and show that instead of the logo.png resource whenever the web application is accessed in offline mode. Each entry in the FALLBACK section consists of two URIs. The first URI is the resource, the second is the fallback. Both URIs must be relative and from the same origin as the manifest file. It’s also possible to use wildcard notation in the FALLBACK section. Let’s say that you want all of the files in the /images folder to be replaced by one particular resource and all HTML-files to be replaced by another resource when the user is offline:

CACHE MANIFEST
NETWORK:
/tracking.cgi
FALLBACK:
/images /only-available-offline.png
*.html /offline.html

Note that there’s a major gotcha present when working with offline web application support. If you make changes to a resource listed in the CACHE section of the manifest, it will not be cached again by the client. The manifest file itself has to be modified for this to happen. Because of this, it might be a good idea to create the manifest files dynamically and create them whenever any of the resources they handle are modified. A generation timestamp somewhere in the manifest will ensure that the client downloads the latest version of any resources that it’s supposed to cache from the server. The timestamp can be included like a comments:

# 2011-06-04 21:00:44
CACHE MANIFEST
NETWORK:
/tracking.cgi
FALLBACK:
/images /only-available-offline.png
*.html /offline.html

As an extra bonus, it’s even possible for your end users to do changes, for instance change some numbers in a budget, even if he is offline. Most of the work to achieve this, however, is up to you, the web developer, and it involves using the local storage on the client, reading a DOM flag that determines if the client is offline or online and syncing the data with the web server when the client comes back online. But all that is outside the scope of this post.

As we’ve seen there are some pretty neat features in HTML5 and those features will become even neater when we start to think of great ways of utilizing them.

Sources:

This entry is also available at BEKK Open.


Feedback

This post has no feedback yet.

Do you have any thoughts you want to share? A question, maybe? Or is something in this post just plainly wrong? Then please send an e-mail to vegard at vegard dot net with your input. You can also use any of the other points of contact listed on the About page.


Caution

It looks like you're using Google's Chrome browser, which records everything you do on the internet. Personally identifiable and sensitive information about you is then sold to the highest bidder, making you a part of surveillance capitalism.

The Contra Chrome comic explains why this is bad, and why you should use another browser.