Custom Attributes in HTML 5

Created: May 26th, 09

An interesting new part of HTML 5 is its formal support- or should I say endorsement- of custom attributes inside HTML elements. Technically it's always been possible to inject arbitrary attributes into an element and parse them using JavaScript getAttribute() method, but not without getting an earful from the W3C validator (not to mention some of your peers) each time. The following fails as valid HTML:

<div id="mydiv" brand="toyota" model="prius">
John is very happy with his Toyota Prius, because he saves on gas.

This is because custom attributes of any type isn't part of HTML's specs, until now that is. Based on growing pressure from the webmaster community, HTML 5 has finally given in, by giving us a new "data" attribute that lets you define custom attributes in a structured way within HTML elements. Lets see what that's all about, and how it's useful.

Defining and parsing the data attribute

In HTML 5, you define custom attributes using the "data" attribute. The exact format is "data-*", where "*" is replaced with the desired custom attribute name, then set to the desired string value. For example:

<div id="mydiv" data-brand="toyota" data-model="prius">
John is very happy with his Toyota Prius, because he saves on gas.

Your attribute name must be prefixed with "data-" in order to validate in HTML 5. So in other words, while HTML 5 supports custom attributes, it doesn't allow for arbitrary attribute names.

The appeal of custom attributes is that it lets you easily associate tidbits of information with an element, to be parsed later using JavaScript for example. There are two ways to retrieve the value of "data" attributes using JavaScript: the first is via the good old fashion getAttribute() method of JavaScript, and the second, by accessing the "dataset" property of the element. Lets see an example of each:

var mydiv=document.getElementById('mydiv')

//Using DOM's getAttribute() property
var brand=mydiv.getAttribute("data-brand") //returns "toyota"
mydiv.setAttribute("data-brand", "mazda") //changes "data-brand" to "mazda"
mydiv.removeAttribute("data-brand") //removes "data-brand" attribute entirely

//Using JavaScript's dataset property

var brand=mydiv.dataset.brand //returns "toyota"
mydiv.dataset.brand='mazda' //changes "data-brand" to "mazda"
mydiv.dataset.brand=null //removes "data-brand" attribute

Before you go wild with the "dataset" property, it should be mentioned that it isn't widely supported in current browsers yet, so for the time being, it's a good idea to stick to getAttribute() instead. Just FYI however, "dataset" should exist on every element, returning a name/value map of every data attribute defined on the element. To access a particular data attribute, reference it by name without the "data-" prefix.

The HTML 5 doctype

Since the "data" attribute is the brainchild of HTML 5, your page should carry a doctype that informs the W3C validator of this if you want the page to validate (after all, isn't that the whole point?). The common XHTML or HTML 4 doctypes will fail, as the "data" attribute is a fish out of the water in those settings. So what to use? W3C advocates the very simple doctype:


for HTML 5. Now, don't let the simplicity of this doctype fool you- it not only validates the document as HTML 5, but also causes all browsers to render the web page is standards compliant mode, similar to what the other proper doctypes do.

Example- Making use of the "data" attribute in image rollovers