HTML5 Video Tag basics.

Posted on July 19, 2010


There is a lot of scuttlebutt about HTML5 and the future. All is not what it seems and i am sure that what we anticipate to be the state of the future of video with use of HTML5 will not be exactly as predicted. Below is a crash course of PC use of the video tag for HTML5.

Don’t expect Adobe to sit on their collective hands either. I do like the H.264 Codec and if that becomes the standard, great. Keep in mind, there is still the question of how to license and pay for the privilege of using this codec and whether or not another option comes along in the next few years!

In any case, this is a great primer on the Video Tag.

Computerworld – If you want to watch Internet-delivered video on your PC, the vast majority of Web sites have settled on a single, consistent way to do that. That’s the good news. The bad news is that this single, consistent delivery system is Adobe Flash, with all its security and stability issues.
But now a new way to deliver video through a browser is coming to the fore, one intended to be native to the browser itself: HTML 5’s tag. In this article I’ll look at how the tag can be used with the new generation of browsers. I’ll also examine how parts of this equation — the browsers and, to some degree, the video formats themselves — are also still very much in flux.
Online video before HTML 5
One could fill a decent-size book talking about all the formats that have been used to deliver Web video at one time or another: Microsoft’s .avi and .wmv container formats and the gang of codecs delivered with them, Apple’s QuickTime, RealNetworks’ RealVideo and RealAudio formats, and so on. Microsoft’s Silverlight also deserves mention, since it allows providers such as Netflix to distribute content with embedded copy protection — a feature not likely to fall out of demand as long as money changes hands for video content.
However, the video delivery system that’s most widely deployed right now is Adobe’s Flash.
The Flash Player was, and still is, one of the few browser add-ons that almost everyone is likely to have. Browsers on Macs and PCs alike typically support Flash by default, since a growing amount of Web content in general depends on it. It could be argued that Flash has become a video-delivery system as a byproduct of its original intention, which was to bring vector-based animation to the Web.
But Flash has problems as a video delivery system. It’s proprietary. It requires the use of third-party code rather than something native to the browser. It has been lambasted for its lack of security and instability. The list goes on. It’s a solution, when people have been hungering for the solution.
Hello to the tag
The history of the tag starts with the Web Hypertext Application Technology Working Group (WHATWG), a consortium made up of folks from Apple, the Mozilla Foundation and Opera Software. The WHATWG was created in 2004 to focus on the development of HTML 5 as a response to what it felt was the disregard of the World Wide Web Consortium (W3C) for real-world developers vis-à-vis XHTML and the then-extant HTML standards.
The first proposals for a tag were submitted to the WHATWG in 2007 by Opera Software. The idea was simple: Create a framework in which Web browsers can natively play back video without being forced to fall back on third-party plug-ins. The user gets the experience of video that just works, those hosting the video have less maintenance to perform, and everyone walks away happy.
That’s the theory, anyway. The practice has been another story entirely.
The codec conundrum
When the tag was first proposed in the HTML 5 draft specification, one key omission from the spec was which video (and audio) codecs would be natively supported by the browser. As a result, while there are several video codecs that can be used in conjunction with the tag, browser makers are not obliged to support any one of them: It’s entirely their choice which codecs to include support for.
The original plan involved specifying the Theora video and Vorbis audio codecs as a baseline that all browsers should be able to play, but this was dropped in favor of an approach where no specific codec was recommended. Instead, the WHATWG expressed a desire for a codec that could be used in an unencumbered fashion and had a better guarantee of patent indemnity than Theora/Vorbis offered at the time.
The change sparked criticism among developers and might well have been one of the motivating factors in Google’s offer of the VP8 codec as another baseline codec candidate.
In the end, the following three codecs have emerged as the main contenders for tag support: H.264, Theora and VP8.
H.264: Microsoft and Apple have been major proponents of the H.264 codec family, which has already been broadly implemented and supported — not just on the Web, but in cameras, in Blu-ray discs, and many other media that need powerful, efficient compression.
What’s contentious about H.264 is not the technology itself but the licensing. H.264’s usage is governed by the MPEG LA group, which levies a sliding scale of fees for H.264 based on the intended use. That said, the vast majority of end users on the Web might never pay anything for using H.264, for a couple of reasons.
First, the MPEG-LA has stated that for the next five years it will collect no royalties for H.264 Web streams that are offered free to end users.
Second, in the cases where you’re dealing with for-pay content, odds are that the usage fees have already been assumed by someone else. For example, if you’re encoding stuff in Windows and uploading it to YouTube as a pay-per-view item, you pay no licensing fees for using H.264, because any costs that might be levied have already been assumed by (in this case) Microsoft and Google.
For additional information on this issue, Ed Bott of ZDNet has explained how H.264 licensing fees work and why it wouldn’t be in the MPEG LA’s interest to suddenly ratchet up licensing fees when the current free-to-stream provisions for Web playback are up for revision in a few years. Florian Mueller’s analysis is also interesting — he examines the MPEG LA licensing terms from the point of view of an opponent of software patents, noting that the MPEG LA’s licensing scheme, while not an ideal arrangement, does serve a useful function in a world where software patents exist and must be acknowledged.
That said, companies like Mozilla have not been set at ease — for example, according to Mike Shaver, vice president of engineering at Mozilla Corp., MPEG LA’s licensing isn’t flexible enough to make solid exceptions for free software. Mozilla has opted to support Theora/Vorbis directly in its Firefox browser (and will support WebM in Version 4.0), and it has no plans to add native H.264 support.
Theora: Free software proponents have advocated the open Theora video format (with its matching Vorbis audio codec), which requires no licensing fees at all and has implementations immediately ready to use. But Theora has been criticized on a number of grounds: It isn’t as technologically advanced as other codecs; there isn’t much material encoded in the format, so current video would have to be recoded; and Theora’s patent status could be subjected to future legal challenges (something Steve Jobs has hinted at).
VP8: A more evolved version of the Theora codec family (they share common ancestors), VP8 was developed by On2 Technologies, which also created one of Flash’s video codecs. Google has since purchased On2 outright, and while Google now owns the patent for VP8, it’s allowing unrestricted use of the codec without licensing fees under the banner of “the WebM Project.” (WebM is Google’s name for VP8 video plus Vorbis audio.)
This makes VP8 sound like a sure thing, but there are two problems. The first is that there are serious questions about how polished the spec is — a factor that has serious implications for, say, hardware devices that shoot video directly in VP8. If VP8 is going to be in flux, then cameras that shoot video in VP8 would need to be firmware-upgradable (and have updates published by their makers) to use newer, better performing versions of the codec.
Another problem is VP8’s quality and compression efficiency compared to H.264. One analysis, by Jason Garrett-Glaser, a developer on the FFmpeg project, has put the quality of VP8 on a par with H.264’s “baseline” spec — in other words, good but not great, and with H.264 way out in front in certain respects. He also believes that VP8’s spec relies way too much on the snippets of code provided by Google. Most specifications for a standard (like the tag itself) are drafted and discussed in depth before a single line of code is written; in Garrett-Glaser’s view, the only real VP8 spec we have right now is the code, a cart-before-the-horse situation.
How to add HTML 5 to your site
The codecs you choose as your starting defaults should be dictated at least in part by what browsers are run by the majority of your visitors. Mark Pilgrim’s Dive Into HTML 5 site has a detailed dissection of the competing and conflicting codecs, and it includes a handy chart that describes what current and next-generation browsers will support. Chrome is way out in the lead: The upcoming Chrome 6 will support all three major families of codec out of the box. As mentioned before, Firefox will support WebM in its upcoming Version 4.0, and it supports Theora, but not H.264, in Versions 3.5 and up. The most recent Internet Explorer 9 Platform Preview plays back H.264 natively; support for other codecs will most likely only be available as add-ons.
So if you’re planning on adding HTML 5 support to your site, what’s the best way to cut through this Gordian Knot of standards? Right now, the only viable long-term answer is to hedge your bets by doing the following:
1. Encode your video in at least two different formats, with Flash being one of them as a universal worst-case fallback.
2. Set up your tags to degrade gracefully, so that browsers without support for a given tier of video will fall back to whatever else is available.
3. Test your site tirelessly — not just with multiple browsers, but with multiple versions of individual browsers and on as many different platforms as you can: desktops, laptops, smartphones, etc.
Conversion tools
Assuming you’ve decided which codecs you will use to run videos in HTML 5, you then have to convert your video into that format. There are several tools available.
H.264 tools
Because H.264 is already a broadly used standard, odds are that whatever professional-grade program you have for creating video (such as Adobe Premiere or QuickTime Pro) will support exporting in that format. That said, there are also several open-source/free H.264 encoders available. For example, the ffdshow library, packaged for Windows as the “ffdshow tryouts” codec pack, or the stand-alone programs Handbrake and Avidemux.
Note that your use of any of these tools must conform to the licensing requirements for H.264. Using an open-source implementation of H.264 doesn’t absolve you of this. Generally, if you’re rehosting video through a provider who already has a licensing agreement (e.g., YouTube), or you’re not creating video “where there is remuneration for the title distributed,” you won’t have to pay anything. But you still need to sign a license agreement with MPEG LA to use H.264 or host your content with a third-party provider that already has one.
Theora tools
In keeping with Theora’s free-and-open promise, the tools for creating Theora videos are available free of charge across multiple platforms.
An interesting place to start is the Firefogg extension for Firefox, which lets you use Firefox 3.5 and up as a front end for a Theora video converter. Feed it a video file, set a few basic options, click Save, and the conversion takes place in the browser as you watch. Be warned that the program is picky about the file format you provide: The .mov files that came from my digital camera had to be converted into .avi before they could be used. Firefogg also trades convenience for power: It’s easy to use, but you can convert only one file at a time.
A more powerful but less convenient tool is the ffmpeg2theora command-line encoder utility. It’s more powerful in that it gives you complete control over the encoding parameters, less convenient in that you have to supply a whole slew of switches to the program to work it. Your best bet is to use a front end of some kind, such as Theora Converter, which allows you to batch-process files and see the most important options at a glance (but be warned — it’s still in alpha). The above-mentioned Handbrake also exports to Theora.
Finally, if you use programs that export through DirectShow filters, has a DirectShow Theora filter in both 32- and 64-bit implementations.
WebM tools
Because WebM is still very new — especially in its current no-license-fee incarnation — the tool set isn’t as polished as it is with Theora or H.264. The WebM project’s Web site lists only a few basic tools, including a DirectShow filter for Windows and a stand-alone command-line encoder called makeWebm. It’s important to realize that WebM is subject to further refinement and improvement, and therefore these tools are likely to undergo refinement as WebM itself is changed.
(Incidentally, the just-released beta 1 of Firefox 4.0 supports WebM playback. Try it out for yourself: Go to, click “Join the HTML5 Beta,” and add “&webm=1″ to any search to look for WebM-encoded videos.)
Using the tag
Codecs aside, the most important thing about using video in HTML 5 is the construction of the tag itself. In a perfect world, you’d just need to point to the video stream in question, like this:

But in our less-than-perfect world, the tag sports a whole bevy of options that you choose from to ensure that your videos play back correctly across browsers and platforms. That said, most of this complexity exists for a good reason.
To illustrate this, here’s an example of a tag:


The outer elements of the tag have several options:

WIDTH / HEIGHT: The width and height of the video, in pixels.
CONTROLS: Add this option to show playback controls on the video.
PRELOAD: Tells the browser to start downloading (but not playing) the video as soon as the page is loaded. Use PRELOAD=”none” to explicitly tell the browser not to preload the video.
AUTOPLAY: Include this to start playback of the video automatically.

The subtag lets you specify which video file, or files, to play back. If you specify more than one file via multiple tags, the browser will attempt to load each file in turn. In the above example, the .MP4 file (an H.264 stream) will be loaded first; if the browser can’t play that one, the .OGV (Theora) stream is loaded next, and so on. There’s no practical limit to the number of tags you can provide, but more than three is probably impractical.
The most complicated and problematic option for the tag is the TYPE parameter, which describes to the browser the exact combination of codecs needed to play a particular video. This way the browser doesn’t have to start downloading the video and perform its own codec detection on it (which may well be flawed) to figure out whether or not it can even play the video in question. If the browser knows in advance it can’t play back a certain type of stream, it doesn’t download it. You and the people viewing your videos will save a lot of bandwidth in exchange for a bit of hassle on your part.
For Theora and WebM codecs, the TYPE parameters are simple enough; the above examples encompass the most common scenarios. For H.264, though, the options become quite complicated because H.264 streams can be encoded in a number of profiles, and the TYPE descriptor has to match the profile used to encode the file. If you’re using only one profile for all your videos, you can create the profile string once and forget about it; if not, you’ll need to find some way to ensure the right TYPE string is associated with the file in question (e.g., via metadata in a content management system).
One other extremely important thing that many webmasters overlook is the MIME type for the files. Firefox, for instance, is extremely dependent on MIME types for determining what to do with a given file. To that end, use the following MIME types when configuring your Web server:

MIME Types
MIME type

Falling back to Flash
So what happens if the page is accessed by users who don’t have tag support in their browsers? For them, you’ll need to provide a way to allow the page to serve up a Flash-encoded version of the video. The good news is that there’s a way to do this that is a legitimate behavior of the tag, not a byproduct of some other behavior or a browser-specific quirk.
The way this works is fairly clever: The Flash object is embedded within the tag itself. If the browser supports , it attempts to use a stream from the tags. If none of those work, the tag that points to a Flash player is invoked, and the other elements are automatically ignored.
For an example of this, check out a template called Video for Everybody, created by the folks at Camen Design. It not only falls back elegantly from HTML 5 to Flash but will also work in HTML 4 by degrading to Flash as a fallback, too.
Other third-party packages serve up the tag in different sauces of HTML 5, JavaScript or Flash, depending on what additional effects you want to invoke. For example, the Projekktor project lets you embed a fairly advanced media player in a Web page with all sorts of extensions — such as the ability to refer to an existing YouTube video. HTML 5 is used by default, but Flash works as a fallback when needed. Projekktor’s code is GPL2-licensed, so it can be used and redistributed freely as-is.
Note that you have to supply the Flash player yourself: You can’t just tell Flash to stream a video file. On the plus side, the current edition of Flash streams H.264 natively, so if you already have an H.264 encode you can simply re-use that.
The biggest obstacle the tag faces is how each browser chooses to implement it — what codecs are supported and how they’re presented to the end user. It’s all in flux, and that means any current implementation might change as the browsers themselves evolve over the next revision and beyond.
It’s likely that we’ll see two tiers of content that use the tag: H.264-encoded content hosted by professional services and portals, such as YouTube and Vimeo, and open-standards content encoded in WebM and Theora, hosted wherever there’s bandwidth and space to spare.
The important thing is that they’ll exist side by side, powered by the same in-browser technology. If one does emerge as the standard, if only in a de facto fashion, it won’t be for lack of competition.