FOMS 2012 Recap: Here Come The Text Tracks!

Sep 6, 2012


FOMS (Foundations of Open Media Software) is an annual unconference for media engineers, known for its attitude of getting things done. This year’s edition – held in Paris, France – again had a great mix of attendees representing codec manufacturers, media frameworks, web browsers and video players.

Text Tracks

On the web browser side, the biggest topic was the implementation ofand WebVTT. These can be used to add interactivity and accessibility to video elements. See our previous blog post on text tracks for more info.

At FOMS, both Opera and Chrome demoed working text track implementations. For Opera, this functionality will probably ship with version 12.5, while Chrome users have to wait until version 23. Safari 6 and Internet Explorer 10 will havesupport too, but Firefox is not actively working on it.

Despite all the progress and working implementations, the WebVTT specification is not yet done. Current outstanding issues are the implementation of roll-up captions (for live broadcasting, like this example) and the ability to store CSS in WebVTT – for players like VLC or Flash, who cannot access the webpage. Both items were heavily discussed during the workshop and proposals for implementation were filed with W3C.

Beyond Captions

Though captions in themselves are great, HTML5 Text Tracks can do a lot more. At FOMS, we saw several demos to show applications of WebVTT beyond captions. The demos we presented are listed below.

Note: you need a browser with text track support to see the demos:

  • Preview Thumbs: these thumbnails, known from Hulu and YouTube, pop-up when hovering the seek bar. The thumbs are implemented using a JPG sprite and a WebVTT file that links to the individual thumbs with an xywh fragment query.
  • Chapter Markers: this demo prints chapter markers on an alternative seek bar for the video. When clicking a marker, the browser seeks to the start of that chapter.
  • Slide Syncing: in this demo, related artworks are displayed for certain ranges of the video. This kind of video-page interaction, now easily implemented, has many applications (PowerPoint presentations, sports statistics, etc).
  • Timeline Search: this demo allows you to search the text tracks of a video to retrieve in-video search results. Widespread use of WebVTT will likely lead to search engines applying this trick on a much larger scale.

Other Developments

Another interesting subject at FOMS was the implementation of adaptive streaming in JavaScript. A team of Chrome engineers presented a new demo of the Media Source API, which allows the appending of arbitrary chunks of video to an already playing stream. As we noted in our HTML5 progress update last year, web developers can use this API to create adaptive streaming applications, using fMP4 or WebM video fragments (TS may be coming too, for anyone interested in building HLS support in HTML5). The API can be enabled in Chrome 23 using chrome://flags.

Many other interesting developments, like the new OPUS audio codec and EME content protection scheme, were covered. The FOMS website contains the detailed notes for all of our sessions. In summary, it became clear at FOMS 2012 that so much is going on in the open media scene, and many great tools are yet to come.