音声ブラウザご使用の方向け: SKIP NAVI GOTO NAVI

Web Posted on: November 20, 1998


Recent Developments in Accessible Web-based Multimedia

Geoff Freed
Project Manager, Web Access Project
CPB/WGBH National Center for Accessible Media
WGBH Educational Foundation
125 Western Ave.
Boston, MA 02134
voice/tty: 617 492-9258
fax: 617 782-2155
e-mail: geoff_freed@wgbh.org

Introduction

For millions of Americans, the World Wide Web is an exciting new tool for learning and communicating. For millions of disabled computer users, however, the Web's enhanced graphics, audio, and video capabilities are out of reach. The Web Access Project, begun in 1996 by the CPB/WGBH National Center for Accessible Media (NCAM), is part of the global effort to help lower or remove accessibility barriers from the Web. NCAM is a research and development facility dedicated to the issues of media technology for people with disabilities in their homes, workplaces, schools and communities. NCAM is the latest media access initiative of WGBH, Boston's public broadcaster, which founded The Caption Center in 1972 and Descriptive Video Service in 1990. With a background in making the content of media accessible, NCAM's contribution has focused largely, but not exclusively, on this aspect of Web access.

This paper will describe NCAM's efforts to make Web-based multimedia more accessible to users who are deaf, hard of hearing, blind or visually impaired. In order to make multimedia more accessible, NCAM has developed techniques which apply broadcast-based accessibility technologies-- closed captions and audio descriptions-- to the Web. As of this writing, NCAM has experimented with at least four methods, using Apple's QuickTime (TM) software, Microsoft's Synchronized Accessible Media Interchange (SAMI) format, the World Wide Web Consortium's (W3C) Synchronized Multimedia Integration Language (SMIL), and WGBH's MAGpie authoring software.


QuickTime and Captioning

Apple's QuickTime 3.0 and MoviePlayer, which comes with QuickTime, allow captions and descriptions to be added to a movie using either a Macintosh or PC. Previous versions of QuickTime software may also be used, but only on the Macintosh platform. Whatever version of QuickTime is used for creation, though, the end result may be played back on either a Macintosh or PC.

A QuickTime movie is made up of separate video and audio tracks. At least one multimedia player (MoviePlayer version 2.1 or higher) allows the user to toggle the tracks on and off. Because they are discrete, a movie may have multiple audio and video tracks, any number of which may be selected by the user. A user can select the appropriate language track at the time of playback.

In addition to video and audio tracks, multiple text tracks may be included with the clip. A text track becomes, for access purposes, a caption track, but can also be used to provide foreign-language subtitles or even as a search engine indexed by keywords. If the user views the movie clip directly from the Web site using streaming software, the caption track is open-- that is, it can't be turned off. However, if the clip is downloaded and played locally using QuickTime software, the caption track may be toggled on or off, thus simulating closed captions. (Note: QuickTime 3.0 allows this toggling on either the Macintosh or PC; previous versions of QuickTime allow toggling only on the Macintosh.) If the clip is downloaded and played using any other multimedia player, the captions remain open.

A captioned movie clip, therefore, contains the normal video and audio tracks plus the additional text track. Unlike broadcast captions, which obscure a portion of the visible picture, captioned movie clips display the text track in a small window below the video (although QuickTime 3.0 allows the window to be positioned virtually anywhere). In its experiments, NCAM was able to fit approximately 19 rows of text below a movie clip before running out of space on the computer monitor. However, displaying more than three rows of text at once may prove impractical as the viewer may have difficulty reading the captions and keeping up with the video.

Sample captioned QuickTime movie clips and step-by-step details of the captioning process may be found at NCAM's Web site.


QuickTime and Audio Descriptions

Not only is it possible to add text tracks to a QuickTime movie clip, it is also possible to add extra audio tracks-- specifically, an audio description track, which increases a movie clip's accessibility for people who are blind or visually impaired. Audio descriptions of QuickTime clips are similar to those found on certain television programs or home videos. Brief narration describing key visual elements are inserted into the pauses of the dialog. This narration makes it easier for blind or visually impaired users to follow the action of a movie clip. The narration track is recorded separately and, using QuickTime's MoviePlayer 2.1 or greater, pasted into the movie. (QuickTime 3.0's MoviePlayer will add sound using either a PC or Macintosh; earlier versions of MoviePlayer will add sound using a Macintosh only.) Like captioned QuickTime movies, the user may toggle the audio description track on and off, depending on the movie playback device being used. To view several different examples of described movie clips, visit NCAM's Web site. Instructions on creating described movie clips can be found here, as well.


Microsoft SAMI

While QuickTime captioning and description methods require authors to encode accessibility features into the multimedia file itself, research is underway to simplify this process. In the fall of 1998, Microsoft(R) released a new accessibility authoring format and associated tools called the Synchronized Accessible Media Interchange (SAMI) format. SAMI synchronizes the primary media (video, for example) with externally stored and referenced caption or audio description content. Because SAMI is based on HTML, it can be adopted easily by those already familiar with Web-page authoring. This also allows developers to easily add or point to captioning content for Web-based or offline multimedia, such as CD-ROM. SAMI files are text files, so they can be read by any operating system. Caption or description files can be stored and transmitted from the same location as the primary media or can be played in sync with media originating anywhere on the Web, as long as the time codes and references are properly matched. The SAMI file format specification is available to the public as an open standard (no licensing fees).

A SAMI captioning or description file contains timecode information which corresponds to elapsed time in a multimedia source file, such as audio, video or animation. The source file can be played by Microsoft's Media Player, which synchronizes it with a SAMI file to render the captions or descriptions at the appropriate time. The user can toggle on or off either the captions or descriptions. Users also have great flexibility in adjusting the appearance and presentation of captions to suit their needs and preferences. SAMI supports captioning in multiple languages, and is also well suited for synchronized text highlighting. For more information, including sample SAMI multimedia clips, visit the Microsoft Accessibility site.


The W3C's Synchronized Multimedia Integration Language (SMIL)

To ease the authoring process of TV-like multimedia presentations on the Web, the W3C has designed the Synchronized Multimedia Integration Language (SMIL). Released in June of 1998, SMIL allows for the creation of time-based, streaming r audio descriptions. WGBH's experience in captioning thousands of clips for Microsoft Encarta, as well as providing descriptive narration for DVDs, has helped inform the design of the tool. MAGpie will be available from the NCAM Web site.


Benefits of Accessible Movie Clips

Deaf and hard-of-hearing Web users are the immediate and obvious beneficiaries of captioned movie clips. However, the benefits extend beyond this audience. Those using computers which lack sound capability, for example, can view captioned clips and follow the soundtrack visually rather than aurally. Also, as many educators have already discovered, captions used in conjunction with both audio and video can be a valuable tool for improving reading skills of children and adults.

A captioned movie's text track can also be used as a reference tool: some movie players have a "search" feature which allows the user to scan the text track for a specific keyword or phrase, making it easy to locate a specific spot in the movie clip. Depending on the software, this search function works even when a text track is hidden.

Another useful feature of a captioned movie clip is the transcript which is generated as part of the captioning process. Displaying a link to the movie's transcript allows the user to read the text before deciding if it is worth the time to download and view the movie. At a minimum, transcripts may be used by those who do not have any video-playback capability, as a partial substitute for the clip itself. For maximum accessibility, transcripts should always be used in conjunction with audio-only clips.

Like captions, the benefits of audio descriptions reach beyond the primary audience of blind or visually impaired users. Preliminary research has shown that described movies or television programs can help reinforce concepts or vocabulary in classroom situations. The same can be true for Web-based multimedia. Even more importantly, a Web-based movie clip is not limited to being played only in real-time. That is, the clip may be stopped, started and randomly accessed at will, or different audio and/or video tracks may be paused while other tracks continue to play. For example, during a clip that deals with a complex math equation, the video may be paused while the audio-description track delivers an in-depth explanation of the equation displayed on the screen. When applied to science or math multimedia, this technique allows for greater understanding of concepts that might otherwise go by the viewer too quickly.

Adding captions or descriptions to Web-based multimedia has one further potential benefit-- preservation of bandwidth. As more and more people log on to the Web, and as content providers utilize byte-intensive multimedia, access for all users will become slower and slower. As accessibility technologies are perfected, however, users will be able to request and download specific media components. That is, a blind user will be able to ignore the video portion of a movie clip but retrieve the program audio and descriptions only, thus avoiding the transfer of large amounts of unneeded data. Likewise, a deaf user may only want to download the video and caption portions of a clip, ignoring all audio.

For more information on multimedia access technology, visit the NCAM Web site.