NMIS Final Report ©1997

NMIS Project Final Report 1993 - 1997

3.3 Video Magazine and Library Services

[ToC]

Introduction

Network access to multimedia libraries will bring new benefits to K-12 schools nationwide. To instructors seeking quick assembly of high quality and current materials, networked accessible libraries offer new opportunities to enhance student achievement. Technological improvements will also soon allow archived multimedia information to be increasingly retrieved in real time and either automatically composed for students or manipulated by them.

Accomplishments

The NMIS Project Team at MIT, in cooperation with Turner Educational Services, Inc. (TESI), has successfully supported the implementation of a World Wide Web (WWW) accessible multimedia magazine and video library service. INTERNET CNN Newsroom Video Magazine has been automatically generated from the traditional analog version of CNN Newsroom program on a daily basis for more than two years by capturing and processing the same video which is delivered to schools for use with traditional video equipment . In addition, the entire archive has been made available in the searchable INTERNET CNN Newsroom Video Library.[5]

INTERNET CNN Newsroom made particularly significant contributions to understanding the production and delivery requirements of multimedia information services. One accomplishment in this area has been the successful demonstration of techniques to automatically capture, encode, index and segment digital video. These automatic capturing and encoding processes have been complemented with processes to automatically generate and assemble the INTERNET CNN Newsroom Digital Video Magazine on a daily basis. Each weekday morning, a sequence of processes executes to assemble the video, images, and text to dynamically generate an HTML document distributed on the WWW. Furthermore, the tools developed for INTERNET CNN Newsroom were designed as generic applications and have also been used to automatically encode content for the health care information products program.

Two complementary software mechanisms, Live Media Display and Hierarchical Media Distribution, were developed early in the NMIS Project for use with the INTERNET CNN Newsroom to combat the dual problems of the time required to download a file to local disk and the amount of local storage required to buffer the video before playback [28]. Live Media Display was one of the earliest attempts to enable a Windows-based browser to support continuous video delivery (streaming), thus saving time and disk space by allowing a video file to be played while being received from the network instead of waiting for the entire file to be received and stored [29]. However, with 1.5 Mbps. video streams, sufficient bandwidth to support Live Media Display was not always available. This problem was addressed by Hierarchical Media Distribution (caching close to the client), so that even when there was not sufficient bandwidth to support the full bit rate of a video file as it was delivered as a continuous stream from the main server, the first portion of the media file could be temporarily buffered and then delivered based on an intelligent estimation of the total file size, average bit rate achieved thus far, and amount already received. Later in the project, NMIS deployed workstations for use with INTERNET CNN Newsroom configured with NetPlay, a revised version of Live Media Play that included the capability of caching a few gigabytes to support both immediate delivery and short term storage of recently used files. Internet Newsroom Video Magazine exhibited particularly good short term cache hit rates because the most frequently requested video clips were the last few days' news [5].

In addition to having achieved production level performance of a daily educational information service, archiving the approximately 300 megabytes per day of INTERNET CNN Newsroom program segments has also resulted in one of the most extensive digital video libraries available on the Internet. As of the summer of 1997, more than two years of CNN Newsroom programs have been captured and stored on the NMIS server. Of these, from six to nine months have been available in an archive to users on the Internet at any given time (approximately 60 gigabytes). Storage improvements may soon expand access from the Internet to the full archive (approximately 150 gigabytes). This digital video testbed has proven to be large enough to test many of the infrastructure and policy questions that were at the heart of the NMIS Project. Efforts to explore distribution performance using unusally high bandwidth were especially facilitated by the use of experimental cable modems which were installed in a school in Lexington, MA as early as 1994.

Experience with the INTERNET CNN Newsroom project showed that permanently segmenting large blocks of digital video into a series of discrete files could not accommodate flexible repurposing of variable and overlapping sections. To address this problem, StreamObjects was developed to make it possible to distribute dynamically segmented digital video over the Internet using the conventional client/server technologies of the WWW. This software solution allows a client to download a dynamically created media stream containing only the precise segment of video and/or audio requested, without creating another file on the server or requiring specialized media servers. This approach makes it unnecessary to know during encoding how to segment the file, because there is no mechanical process required to re-segment a file to obtain smaller, larger or overlapping segments. [20]

Future Directions

A popular component of the INTERNET CNN Newsroom site has been the searchable library of video segments. The key to the archive's functionality has been storing video files with files containing the corresponding closed captioned text for use by a commercial search engine to identify relevant small segments of video. The shortcoming of this approach is that it does not function well in a model which employs large dynamically segmented video files accessed by exact frame numbers rather than a series of small, hard segmented MPEG files accessed by name. When dynamic segmentation is used, each time a search engine indexes a file, it aggregates entire programs rather than specific segments. Using a relational database such as Oracle for managing the relationships between video file time coding and closed captioned text can offer an alternative approach for replicating the function of searching for exact segments.

While the availability of closed captioned text has been an advantage for INTERNET CNN Newsroom, this model has limited transferability because it ultimately depends upon the availability of hand generated text. The replacement of closed captioned text with automatically generated text would allow the system to be adapted to a much wider variety of situations. While there do not appear to be any immediate breakthroughs in affordable, automatic video image processing and recognition on the horizon, there are viable automatic speech to text systems. Two potential systems for prototype experiments are the Sphinx speech recognition software available from Carnegie Mellon's Speech Group [31] and HARK Recognizer(TM) available from BBN Systems and Technologies Speech Solutions Group [1]. In the future, the incorporation of automatic speech to text software packages like these would make it possible to automatically generate a textual representation of the contents of the audio stream of a video file.

Cataloging digital video using a transcription of the audio track has been functional for the INTERNET CNN Newsroom Video Library, but it remains a relatively limited mechanism for cataloging the rich content of digital video for use in access or repurposing. A simple illustration of this point is the case of finding a video in which there is only instrumental music rather than spoken audio. Smart Video Markup Language (SVML) is a richer mechanism for cataloging video files which has recently been developed in association with the INTERNET CNN Newsroom component of the NMIS Project [24]. SVML is a "hierarchical content markup language" derived from SGML which makes it easy to retain a basic set of header information like author and copyright when a video file is segmented. A more advanced benefit of SVML's hierarchical nature is that authors might embed many alternative hierarchies in a digital video file and adjust the beginning and end points of those hierarchies. Future work in this area includes building editors to embed SVML directly in the MPEG file so that it can be easily transmitted, decoded, parsed and used to automatically populate databases for use in digital video file storage and retrieval.

The current approach for distributing INTERNET CNN Newsroom Video Magazine only supports user-initiated requests for immediate delivery or caching. A more convenient approach would be to use a "push" mechanism to allow appropriately programmed caches to "subscribe" and automatically receive a day's files based upon a preprogrammed delivery from the server.

In both the user request and subscription models described so far, normal IP Unicast for file delivery has been assumed. This means that as many complete files need to be distributed over the Internet as there are client requests. While the INTERNET CNN Newsroom Video Magazine has been delivered on a daily basis via Unicast to a small number of pilot sites, server and bandwidth constraints have required deliberate limits to be placed on the number of sites accessing the server at any given time. Implementing IP Multicast in the future could dramatically reduce the bandwidth needed to provide wider daily distribution. This technology allows a single copy of a file to be sent to a large number of specified caching servers (subscribers) because the file can be automatically propagated by routers. A subscription model using IP Multicast is particularly useful for avoiding real time bandwidth constraints because it also can use the broadcasted files to "prime" caches instead of sending files directly to client machines. Once primed, the caches can provide continuous streams of video to clients at need without relying on the main server. This proposed distribution model is illustrated in Figure 3 below.

Figure 3: Proposed NMIS Media Distribution Model