Network Access to Multimedia Information - Summary Introduction This document is a summary of a report entitled "Network Access to Multimedia Information", by Chris Adie. The full report is available at URL ftp://ftp.ed.ac.uk/pub/mmaccess. The report and this summary are Copyright 1993 RARE. The full report contains a disclaimer which also applies to this summary. The report is concerned with issues in the intersection of networked information retrieval, database and multimedia technologies. It aims to establish research and academic user requirements for network access to multimedia data, to look at existing systems which offer partial solutions, and to identify what needs to be done to satisfy the most pressing requirements. User Requirements There are a number of reasons why multimedia data may need to be accessed remotely (as opposed to physically distributing the data, eg on CD-ROM). These reasons centre on the cost of physical distribution, versus the timeliness of network distribution. Of course, there is a cost associated with network distribution, but this tends to be hidden from the end user. User requirements have been determined by studying existing and proposed projects involving networked multimedia data. It has proved convenient to divide the applications into four classes according to their requirements: multimedia database applications, academic (particularly scientific) publishing applications, cal (computer-aided learning), and general multimedia information services. Database applications typically involve large collections of monomedia (non-text) data with associated textual and numeric fields. They require a range of search and retrieval techniques. Publishing applications require a range of media types, hyperlinking, and the capability to access the same data using different access paradigms (search, browse, hierarchical, links). Authentication and charging facilities are required. Cal applications require sophisticated presentation and synchronisation capabilities, of the type found in existing multimedia authoring tools. Authentication and monitoring facilities are required. General multimedia information services include on-line documentation, campus-wide information systems, and other systems which don't conveniently fall into the preceding categories. Hyperlinking is perhaps the most common requirement in this area. The analysis of these application areas allows a number of important user requirements to be identified: Support for the Apple Macintosh, UNIX and PC/MS Windows environments. Support for a wide range of media types - text, image, graphics and application-specific media being most important, followed by video and sound. Support for hyperlinking, and for multiple access structures to be built on the same underlying data. Support for sophisticated synchronisation and presentation facilities. Support for a range of database searching techniques. Support for user annotation of information, and for user-controlled display of sequenced media. Adequate responsiveness - the maximum time taken to retrieve a node should not exceed 20s. Support for user authentication, a charging mechanism, and monitoring facilities. The ability to execute scripts. Support for mail-based access to multimedia documents, and (where appropriate) for printing multimedia documents. Powerful, easy-to-use authoring tools. Existing Systems The main information retrieval systems in use on the Internet are Gopher, Wais, and the World-Wide Web. All work on a client-server paradigm, and all provide some degree of support for multimedia data. Gopher presents the user with a hierarchical arrangement of nodes which are either directories (menus), leaf nodes (documents containing text or other media types), or search nodes (allowing some set of documents to be searched using keywords, possibly using WAIS). A range of media types is supported. Extensions currently being developed for Gopher (Gopher+) provide better support for multimedia data. Gopher has a very high penetration (there are over 1000 Gopher servers on the Internet), but it does not provide hyperlinks and is inflexibly hierarchical. Wais (Wide Area Information Server) allows users to search for documents in remote databases. Full-text indexing of the databases allows all documents containing particular (combinations of) words to be identified and retrieved. Non-text data (principally image data) can be handled, but indexing such documents is only performed on the document file name, severely limiting its usefulness. However, WAIS is ideally suited to text search applications. World-Wide Web (WWW) is a large-scale distributed hypermedia system. The Web consists of nodes (also called documents) and links. Links are connections between documents: to follow a link, the user clicks on a highlighted word in the source document, which causes the linked-to document to be retrieved and displayed. A document can be one of a variety of media types, or it can be a search node in a similar sense to Gopher. The WWW addressing method means that WAIS and Gopher servers may also be accessed from (indeed, form part of) the Web. WWW has a smaller penetration than Gopher, but is growing faster. The Web technology is currently being revised to take better account of the needs of multimedia information. These systems all go some way to meet the user requirements. Support for multiple platforms and for a wide range of media types (through "viewer" software external to the client program) is good. Only WWW has hyperlinks. There is little or no support for sophisticated presentation and synchronisation requirements. Support for database querying tends to be limited to "keyword" searches, but current developments in Gopher and WWW should make more sophisticated queries possible. Some clients support user annotation of documents. Response times for all three systems vary substantially depending on the network distance between client and server, and there is no support for isochronous data transfer. There is little in the way of authentication, charging and monitoring facilities, although these are planned for WWW. Scripting is not supported because of security issues WWW supports a mail responder. The only system sufficiently complex to warrant an authoring tool is WWW, which has editors to support its hypertext markup language. Research There are a number of research projects which are of significant interest. Hyper-G is an ambitious distributed hypermedia research project at the University of Graz. It combines concepts of hypermedia, information retrieval systems and documentation systems with aspects of communication and collaboration, and computer-supported teaching and learning. Automatic generation of hyperlinks is supported, and there is a concept of generic structures which can exist in parallel with the hyperlink structure. Hyper-G is based on UNIX, and is in use as a CWIS at Graz. Gateways between Hyper-G and WWW exist. Microcosm is a PC-based hypermedia system developed at the University of Southampton. It can be viewed as an integrating hypermedia framework - a layer on top of a range of existing applications which enables relationships between different documents to be established. Hyperlinks are maintained separately from the data. Networking support for Microcosm is currently under development, as are versions of Microcosm for the Apple Macintosh and for UNIX. Microcosm is currently being "commercialised". AthenaMuse 2 is an ambitious distributed hypermedia authoring and presentation system under development by a university/industry consortium based at MIT. It will have good facilities for presentation and synchronisation of multimedia data, strong authoring support, and will include support for networking isochronous data. It will be a commercial product. Initial versions will support UNIX and X windows, with a PC/MS Windows version following. Apple Macintosh support has lower priority. The European Commission sponsors a number of peripherally relevant projects through its Esprit and RACE research programmes. These programmes tend to be oriented towards commercial markets, and are thus not directly relevant. An exception is the Esprit IDOMENEUS project, which brings together workers in the database, information retrieval and multimedia fields. It is recommended that RARE establish a liaison with this project. There are a variety of other academic and commercial research projects which are also of interest. None of them are as directly relevant as those outlined above. Standards There are a number of existing and emerging standards for structuring hypermedia applications. Of these, the most important are SGML, HyTime, MHEG, ODA, PREMO and Acrobat. All bar the last are de jure standards, while Acrobat is a commercial product which is being proposed as a de facto standard. SGML (Standard Generalized Markup Language) is a markup language for delimiting the logical and semantic content of text documents. Because of its flexibility, it has become an important tool in hypermedia systems. HyTime is an ISO standardised infrastructure for representing integrated, open hypermedia documents, and is based on SGML. HyTime has great expressive power, but is not optimised for run-time efficiency. It is recommended that future RARE work on networked hypermedia should take account of the importance of SGML and HyTime. MHEG (Multimedia and Hypermedia information coding Experts Group) is a draft ISO standard for representing hypermedia applications in a platform-independent form. It uses an object-oriented approach, and is optimised for run-time efficiency. Full IS status for MHEG is expected in 1994. It is recommended that RARE keep a watching brief on MHEG. The ODA (Open Document Architecture) standard is being enhanced to incorporate multimedia and hypermedia features. However, interest in ODA is perceived to be decreasing, and it is recommended that ODA should not form a basis for further RARE work in networked hypermedia. PREMO is a new work item in the ISO graphics standardisation community, which appears to overlap with MHEG and HyTime. It is not clear that the PREMO work, which is at a very early stage, is worthwhile in view of the existence of those standards. Acrobat PDF is a format for representing multimedia (printable) documents in a portable, revisable form. It is based on Postscript, and is being proposed by Adobe Inc (originators of Postscript) as an industry standard. RARE should maintain awareness of this technology in view of its potential impact on multimedia information systems. There are various standards which have relevance to the way multimedia data is accessed across the network. Many of these have been described in a previous report [Adi93]. Two further access protocols are the proposed multimedia extensions to SQL, and the Document Filing and Retrieval protocol. Neither of these are likely to have major significance for networked multimedia information systems. Other standards of importance include: MIME, a multimedia email standard which defines a range of media types and encoding methods for those types which are useful in a wider context. AVIs (Audio-Visual Interactive services) and the associated multimedia scripting language SMSL, which form a standardisation initiative within CCITT (now ITU- TSS) to specify interactive multimedia services which can be provided across telephone/ISDN networks. There are two important trade associations which are involved in standardisation work. The Interactive Multimedia Association (IMA) has a Compatibility Project which is developing a specification for platform- independent interactive multimedia systems, including networking aspects. A newly-formed group, the Multimedia Communications Forum (MMCF), plans to provide input to the standards bodies. It is recommended that RARE become an Observing Member of the MMCF. Future Directions Three common design approaches emerge from the variety of systems and standards analysed in this report. They can be described in terms of distinctions between different aspects of the system: content is distinct from hyperstructure media type is distinct from media encoding data is distinct from protocol Distributed hypermedia systems are emerging from the research/development phase into the experimental deployment phase. However, the existing global information systems (Gopher, WAIS and WWW) are still largely limited to the use of external viewers for non- textual data. The most significant mismatches between the capabilities of currently-deployed systems and user requirements are in the areas of presentation and quality of service (ie responsiveness). Improving QOS is significantly more difficult than improving presentation capabilities, but there are a number of possible ways in which this could be addressed. Improving feedback to the user, greater multi-threading of applications, pre-fetching, caching, the use of alternative "views" of a node, and the use of isochronous data streams are all avenues which are worth exploring. In order to address these problems, it is recommended that RARE seek to adapt and enhance existing tools, rather than develop new ones. In particular, it is recommended that RARE select the World-Wide Web to concentrate its efforts on. The reasons for this choice revolve around the flexibility of the WWW design, the availability of hyperlinks, the existing effort which is already going into multimedia support in WWW, the fact that it is an integrating solution incorporating both WAIS and Gopher support, and its high rate of growth compared to Gopher (despite Gopher's wider deployment). Gopher is the main competitor to WWW, but its inflexibly hierarchical structure and the absence of hyperlinks make it difficult to use for highly-interactive multimedia applications. It is recommended that RARE should invite proposals for and subsequently commission work to: Develop conversion tools from commercial multimedia authoring packages to WWW, and accompanying authoring guidelines. Implement and evaluate the most promising ways of overcoming the QOS problem. Implement a specific user project using these tools, to validate that the facilities being developed are truly relevant to real applications. Use the experience gained to inform and influence the development of the WWW technology. Contribute to the development of PC/MS Windows and Apple Macintosh WWW clients, particularly in the multimedia data handling area. It is noted that the rapid growth of WWW may in the future lead to problems through the implementation of multiple, uncoordinated and mutually incompatible add-on features. To guard against this trend, it may be appropriate for RARE, in coordination with CERN and other interested parties such as NCSA, to: Encourage the formation of a consortium to coordinate WWW technical development.