CD-ROM Redigitization Project (CDRiP)

Thumbnail Image
Dekker, Harrison
Associated Organization(s)
Organizational Unit
Supplementary to
The CDRiP began with with a fairly narrow goal, to leverage low cost disk storage and create an improved framework for storing, finding and using numeric data published in CD and DVD format. A further goal was to employ emerging metadata statndards rather than legacy or proprietary ones, to promote the extensibility of the system. Other goals have emerged as the project has progressed, most significantly, the need to address software and operating system dependencies often associated with these materials. This last issue is of particular importance given the prevalence of CD products that contain data in proprietay (or obsolete) formats that can only be accessed with the custom software applications accompanying the data on these disks. As a small scale, one-developer project, it was important to choose an approach that allowed a great amount of flexibility. Accordingly, we decided on what's commonly referred to as an iterative design process. Iterative design is defined as "a design methodology based on a cyclic process of prototyping, testing, analyzing, and refining a work in progress. In iterative design, interaction with the designed system is used as a form of research for informing and evolving a project, as successive versions, or iterations of a design are implemented." (http://www.gmlb.com/articles/iterativedesign.html) In essence this approach has allowed us to begin production work on certain aspects of the project before implementation decisions on other aspects have been finalized. In a nutshell, CDRiP is a framework for saving CD image files (ISO format) to a network file server, and automatically generating metadata through both interaction with the library catalog and other programmed processes. Over a thousand CD's and DVD's have been successfully "redigitized" and, with their accompanying metadata files, added to the repository. Under development, and working in prototype, are procedures that allow an end-user to remotely access the repository and, when needed, install applications in a controlled "virtual machine" environment. This approach provides an immediate solution to most of the problems associated with legacy software installation under a modern operation system. It also provides an environment in which software can be installed and run under the operating system version in which it was developed, when all else fails. At the planning stage are automated processes to allow flagging of items for additional processing such as inclusion in the Library's preservation repository workflow or publishing CD contents directly to the web. Eventually, an xml database for enhanced search and retrieval will be implemented. Learning outcomes: Better understanding of the long term preservation and access issues associated with CD-ROM collections Better understanding of how to apply non-marc metadata in a library application Better understanding of virtual machine software and its relevance in the digital library
Date Issued
Resource Type
Resource Subtype
Rights Statement
Rights URI