Open Repositories Conference

Series

Open Repositories Conference

Permanent Link

https://hdl.handle.net/1853/70945

Series Type

Event Series

Associated Organization(s)

Organizational Unit

Library

Full item page

Publication Search Results

Now showing 1 - 10 of 132

When Ruby Met Fedora

(Georgia Institute of Technology, 2009-05-21) Zumwalt, Matthew

This session will provide technical, Fedora-specific background to complement the plenary presentation titled "Many Lightweight Views into Complex Repository Content". We will cover the purpose and features of ActiveFedora, looking at working applications and code samples to support the discussion. We will also discuss topics such as Content Modeling, Search & Indexing, and Security (Authentication + Authorization).
Making DSpace 1.5 Your Own: Customization tips & tricks

(Georgia Institute of Technology, 2009-05-21) Donohue, Tim

DSpace 1.5 represents a big step towards the future of DSpace software. Are you still trying to wrap your mind around 1.5 or wanting to ready yourself for an upgrade? Get a better understanding of the new DSpace 1.5 architecture, features and customization options! Learn how to customize DSpace 1.5 by taking advantage of the Configurable Submission system, the Configurable Browse system, Manakin (XMLUI), Maven and many other new features. This talk/tutorial will concentrate on what is newly available in DSpace 1.5, and how DSpace 1.5 can be customized in a more "modular" fashion. Although a brief introduction to new DSpace 1.5 features will be provided, the majority of the talk will concentrate on how you can customize DSpace in this new architecture (Maven + Manakin, especially). Examples of both minor and major customizations will be presented, based on upgrade experience in migrating a highly-customized version of DSpace 1.4.2 (IDEALS - http://www.ideals.illinois.edu) to DSpace 1.5 and Manakin.
PhotoCat: Implementing a Cataloging Tool for a Live Fedora Repository

(Georgia Institute of Technology, 2009-05-21) Ozakca, Muzaffer ; Dunn, Jon

In this presentation, we will discuss the development process of a metadata cataloging application called PhotoCat (short for Photo Cataloging Application) created by the Indiana University Digital Library Program to allow catalogers and archivists to easily enter and manage item-level MODS descriptive metadata for image collections in IU's Fedora repository. Although admittedly this is not a unique use of Fedora, we faced a few interesting challenges along the way in using Fedora as a storage backend for such a tool. PhotoCat provides access to metadata records for multiple collections in a flexible way. Even though they may all use the same metadata standard (e.g. MODS), each collection may use different subsets of the available elements or use elements in slightly different ways. PhotoCat, in addition to search, browse and user management capabilities, provides a customizable interface and metadata model that define a) the Web form that accepts user input and b) instructions for populating a metadata record from that form. Functionality similar to this is found both in Fez and Muradora, two popular Fedora front-ends. In this presentation, we will talk about how our implementation is different from these systems and the unique requirements that led to our current implementation. One of the challenges we ran into was related to batch updates. Users familiar with traditional online database applications expect updates to multiple records to be reflected nearly instantaneously. Updating a single element in a batch of XML metadata datastreams in Fedora, on the other hand, requires that each object's datastream be retrieved, updated, and stored back in the repository. One of the options for achieving a perceived real time operation was limiting updates to a small fixed number of records, but this wasn't acceptable for large collections with thousands of records that needed to be updated at the same time. We also considered placing a faster middleware layer between the application and Fedora, but ended up opting for an asynchronous, behind the scenes approach for this issue. Another big challenge was generating metadata records that are syntactically and semantically correct. One of the aims of the project was that the metadata generated by the application would obey established guidelines and best practices for a given metadata standard. We considered XForms for this purpose but ended up implementing our own display/model web component, and in this presentation, we will discuss the reasons for that choice.
Digital Repositories and the Semantic Web: Semantic Search and Navigation for DSpace

(Georgia Institute of Technology, 2009-05-21) Alexopoulos, Andreas D. ; Koutsomitropoulos, Dimitrios ; Papatheodorou, Theodore S. ; Solomou, Georgia D.

In many digital repository implementations, resources are often described against some flavor of metadata schema, popularly the Dublin Core Element Set (DCMES), as is the case with the DSpace system. However, such an approach cannot capture richer semantic relations that exist or may be implied, in the sense of a Semantic Web ontology. Therefore we first suggest a method in order to semantically intensify the underlying data model and develop an automatic translation of the flatly organized metadata information to this new ontology. Then we propose an implementation that provides for inference-based knowledge discovery, retrieval and navigation on top of digital repositories, based on this ontology. We apply this technique to real information stored in the University of Patras Institutional Repository that is based on DSpace, and confirm that more powerful, inference-based queries can indeed be performed.
Unicorn: The myth of federated search realized simply. Unifying DSpace repositories with the PKP Harvester tool

(Georgia Institute of Technology, 2009-05-21) Davison, John ; Gilbertson, Keith

The Ohio Digital Resource Commons, located at http://drc.ohiolink.edu, is a union of DSpace repositories operated by higher education institutions in Ohio. The repositories are largely organized and supported by OhioLINK, a consortium of 89 Ohio college and university libraries. In support of the vision of the Digital Resource Commons as a statewide resource, the repository operators saw an immediate need for a federated search tool. A "build it now" approach was taken, and federated searching was implemented in a short timeframe at OhioLINK using the PKP Harvester (http://pkp.sfu.ca/?q=harvester) software. A demonstration of the federated search feature at the Digital Resource Commons will be given, highlighting local customizations that were made to PKP Harvester and DSpace in support of the project. These customizations include changes made to mimic the appearance and behavior of existing search interfaces at OhioLINK, and changes made to meet expressed user requirements. Particular attention will be given to a DSpace change that allows image thumbnails to be displayed in federated search results. Issues encountered during the configuration, implementation, and deployment of the PKP Harvester and DSpace OAI-PMH server will be presented, and the choices made in response to these issues will be explained. The process of integrating the search results with the DSpace interface will be detailed, including ongoing efforts to improve the user experience. The Digital Resource Commons' federated search was implemented as a metadata-based search. We will present a general comparison between metadata and full-text searching, highlighting the advantages and disadvantages of each method. A discussion of metadata uniformity and quality concerns will be presented in the context of federated searching. Particular problems encountered with our metadata will be described, with lessons learned and suggestions for resolution. Operational and maintenance concerns of this system will be discussed, including the metadata harvesting schedule, and the need to flush and rebuild indexes when the metadata schema changes. Future ideas for the DRC's federated search feature will be explored, including an implementation of faceted searching using SOLR, harvesting of non-DSpace repositories, such as CONTENTdm and Fedora, and, finally, the possibility of discarding the current model in favor of an OAI-ORE based system, developed for DSpace at Texas Digital Library, that allows for the possibility of full-text federated searching.
Disseminating Broadcast Archives: Exposing WGBH Materials for Scholarly Use

(Georgia Institute of Technology, 2009-05-21) Beer, Chris ; Michael, Courtney

The WGBH Media Library and Archives is currently prototyping an online archive of moving image content. Funded by The Andrew W. Mellon Foundation, the project seeks to serve scholars in their efforts to incorporate media into their research and communications activities. WGBH Boston is the single greatest producer of programming for PBS. Our archive holds the master copies of television and radio programs dating back to the 1950s. Not only do we hold final programs, we also hold all of the numerous interviews, stock footage, music, producer's notes, and images that went into the making of the films. As these materials are used and re-used, the relationships between assets become increasingly complex. These relationships, however, are vital information necessary for a researcher to interpret and understand our archive. In addition to the complexity of our collection, our project must consider the needs of traditional, text-oriented scholars and the rising generation of "digital natives" for whom content format is not a boundary. To that end, we are incorporating annotation, citation and other workflow tools to facilitate the use of moving images in scholarly work. We are currently prototyping a Fedora-backed online archive incorporating search, browse, data visualization, and web services. We will present the open source infrastructure behind our web project which includes Fedora, Solr and a PHP front end. Our Fedora content model addresses the specific needs of a moving image archive, allowing for the expression of complex relationships between conceptual and instantiated assets. In addition, it allows us to express the myriad permutations and oddities occurring within broadcast asset relationships. We will share lessons learned and new challenges regarding the representation of archival moving image collections online, the unique cataloging and metadata needs of the online researcher, and barriers to the use of online archives by scholarly researchers. Finally, we will cover technical challenges involving storage and delivery of long form video content, rights management, and user authentication and sustainable business models.
Depth Customization of DSpace: Best Practices and Techniques of Institutional Repository at IIT Kanpur, India

(Georgia Institute of Technology, 2009-05-21) Shrivastava, V. D. ; Shukla, Gaurav ; Vijaianand, S. K.

Realizing the importance and magnitude of Institutional Repository and global visibility and further research scope, Indian Institute of Technology Kanpur, India has intensively planned and designed a full fledged IR project started in mid 2005. Considering the various characteristics and strategies we have designed a well defined and distinct roadmap for establishing our IR in two phases. In the first phase we have planned the mass digitization of entire theses collection of Masters and Doctoral Dissertations produced from 1963 spanning one million pages with their complex content. The content management and uploading to the content server with extracted metadata encoded in XML from our existing server by an in-house developed script is being organised effectively ensuring adequate quality checking of these pages. Initially manual submission for theses was in place. Now a provision has been made available to researchers to submit their theses online even without submitting any hardcopy in the library. The second phase is conceived of digitizing scholarly publications other than theses from our academic community. Its coverage is extremely appreciable in number and strategy we are using for our IR. Excellent feature of our system is the depth customization of DSpace at several places incorporating enhanced features. We discovered that default features offered by even the latest version of DSpace are not sufficient and adequate for academic community to establish their IR system in its full functionality so as to deliver the right information to the right user at the time it is needed. After a detailed study and research we have incorporated significant features like workflow, additional browse and search options, cross-collection search, linking to keywords/subject/homepage/citation, total count of items in respect of supervisor/subject/citation. Additionally, login user authentication from central database, IP based access restrictions, embargo and encryption on the bit stream are also provided. Redesigned feedback form has also been provided to improve its scope and functionalities of our system. These are unique features of our IR system and these enhanced features may be useful to any system in identical academic environment using DSpace to power their Institutional Repositories.
Enhanced Content Models for Fedora

(Georgia Institute of Technology, 2009-05-21) Blekinge-Rasmussen, Asger

This presentation introduces Enhanced Content Models for Fedora. Enhanced Content Models have a number of new features compared to the Fedora 3.0 content models. First of these is the more elaborate specification of the data objects. Second is the repository view system, which allows the repository to dynamically remap the contained data to virtual data objects. And third is the object creation templates, which allows the content models to behave as object classes from which new data object instances can be made. All our work is under the Apache 2.0 License, and will be available as add-ons to Fedora. Fedora is an extensible repository system, containing data objects and content models, which hold descriptions of the data objects that subscribe to them. In Fedora 3.0 Content Models express the classes of objects, and tie data objects to disseminators, but do little else. Content Models are formal descriptions of data objects, which should be distinguished from datamodels, which are descriptions of collections of data. Having a datamodel is a requirement for many digital repositories and the easiest solution is creating an interface that only allows data to be entered in a special format. If all data is input through this interface, it will adhere to the datamodel. Unfortunately, this has the side effect of coupling the datamodel to the program code of the interface. We believe that the datamodel should not be part of the interface, it should be part of the repository. We achieved this by enhancing the Content Models. The Enhanced Content Models can specify the cardinality and target classes of relations, and schemas for datastreams. We have implemented a validator, which checks data objects against their Content Models. A set of Enhanced Content Models makes up a datamodel. In Fedora, you might have atomic objects making up a "record", but for indexing purposes, this record must be flattened to one compound. Enhanced Content Models can specify how to do this flattening, in the repository view system, and we have implemented a webservice to create such compounds. In many OO programming languages new objects are created as instances of a class. Enhanced Content Models implements this pattern. You can declare certain data objects to be templates for an enhanced content model. We have developed a webservice that can create new objects in Fedora, given a content model and a template to use as basis.
Fedora 3.0 and METS: A Partnership for the Organization, Presentation and Preservation of Digital Objects

(Georgia Institute of Technology, 2009-05-21) Catapano, Terry ; Hoebelheinrich, Nancy J. ; Yott, Patrick

Fedora is being implemented in many different kinds of repositories even within a single institution, e.g., institutional repositories, and preservation repositories, and metadata repositories. Within many institutions, METS (Metadata Encoding & Transmission Standard http://www.loc.gov/standards/mets/) is being used to encode and package content files and metadata for many of the digital objects within these repositories. Much has been speculated about how the two could work together, particularly with the expansion of "content models" within Fedora 3.0. In this proposal, 3 different academic institutions will discuss decisions, plans, and issues arising out of the implementation of a "Paged Text" content model that incorporates the use of METS for various purposes related to the management of metadata for this type of digital object during its lifecycle. Within two 20 minute presentations, each presenter will provide the context for the type and purpose of the repositories being discussed within his/her institution as well as the related services that pertain to the discussion. In addition, each presenter will explain for what purpose METS is being used within the repositories, e.g., to "stage" content and metadata as a pre-SIP target or organizer for vendors, and/or to package content files and metadata for export to preservation. Areas of discussion will include how METS is or potentially could be used in conjunction with the more generalizable mechanisms built within Fedora to manage the structure of a digital object, the disseminators interacting with a digital object (such as page turners for text), and the workflow associated with different "moments" within the lifecycle of the digital object. Presenters will discuss lessons learned as well as future areas of exploration as the Fedora and METS communities continue to work together to optimize the use of each when it makes sense to do so. Questions and discussion from the audience will be encouraged.
EIAH's Experience in Localization and Customization of DSpace

(Georgia Institute of Technology, 2009-05-21) Khazraee, Emad ; Malek, Hamed ; Moaddeli, Saeed

The encyclopedia of Iranian architectural history was established with the goal of increasing the accessibility of the widespread resources and documents related to Iranian architectural history and to provide a better and more productive space for collaboration of researchers and scholars, enabling them to expand and improve this encyclopedia. The Infrastructure which is designed is a three level structure. We have a knowledge representation level on the top, an ontology of Iranian architectural history, a conceptual model designed for this specific area of study and The middle level is the mediator level which is responsible for establishing the relation between concepts and documents and enhancing search and semantic interoperability and The underlying level is a digital repository; a localized and customized version of Dspace institutional repository. The main changes that we made to Dspace was implementing the Persian calendar (a.k.a Jalali calendar) and fixing the correct representation of Persian numerals. We've also translated all the messages into Persian and changed the JSPUI for a correct view of page containing right-to-left text. A team of librarians in EIAH reviewed the work-flows and based on their feedback, some changes are made to the way the metadata appears for each item and the way work-flows progress. Using our EIAH Metadata standards we have our own Application Profile based on Dublin Core embedded in Dspace for describing the documents available in the history of Iranian architecture. Our Development plan for Dspace includes: A book viewer A customized search engine (lucene based) Federated search on multiple Dspace installations in national cultural heritage centers Establishing a connection between Semantic Mediawiki and Dspace through OAI-PMH Thumbnail generator for different text document formats Packaging Dspace for Redhat based systems Development of our Application profile based on Singapore framework for DCAP.