Open Repositories Conference

Series

Open Repositories Conference

Permanent Link

https://hdl.handle.net/1853/70945

Series Type

Event Series

Associated Organization(s)

Organizational Unit

Library

Full item page

Publication Search Results

Now showing 1 - 10 of 101

When Ruby Met Fedora

(Georgia Institute of Technology, 2009-05-21) Zumwalt, Matthew

This session will provide technical, Fedora-specific background to complement the plenary presentation titled "Many Lightweight Views into Complex Repository Content". We will cover the purpose and features of ActiveFedora, looking at working applications and code samples to support the discussion. We will also discuss topics such as Content Modeling, Search & Indexing, and Security (Authentication + Authorization).
Digital Repositories and the Semantic Web: Semantic Search and Navigation for DSpace

(Georgia Institute of Technology, 2009-05-21) Alexopoulos, Andreas D. ; Koutsomitropoulos, Dimitrios ; Papatheodorou, Theodore S. ; Solomou, Georgia D.

In many digital repository implementations, resources are often described against some flavor of metadata schema, popularly the Dublin Core Element Set (DCMES), as is the case with the DSpace system. However, such an approach cannot capture richer semantic relations that exist or may be implied, in the sense of a Semantic Web ontology. Therefore we first suggest a method in order to semantically intensify the underlying data model and develop an automatic translation of the flatly organized metadata information to this new ontology. Then we propose an implementation that provides for inference-based knowledge discovery, retrieval and navigation on top of digital repositories, based on this ontology. We apply this technique to real information stored in the University of Patras Institutional Repository that is based on DSpace, and confirm that more powerful, inference-based queries can indeed be performed.
Unicorn: The myth of federated search realized simply. Unifying DSpace repositories with the PKP Harvester tool

(Georgia Institute of Technology, 2009-05-21) Davison, John ; Gilbertson, Keith

The Ohio Digital Resource Commons, located at http://drc.ohiolink.edu, is a union of DSpace repositories operated by higher education institutions in Ohio. The repositories are largely organized and supported by OhioLINK, a consortium of 89 Ohio college and university libraries. In support of the vision of the Digital Resource Commons as a statewide resource, the repository operators saw an immediate need for a federated search tool. A "build it now" approach was taken, and federated searching was implemented in a short timeframe at OhioLINK using the PKP Harvester (http://pkp.sfu.ca/?q=harvester) software. A demonstration of the federated search feature at the Digital Resource Commons will be given, highlighting local customizations that were made to PKP Harvester and DSpace in support of the project. These customizations include changes made to mimic the appearance and behavior of existing search interfaces at OhioLINK, and changes made to meet expressed user requirements. Particular attention will be given to a DSpace change that allows image thumbnails to be displayed in federated search results. Issues encountered during the configuration, implementation, and deployment of the PKP Harvester and DSpace OAI-PMH server will be presented, and the choices made in response to these issues will be explained. The process of integrating the search results with the DSpace interface will be detailed, including ongoing efforts to improve the user experience. The Digital Resource Commons' federated search was implemented as a metadata-based search. We will present a general comparison between metadata and full-text searching, highlighting the advantages and disadvantages of each method. A discussion of metadata uniformity and quality concerns will be presented in the context of federated searching. Particular problems encountered with our metadata will be described, with lessons learned and suggestions for resolution. Operational and maintenance concerns of this system will be discussed, including the metadata harvesting schedule, and the need to flush and rebuild indexes when the metadata schema changes. Future ideas for the DRC's federated search feature will be explored, including an implementation of faceted searching using SOLR, harvesting of non-DSpace repositories, such as CONTENTdm and Fedora, and, finally, the possibility of discarding the current model in favor of an OAI-ORE based system, developed for DSpace at Texas Digital Library, that allows for the possibility of full-text federated searching.
Disseminating Broadcast Archives: Exposing WGBH Materials for Scholarly Use

(Georgia Institute of Technology, 2009-05-21) Beer, Chris ; Michael, Courtney

The WGBH Media Library and Archives is currently prototyping an online archive of moving image content. Funded by The Andrew W. Mellon Foundation, the project seeks to serve scholars in their efforts to incorporate media into their research and communications activities. WGBH Boston is the single greatest producer of programming for PBS. Our archive holds the master copies of television and radio programs dating back to the 1950s. Not only do we hold final programs, we also hold all of the numerous interviews, stock footage, music, producer's notes, and images that went into the making of the films. As these materials are used and re-used, the relationships between assets become increasingly complex. These relationships, however, are vital information necessary for a researcher to interpret and understand our archive. In addition to the complexity of our collection, our project must consider the needs of traditional, text-oriented scholars and the rising generation of "digital natives" for whom content format is not a boundary. To that end, we are incorporating annotation, citation and other workflow tools to facilitate the use of moving images in scholarly work. We are currently prototyping a Fedora-backed online archive incorporating search, browse, data visualization, and web services. We will present the open source infrastructure behind our web project which includes Fedora, Solr and a PHP front end. Our Fedora content model addresses the specific needs of a moving image archive, allowing for the expression of complex relationships between conceptual and instantiated assets. In addition, it allows us to express the myriad permutations and oddities occurring within broadcast asset relationships. We will share lessons learned and new challenges regarding the representation of archival moving image collections online, the unique cataloging and metadata needs of the online researcher, and barriers to the use of online archives by scholarly researchers. Finally, we will cover technical challenges involving storage and delivery of long form video content, rights management, and user authentication and sustainable business models.
Fedora 3.0 and METS: A Partnership for the Organization, Presentation and Preservation of Digital Objects

(Georgia Institute of Technology, 2009-05-21) Catapano, Terry ; Hoebelheinrich, Nancy J. ; Yott, Patrick

Fedora is being implemented in many different kinds of repositories even within a single institution, e.g., institutional repositories, and preservation repositories, and metadata repositories. Within many institutions, METS (Metadata Encoding & Transmission Standard http://www.loc.gov/standards/mets/) is being used to encode and package content files and metadata for many of the digital objects within these repositories. Much has been speculated about how the two could work together, particularly with the expansion of "content models" within Fedora 3.0. In this proposal, 3 different academic institutions will discuss decisions, plans, and issues arising out of the implementation of a "Paged Text" content model that incorporates the use of METS for various purposes related to the management of metadata for this type of digital object during its lifecycle. Within two 20 minute presentations, each presenter will provide the context for the type and purpose of the repositories being discussed within his/her institution as well as the related services that pertain to the discussion. In addition, each presenter will explain for what purpose METS is being used within the repositories, e.g., to "stage" content and metadata as a pre-SIP target or organizer for vendors, and/or to package content files and metadata for export to preservation. Areas of discussion will include how METS is or potentially could be used in conjunction with the more generalizable mechanisms built within Fedora to manage the structure of a digital object, the disseminators interacting with a digital object (such as page turners for text), and the workflow associated with different "moments" within the lifecycle of the digital object. Presenters will discuss lessons learned as well as future areas of exploration as the Fedora and METS communities continue to work together to optimize the use of each when it makes sense to do so. Questions and discussion from the audience will be encouraged.
EIAH's Experience in Localization and Customization of DSpace

(Georgia Institute of Technology, 2009-05-21) Khazraee, Emad ; Malek, Hamed ; Moaddeli, Saeed

The encyclopedia of Iranian architectural history was established with the goal of increasing the accessibility of the widespread resources and documents related to Iranian architectural history and to provide a better and more productive space for collaboration of researchers and scholars, enabling them to expand and improve this encyclopedia. The Infrastructure which is designed is a three level structure. We have a knowledge representation level on the top, an ontology of Iranian architectural history, a conceptual model designed for this specific area of study and The middle level is the mediator level which is responsible for establishing the relation between concepts and documents and enhancing search and semantic interoperability and The underlying level is a digital repository; a localized and customized version of Dspace institutional repository. The main changes that we made to Dspace was implementing the Persian calendar (a.k.a Jalali calendar) and fixing the correct representation of Persian numerals. We've also translated all the messages into Persian and changed the JSPUI for a correct view of page containing right-to-left text. A team of librarians in EIAH reviewed the work-flows and based on their feedback, some changes are made to the way the metadata appears for each item and the way work-flows progress. Using our EIAH Metadata standards we have our own Application Profile based on Dublin Core embedded in Dspace for describing the documents available in the history of Iranian architecture. Our Development plan for Dspace includes: A book viewer A customized search engine (lucene based) Federated search on multiple Dspace installations in national cultural heritage centers Establishing a connection between Semantic Mediawiki and Dspace through OAI-PMH Thumbnail generator for different text document formats Packaging Dspace for Redhat based systems Development of our Application profile based on Singapore framework for DCAP.
Beyond the Tutorial: Complex Content Models in Fedora 3

(Georgia Institute of Technology, 2009-05-21) Gorman, Peter ; Prater, Scott

The University of Wisconsin Digital Collections Center recently began a pilot project to create a digital collection of learning objects stored in a Fedora 3.1 repository. This pilot project is the proof-of-concept of many ideas and discussions, extending back for over five years, concerning the problem of storing, searching, and retrieving heterogeneous objects and object types, linked together in complex relations, in a way that is loosely coupled with front-end user display applications. During the course of this presentation, we will describe the issues we've confronted implementing increasingly rich digital collections over the past decade (including some real-life examples of complex digital objects), models we have developed with to resolve many of those issues, and how we have started implementing those models in Fedora 3.1, using the new Content Model functionality.
Fedora 3: A Smooth Migration

(Georgia Institute of Technology, 2009-05-21) Durbin, Michael

The dramatic changes between the architecture of Fedora 2 and Fedora 3 offer exciting opportunities for improved functionality and organization through the addition of a more formalized content model architecture (CMA). On the other hand, these changes may make the migration of existing production repositories seem a daunting task, where one must balance the advantages of new features with the requirements to maintain uninterrupted and unaltered service. In this presentation, we will present the case of the 2008 migration of the Indiana University Digital Library Program's Fedora repository from version 2.2.4 to 3.1. The first portion of this talk will be dedicated to technical and logistical challenges associated with the migration of a repository of nearly half a million objects. These include balancing ingest scheduling with migration timing, dealing with a large amounts of data, and switching from Oracle to MySQL as the relational database system. There will be discussion of the considerations associated with migrating to the new CMA and use of the "generator" application that is part of Fedora 3's migration tools. The pros and cons of various techniques of modeling standard types of objects such as images, books, and serials will be presented as well as the advantages and disadvantages of our outcome. Finally, some time will be spent exploring the implications of migration to Fedora 3 on the custom tools and services that manage ingest, search and delivery in our repository. These include a Lucene-backed search service that was updated to work with the new messaging architecture, an ingest tool that had to accommodate new content models and a new version of FOXML, as well as our identifier resolution service used to maintain persistent URLs to objects (based around the OCLC PURL Resolver service).
Fedora Content Modelling for Improved Services for Research Databases

(Georgia Institute of Technology, 2009-05-21) Heller, Alfred ; Karstensen, Mikael ; Pedersen, Gert Schmeltz

A re-implementation of the research database of the Technical University of Denmark, DTU, is based on Fedora. The backbone consists of content models for primary and secondary entities and their relationships, giving flexible and powerful extraction capabilities for interoperability and reporting. By adopting such an abstract data model, the platform enables new and improved services for researchers, librarians and administrators.
Case Studies in Repository Workflows: Three Approaches

(Georgia Institute of Technology, 2009-05-21) Cramer, Tom ; Green, Richard ; McRae, Lynn ; Sigmon, Tim ; Wayland, Ross

"Lightweight workflow" is both an oxymoron, and a continual aspiration of the many stakeholders in the repository community. As part of the Hydra Project, the University of Hull, University of Virginia and Stanford University are collaboratively developing a reusable application framework that will sit on top of Fedora. Developing support for workflow (defined here as orchestrating multistep processes that may include human interaction) is integral to the project. The partners consciously chose to take three different paths in implementing and integrating workflow into the overall solution. This paper briefly details the three different workflow approaches the collaborators are taking, why they chose them, and the apparent pro's and cons of each.