High-Throughput Workflow for Computer-Assisted Human Parsing of Biological Specimen Label Data

Amin, Aliasgar; Arsiwala, Zainab; Best, Jason; Huang, Jane Q.; McCotter, Melody; Moen, William E.; Neill, Amanda

Title:

High-Throughput Workflow for Computer-Assisted Human Parsing of Biological Specimen Label Data

dc.contributor.author	Amin, Aliasgar	en_US
dc.contributor.author	Arsiwala, Zainab	en_US
dc.contributor.author	Best, Jason	en_US
dc.contributor.author	Huang, Jane Q.	en_US
dc.contributor.author	McCotter, Melody	en_US
dc.contributor.author	Moen, William E.	en_US
dc.contributor.author	Neill, Amanda	en_US
dc.contributor.corporatename	Botanical Research Institute of Texas	en_US
dc.contributor.corporatename	University of North Texas	en_US
dc.contributor.corporatename	Texas Center for Digital Knowledge	en_US
dc.date.accessioned	2009-06-11T20:48:17Z
dc.date.available	2009-06-11T20:48:17Z
dc.date.issued	2009-05	en_US
dc.description	4th International Conference on Open Repositories	en_US
dc.description	This presentation was part of the session : Conference Posters	en_US
dc.description.abstract	Hundreds of thousands of specimens in herbaria and natural history museums worldwide are potential candidates for digitization, making them more accessible to researchers. An herbarium contains collections of preserved plant specimens created for scientific use. Herbarium specimens are ideal natural history objects for digitization, as the plants are pressed flat and dried, and mounted on individual sheets of paper, creating a nearly two-dimensional object. Building digital repositories of herbarium specimens can increase use and exposure of the collections while simultaneously reducing physical handling. As important as the digitized specimens are, the data contained on the associated specimen labels provide critical information about each specimen (e.g., scientific name, geographic location of specimen, etc.). The volume and heterogeneity of these printed label data present challenges in transforming them into meaningful digital form to support research. The Apiary Project is addressing these challenges by exploring and developing transformation processes in a systematic workflow that yields high-quality machine-processable label data in a cost- and time-efficient manner. The University of North Texas's Texas Center for Digital Knowledge (TxCDK) and the Botanical Research Institute of Texas (BRIT), with funding from an Institute of Museum and Library Services National Leadership Grant, are conducting fundamental research with the goal of identifying how human intelligence can be combined with machine processes for effective and efficient transformation of specimen label information. The results of this research will yield a new workflow model for effective and efficient label data transformation, correction, and enhancement.	en_US
dc.description.sponsorship	Institute of Museum and Library Services, National Leadership Grant	en_US
dc.identifier.uri	http://hdl.handle.net/1853/28412
dc.publisher	Georgia Institute of Technology	en_US
dc.relation.ispartofseries	OR09. Conference Posters	en_US
dc.subject	Digital libraries	en_US
dc.subject	Digital repositories	en_US
dc.subject	Botanical specimens	en_US
dc.title	High-Throughput Workflow for Computer-Assisted Human Parsing of Biological Specimen Label Data	en_US
dc.type	Text
dc.type.genre	Proceedings
dspace.entity.type	Publication
local.contributor.corporatename	Library
local.relation.ispartofseries	Open Repositories Conference
relation.isOrgUnitOfPublication	bf0ff3d1-48ff-4cf4-baa3-4c783958e37a
relation.isSeriesOfPublication	91d86e5c-4993-46f1-b27e-8195cabcdede

Files

Original bundle

Now showing 1 - 2 of 2

Name:: 176-669-1-PB.docx
Size:: 15.61 KB
Format:: Unknown data format
Description:: MS Word Extended Abstract

Download

Name:: 176-670-1-PB.pdf
Size:: 29.94 KB
Format:: Adobe Portable Document Format
Description:: PDF Extended Abstract

Download

Collections

Scholarly Events

Title: High-Throughput Workflow for Computer-Assisted Human Parsing of Biological Specimen Label Data

Files

Original bundle

Collections

Title:

High-Throughput Workflow for Computer-Assisted Human Parsing of Biological Specimen Label Data