Title:
High-Throughput Workflow for Computer-Assisted Human Parsing of Biological Specimen Label Data

dc.contributor.author Amin, Aliasgar en_US
dc.contributor.author Arsiwala, Zainab en_US
dc.contributor.author Best, Jason en_US
dc.contributor.author Huang, Jane Q. en_US
dc.contributor.author McCotter, Melody en_US
dc.contributor.author Moen, William E. en_US
dc.contributor.author Neill, Amanda en_US
dc.contributor.corporatename Botanical Research Institute of Texas en_US
dc.contributor.corporatename University of North Texas en_US
dc.contributor.corporatename Texas Center for Digital Knowledge en_US
dc.date.accessioned 2009-06-11T20:48:17Z
dc.date.available 2009-06-11T20:48:17Z
dc.date.issued 2009-05 en_US
dc.description 4th International Conference on Open Repositories en_US
dc.description This presentation was part of the session : Conference Posters en_US
dc.description.abstract Hundreds of thousands of specimens in herbaria and natural history museums worldwide are potential candidates for digitization, making them more accessible to researchers. An herbarium contains collections of preserved plant specimens created for scientific use. Herbarium specimens are ideal natural history objects for digitization, as the plants are pressed flat and dried, and mounted on individual sheets of paper, creating a nearly two-dimensional object. Building digital repositories of herbarium specimens can increase use and exposure of the collections while simultaneously reducing physical handling. As important as the digitized specimens are, the data contained on the associated specimen labels provide critical information about each specimen (e.g., scientific name, geographic location of specimen, etc.). The volume and heterogeneity of these printed label data present challenges in transforming them into meaningful digital form to support research. The Apiary Project is addressing these challenges by exploring and developing transformation processes in a systematic workflow that yields high-quality machine-processable label data in a cost- and time-efficient manner. The University of North Texas's Texas Center for Digital Knowledge (TxCDK) and the Botanical Research Institute of Texas (BRIT), with funding from an Institute of Museum and Library Services National Leadership Grant, are conducting fundamental research with the goal of identifying how human intelligence can be combined with machine processes for effective and efficient transformation of specimen label information. The results of this research will yield a new workflow model for effective and efficient label data transformation, correction, and enhancement. en_US
dc.description.sponsorship Institute of Museum and Library Services, National Leadership Grant en_US
dc.identifier.uri http://hdl.handle.net/1853/28412
dc.publisher Georgia Institute of Technology en_US
dc.relation.ispartofseries OR09. Conference Posters en_US
dc.subject Digital libraries en_US
dc.subject Digital repositories en_US
dc.subject Botanical specimens en_US
dc.title High-Throughput Workflow for Computer-Assisted Human Parsing of Biological Specimen Label Data en_US
dc.type Text
dc.type.genre Proceedings
dspace.entity.type Publication
local.contributor.corporatename Library
local.relation.ispartofseries Open Repositories Conference
relation.isOrgUnitOfPublication bf0ff3d1-48ff-4cf4-baa3-4c783958e37a
relation.isSeriesOfPublication 91d86e5c-4993-46f1-b27e-8195cabcdede
Files
Original bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
176-669-1-PB.docx
Size:
15.61 KB
Format:
Unknown data format
Description:
MS Word Extended Abstract
Thumbnail Image
Name:
176-670-1-PB.pdf
Size:
29.94 KB
Format:
Adobe Portable Document Format
Description:
PDF Extended Abstract
Collections