Apache Spark Performance Compared to a Traditional Relational Database using Open Source Big Data Health Software

Author(s)
Powers, Joshua
Advisor(s)
Editor(s)
Associated Organization(s)
Organizational Unit
Organizational Unit
School of Computer Science
School established in 2007
Series
Supplementary to:
Abstract
The author outlines how big data software can be utilized to speed up health analytics software when faced with big data problems. Specific data analytics from the Observational Health Data Sciences and Informatics (OHDSI) Analytics tool's will be rewritten to demonstrate Apache Spark's ability to more quickly process data with Resilient Distributed Dataset (RDD) in comparison to the use of traditional relational databases such as PostgreSQL.
Sponsor
Date
2016-04-24
Extent
Resource Type
Text
Resource Subtype
Article
Rights Statement
Rights URI