GWAS Data and Federally-Funded Public Access Repositories: Considering the Ethical, Policy, and Social Implications from Multiple View Points

No Thumbnail Available
Koenig, Barbara
McCormick, Jennifer
Wu, Joel
Associated Organization(s)
Organizational Unit
Supplementary to
The collection and distribution of individual genotypic and phenotypic data has increased enormously in the past 10 years and shows no signs of slowing down. The rapid growth of bioinformatics technologies and genome wide association studies (GWAS) continues to increase the quantity and availability GWAS data. As the size, complexity, and number of GWAS increase, so do the risks to maintaining individual privacy, confidentiality, and autonomy and the public's trust in genomic research. The utility of collecting and using genotype and phenotype to promote advances in our understanding of human health and well-being must be kept in line with individual rights. And, policy governing the depositing of GWAS data into publicly accessible databases as well as obtaining that data by downstream researchers must balance the scientific benefit against the potential social and ethical risks. GWAS are an important tool in ascertaining genetic contribution to health risks, as well as in development of new therapeutic targets. The significance of GWAS data is subject to not only the size of the study population, but also the accuracy of phenotypic measures and density of the markers used in genotyping; as such, meaningful GWAS data are expensive and difficult to obtain. The scientific community has recognized there is high value in sharing the genotype and phenotype data obtained through GWAS. In response, the NIH established the Database of Genotypes and Phenotypes (dbGaP). dbGaP is a centralized database of GWAS data collected from large scale NIH-funded studies. The data collected and deposited into dbGaP can be distributed either publicly, or through a restricted process, depending on the nature of the data being shared and who is making the request for data. All NIH supported investigators are required to deposit GWAS data in dbGaP in order to maintain their NIH funding. Sharing of GWAS data involves distribution of sensitive personal information including at minimum, genotype and phenotype data, potentially including additional information such as individual medical records. The sharing of GWAS data exposes not only research subjects to additional risk of identification, harm, and loss of autonomy on how that data might be used, but also exposes researchers and institutions to increased risks for liability, loss of public trust, and loss of funding. These increased risks raise important ethical, legal, policy, and social issues that must be addressed by institutional and federal data sharing policy. While dbGaP is currently the only such data repository, it is very likely that other similarly federal research agency operated databases will come online in the near future. There are various stakeholders in this endeavor to increase ease of accessibility to GWAS data: researchers - both who deposit the data and who obtain the data from the repository, policy-makers who want to see the utility of the science they fund maximized, patient advocates who focus on pushing science along, government and individual institution research administrators who must oversee the repository and or access to the data, and research participants who contribute samples to the GWAS. This paper will discuss the experiences of the authors in creating an institutional policy as well as their participation in deliberations about cross-institutional data sharing policies. Particular emphasis will be placed on how to balance the utility of the data for scientific progress while still keeping in sight the rights, privacy, and confidentiality of the individual. In addition, the authors will present preliminary empirical data from interviews conducted with various stakeholders. The paper will also address the following questions regarding genotype and phenotype data sharing policy through a federal research agency publicly accessible repository in light of emerging legal and policy developments from late 2008 and 2009, 1. how to balance the scientific value of data sharing against the privacy risks, who should have access to different types of GWAS data; 2. whether GWAS data can be de-identified, and what de-identified really means. As noted above, dbGaP is one of the first such publicly accessible genotype and phenotype databases, and determination of relevant issues in creating policy for these databases is of critical importance. As we move forward in this era for large-scale GWAS, striving for advances in biomedical research and the eventual commonplace of individualized medicine, the combined perspectives of all stakeholders should be synthesized into a guiding framework for how research institutions and the federal government can create informed, relevant policy to protect not only the privacy, confidentiality, and personal interests of individual research subjects, but also to protect and promote the progress of individualized research medicine as a whole.
Date Issued
Resource Type
Resource Subtype
Rights Statement
Rights URI