Examination of previously identified associations within the Genetic Analysis Workshop 19 data
© The Author(s). 2016
Published: 18 October 2016
We investigate the possible replication of “known” associated single-nucleotide polymorphisms (SNPs) with blood pressure and expression phenotypes. Previous studies have provided a list of 95 SNPs thought to be associated with blood pressure phenotypes, of which 44 were present in the Genetic Analysis Workshop 19 (GAW19) family-imputed genome-wide association studies (GWAS) data and 4 in the GAW19 unrelateds sequence data. Using only the real (not simulated) GAW19 data, we show through the use of statistical tests that account for family relatedness, using FaST-LMM (Factored Spectrally Transformed Linear Mixed Model), that none of our candidate SNPs yields a significant p value. Furthermore, a study of epistasis, aiming to detect statistical interactions between loci with respect to their association with transcription levels has provided a list of 30 associated interacting SNP pairs, of which 13 are present in the GAW19 family GWAS and expression data. We show for this set of results, using the program GEMMA (genome-wide efficient mixed-model analysis) to account for family relatedness, that there is evidence of replication within the real GAW19 data. Two individual SNP pairs reach significance, and the set of remaining results give a combined p value of 0.017 that at least 1 of these remaining SNP pairs interacts to influence an expression phenotype.
Previous studies using very large data sets have provided a list of single-nucleotide polymorphisms (SNPs) believed to be associated with blood pressure and expression phenotypes. We attempt to replicate these SNPs in the Genetic Analysis Workshop 19 (GAW19) family genome-wide association studies (GWAS) data set and GAW19 sequence data, which may indicate the feasibility of finding novel SNPs in the GAW19 data sets.
Family genome-wide association studies data
The GAW19 family GWAS data  consisted of 959 individuals in 20 families with SNP data for odd chromosomes, including both real and imputed SNP data, and phenotype data for systolic blood pressure (SBP), diastolic blood pressure (DBP), and hypertension (HTN). Quality control was performed identically to that by Eu-ahsunthornwattana et al.  on the Genetic Analysis Workshop 18 data. This resulted in 954 individuals in 20 families. The phenotype data consists of longitudinal data measured over 4 years with covariates for smoking, HTN medication, and age. Covariates and measurements over multiple time points were accounted for by transformation to a single “average” quantitative trait for each phenotype, as described by Eu-ahsunthornwattana et al. .
A meta-analysis study conducted by Tragante et al.  of 87,736 individuals provided 95 candidate SNPs associated with blood pressure–related phenotypes; of these, 44 candidate SNPs were present in the GAW19 family data. Two extra phenotypes were created using the GAW19 phenotype data as defined by Tragante et al. : (a) median arterial pressure (MAP) = 2/3 DBP + 1/3 SBP; and (b) pulse pressure (PP) = SBP − DBP.
Estimates of the power to detect association, at significance level 0.05, for each of the tested SNPs with the appropriate phenotypes were calculated using the program Quanto (http://biostats.usc.edu/Quanto.html), assuming that the individuals are unrelated, thus providing upper limits for the power. Parameter estimates and minor allele frequencies were taken from Tragante et al. , and sample sizes were assumed to equal those of the GAW19 family GWAS data.
Unrelated sequence data
From the 95 previously associated SNPs, only 5 were found in the GAW19 sequence data. Data consisted of 1943 unrelated individuals (of which 92 had missing phenotype data), together with 1 covariate on HTN medication. PLINK was used to calculate p values using linear regression.
A recent study by Hemani et al. , motivated by a desire to investigate the extent to which epistasis (the phenomenon whereby one polymorphism’s effect on a trait depends on other polymorphisms present in the genome) might influence complex traits, detected 30 gene–gene (SNP–SNP) interactions associated with transcription. We attempted to replicate these associations using the GAW19 family GWAS and expression data. From the 30 candidate SNP pairs, 11 were not considered because SNPs were on even chromosomes and 6 because of missing gene-probe data. The gene probes in the GAW19 data were different from those used by Hemani et al. , and were adjusted to account for covariates using the same method as was described for the blood pressure phenotypes. One gene (CTSC) had 2 gene probes.
To test for SNP–SNP interactions while allowing for family relatedness, GEMMA (genome-wide efficient mixed-model analysis) was used with an estimated kinship matrix. GEMMA does not have an interaction option but it does allow covariates, which were used to encode SNP data through use of 2 linear mixed models. The first model encoded 3 variables: the number of minor alleles for each SNP and the intercept. The second model encoded an extra variable given by the product of the number of minor alleles of the 2 SNPs, thus imposing an additive × additive interaction model. The maximum likelihood estimates for each model were used to evaluate p values using the likelihood ratio test.
An overall p value for all 14 interaction tests was calculated using the method previously described for single SNP association analyses in the GAW19 family GWAS data, using 10 million replicates.
Family genome-wide association studies data
GWAS family data
Upper limits of the power to detect the tested SNP associations, at significance level 0.05, gave estimates ranging from approximately 0.080 to 0.30.
Unrelateds sequence data
Unrelateds sequence SNP data
Gene expression data and SNP–SNP interactions
The candidate SNPs and blood pressure phenotypes investigated here were previously detected in large meta-analyses or other replicated studies, giving considerable confidence that these SNPs are in fact genuinely associated. However, the sample size in the GAW19 family GWAS data consists of only 954 related individuals, giving power (for nominal p value 0.05) expected to be less than 0.080 to 0.30; perhaps it is not too unexpected that no associations were replicated. The sample size of the unrelateds sequence data, 1943 individuals, was greater, but nonetheless did not replicate any previously observed associations. Although the low sample sizes are the most obvious reason for the nonreplication, there may also be more subtle reasons for the nonreplication, such as the relatedness and ethnicity of the samples used. The quality and accuracy of the measured phenotypes may also be relevant, in particular whether individuals took HTN medicine or not.
The SNP–SNP interactions previously shown to be associated with transcription did, however, show some evidence of replication, with 2 SNP pairs showing significant evidence of association and the remaining SNP pairs giving an overall p value of 0.017, indicating that at least 1 additional SNP pair is associated. It is argued that the power to detect such associations may be greater because of the more direct link between SNPs and transcription. We note, however, that the interpretation of such findings as representing genuine interactions (as opposed to haplotype effects, possibly marking an untyped causal variant) can be flawed when the SNPs are close to one another .
There was no evidence of replication using the GAW19 data for previously found SNP associations with blood pressure phenotypes, possibly because of the low sample size. However, there was some evidence of replication for SNP–SNP interactions associated with transcription. This may be the result of a greater power to detect associations with transcription than with more distantly related phenotypes.
Support for this work was provided by the Wellcome Trust (grant references 087436/Z/08/Z and 102858/Z/13/Z). JE receives scholarship and funding from Faculty of Medicine, Ramathibodi Hospital, Mahidol University, Bangkok, Thailand.
This article has been published as part of BMC Proceedings Volume 10 Supplement 7, 2016: Genetic Analysis Workshop 19: Sequence, Blood Pressure and Expression Data. Summary articles. The full contents of the supplement are available online at http://bmcproc.biomedcentral.com/articles/supplements/volume-10-supplement-7. Publication of the proceedings of Genetic Analysis Workshop 19 was supported by National Institutes of Health grant R01 GM031575.
RH conducted statistical analyses and drafted the manuscript. JE prepared data and helped conduct statistical analysis. RD prepared the sequence data. HJC conceived the overall study and critically revised the manuscript. All authors read and approved the final manuscript.
The authors declare they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Blangero J, Teslovich TM, Sim X, Almeida MA, Jun G, Dyer TD, Johnson M, Peralta JM, Manning AK, Wood AR, et al. Omics squared: human genomic, transcriptomic, and phenotypic data for Genetic Analysis Workshop 19. BMC Proc. 2015;9 Suppl 8:S2.Google Scholar
- Eu-ahsunthornwattana J, Howey RA, Cordell HJ. Accounting for relatedness in family-based association studies: application to Genetic Analysis Workshop 18 data. BMC Proc. 2014;8 Suppl 1:S79.View ArticlePubMedPubMed CentralGoogle Scholar
- Tragante V, Barnes MR, Ganesh SK, Lanktree MB, Guo W, Franceschini N, Smith EN, Johnson T, Holmes MV, Padmanabhan S, et al. Gene-centric meta-analysis in 87,736 individuals of European ancestry identifies multiple blood-pressure-related loci. Am J Hum Genet. 2014;94(3):349–60.View ArticlePubMedPubMed CentralGoogle Scholar
- Dudbridge F, Koeleman BP. Rank truncated product of P-values, with application to genomewide association scans. Genet Epidemiol. 2003;25(4):360–6.View ArticlePubMedGoogle Scholar
- Schumann E. Generating correlated uniform variates. COMISEF. 2009. http://comisef.wikidot.com/tutorial:correlateduniformvariates.
- Yekutieli D, Benjamini Y. Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics. J Stat Plan Inference. 1999;82:171–96.View ArticleGoogle Scholar
- Hemani G, Shakhbazov K, Westra HJ, Esko T, Henders AK, McRae AF, Yang J, Gibson G, Martin NG, Metspalu A, et al. Detection and replication of epistasis influencing transcription in humans. Nature. 2014;508(7495):249–53.View ArticlePubMedPubMed CentralGoogle Scholar
- Wood AR, Tuke MA, Nalls MA, Hernandez DG, Bandinelli S, Singleton AB, Melzer D, Ferrucci L, Frayling TM, Weedon MN. Another explanation for apperent epistasis. Nature. 2014;514(7520):E3–5.View ArticlePubMedGoogle Scholar