Comparison of 2 models for gene–environment interactions: an example of simulated gene–medication interactions on systolic blood pressure in family-based data
© The Author(s). 2016
Published: 18 October 2016
Nearly half of adults in the United States who are diagnosed with hypertension use blood-pressure-lowering medications. Yet there is a large interindividual variability in the response to these medications. Two complementary gene–environment interaction methods have been published and incorporated into publicly available software packages to examine interaction effects, including whether genetic variants modify the association between medication use and blood pressure. The first approach uses a gene–environment interaction term to measure the change in outcome when both the genetic marker and medication are present (the “interaction model”). The second approach tests for effect-size differences between strata of an environmental exposure (the “med-diff” approach). However, no studies have quantitatively compared how these methods perform with respect to 1 or 2 degree of freedom (DF) tests or in family-based data sets. We evaluated these 2 approaches using simulated genotype–medication response interactions at 3 single nucleotide polymorphisms (SNPs) across a range of minor allele frequencies (MAFs 0.1–5.4 %) using the Genetic Analysis Workshop 19 family sample.
The estimated interaction effect sizes were on average larger in the interaction model approach compared to the med-diff approach. The true positive proportion was higher for the med-diff approach for SNPs less than 1 % MAF, but higher for the interaction model when common variants were evaluated (MAF >5 %). The interaction model produced lower false-positive proportions than expected (5 %) across a range of MAFs for both the 1DF and 2DF tests. In contrast, the med-diff approach produced higher but stable false-positive proportions around 5 % across MAFs for both tests.
Although the 1DF tests both performed similarly for common variants, the interaction model estimated true interaction effects with less bias and higher true positive proportions than the med-diff approach. However, if rare variation (MAF <5 %) is of interest, our findings suggest that when convergence is achieved, the med-diff approach may estimate true interaction effects more conservatively and with less variability.
Hypertension—defined as an average systolic blood pressure (SBP) of 140 mm Hg or higher or an average diastolic blood pressure (DBP) of 90 mm Hg or higher—affects approximately 30 % of American adults, 45 % of whom use antihypertensive medications for blood pressure (BP) control [1, 2]. Broad interindividual variability in responsiveness to antihypertensive medications suggests that genetics may modify response to treatment [3, 4]. Furthermore, SBP and DBP are heritable, and candidate-gene and genome-wide association studies have uncovered more than 50 loci associated with BP [5–15]. Detection of genetic markers responsible for differential pharmacologic response inform our understanding of biological pathways relevant to hypertension, as well as future interventions to reduce its burden [16, 17].
Two complementary gene–environment (G × E) interaction methods have been described in the literature to test G × E interactions such as differential response to antihypertensives resulting from genetic variation. The first method (the “interaction model”) tests for interaction using a gene–environment interaction term to measure the change in outcome when both the genetic marker and environmental factor are present, as compared to when the genetic marker is present but the environmental factor is not . The second method (the “med-diff” approach) tests for effect size differences between strata that differ by environmental exposure . Both methods can estimate 1 degree of freedom (DF) tests of gene-medication interactions as well as 2DF (or joint) tests of these interactions and the genetic main effect using publicly available software.
Although these methods have been assumed to be theoretically equivalent, no previous studies have directly compared them. Therefore, in this study we aimed to evaluate their performance by comparing both their power to detect simulated interaction effects as well as their false-positive proportions (FPPs) in family-based data from the Genetic Analysis Workshop 19 (GAW19) . This was done by first calculating the true-positive proportion (TPP) for the 1DF and 2DF tests using 3 coding variants at CYP3A43 of varying minor allele frequencies (MAFs) with simulated genotype–medication response interactions. We then used TPP to evaluate the power to detect simulated main effects at MAP4 (the simulated single nucleotide polymorphisms, SNPs, with the largest proportion of variance explained in SBP, MAF 2.7 %) using a 2DF test in each approach. Lastly, we assessed the observed FPPs of each approach across the odd-numbered chromosomes without simulated effect using both 1DF and 2DF tests using publicly available software.
Type 2 Diabetes Genetic Exploration by Next-generation sequencing in Ethnic Samples (T2D-GENES) Consortium Project  genotypic and GAW19 simulated phenotypic data have been described separately . The GAW19 genotypic dosage data come from whole genome sequence variants for 20 extended Mexican American families collected as part of the San Antonio Family Studies. Imputation for missing SNP genotypes in pedigrees was conducted using a likelihood-based method implemented in MERLIN, based on the framework of available high-density genome-wide SNP data . The GAW19 conveners simulated 200 replicates of phenotypic data based on the observed longitudinal data in the family-based sample. These data included 3 predicted deleterious coding variants in CYP3A43 with simulated gene–medication interactions in the absence of genetic main effects. We were aware that carriers of these risk variants were assigned to be nonresponsive to the simulated BP treatment effect on SBP of −6.2 mm Hg (βInt = 6.2 mm Hg). Additionally there were 984 SNPs with simulated genetic main effects for SBP explaining between less than 0.1 % and 2.78 % of the phenotypic variance.
Accounting for family relatedness and population structure
We accounted for family relatedness using linear mixed models using a Comprehensive Mixed Model Program for Analysis of Pedigree and Population Data (MMAP) [23, 24]. To account for population structure, we applied principal component (PC) analysis to the observed genotypic data . PCs were initially calculated in unrelated founders (n = 117) and a subset of 28,156 SNPs were selected for uniform coverage and low mutual linkage disequilibrium (r2 ≤ 0.2). PCs were assigned to all other individuals (n = 959) using estimated PCs from founders and the predict function in R to compute each individual’s PC scores based on the individual’s genotypes (www.R-project.org). We included the top 5 PCs in all association analyses .
Gene–environment interaction analyses
Our analysis evaluated simulated SBP at the last time point (t = 3), when both the prevalence of hypertension and use of antihypertensive medications were the highest . We assessed the appropriateness of model-based SEs by examining the heterogeneity of residuals by medication status using a likelihood ratio test to compare the homogeneity model with the heterogeneity model, and based on these results, our models allowed the residual error term to differ by medication status (p ≥0.05). All G × E analyses were adjusted for age, sex, population structure, and relatedness. We filtered out any SNP that exhibited a minor allele count of less than 2. The characteristics of true-positive findings were investigated at the 3 SNPs in CYP3A43 with simulated gene–medication effects (1DF and 2DF tests), and the only SNP in MAP4 with a simulated genetic main effect (2.87 % variation in SBP, chr3: 48040283, −9.91 mm Hg per minor allele) and greater than 80 % estimated power using a 2DF test. The FPP was calculated for the odd-numbered chromosomes using SNPs beyond 500 kb of the simulated effects (gene–medication effects for the 1DF tests, or simulated main and interaction genetic effects for the 2DF joint tests). Both true and false positives were considered statistically significant using a p value criterion of less than 0.05. Based on the simulated prevalence of medication status and SBP distribution, we estimated the expected TPP using an approximate effective sample size of 80 % of the total sample (n = 849), to account for the nonindependence of relative pairs. Power analyses were conducted using Quanto .
The interaction between the simulated genotypes and BP medication status on SBP at t = 3 (equation 1) was modeled to calculate the estimated interaction effect, model-based SEs, and p values using MMAP . The 1DF and 2DF joint tests (shown below) for this method have been described by Manning et al. .
Medication-stratified, “med-diff” approach
To apply the med-diff approach to BP medication–stratified results (Models 2a, b), we modeled the genetic main effect within strata of BP medication status using MMAP . Then the Spearman rank correlation coefficient between strata for all SNPs (r, range across replicates and chromosomes: −0.13 to 0.16), magnitude, SE, and p value of the difference were estimated using EasyStrata . The 1DF and the 2DF joint tests (shown below) have been described by Randall et al.  and Aschard et al. .
The average age was 48.1 years (58 % female) with the youngest and oldest participants across all replicates being 18 and 101 years old, respectively. The mean prevalence of BP medication use at the last simulated time point was 32.7 % (range across replicates: 30.2 to 36.4 %). The average change in SBP in individuals who initiated BP medication between the first and last time points was −6.9 mm Hg. The average odd-numbered chromosome-wide convergence was higher for the interaction model than for the med-diff approach (1DF: 8.0 × 106 vs. 7.0 × 106’, 2DF: 6.5 × 106 vs. 5.7 × 106).
CYP3A43 true-positive gene–medication effects (1DF)
Power to detect true-positive gene–medication interactions at CYP3A43 using 2 approaches. Gene–medication interactions (200 replicates) were simulated to be 6.2 mm Hg at 3 single nucleotide polymorphisms (SNPs) representing a range of minor allele frequencies (MAFs) (0.1 to 5.4 %)
2DF PSNP, Int
chr7:99457518-A (0.1 % MAF)
Replicates P <0.05 of 141 converged
chr7:99454482-G (0.8 % MAF)
Replicates P <0.05 of 190 converged
chr7:99457605-C (5.4 % MAF)
Replicates P <0.05 of 200 converged
MAP4 true-positive main genetic effects (2DF)
At chromosome (chr) 3:48040283, the main genetic effect was overestimated by both the interaction model (−13.1 mm Hg) and the med-diff stratified analysis of nonmedicated individuals (−13.3 mm Hg), whereas the med-diff analysis of medicated individuals underestimated the effect (−7.8 mm Hg). The 2DF joint p values were comparable between the 2 approaches with TPPs of 100 % (n = 156).
False-positive proportions (1DF, 2DF)
Randall et al.  have argued that 1DF tests, such as the interaction model and the med-diff approach, are particularly useful for informing public health interventions by highlighting the nature by which the environment may attenuate or exacerbate genetic predisposition to disease susceptibility. Our preliminary results thus far indicate that these 2 G × E approaches have notable differences with respect to the interaction effect simulated herein. The interaction model may be better at detecting true positive interactions than the med-diff approach for common SNPs, but the med-diff approach may estimate interaction effects more conservatively (i.e. closer to the null) and with less variability at rare SNPs (MAF <1 %). However, it is unclear how nonconvergence may be influencing these results.
The simulated GAW19 data set used in this analysis did not contain any simulated interaction effects at loci with simulated main genetic effects. Thus, using the simulated GAW19 data we were unable to validate a true positive with both main and interaction effects for either approach (2DF test), which may limit the generalizability of our findings. We were however able to compare our ability to detect a strong main effect at MAP4 using a 2DF joint test of main and interaction effects. Even though we expect that a 1DF test of the main genetic effect would be a more powerful approach than a 2DF test when there is a true genetic main effect, this “true positive” assessment may represent a real world application of a 2DF test, wherein the investigator has no knowledge of the underlying true effects and may be interested in assessing the influence of potential interactions on established genetic loci. Furthermore, unlike the med-diff approach 1DF test implementation, the published 2DF test implemented in publicly available software does not account for the potential for correlation between strata due to relatedness. It is unclear how this may bias the observed 2DF results and methods comparisons made herein. Future work warrants a more thorough investigation of these approaches in family-based and unrelated data sets to detect associations at loci with both true main and interaction effects.
Based on our findings, the use of these G × E approaches on SNPs with less than 1 % MAF may lead to unstable TPPs and FPPs. First, nonconvergence may plague the most moderate samplings of the data, allowing the SEs to appear smaller than they really are and TPPs higher than expected, as we had observed for the least common CYP3A43 SNP (MAF 0.1 %). At the SNPs examined at CYP3A43 and MAP4 we observed identical convergence between the 2 approaches. Yet we observed lower odd-numbered chromosome-wide convergence for the med-diff approach than the interaction model, because all medication-stratified models must have converged in order to apply the med-diff approach. Second, we observed FPPs for the interaction model less than expected (5 %) for both 1DF and 2DF tests, which was not the case for the med-diff approach. An analytic focus on low-frequency SNPs (MAF 1–5 %) or common SNPs (MAF >5 %) may minimize the observed difference in the FPPs between the 2 approaches.
In this specific simulated example of gene-medication interactions in family-based data the med-diff approach exhibited greater power to detect interaction effects for low-frequency variants (MAF <1 %), whereas the interaction model exhibited greater power for common alleles (MAF >5 %). However, the med-diff method resulted in a stable but greater FPP for low-frequency variants as compared to the interaction method. In summary, both approaches are robust for common variants (MAF >5 %), but become less concordant as MAF decreases. One of the benefits of the stratified analysis of the med-diff approach is that it is less computationally intensive, but may not be appropriate for continuous environmental factors. This indicates that model selection may in part be context-specific. Furthermore if rare variation is of interest in future investigations, our findings here suggest that the med-diff approach may estimate true interaction effects more robustly, with less variability, and with stable FPPs around 5 % as expected; however, if common variants are the focus, the interaction model may be more robust.
Future investigations should consider different types of simulated interactions, for example, where genetic effects are in opposite directions in the 2 environmental strata or where genetic effects differ in magnitude between strata (ie, when both interaction and main effects are present). This current study used the GAW19 simulated family-based data to fill a gap in the G × E literature and make quantitative comparisons directly relevant to model choice in future effect estimation (TPP) and discovery (FPP) studies in the field of genetic epidemiology.
We would like to thank Thomas W. Winkler for his help implementing the analysis in EasyStrata and his review of the manuscript.
The T2D-GENES Consortium Project is supported by National Institutes of Health grants U01 DK085524, U01 DK085501, U01 DK085526, U01 DK085584, and U01 DK085545. The San Antonio Family Studies are supported by P01 HL045222, R01 DK047482, and R01 DK053889. This work is funded in part through NIH grants U01 HL084756, 2 T32 HL007055-36, and the AHA Pre-/Post-doctoral Fellowships. This article has been published as part of BMC Proceedings Volume 10 Supplement 7, 2016: Genetic Analysis Workshop 19: Sequence, Blood Pressure and Expression Data. Summary articles. The full contents of the supplement are available online at http://bmcproc.biomedcentral.com/articles/supplements/volume-10-supplement-7. Publication of the proceedings of Genetic Analysis Workshop 19 was supported by National Institutes of Health grant R01 GM031575.
Availability data and materials
The data was available as part of participation in the GAW19 workshop .
LFR participated in this study’s design and coordination, and in statistical analyses, assembling, and drafting the manuscript. CJH assisted in statistical analyses, assembling, and drafting the manuscript. MG, AGH, AAS, and AEJ assisted in statistical analyses and drafting the manuscript. SML, JRO, and KEN assisted in design of the study and drafting the manuscript. CLA, GC, NF, VSV, and KLY assisted in drafting the manuscript. AEJ conceived of the study and participated in its design. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Centers for Disease Control and Prevention (CDC). Vital signs: awareness and treatment of uncontrolled hypertension among adults—United States, 2003–2010. MMWR Morb Mortal Wkly Rep. 2012;61:703–9.Google Scholar
- Gu Q, Burt VL, Dillon CF, Yoon S. Trends in antihypertensive medication use and blood pressure control among United States adults with hypertension: the National Health And Nutrition Examination Survey, 2001 to 2010. Circulation. 2012;126(17):2105–14.View ArticlePubMedGoogle Scholar
- Konoshita T, Genomic Disease Outcome Consortium (G-DOC) Study Investigators. Do genetic variants of the renin-angiotensin system predict blood pressure response to renin-angiotensin system-blocking drugs?: a systematic review of pharmacogenomics in the renin-angiotensin system. Curr Hypertens Rep. 2011;13(5):356–61.View ArticlePubMedPubMed CentralGoogle Scholar
- Suonsyrja T, Donner K, Hannila-Handelberg T, Fodstad H, Kontula K, Hiltunen TP. Common genetic variation of beta1- and beta2-adrenergic receptor and response to four classes of antihypertensive treatment. Pharmacogenet Genomics. 2010;20(5):342–5.View ArticlePubMedGoogle Scholar
- Franceschini N, Fox E, Zhang Z, Edwards TL, Nalls MA, Sung YJ, Tayo BO, Sun YV, Gottesman O, Adeyemo A, et al. Genome-wide association analysis of blood-pressure traits in African-ancestry individuals reveals common associated genes in African and non-African populations. Am J Hum Genet. 2013;93(3):545–54.View ArticlePubMedPubMed CentralGoogle Scholar
- He J, Kelly TN, Zhao Q, Li H, Huang J, Wang L, Jaquish CE, Sung YJ, Shimmin LC, Lu F, et al. Genome-wide association study identifies 8 novel loci associated with blood pressure responses to interventions in Han Chinese. Circ Cardiovasc Genet. 2013;6(6):598–607.View ArticlePubMedPubMed CentralGoogle Scholar
- Hottenga JJ, Boomsma DI, Kupper N, Posthuma D, Snieder H, Willemsen G, de Geus EJ. Heritability and stability of resting blood pressure. Twin Res Hum Genet. 2005;8(5):499–508.View ArticlePubMedGoogle Scholar
- International Consortium for Blood Pressure Genome-Wide Association Studies, Ehret GB, Munroe PB, Rice KM, Bochud M, Johnson AD, Chasman DI, Smith AV, Tobin MD, Verwoert GC, et al. Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature. 2011;478(7367):103–9.View ArticleGoogle Scholar
- Kato N, Takeuchi F, Tabara Y, Kelly TN, Go MJ, Sim X, Tay WT, Chen CH, Zhang Y, Yamamoto K, et al. Meta-analysis of genome-wide association studies identifies common variants associated with blood pressure variation in East Asians. Nat Genet. 2011;43(6):531–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Kraja AT, Vaidya D, Pankow JS, Goodarzi MO, Assimes TL, Kullo IJ, Sovio U, Mathias RA, Sun YV, Franceschini N, et al. A bivariate genome-wide approach to metabolic syndrome: STAMPEED consortium. Diabetes. 2011;60(4):1329–39.View ArticlePubMedPubMed CentralGoogle Scholar
- Levy D, Ehret GB, Rice K, Verwoert GC, Launer LJ, Dehghan A, Glazer NL, Morrison AC, Johnson AD, Aspelund T, et al. Genome-wide association study of blood pressure and hypertension. Nat Genet. 2009;41(6):677–87.View ArticlePubMedPubMed CentralGoogle Scholar
- Newton-Cheh C, Johnson T, Gateva V, Tobin MD, Bochud M, Coin L, Najjar SS, Zhao JH, Heath SC, Eyheramendy S, et al. Genome-wide association study identifies eight loci associated with blood pressure. Nat Genet. 2009;41(6):666–76.View ArticlePubMedPubMed CentralGoogle Scholar
- Tragante V, Barnes MR, Ganesh SK, Lanktree MB, Guo W, Franceschini N, Smith EN, Johnson T, Holmes MV, Padmanabhan S, et al. Gene-centric meta-analysis in 87,736 individuals of European ancestry identifies multiple blood-pressure-related loci. Am J Hum Genet. 2014;94(3):349–60.View ArticlePubMedPubMed CentralGoogle Scholar
- Wain LV, Verwoert GC, O’Reilly PF, Shi G, Johnson T, Johnson AD, Bochud M, Rice KM, Henneman P, Smith AV, et al. Genome-wide association study identifies six new loci influencing pulse pressure and mean arterial pressure. Nat Genet. 2011;43(10):1005–11.View ArticlePubMedPubMed CentralGoogle Scholar
- Rutherford S, Cai G, Lopez-Alvarenga JC, Kent JW, Voruganti VS, Proffitt JM, Curran JE, Johnson MP, Dyer TD, Jowett JB, et al. A chromosome 11q quantitative-trait locus influences change of blood-pressure measurements over time in Mexican Americans of the San Antonio Family Heart Study. Am J Hum Genet. 2007;81(4):744–55.View ArticlePubMedPubMed CentralGoogle Scholar
- Lee JW, Aminkeng F, Bhavsar AP, Shaw K, Carleton BC, Hayden MR, Ross CJ. The emerging era of pharmacogenomics: current successes, future potential, and challenges. Clin Genet. 2014;86(1):21–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Weitzel KW, Elsey AR, Langaee TY, Burkley B, Nessl DR, Obeng AO, Staley BJ, Dong HJ, Allan RW, Liu JF, et al. Clinical pharmacogenetics implementation: approaches, successes, and challenges. Am J Med Genet C: Semin Med Genet. 2014;166C(1):56–67.View ArticleGoogle Scholar
- Manning AK, LaValley M, Liu CT, Rice K, An P, Liu Y, Miljkovic I, Rasmussen-Torvik L, Harris TB, Province MA, et al. Meta-analysis of gene-environment interaction: joint estimation of SNP and SNP x environment regression coefficients. Genet Epidemiol. 2011;35(1):11–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Randall JC, Winkler TW, Kutalik Z, Berndt SI, Jackson AU, Monda KL, Kilpeläinen TO, Esko T, Mägi R, Li S, et al. Sex-stratified genome-wide association studies including 270,000 individuals show sexual dimorphism in genetic loci for anthropometric traits. PLoS Genet. 2013;9(6):e1003500.View ArticlePubMedPubMed CentralGoogle Scholar
- Blangero J, Teslovich TM, Sim X, Almeida MA, Jun G, Dyer TD, Johnson M, Peralta JM, Manning AK, Wood AR, et al. Omics squared: human genomic, transcriptomic, and phenotypic data for Genetic Analysis Workshop 19. BMC Proc. 2015;9 Suppl 8:S2.Google Scholar
- Almasy L, Dyer TD, Peralta JM, Jun G, Wood AR, Fuchsberger C, Almeida MA, Kent Jr JW, Fowler S, Blackwell TW, et al. T2D-GENES Consortium: Data for Genetic Analysis Workshop 18: human whole genome sequence, blood pressure, and simulated phenotypes in extended pedigrees. BMC Proc. 2014;8 Suppl 1:S2.View ArticlePubMedPubMed CentralGoogle Scholar
- Abecasis GR, Cherny SS, Cookson WO, Cardon LR. Merlin—rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet. 2002;30(1):97–101.View ArticlePubMedGoogle Scholar
- O’Connell JR. MMAP: A Comprehensive Mixed Model Program for Analysis of Pedigree and Population Data. Boston: American Society of Human Genetics; 2013.Google Scholar
- O’Connell JR. MMAP User Guide. Baltimore: University of Maryland; 2014.Google Scholar
- Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38(8):904–9.View ArticlePubMedGoogle Scholar
- Kelly TN, Takeuchi F, Tabara Y, Edwards TL, Kim YJ, Chen P, Li H, Wu Y, Yang CF, Zhang Y, et al. Genome-wide association study meta-analysis reveals transethnic replication of mean arterial and pulse pressure loci. Hypertension. 2013;62(5):853–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Winkler TW. EasyStrata: Evaluation of Stratified Genome-Wide Association Meta-Analysis Results. 2014. http://cran.r-project.org/web/packages/EasyStrata/index.html.Google Scholar
- Aschard H, Hancock DB, London SJ, Kraft P. Genome-wide meta-analysis of joint tests for genetic and gene-environment interaction effects. Hum Hered. 2010;70(4):292–300.View ArticlePubMedGoogle Scholar