Volume 8 Supplement 1
Genetic Analysis Workshop 18: Human sequence data in extended pedigrees
A powerful statistical method identifies novel loci associated with diastolic blood pressure triggered by nonlinear geneenvironment interaction
 Honglang Wang^{1},
 Tao He^{1},
 Cen Wu^{1},
 PingShou Zhong^{1} and
 Yuehua Cui^{1}Email author
DOI: 10.1186/175365618S1S61
© Wang et al.; licensee BioMed Central Ltd. 2014
Published: 17 June 2014
Abstract
The genetic basis of blood pressure often involves multiple genetic factors and their interactions with environmental factors. Geneenvironment interaction is assumed to play an important role in determining individual blood pressure variability. Older people are more prone to high blood pressure than younger ones and the risk may not display a linear trend over the life span. However, which gene shows sensitivity to aging in its effect on blood pressure is not clear. In this work, we allowed the genetic effect to vary over time and propose a varyingcoefficient model to identify potential genetic players that show nonlinear response across different age stages. We detected 2 novel loci, gene MIR1263 (a microRNA coding gene) on chromosome 3 and gene UNC13B on chromosome 9, that are nonlinearly associated with diastolic blood pressure. Further experimental validation is needed to confirm this finding.
Background
The genetic basis of a complex trait often involves multiple genetic factors functioning in a coordinated manner. The extent to which our genetic blueprint expresses also depends on the interactions between genetic and environmental factors. Increasing evidence shows the importance of geneenvironment (G × E) interactions in determining the risk of a variety of diseases such as respiratory diseases [1], obesity [2], and psychiatric disorders [3]. For a review of G × E interaction, see the work of Hunter [4]. The empirical evidence underscores the importance of developing novel statistical approaches to identify major genetic players that are sensitive to environmental stimuli and to further understand how they function.
Blood pressure is a heritable trait influenced by several biological pathways sensitive to environmental stimuli. High blood pressure, or hypertension, affects more than 1 billion people worldwide. It damages an individual's body in many ways over time, leading to heart disease, stroke, kidney failure, and other health problems [5]. Age is known to be a risk factor for high blood pressure. Systolic blood pressure rises with age, whereas the diastolic blood pressure tends to fall. For people with preexisting high blood pressure, this agerelated pattern occurs even if the blood pressure is well controlled with medication [6]. The reasons why blood pressure changes with age are still poorly understood, but are a topic of intense research. Thus, age should be an important predictor when searching for genetic players responsible for hypertension. However, few studies have considered an agedependent mechanism in their analysis.
where Y is a quantitative trait (diastolic blood pressure in this analysis), G is the genetic variable, X is the environmental variable (age), and is the error term. This is a classical linear model for G × E interaction analysis. As can be seen, equation (1) automatically assumes a linear interaction mechanism between G and X because the coefficient for G is a linear function in X. However, the contribution of the same gene to blood pressure may be quite different at different age levels. This nonlinear penetrance can be well understood by a statistical varyingcoefficient (VC) model [9]. VC models allow the coefficients to change smoothly and nonlinearly with other variables so that one can explore the dynamic feature of a response over time with great flexibility and nice interpretability [10].
In this work, we applied VC models to detect genetic variants associated with diastolic blood pressure from the Genetic Analysis Workshop 18 (GAW18) data with 142 unrelated individuals. We allowed the contribution of genetic variants to blood pressure to vary over time via varying coefficients. We further proposed a sequence of hypothesis tests to evaluate whether the effect of a genetic variant is sensitive to aging, and if it is, is it in a linear or nonlinear fashion? Using this analysis, we identified 2 novel loci that show nonlinear effects over time to affect blood pressure.
Methods
The model
for given $\left(X,G\right)$ and the response $Y$ with $E\left(\u03f5X,G\right)=0$ and $Var\left(\u03f5\text{}X,G\right)=1;$${\sigma}^{2}\left(X\right)=Var\left(YX,G\right)$ is the conditional variance function. The mean function is defined as $m\left(X,G\right)=\alpha \left(X\right)+\beta \left(X\right)G,$ where $\beta \left(X\right)$ is a smoothing function in $X$. Under the VC modeling framework, the effect of a gene is allowed to vary as a function of environmental factors, either linearly or nonlinearly, captured by the model itself. Thus, the VC model has the potential to dissect the nonlinear penetrance of genetic variants. Here we also allow nonlinear function of $X$ with $Y$ modeled by $\alpha \left(X\right)$. This nonlinear term adjusts the nonlinear effect of $X$ when estimating the nonlinear effect of $\beta \left(X\right)$. If we take $\alpha \left(X\right)={\alpha}_{0}+{\alpha}_{1}X,$ equation (1) is just a special case of the VC model when $\beta \left(X\right)={\beta}_{1}+{\beta}_{2}X$.
Hypothesis testing
The following list shows all 4 mean models involved in our analysis.

Model 1: $m\left(X,G\right)=\alpha \left(X\right),$ no genetic effect at all;

Model 2: $m\left(X,G\right)=\alpha \left(X\right)+\beta G,$ linear genetic effect without interaction;

Model 3: $m\left(X,G\right)=\alpha \left(X\right)+\left({\beta}_{0}+{\beta}_{1}X\right)G,$ linear genetic effect with interaction; and

Model 4: $m\left(X,G\right)=\alpha \left(X\right)+\beta \left(X\right)G,$ nonlinear genetic effect.
The rejection of the null indicates that the genetic effect is sensitive to age in a nonlinear fashion. The sequence of hypothesis tests stated above was suggested by Ma et al [9] for optimal power to detect association.
Model implementation
where ${\hat{\tau}}^{2}=1/n{\sum}_{i=1}^{n}{\left\{{Y}_{i}\hat{m}\left({X}_{i},{G}_{i}\right)\right\}}^{2}$ Then the same number of knots ${N}_{\alpha}$ and degree ${p}_{\alpha}$ were applied to estimate function $\alpha \left(X\right)$ when fitting mean models 2 to 4.
Thus we have $\widehat{\alpha}\left(x\right)={\sum}_{t=1}^{{N}_{\alpha}+{p}_{\alpha}+1}{\widehat{\theta}}_{t}{B}_{t}\left(x\right)$ and $\widehat{\beta}\left(x\right)={\sum}_{s=1}^{{N}_{\beta}+{p}_{\beta}+1}{\widehat{\lambda}}_{s}{B}_{s}\left(x\right),$ where ${N}_{\beta}$ and ${p}_{\beta}$ are also selected following the above BIC criterion.
Thus we have $\hat{v}={\left({B}^{T}B\right)}^{1}{B}^{T}{\hat{\u03f5}}^{2},$ and ${\left({\hat{\sigma}}^{2}\left({X}_{1}\right),\cdots \phantom{\rule{0.3em}{0ex}},{\hat{\sigma}}^{2}\left({X}_{n}\right)\right)}^{T}=B\hat{v}=B{\left({B}^{T}B\right)}^{1}{B}^{T}{\hat{\u03f5}}^{2}.$ Wild bootstrap can be applied to assess the significance of ${H}_{0}^{3}$[11].
Results
We applied the above models to the GAW18 genomewide association data. We focused our analysis on diastolic blood pressure (DBP) to identify any genetic players that can explain the variability of DBP triggered by nonlinear genetic penetrance over time. We treated DBP as the response Y and age as the X variable. The genetic variable G is coded following an additive model, that is, G = 1, 0, −1, corresponding to genotype AA, Aa, aa, respectively. In total, 142 individuals and 388,099 SNPs were left after removing SNPs with a minor allele frequency less than 0.05. These SNPs are distributed on oddnumbered chromosomes from chromosome 1 to chromosome 21.
List of SNPs with p value <5 × 10^{−}^{7}
rs ID  Gene name  Chr  ${p}_{41}$  ${p}_{31}$  ${p}_{21}$  ${p}_{42}$  ${p}_{43}$ 

rs1086097  MIR1263  3  1.9 × 10^{−7}  0.009  0.90  4.97 × 10^{−8}  1.08 × 10^{−6} 
rs686697  MIR1263  3  3.4 × 10^{−7}  0.005  0.76  9.24 × 10^{−8}  3.41 × 10^{−6} 
rs483558  unknown  3  4.7 × 10^{−7}  0.007  0.76  1.30 × 10^{−7}  3.72 × 10^{−6} 
rs9863717*  unknown  3  4.96 × 10^{−8}  0.009  0.95  1.23 × 10^{−8}  2.55 × 10^{−7} 
rs1575160*  unknown  3  8.7 × 10^{−8}  0.011  0.70  2.33 × 10^{−8}  3.76 × 10^{−7} 
rs723877  UNC13B  9  4.3 × 10^{−7}  1.25 × 10^{−6}  2.37 × 10^{−5}  6.1 × 10^{−4}  0.02 
rs10972462*  UNC13B  9  9.5 × 10^{−8}  6.96 × 10^{−7}  1.31 × 10^{−5}  2.3 × 10^{−4}  0.007 
Discussion and conclusions
In this work, we proposed to model the genetic effect as a nonlinear function of age. It is clear that the classical linear model, with or without interaction, is just a special case of the VC model. However, the VC model has the flexibility to capture potential nonlinear genetic effects over time. Evidence of nonlinear genetic effects has been reported previously. For example, Laitala et al [13] reported the curvilinear genetic effect on interindividual differences in coffee consumption over age. In a study of congenital scoliosis in mice [14], the authors found that mutations in genes HES7 and MESP2 are sensitive to different degrees of hypoxia, which is responsible for a nonlinear increase in the severity and penetrance of vertebral defects. Our analysis identified 2 novel loci associated with DBP with nonlinear genetic effects. They can be missed by the traditional linear interaction model. However, because statistical significance does not necessarily imply causality, further experimental validation is needed to confirm the finding.
As shown in Ma et al [9], the VC model loses power because of high degrees of freedom in the test in cases where the genetic effect is not very complex, such as in a linear form. Thus one should assess constant or linear effects first, followed by fitting the corresponding model suggested by the results of the tests. In this analysis, we found that the coefficients are constant for most SNPs.
Note that the function $\alpha \left(X\right)$ models the overall mean of DBP over time when there is no genetic effect. When a linear structure for $\alpha \left(X\right)\left(={\alpha}_{0}+{\alpha}_{1}X\right)$ is forced, we observe inflated signals for testing ${H}_{0}:\beta \left(X\right)=0$. Thus, the incorporation of this nonlinear function can largely reduce false positives. In this analysis we coded the genetic variable G in an additive fashion, although other disease models such as dominant or recessive can also be assumed, while the optimal one can be selected based on a model selection criterion such as BIC.
Declarations
Acknowledgements
We greatly appreciate the GAW18 data provider and the National Institutes of Health grant that supports GAW18. We also wish to thank the anonymous reviewer for the insightful comments that greatly improved the manuscript. This work was partially supported by National Science Foundation grants DMS1209112 and IOS1237969. The GAW18 whole genome sequence data were provided by the T2DGENES Consortium, which is supported by NIH grants U01 DK085524, U01 DK085584, U01 DK085501, U01 DK085526, and U01 DK085545. The other genetic and phenotypic data for GAW18 were provided by the San Antonio Family Heart Study and San Antonio Family Diabetes/Gallbladder Study, which are supported by NIH grants P01 HL045222, R01 DK047482, and R01 DK053889. The Genetic Analysis Workshop is supported by NIH grant R01 GM031575.
This article has been published as part of BMC Proceedings Volume 8 Supplement 1, 2014: Genetic Analysis Workshop 18. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcproc/supplements/8/S1. Publication charges for this supplement were funded by the Texas Biomedical Research Institute.
Authors’ Affiliations
References
 Kleeberger SR, Peden D: Geneenvironment interactions in asthma and other respiratory diseases. Annu Rev Med. 2005, 56: 383400. 10.1146/annurev.med.56.062904.144908.View ArticlePubMedGoogle Scholar
 Qi L, Cho YA: Geneenvironment interaction and obesity. Nutr Rev. 2008, 66: 684694. 10.1111/j.17534887.2008.00128.x.PubMed CentralView ArticlePubMedGoogle Scholar
 Caspi A, Moffitt TE: Geneenvironment interactions in psychiatry: joining forces with neuroscience. Nat Rev Neurosci. 2006, 7: 583590.View ArticlePubMedGoogle Scholar
 Hunter DJ: Geneenvironment interactions in human diseases. Nat Rev Genet. 2005, 6: 287298.View ArticlePubMedGoogle Scholar
 The International Consortium for Blood Pressure GenomeWide Association Studies: Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature. 2011, 478: 103109. 10.1038/nature10405.PubMed CentralView ArticleGoogle Scholar
 Kannel WB: Blood pressure as a cardiovascular risk factor: prevention and treatment. JAMA. 1996, 275: 15711576. 10.1001/jama.1996.03530440051036.View ArticlePubMedGoogle Scholar
 Falconer DS: The problem of environment and selection. Am Nat. 1952, 86: 293298. 10.1086/281736.View ArticleGoogle Scholar
 Hoffman AA, Parsons PA: Evolutionary Genetics and Environmental Stress. 1991, New York, Oxford University PressGoogle Scholar
 Ma S, Yang L, Romero R, Cui Y: Varying coefficient model for geneenvironment interaction: a nonlinear look. Bioinformatics. 2011, 27: 21192126. 10.1093/bioinformatics/btr318.PubMed CentralView ArticlePubMedGoogle Scholar
 Fan J, Zhang W: Statistical methods with varying coefficient models. Stat Interface. 2008, 1: 179195. 10.4310/SII.2008.v1.n1.a15.PubMed CentralView ArticlePubMedGoogle Scholar
 Härdle W, Mammen E: Comparing nonparametric versus parametric regression fits. Ann Stat. 1993, 21: 19261947. 10.1214/aos/1176349403.View ArticleGoogle Scholar
 EURAGEDIC Consortium: G/T Substitution in intron 1 of the UNC13B gene is associated with increased risk of nephropathy in patients with type 1 diabetes. Diabetes. 2008, 57: 28432850. 10.2337/db080073.
 Laitala VS, Kaprio J, Silventoinen K: Genetics of coffee consumption and its stability. Addiction. 2008, 103: 20542061. 10.1111/j.13600443.2008.02375.x.View ArticlePubMedGoogle Scholar
 Sparrow DB, Chapman G, Smith AJ, Mattar MZ, Major JA, O'Reilly VC, Saga Y, Zackai EH, Dormans JP, Alman BA, et al: A mechanism for geneenvironment interaction in the etiology of congenital scoliosis. Cell. 2012, 149: 295306. 10.1016/j.cell.2012.02.054.View ArticlePubMedGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.