Haplotype approach for association analysis on hypertension
© Shen et al.; licensee BioMed Central Ltd. 2014
Published: 17 June 2014
We applied a gene-based haplotype approach for the genome-wide association analysis on hypertension using Genetic Analysis Workshop 18 data for unrelated individuals. Association of single-nucleotide polymorphisms and clinical outcome were first assessed and haplotypes were then constructed based on the gene information and the linkage disequilibrium plot. Extensive haplotype analysis was also conducted for the whole chromosome 3. We found 1 block from the ULK4 gene and 2 blocks from the LOC64690 gene that were significantly associated with hypertension.
Hypertension is a major risk factor for many diseases, including stroke and heart failure. Various genetic studies have been done and a number of genes have been identified as having strong associations with hypertension or high blood pressure . In our study, we proposed a haplotype approach to identify blocks on the gene that have strong associations with hypertension. Focusing on a block of the gene instead of looking only at a particular point may better capture the disease pattern and take the potential interactions between markers into account . In addition, because the number of tests is reduced compared with the single-nucleotide polymorphism (SNP) tests, there is less penalty from multiple testing . We report significant haplotypes from association analysis.
Definition of outcome and predictors
Hypertension was defined as systolic blood pressure >140 mm Hg and diastolic blood pressure > 90 mm Hg, or as being on antihypertensive medications at a specific examination. For this study, we defined our outcome as "ever-hypertension" if an individual was hypertensive in any of the 4 examinations, and "never-hypertension" if hypertension criteria were never met in those 4 examinations. In this way, we created a single hypertension outcome based on the longitudinal structure of the data. The genetic analysis was focused on unrelated individuals.
Quality control of genotype data
Preliminary analysis and gene-based haplotype construction
A logistic regression model was applied on association analysis for SNPs and the defined hypertension outcome with adjustment for covariates as well as principal component vectors obtained from the population stratification procedure. We first found some nominally significant SNPs (p <5 × 10−4) from this preliminary model, and then located the genes corresponding to such SNPs based on the annotation information (T. Nalpathamkalam et al., unpublished data, 2012). For each gene, we defined the haplotype block based on a high linkage disequilibrium (LD) region containing the significant SNPs we found from the preliminary model. The blocks were defined by CI algorithm  as well as the 4-gamete rule algorithm . Then for each block, we estimated the haplotype frequencies and the probability of having each haplotype for all individuals. The estimations of the LD blocks and haplotype frequencies were applied using HAPLOVIEW  and PHASE [12–14].
where p hu and p hv denote haplotype frequencies estimated from PHASE. If the omnibus test was significant, which means at least 1 haplotype should be kept in the model, we then conducted haplotype-specific tests for each haplotype in the block and identified the specific haplotype strongly associated with the outcome.
Summary of phenotypes and genotypes
Summary of phenotype data
Preliminary association analysis and haplotype construction
Significant SNPs from preliminary model and corresponding genes
0.29 (0.15, 0.56)
2 × 10−4
0.31 (0.18, 0.55)
7 × 10−5
0.18 (0.08, 0.39)
2 × 10−5
3.53 (1.87, 6.64)
9 × 10−5
3.19 (1.74, 5.87)
2 × 10−4
3.52 (1.86, 6.66)
1 × 10−4
3.59 (1.77, 7.28)
4 × 10−4
4.95 (2.06, 11.89)
3 × 10−4
Significant haplotypes from model 1 in "Methods: Haplotype analysis" section
2.7215 (1.3998, 5.2912)
2.7489 (1.2476, 6.0569)
LOC64690 (rs6785346, rs9857853)
0.2430 (0.1202, 0.4913)
1 × 10−4
3.3028 (1.4293, 7.6320)
3.8169 (1.7371, 8.3867)
9 × 10−4
3.6333 (1.5983, 8.2590)
Adding the interactive effect of haplotype and age did not improve the model. Power analysis showed that for gene ULK4, we needed at least 258 individuals to have an 80% power to detect interaction effect with ratio of OR = 2.0, but only 92 individuals were required for the main effects model. For gene LOC64690, 514 individuals were required to gain 80% power for the interaction model (given ratio of OR = 2.0), but only 100 individuals were required for the main effects model to achieve the same level of power.
We also conducted haplotype analysis on whole chromosome 3 in PLINK. In PLINK, haplotype blocks are estimated following the default procedure in HAPLOVIEW and pairwise LD is calculated only for SNPs within 100 kilobases (kb). We tried the models with and without adjusted covariates. A total of 6389 haplotype blocks were constructed by using PLINK and no haplotype was significant in the omnibus test at Bonferroni corrected significance level of 0.05/6389 ~ 8 × 10−6.
Based on the results, we can see that the haplotype containing SNP rs2700464 on ULK4 is strongly associated with our defined hypertension outcome. Daniel et al  concluded that ULK4 is associated with high blood pressure and, potentially, hypertension. We also detected that 2 haplotype blocks on LOC64690 had a strong relationship with hypertension. In addition, the interaction effect between age and haplotype was not significant in all models, but power analysis indicated that our sample size was too limited to detect interaction effect, but sufficient for the main effects model.
We focused only on unrelated individuals in our study, ignoring family structures. We may consider including the family structure in further research, and may try to model the complex relationship between family members. In addition, we ran the permutation test for haplotypes in the candidate blocks as well as on the whole chromosome 3. However, the population structure is not preserved for a logistic model when doing permutation tests. Therefore, the permutation p values may not be a good estimate of the asymptotic p values. We may consider using the biased urn method  to overcome this problem in further research.
The Genetic Analysis Workshop 18 (GAW18) whole genome sequence data were provided by the T2D-GENES Consortium, which is supported by NIH grants U01 DK085524, U01 DK085584, U01 DK085501, U01 DK085526, and U01 DK085545. The other genetic and phenotypic data for GAW18 were provided by the San Antonio Family Heart Study and San Antonio Family Diabetes/Gallbladder Study, which are supported by NIH grants P01 HL045222, R01 DK047482, and R01 DK053889. The Genetic Analysis Workshop is supported by NIH grant R01 GM031575.
This article has been published as part of BMC Proceedings Volume 8 Supplement 1, 2014: Genetic Analysis Workshop 18. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcproc/supplements/8/S1. Publication charges for this supplement were funded by the Texas Biomedical Research Institute.
- Kim JJ, Vaziri SA, Elson P, Rini I, Ganapathi MK, Ganapathi R: VEGF single nucleotide polymorphisms and correlation to sunitinib-induced hypertension in metastatic renal cell carcinoma patients [abstract]. J Clin Oncol. 2009, 27: 15s-10.1200/JCO.2008.21.7695.View ArticleGoogle Scholar
- Davidson S: Research suggests importance of haplotypes over SNPs. Nat Biotechnol. 2000, 18: 1134-1135. 10.1038/81100.View ArticlePubMedGoogle Scholar
- Zhao K, Aranzana MJ, Kim S, Lister C, Shindo C, Tang C, Toomajian C, Zheng H, Dean C, Marjoram P, et al: An Arabidopsis example of association mapping in structured samples. PLoS Genet. 2007, 3: e4-10.1371/journal.pgen.0030004.PubMed CentralView ArticlePubMedGoogle Scholar
- Durrleman S, Simon R: Flexible regression models with cubic splines. Stat Med. 1989, 8: 551-561. 10.1002/sim.4780080504.View ArticlePubMedGoogle Scholar
- Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ, et al: PLINK: a toolset for whole-genome association and population-based linkage analysis. Am J Hum Genet. 2007, 81: 559-575. 10.1086/519795.PubMed CentralView ArticlePubMedGoogle Scholar
- The International HapMap Consortium: The International HapMap Project. Nature. 2003, 789-796. 426Google Scholar
- Patterson NJ, Price AL, Reich D: Population structure and eigenanalysis. PLoS Genet. 2006, 2: e190-10.1371/journal.pgen.0020190.PubMed CentralView ArticlePubMedGoogle Scholar
- Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D: Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006, 38: 904-909. 10.1038/ng1847.View ArticlePubMedGoogle Scholar
- Gabriel SB: The structure of haplotype blocks in the human genome. Science. 2002, 296: 2225-2229. 10.1126/science.1069424.View ArticlePubMedGoogle Scholar
- Wang N: Distribution of recombination crossovers and the origin of haplotype blocks: the interplay of population history, recombination, and mutation. Am J Hum Genet. 2002, 71: 1227-1234. 10.1086/344398.PubMed CentralView ArticlePubMedGoogle Scholar
- Barrett JC, Fry B, Maller J, Daly MJ: Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005, 21: 263-265. 10.1093/bioinformatics/bth457.View ArticlePubMedGoogle Scholar
- Stephens M, Donnelly P: A comparison of Bayesian methods for haplotype reconstruction from population genotype data. Am J Hum Genet. 2003, 73: 1162-1169. 10.1086/379378.PubMed CentralView ArticlePubMedGoogle Scholar
- Stephens M, Scheet P: Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation. Am J Hum Genet. 2005, 76: 449-462. 10.1086/428594.PubMed CentralView ArticlePubMedGoogle Scholar
- Stephens M, Smith N, Donnelly P: A new statistical method for haplotype reconstruction from population data. Am J Hum Genet. 2001, 68: 978-989. 10.1086/319501.PubMed CentralView ArticlePubMedGoogle Scholar
- Zaykin DV, Westfall PH, Young SS, Karnoub MA, Wagner MJ, Ehm MG: Testing association of statistically inferred haplotypes with discrete and continuous traits in samples of unrelated individuals. Hum Hered. 2002, 53: 79-91. 10.1159/000057986.View ArticlePubMedGoogle Scholar
- Gauderman WJ, Morrison JM: QUANTO 1.1: a computer program for power and sample size calculations for genetic-epidemiology studies. 2006, [http://hydra.usc.edu/gxe]Google Scholar
- Daniel L: Genome-wide association study of blood pressure and hypertension. Nat Genet. 2009, 41: 677-687. 10.1038/ng.384.View ArticleGoogle Scholar
- Epstein MP, Duncan R, Jiang Y, Conneely KN, Allen AS, Satten GA: A permutation procedure to correct for confounders in case-control studies, including tests of rare variation. Am J Hum Genet. 2012, 91: 215-223. 10.1016/j.ajhg.2012.06.004.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.