Homozygosity disequilibrium and its gene regulation
© The Author(s). 2016
Published: 18 October 2016
Homozygosity disequilibrium (HD) describes a nonrandom pattern of sizable runs of homozygosity (ROH) that deviated from a random distribution of homozygotes and heterozygotes in the genome. In this study, we developed a double-weight local polynomial model for estimating homozygosity intensity. This new estimation method enables considering the local property and genetic information of homozygosity in the human genome when detecting regions of HD. By using this new method, we estimated whole-genome homozygosity intensities by analyzing real whole-genome sequencing data of 959 related individuals from 20 large pedigrees provided by Genetic Analysis Workshop 19 (GAW19). Through the analysis, we derived the distribution of HD in the human genome and provided evidence for the genetic component of natural variation in HD. Generalized estimating equation analysis for 855 related individuals was performed to identify regions of HD associated with diastolic blood pressure (DBP), systolic blood pressure, and hypertension (HTN), with concomitant adjustment for age and sex. We identified one DBP-associated and 2 HTN-associated regions of HD. We also studied the gene regulation of HD by analyzing the real whole-genome transcription data of 647 individuals. A set of gene expressions regulated by the DBP- and HTN-associated regions of HD was identified. Finally, we conducted simulation studies to evaluate the performance of our homozygosity association test. The results showed that the association test had a high power and that type 1 error was controlled. The methods have been integrated into our developed Loss-of-Heterozygosity Analysis Suite software, which can be downloaded at http://www.stat.sinica.edu.tw/hsinchou/genetics/loh/LOHAS.htm.
Homozygosity disequilibrium (HD) is defined as a nonrandom pattern of sizable runs of homozygosity (ROH) that deviated from a random distribution of homozygotes and heterozygotes in the genome . The 2 types of ROH are: narrow-sense and broad-sense ROHs. A narrow-sense ROH indicates a contiguous stretch of homozygotes in an intact genomic region [2, 3]. A broad-sense ROH entails a homozygosity-rich region that contains a small proportion of heterozygotes, which are caused by genotyping errors, missing genotypes, or mutations . The genetic mechanisms of HD include autozygosity , natural selection , and chromosomal aberrations . Genetic contributions of HD to the susceptibility of Mendelian diseases, complex disorders, and cancers were reported . A population difference in HD was also reported .
We developed statistical methods and software (Loss-of-Heterozygosity Analysis Suite [LOHAS]) to dissect the whole-genome patterns of HD through whole-genome single nucleotide polymorphism (SNP) analysis . Our method adopted a broad-sense definition of ROH. The methods were applied to investigate the association between HD and disease susceptibility [1, 7, 8] and the relationship between HD and the genetic structures of global populations . Because no studies had investigated HD by using whole-genome sequencing (WGS) data, we expanded our methods and software to analyze the WGS data set provided by Genetic Analysis Workshop 18 . The method was based on the assumption that all rare variants (RVs) have an equal weight, even though common homozygotes of RVs with a lower minor allele frequency (MAF) carry less homozygosity information .
In this study, we analyzed real WGS and whole-genome transcription (WGT) data and analyzed simulated data from Genetic Analysis Workshop 19 (GAW19) and our previous study . This study aimed to (a) develop new statistical methods and analysis tools for examining HD in WGS data; (b) characterize patterns of HD in the human genome by using WGS data; (c) identify regions of HD associated with diastolic blood pressure (DBP), systolic blood pressure (SBP), and hypertension (HTN); (d) examine gene expression regulated by HD; and (e) evaluate the power and type 1 error of the proposed genome-wide homozygosity association analysis according to simulated data sets.
The estimates of the homozygosity intensity ranged from 0 to 1. A higher value indicates a higher homozygosity. Heritability of homozygosity intensity in a region of HD was calculated by using a variance component linkage analysis in MERLIN.
Generalized estimation equation (GEE) analysis was used to examine 855 related individuals and their WGS data, blood pressure (SBP and DBP), HTN, and covariates (age and sex). In the GEE analysis, DBP values, SBP values, and HTN status were modeled as 3 separate response variables. Subsequently, within each sliding window, the association between each response variable and homozygosity intensity was examined separately, with concomitant adjustment for age and sex. An identity link was used in the analysis of SBP and DBP and a logit link was used in the analysis of HTN. Furthermore, we identified gene expression regulated by the significant region(s) of HD. In GEE analysis, each gene expression was modeled as a response variable and its association with homozygosity intensity in the DBP-, SBP-, and HTN-associated regions of HD was examined individually, with concomitant adjustment for age and sex.
We performed simulation studies to evaluate the type 1 error and power of homozygosity association tests. The type 1 error was analyzed by examining the association between Q1 and homozygosity intensities, with covariate adjustments for age and sex, by using GEE analysis. The test power was evaluated by examining the association between Q1 and homozygosity intensities in the 3 Q1-associated regions of HD, with covariate adjustments for age and sex, by using GEE analysis. Throughout the study, the nominal significance level of statistical tests was 0.05 after adjusting for the false discovery rate (FDR) for the multiple-testing problem.
We performed GEE analysis to identify gene expression regulated by the DBP- and HTN-associated regions of HD, with concomitant adjustment for age and sex. The results revealed that the expression of 427 genes was regulated by the DBP-associated region on chromosome 11 (see Fig. 2), where 32 genes were found to be associated with cardiovascular diseases according to the Genetic Association Database (http://geneticassociationdb.nih.gov/). Moreover, the expression of 6 and 2 genes was regulated by the HTN-associated regions on chromosomes 11 and 15, respectively (see Fig. 2).
We evaluated the type 1 error of the homozygosity association tests by analyzing 200 simulation data sets of quantitative trait Q1 provided by GAW19. After FDR correction, a homozygosity association analysis of the 849 related individuals calculated the mean of the type 1 errors as 0.0018. We also evaluated the test power by analyzing 3 Q1-associated regions of HD in 200 simulation data sets . The results showed that the power increased as the probability p increased, where the probability p increased as b 1 increased. The power of the homozygosity association test was 1.000 in each condition, except for the following 3 parameter combinations (b 0 , b 1 ) = (−25, 0.4), (−20, 0.3), and (−25, 0.3). The power for the 3 Q1-associated regions of HD that represented the 10th, 50th, and 90th percentiles of proportions of RVs on chromosome 21 was: 0.920, 0.985, and 0.775 for (b 0 , b 1 ) = (−25, 0.4), respectively; 0.630, 0.820, 0.43 for (b 0 , b 1 ) = (−20, 0.3), respectively; and 0.075, 0.070, and 0.125 for (b 0 , b 1 ) = (−25, 0.3), respectively. The power for (b 0 , b 1 ) = (−25, 0.3) was very low because the probability p was lower than 0.03 for every individual.
Our previous study examined the genomic patterns of HD in 270 samples from populations with African, Asian, and European ancestry in the International HapMap II Project . The results showed that HD changed significantly among ethnic groups. In the classification analysis, samples from different populations were classified according to the HD results. In contrast, the HD pattern of the African population was least similar to the Asian and white populations; the African population had the fewest regions of HD. In this study, we derived the distribution of HD in the human genome, provided evidence for the genetic component of natural variation in HD, identified regions of HD associated with DBP and HTN, and also identified a set of gene expressions regulated by the DBP- and HTN-associated regions of HD. The follow-up analyses will include, but are not limited to, a copy number analysis and natural selection analysis that will facilitate verifying the genetic mechanism of HD.
The contributions of this study are summarized as follows: First, we developed a new method for considering the local property and genetic information of homozygosity in homozygosity intensity estimation. In this method, the assumption that RVs have equal importance when defining ROH is relaxed and, the method has a higher power than our previous homozygosity association test . Moreover, this method does not require imputing the homozygosity intensity for the common homozygote of an RV. Therefore, this method is more effective computationally than the previous estimation procedure . Second, our homozygosity intensity analyses of WGS data provide a whole-genome blueprint of HD and evidence for the genetic component of natural variation in HD. Third, the regions and genes of HD associated with DBP and HTN were identified using genome-wide homozygosity association tests, enriching the collection of genes associated with susceptibility to HTN and other blood pressure–related problems. Finally, the association analysis of WGS and WGT data revealed previously unreported evidence of gene regulation of HD. These findings provide a valuable analysis tool and knowledge for future medical genomics and population genomics studies.
We are grateful to GAW19 for providing the genomic and transcriptomic data sets. This work was partially supported by grants from the Ministry of Science and Technology of Taiwan (MOST 103-2314-B-001-008-MY3 and NSC100-2314-B-001-005-MY3) and the Career Development Award of Academia Sinica (AS-100-CDA-M03). We thank 2 anonymous reviewers for constructive suggestions for this paper.
This article has been published as part of BMC Proceedings Volume 10 Supplement 7, 2016: Genetic Analysis Workshop 19: Sequence, Blood Pressure and Expression Data. Summary articles. The full contents of the supplement are available online at http://bmcproc.biomedcentral.com/articles/supplements/volume-10-supplement-7. Publication of the proceedings of Genetic Analysis Workshop 19 was supported by National Institutes of Health grant R01 GM031575.
HCY conceived of the study, developed statistical methods, and prepared the manuscript. YTL developed analysis software and analyzed the data with HCY. Both authors read and approved the final manuscript.
The authors declare they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Yang HC, Chang LC, Liang YJ, Lin CH, Wang PL. A genome-wide homozygosity association study identifies runs of homozygosity associated with rheumatoid arthritis in the human major histocompatibility complex. PLoS One. 2012;7(4):e34840.View ArticlePubMedPubMed CentralGoogle Scholar
- Gibson J, Morton NE, Collins A. Extended tracts of homozygosity in outbred human populations. Hum Mol Genet. 2006;15(5):789–95.View ArticlePubMedGoogle Scholar
- Li LH, Ho SF, Chen CH, Wei CY, Wong WC, Li LY, et al. Long contiguous stretches of homozygosity in the human genome. Hum Mutat. 2006;27(11):1115–21.View ArticlePubMedGoogle Scholar
- Sabeti PC, Varilly P, Fry B, Lohmueller J, Hostetter E, Cotsapas C, et al. Genome-wide detection and characterization of positive selection in human populations. Nature. 2007;449(7164):913–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Cavenee WK, Dryja TP, Phillips RA, Benedict WF, Godbout R, Gallie BL, et al. Expression of recessive alleles by chromosomal mechanisms in retinoblastoma. Nature. 1983;305(5937):779–84.View ArticlePubMedGoogle Scholar
- Ku CS, Naidoo N, Teo SM, Pawitan Y. Regions of homozygosity and their impact on complex diseases and traits. Hum Genet. 2011;129(1):1–15.View ArticlePubMedGoogle Scholar
- Yang HC, Chang LC, Huggins RM, Chen CH, Mullighan CG. LOHAS: loss-of-heterozygosity analysis suite. Genet Epidemiol. 2011;35(4):247–60.PubMedGoogle Scholar
- Huggins R, Li LH, Lin YC, Yu AL, Yang HC. Nonparametric estimation of LOH using Affymetrix SNP genotyping arrays for unpaired samples. J Hum Genet. 2008;53(11–12):983–90.View ArticlePubMedGoogle Scholar
- Yang HC, Li HW. Analysis of homozygosity disequilibrium using whole-genome sequencing data. BMC Proc. 2014;8 Suppl 1:S15.View ArticlePubMedPubMed CentralGoogle Scholar
- Cui JS, Hopper JL, Harrap SB. Antihypertensive treatments obscure familial contributions to blood pressure variation. Hypertension. 2003;41(2):207–10.View ArticlePubMedGoogle Scholar
- Yasuda SU, Zhang L, Huang SM. The role of ethnicity in variability in response to drugs: focus on clinical pharmacology studies. Clin Pharmacol Ther. 2008;84(3):417–23.View ArticlePubMedGoogle Scholar