- Open Access
The role of heritability in mapping expression quantitative trait loci
BMC Proceedingsvolume 1, Article number: S86 (2007)
Gene expression, as a heritable complex trait, has recently been used in many genome-wide linkage studies. The estimated overall heritability of each trait may be considered as evidence of a genetic contribution to the total phenotypic variation, which implies the possibility of mapping genome regions responsible for the gene expression variation via linkage analysis. However, heritability has been found to be an inconsistent predictor of significant linkage signals. To investigate this issue in human studies, we performed genome-wide linkage analysis on the 3554 gene expression traits of 194 Centre d'Etude du Polymorphisme Humain individuals provided by Genetic Analysis Workshop 15. Out of the 422 expression traits with significant linkage signals identified (LOD > 5.3), 89 traits have low estimated heritability (h2 < 10%), among which 23 traits have an estimated heritability equal to 0. The linkage analysis on individual pedigree shows that the overall LOD scores may result from a few pedigrees with strong linkage signals. Screening gene expressions before linkage analysis using a relatively low heritability (h2 < 20%) may result in a loss of significant linkage signals, especially for trans-acting expression quantitative trait loci (49%).
Gene expression has been studied as heritable and complex traits in genetic linkage studies . To dissect such traits, a fundamental question is what proportion of the variation of the gene expression can be attributed to genetic factors. Broad heritability, the proportion of a phenotypic variation explained by genetic factors, can be estimated to address this question, especially for selecting traits of interest before mapping . Evidence for a significant heritable component of expression traits has been found in heritability studies on humans [3, 4], yeast  and mice . Various aspects of heritability in gene expression were systematically reviewed by Stamatoyannopoulos and about 10% to 50% of transcripts' between-individual variation was found to be heritable differences . Most recently, Petretto et al. studied the influence of heritability on the detection of cis- and trans-acting expression quantitative trait loci (eQTLs) using the BHX/HXB panel of rat recombinant inbred strains in a tissue specific context  and concluded that heritability alone is not a reliable predictor of whether an eQTL will be detected for an expression trait. However, compared to the rat crosses in inbred lines, human samples may have more experimental variability due to cell line handling and more extreme allele frequencies at the loci of interest. Thus, whether a similar conclusion can be drawn for human samples deserves further investigation. Using data for the 14 Centre d'Etude du Polymorphisme Humain (CEPH) Utah families provided by Genetic Analysis Workshop 15 (GAW15), we explored the relationship between the heritability of a gene's expression level and the power to identify regions regulating its expression. We also investigated the contributions from each individual family .
We used 3554 expression profile traits together with marker data on 2819 autosomal single-nucleotide polymorphisms (SNPs) from the 14 CEPH families, consisting of 194 individuals, provided by GAW15. Sex-specific Rutger's genetic maps were provided by Sung et al. .
Heritability estimates and genome-wide linkage analysis
To study the relationship between the heritability of a gene expression and the detection of eQTLs, we estimated narrow-sense trait heritability assuming no dominance effect using the standard variance-component model . Genome-wide linkage analysis was conducted using MERLIN-REGRESS and MERLIN variance-components method without covariate adjustment [10, 11]. The estimated heritability of each expression trait, sample mean, and variance from all 14 pedigrees were used as population trait distribution parameter estimators in the linkage analysis with MERLIN-REGRESS. The error-checking algorithm implemented in MERLIN was applied before linkage analysis was conducted.
To study whether heritability can be a reliable predictor for linkage signals, we selected gene expression traits that show linkage signal but with low heritability estimates. We applied permutation procedures to examine whether the observed linkage signals are false signals. To adjust for multiple comparisons, the observed genome-wide maximum LOD score for a gene expression was compared to the genome-wide maximum LOD score for the gene expression from a permuted sample. The p-value is the number of times that the genome-wide maximum LOD score from permuted data is greater or equal to the observed genome-wide maximum LOD score out of 1000 permutations. To preserve pedigree structures, permutations were performed within each pedigree.
Genome-wide linkage analysis
LOD > 5.3 was used as the significance threshold (corresponding to a point-wise p-value < 3.9 × 10-7, and a genome-wide p-value = 0.001) as in . We identified 422 gene expressions with significant linkage signals, 25 of which have h2 > 50%. A positive correlation (0.68) was observed between the heritability estimates and the LOD scores for these 25 gene expressions. However, this correlation dropped to 0.12 when all 3554 gene expressions were considered. Moreover, among the 880 gene expressions that have h2 < 10%, 89 have significant linkage signals, including 23 traits with an estimated heritability of 0 (Fig. 1).
Heritability as predictor for linkage analysis
We broadly defined the eQTLs that locate on the same chromosome as the mapped gene expressions as cis-acting eQTLs, and trans-acting eQTLs otherwise. Therein, 422 gene expressions with linkage signals were grouped into a cis-acting group of 49 expressions and a trans-acting group of 373 expressions. Ten percent of the cis-acting gene expressions and 23% of the trans-acting gene expressions had an estimated heritability less than 10%. For gene expression in the cis-acting group, the estimated heritability has a mean of 0.35, which is higher than the 0.22 mean of the trans-acting group. If we screened gene expressions using h2 > 20% as the first step before linkage analysis, 183 out of the 373 (49%) gene expressions with trans-acting eQTLs and 13 out of the 49 (27%) gene expressions with cis-acting eQTLs would be excluded at the screening stage (Fig. 2). Permutation analysis on gene expressions with the bottom five heritability estimates (all with h2 = 0) among the 422 gene expression with linkage signals indicated that the observed linkage signals are not false positives at the 0.05 significance level (Table 1).
Between-pedigree and within-pedigree variation
Based on the heritability estimation, we separated out 89 gene expressions with significant linkage signals but that have h2 < 10%. We further assessed the contribution of each pedigree to the LOD score of the selected 89 gene expressions through the regression-based linkage analysis, where population trait mean, variance, and heritability estimations obtained from all 14 pedigrees were used. We found that 72 out of the 89 selected gene expressions had at least one pedigree contributing a LOD score greater than one (Table 2). This suggests that some pedigrees contribute to the overall significance and explains why high LOD scores were observed for these gene expressions with low heritability estimates.
We also separated out 25 gene expressions with significant linkage signals but that have h2 > 50%, and further studied the heritability difference between the two groups. We calculated the ratio of between-pedigree variation to the total variation, defined as the sum of between-pedigree variation and within-pedigree variation, and compared it to the heritability estimation (Fig. 3). A mean ratio of 0.09 was found for the group with h2 < 10% (89 gene expressions), and a mean ratio of 0.30 for the group with h2 > 50% (25 gene expressions). For all 3554 gene expressions, the mean ratio is 0.09 for the 880 gene expressions with h2 < 10%, and 0.25 for the 108 gene expressions with h2 > 50%. The heritability difference between these two groups suggests that the within-pedigree variation (e.g., environmental variance), may affect the heritability estimation more than the between-pedigree variation . Therefore, heritability estimates without appropriate considerations of potential environmental contributors may lead to biased estimates of genetic contribution to a complex trait.
In this study, we have investigated the relationship between the estimated heritability of gene expression traits and the corresponding linkage signals using the CEPH population data provided by GAW15 Problem 1. This was motivated by a recent study using heritability as a filter for traits selection before conducting linkage analysis in a rat cross . The rationale for this filtering is that the estimated heritability may be a good indicator of the statistical power to detect significant linkage regions for complex traits. The highly heritable traits may have more genetic effect contributing to their total phenotypic variance, hence significant linkage regions tend to be detected more easily. However, we found 89 expression traits with significant linkage signals that actually have small estimated heritability (h2 < 10%). Among the 89 expression traits, 23 of them have estimated heritability of 0. Our study suggests that significant genome regions can be identified even for genes with low heritability estimates, indicating that heritability may not be a reliable predictor for linkage mapping results. A significant portion of eQTLs may be filtered out by a relatively low heritability cut-off, especially for trans-acting eQTLs (49%). Based on our analysis, we also found that in general, MERLIN-REGRESS gave a higher genome-wide maximum LOD score compared to the MERLIN variance-component method (Table 1), although a positive correlation (0.60) exists between the results from the two methods. However, for the subset of genes chosen in Tables 1 and 2, we did not observe any obvious potential influential factors, e.g., outliers, that cause the difference in results.
The results from linkage analysis on individual pedigrees indicate that a significant LOD score may result from a few individual pedigrees with strong linkage signals on gene expression traits with an overall low trait heritability estimation, i.e., the proportion of between pedigree variation to the total variation is small, while the linkage analysis is not affected by between pedigree variation.
Highly heritable genes have a greater proportion of phenotypic variation explained by genetic effects and tend to have genomic regions showing significant linkage scores. However, some genes with low heritability also show high linkage scores, indicating that heritability is not a consistent predictor of eQTL mapping. Caution should be taken if we use inferred heritability as a filter of genes to conduct genome-wide linkage analysis.
Gibson G, Weir B: The quantitative genetics of transcription. Trends Genet. 2005, 21: 616-623. 10.1016/j.tig.2005.08.010.
Chesler EJ, Lu L, Shou S, Qu Y, Gu J, Wang J, Hsu HC, Mountz JD, Baldwin NE, Langston MA, Threadgill DW, Manly KF, Williams RW: Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function. Nat Genet. 2005, 37: 233-242. 10.1038/ng1518.
Schadt EE, Monks SA, Drake TA, Lusis AJ, Che N, Colinayo V, Ruff TG, Milligan SB, Lamb JR, Cavet G, Linsley PS, Mao M, Stoughton RB, Friend SH: Genetics of gene expression surveyed in maize, mouse and man. Nature. 2003, 422: 297-302. 10.1038/nature01434.
Cheung VG, Conlin LK, Weber TM, Arcaro M, Jen KY, Morley M, Spielman RS: Natural variation in human gene expression assessed inlymphoblastoid cells. Nat Genet. 2003, 33: 422-425. 10.1038/ng1094.
Brem RB, Yvert G, Clinton R, Kruglyak L: Genetic dissection of transcriptional regulation in budding yeast. Science. 2002, 296: 752-755. 10.1126/science.1069516.
Stamatoyannopoulos JA: The genomics of gene expression. Genomics. 2004, 84: 449-457. 10.1016/j.ygeno.2004.05.002.
Petretto E, Mangion J, Dickens NJ, Cook SA, Kumaran MK, Lu H, Fischer J, Maatz H, Kren V, Pravenec M, Hubner N, Aitman TJ: Heritability and tissue specificity of expression quantitative trait loci. PLoS Genet. 2006, 2: e172-10.1371/journal.pgen.0020172.
Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, Spielman RS, Cheung VG: Genetic analysis of genome-wide variation in human gene expression. Nature. 2004, 430: 743-747. 10.1038/nature02797.
Sung YJ, Di Y, Fu AQ, Rothstein JH, Sieh W, Tong L, Thompson EA, Wijsman EM: Comparison of multipoint linkage analyses for quantitative traits in the CEPH data: parametric LOD scores, variance components LOD scores, and Bayes factors. BMC Proc. 2007, 1 (Suppl 1): S93-
Abecasis GR, Cherny SS, Cookson WO, Cardon LR: Merlin – rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet. 2002, 30: 97-101. 10.1038/ng786.
Sham PC, Purcell S, Cherny SS, Abecasis GR: Powerful regression-based quantitative-trait linkage analysis of general pedigrees. Am J Hum Genet. 2002, 71: 238-253. 10.1086/341560.
Lynch M, Walsh B: Components of environmental variation. Genetics and Analysis of Quantitative Traits. 1998, Sunderland, MA: Sinauer Associates, Inc, 107-129.
This research was supported in part by NIH grant GM 59507. DB was supported by the NIH Institutional Training Grant for Informatics Research.
This article has been published as part of BMC Proceedings Volume 1 Supplement 1, 2007: Genetic Analysis Workshop 15: Gene Expression Analysis and Approaches to Detecting Multiple Functional Loci. The full contents of the supplement are available online at http://www.biomedcentral.com/1753-6561/1?issue=S1.
The author(s) declare that they have no competing interests.