- Open Access
A genome-wide linkage study of GAW15 gene expression data
© Kan et al; licensee BioMed Central Ltd. 2007
Published: 18 December 2007
Recently, gene expression levels have been shown to demonstrate familial aggregation, suggesting a direct role of heritable DNA variation. We studied the gene expression levels in lymphoblastoid cells of the Centre d'Etude du Polymorphisme Humain Utah families made available to Genetic Analysis Workshop 15 (GAW15), using genome-wide linkage analyses.
Heritability was estimated for the expression levels of each individual phenotype. Genome wide linkage analysis was then performed using the 2819 SNPs for the expression levels of all the genes.
Heritability exceeded 0.21 for 50% of the expressed phenotypes. Genome-wide linkage analysis demonstrated that 19 of them reached significance after correcting for multiple comparisons, only 4 of which were reported previously. We did not identify any hot spots of transcriptional regulation when assuming LOD score > 5.3 for significant linkage evidence.
Our analysis suggests that inconsistent results in comparison to the previous report may be due to the different approaches, phenotype transformation, and different pedigree data used in the analyses.
Genetic diseases are the ultimate manifestation of pathological genetic variation, although under some circumstances they may also reflect the influence of environmental factors. Gene expression at the transcript level (i.e., the "gene expression phenotype") is considered an intermediate stage between DNA sequence variation and complex traits. Recently, Cheung et al.  studied variation in human gene expression across the genome by comparing variation among unrelated individuals, among siblings within families, and between monozygotic twins. They found significant evidence for familial aggregation of gene expression phenotypes, suggesting a contribution from germ line genetic variation. The same group also performed genome-wide linkage analysis for expression levels of 3554 genes in 14 large Centre d'Etude du Polymorphisme Humain (CEPH) Utah families by genotyping 2756 autosomal single-nucleotide polymorphisms (SNPs). They identified significant linkage evidence for a large proportion of the expression phenotypes, further supporting a role for DNA sequence variation on these phenotypes. Furthermore, they identified regions, designated hot spots of transcriptional regulation, with significant linkage to several expression phenotypes . We studied the same expression data made available to Genetic Analysis Workshop 15 (GAW15), using the variance-components method implemented in Merlin  in order to compare to the results obtained in the original report obtained with SIBPAL in S.A.G.E. . The rationale for this comparison is that the variance-components approach may be more powerful than SIBPAL when a phenotype is normally distributed, but although SIBPAL is robust to the normality assumption, the variance-components approach is not.
The human gene expression data in lymphoblastoid cells included 14 three-generation CEPH Utah families. The expression levels of 3554 of the 8500 genes tested were available for GAW15. In addition, 2819 autosomal SNPs were genotyped and provided by GAW15. The linkage map of the SNPs was calculated based on the deCode map using interpolation (Kong et al. ). We analyzed 3354 expression phenotypes after excluding SNPs on the X chromosome and those with gene locations that we were unable to locate.
We implemented the software SOLAR to calculate heritability for each expression phenotype under the assumption of a polygenic model . Genome-wide linkage analysis for each expression phenotype was then performed using the multipoint variance-components method as implemented in the software package Merlin. The variance-components method decomposes the total variance into the additive effect of a quantitative trait locus (QTL), polygenic effects, and random environmental effects. The likelihood ratio test was applied to test the null hypothesis of no additive genetic variance due to the QTL. We also performed linkage analysis using SIBPAL with w4 option for some phenotypes for comparison . Because it was not our goal to address or evaluate corrections for multiple testing, in the spirit of a GAW analysis, despite the large number of tests performed, we present only point-wise test results here.
19 gene expression phenotypes with genome-wide significant linkage evidence after correcting for multiple tests
Merlin VC LOD
Linkage peak position
2.9 × 10-13
7.4 × 10-13
1.1 × 10-16
4.4 × 10-14
4.4 × 10-16
3.0 × 10-11
4.4 × 10-12
5.0 × 10-9
5.2 × 10-5
2.7 × 10-5
5.0 × 10-6
3.2 × 10-5
4.4 × 10-11
2.1 × 10-10
7.4 × 10-10
7.0 × 10-9
2.8 × 10-8
4.8 × 10-8
3.1 × 10-15
1.8 × 10-14
2.1 × 10-5
9.2 × 10-7
5.2 × 10-11
4.9 × 10-14
1.1 × 10-10
6.7 × 10-16
7.8 × 10-7
5.0 × 10-5
3.7 × 10-6
1.5 × 10-5
Hotspots of transcriptional regulation
No. of hits
Window region (cM)
Gene expression phenotypes offer important insight into naturally occurring variation and might represent intermediate phenotypes between some genetic diseases and DNA variation. The genetic contribution to expression phenotypes has been studied in species from yeast to human [1, 8, 9]). Linkage evidence for a large proportion of the human expression phenotypes has been detected using the CEPH Utah family by Morley et al. . Morley et al. also identified many hot spots of transcriptional regulation. Our heritability analysis using this data set also suggested that genetics has a modest influence on gene expression phenotypes. Overall, therefore, our results are consistent with the report by Morley et al. . However, differences also appear between the two reports. Among the 13 expression phenotypes with the strongest linkage evidence reported by Morley et al., only four are present in our analysis.
Further analyses suggested several factors that might contribute to the inconsistencies, as summarized below. 1) We used different analysis approaches. In our multipoint genome-wide linkage analysis we used the variance-components approach implemented in Merlin while Morley et al. applied SIBPAL, which is robust to the normality assumption . Using the exact same data for the 19 gene expression phenotypes we still obtained different conclusions regarding linkage for 10 expression phenotypes. One potential reason may be the different power for the two approaches when a trait satisfies the assumption normality. 2) The phenotype transformation may also contribute. For example, after log transformation, the linkage evidence of expression of UGT2B17 and PYGB was no longer statistically significant. 3) Morley et al.  did not include the data from grandparents in the analysis while we used all the family data, which may also play a role, although further confirmation is required from an analysis that does not use the grandparental data.
We failed to observe the hot spot of transcriptional regulation on chromosome 14 reported by Morley et al. . This inconsistency may also be explained by the reasons we mentioned above. Also, Bastone et al.  reported that the evidence of hot spots of transcriptional regulation on chromosome 14 reported by Morley et al.  is driven by a single family, indicating that genetic heterogeneity exists in gene expression phenotypes. Wang et al.  performed simulation permutation analysis by including and excluding the highly correlated phenotypes, suggesting the hot spots might be artificial. Further independent studies, perhaps with larger sample size, may be required in order to identify the true biological patterns.
This work was supported by grants from the National Human Genome Research Institute and the National Heart, Lung and Blood Institute (HG003054, HL074166). We thank Dr. Wijsman and two reviewers for helpful comments, which have resulted in a greatly improved manuscript.
This article has been published as part of BMC Proceedings Volume 1 Supplement 1, 2007: Genetic Analysis Workshop 15: Gene Expression Analysis and Approaches to Detecting Multiple Functional Loci. The full contents of the supplement are available online at http://www.biomedcentral.com/1753-6561/1?issue=S1.
- Cheung VG, Conlin LK, Weber TM, Arcaro M, Jen KY, Morley M, Spielman R: Natural variation in human gene expression assessed in lymphoblastoid cells. Nat Genet. 2003, 33: 422-425. 10.1038/ng1094.View ArticlePubMedGoogle Scholar
- Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, Spielman RS, Cheung VG: Genetic analysis of genome-wide variation in human gene expression. Nature. 2004, 430: 743-747. 10.1038/nature02797.View ArticlePubMed CentralPubMedGoogle Scholar
- Abecasis GR, Cherny SS, Cookson WO, Cardon LR: Merlin – rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet. 2002, 30: 97-101. 10.1038/ng786.View ArticlePubMedGoogle Scholar
- S.A.G.E. Statistical Analysis for Genetic Epidemiology, release 5.3. [http://genepi.cwru.edu/]
- Kong A, Gudbjartsson DF, Sainz J, Jonsdottir GM, Gudjonsson SA, Richardsson B, Sigurdardottir S, Barnard J, Hallbeck B, Masson G, Shlien A, Palsson ST, Frigge ML, Thorgeirsson TE, Gulcher JR, Stefansson K: A high-resolution recombination map of the human genome. Nat Genet. 2002, 31: 241-247.PubMedGoogle Scholar
- Almasy L, Blangero J: Multipoint quantitative-trait linkage analysis in general pedigrees. Am J Hum Genet. 1998, 62: 1198-1211. 10.1086/301844.View ArticlePubMed CentralPubMedGoogle Scholar
- Lander E, Kruglyak L: Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat Genet. 1995, 11: 241-247. 10.1038/ng1195-241.View ArticlePubMedGoogle Scholar
- Elston RC, Buxbaum S, Jacobs KB, Olson JM: Haseman and Elston revisited. Genet Epidemiol. 2000, 19: 1-17. 10.1002/1098-2272(200007)19:1<1::AID-GEPI1>3.0.CO;2-E.View ArticlePubMedGoogle Scholar
- Cheung VG, Jen KY, Weber T, Morley M, Devlin JL, Ewens KG, Spielman RS: Genetics of quantitative variation in human gene expression. Cold Spring Harb Symp Quant Biol. 2003, 68: 403-407.View ArticlePubMedGoogle Scholar
- Bastone LA, Putt ME, Ten Have TR, Cheung VG, Spielman RS: Genetic heterogeneity and trans regulators of gene expression. BMC Proc. 2007, 1 (Suppl 1): S80-View ArticlePubMed CentralPubMedGoogle Scholar
- Wang S, Zheng T, Wang Y: Transcription activity hot spot, is it real or an artifact?. BMC Proc. 2007, 1 (Suppl 1): S94-View ArticlePubMed CentralPubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.