Skip to main content

Volume 3 Supplement 7

Genetic Analysis Workshop 16

Defining genetic determinants of the Metabolic Syndrome in the Framingham Heart Study using association and structural equation modeling methods

Abstract

The Metabolic Syndrome (MetSyn), which is a clustering of traits including insulin resistance, obesity, hypertension and dyslipidemia, is estimated to have a substantial genetic component, yet few specific genetic targets have been identified. Factor analysis, a sub-type of structural equation modeling (SEM), has been used to model the complex relationships in MetSyn. Therefore, we aimed to define the genetic determinants of MetSyn in the Framingham Heart Study (Offspring Cohort, Exam 7) using the Affymetrix 50 k Human Gene Panel and three different approaches: 1) an association-based "one-SNP-at-a-time" analysis with MetSyn as a binary trait using the World Health Organization criteria; 2) an association-based "one-SNP-at-a-time" analysis with MetSyn as a continuous trait using second-order factor scores derived from four first-order factors; and, 3) a multivariate SEM analysis with MetSyn as a continuous, second-order factor modeled with multiple putative genes, which were represented by latent constructs defined using multiple SNPs in each gene. Results were similar between approaches in that CSMD1 SNPs were associated with MetSyn in Approaches 1 and 2; however, the effects of CSMD1 diminished in Approach 3 when modeled simultaneously with six other genes, most notably CETP and STARD13, which were strongly associated with the Lipids and MetSyn factors, respectively. We conclude that modeling multiple genes as latent constructs on first-order trait factors, most proximal to the gene's function with limited paths directly from genes to the second-order MetSyn factor, using SEM is the most viable approach toward understanding overall gene variation effects in the presence of multiple putative SNPs.

Background

The Metabolic Syndrome (MetSyn) is a clustering of metabolic disturbances that increases the risk of type 2 diabetes and cardiovascular disease [1], and may contribute to the pathogenesis of other complex diseases, including cancer [2]. MetSyn is estimated to affect over 47 million adult Americans [3, 4] and is becoming increasingly more prevalent worldwide [5, 6]. Although MetSyn has been shown to increase with age, recent studies have shown a rise in this disease among younger people in the U.S., particularly in women 20 to 39 years of age [1]. Interestingly, this rise mirrors the increasing rates of obesity in women of these ages.

Although it is well established that MetSyn involves the co-occurrence of multiple metabolic traits, there are differences in the formal definitions set forth by the World Health Organization (WHO), the National Cholesterol Education Program Third Adult Treatment Panel (NCEP-ATP III), the American Heart Association/National Heart, Lung and Blood Institute (AHA/NHLBI) and the International Diabetes Federation (IDF), predominantly in defining the most relevant elements and their biological cut-points, which has contributed to confusion in the literature [7]. Nevertheless, all of these definitions include criteria on four common traits: 1) insulin resistance, 2) obesity, 3) hypertension, and 4) dyslipidemia. Factor analysis, a statistical method under the umbrella of structural equation modeling (SEM), has been used, albeit sparingly, to help define the critical elements and structure of the syndrome. Studies conducted in adults using 8 to 10 metabolic measures (fasting insulin, fasting glucose, post-challenge insulin, post-challenge glucose, body mass index (BMI), waist circumference or waist-to-hip ratio (WHR), high density lipoprotein-cholesterol (HDL), triglycerides (TG), systolic blood pressure (SBP), diastolic blood pressure (DBP)) have shown that the MetSyn is best described, statistically, as a unifying, second-order factor defined by four first-order factors (Insulin Resistance, Obesity, Hypertension, Lipids) [8–10].

MetSyn is hypothesized to have fairly large genetic component with heritability estimates ranging from 6.3% to 50% [11], yet few potential genetic targets have been identified. Thus, we aimed to define the genetic determinants of MetSyn in the Framingham Heart Study (Offspring Cohort, Exam 7) using the Affymetrix 50 k Human Gene Panel data and three different approaches: 1) an association-based "one-single-nucleotide polymorphism (SNP)-at-a-time" analysis with MetSyn defined as a binary trait using the WHO criteria [7]; 2) an association-based "one-SNP-at-a-time" analysis with MetSyn defined as a continuous trait using second-order factor scores derived from insulin resistance, obesity, hypertension, and dyslipidemia factors; and, 3) a multivariate SEM analysis with MetSyn defined as a second-order continuous factor trait modeled simultaneously with putative genes (identified in Approaches 1 and 2), which we represented as latent constructs defined by multiple SNPs within each gene.

Methods

Data cleaning and preparation: phenotype and genotype variables

First, we examined the distribution of metabolic variables (TG, HDL, SBP, DBP, fasting glucose, BMI) in the Offspring Cohort, Exam 7 using SAS v9.1 (SAS Institute Inc., Cary, NC). Variables not following a normal distribution, as determined by visual inspection of histograms and quantile-quantile plots and formal Shapiro-Wilk's and Kolmogorov-Smirnov tests, were natural log transformed (TG, HDL, fasting glucose). To adjust for potential bias by antihypertensive treatment and more closely reflect pretreatment BP values, we added 10 mm Hg to SBP and 5 mm Hg to DBP, following Cui et al. [12], in subjects who reported taking blood pressure medications. We used the WHO criteria [7] to define MetSyn; however, waist circumference and microalbuminuria measures were not available, and we applied the most recent IDF fasting glucose cut-point value of ≥100 mg/dL [7]. Mendelian inconsistencies were identified in the Affymetrix 50 k Human Gene Panel data using MARKERINFO (S.A.G.E. v5.4.1). If an inconsistency was found, genotypes of all individuals in that family were set to missing. Of the 2760 subjects in the Offspring Cohort (Approach 1), 2544 had complete data on all metabolic measures (Approach 2) and 1512 had complete data on all metabolic measures and putative genotypes (Approach 3).

Statistical methods for association-based analyses

Approach 1 and 2

We evaluated the potential association between each SNP on the 50 k panel and each phenotype using the following model [13] in ASSOC (S.A.G.E. v5.4.1):

where for any individual i, with trait y i , c ji is any one of n individual specific covariates, η i is a random effect comprising, in our analyses, the sibling and individual specific errors, z i is a genotype indicator for allele A at a diallelic locus with alleles A and B:

where h is the generalized modulus power transformation [14], which estimates the regression coefficients, γ j and δ, as median unbiased on the original scale of measurement. Analyses were adjusted for age, sex and age × sex. p-Values were calculated using likelihood-ratio and Wald tests and compared to ensure consistency; however, we report only the Wald p-values since results were similar in all cases. In Approach 1 and 2, a gene was considered statistically significant if it had ≥2 SNPs associated with an individual metabolic variable or MetSyn at p < 0.001. Significant genes were then utilized in Approach 3.

Statistical methods for factor analysis (FA) and SEM

Approach 2 and 3

We used previous reports to devise our second-order MetSyn factor model [9, 10]; however, because fasting insulin and waist circumference were not available, first-order factors, Insulin Resistance and Obesity were defined using only fasting glucose and BMI, respectively. Similar to previous models [9, 10], the BP and Lipids first-order factors were defined using SBP and DBP and TG and HDL measures, respectively. We performed confirmatory factor analysis using a robust maximum likelihood estimator (MLR), which provides test statistics and standard errors robust to non-independence of observations and non-normality (Mplus v5.1; TYPE = COMPLEX), to formally test our second-order MetSyn model and to generate corresponding factor scores. In Approach 2, we examined potential associations between each SNP on the 50 k panel and the factor scores with ASSOC (S.A.G.E. v5.4.1). In Approach 3, we extended the latent gene construct SEM method of Nock et al. [15] using the robust maximum likelihood estimator (MLR; Mplus v5.1) to simultaneously model MetSyn as a second-order factor together with multiple putative genes identified in Approaches 1 and 2. Similar to Nock et al. [15], we used eigenvalues, scree plots, factor patterns, Cronbach's alpha and linkage disequilibrium (LD) plots (Haploview v4.1) to help select the most informative SNPs in devising the latent gene constructs. For putative genes identified in Approaches 1 and 2, we utilized all available SNPs on the 50 k panel, including those SNPs found to be statistically significant in Approaches 1 and 2, unless they provided redundant information and created a linear dependency. To assess the overall model goodness-of-fit to the data, the χ2 test, comparative fit index (CFI), root mean square error of approximation (RMSEA) and standardized root mean square residual (SRMR) were evaluated [16]. The χ2 test, which evaluates whether the covariance matrix is equal to the model-implied covariance matrix predicted by the parameters, is very sensitive to sample size and complexity. Thus, other fit indices such as the CFI, RMSEA, and SRMR have been proposed as alternative descriptive measures for evaluating model fit [16]. Values for the CFI, which is relatively insensitive to sample size and model complexity, of ≥0.90 and ≥0.95 indicate acceptable and good fit, respectively [17]. Values for the RMSEA (an index less sensitive to sample size and favoring more parsimonious models) values of ≤0.06 represent good fit while values >0.10 represent unacceptable fit [17]. A SRMR ≤0.08 and <0.10 represent good and acceptable fit, respectively [16, 17]. All p-values are from two-sided tests and statistical significance set at p ≤ 0.05 in Approach 3.

Results

In Approach 1, we evaluated the potential associations between each SNP on the Affymetrix 50 k Human Gene Panel and MetSyn as a binary trait defined using the modified WHO criteria (see Methods) and between each SNP and each individual metabolic measure. We found that several genes had two or more SNPs with significant (p < 0.001) associations, the majority of which are listed in Table 1. Of these genes, the most significant associations were observed with CETP SNPs rs11508026 (p = 7.60 × 10-7) and rs3764261 (p = 1.34 × 10-6). Two of the same KIAA0329 SNPs (rs12434098, rs1190547) were associated with TG and HDL but not with MetSyn. Multiple CSMD1 SNPs were associated with BMI and MetSyn, while WDR64 SNPs were associated with SBP and MetSyn.

Table 1 Genes with ≥ 2 SNPs associated with individual metabolic measures and MetSyna at p < 0.001

In Approach 2, we modeled MetSyn as a continuous, second-order factor defined by four first-order factors (Obesity, Insulin Resistance, BP, Lipids) and observed a good fitting model by several fit indices (χ2 = 59.48, df = 7, p < 0.05; CFI = 0.97; RMSEA = 0.05; SRMR = 0.02). We found Insulin Resistance was the most important factor (βstd = 0.99 ± standard error of βstd = 0.02; p < 0.001) followed by Obesity (0.96 ± 0.03; p < 0.001), Lipids (0.69 ± 0.05; p < 0.001), and BP (0.50 ± 0.04; p < 0.001). Using this model, we examined potential associations between each SNP on the 50 k panel and the second- and first-order factor scores (Table 2). KIAA0329 SNPs were found to be associated with Lipids factor scores (Table 2), which is consistent with the results observed in Approach 1 in that KIAA0329 SNPs were also associated with TG and HDL measures. However, CSMD1 was the only gene with ≥2 SNPs found to be associated with MetSyn factor scores (p < 0.001). In Approach 3, we extended the latent gene construct SEM method of Nock et al. [15] to model multiple putative genes identified in Approaches 1 and 2 simultaneously with MetSyn as a second-order factor modeling MetSyn as second-order factor with 24 SNPs in seven genes improved model fit by several indices (χ2 = 523.52, df = 375, p < 0.05; CFI = 0.99; RMSEA = 0.01; SRMR = 0.02) and increased the R2 of MetSyn from 0.23 to 0.43, compared to the same model without genes. To further illustrate the utility of SEM, we also added a path between MetSyn and coronary heart disease (CHD) in models with and without genes. As shown in Figure 1, the strongest associations in terms of effect size and statistical significance were found between CETP and Lipids (βstd = 0.15 ± 0.02; p = 1.04 × 10-8), STARD13 and Insulin Resistance (βstd = 0.14 ± 0.08; p = 0.05), and STARD13 and MetSyn (βstd = 0.08 ± 0.03; p = 0.007); however, the CSMD1 latent construct was not associated with MetSyn even when using different combinations of SNPs to devise the construct (data not shown). The MYO16 latent gene construct was not associated with either BP or MetSyn factors; and, for parsimony reasons, was dropped from the final model shown in Figure 1. The association between MetSyn and CHD was similar, but slightly attenuated in the model with (βstd = 0.13 ± 0.04; p = 0.002) versus without (βstd = 0.14 ± 0.03; p < 0.001) the seven genes.

Table 2 Genes with ≥ 2 SNPs associated with MetSyn factor scoresa at p < 0.001
Figure 1
figure 1

Model of the MetSyn and genes as latent constructs using SEM. Model resulted in good overall model fit (χ2 = 1336.00, df = 457, p < 0.05; CFI = 0.02; RMSEA = 0.03; SRMR = 0.03). Standardized loadings and corresponding standard errors are depicted above arrows. Blue, MetSyn traits; Red, Genes; Green, coronary heart disease; *p ≤ 0.05; **p ≤ 0.10. Residuals not shown for clarity.

Discussion

Results between approaches were similar in that CSMD1 SNPs were found to be associated with MetSyn when using both the modified WHO definition (Approach 1) and the factor scores (Approach 2). However, when evaluating associations between each SNP and each individual metabolic measure, the factor scores (Approach 2) produced fewer putative genes with ≥2 SNPs using a p < 0.001. Because we were most interested in defining putative genes and not individual SNPs, we retained genes with ≥2 SNPs at p < 0.001 versus correcting for multiple tests using a standard Bonferroni correction approach. If we had applied a correction factor for 2,000 tests (p ≤ 2.5 × 10-5), which is the approximate number of genes on the 50 k panel, only CETP (mean SNP p = 1.05 × 10-6) would have qualified for use in Approach 3. Interestingly, the CETP latent gene construct (Approach 3) had the strongest association of all of the gene constructs in terms of effect size (βstd = 0.15) and significance (p = 1.04 × 10-8) in the 7-gene, 24-SNP model (Figure 1).

Although CSMD1 SNPs were associated with MetSyn in Approaches 1 and 2, the CSMD1 latent gene construct (Approach 3) was not associated with Metsyn when modeled in the presence of SNPs in six other genes, even when devising the construct with different combinations of SNPs, which emphasizes an important advantage of Approach 3 in that it can better control for the effects of multiple putative SNPs (and genes) in the same model. Although sample sizes differed between approaches, the consistent findings we observed across all three approaches for CETP (HDL (Approach 1), Lipids (Approach 2)) and STARD13 (fasting glucose (Approach 1), Insulin Resistance (Approach 3)) make attributing the CSMD1 discrepancies to sample size differences less compelling. Moreover, SNPs in genes previously shown to be associated with MetSyn using the WHO criteria, including LDLR, PPARG, and ACE [18], were not found to be significant at p ≤ 0.05 in our study. The lack of replication may be due to modifications we had to make to the WHO definition to accommodate available data and, perhaps, genetic heterogeneity of this complex phenotype.

Conclusion

The multivariate framework of SEM is inherently better suited for modeling the hierarchical, complex relations involved in MetSyn; and, the latent gene construct SEM approach appears particularly useful for disentangling the influence of individual genes on MetSyn in the presence of multiple putative SNPs.

Abbreviations

BMI:

Body mass index

CHD:

Coronary heart disease

CFI:

Comparative fit index

DBP:

Diastolic blood pressure

FA:

Factor analysis

HDL:

High density lipoprotein-cholesterol

IDF:

International Diabetes Federation

LD:

Linkage disequilibrium

MetSyn:

Metabolic Syndrome

RMSEA:

Root mean square error of approximation

SBP:

Systolic blood pressure

SEM:

Structural equation modeling

SNP:

Single-nucleotide polymorphism

SRMR:

Standardized root mean square residual

WHO:

World Health Organization.

References

  1. Mitrakou A: Women's health and metabolic syndrome. Ann NY Acad Sci. 2006, 1092: 33-48. 10.1196/annals.1365.003.

    Article  CAS  PubMed  Google Scholar 

  2. Cowey S, Hardy RW: The metabolic syndrome: a high-risk state for cancer?. Am J Pathol. 2006, 169: 1505-1522. 10.2353/ajpath.2006.051090.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  3. Ford ES, Giles WH, Dietz WH: Prevalence of the metabolic syndrome among US adults: findings from third National Health and Nutrition Examination Survey. JAMA. 2002, 287: 356-359. 10.1001/jama.287.3.356.

    Article  PubMed  Google Scholar 

  4. Ford ES: Prevalence of the metabolic syndrome defined by the International Diabetes Federation among adults in the U.S. Diabetes Care. 2005, 28: 2745-2749. 10.2337/diacare.28.11.2745.

    Article  PubMed  Google Scholar 

  5. Adams RJ, Appleton S, Wilson DH, Taylor AW, Dal Grande E, Chittleborough C, Gill T, Ruffin R: Population comparison of two clinical approaches to the metabolic syndrome: implications of the new International Diabetes Federation consensus definition. Diabetes Care. 2005, 28: 2777-2779. 10.2337/diacare.28.11.2777.

    Article  PubMed  Google Scholar 

  6. Elabbassi WN, Haddad HA: The epidemic of metabolic syndrome. Saudi Med J. 2005, 26: 373-375.

    PubMed  Google Scholar 

  7. Daskalopoulou SS, Athyros VG, Kolovou GD, Anagnostopoulou KK, Mikhailidis DP: Definitions of metabolic syndrome: where are we now?. Curr Vasc Pharmacol. 2006, 4: 185-197. 10.2174/157016106777698450.

    Article  CAS  PubMed  Google Scholar 

  8. Lafortuna CL, Adorni F, Agosti F, Sartorio A: Factor analysis of metabolic syndrome components in obese women. Nutr Metab Cardiovasc Dis. 2008, 18: 233-241. 10.1016/j.numecd.2007.02.002.

    Article  CAS  PubMed  Google Scholar 

  9. Shen BJ, Todaro JF, Niaura R, McCaffery JM, Zhang J, Spiro A, Ward KD: Are metabolic risk factors one unified syndrome? Modeling the structure of the metabolic syndrome X. Am J Epidemiol. 2003, 157: 701-711. 10.1093/aje/kwg045.

    Article  PubMed  Google Scholar 

  10. Shen BJ, Goldberg RB, Llabre MM, Schneiderman N: Is the factor structure of the metabolic syndrome comparable between men and women and across three ethnic groups: the Miami Community Health Study. Ann Epidemiol. 2006, 16: 131-137. 10.1016/j.annepidem.2005.06.049.

    Article  PubMed  Google Scholar 

  11. Terán-García M, Bouchard C: Genetics of the metabolic syndrome. Appl Physiol Nutr Metab. 2007, 32: 89-114. 10.1139/H06-102.

    Article  PubMed  Google Scholar 

  12. Cui JS, Hopper JL, Harrap SB: Antihypertensive treatments obscure familial contributions to blood pressure variation. Hypertension. 2003, 41: 207-210. 10.1161/01.HYP.0000044938.94050.E3.

    Article  CAS  PubMed  Google Scholar 

  13. George VT, Elston RC: Testing the association between polymorphic markers and quantitative traits in pedigrees. Genet Epidemiol. 1987, 4: 193-201. 10.1002/gepi.1370040304.

    Article  CAS  PubMed  Google Scholar 

  14. George VT, Elston RC: Generalized modulus power-transformation. Commun Stat Theory Methods. 1988, 17: 2933-2952. 10.1080/03610928808829781.

    Article  Google Scholar 

  15. Nock NL, Larkin EK, Morris NJ, Li Y, Stein CM: Modeling the complex gene × environment interplay in the simulated rheumatoid arthritis GAW15 data using latent variable structural equation modeling. BMC Proc. 2007, 1 (suppl 1): S118-10.1186/1753-6561-1-s1-s118.

    Article  PubMed Central  PubMed  Google Scholar 

  16. Kline RB: Measurement models and confirmatory factor analysis. Principles and Practice of Structural Equation Modeling. 2005, New York, Guilford Press, 133-145. 2

    Google Scholar 

  17. Hu LT, Bentler PM: Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Structural Equation Modeling. 1999, 6: 1-55. 10.1080/10705519909540118.

    Article  Google Scholar 

  18. Pollex RL, Hegele RA: Genetic determinants of the metabolic syndrome. Nat Clin Pract Cardiovasc Med. 2006, 3: 482-489. 10.1038/ncpcardio0638.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

The Genetic Analysis Workshops are supported by NIH grant R01 GM031575 from the National Institute of General Medical Sciences. Additional support was provided by NIH NCI K07 CA129162, NCI U54 CA116867, R25T CA094186, KL2 RR024990, and NCRR RR03655.

This article has been published as part of BMC Proceedings Volume 3 Supplement 7, 2009: Genetic Analysis Workshop 16. The full contents of the supplement are available online at http://www.biomedcentral.com/1753-6561/3?issue=S7.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nora L Nock.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

NLN conceived the study, conducted the statistical analyses, and prepared the manuscript. XW, YS, DB, CT, and PR cleaned the data and helped conduct analyses for Approach 1. CG-M and CS helped with study design and coordination and drafting the manuscript.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Nock, N.L., Wang, X., Thompson, C.L. et al. Defining genetic determinants of the Metabolic Syndrome in the Framingham Heart Study using association and structural equation modeling methods. BMC Proc 3 (Suppl 7), S50 (2009). https://doi.org/10.1186/1753-6561-3-S7-S50

Download citation

  • Published:

  • DOI: https://doi.org/10.1186/1753-6561-3-S7-S50

Keywords