Volume 3 Supplement 7
Genetic Analysis Workshop 16
Defining genetic determinants of the Metabolic Syndrome in the Framingham Heart Study using association and structural equation modeling methods
 Nora L Nock^{1, 2}Email author,
 Xuefeng Wang^{1},
 Cheryl L Thompson^{2, 3},
 Yeunjoo Song^{1},
 Dan Baechle^{1},
 Paola Raska^{1},
 Catherine M Stein^{1} and
 Courtney GrayMcGuire^{1, 2}
DOI: 10.1186/175365613S7S50
© Nock et al; licensee BioMed Central Ltd. 2009
Published: 15 December 2009
Abstract
The Metabolic Syndrome (MetSyn), which is a clustering of traits including insulin resistance, obesity, hypertension and dyslipidemia, is estimated to have a substantial genetic component, yet few specific genetic targets have been identified. Factor analysis, a subtype of structural equation modeling (SEM), has been used to model the complex relationships in MetSyn. Therefore, we aimed to define the genetic determinants of MetSyn in the Framingham Heart Study (Offspring Cohort, Exam 7) using the Affymetrix 50 k Human Gene Panel and three different approaches: 1) an associationbased "oneSNPatatime" analysis with MetSyn as a binary trait using the World Health Organization criteria; 2) an associationbased "oneSNPatatime" analysis with MetSyn as a continuous trait using secondorder factor scores derived from four firstorder factors; and, 3) a multivariate SEM analysis with MetSyn as a continuous, secondorder factor modeled with multiple putative genes, which were represented by latent constructs defined using multiple SNPs in each gene. Results were similar between approaches in that CSMD1 SNPs were associated with MetSyn in Approaches 1 and 2; however, the effects of CSMD1 diminished in Approach 3 when modeled simultaneously with six other genes, most notably CETP and STARD13, which were strongly associated with the Lipids and MetSyn factors, respectively. We conclude that modeling multiple genes as latent constructs on firstorder trait factors, most proximal to the gene's function with limited paths directly from genes to the secondorder MetSyn factor, using SEM is the most viable approach toward understanding overall gene variation effects in the presence of multiple putative SNPs.
Background
The Metabolic Syndrome (MetSyn) is a clustering of metabolic disturbances that increases the risk of type 2 diabetes and cardiovascular disease [1], and may contribute to the pathogenesis of other complex diseases, including cancer [2]. MetSyn is estimated to affect over 47 million adult Americans [3, 4] and is becoming increasingly more prevalent worldwide [5, 6]. Although MetSyn has been shown to increase with age, recent studies have shown a rise in this disease among younger people in the U.S., particularly in women 20 to 39 years of age [1]. Interestingly, this rise mirrors the increasing rates of obesity in women of these ages.
Although it is well established that MetSyn involves the cooccurrence of multiple metabolic traits, there are differences in the formal definitions set forth by the World Health Organization (WHO), the National Cholesterol Education Program Third Adult Treatment Panel (NCEPATP III), the American Heart Association/National Heart, Lung and Blood Institute (AHA/NHLBI) and the International Diabetes Federation (IDF), predominantly in defining the most relevant elements and their biological cutpoints, which has contributed to confusion in the literature [7]. Nevertheless, all of these definitions include criteria on four common traits: 1) insulin resistance, 2) obesity, 3) hypertension, and 4) dyslipidemia. Factor analysis, a statistical method under the umbrella of structural equation modeling (SEM), has been used, albeit sparingly, to help define the critical elements and structure of the syndrome. Studies conducted in adults using 8 to 10 metabolic measures (fasting insulin, fasting glucose, postchallenge insulin, postchallenge glucose, body mass index (BMI), waist circumference or waisttohip ratio (WHR), high density lipoproteincholesterol (HDL), triglycerides (TG), systolic blood pressure (SBP), diastolic blood pressure (DBP)) have shown that the MetSyn is best described, statistically, as a unifying, secondorder factor defined by four firstorder factors (Insulin Resistance, Obesity, Hypertension, Lipids) [8–10].
MetSyn is hypothesized to have fairly large genetic component with heritability estimates ranging from 6.3% to 50% [11], yet few potential genetic targets have been identified. Thus, we aimed to define the genetic determinants of MetSyn in the Framingham Heart Study (Offspring Cohort, Exam 7) using the Affymetrix 50 k Human Gene Panel data and three different approaches: 1) an associationbased "onesinglenucleotide polymorphism (SNP)atatime" analysis with MetSyn defined as a binary trait using the WHO criteria [7]; 2) an associationbased "oneSNPatatime" analysis with MetSyn defined as a continuous trait using secondorder factor scores derived from insulin resistance, obesity, hypertension, and dyslipidemia factors; and, 3) a multivariate SEM analysis with MetSyn defined as a secondorder continuous factor trait modeled simultaneously with putative genes (identified in Approaches 1 and 2), which we represented as latent constructs defined by multiple SNPs within each gene.
Methods
Data cleaning and preparation: phenotype and genotype variables
First, we examined the distribution of metabolic variables (TG, HDL, SBP, DBP, fasting glucose, BMI) in the Offspring Cohort, Exam 7 using SAS v9.1 (SAS Institute Inc., Cary, NC). Variables not following a normal distribution, as determined by visual inspection of histograms and quantilequantile plots and formal ShapiroWilk's and KolmogorovSmirnov tests, were natural log transformed (TG, HDL, fasting glucose). To adjust for potential bias by antihypertensive treatment and more closely reflect pretreatment BP values, we added 10 mm Hg to SBP and 5 mm Hg to DBP, following Cui et al. [12], in subjects who reported taking blood pressure medications. We used the WHO criteria [7] to define MetSyn; however, waist circumference and microalbuminuria measures were not available, and we applied the most recent IDF fasting glucose cutpoint value of ≥100 mg/dL [7]. Mendelian inconsistencies were identified in the Affymetrix 50 k Human Gene Panel data using MARKERINFO (S.A.G.E. v5.4.1). If an inconsistency was found, genotypes of all individuals in that family were set to missing. Of the 2760 subjects in the Offspring Cohort (Approach 1), 2544 had complete data on all metabolic measures (Approach 2) and 1512 had complete data on all metabolic measures and putative genotypes (Approach 3).
Statistical methods for associationbased analyses
Approach 1 and 2
where h is the generalized modulus power transformation [14], which estimates the regression coefficients, γ_{ j }and δ, as median unbiased on the original scale of measurement. Analyses were adjusted for age, sex and age × sex. pValues were calculated using likelihoodratio and Wald tests and compared to ensure consistency; however, we report only the Wald pvalues since results were similar in all cases. In Approach 1 and 2, a gene was considered statistically significant if it had ≥2 SNPs associated with an individual metabolic variable or MetSyn at p < 0.001. Significant genes were then utilized in Approach 3.
Statistical methods for factor analysis (FA) and SEM
Approach 2 and 3
We used previous reports to devise our secondorder MetSyn factor model [9, 10]; however, because fasting insulin and waist circumference were not available, firstorder factors, Insulin Resistance and Obesity were defined using only fasting glucose and BMI, respectively. Similar to previous models [9, 10], the BP and Lipids firstorder factors were defined using SBP and DBP and TG and HDL measures, respectively. We performed confirmatory factor analysis using a robust maximum likelihood estimator (MLR), which provides test statistics and standard errors robust to nonindependence of observations and nonnormality (Mplus v5.1; TYPE = COMPLEX), to formally test our secondorder MetSyn model and to generate corresponding factor scores. In Approach 2, we examined potential associations between each SNP on the 50 k panel and the factor scores with ASSOC (S.A.G.E. v5.4.1). In Approach 3, we extended the latent gene construct SEM method of Nock et al. [15] using the robust maximum likelihood estimator (MLR; Mplus v5.1) to simultaneously model MetSyn as a secondorder factor together with multiple putative genes identified in Approaches 1 and 2. Similar to Nock et al. [15], we used eigenvalues, scree plots, factor patterns, Cronbach's alpha and linkage disequilibrium (LD) plots (Haploview v4.1) to help select the most informative SNPs in devising the latent gene constructs. For putative genes identified in Approaches 1 and 2, we utilized all available SNPs on the 50 k panel, including those SNPs found to be statistically significant in Approaches 1 and 2, unless they provided redundant information and created a linear dependency. To assess the overall model goodnessoffit to the data, the χ^{2} test, comparative fit index (CFI), root mean square error of approximation (RMSEA) and standardized root mean square residual (SRMR) were evaluated [16]. The χ^{2} test, which evaluates whether the covariance matrix is equal to the modelimplied covariance matrix predicted by the parameters, is very sensitive to sample size and complexity. Thus, other fit indices such as the CFI, RMSEA, and SRMR have been proposed as alternative descriptive measures for evaluating model fit [16]. Values for the CFI, which is relatively insensitive to sample size and model complexity, of ≥0.90 and ≥0.95 indicate acceptable and good fit, respectively [17]. Values for the RMSEA (an index less sensitive to sample size and favoring more parsimonious models) values of ≤0.06 represent good fit while values >0.10 represent unacceptable fit [17]. A SRMR ≤0.08 and <0.10 represent good and acceptable fit, respectively [16, 17]. All pvalues are from twosided tests and statistical significance set at p ≤ 0.05 in Approach 3.
Results
Genes with ≥ 2 SNPs associated with individual metabolic measures and MetSyn^{a} at p < 0.001
Trait and Chr  Gene symbol  Gene ID  rs number  Base pair/AA change  MAF  β (S.E.)  pValue 

Fasting glucose  
Chr 13  STARD13  90627  515192  Outside G/T  0.400  0.034 (0.009)  0.000266 
STARD13  90627  2858808  Intron C/T  0.364  0.032 (0.009)  0.000808  
BMI  
Chr 4  KCTD8  386617  13143747  Thr/Thr  0.092  1.612 (0.419)  0.000118 
KCTD8  386617  17599556  Intron A/C  0.142  0.978 (0.276)  0.000371  
KCTD8  386617  2347926  Intron A/C  0.242  1.066 (0.299)  0.000400  
Chr 8  CSMD1  64478  1997137  Intron G/T  0.167  0.944 (0.281)  0.000284 
CSMD1  64478  2930355  Intron A/G  0.183  1.241 (0.342)  0.000778  
Chr 13    729646  311865  Outside C/T  0.267  1.293 (0.265)  0.000001 
  729646  1006255  Outside C/T  0.417  0.933 (0.240)  0.000102  
TG  
Chr 14  KIAA0329  9895  1210074  Intron A/G  0.246  0.093 (0.027)  0.000462 
KIAA0329  9895  12434098  Intron C/T  0.475  0.101 (0.029)  0.000682  
KIAA0329  9895  1190547  Intron C/G  0.242  0.091 (0.028)  0.000966  
HDL  
Chr 14  KIAA0329  9895  12434098  Intron C/T  0.475  0.050 (0.014)  0.000322 
KIAA0329  9895  1190547  Intron C/G  0.242  0.049 (0.014)  0.000637  
Chr 16  CETP  1071  11508026  Intron C/T  0.492  0.075 (0.016)  7.60 × 10^{7} 
CETP  1071  3764261  Outside G/T  0.367  0.069 (0.014)  1.34 × 10^{6}  
SBP  
Chr 1  WDR64  128025  12074374  Trp/Arg  0.208  4.271 (1.207)  0.000246 
WDR64  128025  12095445  Gln/Arg    4.390 (1.197)  0.000402  
Chr 13  MYO16  23026  4772992  Intron A/G  0.383  2.895 (0.876)  0.000158 
MYO16  23026  6492144  Intron C/G  0.392  2.964 (0.785)  0.000777  
MYO16  23026  9514889  Intron A/G  0.192  2.673 (0.795)  0.000953  
MetSyn^{a}  
Chr 1  WDR64  128025  12074374  Trp/Arg  0.208  0.060 (0.018)  0.000721 
WDR64  128025  12095445  Gln/Arg    0.061 (0.018)  0.000961  
Chr 8  CSMD1 ^{b}  64478  7013078  Intron A/C  0.040  0.135 (0.039)  0.000582 
CSMD1  64478  12549291  Intron G/T  0.250  0.045 (0.013)  0.000808 
Genes with ≥ 2 SNPs associated with MetSyn factor scores^{a} at p < 0.001
Trait and Chr  Gene symbol  Gene ID  rs number  Base pair change  MAF  β (S.E.)  pvalue 

Insulin resistance (firstorder factor)  
Chr 8  CSMD1  64478  7013078  Intron A/C  0.040  4.724 (1.392)  0.000594 
CSMD1  64478  1997137  Intron G/T  0.167  2.572 (0.755)  0.000623  
Obesity (firstorder factor)  
Chr 8  CSMD1  64478  1997137  Intron G/T  0.167  0.674 (0.194)  0.000469 
CSMD1  64478  7013078  Intron A/C  0.040  1.222 (0.360)  0.000510  
Lipids (firstorder factor)^{b}  
Chr 14  KIAA0329  9895  1210074  Intron A/G  0.246  0.230 (0.063)  0.000285 
KIAA0329  9895  12434098  Intron C/T  0.475  0.211 (0.060)  0.000420  
KIAA0329  9895  1190547  Intron C/G  0.242  0.304 (0.129)  0.000585  
MetSyn (secondorder factor)^{a}  
Chr 8  CSMD1  64478  1997137  Intron G/T  0.167  0.189 (0.056)  0.000583 
CSMD1  64478  7013078  Intron A/C  0.040  0.428 (0.103)  0.000632 
Discussion
Results between approaches were similar in that CSMD1 SNPs were found to be associated with MetSyn when using both the modified WHO definition (Approach 1) and the factor scores (Approach 2). However, when evaluating associations between each SNP and each individual metabolic measure, the factor scores (Approach 2) produced fewer putative genes with ≥2 SNPs using a p < 0.001. Because we were most interested in defining putative genes and not individual SNPs, we retained genes with ≥2 SNPs at p < 0.001 versus correcting for multiple tests using a standard Bonferroni correction approach. If we had applied a correction factor for 2,000 tests (p ≤ 2.5 × 10^{5}), which is the approximate number of genes on the 50 k panel, only CETP (mean SNP p = 1.05 × 10^{6}) would have qualified for use in Approach 3. Interestingly, the CETP latent gene construct (Approach 3) had the strongest association of all of the gene constructs in terms of effect size (β_{std} = 0.15) and significance (p = 1.04 × 10^{8}) in the 7gene, 24SNP model (Figure 1).
Although CSMD1 SNPs were associated with MetSyn in Approaches 1 and 2, the CSMD1 latent gene construct (Approach 3) was not associated with Metsyn when modeled in the presence of SNPs in six other genes, even when devising the construct with different combinations of SNPs, which emphasizes an important advantage of Approach 3 in that it can better control for the effects of multiple putative SNPs (and genes) in the same model. Although sample sizes differed between approaches, the consistent findings we observed across all three approaches for CETP (HDL (Approach 1), Lipids (Approach 2)) and STARD13 (fasting glucose (Approach 1), Insulin Resistance (Approach 3)) make attributing the CSMD1 discrepancies to sample size differences less compelling. Moreover, SNPs in genes previously shown to be associated with MetSyn using the WHO criteria, including LDLR, PPARG, and ACE [18], were not found to be significant at p ≤ 0.05 in our study. The lack of replication may be due to modifications we had to make to the WHO definition to accommodate available data and, perhaps, genetic heterogeneity of this complex phenotype.
Conclusion
The multivariate framework of SEM is inherently better suited for modeling the hierarchical, complex relations involved in MetSyn; and, the latent gene construct SEM approach appears particularly useful for disentangling the influence of individual genes on MetSyn in the presence of multiple putative SNPs.
List of abbreviations used
 BMI:

Body mass index
 CHD:

Coronary heart disease
 CFI:

Comparative fit index
 DBP:

Diastolic blood pressure
 FA:

Factor analysis
 HDL:

High density lipoproteincholesterol
 IDF:

International Diabetes Federation
 LD:

Linkage disequilibrium
 MetSyn:

Metabolic Syndrome
 RMSEA:

Root mean square error of approximation
 SBP:

Systolic blood pressure
 SEM:

Structural equation modeling
 SNP:

Singlenucleotide polymorphism
 SRMR:

Standardized root mean square residual
 WHO:

World Health Organization.
Declarations
Acknowledgements
The Genetic Analysis Workshops are supported by NIH grant R01 GM031575 from the National Institute of General Medical Sciences. Additional support was provided by NIH NCI K07 CA129162, NCI U54 CA116867, R25T CA094186, KL2 RR024990, and NCRR RR03655.
This article has been published as part of BMC Proceedings Volume 3 Supplement 7, 2009: Genetic Analysis Workshop 16. The full contents of the supplement are available online at http://www.biomedcentral.com/17536561/3?issue=S7.
Authors’ Affiliations
References
 Mitrakou A: Women's health and metabolic syndrome. Ann NY Acad Sci. 2006, 1092: 3348. 10.1196/annals.1365.003.View ArticlePubMedGoogle Scholar
 Cowey S, Hardy RW: The metabolic syndrome: a highrisk state for cancer?. Am J Pathol. 2006, 169: 15051522. 10.2353/ajpath.2006.051090.PubMed CentralView ArticlePubMedGoogle Scholar
 Ford ES, Giles WH, Dietz WH: Prevalence of the metabolic syndrome among US adults: findings from third National Health and Nutrition Examination Survey. JAMA. 2002, 287: 356359. 10.1001/jama.287.3.356.View ArticlePubMedGoogle Scholar
 Ford ES: Prevalence of the metabolic syndrome defined by the International Diabetes Federation among adults in the U.S. Diabetes Care. 2005, 28: 27452749. 10.2337/diacare.28.11.2745.View ArticlePubMedGoogle Scholar
 Adams RJ, Appleton S, Wilson DH, Taylor AW, Dal Grande E, Chittleborough C, Gill T, Ruffin R: Population comparison of two clinical approaches to the metabolic syndrome: implications of the new International Diabetes Federation consensus definition. Diabetes Care. 2005, 28: 27772779. 10.2337/diacare.28.11.2777.View ArticlePubMedGoogle Scholar
 Elabbassi WN, Haddad HA: The epidemic of metabolic syndrome. Saudi Med J. 2005, 26: 373375.PubMedGoogle Scholar
 Daskalopoulou SS, Athyros VG, Kolovou GD, Anagnostopoulou KK, Mikhailidis DP: Definitions of metabolic syndrome: where are we now?. Curr Vasc Pharmacol. 2006, 4: 185197. 10.2174/157016106777698450.View ArticlePubMedGoogle Scholar
 Lafortuna CL, Adorni F, Agosti F, Sartorio A: Factor analysis of metabolic syndrome components in obese women. Nutr Metab Cardiovasc Dis. 2008, 18: 233241. 10.1016/j.numecd.2007.02.002.View ArticlePubMedGoogle Scholar
 Shen BJ, Todaro JF, Niaura R, McCaffery JM, Zhang J, Spiro A, Ward KD: Are metabolic risk factors one unified syndrome? Modeling the structure of the metabolic syndrome X. Am J Epidemiol. 2003, 157: 701711. 10.1093/aje/kwg045.View ArticlePubMedGoogle Scholar
 Shen BJ, Goldberg RB, Llabre MM, Schneiderman N: Is the factor structure of the metabolic syndrome comparable between men and women and across three ethnic groups: the Miami Community Health Study. Ann Epidemiol. 2006, 16: 131137. 10.1016/j.annepidem.2005.06.049.View ArticlePubMedGoogle Scholar
 TeránGarcía M, Bouchard C: Genetics of the metabolic syndrome. Appl Physiol Nutr Metab. 2007, 32: 89114. 10.1139/H06102.View ArticlePubMedGoogle Scholar
 Cui JS, Hopper JL, Harrap SB: Antihypertensive treatments obscure familial contributions to blood pressure variation. Hypertension. 2003, 41: 207210. 10.1161/01.HYP.0000044938.94050.E3.View ArticlePubMedGoogle Scholar
 George VT, Elston RC: Testing the association between polymorphic markers and quantitative traits in pedigrees. Genet Epidemiol. 1987, 4: 193201. 10.1002/gepi.1370040304.View ArticlePubMedGoogle Scholar
 George VT, Elston RC: Generalized modulus powertransformation. Commun Stat Theory Methods. 1988, 17: 29332952. 10.1080/03610928808829781.View ArticleGoogle Scholar
 Nock NL, Larkin EK, Morris NJ, Li Y, Stein CM: Modeling the complex gene × environment interplay in the simulated rheumatoid arthritis GAW15 data using latent variable structural equation modeling. BMC Proc. 2007, 1 (suppl 1): S11810.1186/175365611s1s118.PubMed CentralView ArticlePubMedGoogle Scholar
 Kline RB: Measurement models and confirmatory factor analysis. Principles and Practice of Structural Equation Modeling. 2005, New York, Guilford Press, 133145. 2Google Scholar
 Hu LT, Bentler PM: Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Structural Equation Modeling. 1999, 6: 155. 10.1080/10705519909540118.View ArticleGoogle Scholar
 Pollex RL, Hegele RA: Genetic determinants of the metabolic syndrome. Nat Clin Pract Cardiovasc Med. 2006, 3: 482489. 10.1038/ncpcardio0638.View ArticlePubMedGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.