Comparison of the power of haplotype-based versus single- and multilocus association methods for gene × environment (gene × sex) interactions and application to gene × smoking and gene × sex interactions in rheumatoid arthritis

Accounting for interactions with environmental factors in association studies may improve the power to detect genetic effects and may help identifying important environmental effect modifiers. The power of unphased genotype-versus haplotype-based methods in regions with high linkage disequilibrium (LD), as measured by D', for analyzing gene × environment (gene × sex) interactions was compared using the Genetic Analysis Workshop 15 (GAW15) simulated data on rheumatoid arthritis with prior knowledge of the answers. Stepwise and regular conditional logistic regression (CLR) was performed using a matched case-control sample for a HLA region interacting with sex. Haplotype-based analyses were performed using a haplotype-sharing-based Mantel statistic and a test for haplotype-trait association in a general linear model framework. A step-down minP algorithm was applied to derive adjusted p-values and to allow for power comparisons. These methods were also applied to the GAW15 real data set for PTPN22. For markers in strong LD, stepwise CLR performed poorly because of the correlation/collinearity between the predictors in the model. The power was high for detecting genetic main effects using simple CLR models and haplotype-based methods and for detecting joint effects using CLR and Mantel statistics. Only the haplotype-trait association test had high power to detect the gene × sex interaction. In the PTPN22 region with markers characterized by strong LD, all methods indicated a significant genotype × sex interaction in a sample of about 1000 subjects. The previously reported R620W single-nucleotide polymorphism was identified using logistic regression, but the haplotype-based methods did not provide any precise location information.


Background
The inclusion of gene × environment interaction terms in association analyses may improve the power to detect genetic effects and may contribute to the identification of important environmental effect modifiers [1]. Current technology makes it possible to genotype very dense sets of markers and the underlying genomic structure can be captured using haplotypes. The merits and drawbacks of haplotype-based compared with unphased genotypebased methods in the analysis of joint effects of genetic and non-genetic factors in regions with high linkage disequilibrium (LD, as measured by D') are of particular interest. When investigating candidate genes using dense markers that are in strong LD, the goal is to distinguish potentially causal variant(s) from those showing association merely due to LD. Using the Genetic Analysis Workshop 15 (GAW15) simulated data, we compared the power of three genotype-and haplotype-based methods to account for gene × environment interactions both to identify the genetic region and to identify the causal variant(s) in case-control scenarios. The methods are ordinary conditional logistic regression (CLR) and stepwise conditional logistic regression (SCLR) [2], Mantel statistics using haplotype sharing [3], and a haplotype-trait association statistic using a maximum likelihood estimate to test for interaction effects [4]. In contrast to the haplotypetrait association statistic, which provides a statistical test for each haplotype, CLR, SCLR, and the Mantel statistics provide a statistical test for each genetic marker. The methods also differ in the number of tests performed as well as in the number of degrees of freedom. To account for multiple hypotheses testing when investigating multiple single-nucleotide polymorphisms (SNPs) in a candidate gene and to obtain comparable power estimates, we applied a step-down minP algorithm [5].
In addition, we explored the performance of these methods in the GAW15 real data set for PTPN22 using markers in strong LD (in terms of D'). The data are a subset of the published data on rheumatoid arthritis (RA) [6]. RA is a complex autoimmune disease more common in women and in smokers [7] and with a moderately strong genetic component. Genetic association with RA was found in the PTPN22 gene [6,8,9], where the SNP R620W has been found to have a stronger effect in males than in females [6,10].

Simulated data sets
From the GAW15 simulated data modeling RA, we selected 100 replicate samples to represent a matched case-control study. For each replicate, the first affected offspring within an affected sibling pair of the first 500 families (1500 families in total) was chosen as case. Out of 2000 unrelated unaffected controls, 500 were matched to the cases by age and sex. The proportion of females:males in the case-control pairs was 2:1. In the simulated HLA region from the high-density scan on chromosome 6 with 18,000 SNPs, the disease loci DR and C have the same physical position and are in high LD. The DR locus has been simulated to have a strong effect on the risk of RA independently of sex, while locus C was generated to interact with sex, such that it increases risk only in women. We investigated these loci and the 10 flanking SNPs on either side, spanning a region of about 300 kb. The alleles of the multiallelic DR locus were recoded as biallelic based on the answers, i.e., the risk allele vs. all other alleles combined.

Patients and genotypes in the RA sample
For the analysis of the real data set on PTPN22, we used all 1001 individuals from the NARAC (North American Rheumatoid Arthritis Consortium) sample for whom genotypes were available, these were 665 unrelated RA cases and 336 unrelated unaffected controls (Table 1, see Plenge et al. [6] for details of ascertainment). For the analysis of interaction effects of SNPs and smoking, we excluded individuals whose smoking status was not known. The control sample is not representative of the population regarding the distribution of sex and smoking because it includes a higher percentage of females and smokers than the case sample. Thus the estimated main effects of sex and smoking are strongly biased and not presented here. All 14 SNPs provided in PTPN22, which span approximately 58 kb, were investigated ( Table 2).

Logistic regression
We investigated four different sets of CLR models: 1) the main effect of each marker individually in 22 separate regression models; 2) only the joint effect of each genotyped marker individually with sex as an interaction term. This approach was considered in order to compare CLR and the Mantel statistics for joint effects, M 1 , mentioned below; 3) both the main effect of each marker individually and their interaction with sex; 4) SCLR (stepwise CLR with backwards model selection based on the Akaike information criterion) with the full model including all main and first-order interaction effects of the 22 markers simultaneously. The SCLR approach was proposed specifically to distinguish between causal variants and those merely in LD [2]. Note that these analyses only use unphased multilocus genotype data. We used an additive genetic model on the logit scale, i.e., multiplicative on the odds ratio scale, without dominance effects to reduce the number of degrees of freedom.
We employed analogous unconditional LR models for the real data on PTPN22 because cases and controls were not matched. Here, gene × environment (current smoking) and gene × sex interactions were investigated. All CLR and LR analyses were performed using the computer program R (version 2.3.1), using the general linear model (glm) procedure and the step function for stepwise model selection.

Haplotype-based analysis 2.1 Mantel statistic using haplotype sharing
We applied the approach of Mantel's statistics for spacetime clustering to correlate genetic and phenotypic similarity [3]:  and denote the environmental factor of s i and s j . M 1 is a test for joint effects and does not distinguish between genetic main effects and interaction effects.

Haplotype-trait association
We applied a test for haplotype-trait association in a general linear model framework, using maximum likelihood estimates for main and interaction effects of haplotypes and non-genetic factors, allowing for haplotype phase to be ambiguous [4,11]. Two regression models were applied. The first model contained sex and all main effects of haplotypes. The second model included all main effects and first-order haplotype-sex interaction effects. The effects of the haplotypes were modeled as additive. We first used the most frequent haplotype, which included the risk alleles on the loci DR and C, as baseline category.
We also analyzed the data using a less frequent haplotype, which did not include any risk alleles, as the baseline haplotype. Certainly, without prior knowledge of the answers, we would not have done so. Only haplotypes with a frequency of at least 5% were considered. We modified the function haplo.glm, which is included in the haplo.stats R-library [4].
For the PTPN22 data set, we used a model which included the main effects of the haplotypes, sex, and smoking, as well as the first-order haplotype-sex or haplotype-smoking interaction terms.

Permutation procedure and step-down minP adjusted p-values
The numbers of tests and degrees of freedom differed between the statistical methods and models. Thus, we permuted the case-control status while keeping together genotypes and sex-as well as smoking status in the analysis of the real data-for each individual, and calculated adjusted p-values by a step-down minP algorithm [5].

Simulated data
As depicted in Table 3, power of CLR was generally high for Model 1 to detect the genetic main effect at the DR locus and for Model 2 to detect the joint effect of the SNPs and sex. For SNPs 5 and 15 only, there was hardly any power. In contrast, modeling of both a genetic main effect and an interaction effect resulted in very low power to detect the interaction and low to moderate power to detect the genetic main effect. SCLR performed highly unsatisfactorily and had very low power for all effects modeled.
The haplotype-sharing-based Mantel statistics had 100% power for all markers both for the genetic main effect and for the joint effect (Table 4) even when no more than 50 case-control pairs were investigated (data not shown). For the haplotype-trait association test, we present results only for the four haplotypes, which were observed in at least 80 of 100 samples ( Table 2). The GAW15 data were simulated such that the allele coded 3 at the DR locus increases RA risk while the allele coded as 1 at locus C was simulated to increase risk for RA only in women. Thus it was surprising that both risk-related alleles were also included in the most common haplotype, the reference haplotype by default. All three haplotypes indicated the main and the interaction effect with power estimates ranging between 0.82 and 1, and 0.65 and 0.78, respectively. We reexamined the data using as reference the second most frequent haplotype, which did not comprise the risk alleles. The estimated power was moderate to high for the detection of the main effect (0.57 to 1) and low for the detection of the interaction effect with sex (0.08 to 0.51, data not shown).

Real data
In the PTPN22 region, LR identified significant effects at R620W (p = 0.04) and rs1217413 (0.05), considering main effects only and when considering also interaction effects with sex (p = 0.007 and p = 0.026, respectively) ( Table 2). The remaining SNPs surrounding R620W did not show significant results. An interaction effect with smoking was not observed. As observed for the simulated data, power was low for all effects modeled using stepwise LR (data not shown).
The Mantel statistics did not yield significant main effects of the investigated SNPs with the lowest adjusted p-value at 0.23 (Table 2). However, evidence for joint effects (M 1 ) both with sex and with smoking was found (lowest adjusted p-values of 0.02 and <0.001, respectively). Six common haplotypes with frequencies ≥ 5% were estimated in PTPN22 (Table 5). Applying the haplotype-trait association test, one haplotype detected with a frequency 13.6% was associated with a significantly increased risk in the model comprising only main effects. This haplotype differed from all other haplotypes at locus R620W (Table  5). One haplotype with frequency 8.52%, which was found to be significant in the haplotype-sex analysis, was associated with a decreased risk. Compared to the reference haplotype, this haplotype carries the same allele at R620W, but differs for SNPs rs12730735 and rs1217414. In both cases, the respective alleles are also present on other haplotypes, which do not show an interaction, thus a specific interacting disease variant could not be identified. There was no evidence for a haplotype-smoking interaction.

Discussion
Current technology permits genotyping of dense marker sets across the whole genome. Once associated regions have been identified, efforts will be made to confirm these z s j results in independent samples by a hypothesis-based candidate gene approach in which control of the type I error rate will be of great importance. Accounting for gene × environment interaction given moderate genetic main effects may improve the power to detect or confirm an association with a gene, a genetic region, or causal vari-ant(s). Given this scenario, we investigated the performance of some methods proposed for detecting genetic main and gene × environmental interaction effects. We found that stepwise LR or CLR performed poorly given strong LD (as measured by D') between markers-a phenomenon called (multi-)collinearity that results in over-  fiting of the model and weak robustness [12]. It is well known that automated stepwise methods do not necessarily produce the best model if there are redundant predictors; often the model created will have poor predictive accuracy [13][14][15]. Although stepwise model selection was proposed for elucidating which of several markers in LD might be functionally relevant [2], this strategy seems to be inefficient when strong LD (in terms of D') exists. In this situation, ordinary LR performs very well when only the main effect of each marker is tested separately in a model and the p-values are adjusted for the total number of tests performed. The low power for SNP 5 is likely due to low LD (r 2 ≈ 0) with the disease loci and non-matching allele frequencies [16]. This does not appear to apply for SNP 15, where power is low but LD is stronger. All other markers had a power of about 100% for the genetic main effect despite mostly low r 2 -values between these markers and the disease loci. Therefore, in the simulated scenario, it was not possible to localize the functional variant within this region, e.g., there was no general tendency for the true causal locus to have the smallest p-value or largest effect size estimate. The simulated genetic main effect was also much stronger than the interaction effect, so that there was no improvement adding a gene × sex interaction term in the model. A much larger sample size would have been necessary for detecting the interaction effect.
There was only a limited number of haplotypes estimated from the 22 simulated markers due to the strong LD (as measured by D') between them. Both haplotype-based methods were able to detect association in the investigated region but they failed to identify the disease variant(s). The findings regarding joint effects suggest that analysis for joint effects may be an option for detecting genetic effects in some situations. For smaller sample sizes, the Mantel statistics for joint effects (M 1 ), however, is more powerful than a similar LR modeling of interaction effects (data not shown).
For the analysis of PTPN22 in RA, we used a subset of the data of Plenge et al. [6], who identified an interaction of R620W with sex in a case-only analysis in their complete data set of about 4100 RA cases and controls. Using the different LR models, we were also able to detect this genotype × sex interaction in a subsample of 1001 subjects at R620W. The lack of significance for the genetic main effect in the model including both main and interaction terms, indicated that the effect of R620W indeed depends on the gender of the genotype carrier. Furthermore, it was possible to distinguish R620W from neighboring markers with allele frequencies that did not match and low r 2 -values [16]. The Mantel statistic failed to identify main effects but showed strong evidence for a joint effect both with gender and with smoking in model M 1 . These significant effects may however be due to apparent strong main effects of sex and smoking caused by an unrepresentative control sample and may not reflect a true interaction effect. No specific marker could be detected as disease-causing polymorphism because of the dependency of the test statistics at different markers. By contrast, the results of the haplotype association method haplo.glm were similar to the results of the LR models implicating the R620W locus as a risk locus for RA. However, it is unclear whether the significant haplotype-sex interaction observed for a different haplotype refers to a further disease variant.

Conclusion
Our results suggest that stepwise or other automated variable selection methods are not suitable for the investigation of gene × environment interactions in regions with high LD, as measured by D'. Given strong genetic main effects and moderate gene-sex interaction, as in the simu-  Loci in the table are listed as described in Table 4. a Established risk locus R620W is bold, underlined b p-Values ≤ 0.05 are given in bold.
lated data, both simple CLR and haplotype-based methods have adequate power for the detection of genetic main effects without considering gene × sex interaction in moderate sample sizes. In the simulated scenario, haplotypetrait association tests had better power than simple LR modeling to detect moderate gene × sex interactions. However, this result cannot be generalized to other situations without further simulation studies under different genetic models. On the other hand, haplotype-based methods that always use data from consecutive markers tend to yield similar results for neighboring markers, thus making localization of a causal variant potentially more difficult than with genotype-based methods that consider each marker at a time. Thus, the potential for localization of causal variants with each method should also be explored in further simulation studies.