Sample
The simulated data set from GAW15 Problem 3 consists of a 5-cM microsatellite genome scan for each of 100 replicates, in which each replicate represents a random sample of 1500 ASPs with RA and their parents (four-person pedigrees). Data from all replicates were analyzed with researchers unblinded to the simulation parameters. The DR locus on chromosome 6 was simulated as the primary disease susceptibility locus with additional genetic and environmental factors affecting the risk of disease. Only Locus A on chromosome 16 was simulated as an effect modifier on the risk of RA due to the DR locus. Thus, it was used as the test locus for gene × gene interaction. The DR locus has three alleles: X, 1, and 4, with prevalences 0.65, 0.1, and 0.25, respectively. The A locus is diallelic and acts in a dominant fashion with a prevalence of 0.3 for risk allele "A". Assuming Hardy-Weinberg proportions at the DR locus and holding other risk factors constant, the marginal risk of RA due to the DR locus in individuals with the A allele at locus A is 5.2, which decreases to 3.5 in individuals who are homozygous for the low-risk "a" allele at Locus A.
Multipoint allele sharing from ASPs was determined using GENIBD (S.A.G.E. v5.2). Parental genotype data were recoded to missing for deceased individuals. Various coding schemes for the covariate DR locus were examined including: 1) the X allele under an additive genetic model; 2) the 4 allele under an additive model; and, 3) a linear combination of the covariates based entirely on the simulated risk levels provided in the solutions. The "linear" coding for each individual given their DR locus genotype was constructed as follows: 1) "X/X" genotype was assigned a value of 0; 2) "X/1" or "X/4" genotypes were assigned a value of 1; 3) "4/4" genotype was assigned a value of 2 and, 4) "1/4" or "1/1" genotypes were assigned a value of 3. This coding scheme was designed to capture in a simple fashion the increased risk associated with the DR1 and DR4 alleles, on the basis of the values of the risk multipliers which are 0.8, 1, 2, and 6, respectively.
Statistical analysis
The percent of replicates in which the p-value for linkage on chromosome 16 was less than 0.05 was used to estimate power. Type I error was determined by taking the chromosomes with no simulated disease or quantitative trait loci and averaging the number of times a replicate exceeded the threshold value of the test statistic at the α = 0.05 level. Within each replicate, the locus with the highest proportion of alleles shared IBD within a 20-cM region of the DR locus was selected as representing the point with most significant evidence for linkage to the DR locus because linkage can be detected as far as 20 cM away from the causal locus [8].
1) Conditional Method
Let π be the mean proportion of alleles shared IBD between ASPs at a marker locus. The mean test compares the average amount of allele sharing IBD at a marker locus to the expected value of π = 0.5. Any excess of allele sharing across all sibling pairs is believed to be due to a disease susceptibility locus. A traditional t-statistic can be computed to compare the observed allele sharing to the null value of 0.5 with n - 1 degrees of freedom. A genome scan using the mean test was repeated, selecting only ASPs in which the proportion of alleles shared IBD was greater than or equal to a cut-off value, thus extracting families with evidence for linkage to the DR locus. Three cut-off values were selected: 0.5, 0.7, and 0.9. By testing various subsets, we were effectively applying 0,1 weights proposed by Cox et al. [1] to select ASPs with evidence for allele sharing at the DR locus. ASPs contributing to linkage at the DR locus should also be linked to the A locus if interactions exists [1]. Analyses were performed using the mean test in SIBPAL (S.A.G.E. v. 5.2).
2) Mean Interaction Test Method
Alternatively, an intercept only (π0) regression model is equivalent to the mean test, where ε
i
represent the errors for each ASP i that are normally distributed with mean 0 and variance σ2: π
i
= π0 + ε
i
[2]. A test for linkage only can be conducted by a likelihood ratio test or by Wald test ((π0 - 0.5)/(s.e.(π0)))2. The regression-based mean test is extended to allow for the inclusion of a mean centered covariate X
i
that captures the joint values of the sibling pairs at the DR locus as described above [2]. In this analysis we used the mean-corrected average of the sibling values:
A likelihood ratio test was conducted with π0 = 0.5 and β = 0 against the alternative that π0 > 0.5 or β ≠ 0 with a resultant test statistic that is distributed as a 50:50 mixture of χ21 and χ22 [2]. In addition, we performed a Wald test of β = 0 against the alternative that β ≠ 0 using SAS v. 8.1, which can be interpreted as a test for interaction.
3) Conditional Logistic Model Method
LODPAL (S.A.G.E. v 5.2) implements the conditional logistic model [4], which estimates λ
i
, the recurrence risk ratio for an affected sibling pair that shares i alleles IBD (for i = 0, 1, or 2) with the constraint that λ2 = 3.634λ1 - 2.634 [6]. The effect of covariates was assessed by estimating λ1 = exp(β + γ x), where β measures the genetic effect at the marker and x is the sib-pair covariate. The DR locus was included as a covariate by summing each sibling pair's individual values using the aforementioned genotype codes that were mean-corrected. A likelihood ratio test was conducted by comparing 2ln10 times the difference in LOD scores between models with and without the covariate to a distribution. For this distribution to be valid, loci with LOD = 0 were removed from the analysis and the denominators for calculations of type I error and power were adjusted accordingly. This adjustment is due to the fact that LODPAL rounds any negative LOD score up to 0.