The genotyped sample of families provided by NARAC to the Genetic Analysis Workshop 15 consisted of 757 families (8017 individuals), in which at least one individual per family was genotyped [5, 6]. We initially examined evidence for two-locus linkage using 380 autosomal microsatellite markers in 511 NARAC families, with 627 affected full-sib pairs (ASP), 29 affected maternal half-sib pairs (AMHSP), and 2 affected paternal half-sib pairs (APHSP). The peaks in the 2D surface were also tested for two-locus evidence for linkage using all of the available genotype data in the NARAC collection. There were 744 families (with 911 ASP, 43 AMHSP, and 2 APHSP) with genotypes available for either the autosomal microsatellite markers (380 markers), or for the autosomal single-nucleotide polymorphism (SNP) markers (5407 markers), or for all microsatellite and SNP autosomal markers (5787 markers). Marker order was based on build 18 of the human genome and genetic distances were obtained from the Rutgers map [7]. Where genetic distances were unavailable, we used physical distances to linearly interpolate the corresponding Rutgers cM location, and in cases where markers were located beyond the ends of the Rutgers map, we assumed that 1 cM = 1 Mb. In the lod-score calculations we used Haldane map units, founder allele frequencies, and if SNP markers were included in the analysis, we clustered markers using a linkage disequilibrium threshold of r2 > 0.1.
Non-parametric linkage analysis was carried out initially using the single-locus maximum lod score (MLS) test statistic [8]. Two-locus non-parametric linkage analysis was performed using the two-locus extension of the MLS [9, 10] implemented in Merloc [3]. Merloc uses likelihood estimates from Merlin [11] to estimate the joint two-locus allele sharing probabilities, which are then used in the calculation of the two-locus MLS using numerical maximization. The two-locus MLS under the most general two-locus model (GEN) is a function of the eight variance components at the two loci – the additive and dominance variances at locus 1 (VA1 and VD1) and 2 (VA2 and VD2), and the four epistatic variances (VA1A2, VA1D2, VD1A2, and VD1D2). Different genetic models can be fitted to the data by restricting the number of free variance components, for example, two-locus additive (ADD) or multiplicative (MUL) models and single-locus (SL) models are nested within the general epistatic model.
In the variance-component framework, epistasis can be defined as a departure from an additive or a multiplicative two-locus penetrance function. To assess the evidence for epistasis as a departure from additivity, the two-locus MLS under the general model may be compared to the MLS under the nested additive model with no interaction terms (GEN-ADD). Alternatively, a test for epistasis would compare the MLS under the general model with that under the multiplicative model (GEN-MUL). The MLS under a two-locus epsilon-epistatic model [3] (where a single parameter, ε, captures the degree of epistasis) and the maximum-likelihood estimate of ε can also be used to indicate the degree of epistasis (with ε = 0 corresponding to an additive model, ε = 1 to the multiplicative model, and ε > 103 to an extreme epistatic model).
We used previously published [3] significance thresholds to assess the significance of our findings of two-locus linkage compared to a null model in which neither locus affects the trait, where GEN = 5.85 corresponds to 2D genome-wide type I error rate of 0.05, and GEN = 4.30 corresponds to suggestive evidence for two-locus linkage. To assess the significance of the two-locus linkage results for pairs of loci that included 6p21, we performed two-locus simulations of chromosome pairs 6–16, 6–18, 6–8, and 6–1, by keeping chromosome 6 fixed (real data) against 1000 replicates of the simulated second chromosome (null effect). To estimate the point-wise significance of our findings of epistasis defined either as a departure from additivity or multiplicativity, we simulated two fully informative markers under the null two-locus model of interest (ADD for GEN-ADD and MUL for GEN-MUL). This was achieved by sampling from the observed two-locus allele sharing distribution under the null two-locus model (ADD or MUL) during the linkage analysis of our actual data. This procedure was performed at each 2D coordinate of interest and 100,000 replicates were used to obtain the GEN-ADD or GEN-MUL thresholds at that coordinate. A similar approach was applied to assess the significance of a secondary locus at 6p22. We generated 100,000 simulates of fully-informative markers, by sampling from the observed 6p21 single-locus allele sharing distribution, and subsequently we analyzed the replicates under the GEN model.