Primary analysis
Evidence of heterogeneity was detected under the linkage model for all five singleton phenotypes. More specifically, in the two-class model, all phenotypes exhibited evidence of a zero slope in one class, and a statistically significant negative slope in the other ("linked") class (See Table 1). For all phenotypes, the estimate of the slope in the "linked" class is more negative than that of the "marginal" slope (i.e., the slope obtained in the one-class model). As one would expect, the marginal slope appears to be an average of the negative and zero slope in the two classes.
We assigned each family to its more likely class, on the basis of estimates of family-specific probabilities of latent class membership. In general, the classes were well separated, with estimates of class membership probabilities close to zero and one. For families classified as "linked", the estimates of the probability of membership in the "linked" class ranged from 0.918 to 1.000, with a mean of 0.988. For families classified as "unlinked", the estimates ranged from 0.000 to 0.024, with a mean of 0.001. We found that relatively few families (i.e., two or three families) contribute to the overall evidence of linkage, despite the strong marginal effect. This suggests that for these five phenotypes, we detected QTLs with relatively rare alleles with strong effects.
We repeated the linkage scans for each class separately. Figure 1 shows the full-sample and class-specific linkage scans for all five phenotypes, using markers on the trans peak chromosome. The stratified scans were implemented in S.A.G.E. [7]. These graphs suggest that we can effectively use the latent class model to identify the subset of families that provide most of the evidence of linkage.
In addition to the chromosome-specific linkage scan, we performed a stratified, genome-wide linkage scan for each singleton phenotype. By omitting those families whose phenotypic variation is already explained ("linked" families above), we might have more power to detect additional QTLs. This reasoning assumes that, for example, when families in the sample are segregating for two distinct QTLs, individual families will be segregating for one or the other, but not both QTLs. Of course, this is not necessarily true. However, if the less common allele at each of the QTLs is rare, it may be a good working assumption. In the linkage scans, the evidence for linkage is measured by the negative log p-value from H-E regression. The maximum new negative log p-value in "unlinked" families ranged from 3.3 for CBR1 (increased from 0.3 in all families) to 4.1 for DSCR2 (increased from 2.2 in all families). In the original analysis, Morley et al. [4] used two different levels of stringency for determining the genome-wide significance of evidence for linkage. A negative log p-value of 4.4 was used as a less stringent threshold. Thus, while the evidence of linkage to other regions in the genome did increase in the "unlinked" families, none of the new linkage peaks achieved genome-wide significance by this criterion.
Secondary analysis
We next applied the latent class method to the chromosome 14 and 20 phenotypes. Specifically, we analyzed data for 29 phenotypes with significant evidence of linkage to a 5-Mb region on chromosome 14. Again, we found that a relatively small number of families contributed to the overall evidence of linkage (range 1 to 7, average 2.4). In fact, for 12 phenotypes, only one family is classified as "linked". Surprisingly, a single family (CEPH 1418) is classified as "linked" for all phenotypes. We repeated a linkage scan removing that family and found that only one phenotype continued to exhibit linkage to chromosome 14. We compared the expression values of Family 1418 to those of the other families and found the mean and range of expression values to be quite similar. Thus, there is nothing remarkable about this family with respect to phenotypic values. The strong dependence of the linkage findings on this family might indicate a relatively rare allele at a QTL with a very strong effect, detected primarily in one family.
We repeated the analysis described above for 24 expression phenotypes with significant linkage to a 5-Mb region on chromosome 20. In contrast to the results for the chromosome 14 phenotypes, we found that different families contribute linkage evidence for different phenotypes. All families contribute evidence for at least two phenotypes and the maximum number of phenotypes to which one family contributes is 14. In general, somewhat larger numbers of families contributed to the evidence of linkage for the chromosome 20 phenotypes (range 1 to 9, average 3.6), as compared to the chromosome 14 phenotypes.