### Statistical model

Let *Y*_{
ij
}^{n}denote the expression of gene *n* in member *j* of family *i* and let *G*_{
ij
}^{m}be the corresponding SNP genotype at marker *m* at location *x*_{
m
}. For the means and covariances of the expression traits, we adopted a generalized estimating equations model of the form used by Thomas et al. [7]

*E*(*Y*_{
ij
}^{n}) ≡ μ_{
ij
}^{n}= α_{0}^{n}+ Σ_{
m
}*A*^{nm}*G*_{
ij
}^{m} (1)

*E*(*C*_{
ijk
}^{n}) ≡ χ_{
ijk
}^{n}= β_{0}^{n}+ *B*^{n}*Z*_{
ijk
}(*X*^{n}), (2)

where *C*_{
ijk
}^{n}= (*Y*_{
ij
}^{n}- μ_{
ij
}^{n})(*Y*_{
ik
}^{n}- μ_{
ik
}^{n}) and *Z*_{
ijk
}(*x*) are the estimated *E*(*IBD*_{
ijk
}(*x*)|**G**_{
i
}) at chromosomal location *x* for pairs (*j*, *k*) from nuclear family *i*, based on the complete multilocus marker data. *X*^{n}is a latent variable for location of the unobserved causal locus linked to expression trait *n*. For *j* = *k*, *V*(*Y*_{
ij
}^{n}) = χ^{n}models the gene expression variance in Eq. (2).

In Eq. (1), the regression coefficients *A*^{nm}are modeled as a mixtures of null values with probabilities 1-π^{nm}and a normal distribution of non-null values with means α^{nm}expressed in terms of row and column effects:

*A*^{nm}~ (1 - π^{nm}) δ (**0**) + π^{nm}*N*(α^{nm}, σ^{2}), (3)

where

α^{nm}= γ_{0}^{A}+ γ_{1}^{A}I(*x*_{
m
}∈ *R*_{
n
}) + *e*_{
m
}^{A}+ *h*_{
n
}^{A} (4a)

logit(π^{nm}) = γ_{0}^{P}+ γ_{1}^{P}I(*x*_{
m
}∈ *R*_{
n
}) + *e*_{
m
}^{P}+ *h*_{
n
}^{P}. (4b)

The parameter γ_{1} distinguishes between *cis* and *trans* effects, a *cis* interaction occurs when the chromosomal location *x*_{
m
}of SNP *m* is within the interval *R*_{
n
}, the alignment region for the gene expression probe *n*. The random effects **e** and **h** are distributed as

(*e*_{
m
}^{A}, *e*_{
m
}^{P}) ~ *N*_{2}(**0**, **T**) (5a)

(*h*_{
n
}^{A}, *h*_{
n
}^{P}) ~ *N*_{2}(**0**, **W**) (5b)

and the γs, **T**, **W** have uninformative normal and Wishart priors.

The regression coefficients *B*^{n}in the covariance model in Eq. (2) are handled similarly, except that we assume each trait has at most one region linked to it. (This is not essential to the method, because Eq. (2) could be extended to a summation over multiple independent linkage regions, but it would not make sense to offer all marker locations simultaneously, since the IBD variables are highly correlated from one location to the next.) Thus, we assume

*B*^{n}~ *N* [γ_{0}^{B}+ γ_{1}^{B}I(*X*^{n}∈ *R*_{
n
}), τ^{2}] (6)

and pick a uniform prior on *X*^{n}; to simplify the calculations, we restrict *X*^{n}to the observed marker locations *x*_{
m
}and compute IBD probabilities only at these locations. Thus, *X*^{n}has a discrete distribution with prior masses inversely proportional to the local marker density, here estimated simply as |*x*_{m+1 }- *x*_{m-1}|. The full model is represented in the directed acyclic graph shown in Figure 1.

We fitted the model using a Markov-chain Monte Carlo (MCMC) approach, implemented in Matlab. Updates of all parameters except the location parameters *X*^{n}values involve standard Gibbs sampling from their respective full conditional distributions, e.g., [α_{0}^{n}|**Y**^{n}, **G**, **A**], [β_{0}^{n}|**C**^{n}, **Z**, **B**], [**A**^{nm}|**Y**^{n}, **G**^{m}, α_{0}^{n}, π^{nm}, α^{nm}, σ^{2}], etc. The updates of the *X* values are based on a Metropolis-Hastings procedure with a random walk proposal. The sequence was started ten times from several initial points chosen from an overdispersed prior around rough estimates. Half of the initial samples are discarded and the second half is kept. The number of kept samples, *L* = 4000, is chosen to be large enough so that for all parameters of interest the variance between sequences *V*_{
B
}is comparable to that within sequence *V*_{
W
}, *R* < 1.10:

\widehat{R}=\sqrt{\frac{L-1}{L}+\frac{1}{L}\frac{{V}_{B}}{{V}_{W}}}.

(7)

The rationale behind this convergence monitoring procedure is described and justified by Gelman et al. [8].

### Subjects, genotypes, and phenotypes

In order to keep the computation to a manageable level, we restricted this analysis to the SNP genotypes and expressed genes on chromosome 11, since previous analyses by Morley et al. [2] had found evidence of both *cis* and *trans* linkages at this chromosome. The final data set thus had 116 SNPs and 189 expressed genes. IBD status was estimated from the complete two-generation pedigrees (excluding grandparents) by a program written by JM based on the Lander-Green algorithm [9]. All 378 sib pairs (110 individuals) from the available 14 families were included in the phenotype analysis.