Volume 4 Supplement 1
Proceedings of the 13th European workshop on QTL mapping and marker assisted selection
Extensive QTL and association analyses of the QTLMAS2009 Data
 Georgia Hadjipavlou†^{1},
 Gib Hemani†^{1},
 Richard Leach†^{1},
 Bruno Louro†^{1},
 Javad Nadaf†^{1},
 Suzanne Rowe†^{1} and
 DirkJan de Koning^{1}Email author
DOI: 10.1186/175365614S1S11
© de Koning et al; licensee BioMed Central Ltd. 2010
Published: 31 March 2010
Abstract
Background
We applied a range of genomewide association (GWA) methods to map quantitative trait loci (QTL) in the simulated dataset provided by the QTLMAS2009 workshop to derive a comprehensive set of results. A Gompertz curve was modelled on the yield data and showed good predictive properties. QTL analyses were done on the raw measurements and on the individual parameters of the Gompertz curve and its predicted growth for each interval. Halfsib and variance component linkage analysis revealed QTL with different modes of inheritance but with low resolution. This was complemented by association studies using single markers or haplotypes, and additive, dominance, parentoforigin and epistatic QTL effects. All association analyses were done on phenotypes precorrected for pedigree effects. These methods detected QTL positions with high concordance to each other and with greater refinement of the linkage signals. Twolocus interaction analysis detected no epistatic pairs of QTL. Overall, using stringent thresholds we identified QTL regions using linkage analyses, corroborated by 6 individual SNPs with significant effects as well as two putatively imprinted SNPs.
Conclusions
We obtained consistent results across a combination of intra and inter family based methods using flexible linear models to evaluate a variety of models. The Gompertz curve fitted the data really well, and provided complementary information on the detected QTL. Retrospective comparisons of the results with actual data simulated showed that best results were obtained by including both yield and the parameters from the Gompertz curve despite the data being simulated using a logistic function.
Background
The QTLMAS2009 data is structured in families and allows both linkage and association approaches to be evaluated. Here we describe a comprehensive set of analyses to detect QTL in the simulated population in order to compare routinely used methods of linkage and association analysis. The half sib method is fast and robust but ignores family information other than parentoffspring relationship being analysed. The variance component analysis is computationally more intensive but models all relationships and is easily extended to nonadditive scenarios. Direct association of marker genotypes is computationally fast but requires denser markers and is more susceptible to data stratification. Jointly, these analyses represent a number of models that are expected to give good insight into the genetic architecture of the trait.
Methods
Treatment of phenotypic data
All analyses used the simulated data on 1000 offspring from 20 dams, nested in 5 sire families. Univariate analyses in ASREML [1] were used to estimate heritabilities at each of the time points. The Gompertz growth function, modelling weight over time, was fitted across all trait data using nonlinear regression in SAS. The following parameterization of the Gompertz equation was used: y(t) = Ae^{{e[Be(Ct)/A]}} , where:
y(t) = yield at time t; A = final yield; B = maximum growth rate; C = age at maximum growth rate.
The Gompertz function was then fitted to trait information for each individual separately and individual estimates of the model parameters A, B, C were extracted. Subsequently, parameter estimates for each individual were employed in the model equation and its derivative to predict yield and growth rate (yield per day), respectively, at the 5 time points for which trait information was available (0 to 530) and at time 600.
Halfsib QTL analyses
QTL analyses of the simulated phenotypes, the estimated Gompertz parameters A, B, and C and the predicted growth rates at the given time points and at time 600 were conducted using the halfsib QTL analysis, as described by Knott et al . [2] and implemented in the webbased software GridQTL [3]. All half sib analyses were performed for paternal and maternal halfsib families. Empirical thresholds were obtained by permutation tests using 2000 permutations per chromosome. From these chromosomewide thresholds the following significance levels were derived: chromosomewide and genomewide 5% and 1%. QTLs detected at the 1% chromosomewide significance level were included in the oneQTL model as cofactors. For chromosomes where a single QTL had been identified a twoQTL model was evaluated and the best fitted twoQTL model obtained tested against the best oneQTL model.
Variance component QTL analyses
A variance component approach was used to look for additive, dominant and imprinted QTL. Following a twostep approach [4], identicalbydescent (IBD) coefficients were estimated for all relationships in the pedigree with the recursive method of PongWong et al,[5]. Variance components for each model were estimated using ASReml [1]. The following models were evaluated
(1) y = Xβ + Zu + e(null or polygenic)
(2) y = Xβ + Zu + Za + e(additive QTL)
(3) y = Xβ + Zu + Za + Zd + e (additive QTL + dominant QTL)
(4) y = Xβ + Zu + Z_{ m }m + Z_{ p }p + e (maternal QTL + paternal QTL)
where y is a vector of phenotypic observations, β is a vector of fixed effects, u, a, d, m, p and e are vectors of random additive polygenic effects, additive and dominance QTL effects, maternal and paternal QTL effects and residuals, respectively. X, Z, W, Zm, and Zp are incidence matrices relating to fixed and random genetic, maternally expressed, and paternally expressed QTL effects, respectively. Variances for polygenic and QTL effects are distributed as follows: var(u) =Aσ^{2}_{a}, Var(a) = Gσ^{2}_{q}, Var(d) = Dσ^{2}_{d}, Var(m) = G_{ M }σ^{2}_{m}, Var(p) = G_{ P }σ^{2}_{p}, var(e) = Iσ^{2}_{e}. A is the standard additive relationship matrix based on pedigree data only and the relationship matrices. The G, G_{ M }, G_{ P } and D are the appropriate relationship matrices used to model the additive, maternal, paternal and dominant QTL effects at each position tested as outlined by Liu et al,[6].
The logarithm of the likelihood ratio test statistic was used to test the presence of a QTL at given locations along the genome. A nominal χ^{2}_{1} or χ^{2}_{2} was used depending on whether one or two extra parameters were estimated. This has been shown to be conservative as the theoretical distribution is a mixture between 0 and χ^{2}_{1} or χ^{2}_{1} and χ^{2}_{2}, respectively[7].
Models (3) vs. (2) were compared to detect dominant effects. Models (4) vs. (1) were compared to test for an additive QTL whilst allowing the maternal and paternal components to vary and (4) vs (2) to test whether the additive effect was better explained by allowing different parental contributions.
Association studies
We first tested the level of linkage disequilibrium (LD) using Haploview [8]. For association analyses, we used the simulated phenotypes as well as the Gompertz parameters A, B, C (results not shown). Association analyses were performed using the GRAMMAR approach [10], which comprises two stages. First, ASReml is used to correct each phenotype for polygenic effects; and second, additive, dominant and imprinting models were sequentially fitted against each marker on the residual phenotypic values with an ANOVA test. In the case of a better fit of the imprinting model for a given SNP, we generated a 5% significance level by performing 2000 randomizations where we randomly swapped the maternal and paternal allelic origin for half the offspring. An empirical genomewide threshold of 5% was generated from 1,000 permutations. We also applied haplotype analyses and exhaustive epistatic searches, but these revealed no additional QTL.
Results and discussion
Descriptive statistics
The estimated heritability was ~ 0.50 for all time points varying between 0.46 and 0.50. The heritabilities for the Gompertz parameters A, B and C were 0.45, 0.48 and 0.26, respectively. The LD between adjacent SNP pairs was generally low. Ostensibly, 453 markers spanning a genetic distance of 5 Morgans appeared to be sparse, and the pattern of LD reflects this. Chromosomes 1 to 4 had similar distributions of r^{2} values, the mean between adjacent markers being ~0.15, but chromosome 5 appeared to have much lower LD (Additional file 1).
Linkage analyses
The results of the halfsib analyses are summarised in additional file 2 and curves for all halfsib analyses are presented in additional file 3. Genomewide significant QTL were identified for all time points and all chromosomes. A QTL on chromosome 1 (43cM) was highly significant for growth rate and yield across all times from both sire and dam halfsib analyses. On chromosome 2, the sire analysis identified 3 significant QTL while the dam analysis revealed two QTL (Additional file 2) On chromosome 3, two QTL were detected from the sire halfsib analysis while the dam analysis showed evidence for two QTL (Additional file 2) The sire halfsib regression resulted in one QTL on chromosome 4 (7879cM) which was significant at early times. On chromosome 4, the dam analysis identified two QTL at 58cM and 86cM for yield times 0132. Finally, from the sire analysis a QTL on chromosome 5 was significant at 7576cM for growth rate from time 132 onward and yield for time points 265530, whereas the dam regression revealed a QTL (99cM) for yield and growth rate at early times. Major QTL on chromosome 1 (3844 cM) from both parental analyses and paternally inherited ones on chromosomes 2 and 5 were detected for the Gompertz parameters for final yield (A) and maximum growth rate (B).
Summary of most significant QTL results for yield from variance component analyses.
Model  Chromosome  QTL Position (cM)^{1}  LRT  Time Point  % variance explained by QTL 

Additive  1  43  135.6**  530  35 
Additive  2  5  25.42**  530  6.4 
Additive  2  38  24.1**  397  7.3 
Additive  3  17  14.05**  0  5.03 
Additive  3  93  8.6**  265  4.92 
Additive  4  37  15.61**  0  5.66 
Additive  4  77  27.85**  0  7.16 
Additive  5  73  11.98**  530  5.0 
Dominant^{2}  4  75  7.32**  397  2.3/5.5 
Dominant^{2}  5  30  2.44  397  0/3.4 
Imprinting^{3}  2  9  4.5*  132  0/5.5 
Imprinting^{3}  2  62  3.4  530  0/4.6 
Imprinting^{3}  3  76  3.0  265  2.6/0 
Imprinting^{3}  3  76  3.6  397  2.4/0 
Imprinting^{3}  4  9  8.3**  0  6.2/0 
Association analyses using GRAMMAR
Most significant associations with single SNPs.
SNP number  Time Point^{a}  % variance  

Model  Chromosome  (cM)  log_{10}P  explained by QTL^{a}  
Additive  1  37 (44.5)  15.022.6  0530  6.79.9 
Additive  2  98 (3.6)  4.06.1  132530  1.82.8 
Additive  2  174 (88.3)  3.34.6  132530  1.31.6 
Additive  3  222 (31.1)  3.05.2  0265  0.91.77 
Additive  4  338 (71.7)  3.07.3  0530  1.02.8 
Dominant  4  315 (38.8)  4.45.1  0530  1.72.0 
Imprinting  1  53 (55.6)  3.9  530  1.6 
Imprinting  2  138 (48.5)  3.6  0  1.5 
Overall comparison
The linkage analyses give a single QTL on chromosome 1, up to 3 QTL regions on chromosomes 2, 3, and 5 and 2 QTL regions on chromosome 4 (Additional file 1 and Table 1). There are some slightly more speculative QTL with potential imprinting effects on chromosomes 2, 3, and 4 but these require further scrutiny. The association study shows convincing evidence for 6 SNPs, that all coincide with QTL regions. There are two putatively imprinted SNPs, but these show little concordance with the putative imprinted regions from the VC Analyses.
Epilogue
The retrospective comparison of the performance of the methods used here with the simulated data is shown in additional file 7. Because the Gompertz growth curve fitted the yields well, halfsib regression analysis of the Gompertz model descriptors performed better than all other methods employed, even though the actual QTL were simulated for the 3 parameters of the logistic function. The halfsib, association and VC analyses of yield data detected 12, 6 and 10 out of the 18 simulated QTL respectively with 1, 2 and 5 false positives. Analysis of the growth model descriptors (growth rate and the Gompertz model parameters) resulted in 15 QTL with no false positives. The association analysis tended to underestimate the variance explained by the QTL with the VC analysis giving the most accurate estimation of variance explained (S2). Comparison of the methods clearly demonstrates the risks of false detection of nonadditive segregation. The imprinted QTL falsely detected by the VC and association analyses can be explained by segregation of QTL, where only a limited number of parents are heterozygous for the QTL. As a result, the QTL effect may appear to come from the parents of a single sex only. The apparent parental origin differences are clearly illustrated by the differences in halfsib regression results (Additional files 2 and 3). The dominant effects, however, are more difficult to interpret, perhaps coming from higher yield within a particular full sib family thus masquerading as dominance. Results from these extended models provide a valuable insight and perhaps serve as a warning on the effects of data structure on results from non additive models.
Notes
List of abbreviations used
 QTL:

Quantitative Trait Locus
 LD:

Linkage Disequilibrium
 GRAMMAR:

Genome wide Rapid Analysis using Mixed Models And Regression
Declarations
Acknowledgements
This work has made use of the resources provided by the Edinburgh Compute and Data Facility (ECDF) (http://www.ecdf.ed.ac.uk/). The ECDF is partially supported by the eDIKT initiative. (http://www.edikt.org.). All authors would like to thank BBSRC for their financial support through an Institute Strategic Programme Grant to Roslin Institute. GH1 acknowledges the GENACT Project while JN and BL acknowledge the SABRETRAIN project (EC Contract number MESTCT2005 020558), both funded by the Marie Curie Host Fellowships for Early Stage Research Training. SR and DJK acknowledge the ECfunded Integrated Project SABRE (EC contract number FOODCT200601625) and the BBSRC supported GridQTL project.
This article has been published as part of BMC Proceedings Volume 4 Supplement 1, 2009: Proceedings of 13th European workshop on QTL mapping and marker assisted selection.
The full contents of the supplement are available online at http://www.biomedcentral.com/17536561/4?issue=S1.
Authors’ Affiliations
References
 Gilmour A, Cullis B, Welham S, Thompson R: ASREML User's Manual. New South Wales Agricultural Institute, Orange, NSW, Australia. 1998, Ref Type: Computer Program
 Knott SA, Elsen JM, Haley CS: Methods for multiplemarker mapping of quantitative trait loci in halfsib populations. Theoretical and Applied Genetics. 1996, 93: 7180. 10.1007/BF00225729.View ArticlePubMed
 Seaton G, Hernandez J, Grunchec J, White I, Allen J, Koning D: GridQTL: a Grid portal for QTL mapping of compute intensive datasets. Proceedings of the 8th World Congress on Genetics Applied to Livestock Production Belo Horizonte, Minas Gerais. 2006, 27: 07
 George AW, Visscher PM, Haley CS: Mapping quantitative trait loci in complex pedigrees: A two step variance component approach. Genetics. 2000, 156: 20812092.PubMed CentralPubMed
 PongWong R, George AW, Woolliams JA, Haley CS: A simple and rapid method for calculating identitybydescent matrices using multiple markers. Genet Sel Evol. 2001, 33: 453471. 10.1186/12979686335453.PubMed CentralView ArticlePubMed
 Liu Y, Jansen GB, Lin CY: The covariance between relatives conditional on genetic markers. Genet Sel Evol. 2002, 34: 657678. 10.1186/12979686346657.PubMed CentralView ArticlePubMed
 Visscher PM: A note on the asymptotic distribution of likelihood ratio tests to test variance components. Twin Res Hum Genet. 2006, 9: 490495. 10.1375/twin.9.4.490.View ArticlePubMed
 Barrett JC, Fry B, Maller J, Daly MJ: Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005, 21: 263265. 10.1093/bioinformatics/bth457.View ArticlePubMed
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.