The analysed data is the simulated data set from the XII QTL Workshop, consisting of 5,865 individuals from seven generations, divided into (i) a group of 4,665 animals from generations 1–4 for which both phenotypes and genotypes are available, (ii) a group of 1,200 animals from generations 5–7 for which only genotypes are available. Phenotypes represent a quantitative trait, while genotypes represent 6,000 SNP markers evenly distributed every 0.1 cM over six chromosomes. In our analysis five different SNP data sets are considered. They comprise:

- a set of all available 6,000 SNPs (SNP6000),

- a set of 3,328 SNPs selected based on their estimated minor allele frequency (MAF) using the condition: MAF ≥ 0.3 (SNP3328),

- a set of 1,200 SNPs selected as every 5^{th} SNP out of the available set (SNP1200),

- a set of 600 SNPs selected as every 10^{th} SNP out of the available set (SNP600),

- a set of 300 SNPs selected as every 20^{th} SNP out of the available set (SNP300).

For prediction of EBVs a standard mixed animal model is applied: **y** = *μ* + **Zα** + **e**, where **y** is a vector of phenotypic values, μ is the overall mean, $\alpha ~N\left(0,A{\sigma}_{\alpha}^{2}\right)$ is a vector of random additive polygenic effects of animals with a covariance matrix given by the numerator relationship matrix (**A**) and the component of the additive polygenic variance ${\sigma}_{\alpha}^{2}$, and $e~N\left(0,I{\sigma}_{e}^{2}\right)$ is a vector of residuals. GBVs are defined as the sum of additive effects of SNPs, estimated from different SNP data sets defined above using the following models:

- (1) **y** = *μ* + **Xq** + **e**, where **q** (N_{SNP} × 1) is a vector of fixed additive SNP effects with the corresponding design matrix **X** with score 0, 1, or 2 for an SNP genotype 11, 12, or 22 respectively, N_{SNP} is the number of SNPs considered and other model parameters are defined as above.

- (2) **y** = *μ* + **Xq** + **Zα** + **e**, with all the parameters defined as above.

- (3) **y** = *μ* + **Zq** + **e**, where $q~N\left(0,I{\sigma}_{\alpha}^{2}\right)$ is a vector of random SNP effects with the corresponding design matrix **Z** with score 0, 1, or 2 for an SNP genotype 11, 12, or 22 respectively.

Note that EBVs and GBVs are estimated for the 4,665 animals from the first four generations. The estimation of parameters of all the mixed models was based on solving the mixed model equations (MME, [1]) while effects in model 1 were estimated using the least squares approach. The DFREML package [2] was used for the estimation of parameters and variance components of the EBV model, whereas the parameters of GBV models (model 1–3) were estimated using R programmes. For models 1–3 residual and additive polygenic variance components were assumed as known and were set with the estimates obtained from the EBV model. Due to too high memory requirements for building an inverse of the coefficient matrix of MME, we were unable to estimate parameters of models 2 and 3 for the data set with all SNPs.