Options
Meyer, Karin
Principal Component and Factor Analytic Models In International Sire Evaluation
2010, Tyriseva, A-M, Meyer, Karin, Fikse, W F, Ducrocq, V, Jakobsen, J, Lidauer, M H, Mantysaari, E A
Various studies have addressed the challenge of variance component estimation for multiple-trait across country evaluation (MACE) and attempted to ease the burden of the estimation process. Several of these have focused on using the decomposition of the genetic covariance matrices into the pertaining matrices of eigenvalues and -vectors, namely principal component (PC) and factor analytic (FA) approaches (e.g., Leclerc et al., 2005; Mäntysaari, 2004). For highly correlated traits, some eigenvalues have only a very small effect on the genetic variation. This is utilized by ignoring the PCs with negligible effects. For the PC approach this results in dimension reduction. The FA model also includes trait specific variances. This results in a full rank (co)variance (VCV) matrix unless some of the latter are zero. Leclerc et al. (2005) studied both PC and FA approaches for a sub-set of well-linked base countries, performing dimension reduction for this sub-set and estimating the contribution of the remaining countries to these PCs or factors. Mäntysaari (2004) introduced a bottom-up PC approach: this begins with a sub-set of countries, adding in the remaining countries sequentially. By examining in each step whether or not the new country increases the rank of the genetic VCV matrix, it only fits PCs with non-negligible eigenvalues and thus avoids over-parameterized models. Direct estimation of the important genetic principal components only has been proposed by Kirkpatrick and Meyer (2004). However, this requires the appropriate rank to be known or to be estimated. Similarly, we can estimate a VCV matrix imposing a FA structure directly. The bottom-up approach has recently been tested for variance component estimation for MACE with promising results (Tyrisevä et al., 2009). Both direct PC and FA approaches have been applied to beef cattle data sets, and have demonstrated their potential to be used for large, multi-trait data sets (e.g., Meyer, 2007a). The objectives of this study are to assess the impact of alternative parameterizations (PC and FA) for the estimation of variance components on practical predictions of breeding values with MACE.
Principal component approach in variance component estimation for international sire evaluation
2011, Tyriseva, A-M, Meyer, Karin, Fikse, F, Ducrocq, V, Jakobsen, J, Lidauer, M H, Mantysaari, E A
Background: The dairy cattle breeding industry is a highly globalized business, which needs internationally comparable and reliable breeding values of sires. The international Bull Evaluation Service, Interbull, was established in 1983 to respond to this need. Currently, Interbull performs multiple-trait across country evaluations (MACE) for several traits and breeds in dairy cattle and provides international breeding values to its member countries. Estimating parameters for MACE is challenging since the structure of datasets and conventional use of multiple-trait models easily result in over-parameterized genetic covariance matrices. The number of parameters to be estimated can be reduced by taking into account only the leading principal components of the traits considered. For MACE, this is readily implemented in a random regression model. Methods: This article compares two principal component approaches to estimate variance components for MACE using real datasets. The methods tested were a REML approach that directly estimates the genetic principal components (direct PC) and the so-called bottom-up REML approach (bottom-up PC), in which traits are sequentially added to the analysis and the statistically significant genetic principal components are retained. Furthermore, this article evaluates the utility of the bottom-up PC approach to determine the appropriate rank of the (co)variance matrix. Results: Our study demonstrates the usefulness of both approaches and shows that they can be applied to large multi-country models considering all concerned countries simultaneously. These strategies can thus replace the current practice of estimating the covariance components required through a series of analyses involving selected subsets of traits. Our results support the importance of using the appropriate rank in the genetic (co)variance matrix. Using too low a rank resulted in biased parameter estimates, whereas too high a rank did not result in bias, but increased standard errors of the estimates and notably the computing time. Conclusions: In terms of estimation's accuracy, both principal component approaches performed equally well and permitted the use of more parsimonious models through random regression MACE. The advantage of the bottom-up PC approach is that it does not need any previous knowledge on the rank. However, with a predetermined rank, the direct PC approach needs less computing time than the bottom-up PC.
Recommendations for Estimation of Variance Components for International Sire Evaluation
2010, Tyriseva, A-M, Meyer, Karin, Fikse, F, Ducrocq, V, Jakobsen, J, Lidauer, MH, Mantysaari, EA
This study assessed the impact of alternative parameterizations for the estimation of variance components on practical predictions of breeding values with MACE. Interbull MACE Holstein evaluations for somatic cell count (April 2009) and protein yield (August 2007) were considered. The MACE model was expressed in terms of a random regression model, which facilitates exploitation of principal component and factor analytic approaches. Both methods allow a reduction of the number of parameters to be estimated and benefit from the more parsimonious variance structure. Genetic parameters from different approaches were very similar, when the optimal fit was used. Over-fitting did not affect the estimates, but increased estimation time, whereas fitting too few parameters affected bull rankings in different countries.
Comparison of Different Variance Component Estimation Approaches for MACE: Direct and Bottom-up PC
2009, Tyriseva, A M, Meyer, Karin, Jakobsen, J, Ducrocq, V, Fikse, F, Lidauer, M H, Mantysaari, E A
Multiple-trait across country evaluation (MACE) is used for international genetic evaluation of dairy bulls. MACE treats records in different countries as different traits. Thus, a sire will get a breeding value for each participating country. Whenever a country makes changes to their national evaluation model, the genetic variance-covariance (VCV) matrix needs to be re-estimated. Estimation of the VCV matrix is a different task. For the Holstein production evaluation, which includes 26 traits, it is not possible to estimate the VCV matrix in a single analysis with the currently available estimation methods and the given time constraints. Hence, the complete matrix is built from analyses of sub-sets. This readily results in a non-positive matrix and a bending procedure (Jorjani et al., 2003) needs to be applied to obtain a positive definite matrix. In addition, the VCV matrix is usually over-parameterized as genetic correlations between countries are generally high.
Principal component and factor analytic models in international sire evaluation
2011, Tyriseva, A-M, Meyer, Karin, Fikse, F, Ducrocq, V, Jakobsen, J, Lidauer, M H, Mantysaari, E A
Background: Interbull is a non-profit organization that provides internationally comparable breeding values for globalized dairy cattle breeding programmes. Due to different trait definitions and models for genetic evaluation between countries, each biological trait is treated as a different trait in each of the participating countries. This yields a genetic covariance matrix of dimension equal to the number of countries which typically involves high genetic correlations between countries. This gives rise to several problems such as over-parameterized models and increased sampling variances, if genetic (co)variance matrices are considered to be unstructured. Methods: Principal component (PC) and factor analytic (FA) models allow highly parsimonious representations of the (co)variance matrix compared to the standard multi-trait model and have, therefore, attracted considerable interest for their potential to ease the burden of the estimation process for multiple-trait across country evaluation (MACE). This study evaluated the utility of PC and FA models to estimate variance components and to predict breeding values for MACE for protein yield. This was tested using a dataset comprising Holstein bull evaluations obtained in 2007 from 25 countries. Results: In total, 19 principal components or nine factors were needed to explain the genetic variation in the test dataset. Estimates of the genetic parameters under the optimal fit were almost identical for the two approaches. Furthermore, the results were in a good agreement with those obtained from the full rank model and with those provided by Interbull. The estimation time was shortest for models fitting the optimal number of parameters and prolonged when under- or over-parameterized models were applied. Correlations between estimated breeding values (EBV) from the PC19 and PC25 were unity. With few exceptions, correlations between EBV obtained using FA and PC approaches under the optimal fit were ≥ 0.99. For both approaches, EBV correlations decreased when the optimal model and models fitting too few parameters were compared. Conclusions: Genetic parameters from the PC and FA approaches were very similar when the optimal number of principal components or factors was fitted. Over-fitting increased estimation time and standard errors of the estimates but did not affect the estimates of genetic correlations or the predictions of breeding values, whereas fitting too few parameters affected bull rankings in different countries.