Options
Meyer, Karin
Direct Estimation of Genetic Principal Components: Simplified Analysis of Complex Phenotypes
2004, Kirkpatrick, M, Meyer, K
Estimating the genetic and environmental variances for multivariate and function-valued phenotypes poses problems for estimation and interpretation. Even when the phenotype of interest has a large number of dimensions, most variation is typically associated with a small number of principal components (eigenvectors or eigenfunctions). We propose an approach that directly estimates these leading principal components; these then give estimates for the covariance matrices (or functions). Direct estimation of the principal components reduces the number of parameters to be estimated, uses the data efficiently, and provides the basis for new estimation algorithms. We develop these concepts for both multivariate and function-valued phenotypes and illustrate their application in the restricted maximum-likelihood framework.
Performance of REML algorithms in multivariate analyses fitting reduced rank and factor-analytic models
2007, Meyer, Karin
Convergence behaviour of restricted maximum likelihood algorithms in multivariate analyses imposing a factor-analytic structure on covariance matrices is examined. Results indicate that estimation for such models can entail a more difficult maximisation problem than 'unstructured' estimation. On the other hand, if only factors explaining negligible variation are omitted, convergence can be faster as parameters at the boundaries of the parameter space have been eliminated. The 'parameter expanded' expectation maximisation algorithm tends to require many more iterates than the 'average information' algorithm, but is useful, in particular when combined with the latter.
Improving REML estimates of genetic parameters through penalties on correlation matrices
2014, Meyer, Karin
Penalized REML estimation can substantially reduce sampling variation in estimates of covariance matrices, and yield estimates of genetic parameters closer to population values than standard analyses. A number of suitable penalties based on prior distributions of correlation matrices from the Bayesian literature are described, and a simulation study is presented demonstrating their efficacy. Results show that reductions of 'loss' in estimates of the genetic covariance matrix, a conglomerate of sampling variance and bias, well over 50% are readily obtained for multivariate analyses of small samples. Default settings for a mild degree of penalization are proposed, which make such analyses suitable for routine use without increasing computational requirements.
Penalized maximum likelihood estimates of genetic covariance matrices with shrinkage towards phenotypic dispersion
2011, Meyer, Karin, Kirkpatrick, Mark, Gianola, Daniel
A simulation study examining the effects of 'regularization' on estimates of genetic covariance matrices for small samples is presented. This is achieved by penalizing the likelihood, and three types of penalties are examined. It is shown that regularized estimation can substantially enhance the accuracy of estimates of genetic parameters. Penalties shrinking estimates of genetic covariances or correlations towards their phenotypic counterparts acted somewhat differently to those aimed reducing the spread of sample eigenvalues. While improvements of estimates were found to be comparable overall, shrinkage of genetic towards phenotypic correlations resulted in least bias.
Components of Variance Underlying Fitness in a Natural Population of the Great Tit, 'Parus major'
2004, McCleery, R H, Pettifor, R A, Armbruster, P, Meyer, Karin, Sheldon, B C, Perrins, C M
Traits that are closely associated with fitness tend to have lower heritabilities (h²) values than those that are not. This has commonly been interpreted as evidence that natural selection tends to deplete genetic variation more rapidly for traits more closely associated with fitness (a corollary of Fisher’s Fundamental Theorem), but Price and Schluter (1991) suggested the pattern might be due to higher residual variance in traits more closely related to fitness. The relationship between eleven different traits for females, and eight traits for males and overall fitness (lifetime recruitment) was quantified for great tits ('Parus major') studied in their natural environment of Wytham Wood, England, using data collected over 38 years. Heritabilities and the coefficients of additive genetic and residual variance (CVA and CVR respectively) were estimated using an "animal model". In both males and females a trait’s correlation (r) with fitness was negatively related to its h2, but positively related to its CVR. CVA was not related to the traits correlation with fitness in either sex . This is the third study using directly measured fitness in a wild population in a natural environment to show the important role of residual variation in determining the pattern of lower heritabilities for traits more closely related to fitness, as predicted by Price & Schluter (1991).
Pooling Estimates of Covariance Components Using a Penalized Maximum Likelihood Approach
2012, Meyer, Karin
Estimates of large genetic covariance matrices are commonly obtained by pooling results from a series of analyses of small subsets of traits. Procedures available to pool the part-estimates differ in their efficacy in accounting for unequal accuracies of estimates and sampling correlations, and ensuring that pooled matrices are within the parameter space. We propose a maximum likelihood (ML) approach to combine estimates, treating sets from individual part-analyses as matrices of mean squares and cross-products from independent families. This facilitates simultaneous pooling of estimates for all sources of variation considered, readily allows for weighted estimation or a given structure of the pooled matrices, and provides a framework for regularized estimation by penalizing the likelihood. A simulation study is presented, comparing the quality of combined estimates for several procedures, including truncation or shrinkage of either canonical or individual matrix eigen-values, iterative summation of expanded part matrices, and the ML approach, considering a range of penalties. Shrinking eigen-values of individual matrices towards their mean reduced losses in the pooled estimates, but substantially increased proportional losses in their phenotypic counterparts and thus yielded estimates differing most from corresponding full multivariate analyses of all traits. Assuming a simple pseudo-pedigree structure when combining estimates for all random effects simultaneously using ML allowed sampling correlations between estimates of different components from the same part-analysis to be approximated sufficiently to yield pooled matrices closest to full multivariate results, with little change in phenotypic components. Imposing a mild penalty to shrink matrices for random effects towards their sum proved highly advantageous, markedly reducing losses in estimates and more than compensating for the reduction in efficiency of using the data inherent in analyses by parts. Penalized ML provides a flexible alternative to current methods for pooling estimates from part-analyses with good sampling properties, and should be adopted more widely.
Which Genomic Relationship Matrix?
2015, Tier, Bruce, Meyer, Karin, Ferdosi, Mohammad
Genomic information can accurately specify relationships among animals, including between those without known common ancestors. Genetic variances estimated with genomic data relate to unknown, more distant, founder populations than those defined by the pedigree. Starting from different sets of assumptions, the properties of some alternative genomic relationship matrices (G) are explored. Although the assumptions and matrices differ, the resulting sets of estimated breeding values predict the differences between animals identically, despite obtaining different estimates of the additive genetic variance - showing that there are many ways of building G that provide identical results. For some methods integer and logic, rather than floating point, operations will expedite building G many-fold.
Restricted Maximum Likelihood to estimate variance components for mixed models with two random factors
1987-03-15, Meyer, Karin
A Restricted Maximum Likelihood procedure is described to estimate variance components for a univariate mixed model with two random factors. An EM-type algorithm is presented with a reparameterisation to speed up the rate of convergence. Computing strategies are outlined for models common to the analysis of animal breeding data, allowing for both a nested and a crossclassified design of the 2 random factors. Two special cases are considered: firstly, the total number of levels of fixed effects is small compared to the number of levels of both random factors " secondly, one fixed effect with a large number of levels is to be fitted in addition to other fixed effects with few levels. A small numerical example is given to illustrate details.
First estimates of covariance functions for lifetime growth of Angus cattle
2003, Meyer, Karin
Estimates of covariance functions for weights of Angus cattle from birth to 3000 days of age were obtained using Bayesian analysis. Data consisted of records in 69 herds with at least 50 mature cow weights, and records in 6 additional herds with 60% or more animals having at least four weights, 551,259 records on 197,915 animals in total. The model of analysis fitted contemporary groups and cubic regressions on orthoganal polynomials of age nested within sex, birth type, dam age class and lactation status as fixed effects. Random effects fitted were cubic and quartic regressions on orthogonal polynomials of age for animals' direct genetic and permanent environmental effects, and quadratic regressions, restricted to 0 to 600 days of age, for maternal genetic and environmental effects. Measurement error variances were modelled through a step function with 32 classes, yielding 69 covariance components to be estimated.
"SNP Snappy": A Strategy for Fast Genome-Wide Association Studies Fitting a Full Mixed Model
2012, Meyer, Karin, Tier, Bruce
A strategy to reduce computational demands of genome-wide association studies fitting a mixed model is presented. Improvements are achieved by utilizing a large proportion of calculations that remain constant across the multiple analyses for individual markers involved, with estimates obtained without inverting large matrices.