Now showing 1 - 10 of 47
  • Publication
    Accuracy of genomic selection: Comparing theory and results
    (Association for the Advancement of Animal Breeding and Genetics (AAABG), 2009)
    Hayes, B J
    ;
    Daetwyler, H D
    ;
    Bowman, P
    ;
    Moser, G
    ;
    ; ;
    Khatkar, M
    ;
    Raadsma, H W
    ;
    Goddard, M E
    Deterministic predictions of the accuracy of genomic breeding values in selection candidates with no phenotypes have been derived based on the heritability of the trait, number of phenotyped and genotyped animals in the reference population where the marker effects are estimated, the effective population size and the length of the genome. We assessed the value of these deterministic predictions given the results that have been achieved in Holstein and Jersey dairy cattle. We conclude that the deterministic predictions are useful guide for establishing the size of the reference populations which must be assembled in order to predict genomic breeding values at a desired level of accuracy in selection candidates.
  • Publication
    Combining two markov chain monte carlo approaches for linkage and association studies with a complex pedigree and multi marker loci
    In QTL mapping using linkage and/or linkage disequilibrium, an important process is to find the pattern of inheritance states and haplotype configurations, a process known as haplotype reconstruction. Haplotype reconstruction is routinely based upon observed pedigree information and marker genotypes for individuals in the pedigree. It is not feasible for the exact methods to use all such information for large complex pedigree especially when there are many missing genotypes. Markov Chain Monte Carlo (MCMC) approaches have been widely used to handle a complex pedigree with sparse genotypic data. However they often have reducibility problems or are slow to converge. Combining two different MCMC approaches results in improvement of computational speed and mixing properties. It allows obtaining reliable estimates such as identity by descent coefficients between individuals within a reasonable time.
  • Publication
    Genome wide selection in dairy cattle based on high-density genome-wide SNP analysis: From discovery to application
    (Association for the Advancement of Animal Breeding and Genetics (AAABG), 2007)
    Raadsma, H W
    ;
    Zenger, K R
    ;
    Nicholas, F W
    ;
    ;
    Khatkar, M S
    ;
    ;
    Moser, G
    ;
    Solkner, J
    ;
    Cavanagh, J A L
    ;
    Hawken, R J
    ;
    Hobbs, M
    ;
    Barris, W
    A genome wide selection (GWS) platform was developed for prediction of genetic merit in dairy cattle. The critical components of the GWS platform included a genome wide SNP analysis assay representing 15,036 SNPs, 1546 progeny tested Holstein Friesian sires with EBV (ABV) for 42 lactation performance traits, and a series of complexity reduction methods with internal and external cross validation. Derived Molecular Breeding Values (MBV) using a fraction of the available SNP information, were shown to have high predictive value for genetic merit (r=0.65-0.87 with ABV) in bulls not used in the training data from which the SNP effects were derived. GWS can be used in the absence of SNP location and pedigree to make potentially highly accurate predictions of genetic merit at an early age from DNA analyses.
  • Publication
    Population stratification, not genotype error, causes some SNPs to depart from Hardy-Weinberg Equilibrium
    (Association for the Advancement of Animal Breeding and Genetics (AAABG), 2009) ;
    Large scale whole genome scans generate massive amounts of genotype data. It is essential to check genotype integrity and identify genotype errors prior to association analysis. Departure from Hardy-Weinberg Equilibrium has been adopted as one of the main methods to identify genotype errors. However population stratification also causes departure from Hardy-Weinberg Equilibrium, which is a disadvantage of this approach. This study used 2 sets of SNP genotypes to show that after basic editing using Call Rate and minor allele frequency, up to 13% of SNPs departed from Hardy-Weinberg Equilibrium (HWD) and about one third of these HWD SNPs could be falsely identified as genotype errors, were attributable to population subdivision (eg herd of origin, cohort) for one dataset and corresponding numbers for the second dataset are 21% and 16%, respectively. This approach can avoid improper culling of a considerable proportion of SNPs.
  • Publication
    Single nucleotide polymorphisms in suppressor of cytokine signalling-2 gene and association with feed conversion ratio and growth in pigs
    (Association for the Advancement of Animal Breeding and Genetics (AAABG), 2007)
    Piper, E
    ;
    Chen, Y
    ;
    ; ; ;
    Luxford, B G
    ;
    Moran, C
    The Suppressor of Cytokine Signalling-2 (SOCS2) is the main negative regulator of somatic growth through the mediation of growth hormone signalling (GH/IGF-1). Knock-out and naturally mutant mice have high growth phenotypes. We have mapped the porcine SOCS2 gene to chromosome 5q, located closely to a reported QTL for food conversion ratio (Lee et al., 2003). Here we report five single nucleotide polymorphisms identified by sequencing of the promoter region and exon 1. One PCR-RFLP assay was designed for genotyping the polymorphism at position 1667(A/G). Association analyses were performed in an Australian mapping resource pedigree (PRDC-US43) for a number of traits (feed conversion ratio, backfat, IGF-1 level and growth traits) and showed significant effects on average daily gain on test (ADG2) (p<0.01) and marginal association with feed conversion ratio (FCR) (p<0.08).
  • Publication
    Multi-Environment Trial Analysis For 'Pinus Radiata'
    (Scion, 2008) ; ;
    Dutkowski, G
    ;
    Wu, H X
    ;
    Powell, M B
    ;
    McRae, T A
    A stem-diameter data set of five combined trials of 'Pinus radiata' D. Don was used to identify and determine the nature of genetics by environment (GxE) interaction. The restricted maximum likelihood approach was applied to handle the main issues of the multi-environment trial analysis: (1) Testing sources of heterogeneity of variance and lack of between-sites genetic correlation; (2) Modelling the heterogeneity of error variance among trials and micro-environmental variation within each trial; and (3) Selecting the best model for prediction of breeding values. Model comparison was based on the criterion of log-likelihood. The significance of variance components was tested by the likelihood ratio test which showed that all sources of GxE interactions were highly significant, indicating that GxE interactions occurred in these five trials due to both the heterogeneity of variances and the lack of correlation. Estimates of Type B genetic correlations were increased slightly by correcting for the heterogeneity of variances. The full model, which accommodated heterogeneity of error variances between trials, spatial variation within trials, and fitting a separate GxE interaction variance for each trial, was superior to other models for this multi-environment trial.
  • Publication
    Statistical methods to interpret genotypic data
    (2007)
    Woolaston, Alex
    ;
    ;
    Murison, Robert
    Recent developments in genetic techniques have provided high throughput tools such as single nucleotide polymorphism (SNP) chips and cDNA microarrays to assist in genetic selection. Such high throughput devices necessitate new statistical approaches so that the massive amounts of data gathered can be exploited in an effective manner. This thesis describes some statistical methods that can be applied to SNP data and microarray data. Firstly, the use of SNP data to predict molecular breeding value (MBV) is studied. Principal component analysis (PCA) is used to summarize the variation of the high dimensional SNP space within a smaller dimensional projection space of principal components (PCs). It is demonstrated how the PCs can be used in principal component regression (PCR) to predict the MBV of dairy cattle from their SNP values alone with both simulated and real data. Highly reliable estimated breeding values (EBVs) are available for the real animals. A cross-validation method is used to predict MBVs for dairy sires, with a correlation of 0.69 between the EBVs and estimated MBVs obtained for these real data. The impact of erroneous SNP values, missing SNP values and the number of animals with known EBVs genotyped is also examined. Through simulation, it is found that erroneous SNP values of greater than 2% reduce the accuracy of prediction, whereas the number of missing SNP values has little impact on the accuracy of prediction. As expected, an increase in the number of animals with already known EBVs increases the accuracy of prediction. Kernel regression is used to predict MBV from the intrinsically discrete SNP data. Binomial kernels, which treat the SNP values as a discrete variable, and a Gaussian kernel, which imposes a continuous structure on the marker data, are employed and compared. It is empirically demonstrated that the Gaussian kernel outperforms the binomial kernel when used in Nadaraya-Watson kernel regression. Secondly, statistical methods to account for the nuisance spatial trends found in microarray slides are assessed. Wavelets are proposed as a method of modeling spatial effects in two colour cDNA microarrays where the spatial error component may be represented as a fractal surface. This method is compared with smoothing splines plus first order autoregressive detrending using data collected from mice in a time-course experiment. Two schemes for selecting control genes are also assessed for these data,(i) pre-determined and (ii) the genes that do not over- or under-express throughout the experiment. It is shown that the spatial adjustment and the set of control genes can influence the interpretation of test genes. Results from this microarray study are also used to generate simulated data to assess the models to remove spatial trends. The wavelets threshold approach is the most successful when the nuisance spatial trends in the images are rough and fractal, but there is little difference between the models for images with smoother spatial bias.
  • Publication
    Fine mapping QTL with haplotypes determined from dense single nucleotide polymorphic markers
    (Association for the Advancement of Animal Breeding and Genetics (AAABG), 2007) ; ;
    Hawken, Rachel
    We use publicly available methods to impute missing genotypes, infer haplotypes and partition haplotypes into blocks for large numbers of single nucleotide polymorphic data on two sections of chromosomes. Haplotype trend regression was used to associate these haplotype blocks with a continuously distributed trait. A number of significant regions of chromosomes, that were not found when tested with single-marker tests, were identified. This study demonstrated a feasible framework to fine-mapping QTL using haplotypes of SNP markers.
  • Publication
    A comparison of five methods to predict genomic breeding values of dairy bulls from genome-wide SNP markers
    (BioMed Central Ltd, 2009)
    Moser, Gerhard
    ;
    ; ;
    Khatkar, Mehar S
    ;
    Raadsma, Herman W
    Background: Genomic selection (GS) uses molecular breeding values (MBV) derived from dense markers across the entire genome for selection of young animals. The accuracy of MBV prediction is important for a successful application of GS. Recently, several methods have been proposed to estimate MBV. Initial simulation studies have shown that these methods can accurately predict MBV. In this study we compared the accuracies and possible bias of five different regression methods in an empirical application in dairy cattle. Methods: Genotypes of 7,372 SNP and highly accurate EBV of 1,945 dairy bulls were used to predict MBV for protein percentage (PPT) and a profit index (Australian Selection Index, ASI). Marker effects were estimated by least squares regression (FR-LS), Bayesian regression (Bayes-R), random regression best linear unbiased prediction (RR-BLUP), partial least squares regression (PLSR) and nonparametric support vector regression (SVR) in a training set of 1,239 bulls. Accuracy and bias of MBV prediction were calculated from cross-validation of the training set and tested against a test team of 706 young bulls. Results: For both traits, FR-LS using a subset of SNP was significantly less accurate than all other methods which used all SNP. Accuracies obtained by Bayes-R, RR-BLUP, PLSR and SVR were very similar for ASI (0.39-0.45) and for PPT (0.55-0.61). Overall, SVR gave the highest accuracy. All methods resulted in biased MBV predictions for ASI, for PPT only RR-BLUP and SVR predictions were unbiased. A significant decrease in accuracy of prediction of ASI was seen in young test cohorts of bulls compared to the accuracy derived from cross-validation of the training set. This reduction was not apparent for PPT. Combining MBV predictions with pedigree based predictions gave 1.05 - 1.34 times higher accuracies compared to predictions based on pedigree alone. Some methods have largely different computational requirements, with PLSR and RR-BLUP requiring the least computing time. Conclusions: The four methods which use information from all SNP namely RR-BLUP, Bayes-R, PLSR and SVR generate similar accuracies of MBV prediction for genomic selection, and their use in the selection of immediate future generations in dairy cattle will be comparable. The use of FR-LS in genomic selection is not recommended.
  • Publication
    Direct Molecular Markers for Pig Improvement: APL1756 QTL Analyses on Chromosomes 10, 9 and 4
    (2004) ; ;
    NSW Department of Primary Industries
    This report summarizes analyses at AGBU and covers two areas. The first area is the supplementary analyses for APL1756 and US43 animals on chromosome 10, as a result of additional genotyping for two APL1756 sire families and re-scoring of some genotypes. The second area reports the analysis of chromosome 4 for US43 animals and of chromosome 9 for APL1756 animals.