• No results found

Whole-genome ordinary ridge regression including gene-gene interaction effects

N/A
N/A
Protected

Academic year: 2022

Share "Whole-genome ordinary ridge regression including gene-gene interaction effects"

Copied!
25
0
0

Loading.... (view fulltext now)

Full text

(1)

IT 13 017

Examensarbete 15 hp Mars 2013

Whole-genome ordinary ridge regression including gene-gene interaction effects

Marzieh Farzamfar

Institutionen för informationsteknologi

(2)
(3)

Teknisk- naturvetenskaplig fakultet UTH-enheten

Besöksadress:

Ångströmlaboratoriet Lägerhyddsvägen 1 Hus 4, Plan 0

Postadress:

Box 536 751 21 Uppsala

Telefon:

018 – 471 30 03

Telefax:

018 – 471 30 00

Hemsida:

http://www.teknat.uu.se/student

Abstract

Whole-genome ordinary ridge regression including gene-gene interaction effects

Marzieh Farzamfar

The methodology of applying markers across the entire genome for the purpose of predicting genomic values, can improve prediction of complex traits. In this study, we considered all main effects (1981 markers) and epistatic effects (1962180 markers) in a wheat data set with 280 accessions to investigate if the inclusion of epistatic effects can improve genomic prediction. The results of our simulations using the real data showed that the contribution of epistasis to phenotypic prediction is very small.

However, including epistatic effects in the model allows us to separate the epistatic effects from the main effects (estimated by a model without epistasis).

Examinator: Jarmo Rantakokko Ämnesgranskare: Maya Neytcheva Handledare: Xia Shen

(4)
(5)

Contents

1 Introduction 3

2 Basic terminology and essential concepts in population ge-

netics 4

3 Linear Models 5

3.1 Linear Mixed Model . . . 7 3.2 General Linear Mixed Model . . . 7

4 Data Analysis 9

4.1 Genomic prediction considering epistatic effects . . . 10

5 Conclusion 11

List of Tables

1 Correlations for each trait with both methods. Mean of each trait in nine locations is taken for simulation. . . 11 2 P-values, 1.Mean of the differences for comparing mean squared

error, 2.Mean of the differences for comparing correlations. . 11 3 Prediction of genetic values for yield, weight, height and flow-

ering date . . . 13 4 Mean of the differences for mean squared error (MSE) and

correlation of predicted values between two methods. . . 14

(6)

List of Figures

1 Alleles and a gene locus on homologous chromosomes [1] . . . 4 2 Epistasis Example [3], Two different genes C and B affect development

of normal hair color of coat mouse. CC and Cc have normal color devel- opment but cc is albino mice. BB or Bb result black agouti mice and bb results brown agouti mice. Agouti is a color for mice fur (Mixed brown color) . . . 5 3 A Generalized Linear mixed model [7] . . . 9 4 Mean of each trait in 9 locations is considered for simulation.

Additive genome variance of a predicted trait with main ef- fects model (x-axis) versus additive genome variance of a pre- dicted trait with main and epistatic effects model (y-axis). . 15 5 Residual variance of the main effects model versus residual

variance of the epistatic effects model- Simulation of each trait is done for the mean of 9 locations trait, x-axis (Variance of residual before including epistatic effects), y-axis (Variance of residual by including epistatic effects). . . 16 6 Variance of four traits, when they are predicted with only

main effects versus variance of the traits after applying epistatic effects. In each plot nine different colors represents the ratio of variance value for each trait in nine different locations . . 17 7 Residual variance of one predicted trait (weight) in 9 locations

with main effects model on x-axis versus residual variance of the same trait with main and epistatic effects on y-axis.

Residual variance difference for three other phenotypes (yield, height, flowering date) are similar to weight. . . 18

(7)

1 Introduction

Many researchers in plant and animal breeding science study genomic selec- tion by using a big number of genetic markers. The methodology of using whole genome markers to improve quantitative traits in plant breeding in a large population is known as genomic selection . Genomic selection incorpo- rates phenotypic data with marker data in order to maximize the accuracy of the predicted genotypic values. Scientists estimate the marker effects from some reference population by using particular statistical models. Based on the marker effects they predict the genetic values of new genotypes [4]. The difficulty of genomic selection is that number of marker effects p is typically large, and it is much bigger than the number of observations n, i,e p  n.

Prediction of genomic values can be more efficient when both main ef- fects and epistatic effects are used [5]. Currently, one of the active research areas in genetics is understanding the concept of epistasis for genomic se- lection [13]. Zhique Hu and his coworkers have done an excellent research which shows the importance of epistatic effects in genetic determination of soybean [5]. They demonstrated that in order to predict genomic value us- ing the markers of the entire genome, can maximize the efficiency of genome selection. In fact, using all markers (main effects and epistatic effects) can decrease the error in prediction significantly. We use epistatic effects similar to main marker effects in the so-called mixed models.

Further, in 2009, Lorenzo and Bernardo [8] analyzed whether an empiri- cal Bayes approach for modeling of epistatic effects can improve the accuracy of prediction. They showed that including epistatic effects in the prediction of genetic values do not have any advantages and may lead to poorer pre- dictions. The above shows that more investigations are necessary to see if inclusion of epistatic effects in prediction of genotypic value has a potential advantage.

In computational biology, scientists have developed different methods for analyzing the genetics data with a big number of individuals. A generalized ridge regression method HEM (heteroscedastic effects model) has recently been developed for analyzing high-dimensional genomic data [11]. With this method it is possible to fit all genome-wide additive genetic effects . Including all potential two-way interactions (epistatic effects) has not been implemented yet.

The aim of this project is to evaluate whether accounting for whole genome epistatic effects would benefit genomic prediction. One part of the HEM algorithm is extended to fit a multiple random effects model includ- ing a correlated random effects term of all the pair-wise interactions. The

(8)

estimation is based on the R package “hglm” [10] for fitting various random effects models.

2 Basic terminology and essential concepts in pop- ulation genetics

Gene In biological organisms, the basic structure part of inheritance is the gene. A gene contains the essential information to produce a particular protein, and it is a short segment of DNA. Chromosomes carry all the genes in an organism. Also genes can specify different characteristics in humans.

For instance eye color, height, and other traits are determined by genes. A specific location of a DNA sequence(gene) is called a locus and there are variant loci on a chromosome. In a particular locus, there are different forms of a gene and an allele is one of the forms, Figure 1.

Genotype The genotype is information that exist in each gene of an in- dividual and it describes gene variation that exist in an individual. In other words, it is a group of genetic markers which can explain different forms of genes.

Phenotype Each living organism has many different characters and traits such as physical structure, color, behavior and anything which is observable in an organism is phenotype. We can show the relationship between pheno- type and genotype in the following form [9]:

genotype(G) + environment(E) → phenotype(P ) (1)

Figure 1: Alleles and a gene locus on homologous chromosomes [1]

(9)

Gene Interactions In order to create a particular phenotype, two genes work together which we refer to as gene interaction. There exist different genetic interactions. When the allelic effects at one locus depend on the sec- ond locus, the genetic interaction of alleles between loci is Epistasis genetic interaction. In this interaction one genes allele masks the phenotype of the other genes alleles and four genotypes can create less than four phenotypes.

In other words Epistasis happens when the alleles of a gene hide or alter the expression of alleles of another gene. Figure 2 is an example of epistasis in mice. In the example allele B determines the pattern of the coat in mice and another locus decide if a mouse has color. The genotype cc are without color and genotypes CC and Cc have color. Additive genetic interaction combined effects of alleles at variant gene loci is equal to the sum of their effects individually.

Figure 2: Epistasis Example [3], Two different genes C and B affect development of normal hair color of coat mouse. CC and Cc have normal color development but cc is albino mice. BB or Bb result black agouti mice and bb results brown agouti mice. Agouti is a color for mice fur (Mixed brown color)

3 Linear Models

In order to model the relationship between a dependent variable Y and inde- pendent variables X1, X2, X3, ..., Xp, Linear Models (Regression) are useful.

In case p = 1, it is called simple regression and when p > 1 it is multiple

(10)

regression.Regression analysis is used to predict future observations and to explain the structure of data. As a simple example, assume we have the height X1 and age X2 of trees and we want to predict the weight of trees.

Usually the data is represented in form of an array, y1 x11 x12

y2 x21 x22 y3 x31 x32 ... ... ... yn xn1 xn2

n is the number of observations and the observation of the i-th tree is given by yi. We can model the relationship between dependent and independent variables by a linear function of the following type:

y = β0+ β1x1+ β2x2+ ε. (2) Then we need to estimate the unknown parameters β0, β1, β2. In equation (2), ε is the error. The regression equation for this example in matrix representation is,

y = Xβ + ε, (3)

Where X is represented as

X =

1 x11 x12

1 x21 x22

... ... ... 1 xn1 xn2

 .

We find the estimation of β = (β0, β1, β2)T by least square method. Con- sider, the sum of the squared errors.

n

X

i=1

ε2= εTε = (Y − Xβ)T(Y − Xβ). (4)

If we differentiate equation (4) with respect to β and set the expression to be equal to zero, then ˆβ satisfies the normal equation

XTX ˆβ = XTY. (5)

Then, if XTX is invertible, we have

β = (Xˆ TX)−1XTY. (6)

(11)

3.1 Linear Mixed Model

One of the popular statistical models in biological and social sciences is a linear mixed model. Linear mixed models are shown as

y = Xβ + Zu + ε, (7)

Where y is the vector (n × 1) of observations, β is the vector (p × 1) of fixed effects, u is the vector(q × 1)of random effects, ε is the vector(n × 1)of random error terms, X is the matrix(n × p)for fixed effects relating observations y to β, and Z is the matrix (n × q)for the random effects relating observations y to u. Suppose that u and ε are uncorrelated random variables and E[u] = 0, E[ε] = 0, V ar[u] = G, V ar[ε] = R and cov(ε, u) = 0. Then we have the variance and expectation of the vector y as follow:

E[y] = Xβ, V ar[y] = V = ZGZT + R. (8) Suppose the random terms are normally distributed:

u ∼ N (0, G), ε ∼ N (0, R), (9)

Then the vector y will be normally distributed y ∼ N (Xβ, V ar[y]). We can use the Henderson mixed model equations [2] to find ˆβ which is the best linear unbiased estimator of β and ˆu which is the best linear unbiased predictor of u. So by using (MME) we have,

XTR−1X XTR−1Z ZTR−1X ZTR−1Z + G−1

!

=

βˆ ˆ u



=  XTR−1y ZTR−1X

 .

Given the variance components in R and G, we can show the solutions as, β = (Xˆ TV−1X)−1XTV−1y (10)

ˆ

u = GZTV−1(y − X ˆβ) (11)

We need to calculate the variance matrices to find the solution of equations 10 and 11. We assume that R = Inσε2, G = Inσu2, V = ZGZT + R and we consider to the assumption of u ∼ N (0, G), ε ∼ N (0, R), y ∼ N (Xβ, V ).

3.2 General Linear Mixed Model

A generalized linear mixed model (GLMM) [6] is similar to the linear mixed model and it includes random effects, u ∼ N (0, G), fixed effects, β, design matrix X and Z and vector of observations y. In GLMM we have a con- ditional distribution which gives us the random effects and the distribution

(12)

has mean value µ and covariance matrix R. Furthermore, we have a linear predictor η, and a link function which is defined as η = g(µ). A linear re- gression is generalized when the link function g relates the linear model to the response variable. Dependent variables y are generated by a particular distribution. The expected value of y in a general linear model is achieved through the inverse link function (g−1) as,

E(Y ) = µ = g−1(Xβ) = g−1(η). (12) In equation 12, η is known as a linear predictor and g is the link function.

In GLMM the expectation is calculated as follows,

E[y|u] = g−1(Xβ + Zu) = g−1(η). (13) As we can see, the linear predictor has fixed effects. The assumption is that the random effects have normal distribution and then we can say u ∼ N (0, G). Thus a linear predictor in GLMM is the combination of fixed and random effects.

(13)

Figure 3: A Generalized Linear mixed model [7]

4 Data Analysis

In this work we used the data from Wang research in a wheat population [13]. Their study used LASSO to filter out some effective markers with both main and epistatic effects, and they showed that for those markers, considering their epistatic effects apparently improved the phenotypic pre- diction. Our purpose was to investigate the possibility of using all the pair-

(14)

wise epistatic effects to achieve the maximum efficiency in prediction. The data includes genotypes and phenotypes of the wheat breeding program in 2010 in Nebraska. We have the phenotype data (yield, grain volume-weight, plant-height, flowering-date) for 280 lines at nine different locations and 1981 markers are measured. The missing genotypes were imputed with the binomial distribution. By finding the number of ones and zeros in each row of genotype matrix, we calculate the probability of having one. Then by applying binomial distribution missing values are imputed. The genotype matrix is ready to use in our model and the genotype matrix with calculated coefficients (kinship matrix) which contains the epistatic effects, needs to be calculated. The entries of the kinship matrix are the kinship coefficients between pairs of marker columns. The calculated coefficients are used as random effects in general linear mixed model.

4.1 Genomic prediction considering epistatic effects

Research that has been done by Wang and his colleagues, [13] showed that applying the epistatic effects for plant breeding has increased the accuracy of phenotype prediction significantly. They have applied adaptive mixed LASSO and the following linear mixed effects model has been considered,

y = Xβ + Zu + ε. (14)

In their setup, LASSO was applied including main and epistatic effects in the fixed effects part of model, i.e β. Here we apply the linear mixed model and same genotype data. The difference is that here we consider two random effects where one includes all the main effects and the other includes all the pair-wise epistatic effects. This is similar to a two-term ridge regression and can be fitted as a multiple random effects model.

y = Xβ + Z1u1+ Z2u2+ ε, (15) The estimation of the fixed effects, the random effects, the variance compo- nents and their standard errors was done by the hglm-R-package [10]. The method is described as follows; first 75 percentage of the genotype is cho- sen and by using the hglm method all the necessary parameters to predict phenotype are produced. With the acquired parameters, we can predict the phenotype values for the remaining of genotype which is 25 percentage of the whole data. Model 15 is compared with model 14 without including the interaction effects (u2).

(15)

5 Conclusion

To find out if there is an improvement in prediction, or if there is not any difference between two sets of results, we need to compare the means of two sets. In addition, we need to find out the difference between their means relative to the variability of their values [12]. Pairwise t-test is a useful test in our case. In Table 2, the P-value for mean squared error and correlation are represented. As we can see in the Table 2 the P-values are greater than 0.05, then we cannot reject the hypothesis of equality of means. This means that the new method has not made any significant improvement.

Phenotype Correlation without epistasis Correlation with epistasis

Yield 0.23 0.23

Weight 0.11 0.11

Height 0.09 0.09

Flowering 0.30 0.30

Table 1: Correlations for each trait with both methods. Mean of each trait in nine locations is taken for simulation.

Phenotype P-value for MSE

MD MSE1 P-value for Correlation

MD Corr2

Yield 0.20 4 × 10−3 0.22 -4 × 10−4

Weight 0.14 6 × 10−4 0.07 -1 × 10−3

Height 0.09 1 × 10−3 0.18 -1 × 10−3

Flowering 0.99 -2 × 10−3 0.99 1 × 10−3

Table 2: P-values, 1.Mean of the differences for comparing mean squared error, 2.Mean of the differences for comparing correlations.

We have applied hierarchical generalized linear models (HGLM) by in- cluding epistatic effects as random effects to improve the performance of prediction. We have the value of four traits in nine different locations. The average of the traits in 9 locations is considered in our model. The correla- tions and p-values of our simulation is shown in Tables 1 and 2. In Figure 4 the variance of the predicted traits without considering epistatic effects V0 versus the variance of the predicted traits with main and epistatic effects V1

are presented. Variance reduction after using epistatic effects is obvious in the Figure 4.

(16)

In order to be able to compare our results with those in Table 1 in [13], we have done the simulation for each location separately. Table 3 shows the correlation changes and mean squared error for predicting with main effects only and with main and epistatic effects. Only in few cases the correlation has very small increase. Most of the correlation values remain unchanged and in some cases it has decreased. Moreover the p-values show that there is not a big difference between standard errors of the two prediction models and these errors almost did not change.

We have evaluated two models of genome selection for wheat data, the epistatic effect model and the main effect model. The results show that the additive genome variance did not increase after including epistatic effects VA0 ≈ VA1+ VI. In fact the sum of two variance (variance of additive effects and variance of epistatic effects) from the epistatic effects model (17) is equal to the variance of additive effect from the main effects model (16).

Vy = VA0

|{z}

+VE0 (16)

Vy = VA1 + VI

| {z }

+VE1 (17)

Furthermore the residual variance for each trait did not change as we can see in Figure 6 for trait weight VE0 ≈ VE1 .

(17)

Location Corr-M Corr-M,E P-value MSE P-value Corr Yield

1 1.64 × 10−1 1.61 × 10−1 0.99 0.99

2 1.20 × 10−1 1.18 × 10−1 0.92 0.99

3 3.57 × 10−2 3.59 × 10−2 0.69 0.45

4 1.76 × 10−1 1.76 × 10−1 0.79 0.70

5 1.30 × 10−1 1.24 × 10−1 1 1

6 −3.38 × 10−2 −3.74 × 10−2 0.02 1

7 8.21 × 10−2 8.21 × 10−2 0.78 0.52

8 3.61 × 10−2 3.22 × 10−2 0.99 1

9 2.08 × 10−2 1.51 × 10−2 0.99 1

Weight

1 1.96 × 10−1 2.06 × 10−1 0.07 1.7 × 10−4

2 1.78 × 10−1 1.77 × 10−1 0.96 0.91

3 2.13 × 10−1 2.19 × 10−1 2.801 × 10−16 7 × 10−16 4 2.77 × 10−2 3.38 × 10−2 2.806 × 10−1 1.204 × 10−15

5 1.48 × 10−2 1.46 × 10−2 0.91 0.57

6 1.07 × 10−1 1.04 × 10−1 0.99 1

7 −2.46 × 10−2 −2.36 × 10−2 0.75 0.35

8 5.58 × 10−2 5.39 × 10−2 0.82 0.99

9 9.33 × 10−2 9.18 × 10−2 0.88 0.86

Plant Height

1 1.13 × 10−1 1.10 × 10−1 0.99 0.99

2 9.74 × 10−2 9.70 × 10−2 0.95 0.71

3 −1.22 × 10−1 −1.08 × 10−1 0.99 2.2 × 10−16

4 1.10 × 10−1 1.10 × 10−1 0.99 0.59

6 1.51 × 10−1 1.52 × 10−1 0.12 0.08

7 1.24 × 10−1 1.21 × 10−1 0.99 0.99

8 1.85 × 10−1 1.93 × 10−1 6.178 × 10−9 4.694 × 10−15

9 −8.15 × 10−2 −7.77 × 10−2 0.97 6.412 × 10−5

Flowering date

1 2.46 × 10−1 2.42 × 10−1 1 1

2 2.45 × 10−1 2.46 × 10−1 0.31 0.02

3 2.73 × 10−1 2.70 × 10−1 0.99 1

Table 3: Prediction of genetic values for yield, weight, height and flowering date

(18)

Location Mean difference MSE Mean difference Corr Yield

1 −7.76 × 10−2 3.04 × 10−3

2 −1.55 × 10−2 1.97 × 10−3

3 −4.90 × 10−3 −1.57 × 10−4

4 −1.14 × 10−2 3.11 × 10−4

5 −8.87 × 10−2 5.71 × 10−3

6 9.64 × 10−3 3.60 × 10−3

7 −1.76 × 10−2 5.22 × 10−5

8 −4.83 × 10−2 3.86 × 10−3

9 −5.42 × 10−2 5.68 × 10−3

Weight

1 2.61 × 10−2 −9.32 × 10−3

2 −1.61 × 10−3 6.78 × 10−4

3 5.35 × 10−3 −5.37 × 10−3

4 8.18 × 10−4 −6.11 × 10−3

5 −2.49 × 10−2 2.56 × 10−4

6 −1.97 × 10−3 2.86 × 10−3

7 −8.77 × 10−3 −1.02 × 10−3

8 −1.40 × 10−3 1.89 × 10−3

9 −1.46 × 10−2 1.47 × 10−3

Height

1 −5.46 × 10−3 2.64 × 10−3

2 −3.72 × 10−3 4.79 × 10−4

3 −4.07 × 10−3 −1.32 × 10−2

4 −7.37 × 10−3 2.02 × 10−4

6 1.64 × 10−3 −8.17 × 10−4

7 −8.37 × 10−3 2.51 × 10−3

8 1.79 × 10−2 −7.44 × 10−3

9 −3.29 × 10−3 −3.77 × 10−3

Flowering date

1 −3.09 × 10−3 3.84 × 10−3

2 5.17 × 10−4 −1.48 × 10−3

3 −6.41 × 10−3 3.38 × 10−3

Table 4: Mean of the differences for mean squared error (MSE) and corre- lation of predicted values between two methods.

(19)

Figure 4: Mean of each trait in 9 locations is considered for simulation.

Additive genome variance of a predicted trait with main effects model (x- axis) versus additive genome variance of a predicted trait with main and epistatic effects model (y-axis).

0.005 0.010 0.015 0.020

0.0020.0040.0060.0080.0100.012

Yield.Variance

V.yield0

V.yield1

0.000 0.002 0.004 0.006 0.008 0.010

0.0000.0010.0020.0030.0040.005

Weight Variance

V.weight0

V.weight1

0.0005 0.0010 0.0015

0e+002e−044e−046e−048e−04

Height Variance

V.height0

V.height1

0.0015 0.0020 0.0025 0.0030 0.0035

0.00050.00100.00150.0020

Flowering time Variance

V.flower0

V.flower1

(20)

Figure 5: Residual variance of the main effects model versus residual vari- ance of the epistatic effects model- Simulation of each trait is done for the mean of 9 locations trait, x-axis (Variance of residual before including epistatic effects), y-axis (Variance of residual by including epistatic effects).

11 12 13 14 15

1112131415

MSE Variance Yield

V.yieldE0

V.yieldE1

0.0 0.5 1.0 1.5

0.00.51.01.5

MSE Variance Weight

V.weightE0

V.weightE1

2.6 2.8 3.0 3.2

2.62.83.03.2

MSE Variance Height

V.heightE0

V.heightE1

1.0 1.1 1.2 1.3 1.4 1.5

1.01.11.21.31.41.5

MSE Variance Flowering Time

V.flowerE0

V.flowerE1

(21)

Figure 6: Variance of four traits, when they are predicted with only main effects versus variance of the traits after applying epistatic effects. In each plot nine different colors represents the ratio of variance value for each trait in nine different locations .

0.00 0.01 0.02 0.03 0.04

0.0000.0050.0100.0150.0200.0250.030

Yield Variance For 9 Locations

Yield.Variance.M

Yield.Variance.ME

0.0005 0.0010 0.0015 0.0020 0.0025 0.0030 0.0035

0.00050.00100.00150.0020

Variance Weight

V.weight0

V.weight1

0.001 0.002 0.003 0.004 0.005

0.0000.0010.0020.0030.004

Height Variance For 9 Locations

Height.Variance.M

Height.Variance.ME

0.0005 0.0010 0.0015 0.0020 0.0025

0.00050.00100.0015

Flowering Variance For 3 Locations

flowering.variance.M

flowering.variance.ME

(22)

Figure 7: Residual variance of one predicted trait (weight) in 9 locations with main effects model on x-axis versus residual variance of the same trait with main and epistatic effects on y-axis. Residual variance difference for three other phenotypes (yield, height, flowering date) are similar to weight.

0.0 0.5 1.0 1.5 2.0 2.5

2.02.22.42.62.8

MSE Variance Weight 1

V.weightE0

V.weightE1

3.6 3.8 4.0 4.2 4.4 4.6 4.8

3.63.84.04.24.44.64.8

MSE Variance Weight 2

V.weightE0

V.weightE1

1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1

1.41.51.61.71.81.92.02.1

MSE Variance Weight 3

V.weightE0

V.weightE1

4.0 4.5 5.0 5.5

4.04.55.05.5

MSE Variance Weight 4

V.weightE0

V.weightE1

(23)

1.7 1.8 1.9 2.0 2.1 2.2 2.3

0.00.51.01.52.0

MSE Variance Weight 5

V.weightE0

V.weightE1

2.2 2.4 2.6 2.8

2.22.42.62.83.0

MSE Variance Weight 6

V.weightE0

V.weightE1

0.0 0.2 0.4 0.6 0.8 1.0 1.2

0.00.20.40.60.81.01.2

MSE Variance Weight 7

V.weightE0

V.weightE1

5.0 5.5 6.0 6.5 7.0

4.55.05.56.06.57.0

MSE Variance Weight 8

V.weightE0

V.weightE1

0.0 0.5 1.0 1.5

0.00.51.01.5

MSE Variance Weight 9

V.weightE0

V.weightE1

(24)

References

[1] Alleles, alternative versions of a gene. http://cikgurozaini.

blogspot.se/2010/06/genetic-1.html. Accessed: 2/13/2013.

[2] Applications of Linear Models in Animal Breeding by Henderson 1984. http://cgil.uoguelph.ca/pub/Henderson.html. Accessed:

13/02/2013.

[3] Gene interactions. http://bioserv.fiu.edu/~walterm/genbio2004/

chapter10_trans_genetics/genetics_pics_post.htm. Accessed:

1/25/2013.

[4] Genimic selection and prediction in plant breeding. http://genomics.

cimmyt.org/. Accessed: 01/02/2013.

[5] Zhiqiu Hu and Yongguang Li. Genomic value prediction for quantita- tive traits under the epistatic model. BMC Genetics, 1471-2156:12–15, 2011.

[6] N.E. Breslow and D.G. Clayton. Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88:9–25, March 1993.

[7] Stephen D. Kachman. An introduction to generalized linear mixed mod- els. In Implementation Strategies for National Beef Cattle Evaluation, pages 59–73, 2000.

[8] RobenzonE. Lorenzana and Rex Bernardo. Accuracy of genotypic value predictions for marker-based selection in biparental plant populations.

Theoretical and Applied Genetics, 120:151–161, 2009.

[9] Michael Lynch and Bruce Walsh. Genetics and analysis of quantitative traits. Sunderland, Ma, 1997.

[10] Lars Ronnegard, Xia shen, and Moudud Alam . hglm: A package for fitting hierarchical generalized linear models. The R Journal, 2(2):20, 28, 2010,December.

[11] Xia Shen, Moudud Alam, Freddy Fikse, and Lars Rnnegrd. A novel generalized ridge regression method for quantitative genetics. Genetics, 2013.

[12] Timothy C. Urdan. Statistics In Plain English. Routledge, 2010.

(25)

[13] D Wang, I El-Basyoni, PS Baenziger, J Crossa, K Eskridge, and I Dweikat. Prediction of genetic values of quantitative traits with epistatic effects in plant breeding populations, 2012.

References

Related documents

Number of SNPs that were successfully pooled for high-throughput re-genotyping in all the samples (Pooled Var- iants), number of SNPs that were subsequently genotyped with

The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a Metagenome- Assembled Genome (MIMAG), including, but not

In a recently performed association study, three additional genes; IL7R, LAG3 and TIM3 showed significant differ- ences in allele frequencies between 672 MS cases and 672

ƌĞŐŝŽŶƐ ŽĨ ƚŚĞ ŐĞŶŽŵĞ͘ tŚŝůĞ ƚŚĞ ůŝďƌĂƌLJ ĂƉƉƌŽĂĐŚ ƌĞƋƵŝƌĞƐ ƚŚĞ ĐŽŶƐƚƌƵĐƚŝŽŶ

This section provides an overview of my scientific contributions and is further elab- orated in Chapter 4. In the first paper, I propose a new method for reducing the multiple

Alice kunde en tid efter dödsfallet ringa till makens jobb eller tro att maken skulle komma hem, vilket dels kan bero på försvarsmekanismer eller som Cullberg

Sequence coverage refers to the average number of reads per locus and differs from physical coverage, a term often used in genome assembly referring to the cumulative length of reads

Using permutations of length 750, we have started from the identity permu- tation and performed random operations (inversions, transpositions, inverted transpositions and